Skip to main content

Datasets & Jobs

Manage relationships between queries and their associated datasets and scheduled jobs.

Query-Dataset Relationship

Queries can generate datasets in two ways:

  1. Embedded Datasets - Created when running a query via POST /api/queries/{id}/_run
  2. Standalone Datasets - Created explicitly and reference a query

GET /api/queries/{id}/dataset

Get the embedded dataset created by running a query for the current user.

Authentication: Required

Path Parameters:

ParameterTypeDescription
idstringQuery ID (UUID) or natural ID

Response:

{
"id": "embedded-dataset-uuid",
"name": "Sales Analysis Query Results",
"description": null,
"type": "query",
"queryId": "550e8400-e29b-41d4-a716-446655440000",
"ownerId": "john.doe",
"embedded": true,
"params": {
"startDate": "2024-01-01",
"region": "West"
},
"ttl": 3600000,
"records": 12450,
"size": 2458901,
"esIndex": "dataset_embedded_uuid",
"createdAt": "2024-02-09T10:00:00Z",
"updatedAt": "2024-02-09T10:30:00Z",
"dataUpdatedAt": "2024-02-09T10:30:00Z",
"_links": {
"self": { "href": "/api/datasets/embedded-dataset-uuid" },
"inf:query": { "href": "/api/queries/550e8400-e29b-41d4-a716-446655440000" },
"inf:fields": { "href": "/api/datasets/embedded-dataset-uuid/fields" },
"inf:data": { "href": "/api/datasets/embedded-dataset-uuid/data" }
},
"_embedded": {
"inf:query": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"name": "Sales Analysis Query",
"language": "sql"
}
}
}

Returns 404 if no embedded dataset exists for the current user and query.

Dataset Lifecycle

Embedded datasets are created per user when running a query. Each user gets their own dataset instance with their specific parameters.


POST /api/queries/{id}/_create-dataset

Create a standalone (non-embedded) dataset from a query.

Authentication: Required Permission: query:run

Path Parameters:

ParameterTypeDescription
idstringQuery ID (UUID) or natural ID

Request Body:

{
"name": "Sales Analysis Dataset",
"description": "Standalone dataset for sales analysis",
"params": {
"startDate": "2024-01-01",
"region": "West"
},
"shared": true,
"folderId": "folder-uuid"
}

Response:

Returns the created dataset with 201 Created status.

{
"id": "new-dataset-uuid",
"name": "Sales Analysis Dataset",
"description": "Standalone dataset for sales analysis",
"type": "query",
"queryId": "550e8400-e29b-41d4-a716-446655440000",
"ownerId": "john.doe",
"embedded": false,
"params": {
"startDate": "2024-01-01",
"region": "West"
},
"shared": true,
"folderId": "folder-uuid",
"_links": {
"self": { "href": "/api/datasets/new-dataset-uuid" },
"inf:query": { "href": "/api/queries/550e8400-e29b-41d4-a716-446655440000" }
}
}

Use Cases:

  • Create a permanent dataset from a query
  • Share query results with fixed parameters
  • Attach dataset to a job for scheduled execution
  • Build reports on top of query results
Embedded vs Standalone
  • Embedded: Temporary, per-user, auto-created by _run
  • Standalone: Permanent, shareable, manually created

Jobs

Queries can be scheduled for automatic execution via jobs.

GET /api/queries/{id}/jobs

Get all jobs associated with a query.

Authentication: Required

Path Parameters:

ParameterTypeDescription
idstringQuery ID (UUID) or natural ID

Response:

{
"_links": {
"self": { "href": "/api/queries/{id}/jobs" }
},
"_embedded": {
"inf:job": [
{
"id": "job-uuid-1",
"name": "Daily Sales Report",
"description": "Generate daily sales summary",
"enabled": true,
"schedule": {
"type": "cron",
"expression": "0 6 * * *",
"timezone": "America/New_York"
},
"datasets": [
{
"datasetId": "dataset-uuid",
"queryId": "550e8400-e29b-41d4-a716-446655440000",
"params": {
"startDate": { "expression": "TODAY - 1 DAY" }
}
}
],
"actions": [
{
"type": "email",
"recipients": ["sales@company.com"],
"subject": "Daily Sales Report"
}
],
"lastRun": "2024-02-09T06:00:00Z",
"nextRun": "2024-02-10T06:00:00Z",
"_links": {
"self": { "href": "/api/jobs/job-uuid-1" }
}
},
{
"id": "job-uuid-2",
"name": "Weekly Sales Summary",
"enabled": true,
"schedule": {
"type": "cron",
"expression": "0 8 * * MON"
}
}
]
},
"start": 0,
"count": 2,
"total": 2
}
Job Discovery

This endpoint returns all jobs that reference the query, either directly or through a dataset that uses the query.


GET /api/queries/{id}/job-templates

Get job templates available for a query.

Authentication: Required

Path Parameters:

ParameterTypeDescription
idstringQuery ID (UUID) or natural ID

Response:

[
{
"id": "daily-email",
"name": "Daily Email Report",
"description": "Send query results via email every day",
"schedule": {
"type": "cron",
"expression": "0 6 * * *",
"description": "Daily at 6 AM"
},
"actions": [
{
"type": "email",
"format": "excel"
}
]
},
{
"id": "weekly-export",
"name": "Weekly Export",
"description": "Export results to CSV weekly",
"schedule": {
"type": "cron",
"expression": "0 8 * * MON",
"description": "Mondays at 8 AM"
},
"actions": [
{
"type": "export",
"format": "csv",
"destination": "ftp"
}
]
},
{
"id": "hourly-refresh",
"name": "Hourly Refresh",
"description": "Refresh dataset every hour",
"schedule": {
"type": "cron",
"expression": "0 * * * *",
"description": "Every hour"
}
}
]

Template Properties:

PropertyDescription
idTemplate identifier
nameDisplay name
descriptionTemplate purpose
schedulePre-configured schedule
actionsDefault actions (email, export, etc.)
Creating Jobs

Use these templates as starting points when creating jobs via POST /api/jobs. The templates provide common scheduling patterns and action configurations.


Dataset Management

Finding Query Datasets

// Get user's embedded dataset
const dataset = await GET('/api/queries/my-query/dataset');

// Check if dataset exists and is fresh
if (dataset && dataset.dataUpdatedAt) {
const age = Date.now() - new Date(dataset.dataUpdatedAt);
const ttl = dataset.ttl;

if (age < ttl) {
// Dataset is still valid
console.log('Using cached dataset');
} else {
// Dataset expired, need to refresh
await POST('/api/queries/my-query/_run');
}
} else {
// No dataset exists, run query
await POST('/api/queries/my-query/_run');
}

Creating Permanent Datasets

// Create a permanent dataset from query
const dataset = await POST('/api/queries/my-query/_create-dataset', {
name: 'Sales Analysis - Q4 2024',
description: 'Quarterly sales snapshot',
params: {
startDate: '2024-10-01',
endDate: '2024-12-31'
},
shared: true
});

// Dataset can now be shared, scheduled, or used in reports
await POST('/api/jobs', {
name: 'Monthly Sales Email',
schedule: { type: 'cron', expression: '0 8 1 * *' },
datasets: [{ datasetId: dataset.id }],
actions: [
{
type: 'email',
recipients: ['team@company.com'],
format: 'excel'
}
]
});

Job Integration

Creating a Job for a Query

// 1. Create standalone dataset from query
const dataset = await POST('/api/queries/my-query/_create-dataset', {
name: 'Sales Dataset',
params: { region: 'West' }
});

// 2. Create job to refresh dataset daily
const job = await POST('/api/jobs', {
name: 'Daily Sales Refresh',
enabled: true,
schedule: {
type: 'cron',
expression: '0 6 * * *',
timezone: 'America/New_York'
},
datasets: [
{
datasetId: dataset.id,
params: {
region: 'West',
startDate: { expression: 'TODAY - 7 DAYS' }
}
}
],
actions: [
{
type: 'email',
recipients: ['sales@company.com'],
subject: 'Daily Sales Update',
format: 'excel'
}
]
});

console.log('Job created:', job.id);
console.log('Next run:', job.nextRun);

Dynamic Parameters in Jobs

Jobs can use dynamic parameter expressions:

{
"datasets": [
{
"datasetId": "dataset-uuid",
"params": {
"startDate": { "expression": "TODAY - 1 DAY" },
"endDate": { "expression": "TODAY" },
"quarter": { "expression": "CURRENT_QUARTER" },
"fiscalYear": { "expression": "CURRENT_FISCAL_YEAR" }
}
}
]
}

Best Practices

Embedded vs Standalone Datasets

Use CaseType
Ad-hoc analysisEmbedded
User-specific viewsEmbedded
Short-lived resultsEmbedded
Shared reportsStandalone
Scheduled jobsStandalone
Permanent snapshotsStandalone

Dataset TTL Management

// Configure TTL based on data freshness needs
const ttlMapping = {
'real-time': 5 * 60 * 1000, // 5 minutes
'frequent': 30 * 60 * 1000, // 30 minutes
'hourly': 60 * 60 * 1000, // 1 hour
'daily': 24 * 60 * 60 * 1000, // 24 hours
'weekly': 7 * 24 * 60 * 60 * 1000 // 7 days
};

Job Scheduling Patterns

PatternCron ExpressionUse Case
Every hour0 * * * *Real-time dashboards
Daily at 6 AM0 6 * * *Daily reports
Weekdays at 9 AM0 9 * * 1-5Business day reports
First of month0 8 1 * *Monthly summaries
Weekly on Monday0 8 * * MONWeekly rollups

Parameter Expressions

Common dynamic expressions for job parameters:

{
"startDate": { "expression": "TODAY - 1 DAY" },
"endDate": { "expression": "TODAY" },
"lastWeek": { "expression": "TODAY - 7 DAYS" },
"monthStart": { "expression": "START_OF_MONTH" },
"monthEnd": { "expression": "END_OF_MONTH" },
"quarterStart": { "expression": "START_OF_QUARTER" },
"yearStart": { "expression": "START_OF_YEAR" }
}

Monitoring

Check Job Status

// Get jobs for a query
const jobs = await GET('/api/queries/my-query/jobs');

for (const job of jobs.items) {
console.log(`Job: ${job.name}`);
console.log(` Enabled: ${job.enabled}`);
console.log(` Last Run: ${job.lastRun}`);
console.log(` Next Run: ${job.nextRun}`);
console.log(` Status: ${job.lastRunStatus}`);
}

Dataset Freshness

// Check if dataset needs refresh
const dataset = await GET('/api/queries/my-query/dataset');

const dataAge = Date.now() - new Date(dataset.dataUpdatedAt);
const needsRefresh = dataAge > dataset.ttl;

if (needsRefresh) {
console.log('Dataset is stale, refreshing...');
await POST('/api/queries/my-query/_run', {
params: dataset.params
});
}

Cleanup

Remove Stale Datasets

// Find old embedded datasets
const datasets = await GET('/api/datasets?type=query&embedded=true');

for (const dataset of datasets.items) {
const age = Date.now() - new Date(dataset.dataUpdatedAt);
const maxAge = 7 * 24 * 60 * 60 * 1000; // 7 days

if (age > maxAge) {
console.log(`Deleting stale dataset: ${dataset.id}`);
await DELETE(`/api/datasets/${dataset.id}`);
}
}

Disable Old Jobs

// Disable jobs that haven't run in 90 days
const jobs = await GET('/api/queries/my-query/jobs');

for (const job of jobs.items) {
if (job.lastRun) {
const daysSinceRun = (Date.now() - new Date(job.lastRun)) / (24 * 60 * 60 * 1000);

if (daysSinceRun > 90) {
await PUT(`/api/jobs/${job.id}`, { enabled: false });
console.log(`Disabled inactive job: ${job.name}`);
}
}
}