Datasets & Jobs
Manage relationships between queries and their associated datasets and scheduled jobs.
Query-Dataset Relationship
Queries can generate datasets in two ways:
- Embedded Datasets - Created when running a query via
POST /api/queries/{id}/_run - Standalone Datasets - Created explicitly and reference a query
GET /api/queries/{id}/dataset
Get the embedded dataset created by running a query for the current user.
Authentication: Required
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Query ID (UUID) or natural ID |
Response:
{
"id": "embedded-dataset-uuid",
"name": "Sales Analysis Query Results",
"description": null,
"type": "query",
"queryId": "550e8400-e29b-41d4-a716-446655440000",
"ownerId": "john.doe",
"embedded": true,
"params": {
"startDate": "2024-01-01",
"region": "West"
},
"ttl": 3600000,
"records": 12450,
"size": 2458901,
"esIndex": "dataset_embedded_uuid",
"createdAt": "2024-02-09T10:00:00Z",
"updatedAt": "2024-02-09T10:30:00Z",
"dataUpdatedAt": "2024-02-09T10:30:00Z",
"_links": {
"self": { "href": "/api/datasets/embedded-dataset-uuid" },
"inf:query": { "href": "/api/queries/550e8400-e29b-41d4-a716-446655440000" },
"inf:fields": { "href": "/api/datasets/embedded-dataset-uuid/fields" },
"inf:data": { "href": "/api/datasets/embedded-dataset-uuid/data" }
},
"_embedded": {
"inf:query": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"name": "Sales Analysis Query",
"language": "sql"
}
}
}
Returns 404 if no embedded dataset exists for the current user and query.
Embedded datasets are created per user when running a query. Each user gets their own dataset instance with their specific parameters.
POST /api/queries/{id}/_create-dataset
Create a standalone (non-embedded) dataset from a query.
Authentication: Required
Permission: query:run
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Query ID (UUID) or natural ID |
Request Body:
{
"name": "Sales Analysis Dataset",
"description": "Standalone dataset for sales analysis",
"params": {
"startDate": "2024-01-01",
"region": "West"
},
"shared": true,
"folderId": "folder-uuid"
}
Response:
Returns the created dataset with 201 Created status.
{
"id": "new-dataset-uuid",
"name": "Sales Analysis Dataset",
"description": "Standalone dataset for sales analysis",
"type": "query",
"queryId": "550e8400-e29b-41d4-a716-446655440000",
"ownerId": "john.doe",
"embedded": false,
"params": {
"startDate": "2024-01-01",
"region": "West"
},
"shared": true,
"folderId": "folder-uuid",
"_links": {
"self": { "href": "/api/datasets/new-dataset-uuid" },
"inf:query": { "href": "/api/queries/550e8400-e29b-41d4-a716-446655440000" }
}
}
Use Cases:
- Create a permanent dataset from a query
- Share query results with fixed parameters
- Attach dataset to a job for scheduled execution
- Build reports on top of query results
- Embedded: Temporary, per-user, auto-created by
_run - Standalone: Permanent, shareable, manually created
Jobs
Queries can be scheduled for automatic execution via jobs.
GET /api/queries/{id}/jobs
Get all jobs associated with a query.
Authentication: Required
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Query ID (UUID) or natural ID |
Response:
{
"_links": {
"self": { "href": "/api/queries/{id}/jobs" }
},
"_embedded": {
"inf:job": [
{
"id": "job-uuid-1",
"name": "Daily Sales Report",
"description": "Generate daily sales summary",
"enabled": true,
"schedule": {
"type": "cron",
"expression": "0 6 * * *",
"timezone": "America/New_York"
},
"datasets": [
{
"datasetId": "dataset-uuid",
"queryId": "550e8400-e29b-41d4-a716-446655440000",
"params": {
"startDate": { "expression": "TODAY - 1 DAY" }
}
}
],
"actions": [
{
"type": "email",
"recipients": ["sales@company.com"],
"subject": "Daily Sales Report"
}
],
"lastRun": "2024-02-09T06:00:00Z",
"nextRun": "2024-02-10T06:00:00Z",
"_links": {
"self": { "href": "/api/jobs/job-uuid-1" }
}
},
{
"id": "job-uuid-2",
"name": "Weekly Sales Summary",
"enabled": true,
"schedule": {
"type": "cron",
"expression": "0 8 * * MON"
}
}
]
},
"start": 0,
"count": 2,
"total": 2
}
This endpoint returns all jobs that reference the query, either directly or through a dataset that uses the query.
GET /api/queries/{id}/job-templates
Get job templates available for a query.
Authentication: Required
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Query ID (UUID) or natural ID |
Response:
[
{
"id": "daily-email",
"name": "Daily Email Report",
"description": "Send query results via email every day",
"schedule": {
"type": "cron",
"expression": "0 6 * * *",
"description": "Daily at 6 AM"
},
"actions": [
{
"type": "email",
"format": "excel"
}
]
},
{
"id": "weekly-export",
"name": "Weekly Export",
"description": "Export results to CSV weekly",
"schedule": {
"type": "cron",
"expression": "0 8 * * MON",
"description": "Mondays at 8 AM"
},
"actions": [
{
"type": "export",
"format": "csv",
"destination": "ftp"
}
]
},
{
"id": "hourly-refresh",
"name": "Hourly Refresh",
"description": "Refresh dataset every hour",
"schedule": {
"type": "cron",
"expression": "0 * * * *",
"description": "Every hour"
}
}
]
Template Properties:
| Property | Description |
|---|---|
id | Template identifier |
name | Display name |
description | Template purpose |
schedule | Pre-configured schedule |
actions | Default actions (email, export, etc.) |
Use these templates as starting points when creating jobs via POST /api/jobs. The templates provide common scheduling patterns and action configurations.
Dataset Management
Finding Query Datasets
// Get user's embedded dataset
const dataset = await GET('/api/queries/my-query/dataset');
// Check if dataset exists and is fresh
if (dataset && dataset.dataUpdatedAt) {
const age = Date.now() - new Date(dataset.dataUpdatedAt);
const ttl = dataset.ttl;
if (age < ttl) {
// Dataset is still valid
console.log('Using cached dataset');
} else {
// Dataset expired, need to refresh
await POST('/api/queries/my-query/_run');
}
} else {
// No dataset exists, run query
await POST('/api/queries/my-query/_run');
}
Creating Permanent Datasets
// Create a permanent dataset from query
const dataset = await POST('/api/queries/my-query/_create-dataset', {
name: 'Sales Analysis - Q4 2024',
description: 'Quarterly sales snapshot',
params: {
startDate: '2024-10-01',
endDate: '2024-12-31'
},
shared: true
});
// Dataset can now be shared, scheduled, or used in reports
await POST('/api/jobs', {
name: 'Monthly Sales Email',
schedule: { type: 'cron', expression: '0 8 1 * *' },
datasets: [{ datasetId: dataset.id }],
actions: [
{
type: 'email',
recipients: ['team@company.com'],
format: 'excel'
}
]
});
Job Integration
Creating a Job for a Query
// 1. Create standalone dataset from query
const dataset = await POST('/api/queries/my-query/_create-dataset', {
name: 'Sales Dataset',
params: { region: 'West' }
});
// 2. Create job to refresh dataset daily
const job = await POST('/api/jobs', {
name: 'Daily Sales Refresh',
enabled: true,
schedule: {
type: 'cron',
expression: '0 6 * * *',
timezone: 'America/New_York'
},
datasets: [
{
datasetId: dataset.id,
params: {
region: 'West',
startDate: { expression: 'TODAY - 7 DAYS' }
}
}
],
actions: [
{
type: 'email',
recipients: ['sales@company.com'],
subject: 'Daily Sales Update',
format: 'excel'
}
]
});
console.log('Job created:', job.id);
console.log('Next run:', job.nextRun);
Dynamic Parameters in Jobs
Jobs can use dynamic parameter expressions:
{
"datasets": [
{
"datasetId": "dataset-uuid",
"params": {
"startDate": { "expression": "TODAY - 1 DAY" },
"endDate": { "expression": "TODAY" },
"quarter": { "expression": "CURRENT_QUARTER" },
"fiscalYear": { "expression": "CURRENT_FISCAL_YEAR" }
}
}
]
}
Best Practices
Embedded vs Standalone Datasets
| Use Case | Type |
|---|---|
| Ad-hoc analysis | Embedded |
| User-specific views | Embedded |
| Short-lived results | Embedded |
| Shared reports | Standalone |
| Scheduled jobs | Standalone |
| Permanent snapshots | Standalone |
Dataset TTL Management
// Configure TTL based on data freshness needs
const ttlMapping = {
'real-time': 5 * 60 * 1000, // 5 minutes
'frequent': 30 * 60 * 1000, // 30 minutes
'hourly': 60 * 60 * 1000, // 1 hour
'daily': 24 * 60 * 60 * 1000, // 24 hours
'weekly': 7 * 24 * 60 * 60 * 1000 // 7 days
};
Job Scheduling Patterns
| Pattern | Cron Expression | Use Case |
|---|---|---|
| Every hour | 0 * * * * | Real-time dashboards |
| Daily at 6 AM | 0 6 * * * | Daily reports |
| Weekdays at 9 AM | 0 9 * * 1-5 | Business day reports |
| First of month | 0 8 1 * * | Monthly summaries |
| Weekly on Monday | 0 8 * * MON | Weekly rollups |
Parameter Expressions
Common dynamic expressions for job parameters:
{
"startDate": { "expression": "TODAY - 1 DAY" },
"endDate": { "expression": "TODAY" },
"lastWeek": { "expression": "TODAY - 7 DAYS" },
"monthStart": { "expression": "START_OF_MONTH" },
"monthEnd": { "expression": "END_OF_MONTH" },
"quarterStart": { "expression": "START_OF_QUARTER" },
"yearStart": { "expression": "START_OF_YEAR" }
}
Monitoring
Check Job Status
// Get jobs for a query
const jobs = await GET('/api/queries/my-query/jobs');
for (const job of jobs.items) {
console.log(`Job: ${job.name}`);
console.log(` Enabled: ${job.enabled}`);
console.log(` Last Run: ${job.lastRun}`);
console.log(` Next Run: ${job.nextRun}`);
console.log(` Status: ${job.lastRunStatus}`);
}
Dataset Freshness
// Check if dataset needs refresh
const dataset = await GET('/api/queries/my-query/dataset');
const dataAge = Date.now() - new Date(dataset.dataUpdatedAt);
const needsRefresh = dataAge > dataset.ttl;
if (needsRefresh) {
console.log('Dataset is stale, refreshing...');
await POST('/api/queries/my-query/_run', {
params: dataset.params
});
}
Cleanup
Remove Stale Datasets
// Find old embedded datasets
const datasets = await GET('/api/datasets?type=query&embedded=true');
for (const dataset of datasets.items) {
const age = Date.now() - new Date(dataset.dataUpdatedAt);
const maxAge = 7 * 24 * 60 * 60 * 1000; // 7 days
if (age > maxAge) {
console.log(`Deleting stale dataset: ${dataset.id}`);
await DELETE(`/api/datasets/${dataset.id}`);
}
}
Disable Old Jobs
// Disable jobs that haven't run in 90 days
const jobs = await GET('/api/queries/my-query/jobs');
for (const job of jobs.items) {
if (job.lastRun) {
const daysSinceRun = (Date.now() - new Date(job.lastRun)) / (24 * 60 * 60 * 1000);
if (daysSinceRun > 90) {
await PUT(`/api/jobs/${job.id}`, { enabled: false });
console.log(`Disabled inactive job: ${job.name}`);
}
}
}