Data Operations
Endpoints for managing dataset records: retrieving, adding, replacing, deleting, and searching data.
GET /api/datasets/{id}/data
Get dataset records with pagination, sorting, filtering, and full-text search.
Authentication: Required
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Dataset ID or slug |
Query Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
q | string | - | Full-text search query |
sort | string | - | Sort field (prefix with - for descending) |
start | integer | 0 | Pagination offset |
limit | integer | 30 | Number of results per page |
filter | string | - | Filter ID or criteria JSON |
report | string | - | Report ID for field selection |
alias | string | - | Snapshot alias for time-based queries |
Response:
{
"_links": {
"self": { "href": "/api/datasets/{id}/data" }
},
"_embedded": {
"inf:record": [
{
"order_id": "ORD-2024-001",
"customer_name": "Acme Corp",
"amount": 1250.00,
"order_date": "2024-01-15T10:30:00Z",
"region": "West"
},
{
"order_id": "ORD-2024-002",
"customer_name": "TechStart Inc",
"amount": 3400.50,
"order_date": "2024-01-16T14:20:00Z",
"region": "East"
}
]
},
"start": 0,
"count": 2,
"total": 125000
}
Full-Text Search:
GET /api/datasets/sales-2024/data?q=Acme&limit=10
Sorting:
GET /api/datasets/sales-2024/data?sort=-amount&limit=50
Filtering:
GET /api/datasets/sales-2024/data?filter=filter-high-value
Or with inline criteria:
GET /api/datasets/sales-2024/data?filter={"amount":{"$gte":1000}}
For large datasets, always use pagination with reasonable limit values (30-100). Use sort and filter to narrow results before applying full-text search.
POST /api/datasets/{id}/data
Add new records to the dataset.
Authentication: Required
Permission: dataset:write
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Dataset ID or slug |
Request Body:
Array of record objects or single record object.
Example Request (multiple records):
[
{
"order_id": "ORD-2024-100",
"customer_name": "New Customer A",
"amount": 500.00,
"order_date": "2024-02-08T10:00:00Z",
"region": "West"
},
{
"order_id": "ORD-2024-101",
"customer_name": "New Customer B",
"amount": 750.00,
"order_date": "2024-02-08T11:00:00Z",
"region": "East"
}
]
Example Request (single record):
{
"order_id": "ORD-2024-102",
"customer_name": "New Customer C",
"amount": 1200.00,
"order_date": "2024-02-08T12:00:00Z",
"region": "North"
}
Response:
{
"added": 2,
"total": 125002
}
Behavior:
- Records are appended to existing data (does not replace)
- Missing fields will be set to
null - Unknown fields will be created automatically
- Elasticsearch index is updated immediately
If the payload includes fields that don't exist in the dataset, they will be automatically created with inferred data types.
PUT /api/datasets/{id}/data
Replace all dataset data (clear and rewrite).
Authentication: Required
Permission: dataset:write
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Dataset ID or slug |
Request Body:
Array of record objects.
Example Request:
[
{
"product_id": "PROD-001",
"name": "Widget A",
"price": 19.99,
"category": "Electronics"
},
{
"product_id": "PROD-002",
"name": "Widget B",
"price": 24.99,
"category": "Electronics"
}
]
Response:
{
"replaced": 2,
"total": 2
}
Behavior:
- All existing data is deleted
- New data is indexed
- Field definitions are preserved
- Dataset structure remains intact
This operation permanently deletes all existing records before adding new ones. Use POST to append data instead.
DELETE /api/datasets/{id}/data
Clear all data from the dataset index.
Authentication: Required
Permission: dataset:write
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Dataset ID or slug |
Query Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
removeFields | boolean | false | Also remove all field definitions |
Response:
204 No Content
Example:
DELETE /api/datasets/sales-2024/data
With field removal:
DELETE /api/datasets/sales-2024/data?removeFields=true
Behavior:
- All records are deleted from Elasticsearch
- Field definitions are preserved (unless
removeFields=true) - Dataset structure and configuration remain intact
- Index mapping is retained
Use removeFields=true only when you want to completely reset the dataset structure. This will remove all field definitions, calculated fields, and type mappings.
GET /api/datasets/{id}/_search
Execute an Elasticsearch search query against the dataset.
Authentication: Required
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Dataset ID or slug |
Query Parameters:
| Parameter | Type | Description |
|---|---|---|
query | object | Elasticsearch query DSL |
_source | array/string | Fields to return |
sort | array/string | Sort configuration |
from | integer | Offset (Elasticsearch pagination) |
size | integer | Result count |
searchType | string | Elasticsearch search type |
aggs | object | Aggregations |
aggregations | object | Aggregations (alias) |
report | string | Report ID for field selection |
alias | string | Snapshot alias |
snapshots | object | Snapshot configuration |
Example Request:
GET /api/datasets/sales-2024/_search?query={"match":{"customer_name":"Acme"}}&size=10&sort=["order_date:desc"]
Response:
Standard Elasticsearch response with hits and aggregations:
{
"took": 5,
"timed_out": false,
"hits": {
"total": { "value": 42, "relation": "eq" },
"max_score": 1.0,
"hits": [
{
"_id": "doc-123",
"_score": 1.0,
"_source": {
"order_id": "ORD-2024-001",
"customer_name": "Acme Corp",
"amount": 1250.00
}
}
]
}
}
POST /api/datasets/{id}/_search
Execute an Elasticsearch search query with request body (for complex queries).
Authentication: Required
Path Parameters:
| Parameter | Type | Description |
|---|---|---|
id | string | Dataset ID or slug |
Request Body:
Elasticsearch query DSL document.
Example Request:
{
"query": {
"bool": {
"must": [
{ "range": { "amount": { "gte": 1000 } } },
{ "term": { "region": "West" } }
]
}
},
"sort": [
{ "order_date": "desc" }
],
"from": 0,
"size": 50,
"aggs": {
"total_amount": {
"sum": { "field": "amount" }
},
"avg_amount": {
"avg": { "field": "amount" }
},
"by_region": {
"terms": { "field": "region" }
}
}
}
Response:
{
"took": 12,
"hits": {
"total": { "value": 156, "relation": "eq" },
"hits": [...]
},
"aggregations": {
"total_amount": {
"value": 487500.50
},
"avg_amount": {
"value": 3125.65
},
"by_region": {
"buckets": [
{ "key": "West", "doc_count": 89 },
{ "key": "East", "doc_count": 67 }
]
}
}
}
Supported Elasticsearch Features:
- Full query DSL (match, term, range, bool, etc.)
- Aggregations (terms, sum, avg, date_histogram, etc.)
- Sorting (single or multi-field)
- Source filtering (
_sourceparameter) - Pagination (from/size)
- Search types (query_then_fetch, dfs_query_then_fetch)
Use POST for complex queries with multiple filters, aggregations, or nested structures. GET is suitable for simple queries that fit in URL parameters.
Data Query Examples
Filter by Date Range
{
"query": {
"range": {
"order_date": {
"gte": "2024-01-01",
"lte": "2024-01-31"
}
}
}
}
Full-Text Search with Filters
{
"query": {
"bool": {
"must": [
{ "match": { "customer_name": "Tech" } }
],
"filter": [
{ "range": { "amount": { "gte": 1000 } } },
{ "term": { "region": "West" } }
]
}
}
}
Aggregation by Category
{
"size": 0,
"aggs": {
"by_category": {
"terms": {
"field": "category",
"size": 10
},
"aggs": {
"total_sales": {
"sum": { "field": "amount" }
}
}
}
}
}
Date Histogram
{
"aggs": {
"sales_over_time": {
"date_histogram": {
"field": "order_date",
"calendar_interval": "month"
},
"aggs": {
"revenue": {
"sum": { "field": "amount" }
}
}
}
}
}
Best Practices
Performance Optimization
- Use filters instead of queries when you don't need scoring
- Limit aggregation size to prevent memory issues
- Use
_sourcefiltering to return only needed fields - Paginate large result sets with
from/size - Avoid deep pagination (use search_after for large offsets)
Data Consistency
- Batch inserts when adding multiple records (POST array)
- Verify field types before inserting to prevent mapping conflicts
- Use transactions for related changes (not available at data level)
- Test with drafts before making production data changes
Security
- Validate input before inserting user-provided data
- Apply filters to restrict data access by user role
- Check permissions before bulk operations
- Audit changes for compliance and debugging