`engine.queries`

Purpose

The engine.queries dataset logs detailed information about the execution of queries within your environment. It captures both semantic details (e.g., query structure, labels, joins) and execution-level statistics (e.g., performance metrics, errors, resource usage). This dataset is invaluable for query performance analysis, helping teams investigate slow queries, diagnose failures, and track usage patterns. By understanding the nuances of query execution, teams can identify inefficiencies, optimize query performance, and ensure the reliability of their data operations.

Some of the many use cases for engine.queries include:

Schema description

Full JSON path	Field data type	Field data example	Description
`clientInfo.userEmail`	String	`"alex@acme.io"`	The email address of the user who issued the query.
`clientInfo.originatingTeamId`	String	`"42"`	Internal team ID that owns or initiated the query.
`queryInfo.semanticLabels`	Object	`{...}`	Aggregated booleans describing structural/query features.
`queryInfo.semanticLabels.containsFreeTextSearch`	Boolean	`true`	Whether the query includes full-text search elements.
`queryInfo.semanticLabels.containsUnions`	Boolean	`false`	Whether SQL-style `UNION` clauses are present.
`queryInfo.semanticLabels.containsAggregations`	Boolean	`true`	Whether aggregate functions (e.g., `SUM`, `COUNT`, `AVG`) are used.
`queryInfo.semanticLabels.nonDefaultLimitRequested`	Boolean	`true`	Indicates a non-default result size was requested.
`queryInfo.semanticLabels.containsWildTextSearch`	Boolean	`false`	Whether wildcard search terms (e.g., `*`, `%`) are used.
`queryInfo.semanticLabels.containsGroupingSets`	Boolean	`false`	Whether grouping sets or similar constructs are used.
`queryInfo.semanticLabels.containsJoins`	Boolean	`true`	Whether the query joins multiple datasets/tables.
`queryInfo.semanticLabels.containsExtract`	Boolean	`true`	Whether extract operations (e.g., parsing time/strings) are used.
`queryInfo.semanticLabels.containsWriteto`	Boolean	`false`	Whether the query uses a `writeto` clause to materialize results into a dataset.
`queryInfo.semanticLabels.extractBeforeFilter`	Boolean	`false`	Whether the extract occurs before filtering in execution order.
`queryInfo.querySyntax`	Enum	`"dataprime"`	Syntax format used by the query: `dataprime`, `lucene`, or `opensearch`.
`queryInfo.interfaceType`	Enum	`"dataprime"`	Interface that the query was submitted against: `dataprime` or `opensearch`.
`queryInfo.tier`	Enum	`"low"`	Query priority tier: `high`, `medium`, or `low`.
`queryInfo.sources`	Array	`[{"fqDataset":"default/logs","teamId":"42","timeFrame":{...}}]`	Source datasets and configurations used in the query.
`queryInfo.sources.fqDataset`	String	`"default/logs"`	Fully-qualified dataset name (`<dataspace>/<dataset>`).
`queryInfo.sources.teamId`	String	`"42"`	Team responsible for the dataset (optional).
`queryInfo.sources.timeFrame`	Object	`{"start":1777545819954000000,"end":1777841999999000000,"durationMs":296180045}`	Selected time range for the source.
`queryInfo.sources.timeFrame.start`	Number	`1777545819954000000`	Start of the source's time range, in epoch nanoseconds.
`queryInfo.sources.timeFrame.end`	Number	`1777841999999000000`	End of the source's time range, in epoch nanoseconds.
`queryInfo.sources.timeFrame.durationMs`	Number	`296180045`	Total timeframe duration in milliseconds.
`queryInfo.sources.scopeExpression`	String	`"service='api' AND env='prod'"`	Expression that filters or scopes the source data.
`queryInfo.queryOutcome`	Object	`{...}`	Final execution outcome details.
`queryInfo.queryOutcome.errorMessage`	String	`"Syntax error near 'FROM'"`	Optional error message if the query failed.
`queryInfo.queryOutcome.outputRowCount`	Number	`124`	Number of rows returned.
`queryInfo.queryOutcome.status`	Enum	`"Completed"`	Final status: `Completed`, `Failed`, `Cancelled`, `TimedOut`, or `Incomplete`.
`queryInfo.queryOutcome.storage`	Object	`{...}`	Storage and execution metadata: where the query ran, which storage locations it touched, what it produced, and resource statistics.
`queryInfo.queryOutcome.storage.home`	Object	`{"cloud":"aws","aws":{"region":"eu-west-1"}}`	Cloud and region where the query was executed.
`queryInfo.queryOutcome.storage.home.cloud`	Enum	`"aws"`	Cloud provider hosting the query engine.
`queryInfo.queryOutcome.storage.home.aws.region`	String	`"eu-west-1"`	AWS region where the query was executed. Present when `cloud` is `aws`.
`queryInfo.queryOutcome.storage.locations`	Array	`[{"type":"objectStore","locationOwner":"customer","objectStore":{...}}]`	Storage locations the query read from or wrote to during execution. Each item is one of two variants — `objectStore` (customer-owned) or `high` (provider-owned).
`queryInfo.queryOutcome.storage.locations.type`	Enum	`"objectStore"`	Location variant: `objectStore` or `high`.
`queryInfo.queryOutcome.storage.locations.locationOwner`	Enum	`"customer"`	`customer` for `objectStore` locations; `provider` for `high` locations.
`queryInfo.queryOutcome.storage.locations.objectStore`	Object	`{...}`	Object-store metadata. Present when `type` is `objectStore`.
`queryInfo.queryOutcome.storage.locations.objectStore.type`	Enum	`"aws_s3"`	Object-store type: `aws_s3`, `gcp_gs`, `azure_blobStorage`, or `ibm_cos`.
`queryInfo.queryOutcome.storage.locations.objectStore.aws_s3.bucket`	String	`"customer-archive-bucket"`	S3 bucket name. Present when `objectStore.type` is `aws_s3`.
`queryInfo.queryOutcome.storage.locations.objectStore.aws_s3.region`	String	`"eu-west-1"`	S3 bucket region. Present when `objectStore.type` is `aws_s3`.
`queryInfo.queryOutcome.storage.locations.objectStore.stats`	Object	`{...}`	Object-store I/O stats for this location.
`queryInfo.queryOutcome.storage.locations.objectStore.stats.bytesRead`	Number	`0`	Bytes read from this object-store location.
`queryInfo.queryOutcome.storage.locations.objectStore.stats.getRequests`	Number	`0`	Number of `GET` requests issued to this object-store location.
`queryInfo.queryOutcome.storage.locations.objectStore.stats.headRequests`	Number	`0`	Number of `HEAD` requests issued to this object-store location.
`queryInfo.queryOutcome.storage.locations.high`	Object	`{...}`	Provider-owned storage metadata. Present when `type` is `high`.
`queryInfo.queryOutcome.storage.locations.high.stats.bytesRead`	Number	`0`	Bytes read from provider-owned storage.
`queryInfo.queryOutcome.storage.outputs`	Array	`[{"type":"writeto","writeto":{...}}]`	Outputs produced by the query (e.g., results materialized via `writeto`).
`queryInfo.queryOutcome.storage.outputs.type`	Enum	`"writeto"`	Output type.
`queryInfo.queryOutcome.storage.outputs.writeto`	Object	`{...}`	`writeto` output details. Present when `type` is `writeto`.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.stats.totalBytesWritten`	Number	`87893`	Total compressed bytes written across all targets.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.stats.totalUncompressedBytes`	Number	`24469`	Total uncompressed bytes written across all targets.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.targets`	Array	`[{...}]`	Per-target `writeto` results.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.targets.dataset`	String	`"materialized_events"`	Target dataset name.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.targets.dataspace`	String	`"default"`	Target dataspace.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.targets.writeMode`	Enum	`"append"`	Write mode used: `append` or `overwrite`.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.targets.stats.bytesWritten`	Number	`87893`	Compressed bytes written to this target.
`queryInfo.queryOutcome.storage.outputs.writeto.dataset.targets.stats.uncompressedBytes`	Number	`24469`	Uncompressed bytes written to this target.
`queryInfo.queryOutcome.storage.stats`	Object	`{...}`	Execution-level resource statistics for the query.
`queryInfo.queryOutcome.storage.stats.bytesRead`	Number	`7271615`	Bytes read from customer-exposed storage (customer bucket or provider-owned storage). Excludes cache and staging bucket reads.
`queryInfo.queryOutcome.storage.stats.crossRegionBytesRead`	Number	`0`	Bytes read from a different region than the DataPrime Query Engine (DQE) cluster's region.
`queryInfo.queryOutcome.storage.stats.limits`	Object	`{...}`	Per-resource limits. Each entry includes `reached` (whether the limit was hit) and, where applicable, `limit` (the threshold) and `value` (the measured amount). Threshold values vary by tier and account configuration; the `limit` examples in the rows below are illustrative only.
`queryInfo.queryOutcome.storage.stats.limits.scan.reached`	Boolean	`false`	`true` if the scan limit was reached.
`queryInfo.queryOutcome.storage.stats.limits.scan.limit`	Number	`1073741824`	Scan-byte threshold.
`queryInfo.queryOutcome.storage.stats.limits.scan.value`	Number	`14522`	Bytes the engine scanned (may include data from in-memory cache, not just storage).
`queryInfo.queryOutcome.storage.stats.limits.shuffleSize.reached`	Boolean	`false`	`true` if the shuffle size limit was reached.
`queryInfo.queryOutcome.storage.stats.limits.shuffleSize.limit`	Number	`1073741824`	Shuffle-byte threshold.
`queryInfo.queryOutcome.storage.stats.limits.shuffleSize.value`	Number	`83456`	Bytes shuffled by the query.
`queryInfo.queryOutcome.storage.stats.limits.filesRead.reached`	Boolean	`false`	`true` if the files-read limit was reached.
`queryInfo.queryOutcome.storage.stats.limits.filesRead.limit`	Number	`10000`	Files-read threshold.
`queryInfo.queryOutcome.storage.stats.limits.filesRead.value`	Number	`0`	Number of files the query read.
`queryInfo.queryOutcome.storage.stats.limits.aggBuckets.reached`	Boolean	`false`	`true` if the aggregation-buckets limit was reached.
`queryInfo.queryOutcome.storage.stats.limits.aggBuckets.limit`	Number	`10000`	Aggregation-buckets threshold.
`queryInfo.queryOutcome.storage.stats.limits.aggBuckets.value`	Number	`0`	Number of aggregation buckets used by the query.
`queryInfo.queryOutcome.storage.stats.limits.column.reached`	Boolean	`false`	`true` if the column limit was reached. The column count itself is not emitted because it is computed at ingest, not at query time.
`queryInfo.queryOutcome.storage.stats.limits.scrollTimeout.reached`	Boolean	`false`	`true` if the scroll operation timed out.
`queryInfo.queryOutcome.failureType`	Enum	`"bad request"`	Failure reason (if applicable).
`queryInfo.queryOutcome.failureClass`	Enum	`"clientError"`	Error class: `clientError` or `serverError`.
`queryInfo.queryOutcome.e2eDurationMs`	Number	`842`	End-to-end execution time (ms).
`queryInfo.queryId`	String	`"q-2025-09-04-abc123"`	Unique identifier for the query execution.
`queryInfo.queryBlueprints`	Object	`{...}`	Normalized representations of query components.
`queryInfo.queryBlueprints.queryTextSearchFilters`	String	`"text:\"payment failed\""`	Representation of text-based filters.
`queryInfo.queryBlueprints.queryNoLiterals`	String	`"SELECT * FROM logs WHERE status = ?"`	Query string with literals removed.
`queryInfo.queryBlueprints.queryLabelFilters`	String	`"service=api, env=prod"`	Normalized label-based filters.
`queryInfo.defaultTimeFrame`	Object	`{"start":1777801523000000000,"end":1777805123000000000,"durationMs":3600000}`	Default time range if none is specified.
`queryInfo.defaultTimeFrame.start`	Number	`1777801523000000000`	Default start time, in epoch nanoseconds.
`queryInfo.defaultTimeFrame.end`	Number	`1777805123000000000`	Default end time, in epoch nanoseconds.
`queryInfo.defaultTimeFrame.durationMs`	Number	`3600000`	Default time range duration in milliseconds.
`queryInfo.queryText`	String	`"source default/logs \ \| limit 10"`	Original raw query issued by the user.

`engine.queries` schema

{ clientInfo

userEmail

type: string
The email address of the user who issued the query.

originatingTeamId

type: string
The internal team ID that owns or initiated the query.

}

{ queryInfo

{ semanticLabels

containsFreeTextSearch

type: boolean
Whether the query includes full-text search elements.

containsUnions

type: boolean
Whether the query includes SQL-style UNION clauses.

containsAggregations

type: boolean
Whether the query uses aggregation functions like SUM, COUNT, AVG.

nonDefaultLimitRequested

type: boolean
Indicates if the query requests a result size different from the default.

containsWildTextSearch

type: boolean
Whether the query includes wildcard search terms (e.g., *, %).

containsGroupingSets

type: boolean
Whether grouping sets or similar constructs are used in the query.

containsJoins

type: boolean
Whether the query involves joining multiple datasets or tables.

containsExtract

type: boolean
Whether the query includes extract operations (e.g., parsing time or string fields).

containsWriteto

type: boolean
Whether the query uses a writeto clause to materialize results into a dataset.

extractBeforeFilter

type: boolean
Whether the extract operation occurs before filtering in execution order.

}

querySyntax

Enum: dataprime, lucene, opensearch
Syntax format used by the query.

interfaceType

Enum: dataprime, opensearch
Interface that the query was submitted against.

tier

Enum: high, medium, low
Query priority tier.

{ sources

type: array
A list of source datasets and their configurations used in the query.

{ items

fqDataset

type: string
The fully qualified name of the dataset (<dataspace>/<dataset>).

teamId

type: string
The team responsible for the dataset (optional).

{ timeFrame

start

type: number
Start of the source's time range, in epoch nanoseconds.

end

type: number
End of the source's time range, in epoch nanoseconds.

durationMs

type: number
Total duration of the selected timeframe in milliseconds.

}

scopeExpression

type: string
Expression that filters or scopes the source data (e.g., labels or conditions).

}

{ queryOutcome

errorMessage

type: string
Optional error message if the query failed.

outputRowCount

type: number
Number of rows returned by the query.

status

Enum: Completed, Failed, Cancelled, TimedOut, Incomplete
The final status of the query execution.

{ storage

{ home

cloud

Enum: aws
Cloud provider hosting the query engine.

{ aws

region

type: string
AWS region where the query was executed. Present when cloud is aws.

}

{ locations

type: array
Storage locations the query read from or wrote to. Each item is one of two variants — objectStore (customer-owned) or high (provider-owned).

{ items

type

Enum: objectStore, high
Location variant.

locationOwner

Enum: customer, provider
customer for objectStore locations; provider for high locations.

{ objectStore

Present when type is objectStore.

type

Enum: aws_s3, gcp_gs, azure_blobStorage, ibm_cos
Object-store type.

{ aws_s3

Present when objectStore.type is aws_s3.

bucket

type: string
S3 bucket name.

region

type: string
S3 bucket region.

}

{ stats

bytesRead

type: number
Bytes read from this object-store location.

getRequests

type: number
Number of GET requests issued to this object-store location.

headRequests

type: number
Number of HEAD requests issued to this object-store location.

}

{ high

Present when type is high.

{ stats

bytesRead

type: number
Bytes read from provider-owned storage.

}

{ outputs

type: array
Outputs produced by the query (e.g., results materialized via writeto).

{ items

type

Enum: writeto
Output type.

{ writeto

{ dataset

{ stats

totalBytesWritten

type: number
Total compressed bytes written across all targets.

totalUncompressedBytes

type: number
Total uncompressed bytes written across all targets.

}

{ targets

type: array
Per-target writeto results.

{ items

dataset

type: string
Target dataset name.

dataspace

type: string
Target dataspace.

writeMode

Enum: append, overwrite
Write mode used for this target.

{ stats

bytesWritten

type: number
Compressed bytes written to this target.

uncompressedBytes

type: number
Uncompressed bytes written to this target.

}

{ stats

bytesRead

type: number
Bytes read from customer-exposed storage (customer bucket or provider-owned storage). Excludes cache and staging bucket reads.

crossRegionBytesRead

type: number
Bytes read from a different region than the DataPrime Query Engine (DQE) cluster's region.

{ limits

Per-resource limit signals. Each entry includes reached (whether the limit was hit) and, where applicable, limit (the threshold) and value (the measured amount). Threshold values vary by tier and account configuration.

{ scan

reached

type: boolean
true if the scan limit was reached.

limit

type: number
Scan-byte threshold.

value

type: number
Bytes the engine scanned (may include in-memory cache, not just storage).

}

{ shuffleSize

reached

type: boolean
true if the shuffle size limit was reached.

limit

type: number
Shuffle-byte threshold.

value

type: number
Bytes shuffled by the query.

}

{ filesRead

reached

type: boolean
true if the files-read limit was reached.

limit

type: number
Files-read threshold.

value

type: number
Number of files the query read.

}

{ aggBuckets

reached

type: boolean
true if the aggregation-buckets limit was reached.

limit

type: number
Aggregation-buckets threshold.

value

type: number
Number of aggregation buckets used.

}

{ column

reached

type: boolean
true if the column limit was reached. The column count itself is not emitted because it is computed at ingest, not at query time.

}

{ scrollTimeout

reached

type: boolean
true if the scroll operation timed out.

}

failureType

Enum: bad request, rate limit reached, business timeout, not found, permission denied, internal, resource exhausted, internal death, query failed, query timed out
The reason the query failed, if applicable.

failureClass

Enum: clientError, serverError
Classification of the error as client-side or server-side.

e2eDurationMs

type: number
Total execution time from request to response in milliseconds.

}

queryId

type: string
A unique identifier for the query execution.

{ queryBlueprints

queryTextSearchFilters

type: string
A representation of the query's text-based filters.

queryNoLiterals

type: string
The query string with literals removed for comparison/normalization.

queryLabelFilters

type: string
A normalized form of the label-based filters in the query.

}

{ defaultTimeFrame

start

type: number
Default time range start, in epoch nanoseconds.

end

type: number
Default time range end, in epoch nanoseconds.

durationMs

type: number
Total default time range duration in milliseconds.

}

queryText

type: string
The original raw query as issued by the user.

}

Next steps

Track field-level schema evolution over time with engine.schema_fields.

Need help? Contact Support.

What's new? Find out here.

LLM? Read llms.txt.

engine.queries

Purpose

Schema description

engine.queries schema

Next steps

`engine.queries`

`engine.queries` schema