Skip to content

Using DataPrime to enrich and reshape data

Goal

By the end of this guide you should be able to enrich documents using lookup tables, extract structured values from strings, parse key-value pairs, and explode arrays into separate documents.

Why it matters

Data in logs and traces is often messy, inconsistent, or incomplete. You may need to add metadata, normalize fields across schemas, parse semi-structured text, or reshape arrays into flat rows for easier analysis. DataPrime lets you do all of this at query time, without needing to preprocess or re-index.

These transformations are essential for debugging, auditing, and building clean, meaningful dashboards—even when your logs aren’t clean.


Enrich documents with lookup metadata

Description

The enrich command allows you to decorate your logs with metadata from an external lookup table. This is useful for adding human-readable context such as names, departments, or team ownership—based on fields like userid, ip, or cluster_id.

Syntax

enrich <lookup_value> into <target_field> using <lookup_table>
  • lookup_value: The field in the document used as a lookup key.
  • target_field: The new key where enriched data will be stored.
  • lookup_table: The name of the custom enrichment table.

Example

Sample data

{
  "userid": "111"
}

Lookup table: user_lookup_table
IDNameDepartment
111JohnFinance
222EmilyIT

Query

enrich userid into user_info using user_lookup_table

Result

{
  "userid": "111",
  "user_info": {
    "ID": "111",
    "Name": "John",
    "Department": "Finance"
  }
}

This query appends the relevant row from the lookup table as an object under user_info, creating a dynamic join on read.


Extract structured data from strings (extract + regexp)

Description

The extract command paired with the regexp extraction strategy allows you to pull structured values from text strings. It's ideal for turning loosely formatted logs into something queryable.

Syntax

extract <source_field> into <target_field> using regexp(e=/<named_capture_group>/)

Example

Sample data

{
  "message": "user Chris has logged in"
}

Query

extract message into parsed using regexp(e=/user (?<username>.*) has logged in/)

Result

{
  "message": "user Chris has logged in",
  "parsed": {
    "username": "Chris"
  }
}

Now you can filter, count, or visualize by parsed.username, rather than relying on full-text search.


Parse key-value strings into objects (extract + kv)

Description

The kv strategy for extract is ideal for parsing structured fields that follow key-value formatting (e.g., logfmt, URL query strings). It creates an object with separate keys for each parsed item.

Syntax

extract <source_field> into <target_object> using kv(pair_delimiter='&', key_delimiter='=')

Note

kv is only one of several extractor functions. Choose the one that best serves your use case.

Example

Sample data

{
  "query_string": "user=chris&env=prod"
}

Query

extract query_string into query_params using kv(pair_delimiter='&', key_delimiter='=')

Result

{
  "query_string": "user=chris&env=prod",
  "query_params": {
    "user": "chris",
    "env": "prod"
  }
}

You can now access query_params.user and query_params.env directly in filters, visualizations, or enrichments.


Explode arrays into multiple documents (explode)

Description

The explode command transforms a document with an array field into multiple documents, one per array element. This makes it easier to count, filter, or group by individual values inside arrays.

Syntax

explode <array_field> into <item_field> original [preserve|discard]
  • original preserve: Retains all original fields in each new document.
  • original discard: Only includes the exploded value in each new document.

Example

Sample data

{
  "userid": "1",
  "scopes": ["read", "write"]
}

Query

explode scopes into scope original preserve

Result

{ "userid": "1", "scope": "read", "scopes": ["read", "write"] }
{ "userid": "1", "scope": "write", "scopes": ["read", "write"] }

Each document now contains a single scope value, while keeping the original context (userid, scopes).


Common pitfalls or gotchas

  • enrich only works if your lookup key is a string—cast it if needed.
  • extract using regexp will return null if the pattern doesn't match.
  • kv extraction assumes consistent formatting—watch for missing delimiters or malformed strings.
  • explode overwrites destination fields if names collide—rename carefully.