Data Modeling
JSON Unnesting
JSON unnesting in Apache Pinot—powered by StarTree—makes it easy to work with deeply nested and semi-structured data. This feature allows users to extract nested fields from JSON objects and arrays into flat, queryable formats during ingestion.
Whether you’re dealing with clickstream data, telemetry logs, or API responses, unnesting simplifies data modeling and enhances query performance.
Why JSON Unnesting?
- Simplifies Queries: Flatten nested data for easier filtering, aggregations, and joins.
- Improves Performance: Avoids expensive JSON path evaluations at query time.
- Streamlines Data Modeling: Convert complex structures into relational form at ingestion.
Enabling JSON Unnesting in StarTree
JSON unnesting can be configured via:
- The StarTree Data Portal UI, or
- The ingestion config JSON using
transformConfigs
andunnestConfig
.
Examples
Let’s go through a few examples to unnest a JSON Object,
Example 1: Unnesting a JSON Object
Input JSON:
Unnesting Config:
Resulting Schema:
user_id | user_name | event |
---|---|---|
123 | Alice | click |
Example 2: Unnesting an Array into Multiple Rows
Input JSON:
Unnesting Config:
Resulting Schema:
orderId | items_sku | items_qty |
---|---|---|
A001 | X1 | 2 |
A001 | X2 | 1 |