In this recipe we’ll learn how to unnest or explode values when ingesting JSON documents into Apache Pinot. When we unnest an array it means that we’ll create one row in Pinot for each item in the array rather than one row for the whole JSON document.Documentation Index
Fetch the complete documentation index at: https://docs.startree.ai/llms.txt
Use this file to discover all available pages before exploring further.
| Pinot Version | 0.10.0 |
|---|---|
| Code | startreedata/pinot-recipes/json-unnest |
Prerequisites
To follow the code examples in this guide, you must install Docker locally and download recipes.Navigate to recipe
- If you haven’t already, download recipes.
- In terminal, go to the recipe by running the following command:
Launch Pinot Cluster
You can spin up a Pinot Cluster by running the following command:Dataset
We’re going to import the following JSON file:showLineNumbers
Ratings property of each movie. We’re going to unnest or explode the ratings so that we have one row per rating in our database after the data is ingested.
Pinot Schema and Table
Now let’s create a Pinot Schema and Table. First, the schema:Ratings.
We’ll also have the following table config:
Ratings field contains an array of JSON objects and that array will be exploded to create one row for each item in the array.
If you want to rename the unnested fields, see How to rename fields when unnesting arrays in JSON documents
Ingestion Job
Now we’re going to import the JSON file into Pinot. We’ll do this with the following ingestion spec:movie_ratings schema. If one of the fields doesn’t exist in the schema it will be skipped.
You can run the following command to run the import:
Querying
Once that’s completed, navigate to localhost:9000/#/query and click on themovie_ratings table or copy/paste the following query:
| Rated | Ratings.Source | Ratings.Value | Title |
|---|---|---|---|
| R | Internet Movie Database | 8.7/10 | The Matrix |
| R | Rotten Tomatoes | 88% | The Matrix |
| R | Metacritic | 73/100 | The Matrix |
| PG-13 | Internet Movie Database | 7.8/10 | Avatar |
| PG-13 | Rotten Tomatoes | 82% | Avatar |
| PG-13 | Metacritic | 83/100 | Avatar |
| PG | Internet Movie Database | 6.3/10 | The Boss Baby |
| PG | Rotten Tomatoes | 53% | The Boss Baby |
| PG | Metacritic | 50/100 | The Boss Baby |
| TV-PG | Internet Movie Database | 5.2/10 | Tall Girl |
| TV-PG | Rotten Tomatoes | 38% | Tall Girl |
| TV-MA | Internet Movie Database | 6.3/10 | Love Hard |
| TV-MA | Rotten Tomatoes | 57% | Love Hard |
| TV-MA | Metacritic | 42/100 | Love Hard |

