In this recipe we’ll learn how to infer a Pinot schema from a JSON input file.Documentation Index
Fetch the complete documentation index at: https://docs.startree.ai/llms.txt
Use this file to discover all available pages before exploring further.
| Pinot Version | 0.9.0 |
|---|---|
| Code | startreedata/pinot-recipes/infer-schema-json-data |
Prerequisites
To follow the code examples in this guide, you must install Docker locally and download recipes.Navigate to recipe
- If you haven’t already, download recipes.
- In terminal, go to the recipe by running the following command:
JSON Data
We’re going to infer a Pinot schema from the following input file:Infer schema
Now we’re going to infer a schema for this input file. We can do this using theJsonToPinotSchema command.
Schema with only dimension fields
You can generate a schema file that creates a dimension column field per JSON field, by running the following command:./config/github.json, the contents of which are shown below:
Schema with time column
We should probably specify thecreated_at field as a date time field, which we can do using the following command:
./config/github.json, the contents of which are shown below:
Schema with unnested fields
At the moment thepayload.commits JSON array is stored as a string. We can unnest each of the values in the documents in the array, by running the following command:
./config/github_unnest.json, the contents of which are shown below:

