Pinot Version | 1.0.0 |
---|---|
Code | startreedata/pinot-recipes/chaining-transformation-functions |
Prerequisites
To follow the code examples in this guide, you must install Docker locally and download recipes.Navigate to recipe
- If you haven’t already, download recipes.
- In terminal, go to the recipe by running the following command:
Launch Pinot Cluster
You can spin up a Pinot Cluster by running the following command:Dataset
We’re going to import the following JSON file:userId
and store them in individual columns.
Pinot Schema and Table
Now let’s create a Pinot Schema and Table. First, the schema:userId
column will store the userId
value from the JSON document. The name
and id
columns will store values extracted from the userId
.
We’ll also have the following table config:
ingestionConfig.transformConfigs
) that do the following:
- Extract
payload.userId
using the jsonPathString function. - Split the corresponding string on
__
and extracting theid
andname
using Groovy transformation functions.
Ingestion Job
Now we’re going to import the JSON file into Pinot. We’ll do this with the following ingestion spec:people
schema. If one of the fields doesn’t exist in the schema it will be skipped.
In this case our JSON documents only have one top level field, payload
, which doesn’t have a corresponding column in the schema. Instead, transformation functions extract the payload.userId
field and then store parts of it in different columns.
You can run the following command to run the import:
Querying
Once that’s completed, navigate to localhost:9000/#/query and click on thepeople
table or copy/paste the following query:
id | name | userId |
---|---|---|
3287651 | David Smith | 3287651__David Smith |
4987622 | Jenny Jones | 4987622__Jenny Jones |
1965900 | Stephen Davis | 1965900__Stephen Davis |