How to Combine Source Fields into one Column
In this recipe we’ll learn how to combine the data from fields in our data source into a single column in Apache Pinot.
Pinot Version | 0.10.0 |
---|---|
Code | startreedata/pinot-recipes/combine-fields |
Prerequisites
To follow the code examples in this guide, you must install Docker locally and download recipes.
Navigate to recipe
- If you haven’t already, download recipes.
- In terminal, go to the recipe by running the following command:
Launch Pinot Cluster
You can spin up a Pinot Cluster by running the following command:
This command will run a single instance of the Pinot Controller, Pinot Server, Pinot Broker, and Zookeeper. You can find the docker-compose.yml file on GitHub.
Dataset
We’re going to import the following JSON file:
data/movies.json
Pinot Schema and Table
Now let’s create a Pinot Schema and Table.
First, the schema:
config/schema.json
You can create the schema by running the following command:
We’ll also have the following table config:
config/table.json
The highlighted section contains a transformation function that concatenates the name
and surname
fields, separated by a space.
You can create the table by running the following command:`
Ingestion Job
Now we’re going to import the JSON file into Pinot. We’ll do this with the following ingestion spec:
config/job-spec.yml
You can run the following command to run the import:
Querying
Once that’s completed, navigate to localhost:9000/#/query and click on the people
table or copy/paste the following query:
You will see the following output:
fullName |
---|
Pete Smith |
John Jones |
Query Results
We can see that the name
and surname
fields from our JSON file have been combined into a single fullName
column for each person.