Pinot Version | 1.0.0 |
---|---|
Code | startreedata/pinot-recipes/csv-files-spaces-column-names |
Prerequisites
To follow the code examples in this guide, you must install Docker locally and download recipes.Navigate to recipe
- If you haven’t already, download recipes.
- In the terminal, navigate to this recipe’s directory:
Launch Pinot Cluster
Launch a Pinot Cluster:Dataset
We’re going to import the following CSV file, in which theCase Number
column heading contains a space:
ID | Case Number |
---|---|
10224738 | HY411648 |
10224739 | HY411615 |
11646166 | JC213529 |
10224740 | HY411595 |
Pinot Schema and Table
Next we create a Pinot Schema and Table. A common pattern when creating a schema is to create columns that map directly to the names of the fields in our data source. We can’t do that in this case since column names can’t contain spaces, so instead we’ll use the following:The entry under
ingestionConfig.transformConfigs
makes sure that data in the Case Number
field in the data source is ingested into the CaseNumber
column of the table. To learn more about writing these functions, see the ingestion transformation documentation.Ingestion Job
Next, we import the CSV file into Pinot. We’ll do this with the following ingestion spec:Querying
Once that’s completed, navigate to localhost:9000/#/query and click on thecrimes
table or copy/paste the following query:
CaseNumber | ID | |
---|---|---|
HY411648 | 10224738 | |
HY411615 | 10224739 | |
JC213529 | 11646166 | |
HY411595 | 10224740 |