forceCommit
API to immediately commit consuming segments. This API is usually used when we’ve made stream compatible changes to our table config, like changing segment.threshold
parameters.
Pinot Version | 1.0.0 |
---|---|
Code | startreedata/pinot-recipes/force-commit |
Prerequisites
You will need to install Docker to follow the code examples in this guide.Navigate to recipe
- If you haven’t already, download recipes.
- In terminal, go to the recipe by running the following command:
Launch Pinot Cluster
You can spin up a Pinot Cluster by running the following command:Data generator
This recipe contains a data generator that creates events with a timestamp, count, and UUID. You can generate data by running the following command:Kafka ingestion
We’re going to ingest this data into an Apache Kafka topic using the kcat command line tool. We’ll also usejq
to structure the data in the key:payload
structure that Kafka expects:
Pinot Schema and Table
Now let’s create a Pinot Schema and Table. First, the schema:Querying by segment
Once that’s been created, we can head over to the Pinot UI and run some queries. Pinot has several built-in virtual columns inside every schema that can be used for debugging purposes:Column Name | Column Type | Data Type | Description |
---|---|---|---|
$hostName | Dimension | STRING | Name of the server hosting the data |
$segmentName | Dimension | STRING | Name of the segment containing the record |
$docId | Dimension | INT | Document id of the record within the segment |
$segmentName
, which we can use like this to count the number of records in each segment:
$segmentName | maxTs | count(*) |
---|---|---|
events__0__1__20230602T1305Z | 2023-06-02 13:04:34 | 1423028 |
events__0__0__20230602T1304Z | 2023-06-02 13:04:27 | 2500000 |
forceCommit
API:
$segmentName | maxTs | count(*) |
---|---|---|
events__0__5__20230602T1305Z | 2023-06-02 13:04:55 | 1000000 |
events__0__4__20230602T1305Z | 2023-06-02 13:04:50 | 1000000 |
events__0__3__20230602T1305Z | 2023-06-02 13:04:45 | 1000000 |
events__0__2__20230602T1305Z | 2023-06-02 13:04:39 | 1000000 |
events__0__1__20230602T1305Z | 2023-06-02 13:04:34 | 1423028 |
events__0__0__20230602T1304Z | 2023-06-02 13:04:27 | 2500000 |