Partial Upserts in Pinot
To learn how to perform partial upserts on a real-time table, watch the following video, or complete the tutorial below.
To get a better understanding of upserts and how they work, see the Full Upserts documentation.
Pinot Version | 1.0.0 |
---|---|
Code | startreedata/pinot-recipes/partial-upserts |
Prerequisites
To follow the code examples in this guide, you must install Docker locally and download recipes.
Navigate to recipe
- If you haven’t already, download recipes.
- In terminal, go to the recipe by running the following command:
Launch Pinot Cluster
You can spin up a Pinot Cluster by running the following command:
This command will run a single instance of the Pinot Controller, Pinot Server, Pinot Broker, Zookeeper, and Kafka. You can find the docker-compose.yml file on GitHub.
Create meetup_rsvp
Kafka topic
This recipe explores capturing RSVPs for meetup events. A meetup can be hosted by multiple groups, at multiple venues.
Let’s create the meetup_rsvp
topic in Kafka to record the RSVPs.
Pinot Schema and Table
Now let’s create a Pinot Schema and Table.
First, the schema:
config/meetup_rsvp_schema.json
Note that, the event_id
column is appointed as the primary key, which is mandatory for upserts in Pinot.
We’ll also have the following table config:
config/orders_table.json
In this table configuration, we only enable upserts on three columns: rsvp_count
, group_name
, and venue_name
. Hence, the mode
is set to PARTIAL
.
When using the APPEND
strategy, you must make sure that the corresponding column can accept multiple values, by specifying the following config:
When RSVP events are ingested with unique event_id
values, rsvp_count
will be incremented.
The name of the group will added to the group_name
column, if not exists.
Also, the venue will be appended to the venue_name
.
You can create the table and schema by running the following command:`
Produce some RSVP events
We will simulate a few RSVPs by publishing the following events to the meetup_rsvp
Kafka topic.
Querying
Once that’s completed, navigate to localhost:9000/#/query and click on the meetup_rsvp
table or copy/paste the following query: