Documentation Index
Fetch the complete documentation index at: https://docs.startree.ai/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
Offline upserts allow an OFFLINE Pinot table to upsert records by primary key across multiple segments, presenting query results as if only the most-recent version of each record exists. In a standard OFFLINE table every segment is independent — pushing two segments that each contain a row for the same primary key means queries will return both rows. Offline upserts fix this: when a new segment is pushed, the server resolves conflicts with all previously loaded segments and surfaces only the winning record per primary key.Record precedence ordering
When no explicit comparison column is configured, segment push time determines which record wins: the record from the segment pushed most recently always wins. This is the simplest setup for offline tables where each push represents a more-complete or corrected snapshot of the data. You can also configure an explicit comparison column (e.g., aneventTime timestamp). In that case the row with the highest value in the comparison column wins, regardless of push order.
Use Cases
| Scenario | Why offline upserts help |
|---|---|
| Incremental corrections | A small fraction of rows in the table has changed (e.g., late-arriving corrections, status updates) and you want to push only the changed rows as a new segment without re-ingesting the whole table. |
| Snapshot uploads with overlap | An upstream job exports daily snapshots that may overlap (the same primary key appears in yesterday’s and today’s export). Offline upserts deduplicate automatically. |
How to Configure Offline Upserts
Enabling offline upserts requires changes to three places: the schema, the table config, and (optionally) the ingestion task config.1. Schema — declare primary key columns
primaryKeyColumns can contain multiple columns — Pinot hashes them together as a composite key.
2. Table config
Three sections of the table config must be set together:upsertConfig, segmentPartitionConfig, and routing.
upsertConfig.modemust be"FULL". Partial upsert mode is not supported for offline tables (see Limitations).upsertConfig.metadataManagerClassmust point to the StarTree RocksDB implementation. If omitted the in-heap OSS default is used, which does not persist metadata across server restarts and is not recommended for production.segmentPartitionConfigon the primary key column is recommended. The partition function and count define how the data is partitioned and also governs the overall scalability. Every segment for a given primary key must land on the same server — without this the server cannot see all versions of a key and deduplication will be incorrect. In general, the recommendation is to use a high partition count (eg: 128) to account for organic growth. Note that you cannot change this post table creation.instanceSelectorType: strictReplicaGroupensures that a query is routed to exactly one replica group. This is required so that the server’s local RocksDB view (which is per-partition) is authoritative for the keys it owns.
comparisonColumns is set, the row with the highest value in that column wins. When it is omitted, the row from the most-recently-pushed segment wins.
3. Partitioning upstream data with FileIngestionTask
For offline upserts to work correctly, every segment must be partitioned on the primary key: all rows with the same primary key value must reside in the same server.FileIngestionTask handles repartitioning automatically, even when the upstream S3 data is not pre-partitioned by primary key. On each execution the task executor reads segmentPartitionConfig from the table config and passes it to SegmentProcessorFramework, which physically sorts all ingested rows into per-partition buckets by applying the configured partition function (e.g., Murmur) across the configured number of partitions. Each output segment contains only rows that hash to a single partition bucket, satisfying the upsert requirement without any extra config.
The segmentPartitionConfig in the table config acts as the contract: the task executor reads it to know how many partitions to create and which hash function to use, producing pre-partitioned segments that satisfy the upsert requirement automatically.
Offline Upserts vs. Atomic Sync
StarTree also offers Atomic Sync (viaconsistentPushSwapEnabled=true on FileIngestionTask), which replaces the entire table atomically. Both features update an offline table from S3, but they are designed for different scenarios.
| Dimension | Offline Upserts | Atomic Sync |
|---|---|---|
| What it does | Merges new/updated rows into existing segments. Only changed rows need to be pushed. | Rebuilds and atomically swaps all segments in the table from the input source. |
| Granularity | Row-level upsert/deduplication by primary key. | File / Partition / Table-level replacement. |
| Data volume per push | Small-to-moderate fraction of the table (works well under ~40% of total rows being upserted). | Optimised for full partition or table refreshes (essentially 100% of rows). |
| Query consistency during push | New segments become visible as they are pushed; there is a brief window where both old and new versions of a row are in the table and the upsert filter resolves them. | Old and new sets of segments are swapped atomically — queries see either the old or the new version of the table, never a mix. |
| Index support | No StarTree index support (see Limitations). | StarTree indexes are supported which significantly improves query performance |
| Partial upsert | Not supported (FULL mode only). | Not applicable — full re-ingestion replaces all data. |
| When to choose | You are pushing incremental corrections, late data, or a small update batch. The primary key is known and stable. | You are regenerating the entire table (e.g., reprocessing pipeline, full restatement), or you require atomic cutover. |
Limitations
No partial upsert support. Onlymode: FULL is supported for offline tables. FULL mode means the entire winning row replaces the entire losing row. Column-level merge strategies available in PARTIAL mode for realtime upsert tables are not available. Note that this is generally ok since column level update pattern is typically found in CDC/streaming use cases not batch.
No StarTree index. Columns in an offline upsert table cannot have a StarTree (multi-dimensional pre-aggregation) index configured. The controller will reject such a table config at creation time.
Segment push time ordering is implicit and irreversible. When comparisonColumns is not set, the server uses segment push time to break ties. Pushing an older segment after a newer one will cause the older data to win. Use an explicit comparisonColumns (e.g., a business event timestamp) if push order cannot be guaranteed.
