- Create the new schema based on the existing schema.
- Update the schema using the Controller API
- Invoker the Table Reload operation to ensure the new schema is picked up by all Pinot segments and hence in the query results.
Schema Update and Reload Integration
When a schema is updated, Pinot provides two options for applying the changes to existing segments- Immediate reload (
reload=true
) - TriggersreloadAllSegments()
for all tables using the schema - Deferred reload - default (
reload=false
) - Sends schema refresh messages to servers without immediate reload
Backwards compatible changes
Following changes are considered backwards compatible:- Adding new columns - You can add new fields to the schema
- Changing default values - Modifying default values for existing columns is allowed
- Time column granularity changes - Modifications to time field granularity specifications are permitted
- DateTime column format changes - Changes to datetime field formats are backwards compatible
Incompatible Changes
Generally speaking, following changes are deemed as incompatible:- Removing existing columns
- Changing column data types
In StarTree Cloud, you can still make such incompatible changes by running the SegmentRefreshTask (SRT). Note that queries may not succeed until the SRT task has finished.
Query Semantics
It is important to ensure that queries don’t fail in the midst of a schema evolution operation. In order to make this easier, Pinot supports the notion of a virtual column which is used in lieu of the actual newly added column. Queries touching this virtual column will pick up the default values instead of failing (which was the old behavior). Here is the general guidance:Scenario | Before | After |
---|---|---|
Rolling addition of a new column (newCol) | Queries fail with schema mismatch | Queries succeed; newCol is exposed as a virtual column |
Add a new column without segment reload | Empty results due to server‑side pruning | Queries succeed and return all columns, with newCol filled with default values |
Queries over existing columns only | No change | No change |
Queries referencing a partially loaded column | Merge errors or empty responses | Queries succeed with consistent results, including the new virtual column |