Documentation Index
Fetch the complete documentation index at: https://docs.startree.ai/llms.txt
Use this file to discover all available pages before exploring further.
Context
Workload isolation is a new feature in StarTree Cloud (Starting 0.14 release) that allows users to physically isolate queries from different workloads on the same Pinot table. For instance, let’s assume we have following 2 workloads- User facing (application) queries: characterized by strict SLA, high impact to business
- Adhoc queries: Required for triaging/debugging, low impact to business
Pre-requisite
For this feature to work, the table must be configured to use replica groups and use the ‘StarTreeReplicaGroupInstanceSelector’ instance selector type. For instance let’s consider this sample table configHow to use this feature
Preferred Replicas
Once enabled, you can configure isolation at the query level via the preferredReplicas query option. It accepts a comma-separated list of replica group IDs (derived from ideal state, not from instance partitions) that the broker should use to route the query. Suppose we want the following configuration:- Workload A: allowed to use both replica groups (0 and 1)
- Workload B: restricted to replica group 2
Preferred Replica list semantics
When multiple replica groups are listed, the broker rotates across them by request ID for load balancing — it does not treat the first entry as primary and the rest as failover. So preferredReplicas=‘0,1’ routes roughly half of the requests entirely through RG 0 and the other half entirely through RG 1. To express “prefer RG 0, and fall over to RG 1 only when RG 0 cannot serve a segment”, use preferredReplicas=‘0’ together with fallbackReplicas=‘1’ (see the next section). Single-replica case. preferredReplicas=‘2’ means the query is routed only to instances in replica group 2. If any segment is unavailable on RG 2, it is reported as unavailable (partial result) unless fallbackReplicas is set. Therefore, in the happy path, Workload A will be served from RG 0 and RG 1 (load balanced), and Workload B will be served exclusively from RG 2.Fallback replicas
The fallbackReplicas query option is an optional companion to preferredReplicas. It accepts a comma-separated list of replica group IDs that the broker should consider only when none of the preferredReplicas can serve a given segment. Example:Selection order
For each segment, the broker walks the preferred replicas first (rotated by request ID for load balancing within the preferred set), and only falls through to the fallbackReplicas (in the listed order, no rotation) when no preferred replica has the segment online. Cross-preferred-replica failover takes priority over fallback: with preferredReplicas=‘0,1’ and fallbackReplicas=‘2’, a segment that is offline on RG 0 but online on RG 1 is served from RG 1, and the fallback metric is not incremented.Upsert / dedup tables
Falling back to a different replica group is safe even for upsert / dedup tables: if any segment in a partition is unavailable on an instance, the underlying selector removes that instance as a candidate for all segments in that partition, so the fallback replica is chosen consistently across the whole partition.Observability
Whenever any segment in a query is served by a fallback replica, the broker increments the per-table meter pinot.broker.<table>.queriesUsingFallbackReplica (exposed to Prometheus as pinot_broker_queriesUsingFallbackReplica_* with a “table” label). Use this metric to alert on workload isolation being silently violated. Recommended pattern. For strict isolation, set preferredReplicas only. For best-effort isolation with availability under partial outages, set preferredReplicas to your isolation target and fallbackReplicas to the rest — and alert on the queriesUsingFallbackReplica meter so you know when isolation is being broken. Here are the key failover scenarios:| Failure Scenario | Implication (no fallbackReplicas) | Implication (with fallbackReplicas set) |
|---|---|---|
| Segment replica failure within a preferred RG, but another preferred RG has the segment | Served from the other preferred RG (no metric increment) | Same as left column — fallback is not used |
| Segment replica failure across all preferred RGs | Segment reported unavailable → partial result | Served from the first fallback RG that has the segment; queriesUsingFallbackReplica meter is incremented |
| Segment replica failure on a preferred RG with upserts enabled | Entire affected partition is failed over to the next preferred RG | Entire partition fails over to the next available RG (preferred or fallback) that has it (consistent across the partition) |
| Segment unavailable on every preferred and fallback RG | Partial result | Partial result |
Limitations
- During any kind of segment movement (eg: rebalance or segment relocator -> realtime to offline or across tiers) - workload isolation guarantees are lost (this is something to do with the internals of Apache Pinot as it works today).
- Only homogeneous replica groups are supported today. In other words, each replica group must contain the same amount of instances.
- When preferredReplicas lists more than one replica group, request-ID-based rotation is the only load-balancing strategy — adaptive server selection is not consulted within the preferred set.

