Skip to main content
This feature requires StarTree release 0.15.0 or later.
Scheduled Server Scaling lets you automatically scale Apache Pinot server replica groups up or down on a fixed schedule. This is useful when your query load is predictable — for example, running fewer server replica groups overnight or on weekends to save cost, and restoring full capacity before peak hours. Scaling is driven by cron schedules defined per Pinot tenant. At each scheduled time, the operator asks the Pinot controller which servers to remove (or restore) to reach your target replica-group count, drains queries off the affected servers, and scales their StatefulSets accordingly.
Scaling operates at the granularity of replica groups, not individual servers. You specify a target number of replica groups; the Pinot controller computes the exact set of servers to remove or restore to achieve that target. See Replica Group based Workload Isolation for background on replica groups and pools.
Scaling down reduces redundancy for the tenant until the matching scale-up restores it. A schedule that targets targetReplicaGroups: 1 leaves the tenant with no replica group failover for the duration of the scale-down window — if that single replica group has an issue, queries to the tenant’s tables will fail. Choose a target that keeps the minimum redundancy your workload needs.

Prerequisites

  • Your cluster is on StarTree release 0.15.0 or later.
  • Your cluster is managed by the StarTree Kubernetes operator via the PinotCluster custom resource (this feature is configured at the operator level, not from the Data Portal UI).
  • The tenant’s tables already use pool-based, replica-group-aware instance assignment — see Controller requirements for scale-down for the exact conditions the controller checks before it will scale down a tenant.

How it works

  1. You enable scheduled scaling in the PinotCluster spec under the server component and define one or more schedules per tenant.
  2. The operator creates and maintains a ScheduledServerScaling custom resource for each tenant you configure.
  3. At each schedule’s cron time:
    • Scale down — the operator calls the Pinot controller to determine which servers can be removed to reach the target replica-group count, drains in-flight queries off them (bounded by queryDrainTimeout), then scales those server StatefulSets to zero.
    • Scale up — the operator determines which previously-removed servers must come back to reach the target, and restores their StatefulSets.
  4. If the operator was down when a schedule fired, it still executes the missed run as long as it restarts within the missedExecutionWindow. After that window passes, the missed run is skipped until the next occurrence.
While a scale operation is in progress, the operator suppresses normal replica reconciliation for the affected servers, so a manually-running cluster reconcile will not fight the schedule.

Enabling it

Add a scheduledScaling block to the server component in your PinotCluster resource:
apiVersion: startreedata.io/v2alpha1
kind: PinotCluster
metadata:
  name: my-pinot
  namespace: pinot
spec:
  components:
    server:
      scheduledScaling:
        enabled: true
        tenants:
          - tenant: DefaultTenant
            schedules:
              - name: nightly-scale-down
                action: SCHEDULED_SCALE_DOWN
                cron: "0 22 * * *"          # Every day at 22:00 UTC
                targetReplicaGroups: 1
                queryDrainTimeout: "10m"
                missedExecutionWindow: "30m"
              - name: morning-scale-up
                action: SCHEDULED_SCALE_UP
                cron: "0 6 * * *"           # Every day at 06:00 UTC
                targetReplicaGroups: 3
                queryDrainTimeout: "10m"
                missedExecutionWindow: "30m"
To disable scheduled scaling, set enabled: false (or remove the scheduledScaling block). When a tenant is removed from the spec, the operator deletes its ScheduledServerScaling resource and restores any servers that were left scaled down.

Configuration reference

scheduledScaling (under spec.components.server):
FieldTypeDescription
enabledbooleanMaster switch for scheduled scaling on this cluster.
tenantsarrayOne entry per Pinot tenant you want to schedule. Each has tenant + schedules.
Each entry in tenants:
FieldTypeDescription
tenantstringPinot server tenant name (e.g. DefaultTenant).
schedulesarrayOne or more scheduled actions for this tenant.
Each entry in schedules:
FieldTypeRequiredDescription
namestringyesA label for the schedule, used in logs and status.
actionenumyesSCHEDULED_SCALE_DOWN or SCHEDULED_SCALE_UP.
cronstringyes5-field Unix cron expression (minute hour day-of-month month day-of-week), evaluated in UTC. See examples below.
targetReplicaGroupsintegeryesDesired number of server replica groups for the tenant after this action completes. Must be greater than 0.
queryDrainTimeoutstringnoMax time to wait for in-flight queries to drain off a server before scaling it down (e.g. 10m, 30s).
missedExecutionWindowstringnoGrace period after the scheduled time during which a missed run still executes if the operator was down (e.g. 30m, 1h). Defaults to 15m.

Cron format

Schedules use standard 5-field Unix cron (minute hour day-of-month month day-of-week). All times are evaluated in UTC — convert your local schedule to UTC before setting cron.
ExpressionMeaning (UTC)
0 22 * * *Every day at 22:00 UTC
0 6 * * 1-5Weekdays at 06:00 UTC
0 0 * * 0Every Sunday at midnight UTC
30 18 * * 5Every Friday at 18:30 UTC

Typical pattern: nightly scale-down, morning scale-up

Pair a SCHEDULED_SCALE_DOWN with a SCHEDULED_SCALE_UP to shrink capacity during off-hours and restore it before peak load:
  • Scale down at night to a low targetReplicaGroups (e.g. 1).
  • Scale up in the morning back to your full targetReplicaGroups (e.g. 3).
Use the same tenant for both schedules. The scale-up restores exactly the servers that the matching scale-down removed.

Checking status

The operator tracks each tenant’s scaling state in the ScheduledServerScaling resource:
kubectl get scheduledserverscaling -n <namespace>
kubectl get scheduledserverscaling <name> -n <namespace> -o yaml
Key status fields:
FieldDescription
status.stateAVAILABLE, DELETING (scale-down in progress), DELETED (scaled down), RESTORING (scale-up in progress), RESTORED.
status.lastScalingOperation.actionThe most recent action: SCHEDULED_SCALE_DOWN or SCHEDULED_SCALE_UP.
status.lastScalingOperation.lastOccurrenceTimestamp of the cron fire time that triggered the operation.
status.lastScalingOperation.serversAffectedList of server StatefulSets affected by the operation.
status.lastScalingOperation.targetReplicaGroupsThe target replica-group count of the last completed operation.

Notes and limitations

  • targetReplicaGroups must be greater than 0 — a schedule cannot remove every replica group.
  • If targetReplicaGroups equals the tenant’s current total replica groups, a scale-down is a no-op (nothing to remove).
  • Cron times are evaluated in UTC — convert your local time before setting cron.
  • Scheduled scaling only affects server StatefulSets for the named tenant. It does not change Zookeeper, Controller, Broker, or Minion components.

Controller requirements for scale-down

At each scale-down, the operator asks the Pinot controller (GET /serverReplicaGroupScaleDown) which servers can be removed for the tenant. The controller only returns a server list when all of the following hold. If any fails, the scale-down is rejected and no servers are removed — fix the underlying condition and the next scheduled run (or a manual retry) will proceed. Tenant and topology
  • The tenant exists and has servers tagged for it (<tenant>_OFFLINE / <tenant>_REALTIME).
  • Servers are exclusive to the tenant — no server tagged for this tenant may also carry tags for another tenant.
  • Every tenant server has a pool assignment, and each pool matches at least one of the tenant’s tags.
  • If a server has both OFFLINE and REALTIME tags, both must point to the same pool number.
Target value
  • targetReplicaGroups must be ≥ 1 and ≤ the current number of replica groups (pools).
  • If targetReplicaGroups equals the current replica-group count, the call succeeds but returns no servers (nothing to remove).
Table configuration (when the tenant has tables)
  • No table may use instancePartitionsMap (which bypasses pool-based assignment).
  • All non-dimension tables must use pool-based, replica-group-aware instance assignment.
  • Each table’s configured numReplicaGroups (when non-zero) must equal the current pool count.
No rebalance in progress
  • No table in the tenant may have an active or failed table rebalance job.
  • The tenant may not have an active, aborted, cancelled, or unscheduled tenant rebalance job.
The controller evaluates these against a point-in-time snapshot of cluster state. The highest numbered pools are always selected for removal, so a given (tenant, targetReplicaGroups) request is deterministic.

Troubleshooting

A scale-down is rejected, with no servers removed, whenever one of the controller requirements above isn’t met. Fix the underlying condition — the next scheduled run (or a manual retry) will proceed once it’s resolved.
SymptomLikely causeWhat to do
Scale-down never removes any serversA server tagged for the tenant is also tagged for another tenantMake servers exclusive to one tenant, or scope the schedule to a tenant that already has dedicated servers.
Scale-down never removes any serversA server is missing a pool assignment, or OFFLINE/REALTIME tags point to different poolsFix the server’s pool assignment so it matches the tenant’s tags, and align OFFLINE/REALTIME pool numbers.
Scale-down never removes any serverstargetReplicaGroups is greater than the tenant’s current replica-group (pool) countLower targetReplicaGroups to at most the current pool count.
Scale-down never removes any serversA table uses instancePartitionsMapMigrate the table to pool-based instance assignment before scheduling scale-down for its tenant.
Scale-down never removes any serversA table’s numReplicaGroups doesn’t match the current pool countUpdate the table’s replica-group config to match the tenant’s current pool count.
Scale-down never removes any serversA table or tenant rebalance is active, aborted, cancelled, or unscheduledLet the rebalance finish (or clear the stuck job), then wait for the next scheduled run or retry manually.
Scale-up doesn’t fully restore capacityThe matching scale-down removed a different set of servers than expectedCheck status.lastScalingOperation.serversAffected on the ScheduledServerScaling resource for both operations to confirm which servers were removed vs. restored.
A schedule didn’t run at allThe operator was down past missedExecutionWindow at the cron timeIncrease missedExecutionWindow if operator restarts routinely take longer than the current setting, or trigger the action manually.