Scheduled Server Scaling

This feature requires StarTree release 0.15.0 or later.

Scheduled Server Scaling lets you automatically scale Apache Pinot server replica groups up or down on a fixed schedule. This is useful when your query load is predictable — for example, running fewer server replica groups overnight or on weekends to save cost, and restoring full capacity before peak hours. Scaling is driven by cron schedules defined per Pinot tenant. At each scheduled time, the operator asks the Pinot controller which servers to remove (or restore) to reach your target replica-group count, drains queries off the affected servers, and scales their StatefulSets accordingly.

Scaling operates at the granularity of replica groups, not individual servers. You specify a target number of replica groups; the Pinot controller computes the exact set of servers to remove or restore to achieve that target. See Replica Group based Workload Isolation for background on replica groups and pools.

Scaling down reduces redundancy for the tenant until the matching scale-up restores it. A schedule that targets targetReplicaGroups: 1 leaves the tenant with no replica group failover for the duration of the scale-down window — if that single replica group has an issue, queries to the tenant’s tables will fail. Choose a target that keeps the minimum redundancy your workload needs.

Prerequisites

Your cluster is on StarTree release 0.15.0 or later.
Your cluster is managed by the StarTree Kubernetes operator via the PinotCluster custom resource (this feature is configured at the operator level, not from the Data Portal UI).
The tenant’s tables already use pool-based, replica-group-aware instance assignment — see Controller requirements for scale-down for the exact conditions the controller checks before it will scale down a tenant.

How it works

You enable scheduled scaling in the PinotCluster spec under the server component and define one or more schedules per tenant.
The operator creates and maintains a ScheduledServerScaling custom resource for each tenant you configure.
At each schedule’s cron time:
- Scale down — the operator calls the Pinot controller to determine which servers can be removed to reach the target replica-group count, drains in-flight queries off them (bounded by queryDrainTimeout), then scales those server StatefulSets to zero.
- Scale up — the operator determines which previously-removed servers must come back to reach the target, and restores their StatefulSets.
If the operator was down when a schedule fired, it still executes the missed run as long as it restarts within the missedExecutionWindow. After that window passes, the missed run is skipped until the next occurrence.

While a scale operation is in progress, the operator suppresses normal replica reconciliation for the affected servers, so a manually-running cluster reconcile will not fight the schedule.

Enabling it

Add a scheduledScaling block to the server component in your PinotCluster resource:

apiVersion: startreedata.io/v2alpha1
kind: PinotCluster
metadata:
  name: my-pinot
  namespace: pinot
spec:
  components:
    server:
      scheduledScaling:
        enabled: true
        tenants:
          - tenant: DefaultTenant
            schedules:
              - name: nightly-scale-down
                action: SCHEDULED_SCALE_DOWN
                cron: "0 22 * * *"          # Every day at 22:00 UTC
                targetReplicaGroups: 1
                queryDrainTimeout: "10m"
                missedExecutionWindow: "30m"
              - name: morning-scale-up
                action: SCHEDULED_SCALE_UP
                cron: "0 6 * * *"           # Every day at 06:00 UTC
                targetReplicaGroups: 3
                queryDrainTimeout: "10m"
                missedExecutionWindow: "30m"

To disable scheduled scaling, set enabled: false (or remove the scheduledScaling block). When a tenant is removed from the spec, the operator deletes its ScheduledServerScaling resource and restores any servers that were left scaled down.

Configuration reference

scheduledScaling (under spec.components.server):

Field	Type	Description
`enabled`	boolean	Master switch for scheduled scaling on this cluster.
`tenants`	array	One entry per Pinot tenant you want to schedule. Each has `tenant` + `schedules`.

Each entry in tenants:

Field	Type	Description
`tenant`	string	Pinot server tenant name (e.g. `DefaultTenant`).
`schedules`	array	One or more scheduled actions for this tenant.

Each entry in schedules:

Field	Type	Required	Description
`name`	string	yes	A label for the schedule, used in logs and status.
`action`	enum	yes	`SCHEDULED_SCALE_DOWN` or `SCHEDULED_SCALE_UP`.
`cron`	string	yes	5-field Unix cron expression (`minute hour day-of-month month day-of-week`), evaluated in UTC. See examples below.
`targetReplicaGroups`	integer	yes	Desired number of server replica groups for the tenant after this action completes. Must be greater than 0.
`queryDrainTimeout`	string	no	Max time to wait for in-flight queries to drain off a server before scaling it down (e.g. `10m`, `30s`).
`missedExecutionWindow`	string	no	Grace period after the scheduled time during which a missed run still executes if the operator was down (e.g. `30m`, `1h`). Defaults to `15m`.

Cron format

Schedules use standard 5-field Unix cron (minute hour day-of-month month day-of-week). All times are evaluated in UTC — convert your local schedule to UTC before setting cron.

Expression	Meaning (UTC)
`0 22 * * *`	Every day at 22:00 UTC
`0 6 * * 1-5`	Weekdays at 06:00 UTC
`0 0 * * 0`	Every Sunday at midnight UTC
`30 18 * * 5`	Every Friday at 18:30 UTC

Typical pattern: nightly scale-down, morning scale-up

Pair a SCHEDULED_SCALE_DOWN with a SCHEDULED_SCALE_UP to shrink capacity during off-hours and restore it before peak load:

Scale down at night to a low targetReplicaGroups (e.g. 1).
Scale up in the morning back to your full targetReplicaGroups (e.g. 3).

Use the same tenant for both schedules. The scale-up restores exactly the servers that the matching scale-down removed.

Checking status

The operator tracks each tenant’s scaling state in the ScheduledServerScaling resource:

kubectl get scheduledserverscaling -n <namespace>
kubectl get scheduledserverscaling <name> -n <namespace> -o yaml

Key status fields:

Field	Description
`status.state`	`AVAILABLE`, `DELETING` (scale-down in progress), `DELETED` (scaled down), `RESTORING` (scale-up in progress), `RESTORED`.
`status.lastScalingOperation.action`	The most recent action: `SCHEDULED_SCALE_DOWN` or `SCHEDULED_SCALE_UP`.
`status.lastScalingOperation.lastOccurrence`	Timestamp of the cron fire time that triggered the operation.
`status.lastScalingOperation.serversAffected`	List of server StatefulSets affected by the operation.
`status.lastScalingOperation.targetReplicaGroups`	The target replica-group count of the last completed operation.

Notes and limitations

targetReplicaGroups must be greater than 0 — a schedule cannot remove every replica group.
If targetReplicaGroups equals the tenant’s current total replica groups, a scale-down is a no-op (nothing to remove).
Cron times are evaluated in UTC — convert your local time before setting cron.
Scheduled scaling only affects server StatefulSets for the named tenant. It does not change Zookeeper, Controller, Broker, or Minion components.

Controller requirements for scale-down

At each scale-down, the operator asks the Pinot controller (GET /serverReplicaGroupScaleDown) which servers can be removed for the tenant. The controller only returns a server list when all of the following hold. If any fails, the scale-down is rejected and no servers are removed — fix the underlying condition and the next scheduled run (or a manual retry) will proceed. Tenant and topology

The tenant exists and has servers tagged for it (<tenant>_OFFLINE / <tenant>_REALTIME).
Servers are exclusive to the tenant — no server tagged for this tenant may also carry tags for another tenant.
Every tenant server has a pool assignment, and each pool matches at least one of the tenant’s tags.
If a server has both OFFLINE and REALTIME tags, both must point to the same pool number.

Target value

targetReplicaGroups must be ≥ 1 and ≤ the current number of replica groups (pools).
If targetReplicaGroups equals the current replica-group count, the call succeeds but returns no servers (nothing to remove).

Table configuration (when the tenant has tables)

No table may use instancePartitionsMap (which bypasses pool-based assignment).
All non-dimension tables must use pool-based, replica-group-aware instance assignment.
Each table’s configured numReplicaGroups (when non-zero) must equal the current pool count.

No rebalance in progress

No table in the tenant may have an active or failed table rebalance job.
The tenant may not have an active, aborted, cancelled, or unscheduled tenant rebalance job.

The controller evaluates these against a point-in-time snapshot of cluster state. The highest numbered pools are always selected for removal, so a given (tenant, targetReplicaGroups) request is deterministic.

Troubleshooting

A scale-down is rejected, with no servers removed, whenever one of the controller requirements above isn’t met. Fix the underlying condition — the next scheduled run (or a manual retry) will proceed once it’s resolved.

Symptom	Likely cause	What to do
Scale-down never removes any servers	A server tagged for the tenant is also tagged for another tenant	Make servers exclusive to one tenant, or scope the schedule to a tenant that already has dedicated servers.
Scale-down never removes any servers	A server is missing a pool assignment, or OFFLINE/REALTIME tags point to different pools	Fix the server’s pool assignment so it matches the tenant’s tags, and align OFFLINE/REALTIME pool numbers.
Scale-down never removes any servers	`targetReplicaGroups` is greater than the tenant’s current replica-group (pool) count	Lower `targetReplicaGroups` to at most the current pool count.
Scale-down never removes any servers	A table uses `instancePartitionsMap`	Migrate the table to pool-based instance assignment before scheduling scale-down for its tenant.
Scale-down never removes any servers	A table’s `numReplicaGroups` doesn’t match the current pool count	Update the table’s replica-group config to match the tenant’s current pool count.
Scale-down never removes any servers	A table or tenant rebalance is active, aborted, cancelled, or unscheduled	Let the rebalance finish (or clear the stuck job), then wait for the next scheduled run or retry manually.
Scale-up doesn’t fully restore capacity	The matching scale-down removed a different set of servers than expected	Check `status.lastScalingOperation.serversAffected` on the `ScheduledServerScaling` resource for both operations to confirm which servers were removed vs. restored.
A schedule didn’t run at all	The operator was down past `missedExecutionWindow` at the cron time	Increase `missedExecutionWindow` if operator restarts routinely take longer than the current setting, or trigger the action manually.

Replica Group based Workload Isolation — background on replica groups and pool-based instance assignment.
Cluster Health Dashboard — check overall cluster health, including replication and instance-pool checks relevant to scale-down eligibility.

​Prerequisites

​How it works

​Enabling it

​Configuration reference

​Cron format

​Typical pattern: nightly scale-down, morning scale-up

​Checking status

​Notes and limitations

​Controller requirements for scale-down

​Troubleshooting

​Related

Prerequisites

How it works

Enabling it

Configuration reference

Cron format

Typical pattern: nightly scale-down, morning scale-up

Checking status

Notes and limitations

Controller requirements for scale-down

Troubleshooting

Related