Skip to main content

Overview

Some Minion tasks — such as large ingestion jobs, purges, or exports — process too much data to finish in a single run. They’re broken up into multiple batches, and today each batch normally requires a separate trigger: either you call the task API again, or you wait for the next cron run. Minion Task Orchestration removes that manual step. When enabled for a task, StarTree Cloud tracks the multi-batch job as a task plan: a single trigger creates the plan, and the controller automatically generates and submits each subsequent batch as soon as the previous one finishes, until there’s no more work left to do. A task plan is a long-running orchestration for one table + task type combination. Only one active plan is allowed per table and task type at a time, and the plan tracks overall status, its batches, and progress until it reaches a terminal state (COMPLETED, CANCELLED, or FAILED). Task orchestration is supported for both the ad hoc Execute API and scheduled (cron) task triggers.

How is this different from a normal task run?

AspectNormal task runOrchestrated task run
TriggerEach run is independent (ad hoc API call or cron).One trigger creates a plan; later batches are driven automatically by batch completion.
ScopeOne invocation generates one batch of subtasks.One plan spans multiple batches over time.
Next batchYou must trigger again, or wait for the next cron run.The controller automatically submits the next batch when the current one finishes.
Progress trackingOnly the Helix task state is available.A task plan tracks status, batch history, and progress, queryable via API.
ConcurrencyMultiple runs for the same table/task can overlap.At most one active plan per table and task type; new triggers are rejected or skipped while a plan is active.
In short: a normal run produces one batch per trigger, while an orchestrated run produces a chain of batches from a single trigger, running until the job is done.

Enabling task orchestration

Task orchestration requires the feature to be enabled at the cluster level, and then opted into per task.

Cluster-level control

Task orchestration is enabled by default at the cluster level. It can be turned off entirely with the controller configuration property below, which acts as a cluster-wide kill switch — when disabled, no task plans are created or progressed, regardless of any per-task setting.
PropertyDefaultDescription
controller.startree.task.manager.enableTaskOrchestrationtrueEnables the orchestration infrastructure cluster-wide.
Enabling this property does not by itself orchestrate any task — each task must still opt in individually, as described below.

Enabling for an ad hoc trigger

Add enableTaskOrchestration to the task configuration when calling the Execute API:
{
  "taskType": "SegmentPurgeTask",
  "tableName": "myTable_OFFLINE",
  "taskConfigs": {
    "enableTaskOrchestration": "true"
  }
}
  • Only task types that support orchestration will use this path; other task types ignore the flag and run as before.
  • If a plan is already active for that table and task type, the ad hoc request is rejected until the existing plan completes or is aborted.

Enabling for a scheduled (cron) trigger

Add enableTaskOrchestration alongside the schedule key in the table configuration:
{
  "task": {
    "taskTypeConfigsMap": {
      "SegmentPurgeTask": {
        "schedule": "0 0 2 * * ?",
        "enableTaskOrchestration": "true"
      }
    }
  }
}
When a cron trigger fires and orchestration is enabled, the controller creates a task plan from the table configuration and begins multi-batch orchestration automatically.
If a scheduled trigger fires while a plan is already active for that table and task type, the trigger is skipped (not rejected) and the existing plan continues unaffected. This is different from the ad hoc path, which rejects the request outright.
If the task’s generator doesn’t support orchestration, the trigger automatically falls back to the normal, one-shot batch generation.

Disabling orchestration for a single task type

To disable orchestration for one problematic task type without touching the cluster-wide switch, set forceLegacyTaskFlow in that task type’s configuration:
{
  "task": {
    "taskTypeConfigsMap": {
      "SegmentPurgeTask": {
        "schedule": "0 0 2 * * ?",
        "forceLegacyTaskFlow": "true"
      }
    }
  }
}
  • forceLegacyTaskFlow takes precedence over enableTaskOrchestration — if both are true, the legacy flow is used.
  • It only stops new plans from being created. A plan that’s already active for that table and task type is left to finish on its own; it isn’t aborted.
  • If a single scheduled trigger covers multiple tables for the same task type, setting forceLegacyTaskFlow on any of them routes the whole trigger cycle to the legacy flow.
Some task types (for example StarTreeAlterTableTask) only support the orchestrated flow and reject ad hoc generation without it. Forcing the legacy flow for such a task type is an explicit choice that may cause task generation to fail — only use it if you understand the tradeoff.

Supported task types

Orchestration is available only for task types whose generator implements batch-by-batch generation. Currently supported task types:
Task typeDescription
File Ingestion TaskSupports all ingestion modes (sync/append, with or without consistent push) with automatic batch chaining and retries. Also supports consistent push with full-swap mode, completing a swap reliably across retries while avoiding accidental duplicate swaps.
Segment Purge TaskChains batches automatically for large purge jobs that span more segments than a single batch can process.
Data Export TaskExports completed segments from a source real-time table to an external target. Segments beyond the per-batch limit, and any pending commits, automatically flow into subsequent batches until the export queue is fully drained.
For any other task type, enabling enableTaskOrchestration has no effect — the task always uses the standard, one-shot generation flow.

Monitoring task plans via API

When available, the controller exposes REST endpoints for inspecting and managing task plans. All endpoints require the same authentication and table-level authorization as other task APIs.
MethodPathDescription
GET/tasks/taskPlans/isActive/{taskPlanId}Check whether a task plan is currently active.
GET/tasks/taskPlans/{planId}Get the full details of a task plan by ID.
GET/tasks/taskPlansList all task plan IDs for a table and task type. Requires tableNameWithType and taskType query parameters.
GET/tasks/taskPlans/activeList all active task plans, optionally filtered by tableNameWithType and/or taskType.
GET/tasks/taskPlans/active/idsList active task plan IDs only, with the same optional filters.
DELETE/tasks/taskPlans/{planId}Abort a task plan. Sets its status to ABORTING; no new batches are generated, though any already-running batch may still complete. Poll the isActive endpoint to confirm the plan has fully stopped.

Task plan data model

Each task plan returned by the API contains the following fields:
FieldTypeDescription
planIdStringUnique ID, formatted as <tableNameWithType>__<taskType>__<uuid>.
tableNameWithTypeStringThe table this plan belongs to (e.g. myTable_OFFLINE).
taskTypeStringThe task type this plan orchestrates (e.g. SegmentPurgeTask).
statusEnumOne of ACTIVE, COMPLETED, CANCELLED, ABORTING, FAILED.
sourceStringHow the plan was created: ADHOC_CONFIG or TABLE_CONFIG.
propertiesMap<String, String>Plan-level configuration used to generate each batch.
batchesListOrdered list of batch records submitted so far (see below).
inputsToProcessLongTotal number of input units (e.g. segments) to process.
inputsBeingProcessedLongNumber of input units currently in flight.
inputUnitStringThe unit the counts above are measured in (e.g. "segments").
statusMessageStringHuman-readable status, such as the reason for an abort or completion.
customStatsMap<String, String>Optional, task-type-specific statistics.
Each entry in batches includes:
FieldTypeDescription
batchSequenceInteger0-based sequence number of the batch.
submittedAtMsLongTime (epoch ms) the batch was submitted.
submittedTaskNameStringName of the parent task submitted for this batch.
taskStateEnumCurrent state of that batch’s parent task (e.g. NOT_STARTED, IN_PROGRESS, COMPLETED).

Task plan cleanup

Task plans are cleaned up automatically when their table is dropped, so deleted tables don’t leave orphaned plans behind:
  • Dropping a table synchronously removes all of its task plans (across every task type) before the table’s metadata is torn down.
  • A periodic background sweep also reaps any plans whose table no longer exists, catching cases the synchronous cleanup may have missed. This sweep runs on an interval controlled by:
PropertyDefaultDescription
controller.startree.task.manager.orphanedTaskPlanCleanupIntervalInSeconds28800 (8 hours)How often the controller sweeps for and removes plans belonging to deleted tables.

Observability and metrics

Task orchestration emits controller-side metrics scoped to tableNameWithType and taskType, so you can monitor and alert on orchestration health. Global gauges are emitted per controller — since only the controller that leads a table progresses its plans, aggregate global metrics across all controllers when building dashboards.

Meters

MetricMeaning
taskPlanAbortedA plan was aborted, either due to a failure-threshold breach or a generator-driven abort.
orchestrationCycleFailureAn exception occurred while progressing a plan.
taskPlanProgressionBlockedPlan progression was skipped because the task queue is paused, or resource-utilization limits were hit. Stays set while blocked and clears once progression can resume.
scheduledTriggerActivePlanConflictA scheduled trigger was skipped because a plan was already active for that table and task type.
taskGenerationFailureCountTask generation failed for the table/task type.

Gauges

MetricMeaning
taskPlanInputsToProcess, taskPlanInputsBeingProcessed, taskPlanInputsProcessedPlan progress: total, in-flight, and cumulative-completed input units.
activeTaskPlansCountNumber of ACTIVE plans this controller is currently progressing.
taskPlansAbortingCountNumber of plans stuck in ABORTING, waiting for in-flight subtasks to terminate.
orchestrationTimeSinceLastPollMsTime since the plan-completion polling job last ran. Rising continuously would indicate the polling job has stalled.

FAQs

Do I need to change anything for task types that don’t support orchestration?

No. Setting enableTaskOrchestration on an unsupported task type has no effect — it continues to use the standard one-shot generation flow.

What happens if I trigger an ad hoc run while a plan is already active?

The request is rejected with an error. Wait for the active plan to complete, or abort it using the DELETE /tasks/taskPlans/{planId} endpoint, before triggering again.

What happens if a scheduled (cron) trigger fires while a plan is active?

Unlike the ad hoc path, the scheduled trigger is silently skipped rather than rejected, and the scheduledTriggerActivePlanConflict metric is incremented. The existing plan is unaffected and continues to progress.

How do I stop an in-progress plan?

Call DELETE /tasks/taskPlans/{planId}. This moves the plan to ABORTING; any batch already running is allowed to finish, but no new batch is generated. Poll GET /tasks/taskPlans/isActive/{planId} until it reports the plan is no longer active.

How can I tell whether a task type supports orchestration?

Check the supported task types table. If a task type isn’t listed there, enabling enableTaskOrchestration for it has no effect.