> ## Documentation Index
> Fetch the complete documentation index at: https://docs.startree.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Minion Task Orchestration

> Automatically chain multi-batch Minion tasks to completion using task plans, without manual re-triggers or a fixed schedule.

## Overview

Some Minion tasks — such as large ingestion jobs, purges, or exports — process too much data to finish in a single run. They're broken up into multiple **batches**, and today each batch normally requires a separate trigger: either you call the task API again, or you wait for the next cron run.

**Minion Task Orchestration** removes that manual step. When enabled for a task, StarTree Cloud tracks the multi-batch job as a **task plan**: a single trigger creates the plan, and the controller automatically generates and submits each subsequent batch as soon as the previous one finishes, until there's no more work left to do.

A task plan is a long-running orchestration for one **table + task type** combination. Only one active plan is allowed per table and task type at a time, and the plan tracks overall status, its batches, and progress until it reaches a terminal state (`COMPLETED`, `CANCELLED`, or `FAILED`).

Task orchestration is supported for both the ad hoc [Execute API](/api-reference/task/execute-a-task-on-minion) and scheduled (cron) task triggers.

## How is this different from a normal task run?

| Aspect                | Normal task run                                        | Orchestrated task run                                                                                         |
| :-------------------- | :----------------------------------------------------- | :------------------------------------------------------------------------------------------------------------ |
| **Trigger**           | Each run is independent (ad hoc API call or cron).     | One trigger creates a **plan**; later batches are driven automatically by batch completion.                   |
| **Scope**             | One invocation generates one batch of subtasks.        | One plan spans multiple batches over time.                                                                    |
| **Next batch**        | You must trigger again, or wait for the next cron run. | The controller automatically submits the next batch when the current one finishes.                            |
| **Progress tracking** | Only the Helix task state is available.                | A task plan tracks status, batch history, and progress, queryable via API.                                    |
| **Concurrency**       | Multiple runs for the same table/task can overlap.     | At most one active plan per table and task type; new triggers are rejected or skipped while a plan is active. |

In short: a normal run produces one batch per trigger, while an orchestrated run produces a chain of batches from a single trigger, running until the job is done.

## Enabling task orchestration

Task orchestration requires the feature to be enabled at the cluster level, and then opted into per task.

### Cluster-level control

Task orchestration is **enabled by default** at the cluster level. It can be turned off entirely with the controller configuration property below, which acts as a cluster-wide kill switch — when disabled, no task plans are created or progressed, regardless of any per-task setting.

| Property                                                   | Default | Description                                            |
| :--------------------------------------------------------- | :------ | :----------------------------------------------------- |
| `controller.startree.task.manager.enableTaskOrchestration` | `true`  | Enables the orchestration infrastructure cluster-wide. |

Enabling this property does not by itself orchestrate any task — each task must still opt in individually, as described below.

### Enabling for an ad hoc trigger

Add `enableTaskOrchestration` to the task configuration when calling the [Execute API](/api-reference/task/execute-a-task-on-minion):

```json theme={null}
{
  "taskType": "SegmentPurgeTask",
  "tableName": "myTable_OFFLINE",
  "taskConfigs": {
    "enableTaskOrchestration": "true"
  }
}
```

* Only [task types that support orchestration](#supported-task-types) will use this path; other task types ignore the flag and run as before.
* If a plan is already active for that table and task type, the ad hoc request is **rejected** until the existing plan completes or is aborted.

### Enabling for a scheduled (cron) trigger

Add `enableTaskOrchestration` alongside the `schedule` key in the table configuration:

```json theme={null}
{
  "task": {
    "taskTypeConfigsMap": {
      "SegmentPurgeTask": {
        "schedule": "0 0 2 * * ?",
        "enableTaskOrchestration": "true"
      }
    }
  }
}
```

When a cron trigger fires and orchestration is enabled, the controller creates a task plan from the table configuration and begins multi-batch orchestration automatically.

<Note>
  If a scheduled trigger fires while a plan is already active for that table and task type, the trigger is **skipped** (not rejected) and the existing plan continues unaffected. This is different from the ad hoc path, which rejects the request outright.
</Note>

If the task's generator doesn't support orchestration, the trigger automatically falls back to the normal, one-shot batch generation.

### Disabling orchestration for a single task type

To disable orchestration for one problematic task type without touching the cluster-wide switch, set `forceLegacyTaskFlow` in that task type's configuration:

```json theme={null}
{
  "task": {
    "taskTypeConfigsMap": {
      "SegmentPurgeTask": {
        "schedule": "0 0 2 * * ?",
        "forceLegacyTaskFlow": "true"
      }
    }
  }
}
```

* `forceLegacyTaskFlow` takes precedence over `enableTaskOrchestration` — if both are `true`, the legacy flow is used.
* It only stops **new** plans from being created. A plan that's already active for that table and task type is left to finish on its own; it isn't aborted.
* If a single scheduled trigger covers multiple tables for the same task type, setting `forceLegacyTaskFlow` on **any** of them routes the whole trigger cycle to the legacy flow.

<Warning>
  Some task types (for example `StarTreeAlterTableTask`) only support the orchestrated flow and reject ad hoc generation without it. Forcing the legacy flow for such a task type is an explicit choice that may cause task generation to fail — only use it if you understand the tradeoff.
</Warning>

## Supported task types

Orchestration is available only for task types whose generator implements batch-by-batch generation. Currently supported task types:

| Task type                                                                                      | Description                                                                                                                                                                                                                                                    |
| :--------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [**File Ingestion Task**](/corecapabilities/ingestdata/adv-concepts/batch/file-ingestion-task) | Supports all ingestion modes (sync/append, with or without consistent push) with automatic batch chaining and retries. Also supports consistent push with full-swap mode, completing a swap reliably across retries while avoiding accidental duplicate swaps. |
| [**Segment Purge Task**](/corecapabilities/manage-data/purge-task)                             | Chains batches automatically for large purge jobs that span more segments than a single batch can process.                                                                                                                                                     |
| [**Data Export Task**](/corecapabilities/external-table/data-export-task)                      | Exports completed segments from a source real-time table to an external target. Segments beyond the per-batch limit, and any pending commits, automatically flow into subsequent batches until the export queue is fully drained.                              |

For any other task type, enabling `enableTaskOrchestration` has no effect — the task always uses the standard, one-shot generation flow.

## Monitoring task plans via API

When available, the controller exposes REST endpoints for inspecting and managing task plans. All endpoints require the same authentication and table-level authorization as other task APIs.

| Method   | Path                                     | Description                                                                                                                                                                                              |
| :------- | :--------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `GET`    | `/tasks/taskPlans/isActive/{taskPlanId}` | Check whether a task plan is currently active.                                                                                                                                                           |
| `GET`    | `/tasks/taskPlans/{planId}`              | Get the full details of a task plan by ID.                                                                                                                                                               |
| `GET`    | `/tasks/taskPlans`                       | List all task plan IDs for a table and task type. Requires `tableNameWithType` and `taskType` query parameters.                                                                                          |
| `GET`    | `/tasks/taskPlans/active`                | List all active task plans, optionally filtered by `tableNameWithType` and/or `taskType`.                                                                                                                |
| `GET`    | `/tasks/taskPlans/active/ids`            | List active task plan IDs only, with the same optional filters.                                                                                                                                          |
| `DELETE` | `/tasks/taskPlans/{planId}`              | Abort a task plan. Sets its status to `ABORTING`; no new batches are generated, though any already-running batch may still complete. Poll the `isActive` endpoint to confirm the plan has fully stopped. |

### Task plan data model

Each task plan returned by the API contains the following fields:

| Field                  | Type                 | Description                                                           |
| :--------------------- | :------------------- | :-------------------------------------------------------------------- |
| `planId`               | String               | Unique ID, formatted as `<tableNameWithType>__<taskType>__<uuid>`.    |
| `tableNameWithType`    | String               | The table this plan belongs to (e.g. `myTable_OFFLINE`).              |
| `taskType`             | String               | The task type this plan orchestrates (e.g. `SegmentPurgeTask`).       |
| `status`               | Enum                 | One of `ACTIVE`, `COMPLETED`, `CANCELLED`, `ABORTING`, `FAILED`.      |
| `source`               | String               | How the plan was created: `ADHOC_CONFIG` or `TABLE_CONFIG`.           |
| `properties`           | Map\<String, String> | Plan-level configuration used to generate each batch.                 |
| `batches`              | List                 | Ordered list of batch records submitted so far (see below).           |
| `inputsToProcess`      | Long                 | Total number of input units (e.g. segments) to process.               |
| `inputsBeingProcessed` | Long                 | Number of input units currently in flight.                            |
| `inputUnit`            | String               | The unit the counts above are measured in (e.g. `"segments"`).        |
| `statusMessage`        | String               | Human-readable status, such as the reason for an abort or completion. |
| `customStats`          | Map\<String, String> | Optional, task-type-specific statistics.                              |

Each entry in `batches` includes:

| Field               | Type    | Description                                                                                 |
| :------------------ | :------ | :------------------------------------------------------------------------------------------ |
| `batchSequence`     | Integer | 0-based sequence number of the batch.                                                       |
| `submittedAtMs`     | Long    | Time (epoch ms) the batch was submitted.                                                    |
| `submittedTaskName` | String  | Name of the parent task submitted for this batch.                                           |
| `taskState`         | Enum    | Current state of that batch's parent task (e.g. `NOT_STARTED`, `IN_PROGRESS`, `COMPLETED`). |

## Task plan cleanup

Task plans are cleaned up automatically when their table is dropped, so deleted tables don't leave orphaned plans behind:

* Dropping a table synchronously removes all of its task plans (across every task type) before the table's metadata is torn down.
* A periodic background sweep also reaps any plans whose table no longer exists, catching cases the synchronous cleanup may have missed. This sweep runs on an interval controlled by:

| Property                                                                    | Default           | Description                                                                        |
| :-------------------------------------------------------------------------- | :---------------- | :--------------------------------------------------------------------------------- |
| `controller.startree.task.manager.orphanedTaskPlanCleanupIntervalInSeconds` | `28800` (8 hours) | How often the controller sweeps for and removes plans belonging to deleted tables. |

## Observability and metrics

Task orchestration emits controller-side metrics scoped to `tableNameWithType` and `taskType`, so you can monitor and alert on orchestration health. Global gauges are emitted per controller — since only the controller that leads a table progresses its plans, aggregate global metrics across all controllers when building dashboards.

### Meters

| Metric                               | Meaning                                                                                                                                                                 |
| :----------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `taskPlanAborted`                    | A plan was aborted, either due to a failure-threshold breach or a generator-driven abort.                                                                               |
| `orchestrationCycleFailure`          | An exception occurred while progressing a plan.                                                                                                                         |
| `taskPlanProgressionBlocked`         | Plan progression was skipped because the task queue is paused, or resource-utilization limits were hit. Stays set while blocked and clears once progression can resume. |
| `scheduledTriggerActivePlanConflict` | A scheduled trigger was skipped because a plan was already active for that table and task type.                                                                         |
| `taskGenerationFailureCount`         | Task generation failed for the table/task type.                                                                                                                         |

### Gauges

| Metric                                                                               | Meaning                                                                                                              |
| :----------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------- |
| `taskPlanInputsToProcess`, `taskPlanInputsBeingProcessed`, `taskPlanInputsProcessed` | Plan progress: total, in-flight, and cumulative-completed input units.                                               |
| `activeTaskPlansCount`                                                               | Number of `ACTIVE` plans this controller is currently progressing.                                                   |
| `taskPlansAbortingCount`                                                             | Number of plans stuck in `ABORTING`, waiting for in-flight subtasks to terminate.                                    |
| `orchestrationTimeSinceLastPollMs`                                                   | Time since the plan-completion polling job last ran. Rising continuously would indicate the polling job has stalled. |

## FAQs

### Do I need to change anything for task types that don't support orchestration?

No. Setting `enableTaskOrchestration` on an unsupported task type has no effect — it continues to use the standard one-shot generation flow.

### What happens if I trigger an ad hoc run while a plan is already active?

The request is rejected with an error. Wait for the active plan to complete, or abort it using the `DELETE /tasks/taskPlans/{planId}` endpoint, before triggering again.

### What happens if a scheduled (cron) trigger fires while a plan is active?

Unlike the ad hoc path, the scheduled trigger is silently skipped rather than rejected, and the `scheduledTriggerActivePlanConflict` metric is incremented. The existing plan is unaffected and continues to progress.

### How do I stop an in-progress plan?

Call `DELETE /tasks/taskPlans/{planId}`. This moves the plan to `ABORTING`; any batch already running is allowed to finish, but no new batch is generated. Poll `GET /tasks/taskPlans/isActive/{planId}` until it reports the plan is no longer active.

### How can I tell whether a task type supports orchestration?

Check the [supported task types](#supported-task-types) table. If a task type isn't listed there, enabling `enableTaskOrchestration` for it has no effect.
