Skip to main content
This feature is available starting in StarTree release 0.14.0. It must be enabled on demand — contact your StarTree representative to have it activated for your environment.
After creating an External table and triggering onboarding (see Onboarding Guide), use these APIs to monitor the onboarding pipeline — trigger runs on demand, check watcher state, and verify checkpoints.

Table of Contents

  1. Trigger External Table Sync Task
  2. Get Watcher Status
  3. Get Checkpoint

API Endpoints Quick Reference

MethodEndpointPurpose
POST/tasks/scheduleManually trigger an onboarding run
GET/tables/{tableNameWithType}/externalTable/statusGet watcher run status
GET/tables/{tableNameWithType}/externalTable/checkpointGet last ingested checkpoint

1. Trigger External Table Sync Task

POST /tasks/schedule After creating a table via API you must manually trigger the first onboarding run. The ExternalTableSyncTask is registered on a cron schedule (e.g. every 30 minutes) but will not fire automatically until the next scheduled window. Use the Schedule API to start data loading immediately — it uses the task config defined in the table configuration. Request body:
FieldRequiredDescription
taskTypeYesAlways ExternalTableSyncTask
tableNameYesThe Pinot table name, e.g. my_table
curl -X POST \
  "http://localhost:9000/tasks/schedule" \
  -H "Content-Type: application/json" \
  -d '{"taskType": "ExternalTableSyncTask", "tableName": "<TABLE_NAME>"}'
Success response (200):
{
  "ExternalTableSyncTask": "Task_ExternalTableSyncTask_<TIMESTAMP>"
}
Note: After triggering, poll Get Watcher Status to confirm the task moves from RUNNING to COMPLETED. For StarTree Cloud deployments the base URL includes the /api/pinot proxy prefix, e.g. https://<data-plane-host>/api/pinot/tasks/schedule.

2. Get Watcher Status

GET /tables/{tableNameWithType}/externalTable/status Returns the last run status of the external table sync watcher (ExternalTableSyncWatcher) for the table. Path parameter: tableNameWithType — table name with type suffix, e.g. <TABLE_NAME>_OFFLINE. Response fields:
FieldTypeDescription
statusstringIDLE | RUNNING | COMPLETED | FAILED
startTimeMslongRun start time in ms. 0 if IDLE.
endTimeMslongRun end time in ms. 0 if RUNNING or IDLE.
filesDiscoveredintFiles found from the catalog in this run.
segmentsGeneratedintSegments successfully created and uploaded.
errorMessagestringPopulated only on FAILED.
checkpointValuestringSnapshot ID or watermark after a successful run.
Status meanings:
StatusMeaning
IDLENo run has occurred yet, or the watcher is between scheduled runs.
RUNNINGA run is currently in progress.
COMPLETEDLast run succeeded. Check checkpointValue for the ingested snapshot.
FAILEDLast run failed. Check errorMessage and compare filesDiscovered vs segmentsGenerated to see how far it got.
curl -X GET "http://localhost:9000/tables/<TABLE_NAME>_OFFLINE/externalTable/status"
Sample responses:
// COMPLETED
{
  "tableNameWithType": "<TABLE_NAME>_OFFLINE",
  "status": "COMPLETED",
  "startTimeMs": 1707500000000,
  "endTimeMs": 1707500060000,
  "filesDiscovered": 15,
  "segmentsGenerated": 15,
  "errorMessage": null,
  "checkpointValue": "1234567890123456789"
}

// FAILED
{
  "tableNameWithType": "<TABLE_NAME>_OFFLINE",
  "status": "FAILED",
  "startTimeMs": 1707500000000,
  "endTimeMs": 1707500030000,
  "filesDiscovered": 10,
  "segmentsGenerated": 5,
  "errorMessage": "Failed to upload segment: Connection timeout",
  "checkpointValue": null
}

// IDLE (no run yet — table was just created)
{
  "tableNameWithType": "<TABLE_NAME>_OFFLINE",
  "status": "IDLE",
  "startTimeMs": 0,
  "endTimeMs": 0,
  "filesDiscovered": 0,
  "segmentsGenerated": 0,
  "errorMessage": null,
  "checkpointValue": null
}
Error codes: 404 table not found | 500 internal error
The watcher status is stored in ZooKeeper at /EXTERNAL_TABLE_WATCHER_STATUS/{tableNameWithType}. Only the last run is retained — each new run overwrites the previous status.

3. Get Checkpoint

GET /tables/{tableNameWithType}/externalTable/checkpoint Returns the Iceberg snapshot ID (or timestamp) up to which data has been successfully ingested into Pinot. This is the watermark stored in ZooKeeper after each completed onboarding run.
Requires a top-level catalogType entry in the table’s ExternalTableSyncTask config (catalog-backed onboarding). Raw S3 (s3-catalog) tables return 400.
Response fields:
FieldTypeDescription
checkpointTypestringSNAPSHOT_ID or TIMESTAMP. null if no onboarding has completed.
checkpointValuestringThe snapshot ID or timestamp value. null if no onboarding has completed.
curl -X GET "http://localhost:9000/tables/<TABLE_NAME>_OFFLINE/externalTable/checkpoint"
Sample responses:
// After successful onboarding
{
  "tableNameWithType": "<TABLE_NAME>_OFFLINE",
  "checkpointType": "SNAPSHOT_ID",
  "checkpointValue": "1234567890123456789"
}

// No onboarding completed yet
{
  "tableNameWithType": "<TABLE_NAME>_OFFLINE",
  "checkpointType": null,
  "checkpointValue": null
}
Error codes: 400 not a catalog table | 404 table not found | 500 catalog error