Skip to main content
This feature is available starting in StarTree release 0.14.0. Ensure your environment is running version 0.14.0 or later before following this guide.
This guide explains how to use the Pinot Controller’s Iceberg Catalog APIs to connect to an Iceberg catalog, explore its tables, and create a Pinot table with automated ingestion. Each section covers one catalog provider. Within each section, all APIs are shown individually with their exact request bodies.
Prefer a point-and-click setup? See the Data Portal Onboarding Guide.

How It Works

Onboarding an external table follows a linear discovery-to-ingestion workflow:
  1. Validate your catalog credentials and connectivity.
  2. Discover available namespaces and tables in the catalog.
  3. Create the Pinot schema and table — this also registers the IcebergIngestionTask on a cron schedule.
  4. Trigger the first ingestion run manually, since the scheduled task does not fire immediately after table creation.
Once ingestion is running, use the Observability APIs to monitor progress, verify checkpoints, and check file counts.

API Endpoints Quick Reference

StepMethodEndpointPurpose
1POST/iceberg/catalog/validateValidate catalog connection
2POST/iceberg/catalog/namespacesList namespaces
3POST/iceberg/catalog/tables/listList tables in a namespace
4POST/iceberg/catalog/tablesCreate Pinot table and schema
5POST/periodictask/runManually trigger first ingestion run
For monitoring and observability APIs, see the Observability page.
Run these steps in order when onboarding a new External table:
Step 1 ──► POST /iceberg/catalog/validate
           Confirm credentials and catalog connectivity.

Step 2 ──► POST /iceberg/catalog/namespaces
           Discover available namespaces (databases).

Step 3 ──► POST /iceberg/catalog/tables/list
           List tables in the target namespace.

Step 4 ──► POST /iceberg/catalog/tables
           Create the Pinot schema + table and register the ingestion schedule.

Step 5 ──► POST /periodictask/run?taskname=IcebergIngestionTask&tableName=<TABLE_NAME>&type=OFFLINE
           ⚠️  Manually trigger the first ingestion run if table is created via API.
           The scheduled task will not fire immediately after table creation —
           you must trigger it to start loading data without waiting for
           the next cron window (up to 30 minutes).
Once the table is created and ingestion is running, use the Observability APIs to monitor progress, verify checkpoints, and check file counts.

Pausing Ingestion

To stop the IcebergIngestionTask from running on its cron schedule, set enabled to false in the task config and update the table via the Pinot Controller API.
curl -X PUT \
  "http://localhost:9000/tables/<TABLE_NAME>_OFFLINE" \
  -H "Content-Type: application/json" \
  -d '{
    ...tableConfig...,
    "task": {
      "taskTypeConfigsMap": {
        "IcebergIngestionTask": {
          "enabled": "false",
          "schedule": "0 */5 * * * ?",
          ...
        }
      }
    }
  }'
Setting "enabled": "false" prevents the scheduler from creating new ingestion task instances. Any run currently in progress will complete normally. To resume ingestion, set "enabled": "true" and update the table again.
Note: Pausing does not delete existing segments or checkpoints. When you re-enable the task, ingestion resumes from the last recorded checkpoint — no data is re-ingested.

Catalog Providers

1. S3 Data Lake

What it is: For raw Parquet files on S3 that are not managed by an Iceberg catalog service. There is no catalog REST endpoint — Pinot reads files directly from the specified S3 bucket and prefix. The namespace and table discovery APIs still work but return values derived from the S3 path. catalogType: "s3-catalog"
Note: Unlike the other catalog types, this provider uses accessKey / secretKey (not restAccessKeyId / restSecretAccessKey) for credentials.

1.1 Validate

POST /iceberg/catalog/validate
{
  "catalogType": "s3-catalog",
  "catalogConfig": {
    "bucketName": "<YOUR_S3_BUCKET>",
    "prefix": "path/to/parquet/data/",
    "region": "<REGION>",
    "accessKey": "<YOUR_ACCESS_KEY_ID>",
    "secretKey": "<YOUR_SECRET_ACCESS_KEY>"
  }
}

1.2 Create Pinot Table

POST /iceberg/catalog/tables
{
  "catalogType": "s3-catalog",
  "catalogConfig": {
    "namespace": "default",
    "tableName": "raws3_table",
    "bucketName": "<YOUR_S3_BUCKET>",
    "prefix": "path/to/parquet/data/",
    "region": "<REGION>",
    "accessKey": "<YOUR_ACCESS_KEY_ID>",
    "secretKey": "<YOUR_SECRET_ACCESS_KEY>"
  },
  "schemaOptions": {
    "includePartitionColumns": true,
    "timestampColumn": null,
    "timeUnit": "MILLISECONDS",
    "schemaName": "raws3_table"
  },
  "tableConfig": {
    "tableName": "raws3_table_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "timeColumnName": "logical_timestamp",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "3600",
      "replication": "1",
      "segmentPushType": "APPEND"
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "rangeIndexVersion": 2,
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "loadMode": "MMAP",
      "enableDefaultStarTree": false,
      "enableDynamicStarTreeCreation": false,
      "aggregateMetrics": false,
      "nullHandlingEnabled": true
    },
    "task": {
      "taskTypeConfigsMap": {
        "IcebergIngestionTask": {
          "schedule": "*/30 * * * * ?",
          "executor": "controller",
          "inputFormat": "parquet",
          "iceberg.source.type": "catalog",
          "catalogType": "s3",
          "catalog.s3.table.tableName": "raws3_table",
          "catalog.s3.bucketName": "<YOUR_S3_BUCKET>",
          "catalog.s3.prefix": "path/to/parquet/data/",
          "catalog.s3.region": "<REGION>",
          "catalog.s3.accessKey": "<YOUR_ACCESS_KEY_ID>",
          "catalog.s3.secretKey": "<YOUR_SECRET_ACCESS_KEY>"
        }
      }
    },
    "tierConfigs": [
      {
        "name": "myS3Tier",
        "segmentSelectorType": "time",
        "segmentAge": "0s",
        "storageType": "pinot_server",
        "serverTag": "DefaultTenant_OFFLINE",
        "tierBackend": "s3",
        "tierBackendProperties": {
          "enable.delegate.v2": "true",
          "region": "<REGION>",
          "bucket": "<YOUR_S3_BUCKET>",
          "accessKey": "<YOUR_ACCESS_KEY_ID>",
          "secretKey": "<YOUR_SECRET_ACCESS_KEY>",
          "sessionToken": "",
          "preload.index.keys.override": "*.inverted_index,*.composite_json_index",
          "s3client.crtasync.targetThroughputInGbps": "1000.0",
          "s3client.crtasync.maxConcurrency": "1000"
        }
      }
    ],
    "isDimTable": false
  }
}

2. Glue REST

What it is: Connects to AWS Glue as an Iceberg catalog using the Iceberg REST protocol. Uses AWS SigV4 for both the Glue catalog API (rest* fields) and S3 data access (storage* fields). catalogType: "iceberg-rest" with "serviceType": "glue"

2.1 Validate

POST /iceberg/catalog/validate
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "restUri": "https://glue.<REGION>.amazonaws.com",
    "serviceType": "glue",
    "warehouse": "<YOUR_AWS_ACCOUNT_ID>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  }
}

2.2 List Namespaces

POST /iceberg/catalog/namespaces
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>",
    "restUri": "https://glue.<REGION>.amazonaws.com",
    "serviceType": "glue",
    "warehouse": "<YOUR_AWS_ACCOUNT_ID>"
  }
}

2.3 List Tables

POST /iceberg/catalog/tables/list
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "restUri": "https://glue.<REGION>.amazonaws.com",
    "serviceType": "glue",
    "warehouse": "<YOUR_AWS_ACCOUNT_ID>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "namespace": "st-database"
}
namespace — the Glue database name to list tables from.

2.4 Create Pinot Table

POST /iceberg/catalog/tables
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "namespace": "st-database",
    "tableName": "glue_iceberg_table_wiki",
    "restUri": "https://glue.<REGION>.amazonaws.com",
    "serviceType": "glue",
    "warehouse": "<YOUR_AWS_ACCOUNT_ID>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "schemaOptions": {
    "includePartitionColumns": true,
    "timestampColumn": null,
    "timeUnit": "MILLISECONDS",
    "schemaName": "glue_rest_table"
  },
  "tableConfig": {
    "tableName": "glue_rest_table_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "timeColumnName": null,
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "replication": "1",
      "segmentPushType": "APPEND"
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "rangeIndexVersion": 2,
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "loadMode": "MMAP",
      "enableDefaultStarTree": false,
      "enableDynamicStarTreeCreation": false,
      "aggregateMetrics": false,
      "nullHandlingEnabled": true
    },
    "task": {
      "taskTypeConfigsMap": {
        "IcebergIngestionTask": {
          "schedule": "0 */30 * * * ?",
          "executor": "controller",
          "inputFormat": "parquet",
          "iceberg.source.type": "catalog",
          "catalogType": "iceberg-rest",
          "catalog.iceberg-rest.serviceType": "glue",
          "catalog.iceberg-rest.table.namespace": "st-database",
          "catalog.iceberg-rest.table.tableName": "glue_iceberg_table_wiki",
          "catalog.iceberg-rest.restUri": "https://glue.<REGION>.amazonaws.com",
          "catalog.iceberg-rest.warehouse": "<YOUR_AWS_ACCOUNT_ID>",
          "catalog.iceberg-rest.auth.rest.authType": "aws-sigv4",
          "catalog.iceberg-rest.auth.rest.accessKeyId": "<YOUR_ACCESS_KEY_ID>",
          "catalog.iceberg-rest.auth.rest.secretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
          "catalog.iceberg-rest.auth.rest.region": "<REGION>",
          "catalog.iceberg-rest.auth.rest.service": "glue",
          "catalog.iceberg-rest.auth.storage.authType": "aws-sigv4",
          "catalog.iceberg-rest.auth.storage.accessKeyId": "<YOUR_ACCESS_KEY_ID>",
          "catalog.iceberg-rest.auth.storage.secretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
          "catalog.iceberg-rest.auth.storage.region": "<REGION>"
        }
      }
    },
    "tierConfigs": [],
    "isDimTable": false
  }
}
Note: The IcebergIngestionTask config in Create uses the hierarchical catalog.iceberg-rest.auth.* key format.

3. Glue Native

What it is: Connects to AWS Glue using the native Glue SDK (as opposed to the Iceberg REST protocol). Simpler to configure — no restUri or serviceType needed. The Glue database to use is specified by the database field. catalogType: "glue-catalog"

3.1 Validate

POST /iceberg/catalog/validate
{
  "catalogType": "glue-catalog",
  "catalogConfig": {
    "region": "<REGION>",
    "database": "st-database",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  }
}

3.2 List Namespaces

POST /iceberg/catalog/namespaces
{
  "catalogType": "glue-catalog",
  "catalogConfig": {
    "region": "<REGION>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  }
}

3.3 List Tables

POST /iceberg/catalog/tables/list
{
  "catalogType": "glue-catalog",
  "catalogConfig": {
    "region": "<REGION>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "namespace": "st-database"
}

3.4 Create Pinot Table

POST /iceberg/catalog/tables
{
  "catalogType": "glue-catalog",
  "catalogConfig": {
    "namespace": "st-database",
    "tableName": "glue_iceberg_table_wiki",
    "region": "<REGION>",
    "database": "st-database",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "glue",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "schemaOptions": {
    "includePartitionColumns": true,
    "timestampColumn": null,
    "timeUnit": "MILLISECONDS",
    "schemaName": "glue_native_table"
  },
  "tableConfig": {
    "tableName": "glue_native_table_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "timeColumnName": null,
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "replication": "1",
      "segmentPushType": "APPEND"
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "rangeIndexVersion": 2,
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "loadMode": "MMAP",
      "enableDefaultStarTree": false,
      "enableDynamicStarTreeCreation": false,
      "aggregateMetrics": false,
      "nullHandlingEnabled": true
    },
    "task": {
      "taskTypeConfigsMap": {
        "IcebergIngestionTask": {
          "schedule": "0 */30 * * * ?",
          "executor": "controller",
          "inputFormat": "parquet",
          "iceberg.source.type": "catalog",
          "catalogType": "glue",
          "catalog.glue.table.namespace": "st-database",
          "catalog.glue.table.tableName": "glue_iceberg_table_wiki",
          "catalog.glue.region": "<REGION>",
          "catalog.glue.database": "st-database",
          "catalog.glue.awsAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
          "catalog.glue.awsSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>"
        }
      }
    },
    "tierConfigs": [],
    "isDimTable": false
  }
}

4. Nessie REST

What it is: Connects to a Project Nessie server using the Iceberg REST protocol. Nessie itself requires no authentication in this configuration (restAuthType: "none"); only S3 credentials are needed to read the underlying data files. catalogType: "iceberg-rest-s3" with "serviceType": "nessie"

4.1 Validate

POST /iceberg/catalog/validate
{
  "catalogType": "iceberg-rest-s3",
  "catalogConfig": {
    "restUri": "http://localhost:19120/iceberg",
    "serviceType": "nessie",
    "restAuthType": "none",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  }
}

4.2 List Namespaces

POST /iceberg/catalog/namespaces
{
  "catalogType": "iceberg-rest-s3",
  "catalogConfig": {
    "restUri": "http://localhost:19120/iceberg",
    "serviceType": "nessie",
    "restAuthType": "none",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  }
}

4.3 List Tables

POST /iceberg/catalog/tables/list
{
  "catalogType": "iceberg-rest-s3",
  "catalogConfig": {
    "restUri": "http://localhost:19120/iceberg",
    "serviceType": "nessie",
    "restAuthType": "none",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "namespace": "default"
}

4.4 Create Pinot Table

POST /iceberg/catalog/tables
{
  "catalogType": "iceberg-rest-s3",
  "catalogConfig": {
    "namespace": "default",
    "tableName": "test_company_data",
    "restUri": "http://localhost:19120/iceberg",
    "serviceType": "nessie",
    "restAuthType": "none",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "schemaOptions": {
    "includePartitionColumns": true,
    "timestampColumn": null,
    "timeUnit": "MILLISECONDS",
    "schemaName": "nessie_rest_table"
  },
  "tableConfig": {
    "tableName": "nessie_rest_table_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "timeColumnName": null,
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "replication": "1",
      "segmentPushType": "APPEND"
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "rangeIndexVersion": 2,
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "loadMode": "MMAP",
      "enableDefaultStarTree": false,
      "enableDynamicStarTreeCreation": false,
      "aggregateMetrics": false,
      "nullHandlingEnabled": true
    },
    "task": {
      "taskTypeConfigsMap": {
        "IcebergIngestionTask": {
          "schedule": "0 */30 * * * ?",
          "executor": "controller",
          "inputFormat": "parquet",
          "iceberg.source.type": "catalog",
          "catalogType": "iceberg-rest",
          "catalog.iceberg-rest.serviceType": "nessie",
          "catalog.iceberg-rest.table.namespace": "default",
          "catalog.iceberg-rest.table.tableName": "test_company_data",
          "catalog.iceberg-rest.restUri": "http://localhost:19120/iceberg",
          "catalog.iceberg-rest.auth.rest.authType": "none",
          "catalog.iceberg-rest.auth.storage.authType": "aws-sigv4",
          "catalog.iceberg-rest.auth.storage.accessKeyId": "<YOUR_ACCESS_KEY_ID>",
          "catalog.iceberg-rest.auth.storage.secretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
          "catalog.iceberg-rest.auth.storage.region": "<REGION>"
        }
      }
    },
    "tierConfigs": [],
    "isDimTable": false
  }
}

5. S3 Tables REST

What it is: Connects to AWS S3 Tables, a managed Iceberg-compatible table storage service. Uses the Iceberg REST protocol. Requires an additional tableBucketArn field that identifies the S3 Tables bucket. catalogType: "iceberg-rest" with "serviceType": "s3Tables"

5.1 Validate

POST /iceberg/catalog/validate
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "restUri": "https://s3tables.<REGION>.amazonaws.com",
    "serviceType": "s3Tables",
    "tableBucketArn": "arn:aws:s3tables:<region>:<account-id>:bucket/<bucket-name>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "s3tables",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  }
}

5.2 List Namespaces

POST /iceberg/catalog/namespaces
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "restUri": "https://s3tables.<REGION>.amazonaws.com",
    "serviceType": "s3Tables",
    "tableBucketArn": "arn:aws:s3tables:<region>:<account-id>:bucket/<bucket-name>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "s3tables",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  }
}

5.3 List Tables

POST /iceberg/catalog/tables/list
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "restUri": "https://s3tables.<REGION>.amazonaws.com",
    "serviceType": "s3Tables",
    "tableBucketArn": "arn:aws:s3tables:<region>:<account-id>:bucket/<bucket-name>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "s3tables",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "namespace": "s3_namespace_wiki"
}

5.4 Create Pinot Table

POST /iceberg/catalog/tables
{
  "catalogType": "iceberg-rest",
  "catalogConfig": {
    "namespace": "s3_namespace_wiki",
    "tableName": "s3_table_wiki",
    "restUri": "https://s3tables.<REGION>.amazonaws.com",
    "serviceType": "s3Tables",
    "tableBucketArn": "arn:aws:s3tables:<region>:<account-id>:bucket/<bucket-name>",
    "restAuthType": "aws-sigv4",
    "restAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "restSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "restRegion": "<REGION>",
    "restService": "s3tables",
    "storageAuthType": "aws-sigv4",
    "storageAccessKeyId": "<YOUR_ACCESS_KEY_ID>",
    "storageSecretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
    "storageRegion": "<REGION>"
  },
  "schemaOptions": {
    "includePartitionColumns": true,
    "timestampColumn": null,
    "timeUnit": "MILLISECONDS",
    "schemaName": "s3tables_rest_table"
  },
  "tableConfig": {
    "tableName": "s3tables_rest_table_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "timeColumnName": null,
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "replication": "1",
      "segmentPushType": "APPEND"
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "rangeIndexVersion": 2,
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "loadMode": "MMAP",
      "enableDefaultStarTree": false,
      "enableDynamicStarTreeCreation": false,
      "aggregateMetrics": false,
      "nullHandlingEnabled": true
    },
    "task": {
      "taskTypeConfigsMap": {
        "IcebergIngestionTask": {
          "schedule": "0 */30 * * * ?",
          "executor": "controller",
          "inputFormat": "parquet",
          "iceberg.source.type": "catalog",
          "catalogType": "iceberg-rest",
          "catalog.iceberg-rest.serviceType": "s3Tables",
          "catalog.iceberg-rest.table.namespace": "s3_namespace_wiki",
          "catalog.iceberg-rest.table.tableName": "s3_table_wiki",
          "catalog.iceberg-rest.restUri": "https://s3tables.<REGION>.amazonaws.com",
          "catalog.iceberg-rest.tableBucketArn": "arn:aws:s3tables:<region>:<account-id>:bucket/<bucket-name>",
          "catalog.iceberg-rest.auth.rest.authType": "aws-sigv4",
          "catalog.iceberg-rest.auth.rest.accessKeyId": "<YOUR_ACCESS_KEY_ID>",
          "catalog.iceberg-rest.auth.rest.secretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
          "catalog.iceberg-rest.auth.rest.region": "<REGION>",
          "catalog.iceberg-rest.auth.rest.service": "s3tables",
          "catalog.iceberg-rest.auth.storage.authType": "aws-sigv4",
          "catalog.iceberg-rest.auth.storage.accessKeyId": "<YOUR_ACCESS_KEY_ID>",
          "catalog.iceberg-rest.auth.storage.secretAccessKey": "<YOUR_SECRET_ACCESS_KEY>",
          "catalog.iceberg-rest.auth.storage.region": "<REGION>"
        }
      }
    },
    "tierConfigs": [],
    "isDimTable": false
  }
}

Request Body Field Reference

catalogConfig — Common Fields

FieldApplies toDescription
namespaceAll (steps 2–3)Namespace (Glue database, Nessie namespace, etc.) containing the target table. Required for List Tables and Create.
tableNameCreateName of the Iceberg table. Required for Create.
restUriiceberg-rest, iceberg-rest-s3URL of the Iceberg REST catalog endpoint.
serviceTypeiceberg-rest, iceberg-rest-s3Identifies the backing service: "glue", "nessie", or "s3Tables".
warehouseiceberg-rest (glue)AWS account ID — used as the Glue warehouse identifier.
tableBucketArniceberg-rest (s3Tables)Full ARN of the S3 Tables bucket.
restAuthTypeiceberg-rest, iceberg-rest-s3, glue-catalogAuth type for the catalog REST API: "aws-sigv4" or "none".
restAccessKeyId / restSecretAccessKeyiceberg-rest, glue-catalogAWS credentials for catalog API access.
restRegioniceberg-rest, glue-catalogAWS region for the catalog endpoint.
restServiceiceberg-rest, glue-catalogAWS service name for SigV4 signing: "glue" or "s3tables".
storageAuthTypeiceberg-rest, iceberg-rest-s3, glue-catalogAuth type for S3 data access: always "aws-sigv4".
storageAccessKeyId / storageSecretAccessKeyiceberg-rest, iceberg-rest-s3, glue-catalogAWS credentials for reading Parquet data from S3.
storageRegioniceberg-rest, iceberg-rest-s3, glue-catalogAWS region for S3 data access.
region / databaseglue-catalogRegion and default database for native Glue SDK access.
bucketName / prefixs3-catalogS3 bucket name and key prefix pointing to the Parquet files.
accessKey / secretKeys3-catalogAWS credentials (note: different field names from other providers).

tableConfig — Key Fields

FieldDescription
tableNamePinot table name. Convention: <schemaName>_OFFLINE.
tableTypeAlways "OFFLINE" for Iceberg ingestion.
segmentsConfig.timeColumnNameTime column for Pinot segments. Can be null if no time dimension.
segmentsConfig.retentionTimeValue / retentionTimeUnitHow long Pinot retains segments.
segmentsConfig.segmentPushTypeAlways "APPEND" — new Iceberg snapshots are appended as new segments.
tableIndexConfig.nullHandlingEnabledSet to true to handle nullable columns from Iceberg schemas.
task.taskTypeConfigsMap.IcebergIngestionTask.scheduleCron expression for ingestion frequency. "0 */30 * * * ?" runs every 30 minutes.
task.taskTypeConfigsMap.IcebergIngestionTask.inputFormatAlways "parquet" — Iceberg uses Parquet for data files.

Frequently Asked Questions

Why isn’t my table ingesting data after creation? The IcebergIngestionTask runs on a cron schedule (default every 30 minutes) and does not fire automatically on table creation. You must manually trigger the first run:
curl -X POST \
  "http://localhost:9000/periodictask/run?taskname=IcebergIngestionTask&tableName=<TABLE_NAME>&type=OFFLINE" \
  -H "Content-Type: application/json" \
  -d ''
See Observability → Trigger Ingestion Task for details.
Which catalog type should I use?
My setupUse
Raw Parquet files directly on S3 (no catalog service)s3-catalog (S3 Data Lake)
AWS Glue via the Iceberg REST protocoliceberg-rest with serviceType: "glue" (Glue REST)
AWS Glue via the native Glue SDKglue-catalog (Glue Native)
Project Nessie servericeberg-rest-s3 with serviceType: "nessie" (Nessie REST)
AWS S3 Tablesiceberg-rest with serviceType: "s3Tables" (S3 Tables REST)

What credentials do I need? For all AWS-backed catalog types (iceberg-rest, glue-catalog, iceberg-rest-s3), you need two sets of AWS credentials:
  • Catalog credentials (restAccessKeyId / restSecretAccessKey) — to authenticate against the catalog API (Glue, S3 Tables).
  • Storage credentials (storageAccessKeyId / storageSecretAccessKey) — to read Parquet data files from S3.
For the S3 Data Lake provider (s3-catalog), only one set is needed, using the field names accessKey / secretKey.
What file format does Iceberg ingestion support? Only Parquet. Set "inputFormat": "parquet" in the IcebergIngestionTask config. All Iceberg-managed tables use Parquet by default.
Can I set a custom ingestion schedule? Yes. The schedule field in IcebergIngestionTask accepts a standard cron expression. The default "0 */5 * * * ?" runs every 5 minutes. To run every hour, use "0 0 * * * ?". The schedule applies to all subsequent automatic runs; the first run must always be triggered manually.
How do I pause ingestion without deleting the table? Set "enabled": "false" in the IcebergIngestionTask config and update the table. This stops the scheduler from creating new ingestion runs while preserving all existing segments and the last checkpoint. Re-enable by setting "enabled": "true". See Pausing Ingestion for the full example.
What happens if timeColumnName is null? Pinot creates the table without a time dimension. Segments are still ingested and queryable, but time-based retention and time-partition pruning are disabled. Set timeColumnName to a timestamp column in your Iceberg schema if you need those features.
My Validate call succeeds but List Namespaces returns nothing — why? The validate endpoint only confirms connectivity and credential validity. An empty namespace list typically means the credentials have access to the catalog service but the account or region contains no databases, or the warehouse / database field points to the wrong scope. Double-check that the region, warehouse, and database fields match your actual Glue or S3 Tables configuration.
Can I use IAM roles instead of access keys? Cross account IAM role based access is not supported yet.
How do I change the ingestion schedule after table creation? Update the IcebergIngestionTask.schedule field in the table config.
Where can I monitor ingestion health? See the Observability page for the Watcher Status, Checkpoint, and File Count APIs.