> ## Documentation Index
> Fetch the complete documentation index at: https://docs.startree.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Release version 0.15.0: April 2026

# **Executive Summary**

Release `0.15.0` is centered on one major milestone: **StarTree Iceberg Tables are production ready for broader customer adoption, with Iceberg catalog support and enhanced operational reliability**.

* **StarTree Iceberg Tables:** Stronger validation, better status/error surfacing, and safer execution paths make Iceberg based external tables easier to run in production at scale.
* **Reliability first:** Ingestion and task orchestration improvements reduce failure blast radius and improve recoverability for long running jobs.
* **Data correctness and type handling:** Substantial work landed around Parquet readers, complex type mapping, and column metadata behavior to improve query correctness.
* **Operational visibility and security:** Better auth metrics, request level tracing, and storage/query observability improve day 2 operations.

# **StarTree Cloud highlights**

## **New Features**

### **StarTree Iceberg Table is now GA**

StarTree Iceberg Table take a major step forward in this release, with a strong focus on production readiness for Iceberg catalogs (including AWS Glue and S3 Tables) and in-place query workflows.

* Expanded validation and guardrails for external table onboarding.
* Better operational controls when handling real-world catalog and table configuration edge cases.
* Improved ingestion task behavior and lifecycle handling for external table pipelines.
* IAM role based onboarding support for external Iceberg tables

We published an internal performance benchmark comparing StarTree's external table offering with a similar offering in Trino and ClickHouse. StarTree is orders of magnitude faster and infrastructure footprint is 15x lower in comparison. Read full benchmark here: [https://startree.ai/resources/iceberg-query-benchmark-vs-trino-vs-clickhouse/](https://startree.ai/resources/iceberg-query-benchmark-vs-trino-vs-clickhouse/)

For onboarding and usage details, see the [External Tables documentation](https://docs.startree.ai/corecapabilities/external-table/onboarding-data-portal).

**Better reliability and Control**

StarTree Iceberg Table onboarding now has stronger execution and recovery behavior for production workloads:

* More robust task execution flow for external table ingestion runs.
* Clearer status and root-cause error reporting to speed up incident triage.
* Optional handling for file-level failures to keep ingestion progressing when appropriate.
* REST API to clear page caches on server and controller

### **Query and type-system improvements for External Table**

This release improves correctness when querying external Parquet and complex schemas:

* Better handling for complex and nested type paths.
* Improvements in multi-value reads and metadata interpretation.
* More consistent behavior for column statistics and predicate handling.

### **Operational observability and auth telemetry**

StarTree Cloud now surfaces better operational signals across query, cache, and security paths:

* New/expanded metrics for auth plugin behavior.
* Better reader and cache observability for Parquet query paths.
* Improved request-level tracing for external table ingestion components.

### **Query Log**

Built-in query log system table: Every executed query (single-stage and multi-stage) is now asynchronously captured to a managed system\_query\_log Pinot table. This table captures all relevant fields from the query response which is useful to slice and dice historical queries by different dimensions (latency, memory used, docs scanned etc)

### **Replica scale down**

Ability to scale down a replica group on a specified schedule is now available. Users can use this feature to reduce the footprint during periods of low activity (for instance in many cases, query volume may scale down in non business hours).

## **Improvements**

* **Default Kafka AZ aware ingestion**: Real-time tables now ingest from Kafka in an AZ aware fashion by default, thus reducing cross-AZ traffic (and hence reducing cost).
* **Task and orchestration maturity:** Better task metadata tracking, cleaner status transitions, and stronger scheduling/runtime controls.
* **Cluster and ingestion safeguards:** Additional checks and throttling paths help reduce overload during ingestion and segment operations.
* **Parquet path performance tuning:** Safer concurrent read behavior and improved cache/prefetch handling for heavy query workloads.
* **Reload hardening**: Ability to simulate reload operation (dry-run) and cancel in-progress reloads.
* **Min-max index on sorted raw columns**: Min-max indexes can now be built on sorted columns that do not use dictionary encoding, extending pruning coverage.
* **Hardening of offline upserts** and support for off heap (persistent) upsert metadata.
* **enforceConsumptionInOrder** enabled by default: Partial upsert and dedup tables now enforce in-order consumption by default, preventing correctness issues from out-of-order message delivery

## **Bug Fixes**

* Fixed multiple correctness issues in Parquet and complex-type readers that could affect query output under edge schemas.
* Fixed reliability issues in upsert/dedup and RocksDB lifecycle paths to reduce race conditions and memory leaks.
* Fixed several ingestion and controller edge cases, including null handling, status transitions, and flaky runtime behaviors.
* Fixed platform and packaging issues that impacted build stability and release workflows.

# **Apache Pinot Highlights**

This section describes [**Apache Pinot**](https://github.com/apache/pinot) (open source) changes in the baseline that ships with StarTree Cloud **0.15.0** compared with **0.14.0**. It does not describe StarTree-only extensions.

## New Features

* **Vector Search** — Full multi-phase vector search: IVF\_FLAT, IVF\_PQ compressed ANN, filtered ANN, SQL radius queries, HNSW efSearch, IVF\_ON\_DISK, and adaptive planner.
* **MSE Enhancements** — Native MSE planning for SUM/AVG over MV column. Broker pruning for non-partitioned leaf paths. Lookup join support in physical optimizer
* **Upsert derivations:** Post-partial-upsert transforms support derived columns after partial upsert merge.
* **Arrow Batch Ingestion** — ArrowRecordReader for ingesting Arrow IPC files.`pinot-arrow` is included in the standard Pinot binary bundle for Arrow-based columnar read paths.
* **Distinct Early Termination** — Support early termination in combine operator for predictable query latency.
* **AI Metadata on Schema/TableConfig** — description and tags fields on Schema, FieldSpec, and TableConfig for capturing enhanced user context.

## Improvements

* **MSE Observability** — Upstream/downstream stage ID MDC fields for debugging. Improved error propagation and logging.
* **Adaptive Routing** — Export adaptive routing stats as broker metrics.
* **Auth ordering:** Authentication runs before query validation so unauthorized requests fail earlier.
* **Large response handling:** Cursor-style response lifecycle cleanup moves toward the broker with a batch delete API (review upstream notes if you integrate with response stores).
* **Minion observability:** Minion task generation logs carry correlation context (MDC) keyed by task id.
* **Dynamic Server Thread Pool** — Thread pool size can be modified at runtime without restart

## Bug Fixes

* **Upsert snapshots:** Bitmap optimizations during upsert snapshot and segment commit paths.
* **Kafka consumption:** Partition-level consumer avoids incorrect re-seeking past `read_committed`-filtered batches.
* **Record extraction:** Map serialization is canonicalized with more reliable preservation of primitive types in record extractors.
* **FUNNEL\_COUNT NPE** — Fix when WHERE clause filters out all rows

## Backwards Incompatible Changes

* **Native text and FST index removed** — Migrate to Lucene-based text index.
* **Inverted index always created during segment generation** — Previously optional
