Query Analyzer - StarTree Docs

Beta feature — Query Analyzer is disabled by default and available on demand. Contact your StarTree account team to have it enabled for your environment.

Query Analyzer is an AI-powered tool built into StarTree Cloud that helps you understand and optimize Apache Pinot multi-stage engine (MSE) queries. It analyzes your SQL query alongside table metadata, explain plans, and execution statistics to produce prioritized, evidence-backed recommendations — all without requiring deep Pinot expertise.

The Problem

Optimizing Pinot multi-stage queries is hard because the debugging surface is fragmented and expert-only:

Explain plans are detailed but difficult to interpret for most users
Stage statistics show where time was spent, but not what to change
Table metadata (indexes, cardinality, data types) is spread across multiple pages
Users end up guessing, iterating blindly, or escalating to support

This leads to slow time-to-resolution, wasted compute, and repetitive support loops.

How to Access

Query Analyzer is available directly in the StarTree Data Portal Query Console. Once enabled for your environment, no additional configuration is required — write your SQL query and use the built-in Analyze options to get instant, AI-generated optimization guidance.

Navigate to Data Portal → Query Console, write your query, and select Analyze to get started.

What You Get Back

Query Analyzer returns three top-level sections for every analysis:

Output	Description
Query Summary	Plain-English description of what the query does
Risk Flags	Issues found, ranked by severity. Each flag carries its own supporting evidence, bottleneck metrics (runtime), and any correctness alerts — so context and cause are always together
Recommendations	Ranked, actionable suggestions with expected impact, a before/after change, and tradeoffs

Every recommendation is grounded in measured evidence, not generic advice.

Two Analysis Modes

Static Analysis

Run before executing your query. Flags likely problems using table metadata, index configuration, and the explain plan — so you can fix issues before spending compute.

Runtime Analysis

Run after executing your query. Uses real execution statistics to pinpoint actual bottlenecks and provide evidence-based fixes.

What Gets Analyzed

Index gaps Missing inverted, range, sorted, bloom filter, text, FST, JSON, star-tree, timestamp, geospatial, sparse, composite JSON, and vector indexes, as well as full scan risk. Query patterns Wrong join order, filters not pushed below joins, broadcast join risk, cross-join risk, OFFSET-based pagination, SELECT *, leading-wildcard LIKE, high result LIMIT, unnecessary sort, high-cardinality GROUP BY, group limit risk, heavy aggregation, suboptimal column encoding, missing cross-table filter, large table without pre-filter, and missing date range propagation. Operator-level execution (runtime only) Each risk flag produced during runtime analysis includes measured metrics from the relevant operator — execution time, memory allocation, rows produced, bytes shuffled, and correctness signals such as whether the aggregation group limit was hit.

Security and Data Privacy

Query Analyzer uses an AI model to generate recommendations. Your SQL and table metadata are processed as part of every analysis — review what is sent, how it is protected, and how it maps to your security and compliance requirements before enabling or using the feature.

Security and Compliance

Essential reading for security, compliance, and data governance teams.

AI model and cloud provider by deployment
What data is sent to the LLM — and what is explicitly excluded
Credential masking, data flows, and retention
Access controls, encryption, and platform certifications (SOC 2, ISO 27001, HIPAA readiness)

Guardrails

Query Analyzer is designed to produce high-signal, actionable output:

Every recommendation includes specific evidence from the query plan or execution statistics
Low-confidence and low-impact findings are suppressed — only high-signal results are shown
Correctness alerts (such as results being potentially incomplete due to group limits) are surfaced prominently
Query rewrite suggestions are validated against Pinot SQL syntax rules
All suggestions include tradeoffs so you can make informed decisions

Next Steps

How to Use Query Analyzer

Step-by-step guide to running static and runtime analysis from the Query Console, and how to act on the results.

Understanding Your Results

Reference guide to every output field, risk flag type, recommendation category, and what each means for your query.

Security and Compliance

AI model, data privacy, access controls, and platform compliance.

​The Problem

​How to Access

​What You Get Back

​Two Analysis Modes

Static Analysis

Runtime Analysis

​What Gets Analyzed

​Security and Data Privacy

Security and Compliance

​Guardrails

​Next Steps

How to Use Query Analyzer

Understanding Your Results

Security and Compliance

The Problem

How to Access

What You Get Back

Two Analysis Modes

What Gets Analyzed

Security and Data Privacy

Guardrails

Next Steps