Skip to main content
Beta feature — Query Analyzer is disabled by default and available on demand. Contact your StarTree account team to have it enabled for your environment.
This page is a reference for everything Query Analyzer returns. Use it to understand what each field means, what triggered a specific finding, and what the different recommendation and risk flag categories represent.

Output Sections

Query Summary

Risk Flags

The main output section. Each risk flag represents one detected issue and bundles everything related to that issue in one place: what was found, why it matters, the measured evidence behind it, and — for runtime analysis — the operator metrics and any correctness concerns.
FieldDescription
TypeThe category of issue detected (see Risk Flag Types below)
SeverityHow serious the issue is — HIGH, MEDIUM, or LOW
DescriptionA plain-English explanation of what was found and why it matters
EvidenceThe specific data that triggered the flag — for example, a column name, cardinality value, index configuration, or measured execution metric
ConfidenceHow confident the analyzer is in the finding — HIGH, MEDIUM, or LOW
Operator (runtime)The execution operator this flag relates to — for example, Hash Join or Table Scan (see Operator Types)
Metrics (runtime)Measured execution values for the operator — time, memory, rows, bytes shuffled
Only HIGH and MEDIUM confidence findings are shown. Low-confidence results are suppressed to keep output focused.
When a risk flag carries a correctness alert, address it before acting on performance recommendations. A correctness issue — such as the group limit being reached — means query results may already be incomplete, which affects how you should interpret the rest of the analysis.

Recommendations

Ranked, actionable suggestions. Each recommendation includes:
FieldDescription
RankPriority order — rank 1 has the highest expected impact
TitleShort description of the action to take
TypeThe category of change recommended (see Recommendation Types below)
DescriptionDetailed explanation of why this change helps
Suggested ChangeA before/after view of the specific change — a query snippet, index config, or hint
Expected ImpactHow much improvement is expected — HIGH, MEDIUM, or LOW
ConfidenceHow confident the analyzer is in this recommendation — HIGH, MEDIUM, or LOW
EvidenceReferences to the supporting evidence items that justify this recommendation
TradeoffsSide effects or caveats to consider before applying the change
Correctness NotesSemantic considerations — for example, whether a join reorder affects NULL handling
Recommendations are ranked by expected impact first, then confidence. A query rewrite that eliminates an expensive operation ranks above an index addition at the same impact level.

Risk Flag Types

Risk flags are detected during both static and runtime analysis. There are two categories.

Index Gaps

These flags indicate that a column used in a filter, join, or aggregation is missing an index that would significantly speed up the query.
Risk FlagWhat It Means
Missing Inverted IndexA column used in an equality or IN predicate has no inverted index, requiring a full scan of each segment
Missing Range IndexA column used in a range comparison (>, <, BETWEEN) has no range index
Missing Sorted IndexA high-selectivity column could reduce segment pruning with a sorted index
Missing Bloom FilterA low-cardinality column with equality predicates could benefit from a bloom filter to skip segments quickly
Missing Text IndexA column used with LIKE or text search has no full-text index
Missing FST IndexA column used with regex predicates has no FST (Finite State Transducer) index
Missing JSON IndexA column with JSON path access has no JSON index
Missing Star-Tree IndexAn aggregation pattern could be pre-computed with a star-tree index, potentially eliminating the scan entirely
Missing Timestamp IndexA time column used in time-bucketing functions has no timestamp index
Missing Geospatial IndexA column used in spatial predicates has no geospatial index
Missing Sparse IndexA column stored in cold/tiered storage could benefit from a sparse index to reduce remote reads
Missing Composite JSON IndexMultiple JSON path accesses could be accelerated with a composite JSON index
Missing Vector IndexA column used in vector similarity search has no vector index
Full Scan RiskThe query structure is likely to cause a full segment scan regardless of indexes

Query Pattern Issues

These flags indicate structural patterns in the SQL or query plan that are likely to cause poor performance.
Risk FlagWhat It Means
Wrong Join OrderThe larger table appears on the right (hash build) side of a join. Pinot builds the in-memory hash table from the right input — putting the larger table there wastes memory and slows the build
Filter Not Pushed DownA filter appears above a join in the execution plan instead of below it. Pushing filters down reduces the number of rows that reach the join
Broadcast JoinThe right-side input of a join is broadcast to every worker, which is expensive for large tables
Self JoinA table is joined with itself, which may indicate a query design issue
Cross Join RiskA join condition is non-equality, which can produce a Cartesian product
Offset PaginationOFFSET is used for pagination. For large offsets this requires scanning and discarding thousands of rows — use keyset (cursor) pagination instead
Select StarSELECT * fetches all columns including potentially large or unused ones
Leading WildcardA LIKE '%...' pattern cannot use index prefix optimization and requires scanning every value
High Result LimitA very large LIMIT value may produce excessive result sets
Unnecessary SortAn ORDER BY clause has no corresponding LIMIT, meaning all rows are sorted
High Cardinality Group ByGROUP BY on a high-cardinality column produces a large intermediate state that requires significant memory
Group Limit RiskThe estimated number of groups may approach or exceed the default limit of 100,000 groups
Heavy AggregationA memory-intensive aggregate function such as DISTINCTCOUNT is used
Suboptimal EncodingA column uses raw encoding where dictionary encoding would improve index effectiveness

Filter Propagation Issues

These flags identify missed opportunities to reduce data early by pushing filters across the query plan.
Risk FlagWhat It Means
Missing Cross-Table FilterA filter applied to one side of a join could also be applied to the other side, but isn’t — the engine ends up joining more rows than necessary
Large Table Without Pre-filterA large table is scanned with no pre-filter, causing a full scan before the join or aggregation that follows
Missing Date Range PropagationA date or time range filter on one side of a join is not propagated to the other side, missing an opportunity to prune data before the join executes

Operator Types

Operator types appear within risk flags during runtime analysis.
OperatorWhat It DoesKey Metrics to Watch
Table ScanScans segments and applies filtersEntries scanned in-filter (high = indexes not being used); entries scanned post-filter (high = star-tree opportunity)
Hash JoinBuilds a hash table from the right input and probes it with the leftHash table build time; memory allocated; rows emitted (high rows = join explosion)
Data SendSends rows to downstream workers across stagesBytes serialized; fan-out (number of workers data is sent to)
Data ReceiveReceives rows from upstream workersBytes deserialized; upstream wait time (high = upstream stage is slow)
AggregationGroups rows and computes aggregate functionsMemory allocated; groups emitted; whether the group limit was hit
Window FunctionComputes window functions over a partitionExecution time; memory allocated (high = large partitions or wide frames)
SortSorts rows, typically in combination with a LIMITExecution time and memory for sort buffer

Evidence Types

Evidence is embedded within each risk flag to show what triggered it. The type tells you where the finding came from.

Found in Both Modes

Evidence TypeWhat It Means
Full ScanA full segment scan was detected in the explain plan
Index UsedA specific index was used — for example, an inverted index for an equality predicate
Join OrderJoin order analysis based on table row counts
Missing IndexA column is missing an index that would help — covers inverted, range, text, FST, JSON, bloom filter, star-tree, timestamp, geospatial, sparse, composite JSON, and vector index types
Filter-Only JoinThe right side of a join is used only to filter rows, making it a candidate for rewriting as WHERE col IN (SELECT ...)
Star-Tree BlockedA star-tree index cannot be used because the aggregation depends on a cross-table join
Column CardinalityCardinality data from table statistics, used to assess index recommendations
Predicate PatternAnalysis of the query’s filter structure
Suboptimal EncodingColumn encoding is suboptimal for the query’s access pattern

Found During Static Analysis Only

Evidence TypeWhat It Means
Plan StructureA structural observation from the explain plan
Exchange TypeThe data distribution pattern between stages — hash, broadcast, or random

Found During Runtime Analysis Only

Evidence TypeWhat It Means
High Post-Filter ScanMany entries were scanned after the filter, suggesting a star-tree index opportunity
Join Build TimeHigh hash table build time indicates the right-side table is too large
Join ExplosionLarge number of output rows from the join, indicating a many-to-many relationship
Group Limit HitThe 100,000 group limit was reached — results may be incomplete
High MemoryAn operator allocated significant heap memory
Heavy ShuffleLarge volumes of data were sent between workers
Upstream WaitA stage spent significant time waiting for its upstream stage to produce data
Downstream WaitA stage was blocked waiting for the downstream consumer to process rows
Broadcast DetectedA broadcast exchange was detected in the execution tree

Recommendation Types

Recommendations appear in both static and runtime analysis.

Query Changes

TypeWhat Gets Changed
Query RewriteThe SQL itself — for example, reordering joins, replacing OFFSET with keyset pagination, or rewriting a join as a subquery
Hint AdditionA query hint added inline to control join strategy, aggregation behavior, or partition handling

Index Additions

TypeIndex Added
Add Inverted IndexSpeeds up equality and IN predicates on the target column
Add Range IndexSpeeds up range comparisons on the target column
Add Bloom FilterAllows fast segment skipping for equality predicates on the target column
Add Text IndexEnables full-text search and speeds up LIKE predicates
Add FST IndexEnables fast regex matching on the target column
Add JSON IndexSpeeds up JSON path access on the target column
Add Star-Tree IndexPre-computes aggregations to potentially eliminate scanning entirely
Add Timestamp IndexSpeeds up time-bucketing functions on datetime columns
Add Geospatial IndexSpeeds up spatial predicate evaluation
Add Sparse IndexReduces remote reads for columns in cold/tiered storage
Add Composite JSON IndexOptimizes multiple JSON path accesses together
Add Vector IndexEnables efficient vector similarity search

Configuration Changes

TypeWhat Gets Changed
Encoding ChangeColumn encoding type — for example, switching from raw to dictionary encoding
Join StrategyHow the join is executed — for example, using a lookup join for dimension tables or a colocated join to eliminate data shuffling
Aggregation HintAggregation behavior — for example, enabling group trimming for ORDER BY + LIMIT patterns, or increasing the group count limit