Overview and Purpose
The Composite JSON Index is an enhanced version of the JSON Index.The Composite JSON Index is available from Startree version 0.11.0.
- Indexing select path(s) with internal range and/or text index.
- Controlling which paths are included in inverted index (to make the index smaller, speed up index build).
- Applying a filter to the number of matching flattened documents (per JSON document).
- Controlling how often to flush with the off-heap index creator and also limiting temporary memory usage.
- Disabling the indexing of array positions (to make the index smaller).
Example Illustration
Benchmark testing shows a significant response time reduction for range expressions. Consider the query:Configuration
Enabling Composite JSON Index
To enable a composite JSON index on a column in your StarTree Cloud table, add the following configuration to your table definition:Configuration Parameters
The composite JSON index configuration supports the JSON index options, as well as:- flushThreshold: Flush off-heap posting list every n documents to limit memory usage.
- enablePositionalIndexing: Indicates whether to include array indexes or not.
- Optional
- Default: true
- When set to false, then Pinot won’t include array indexes in the inverted index.
- invertedIndexConfigs: A list of paths to add to the inverted index.
- Default: empty
- Can be a single element with
"includeAllPaths":true, to include everything
- rangeIndexConfigs: A list of paths to include in internal range index(es), along with settings.
Supports the following parameters:- path: JSON path of field to index
- Required
- name: The name for the field.
- Required and must be unique.
- dataType: Data type of the JSON field.
- Required
- When
createDictionary=true, this parameter accepts accepts INT, LONG, FLOAT, DOUBLE, STRING, BIG_DECIMAL. - Otherwise this parameter accepts INT, LONG, FLOAT, DOUBLE.
- createDictionary: Indicates whether to create the dictionary for the field.
- Optional
- Default: false
- dictionaryType: Indicates the type, fixedLength or variableLength.
- Optional
- Default: variableLength
- Due to dictionary limitations, Pinot currently supports only variableLength for BIG_DECIMAL.
- defaultValue: A value for flattened records that don’t include the field, or that contain a badly formed value.
- Optional
- path: JSON path of field to index
- textIndexConfigs: A list of paths to include in the internal Lucene text index(es).
- Accepts path and name parameters.
- The name parameter must be unique.
Example Queries
Range-index-based range query against a JSON fieldRange Index Default Values
The range index contains value for each flattened row. If value is missing or unparseable then either user-set or fixed - ‘0’ - default is used. That default might cause unexpected results to appear when range-querying without upper or lower bound. For example, with nodefaultValue set, the following query will include documents without a value field:
Array Position Indexing
A query that specifies an array index in the JSON path will require theenablePositionalIndexing=true parameter configuration, even when path is range-indexed. Otherwise, the query will return an empty result. For example, the following query:
Range Queries against Mutable Segments
A range index is used for immutable/commited segments, while mutable segments rely on the inverted index. When indexing hybrid or real-time tables, you must include paths used for range queries in the inverted index. Otherwise queries will return empty results for the mutable segments. Consider the following example configuration:Text Index Configuration
Apart from the path and name, the textIndexConfig can contain fields allowed in regular text index configuration: “rawValue”, “queryCache”, “useANDForMultiTermQueries”, “stopWordsInclude”, “stopWordsExclude”,“luceneUseCompoundFile”, “luceneMaxBufferSizeMB”, “luceneAnalyzerClass”, “luceneAnalyzerClassArgs”,
“luceneAnalyzerClassArgTypes”, “luceneQueryParserClass”, “enablePrefixSuffixMatchingInPhraseQueries”,
“reuseMutableIndex”, “luceneNRTCachingDirectoryMaxBufferSizeMB” For more information on the text index and configuration, see the Pinot documentation.

