Context
Having text index support for data stored in remote tiered storage is crucial for accelerating performance for text search based queries. Previously, text indexes were not supported in StarTree Tiered storage due to various reasons (text index stored separate from Pinot segment and not integrated in columns.psf) Starting Startree release 0.11.1, text indexes can now be used in conjunction with Startree Tiered storage configuration. Key changes made to enable this support- File Consolidation: Text index directories are now consolidated into the columns.psf file.
- Tiered Storage Enabled: The inclusion of text indexes within
columns.psfautomatically enables tiered storage support. - Minimal Configuration: Activation of the new format merely requires setting the storeInSegmentFile flag.
- Default Mode: Text indexes continue to employ separate directories, thereby lacking tiered storage support.
- Consolidated Mode: When
storeInSegmentFile: "true"is configured, text indexes are stored within columns.psf, enabling tiered storage support.
Sample configuration
ThestoreInSegmentFile flag controls whether text indexes are stored within the segment file or in separate directories:
How to Enable Text Index for S3 Tiered Storage
New Table with Text Index and S3 Tiered Storage Support
For new tables, you have two options for configuring text indexes with S3 tiered storage support:Case 1: Both Local and S3 in Consolidated Format
Case 2: Local in Default Format, S3 in Consolidated Format
In this case, we use thetierOverwrites to decouple local vs S3 format.
Enable Text Index on Existing Table with S3 Tiered Storage
For existing tables, local storage can remain as it is (default format), and tierOverwrites with storeInSegmentFile: “true” can be used to enable text index on S3 tiered storage: Note: For existing tables, the new flag will take effect on existing segments after a table reload but new segments(data) will pick the changes. Table reload can take some time based on the data size in tables. Important: Local storage can remain in default format (separate directories), but when moving to S3, storeInSegmentFile must be set to “true” for tiered storage to work.How to Pin the Text Index
**Text index pinning works the same way as inverted index pinning and is specified using **preload.keys in the tier configuration. For detailed information on preloaded index configuration, refer to theStarTree documentation on preloaded index . To pin text indexes to specific storage tiers, configure the preload.keys in your tierConfigs:Tiered storage features for text indexes work similar to other indexes (for example, inverted index). Once text indexes are consolidated into the pinot segment file with storeInSegmentFile: “true”, they can be moved between storage tiers just like any other index type.

