The Text index (Lucene) enhances complex text search capabilities by enabling efficient phrase matching, regex, and prefix searches on large text fields.
TEXT_MATCH
function, like this:
<column_name>
is the column text index is created on and <search_expression>
conforms to one of the following:
Search Expression Type | Example |
---|---|
Phrase query | TEXT_MATCH (<column_name>, ‘“distributed system”‘) |
Term Query | TEXT_MATCH (<column_name>, ‘Java’) |
Boolean Query | TEXT_MATCH (<column_name>, ‘Java AND c++‘) |
Prefix Query | TEXT_MATCH (<column_name>, ‘stream*‘) |
Regex Query | TEXT_MATCH (<column_name>, ‘/Exception.*/‘) |
Not Query | TEXT_MATCH (<column_name>, ’*:* NOT c%’) NOT TEXT_MATCH (<column_name>, ‘c%’) |
noDictionaryColumns
in tableIndexConfig
:
TEXT_MATCH
function supports an optional third parameter for specifying Lucene query parser options at query time. This allows for flexible and advanced text search without changing table configuration.
Function Signature:
text_column_name
: Name of the column to perform text search on.search_expression
: The query string for text search.options
(optional): Comma-separated string of key-value pairs to control query parsing and search behavior.Option | Values | Description |
---|---|---|
parser | CLASSIC , STANDARD , COMPLEX | Selects the Lucene query parser to use. Default is CLASSIC . |
allowLeadingWildcard | true , false | Allows queries to start with a wildcard (e.g., *term ). Default is false . |
defaultOperator | AND , OR | Sets the default boolean operator for multi-term queries. Default is OR . |
TooManyClauses
exceptions, causing query failures. This commonly occurs with:
pinot.lucene.max.clause.count
setting.