@import url('https://fonts.googleapis.com/css2?family=Inter:ital,opsz,wght@0,14..32,100..900;1,14..32,100..900&family=Montserrat:wght@600&family=Poppins:ital,wght@0,100;0,200;0,300;0,400;0,500;0,600;0,700;0,800;0,900;1,100;1,200;1,300;1,400;1,500;1,600;1,700;1,800;1,900&display=swap');
@import url('https://fonts.googleapis.com/css2?family=Fira+Code:wght@300..700&family=Inter:ital,opsz,wght@0,14..32,100..900;1,14..32,100..900&family=Montserrat:wght@600&family=Poppins:ital,wght@0,100;0,200;0,300;0,400;0,500;0,600;0,700;0,800;0,900;1,100;1,200;1,300;1,400;1,500;1,600;1,700;1,800;1,900&display=swap');
.font-inter  h4{
  font-family: 'Inter';
}

.font-fira-code {
  font-family: 'Fira Code';
}

.card-icon a .fill-gray-800, .card-icon-dark a .fill-gray-800{
  height: 48px;
  width: 48px;
}
.card-icon a div h2, .card-icon-dark a div h2{
  color: #002654;
  font-size: 1.25rem;
  font-weight: 500;
  line-height: 130%;
  letter-spacing: 0%;
  font-family: 'Inter';
}
.card-icon a div .top-5, .card-icon-dark a div .top-5{
  display: none;
}
.card-icon-dark .card{
  background-color: #0C2E5C;
}

.card-icon a, .card-icon-dark a{
  height: 100%;
  width: 100%;
  margin-top: 0px;
  margin-bottom: 0px;
}

.gap-cards{
  gap: 1rem;
}


Prerequisites

Docker

Makefile

Load Documentation

Ask Your Questions

Clean up

Troubleshooting

Real-Time RAG Pinot

StarTree Docs

Real-time analytics enable users to make better and more timely decisions. Real-time analytics products pull data in as soon as it happens, pull data out as soon as it gets pulled in, and do so at scale.

What is Real-Time Analytics?

Ingest data from various streaming and batch sources using connectors. The StarTree Data Portal makes it easy to ingest and transform data.

Overview

Ingest Data

Connect to any data source using the custom connector. Configure the connector to use a batch or streaming source.

Connect to Any Data Source

Configure advanced settings for the table in the Data Portal, including ingestion behavior and time partitioning for Apache Pinot tables. These are crucial for correct table functioning and data integrity in Apache Pinot.

Additional Configuration

Review the overall table configuration and create the table in the last step.

Create Table

Learn about off-heap upserts and how to use them in StarTree

Off-Heap Upserts

Learn about off-heap dedup and how to use them in StarTree

Off-Heap Dedup

Off-Heap Deduplication

Hybrid Tables

Connect either Tableau Desktop or Tableau Server to your StarTree Cloud instance.

Tableau

Connect Tableau to StarTree Cloud

Connect Superset to StarTree Cloud to visualize your data.

Superset

Connect Superset to StarTree Cloud

The StarTree Security Manager provides a centralized location for managing access control within your StarTree environment.

Security Manager

Manage secure, fine-grained access to StarTree resources with Role-Based Access Control (RBAC), enabling administrators to define custom policies, create roles, and assign them to users or groups through the Security Manager interface integrated with your organization's Identity Provider (IDP).

Managing Access

API Token Management

Define and implement custom RBAC policies in StarTree Cloud by configuring granular permissions using StarTree Resource Names (SRNs), supporting over 150 distinct actions for precise access control across environments, clusters, and tables. 

Custom Policy Configuration

There are several commonly used actions that can be specified in a policy. 

Actions

Configure using the RBAC Manager API. RBAC allows you to define granular permissions for users, groups, and service tokens.

Using RBAC API

Using the StarTree Role-Based Access Control (RBAC) API

0.10.1 Release Version 0.10.1: February 2025

Definitions for real-time user-facing analytics, StarTree products, and platforms

Glossary

Cluster Health

StarTree Cloud Cluster Health Dashboard

Getting Started with ThirdEye

ThirdEye Architecture

ThirdEye Entities

ThirdEye Worker Management

DataFetcher

Dimension Exploration

EventFetcher

SqlExecution

TimeIndexFiller

Optimize Performance

Use Pre-Computed Time Columns

Add a Notification System

Subscription Groups and Notification Systems

Alert Tuning

Use Anomaly Filters to Tune Alerts

Root Cause Analysis

How to perform root cause analysis

Manage Holiday Effects

Manage Holiday Effects with StarTree ETS

Using ThirdEye API

How to use the ThirdEye API

Access Control in ThirdEye

Frequently Asked Questions

Observability & Monitoring

ThirdEye Observability and Monitoring

Alert Configuration

Alert Configuration and Execution

Scaling Workers

Dimensions Recommender

ThirdEye Dimensions Recommender

Aggregation Functions

Resources

Download Recipes

Clickstream Analytics Dashboard

Clickstream Analytics Dashboard with StarTree Cloud Free Tier and Streamlit

Learn about StarTree Cloud APIs for managing tables and querying data at scale.

Introduction

Run SQL queries on StarTree Cloud's real-time analytics engine.

Query Data

Simulates a preview of the table given the table config and schema

preview

Lists all tables in cluster

Creates a segment using given file and pushes it to Pinot. 
 All steps happen on the controller. This API is NOT meant for production environments/large input files. 
 Example usage (query params need encoding):
```
curl -X POST -F file=@data.json -H "Content-Type: multipart/form-data" "http://localhost:9000/ingestFromFile?tableNameWithType=foo_OFFLINE&
batchConfigMapStr={
  "inputFormat":"csv",
  "recordReader.prop.delimiter":"|"
}" 
```

Ingest a file

Creates a segment using file at the given URI and pushes it to Pinot. 
 All steps happen on the controller. This API is NOT meant for production environments/large input files. 
Example usage (query params need encoding):
```
curl -X POST "http://localhost:9000/ingestFromURI?tableNameWithType=foo_OFFLINE
&batchConfigMapStr={
  "inputFormat":"json",
  "input.fs.className":"org.apache.pinot.plugin.filesystem.S3PinotFS",
  "input.fs.prop.region":"us-central",
  "input.fs.prop.accessKey":"foo",
  "input.fs.prop.secretKey":"bar"
}
&sourceURIStr=s3://test.bucket/path/to/json/data/data.json"
```

Ingest from the given URI

Get the instance partitions

Create/update the instance partitions

Remove the instance partitions

Replace an instance in the instance partitions

Assign server instances to a table

Pause the consumption of a realtime table

Pause consumption of a realtime table

Resume the consumption for a realtime table. ConsumeFrom parameter indicates from which offsets consumption should resume. Recommended value is 'lastConsumed', which indicates consumption should continue based on the offsets in segment ZK metadata, and in case the offsets are already gone, the first available offsets are picked to minimize the data loss.

Resume consumption of a realtime table

Get status for a submitted force commit operation

Gets the segments that are in error state and segments with COMMITTING status in ZK metadata

Returns state of pauseless table

Return pause status of a realtime table along with list of consuming segments.

Return pause status of a realtime table

Force commit the current segments in consuming state and restart consumption. This should be used after schema/table config changes. Please note that this is an asynchronous operation, and 200 response does not mean it has actually been done already.If specific partitions or consuming segments are provided, only those partitions or consuming segments will be force committed.

Force commit the current consuming segments

Gets the status of consumers from all servers.Note that the partitionToOffsetMap has been deprecated and will be removed in the next release. The info is now embedded within each partition's state as currentOffsetsMap.

Returns state of consuming segments

List table instances

List tables to live brokers mappings based on EV

List tables to live brokers mappings

List live brokers of the given table based on EV

List the brokers serving a table

Recommend config

Lists the table configs

Updates table config for a table

Deletes a table

This API returns the table config that matches the one you get from 'GET /tables/{tableName}'. This allows us to validate table config before apply.

Validate table config for a table

Rebalances a table (reassign instances and segments for a table)

Cancel all rebalance jobs for the given table, and noop if no rebalance is running

Gets detailed stats of a rebalance operation

Provides status of the table including ingestion status

table status

Get current table state

Enable/disable a table

Provides metadata info/stats about the table.

table stats

Adds a table

Get the aggregate validDocIds metadata of all segments for a table

Get the aggregate metadata of all segments for a table

Get the aggregate index details of all segments for a table

Get list of controller jobs for this table

Set hybrid table query time boundary based on offline segments' metadata

Delete hybrid table query time boundary

Rebuild broker resource for table

Validate the TableConfigs

Lists all TableConfigs in cluster

Add the TableConfigs using the tableConfigsStr json

Get the TableConfigs for a given raw tableName

Update the TableConfigs provided by the tableConfigsStr json

Delete the TableConfigs

Get table size details. Table size is the size of untarred segments including replication

Read table sizes

Get segment names to segment status map

Get table ideal state

Get table external view

Get the list of servers to restart in sequence

Get cached cluster health details for all pinot entities

Enable / disable the periodic cluster health check task. Note that this setting isn't persisted across controller restarts and /cluster/configs should be used to disable it permanently

Get all available cluster health checks and their details

Get cluster properties if deployed in AZ aware mode

Browse resources in a connection. Eg list directories/files in a s3 bucket, list tables in a database, list topics in a Kafka cluster etc.

Recipes

​Prerequisites

​Docker

​Makefile

​Load Documentation

​Ask Your Questions

​Clean up

​Troubleshooting