Connect to Any Data Source

On this page

Step 1: Select the Data Source Category
Batch Sources
Streaming Sources
SQL-Based Sources
Step 2: Configure the Data Source Connection
Step 3: Preview the Data

Step 1: Select the Data Source Category

To begin the connection setup, select the appropriate data source category that best represents the type of system you are connecting to. This helps in identifying the ingestion mode and ensures the correct configurations are applied.

Batch Sources

Select this option for file-based systems, such as:
- AWS S3
- Google Cloud Storage (GCS)
- Azure Blob Storage
- HDFS
- Local file systems
These sources typically provide static data that can be ingested periodically.

Streaming Sources

Choose this category for real-time streaming data sources, such as:
- Apache Kafka
- Amazon Kinesis
- Apache Pulsar
These sources continuously generate data, requiring Pinot to process events in real time.

SQL-Based Sources

Use this option when connecting to traditional relational databases, such as:
- Snowflake
- Google BigQuery
This setup allows Pinot to fetch data using SQL queries.

Step 2: Configure the Data Source Connection

Once you have selected the appropriate data source category, enter the required connection configuration details. These settings will depend on the type of source you are connecting to and should align with Apache Pinot’s ingestion configuration. After entering the configuration details correctly, proceed with validating the connection.

Step 3: Preview the Data

Click Show Sample Data to preview the source data before finalizing the configuration.

Next Step

Proceed with Data Modeling.

Delta Lake Overview

Next Step

Get Started

Ingestion

Query Data

Manage Data

Visualize Data

Manage Security

Release Notes

Reference

Connect to Any Data Source

Step 1: Select the Data Source Category

Batch Sources

Streaming Sources

SQL-Based Sources

Step 2: Configure the Data Source Connection

Step 3: Preview the Data

Get Started

Ingestion

Query Data

Manage Data

Visualize Data

Manage Security

Release Notes

Reference

​Step 1: Select the Data Source Category

​Batch Sources

​Streaming Sources

​SQL-Based Sources

​Step 2: Configure the Data Source Connection

​Step 3: Preview the Data

Next Step

Step 1: Select the Data Source Category

Batch Sources

Streaming Sources

SQL-Based Sources

Step 2: Configure the Data Source Connection

Step 3: Preview the Data