Connect to Amazon Kinesis
Create a connection to ingest real-time data from Amazon Kinesis.
Step 1: In the Data Portal, click Tables and then click Create Table.
Step 2: Select Kinesis as the Data Source.
Step 3: Create a New Connection.
Click New Connection. If you want to use an existing connection, select the connection from the list and proceed to Step 5.
Enter a Source Name for the new connection and specify the Broker URL.
Step 4: Configure Connection Parameters
Connect Kinesis Using Basic Authentication
Use the following JSON configuration when Kinesis is set up with basic authentication using an access key and secret key.
Property Descriptions
Property | Required | Description |
---|---|---|
region | Yes | The AWS region (e.g., us-east-1 ) where the Kinesis stream is located. |
accessKey | Yes | The AWS access key used for authentication. |
secretKey | Yes | The AWS secret key used for authentication (paired with the access key). |
Connecting to Kinesis Using IAM-Based Authentication
Please follow these steps to create IAM role for Kinesis setup.
Use the following JSON configuration when Kinesis is set up with IAM-based authentication by assuming an IAM Role for secure access.
Property Descriptions
Property | Required | Description |
---|---|---|
region | Yes | The AWS region (e.g., us-east-1 ) where the Kinesis stream is located. |
iamRoleBasedAccessEnabled | Yes | Mark it as true, when IAM role based authentication is used. |
roleArn | Yes | ARN of the IAM role Pinot will assume to access Kinesis |
roleSessionName | Yes | An identifier used when assuming an IAM role via AWS Security Token Service to access the Kinesis stream securely. It can help track usage in AWS CloudTrail logs, metrics, etc. |
Step 5: Test the Connection and Configure Data Ingestion
After configuring the connection properties, test the connection to ensure it functions properly. Once the connection is successful, configure the additional data settings using the following JSON format:
Property Descriptions
Property | Required | Description |
---|---|---|
stream.kinesis.topic.name | Yes | The name of the Kinesis topic from which Pinot will consume data. |
stream.kinesis.decoder.prop.format | Yes | The format of the Kinesis messages. Supported values include csv, json, avro, protobuf, parquet, etc. |
stream.kinesis.decoder.class.name | Yes | The class name of the decoder used for Kinesis message parsing. It is set based on the message format and schema. Examples: * For JSON Messages: org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder * For AVRO Messages: org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder * For PROTO Messages: org.apache.pinot.plugin.inputformat.protobuf.KafkaConfluentSchemaRegistryProtoBufMessageDecoder See Message Decoders for more information. |
stream.kinesis.consumer.factory.class.name | Yes | The Kinesis consumer factory class to use. The default class is org.apache.pinot.plugin.stream.kinesis.KinesisConsumerFactory for AWS Kinesis. |
Configure Record Reader
Step 6: Preview the Data
Click Show Sample Data to preview the source data before finalizing the configuration.
Next Step
Proceed with Data Modeling.