DynamoDB Message Decoder Configurations
To configure a Pinot table to use a DynamoDB formatted streaming source, Pinot provides a decoder -ai.startree.pinot.plugin.inputformat.dynamodb.DynamoDbMessageDecoder
.
The properties of this decoder are listed below:
Configuration Key | Description |
---|---|
decoder.class.name | Specifies the primary decoder for DynamoDB messages. Set this to ai.startree.pinot.plugin.inputformat.dynamodb.DynamoDbMessageDecoder to enable DynamoDB CDC ingestion. |
dynamodb.timeColumnName | The column name where the ApproximateCreationDateTime from the DynamoDB JSON record should be stored. This timestamp can be used as the table’s default time column. |
dynamodb.deleteColumnName | The column name that will be set to true when a REMOVE record is received from DynamoDB, and false otherwise. This helps track deletion events in your Pinot table. |
dynamodb.envelope.decoder.class.name | Specifies the underlying decoder used to parse the message format. Since DynamoDB messages are in JSON format, this should typically be set to org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder . |
dynamodb.envelope.decoder.prop. | Prefix to be used for any properties associated with the envelope decoder class. |
Configuration Example
When ingesting a DynamoDB formatted payload from a stream, the decoder used for the stream must beai.startree.pinot.plugin.inputformat.dynamodb.DynamoDbMessageDecoder
.
The following is an example stream config where the Pinot table is consuming from a JSON-encoded Kafka topic containing DynamoDB CDC payload:
org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory
and the decoder associated with this stream is ai.startree.pinot.plugin.inputformat.dynamodb.DynamoDbMessageDecoder
.
The configuration uses several key components:
- The primary decoder
ai.startree.pinot.plugin.inputformat.dynamodb.DynamoDbMessageDecoder
handles the DynamoDB-specific message format - The
dynamodb.timeColumnName
is populated with theApproximateCreationDateTime
from the DynamoDB JSON record - The
dynamodb.deleteColumnName
is set totrue
whenREMOVE
records are received from DynamoDB - The
dynamodb.envelope.decoder.class.name
is set toorg.apache.pinot.plugin.inputformat.json.JSONMessageDecoder
since the underlying DynamoDB messages are in JSON format