Message Decoders
In Apache Pinot, message decoders play a crucial role in the real-time ingestion pipeline. They are responsible for transforming raw byte-stream messages from streaming systems like Apache Kafka, AWS Kinesis, or Pulsar into Pinot-compatible row objects (GenericRow) that can be indexed and stored.
When data is consumed from a stream, it often comes in a variety of formats—such as JSON, Avro, ORC or Thrift—and Pinot needs to understand how to deserialize that data into something it can work with. That’s where message decoders come in. Each decoder implements a strategy to handle a specific serialization format and convert it into a structured row according to the schema defined in Pinot.
Message decoders are configured per real-time table, under the streamConfigs section, and allow you to define how incoming messages are interpreted and parsed.
Supported Message Decoders in Pinot
Apache Pinot supports a range of message decoders:
JSON Decoder
Used to Decode JSON-formatted messages from Kafka topics
Avro Decoder
Used when consuming Avro-encoded messages from Kafka topics
Protobuf Decoder
Used to decode Protobuf-encoded messages from Kafka topics
Debezium Decoder
Used to configure a Pinot table to use a Debezium formatted streaming source
DynamoDB Decoder
- Used to configure a Pinot table to use a DynamoDB formatted streaming source
Prometheus Decoder
Used to provide support for ingesting Prometheus-formatted metrics data