Skip to main content
In Apache Pinot, message decoders play a crucial role in the real-time ingestion pipeline. They are responsible for transforming raw byte-stream messages from streaming systems like Apache Kafka, AWS Kinesis, or Pulsar into Pinot-compatible row objects (GenericRow) that can be indexed and stored. When data is consumed from a stream, it often comes in a variety of formats—such as JSON, Avro, ORC or Thrift—and Pinot needs to understand how to deserialize that data into something it can work with. That’s where message decoders come in. Each decoder implements a strategy to handle a specific serialization format and convert it into a structured row according to the schema defined in Pinot. Message decoders are configured per real-time table, under the streamConfigs section, and allow you to define how incoming messages are interpreted and parsed.

Supported Message Decoders in Pinot

Apache Pinot supports a range of message decoders:

JSON Decoder

Used to Decode JSON-formatted messages from Kafka topics

Avro Decoder

Used when consuming Avro-encoded messages from Kafka topics

Protobuf Decoder

Used to decode Protobuf-encoded messages from Kafka topics

Debezium Decoder

Used to configure a Pinot table to use a Debezium formatted streaming source

DynamoDB Decoder

  • Used to configure a Pinot table to use a DynamoDB formatted streaming source

Prometheus Decoder

Used to provide support for ingesting Prometheus-formatted metrics data