In Apache Pinot, message decoders play a crucial role in the real-time ingestion pipeline. They are responsible for transforming raw byte-stream messages from streaming systems like Apache Kafka, AWS Kinesis, or Pulsar into Pinot-compatible row objects (GenericRow) that can be indexed and stored.

When data is consumed from a stream, it often comes in a variety of formats—such as JSON, Avro, ORC or Thrift—and Pinot needs to understand how to deserialize that data into something it can work with. That’s where message decoders come in. Each decoder implements a strategy to handle a specific serialization format and convert it into a structured row according to the schema defined in Pinot.

Message decoders are configured per real-time table, under the streamConfigs section, and allow you to define how incoming messages are interpreted and parsed.

Supported Message Decoders in Pinot

Apache Pinot supports a range of message decoders: