Guide for setting up AZ aware Kafka ingestion
This guide demonstrates how to optimize cross-Availability Zone (AZ) traffic using AZ-aware Kafka consumers in your StarTree Pinot cluster.
This guide covers the following key areas:
In a StarTree Pinot cluster, Pinot servers utilize low-level Kafka consumers to retrieve data from Kafka brokers. When a Pinot consumer operates in a different Availability Zone than the broker hosting the required partition, each fetch request generates cross-AZ network traffic.
Cross-AZ traffic for Kafka consumers creates several challenges:
Implementing AZ-aware consumption in StarTree pinot provides:
The optimization strategy centers on implementing AZ-aware Kafka consumers using the Kafka RackAwareReplicaSelector. This approach ensures that Pinot servers preferentially consume from Kafka brokers within the same Availability Zone.
Here are the key steps in achieving this
Step 1: Implement AZ-Aware Instance Assignment
Configure the instance assignment strategy to consider Availability Zone placement when distributing workloads across the cluster.
Step 3: Configure AZ-Aware Table Settings
First thing to do is setup pool-based instance assignment, wherein we tag servers in the same AZ with the same name (eg CLOUD_AZ_POOL_REALTIME). For example, set servers in aps1-az1 with value 0, aps1-az2 with value 1, etc.
When we create realtime table, configure client.rack
This environment variable CLOUD_AZ is automatically set on the servers and includes the coprresponding cloud zone information.
For pool-based instance assignment, you need to configure CONSUMING with tag CLOUD_AZ_POOL_REALTIME and poolBased in instanceAssignmentConfigMap:
Example config:
This guide addresses the critical issue of cross-Availability Zone network traffic in StarTree Pinot clusters. This can be enabled by configuring pool based instance assignment and setting client.rack property of kafka consumer to the right value. Results demonstrate substantial optimization with same-AZ traffic increasing from 50% to 96-98% across all tested zones, resulting in significant cost savings and improved system performance.