Description
Component(s)
receiver/kafka
Is your feature request related to a problem? Please describe.
Support for configuring Sticky Rebalancing Strategy (Consumer.Group.Rebalance.Strategy = sticky) in the Kafka receiver implementation via IBM Sarama client.
This includes:
- Allow configuring Consumer Group Rebalance Strategy (range or roundrobin or sticky)
- Optionally support Group.InstanceId to enable static consumer membership (as per KIP-345)
The kafkareceiver currently uses the IBM Sarama client with the default range rebalancing strategy for consumer group coordination. This often leads to uneven partition assignments and large-scale rebalances when pods are restarted or scaled, causing unnecessary cache reloading, CPU spikes, and latency due to metric metadata being recomputed or fetched again.
This is especially problematic in large-scale OpenTelemetry Collector deployments that rely on consistent partition ownership for optimized caching and reduced memory churn.
Describe the solution you'd like
Expose support for the sticky rebalancing strategy (stickyBalanceStrategy) in the kafkareceiver using the IBM Sarama client.
Specifically:
Add configuration option in kafkareceiver to allow Sarama client's setting Consumer.Group.Rebalance.Strategy
Allow optionally specifying Group.InstanceId to leverage static membership (e.g., group_instance_id: ${POD_NAME} for StatefulSets)
Default to current range strategy if no value is provided to maintain backward compatibility
Example config:
protocol_version: 3.2.1
group_id: otel-metrics-group
group_rebalance_strategy: sticky # optional, new property, default value is range, possible values: range, roundrobin, sticky
group_instance_id: ${POD_NAME} # optional, new property, enables static membership
Note: Supported Group.InstanceId for Kafka >2.3
This would allow consumers to maintain a more consistent partition-to-replica assignment across restarts and reduce the operational load during scaling events.
Describe alternatives you've considered
- Custom patching the Sarama client config inside a forked Collector build (currently being used as a workaround)
- Sticky logic at Kafka broker level — not viable; balancing is always determined by clients
Additional context
Sticky balancing in Sarama:
https://github.com/IBM/sarama/blob/main/balance_strategy.go
https://github.com/IBM/sarama/blob/main/consumer_group.go
KafkaReceiver implementation:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/internal/kafka/client.go
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kafkareceiver
This enhancement will help large-scale OTel deployments (millions of unique time series) reduce rebalance impact and improve cache and CPU efficiency.