You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a bug in the new version of Sarama that can cause consumers to lose messages when subscribed to eventhub: IBM/sarama#2677
Hopefully this bug will be fixed soon in sarama but in the mean time logging this issue so people are aware of it. We are downgrading sarama to 1.40.1 to work around the issue.
Steps to Reproduce
Use kafka receiver with Azure EventHubs.
Expected Result
Kafka receiver consumes all messages
Actual Result
Kafka receiver skips some messages (observed in azure metrics)
For the impacted pods we observe repeated logs like this which weren't present in the previous version implying that the consumer is getting reset periodically for some reason. The sarama bug indicates that when consumers are reset they lose the initial offset:
2023-10-25T13:52:32.922033081Z stderr F 2023-10-25T13:52:32.921Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T13:52:51.404728652Z stderr F 2023-10-25T13:52:51.404Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:02:33.817672442Z stderr F 2023-10-25T14:02:33.817Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:03:22.084838355Z stderr F 2023-10-25T14:03:22.084Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:12:54.337646163Z stderr F 2023-10-25T14:12:54.337Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:22:31.602084342Z stderr F 2023-10-25T14:22:31.601Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:32:22.996075195Z stderr F 2023-10-25T14:32:22.995Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:42:23.675027732Z stderr F 2023-10-25T14:42:23.674Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:42:54.389087635Z stderr F 2023-10-25T14:42:54.388Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:52:52.134123849Z stderr F 2023-10-25T14:52:52.134Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:02:38.449207265Z stderr F 2023-10-25T15:02:38.448Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:04:20.706895353Z stderr F 2023-10-25T15:04:20.706Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:12:38.396434507Z stderr F 2023-10-25T15:12:38.396Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:12:38.567983529Z stderr F 2023-10-25T15:12:38.567Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:12:50.681064864Z stderr F 2023-10-25T15:12:50.680Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:33:23.946760942Z stderr F 2023-10-25T15:33:23.946Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:42:50.885315129Z stderr F 2023-10-25T15:42:50.885Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:53:07.19624621Z stderr F 2023-10-25T15:53:07.195Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T16:02:37.35429016Z stderr F 2023-10-25T16:02:37.354Z info [email protected]/kafka_receiver.go:561 Starting consumer group {"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
Additional context
No response
The text was updated successfully, but these errors were encountered:
ben-childs-docusign
changed the title
losing messages when consuming from EventHub due to Sarama issue
[kafkareceiver] losing messages when consuming from EventHub due to Sarama issue
Oct 25, 2023
@ben-childs-docusign To make sure I understand right: There's nothing to do for the collector, but you've filed this issue as a notice to anyone using this configuration to let them know why they're seeing the above errors?
Component(s)
receiver/kafka
What happened?
Description
There is a bug in the new version of Sarama that can cause consumers to lose messages when subscribed to eventhub:
IBM/sarama#2677
Hopefully this bug will be fixed soon in sarama but in the mean time logging this issue so people are aware of it. We are downgrading sarama to 1.40.1 to work around the issue.
Steps to Reproduce
Use kafka receiver with Azure EventHubs.
Expected Result
Kafka receiver consumes all messages
Actual Result
Kafka receiver skips some messages (observed in azure metrics)
Collector version
0.87.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
No response
Log output
For the impacted pods we observe repeated logs like this which weren't present in the previous version implying that the consumer is getting reset periodically for some reason. The sarama bug indicates that when consumers are reset they lose the initial offset:
Additional context
No response
The text was updated successfully, but these errors were encountered: