Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kafkareceiver] losing messages when consuming from EventHub due to Sarama issue #28620

Closed
ben-childs-docusign opened this issue Oct 25, 2023 · 4 comments
Labels
admin issues tracker issues etc. bug Something isn't working receiver/kafka

Comments

@ben-childs-docusign
Copy link
Contributor

ben-childs-docusign commented Oct 25, 2023

Component(s)

receiver/kafka

What happened?

Description

There is a bug in the new version of Sarama that can cause consumers to lose messages when subscribed to eventhub:
IBM/sarama#2677

Hopefully this bug will be fixed soon in sarama but in the mean time logging this issue so people are aware of it. We are downgrading sarama to 1.40.1 to work around the issue.

Steps to Reproduce

Use kafka receiver with Azure EventHubs.

Expected Result

Kafka receiver consumes all messages

Actual Result

Kafka receiver skips some messages (observed in azure metrics)

Collector version

0.87.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

For the impacted pods we observe repeated logs like this which weren't present in the previous version implying that the consumer is getting reset periodically for some reason. The sarama bug indicates that when consumers are reset they lose the initial offset:

2023-10-25T13:52:32.922033081Z stderr F 2023-10-25T13:52:32.921Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T13:52:51.404728652Z stderr F 2023-10-25T13:52:51.404Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:02:33.817672442Z stderr F 2023-10-25T14:02:33.817Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:03:22.084838355Z stderr F 2023-10-25T14:03:22.084Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:12:54.337646163Z stderr F 2023-10-25T14:12:54.337Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:22:31.602084342Z stderr F 2023-10-25T14:22:31.601Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:32:22.996075195Z stderr F 2023-10-25T14:32:22.995Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:42:23.675027732Z stderr F 2023-10-25T14:42:23.674Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:42:54.389087635Z stderr F 2023-10-25T14:42:54.388Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T14:52:52.134123849Z stderr F 2023-10-25T14:52:52.134Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:02:38.449207265Z stderr F 2023-10-25T15:02:38.448Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:04:20.706895353Z stderr F 2023-10-25T15:04:20.706Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:12:38.396434507Z stderr F 2023-10-25T15:12:38.396Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:12:38.567983529Z stderr F 2023-10-25T15:12:38.567Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:12:50.681064864Z stderr F 2023-10-25T15:12:50.680Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:33:23.946760942Z stderr F 2023-10-25T15:33:23.946Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:42:50.885315129Z stderr F 2023-10-25T15:42:50.885Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T15:53:07.19624621Z stderr F 2023-10-25T15:53:07.195Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}
2023-10-25T16:02:37.35429016Z stderr F 2023-10-25T16:02:37.354Z	info	[email protected]/kafka_receiver.go:561	Starting consumer group	{"kind": "receiver", "name": "kafka", "data_type": "metrics", "partition": 51}

Additional context

No response

@ben-childs-docusign ben-childs-docusign added bug Something isn't working needs triage New item requiring triage labels Oct 25, 2023
@ben-childs-docusign ben-childs-docusign changed the title losing messages when consuming from EventHub due to Sarama issue [kafkareceiver] losing messages when consuming from EventHub due to Sarama issue Oct 25, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

@ben-childs-docusign To make sure I understand right: There's nothing to do for the collector, but you've filed this issue as a notice to anyone using this configuration to let them know why they're seeing the above errors?

@ben-childs-docusign
Copy link
Contributor Author

@crobert-1 correct, we would need an update from sarama to fix this or rollback to the earlier version to resolve this issue.

@crobert-1 crobert-1 added admin issues tracker issues etc. and removed needs triage New item requiring triage labels Oct 31, 2023
@ben-childs-docusign
Copy link
Contributor Author

This has been fixed with the following patch:
#29028

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
admin issues tracker issues etc. bug Something isn't working receiver/kafka
Projects
None yet
Development

No branches or pull requests

2 participants