-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Successes() channel returning &MessageToSend with flags:2 after requeue #179
Comments
I'll have to double check but I think so. Flags are used internally to track the state of messages such as if they've been retried so I would expect retried messages to have some extra flag(s) set. How did you even check that, we don't expose those flags in the API? |
(the log messages indicate that your cluster is undergoing leader election and so the producer has to switch brokers) |
Anyhow, I'm keeping tabs on the I wonder why the cluster's doing leader election / why the metadata goes stale in the first place; it's running on my laptop, and there shouldn't be any surprising changes in cluster membership since it's pretty much always up. |
Looks like some messages don't get acknowledged or error out at all after this happens, although I haven't had time to check whether they actually get sent or not. If needed, I can try to come up with a test case for this, although it might take some time with the work load I currently have. |
today I learned
Ya, haha I know what this is. In retry scenarios the producer internally passes some "fake" messages in order to carry additional flags. I guess we shouldn't be returning those to the user :) For now you can just ignore them - they're basically meaningless, just an implementation detail leaking out where it shouldn't. Easy to fix.
I dunno, but "kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date." is a message we get from the broker, so the broker is definitely telling us that something is wrong. |
Or not. There's a corner case here that isn't great. Still digging... |
A reproducible case or a network traffic capture (tcpdump or wireshark) at the time of the event would really help (a more complete log sample might also be helpful). Every time I think I've figured out how this could occur I realize it should be impossible. It would be trivial to add an "if flags set, don't send on Successes channel" check, but all that would do is mask the real problem - those messages shouldn't be in that part of the flow to begin with. |
I worked out the formal state machines of the relevant goroutines and stuck a log message in at every state change. This should help with debugging #179. The logs should still only trigger on broker rebalance or super-heavy traffic, so normal operation should be quiet.
I worked out the formal state machines of the relevant goroutines and stuck a log message in at every state change. This should help with debugging IBM#179. The logs should still only trigger on broker rebalance or super-heavy traffic, so normal operation should be quiet.
I have a hypothesis - is it possible you are reusing |
Hm, that shouldn't be the case: a new |
Also reported at #199 (comment) Go's closure semantics are really annoying - simply stop spawning goroutines with the wrong arguments. Add a test (heavily based on https://gist.github.com/ORBAT/d0adcd790dff34b37b04) to ensure this behaviour doesn't regress. Huge thanks to Tom Eklöf for getting me all the logs etc. needed to track this down.
Looks like PR #202 did the trick, yay! I'll go ahead and close the issue. |
I'm seeing messages like
&sarama.MessageToSend{Topic:"benchmark-kafka", Key:sarama.Encoder(nil), Value:sarama.Encoder(nil), offset:5, partition:2, flags:2})
occasionally being returned on the Successes() channel. Here's what seems to usually happen in the logs right before receiving the message:Is this intended behavior?
The text was updated successfully, but these errors were encountered: