Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rdy_timeout handling round 2 #37

Merged
merged 1 commit into from
Jun 26, 2013
Merged

Conversation

mreiferson
Copy link
Member

more cleanup re: #36

  • fix edge cases where rdy_timeout was not being cleaned up
  • separates the handling of Reader global backoff timers from per-connection RDY delay timers in order to handle connections closing that had the backoff timer state
  • resuming normal RDY state for all connections after completely exiting backoff state
  • re-connecting while in backoff block

cc @jehiah

@mreiferson
Copy link
Member Author

pushed a few commits up and updated issue description

@jehiah
Copy link
Member

jehiah commented Jun 25, 2013

I've pushed up a few more various fixes here. This code is now stable for me and tested on some production systems. I am still thinking through if there is anything else i want to tackle, and i have a few ideas on how to test.

  • We put a connection in the connection list too soon (it was possible to get sent a RDY count before it was connected). I delayed that till it was actually connected.
  • end of handling calls 2 functions, exit or start backoff block.
  • other points for backoff (connection add, close) call restart backoff block as appropriate
  • backoff block functions never take a connection; they pick one randomly

* skip redistributing RDY while blocked by backoff
* stop callbacks on closed connections
* re-schedule backoff on remaining connections
* properly send RDY on connect (after IDENTIFY)
* log message body on exception
* throttle connection attempts
* rename various internal methods for clarity
* separate a conn's rdy_timeout (for disabled handling) from the
  reader's global backoff_timeout (and clear backoff_timeout)
* force redistribute after a connection closes when in backoff
  or when it would have toggled out of normal redistribution cases
* test improvements
* dont 'optimize' when RDY is already the value we want (far far easier to reason in tests)
* when completely exiting backoff return to normal operation
* choose a random connection when backoff block expires
@jehiah
Copy link
Member

jehiah commented Jun 26, 2013

LGTM. squash/rebase please.

This set of changes fantastically improves lots of strange edge cases. 🚀 💯

jehiah added a commit that referenced this pull request Jun 26, 2013
@jehiah jehiah merged commit e41e329 into nsqio:master Jun 26, 2013
@jehiah jehiah mentioned this pull request Aug 19, 2013
5 tasks
@mreiferson mreiferson mentioned this pull request Sep 3, 2013
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants