-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework how the client connects to brokers. #10
Conversation
Fixes #9. This ended up being more complicated than I had hoped and touched several different areas. TL;DR is that we now connect to the other brokers in the cluster asynchronously. Errors connecting only show up when somebody tries to use that broker. This is better than the old behaviour since it means that if some brokers in a cluster go down but the topics we care about are still available, we just keep going instead of blowing up for no reason. The complicated part is that simply calling `go broker.Connect()` doesn't do what we want, so I had to write a `broker.AsyncConnect()`. The problem occurs if you've got code like this: go broker.Connect() // do some stuff broker.SendSomeMessage() What can happen is that SendSomeMessage can be run before the Connect() goroutine ever gets scheduled, in which case SendSomeMessage will simply return NotConnected. The desired behaviour is that SendSomeMessage waits for Connect() to finish, which means Connect() has to *synchronously* take the broker lock before it launches the asynchronous connect call. Lots of fun. And bonus change in this commit: rather than special-casing leader == -1 in `client.cachedLeader` and adding a big long comment to the LEADER_NOT_AVAILABLE case explaining the fallthrough statement, just delete that partition from the hash. So much easier to follow, I must have been on crack when I wrote the old way.
It looks like a good change. Afaik the db.Sql package works just like this.
On Tue, Aug 13, 2013 at 7:01 PM, Evan Huus [email protected] wrote:
|
if b.conn != nil { | ||
return AlreadyConnected | ||
} | ||
b.conn_err = nil | ||
|
||
addr, err := net.ResolveIPAddr("ip", b.host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider that you could do addr, b.conn_err = ...
. Just a thought; feel free to ignore if you have some reason not to do that.
Change looks good to me. Nice work |
@tobi does db.Sql not provide a synchronous open then? That seems an odd choice. Nevermind, I RTFM, that's a pattern worth matching. |
Replaces Connect and AsyncConnect with Open and Connected
ping @burke for a quick re-review of the new pattern please? |
Looks great. |
Rework how the client connects to brokers.
@burke @fw42 Thorough review please - I want this concurrent locking pattern to make sense to somebody else before I merge it.
CC @tobi
Fixes #9.
This ended up being more complicated than I had hoped and touched several
different areas. TL;DR is that we now connect to the other brokers in the
cluster asynchronously. Errors connecting only show up when somebody tries to
use that broker.
This is better than the old behaviour since it means that if some brokers in a
cluster go down but the topics we care about are still available, we just keep
going instead of blowing up for no reason.
The complicated part is that simply calling
go broker.Connect()
doesn't dowhat we want, so I had to write a
broker.AsyncConnect()
. The problem occurs ifyou've got code like this:
What can happen is that
SendSomeMessage
can be run before theConnect
goroutine ever gets scheduled, in which case
SendSomeMessage
will simply returnNotConnected
. The desired behaviour is thatSendSomeMessage
waits forConnect
to finish, which means
AsyncConnect
has to synchronously take the broker lockbefore it launches the asynchronous connect goroutine. Lots of fun.
And bonus change in this commit: rather than special-casing
leader == -1
inclient.cachedLeader
and adding a big long comment to theLEADER_NOT_AVAILABLE
case explaining the fallthrough statement, just delete that partition from the
hash. So much easier to follow, I must have been on crack when I wrote the old
way.