Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazily connect to brokers in the client #309

Merged
merged 1 commit into from
Mar 13, 2015
Merged

Conversation

eapache
Copy link
Contributor

@eapache eapache commented Feb 26, 2015

Instead of opening a connection to all brokers immediately upon receiving their
information in metadata, wait until we are asked for them either via a call to
Leader or a call to any.

It seems simple enough (simpler than I was expecting actually) but I'm a bit wary, since it changes the internal semantics so much. Strong opinions and careful review please.

@Shopify/kafka

@eapache eapache force-pushed the lazy-broker-connections branch 2 times, most recently from 0b187cf to 6b7ee17 Compare March 6, 2015 16:35
@wvanbergen
Copy link
Contributor

This looks innocent enough 👍

Should we do some manual testing?

@eapache
Copy link
Contributor Author

eapache commented Mar 10, 2015

Ya, I'm really suspicious mostly because it looks so innocent. Throw whatever cases you can think of at it :)

@yagnik
Copy link

yagnik commented Mar 10, 2015

This code looks good, what do you guys think about genghis like stress testing tool for kafka, we could possibly just leverage genghis ?

@wvanbergen
Copy link
Contributor

Kind of unrelated to this PR, but we need:

Let's discuss this offline.

Instead of opening a connection to all brokers immediately upon receiving their
information in metadata, wait until we are asked for them either via a call to
`Leader` or a call to `any`.
@eapache eapache force-pushed the lazy-broker-connections branch from 6b7ee17 to 533cce2 Compare March 12, 2015 16:58
eapache added a commit that referenced this pull request Mar 12, 2015
It is now only called from one place in the client. This looks simple but is
actually super-subtle, and depends on lazy broker connections (PR #309).

disconnectBroker does a whole bunch of different things:
- calls `Close` on the broker connection
- adds the address to the internal `deadBrokerAddrs` map even if the broker is
  not a seed, which I think is wrong, since resurrectDeadBrokers will use it to
  repopulate the seedBrokerAddrs list
- rotates seedBrokers (if the broker was a seed broker)
- removes it from the brokers map (otherwise)

In the producer and consumer where we used to call disconnectBroker:
- We now call `Close` directly on the broker.
- The broker we are dealing with is not a seed broker, so the seedBrokers do not
  need rotating, and I don't think it's a problem that it no longer gets added
  to `deadBrokerAddrs`.
- The reason we removed it from the broker map was so that the next request for
  that broker would trigger a metadata request and reopen the connection. The
  producer and consumer both manually trigger metadata requests when necessary,
  and the fact that we now have lazy connection opening means simply closing it
  (which we do, see first bullet) is enough to cause the connection to reopen
  the next time it is requested, even if no metadata refresh is requested.
eapache added a commit that referenced this pull request Mar 12, 2015
It is now only called from one place in the client. This looks simple but is
actually super-subtle, and depends on lazy broker connections (PR #309).

disconnectBroker does a whole bunch of different things:
- calls `Close` on the broker connection
- adds the address to the internal `deadBrokerAddrs` map even if the broker is
  not a seed, which I think is wrong, since resurrectDeadBrokers will use it to
  repopulate the seedBrokerAddrs list
- rotates seedBrokers (if the broker was a seed broker)
- removes it from the brokers map (otherwise)

In the producer and consumer where we used to call disconnectBroker:
- We now call `Close` directly on the broker.
- The broker we are dealing with is not a seed broker, so the seedBrokers do not
  need rotating, and I don't think it's a problem that it no longer gets added
  to `deadBrokerAddrs`.
- The reason we removed it from the broker map was so that the next request for
  that broker would trigger a metadata request and reopen the connection. The
  producer and consumer both manually trigger metadata requests when necessary,
  and the fact that we now have lazy connection opening means simply closing it
  (which we do, see first bullet) is enough to cause the connection to reopen
  the next time it is requested, even if no metadata refresh is requested.
eapache added a commit that referenced this pull request Mar 12, 2015
It is now only called from one place in the client. This looks simple but is
actually super-subtle, and depends on lazy broker connections (PR #309).

disconnectBroker does a whole bunch of different things:
- calls `Close` on the broker connection
- adds the address to the internal `deadBrokerAddrs` map even if the broker is
  not a seed, which I think is wrong, since resurrectDeadBrokers will use it to
  repopulate the seedBrokerAddrs list
- rotates seedBrokers (if the broker was a seed broker)
- removes it from the brokers map (otherwise)

In the producer and consumer where we used to call disconnectBroker:
- We now call `Close` directly on the broker.
- The broker we are dealing with is not a seed broker, so the seedBrokers do not
  need rotating, and I don't think it's a problem that it no longer gets added
  to `deadBrokerAddrs`.
- The reason we removed it from the broker map was so that the next request for
  that broker would trigger a metadata request and reopen the connection. The
  producer and consumer both manually trigger metadata requests when necessary,
  and the fact that we now have lazy connection opening means simply closing it
  (which we do, see first bullet) is enough to cause the connection to reopen
  the next time it is requested, even if no metadata refresh is requested.
eapache added a commit that referenced this pull request Mar 12, 2015
It is now only called from one place in the client. This looks simple but is
actually super-subtle, and depends on lazy broker connections (PR #309).

disconnectBroker does a whole bunch of different things:
- calls `Close` on the broker connection
- adds the address to the internal `deadBrokerAddrs` map even if the broker is
  not a seed, which I think is wrong, since resurrectDeadBrokers will use it to
  repopulate the seedBrokerAddrs list
- rotates seedBrokers (if the broker was a seed broker)
- removes it from the brokers map (otherwise)

In the producer and consumer where we used to call disconnectBroker:
- We now call `Close` directly on the broker.
- The broker we are dealing with is not a seed broker, so the seedBrokers do not
  need rotating, and I don't think it's a problem that it no longer gets added
  to `deadBrokerAddrs`.
- The reason we removed it from the broker map was so that the next request for
  that broker would trigger a metadata request and reopen the connection. The
  producer and consumer both manually trigger metadata requests when necessary,
  and the fact that we now have lazy connection opening means simply closing it
  (which we do, see first bullet) is enough to cause the connection to reopen
  the next time it is requested, even if no metadata refresh is requested.
@eapache
Copy link
Contributor Author

eapache commented Mar 13, 2015

I ran this code (in combination with #338) through a bunch of failure scenarios using the vagrant cluster and willem's stressproducer and didn't have any problems.

I'll merge this and rebase #338 and then it can be reviewed separately.

eapache added a commit that referenced this pull request Mar 13, 2015
Lazily connect to brokers in the client
@eapache eapache merged commit 401d5b1 into master Mar 13, 2015
@eapache eapache deleted the lazy-broker-connections branch March 13, 2015 14:21
eapache added a commit that referenced this pull request Mar 13, 2015
It is now only called from one place in the client. This looks simple but is
actually super-subtle, and depends on lazy broker connections (PR #309).

disconnectBroker does a whole bunch of different things:
- calls `Close` on the broker connection
- adds the address to the internal `deadBrokerAddrs` map even if the broker is
  not a seed, which I think is wrong, since resurrectDeadBrokers will use it to
  repopulate the seedBrokerAddrs list
- rotates seedBrokers (if the broker was a seed broker)
- removes it from the brokers map (otherwise)

In the producer and consumer where we used to call disconnectBroker:
- We now call `Close` directly on the broker.
- The broker we are dealing with is not a seed broker, so the seedBrokers do not
  need rotating, and I don't think it's a problem that it no longer gets added
  to `deadBrokerAddrs`.
- The reason we removed it from the broker map was so that the next request for
  that broker would trigger a metadata request and reopen the connection. The
  producer and consumer both manually trigger metadata requests when necessary,
  and the fact that we now have lazy connection opening means simply closing it
  (which we do, see first bullet) is enough to cause the connection to reopen
  the next time it is requested, even if no metadata refresh is requested.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants