-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Curator framework improvements #2225
Conversation
🚢 |
Is there any concern about this causing issues for a sequence of events like:
With a single curator, I believe that you're guaranteed that the getData call returns the data you just wrote, because the read and write are going to the same ZooKeeper server. But with this change, the read and write could go to different ZooKeeper servers, and the read might not return the latest data. Normally you could work around this by adding a call to |
@jhaber this is only enabled for read-only cases on instances that do not contend for leader latch (we split it into to separate deploys to isolate the scheduler from heavy read traffic). i.e. they never write. Only the leading scheduler is doing writes and all other instances proxy to the leader (for other various reasons like that it keeps most state in memory and just persists to zk). If we were to enable this on instances that write we'd have to do a bit more work around giving out a specific curator instance and using it for the duration of the method calls. We didn't feel we needed that level of optimization yet as the main area of pain we were trying to solve for was the read only api instances |
Ah ok makes sense, thanks for clarifying. Do you have a sense of how much of the benefit comes from reducing contention on the client side vs. server side? If most of the benefit is client side, we could theoretically force all of the Also, a more robust option might be to defer the |
yeah, the way it does builders makes that rough. In terms of benefit though, on Singualrity scale going from 1 -> 3 curator instances, we saw the metrics for total curator call time go from maxing out around 5s+ to a few hundre millis |
Currently we use a single instance of CuratorFramework for the entire service. This PR adds a way to create multiple instances of CuratorFramework and wrap them behind a load distributor (using round robin scheduling).