Curator framework improvements #2225

ajammala · 2021-08-12T16:31:24Z

Currently we use a single instance of CuratorFramework for the entire service. This PR adds a way to create multiple instances of CuratorFramework and wrap them behind a load distributor (using round robin scheduling).

ssalinas · 2021-08-25T18:22:19Z

🚢

jhaber · 2021-08-25T19:44:17Z

Is there any concern about this causing issues for a sequence of events like:

write data to path /XYZ
issue a getData call to path /XYZ

With a single curator, I believe that you're guaranteed that the getData call returns the data you just wrote, because the read and write are going to the same ZooKeeper server. But with this change, the read and write could go to different ZooKeeper servers, and the read might not return the latest data. Normally you could work around this by adding a call to sync, but that sync call might not get routed to the same CuratorFramework as the subsequent getData call, in which case it would have no effect.

ssalinas · 2021-08-25T19:50:19Z

@jhaber this is only enabled for read-only cases on instances that do not contend for leader latch (we split it into to separate deploys to isolate the scheduler from heavy read traffic). i.e. they never write. Only the leading scheduler is doing writes and all other instances proxy to the leader (for other various reasons like that it keeps most state in memory and just persists to zk).

If we were to enable this on instances that write we'd have to do a bit more work around giving out a specific curator instance and using it for the duration of the method calls. We didn't feel we needed that level of optimization yet as the main area of pain we were trying to solve for was the read only api instances

jhaber · 2021-08-25T19:56:47Z

Ah ok makes sense, thanks for clarifying. Do you have a sense of how much of the benefit comes from reducing contention on the client side vs. server side? If most of the benefit is client side, we could theoretically force all of the CuratorFramework instances to connect to the same ZooKeeper server, which I think would avoid most of the weirdness and be safer to use on all the instances (maybe with some additional tweaks).

Also, a more robust option might be to defer the CuratorFramework selection until we know the path, and use a hash of the path to consistently pick the same CuratorFramework. But it would probably be a nightmare to wire that up because the path is usually specified last in the method call chain.

ssalinas · 2021-08-25T19:59:01Z

yeah, the way it does builders makes that rough. In terms of benefit though, on Singualrity scale going from 1 -> 3 curator instances, we saw the metrics for total curator call time go from maxing out around 5s+ to a few hundre millis

ajammalamadaka added 6 commits August 11, 2021 13:09

Adding a custom CuratorFramework implementation

ac7ce64

Round robin load balancing for the curator frameworks

5dd341a

adding methods to load distributor

9e97a5f

separating curator frameworks for read-only instances

820e186

Merge branch 'master' into curator_framework_improvements

af80ad1

Updated error handling and logging

d67c8e3

ajammala merged commit 78de2ba into master Aug 25, 2021

ssalinas deleted the curator_framework_improvements branch August 25, 2021 19:51

ssalinas added this to the 1.5.0 milestone May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Curator framework improvements #2225

Curator framework improvements #2225

ajammala commented Aug 12, 2021

ssalinas commented Aug 25, 2021

jhaber commented Aug 25, 2021 •

edited

Loading

ssalinas commented Aug 25, 2021

jhaber commented Aug 25, 2021

ssalinas commented Aug 25, 2021

Curator framework improvements #2225

Curator framework improvements #2225

Conversation

ajammala commented Aug 12, 2021

ssalinas commented Aug 25, 2021

jhaber commented Aug 25, 2021 • edited Loading

ssalinas commented Aug 25, 2021

jhaber commented Aug 25, 2021

ssalinas commented Aug 25, 2021

jhaber commented Aug 25, 2021 •

edited

Loading