Race in Cluster.onConnect that leaves cluster in shutdown state indefinitely? #18

jdanbrown · 2012-09-19T03:04:51Z

I don't have hard evidence for this, but I think the following is possible:

Receive SyncConnected for sessionid "a"
Call onConnect()
zk session expires here, sessionid was "a"
Call joinCluster()
zk.get().getSessionId establishes new session with sessionid "b" and returns "b"
/<name>/nodes/<nodeID> created with connectionID "b", lifetime bound to sessionid "b"
Complete joinCluster(), onConnect(), SyncConnected
Receive SyncConnected for sessionid "b"
Call onConnect()
Call previousZKSessionStillActive()
previousZKSessionStillActive() uses session "b" to read /<name>/nodes/<nodeID>, finds connectionID "b", returns true
onConnect() returns early, skipping cluster setup

And somewhere in there an Expired event for sessionid "a" shuts down the cluster (I'm not sure if it matters when), leaving the cluster in a shutdown state indefinitely because onConnect() for sessionid "b" was tricked into skipping cluster setup.

The text was updated successfully, but these errors were encountered:

jdanbrown · 2012-09-21T21:45:58Z

I can now trip this somewhat reliably in my own code (unpublished) by starting a handful of nodes (~5-10) and putting them through a short series of brief zk connection losses.

I think the root problem is that the code sometimes uses separate zk.get() calls, multiple times in sequence, while assuming that all interactions are in the same zk session, which is unsound because each zk.get() call can potentially start a new zk session.

Here's one simple example of where this problem can occur:

Cluster.joinCluster calls zk.get().getSessionId to figure out what sessionId to put in its ephemeral /nodes/<nodeId>
ZKUtils.createEphemeral calls zk.get().create to create the ephemeral node with the requested data

In the case where the two separate zk.get() calls return ZooKeeper instances for different sessions, we end up with a /nodes/<nodeId> actually created by session B (visible by stat) that thinks it was created by session A (visible by get). I think this particular race wouldn't cause the failure outlined above, but there's a zk.get() race at the core of that one too.

jdanbrown · 2012-09-21T22:21:51Z

As for a fix, I'm imaging something along these lines:

Indirect all existing calls to zk.get() through a call to currentZk()
currentZk() calls zk.get() except when there's a fixed ZooKeeper instance to tie us to a single session, which we can unobtrusively communicate through a dynamic variable fixedZk:

def currentZk(): ZooKeeper = fixedZk.value getOrElse zk.get()
val fixedZk = new DynamicVariable[Option[ZooKeeper]](None) // No fixed zk session by default

To fix a zk session for a particular block of code, wrap the block with zkSingleSession { ... }, which we can define as:

def zkSingleSession[X](x: => X): X = fixedZk.withValue(zk.get()) { x }

Then any zkSingleSession { body } is guaranteed to observe only one ZooKeeper instance, and thus only one session, and if it ever tries to use a ZooKeeper whose session has expired, it will trigger a SessionExpiredException that body probably can't meaningfully handle, in which case it will abort body, and the caller of zkSingleSession will be responsible for deciding whether to fail or retry.

eribeiro · 2012-09-25T23:07:13Z

This bug seems to amplify a ginormous design flaw in Twitter's Zookeeper API, that is, creating a method that do two things (retrieve or create and retrieve) instead of one. I proposed a fix some time ago, but job commitments got in my way. Your bug pitch shows this question should be addressed as soon as possible! Maybe we can start at ZKClient and move up the stack?

jdanbrown mentioned this issue Oct 24, 2012

Log any exception that kills a Runnable.run #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race in Cluster.onConnect that leaves cluster in shutdown state indefinitely? #18

Race in Cluster.onConnect that leaves cluster in shutdown state indefinitely? #18

jdanbrown commented Sep 19, 2012

jdanbrown commented Sep 21, 2012

jdanbrown commented Sep 21, 2012

eribeiro commented Sep 25, 2012

Race in Cluster.onConnect that leaves cluster in shutdown state indefinitely? #18

Race in Cluster.onConnect that leaves cluster in shutdown state indefinitely? #18

Comments

jdanbrown commented Sep 19, 2012

jdanbrown commented Sep 21, 2012

jdanbrown commented Sep 21, 2012

eribeiro commented Sep 25, 2012