Skip to content
This repository has been archived by the owner on Nov 12, 2024. It is now read-only.

Members should not respond with Vote if sender is not in Configuration #49

Open
colin-scott opened this issue May 18, 2015 · 5 comments

Comments

@colin-scott
Copy link
Contributor

Suppose there are 5 nodes, labeled raft-member-{1-5}.

In the beginning, we send ChangeConfiguration(raft-member-1, raft-member-2, raft-member-3) [bootstrap] messages to raft-member-{1-3}.

Later, we add two nodes raft-member-{4-5}. We send new ChangeConfiguration(raft-member-1, raft-member-2, raft-member-3, raft-member-4, raft-member-5) messages to all 5 members, except, the messages arrive to raft-member-{4-5} much more quickly than they arrive to raft-member-{1-3}.

The current behavior in this scenario is that raft-member-4 or raft-member-5 will start an election, and send RequestVote messages to all members. Strangely, raft-member-{1-3} respond with votes (even though they are not aware of the existence of raft-member-{4-5})! This seems broken, as it can cause multiple leaders to be elected for the same term.

I realize this is probably not the proper way to initiate joint consensus, so maybe the fix is just to document the proper way to do so? Incidentally, how is joint consensus supposed to be triggered?

Thanks!

@colin-scott
Copy link
Contributor Author

After reading the code more closely, it looks like ChangeConfiguration messages are supposed to be the way that joint consensus is triggered.

The behavior described in the javadoc comment matches the Raft spec:

  * [ChangeConfiguration messages] can take a while to propagate, and will only be applied when the config is passed on to all nodes

However, it appears that the doc does not match with how the code actually behaves; nodes appear to apply the config as soon as they receive them, which can lead to violations of safety conditions.

@colin-scott
Copy link
Contributor Author

I have a replayable execution that tickles this behavior, in case that's of any use.

@ktoso
Copy link
Owner

ktoso commented May 18, 2015

Hi @colin-scott, yes a reproducer would be very welcome!
I have not yet been able to resume work on this project but plan to do so eventually sometime this month, which is when all such reproducers and the issues you raised will be invaluable to get the project up to shape :-)

Please post the reproducer (can be a gist, in line here, or a repo - whichever works best for you).

@colin-scott
Copy link
Contributor Author

Sounds good, I'll set up the reproducing scenario soon.

@colin-scott
Copy link
Contributor Author

A related issue:

Suppose we send a ChangeConfiguration(raft-member-1, raft-member-2) message to 4 different nodes: raft-member-1, raft-member-2, raft-member-3, raft-member-4. This is admittedly a little strange, since raft-member-3 and 4 are not themselves included in the ChangeConfiguration message we send them.

Regardless, the current behavior is the following: raft-member-3 and raft-member-4 might both start an election, so they send votes (only) to raft-member-1 and 2. raft-member-1 and raft-member-2 both respond with Votes! So it's possible that raft-member-3 and raft-member-4 will both become elected, and they can then override entries in raft-member-1 and 2.

To replay this test case:

$ git clone -b raft-strange-cluster-membership [email protected]:NetSys/sts2-applications.git
$ cd sts2-applications
$ git remote add interposition [email protected]:NetSys/sts2-interposition.git
$ git subtree pull --prefix=interposition interposition master
$ git clone [email protected]:NetSys/sts2-experiments.git experiments
$ sbt assembly
$ java -d64 -Xmx15g -cp target/scala-2.11/randomSearch-assembly-0.1.jar Main 2>&1 | tee console.out

Console output here.

Messages delivered in the test case (labels of the nodes are mixed around somewhat, since I started out with a 9-node cluster and minimized the execution to a 4 node cluster):

MsgEvent(deadLetters,raft-member-4,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-8, raft-member-9))))
MsgEvent(deadLetters,raft-member-8,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-8, raft-member-9))))
MsgEvent(raft-member-4,raft-member-4,BasicFingerprint(BeginElection))
MsgEvent(deadLetters,raft-member-2,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-8, raft-member-9))))
MsgEvent(raft-member-4,raft-member-8,BasicFingerprint((RequestVote,Term(2),raft-member-4,Term(0),0)))
MsgEvent(deadLetters,raft-member-9,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-8, raft-member-9))))
MsgEvent(raft-member-2,raft-member-2,BasicFingerprint(BeginElection))
MsgEvent(raft-member-2,raft-member-9,BasicFingerprint((RequestVote,Term(2),raft-member-2,Term(0),0)))
MsgEvent(raft-member-8,raft-member-4,BasicFingerprint(VoteCandidate(Term(0))))
MsgEvent(raft-member-9,raft-member-2,BasicFingerprint(VoteCandidate(Term(1))))

At the end of these message deliveries, raft-member-2 and raft-member-4 are both elected as leader in the same term.

@ktoso ktoso modified the milestone: 1.0 Aug 9, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants