Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An entity on a follower could stick at WaitForReplication if the entity has a ProcessCommand in its mailbox #157

Closed
xirc opened this issue Jul 6, 2022 · 0 comments · Fixed by #158
Labels
bug Something isn't working
Milestone

Comments

@xirc
Copy link
Contributor

xirc commented Jul 6, 2022

Situation

Suppose: An entity belonging to a leader has some ProcessCommand in its mailbox (might happen on heavy load)

  1. A RaftActor (called RaftActor A, is the leader) that is responsible for an entity (called Entity X) becomes a follower for some reason.
  2. Entity X (on the Ready state) executes its command handler for the existing one ProcessCommand in its mailbox.
  3. Entity X sends a Replicate message to RaftActor A and then waits for a replication result (ReplicationSucceeded, ReplicationFaield, or Replica) on the WaitForReplication state.
  4. The RaftActor A discards the Replicate message from Entity X because RaftActor A is a follower.
  5. Entity X continues to wait for a replication result until the leader on another node (called RaftActor B) completes another new replication for an entity (having the same ID, belonging to the RaftActor B, called Entity Y).
  6. If RaftActor B completes new replication for Entity Y, RaftActor A sends Replica to Entity X eventually.
  7. Entity X (on the WaitForReplication state) receives Replica, becomes the Ready state, and then un-stash all stashed messages.
  8. If Entity X stashed more ProcessCommand before, it repeats the above behavior.

Possible Solution

RaftActor (Follower and Candidate, not Leader) replies to an entity with a ReplicationFaield message instead if it receives a Replicate message.

@xirc xirc added this to the v2.1.1 milestone Jul 15, 2022
@xirc xirc added the bug Something isn't working label Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant