Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix old snapshot read in consecutive snapshot synchronization #217

Merged

Conversation

xirc
Copy link
Contributor

@xirc xirc commented Oct 26, 2023

Closes #215

Changes SnapshotSyncManager such that it executes only one snapshotsynchronization in its lifecycle. That means SnapshotSyncManager doesn't reuse SnapshotStore instances. Each synchronization will be performed by a different SnapshotSyncManager instance and in serial order.

TODO:

  • Verify the issue doesn't occur by executing fault injection tests several times
  • Add unit tests

Taichi Yamakawa added 4 commits October 25, 2023 11:41
State transition will depend on callers of the state update function.
RaftActor of the source replica group of the synchronization can execute
compactions during snapshot synchronization. In such a case, the single
SnapshotSyncManager instance could perform multiple snapshot
synchronizations in serial order. If the SnapsotSyncManager does so, it
could read an old snapshot in the newer synchronization. This old
snapshot read is related to the following aspects:

* SnapshotSyncManager reads snapshots from a SnapshotStore that the
  SnapshotSyncManager spawns as its descendant. The SnapshotStore is
  a different instance from a SnapshotStore running as a descendant of
  RaftActor of the source replica group. The SnapshotStore cannot read
  new snapshots the other one wrote (by such as compaction) if it has
  already recovered from persisted data.
* SnapshotSyncManager reuses an instance of SnapshotStore among multiple
  snapshot synchronizations. It cannot read a new snapshot written by a
  compaction that happens during the synchronization.

Change SnapshotSyncManager such that it executes only one snapshot
synchronization in its lifecycle. That means SnapshotSyncManager doesn't
reuse SnapshotStore instances. Each synchronization will be performed by
a different SnapshotSyncManager instance and in serial order.
SnapshotSyncManager no longer needs the Stop messages. It can decide
immediately to stop itself in the finalizing state.
Taichi Yamakawa and others added 4 commits October 26, 2023 15:53
Use SnapshotSyncManagerFinalizingSpec instead. It will contain tests for
non-persistence features in the finalizing state.
@xirc xirc marked this pull request as ready for review October 30, 2023 06:37
@xirc xirc requested a review from negokaz October 30, 2023 06:37
Copy link
Contributor

@negokaz negokaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! 👍 I left some comments.

Copy link
Contributor

@negokaz negokaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@negokaz negokaz merged commit 88d0bb3 into master Nov 6, 2023
@negokaz negokaz deleted the fix-old-snapshot-read-in-consecutive-snapshot-synchronization branch November 6, 2023 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consecutive snapshot synchronizations could not synchronize new entity snapshots
2 participants