-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safekeeper peer recovery preparatory patches #5118
Conversation
Now available under GET /tenant/xxx/timeline/yyy for inspection.
@petuhovskiy Trying to make this easier to digest I attempt to do it in several smaller commits -- the ones pushed here are ready for review. |
1624 tests run: 1550 passed, 0 failed, 74 skipped (full report)Flaky tests (1)Postgres 14
The comment gets automatically updated with the latest test results
8f0ae23 at 2023-08-29T19:38:46.837Z :recycle: |
Slightly refactors init: now load_tenant_timelines is also async to properly init the timeline, but to keep global map lock sync we just acquire it anew for each timeline. Recovery task itself is just a stub here. part of #4875
Add derive Ord for easy comparison of <term, lsn> pairs. part of #4875
We need them for safekeeper peer recovery #4875
1686ba1
to
4f5d5b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this recovery can be somehow combined with retrieving at compute WAL needed for logical replication? In principle - the approach is similar: we need to send WAL till some boundary (in this case determined by logical replication slot) to walproposer.
Or it is better not to mix this two things?
These are not very related. Interface for fetching WAL from safekeepers by pg protocol exists for a long time, and can be used for logical replication as well, in fact we already have fetching code in walproposer (which I plan to remove soon, but anyway, it is trivial). This patchset extends so that not committed part can also be dynamically fetched, but that's not much needed for replication, as not committed part most often can be still on the compute as it generates it. I still think that hardest part about logical repl is persistency of replication slots and historical snapshots... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot that this PR is splitted into commits and reviewed all changed as usual (altogether in Files changed
tab).
Overall LGTM, let's merge and deploy.
It will be used by safekeeper as well.
Instead of fixed during the start of replication. To this end, create term_flush_lsn watch channel similar to commit_lsn one. This allows to continue recovery streaming if new data appears.
4f5d5b4
to
8f0ae23
Compare
Implements #4875