Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix corruption caused by mmap flushing problems #16019

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

rrevans
Copy link
Contributor

@rrevans rrevans commented Mar 22, 2024

Motivation and Context

See #15933

TL;DR there are three separate bugs underlying that issue which are addressed by this PR. See the description below.

Description

  1. Make mmap flushes synchronous. Linux may skip flushing dirty pages
    already in writeback unless data-integrity sync is requested.

  2. Change zfs_putpage to use TXG_WAIT. Otherwise dirty pages may be
    skipped due to DMU pushing back on TX assign.

  3. Add missing mmap flush when doing block cloning.

  4. While here, pass errors from putpage to writepage/writepages.

This change fixes corruption edge cases, but unfortunately adds synchronous ZIL flushes for dirty mmap pages to llseek and bclone operations. It may be possible to avoid these sync writes later but would need more tricky refactoring of the writeback code.

How Has This Been Tested?

Both the reproducer from #15933 and gentoo emerge test script pass after multiple hours.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

1) Make mmap flushes synchronous. Linux may skip flushing dirty pages
   already in writeback unless data-integrity sync is requested.

2) Change zfs_putpage to use TXG_WAIT. Otherwise dirty pages may be
   skipped due to DMU pushing back on TX assign.

3) Add missing mmap flush when doing block cloning.

4) While here, pass errors from putpage to writepage/writepages.

This change fixes corruption edge cases, but unfortunately adds
synchronous ZIL flushes for dirty mmap pages to llseek and bclone
operations. It may be possible to avoid these sync writes later
but would need more tricky refactoring of the writeback code.

Fixes: openzfs#15933

Signed-off-by: Robert Evans <[email protected]>
@amotin amotin requested a review from behlendorf March 22, 2024 13:57
@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Mar 22, 2024
@behlendorf behlendorf merged commit 102b468 into openzfs:master Mar 25, 2024
22 of 25 checks passed
@ssergiienko
Copy link

Will it be cherry-picked to v2.2.4 ?

rrevans added a commit to rrevans/zfs that referenced this pull request Mar 28, 2024
1) Make mmap flushes synchronous. Linux may skip flushing dirty pages
   already in writeback unless data-integrity sync is requested.

2) Change zfs_putpage to use TXG_WAIT. Otherwise dirty pages may be
   skipped due to DMU pushing back on TX assign.

3) Add missing mmap flush when doing block cloning.

4) While here, pass errors from putpage to writepage/writepages.

This change fixes corruption edge cases, but unfortunately adds
synchronous ZIL flushes for dirty mmap pages to llseek and bclone
operations. It may be possible to avoid these sync writes later
but would need more tricky refactoring of the writeback code.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Robert Evans <[email protected]>
Closes openzfs#15933
Closes openzfs#16019
@tonyhutter
Copy link
Contributor

@ssergiienko yes it will - I just opened a backport PR (#16039) so it doesn't get forgotten.

behlendorf pushed a commit that referenced this pull request Mar 30, 2024
1) Make mmap flushes synchronous. Linux may skip flushing dirty pages
   already in writeback unless data-integrity sync is requested.

2) Change zfs_putpage to use TXG_WAIT. Otherwise dirty pages may be
   skipped due to DMU pushing back on TX assign.

3) Add missing mmap flush when doing block cloning.

4) While here, pass errors from putpage to writepage/writepages.

This change fixes corruption edge cases, but unfortunately adds
synchronous ZIL flushes for dirty mmap pages to llseek and bclone
operations. It may be possible to avoid these sync writes later
but would need more tricky refactoring of the writeback code.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Robert Evans <[email protected]>
Closes #15933 
Closes #16019
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
1) Make mmap flushes synchronous. Linux may skip flushing dirty pages
   already in writeback unless data-integrity sync is requested.

2) Change zfs_putpage to use TXG_WAIT. Otherwise dirty pages may be
   skipped due to DMU pushing back on TX assign.

3) Add missing mmap flush when doing block cloning.

4) While here, pass errors from putpage to writepage/writepages.

This change fixes corruption edge cases, but unfortunately adds
synchronous ZIL flushes for dirty mmap pages to llseek and bclone
operations. It may be possible to avoid these sync writes later
but would need more tricky refactoring of the writeback code.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Robert Evans <[email protected]>
Closes openzfs#15933 
Closes openzfs#16019
@rrevans rrevans deleted the clone_mmap branch December 23, 2024 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants