Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand fragmentation table to reflect larger possibile allocation sizes #16986

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pcd1193182
Copy link
Contributor

Motivation and Context

When you are using large recordsizes in conjunction with raidz, with incompressible data, you can pretty reliably be making 21 MB allocations. Unfortunately, the fragmentation metric in ZFS considers any metaslabs with 16 MB free chunks completely unfragmented, so you can have a metaslab report 0% fragmented and be unable to satisfy an allocation. When using the segment-based metaslab weight, this is inconvenient; when using the space-based one, it can seriously degrade performance.

Description

We expand the fragmentation table to extend up to 1GB, and redefine the table size based on the actual table, rather than having a static define. We also tweak the one variable that depends on fragmentation directly.

The one caveat for this change is that on pools with small disks (less than 200GB), once a metaslab is dirtied at all it will always report fragmented. This is because at our target of 200 metaslabs, the whole metaslab is less than a gigabyte, so the largest possible free segment is less than a gigabyte. This may result in some user questions, but most users probably don't have disks that small installed. At larger sizes, the problem disappears. Users may note an increase in fragmentation when this change is released, but it doesn't actually reflect any on-disk changes, just a new measurement scale.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

How Has This Been Tested?

Basic sanity testing only; passes the zfs test suite and zloop, and reports fragmentation correctly.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@pcd1193182 pcd1193182 changed the title Expand fragmentation table size to reflect larger possibility space for allocation sizes Expand fragmentation table to reflect larger possibile allocation sizes Jan 23, 2025
@pcd1193182 pcd1193182 self-assigned this Jan 23, 2025
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Paul Dagnelie <[email protected]>
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any particular motivation to go as high as 1GB? IIRC 16MB is pretty hard block limit for ZFS now that is not going to change (soon). Sure you've shown that 16MB may be not enough, and free ranges close to it might not represent no fragmentation, since close sized allocations may produce significant amount of smaller fragments, exposing the hidden fragmentation. But I think those effects should rapidly diminish and could be neglected somewhere about 64-128MB. Also looking on logic of vdev_metaslab_set_size() it seems 512MB is the lowest metaslab size for the most of cases, which makes 128MB also a sweet spot to allow almost empty metaslabs to remain non-fragmented.

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants