Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spa: make read/write queues configurable #15675

Merged
merged 1 commit into from
Dec 20, 2023

Conversation

robn
Copy link
Member

@robn robn commented Dec 15, 2023

Motivation and Context

We are finding that as customers get larger and faster machines (hundreds of cores, large NVMe-backed pools) they keep hitting relatively low performance ceilings. Our profiling work almost always finds that they're running into bottlenecks on the SPA IO taskqs. Unfortunately there's often little we can advise at that point, because there's very few ways to change behaviour without patching.

Description

This commit adds two load-time parameters zio_taskq_read and zio_taskq_write that can configure the READ and WRITE IO taskqs directly.

This achieves two goals: it gives operators (and those that support them) a way to tune things without requiring a custom build of OpenZFS, which is often not possible, and it lets us easily try different config variations in a variety of environments to inform the development of better defaults for these kind of systems.

Because tuning the IO taskqs really requires a fairly deep understanding of how IO in ZFS works, and generally isn't needed without a pretty serious workload and an ability to identify bottlenecks, only minimal documentation is provided. Its expected that anyone using this is going to have the source code there as well.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Further discussion

I know there's not really much appetite for even more config parameters, but as it says in above, we keep running IO queue bottlenecks at customer sites, and most don't have a way to take a patch. This would at least let us offer them something.

I'm aware that this is exposing implementation details to a degree - its the point! To that end, I'd be more than happy somehow marking this that makes it clear the operator is taking matters into their own hand. Maybe you have to set a zfs_debug flag first, or maybe it logs "dragons enabled!", or something else? Your suggestion welcome.

How Has This Been Tested?

By hand, and with our internal performance test suite. The most important thing is that when not in use, nothing changes.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Dec 19, 2023
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've often wanted to be able to easily experiment with exactly this kind of tuning, and adding one more module option is a reasonable way to handle it. If you can just fix up the CentOS 7 and FreeBSD 13 build failures we can move this forward.

@robn robn force-pushed the spa-taskq-parameterise branch from 2c3f205 to fe3bf03 Compare December 20, 2023 03:38
We are finding that as customers get larger and faster machines
(hundreds of cores, large NVMe-backed pools) they keep hitting
relatively low performance ceilings. Our profiling work almost always
finds that they're running into bottlenecks on the SPA IO taskqs.
Unfortunately there's often little we can advise at that point, because
there's very few ways to change behaviour without patching.

This commit adds two load-time parameters `zio_taskq_read` and
`zio_taskq_write` that can configure the READ and WRITE IO taskqs
directly.

This achieves two goals: it gives operators (and those that support
them) a way to tune things without requiring a custom build of OpenZFS,
which is often not possible, and it lets us easily try different config
variations in a variety of environments to inform the development of
better defaults for these kind of systems.

Because tuning the IO taskqs really requires a fairly deep understanding
of how IO in ZFS works, and generally isn't needed without a pretty
serious workload and an ability to identify bottlenecks, only minimal
documentation is provided. Its expected that anyone using this is going
to have the source code there as well.

Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
@robn robn force-pushed the spa-taskq-parameterise branch from fe3bf03 to e7b7747 Compare December 20, 2023 10:02
@robn
Copy link
Member Author

robn commented Dec 20, 2023

Alright, pending test results that should do it. CentOS 7 was the bug in old GCC where you have to initialise arrays of complex types with {{0}} instead of {0}. The FreeBSD one was that parameter code is not really generalised at all for PARAM_CALL parameters, so I had to learn how to write that, and then spent hours down a deep rabbithole discovering that (maybe) you there's no memory allocator available in early module load, so it all has to be done on the stack. So that's another thing on my list of someday cleanups, oh well.

(ignore me; just felt an afternoon of faffing about so close to Christmas deserved a paragraph 😅)

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Dec 20, 2023
@behlendorf behlendorf merged commit 6930ecb into openzfs:master Dec 20, 2023
24 of 25 checks passed
robn added a commit to robn/zfs that referenced this pull request Jan 11, 2024
robn added a commit to robn/zfs that referenced this pull request Jan 11, 2024
robn added a commit to robn/zfs that referenced this pull request Jan 11, 2024
@robn robn mentioned this pull request Jan 11, 2024
13 tasks
robn added a commit to robn/zfs that referenced this pull request Jan 12, 2024
robn added a commit to robn/zfs that referenced this pull request Jan 12, 2024
ixhamza pushed a commit to truenas/zfs that referenced this pull request Jan 15, 2024
behlendorf pushed a commit that referenced this pull request Jan 16, 2024
behlendorf pushed a commit that referenced this pull request Jan 16, 2024
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 13, 2024
We are finding that as customers get larger and faster machines
(hundreds of cores, large NVMe-backed pools) they keep hitting
relatively low performance ceilings. Our profiling work almost always
finds that they're running into bottlenecks on the SPA IO taskqs.
Unfortunately there's often little we can advise at that point, because
there's very few ways to change behaviour without patching.

This commit adds two load-time parameters `zio_taskq_read` and
`zio_taskq_write` that can configure the READ and WRITE IO taskqs
directly.

This achieves two goals: it gives operators (and those that support
them) a way to tune things without requiring a custom build of OpenZFS,
which is often not possible, and it lets us easily try different config
variations in a variety of environments to inform the development of
better defaults for these kind of systems.

Because tuning the IO taskqs really requires a fairly deep understanding
of how IO in ZFS works, and generally isn't needed without a pretty
serious workload and an ability to identify bottlenecks, only minimal
documentation is provided. Its expected that anyone using this is going
to have the source code there as well.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#15675
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 13, 2024
We are finding that as customers get larger and faster machines
(hundreds of cores, large NVMe-backed pools) they keep hitting
relatively low performance ceilings. Our profiling work almost always
finds that they're running into bottlenecks on the SPA IO taskqs.
Unfortunately there's often little we can advise at that point, because
there's very few ways to change behaviour without patching.

This commit adds two load-time parameters `zio_taskq_read` and
`zio_taskq_write` that can configure the READ and WRITE IO taskqs
directly.

This achieves two goals: it gives operators (and those that support
them) a way to tune things without requiring a custom build of OpenZFS,
which is often not possible, and it lets us easily try different config
variations in a variety of environments to inform the development of
better defaults for these kind of systems.

Because tuning the IO taskqs really requires a fairly deep understanding
of how IO in ZFS works, and generally isn't needed without a pretty
serious workload and an ability to identify bottlenecks, only minimal
documentation is provided. Its expected that anyone using this is going
to have the source code there as well.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#15675
allanjude pushed a commit to KlaraSystems/zfs that referenced this pull request May 21, 2024
Missed in openzfs#15696, backporting openzfs#15675.

Signed-off-by: Rob Norris <[email protected]>
(cherry picked from commit 437d598)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants