You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.
Transparent Huge Pages provides real benefit to certain applications by potentially reducing TLB misses and improving performance. For other applications, it can bloat memory usage and cause performance regressions. The kernel documentation claims that [madvise] is the default behavior:
"madvise" will enter direct reclaim like "always" but only for regions
that are have used madvise(MADV_HUGEPAGE). This is the default behaviour.
By default coreos enables transparent huge pages, but doesn't specify if it wants to use always or madvise by default, so always is chosen. Unfortunately setting THP to [always] causes issues with a variety of software:
More recently, we've also seen memory usage bloat in Ceph (using tcmalloc) when THP is set to always potentially resulting in OOM when running inside containers. There are various ways to potentially work around this at the application level including using MADV_NOHUGEPAGE or a prctl flag. Requiring these workarounds to disable THP for a given application is counter-intuitive for several reasons:
It puts the onus on developers to explicitly stop the kernel from engaging in sub-optimal behavior.
It's incredibly confusing to have a system-wide default that claims to "always" enable a setting that many applications may or may not silently disable through workarounds.
Finally, when another prominent distribution was faced with a similar choice, they ran stream and malloc tests showing improvement at various allocation sizes when THP was disabled. Ultimately that lead them to switching to madvise with no apparent performance regressions:
Issue Report
Transparent Huge Pages provides real benefit to certain applications by potentially reducing TLB misses and improving performance. For other applications, it can bloat memory usage and cause performance regressions. The kernel documentation claims that [madvise] is the default behavior:
https://www.kernel.org/doc/Documentation/vm/transhuge.txt
However in mm/Kconfig it turns out the default behavior is actually to use [always]:
https://github.com/torvalds/linux/blob/master/mm/Kconfig#L385-L407
By default coreos enables transparent huge pages, but doesn't specify if it wants to use always or madvise by default, so always is chosen. Unfortunately setting THP to [always] causes issues with a variety of software:
splunk: https://docs.splunk.com/Documentation/Splunk/7.3.2/ReleaseNotes/SplunkandTHP
mongodb: https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/
couchbase: https://docs.couchbase.com/server/current/install/thp-disable.html
oracle: https://blogs.oracle.com/linux/performance-issues-with-transparent-huge-pages-thp
nuodb: http://doc.nuodb.com/4.0/Content/OpenShift-disable-THP.htm
Go runtime: golang/go#8832
jemalloc: https://blog.digitalocean.com/transparent-huge-pages-and-alternative-memory-allocators/
node.js: nodejs/node#11077
tcmalloc: gperftools/gperftools#1073
More recently, we've also seen memory usage bloat in Ceph (using tcmalloc) when THP is set to always potentially resulting in OOM when running inside containers. There are various ways to potentially work around this at the application level including using MADV_NOHUGEPAGE or a prctl flag. Requiring these workarounds to disable THP for a given application is counter-intuitive for several reasons:
It puts the onus on developers to explicitly stop the kernel from engaging in sub-optimal behavior.
It's incredibly confusing to have a system-wide default that claims to "always" enable a setting that many applications may or may not silently disable through workarounds.
Finally, when another prominent distribution was faced with a similar choice, they ran stream and malloc tests showing improvement at various allocation sizes when THP was disabled. Ultimately that lead them to switching to madvise with no apparent performance regressions:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1703742
Bug
In coreos-overlay, THP is set:
https://github.com/coreos/coreos-overlay/blob/master/sys-kernel/coreos-modules/files/amd64_defconfig-4.19#L216
But making madvise default also requires setting:
Environment
What hardware/cloud provider/hypervisor is being used to run Container Linux?
Expected Behavior
The current behavior is expected when THP is set to [always].
Actual Behavior
See:
https://docs.google.com/spreadsheets/d/1Xl3nWapi7ZKEmpnsSHHWO96iopEG0hK6GeDWhWKSfDo/edit?usp=sharing
Reproduction Steps
Other Information
https://unix.stackexchange.com/questions/495816/which-distributions-enable-transparent-huge-pages-for-all-applications
https://www.percona.com/blog/2019/03/06/settling-the-myth-of-transparent-hugepages-for-databases/
https://blog.nelhage.com/post/transparent-hugepages/
https://alexandrnikitin.github.io/blog/transparent-hugepages-measuring-the-performance-impact/
https://dl.acm.org/citation.cfm?id=3359640
The text was updated successfully, but these errors were encountered: