-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use uv_thread_getaffinity when --threads=auto #42340
Conversation
@maleadt @staticfloat Would this potentially be useful for PkgEval? |
I think this PR alone is not quite enough for resource-limiting in CI since there has to be some scheduler to set the affinity of the scheduled jobs. I don't know how resource management works in PkgEval, though. I was also kinda hoping that it can be a part of the solution to #42242, provided that buildkite runner can set the affinity. |
PkgEval was already updating JULIA_NUM_THREADS accordingly, https://github.com/JuliaCI/PkgEval.jl/blob/b5fda0997792d02115e15a53f3d73d30dc7b87a9/src/run.jl#L144-L148, so this will allow removing that code. |
cc: @carstenbauer who looked into some adjacent stuff. How does this interact with the BLAS threads? |
Doesn't seem to work with LIKWID / LIKWID.jl, in particular
(getaffinity.jl contains the code in the OP) |
My assumption is that this is not the responsibility of Julia, but is rather the responsibility of the CI environment; e.g. we do something like tell the buildkite agents ("limit yourselves to these 16 cores") and just divide up the cores of the machine appropriately. |
So from https://github.com/RRZE-HPC/likwid/wiki/Likwid-Pin
Ahhh... this is interesting. So, they are monkey-patching pthread? I can imagine that it's very effective when you don't control all the binaries, but I don't know what can/should be done here. Detecting likwid-pin seems to require hard-coding their monkey-patching strategy. But I prefer not to do anything like that. Magics don't compose.
Yes, I agree (that's why I mentioned "a part of"). This PR is about playing nice with the surrounding environment. |
The way I see it (as a non-expert, thus in simple words), we are trying to be smart with this PR and figure out the "advised" number of threads automatically to bound nthreads. This is great when it works, in which case I'm all for it.
Many environments isn't really enough to enforce nthread bounding, is it?
I agree, we shouldn't hard-code / special case anything. |
Also, related to this, another concern may be that it does not work on macOS:
Maybe this is fixable in libuv, since it looks like there is a macOS API: |
I don't think that API is "enabled" (https://github.com/apple/darwin-xnu/search?q=ml_get_max_affinity_sets is hard-coded to 0), but it is unclear, since it is documented. |
Fair enough. I'm not particularly focused on LIKWID. I guess my point is more general: IMO, an explicit request by the user, i.e. e.g. |
Yes, definitely. This PR only changes the behavior of $ taskset --cpu-list 0,2,4,8 julia -t17 -e '@show Threads.nthreads()'
Threads.nthreads() = 17 But now that I looked at the PR title, it's not really clear from it. Sorry about the confusion! |
I'm assuming that, in addition to using this when the |
Yeah, $ JULIA_NUM_THREADS=auto taskset --cpu-list 0,2,4,8 julia-dev -e '@show Threads.nthreads()'
Threads.nthreads() = 4
$ JULIA_NUM_THREADS=11 taskset --cpu-list 0,2,4,8 julia-dev -e '@show Threads.nthreads()'
Threads.nthreads() = 11 I only changed the function |
It looks like OpenBLAS is already doing the right thing when the affinity is set. This is with Julia 1.6:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we want Sys.CPU_THREADS()
to change dynamically at runtime. Perhaps we should make addprocs
and --threads=auto
directly check this affinity in some way (cached from started)?
I do want to cache affinity on startup for #42338 though. |
Actually, there's something wrong with linux64 (and freebsed) that I don't understand and can't reproduce locally. I'll turn this to draft to avoid merging it accidentally. |
e.g., https://build.julialang.org/#/builders/70/builds/1821/steps/5/logs/stdio shows
But in other runs (e.g., https://build.julialang.org/#/builders/70/builds/1791/steps/5/logs/stdio) the worker process seems to simply die without useful information ( |
The HEAD cdd8223 now passes tester_linux64 https://build.julialang.org/#/builders/70/builds/1833 But I'm not sure why would these commits fix the issue 9a3cfd9...e8f469a
|
I'm hitting the following during build on MacOS after this
|
Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Jameson Nash <[email protected]>
close #35787
This patch detects the "advised" bound of CPU resources using
uv_thread_getaffinity
. For example, we can now use-tauto
withtaskset
:This seems to be very useful in HPC environments since you can just put
--threads=auto
in the batch script and stop worrying about the inconsistency between Julia's number of threads and the requested number of CPUs. It is also useful in containers like Docker which can use cgroup to designate the available CPUs which in turn is reflected in the affinity setting.TODOs
cpumask
setting viarun(::Cmd)
? Addsetcpuaffinity(cmd, cpus)
for setting CPU affinity of subprocesses #42469uv_thread_getaffinity
seems to be able to detect CPU settings in many environments. I checked that the following code can detect the number of CPUs configured bytaskset
, systemd'sAllowedCPUs
(cgroup), and Slurm: