Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add setcpuaffinity(cmd, cpus) for setting CPU affinity of subprocesses #42469

Merged
merged 23 commits into from
Jan 10, 2022

Conversation

tkf
Copy link
Member

@tkf tkf commented Oct 2, 2021

While working on implementing tests for #42340, I thought it'd make sense to add a Julia API to spawn processes with an explicit CPU affinity since libuv already can do this. It may also be useful for process-based parallelism libraries like Distributed and Dagger.

So, this patch adds a new API:

setcpuaffinity(original_command::Cmd, cpus) -> command::Cmd

Set the CPU affinity of the command by a list of CPU IDs (1-based) cpus. Passing cpus = nothing means to unset the CPU affinity if the original_command has any.

This is supported on Unix and Windows but not in macOS.

Example:

julia> run(setcpuaffinity(`sh -c 'taskset -p $$'`, [1, 2, 5]));
pid 2273's current affinity mask: 13

julia> 0b010011
0x13

We can now remove taskset from the test dependency.

Side notes: The behavior of cpus = nothing with JULIA_EXCLUSIVE=1 is currently badly defined/implemented #42338. So, we'd need to be careful when documenting its behavior.

What do you think? Is it an appropriate API for Julia Base?

@tkf tkf added needs docs Documentation for this change is required needs news A NEWS entry is required for this change labels Oct 2, 2021
base/cmd.jl Outdated Show resolved Hide resolved
base/cmd.jl Outdated Show resolved Hide resolved
@tkf tkf changed the title RFC: add setcpus(cmd, cpus) for setting CPU affinity of subprocesses RFC: add setcpuaffinity(cmd, cpus) for setting CPU affinity of subprocesses Oct 2, 2021
@tkf tkf added Buildkite - retry failed jobs and removed needs docs Documentation for this change is required needs news A NEWS entry is required for this change Buildkite - retry failed jobs labels Oct 2, 2021
julia> run(setcpuaffinity(`sh -c 'taskset -p \$\$'`, [1, 2, 5]));
pid 2273's current affinity mask: 13

julia> 0b010011
Copy link
Member

@KristofferC KristofferC Oct 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic required to go from the first command to understand the point of the second feels a bit large here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good point. I added a bit more explanations in the latest commit.

test/show.jl Outdated
Comment on lines 2329 to 2335
@testset "Cmd" begin
@test sprint(show, `true`) == "`true`"
@test sprint(show, setenv(`true`, "A" => "B")) == """setenv(`true`,["A=B"])"""
@test sprint(show, setcpuaffinity(`true`, [1, 2])) == "setcpuaffinity(`true`,[1, 2])"
@test sprint(show, setenv(setcpuaffinity(`true`, [1, 2]), "A" => "B")) ==
"""setenv(setcpuaffinity(`true`,[1, 2]),["A=B"])"""
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find where show(::IO, ::Cmd) was tested. It looks like the original commit dee7f6c didn't add the test for show with setenv. So, I just threw this in for completeness.

@tkf tkf changed the title RFC: add setcpuaffinity(cmd, cpus) for setting CPU affinity of subprocesses Add setcpuaffinity(cmd, cpus) for setting CPU affinity of subprocesses Oct 14, 2021
@tkf
Copy link
Member Author

tkf commented Oct 14, 2021

@vtjnash Can you revisit the above coment? Is it good to go?

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions

base/cmd.jl Outdated Show resolved Hide resolved
base/cmd.jl Outdated Show resolved Hide resolved
base/cmd.jl Outdated Show resolved Hide resolved
base/cmd.jl Outdated Show resolved Hide resolved
base/cmd.jl Outdated
Comment on lines 326 to 327
setcpuaffinity(cmd::Cmd, ::Nothing) = Cmd(cmd; cpus = nothing)
setcpuaffinity(cmd::Cmd, cpus) = Cmd(cmd; cpus = collect(Int, cpus))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
setcpuaffinity(cmd::Cmd, ::Nothing) = Cmd(cmd; cpus = nothing)
setcpuaffinity(cmd::Cmd, cpus) = Cmd(cmd; cpus = collect(Int, cpus))
setcpuaffinity(cmd::Cmd, ::Nothing) = Cmd(cmd; cpus = nothing)
setcpuaffinity(cmd::Cmd, cpus::AbstractVector{Bool}) = Cmd(cmd; cpus = findall(cpus))
setcpuaffinity(cmd::Cmd, cpus::AbstractVector) = Cmd(cmd; cpus = convert(Vector{UInt16}, cpus))
setcpuaffinity(cmd::Cmd, cpus) = setcpuaffinity(cmd, collect(cpus)::AbstractVector)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm against dispatching on Bool element type to change the meaning (in general). It will change the behavior between cpus = Bool[true] and cpus = Integer[true] (or Any[true]). I prefer to add a distinct function like setcpuaffinitymask or maybe a Mask type for supporting this.

test/print_process_affinity.jl Outdated Show resolved Hide resolved
test/print_process_affinity.jl Outdated Show resolved Hide resolved
test/print_process_affinity.jl Outdated Show resolved Hide resolved
test/show.jl Outdated Show resolved Hide resolved
test/print_process_affinity.jl Outdated Show resolved Hide resolved
test/threads.jl Outdated Show resolved Hide resolved
test/threads.jl Outdated Show resolved Hide resolved
@tkf
Copy link
Member Author

tkf commented Oct 15, 2021

@vtjnash Can you check my comments above?

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know what happened on FreeBSD? The CI logs make it seem like it hung in the precompile tests, before getting to testing the new code.

test/print_process_affinity.jl Outdated Show resolved Hide resolved
test/print_process_affinity.jl Outdated Show resolved Hide resolved
test/threads.jl Outdated Show resolved Hide resolved
@tkf
Copy link
Member Author

tkf commented Nov 10, 2021

I have no idea what's happening in FreeBSD but I think the CI was older than the recent fixes for FreeBSD. I merged the master branch to see if it shows something new.

@tkf
Copy link
Member Author

tkf commented Nov 10, 2021

In win64 we are getting

ERROR: LoadError: IOError: getaffinity: bad file descriptor (EBADF)
Stacktrace:
 [1] uv_error
   @ .\libuv.jl:97 [inlined]
 [2] uv_thread_getaffinity()
   @ Main C:\buildbot\worker-tabularasa\tester_win64\build\share\julia\test\print_process_affinity.jl:18
 [3] print_process_affinity()
   @ Main C:\buildbot\worker-tabularasa\tester_win64\build\share\julia\test\print_process_affinity.jl:25
 [4] top-level scope
   @ C:\buildbot\worker-tabularasa\tester_win64\build\share\julia\test\print_process_affinity.jl:30
in expression starting at C:\buildbot\worker-tabularasa\tester_win64\build\share\julia\test\print_process_affinity.jl:29

(which is the same as #42469 (comment) but with a better error message).

I guess this requires libuv/libuv#3357 too.

@tkf
Copy link
Member Author

tkf commented Dec 16, 2021

Now that #43219 pulls in the fix libuv/libuv#3357, the test works in Windows CI.

@vtjnash Good to go?

@vchuravy vchuravy requested a review from vtjnash January 6, 2022 17:57
base/cmd.jl Outdated Show resolved Hide resolved
test/threads.jl Show resolved Hide resolved
@vchuravy vchuravy added this to the 1.8 milestone Jan 6, 2022
@vtjnash vtjnash merged commit 7b62896 into JuliaLang:master Jan 10, 2022
@tkf tkf deleted the cpumask branch January 17, 2022 04:22
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants