Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Core Affinity doesn't seem to work (Or Reporter wrong?) #1812

Open
FabianSchuetze opened this issue Jul 11, 2024 · 5 comments
Open

[BUG] Core Affinity doesn't seem to work (Or Reporter wrong?) #1812

FabianSchuetze opened this issue Jul 11, 2024 · 5 comments

Comments

@FabianSchuetze
Copy link

Describe the bug
I would like to run the benchmark on a particular CPU core. The [docs]( says:

2. Set the benchmark program's task affinity to a fixed cpu.  For example:
   ```sh
   taskset -c 0 ./mybenchmark

However, when I run the basic_test app, I see the following:

build git:(main) taskset -c 0 ./test/basic_test
2024-07-11T11:29:31+02:00
Running ./test/basic_test
Run on (24 X 4700 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x12)
  L1 Instruction 32 KiB (x12)
  L2 Unified 1280 KiB (x12)
  L3 Unified 25600 KiB (x1)
Load Average: 5.07, 4.92, 2.97
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------------------------
Benchmark                                               Time             CPU   Iterations
-----------------------------------------------------------------------------------------
BM_empty                                            0.327 ns        0.325 ns   2148747493
BM_empty/threads:24                                 0.296 ns        0.328 ns   2130524928
...

I think it might be reporting here that is wrong. Looking at top, I can verify that only core 0 is used. The code in reporter[https://github.com/google/benchmark/blob/main/src/reporter.cc#L49C29-L49C37) seems to use a static number of cores.

System
Which OS, compiler, and compiler version are you using:

  • OS: ubuntu 22.04
  • Compiler and version: gcc 11.4

To reproduce
Steps to reproduce the behavior:

  1. sync to commit : ea71a14891474943fc1f34d359f9e0e82476ffe1
  2. cmake: cmake -D BENCHMARK_DOWNLOAD_DEPENDENCIES=1 -S . -B build
  3. make: cmake --build build/ -j20
  4. taskset -c 0 ./build/test/basic_test

Expected behavior
I would expect the test to run only on core 0 and the output of the test be:
Run on (1 X 4700 MHz CPU s) instead of Run on (24 X 4700 MHz CPU s)

@LebedevRI
Copy link
Collaborator

Doesn't core affinity only ensure that the main thread stays on the same CPU,
not that the main thread is unable to start new threads?

https://man7.org/linux/man-pages/man1/taskset.1.html

       The taskset command is used to set or retrieve the CPU affinity
       of a running process given its pid, or to launch a new command
       with a given CPU affinity. CPU affinity is a scheduler property
       that "bonds" a process to a given set of CPUs on the system. The
       Linux scheduler will honor the given CPU affinity and the process
       will not run on any other CPUs.

@FabianSchuetze
Copy link
Author

I agree.

What's the meaning of Run on (24 X 4700 MHz CPU s) then?

That does not indicate any threading decisions by benchmark instead it enumerates the number of CPU cores of the test system? Equivalently, if taskset is not used, Run on (24 X 4700 MHz CPU s) also don't indicate that the benchmark run parallel on different cores.

@LebedevRI
Copy link
Collaborator

I think so, yes.
https://github.com/google/benchmark/blob/main/src/sysinfo.cc seems to temporarily unset core affinity to read CPU frequencies, but i don't think it ever reports what the actual affinity is.
But again, i'm not sure what happens for all the extra threading that may happen (either libbenchmark-induced, or in the snippet-under-measurement).

@dmah42
Copy link
Member

dmah42 commented Jul 12, 2024

we could fix the PrintBasicContext section to report on the number of CPUs used if we got the current affinity somewhere i guess?

@FabianSchuetze
Copy link
Author

FabianSchuetze commented Jul 12, 2024

That would be wonderful, but I wonder if that solves only one-half of the issue?

If no affinity is set (0xFFFFFFF is returned), should the benchmark report that it runs on all cores? I think the benchmark is not scheduled to run in parallel on different cores, or?

However, particularly when the system is a hybrid architecture and consists of "performance" and "efficient" cores, reporting the affinity is useful, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants