-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make PauseTiming()
and ResumeTiming()
per thread.
#286
Conversation
❌ Build benchmark 411 failed (commit 3668875662 by @EricWF) |
❌ Build benchmark 412 failed (commit 13b4a6c641 by @EricWF) |
❌ Build benchmark 413 failed (commit b6f58fa5a2 by @EricWF) |
@@ -861,19 +797,17 @@ RunBenchmark(const benchmark::internal::Benchmark::Instance& b, | |||
thread.join(); | |||
} | |||
for (std::size_t ti = 0; ti < pool.size(); ++ti) { | |||
pool[ti] = std::thread(&RunInThread, &b, iters, static_cast<int>(ti), &total); | |||
pool[ti] = std::thread(&RunInThread, &b, iters, | |||
static_cast<int>(ti + 1), &manager); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an existing bug or are you incrementing thread_id for some reason related to the PR?
with the explicit RunInThread 0 below, should this loop start with ti = 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it so the main thread always runs as thread 0
for multi-threaded benchmarks. Previously we just slept the main thread.
the direction looks great. it's very clean and clearly more correct. |
❌ Build benchmark 414 failed (commit a6ab48a224 by @EricWF) |
Unfortunately this patch is starting to get big. I had to fix a bunch of TSAN race conditions, including existing ones in CHECK and VLOG. However I have managed to remove all (?) of the global state used to run the benchmarks. |
|
||
// TODO(ericwf): support MallocCounter. | ||
//static benchmark::MallocCounter *benchmark_mc; | ||
Mutex& getBenchmarkMutex() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do a format/style pass at some point? GetBenchmarkMutex, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do it right now if you clarify what the style should be (Other than FunctionNameAllUpper).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that's all it is.. clang-format Google-style should take care of the rest (there's some extra indents around the place).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check in a .clang-format file to make formatting easier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
❌ Build benchmark 415 failed (commit a0eac36099 by @EricWF) |
this lgtm. have you run a comparison for all the benchmark tests to see how the timing changes? |
❌ Build benchmark 417 failed (commit 6e0eb519be by @EricWF) |
❌ Build benchmark 418 failed (commit 92aefcfe62 by @EricWF) |
I've done some. The results for |
❌ Build benchmark 419 failed (commit c3a8866d99 by @EricWF) |
❌ Build benchmark 420 failed (commit 7d3644d3cf by @EricWF) |
❌ Build benchmark 421 failed (commit 5867586ae3 by @EricWF) |
❌ Build benchmark 422 failed (commit 69e69bd07e by @EricWF) |
❌ Build benchmark 423 failed (commit ebc1b81f5c by @EricWF) |
❌ Build benchmark 424 failed (commit 809a62661c by @EricWF) |
❌ Build benchmark 425 failed (commit 3b22deabfb by @EricWF) |
I have now. The results are very close to the same except for on threaded benchmarks. Surprisingly threaded benchmarks that don't use PauseTiming still benefit, sometimes greatly. Previously all threads would have to complete the KeepRunning loop before the timer could be paused, this essentially made each benchmark as slow as its slowest thread. This added as much as 30% to the real time measurements. Now that has been fixed. |
wonderful! |
Currently we time benchmarks using a single global timer that tracks per-process CPU usage. Pausing and resuming this timer have to act as a barrier to all threads. This has crippling effects on multi-threaded benchmarks. If you pause every iterator you synchronize the entire benchmark. It's effectively no longer multi-threaded.
This patch changes to a per-thread timer. Instead of measuring process CPU time we sum thread CPU time and we pause on a per-thread basis.
Below are comparison of the new and old results from
basic_test.cc
. Note that theBM_spin_pause_during
test get 95% faster.There's still work to do on this, but I was hoping for initial feedback on the direction.