Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically detect PMU capabilities through libpfm #1380

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

damageboy
Copy link

@damageboy damageboy commented Apr 2, 2022

Dynamically detect PMU capabilities through libpfm

  • Instead of allowing for up to 3 counters, libpfm's internal
    capabilities of reporting PMU info are used to manage a per-PMU
    "registry" and dynamically allocate "slots" according to the specific
    counters requested.
  • per-PMU information is obtained, where each PMU reports its own
    capabilities in the form of fixed/non-fixed counter limits.
  • In this PR/commit, it is still impossible to get more detailed
    (x86-only) counter information in terms of fixed/non-fixed counter
    association, due to what seems to be a lack of API surface on libpfm
    itself: https://sourceforge.net/p/perfmon2/mailman/message/37631173/
  • The maximal number of counters is bumped from 3 to 63, which together
    with the current padding "scheme" means we pre-allocate/inlline up-to
    64 counter slots (64-bits each) per measurement instance
  • Closes [FR] Support Dynamic PMU detection #1377

@google-cla
Copy link

google-cla bot commented Apr 2, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

For more information, open the CLA check for this pull request.

@damageboy damageboy force-pushed the main branch 3 times, most recently from 87ee46b to aa823ed Compare April 2, 2022 07:43
Copy link
Member

@dmah42 dmah42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also check the format issues please. use clang-format, ideally.

const char* desc_;
std::unordered_map<int, uint64_t> counter_ids_;
std::unordered_map<int, uint64_t> fixed_counter_ids_;
uint64_t available_counters_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://google.github.io/styleguide/cppguide.html#Integer_Types

for pretty good reason, we try to stick to int rather than uint64_t. obviously if 64-bit is required that is fine, but please don't use unsigned for these. instead used signed and check for over/underflow instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in this case, I'm just trying to stick to whatever libpfm does:

typedef struct {
	...
	uint64_t		code;	/* event raw code (not encoding) */
    ...
} pfm_event_info_t;

Do you prefer something else given their choice of types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the map should stay as uint64_t as we have no way to know what they're using for ids (they could be 64-bit UUID or something). the counters should change to int though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this introduces a warning about comparting .size() which returns a size_t to int and the possible loss of integer precision. Do you want me to cast the available_counters_ to size_t for the comparison, or just leave as is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally, we'd check the returned .size() is within bounds for int(64_t) and then cast to int(64_t).

i'm much more concerned about the unsigned bit than the 64-bit bit (though if we have more counters than fit in 32-bit... :O

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, This feels over the top to me considering the amount of .size() and the use of size_t everywhere, due to STL, but there you go, updated the code.

src/perf_counters.cc Outdated Show resolved Hide resolved
src/perf_counters.cc Show resolved Hide resolved
src/perf_counters.h Outdated Show resolved Hide resolved
src/perf_counters.cc Show resolved Hide resolved
@damageboy
Copy link
Author

also check the format issues please. use clang-format, ideally.

Yeah, sorry about that, I was editing the file through a different project with its own .clang-format and it got mixed up.

@dmah42
Copy link
Member

dmah42 commented Apr 5, 2022

can you check those bot failures please?

@damageboy
Copy link
Author

can you check those bot failures please?

Sure on it!

@damageboy
Copy link
Author

can you check those bot failures please?

Well, apart from the formatting error, the build errors are from that (apparently) not so innocent that was part of the pre-existing code that you requested: #1380 (comment)

Reverting that unrelated change seems to have solved the issue locally for me when building with:

cmake ../ -G Ninja -DCMAKE_BUILD_TYPE=Release -DBENCHMARK_ENABLE_WERROR=ON -DBENCHMARK_ENABLE_TESTING=OFF

The formatting issues have been fixed too.

@dmah42
Copy link
Member

dmah42 commented Apr 6, 2022

can you check those bot failures please?

Well, apart from the formatting error, the build errors are from that (apparently) not so innocent that was part of the pre-existing code that you requested: #1380 (comment)

Reverting that unrelated change seems to have solved the issue locally for me when building with:

cmake ../ -G Ninja -DCMAKE_BUILD_TYPE=Release -DBENCHMARK_ENABLE_WERROR=ON -DBENCHMARK_ENABLE_TESTING=OFF

The formatting issues have been fixed too.

ah crap, sorry for the extra churn then.


SinglePMURegistry(pfm_pmu_t pmu_id)
: pmu_id_(pmu_id), available_counters_(0), available_fixed_counters_(0) {
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the extra braces here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, mistake...

- Instead of allowing for up to 3 counters, libpfm's internal
  capabilities of reporting PMU info are used to manage a per-PMU
  "registry" and dynamically allocate "slots" according to the specific
  counters requested.
- per-PMU information is obtained, where each PMU reports its own
  capabilities in the form of fixed/non-fixed counter limits.
- In this PR/commit, it is *still* impossible to get more detailed
  (x86-only) counter information in terms of fixed/non-fixed counter
  association, due to what seems to be a lack of API surface on libpfm
  itself: https://sourceforge.net/p/perfmon2/mailman/message/37631173/
- The maximal number of counters is bumped from 3 to 63, which together
  with the current padding "scheme" means we pre-allocate/inlline up-to
  64 counter slots (64-bits each) per measurement instance
- Closes google#1377
@damageboy
Copy link
Author

@dominichamon I'm having a hard time reproing the build issues on a pristine ubuntu 16.04 with gcc-5 using a local docker image.
Do you have any ideas on how I can repro the build breakage? Referring to this:

https://github.com/google/benchmark/runs/5850874474?check_suite_focus=true:

/__w/benchmark/benchmark/src/perf_counters.cc:145:52:   required from here
/usr/include/c++/5/bits/hashtable_policy.h:85:34: error: no match for call to '(const std::hash<pfm_pmu_t>) (const pfm_pmu_t&)'
  noexcept(declval<const _Hash&>()(declval<const _Key&>()))>

@dmah42
Copy link
Member

dmah42 commented Apr 12, 2022

@dominichamon I'm having a hard time reproing the build issues on a pristine ubuntu 16.04 with gcc-5 using a local docker image. Do you have any ideas on how I can repro the build breakage? Referring to this:

https://github.com/google/benchmark/runs/5850874474?check_suite_focus=true:

/__w/benchmark/benchmark/src/perf_counters.cc:145:52:   required from here
/usr/include/c++/5/bits/hashtable_policy.h:85:34: error: no match for call to '(const std::hash<pfm_pmu_t>) (const pfm_pmu_t&)'
  noexcept(declval<const _Hash&>()(declval<const _Key&>()))>

i haven't seen this error before. it smells a bit like an ABI mismatch, which suggests the bots are misconfigured, but i don't think this has been true until now.

it suggests we don't have a std::hash for pfm_pmu_t...

@damageboy
Copy link
Author

en this error before. it smells a bit like an ABI mismatch, which suggests the bots are misconfigured, but i don't think this has been true until now.

Yes, I realize this, to make things worse, this doesn't happen on a brand new printine ubuntu 16.04 docker image I ran locally.
It just compiles cleanly... :(

}

private:
pfm_pmu_t pmu_id_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you set pmu_id_ to an invalid value here? Easier to avoid uninitialized values (even if you set it later in the ctor, setting it at decl is easier to maintain)

or could it be const?

#if defined HAVE_LIBPFM

class SinglePMURegistry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some comments - purpose of the class, etc?


const char* name() const { return name_; }

bool AddCounter(int event_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add tests.

@dmah42
Copy link
Member

dmah42 commented May 1, 2022

ping .. are you planning on coming back to finish this off? it would be really good to be able to merge it.

@dmah42
Copy link
Member

dmah42 commented Feb 7, 2023

one more check to see if there's any plans to come back to this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
incomplete work needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FR] Support Dynamic PMU detection
3 participants