Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument/label consistency #4228

Merged
merged 3 commits into from
Feb 28, 2024

Conversation

chrzaszcz
Copy link
Member

@chrzaszcz chrzaszcz commented Feb 22, 2024

Reimplement mongoose_instrument registry, allowing more consistency checks.

  • Label keys for a given EventName need to be the same for all registered events - otherwise Prometheus metrics would break.
  • In case of already registered events, the set_up callbacks are not called before raising the error.

A gen_server is used to eliminate possible race conditions during setup.

Previously, mongoose_instrument_registry was an ETS table. Now a map: #{event_name() => #{labels() => handlers()}} is used instead.

  • The map is initially stored in the state of the gen_server, but a call to mongoose_instrument:persist() should be done after initial setup, which stores it in a persistent term.
  • The term is then updated for every future set_up/tear_down call.
  • This means that the implementation is firstly optimized for updates, and after persist it is optimized for concurrent reads.

This is very similar to gen_hook, but the events are handled correctly even before persist is called. This seems to be needed, because some events might be triggered during startup, and they shouldn't fail.


Note: after a restart the persistent terms are always deleted, even if terminate was not called (e.g. kill). This means that all instrumentation is lost on restart. We could change this to make it permanent, but I think it would require a separate PR. The same applies to gen_hook, but additionally it can get into an inconsistent state when killed and started again - this is unlikely though.

@mongoose-im

This comment was marked as outdated.

Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: Patch coverage is 98.52941% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 84.40%. Comparing base (257a371) to head (277171b).

Files Patch % Lines
src/metrics/mongoose_instrument.erl 98.48% 1 Missing ⚠️
Additional details and impacted files
@@                 Coverage Diff                 @@
##           feature/instrument    #4228   +/-   ##
===================================================
  Coverage               84.39%   84.40%           
===================================================
  Files                     557      556    -1     
  Lines                   33588    33630   +42     
===================================================
+ Hits                    28348    28386   +38     
- Misses                   5240     5244    +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from 4cf4a30 to a260d18 Compare February 22, 2024 12:47
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from a260d18 to 78074d5 Compare February 22, 2024 18:34
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from 78074d5 to 7436e6d Compare February 23, 2024 08:30
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from 7436e6d to 718f018 Compare February 23, 2024 15:09
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from 718f018 to 050a842 Compare February 23, 2024 15:26
@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from a217513 to 3469900 Compare February 23, 2024 17:27
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from 3469900 to 582da60 Compare February 27, 2024 10:44
@mongoose-im

This comment was marked as outdated.

The state is kept in a gen_server, and after calling persist/0 it
becomes mirrored in a persistent term, allowing efficient lookup.

Main benefits:
- Consistency of setup/teardown - previously these operations might
  intialize metrics, but fail to register handlers (e.g. for duplicates)
- Ability to prevent label key inconsistency for an event, which would
  make prometheus fail.
- Similarity to gen_hook
- Maybe a slight advantage in terms of efficiency (persistent term vs ETS)
- Less possibility of race conditions, because all modifications
  are sequential.

Potential bottlenecks:
- Before calling persist(), all operations are handled by the
  gen_server, but no load should arrive yet (hooks are inactive).
- After calling persist(), modifications are more costly, because the
  persistent term will be updated for each of them. This shouldn't be
  a problem, because at that point all instrumentation is already set up.
- Test functionality with and without the persistent term.
- Check label consistency, which is now enforced.
- Change the way call history is gathered - now the functions are
  executed in a different process, so filter by event name.
@chrzaszcz chrzaszcz force-pushed the instrument/label-consistency branch from 582da60 to 277171b Compare February 27, 2024 11:56
@mongoose-im
Copy link
Collaborator

mongoose-im commented Feb 27, 2024

elasticsearch_and_cassandra_26 / elasticsearch_and_cassandra_mnesia / 277171b
Reports root/ big
OK: 417 / Failed: 0 / User-skipped: 43 / Auto-skipped: 0


small_tests_25 / small_tests / 277171b
Reports root / small


small_tests_26 / small_tests / 277171b
Reports root / small


small_tests_26_arm64 / small_tests / 277171b
Reports root / small


ldap_mnesia_26 / ldap_mnesia / 277171b
Reports root/ big
OK: 2275 / Failed: 0 / User-skipped: 856 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_25 / pgsql_mnesia / 277171b
Reports root/ big
OK: 4421 / Failed: 0 / User-skipped: 111 / Auto-skipped: 0


dynamic_domains_mysql_redis_26 / mysql_redis / 277171b
Reports root/ big
OK: 4388 / Failed: 0 / User-skipped: 144 / Auto-skipped: 0


ldap_mnesia_25 / ldap_mnesia / 277171b
Reports root/ big
OK: 2275 / Failed: 0 / User-skipped: 856 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_26 / pgsql_mnesia / 277171b
Reports root/ big
OK: 4421 / Failed: 0 / User-skipped: 111 / Auto-skipped: 0


internal_mnesia_26 / internal_mnesia / 277171b
Reports root/ big
OK: 2415 / Failed: 0 / User-skipped: 716 / Auto-skipped: 0


pgsql_cets_26 / pgsql_cets / 277171b
Reports root/ big
OK: 4356 / Failed: 0 / User-skipped: 178 / Auto-skipped: 0


dynamic_domains_mssql_mnesia_26 / odbc_mssql_mnesia / 277171b
Reports root/ big
OK: 4418 / Failed: 0 / User-skipped: 114 / Auto-skipped: 0


pgsql_mnesia_25 / pgsql_mnesia / 277171b
Reports root/ big
OK: 4810 / Failed: 0 / User-skipped: 118 / Auto-skipped: 0


mysql_redis_26 / mysql_redis / 277171b
Reports root/ big
OK: 4789 / Failed: 0 / User-skipped: 139 / Auto-skipped: 0


pgsql_mnesia_26 / pgsql_mnesia / 277171b
Reports root/ big
OK: 4810 / Failed: 0 / User-skipped: 118 / Auto-skipped: 0


mssql_mnesia_26 / odbc_mssql_mnesia / 277171b
Reports root/ big
OK: 4807 / Failed: 0 / User-skipped: 121 / Auto-skipped: 0

@chrzaszcz chrzaszcz marked this pull request as ready for review February 27, 2024 12:38
Copy link
Contributor

@JanuszJakubiec JanuszJakubiec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@JanuszJakubiec JanuszJakubiec merged commit c6ec239 into feature/instrument Feb 28, 2024
4 checks passed
@JanuszJakubiec JanuszJakubiec deleted the instrument/label-consistency branch February 28, 2024 07:54
@jacekwegr jacekwegr added this to the 6.3.0 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants