Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added metrics for when we rate limit #5640

Merged
merged 5 commits into from
Feb 3, 2024

Conversation

jakobht
Copy link
Member

@jakobht jakobht commented Jan 31, 2024

What changed?
Add metrics and logs for when we rate limit workflowIDs

Why?
We need to have observability on the workflow ids we rate limit

How did you test it?
Tested locally and in unit tests

Potential risks
It might cause excessive logging, but it is at most one log (extra) per GRPC call, so I expect it's ok

Release notes

Documentation Changes

@coveralls
Copy link

coveralls commented Jan 31, 2024

Pull Request Test Coverage Report for Build 018d6eb9-8f0c-484f-8138-3f55479d8974

  • -3 of 23 (86.96%) changed or added relevant lines in 2 files are covered.
  • 59 unchanged lines in 8 files lost coverage.
  • Overall coverage increased (+0.03%) to 62.675%

Changes Missing Coverage Covered Lines Changed/Added Lines %
common/log/tag/tags.go 0 3 0.0%
Files with Coverage Reduction New Missed Lines %
common/persistence/historyManager.go 2 66.67%
service/history/task/transfer_active_task_executor.go 2 72.22%
common/persistence/statsComputer.go 3 94.64%
service/matching/taskListManager.go 3 80.2%
service/history/task/transfer_standby_task_executor.go 4 86.19%
common/persistence/sql/workflowStateMaps.go 11 83.84%
service/history/execution/mutable_state_task_refresher.go 14 64.56%
service/history/task/task_util.go 20 70.57%
Totals Coverage Status
Change from base Build 018d6bf0-6d52-40ca-b6f4-6cd23f6e66a1: 0.03%
Covered Lines: 92114
Relevant Lines: 146970

💛 - Coveralls

@Groxx
Copy link
Member

Groxx commented Jan 31, 2024

Log spam: eh, it's probably fine tbh. Zap samples by default, per log message (the first string arg). And if we decide we do need something fancier or more control beyond that, we have plenty of options.

(I believe we have this disabled internally, but we should probably rethink that. and a log per rpc is far from a major increase anyway.)

Copy link
Member

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor tweaks worth doing I think, but merge when ready :)

- Added internal/external to the metrics and logs
- Renamed emitMetrics to emitRateLimitMetrics
- Added domainID to log
Copy link
Member

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny thing to fix in the tests, but LGTM once that unblocks :)

service/history/workflowcache/cache_test.go Outdated Show resolved Hide resolved
@jakobht jakobht enabled auto-merge (squash) February 3, 2024 10:27
@jakobht jakobht merged commit 465bb62 into cadence-workflow:master Feb 3, 2024
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants