-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt at adding hostport info to logs #6152
Conversation
client/history/client.go
Outdated
@@ -158,6 +158,7 @@ func (c *clientImpl) DescribeHistoryHost( | |||
peer, err = c.peerResolver.FromHostAddress(request.GetHostAddress()) | |||
} | |||
if err != nil { | |||
c.logger.Error("peer could not be resolved for host.", tag.Error(err), tag.Address(request.GetHostAddress())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's also include shardid or workflowid as a tag (whichever is used above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on it.
client/history/client.go
Outdated
@@ -158,6 +158,7 @@ func (c *clientImpl) DescribeHistoryHost( | |||
peer, err = c.peerResolver.FromHostAddress(request.GetHostAddress()) | |||
} | |||
if err != nil { | |||
c.logger.Error("peer could not be resolved for host.", tag.Error(err), tag.ShardID(int(request.GetShardIDForHost())), tag.WorkflowID(request.ExecutionForHost.GetWorkflowID()), tag.Address(request.GetHostAddress())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
request.ExecutionForHost.GetWorkflowID()
can panic.
may be worth putting this in each branch tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on it. tbh I am not sure if there is any other alternative here apart from putting these logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a thing definitely worth fixing, but "put a log in here" seems reasonable. we don't have much of this info elsewhere.
eventually we can wrap these errors and take care of it in a centralized way, but we're not there yet.
if we're trying to log "host X has errored" and not "could not decide a host" (as I think that might be more accurate for the issue driving this PR), personally I was planning to add it to the op
-retryer. it has access to the peer in all cases (except the ratelimiter API, but it doesn't retry by design)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the delay!
Codecov ReportAttention: Patch coverage is
Additional details and impacted files
... and 15 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
Pull Request Test Coverage Report for Build 01909332-6a65-4ffa-bb92-200b7f6577c6Details
💛 - Coveralls |
What changed?
Our internal ring-use / task-processing doesn't report the intended destination, so zone-specific issues can be hard to identify / hard to narrow to more specific causes. This is an attempt to make the logs more meaning ful. an attempt was already made but it's not the most optimal because of:
Trying to put a log at this location as we have a retrier that has access to that peer here is it's used nearly for all of the similar requests.
Why?
As a part of the incident resolution.
How did you test it?
Potential risks
Release notes
Documentation Changes