-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove usage of timer.Timer in benchlist #2446
Conversation
benchlist.timer = timer.NewTimer(benchlist.update) | ||
go benchlist.timer.Dispatch() | ||
return benchlist, benchlist.metrics.Initialize(ctx.Registerer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely odd that we initialized the metrics after kicking off the goroutine... Not a bug - but felt pretty close
return benchlist, nil | ||
} | ||
|
||
// TODO: Close this goroutine during node shutdown |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We never shutdown the timer previously... We still don't... But we should
// Note: If there are no nodes to remove, [duration] will be 0 and we | ||
// will immediately wait until there are benched nodes. | ||
duration := b.durationToSleep(now) | ||
timer.Reset(duration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: it is safe to reset a timer with a 0 or negative duration... I don't think that's actually possible... but it isn't a case we need to worry about
if _, ok := b.benchlistSet[nodeID]; ok { | ||
return true | ||
} | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code made me sad
b.ctx.Log.Debug("benching validator after consecutive failed queries", | ||
zap.Stringer("nodeID", nodeID), | ||
zap.Duration("benchDuration", benchedUntil.Sub(now)), | ||
zap.Int("numFailedQueries", b.threshold), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It felt weird having this log after marking the node as benched.
benchlist := &benchlist{ | ||
ctx: ctx, | ||
resetTimer: make(chan struct{}, 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the same hack here of a length 1 buffered channel as we do with block building. This allows us to be ensured that the timer will be reset after attempting to push a message without blocking on resetting the timer. This is important because the method attempting to reset the timer is holding a lock that the timer may also be attempting to grab.
Why this should be merged
timer.Timer
is horrible code, I now know better. This helps to remove the abomination.How this works
Replaces the usage of
timer.Timer
with the standard lib'stime.Timer
How this was tested
CI