eth,p2p: count timeout packet towards rtt #25588
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR changes the way we calculate RTT a bit. I noticed that for some nodes (particularly our azure nodes), there are a lot of dropped/failed trie heal messages.
Whenever we time out a packet, we update the capacity for that particular kind to
0
. But what we do not do is update the rtt estimate. This PR changes that, so that a failed delivery does update the RTT, by pretending that it completed in 2x the amount of time spent so far. So if it times out after two seconds, it RTT estimate is updated as if it had been successfully delivered after 4 seconds. The idea is to make timeouts bump up the RTT estimate.This is from one of the bootnodes, which is syncing. The graphs shows
Unexpected trienode heal
occurrences in the log. I restarted it with this PR at 11:50, when if flattens out. The dotted line represents 500 messages.It's only one isolated and perhaps misrepresenting metric, but in that instance it seems to have improved the misdeliveries.