-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix hot state disk leak #5768
Fix hot state disk leak #5768
Conversation
Seems like some of the tests are unhappy. I will look into it tomorrow. |
Compared to the pruning in lighthouse/beacon_node/beacon_chain/src/migrate.rs Lines 759 to 768 in b5f2761
|
Squashed commit of the following: commit 0f6ea4b Author: Michael Sproul <[email protected]> Date: Tue May 14 10:23:29 2024 +1000 Don't delete the genesis state when split is 0x0! commit aa8f7c1 Author: Michael Sproul <[email protected]> Date: Mon May 13 12:51:47 2024 +1000 Fix hot state leak
Squashed commit of the following: commit 0f6ea4b Author: Michael Sproul <[email protected]> Date: Tue May 14 10:23:29 2024 +1000 Don't delete the genesis state when split is 0x0! commit aa8f7c1 Author: Michael Sproul <[email protected]> Date: Mon May 13 12:51:47 2024 +1000 Fix hot state leak
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with the code here but changes look reasonable and I can't spot any flaws! Looks good to me 👍
@Mergifyio queue |
✅ The pull request has been merged automaticallyThe pull request has been merged automatically at 8762d82 |
Issue Addressed
Fix an issue whereby the v5.2.0-RC hot database grows indefinitely.
Proposed Changes
The main reason for the unbounded growth is the storage of extra hot states.
I accidentally introduced the bug in:
In that PR, we added logic to both:
The problem is, we were deleting the temporary flags for all advanced states, when we should have only been doing this at skipped slots. For example, with 2 consecutive blocks at slots
N
andN + 1
we would store (forever) the advanced pre-state forN + 1
. Our pruning logic would never delete this state because it was not marked temporary, and it was not found by theprune_abandoned_forks
logic because it did not lie on any side chains. To fix this bug, we now only delete the temporary flag when a skipped slot occurs.In addition to this, states with temporary flags were previously only deleted on startup. To allow pruning of these states without restaring, I've ported over some of the state pruning code from full
tree-states
, which iterates theHotStateSummary
s in the database and deletes all states older than thesplit.slot
(or equal to the split slot with a different state root). This is safe even in the case where the split block lies at a slot prior to the split slot, because the database only stores the split slot's advanced (epoch-aligned) state in this case, and we do not risk deleting a canonical state by deleting states withslot < split.slot
.Thanks to @antondlr for noticing the state growth on the testnet deployment.