Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint of the world state should be moved by FCU #8143

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Scooletz
Copy link
Contributor

@Scooletz Scooletz commented Jan 30, 2025

This PR changes the behavior of MergePlugin.MergeEnabled chains so that once a given block is considered final (FCU comes in), the checkpoint of the main block processor is moved to the last finalized state root hash. This ensures that the root hash of the world state touched by the main processor always progresses according to FCU. The global world state should not reference any ancient blocks.

Before

Currently, the pair of calls .CreateCheckpoint & .RestoreBranch in BlockProcessor never moves the current state forward. The state is captured and then rolled back to with each execution of the NewPayload. This means, that after the state is set when a node is initialized, it will be switched between an old initial hash root and the current branch hash root. You can see it in enhanced (via 54865e4) logs below. No matter how far, the checkpoint is the same and will be continuously restored to.

31 Jan 2025 08:34:49.302 Received ForkChoice: 21743051 (0xcae62e...091005), Safe: 21743003 (0x940ec8...fb4fc9), Finalized: 21742971 (0x2c8827...04665e)
31 Jan 2025 08:34:49.073  Block throughput      213.35 MGas/s  |    1,962.7 tps |             9.72 Blk/s | exec code  from cache   2,431 | new      1
31 Jan 2025 08:34:49.073  Block mev 0.0282 ETH   21.96 MGas    |      202   txs | calls      1,122 ( 30) | sload   3,581 | sstore  1,253 | create   2
31 Jan 2025 08:34:49.073 Processed            21743051         |      102.9 ms  | slot         13,382 ms |⛽ Gas gwei: 1.49 .. 1.53 (3.04) .. 45.00
31 Jan 2025 08:34:49.073 Restored the branch checkpoint - 0x2c2465a5d17ae367dea6033203a24536d060265234b153c2ef9ecc97ccd615bd | 0x2c2465a5d17ae367dea6033203a24536d060265234b153c2ef9ecc97ccd615bd
31 Jan 2025 08:34:49.073 Restoring the branch checkpoint - 0x2c2465a5d17ae367dea6033203a24536d060265234b153c2ef9ecc97ccd615bd 

and 8 hours earlier

30 Jan 2025 23:59:43.710 Received ForkChoice: 21740492 (0xa012c9...aafe4c), Safe: 21740460 (0xa486fb...ff9373), Finalized: 21740428 (0x167a8e...a236c3)
30 Jan 2025 23:59:37.159  Block throughput      255.48 MGas/s  |    2,870.2 tps |             8.52 Blk/s | exec code  from cache   3,979 | new      9
30 Jan 2025 23:59:37.159 Block mev 0.0523 ETH   30.00 MGas    |      337   txs | calls      1,862 ( 53) | sload   6,334 | sstore  1,901 | create   3
30 Jan 2025 23:59:37.159 Processed            21740492         |      117.4 ms  | slot         13,408 ms |⛽ Gas gwei: 2.45 .. 2.45 (4.02) .. 69.77
30 Jan 2025 23:59:37.159 Restored the branch checkpoint - 0x2c2465a5d17ae367dea6033203a24536d060265234b153c2ef9ecc97ccd615bd | 0x2c2465a5d17ae367dea6033203a24536d060265234b153c2ef9ecc97ccd615bd
30 Jan 2025 23:59:37.159 Restoring the branch checkpoint - 0x2c2465a5d17ae367dea6033203a24536d060265234b153c2ef9ecc97ccd615bd 

After

The checkpoint that StateRoot is reset to is properly updated. See the log outputs taken in different times

31 Jan 2025 14:19:49.518 Updating checkpoint to 0xe885162f5858ed9c9f245d803da9090abb5c02c89c7171c39403f3e6b36cd7ee at block 21744687
31 Jan 2025 14:19:49.518 Received ForkChoice: 21744764 (0x397e92...1390df), Safe: 21744719 (0x1ac1a3...c6e21e), Finalized: 21744687 (0x62a6dc...046d7f)
31 Jan 2025 14:19:49.293  Block throughput      223.01 MGas/s  |    2,841.4 tps |            20.15 Blk/s | exec code  from cache   1,391 | new      0
31 Jan 2025 14:19:49.293  Block mev 0.0774 ETH   11.07 MGas    |      141   txs | calls        638 ( 23) | sload   2,072 | sstore    673 | create   0
31 Jan 2025 14:19:49.293 Processed            21744764         |       49.6 ms  | slot         12,186 ms |⛽ Gas gwei: 4.51 .. 5.61 (9.54) .. 240.08
31 Jan 2025 14:19:49.293 Restored the branch checkpoint - 0xe885162f5858ed9c9f245d803da9090abb5c02c89c7171c39403f3e6b36cd7ee | 0xe885162f5858ed9c9f245d803da9090abb5c02c89c7171c39403f3e6b36cd7ee
31 Jan 2025 14:19:49.293 Restoring the branch checkpoint - 0xe885162f5858ed9c9f245d803da9090abb5c02c89c7171c39403f3e6b36cd7ee 

vs

31 Jan 2025 13:51:13.210 Synced Chain Head to 21744621 (0x04fb6c...ee9988)
31 Jan 2025 13:51:13.210 Updating checkpoint to 0x6b27f8e2062dc999b5ba1b8379702d6bd3c88b0a1307a4cda71c812e1adb806f at block 21744527
31 Jan 2025 13:51:13.210 Received ForkChoice: 21744621 (0x04fb6c...ee9988), Safe: 21744559 (0xca69e2...f5cbb9), Finalized: 21744527 (0x3f1824...8a246a)
31 Jan 2025 13:51:12.975  Block throughput      168.70 MGas/s  |    2,684.2 tps |            20.18 Blk/s | exec code  from cache     900 | new      0
31 Jan 2025 13:51:12.975  Block     0.0082 ETH    8.36 MGas    |      133   txs | calls        379 ( 40) | sload   1,462 | sstore    395 | create   0
31 Jan 2025 13:51:12.975 Processed            21744621         |       49.5 ms  | slot         11,420 ms |⛽ Gas gwei: 3.70 .. 3.70 (5.07) .. 13.78
31 Jan 2025 13:51:12.975 Restored the branch checkpoint - 0x6b27f8e2062dc999b5ba1b8379702d6bd3c88b0a1307a4cda71c812e1adb806f | 0x6b27f8e2062dc999b5ba1b8379702d6bd3c88b0a1307a4cda71c812e1adb806f
31 Jan 2025 13:51:12.975 Restoring the branch checkpoint - 0x6b27f8e2062dc999b5ba1b8379702d6bd3c88b0a1307a4cda71c812e1adb806f

Changes

  • enhancement of the MergePlugin to inform the main block processor about the checkpoint
  • fix the BlockProcessor so that it takes into consideration the provided checkpoint instead always falling back to the checkpoint captured at the node startup

Types of changes

What types of changes does your code introduce?

  • Bugfix (a non-breaking change that fixes an issue)
  • New feature (a non-breaking change that adds functionality)
  • Breaking change (a change that causes existing functionality not to work as expected)
  • Optimization
  • Refactoring
  • Documentation update
  • Build-related changes
  • Other: Description

Testing

Requires testing

  • Yes
  • No

If yes, did you write tests?

  • Yes
  • No

Notes on testing

Probably we should test MergeEnabled and others as well

Documentation

Requires documentation update

  • Yes
  • No

Requires explanation in Release Notes

  • Yes
  • No

@Scooletz
Copy link
Contributor Author

@Scooletz Scooletz marked this pull request as ready for review January 31, 2025 13:29
@LukaszRozmej
Copy link
Member

I think it is all not needed. What is really needed is in the ForkChoiceUpdate handler move main state provider state root to the root of the new head block

@LukaszRozmej
Copy link
Member

Alternative I had in mind: #8154

@Scooletz
Copy link
Contributor Author

Scooletz commented Feb 3, 2025

@LukaszRozmej My take on it is that the global state provider should be updated only from a single thread. Having it updated from FCU handler may result in some funky behaviors. It' better to have FCU responsible for signaling where other components (BlockProcessor) use the provided information when needed

@LukaszRozmej
Copy link
Member

LukaszRozmej commented Feb 3, 2025

@LukaszRozmej My take on it is that the global state provider should be updated only from a single thread. Having it updated from FCU handler may result in some funky behaviors. It' better to have FCU responsible for signaling where other components (BlockProcessor) use the provided information when needed

Good point, but NewPayload and FCU are always under single-threaded lock.

That being said, you could potentially maybe have blocks queued in block processor?

Added a fix for that.

@Scooletz
Copy link
Contributor Author

Scooletz commented Feb 3, 2025

@LukaszRozmej I see that there's a lock, but notification-way of doing it (this PR) seems to me more preferred. We don't leak the global state elsewhere, we let FCU notfity the processor and make the processor notifiable if other components would like to move the checkpoint (for some reason, like another consensus mechanism or sth).

@asdacap
Copy link
Contributor

asdacap commented Feb 3, 2025

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block proccessing anyway, so why should it care to restore it?

@Scooletz
Copy link
Contributor Author

Scooletz commented Feb 3, 2025

@asdacap

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block processing anyway, so why should it care to restore it?

First. This is not the change introduced in this PR, is it? Second, my understanding is that for options that do not contain DoNotUpdateHead, it should progress and do not reset the state. That was the behavior of the original.

@LukaszRozmej
Copy link
Member

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block proccessing anyway, so why should it care to restore it?

Because it is FCU that moves head state and not NewPayload? We probably could move state root on NewPayload if we wanted?

@asdacap
Copy link
Contributor

asdacap commented Feb 4, 2025

@asdacap

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block processing anyway, so why should it care to restore it?

First. This is not the change introduced in this PR, is it? Second, my understanding is that for options that do not contain DoNotUpdateHead, it should progress and do not reset the state. That was the behavior of the original.

I'm not saying that this PR introduce that change. I'm saying that if that logic is removed, that might solve your problem. The DoNotUpdateHead is also used to update the BlockTree's Head which make much more sense than here.

@asdacap
Copy link
Contributor

asdacap commented Feb 4, 2025

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block proccessing anyway, so why should it care to restore it?

Because it is FCU that moves head state and not NewPayload? We probably could move state root on NewPayload if we wanted?

My thought is, why does it need to move at all if it is always moved at start of block processing.

@LukaszRozmej
Copy link
Member

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block proccessing anyway, so why should it care to restore it?

Because it is FCU that moves head state and not NewPayload? We probably could move state root on NewPayload if we wanted?

My thought is, why does it need to move at all if it is always moved at start of block processing.

So you are saying - just don't revert and keep as is?

@tanishqjasoria
Copy link
Contributor

@asdacap

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block processing anyway, so why should it care to restore it?

First. This is not the change introduced in this PR, is it? Second, my understanding is that for options that do not contain DoNotUpdateHead, it should progress and do not reset the state. That was the behavior of the original.

I'm not saying that this PR introduce that change. I'm saying that if that logic is removed, that might solve your problem. The DoNotUpdateHead is also used to update the BlockTree's Head which make much more sense than here.

I did the same with verkle, as it also had a similar problem with flat db layout.

@LukaszRozmej
Copy link
Member

@asdacap

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block processing anyway, so why should it care to restore it?

First. This is not the change introduced in this PR, is it? Second, my understanding is that for options that do not contain DoNotUpdateHead, it should progress and do not reset the state. That was the behavior of the original.

I'm not saying that this PR introduce that change. I'm saying that if that logic is removed, that might solve your problem. The DoNotUpdateHead is also used to update the BlockTree's Head which make much more sense than here.

I did the same with verkle, as it also had a similar problem with flat db layout.

Can you be more clear?

@tanishqjasoria
Copy link
Contributor

@asdacap

I think the main purpose of the checkpoint is to revert when an exception happens. So my confusion is, why the DoNotUpdateHead do anything here. In any case, even with exception, it always reset the state at start block processing anyway, so why should it care to restore it?

First. This is not the change introduced in this PR, is it? Second, my understanding is that for options that do not contain DoNotUpdateHead, it should progress and do not reset the state. That was the behavior of the original.

I'm not saying that this PR introduce that change. I'm saying that if that logic is removed, that might solve your problem. The DoNotUpdateHead is also used to update the BlockTree's Head which make much more sense than here.

I did the same with verkle, as it also had a similar problem with flat db layout.

Can you be more clear?

I removed the DoNotUpdateHead resetting the state logic and that solved the issue.

@Scooletz
Copy link
Contributor Author

Scooletz commented Feb 4, 2025

As far as I understand, we could ditch the final reset as @asdacap is proposing. Then, if a branch is properly processed, we leave the root as is. If it's not we reset to what has been captured before execution. The crux is that we always reset the root of the branch if it's different from the one that branch should be executed on. Having this said, removing of the reset to the checkpoint after a proper execution, seems like a good candidate for ensuring the checkpoint progression?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants