Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge transition delay #38

Merged
merged 4 commits into from
Jul 28, 2022

Conversation

ralexstokes
Copy link
Member

Fixes #32.

@ralexstokes ralexstokes force-pushed the merge-transition-delay branch from f106619 to 3dd3c87 Compare July 7, 2022 21:32
Copy link
Contributor

@mkalinin mkalinin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm!

@StefanBratanov
Copy link
Contributor

Will relays/mev-boost return 204 when querying for the execution payload header during the delay time?

@ralexstokes
Copy link
Member Author

Will relays/mev-boost return 204 when querying for the execution payload header during the delay time?

this is the idea!

@StefanBratanov
Copy link
Contributor

Will relays/mev-boost return 204 when querying for the execution payload header during the delay time?

this is the idea!

Thanks! If CL handles the 204 properly and fallback to a local EL, is it not redundant to also add the additional logic from this PR?

@tbenr
Copy link

tbenr commented Jul 8, 2022

Will relays/mev-boost return 204 when querying for the execution payload header during the delay time?

this is the idea!

Thanks! If CL handles the 204 properly and fallback to a local EL, is it not redundant to also add the additional logic from this PR?

It seems more an honest builder then a honest validator spec.

My concern is that CL discards pre finalized states. So CL can immediately know if we have a finalized execution payload but currently cant easily know when we finalized it the first time. Tracking it at runtime is easy but it wont survive a restart. So we need to write it on DB somewhere (or scan canonical blocks backward at startup). I mean, we can do that, just pointing out that it is not trivial to get that information ATM.

And if the main thing here is to have builders not serving blocks in that timeframe, seems a redundant behaviour.

@ralexstokes
Copy link
Member Author

ralexstokes commented Jul 8, 2022

@StefanBratanov @tbenr fair point on the semantics, the intent is to have some number we all agree upon

and the idea is that (1) block builders will not produce blocks during this time (2) relays will not serve blocks during this time (and will 204 if the header endpoint is called) and that (3) proposers won't bother calling the endpoint at this time

if one party just wants to rely on another party failing to do their job and "papering over" the error then that's fine but you are taking on the risk that they will in fact fail to do their job and then you could be exposing yourself to increased risk during the Merge transition -- you can decide how much risk to take on for an unprecedented software upgrade

this PR currently mixes the concerns across (1) (2) and (3) out of convenience as we don't have an "honest block builder" or "honest relay" spec yet although if you think it is worthwhile I can re-work this PR in that direction

@ralexstokes
Copy link
Member Author

@tbenr

My concern is that CL discards pre finalized states. So CL can immediately know if we have a finalized execution payload but currently cant easily know when we finalized it the first time. Tracking it at runtime is easy but it wont survive a restart. So we need to write it on DB somewhere (or scan canonical blocks backward at startup). I mean, we can do that, just pointing out that it is not trivial to get that information ATM.

I think we can discuss implementation on the next CL call and if there is enough appetite to go another route then we can change tactics here

@tbenr
Copy link

tbenr commented Jul 8, 2022

if one party just wants to rely on another party failing to do their job and "papering over" the error then that's fine but you are taking on the risk that they will in fact fail to do their job and then you could be exposing yourself to increased risk during the Merge transition -- you can decide how much risk to take on for an unprecedented software upgrade

This is a good point. At the end it is matter of comparing complexity added to apply the rule VS confidence on our fallback mechanism (and additional latency introduced)

@StefanBratanov
Copy link
Contributor

StefanBratanov commented Jul 8, 2022

if one party just wants to rely on another party failing to do their job and "papering over" the error then that's fine but you are taking on the risk that they will in fact fail to do their job and then you could be exposing yourself to increased risk during the Merge transition -- you can decide how much risk to take on for an unprecedented software upgrade

Agree actually. Ideally we would like 1, 2 and 3 to do their job properly to be fully robust. I guess my main concern was running this logic indefinitely. I suppose we could introduce a CLI flag which enables this logic and default it to false and it will be up to the users who run mev-boost/relay during merge transitions to enable it. Happy to discuss implementation details during the CL call as you mentioned. :)

@mkalinin
Copy link
Contributor

mkalinin commented Jul 15, 2022

A thought after discussion on CL call yesterday. Allowing CL client to enable mev-boost upon non-empty payload in the checkpoint state opens a way to bypass the delay by restarting the client with fresh database right after the transition is finalized

UPD
An even easier way to bypass the delay might be overriding the config parameter with 0 value

@djrtwo
Copy link
Collaborator

djrtwo commented Jul 18, 2022

Right, @mkalinin. That conversation primarily left me with the impression that just finalizing the merge transition is likely the proper path here rather than the finality + delay. The subtle complexities that emerge with the delay are absolutely not worth it in my opinion as they just open the door for more errors and/or gameability.

I would suggest just changing this to a finality trigger
and if we also want a delay, consider asking "honest" relays to add such a delay, but not 100% on that

@ralexstokes ralexstokes merged commit 94df9ba into ethereum:main Jul 28, 2022
@ralexstokes ralexstokes deleted the merge-transition-delay branch July 28, 2022 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider running builder software through the Merge testing
9 participants