-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make pivot update attempts infinite #5988
Conversation
Ok but is that safe? |
@LukaszRozmej So I'd even say it is more safe than current approach because fallback like we had before was another testing path which would need to be covered all the time - now we reduce possible paths (this scenario will be possible only when someone would manually start to play with this config value to make it happen). Thing is that in reality we don't fix real issue (unable to "catch" missing headers if that occurs - this is because of nature how we download headers from latest to oldest - missing ones are more recent than initial PIVOT set) but ensure that it won't happen in default configuration. Proper fix would be problematic and I guess it would cause more regression than this approach. |
@@ -61,7 +61,7 @@ public interface ISyncConfig : IConfig | |||
[ConfigItem(DisabledForCli = true, HiddenFromDocs = true)] | |||
Keccak? PivotHashParsed => PivotHash is null ? null : new Keccak(Bytes.FromHexString(PivotHash)); | |||
|
|||
[ConfigItem(Description = "Max number of attempts (seconds) to update pivot block basing on Forkchoice message from Consensus Layer. Only for PoS chains.", DefaultValue = "900")] | |||
[ConfigItem(Description = "Max number of attempts (seconds) to update pivot block basing on Forkchoice message from Consensus Layer. Only for PoS chains. Infinite by default.", DefaultValue = "2147483647")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcindsobczak Why not some value like -1? I guess this is because setting Int.Max is quicker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just have logic which is checking if attemts left > 0
in few places, making support for -1 will be nice, but requires bigger refactoring and it is always some risk. If we are not planning to use values between disabled and infinite, we could make it bool as well. But doing it extremely high (int.Max = 68 years) works the same as infinite, is quick and safest, as doesn't require any changes in the code, just config value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is what I expected to hear - we can go safe way but would be nice to leave some issue to refactor this in future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created low priority good first issue for refactoring it to be infinite instead of ~68 years: #5992
@@ -141,7 +141,7 @@ private async Task<bool> TrySetFreshPivot(CancellationToken cancellationToken) | |||
|
|||
if (finalizedBlockHash is null || finalizedBlockHash == Keccak.Zero) | |||
{ | |||
if (_logger.IsInfo && _attemptsLeft % 10 == 0) _logger.Info($"Waiting for Forkchoice message from Consensus Layer to set fresh pivot block. {_attemptsLeft} attempts left"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attempts left was pretty nice but if we keep int.max approach it does not have sense indeed. If You would plan to add some value like -1 to make it infinite instead of this approach, please rollback this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it will be infinite, then it will have even less sense, because infinite number of attempt will be always left. We can add one more variable and just increment it - it will be something like:
Waiting for Forkchoice message from Consensus Layer to set fresh pivot block [{counter}s]
And for users
Waiting for Forkchoice message from Consensus Layer to set fresh pivot block [1s]
Waiting for Forkchoice message from Consensus Layer to set fresh pivot block [2s]
...
Waiting for Forkchoice message from Consensus Layer to set fresh pivot block [123s]
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice! :)
When introducing PivotUpdator we made a timeout of 15min as a safety guarantee, with intention to make it infinite some day. In the meantime it proved that is working fine, the only issues are because of edge-cases caused by timeout. We are not able to sync if there is no CL - that's one thing, and downloading headers in the meantime is problematic, because in case of restart at this stage, pivot will be updated and we will end up with a gap in headers, between old pivot and new pivot. So we have two options - disable functionality of PivotUpdator if timeout would be hit, or make it infinite. In first scenario there will be no corruption (gap in headers), but after connecting with proper CL sync will start from old, hardcoded pivot, so will take way longer. Second option is to wait with downloading headers until pivot will be updated, then state sync can start almost immediately. In second option the whole sync process in most cases would be faster, so I recommend this approach. We could mix these approaches and make timeout e.g. 6h and if it will be hit, just save metadata to never update pivot on that node. In most cases timeout will never hit - 6h is enough time to wait for setting up CL. And if there is something wrong with setup, it will just start downloading headers after 6h, just to not wait infinitely. But it will not be able to start state sync until CL connection would be fixed anyway, and after fixing it, it will need to download a lot of blocks to start state sync, so doesn't have much sense |
Resolves #5986
Changes
MaxAttemptsToUpdatePivot
to infinite in practice (int.MaxValue
equal 0x7fffffff, equal 2147483647, equal ~68 years)attempts left
from logsTypes of changes
What types of changes does your code introduce?
Testing
Requires testing
Documentation
Requires documentation update
Requires explanation in Release Notes