Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate chain sync bottleneck when reach 1.4M of blocks synced #3294

Closed
EclesioMeloJunior opened this issue Jun 1, 2023 · 2 comments
Closed
Assignees
Labels
Epic Issue used to track development status of a complex feature, aggregates several issues

Comments

@EclesioMeloJunior
Copy link
Member

EclesioMeloJunior commented Jun 1, 2023

Issue summary

  • When gossamer reaches the amount of ~1.41M it slows down the sync velocity

Screenshot 2023-06-13 at 10 21 51

@EclesioMeloJunior EclesioMeloJunior self-assigned this Jun 5, 2023
@EclesioMeloJunior EclesioMeloJunior changed the title investigate chain sync bottleneck when increasing the number or workers Investigate chain sync bottleneck when reach 1.4M of blocks synced Jun 13, 2023
@EclesioMeloJunior
Copy link
Member Author

The following pprof output was extracted when Gossamer was in that region

image

It is possible to notice that the methods handleReadyBlock, handleWorkersResults, processBlockData and processBlockDataWithHeaderAndBody are taking 0s in flat and 19.32s in cum which indicates that a function they call is hanging 99% of the time and we know that the method VerifyBlock is part of the sync flow

If we go deeper in the processBlockDataWithHeaderAndBody we can see that the VerifyBlock is taking 18.82s
image

To be brief the call chain is:


VerifyBlock -> verifyAuthorshipRight -> verifyBlockEquivocation -> b.blockState.GetBlockHashesBySlot -> GetSlotForBlock -> types.GetSlotFromHeader

and in the end, the function GetSlotFromHeader calls the function DecodeBabePreDigest which uses the scale.Unmarshal to decode the varying data type and as we can notice it is quite expensive.

image

This process (DecodeBabePreDigest) is made by every descendant of the highest finalized block, and we know that at this point 1.41M in the Westend chain the finalization stopped for some period of time.

Basically, each time we want to import a new block we need to verify it, and since we don't have any finalization happening the number of blocks since the latest finalized block only increases and this creates the bottleneck

@EclesioMeloJunior EclesioMeloJunior added Type: Bug Epic Issue used to track development status of a complex feature, aggregates several issues labels Jun 13, 2023
@EclesioMeloJunior
Copy link
Member Author

This bottleneck was fixed by #3364

@P1sar P1sar closed this as completed Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic Issue used to track development status of a complex feature, aggregates several issues
Projects
None yet
Development

No branches or pull requests

2 participants