Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: chain: put tipsetkey upon expansion of tipset #10069

Merged
merged 2 commits into from
Jan 19, 2023

Conversation

arajasek
Copy link
Contributor

Related Issues

Fixes #10061

Proposed Changes

The bug, introduced in #9904, caused us to not put the "expanded" tipsetkey in our blockstore. We're now putting the expanded TSK.

I'm a little confused about how things were working before 9904. AFAIC tell we had the opposite bug before 9904 -- we only put the TSK when we were taking a new heaviest tipset, which does not happen during "catch-up" sync. I suspect we were missing all TSKs during catch-up sync in the previous version.

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

  • Commits have a clear commit message.
  • PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
    • area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
  • New features have usage guidelines and / or documentation updates in
  • Tests exist for new functionality or change in behavior
  • CI is green

@arajasek arajasek requested a review from a team as a code owner January 19, 2023 16:38
@arajasek
Copy link
Contributor Author

@raulk will be contributing a test <3

@arajasek arajasek linked an issue Jan 19, 2023 that may be closed by this pull request
Copy link
Contributor

@ZenGround0 ZenGround0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a case to be made that the key persistence belongs inside of expandTipSet to match the fact that its buried inside PersistTipSet in the non expanded case.

But this LGTM

Copy link
Member

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this seems like a workaround since we're not correcting the fact that we we are persisting tipset CIDs for unexpanded tipsets. However, if the Lotus team is happy with this solution (versus what we had in the past before this refactor was made, which worked better IMO and was cleaner IIRC), then let's go forward.

Just to check: this tipset CID will be correctly handled by the splitstore, right?

@arajasek
Copy link
Contributor Author

Honestly this seems like a workaround since we're not correcting the fact that we we are persisting tipset CIDs for unexpanded tipsets. However, if the Lotus team is happy with this solution (versus what we had in the past before this refactor was made, which worked better IMO and was cleaner IIRC), then let's go forward.

@raulk We can drop the persistence of the unexpanded tipsets. I only have it there for redundancy / to cover our bases.

I can also definitely go back to what we had before, but I'm not convinced it was correct (from PR description):

I'm a little confused about how things were working before 9904. AFAIC tell we had the opposite bug before 9904 -- we only put the TSK when we were taking a new heaviest tipset, which does not happen during "catch-up" sync. I suspect we were missing all TSKs during catch-up sync in the previous version.

I would like to understand this a bit before merging to make sure we know what we're doing and get it right this time.

Just to check: this tipset CID will be correctly handled by the splitstore, right?

Yes, anything in our main chain will be protected, TSKs that aren't in our main chain will get GCed.

@raulk
Copy link
Member

raulk commented Jan 19, 2023

Confirmed that this patch fixes the issue with live syncing. After bringing up my node to speed with Hyperspace + letting it run for a few minutes, all block hashes are consistent as reported by the tool in #10060:

⟩ ./lotus-shed eth --repo=/Users/raul/.lotus-hyperspace check-tipsets | head -100
Current height: 10194
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10194 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10193 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10192 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10191 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10190 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10189 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10188 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10187 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10186 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10185 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10184 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10183 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10182 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10181 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10180 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10179 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10178 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10177 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10176 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10175 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10174 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10173 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10172 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10171 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10170 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10169 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10168 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10167 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10166 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10165 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10164 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10163 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10162 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10161 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10160 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10159 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10158 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10157 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10156 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10155 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10154 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10153 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10152 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10151 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10150 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10149 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10148 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10147 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10146 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10145 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10144 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10143 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10142 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10141 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10140 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10139 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10138 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10137 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10136 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10135 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10134 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10133 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10132 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10131 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10130 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10129 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10128 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10127 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10126 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10125 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10124 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10123 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10122 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10121 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10120 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10119 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10118 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10117 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10116 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10115 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10114 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10113 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10112 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10111 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10110 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10109 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10108 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10107 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10106 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10105 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10104 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10103 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10102 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10101 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10100 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10099 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10098 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10097 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @10096 are identical

@raulk
Copy link
Member

raulk commented Jan 19, 2023

@arajasek unfortunately I don't have the bandwidth to go digging to find an explanation now. However, I can say that the previous version of the code had been running for months in Wallaby without issues. I can't spend the time now to re-grok the Lotus sync code to understand if there'd be any missing gaps if we went back to the original version, so let's merge this and open a TODO to clean this up later?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tipset CID flakiness
3 participants