-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: FIP-0076: implement migration to populate ProviderSectors #246
Conversation
builtin/v13/migration/market.go
Outdated
} else { | ||
//fmt.Printf("deal %d not found in providerSectors: %v\n", deal, oldState) | ||
|
||
newState.SectorNumber = 0 // FIP: if such a sector cannot be found, assert that the deal's end epoch has passed and use sector ID 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant?
builtin/v13/migration/market.go
Outdated
} | ||
|
||
func (m *marketMigrator) migrateProviderSectorsAndStatesWithDiff(ctx context.Context, store cbor.IpldStore, prevInStatesCid, prevOutStatesCid, prevOutProviderSectorsCid, inStatesCid, prevInProposals, inProposals cid.Cid) (cid.Cid, cid.Cid, error) { | ||
diffs, err := amt.Diff(ctx, store, store, prevInStatesCid, inStatesCid, amt.UseTreeBitWidth(market12.StatesAmtBitwidth)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, unfortunately, we're diffing the wrong states here if we care about re-orgs. I.e., it's possible for a re-org to change the sector number to which a deal is assigned without changing anything about the deal itself, and this diff won't catch that because it's diffing the deal states which don't record sector information.
I discussed this with @arajasek and we think the answer is to just not allow reorgs. I.e., don't use the cached migration if its not an ancestor. That'll ensure that neither sector nor deal IDs are reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically:
- Wait for some measure of finality (not real finality, but, like, a few epochs) after the pre-migration epoch before running the pre-migration. OR, run the premigration at the specified epoch, but do a lookback (i.e., migrate state from a few epochs ago).
- In the "migration" struct itself, record the highest pre-migration tipset.
- When running the actual migration (or, really, any pre-migrations), if there is a pre-migration tipset recorded and it's not an ancestor of the tipset being migrated, discard the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh. Ok, maybe that's not the way. We do diff the sectors anyways. So...
- When we diff the sectors, we need to record the changes.
- When we diff the deal states, we need to apply/remove the sector changes.
- Finally, we need to iterate over the rest of the sector changes, and fix their deal states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@magik6k is the above note something you can look at addressing?
95e0e25
to
1a176da
Compare
builtin/v13/migration/miner.go
Outdated
return nil, xerrors.Errorf("failed to unmarshal sector %d after: %w", sectorNo, err) | ||
} | ||
|
||
if len(sectorBefore.DealIDs) != len(sectorAfter.DealIDs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have the same number of deals, but different deals because of a reorg?What if we have the same number of deals, but different deals because of a reorg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only reviewed FromScratch, which looks ok to me.
v13: Address migration review v13: Handle deal proposal in case of reorgs v13: organize imports v13: perf nit: avoid repeated map lookups implement new rules for uncached operations implement new rules for cached operations
1a176da
to
d51cd17
Compare
// no cached migration, so we simply iterate all sectors and collect deal IDs | ||
|
||
err = inSectors.ForEach(§or, func(i int64) error { | ||
if len(sector.DealIDs) == 0 || sector.Expiration < UpgradeHeight { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass this into the migration from somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I have a note for that too.
} | ||
// There can't already be an entry for this sector, so just set (not append) | ||
// TODO: golang: I need to copy this, since sector is a reference that's constantly updated? Right? Steb? | ||
sectorDealIDs := sector.DealIDs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, we'll overwrite sector
each time, but create new slices. However, I'd still feel safer copying.
return nil, xerrors.Errorf("didn't find sector %d in inSectors", sectorNo) | ||
} | ||
|
||
if len(sector.DealIDs) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify needs to be treated as remove + add, so we need to run the remove logic from below even in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify should strictly be an add, because the protocol doesn't actually let you remove deals ever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this only matters for reorgs.
sectorDeals = make(map[abi.SectorNumber][]abi.DealID) | ||
m.providerSectors.updatesToMinerToSectorToDeals[abi.ActorID(mid)] = sectorDeals | ||
} | ||
// There can't already be an entry for this sector, so just set (not append) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reorg
builtin/v13/migration/market.go
Outdated
addProviderSectorEntry := func(deal abi.DealID) (abi.SectorNumber, error) { | ||
sid, ok := m.providerSectors.dealToSector[deal] | ||
if !ok { | ||
return 0, xerrors.Errorf("deal %d not found in providerSectors", deal) // todo is this normal and possible?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See note here.
builtin/v13/migration/market.go
Outdated
} else { | ||
//fmt.Printf("deal %d not found in providerSectors: %v\n", deal, oldState) | ||
|
||
newState.SectorNumber = 0 // FIP: if such a sector cannot be found, assert that the deal's end epoch has passed and use sector ID 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See note here.
builtin/v13/migration/miner.go
Outdated
|
||
if len(sectorBefore.DealIDs) != len(sectorAfter.DealIDs) { | ||
if len(sectorBefore.DealIDs) != 0 { | ||
return nil, xerrors.Errorf("WHAT?! sector %d modified, this not supported and not supposed to happen", i) // todo: is it? Can't happen w/o a deep, deep reorg, and even then we wouldn't use the cache?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See note here.
builtin/v13/migration/top.go
Outdated
@@ -17,6 +17,8 @@ import ( | |||
"golang.org/x/xerrors" | |||
) | |||
|
|||
const UpgradeHeight = abi.ChainEpoch(1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be an input to the top-level migration func / some config thing.
return nil, xerrors.Errorf("failed to load sectors array: %w", err) | ||
} | ||
|
||
hasCached, prevSectors, err := in.Cache.Read(migration.MinerPrevSectorsInKey(in.Address)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to think about use of the cache -- providerSectors is populated based on whether or not there's a cache hit, but the market migration doesn't consider the possibility that some miners were cached, but others weren't.
|
||
// now, process updatesToMinerToSectorToDeals, making the appropriate updates | ||
|
||
for miner, sectorUpdates := range m.providerSectors.updatesToMinerToSectorToDeals { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a "is not nil" check in both this case and the "without diff" case below, and initialize the one we're using. Otherwise, I'm a bit concerned we might have somehow cached the market migration but not the sector migration...
Also, it would be nice to make this independent somehow?
- Determine if we're doing a cached market migration.
- Determine independently if we're using a cache from a prior sector migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I'd like this to be a future item too. We can unblock "having an nv22 network" for now.
SlashEpoch: oldState.LastUpdatedEpoch, | ||
} | ||
|
||
// TODO: We should technically only set this if the deal hasn't expired, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That involves looking up the proposal for every single deal in here, which is significant work, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. We don't include that info in the deal state.
What if we check the current epoch is more than DealMinDuration
after SectorStartEpoch
. If so, we lookup the deal state.
Honestly, I think looking up the deal state each time won't be a huge issue either.
You're right that this shouldn't matter, it's just a bit annoying that we're relying on the time between the migration and the first pre-migration being short.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna track this as a future item, and see if looking it up each time isn't too much slower.
|
||
// if the deal is now slashed, UNSET the sector number | ||
if newState.SlashEpoch != -1 { | ||
newState.SectorNumber = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-org (if we decide to handle those): expiration/sector number can change.
builtin/v13/migration/market.go
Outdated
err = actorSectors.ForEach(nil, func(k string) error { | ||
return errItemFound | ||
}) | ||
nonEmpty := false | ||
if err == errItemFound { | ||
nonEmpty = true | ||
err = nil | ||
} | ||
if err != nil { | ||
return cid.Undef, cid.Undef, xerrors.Errorf("failed to iterate actor sectors: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider attaching this to the "map" definition in util/adt/map.go
so I never have to look at this code again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yessir.
|
||
var sectorDealIDs market13.SectorDealIDs | ||
for sector, deals := range sectors { | ||
sectorDealIDs = deals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't wait for the fixed loop to ship by default. This is one of the dumbest go design decisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. This PR is correct (in practice) assuming we detect and avoid using the cache for reorgs in lotus. That means:
- Performing the premigration at some small lookback from the actual premigration epoch.
- Recording the tipset where we did the premigration.
- During the real migration (or any subsequent premigrations), check if the current epoch is a child of the premigration. If not, discard the cache.
- During the real migration, DO NOT update the cache. If we reorg around the migration epoch, we just need to re-run the migration based on the last premigration.
Extracted from #236.
This is a direct copy of the code from there (except for changes made in earlier PRs #245 and #244). I have not yet examined the code for correctness, nor have I addressed some outstanding review on that PR, and so am opening this as draft for now. It effectively needs review including from me.
In particular, reviewers please examine whether the use of the migration cache is incorrect if the "pre-migrated" state is NOT an ancestor of the migration input state. It SHOULD be the case that this is ancestry-agnostic.