perf: chain: export-range #10145

hsanjuan · 2023-01-30T16:24:27Z

Taking over from #9192 .

Related Issues

Proposed Changes

A new chain export-range command that can create archival-grade ranged exports of the chain as quickly as possible (letting Lotus write to disk directly).

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

Commits have a clear commit message.
PR title is in the form of of <PR type>: <area>: <change being made>
- example: fix: mempool: Introduce a cache for valid signatures
- PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
- area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
New features have usage guidelines and / or documentation updates in
- Lotus Documentation
- Discussion Tutorials
Tests exist for new functionality or change in behavior
CI is green

hsanjuan · 2023-02-06T13:24:55Z

The code from my side is mostly finished. I have this running in "production" already, as we rely on it for archival snapshots. Testing has involved making sure that it exports the exact same set of blocks as the previous code (#9192). We have used snapshots generated by the previous code and this code to process the full Filecoin chain with Lily, so we are reasonably sure that they include what they should.

In terms of performance, this version of the code seems to be a bit faster (~10%), mostly due to bugs fixed and improved blockstore access. Memory profiles are the same, essentially a large amount of memory is required to keep the "processed CID set" and the task queue (we cannot control its growth since a blockstore block can have an arbitrary number of children that need to be queued). That said memory usage is consistent. We had episodes of OOM/leaks with the previous code, but those have not happened so far with this one. The addressing of some deadlock-ing code makes me feel better about them though.

In the previous PR (#9192 ), there was a comment from @magik6k about back-pressure and improved RPC handling when doing a normal export (without --internal).

I am of the opinion that we should remove that possibility all together and that export-range should only export to local disk. This is way faster. Currently snapshots are very large and take about 1h this way. I don't find it very useful to be able to ship them remotely via the RPC API (even more given that they are not compressed).

In any case, I would implement that later if there is a better way that the one currently used, rather than letting that delay having this functionality upstreamed.

--

I don't know what is with CI. I'm happy updating documentation when we have finished discussion here.

api/api_full.go

api/v0api/full.go

node/impl/full/chain.go

@hsanjuan

This first commit contains the first and second implementation stabs (after primary review by @hsanjuan), using a stack for task buffering. Known issues: ctrl-c (context cancellation) results in the export code getting deadlocked. Duplicate blocks in exports. Duplicate block reads from store. Original commit messages: works works against mainnet and calibnet feat: add internal export api method - will hopfully make things faster by not streaming the export over the json rpc api polish: better file nameing fix: potential race in marking cids as seen chore: improve logging feat: front export with cache fix: give hector a good channel buffer on this shit docsgen

travisperson · 2023-02-14T17:11:35Z

I am of the opinion that we should remove that possibility all together and that export-range should only export to local disk. This is way faster.

Removing this will mean that we will not be able to use this new export code with the existing snapshot service and will have to continue to use the ChainExport method until API support is added or a new design for snapshots is created to handle local disks.

cc @ognots

hsanjuan · 2023-02-14T18:19:55Z

I am of the opinion that we should remove that possibility all together and that export-range should only export to local disk. This is way faster.

Removing this will mean that we will not be able to use this new export code with the existing snapshot service and will have to continue to use the ChainExport method until API support is added or a new design for snapshots is created to handle local disks.

cc @ognots

It is not going to be much faster than what you already have when pushing the snapshot through the API endpoint.

This commit moderately refactors the ranged export code. It addresses several problems: * Code does not finish cleanly and things hang on ctrl-c * Same block is read multiple times in a row (artificially increasing cached blockstore metrics to 50%) * It is unclear whether there are additional races (a single worker quits when reaching height 0) * CARs produced have duplicated blocks (~400k for an 80M-blocks CAR or so). Some blocks appear up to 5 times. * Using pointers for tasks where it is not necessary. The changes: * Use a FIFO instead of stack: simpler implementation as its own type. This has not proven to be much more memory-friendly, but it has not made things worse either. * We avoid a probably not small amount of allocations by not using unnecessary pointers. * Fix duplicated blocks by atomically checking+adding to CID set. * Context-termination now works correctly. Worker lifetime is correctly tracked and all channels are closed, avoiding any memory leaks and deadlocks. * We ensure all work is finished before finishing, something that might have been broken in some edge cases previously. In practice, we would not have seen this except perhaps in very early snapshots close to genesis. Initial testing shows the code is currently about 5% faster. Resulting snapshots do not have duplicates so they are a bit smaller. We have manually verified that no CID is lost versus previous results, with both old and recent snapshots.

The improvements in the range-export code lead to avoid reading most blocks twice, as well as to allowing some blocks to be written to disk multiple times. The cache hit-rate went down from being close to 50% to a maximum of 12% at the very end of the export. The reason is that most CIDs are never read twice since they are correctly tracked in the CID set. These numbers do not support the maintenance of the CachingBlockstore code. Additional testing shows that removing it has similar memory-usage behaviour and about 5 minute-faster execution (around 10%). Less code to maintain and less options to mess up with.

Additionally, per feedback: * Set "admin" permissions * Remove from v0api

hsanjuan · 2023-02-14T20:31:13Z

ok, I think I am good wrt to tests. The things that are failing or stuck don't seem related to this.

It is a bit of a pain to rebase on master given the conflicts on generated-code. What is left to do? I can write a tutorial, but hopefully can do post-merge?

…rt-range-rebased

hsanjuan self-assigned this Jan 30, 2023

hsanjuan mentioned this pull request Jan 30, 2023

Chain ranged export #9192

Closed

5 tasks

hsanjuan force-pushed the hsanjuan/chain-export-range-rebased branch 2 times, most recently from 436c034 to 31b894e Compare February 6, 2023 13:12

hsanjuan marked this pull request as ready for review February 6, 2023 13:25

hsanjuan requested a review from a team as a code owner February 6, 2023 13:25

magik6k reviewed Feb 10, 2023

View reviewed changes

api/api_full.go Outdated Show resolved Hide resolved

api/v0api/full.go Outdated Show resolved Hide resolved

node/impl/full/chain.go Outdated Show resolved Hide resolved

node/impl/full/chain.go Show resolved Hide resolved

hsanjuan force-pushed the hsanjuan/chain-export-range-rebased branch from 31b894e to e43a6d4 Compare February 14, 2023 14:24

hsanjuan force-pushed the hsanjuan/chain-export-range-rebased branch 2 times, most recently from 160c2db to d3d58fe Compare February 14, 2023 16:44

hsanjuan force-pushed the hsanjuan/chain-export-range-rebased branch from dda703d to 92c7d75 Compare February 14, 2023 18:18

hsanjuan force-pushed the hsanjuan/chain-export-range-rebased branch from 92c7d75 to 5359c2c Compare February 14, 2023 18:30

hsanjuan added 6 commits February 14, 2023 21:08

Ranged export: remove non-internal ranged export method

ec01036

Additionally, per feedback: * Set "admin" permissions * Remove from v0api

range-export: check serverside that head > tail

be1614a

range-export: write files to lotus repo path

a2475bc

range-export: make gen

7ecc000

hsanjuan force-pushed the hsanjuan/chain-export-range-rebased branch from 5359c2c to 7ecc000 Compare February 14, 2023 20:08

magik6k approved these changes Feb 23, 2023

View reviewed changes

Merge remote-tracking branch 'origin/master' into hsanjuan/chain-expo…

ea57a1a

…rt-range-rebased

magik6k merged commit abaa53c into master Feb 23, 2023

magik6k deleted the hsanjuan/chain-export-range-rebased branch February 23, 2023 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: chain: export-range #10145

perf: chain: export-range #10145

hsanjuan commented Jan 30, 2023 •

edited

Loading

hsanjuan commented Feb 6, 2023

travisperson commented Feb 14, 2023

hsanjuan commented Feb 14, 2023

hsanjuan commented Feb 14, 2023

perf: chain: export-range #10145

perf: chain: export-range #10145

Conversation

hsanjuan commented Jan 30, 2023 • edited Loading

Related Issues

Proposed Changes

Additional Info

Checklist

hsanjuan commented Feb 6, 2023

travisperson commented Feb 14, 2023

hsanjuan commented Feb 14, 2023

hsanjuan commented Feb 14, 2023

hsanjuan commented Jan 30, 2023 •

edited

Loading