Skip to content

Commit

Permalink
Chain ranged export: rework and address current shortcomings
Browse files Browse the repository at this point in the history
This commit moderately refactors the ranged export code. It addresses several
problems:
  * Code does not finish cleanly and things hang on ctrl-c
  * Same block is read multiple times in a row (artificially increasing cached
    blockstore metrics to 50%)
  * It is unclear whether there are additional races (a single worker quits
    when reaching height 0)
  * CARs produced have duplicated blocks (~400k for an 80M-blocks CAR or
    so). Some blocks appear up to 5 times.
  * Using pointers for tasks where it is not necessary.

The changes:

  * Use a FIFO instead of stack: simpler implementation as its own type. This
has not proven to be much more memory-friendly, but it has not made things
worse either.
  * We avoid a probably not small amount of allocations by not using
    unnecessary pointers.
  * Fix duplicated blocks by atomically checking+adding to CID set.
  * Context-termination now works correctly. Worker lifetime is correctly tracked and all channels
  are closed, avoiding any memory leaks and deadlocks.
  * We ensure all work is finished before finishing, something that might have
  been broken in some edge cases previously. In practice, we would not have
  seen this except perhaps in very early snapshots close to genesis.

Initial testing shows the code is currently about 5% faster. Resulting
snapshots do not have duplicates so they are a bit smaller. We have manually
verified that no CID is lost versus previous results, with both old and recent
snapshots.
  • Loading branch information
hsanjuan committed Feb 6, 2023
1 parent 828354c commit 87872a8
Show file tree
Hide file tree
Showing 8 changed files with 374 additions and 274 deletions.
4 changes: 2 additions & 2 deletions api/api_full.go
Original file line number Diff line number Diff line change
Expand Up @@ -172,9 +172,9 @@ type FullNode interface {
// If oldmsgskip is set, messages from before the requested roots are also not included.
ChainExport(ctx context.Context, nroots abi.ChainEpoch, oldmsgskip bool, tsk types.TipSetKey) (<-chan []byte, error) //perm:read

ChainExportRange(ctx context.Context, head, tail types.TipSetKey, cfg *ChainExportConfig) (<-chan []byte, error) //perm:read
ChainExportRange(ctx context.Context, head, tail types.TipSetKey, cfg ChainExportConfig) (<-chan []byte, error) //perm:read

ChainExportRangeInternal(ctx context.Context, head, tail types.TipSetKey, cfg *ChainExportConfig) error //perm:read
ChainExportRangeInternal(ctx context.Context, head, tail types.TipSetKey, cfg ChainExportConfig) error //perm:read

// ChainPrune prunes the stored chain state and garbage collects; only supported if you
// are using the splitstore
Expand Down
30 changes: 14 additions & 16 deletions api/proxy_gen.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion api/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -399,9 +399,10 @@ func (m *MsgUuidMapType) UnmarshalJSON(b []byte) error {
return nil
}

// ChainExportConfig holds configuration for chain ranged exports.
type ChainExportConfig struct {
WriteBufferSize int
Workers int64
NumWorkers int
CacheSize int
IncludeMessages bool
IncludeReceipts bool
Expand Down
4 changes: 2 additions & 2 deletions api/v0api/full.go
Original file line number Diff line number Diff line change
Expand Up @@ -161,9 +161,9 @@ type FullNode interface {
// If oldmsgskip is set, messages from before the requested roots are also not included.
ChainExport(ctx context.Context, nroots abi.ChainEpoch, oldmsgskip bool, tsk types.TipSetKey) (<-chan []byte, error) //perm:read

ChainExportRange(ctx context.Context, head, tail types.TipSetKey, cfg *api.ChainExportConfig) (<-chan []byte, error) //perm:read
ChainExportRange(ctx context.Context, head, tail types.TipSetKey, cfg api.ChainExportConfig) (<-chan []byte, error) //perm:read

ChainExportRangeInternal(ctx context.Context, head, tail types.TipSetKey, cfg *api.ChainExportConfig) error //perm:read
ChainExportRangeInternal(ctx context.Context, head, tail types.TipSetKey, cfg api.ChainExportConfig) error //perm:read

// MethodGroup: Beacon
// The Beacon method group contains methods for interacting with the random beacon (DRAND)
Expand Down
22 changes: 10 additions & 12 deletions api/v0api/proxy_gen.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 87872a8

Please sign in to comment.