Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/datamodel selector retrieval #6393

Merged
merged 7 commits into from
Oct 5, 2021

Conversation

ribasushi
Copy link
Collaborator

@ribasushi ribasushi commented Jun 4, 2021

( This PR depends on and includes #6375 )

Introduce a new RetrievalOrder-struct field and a CLI option that takes a string representation as understood by https://pkg.go.dev/github.com/ipld/go-ipld-selector-text-lite#SelectorSpecFromPath . Allows for partial retrieval of any sub-DAG of a deal provided the user knows the exact low-level shape of the deal contents.

As an example with this patch one can retrieve the first entry of a UnixFS directory by executing:
lotus client retrieve --miner f0XXXXX --datamodel-path-selector 'Links/0/Hash' bafyROOTCID ~/output

See top of itests/deals_partial_retrieval_test.go for a more elaborate example.

@ribasushi
Copy link
Collaborator Author

ribasushi commented Jun 4, 2021

Review note: this is the first user-facing selector interface that I know of. Also selector execution is not something we have a lot of prior art for. Request for @warpfork @mvdan @willscott and @hannahhoward to very carefully examine these parts:

  • func TraverseDag(
    ctx context.Context,
    ds mdagipld.DAGService,
    startFrom cid.Cid,
    optionalSelector ipld.Node,
    visitCallback traversal.AdvVisitFn,
    ) error {
    // If no selector is given - use *.*
    // See discusion at https://github.com/ipld/go-ipld-prime/issues/171
    if optionalSelector == nil {
    ssb := builder.NewSelectorSpecBuilder(basicnode.Prototype.Any)
    optionalSelector = ssb.ExploreRecursive(
    selector.RecursionLimitNone(),
    ssb.ExploreUnion(
    ssb.Matcher(),
    ssb.ExploreAll(ssb.ExploreRecursiveEdge()),
    ),
    ).Node()
    }
    parsedSelector, err := selector.ParseSelector(optionalSelector)
    if err != nil {
    return err
    }
    // not sure what this is for TBH...
    linkContext := ipld.LinkContext{Ctx: ctx}
    // this is what allows us to understand dagpb
    nodePrototypeChooser := dagpb.AddSupportToChooser(
    func(ipld.Link, ipld.LinkContext) (ipld.NodePrototype, error) {
    return basicnode.Prototype.Any, nil
    },
    )
    // this is how we implement GETs
    linkSystem := cidlink.DefaultLinkSystem()
    linkSystem.StorageReadOpener = func(_ ipld.LinkContext, lnk ipld.Link) (io.Reader, error) {
    if cl, isCid := lnk.(cidlink.Link); !isCid {
    return nil, fmt.Errorf("unexpected link type %#v", lnk)
    } else {
    node, err := ds.Get(context.TODO(), cl.Cid)
    if err != nil {
    return nil, err
    }
    return bytes.NewBuffer(node.RawData()), nil
    }
    }
    // this is how we pull the start node out of the DS
    startLink := cidlink.Link{Cid: startFrom}
    startNodePrototype, err := nodePrototypeChooser(startLink, linkContext)
    if err != nil {
    return err
    }
    startNode, err := linkSystem.Load(
    linkContext,
    startLink,
    startNodePrototype,
    )
    if err != nil {
    return err
    }
    // this is the actual execution, invoking the supplied callback
    return traversal.Progress{
    Cfg: &traversal.Config{
    Ctx: ctx,
    LinkSystem: linkSystem,
    LinkTargetNodePrototypeChooser: nodePrototypeChooser,
    },
    }.WalkAdv(startNode, parsedSelector, visitCallback)
    }
  • root := order.Root
    if order.DatamodelPathSelector != nil {
    // no err check - we just compiled this before starting, but now we do not wrap a `*`
    selspec, _ := textselector.SelectorSpecFromPath(*order.DatamodelPathSelector, nil) //nolint:errcheck
    if err := utils.TraverseDag(
    ctx,
    rdag,
    root,
    selspec.Node(),
    func(p traversal.Progress, n ipld.Node, r traversal.VisitReason) error {
    if r == traversal.VisitReason_SelectionMatch {
    cidLnk, castOK := p.LastBlock.Link.(cidlink.Link)
    if !castOK {
    return xerrors.Errorf("cidlink cast unexpectedly failed on '%s'", p.LastBlock.Link.String())
    }
    root = cidLnk.Cid
    }
    return nil
    },
    ); err != nil {
    finish(xerrors.Errorf("Finding partial retrieval sub-root: %w", err))
    return
    }
    }
  • // FIXME - this is a direct copy from https://github.com/filecoin-project/go-fil-markets/blob/v1.4.0/shared/selectors.go#L11-L16
    // Unable to use it because we need the SelectorSpec, and markets exposes just a reified node
    ssb := builder.NewSelectorSpecBuilder(basicnode.Prototype.Any)
    selspec := ssb.ExploreRecursive(
    selector.RecursionLimitNone(),
    ssb.ExploreAll(ssb.ExploreRecursiveEdge()),
    )
    if order.DatamodelPathSelector != nil {
    var err error
    selspec, err = textselector.SelectorSpecFromPath(*order.DatamodelPathSelector, selspec)
    if err != nil {
    finish(xerrors.Errorf("failed to parse selector '%s': %w", *order.DatamodelPathSelector, err))
    return
    }
    log.Infof("partial retrieval of datamodel-path-selector %s/*", *order.DatamodelPathSelector)
    }

@mvdan

This comment has been minimized.

cli/client.go Show resolved Hide resolved
ssb := builder.NewSelectorSpecBuilder(basicnode.Prototype.Any)
selspec := ssb.ExploreRecursive(
selector.RecursionLimitNone(),
ssb.ExploreAll(ssb.ExploreRecursiveEdge()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want a matcher here as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for

// FIXME - this is a direct copy from https://github.com/filecoin-project/go-fil-markets/blob/v1.4.0/shared/selectors.go#L11-L16
// Unable to use it because we need the SelectorSpec, and markets exposes just a reified node

Diverging didn't make sense... @hannahhoward should comment here: either we export the spec, or we adjust both

@kernelogic

This comment has been minimized.

@ribasushi

This comment has been minimized.

@kernelogic

This comment has been minimized.

alanshaw pushed a commit to nftstorage/nft.storage that referenced this pull request Jun 18, 2021
The deal data selector is now available so if folks actually want to retrieve data from Filecoin they'll soon be able to use this (🔜 currently needs custom lotus client filecoin-project/lotus#6393).

resolves #192
alanshaw pushed a commit to nftstorage/nft.storage that referenced this pull request Jun 18, 2021
The deal data selector is now available so if folks actually want to retrieve data from Filecoin they'll soon be able to use this (🔜 currently needs custom lotus client filecoin-project/lotus#6393).

resolves #192
@kernelogic

This comment has been minimized.

@ribasushi

This comment has been minimized.

@kernelogic

This comment has been minimized.

@ribasushi

This comment has been minimized.

@ribasushi ribasushi marked this pull request as draft July 27, 2021 11:04
@ribasushi ribasushi force-pushed the feat/datamodel-selector-retrieval branch 2 times, most recently from 7ae0512 to ab1377f Compare September 10, 2021 07:22
@ribasushi ribasushi force-pushed the feat/datamodel-selector-retrieval branch 2 times, most recently from 08ee9df to 988d14b Compare September 10, 2021 07:33
Syntaxt of selection is located at
https://pkg.go.dev/github.com/ipld/go-ipld-selector-text-lite#SelectorSpecFromPath

Example use, assuming that:
  - The root of the deal is a plain dag-pb unixfs directory
  - The directory is not sharded
  - The user wants to retrieve the first entry in that directory

  lotus client retrieve --miner f0XXXXX --datamodel-path-selector 'Links/0/Hash' bafyROOTCID ~/output

For a much more elaborate example see the top of ./itests/deals_partial_retrieval_test.go
@ribasushi ribasushi force-pushed the feat/datamodel-selector-retrieval branch from 06f5253 to af0d9b6 Compare October 4, 2021 21:21
@codecov
Copy link

codecov bot commented Oct 4, 2021

Codecov Report

Merging #6393 (af0d9b6) into master (0a04302) will increase coverage by 0.43%.
The diff coverage is 44.19%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6393      +/-   ##
==========================================
+ Coverage   39.14%   39.57%   +0.43%     
==========================================
  Files         614      617       +3     
  Lines       64997    65373     +376     
==========================================
+ Hits        25440    25871     +431     
+ Misses      35150    34996     -154     
- Partials     4407     4506      +99     
Impacted Files Coverage Δ
api/api_full.go 47.36% <ø> (ø)
api/v0api/v1_wrapper.go 2.22% <0.00%> (-0.11%) ⬇️
api/version.go 80.00% <ø> (ø)
build/panic_reporter.go 0.00% <0.00%> (ø)
chain/gen/gen.go 67.27% <0.00%> (ø)
cli/client.go 22.82% <0.00%> (+3.71%) ⬆️
cmd/lotus-miner/actor.go 8.15% <0.00%> (+0.05%) ⬆️
cmd/lotus-miner/init.go 0.00% <0.00%> (ø)
cmd/lotus-miner/main.go 4.27% <0.00%> (-0.45%) ⬇️
cmd/lotus-seal-worker/main.go 0.00% <0.00%> (ø)
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a04302...af0d9b6. Read the comment docs.

@ribasushi
Copy link
Collaborator Author

This PR has been updated once again to track latest master. In fact it had to be merged on top of (and to thus include) #7441, because go-ipld-prime interfaces changed in a patch version again 🤦‍♂️

There are now negative-tests addressing most of the comments above, everything is green, in fact coverage went up.

Also an identical CLI flag with the same functionality got merged into the filc standalone client. For folks who need the feature today - you can try to switch to the above.

@mvdan
Copy link

mvdan commented Oct 4, 2021

because go-ipld-prime interfaces changed in a patch version again

Which interface, out of curiosity? Nothing came to mind between v0.12.0 and v0.12.3, and I just skimmed the list of commits and didn't spot anything obvious.

@ribasushi
Copy link
Collaborator Author

CommonSelector_MatchAllRecursively became something else

node/impl/client/client.go Outdated Show resolved Hide resolved
@rvagg
Copy link
Member

rvagg commented Oct 5, 2021

Ahh, sorry, I suppose that was me. Trying to move away from exposing bare Selector instances and push the Node form everywhere it's used externally, there'll probably be more breakages around that to come at some point, hopefully in a non-patch release!

@jennijuju jennijuju added P2 P2: Should be resolved release/backport labels Oct 5, 2021
@jennijuju jennijuju modified the milestones: v1.13.1, v1.13.0 Oct 5, 2021
@jennijuju jennijuju merged commit 26ae8bb into master Oct 5, 2021
@jennijuju jennijuju deleted the feat/datamodel-selector-retrieval branch October 5, 2021 16:27
@jennijuju jennijuju added the impact/api-breakage Impact: API Breakage label Oct 5, 2021
@arajasek arajasek restored the feat/datamodel-selector-retrieval branch October 6, 2021 00:00
go.mod Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact/api-breakage Impact: API Breakage P2 P2: Should be resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants