Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data transfer can resume a transfer on restart #3417

Closed
hannahhoward opened this issue Aug 8, 2020 · 4 comments
Closed

Data transfer can resume a transfer on restart #3417

hannahhoward opened this issue Aug 8, 2020 · 4 comments
Assignees
Labels
area/markets Area: Markets dif/hard Suggests that having worked on the specific component affected by this issue is important effort/weeks Effort: Multiple Weeks kind/feature Kind: Feature P2 P2: Should be resolved status/won't fix

Comments

@hannahhoward
Copy link
Contributor

Currently, when a node shuts down, if data transfers are in progress at that time -- either piece data being transferred for a storage deal or in progress retrievals -- they do not resume when the node restarts.

The end goal of this task is as follows: when I restart my node, any data transfers that were going prior to the node shutdown resume, without resending data already sent from one node to another.

This ticket involves several steps, and a fairly deep knowledge of go-graphsync, go-data-transfer, and go-fil-markets. It's also probably good to familiarize yourself with the relevant portions of the spec to understand theoretical concepts (https://beta.spec.filecoin.io/#systems__filecoin_files__data_transfer , https://beta.spec.filecoin.io/#systems__filecoin_markets -- though please be careful as these specs may be out of date).

I'll outline the approach here, though it's probably best to do in separate pieces and may require some collaboration to get up and running.

What we have so already:

  1. The underlying protocol for data transfer is Graphsync. A normal Graphsync request is expressed as a root CID and an IPLD selector (in the broadest sense IPLD selectors are a query language for IPLD graphs, similar to how a SQL is a query language for a relational database).The go-graphsync library has a mechanism to perform a request where the responder is told to not send back certain data the requestor already has. This is the graphsync/do-no-send-cids extension. This extension is already supported in the go-graphsync library.
  2. The state of every data transfer is persisted to disk at a high level in the go-data-transfer library which uses go-statemachine to manage state. You can see the existing data transfer state machine definition here (https://github.com/filecoin-project/go-data-transfer/blob/master/channels/channels_fsm.go)

What we need:

  1. We need to record data transfer state at a fined grain level in go-data-transfer, for the side that is receiving data, in terms the actual CIDs already received, so that we can create a Graphsync request that includes all the CIDs we already have when we restart the transfer.
  2. Upon restart of a node, we need to send an event to all tracked data transfers that are not in a terminal state to restart.
  3. We need to send a network message to the other end of a data transfer that our node shut down and we need to restart the transfer. The node that shut down could be the receiver of data or the sender of data. If we are the receiver, we can probably just create a new Graphsync request with the relavant CIDs to ignore (plus we'll need to find a way to make sure the request is still validated -- see how data transfer requests are validated in the spec). As sender of data, the receiver will still need to send a new graphsync request, so we'll need to send a seperate message over the data transfer libp2p protocol from sender to receiver to communicate we need a new graphsync request.
  4. We'll need to make sure request validators in both storage and retrieval markets in go-fil-markets will revalidate a resuming request as well as a new request.
@aarshkshah1992
Copy link
Contributor

@hannahhoward Please can you flesh out this issue with a description of what the problem is/logs and what the proposed solutions were ?

@aarshkshah1992
Copy link
Contributor

@whyrusleeping Please can you assign this to me ?

@hannahhoward
Copy link
Contributor Author

So, fyi @aarshkshah1992 this is actually fairly complicated ticket. You're welcome to take it on, but just fair warning. There are actually several steps, and I would take them on individually -- so we can do reviews and incremental improvements, even if we're not getting real user functionality till the end.

I'm going to write them in the description above.

@daviddias daviddias transferred this issue from filecoin-project/go-fil-markets Aug 31, 2020
@daviddias daviddias added area/markets Area: Markets P1 P1: Must be resolved labels Aug 31, 2020
@hannahhoward hannahhoward added P2 P2: Should be resolved dif/expert effort/weeks Effort: Multiple Weeks kind/feature Kind: Feature and removed P1 P1: Must be resolved labels Sep 17, 2020
@dineshshenoy dineshshenoy added dif/hard Suggests that having worked on the specific component affected by this issue is important and removed dif/expert labels Nov 16, 2020
@TippyFlitsUK
Copy link
Contributor

TippyFlitsUK commented Feb 9, 2023

Hi Hannah 👋

The Legacy Lotus Markets sub-system reached EOL at the end of the 31st January 2023.

This ticket is being marked as won't fix and closed as the Lotus Team will no longer be making any further fixes or enhancements to the legacy markets subsystem.

Please feel free to re-open this ticket in the new markets sub-system repository at https://github.com/filecoin-project/boost if you feel that it is still relevant.

Many thanks 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/markets Area: Markets dif/hard Suggests that having worked on the specific component affected by this issue is important effort/weeks Effort: Multiple Weeks kind/feature Kind: Feature P2 P2: Should be resolved status/won't fix
Projects
None yet
Development

No branches or pull requests

6 participants