Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Cancelable requests AKA Context #2884

Closed
daviddias opened this issue Aug 15, 2016 · 19 comments · Fixed by #2993
Closed

Cancelable requests AKA Context #2884

daviddias opened this issue Aug 15, 2016 · 19 comments · Fixed by #2993
Assignees
Labels
exp/wizard Extensive knowledge (implications, ramifications) required kind/feature

Comments

@daviddias
Copy link
Member

We need a way to be able to cancel calls that have to hit the network, either through a timeout or by a voluntary cancelation.

JavaScript doesn't provide a good interface for this natively, we need to research or build one our own.

For context, read https://blog.golang.org/context which is what we use in Go. (Another resource - http://bouk.co/blog/context/)

Notes from the sprint discussion:

  • Do we want timeouts No timeout on ipfs.ls is called ipfs-inactive/js-ipfs-http-client#71 ? (If no, defer to milestone 4)
  • Should we have timeouts, should go-ipfs be doing the timeouts and share through the API?
    We need to decide now if we should add this soon or if this is something that is a high priority right now.
  • There should always be the possibility for no time out so that requests can go through.
    go-ipfs may or may not have a default timeout, @jbenet was very against it because of the delay (?)
  • Compromise: have no time out as a default, and have a flag to add one per request
    The core should have a notion of timeout - context might be a good abstraction to use here, captures the requirement. All the requests that are made should be cancelable by the caller.
  • We currently don't have cancels through the go-ipfs-api core, they can kill the connection
    In go-ipfs this was an issue: not being able to cancel requests when they were made, wasting processing time and resources. Once cancelations were chained down, things went a lot better. Once the cancelation wiring was cancelled, it closed out all the lingering bits.
  • Open conversation async to look at ways to do this.
@dignifiedquire
Copy link
Member

Discussion around cancellation in the promise community:

@jbenet
Copy link
Member

jbenet commented Aug 30, 2016

Random experiment. it's not correct, just a draft. https://gist.github.com/jbenet/11671be17cbe91c67a00d9638d99a45d

@dignifiedquire
Copy link
Member

@diasdavid I can start doing some research and planning here, but I don't think it is realistic we can keep this in milestone 2

@daviddias
Copy link
Member Author

Expanding on this a bit...

The main issue comes from the Inversion of Control nature that Callback APIs forces us to submit, where we can't be sure how our callback (Continuation) is going to be called (if called at all). In the other hand, it seems Promises seem to get this a little bit better by removing time out of the equation and enabling us to pass 'values' around, independently when they get resolved, and in addition, the native Promise implementation already has a mechanism for situations where multiple outcomes can happen, making sure that only one of the paths is resolved, the Promise.race.

What this means, is that from the standpoint of the js-ipfs core caller, we can have:

const doOperationP = new Promise((resolve, reject) => {
  // do whatever it is supposed to do and call resolve when it is done
})

let cancel 
const cancelP = new Promise((resolve, reject) => {
  cancel = () => {
    // Do the operations to rewind the operation
    reject()
  }
})
const timeoutP = new Promise((resolve, reject) => { 
    setTimeout(reject, constants.TIMEOUT)
})
Promise.race([
  doOpP,
  cancelP,
  timeoutP
])

This way, only one of the operations is called, keeping the control within outside to compose what we need that fits our expectations.

For an object.get cancellation, the cancel operation would be quite trivial, as it would require a single bitswap unwant, however, for files.cat, the or dag.resolve, the situation gets hairy because it will unwrap several nodes that we still do not know about, forcing us to share state across async operations.

The way that we share this state is another rabbit hole, as it can be just a shared object/array, a queue or probably something like Go or Closure, a channel, so that when it is time to cancel, we have a clean perspective of what CIDs need to be canceled, this would inject yet another abstraction (not sure if we want to go there necessarily).

Nevertheless, I also want to point out that we don't even need to use promises, fortunately (or apparently), everything you can do with Promises, you could do before just by using our favorite and already present throughout the codebase, async library, async. race is also an operation available http://caolan.github.io/async/docs.html#race. (and btw, Promise.all and Promise.gate is just async.parallel, IIUC).

@pgte
Copy link
Contributor

pgte commented Nov 14, 2017

My 5 cents on this discussion:
I think that here we're discussing two distinct things: 1) the API for expressing a cancelation, where you may use promises or some other type of control and then there's 2) the cancellation implementation itself.
I think that 2 may be the hardest part.

Since there's a lot of shared state that you may want to roll back, these contexts look a lot like traditional transactions, where for every action there is an "undo" action. Implementing these may not be trivial.

For instance, even for canceling object.get this may not be simple. Removing an item from the wantlist may be not as trivial as removing the wanted hash (unless bitswap keeps a client count for each hash - I don't know anything about bitswap internals). Or it this nmay be a case where it's simple, but I guess there will be many actions where canceling will be not trivial.

Nesting: like when canceling a cat and stoping the traversing of nodes. This I think could be implemented by nesting contexts. A context can have one parent, and canceling an unfinished parent cancels all the unfinished children recursively.

About timeouts: I believe that it should be the client to decide about the timeout and cancel the action itself. This may mean extending the action interface all the way to the client, for them to be able to cancel an action (probably through a control channel if done remotely, for instance, through HTTP), which means that actions must immediately return (like a promise) and provide an action unique ID..

@daviddias
Copy link
Member Author

Had the realization the other day that probably what we need is just to add http://npmjs.com/continuation-local-storage. Need to investigate further :)

@victorb
Copy link
Member

victorb commented Apr 18, 2018

@diasdavid should this maybe be part of OKRs? Currently there is (AFAIK) no way of doing timeouts in js-land...

@whyrusleeping
Copy link
Member

Any update here? I've been hearing people really would like to have nice timeouts in js-ipfs stuff.

@alanshaw
Copy link
Member

I've been playing with using AbortController as a means for doing this and have implemented some abortable methods in ipfsx.

As part of the async iterators endevour I proposed https://github.com/libp2p/interface-peer-discovery/issues/2 which uses AbortableController and I'd welcome feedback on the approach. I also extracted out a more generic module to make any iterator or async iterator abortable in this way: abortable-iterator.

@parkan
Copy link
Contributor

parkan commented Nov 14, 2018

ooh ipfsx looks very interesting, good to learn about it! would love to see it surfaced more

@mitra42
Copy link

mitra42 commented Nov 17, 2018

What we really need is error returns, since the async / distributed nature of IPFS makes that (I understand) impossible, a timeout is an alternative, the how (from the app developer perspective) is less important, but the user experience of sitting waiting for something that never returns because for example the hash is invalid is really bad.

We've implemented timeouts at the app level for fetch's, but the challenge (at the app level) is that a reasonable timeout for a small file, is too short for a long file, so IPFS is pretty much always going to be timed-out on anything that takes more than a few seconds to retrieve.

For streams its much worse - since the jsipfs.files.catReadableStream call always succeeds, but on a bad, or unfindable, hash no data is received. This - along with a high error rate finding even valid hashes - was why we had to turn IPFS streams off for video & audio in the dweb.archive.org app, there were just too many cases of people sitting there waiting for the IPFS stream to start delivering the movie, when if there had been an error return we could have fallen back to HTTP. We switched the streams to WebTorrent because that does a graceful HTTP fallback.

@mitra42
Copy link

mitra42 commented Dec 2, 2018

If you want a good example of why this is important ... take a look at https://dweb.me/arc/archive.org/details/etree and watch the console. Recently IPFS lost a lot of our data again go-ipfs#5765 and on this page you can see the result of each icon hanging during load - the createReadStream succeeds so its assumed IPFS will work, and the UI doesn't fall back to fetch over HTTP so the page looks awful if IPFS is enabled. (With IPFS turned off it looks like https://dweb.me/arc/archive.org/details/etree?paused=IPFS )

@daviddias
Copy link
Member Author

@alanshaw, @achingbrain, @vmx, @jacobheun & @vasco-santos, given that this is a PO and that #1670 is a dependency to achieve this (so that then we can consider the ipfsx approaches at their full potential), shouldn't #1670 be bumped to P0 and be tracked on the OKRs of js-ipfs, js-libp2p and js-ipld as a big item?

@jacobheun
Copy link
Contributor

Yes, I think #1670 should be a P0. We're likely not going to get to everything this quarter, but we should make it a priority to make progress here. I think this issue should also be a first level concern when we're doing work for #1670. I think it's going to be important to keep this in mind when updating repos like libp2p/interface-transport, as building in abortable behavior early should save us quite a bit of time as more of the repos are refactored.

@vmx
Copy link
Member

vmx commented Jan 8, 2019

In regards to IPLD, that one is covered by the new APIs work. Those APIs will use async/await.

@alanshaw
Copy link
Member

@daviddias 👍 #1670 is already a P0 on the JS IPFS roadmap (search "Revamped APIs") and we have a P1 for working on it this quarter (behind the P0 of completing base32 v1 CIDs) - HTH

@daviddias
Copy link
Member Author

One thing it is still not answered fully is how AbortController and the Async Iterators refactor will provide a way to cancel an action that spawned multiple actions, e.g.:

When catting a file, Unixfs will ask for the blocks, Bitswap will create mutiple events to fetch its blocks and the DHT will create multiple crawls to find the nodes that then it has to connect to fetch the file. Aborting a request isn't just about saying "don't give me back the result because I don't want it", it is about being able to cancel all of this initiated actions that will be consuming memory, CPU and bandwidth.

For contrast, the AbortController puts the responsibility on the Browser engine to "clean out" all HTTP requests, TCP sockets and so on, once a request is canceled.

So far, I think the best approach is what go-ipfs does with Context, which the parallel in JS world is https://www.npmjs.com/package/continuation-local-storage, but I haven't seen it being used with async iterators.

@achingbrain
Copy link
Member

I think the approach is similar - the signal would be passed all the way down the call chain in a similar way to go's Context and long running operations would occasionally check with it to see if they've been aborted.

// in the application
const controller = new AbortController()

setTimeout(() => controller.abort(), 100)

try {
  const res = await ipfs.cat('Qmfoo', {
    signal: controller.signal
  })
} catch (err) {
  if (err instanceof AbortError) {
    // request was aborted
  }
}
//.. meanwhile, in the unixfs exporter
for (let i = 0; i < node.Links.length; i++) {
  // potentially a long running network operation
  const child = await ipld.get(node.Links[i].Hash)
  
  // were we aborted while fetching the next node?
  if (options.signal.aborted) {
    throw new AbortError()
  }

  // otherwise continue as normal
}

@alanshaw
Copy link
Member

alanshaw commented Oct 17, 2019

Yes exactly this, the signal is the context, and multiple interested parties can register to be notified on an abort via signal.addEventListener('abort', doCleanup)

for reference: https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal

@achingbrain achingbrain transferred this issue from ipfs-inactive/interface-js-ipfs-core Mar 10, 2020
@achingbrain achingbrain added exp/wizard Extensive knowledge (implications, ramifications) required kind/feature labels Mar 10, 2020
@achingbrain achingbrain self-assigned this Apr 6, 2020
@achingbrain achingbrain linked a pull request Apr 21, 2020 that will close this issue
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
exp/wizard Extensive knowledge (implications, ramifications) required kind/feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.