-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Conversation
Pulls non-grpc changes out of #3403 to ease the continued merging of master into that branch. - Moves withTimeoutOption into core-utils - Moves TimeoutError into core-utils - Adds missing ts project links - Adds more add-all tests to interface suite - Ignores unpassable tests for non-grpc or core implementations - Normalises mode and mtime in normalise-input function
Pulls non-grpc changes out of #3403 to ease the continued merging of master into that branch. - Moves withTimeoutOption into core-utils - Moves TimeoutError into core-utils - Adds missing ts project links - Adds more add-all tests to interface suite - Ignores unpassable tests for non-grpc or core implementations - Normalises mode and mtime in normalise-input function - Dedupes mtime normalisation between core and http client
41acdb8
to
874f1f7
Compare
11f2292
to
5c07b65
Compare
Tagged @aschmahmann as it'd be great to get some go-IPFS perspective since ideally the gRPRC-web<->websocket bridge will be implemented in go-IPFS as well. |
d3ee91d
to
bdf3040
Compare
Thank you for creating this @achingbrain Change like this impacts GO, JS, and entire ecosystem: I feel we may want to give this more thought and will most likely park this until January, (folks are busy with various work threads and may not have time to give this a proper review and thought until then, including myself). Figuring this out should be on our cross-project roadmap for 2021.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything here makes sense to me from a design perspective. Not coming from a js background, I only had one comment/question on the actual code.
Is there anything that need to be done regarding CORS, that should be documented, like:
jsipfs config --json API.HTTPHeaders.Access-Control-Allow-Origin '["http://127.0.0.1:5003"]'
@achingbrain I'm probably just missing something, but how would this affect the remote clients in Go? Would we be trying to change our HTTP client to use GRPC? Would Go need a GRPC-web client, or just a GRPC client to match functionality? |
Not as implemented because the traffic is only ever coming from and going to Unless we want to open it up to other origins of course in which case yes, we'd need to add some CORS configuration, in which case it'd need to work the same as our existing HTTP API.
We'd need a go client that talks gRPC-web over WebSockets to get the full streaming functionality with error messages that work, etc. I don't think regular gRPC is something we can use as there's no way to run a client in the browser which means it won't work with the WebUI, etc.
In the long term, yes (though gRPC-web, not gRPC) - in the short term producing a client with API compatibility with the existing Core API with that uses WebSockets for parts of the API that have been implemented and with fallback to the older HTTP API for everything else such as the ipfs-client module introduced in this PR will let people start using the improved API immediately. I'm lukewarm on gRPC-web over HTTP only (e.g. no WebSockets) as it has the same problems our existing HTTP API has around streaming and error messages so would prefer WebSockets-only with no HTTP fallback, though in Go the WebSocket proxy is an upgrade for a gRPC-web server so you sort of get the HTTP part for free. Ultimately I think we should EOL the existing HTTP API and port everything to gRPC-web-over-WebSockets. Almost everything, it's a great opportunity to dedupe functionality and remove things that are deprecated but I don't want to derail the improvements made here by trying to decide on the perfect API so I think compatibility with the existing API is a hard requirement initially. During the transition people would use the client-with-HTTP-API-fallback which would let us do the upgrade with the minimum amount of disruption. Then when we are ready the existing HTTP API would then become an optional install for people who have built things using it directly (e.g. through cURL etc, without a client) and are not ready to migrate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its a lot to grasp, gonna need another run through but i left some questions inline.
Can you give a high level summary of the loadServices.js
files please?
The |
Adds a server running a gRPC endpoint over websockets, a client to access the server and a `ipfs-client` module that uses the gRPC client with HTTP fallback. So far only supports `ipfs.addAll` but the idea is to implement all streaming methods over websockets instead of HTTP, to give us bidirectional streaming and errors that work in the browser. Fixes: Depends on: - [ ] ipfs/js-ipfsd-ctl#561
06505be
to
56ddfaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achingbrain sorry for taking so long to doing this. Still, I am afraid you’ll be getting a rushed quality feedback from me. I feel bad for providing not as useful feedback as I wished for. My general impressions are:
- Surprised that it is connection per request, I was assuming multiplexing over single connection.
- Having different types of requests 1:1, m:1 , 1:n, m:n without general abstraction for RPC call / request doesn’t seat well with me. Which could just be a difference in how we approach things. That said I do wish there was something like request / response and both requestor and responder could choose to send stream or a single item. E.g. message-port libs take that approach by having
encodeAsyncIterable
which is than encoded into response (it’s trickier here because there I can just create message port and send it, but with more effort should be doable here as well). - In bunch of places (mentioned inline) I struggled to understand what the intention of the code. I think adding code comments telling what function supposed to do would be really helpful for anyone contributing to this in the future.
- There are some
setTimeouts
here and there, which does not feel right. If my guesses are right and they are in place so that writes don’t start before reads I think the channel abstraction used has a problem that is worth addressing there instead of places that make use of it. - This maybe another case where our approaches differ, but reading this code I often found myself distracted from understanding what the function does by the many small details. I think it would really help if some of those little details were factored out into functions. That way reading primary functions tells you what it does (at high level) and all the functions it uses tell you details of how it does it. I think I made couple of suggestions along those lines.
I hope you’ll find some of these feedback useful. Thanks for your patience and for addressing long standing issues. Exciting to see all the new things it will enable us to do in the future.
expect(files[0].cid.codec).to.equal('dag-pb') | ||
expect(files[0].size).to.equal(11) | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to have a test that ensures that errors do really make it to the client as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do, it's called should error during add-all stream
and it's skipped for ipfs-http-client
but run for ipfs-core
and ipfs-client
.
@@ -0,0 +1,49 @@ | |||
{ | |||
"name": "ipfs-client", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe ipfs-local-client
or local-ipfs-client
instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing local about the client, it can connect to a correctly configured websocket on a remote machine so I think the name is fine.
metadata: options | ||
}) | ||
|
||
setTimeout(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add code comment to explain this time out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is so that it starts alter read below I think it would make more sense to extract read in to separate function e.g.
const results = read(sink, options)
sendFiles(stream, soruce, options)...
yeild * results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't think it's necessary here, have removed it.
async function * addAll (stream, options = {}) { | ||
const { | ||
source, | ||
sink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading from sink and writing to source feels backwards.
opts = opts || {} | ||
|
||
async function mfsWrite (path, content, options = {}) { | ||
const stream = async function * () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to turn this into top level helper function likestream(path, content)
.
} | ||
|
||
for await (const result of ipfs.files.ls(request.path, opts)) { | ||
result.cid = result.cid.toString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like with add
it would easier to read and understand if message was constructed instead of been mutated in place.
} | ||
} | ||
|
||
// path is sent with content messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe sending it with metadata would make more sense here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted it to be part of the RPC protobuf so it's properly defined, rather than the metadata which isn't.
|
||
channel.handler = handler | ||
|
||
switch (handler.type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like it would make more sense to decorate handlers at definition site such that they would all implement same API e.g. take channel & metadata and have the logic of message(s) reading / writing encapsulated there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but this is the internal gRPC-web api so it's out of our control.
@@ -0,0 +1,13 @@ | |||
'use strict' | |||
|
|||
module.exports = (string) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this reverses what prama-case does on the client ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, this file isn't used.
* @param {object} object - key/value pairs to turn into HTTP headers | ||
* @returns {Uint8Array} - HTTP headers | ||
**/ | ||
const objectToHeaders = (object) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was another similar function on the client, maybe they should be consolidated and used across both ends ?
Pulls non-grpc changes out of #3403 to ease the continued merging of master into that branch. - Moves withTimeoutOption into core-utils - Moves TimeoutError into core-utils - Adds missing ts project links - Adds more add-all tests to interface suite - Ignores unpassable tests for non-grpc or core implementations - Normalises mode and mtime in normalise-input function - Dedupes mtime normalisation between core and http client
Adds a server running a gRPC endpoint over websockets running on port 5003, a `ipfs-grpc-client` module to access the server and a `ipfs-client` module that uses the gRPC client with HTTP fallback. This is to solve shortcomings and limitations of the existing HTTP API and addresses the concerns raised in the 'Streaming HTTP APIs and errors, y u no work?' session we had at IPFS team week in NYC. ## Key points 1. Enables full duplex communication with a remote node When making an HTTP request in the browser, a [FormData][] object must be created. In order to add all the values to the FormData object, an incoming stream must be consumed in its entirety before the first byte is sent to the server. This means you cannot start processing a response before the request has been sent, so you cannot have full-duplex communication between client and server over HTTP. This seems unlikely to change in the near future. With a websocket transport for gRPC-web, individual messages can be sent backwards and forwards by the client or the server enabling full-duplex communication. This is essential for things like progress events from `ipfs.add` in the short term, and exposing the full stream capabilities of libp2p via remote client in the long term. 2. Enables streaming errors The existing HTTP API sends errors as HTTP trailers. No browser supports HTTP trailers so when a stream encounters an error, from the client's point of view the stream just stops with no possibility of finding out what happened. This can also mask intended behaviour cause users to incorrectly interpret the API. For example if you specify a timeout to a DHT query and that timeout is reached, in the browser the stream ends without an error and you take away the results you've received thinking all is well but on the CLI the same operation results in a non-zero exit code. A websocket transport has no restrictions here, since full-duplex communication is possible, errors can be received at any time. 3. Listens on websockets with no HTTP fallback gRPC-web exists and is a way of exposing a gRPC service over HTTP. Whereas gRPC supports four modes (unary, e.g. one request object and one response object, client streaming, server streaming and bidirectional streaming), gRPC-web only supports [unary and server streaming](https://github.com/grpc/grpc-web#wire-format-mode). This is due to limitations of the web platform mentioned above and doesn't give us anything over our existing HTTP API. The gRPC-web team are evaluating several options for client and bidirectional streaming, all of which require new capabilities to be added to browsers and none of which will be available in a reasonable time frame. Notably they have [no plans to use websockets](https://github.com/grpc/grpc-web/blob/master/doc/streaming-roadmap.md#issues-with-websockets) as a transport, even though it solves the problems we have today. The team from [improbable](https://improbable.io/) maintain a [gRPC-web-websockets bridge](https://github.com/improbable-eng/grpc-web) which the client added by this PR is compatible with. Their bridge also has a go implementation of a [reverse proxy](https://github.com/improbable-eng/grpc-web/tree/master/go/grpcwebproxy) for use with gRPC servers to turn them into gRPC-web servers with an optional websocket transport. My proposal is to embrace the use of websockets to solve our problems right now, then move to whatever streaming primitive the gRPC-web team settle on in the years to come. As implemented there's only websockets here and no HTTP fallback as the existing HTTP API works fine for unary operations so there's little to be gained by blocking this work on reimplementing the whole of the HTTP API in gRPC-web, and the client can pick and choose which API it'll use per-call. By running the websocket server on a different port to the existing HTTP API it gives us room to add gRPC-web fallback for the API if we find that useful. 4. Has protobuf definitions for all requests/responses See the [ipfs-grpc-protocol](https://github.com/ipfs/js-ipfs/tree/feat/add-grpc-server-and-client/packages/ipfs-grpc-protocol) module, which contains definitions for API requests/reponses. They've been ported from the existing API and will need some checking. The [ipfs-grpc-server/README.md](https://github.com/ipfs/js-ipfs/blob/feat/add-grpc-server-and-client/packages/ipfs-grpc-server/README.md) has a rundown of the websocket communication protocol that was ported from [improbable-eng/grpc-web](https://github.com/improbable-eng/grpc-web). 5. Options as metadata When making a request, metadata is sent during the preamble - these take the form of a string identical to HTTP headers as the initial websocket message - I've used this mechanism to send the options for a given invocation. Notably these are not defined as a protocol buffer, just an unspecified list of simple key/value pairs - maybe they should be to ensure compatibility between implementations? This will be trivial in the implementation in the PR as it contains a server implementation too but to do it in go will require patching or forking the improbable gRPC proxy. 6. Errors as metadata Similar to the existing HTTP API, message trailers are used to send errors. Four fields are used to re-construct the error on the client side: | Field | Notes | | ----- | ----- | | grpc-status | 0 for success, 1+ for error | | grpc-message | An error message | | grpc-stack | A stack trace with `\n` delimited lines | | grpc-code | A string code such as `'ERROR_BAD_INPUT'` that may be used for i18n translations to show a message to the user in their own language | Similar to options these fields are unspecified, if a convention is not enough, perhaps they should be specified as a protobuf and the trailer sent as binary? 7. Streams When sending data as part of an `ipfs.add`, we send repeated messages that contain a path, a content buffer and an index. The index is used to differentiate between streams - path cannot be used as it could be empty. Only the first supplied `path` is respected for a given index. On the server separate input streams are created for each file being added. A file stream is considered closed when an unset or empty content buffer is received. Ultimately this will allow us to apply backpressure on a per-file basis and read from different file streams in parallel and asymmetrically based on the available server capacity. 8. Performance Observed performance pegs gRPC-web over websockets as similar to the HTTP Client with pretty much zero optimisation work performed 9. Security Browsers require TLS for all use of websocket connections to localhost. They do not require it for the loopback address, however, which this PR uses, though loopback means the traffic will not leave the local machine. The incoming requests start as HTTP requests so have a referer header and user agent so would follow the same restrictions as the existing HTTP API. Fixes #2519 Fixes #2838 Fixes #2943 Fixes #2854 Fixes #2864 [FormData]: https://developer.mozilla.org/en-US/docs/Web/API/FormData
Adds a server running a gRPC endpoint over websockets running on port 5003, a
ipfs-grpc-client
module to access the server and aipfs-client
module that uses the gRPC client with HTTP fallback.This is to solve shortcomings and limitations of the existing HTTP API and addresses the concerns raised in the 'Streaming HTTP APIs and errors, y u no work?' session we had at IPFS team week in NYC.
Key points
When making an HTTP request in the browser, a FormData object must be created. In order to add all the values to the FormData object, an incoming stream must be consumed in its entirety before the first byte is sent to the server.
This means you cannot start processing a response before the request has been sent, so you cannot have full-duplex communication between client and server over HTTP. This seems unlikely to change in the near future.
With a websocket transport for gRPC-web, individual messages can be sent backwards and forwards by the client or the server enabling full-duplex communication. This is essential for things like progress events from
ipfs.add
in the short term, and exposing the full stream capabilities of libp2p via remote client in the long term.The existing HTTP API sends errors as HTTP trailers. No browser supports HTTP trailers so when a stream encounters an error, from the client's point of view the stream just stops with no possibility of finding out what happened.
This can also mask intended behaviour cause users to incorrectly interpret the API. For example if you specify a timeout to a DHT query and that timeout is reached, in the browser the stream ends without an error and you take away the results you've received thinking all is well but on the CLI the same operation results in a non-zero exit code.
A websocket transport has no restrictions here, since full-duplex communication is possible, errors can be received at any time.
gRPC-web exists and is a way of exposing a gRPC service over HTTP. Whereas gRPC supports four modes (unary, e.g. one request object and one response object, client streaming, server streaming and bidirectional streaming), gRPC-web only supports unary and server streaming. This is due to limitations of the web platform mentioned above and doesn't give us anything over our existing HTTP API.
The gRPC-web team are evaluating several options for client and bidirectional streaming, all of which require new capabilities to be added to browsers and none of which will be available in a reasonable time frame.
Notably they have no plans to use websockets as a transport, even though it solves the problems we have today.
The team from improbable maintain a gRPC-web-websockets bridge which the client added by this PR is compatible with. Their bridge also has a go implementation of a reverse proxy for use with gRPC servers to turn them into gRPC-web servers with an optional websocket transport.
My proposal is to embrace the use of websockets to solve our problems right now, then move to whatever streaming primitive the gRPC-web team settle on in the years to come.
As implemented there's only websockets here and no HTTP fallback as the existing HTTP API works fine for unary operations so there's little to be gained by blocking this work on reimplementing the whole of the HTTP API in gRPC-web, and the client can pick and choose which API it'll use per-call.
By running the websocket server on a different port to the existing HTTP API it gives us room to add gRPC-web fallback for the API if we find that useful.
See the ipfs-grpc-protocol module, which contains definitions for API requests/reponses.
They've been ported from the existing API and will need some checking.
The ipfs-grpc-server/README.md has a rundown of the websocket communication protocol that was ported from improbable-eng/grpc-web.
When making a request, metadata is sent during the preamble - these take the form of a string identical to HTTP headers as the initial websocket message - I've used this mechanism to send the options for a given invocation.
Notably these are not defined as a protocol buffer, just an unspecified list of simple key/value pairs - maybe they should be to ensure compatibility between implementations?
This will be trivial in the implementation in the PR as it contains a server implementation too but to do it in go will require patching or forking the improbable gRPC proxy.
Similar to the existing HTTP API, message trailers are used to send errors. Four fields are used to re-construct the error on the client side:
\n
delimited lines'ERROR_BAD_INPUT'
that may be used for i18n translations to show a message to the user in their own languageSimilar to options these fields are unspecified, if a convention is not enough, perhaps they should be specified as a protobuf and the trailer sent as binary?
When sending data as part of an
ipfs.add
, we send repeated messages that contain a path, a content buffer and an index. The index is used to differentiate between streams - path cannot be used as it could be empty. Only the first suppliedpath
is respected for a given index. On the server separate input streams are created for each file being added. A file stream is considered closed when an unset or empty content buffer is received. Ultimately this will allow us to apply backpressure on a per-file basis and read from different file streams in parallel and asymmetrically based on the available server capacity.Observed performance pegs gRPC-web over websockets as similar to the HTTP Client with pretty much zero optimisation work performed
Browsers require TLS for all use of websocket connections to localhost. They do not require it for the loopback address, however, which this PR uses, though loopback means the traffic will not leave the local machine.
The incoming requests start as HTTP requests so have a referer header and user agent so would follow the same restrictions as the existing HTTP API.
Fixes:
#2519
#2838
#2943
#2854
#2864
Depends on: