Skip to content

Commit

Permalink
chore: editorial fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
lidel committed Nov 9, 2022
1 parent eea310a commit 8fe745a
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 27 deletions.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# IPIP 0000: TAR Response Format on Web Gateways
# IPIP-288: TAR Response Format on HTTP Gateways

- Start Date: 2022-06-10
- Related Issues:
Expand Down Expand Up @@ -48,15 +48,20 @@ directories inside.
However, there are certain behaviors, detailed in the [security section](#security)
that should be handled. To test such behaviors, the following fixtures can be used:

- [`bafybeibfevfxlvxp5vxobr5oapczpf7resxnleb7tkqmdorc4gl5cdva3y`][inside-dag] is a UnixFS
DAG that contains a file with a relative path that points inside the root directory.
Downloading it as a TAR must work.
- [`bafkreict7qp5aqs52445bk4o7iuymf3davw67tpqqiscglujx3w6r7hwoq`][inside-dag-tar] is an
example TAR file that corresponds to the aforementioned UnixFS DAG. Its structure can be
inspected in order to check if new implementations conform to the specification.
- [`bafybeicaj7kvxpcv4neaqzwhrqqmdstu4dhrwfpknrgebq6nzcecfucvyu`][outside-dag] is a UnixFS
DAG that contains a file with a relative path that points outside the root directory.
Downloading it as a TAR must error.
- [`bafybeibfevfxlvxp5vxobr5oapczpf7resxnleb7tkqmdorc4gl5cdva3y`][inside-dag]
is a UnixFS DAG that contains a file with a name that looks like a relative
path that points inside the root directory. Downloading it as a TAR must
work.

- [`bafkreict7qp5aqs52445bk4o7iuymf3davw67tpqqiscglujx3w6r7hwoq`][inside-dag-tar]
is an example TAR file that corresponds to the aforementioned UnixFS DAG. Its
structure can be inspected in order to check if new implementations conform
to the specification.

- [`bafybeicaj7kvxpcv4neaqzwhrqqmdstu4dhrwfpknrgebq6nzcecfucvyu`][outside-dag]
is a UnixFS DAG that contains a file with a name that looks like a relative
path that points outside the root directory. Downloading it as a TAR must
error.

## Design rationale

Expand All @@ -78,13 +83,28 @@ downloading it.
CLI users will be able to download a directory with existing tools like `curl` and `tar` without
having to talk to implementation-specific RPC APIs like `/api/v0/get` from Kubo.

Fetching a directory from a local gateway will be as simple as:

```console
$ export DIR_CID=bafybeigccimv3zqm5g4jt363faybagywkvqbrismoquogimy7kvz2sj7sq
$ curl "http://127.0.0.1:8080/ipfs/$DIR_CID?format=tar" | tar xv
bafybeigccimv3zqm5g4jt363faybagywkvqbrismoquogimy7kvz2sj7sq
bafybeigccimv3zqm5g4jt363faybagywkvqbrismoquogimy7kvz2sj7sq/1 - Barrel - Part 1 - alt.txt
bafybeigccimv3zqm5g4jt363faybagywkvqbrismoquogimy7kvz2sj7sq/1 - Barrel - Part 1 - transcript.txt
bafybeigccimv3zqm5g4jt363faybagywkvqbrismoquogimy7kvz2sj7sq/1 - Barrel - Part 1.png
```

### Compatibility

This IPIP is backwards compatible: adds a new opt-in response type, does not
modify preexisting behaviors.

Existing content type `application/x-tar` is used when request is made with an `Accept` header.

### Security

Third-party UnixFS file names may include unexpected values, such as `../`.

Manually created UnixFS DAGs can be turned into malicious TAR files. For example,
if a UnixFS directory contains a file that points at a relative path outside
its root, the unpacking of the TAR file may overwrite local files outside the expected
Expand All @@ -102,18 +122,20 @@ suggested to use a CAR file if they want to download the raw files.

### Alternatives

One discussed alternative would be to support uncompressed ZIP files. However, TAR and
TAR-related libraries are already supported and implemented for UnixFS files. Therefore,
the addition of a TAR response format is facilitated, while introduction of ZIP would increase
implementation complexity.
One discussed alternative would be to support uncompressed ZIP files. However,
TAR and TAR-related libraries are already supported by some IPFS
implementations, and are easier to work with in CLI. TAR provides simpler
abstraction, and layering compression on top of TAR stream allows for greater
flexibility than alternative options that come with own, opinionated approaches
to compression.

In addition, we considered supporting [Gzipped TAR](https://github.com/ipfs/go-ipfs/pull/9034).
However, there it may be a vector for DOS attacks since compression requires high CPU power.
In addition, we considered supporting [Gzipped TAR](https://github.com/ipfs/go-ipfs/pull/9034) out of the box,
but decided against it as gzip or alternative compression may be introduced on the HTTP transport layer.

### Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

[inside-dag]: https://dweb.link/ipfs/bafybeibfevfxlvxp5vxobr5oapczpf7resxnleb7tkqmdorc4gl5cdva3y
[inside-dag-tar]: https://dweb.link/ipfs/bafkreict7qp5aqs52445bk4o7iuymf3davw67tpqqiscglujx3w6r7hwoq
[outside-dag]: https://dweb.link/ipfs/bafybeicaj7kvxpcv4neaqzwhrqqmdstu4dhrwfpknrgebq6nzcecfucvyu
[inside-dag]: https://dweb.link/ipfs/bafybeibfevfxlvxp5vxobr5oapczpf7resxnleb7tkqmdorc4gl5cdva3y?format=car
[inside-dag-tar]: https://dweb.link/ipfs/bafkreict7qp5aqs52445bk4o7iuymf3davw67tpqqiscglujx3w6r7hwoq?format=car
[outside-dag]: https://dweb.link/ipfs/bafybeicaj7kvxpcv4neaqzwhrqqmdstu4dhrwfpknrgebq6nzcecfucvyu?format=car
17 changes: 9 additions & 8 deletions http-gateways/PATH_GATEWAY.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Path Gateway Specification

![Status: Work In Progress](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square)
![reliable](https://img.shields.io/badge/status-reliable-green.svg?style=flat-square)

**Authors**:

Expand Down Expand Up @@ -181,7 +181,7 @@ For example:

- [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) – disables [IPLD codec deserialization](https://ipld.io/docs/codecs/), requests a verifiable raw [block](https://docs.ipfs.io/concepts/glossary/#block) to be returned
- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables [IPLD codec deserialization](https://ipld.io/docs/codecs/), requests a verifiable [CAR](https://docs.ipfs.io/concepts/glossary/#car) stream to be returned
- [application/x-tar](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types) – returns UnixFS file or a directory as a [TAR](https://en.wikipedia.org/wiki/Tar_(computing)) stream. At the root of the TAR archive, a file or directory, with the CID of the content, is present. Produces 400 Bad Request for content that is not UnixFS.
- [application/x-tar](https://en.wikipedia.org/wiki/Tar_(computing)) – returns UnixFS tree (files and directories) as a [TAR](https://en.wikipedia.org/wiki/Tar_(computing)) stream. Returned tree starts at a root item which name is the same as the requested CID. Produces 400 Bad Request for content that is not UnixFS.
<!-- TODO: https://github.com/ipfs/go-ipfs/issues/8823
- application/vnd.ipld.dag-json OR application/json – requests IPLD Data Model representation serialized into [DAG-JSON format](https://ipld.io/docs/codecs/known/dag-json/)
- application/vnd.ipld.dag-cbor OR application/cbor - requests IPLD Data Model representation serialized into [DAG-CBOR format](https://ipld.io/docs/codecs/known/dag-cbor/)
Expand Down Expand Up @@ -366,10 +366,11 @@ and CDNs, implementations should base it on both CID and response type:
- Example: `Etag: "DirIndex-2B423AF_CID-bafy…foo"`

- When a gateway can’t guarantee byte-for-byte identical responses, a “weak”
etag should be used. For example, if CAR is streamed, and blocks arrive in
non-deterministic order, the response should have `Etag: W/"bafy…foo.car"`.
If TAR is generated by traversing an UnixFS directory in non-deterministic
order, the response should have `Etag: W/"bafy…foo.tar"`.
etag should be used.
- Example: If CAR is streamed, and blocks arrive in non-deterministic order,
the response should have `Etag: W/"bafy…foo.car"`.
- Example: If TAR stream is generated by traversing an UnixFS directory in non-deterministic
order, the response should have `Etag: W/"bafy…foo.x-tar"`.

- When responding to [`Range`](#range-request-header) request, a strong `Etag`
should be based on requested range in addition to CID and response format:
Expand Down Expand Up @@ -594,9 +595,9 @@ Data sent with HTTP response depends on the type of requested IPFS resource:
- Raw block
- Opaque bytes, see [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw)
- CAR
- CAR file or stream, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)
- Arbitrary DAG as a verifiable CAR file or a stream, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)
- TAR
- TAR file or stream, see [application/x-tar](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types)
- Deserialized UnixFS files and directories as a TAR file or a stream, see [application/x-tar](https://en.wikipedia.org/wiki/Tar_(computing))
<!-- TODO: https://github.com/ipfs/go-ipfs/issues/8823
- dag-json / dag-cbor
- See [https://github.com/ipfs/go-ipfs/issues/8823](https://github.com/ipfs/go-ipfs/issues/8823)
Expand Down

0 comments on commit 8fe745a

Please sign in to comment.