-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: unify /upload and /car uploads #480
Conversation
Have the API pack uploaded files into a CAR, to make the two paths more similar and ideally simplify the upload to S3 path. There were two ideas here 1. If uploaded files are packed into a CAR by the API before sending to cluster we then only have to deal wth sending CARs to S3, which would be nicer than having to handle both raw files and CARs. 2. We have a lot of different notions of "how big is this thing"... the size of the uploaded CAR, the cumulative size of uploaded files, the sum of the size of each block in gthe CAR, the size or bytes value returned by cluter, and the size of all the blocks in the dag where we understand the ipld codecs and can follow the links, which could be different to the sum of the size of blocks in a CAR if it has redundent or duplicate blocks. It would be good to simplify that. The dream was - To lean on cluster to get the unixFS CumulativeSize when uploading a CAR with files with a unixFS root - ...and pack raw files uploads into a CAR. The reality is - We can't get the unixFS CumulativeSize out of cluster. It might return the FileSize for a CAR with a single file in, but that will be 0 for Directories. - We can't pack files into a unixFS flavour CAR with ipfs-car in a cloudflare worker today as it fails with a about `importScrtips` not being available in CloudFlare workers. License: (Apache-2.0 AND MIT) Signed-off-by: Oli Evans <[email protected]>
Looks like it might be our webpack config trying to split the service worker bundle into chunks, but as it's for upload to the cloudflare that should just be disabled. Am investigating. |
this fixes the `importScripts` error License: (Apache-2.0 AND MIT) Signed-off-by: Oli Evans <[email protected]>
Nice! Forcing webpack to not split the bundle fixes the importScripts error. |
License: (Apache-2.0 AND MIT) Signed-off-by: Oli Evans <[email protected]>
License: (Apache-2.0 AND MIT) Signed-off-by: Oli Evans <[email protected]>
Of note this pushes the api bundlesize up from 686.35KB to 799.28KB
|
} | ||
|
||
return new JSONResponse({ cid }) | ||
// TODO: do we want to wrap file uploads like we do car uploads from the client? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when receiving a single file, it won't have a path, while directories might have in the name. For directories with several files with no path like name it would likely make sense. I suggest that we should discuss this externally within an issue and keep the current behaviour in this PR context.
License: (Apache-2.0 AND MIT) Signed-off-by: Oli Evans <[email protected]>
License: (Apache-2.0 AND MIT) Signed-off-by: Oli Evans <[email protected]>
packages/api/src/car.js
Outdated
// We can't use the `bytes` from cluster where it's a unixfs root partial, as it'll be a shard of the total dag size. | ||
// We can't use the `size` from cluster as it's the unixfs FileSize which is 0 for directories, and is not set for raw encoded files, only dag-pb ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add this knowledge to the function where we compute the dag size. What do you think?
Reading through the code, it seems strange this placement as it is not clear what it relates to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, fair point, but the bytes
and size
are properties of the cluster.add
response. I'll shuffling it around and see if I can make it make more sense.
Co-authored-by: Vasco Santos <[email protected]>
License: (Apache-2.0 AND MIT) Signed-off-by: Oli Evans <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not finished review, submitting what I have...
Co-authored-by: Alan Shaw <[email protected]>
// NOTE: Tsize is optional, but all ipfs implementations we know of set it. | ||
// It's metadata, that could be missing or deliberately set to an incorrect value. | ||
// This logic is the same as used by go/js-ipfs to display the cumulative size of a dag-pb dag. | ||
return pbNodeBytes.length + pbNode.Links.reduce((acc, curr) => acc + (curr.Tsize || 0), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vasco-santos uh oh. i think this should be:
- return pbNodeBytes.length + pbNode.Links.reduce((acc, curr) => acc + (curr.Tsize || 0), 0)
+ return pbNodeBytes.byteLength + pbNode.Links.reduce((acc, curr) => acc + (curr.Tsize || 0), 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions to reliably test this?
License: (Apache-2.0 AND MIT)
Signed-off-by: Oli Evans [email protected]