-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inaccurate claimed DAG sizes #1427
Comments
Needs further investigation to understand full priority |
I haven't gone too deep yet, but I'll start sharing my Initial investigation results (in web3.storage): Data sample 6484 cids:
Looking at the code, possible roots of the problem are:
CBOR dags From a quick look, I suspect PB directories
I haven't yet checked why the 2 reports different sizes, (is it unixFs headers or a bug) but I'm sure you know @alanshaw. @alanshaw can you run a query in prod where you use the dag size from |
Web3.storage findingsTL;DRRelying on tsize is the issue for the observed CIDs. Not sure there's a whole lot we can do apart from switching to navigating the whole dag instead of relying on metadata. (I'll put together a different comment for nft.storage, and in particular for CBOR dags) Investigation detailsQuery
Results: Observation67 of 67 dag-pb CID bafybeicyifavcxymohtwpsiht7ule5hnb3o7cj6ulirbvar33vffhepupaCargoDelta: 1477 kB Web3.storagesize: 50
Context
CID bafybeid7dl444brakxtphxnhj5flad37wwntcweijzycd242uy6anqvadqCargoDelta: 107 MB Web3.storagesize: 163 Sizes
Context
CID bafybeictly5566rw434m4p6gihiqryqghei23vz5v5w4nwfhf3qyonyrwuCargoDelta: 5121 MB Web3.storagesize: 4224
*I had to cancel this one because is too big and I was running out of space
|
This could be a bug in the JS UnixFS importer. It would be interesting to see if importing data with problematic size into a go-ipfs node resulted in a DAG with similar problematic We currently use dagcargo to update dag size when we don't know how big the DAG is: We could/should setup a cron job to "fix up" where we do not agree. |
@Gozala does this problem go away with Uploads v2 (if we just use the size on disk in Elastic Provider)? |
DAG sizes for some upload types are incorrect (reporting smaller than actual).
Examples:
examples.txt
The text was updated successfully, but these errors were encountered: