Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] NFT.Storage #49

Closed
galen-mcandrew opened this issue Sep 30, 2021 · 27 comments
Closed

[DataCap Application] NFT.Storage #49

galen-mcandrew opened this issue Sep 30, 2021 · 27 comments

Comments

@galen-mcandrew
Copy link
Collaborator

Large Dataset Notary Application

Core Information

  • Organization Name: nft.storage, web3.storage - Protocol Labs
  • Website / Social Media: nft.storage, web3.storage
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 5PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB): 5TiB
  • On-chain address for first allocation: f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

Please respond to the questions below in pargraph form, replacing the text saying "Please answer here". Include as much detail as you can in your answer!

Project details

Share a brief history of your project and organization.

Protocol Labs is an open-source R&D lab. We build protocols, tools, and services to radically improve the internet. Our products serve thousands of organizations and millions of people. Protocols we’ve built to date include IPFS, Filecoin, libp2p, drand, IPLD and more. 

Our organization was founded back in 2014 - you can find more details about some of the important milestones along our history on our [website](https://protocol.ai/about/).  

What is the primary source of funding for this project?

We are developing a backend that will be used to serve two separate projects. 

The first project (nft.storage) is being funded by Protocol Labs to support the storage of what we view as a public good. Aside from the investment we’re making to operationalize the Filecoin deals, we’re also managing our own pin cluster to keep all NFT persisted on our side - and working with Pinata to store additional copies as well. 

The second project (web3.storage) is a project to enable early users of Filecoin to have a seamless user experience to develop early applications. The structure is heavily borrowed from NFT.Storage - where users can sign up via a website, download a client library, and get rolling using the service. The purpose of this project is to enable non-NFT use cases, and provide an early sandbox for developers to get started with testing (offering a capped limit of free storage, and with a verification process enabling higher limits). Protocol Labs intends to bear the IPFS costs required to deliver a strong user experience, while using Filecoin for persistence and storage. Over time, we'll be relying more heavily on Filecoin for retrievals vs. relying on IPFS (as we fully decentralize out the backend). 

What other projects/ecosystem stakeholders is this project associated with?

The NFT.storage project is being driven by Protocol Labs, but there are a number of large NFT platforms that intend to integrate this service as a component of their storage strategy (OpenSea, Rarible, Async Art, Palm) as well as platforms that intend to offer this as a free add-on service for their customers (Pinata, Fleek, Infura). 

The web3.storage project is also driven by Protocol Labs and is aimed to be one of the early starting points for developers trying to build in IPFS/Filecoin ecosystem. Primarily we believe hackathon participants will benefit the most early on (e.g. HackFS). 

Use-case details

Describe the data being stored onto Filecoin

For NFT.Storage
We are offering free storage of all NFT on IPFS and Filecoin through https://nft.storage/. This can include various tokenized media that should be preserved long term on Filecoin and IPFS to ensure owners of NFTs can continually access the media they own. 

In the future, we might expand this to serve use cases broader than NFTs, but any changes to this plan will be added to this proposal for approval.

For Web3.Storage
We intend to offer free storage to any user with a Github or valid email signup. To start, we'll be offering a capped amount of free storage for all accounts - but enable users to request additional storage capacity (for free) by contacting us (so we can vet for abuse). We intend to mirror the same Terms of Service requirements for stored content as IPFS (and NFT.Storage). 

Where was the data in this dataset sourced from?


For NFT.Storage:
These NFTs are being sourced via three methods: 

1. Chain scraping (using theGraph to build an index of minted NFTs on the most popular blockchains)
2. Adoption of the NFT.storage library (to let developer directly integrate with our service)
3. Backend support with infrastructure providers (to enable pinning services to directly ship us content from their customers)

From these three methods, we both persist relevant CIDs in IPFS, and add the CIDs into a deal queue - which we package into 32GiB files with a self-describing index.

For web3.storage: 
These are files and requests we are pulling from early hackathon participants, and other early adopters of IPFS and Filecoin. We imagine this will likely end up being use cases that lean towards larger data volumes (as we are architecting our processes to make storing sizable chunks of data seamless) - but believe this service will be useful for any dApp developer trying to leverage decentralized storage for their use case. Equally, we believe this will become a friendly pathway for web2 developers to dip their toes into the IPFS/Filecoin ecosystems. 

Can you share a sample of what is in the dataset? A link to a file, an image, a table, etc., are good examples of this.

For NFT.Storage
This platform serves all different kinds of media, including images, files, and videos that have been tokenized as NFTs. These are typically created by an artist, minted as NFTs, and then can be transacted. When uploaded to Filecoin, they are still owned by the individual that last purchased the token, but are accessible to anyone. 

Here are some sample CIDs of NFTs that have been stored via nft.storage: 
- https://ipfs.io/ipfs/bafybeid5jpdqzlb4tqsd6peoa7qstoxat3ovsg62wutyp4gnzqbqsggfsq
- https://ipfs.io/ipfs/bafybeihity6bx24npzvvkzopjbat25ekefjwmnshe7rvldy72dxngzf644
- https://ipfs.io/ipfs/bafybeicvcevx3ktiqjsfwnjguu4lnzejlhgb35brayuod5xdtn7demfdhe

For Web.Storage - we do not have any examples yet. 

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

For NFT.Storage
All of this data is public - NFT.storage will support a “status” API that will give both pin status (to show if the content is in IPFS) as well as the relevant deal information - such that anyone can retrieve the relevant data from a miner and store their own local copy. 

For Web3.Storage
We expect there may be a mix here - some developers will be building with public data, but its possible some users may want (for their own reasons) to encrypt their data. Given all of this content is going to be pinned on IPFS, it content will be retrievable. There will be public APIs that will allow anyone to query for a speciifc CID to get back status on the content being pinned, as well as the miners who have been contracted to store those deals.For users who request larger allocations of storage capacity, we have the ability to vet teams for abuse.

I think this is an area where we're open to feedback - our aim is to give early developers as much flexibility to use these protocols as they see fit, and given the miner selection process is decoupled from the client storage (i.e. Clients cannot directly choose which miner to store with) typical economic concerns of self-dealing are mitigated against.


What is the expected retrieval frequency for this data?

To start, we intend to persist all this data in IPFS - so the retrieval frequency out of Filecoin we expect to be minimal. Over time, our intent is to lean more heavily on fast retrieval out of Filecoin and cache content intelligently in IPFS. 

For NFT.Storage specifically: 
Part of our intended goals is to also enable teams who are not directly storing content via NFT.storage to request data out of this service, or to configure it as a fallback in the event an NFTs URL does not resolve. 

For how long do you plan to keep this dataset stored on Filecoin? Will this be a permanent archival or a one-time storage deal?

We intend to store this content permanently on Filecoin - and will be actively monitoring to ensure that the requisite number of copies remain online and that deals are appropriately renewed. 

DataCap allocation plan

In which geographies do you plan on making storage deals?

We intend to make deals with our allocation with MinerX fellows from around the world. These deals will be split across ~30 miners - in regions spanning the globe (US, Europe, parts of Asia). 

What is your expected data onboarding rate? How many deals can you make in a day, in a week? How much DataCap do you plan on using per day, per week?

For NFT.Storage
In the first few months, we will be running our chain scraping process - which will require creating a backfill for all NFTs that have been minted to date. As that process executes, we imagine the rate of storage will be significantly higher (perhaps on the order of terabytes a day). We do not have concrete estimates of the total number of NFTs that we will be storing, but our estimate is 600TiB for Ethereum looking back historically. We expect to have a more refined forward looking estimate of the rate of increase as we see steady state adoption in the coming weeks. 

Our current estimate is that we’ll require 5TiB per week of DataCap to store newly minted NFTs from Ethereum with replication. From the 5PiB requested above, we expect 1PiB to be dedicated to supporting NFT.Storage

For Web3.Storage
We expect we're going to see a rapid number of sign ups during hackathons and pushes where there might be a large number of new developers - but we expect these users to have a low average usage - assuming the average storage lifetime amt is ~5GiB total. 

Separately, our estimates are that there will likely be a handful of power users (e.g. teams like Voodfy, we've ball parked this at 10% of our user base) who will generate large volumes of data - on the order of 10s of GiB a week. 

A rough estimate would be (8k total users by EOY  * 0.9 *5 GiB) + (8k total users by EOY* 0.1 *50GiB*4 weeks*6 months) = Total datacap used this year = (roughly) 1PiB

For a minimum of 5x replication, we'd need 5PiB.

These are aggressive goals (and walking throuhg the math is likely an over assumption), but given the allocation mechanism it would be ideal to have this amt pre-approved - and notaries to rate limit accordingly. 

Totally the request is: 
1PiB for NFT.Storage
5PiB for Web3.Storage
-> 6PiB total. -> requesting 5PiB as it is the max. 

How will you be distributing your data to miners? Is there an offline data transfer process?

We’ll be doing primarily online data transfers with miners - though we’ll be running a batching script to only trigger deals once we have full sectors. Miners are being sourced from the MinerX program - and will only receive deals if they are found to be in good standing with our client. Deals will be allocated in a round robin strategy to ensure equal distribution - with a primary focus on enabling geographic diversity per deal (redundant storage across locations). 

Note our broker backend is being used for a number of other services (including Textile, Estuary, and Discover)

How do you plan on choosing the miners with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

As a component of our storage strategy, we’ll be requesting fast / free retrieval which will be required in order for miners to continue to receive deals from our Client. 

How will you be distributing data and DataCap across miners storing data?

We will be rotating through the MinerX fellows to ensure roughly equal distribution to Miners who offer high quality services to our client. We intend to fully allocate DataCap per 32GiB deal we strike with a Miner. 
@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@galen-mcandrew
Copy link
Collaborator Author

Multisig Notary requested

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

5TiB

@large-datacap-requests
Copy link

**Multisig created and sent to RKH f01322584

@galen-mcandrew
Copy link
Collaborator Author

@ribasushi & @jnthnvctr Here is the new large dataset application issue, per the new LDN process.

@large-datacap-requests
Copy link

DataCap Allocation requested

Multisig Notary address

f01322584

Client address

f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

DataCap allocation requested

2.5TiB

Copy link

Reiers commented Oct 19, 2021

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecyrpzbwlojkh4qkh6m77dqoa4mmpzyvl6sc2huk4amwslabkstlk

Address

f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

Datacap Allocated

2.5TiB

Signer Address

f1oz43ckvmtxmmsfzqm6bpnemqlavz4ifyl524chq

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecyrpzbwlojkh4qkh6m77dqoa4mmpzyvl6sc2huk4amwslabkstlk

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceda4q3ifrhuv6apchqql3vjw5wtut3t2ls5hn4mfs3hc4k5kwntoq

Address

f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

Datacap Allocated

2.5TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceda4q3ifrhuv6apchqql3vjw5wtut3t2ls5hn4mfs3hc4k5kwntoq

@dchoi27
Copy link

dchoi27 commented Dec 14, 2022

Content policy of web3.storage:

web3.storage is a developer platform designed to be the easiest way to host data for availability over the public IPFS network and storage on Filecoin. All data stored on web3.storage is stored in Filecoin deals today through an automated pipeline. Data stored on web3.storage is subject to web3.storage's Terms of Service, and the web3.storage user is liable for the data they store.

The web3.storage team does not actively have visibility into the data users are storing. However, users are incentivized to store useful data through the product because they pay for storing data beyond a 5GiB free tier. For instance, NFT.Storage is now a user of NFT.Storage, and a good indicative example of the type of content to expect via web3.storage).

@ribasushi
Copy link

( note for anyone following this issue: the above is the actual content policy, but was misposted to this issue by mistake )

@RobQuistNL
Copy link

checker:manualTrigger

@Sunnyiscoming
Copy link
Collaborator

Are there any problems with using datacap?

@ribasushi
Copy link

Are there any problems with using datacap?

Software platform is not full live yet, launch imminent. Fatacap will start flowing then.

@Sunnyiscoming
Copy link
Collaborator

Ok.

@large-datacap-requests
Copy link

DataCap Allocation requested

Request number 2

Multisig Notary address

f01322584

Client address

f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

DataCap allocation requested

5TiB

Id

8552e23d-ec7a-4840-9533-a44bf7c43331

@large-datacap-requests
Copy link

Stats & Info for DataCap Allocation

Multisig Notary address

f01322584

Client address

f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

Last two approvers

cryptowhizzard & Reiers

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

5TiB

Total DataCap granted for client so far

2.5TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.99PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
0 0 2.5TiB NaN 576GiB

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the full report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the full report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the full report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceaduoznsak3zikpfxwimvejwhejocycvypzeugvmvml7oi2kmsxae

Address

f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

Datacap Allocated

5.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceaduoznsak3zikpfxwimvejwhejocycvypzeugvmvml7oi2kmsxae

@github-actions
Copy link

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

@github-actions github-actions bot added the Stale label Jul 21, 2023
@dchoi27
Copy link

dchoi27 commented Jul 24, 2023

Please keep open - we are moving to separating the FIL wallets for NFT.Storage and web3.storage, so this application will likely become for NFT.Storage (and this for web3.storage #1838)

@dchoi27
Copy link

dchoi27 commented Jul 25, 2023

actually can just close this one and our team will open a new one to get this off of galen's plate

@github-actions github-actions bot removed the Stale label Jul 25, 2023
@dchoi27
Copy link

dchoi27 commented Jul 25, 2023

done here - #2110
this can be closed

@Chris00618
Copy link

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

  • Overall Graphsync retrieval success rate:
  • Overall HTTP retrieval success rate: 0.00%
  • Overall Bitswap retrieval success rate:

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 99.38% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the CID Checker report.
Click here to view the Retrieval Dashboard.
Click here to view the Retrieval report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@github-actions
Copy link

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

@github-actions github-actions bot added the Stale label Aug 14, 2023
@github-actions
Copy link

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 18, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants