Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] [DEPRECATED] NFT.Storage #12

Closed
jnthnvctr opened this issue Jun 14, 2021 · 61 comments
Closed

[DataCap Application] [DEPRECATED] NFT.Storage #12

jnthnvctr opened this issue Jun 14, 2021 · 61 comments

Comments

@jnthnvctr
Copy link

jnthnvctr commented Jun 14, 2021

Large Dataset Notary Application

Core Information

  • Organization Name: [DEPRECATED] nft.storage, web3.storage - Protocol Labs
  • Website / Social Media: nft.storage, web3.storage
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 5PiB
  • On-chain address for first allocation: f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

Please respond to the questions below in pargraph form, replacing the text saying "Please answer here". Include as much detail as you can in your answer!

Project details

Share a brief history of your project and organization.

Protocol Labs is an open-source R&D lab. We build protocols, tools, and services to radically improve the internet. Our products serve thousands of organizations and millions of people. Protocols we’ve built to date include IPFS, Filecoin, libp2p, drand, IPLD and more. 

Our organization was founded back in 2014 - you can find more details about some of the important milestones along our history on our [website](https://protocol.ai/about/).  

What is the primary source of funding for this project?

We are developing a backend that will be used to serve two separate projects. 

The first project (nft.storage) is being funded by Protocol Labs to support the storage of what we view as a public good. Aside from the investment we’re making to operationalize the Filecoin deals, we’re also managing our own pin cluster to keep all NFT persisted on our side - and working with Pinata to store additional copies as well. 

The second project (web3.storage) is a project to enable early users of Filecoin to have a seamless user experience to develop early applications. The structure is heavily borrowed from NFT.Storage - where users can sign up via a website, download a client library, and get rolling using the service. The purpose of this project is to enable non-NFT use cases, and provide an early sandbox for developers to get started with testing (offering a capped limit of free storage, and with a verification process enabling higher limits). Protocol Labs intends to bear the IPFS costs required to deliver a strong user experience, while using Filecoin for persistence and storage. Over time, we'll be relying more heavily on Filecoin for retrievals vs. relying on IPFS (as we fully decentralize out the backend). 

What other projects/ecosystem stakeholders is this project associated with?

The NFT.storage project is being driven by Protocol Labs, but there are a number of large NFT platforms that intend to integrate this service as a component of their storage strategy (OpenSea, Rarible, Async Art, Palm) as well as platforms that intend to offer this as a free add-on service for their customers (Pinata, Fleek, Infura). 

The web3.storage project is also driven by Protocol Labs and is aimed to be one of the early starting points for developers trying to build in IPFS/Filecoin ecosystem. Primarily we believe hackathon participants will benefit the most early on (e.g. HackFS). 

Use-case details

Describe the data being stored onto Filecoin

For NFT.Storage
We are offering free storage of all NFT on IPFS and Filecoin through https://nft.storage/. This can include various tokenized media that should be preserved long term on Filecoin and IPFS to ensure owners of NFTs can continually access the media they own. 

In the future, we might expand this to serve use cases broader than NFTs, but any changes to this plan will be added to this proposal for approval.

For Web3.Storage
We intend to offer free storage to any user with a Github or valid email signup. To start, we'll be offering a capped amount of free storage for all accounts - but enable users to request additional storage capacity (for free) by contacting us (so we can vet for abuse). We intend to mirror the same Terms of Service requirements for stored content as IPFS (and NFT.Storage). 

Where was the data in this dataset sourced from?


For NFT.Storage:
These NFTs are being sourced via three methods: 

1. Chain scraping (using theGraph to build an index of minted NFTs on the most popular blockchains)
2. Adoption of the NFT.storage library (to let developer directly integrate with our service)
3. Backend support with infrastructure providers (to enable pinning services to directly ship us content from their customers)

From these three methods, we both persist relevant CIDs in IPFS, and add the CIDs into a deal queue - which we package into 32GiB files with a self-describing index.

For web3.storage: 
These are files and requests we are pulling from early hackathon participants, and other early adopters of IPFS and Filecoin. We imagine this will likely end up being use cases that lean towards larger data volumes (as we are architecting our processes to make storing sizable chunks of data seamless) - but believe this service will be useful for any dApp developer trying to leverage decentralized storage for their use case. Equally, we believe this will become a friendly pathway for web2 developers to dip their toes into the IPFS/Filecoin ecosystems. 

Can you share a sample of what is in the dataset? A link to a file, an image, a table, etc., are good examples of this.

For NFT.Storage
This platform serves all different kinds of media, including images, files, and videos that have been tokenized as NFTs. These are typically created by an artist, minted as NFTs, and then can be transacted. When uploaded to Filecoin, they are still owned by the individual that last purchased the token, but are accessible to anyone. 

Here are some sample CIDs of NFTs that have been stored via nft.storage: 
- https://ipfs.io/ipfs/bafybeid5jpdqzlb4tqsd6peoa7qstoxat3ovsg62wutyp4gnzqbqsggfsq
- https://ipfs.io/ipfs/bafybeihity6bx24npzvvkzopjbat25ekefjwmnshe7rvldy72dxngzf644
- https://ipfs.io/ipfs/bafybeicvcevx3ktiqjsfwnjguu4lnzejlhgb35brayuod5xdtn7demfdhe

For Web.Storage - we do not have any examples yet. 

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

For NFT.Storage
All of this data is public - NFT.storage will support a “status” API that will give both pin status (to show if the content is in IPFS) as well as the relevant deal information - such that anyone can retrieve the relevant data from a miner and store their own local copy. 

For Web3.Storage
We expect there may be a mix here - some developers will be building with public data, but its possible some users may want (for their own reasons) to encrypt their data. Given all of this content is going to be pinned on IPFS, it content will be retrievable. There will be public APIs that will allow anyone to query for a speciifc CID to get back status on the content being pinned, as well as the miners who have been contracted to store those deals.For users who request larger allocations of storage capacity, we have the ability to vet teams for abuse.

I think this is an area where we're open to feedback - our aim is to give early developers as much flexibility to use these protocols as they see fit, and given the miner selection process is decoupled from the client storage (i.e. Clients cannot directly choose which miner to store with) typical economic concerns of self-dealing are mitigated against.


What is the expected retrieval frequency for this data?

To start, we intend to persist all this data in IPFS - so the retrieval frequency out of Filecoin we expect to be minimal. Over time, our intent is to lean more heavily on fast retrieval out of Filecoin and cache content intelligently in IPFS. 

For NFT.Storage specifically: 
Part of our intended goals is to also enable teams who are not directly storing content via NFT.storage to request data out of this service, or to configure it as a fallback in the event an NFTs URL does not resolve. 

For how long do you plan to keep this dataset stored on Filecoin? Will this be a permanent archival or a one-time storage deal?

We intend to store this content permanently on Filecoin - and will be actively monitoring to ensure that the requisite number of copies remain online and that deals are appropriately renewed. 

DataCap allocation plan

In which geographies do you plan on making storage deals?

We intend to make deals with our allocation with MinerX fellows from around the world. These deals will be split across ~30 miners - in regions spanning the globe (US, Europe, parts of Asia). 

What is your expected data onboarding rate? How many deals can you make in a day, in a week? How much DataCap do you plan on using per day, per week?

For NFT.Storage
In the first few months, we will be running our chain scraping process - which will require creating a backfill for all NFTs that have been minted to date. As that process executes, we imagine the rate of storage will be significantly higher (perhaps on the order of terabytes a day). We do not have concrete estimates of the total number of NFTs that we will be storing, but our estimate is 600TiB for Ethereum looking back historically. We expect to have a more refined forward looking estimate of the rate of increase as we see steady state adoption in the coming weeks. 

Our current estimate is that we’ll require 5TiB per week of DataCap to store newly minted NFTs from Ethereum with replication. From the 5PiB requested above, we expect 1PiB to be dedicated to supporting NFT.Storage

For Web3.Storage
We expect we're going to see a rapid number of sign ups during hackathons and pushes where there might be a large number of new developers - but we expect these users to have a low average usage - assuming the average storage lifetime amt is ~5GiB total. 

Separately, our estimates are that there will likely be a handful of power users (e.g. teams like Voodfy, we've ball parked this at 10% of our user base) who will generate large volumes of data - on the order of 10s of GiB a week. 

A rough estimate would be (8k total users by EOY  * 0.9 *5 GiB) + (8k total users by EOY* 0.1 *50GiB*4 weeks*6 months) = Total datacap used this year = (roughly) 1PiB

For a minimum of 5x replication, we'd need 5PiB.

These are aggressive goals (and walking throuhg the math is likely an over assumption), but given the allocation mechanism it would be ideal to have this amt pre-approved - and notaries to rate limit accordingly. 

Totally the request is: 
1PiB for NFT.Storage
5PiB for Web3.Storage
-> 6PiB total. -> requesting 5PiB as it is the max. 

How will you be distributing your data to miners? Is there an offline data transfer process?

We’ll be doing primarily online data transfers with miners - though we’ll be running a batching script to only trigger deals once we have full sectors. Miners are being sourced from the MinerX program - and will only receive deals if they are found to be in good standing with our client. Deals will be allocated in a round robin strategy to ensure equal distribution - with a primary focus on enabling geographic diversity per deal (redundant storage across locations). 

Note our broker backend is being used for a number of other services (including Textile, Estuary, and Discover)

How do you plan on choosing the miners with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

As a component of our storage strategy, we’ll be requesting fast / free retrieval which will be required in order for miners to continue to receive deals from our Client. 

How will you be distributing data and DataCap across miners storing data?

We will be rotating through the MinerX fellows to ensure roughly equal distribution to Miners who offer high quality services to our client. We intend to fully allocate DataCap per 32GiB deal we strike with a Miner. 
@cryptowhizzard
Copy link

Sounds good. Happy to be notary for this as soon as we are ready to go.

@large-datacap-requests
Copy link

Thanks for your request!
❗ We have found some problems in the information provided.
We could not find your Name in the information provided
We could not find your Filecoin address in the information provided
We could not find the Datacap requested in the information provided
We could not find any Web site or social media info in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.

@jnthnvctr
Copy link
Author

@cryptowhizzard - please see the updates to the above issue! Note we're using hte same backend for two projects, so rather than file two different issues I've added detail on both above!

@neogeweb3
Copy link

Count me in!

@andrewxhill
Copy link

+1

@large-datacap-requests
Copy link

Thanks for your request!
❗ We have found some problems in the information provided.
We could not find your Name in the information provided
We could not find your Filecoin address in the information provided
We could not find the Datacap requested in the information provided
We could not find any Web site or social media info in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.

@jnthnvctr
Copy link
Author

@andrewxhill and @neogeweb3 please note that I totally forgot about the replication factor - I've updated the issue above accordingly

@s0nik42
Copy link

s0nik42 commented Jun 29, 2021

Good for me too

@Reiers
Copy link

Reiers commented Jun 29, 2021

I'm in as well +1

@ozhtdong
Copy link

Count me in!

@steven004
Copy link

Great. Count me in. +1

And, I'm looking forward to some methods which can also be applied to other programs to ensure that the datacap is not abused.

@Broz221
Copy link

Broz221 commented Jun 30, 2021

Would like to be the notary of this project.

@rayshitou
Copy link

Nice. I'm in too

@Fenbushi-Filecoin
Copy link

I'm in +1

@Destore2023
Copy link

I'm in.

@Fatman13
Copy link
Contributor

Fatman13 commented Jun 30, 2021

Note our broker backend is being used for a number of other services (including Textile, Estuary, and Discover)

Is this broker backend open sourced by any chance?

@ribasushi
Copy link

@Fatman13 it is open source as virtually everything we do. However things are still shaping up: I could point you to a repository, but it is 100% unstable and being force-pushed to, so I wouldn't look just yet ( if really curious - simply look at my recent commit history 😉 )

The aggregator routine however (the one I used in partial retrieval demos ) is ready to roll: https://pkg.go.dev/github.com/filecoin-project/go-dagaggregator-unixfs

@dkkapur
Copy link
Collaborator

dkkapur commented Jul 5, 2021

Hi folks - great to see lots of notary support for this one! Moving forward with the first 7 notaries that responded.

  1. @cryptowhizzard - f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa
  2. @neogeweb3 - f13k5zr6ovc2gjmg3lvd43ladbydhovpylcvbflpa
  3. @andrewxhill - f1n4kuihubfesg55brkx5ntwqyglvbhydxjfodwra
  4. @s0nik42 - f1wxhnytjmklj2czezaqcfl7eb4nkgmaxysnegwii
  5. @Reiers - f1oz43ckvmtxmmsfzqm6bpnemqlavz4ifyl524chq
  6. @ozhtdong - f1lwpw2bcv66pla3lpkcuzquw37pbx7ur4m6zvq2a
  7. @steven004 - f1qoxqy3npwcvoqy7gpstm65lejcy7pkd3hqqekna

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@s0nik42
Copy link

s0nik42 commented Sep 8, 2021

@ribasushi I'm keeping my ledger plugged in then :p

@neogeweb3
Copy link

@ribasushi no worries, let me know if there is anything I could help with along the process.

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@s0nik42
Copy link

s0nik42 commented Sep 24, 2021

What is the status of this request ? is it still not fully approved ?

@galen-mcandrew
Copy link
Collaborator

@s0nik42 , to my knowledge this second allocation has still not met threshold of 4. That said, the team is still working through some tech hurdles so they have not actually started any deals with the first allocation.

Right now, seeing 3 approvals from Neo, Reiers, and Julien. I think it would be good to go ahead and get a fourth signature, so that it clears the second allocation before we roll all the LDN's to new multisigs.

@cryptowhizzard @andrewxhill @ozhtdong @steven004 could we get one more approval here?

@steven004
Copy link

steven004 commented Sep 26, 2021 via email

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedy2ezgiiuh4qbujlpuuvwpuznxk2xaerkoijik6t5ktdpkjwa5t6

Address

f3watks6wyq5sakerofowyv7q4gwx4z6ukmfe2irh5zitspj6gpceudt3rrbvm5psofkcgoabev3s2mwtkcunq

Datacap Allocated

2748779069440

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedy2ezgiiuh4qbujlpuuvwpuznxk2xaerkoijik6t5ktdpkjwa5t6

@galen-mcandrew
Copy link
Collaborator

rolling to #49

@large-datacap-requests
Copy link

Thanks for your request!
❗ We have found some problems in the information provided.
We could not find any Expected weekly DataCap usage rate in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.

@large-datacap-requests
Copy link

Thanks for your request!
❗ We have found some problems in the information provided.
We could not find any Expected weekly DataCap usage rate in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.