XLS-48d: Document Storage #132

mDuo13 · 2023-09-14T01:13:38Z

mDuo13
Sep 14, 2023

title: Document Storage
author: Rome Reginelli, Aanchal Malhotra
affiliation: Ripple
revision: 4.1
core_protocol_changes_required: true

Document Storage

Many use cases call for storing arbitrary, unstructured data in the ledger and retrieving it. This might be used for app configurations, identity data, oracles, user storage for play-to-earn games, provenance tracking of objects, and so on.

This proposal defines "CRUD operations" (create, retrieve, update, delete) for arbitrary documents.

Adds one ledger entry type, Document
Adds two new transaction types:
- DocumentSet, which creates or updates a Document
- DocumentDelete, which deletes a Document
Extends the ledger_entry and account_object API methods for getting Document entries from the ledger

These changes require an amendment to the XRP Ledger.

(This standards draft draws heavily on a previous draft of the XLS-40d: Decentralized Identity specification.)

Rationale

Blockchains are fundamentally good at storing data in a manner that is public, highly-available, and cryptographically verifiable. (Changes require a signed transaction and data integrity is verified with cryptographic hashes.) Various blockchain use cases depend on being able to store and retrieve data from the ledger in various formats. This specification aims to support further general purpose use of the XRP Ledger by allowing users to store small chunks of data, called Documents, either directly on-ledger or with an on-ledger reference to data stored off-ledger.

In general, this spec is applicable for anything where all of the following are true:

The XRP Ledger does not validate the integrity or structure of the data.
The data is owned by one account, which can update or delete it as needed.
The document is either small enough to store directly on the XRP Ledger, or is stored elsewhere but the XRP Ledger stores the information necessary to look up and verify it.

Non-fungible tokens provide a similar ability to store arbitrary URIs or small data blocks in the ledger. However, NFTs have some properties that are critical to their use cases but other inconvenient for the use cases Document Storage targets.

NFTs can be traded, sold, or transferred. Documents cannot.
Documents are mutable. NFTs cannot be changed after minting, at least under the XLS-20 standard. (XLS-46d: Dynamic NFTs proposes mutable NFTs.)
It is possible to directly look up a given type of Document owned by a particular user knowing nothing more than the user account and the type of document. This is not possible with NFTs, which have a sequence that is defined at minting time, so it is not possible for different accounts to use a consistent ID to store equivalent data.

The functionality defined by Document Storage is similar to Stellar's "Manage Data" operation, but the amount and format of data allowed is different. Stellar allows for a 64-byte data field identified by a 64-byte identifying string; Document Storage allows up to two 256-byte data fields identified by a 4-byte identifying integer.

One example of a use case for Document Storage is non-sensitive client application settings. Wallet applications are already highly portable; as long as you know your secret keys, you can use a wallet app from any device and switch at any time. Most of the data about your XRP Ledger account is natively part of the ledger, such as your balances, trust lines, and various account settings. However, there is currently no place on-ledger to store settings that are specific to the client application, which means that switching to a different device means either exporting and re-importing certain settings, or having a separate backend service hosted by the wallet provider. With Document Storage, a client application can define a specific document number to store its settings at, which would be the same for every user, and define/retrieve settings from that Document purely on-ledger. Then, if the user connected using the same static wallet code from any device, they could instantly access their saved settings with no outside service or file needed. Taking it a step further, other apps could read these settings, or multiple separate apps could share settings in a standard format, allowing greater portability and compatibility. This allows many more dApps to have no backend other than the XRP Ledger itself.

Document Storage also has a potential for synergy with other proposed extensions, such as Hooks, which could read or modify Documents beyond what they can do with Hooks' built in state management.

Of course, since all data in the XRP Ledger is public, Document storage is not suited for storing secret, sensitive, or personal information. (Even if it is encrypted, having the data permanently publicly accessible is poor operational security.)

DocumentSet transaction

This transaction type creates or updates a Document ledger entry. In addition to transaction common fields, it has:

Field	Type	Required?	Description
`DocumentNumber`	UInt32	Yes	An arbitrary number to identify the document to add or modify. You can uniquely identify any Document ledger entry by the pair (`Owner`, `DocumentNumber`). By convention, values of 65535 or less are reserved for "well known" documents and values of 65536 or greater are unreserved.
`Data`	VL	No	Arbitrary binary data (hex-formatted in JSON). Limited to 256 bytes or less.
`URI`	VL	No	A URI that encodes or locates a document (hex-formatted in JSON). Only bytes that are valid in URIs are allowed. Limited to 256 bytes or less.

The Data and URI fields are collectively called the data fields of the Document. You can specify one, both, or neither of these.

The URI field can only contain bytes that correspond to characters valid in URIs, specifically the ASCII values for the following characters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=%. (This is the same restriction that applies to the MemoType and MemoFormat fields.) It is intended, but not required, for the URI field to encode a reference to an off-ledger document that is too large to store on-ledger, such as an ipfs: URL.

The Data field may contain fully arbitrary data.

The transaction creates or updates (aka "upserts") the corresponding Document entry in the ledger with the matching DocumentNumber, adding or replacing the Data and URI fields with the ones provided. Specify an empty Data or URI field to remove it. (If the field already does not exist in the Document, this has no effect.) This transaction always updates the PreviousTxnID and PreviousTxnLgrSeq fields of a Document even if it did not make changes to the data fields, so you can use an otherwise no-op transaction to "renew" a document.

The DocumentSet transaction has no specific transaction flags.

DocumentDelete transaction

This transaction removes a Document ledger entry. In addition to transaction common fields, it has the following field:

Field	Type	Required?	Description
`DocumentNumber`	UInt32	Yes	An arbitrary number to identify the document to delete.

The DocumentDelete transaction has no specific transaction flags.

Document ledger entry

A Document ledger entry stores arbitrary data on behalf of a given account.

A Document entry counts as one item for purposes of the owner reserve, and is tracked in an owner directory. Only the owner of an object can update it, although you can use multi-signing to share that power. Each account can have (theoretically) up to 2³² Document entries, each with a different DocumentNumber, but practically speaking the owner reserve puts a stricter limit on how many Document entries one account can afford.

This ledger entry has the following fields:

Field	Type	Required?	Description
`Owner`	AccountID	Yes	The account that owns this data. Only this account has permission to update it. Automatically set to the sender of the `DocumentSet` transaction that created this ledger entry.
`DocumentNumber`	UInt32	Yes	An arbitrary number to identify this entry. By convention, values of 65535 or less are reserved for "well known" documents and values of 65536 or greater are unreserved.
`Data`	VL	No	Arbitrary binary data, hex-formatted. Limited to 256 bytes or less.
`URI`	VL	No	A URI that encodes or locates the document (hex-formatted in JSON). Limited to 256 bytes or less.
`PreviousTxnID`	UInt256	Yes	Hash of the previous transaction to modify this entry. (Same as on other entries with this field.)
`PreviousTxnLgrSeq`	UInt32	Yes	Ledger index of the ledger when this entry was most recently updated/created. (Same as other entries with this field.)

The fields Data and URI are guaranteed to be nonzero length if present.

Document ID format

The ledger entry ID of a Document object is the SHA-512Half of the following values, concatenated in order:

The Document space key (proposed: 0x0044)
The AccountID of the DocumentSet transaction sender (the owner)
The DocumentNumber value

Well-Known Document Numbers

It some cases, it is helpful to have certain types of documents conventionally stored at a predetermined DocumentNumber so that you can look up an account's document of that type without further information. Similar to reserved ports in TCP/IP, we define a range of "reserved" document numbers and a registry (a list to be maintained on xrpl.org) mapping specific numbers to specific types of documents.

The reserved range of DocumentNumber values is 0 through 65535 inclusive. To register a given document number for a specific purpose, create an XLS draft and specify the meaning and format of the document type to be stored at that number. The registry will be updated when that XLS draft is accepted.

Values of 65536 or greater are unreserved and may be used for any type of document.

At a protocol level, no rules are enforced regarding the type of document stored at any number.

API Changes

The account_objects method can return Document ledger entries. Extend the API method to allow filtering by "type": "document" to return only these types of documents. The same applies to the ledger and ledger_data methods, which can also retrieve arbitrary ledger entries and filter by type.

The ledger_entry method can retrieve a specific Document ledger entry, similar to how it returns other types. There are two ways to look up a given Document:

Look it up by ID using the existing Get Ledger Object By ID syntax (the index request field)
Look it up by the DocumentNumber and the account that owns it, using a new document request field, which is an Object with two nested, required sub-fields as follows:

Field	Type	Description
`document.owner`	String - Address	The unique address of the account that owns the document.
`document.number`	Number - UInt32	The `DocumentNumber` of the document to look up.

This is similar to how you can look up offers, escrows, tickets, and others.

Size Considerations

At 572 bytes including the bookkeeping fields, the maximum size of a single Document ledger entry is slightly larger than most directly user-editable ledger entries (for comparison, a trust line is between 234 and 250 bytes and a payment channel can be up to 247 bytes if my math is right), but much smaller than a single NFTokenPage entry, which can be over 73,000 bytes if it stores the maximum number of NFTs. Therefore, we consider it appropriate to require a single owner reserve increment per Document object owned in the ledger.

Users can doubtlessly find various ways to store arbitrary data in the ledger, but to discourage wasteful use of resources we explicitly don't define a way of linking multiple Document entries together for storing larger amounts of data. Users can use the data fields to identify, locate, and verify documents which are stored and distributed using another system such as IPFS or even BitTorrent. For example, the Data field can contain a hash, the URI field can be a magnet: link, and so on.

Document History

Some use cases may call for examining multiple recent values for a given Document. This is already possible using past ledger versions and transaction metadata. The PreviousTxnID and PreviousTxnLgrSeq fields can be used to directly look up the previous version of a Document the same way they can for other types of ledger entry with those fields. (Note: most ledger entry types have these fields; the big exception is RippleState, for trust lines. The other types that don't have these fields are not directly user-modifiable: Amendments, DirectoryNode, FeeSettings, LedgerHashes, and NegativeUNL.)

In the edge case where a ledger entry is modified more than once within a single ledger, you must use the transaction metadata (specifically the FinalFields of a ModifiedNode entry) to look up the intermediate state of the entry. This state exists for only a brief moment while executing the transactions to build the ledger, but it may be necessary to understand the full history of the entry.

A limitation of the PreviousTxnID and PreviousTxnLgrSeq fields is that the "thread" they create of an entry's history does not go back further than its creation. If an entry was previously deleted and later recreated with the same ID, each instance's history is separate.

It may be useful to add an API method to the server or client libraries for looking up past states of a ledger entry. This would not require an amendment.

shortthefomo · 2023-09-14T03:17:50Z

shortthefomo
Sep 14, 2023

Im confused as to how the URI data is accessible on ledger? For any on ledger transactor to act on the data present in the URI it would need to fetch that data.

From the perspective of oracle data it's useful only on ledger, as the point of the oracle is to bring off ledger data on ledger to act on. The oracle spec XLS-47d defines this more granularly than this.. esp if oracles are bringing multiple pairs of data. The definition here can be already achieved via an accountSet operation which has much the same limitations as this from the perspective of oracle data.

From my first pass reading through this it seems to indicate this is possible (fetching the URI). Not even hooks could access data off ledger.

1 reply

mDuo13 Sep 14, 2023
Author

The data at the URI is not accessible on-ledger; only the URI itself is. So, for oracles, Document Storage is mostly useful if the data is small enough to fit on-ledger (e.g. ≤256 bytes per document, or 256+256 if you use URI creatively).

One case where this is still an advantage over offering an oracle as "just an HTTP API" is that Document data is always signed by the owner of the account—as part of the transaction to set it—so it's cryptographically associated with an XRPL identity. If your URI is a content-addressed-storage style scheme like ipfs: or magnet: (meaning, the URI contains a hash of the contents) then you can view that as an indirect signature of the contents. For other URI schemes, you could put a hash in the Data field alongside the URI. Paired with ledger history, which is also cryptographically secured, you have a verifiable chain of documents provably signed by a single entity. In this scenario, you still have to fetch & save the actual larger documents off-ledger, but the ledger data are a public & permanent integrity check on the data.

xVet · 2023-09-14T10:46:21Z

xVet
Sep 14, 2023

Interesting proposal and thank you for writing this up!

I think what you proposed is basically a URIToken that is soulbound.

Under soulbound we understand NFTs, or objects containing a URI, that are non transferable.

XLS20 does have this flag, tfTransferable, it's not a first class object so you need to have an indexer like clio.

URITokens or xls40 can both act the same way with a flag to make them soulbound.

Can you share your thoughts around this? 😊

1 reply

mDuo13 Sep 14, 2023
Author

Documents and "soulbound NFTs" are pretty similar, but one of the key differences—as described in the Rationale section—is lookup. It's a single-step lookup to find "Document 12345 for account rf1BiGeXwwQoi8Z2ueFYTEXSwuJYfV2Jpn". So, if you're looking for a particular thing—say, some kind of identity information, or settings for a particular app, or whatever else—that's all it takes. The equivalent for a soulbound NFT would involve doing a search through multiple NFT pages to find one with the correct taxon, or something. You can't know what NFTokenID someone will have used because that's somewhat dynamic (sequence number based) at the time the NFT is minted.

ckeshava · 2023-09-15T22:21:47Z

ckeshava
Sep 15, 2023

Hello, thank you for writing this proposal. I have two questions:

Why would we need both Document Storage and Decentralized Identity? Although the implementation specifics might differ, because both of them encode 256+256 bytes, they appear to be redundant.
I am not able to conceive uses for this transaction type. Why would I pay the reserve requirements to store 512 bytes of data? (Especially since we can't access it on-chain, or that XRPL doesn't have smart contracts either) ?

A rebuttal to the use case of preserving document integrity: Today, we are publishing documents along with their checksums in order to prove their integrity. For example -- software downloads are accompanied with a public display of the expected md5 hashes.

Why would an app developer want to add an extra layer of indirection? Instead she could publish the ipfs/magnet/Google-Drive link directly on her website.

5 replies

mDuo13 Sep 19, 2023
Author

Two basic reasons: (a) decentralized identity data structures and code may evolve to be more specific and less suitable to other, generic data storage purposes, and (b) DID stores only one document; this spec allows storing over 2 billion—that would cost a prohibitive amount of XRP under current Mainnet reserve amounts, but you can certainly store a lot more documents usefully, maybe moreso in the future or on sidechains.
It's all about the types of architecture that it enables. For one, consider a document that you want a specific list of people to be able to edit—combining Document Storage with a multi-signing wallet handles that in a public-key-powered, history-tracked way. If you have any sort of app that wants to interact with the blockchain in general, and that app needs to store some (non-sensitive) data, this adds a place to do that in a "cloud"-like way, without relying on a separate service. You could have a single, static wallet application that nevertheless loads all your customizations and preferences regardless of which device you access it from—just bring your account address—without relying on additional centralized services or APIs.

If you're thinking about a single document per app developer published on a website, you're not thinking big enough. This is about being able to store things on the blockchain even if you don't have a website, or don't want to use it for this. Or checking the integrity of documents for each user of the app not just for the app itself.

Is the price right at 2 XRP for 256+256 bytes? That, I'm not so sure; certainly, there are use cases that will be priced out. But the interface itself is not fundamentally incompatible with a wide variety of possible uses.

mvadari Sep 20, 2023
Maintainer

Re 2: You can also fit smaller documents on-ledger that are <=512 bytes.

ckeshava Sep 20, 2023

Okay, the use of Document Storage to store the wallet specific information appears to be a promising use case.
What do you mean by "... this spec allows storing over 2 billion - ..." ? Do you mean users can cascade on the links and store a large number of documents?

Is there any cost associated with the DocumentSet (creation or updating existing documents) ? I'm aware of the two XRP reserve requirement for a document. If we do not have any transaction fees for updates to a document, the network would be flooded with cost-free updates right?

mvadari Sep 20, 2023
Maintainer

Creating a new object (including a document) always requires a reserve. Updating a ledger object (including a document) never requires a reserve and always requires a transaction fee.

ckeshava Sep 20, 2023

ok, thanks for the clarification

mDuo13 · 2023-10-30T20:22:58Z

mDuo13
Oct 30, 2023
Author

It sounds like there's not all that much demand for / interest in this functionality, so I guess I'll retire this proposal for now. We can always dig it back out if the need arises.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XLS-48d: Document Storage #132

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 7 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

XLS-48d: Document Storage #132

mDuo13 Sep 14, 2023

Document Storage

Rationale

DocumentSet transaction

DocumentDelete transaction

Document ledger entry

Document ID format

Well-Known Document Numbers

API Changes

Size Considerations

Document History

Replies: 4 comments · 7 replies

shortthefomo Sep 14, 2023

mDuo13 Sep 14, 2023 Author

xVet Sep 14, 2023

mDuo13 Sep 14, 2023 Author

ckeshava Sep 15, 2023

mDuo13 Sep 19, 2023 Author

mvadari Sep 20, 2023 Maintainer

ckeshava Sep 20, 2023

mvadari Sep 20, 2023 Maintainer

ckeshava Sep 20, 2023

mDuo13 Oct 30, 2023 Author

mDuo13
Sep 14, 2023

Replies: 4 comments 7 replies

shortthefomo
Sep 14, 2023

mDuo13 Sep 14, 2023
Author

xVet
Sep 14, 2023

mDuo13 Sep 14, 2023
Author

ckeshava
Sep 15, 2023

mDuo13 Sep 19, 2023
Author

mvadari Sep 20, 2023
Maintainer

mvadari Sep 20, 2023
Maintainer

mDuo13
Oct 30, 2023
Author