Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add information byte to message share format #839

Closed
Tracked by #514
musalbas opened this issue Aug 17, 2022 · 4 comments
Closed
Tracked by #514

Add information byte to message share format #839

musalbas opened this issue Aug 17, 2022 · 4 comments
Assignees
Labels
T:spec-divergence Type: diversion from spec

Comments

@musalbas
Copy link
Member

The current share format is as follows:

nid (8 bytes) | message length (varint) | share data for shares at the start of a message;
nid (8 bytes) | share data for other shares.

I propose to change it to the following:

nid (8 bytes) | info byte | message length (varint) | share data for shares at the start of a message;
nid (8 bytes) | info byte | share data for other shares.

Where info byte is a byte with the following structure:

  • the first 7 bits are reserved for the version information in big endian form (initially, this will just be 0000000 until further notice);
  • the last bit is a "message start indicator", that is 1 if the share is at the start of a message.

Rationale:

  1. The message start indicators allow clients to parse a whole message in the middle of a namespace, without needing to read the whole namespace.
  2. The version bits allow us to upgrade the share format in the future, if we need to do so in such a way that different share formats can be mixed within a block.
@rootulp
Copy link
Collaborator

rootulp commented Aug 18, 2022

Thanks for the visual depictions of the current/proposed share format, they really help!

The message start indicators allow clients to parse a whole message in the middle of a namespace, without needing to read the whole namespace.

Clarifying question: is this because a a client cannot differentiate between message length (varint) and arbitrary share data in the current format?

Does the proposed format include the message start indicator in the info byte (as opposed to a separate byte) to save one byte per share? This seems like a sane tradeoff to make assuming we don't expect to exhaust 2^7 versions.

@musalbas
Copy link
Member Author

musalbas commented Aug 18, 2022 via email

@rootulp rootulp changed the title Add information byte to share format Add information byte to message share format Aug 19, 2022
@rootulp rootulp moved this to TODO in Celestia Node Aug 19, 2022
@rootulp rootulp moved this from TODO to In Progress in Celestia Node Aug 19, 2022
@rootulp rootulp added the T:spec-divergence Type: diversion from spec label Aug 22, 2022
@adlerjohn
Copy link
Member

adlerjohn commented Aug 23, 2022

assuming we don't expect to exhaust 2^7 versions.

The info byte could always be extended, e.g. if the 7 bits are 1111111 then read the second byte.

The proposal seems fine, I have no objections. Adding versions to things has proven to basically always be a good thing in hindsight. The overhead is minimal, especially considering we may increase share size.

@rootulp
Copy link
Collaborator

rootulp commented Aug 29, 2022

Closing in favor of celestiaorg/celestia-app#659

@rootulp rootulp closed this as completed Aug 29, 2022
Repository owner moved this from In Progress to Done in Celestia Node Aug 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T:spec-divergence Type: diversion from spec
Projects
No open projects
Archived in project
Development

No branches or pull requests

3 participants