[questions] Understanding history sharding #2625

MarkusTeufelberger · 2018-07-12T23:19:19Z

Are the following assumptions/statements around history sharding correct? (pinging @miguelportilla for help, who wrote most of the code)

Rippled (by default) will call the contents of ledgers [32571-48954] (both inclusive) "shard 0", [48955-65338] "shard 1" and so on.
Every shard contains ALL nodes that appear in the 16384 ledgers it contains, even if they haven't changed during these 16384 ledgers (e.g. an AccountRoot node of a quite passive account).
"All nodes" means they also contain InnerNodes.
If 2 is true, that means that the total space used up by all shards is probably a bit larger than the actual node_db of a server with full history (since nodes are stored only once in there).
Shard database files are not created deterministically, they can look potentially different on every single server out there, even if they contain the same data. This is even true if they were all generated with NuDB xor RocksDB.
Ledger 0 (or ledger 32570 in the default config) will not make it into a shard.
If I set my node to earliest_seq = 32569 so it will also store 32570 in a shard, it will be likely incompatible with the rest of the network (since "shard 0" is different from every one else's "shard 0") and sharing them with others using Add shard download and import RPC #2561 will likely end in at least one ledger being requested from the network when importing the data.
After importing the data with Add shard download and import RPC #2561 on a server that is configured for full history, but that has no data yet, the ledger contents will be written into both node_db and shard_db.
It would be a smart strategy in the situation of 8 to start importing the latest shards (= highest IDs) first and then going back down to shard 0 instead of starting at 0 and moving forward in time.

What I want to achieve is (once #2561 lands) to create a trustworthy (meaning: deterministically generated) way to share shards via IPFS and/or BitTorrent or other P2P filesharing networks, so it becomes easier to have a node with full history without using that much ressources on rippled servers with that data. I have some ideas how this might be doable with NuDB (just keep the salt static, write everything sorted and singlethreaded in a data file with no spill records, then generate the index which should put the spill records at the end of the data file(?)), but I'd like to be sure first that I understood history sharding before I start wrangling with NuDB again.

PS: It would be nice if the earliest_seq - hack could be removed, so shard IDs are globally mapping to the same ledger ranges and just disable shards 0 and 1 on XRPL... but that's probably worth a separate issue.

The text was updated successfully, but these errors were encountered:

miguelportilla · 2018-08-17T18:28:01Z

Hi @MarkusTeufelberger,

Thanks for your questions, I've tried to answer them below.

The shard code uses two constants that can be modified to suit different networks. The first is the number of ledgers per shard. For the XPR ledger network, that value is 16384 which is based on a compromise of resource usage and application performance. The second constant is the earliest ledger sequence allowed. For the XRP ledger network, that value is 32570.

1- Shard indexes are calculated using the integral value of (ledger sequence - 1) / ledgers per shard.
Given ledger sequence 32570, we have (32570-1)/16384 = 1

A shard's first ledger sequence can be calculated using the maximum integral value of 1 + (shard index * ledgers per shard) or the earliest ledger sequence.
Given shard index 1, we have max(1 + (1 * 16384), 32570) = 32570

A shard's last ledger sequence can be calculated using the integral value of (shardIndex + 1) * ledgers per shard
Given shard index 1, we have (1 + 1) * 16384 = 32768

So the XRP ledger network calls the contents of ledgers [32570-32768] (both inclusive) "shard 1", [32769-49152] "shard 2" and so on.

2- Correct

3- Correct

4- Correct

5- Correct

6- In the XPR ledger network, ledger 32570 is in shard 1.

7- Changing the constants may create incompatibility, they are not meant to be modified after use.

8- No, shard archives are only imported into the shards database.

9- Since we only import into the shards database, the order doesn't matter. If we did import into the node database, then the strategy you suggested might help if using contiguous shard indexes.

I believe it is possible to create 'deterministically generated' shards as you described. That process can also be applied to convert existing shards.

Implementing an IPFS client in rippled is on my list to do.

MarkusTeufelberger · 2018-08-17T22:18:06Z

Great to hear and thanks for your answers! :-)

In that case I wonder why it was decided to go with the earliest ledger sequence design instead of just skipping shards 1+2 by default? There is a nonzero chance that earlier ledgers could be recovered and then all shard databases in the whole network might potentially need to be migrated or change because the earliest ledger sequence changes... Compared to just not having a few hundred of the earliest ledgers in a shard and making it much easier to calculate which shard a certain ledger height is going to be in in the process.

Since we only import into the shards database, the order doesn't not matter.

Would a node that wants full history and has all/most shards locally on disk then fill its node_db from the shard_db as fast as it can read/write the data or would it still query the data from the network? I really hope the former is the case...

I believe it is possible to create 'deterministically generated' shards as you described.

Write singlethreaded to a NuDB in deterministic order (e.g. sort all nodes by key alphabetically) and keep the salt of the database constant. I'm not so sure about the spill records, since the data gets written asynchronously, so if the disk can't keep up maybe it'll introduce problems? Should be testable though.

A different option would be to have a dedicated import/export format (could be as simple as CSV, these are only key value pairs of hex-strings) for shards instead of sharing database files. This would also have the benefit of helping alternative use cases - NuDB is not exactly widely used and RocksDB database files are also probably not that easy to be used in an shard import context.
The downside would be that there's the same data stored up to three times (node_db, shard_db, export/import file).

MarkusTeufelberger · 2018-11-16T12:12:24Z

Revisiting this because I'm slightly confused:

So the XRP ledger network calls the contents of ledgers [32570-32768] (both inclusive) "shard 1", [32769-49153] "shard 2" and so on.

How can 49153 be the last ledger of shard 2 if the last one in a shard always will be a multiple of 16384?

I think your example is off by one and the last ledger should be 49152 ((shardIndex + 1) * ledgers per shard --> (2 + 1) * 16384 = 49152), right?

miguelportilla · 2018-11-16T13:36:55Z

@MarkusTeufelberger Correct. I am not sure how that 3 snuck in there. ((2 + 1) * 16384 = 49152)

MarkusTeufelberger · 2018-11-16T14:55:12Z

Ah, I was already worried I misunderstood that part too. :-D

For future me and/or anyone else reading this, here's the shard calculation stuff in Python3:

# constants
LEDGERS_PER_SHARD = 16384
EARLIEST_PUBLIC_LEDGER = 32570

def earliest_ledger_in_shard(shard_num: int) -> int:
    return max(EARLIEST_PUBLIC_LEDGER, 1 + (shard_num * LEDGERS_PER_SHARD))

def last_ledger_in_shard(shard_num: int) -> int:
    return max(EARLIEST_PUBLIC_LEDGER, (shard_num + 1) * LEDGERS_PER_SHARD)

def shard_index_by_ledger(ledger_height: int) -> int:
    return (ledger_height - 1) // LEDGERS_PER_SHARD

carlhua · 2020-09-16T20:28:08Z

@MarkusTeufelberger can we close this issue? Also we welcome your feedback with #3455

MarkusTeufelberger · 2020-09-17T09:31:14Z

I guess it can be closed, maybe @mDuo13 might want to look over this here and see if it is covered in the documentation by now.

nbougalis assigned nbougalis and miguelportilla and unassigned miguelportilla and nbougalis Aug 17, 2018

MarkusTeufelberger mentioned this issue Sep 18, 2018

Deterministic shard databases/files #2688

Open

mDuo13 added the Documentation README changes, code comments, etc. label Aug 2, 2019

carlhua added the Reviewed label Sep 16, 2020

MarkusTeufelberger closed this as completed Sep 17, 2020

mDuo13 mentioned this issue Sep 23, 2020

History Sharding Details XRPLF/xrpl-dev-portal#906

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[questions] Understanding history sharding #2625

[questions] Understanding history sharding #2625

MarkusTeufelberger commented Jul 12, 2018

miguelportilla commented Aug 17, 2018 •

edited

Loading

MarkusTeufelberger commented Aug 17, 2018

MarkusTeufelberger commented Nov 16, 2018

miguelportilla commented Nov 16, 2018

MarkusTeufelberger commented Nov 16, 2018

carlhua commented Sep 16, 2020

MarkusTeufelberger commented Sep 17, 2020

[questions] Understanding history sharding #2625

[questions] Understanding history sharding #2625

Comments

MarkusTeufelberger commented Jul 12, 2018

miguelportilla commented Aug 17, 2018 • edited Loading

MarkusTeufelberger commented Aug 17, 2018

MarkusTeufelberger commented Nov 16, 2018

miguelportilla commented Nov 16, 2018

MarkusTeufelberger commented Nov 16, 2018

carlhua commented Sep 16, 2020

MarkusTeufelberger commented Sep 17, 2020

miguelportilla commented Aug 17, 2018 •

edited

Loading