Pruned full/bridge nodes: minimum viable pruning #2615

musalbas · 2023-08-26T08:42:57Z

#OperationSaveStorageSpace

This is issue describes the minimum viable pruning needed for full and bridge nodes, which are milestones 1 and 2 in #2033.

First, let us define a storage window in terms of numbers of blocks, such that blocks before that period aren't kept. For example if the storage window is 172,800, this would be 30 days worth of blocks, assuming a 15 second block time. Defining it in days would be better but may not be trivial.

For light nodes, they should only sample blocks within the storage window.

For full and bridge nodes, we should add a flag --pruned [true/false] that persists a config setting that is true by default on init.

When this flag is true, only blocks within the storage window are sampled, and the CAR files of blocks outside of the storage window are automatically deleted from the store. Headers can still be kept and synced.

The most complicated part will be pruning the badger inverted index. To do so, we can split badger into multiple databases, for each rolling storage window (eg, a database for the last 100k blocks, another database for the next 100k blocks before that). This would mean a that getting entries from the inverted index would require up to two reads, but that is okay as it can be parallelized, and we plan to remove the inverted index anyway. This also means that it might not be possible to switch a non-pruned node to a pruned node, because the inverted indexes are stored differently, unless we add logic for this, but it's not necessary to support this for minimum viable pruning.

Pruned nodes should advertise themselves on a new discovery topic for pruned full nodes. Then, shrexeds/nd should have logic to discover non-pruned nodes in addition to pruned nodes if they try to access data from blocks older than the storage window.

musalbas · 2023-08-26T09:11:02Z

For bridge nodes, they should probably sync only block headers from core nodes, for blocks outside of the storage window

musalbas · 2023-08-28T20:32:22Z

This also means that it might not be possible to switch a non-pruned node to a pruned node, because the inverted indexes are stored differently, unless we add logic for this, but it's not necessary to support this for minimum viable pruning.

correction: non-pruned nodes could also prune these indexes, given that sampling window is 30 days, and indexes should only be needed for sampling (assuming we remove or reconstruct index for shrex-ND)

distractedm1nd · 2023-09-22T13:52:08Z

Bridge nodes should always sync from height 1 regardless of recency window, correct?

musalbas · 2023-09-22T13:57:56Z

No, pruned bridge nodes shouldn't Put() blocks outside of recency window.

What do you mean by "sync" in this case though?

github-actions bot added needs:triage external Issues created by non node team members labels Aug 26, 2023

musalbas mentioned this issue Aug 29, 2023

GetSharesByNamespace should get shares by reading them in the CAR file sequentially, rather than walking the IPLD #2618

Closed

distractedm1nd mentioned this issue Sep 22, 2023

[EPIC] Storage Pruning #2748

Open

ramin added kind:discussion and removed needs:triage labels Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pruned full/bridge nodes: minimum viable pruning #2615

Pruned full/bridge nodes: minimum viable pruning #2615

musalbas commented Aug 26, 2023 •

edited

Loading

musalbas commented Aug 26, 2023

musalbas commented Aug 28, 2023

distractedm1nd commented Sep 22, 2023

musalbas commented Sep 22, 2023 •

edited

Loading

Pruned full/bridge nodes: minimum viable pruning #2615

Pruned full/bridge nodes: minimum viable pruning #2615

Comments

musalbas commented Aug 26, 2023 • edited Loading

musalbas commented Aug 26, 2023

musalbas commented Aug 28, 2023

distractedm1nd commented Sep 22, 2023

musalbas commented Sep 22, 2023 • edited Loading

musalbas commented Aug 26, 2023 •

edited

Loading

musalbas commented Sep 22, 2023 •

edited

Loading