-
Notifications
You must be signed in to change notification settings - Fork 961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pruned full/bridge nodes: minimum viable pruning #2615
Comments
For bridge nodes, they should probably sync only block headers from core nodes, for blocks outside of the storage window |
correction: non-pruned nodes could also prune these indexes, given that sampling window is 30 days, and indexes should only be needed for sampling (assuming we remove or reconstruct index for shrex-ND) |
Bridge nodes should always sync from height 1 regardless of recency window, correct? |
No, pruned bridge nodes shouldn't Put() blocks outside of recency window. What do you mean by "sync" in this case though? |
#OperationSaveStorageSpace
This is issue describes the minimum viable pruning needed for full and bridge nodes, which are milestones 1 and 2 in #2033.
First, let us define a storage window in terms of numbers of blocks, such that blocks before that period aren't kept. For example if the storage window is 172,800, this would be 30 days worth of blocks, assuming a 15 second block time. Defining it in days would be better but may not be trivial.
For light nodes, they should only sample blocks within the storage window.
For full and bridge nodes, we should add a flag --pruned [true/false] that persists a config setting that is true by default on init.
When this flag is true, only blocks within the storage window are sampled, and the CAR files of blocks outside of the storage window are automatically deleted from the store. Headers can still be kept and synced.
The most complicated part will be pruning the badger inverted index. To do so, we can split badger into multiple databases, for each rolling storage window (eg, a database for the last 100k blocks, another database for the next 100k blocks before that). This would mean a that getting entries from the inverted index would require up to two reads, but that is okay as it can be parallelized, and we plan to remove the inverted index anyway. This also means that it might not be possible to switch a non-pruned node to a pruned node, because the inverted indexes are stored differently, unless we add logic for this, but it's not necessary to support this for minimum viable pruning.
Pruned nodes should advertise themselves on a new discovery topic for pruned full nodes. Then, shrexeds/nd should have logic to discover non-pruned nodes in addition to pruned nodes if they try to access data from blocks older than the storage window.
The text was updated successfully, but these errors were encountered: