-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvdb-rocksdb with no overlay cache #310
Comments
I have asked myself the same thing but never did anything about it due to the lack of hard data on the performance impact. I'm not convinced that "substrate isn't using it" is enough. |
I can do some benchmarks on block import impact for parity-ethereum. |
@dvdplm Yes, of course, I am not convinced either. |
I'll look what is required for my branch to work with But we can also have two versions even if |
I've run a quick-and-dirty import bench (importing 1k recent blocks (at ~9230k height) on master vs ao-no-overlay branch). The thing that worries me is the semantical change. In parity-ethereum we use write_buffered and if we just replace |
We do not provide any guarantees to ensure data consistency in the case of a crash afaik, not beyond what rocksdb already does with the WAL anyway. Here's what I like about the idea of getting rid of the overlay:
The downside is, like you say, that the consuming code has to be very carefully audited. |
Actually, I have an optimised version of no-overlay variant, want to check also that in branch It squashes all key-values in one long It will allow to avoid allocations when you, for example, do transaction.write(&h256, &h256), which is widely used afair Overlay cache prevented this optimisation before |
@NikVolf I don't expect it to make a difference, but will try tomorrow.
But |
@NikVolf I've tried ao-lil-copy branch as your branch didn't compile and it didn't make a difference for the import bench. My concert about semantics still holds. |
@ordian Anyone who relies on this semantics shouldn't have done it on the first place, it is not written anywhere (is it?)
atomicity is, of course, on |
Here is my audit of I think we should proceed with removing overlay regardless and fix the usage in parity-ethereum later. |
This overlay was originally introduced to optimize block import pipeline, not for caching. The point is we can start importing block N+1 while N is still being submitted (flushed) to RocksDB. Copying data in memory is still much faster than creating and writing RocksDB batch. In early days with lighter blocks that resulted in about 5-10% faster full sync speed. Although this is not currently used in substrate, we were going to use it eventually. |
@arkpar thanks for the input. I've tested with the much heavier blocks than in the early days indeed, so I don't know how will it impact substrate import time (if it is to be used), but even then 5% is not worth complexity and limitations like this. We're essentially trading off between how often data is flushed vs latency and this is something that can be tuned on the RocksDB settings level I guess? |
Currently, substrate is not using it, and probably for a good reason, since caching something that RocksDB caches (or is able to) on it's own is generally not a good idea.
The text was updated successfully, but these errors were encountered: