Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvdb-rocksdb with no overlay cache #310

Closed
NikVolf opened this issue Jan 6, 2020 · 13 comments · Fixed by #313
Closed

kvdb-rocksdb with no overlay cache #310

NikVolf opened this issue Jan 6, 2020 · 13 comments · Fixed by #313

Comments

@NikVolf
Copy link
Contributor

NikVolf commented Jan 6, 2020

Currently, substrate is not using it, and probably for a good reason, since caching something that RocksDB caches (or is able to) on it's own is generally not a good idea.

@dvdplm
Copy link
Contributor

dvdplm commented Jan 6, 2020

I have asked myself the same thing but never did anything about it due to the lack of hard data on the performance impact. I'm not convinced that "substrate isn't using it" is enough.

@ordian
Copy link
Member

ordian commented Jan 6, 2020

I can do some benchmarks on block import impact for parity-ethereum.

@NikVolf
Copy link
Contributor Author

NikVolf commented Jan 6, 2020

@dvdplm Yes, of course, I am not convinced either.
We can also make substrate use it and see how it goes actually.

@NikVolf
Copy link
Contributor Author

NikVolf commented Jan 6, 2020

I can do some benchmarks on block import impact for parity-ethereum.

I'll look what is required for my branch to work with parity-ethereum

But we can also have two versions even if parity-ethereum strictly need overlay cache

@ordian
Copy link
Member

ordian commented Jan 9, 2020

I've run a quick-and-dirty import bench (importing 1k recent blocks (at ~9230k height) on master vs ao-no-overlay branch).
There seems to be small (~4%) to no regression for that. Warp-sync doesn't seem to regress substantially either.

The thing that worries me is the semantical change. In parity-ethereum we use write_buffered and if we just replace write_buffered with write (this is what was done in that branch), we can commit some data in the intermediate state and if the node crashes/finishes it could cause some problems? But if it's not the case, I think we proceed with no overlay and probably remove write_buffered and flush from the kvdb API altogether (another breaking change).

@dvdplm
Copy link
Contributor

dvdplm commented Jan 9, 2020

if the node crashes/finishes it could cause some problems?

We do not provide any guarantees to ensure data consistency in the case of a crash afaik, not beyond what rocksdb already does with the WAL anyway.

Here's what I like about the idea of getting rid of the overlay:

  • less code is good
  • the overlay has no proven performance advantage
  • memory spent in the overlay can be given to rocksdb
  • the overlay makes it harder to reason about and test performance and bottlenecks

The downside is, like you say, that the consuming code has to be very carefully audited.

@NikVolf
Copy link
Contributor Author

NikVolf commented Jan 9, 2020

@ordian

Actually, I have an optimised version of no-overlay variant, want to check also that in branch lil-copy?

It squashes all key-values in one long Vec inside transaction

It will allow to avoid allocations when you, for example, do transaction.write(&h256, &h256), which is widely used afair

Overlay cache prevented this optimisation before

@ordian
Copy link
Member

ordian commented Jan 9, 2020

@NikVolf I don't expect it to make a difference, but will try tomorrow.

We do not provide any guarantees to ensure data consistency in the case of a crash afaik, not beyond what rocksdb already does with the WAL anyway.

But write_buffered is guaranteed not to commit into rocksdb, if the change the semantics, we're now committing into the db what we previously didn't (intermediate state).

@ordian
Copy link
Member

ordian commented Jan 10, 2020

@NikVolf I've tried ao-lil-copy branch as your branch didn't compile and it didn't make a difference for the import bench.

My concert about semantics still holds.

@NikVolf
Copy link
Contributor Author

NikVolf commented Jan 11, 2020

@ordian Anyone who relies on this semantics shouldn't have done it on the first place, it is not written anywhere (is it?)

write_buffered could have written transaction immediately, for example, if there is no cache present, or whatever logic could be behind it

atomicity is, of course, on Transaction level, and this should be unambiguously stated if it is not already

@ordian
Copy link
Member

ordian commented Jan 13, 2020

Here is my audit of write_buffered usage in parity-ethereum: https://hackmd.io/IOQQynyJSoedXXq5SKqUVQ. Would appreciate some comment regarding concerns about warp sync usage of it.

I think we should proceed with removing overlay regardless and fix the usage in parity-ethereum later.

@arkpar
Copy link
Member

arkpar commented Jan 15, 2020

This overlay was originally introduced to optimize block import pipeline, not for caching.

The point is we can start importing block N+1 while N is still being submitted (flushed) to RocksDB. Copying data in memory is still much faster than creating and writing RocksDB batch. In early days with lighter blocks that resulted in about 5-10% faster full sync speed.

Although this is not currently used in substrate, we were going to use it eventually.
However if benchmarks show that performance gains are now negligible I'm fine with removing it.

@ordian
Copy link
Member

ordian commented Jan 15, 2020

@arkpar thanks for the input. I've tested with the much heavier blocks than in the early days indeed, so I don't know how will it impact substrate import time (if it is to be used), but even then 5% is not worth complexity and limitations like this.

We're essentially trading off between how often data is flushed vs latency and this is something that can be tuned on the RocksDB settings level I guess?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants