bug: RocksDB "busy" exceptions #3719

garyschulte · 2022-04-11T16:05:54Z

Description

In a variety of instances Besu will encounter a rocksdb "busy" exception. This may be exacerbated by the use of OptimisticTransactionDB. The 'busy' errors are especially pronounced on mainnet nodes during fast-sync.
e.g.

{
   "timestamp":"2022-04-07T15:42:19,838",
   "level":"ERROR",
   "thread":"EthScheduler-Services-7 (batchPersistData)",
   "class":"FastWorldStateDownloadProcess",
   "message":"Pipeline failed",
   "throwable":"
        org.hyperledger.besu.plugin.services.exception.StorageException: org.rocksdb.RocksDBException: Busy
            at org.hyperledger.besu.plugin.services.storage.rocksdb.segmented.RocksDBColumnarKeyValueStorage$RocksDbTransaction.commit(RocksDBColumnarKeyValueStorage.java:287)
            at org.hyperledger.besu.services.kvstore.SegmentedKeyValueStorageTransactionTransitionValidatorDecorator.commit(SegmentedKeyValueStorageTransactionTransitionValidatorDecorator.java:49)
            at org.hyperledger.besu.services.kvstore.SegmentedKeyValueStorageAdapter$1.commit(SegmentedKeyValueStorageAdapter.java:90)
            at org.hyperledger.besu.ethereum.bonsai.BonsaiWorldStateKeyValueStorage$Updater.commit(BonsaiWorldStateKeyValueStorage.java:333)
            at org.hyperledger.besu.ethereum.eth.sync.fastsync.worldstate.PersistDataStep.persist(PersistDataStep.java:53)
            at org.hyperledger.besu.ethereum.eth.sync.fastsync.worldstate.FastWorldStateDownloadProcess$Builder.lambda$build$3(FastWorldStateDownloadProcess.java:202)
            at org.hyperledger.besu.services.pipeline.MapProcessor.processNextInput(MapProcessor.java:31)
            at org.hyperledger.besu.services.pipeline.ProcessingStage.run(ProcessingStage.java:38)
            at org.hyperledger.besu.services.pipeline.Pipeline.lambda$runWithErrorHandling$3(Pipeline.java:152)
            at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
            at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
            at java.base/java.lang.Thread.run(Thread.java:829)
        Caused by: org.rocksdb.RocksDBException: Busy
            at org.rocksdb.Transaction.commit(Native Method)
            at org.rocksdb.Transaction.commit(Transaction.java:206)
            at org.hyperledger.besu.plugin.services.storage.rocksdb.segmented.RocksDBColumnarKeyValueStorage$RocksDbTransaction.commit(RocksDBColumnarKeyValueStorage.java:281)
            ... 13 more"
}

We should either handle these exceptions and implement a retry strategy, or revert to TransactionDB

Acceptance Criteria

Besu should handle storage exceptions gracefully and retry where possible

Steps to Reproduce (Bug)

the easiest way to reproduce is to simply fast-sync mainnet on bonsai

Expected behavior: [What you expect to happen]
fast-sync completes

Actual behavior: [What actually happens]
Unhandled "Busy" exeptions cause the world state downloader to abend, but the besu process stays up and continues to download blocks. When sync completes the worldstate is incomplete.

Frequency: [What percentage of the time does it occur?]
~100% on ec2 instances which have 6000 or fewer IOPS

Versions (Add all that apply)

Software version: 22.x

The text was updated successfully, but these errors were encountered:

garyschulte added the mainnet label Apr 11, 2022

garyschulte mentioned this issue Apr 11, 2022

revert to TransactionDB to resolve 'busy' exception #3720

Merged

2 tasks

garyschulte self-assigned this Apr 13, 2022

garyschulte closed this as completed in #3720 Apr 14, 2022

non-fungible-nelson mentioned this issue Nov 9, 2022

RocksDB Busy / Pipeline failed after 22.10.0 update and resumption of node 1000 blocks behind head #4639

Closed

greenhat616 mentioned this issue Dec 25, 2023

fix(rocksdb): resource busy issue libnyanpasu/clash-nyanpasu#194

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: RocksDB "busy" exceptions #3719

bug: RocksDB "busy" exceptions #3719

garyschulte commented Apr 11, 2022 •

edited

Loading

bug: RocksDB "busy" exceptions #3719

bug: RocksDB "busy" exceptions #3719

Comments

garyschulte commented Apr 11, 2022 • edited Loading

Description

Acceptance Criteria

Steps to Reproduce (Bug)

Versions (Add all that apply)

garyschulte commented Apr 11, 2022 •

edited

Loading