DB corrupted after graceful shutdown #626

raphjaph · 2023-06-28T13:43:48Z

I'm upgrading redb from 0.13.0 to 1.0.1 because we wanted to pull in this fix. If I now run this as a service on a debian machine until around blockheight 777000 and do systemctl stop ord and systemctl start ord I get the following:

Stopping Ord server...
ord.service: Succeeded.
Stopped Ord server.
ord.service: Consumed 1h 9min 23.138s CPU time.
Started Ord server.
[2023-06-28T13:02:32Z INFO  ord::options] Connecting to Bitcoin Core at 127.0.0.1:8332/wallet/ord
[2023-06-28T13:02:32Z INFO  ord::options] Using credentials from cookie file at `/var/lib/bitcoind/.cookie`
error: DB corrupted: Failed to repair database. All roots are corrupted
   0: ord::index::Index::open
   1: ord::subcommand::Subcommand::run
   2: ord::main
   3: std::sys_common::backtrace::__rust_begin_short_backtrace
   4: std::rt::lang_start::{{closure}}
   5: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/ops/function.rs:287:13
      std::panicking::try::do_call
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:485:40
      std::panicking::try
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:449:19
      std::panic::catch_unwind
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panic.rs:140:14
      std::rt::lang_start_internal::{{closure}}
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/rt.rs:148:48
      std::panicking::try::do_call
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:485:40
      std::panicking::try
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:449:19
      std::panic::catch_unwind
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panic.rs:140:14
      std::rt::lang_start_internal
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/rt.rs:148:20
   6: main
   7: __libc_start_main
   8: _start
ord.service: Main process exited, code=exited, status=1/FAILURE
ord.service: Failed with result 'exit-code'.

I think I'm shutting it down gracefully and also tested this on Mac, where the same happens. We're also using Multimaps now, in case that helps with debugging. Let me know how I can provide more information.

The text was updated successfully, but these errors were encountered:

raphjaph · 2023-06-28T13:58:07Z

If you want to reproduce this I recommend building from this branch and running with --db-cache-size 2147483648 (16GiB) and without --index-sats. Should take about half an hour to get to height 777000.

cberner · 2023-06-28T14:56:32Z

Uh oh :/ I'll take a look

cberner · 2023-06-28T15:09:30Z

@raphjaph I think I'm not doing it right. Here are the steps I followed:

launch bitcoind
run cargo run --release -- --data-dir=./junk --db-cache-size 2147483648 --height-limit=777000 index run
press ctrl-c after it reaches block ~765k
repeat step (2)

Indexing seems to continue just fine from there. Did I miss a step?

raphjaph · 2023-06-28T15:37:34Z

@cberner maybe run it without --height-limit and try with server instead of index run and let it go above 777000. Then ctrl-c once and do step 2.

This is what I did:
./target/release/ord --index update-redb.redb --db-cache-size 2147483648 server

cberner · 2023-06-28T20:48:37Z

I was able to reproduce it by using timeout 2000 cargo run --release -- --data-dir=./junk --db-cache-size 2147483648 index run. Digging into what is wrong now.

cberner · 2023-06-29T15:08:54Z

Ok, I think I found the issue. Can you try opening that database with master?

veryordinally · 2023-06-29T17:20:43Z

@cberner This looks good! When would you plan to make a release? We'd want to release a new ord version as quickly as possible and prefer to base on a redb release.

cberner · 2023-06-29T17:21:43Z

I'll make a new release today. Just letting the fuzzer run for a few hours first

raphjaph · 2023-06-29T19:19:21Z

Awesome, thanks for fixing so quickly!

cberner · 2023-06-29T20:35:32Z

For sure! Thank for finding this

raphjaph mentioned this issue Jun 28, 2023

Update redb from 0.13.0 to 1.0.2 ordinals/ord#2141

Merged

cberner mentioned this issue Jun 29, 2023

Fix panics when repairing some databases #627

Merged

cberner closed this as completed in #627 Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DB corrupted after graceful shutdown #626

DB corrupted after graceful shutdown #626

raphjaph commented Jun 28, 2023 •

edited

Loading

raphjaph commented Jun 28, 2023 •

edited

Loading

cberner commented Jun 28, 2023

cberner commented Jun 28, 2023

raphjaph commented Jun 28, 2023 •

edited

Loading

cberner commented Jun 28, 2023

cberner commented Jun 29, 2023

veryordinally commented Jun 29, 2023

cberner commented Jun 29, 2023

raphjaph commented Jun 29, 2023

cberner commented Jun 29, 2023

DB corrupted after graceful shutdown #626

DB corrupted after graceful shutdown #626

Comments

raphjaph commented Jun 28, 2023 • edited Loading

raphjaph commented Jun 28, 2023 • edited Loading

cberner commented Jun 28, 2023

cberner commented Jun 28, 2023

raphjaph commented Jun 28, 2023 • edited Loading

cberner commented Jun 28, 2023

cberner commented Jun 29, 2023

veryordinally commented Jun 29, 2023

cberner commented Jun 29, 2023

raphjaph commented Jun 29, 2023

cberner commented Jun 29, 2023

raphjaph commented Jun 28, 2023 •

edited

Loading

raphjaph commented Jun 28, 2023 •

edited

Loading

raphjaph commented Jun 28, 2023 •

edited

Loading