Skip to content

Commit

Permalink
Improve online_delete configuration and DB tuning:
Browse files Browse the repository at this point in the history
* Document delete_batch, back_off_milliseconds, age_threshold_seconds.
* Convert those time values to chrono types.
* Fix bug that ignored age_threshold_seconds.
* Add a "recovery buffer" to the config that gives the node a chance to
  recover before aborting online delete.
* Add begin/end log messages around the SQL queries.
* Add a new configuration section: [sqlite] to allow tuning the sqlite
  database operations. Ignored on full/large history servers.
* Update documentation of [node_db] and [sqlite] in the
  rippled-example.cfg file.
* Resolves XRPLF#3321
  • Loading branch information
ximinez authored and manojsdoshi committed Jun 25, 2020
1 parent 3f480ec commit eba653c
Show file tree
Hide file tree
Showing 21 changed files with 1,078 additions and 263 deletions.
181 changes: 155 additions & 26 deletions cfg/rippled-example.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
# For more information on where the rippled server instance searches for the
# file, visit:
#
# https://developers.ripple.com/commandline-usage.html#generic-options
# https://xrpl.org/commandline-usage.html#generic-options
#
# This file should be named rippled.cfg. This file is UTF-8 with DOS, UNIX,
# or Mac style end of lines. Blank lines and lines beginning with '#' are
Expand Down Expand Up @@ -869,18 +869,65 @@
#
# These keys are possible for any type of backend:
#
# earliest_seq The default is 32570 to match the XRP ledger
# network's earliest allowed sequence. Alternate
# networks may set this value. Minimum value of 1.
# If a [shard_db] section is defined, and this
# value is present either [node_db] or [shard_db],
# it must be defined with the same value in both
# sections.
#
# online_delete Minimum value of 256. Enable automatic purging
# of older ledger information. Maintain at least this
# number of ledger records online. Must be greater
# than or equal to ledger_history.
#
# advisory_delete 0 for disabled, 1 for enabled. If set, then
# require administrative RPC call "can_delete"
# to enable online deletion of ledger records.
# These keys modify the behavior of online_delete, and thus are only
# relevant if online_delete is defined and non-zero:
#
# earliest_seq The default is 32570 to match the XRP ledger
# network's earliest allowed sequence. Alternate
# networks may set this value. Minimum value of 1.
# advisory_delete 0 for disabled, 1 for enabled. If set, the
# administrative RPC call "can_delete" is required
# to enable online deletion of ledger records.
# Online deletion does not run automatically if
# non-zero and the last deletion was on a ledger
# greater than the current "can_delete" setting.
# Default is 0.
#
# delete_batch When automatically purging, SQLite database
# records are deleted in batches. This value
# controls the maximum size of each batch. Larger
# batches keep the databases locked for more time,
# which may cause other functions to fall behind,
# and thus cause the node to lose sync.
# Default is 100.
#
# back_off_milliseconds
# Number of milliseconds to wait between
# online_delete batches to allow other functions
# to catch up.
# Default is 100.
#
# age_threshold_seconds
# The online delete process will only run if the
# latest validated ledger is younger than this
# number of seconds.
# Default is 60.
#
# recovery_wait_seconds
# The online delete process checks periodically
# that rippled is still in sync with the network,
# and that the validated ledger is less than
# 'age_threshold_seconds' old. By default, if it
# is not the online delete process aborts and
# tries again later. If 'recovery_wait_seconds'
# is set and rippled is out of sync, but likely to
# recover quickly, then online delete will wait
# this number of seconds for rippled to get back
# into sync before it aborts.
# Set this value if the node is otherwise staying
# in sync, or recovering quickly, but the online
# delete process is unable to finish.
# Default is unset.
#
# Notes:
# The 'node_db' entry configures the primary, persistent storage.
Expand All @@ -892,6 +939,12 @@
# [import_db] Settings for performing a one-time import (optional)
# [database_path] Path to the book-keeping databases.
#
# The server creates and maintains 4 to 5 bookkeeping SQLite databases in
# the 'database_path' location. If you omit this configuration setting,
# the server creates a directory called "db" located in the same place as
# your rippled.cfg file.
# Partial pathnames are relative to the location of the rippled executable.
#
# [shard_db] Settings for the Shard Database (optional)
#
# Format (without spaces):
Expand All @@ -907,12 +960,84 @@
#
# max_size_gb Maximum disk space the database will utilize (in gigabytes)
#
# [sqlite] Tuning settings for the SQLite databases (optional)
#
# Format (without spaces):
# One or more lines of case-insensitive key / value pairs:
# <key> '=' <value>
# ...
#
# Example 1:
# sync_level=low
#
# Example 2:
# journal_mode=off
# synchronous=off
#
# WARNING: These settings can have significant effects on data integrity,
# particularly in systemic failure scenarios. It is strongly recommended
# that they be left at their defaults unless the server is having
# performance issues during normal operation or during automatic purging
# (online_delete) operations. A warning will be logged on startup if
# 'ledger_history' is configured to store more than 10,000,000 ledgers and
# any of these settings are less safe than the default. This is due to the
# inordinate amount of time and bandwidth it will take to safely rebuild a
# corrupted database of that size from other peers.
#
# Optional keys:
#
# There are 4 bookkeeping SQLite database that the server creates and
# maintains. If you omit this configuration setting, it will default to
# creating a directory called "db" located in the same place as your
# rippled.cfg file. Partial pathnames will be considered relative to
# the location of the rippled executable.
# safety_level Valid values: high, low
# The default is "high", which tunes the SQLite
# databases in the most reliable mode, and is
# equivalent to:
# journal_mode=wal
# synchronous=normal
# temp_store=file
# "low" is equivalent to:
# journal_mode=memory
# synchronous=off
# temp_store=memory
# These "low" settings trade speed and reduced I/O
# for a higher risk of data loss. See the
# individual settings below for more information.
# This setting may not be combined with any of the
# other tuning settings: "journal_mode",
# "synchronous", or "temp_store".
#
# journal_mode Valid values: delete, truncate, persist, memory, wal, off
# The default is "wal", which uses a write-ahead
# log to implement database transactions.
# Alternately, "memory" saves disk I/O, but if
# rippled crashes during a transaction, the
# database is likely to be corrupted.
# See https://www.sqlite.org/pragma.html#pragma_journal_mode
# for more details about the available options.
# This setting may not be combined with the
# "safety_level" setting.
#
# synchronous Valid values: off, normal, full, extra
# The default is "normal", which works well with
# the "wal" journal mode. Alternatively, "off"
# allows rippled to continue as soon as data is
# passed to the OS, which can significantly
# increase speed, but risks data corruption if
# the host computer crashes before writing that
# data to disk.
# See https://www.sqlite.org/pragma.html#pragma_synchronous
# for more details about the available options.
# This setting may not be combined with the
# "safety_level" setting.
#
# temp_store Valid values: default, file, memory
# The default is "file", which will use files
# for temporary database tables and indices.
# Alternatively, "memory" may save I/O, but
# rippled does not currently use many, if any,
# of these temporary objects.
# See https://www.sqlite.org/pragma.html#pragma_temp_store
# for more details about the available options.
# This setting may not be combined with the
# "safety_level" setting.
#
#
#
Expand Down Expand Up @@ -1212,24 +1337,27 @@ medium

# This is primary persistent datastore for rippled. This includes transaction
# metadata, account states, and ledger headers. Helpful information can be
# found here: https://ripple.com/wiki/NodeBackEnd
# delete old ledgers while maintaining at least 2000. Do not require an
# external administrative command to initiate deletion.
# found at https://xrpl.org/capacity-planning.html#node-db-type
# type=NuDB is recommended for non-validators with fast SSDs. Validators or
# slow / spinning disks should use RocksDB. Caution: Spinning disks are
# not recommended. They do not perform well enough to consistently remain
# synced to the network.
# online_delete=512 is recommended to delete old ledgers while maintaining at
# least 512.
# advisory_delete=0 allows the online delete process to run automatically
# when the node has approximately two times the "online_delete" value of
# ledgers. No external administrative command is required to initiate
# deletion.
[node_db]
type=RocksDB
path=/var/lib/rippled/db/rocksdb
open_files=2000
filter_bits=12
cache_mb=256
file_size_mb=8
file_size_mult=2
online_delete=2000
type=NuDB
path=/var/lib/rippled/db/nudb
online_delete=512
advisory_delete=0

# This is the persistent datastore for shards. It is important for the health
# of the ripple network that rippled operators shard as much as practical.
# NuDB requires SSD storage. Helpful information can be found here
# https://ripple.com/build/history-sharding
# NuDB requires SSD storage. Helpful information can be found at
# https://xrpl.org/history-sharding.html
#[shard_db]
#path=/var/lib/rippled/db/shards/nudb
#max_size_gb=500
Expand All @@ -1248,7 +1376,8 @@ time.apple.com
time.nist.gov
pool.ntp.org

# To use the XRP test network (see https://ripple.com/build/xrp-test-net/),
# To use the XRP test network
# (see https://xrpl.org/connect-your-rippled-to-the-xrp-test-net.html),
# use the following [ips] section:
# [ips]
# r.altnet.rippletest.net 51235
Expand Down
4 changes: 2 additions & 2 deletions src/ripple/app/ledger/Ledger.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -228,14 +228,14 @@ Ledger::Ledger(
!txMap_->fetchRoot(SHAMapHash{info_.txHash}, nullptr))
{
loaded = false;
JLOG(j.warn()) << "Don't have TX root for ledger";
JLOG(j.warn()) << "Don't have transaction root for ledger" << info_.seq;
}

if (info_.accountHash.isNonZero() &&
!stateMap_->fetchRoot(SHAMapHash{info_.accountHash}, nullptr))
{
loaded = false;
JLOG(j.warn()) << "Don't have AS root for ledger";
JLOG(j.warn()) << "Don't have state data root for ledger" << info_.seq;
}

txMap_->setImmutable();
Expand Down
5 changes: 3 additions & 2 deletions src/ripple/app/main/Application.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1019,7 +1019,7 @@ class ApplicationImp : public Application, public RootStoppable, public BasicApp

try
{
auto const setup = setup_DatabaseCon(*config_);
auto setup = setup_DatabaseCon(*config_, m_journal);

// transaction database
mTxnDB = std::make_unique<DatabaseCon>(
Expand Down Expand Up @@ -1069,6 +1069,7 @@ class ApplicationImp : public Application, public RootStoppable, public BasicApp
mLedgerDB->setupCheckpointing(m_jobQueue.get(), logs());

// wallet database
setup.useGlobalPragma = false;
mWalletDB = std::make_unique<DatabaseCon>(
setup,
WalletDBName,
Expand Down Expand Up @@ -1360,7 +1361,7 @@ class ApplicationImp : public Application, public RootStoppable, public BasicApp
JLOG(m_journal.fatal())
<< "Free SQLite space for transaction db is less than "
"512MB. To fix this, rippled must be executed with the "
"vacuum <sqlitetmpdir> parameter before restarting. "
"\"--vacuum\" parameter before restarting. "
"Note that this activity can take multiple days, "
"depending on database size.";
signalStop();
Expand Down
43 changes: 22 additions & 21 deletions src/ripple/app/main/DBInit.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,23 @@ namespace ripple {

////////////////////////////////////////////////////////////////////////////////

// These pragmas are built at startup and applied to all database
// connections, unless otherwise noted.
inline constexpr char const* CommonDBPragmaJournal{"PRAGMA journal_mode=%s;"};
inline constexpr char const* CommonDBPragmaSync{"PRAGMA synchronous=%s;"};
inline constexpr char const* CommonDBPragmaTemp{"PRAGMA temp_store=%s;"};
// A warning will be logged if any lower-safety sqlite tuning settings
// are used and at least this much ledger history is configured. This
// includes full history nodes. This is because such a large amount of
// data will be more difficult to recover if a rare failure occurs,
// which are more likely with some of the other available tuning settings.
inline constexpr std::uint32_t SQLITE_TUNING_CUTOFF = 10'000'000;

// Ledger database holds ledgers and ledger confirmations
inline constexpr auto LgrDBName{"ledger.db"};

inline constexpr std::array<char const*, 3> LgrDBPragma{
{"PRAGMA synchronous=NORMAL;",
"PRAGMA journal_mode=WAL;",
"PRAGMA journal_size_limit=1582080;"}};
inline constexpr std::array<char const*, 1> LgrDBPragma{
{"PRAGMA journal_size_limit=1582080;"}};

inline constexpr std::array<char const*, 5> LgrDBInit{
{"BEGIN TRANSACTION;",
Expand Down Expand Up @@ -61,22 +71,13 @@ inline constexpr std::array<char const*, 5> LgrDBInit{
// Transaction database holds transactions and public keys
inline constexpr auto TxDBName{"transaction.db"};

inline constexpr
#if (ULONG_MAX > UINT_MAX) && !defined(NO_SQLITE_MMAP)
std::array<char const*, 6>
TxDBPragma
inline constexpr std::array TxDBPragma
{
{
#else
std::array<char const*, 5> TxDBPragma {{
#endif
"PRAGMA page_size=4096;", "PRAGMA synchronous=NORMAL;",
"PRAGMA journal_mode=WAL;", "PRAGMA journal_size_limit=1582080;",
"PRAGMA max_page_count=2147483646;",
"PRAGMA page_size=4096;", "PRAGMA journal_size_limit=1582080;",
"PRAGMA max_page_count=2147483646;",
#if (ULONG_MAX > UINT_MAX) && !defined(NO_SQLITE_MMAP)
"PRAGMA mmap_size=17179869184;"
"PRAGMA mmap_size=17179869184;"
#endif
}
};

inline constexpr std::array<char const*, 8> TxDBInit{
Expand Down Expand Up @@ -115,10 +116,8 @@ inline constexpr std::array<char const*, 8> TxDBInit{
// Temporary database used with an incomplete shard that is being acquired
inline constexpr auto AcquireShardDBName{"acquire.db"};

inline constexpr std::array<char const*, 3> AcquireShardDBPragma{
{"PRAGMA synchronous=NORMAL;",
"PRAGMA journal_mode=WAL;",
"PRAGMA journal_size_limit=1582080;"}};
inline constexpr std::array<char const*, 1> AcquireShardDBPragma{
{"PRAGMA journal_size_limit=1582080;"}};

inline constexpr std::array<char const*, 1> AcquireShardDBInit{
{"CREATE TABLE IF NOT EXISTS Shard ( \
Expand All @@ -130,6 +129,7 @@ inline constexpr std::array<char const*, 1> AcquireShardDBInit{
////////////////////////////////////////////////////////////////////////////////

// Pragma for Ledger and Transaction databases with complete shards
// These override the CommonDBPragma values defined above.
inline constexpr std::array<char const*, 2> CompleteShardDBPragma{
{"PRAGMA synchronous=OFF;", "PRAGMA journal_mode=OFF;"}};

Expand Down Expand Up @@ -172,6 +172,7 @@ inline constexpr std::array<char const*, 6> WalletDBInit{

static constexpr auto stateDBName{"state.db"};

// These override the CommonDBPragma values defined above.
static constexpr std::array<char const*, 2> DownloaderDBPragma{
{"PRAGMA synchronous=FULL;", "PRAGMA journal_mode=DELETE;"}};

Expand Down
Loading

0 comments on commit eba653c

Please sign in to comment.