-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Parallellize snapshot creation #1799
Conversation
db.entries::<T>(None, IterDirection::Forward) | ||
.chunks(group_size) | ||
.into_iter() | ||
.take_while(|_| !cancel.is_cancelled()) | ||
.try_for_each(|chunk| writer.write(chunk.try_collect()?))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be interesting to see if you can parameterize the DB query to work with a quantity and/or primary index, so that you can also use Rayon and the task manager for the write_contract_snapshot
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought about it, but the contract_snapshot
is there only to support writing e2e tests.
As I understood it you'd deploy yourself a contract, interact however you wish with it. When satisfied freeze that moment in time for e2e tests with the contract snapshot.
As such didn't feel a need to optimize for performance.
I'll give it a try anyway because the code might be more readable if I succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turned out nicely IMO. Have a look at lemme know your thoughts :)
Edit:
Also found a (probably) significant performance issue. Previously I had understood iter_all(prefix)
to mean "iterate all entries whose key has this prefix.
But what it actually means is "start iterating thought the db at the first match of this prefix (if any). Then continue until the end of the db. That caused the contract snapshot logic to load all state after the match and then separate out what belongs to the contract in question. Fixed that, so noice :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think introducing the KeyFilter
trait is a nice abstraction, as it allows us to build where
-esque clauses for queries at the call site level. I'm surprised that this change to the DB interface was relatively low impact on the rest of the codebase, which is also nice.
As a side note, my understanding of iter_all(prefix)
is that your initial understanding is actually correct. It is defined as :
/// Returns an iterator over the all entries in the table with the specified prefix.
fn iter_all_by_prefix<M, P>(
&self,
prefix: Option<P>,
) -> BoxedIter<super::Result<(M::OwnedKey, M::OwnedValue)>>
where
M: Mappable,
P: AsRef<[u8]>,
Self: IterableTable<M>,
{
self.iter_all_filtered::<M, P>(prefix, None, None)
}
indicating that it should be filtering by prefix. If that's not the case, there might be something wrong at the implementation level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that this change to the DB interface was relatively low impact
It's because even though the method sounds general, it was created only to support generating snapshots. Besides that nobody uses entries
.
If that's not the case, there might be something wrong at the implementation level.
Interestingly enough that is not the case for rocksdb. If you look underneath you'll see the following (match on (prefix, start_key)):
(Some(prefix), None) => {
if direction == IterDirection::Reverse {
self.reverse_prefix_iter(prefix, column).into_boxed()
} else {
// start iterating in a certain direction within the keyspace
let iter_mode = IteratorMode::From(
prefix,
convert_to_rocksdb_direction(direction),
);
let mut opts = ReadOptions::default();
opts.set_prefix_same_as_start(true);
self._iter_all(column, opts, iter_mode).into_boxed()
}
}
It looks like that the prefix is used just to start the iteration from where it matches, but it is not going to stop it. A simple test shows this happening:
#[test]
fn recreate_me() {
let test = |mut db: Database<OnChain>| {
let contract_id_1 = ContractId::from_str(
"5962be5ebddc516cb4ed7d7e76365f59e0d231ac25b53f262119edf76564aab4",
)
.unwrap();
let mut insert_empty_code = |id| {
StorageMutate::<ContractsRawCode>::insert(&mut db, &id, &[]).unwrap()
};
insert_empty_code(contract_id_1);
let contract_id_2 = ContractId::from_str(
"5baf0dcae7c114f647f6e71f1723f59bcfc14ecb28071e74895d97b14873c5dc",
)
.unwrap();
insert_empty_code(contract_id_2);
use itertools::Itertools;
let matched_keys: Vec<_> = db
.iter_all_by_prefix::<ContractsRawCode, _>(Some(contract_id_1))
.map_ok(|(k, _)| k)
.try_collect()
.unwrap();
assert_eq!(matched_keys, vec![contract_id_1]);
};
let temp_dir = tempfile::tempdir().unwrap();
let db = Database::<OnChain>::in_memory();
// in memory passes
test(db);
let db = Database::<OnChain>::open_rocksdb(temp_dir.path(), 1024 * 1024 * 1024)
.unwrap();
// rocks db fails
test(db);
}
I'll raise this as a question internally and we can discuss there.
…1812) Follow-up PR to simplify the logic around parallel snapshot creation #1799. The change removes `Option<BlockHeight>`, `Option<DaBlockHeight>` and `Option<ChainConfig>` from snapshot fragments to not allow overriding of them on type system level. ### Before requesting review - [x] I have reviewed the code myself
We used `RocksDB` filtering by prefix during `iter_all_by_prefix`. But if the prefix is not set for the column, the `RocksDB` ignores it. We don't set the prefix for all tables because of performance purposes. But it means that iteration sometimes can be wrong. The change adds a `Rust` level filtering since not all tables really need a prefix. The issue found by @segfault-magnet in #1799 (comment) ### Before requesting review - [x] I have reviewed the code myself
## Version v0.25.0 ### Fixed - [#1821](#1821): Can handle missing tables in snapshot. - [#1814](#1814): Bugfix: the `iter_all_by_prefix` was not working for all tables. The change adds a `Rust` level filtering. ### Added - [#1831](#1831): Included the total gas and fee used by transaction into `TransactionStatus`. - [#1821](#1821): Propagate shutdown signal to (re)genesis. Also add progress bar for (re)genesis. - [#1813](#1813): Added back support for `/health` endpoint. - [#1799](#1799): Snapshot creation is now concurrent. - [#1811](#1811): Regenesis now preserves old blocks and transactions for GraphQL API. ### Changed - [#1833](#1833): Regenesis of `SpentMessages` and `ProcessedTransactions`. - [#1830](#1830): Use versioning enum for WASM executor input and output. - [#1816](#1816): Updated the upgradable executor to fetch the state transition bytecode from the database when the version doesn't match a native one. This change enables the WASM executor in the "production" build and requires a `wasm32-unknown-unknown` target. - [#1812](#1812): Follow-up PR to simplify the logic around parallel snapshot creation. - [#1809](#1809): Fetch `ConsensusParameters` from the database - [#1808](#1808): Fetch consensus parameters from the provider. #### Breaking - [#1826](#1826): The changes make the state transition bytecode part of the `ChainConfig`. It guarantees the state transition's availability for the network's first blocks. The change has many minor improvements in different areas related to the state transition bytecode: - The state transition bytecode lies in its own file(`state_transition_bytecode.wasm`) along with the chain config file. The `ChainConfig` loads it automatically when `ChainConfig::load` is called and pushes it back when `ChainConfig::write` is called. - The `fuel-core` release bundle also contains the `fuel-core-wasm-executor.wasm` file of the corresponding executor version. - The regenesis process now considers the last block produced by the previous network. When we create a (re)genesis block of a new network, it has the `height = last_block_of_old_netowkr + 1`. It continues the old network and doesn't overlap blocks(before, we had `old_block.height == new_genesis_block.hegiht`). - Along with the new block height, the regenesis process also increases the state transition bytecode and consensus parameters versions. It guarantees that a new network doesn't use values from the previous network and allows us not to migrate `StateTransitionBytecodeVersions` and `ConsensusParametersVersions` tables. - Added a new CLI argument, `native-executor-version,` that allows overriding of the default version of the native executor. It can be useful for side rollups that have their own history of executor upgrades. - Replaced: ```rust let file = std::fs::File::open(path)?; let mut snapshot: Self = serde_json::from_reader(&file)?; ``` with a: ```rust let mut json = String::new(); std::fs::File::open(&path) .with_context(|| format!("Could not open snapshot file: {path:?}"))? .read_to_string(&mut json)?; let mut snapshot: Self = serde_json::from_str(json.as_str())?; ``` because it is 100 times faster for big JSON files. - Updated all tests to use `Config::local_node_*` instead of working with the `SnapshotReader` directly. It is the preparation of the tests for the futures bumps of the `Executor::VERSION`. When we increase the version, all tests continue to use `GenesisBlock.state_transition_bytecode = 0` while the version is different, which forces the usage of the WASM executor, while for tests, we still prefer to test native execution. The `Config::local_node_*` handles it and forces the executor to use the native version. - Reworked the `build.rs` file of the upgradable executor. The script now caches WASM bytecode to avoid recompilation. Also, fixed the issue with outdated WASM bytecode. The script reacts on any modifications of the `fuel-core-wasm-executor` and forces recompilation (it is why we need the cache), so WASM bytecode always is actual now. - [#1822](#1822): Removed support of `Create` transaction from debugger since it doesn't have any script to execute. - [#1822](#1822): Use `fuel-vm 0.49.0` with new transactions types - `Upgrade` and `Upload`. Also added `max_bytecode_subsections` field to the `ConsensusParameters` to limit the number of bytecode subsections in the state transition bytecode. - [#1816](#1816): Updated the upgradable executor to fetch the state transition bytecode from the database when the version doesn't match a native one. This change enables the WASM executor in the "production" build and requires a `wasm32-unknown-unknown` target. ### Before requesting review - [x] I have reviewed the code myself ### After merging, notify other teams - [x] [Rust SDK](https://github.com/FuelLabs/fuels-rs/) - [x] [Sway compiler](https://github.com/FuelLabs/sway/) - [x] DevOps ## What's Changed * Add PR template by @Dentosal in #1806 * feat: Parallellize snapshot creation by @segfault-magnet in #1799 * Follow-up PR to simplify the logic around parallel snapshot creation by @xgreenx in #1812 * Bugfix: the `iter_all_by_prefix` was not working for all tables by @xgreenx in #1814 * Added back support for `/health` endpoint by @xgreenx in #1813 * Fetch consensus parameters from the provider by @xgreenx in #1808 * Fetch `ConsensusParameters` from the database by @xgreenx in #1809 * Handle FTI messages in executor by @Voxelot in #1787 * Use state transition bytecode from the database by @xgreenx in #1816 * feat: (re)genesis graceful shutdown by @segfault-magnet in #1821 * Use `fuel-vm 0.49.0` with new transactions types by @xgreenx in #1822 * Included the total gas and fee into `TransactionStatus` by @xgreenx in #1831 * Use versioning enum for WASM executor input and output by @xgreenx in #1830 * Support upgradability of the consensus parameters and state transition bytecode in genesis by @xgreenx in #1826 * Store old blocks and txs after regenesis by @Dentosal in #1811 * Regenesis of `SpentMessages` and `ProcessedTransactions` by @xgreenx in #1833 **Full Changelog**: v0.24.2...v0.25.0
…(#1812) Follow-up PR to simplify the logic around parallel snapshot creation FuelLabs/fuel-core#1799. The change removes `Option<BlockHeight>`, `Option<DaBlockHeight>` and `Option<ChainConfig>` from snapshot fragments to not allow overriding of them on type system level. ### Before requesting review - [x] I have reviewed the code myself
We used `RocksDB` filtering by prefix during `iter_all_by_prefix`. But if the prefix is not set for the column, the `RocksDB` ignores it. We don't set the prefix for all tables because of performance purposes. But it means that iteration sometimes can be wrong. The change adds a `Rust` level filtering since not all tables really need a prefix. The issue found by @segfault-magnet in FuelLabs/fuel-core#1799 (comment) ### Before requesting review - [x] I have reviewed the code myself
## Version v0.25.0 ### Fixed - [#1821](FuelLabs/fuel-core#1821): Can handle missing tables in snapshot. - [#1814](FuelLabs/fuel-core#1814): Bugfix: the `iter_all_by_prefix` was not working for all tables. The change adds a `Rust` level filtering. ### Added - [#1831](FuelLabs/fuel-core#1831): Included the total gas and fee used by transaction into `TransactionStatus`. - [#1821](FuelLabs/fuel-core#1821): Propagate shutdown signal to (re)genesis. Also add progress bar for (re)genesis. - [#1813](FuelLabs/fuel-core#1813): Added back support for `/health` endpoint. - [#1799](FuelLabs/fuel-core#1799): Snapshot creation is now concurrent. - [#1811](FuelLabs/fuel-core#1811): Regenesis now preserves old blocks and transactions for GraphQL API. ### Changed - [#1833](FuelLabs/fuel-core#1833): Regenesis of `SpentMessages` and `ProcessedTransactions`. - [#1830](FuelLabs/fuel-core#1830): Use versioning enum for WASM executor input and output. - [#1816](FuelLabs/fuel-core#1816): Updated the upgradable executor to fetch the state transition bytecode from the database when the version doesn't match a native one. This change enables the WASM executor in the "production" build and requires a `wasm32-unknown-unknown` target. - [#1812](FuelLabs/fuel-core#1812): Follow-up PR to simplify the logic around parallel snapshot creation. - [#1809](FuelLabs/fuel-core#1809): Fetch `ConsensusParameters` from the database - [#1808](FuelLabs/fuel-core#1808): Fetch consensus parameters from the provider. #### Breaking - [#1826](FuelLabs/fuel-core#1826): The changes make the state transition bytecode part of the `ChainConfig`. It guarantees the state transition's availability for the network's first blocks. The change has many minor improvements in different areas related to the state transition bytecode: - The state transition bytecode lies in its own file(`state_transition_bytecode.wasm`) along with the chain config file. The `ChainConfig` loads it automatically when `ChainConfig::load` is called and pushes it back when `ChainConfig::write` is called. - The `fuel-core` release bundle also contains the `fuel-core-wasm-executor.wasm` file of the corresponding executor version. - The regenesis process now considers the last block produced by the previous network. When we create a (re)genesis block of a new network, it has the `height = last_block_of_old_netowkr + 1`. It continues the old network and doesn't overlap blocks(before, we had `old_block.height == new_genesis_block.hegiht`). - Along with the new block height, the regenesis process also increases the state transition bytecode and consensus parameters versions. It guarantees that a new network doesn't use values from the previous network and allows us not to migrate `StateTransitionBytecodeVersions` and `ConsensusParametersVersions` tables. - Added a new CLI argument, `native-executor-version,` that allows overriding of the default version of the native executor. It can be useful for side rollups that have their own history of executor upgrades. - Replaced: ```rust let file = std::fs::File::open(path)?; let mut snapshot: Self = serde_json::from_reader(&file)?; ``` with a: ```rust let mut json = String::new(); std::fs::File::open(&path) .with_context(|| format!("Could not open snapshot file: {path:?}"))? .read_to_string(&mut json)?; let mut snapshot: Self = serde_json::from_str(json.as_str())?; ``` because it is 100 times faster for big JSON files. - Updated all tests to use `Config::local_node_*` instead of working with the `SnapshotReader` directly. It is the preparation of the tests for the futures bumps of the `Executor::VERSION`. When we increase the version, all tests continue to use `GenesisBlock.state_transition_bytecode = 0` while the version is different, which forces the usage of the WASM executor, while for tests, we still prefer to test native execution. The `Config::local_node_*` handles it and forces the executor to use the native version. - Reworked the `build.rs` file of the upgradable executor. The script now caches WASM bytecode to avoid recompilation. Also, fixed the issue with outdated WASM bytecode. The script reacts on any modifications of the `fuel-core-wasm-executor` and forces recompilation (it is why we need the cache), so WASM bytecode always is actual now. - [#1822](FuelLabs/fuel-core#1822): Removed support of `Create` transaction from debugger since it doesn't have any script to execute. - [#1822](FuelLabs/fuel-core#1822): Use `fuel-vm 0.49.0` with new transactions types - `Upgrade` and `Upload`. Also added `max_bytecode_subsections` field to the `ConsensusParameters` to limit the number of bytecode subsections in the state transition bytecode. - [#1816](FuelLabs/fuel-core#1816): Updated the upgradable executor to fetch the state transition bytecode from the database when the version doesn't match a native one. This change enables the WASM executor in the "production" build and requires a `wasm32-unknown-unknown` target. ### Before requesting review - [x] I have reviewed the code myself ### After merging, notify other teams - [x] [Rust SDK](https://github.com/FuelLabs/fuels-rs/) - [x] [Sway compiler](https://github.com/FuelLabs/sway/) - [x] DevOps ## What's Changed * Add PR template by @Dentosal in FuelLabs/fuel-core#1806 * feat: Parallellize snapshot creation by @segfault-magnet in FuelLabs/fuel-core#1799 * Follow-up PR to simplify the logic around parallel snapshot creation by @xgreenx in FuelLabs/fuel-core#1812 * Bugfix: the `iter_all_by_prefix` was not working for all tables by @xgreenx in FuelLabs/fuel-core#1814 * Added back support for `/health` endpoint by @xgreenx in FuelLabs/fuel-core#1813 * Fetch consensus parameters from the provider by @xgreenx in FuelLabs/fuel-core#1808 * Fetch `ConsensusParameters` from the database by @xgreenx in FuelLabs/fuel-core#1809 * Handle FTI messages in executor by @Voxelot in FuelLabs/fuel-core#1787 * Use state transition bytecode from the database by @xgreenx in FuelLabs/fuel-core#1816 * feat: (re)genesis graceful shutdown by @segfault-magnet in FuelLabs/fuel-core#1821 * Use `fuel-vm 0.49.0` with new transactions types by @xgreenx in FuelLabs/fuel-core#1822 * Included the total gas and fee into `TransactionStatus` by @xgreenx in FuelLabs/fuel-core#1831 * Use versioning enum for WASM executor input and output by @xgreenx in FuelLabs/fuel-core#1830 * Support upgradability of the consensus parameters and state transition bytecode in genesis by @xgreenx in FuelLabs/fuel-core#1826 * Store old blocks and txs after regenesis by @Dentosal in FuelLabs/fuel-core#1811 * Regenesis of `SpentMessages` and `ProcessedTransactions` by @xgreenx in FuelLabs/fuel-core#1833 **Full Changelog**: FuelLabs/fuel-core@v0.24.2...v0.25.0
closes: #1789
Feature:
A
SnapshotWriter
can now bepartial_close
ed which results in aSnapshotFragment
. Those can then bemerge
ed together andfinalize
d into aSnapshotMetadata
(and a written snapshot at that).The export process now spawns rayon tasks for every table. At the end of the task it will
partial_close
its writer and return theSnapshotFragment
.When all tasks are finished the fragments are merged and the snapshot is done.
A
TaskManager
was written to deduplicate rayon task fail-fast and cancel logic.Refactor:
Snapshot generation and import was moved together into
fuel-core
. Mainly because theTaskManager
needed a place to live accesible to both. But also because if you're adding tables to the snapshot you'd also be wanting to read them so having both close together seems sensible.