-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/engine no std #595
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is ready for review yet, but it's not a draft PR, so as I noticed it, I've taken a look.
I approve of the want to restore a no-std build for the engine, but I worry about a few of the details here. Happy to chat if you'd like :)
.build(); | ||
pub fn new(_options: InstrumenterOptions) -> Self { | ||
// TODO: limit size | ||
let cache = RefCell::new(HashMap::new()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a definite reversion of behaviour. We want a safe and performant cache implementation here, for use in the node.
Can we instead just fall back to this basic version in the no-std case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better yet, what about creating our own ThreadsafeCache
type, and swap out the implementation for either wrapping Moka or a Mutex<HashMap<K, Arc>> depending on if we use alloc
or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mutex
is not part of the no_std
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try conditinal compiling first.
|
||
use crate::types::*; | ||
use crate::wasm::{WasmMeteringConfig, WasmModule}; | ||
|
||
pub struct WasmInstrumenter { | ||
cache: Cache<(Hash, Hash), Arc<Vec<u8>>>, | ||
cache: RefCell<HashMap<(Hash, Hash), Arc<Vec<u8>>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this will panic if multiple threads try to lock it at the same time - as can happen in the Node when we are accepting multiple mempool submissions at once.
Can we wrap it in a Mutex instead of a RefCell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mutex is not part of the no_std
.
@@ -116,7 +116,8 @@ mod tests { | |||
// RFC 2056 | |||
fn assert_all() { | |||
assert_send::<ScryptoInterpreter<DefaultWasmEngine>>(); | |||
assert_sync::<ScryptoInterpreter<DefaultWasmEngine>>(); | |||
// TODO: make sure engine is multi-thread safe! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We create a new Radix engine for each transaction and single thread it - it's the ScryptoInterpreter caches we're sharing.
#FearlessConcurrency should mean this is fine? 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the runtime shares pointer to the cache, they may run into illegal state due to memory volatility. Like I said, we need dedicated story to make sure multi-thread work rather pretend to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean I have already verified this - in the very least:
- The moka cache claims to be thread safe
- The multiple copies of the engine share the WasmModule which is just (a reference to) data, and from which they instantiate separate / independent WasmInstances
Separately, the only other interaction between engine threads is in the database layer - where RocksDB is threadsafe. (and we ensure there are no commits during execution of other transactions)
Is there anything I've missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly the wasmer
implementation which stores a pointer which could be volatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that's the WasmerInstance
which contains the pointer, not the WasmerModule
. The WasmerInstance
is not shared - it's created for each execution from the (shared) WasmerModule
template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed - Wasmer is written in Rust, and they have good understanding of Send/Sync (It appears even Instance implements send!), so we can trust that if Module is Sync then it is Sync. Indeed - I believe it's essentially a data object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't know what Rust is doing with the data under that pointer though since Rust doesn't know that it's a mutable object potentially being written to across multiple threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you're referring to the pointer that we create ourselves. This pointer isn't in the object which is being shared. This pointer is in the WasmerInstance
(not shared) not the WasmerModule
(shared).
This pointer in WasmerInstance
has no affect on whether WasmerModule
is Sync
.
On a related note, I'm quite surprised that we use runtime_ptr: Arc<Mutex<usize>>,
and use usize
, not a raw pointer type here, but anyway... I agree WasmerInstance possibly has an unsound impl of Sync
(without looking into it further)... But feeling slightly like a broken record, we're not sharing WasmerInstance
. And I've double-checked - the WasmerInstance
pointer does not point into WasmerModule
(which would be unsound for sure if it did). :)
84ef28b
to
16dce1f
Compare
d7b6cdc
to
92966d0
Compare
92966d0
to
1a7a7f9
Compare
It's now ready for another review. @dhedey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks! 👍
Happy for you to merge as-is or to extract out a separate Cache type.
EDIT: Looks like you still need to fix some failing tests: https://github.com/radixdlt/radixdlt-scrypto/actions/runs/3515933512/jobs/5891823809
indexmap = { version = "1.8.1" } | ||
moka = { version = "0.9.4", features = ["sync"], default-features = false } | ||
lru = { version = "0.8.1", default-features = false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put lru behind optional
too, and only enable it in the alloc
case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, we don't gain much from this (tiny compile time of a small crate) by coupling lru
and alloc
. A slightly better approach is to make it a feature if we really care. But, will punt on this for now. :)
@@ -116,6 +116,8 @@ mod tests { | |||
// RFC 2056 | |||
fn assert_all() { | |||
assert_send::<ScryptoInterpreter<DefaultWasmEngine>>(); | |||
// TODO: make sure engine is indeed multi-thread safe! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say the engine, do you mean the DefaultWasmEngine
? or the RadixEngine itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment. Do you have specific worries or things you want to check? Else I think this code comment should probably be removed.
More and more non-std incompatible changes have been introduced into the code base.
To prevent further diverge, this PR restores
no_std
build and adds it to CI.