Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two NearMaps can write into the same storage independently #182

Closed
ilblackdragon opened this issue Jun 8, 2020 · 11 comments
Closed

Two NearMaps can write into the same storage independently #182

ilblackdragon opened this issue Jun 8, 2020 · 11 comments

Comments

@ilblackdragon
Copy link
Member

ilblackdragon commented Jun 8, 2020

Example:

#<span class="error">[near_bindgen_macro]</span>
#<span class="error">[derive(BorshDeserialize, BorshSerialize, Default)]</span>
struct Blah {
}

#<span class="error">[near_bindgen_macro]</span>
impl Blah {
fn test(&self) 

{ let mut x: NearMap<String, String> = NearMap::new(b"x".to_vec()); let mut y: NearMap<String, String> = NearMap::new(b"x".to_vec()); x.insert(&"x".to_string(), &"1".to_string()); y.insert(&"x".to_string(), &"2".to_string()); // comment this line to get different error y.get(&"x".to_string()); } 

}

#<span class="error">[test]</span>
fn test() {
let mut context = test_utils::get_context(vec![]);
testing_env!(context);
let b = Blah {};
b.test();
}

Observe that code fails with `HostError(GuestPanic

{ panic_msg: "Index out of bounds" }

)`.

Also if you comment y.insert(&"x".to_string(), &"2".to_string()); observe a different error:
` HostError(GuestPanic

{ panic_msg: "The collection is an inconsistent state. Did previous smart contract execution terminate unexpectedly?" }

)`

These errors are unexpected for developer and don't provide clear guidance how to fix this.

Even though it's user error, this types of usage can be hidden very deeply in the code and hard to track.

@MaksymZavershynskyi
Copy link
Contributor

MaksymZavershynskyi commented Jun 8, 2020

Update: As discussed with @SkidanovAlex below, it is easier to fix it rather than make API for complex.

We need to introduce an explicit boolean flag for the constructor argument, which specifies whether the user intends to alias an existing map. If they did not intend to alias the contract should panic.

Specifically, it will look like:

let a = NearMap::new("x", false);
// Lots of code.
let b = NearMap::new("x", true); <--- specifies the intention to alias and so it does not panic.

while

let a = NearMap::new("x", false);
// Lots of code.
let b = NearMap::new("x", false); <-- panics because the user did not intend to alias

This would require keeping track of all currently created collections in a static variable std::collections::Set. When collection is created its id is added to the set, when collection is dropped its id is removed.

@MaksymZavershynskyi
Copy link
Contributor

@evgenykuzyakov WDYT?

@SkidanovAlex
Copy link

@nearmax I think that complicates stuff for the user even further?
Is there no way to make the collections work in a way that aliased collections work?
In the most naive implementation (that doesn't store anything locally and always fetches data from the storage) it would most certainly work, so most likely the issues are due to caching some data in the persistent map?

@MaksymZavershynskyi
Copy link
Contributor

Is there no way to make the collections work in a way that aliased collections work?

We can make aliased collections work. The issue comes from the fact that Vector stores len directly in the body of the collection rather than in a separate key-value:

Therefore two vectors that alias each other might start having inconsistent knowledge about their length if one of them is mutated.

@nearmax I think that complicates stuff for the user even further?

I don't think the proposed solution is good either, it is just first that came to mind.

Overall, I think we should revisit this interface from the DevX perspective. Should we hide it from the users that these collections alias the kvdatabase or should we make it very explicit? #183 I am leaning towards making it very explicit, since we cannot fully reproduce interface fo std::collections::* anyway (because of the API that relies heavily on the references).

@lexfrl
Copy link
Contributor

lexfrl commented Jun 11, 2020

I'm for storage*. Even in Solidity (which is a DSL for smart contracts anyway, not a general-purpose language) there is a strict separation between storage and memory. It will greatly simplify things and make it more clear for users.

https://medium.com/cryptologic/memory-and-storage-in-solidity-4052c788ca86

@evgenykuzyakov
Copy link
Contributor

I'd vote for performance. This is super rare when you need to have 2 persistent collections in memory at the same time, so making it explicit through a different constructor is fine. E.g.
let b = NearMap::reuse("x");

@lexfrl
Copy link
Contributor

lexfrl commented Jun 15, 2020

Indeed we need some kind of a storage which supports references.

@ilblackdragon
Copy link
Member Author

@nearmax @evgenykuzyakov how does "reuse" would work if len is stored locally in the data struct?
You would need to store len in the independent storage to make "reuse" work, wouldn't you?
If that's true, then it will automatically reuse on subsequent instantiations and it's more about communicating to the user that it's the same "table".

@austinabell
Copy link
Contributor

Can someone give me the use case of why you would want to reuse the same structure twice? This just seems like it would cause data races and application-specific bugs to do this.

Although allowing multiple structures to reference the data may be slightly easier to have read and write access to underlying storage, it would just become a foot gun where the compiler can't check the logic is sound.

One other note is that if you allow this reusing by creating multiple data structures, it does not allow you to add a caching layer on these types without introducing breaking changes and rug pull this functionality later.

cc: @matklad as he also recently had the suggestion to move this collections module to be named storage.

I'm indifferent about the naming because with an updated API these collections can be close to drag and drop with the std::collections but I do see also how storage can make it more clear what the difference is.

@matklad
Copy link
Contributor

matklad commented May 12, 2021

Yeah, I am not sure if we need to support aliasing at all in the library. If the user wants aliasble thing, they can wrap it in Rc<RefCell> themselves.

@austinabell
Copy link
Contributor

Closing this, the base patterns to avoid are documented in https://docs.near.org/sdk/rust/contract-structure/collections#error-prone-patterns and there isn't anything here actionable as the tradeoffs to writing the length back to storage after each operation and encouraging this pattern isn't feasible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants