Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EnumerableMap, refactor ERC721 #2160

Merged
merged 22 commits into from
Apr 2, 2020

Conversation

nventuro
Copy link
Contributor

Closes #2072.

This PR does a number of things (split across separate commits):

  • Refactor the internals of EnumerableSet, to make adding new sets easier
  • Add EnumerableMap, following the same approach for generic sets
  • Refactor ERC721 by using these internally

A couple things didn't go very well:

  • Due to issues with the type system, I had to drop EnumerableSet.enumerate. I plan on bringing this back once I get feedback from the Solidity team on how to best do it
  • Because of this, I also dropped ERC721._tokensOfOwner()

Finally, I added some small new tests to ERC721: during the migration I made a small mistake that was not caught by the suite, so I created a new test.

@nventuro nventuro requested a review from frangio March 30, 2020 17:38
@nventuro
Copy link
Contributor Author

Note also that this refactor could also include the approval-related mappings if we extended EnumerableMap/Set to generic structs as opposed to limiting it to 32-byte data chunks.

Copy link
Contributor

@frangio frangio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

contracts/utils/EnumerableSet.sol Outdated Show resolved Hide resolved
contracts/utils/EnumerableMap.sol Outdated Show resolved Hide resolved
contracts/utils/EnumerableMap.sol Outdated Show resolved Hide resolved
contracts/utils/EnumerableMap.sol Outdated Show resolved Hide resolved
contracts/utils/EnumerableMap.sol Outdated Show resolved Hide resolved
contracts/mocks/EnumerableMapMock.sol Outdated Show resolved Hide resolved
test/token/ERC721/ERC721.test.js Outdated Show resolved Hide resolved
contracts/token/ERC721/ERC721.sol Outdated Show resolved Hide resolved
contracts/token/ERC721/ERC721.sol Show resolved Hide resolved
contracts/token/ERC721/ERC721.sol Show resolved Hide resolved
@nventuro nventuro requested a review from frangio April 1, 2020 16:28
Copy link
Contributor

@frangio frangio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts on documentation. Should be applied to the other overloads.

contracts/utils/EnumerableSet.sol Outdated Show resolved Hide resolved
contracts/utils/EnumerableSet.sol Outdated Show resolved Hide resolved
@frangio
Copy link
Contributor

frangio commented Apr 1, 2020

I think this is good to go. The only thing pending is deciding whether we want UintToAddressMap or Uint256ToAddressMap.

@frangio
Copy link
Contributor

frangio commented Apr 2, 2020

Any news with respect to enumerate?

@nventuro
Copy link
Contributor Author

nventuro commented Apr 2, 2020

It can be done using inline assembly, but it triggers tons of shadowing warnings due to assembly keywords that clash with function names like add. I'll leave that out of scope for this rc, and see if we can get it back before the final release.

@nventuro nventuro merged commit bd07784 into OpenZeppelin:master Apr 2, 2020
@nventuro nventuro deleted the erc721-refactor branch April 2, 2020 18:43
@KaiRo-at
Copy link
Contributor

KaiRo-at commented Apr 8, 2020

Ugh, why are you merging everything into ERC721.sol? I have contracts that intentionally use different (e.g. lighter-gas-cost variants) of ERC721Enumerable and ERC721Metadata and with this refactoring I cannot base them on OpenZeppelin's ERC721 implementation any more but need to do a full copy of ERC721.sol with that changes only implementations specific to the enumerable and metadata inferfaces :(

@KaiRo-at
Copy link
Contributor

KaiRo-at commented Apr 8, 2020

FWIW, the previous variant of what I'm doing (and would like to do again with OpenZeppelin 3.0) is at https://etherscan.io/address/0x7e789E2dd1340971De0A9bca35b14AC0939Aa330#code in the ERC721EnumerableSimple contract (starting line 534) and the tokenURI() function in line 921 and following, for which I meanwhile created my own variant of the ERC721Metadata contract in newer (as yet unpublished) work. When you create a large number of NFTs (e.g. 150k as we did for Crypto stamp Edition 1), you want minting cost to be as small as possible...

@nventuro
Copy link
Contributor Author

nventuro commented Apr 9, 2020

That's very interesting @KaiRo-at, thanks a lot for your feedback!

We intend to support that use case for tokenURI() by default (#1745). The _mint optimization is interesting however, I had not heard of such an implementation before.

If I understand your code correctly, what you're basically doing is:

  • disallowing _burn
  • making all tokenIds sequential

These two together mean that there is no need for a tokenIndex -> tokenId mapping, since the ids and indexes match. You do still need the id -> owner mapping (ownerOf()) and to track the total number of tokens (totalSupply()), so it's two out of the four relevant storage writes that we'd saving (not counting tokensOfOwnerByIndex() related writes).

Now, there's multiple reasons why we decided to merge all ERC721 flavors: to reduce library complexity, make the code easier to read (e.g. by using EnumerableMap), and simplify usage of the library. After all, pretty much all ERC721 tokens include both the metadata and enumerable extensions. However, it does prevent these sort of alternative implementations, and we definitely would rather users didn't have to copy-paste parts of the code in order to achieve their goals.

I'm not sure what the best way to proceed regarding this is. On one hand, the requirements for this optimization are rather high (the ability to mint arbitrary tokenIds is often critical) and I wouldn't want to drop our usage of EnumerableMap, but on the other, failing to provide a way to reuse our code is a big issue.

@KaiRo-at
Copy link
Contributor

KaiRo-at commented Apr 9, 2020

Interesting to hear about #1745 - that would be great (esp. as the current solution with the _baseURI pattern results in rather ugly URLs and what OpenSea and us have done looks nicer in terms of URLs).

For the rest, you understood correctly, we save 2 stores on empty slots on minting as those are a series of collectibles that have sequential IDs anyhow (and cannot be burned by design, which becomes a requirement with this optimization). Even those 2 stores sum up quite a bit if you mint 150k NFTs up-front like we did for Crypto Stamp Edition 1 (and we will do a similar thing again).

That said, arbitrary token IDs of course need to be possible for other tokens, we have use cases ourselves where we need that (and have been using the default enumerable implementation).

In Solidity 0.5 days, I would have just gone and overridden the affected functions and be done (see the tokenURI() function I pointed to), but with the requirement of Solidity 0.6 to mark all to-be-overridden functions virtual, this kind of flexible pattern has been killed in many cases and I fear we will sadly have to resort to just a copy&modify pattern instead of having only our own changes isolated in our contracts that override/extend OpenZeppelin, which is not your fault at all, but Solidity's (and IMHO a mis-feature on their side).

Unfortunately, the merging of contracts for simplifying the OpenZeppelin library (as nice and clean it looks by itself) exacerbates that problem as it requires me to copy and replace the whole ERC-721 implementation instead of only parts of it. The only thing that could potentially save things somewhat is to mark almost everything virtual so people can override whatever they want, but of course that also makes it much easier for people to shoot themselves in the foot by forgetting to override something that is involved as well.

@frangio
Copy link
Contributor

frangio commented Apr 9, 2020

Note that we have indeed decided to mark all non-view functions virtual so it will be possible to override them. See the forum post.

I'm not sure if we should include burn by default though, given that in ERC20 the public burn function is a separate extension.

@KaiRo-at
Copy link
Contributor

KaiRo-at commented Apr 9, 2020

@frangio, it's enough to have _burn() internal virtual in there as ERC721.sol has now. That allows to use it where needed or even override it and put restrictions on it (I sometimes prefer to override it with a function that just reverts - or do the equivalent in a _beforeTokenTransfer() override - just to make it explicit that burning cannot happen). A problem is for example that tokenURI() is not overrideable as it is view and therefore not virtual, so I cannot implement a solution for #1745 in a contract that bases on ERC721.sol and will need to copy&modify even for a contract I have that needs everything else of that contract unchanged. :(

On the gas-saving stuff, I see that with the EnumerableMap for _tokenOwners, that looks quite slick design-wise but the EnumerableMap implores a gas cost of 3 empty-slot SSTOREs on adding any item whereas an unburnable NFT with 0-based sequential tokenIDs only would need 1 there. I completely see the appeal of the EnumerableMap but it even makes it hard to copy&modify the contract and remove the overhead there... Hmm, if a contract could override a using...for of its parent, I could use a stripped-down EnumerableMap that is basically just a mapping with additional functions... But that doesn't work, does it?

@nventuro
Copy link
Contributor Author

Sadly no, libraries don't work like that :/ That's an interesting idea though - I'd bring this up with the Solidity team and see what they think about it, they've been looking at their mechanisms for code reuse recently.

@KaiRo-at
Copy link
Contributor

So, I just ran our code through a test, and comparing numbers for a version with my ERC721EnumerableSimple ported to OpenZeppelin 3.0 beta.0 (before this change) and a version based on straight ERC721.sol from OpenZeppelin 3.0 RC.1, I see the latter requiring about 40k more gas per token, as I expected.

@KaiRo-at
Copy link
Contributor

And with just changing EnumerableMap, I can recover pretty much the whole regression, the code below is just for context on how I'm doing it:

--- openzeppelin-contracts/utils/EnumerableMap.sol      1985-10-26 09:15:00.000000000 +0100
+++ contracts/EnumerableMapSimple.sol   2020-04-14 18:07:43.454981171 +0200
@@ -1,6 +1,6 @@
 pragma solidity ^0.6.0;
 
-library EnumerableMap {
+library EnumerableMapSimple {
     // To implement this library for multiple types with as little code
     // repetition as possible, we write it in terms of a generic Map type with
     // bytes32 keys and values.
@@ -10,18 +10,9 @@
     // This means that we can only create new EnumerableMaps for types that fit
     // in bytes32.
 
-    struct MapEntry {
-        bytes32 _key;
-        bytes32 _value;
-    }
-
     struct Map {
         // Storage of map keys and values
-        MapEntry[] _entries;
-
-        // Position of the entry defined by a key in the `entries` array, plus 1
-        // because index 0 means a key is not in the map.
-        mapping (bytes32 => uint256) _indexes;
+        bytes32[] _entries;
     }
 
     /**
@@ -32,17 +23,14 @@
      * already present.
      */
     function _set(Map storage map, bytes32 key, bytes32 value) private returns (bool) {
-        // We read and store the key's index to prevent multiple reads from the same storage slot
-        uint256 keyIndex = map._indexes[key];
+        uint256 uintKey = uint256(key);
+        require(uintKey <= map._entries.length, "Cannot add entry that is not connected to existing IDs");
 
-        if (keyIndex == 0) { // Equivalent to !contains(map, key)
-            map._entries.push(MapEntry({ _key: key, _value: value }));
-            // The entry is stored at length-1, but we add 1 to all indexes
-            // and use 0 as a sentinel value
-            map._indexes[key] = map._entries.length;
+        if (uintKey == map._entries.length) { // add new entry
+            map._entries.push(value);
             return true;
         } else {
-            map._entries[keyIndex - 1]._value = value;
+            map._entries[uintKey] = value;
             return false;
         }
     }
@@ -52,45 +40,15 @@
      *
      * Returns true if the key was removed from the map, that is if it was present.
      */
-    function _remove(Map storage map, bytes32 key) private returns (bool) {
-        // We read and store the key's index to prevent multiple reads from the same storage slot
-        uint256 keyIndex = map._indexes[key];
-
-        if (keyIndex != 0) { // Equivalent to contains(map, key)
-            // To delete a key-value pair from the _entries array in O(1), we swap the entry to delete with the last one
-            // in the array, and then remove the last entry (sometimes called as 'swap and pop').
-            // This modifies the order of the array, as noted in {at}.
-
-            uint256 toDeleteIndex = keyIndex - 1;
-            uint256 lastIndex = map._entries.length - 1;
-
-            // When the entry to delete is the last one, the swap operation is unnecessary. However, since this occurs
-            // so rarely, we still do the swap anyway to avoid the gas cost of adding an 'if' statement.
-
-            MapEntry storage lastEntry = map._entries[lastIndex];
-
-            // Move the last entry to the index where the entry to delete is
-            map._entries[toDeleteIndex] = lastEntry;
-            // Update the index for the moved entry
-            map._indexes[lastEntry._key] = toDeleteIndex + 1; // All indexes are 1-based
-
-            // Delete the slot where the moved entry was stored
-            map._entries.pop();
-
-            // Delete the index for the deleted slot
-            delete map._indexes[key];
-
-            return true;
-        } else {
-            return false;
-        }
+    function _remove(Map storage /*map*/, bytes32 /*key*/) private pure returns (bool) {
+        revert("No removal supported");
     }
 
     /**
      * @dev Returns true if the key is in the map. O(1).
      */
     function _contains(Map storage map, bytes32 key) private view returns (bool) {
-        return map._indexes[key] != 0;
+        return uint256(key) < map._entries.length;
     }
 
     /**
@@ -112,9 +70,7 @@
     */
     function _at(Map storage map, uint256 index) private view returns (bytes32, bytes32) {
         require(map._entries.length > index, "EnumerableMap: index out of bounds");
-
-        MapEntry storage entry = map._entries[index];
-        return (entry._key, entry._value);
+        return (bytes32(index), map._entries[index]);
     }
 
     /**
@@ -132,9 +88,9 @@
      * @dev Same as {_get}, with a custom error message when `key` is not in the map.
      */
     function _get(Map storage map, bytes32 key, string memory errorMessage) private view returns (bytes32) {
-        uint256 keyIndex = map._indexes[key];
-        require(keyIndex != 0, errorMessage); // Equivalent to contains(map, key)
-        return map._entries[keyIndex - 1]._value; // All indexes are 1-based
+        uint256 uintKey = uint256(key);
+        require(map._entries.length > uintKey, errorMessage); // Equivalent to contains(map, key)
+        return map._entries[uintKey];
     }
 
     // UintToAddressMap
@@ -159,7 +115,7 @@
      *
      * Returns true if the key was removed from the map, that is if it was present.
      */
-    function remove(UintToAddressMap storage map, uint256 key) internal returns (bool) {
+    function remove(UintToAddressMap storage map, uint256 key) internal view returns (bool) {
         return _remove(map._inner, bytes32(key));
     }

@frangio
Copy link
Contributor

frangio commented Apr 14, 2020

@KaiRo-at Can you please create a new issue describing the problems in full? I'm starting to lose track of the context here.

@KaiRo-at
Copy link
Contributor

@KaiRo-at Can you please create a new issue describing the problems in full? I'm starting to lose track of the context here.

Yes, sorry, not the best style to start a discussion on a closed PR. I created #2187 for this issue, summarizing what this all is about (and putting that diff into another comment on there).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants