diff --git a/EIPS/eip-2348.md b/EIPS/eip-2348.md index 9b1231ac50b88a..8837c763055dbb 100644 --- a/EIPS/eip-2348.md +++ b/EIPS/eip-2348.md @@ -60,51 +60,82 @@ all new and updated accounts will have the account version `1`. The validation p rules described in the Version Header, `BEGINDATA`, Invalid Opcode Validation, and Static Jump Validations sections. -These EIP sections only apply to contracts stored or in the process of being stored in in accounts -with version `1`. This EIP never applies to contracts stored or in the process of being stored in -accounts at version `0`. Future EIPs may increase the set of contract versions this EIP applies to. +These EIP sections applies to contracts stored or in the process of being stored in in accounts with +version `1`. This EIP never applies to contracts stored or in the process of being stored in +accounts at version `0`. For initcode being executed for `CREATE` and `CREATE2` operations this +applies if the contract invoking the opcode is version `1`. If the calling contract was stored in an +account with version `0` this EIP does not apply. + +Future EIPs may increase the set of contract versions this EIP applies to. ### Version Header -For contracts with the first byte from `0x01` to `0xff`, or whose total length is less than 4 bytes, -the contract is treated exactly as through it had been deployed to an account with version `0`. For +For contracts with the first byte is not `0xef`, or whose total length is less than 4 bytes, the +contract is treated exactly as through it had been deployed to an account with version `0`. For these contracts none of the other subsections in this EIP apply. -When deploying a contract if a contract starts with `0x00` and has a length 4 or later the first +When deploying a contract if a contract starts with `0xef` and has a length 4 or later the first four bytes form a version header. If a version header is not recognized by the EVM the contract deployment transaction fails with out-of-gas. -For this EIP, only header version `1` (contracts starting with the byte stream `0x00` `0x00` `0x00` -`0x01`) is defined. Future EIPs may expand on the valid set of headers. This version indicates that -next three validations are applied to the content of the contract, keeping all other semantics of -the current "version 0" EVM contracts, including the same gas schedule. +When executing a contract with a header the execution should start at `PC=4`, corresponding to the +first byte of the contract that is not part of the headers. + +EVM implementations could model this as a 4 byte no-op no-gas operation that can only occur at the +zeroth index of a contract. However they would need to take care that the byte `0xef` would be +invalid if it occurred in the code segment at any location other than the zeroth byte. -**unresolved** - How do we deal with executing with the header? +For this EIP the header byte sequence [`0xef`, `0x65`, `0x76`, `0x6d`] is defined (corresponding to +the ISO/IEC 8859 part 1 string `'ïevm'`) is specified. This version indicates that next set of +validations are applied to the content of the contract, keeping all other semantics of the current +"version 0" EVM contracts, including the same gas schedule. -- Should contract execution start at index 4 as PC=0, - - This causes EXTCODECOPY indexes to not match up -- should contract execution start at index 4 as PC=4, - - This would require some possibly non-trivial EVM changes -- should the version header be a multi-byte instruction which is a no-op? Contract starts at 0 and - PC=0 - - This introduces a new opcode, may be the simplest. +Future EIPs may expand on the valid set of headers. No other header sequences are defined in this +EIP. -### `BEGINDATA` +### `BEGINDATA` operation As described in [EIP-2327] a new opcode `BEGINDATA` (`0xb6`) is added that indicates the remainder of the contract should not be considered executable code. +If the EVM attempts to execute the `BEGINDATA` operation it should be treated as attempting to +execute an invalid operation. Similarly jumping into any location after the `BEGINDATA` operation is +an invalid operation, even if the byte jumped to corresponds to the `JUMPDEST` opcode. + +### Code Segment Size Limit + +With the introduction of the `BEGINDATA` opcode the contract can now be conceptually split into a +code segment ad a data segment. The code segments corresponds to all the bytes prior to and +including the `BEGINDATA` opcode or the entire contract if no `BEGINDATA` opcode is present. All +other data after the code segment is referred to as the data segment. If there is no `BEGINDATA` +operation there are no bytes in the data segment. + +In [EIP 170](https://eips.ethereum.org/EIPS/eip-170) a contract code size limit was introduced. All +code segment data, including the header bytes and `BEGINDATA` operation (if present) must be equal +to or less than the chain's specified contract code size limit, which is currently 24KiB for +mainnet. + +For contract creation transactions, and the return of `CREATE`, and `CREATE2` operations this limit +is already enforced for the entire size of the contract, including code and data segments. For the +initialization code for a `CREATE` or `CREATE2` operation there is no specified limit, so the +separate enforcement of the code segment length will need to be enforced in those instances. The +combined code and data segment size for init code in `CREATE` and `CREATE2` operations is out of +scope for this EIP. + ### Invalid Opcode Validation All data between the Version Header and either the `BEGINDATA` marker or the end of the contract if `BEGINDATA` is not present must represent a valid EVM program at all points of the data. Invalid opcode validation consists of the following process: -- Iterate over the code bytes one by one. - - If the code byte is a multi-byte operation, skip the appropriate number of bytes. - - If the code byte is a valid opcode or designated invalid instruction (`0xfe`), continue. - - If the code byte is the `BEGINDATA` operation (`0xb6`) stop iterating. - - Otherwise, throw out-of-gas. +- Iterate over the code bytes starting after the header bytes one by one. + - If the code byte is a multi-byte operation, skip the appropriate number of bytes and continue. + - If the code byte is a valid opcode or the designated invalid instruction (`0xfe`), continue. + - If the code byte is the `BEGINDATA` operation (`0xb6`) stop iterating and consider the contract + valid. + - If more bytes than the contract code size limit would be validated the contract is invalid and + the operation fails. + - Otherwise, the contract is invalid and the operation fails. As of the Istanbul upgrade all of the multi-byte operations are the `PUSHn` series of operations from `0x60` to `0x7f`. Future upgrades may add more multi-byte operations. @@ -129,9 +160,17 @@ performed separately at contract deployment time. ## Rationale -The first major feature is the invalid opcode removal. In the case where a contract has an invalid -opcode that later becomes a multi-byte opcode followed by a `JUMPDEST` marker that contract would -become invalid after an upgrade because the destination marker would become part of the new +The choice for the first byte of the header as `0xef` was first recommended in +[issue 154](https://github.com/ethereum/EIPs/issues/154) of the EIP repository. It also maps to an +unused opcode in the version 0 spec and packs next to the `0xf0` series of call instructions, and +the `evm` part was to mirror what WASM has done. Choosing `0x00` as the first byte as it could be +confused with a nonsensical, but correct contract that starts with STOP and the next operation is +PUSH5 if lowercase e was selected, or `STOP` `GASLIMIT` `JUMP` `` if capital letters +were used. A header that was always invalid in the prior EVM specs was seen as desirable. + +The first major validation is the invalid opcode removal. In the case where a contract has an +invalid opcode that later becomes a multi-byte opcode followed by a `JUMPDEST` marker that contract +would become invalid after an upgrade because the destination marker would become part of the new multi-byte instruction, as described in the [EIP-663 discussion]. If no invalid opcodes can be deployed then the possibility of the `JUMPDEST` being absorbed by new multi-byte instructions is eliminated. @@ -164,7 +203,7 @@ that is deployed with a version header. Because of the version header validation contracts can be deployed. Existing compilers (such as solidity) can provide support for headers by prepending their output -stream with `0x00`, `0x00`, `0x00`, `0x01` and appending `0xb6` prior to any non-code data added as +stream with `0xef`, `0x65`, `0x76`, `0x6d` and appending `0xb6` prior to any non-code data added as part of the contract. ## Forwards Compatibility @@ -198,6 +237,8 @@ again for `CREATE2`. - three byte program, starts with zero - four bytes program, header only - header and begin data only + - validated code in `CREATE` an `CREATE2` init code with proper code segment size and total size + greater than the code segment limit - Negative - contract with otherwise valid program that starts with zero, 5 bytes or more - contract with header and invalid opcodes @@ -209,7 +250,11 @@ again for `CREATE2`. - header, and contract code+header to large by less than 4 bytes - header, and contract code+header to large by more than 4 bytes - header, contract code, begin data, data, and the whole thing is too large - - One test for each invalid opcode: no header, with header, and with header and `BEGINDATA` + - one test for each invalid opcode: no header, with header, and with header and `BEGINDATA` + - code segment size violations + - In a contract creation transaction + - In `CREATE` and `CREATE2` init code + - In `CREATE` and `CREATE2` created contracts ## Implementation