From 3a7c2f9cf71b74c9321da42ef7c6add6eb5011e9 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Thu, 11 Apr 2024 20:31:21 +0200 Subject: [PATCH] Update EIP-4750: align with EOF Megaspec Merged by EIP-Bot. --- EIPS/eip-4750.md | 46 +++++++++++++++++++++++++--------------------- 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/EIPS/eip-4750.md b/EIPS/eip-4750.md index aee0f2f3ba9a6c..90f716078d0b08 100644 --- a/EIPS/eip-4750.md +++ b/EIPS/eip-4750.md @@ -33,15 +33,15 @@ The type section of EOF containers must adhere to following requirements: 1. The section is comprised of a list of metadata where the metadata index in the type section corresponds to a code section index. Therefore, the type section size MUST be `n * 4` bytes, where `n` is the number of code sections. 2. Each metadata item has 3 attributes: a uint8 `inputs`, a uint8 `outputs`, and a uint16 `max_stack_height`. *Note:* This implies that there is a limit of 255 stack for the input and in the output. This is further restricted to 127 stack items, because the upper bit of both the input and output bytes are reserved for future use. `max_stack_height` is further defined in [EIP-5450](./eip-5450.md). -3. The first code section MUST have 0 inputs and 0 outputs. +3. The 0th code section MUST have 0 inputs and 0 outputs. Refer to [EIP-3540](./eip-3540.md) to see the full structure of a well-formed EOF bytecode. ### New execution state in EVM -A return stack is introduced, separate from the operand stack. It is a stack of items representing execution state to return to after function execution is finished. Each item is comprised of: code section index, offset in the code section (PC value), calling function stack height. +A return stack is introduced, separate from the operand stack. It is a stack of items representing execution state to return to after function execution is finished. Each item is comprised of code section index and offset in the code section (PC value). -Note: Implementations are free to choose particular encoding for a stack item. In the specification below we assume that representation is three unsigned integers: `code_section_index`, `offset`, `stack_height`. +Note: Implementations are free to choose particular encoding for a stack item. In the specification below we assume that representation is two unsigned integers: `code_section_index`, `offset`. The return stack is limited to a maximum 1024 items. @@ -58,17 +58,17 @@ If the code is legacy bytecode, any of these instructions results in an *excepti First we define several helper values: -- `caller_stack_height = return_stack.top().stack_height` - stack height value saved in the top item of return stack -- `type[i].inputs = type_section_contents[i * 4]` - number of inputs of ith section -- `type[i].outputs = type_section_contents[i * 4 + 1]` - number of outputs of ith section +- `type[i].inputs = type_section_contents[i * 4]` - number of inputs of ith code section +- `type[i].outputs = type_section_contents[i * 4 + 1]` - number of outputs of ith code section +- `type[i].max_stack_height = type_section_contents[i * 4 + 2:i * 4 + 4]` - maximum operand stack height of ith code section If the code is valid EOF1, the following execution rules apply: #### `CALLF` 1. Has one immediate argument,`target_section_index`, encoded as a 16-bit unsigned big-endian value. -2. EOF validation guarantees that operand stack has at least `caller_stack_height + type[target_section_index].inputs` items. -3. If operand stack size exceeds `1024 - type[target_section_index].max_stack_height` (i.e. if the called function may exceed the global stack height limit), execution results in exceptional halt. This also guarantees that the stack height after the call is within the limits. +2. *Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees that operand stack has enough items to use as callee's inputs. +3. If operand stack size exceeds `1024 - type[target_section_index].max_stack_height + type[target_section_index].inputs` (i.e. if the called function may exceed the global stack height limit), execution results in exceptional halt. This also guarantees that the stack height after the call is within the limits. 4. If return stack already has `1024` items, execution results in exceptional halt. 5. Charges 5 gas. 6. Pops nothing and pushes nothing to operand stack. @@ -76,23 +76,23 @@ If the code is valid EOF1, the following execution rules apply: ``` (code_section_index = current_section_index, - offset = PC_post_instruction, - stack_height = data_stack.height - types[code_section_index].inputs) + offset = PC_post_instruction) ``` - Under `PC_post_instruction` we mean the PC position after the entire immediate argument of `CALLF`. Operand stack height is saved as it was before function inputs were pushed. + Under `PC_post_instruction` we mean the PC position after the entire immediate argument of `CALLF`. - *Note:* Code validation rules of [EIP-5450](./eip-5450.md) guarantee there is always an instruction following `CALLF` (since terminating instruction or unconditional jump is required to be final one in the section), therefore `PC_post_instruction` always points to an instruction inside section bounds. + *Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees there is always an instruction following `CALLF` (since terminating instruction or unconditional jump is required to be final one in the section), therefore `PC_post_instruction` always points to an instruction inside section bounds. 8. Sets `current_section_index` to `target_section_index` and `PC` to `0`, and execution continues in the called section. #### `RETF` 1. Does not have immediate arguments. -2. EOF validation guarantees that operand stack has exactly `caller_stack_height + type[current_section_index].outputs` items. +2. *Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees that operand stack has exact number of items to use as outputs. 3. Charges 3 gas. 4. Pops nothing and pushes nothing to operand stack. 5. Pops an item from return stack and sets `current_section_index` and `PC` to values from this item. - 1. If return stack is empty after this, execution halts with success. + +*Note:* EOF validation requirement for 0th code section to be non-returning (non-returning sections introduced in a separate EIP) guarantees that return stack cannot be empty before `RETF`. ### Code Validation @@ -103,6 +103,7 @@ In addition to container format validation rules above, we extend code section v 3. `RJUMP`, `RJUMPI` and `RJUMPV` immediate argument value (jump destination relative offset) validation: 1. Code section is invalid in case offset points to a position outside of section bounds. 2. Code section is invalid in case offset points to one of two bytes directly following `CALLF` instruction. +5. No unreachable code sections are allowed, i.e. every code section can be reached from the 0th code section with a series of `CALLF` / `JUMPF` (`JUMPF` introduced in a separate EIP) instructions (0th code section is always reachable). ### Disallowed instructions @@ -116,18 +117,21 @@ Dynamic jump instructions `JUMP` (`0x56`) and `JUMPI` (`0x57`) are invalid and t ### Execution -1. Execution starts at the first byte of the first code section, and PC is set to 0. -2. Return stack is initialized to contain one item: `(code_section_index = 0, offset = 0, stack_height = 0)` -3. If any instruction access the operand stack item below `caller_stack_height`, execution results in exceptional halt. This rule replaces the old stack underflow check. -4. No change in stack overflow check: if any instruction causes the operand stack height to exceed `1024`, execution results in exceptional halt. +1. Execution starts at the first byte of the 0th code section, and PC is set to 0. +2. Return stack is initialized empty. +3. Stack underflow check is not performed anymore. *Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees that it cannot happen at run-time. +3. Stack overflow check is not performed anymore, except during `CALLF` as specified above. ## Rationale -### `RETF` in the top frame ends execution vs exceptionally halts +### `RETF` in the top frame ends execution vs exceptionally halts vs is not allowed during validation + +Alternative logic for `RETF` in the top frame could be to allow it during code validation and make it either: -Alternative logic for executing `RETF` in the top frame could be to exceptionally halt execution, because there is arguably no caller for the starting function. This would mean that return stack is initialized as empty, and `RETF` exceptionally aborts when return stack is empty. +- end execution if the return stack is emptied by the `RETF` or +- exceptionally halt if the return stack is empty before the `RETF`. -We have decided in favor of always having at least one item in the return stack, because it allows to avoid having a special case for empty stack in the interpreter loop stack underflow check. We keep the stack underflow rule general by having `caller_stack_height = 0` in the top frame. +This has been superseded with the validation rule of top frame (0th code section) being non-returning (non-returning sections introduced in a separate EIP), because validating non-returning status of functions is valuable by itself for other reasons. Therefore all considerations of runtime behavior of `RETF` in the top frame were obsoleted. ### Code section limit and instruction size