Skip to content

Commit

Permalink
Update EIP-4750: align with EOF Megaspec
Browse files Browse the repository at this point in the history
Merged by EIP-Bot.
  • Loading branch information
pdobacz authored Apr 11, 2024
1 parent 14d770b commit 3a7c2f9
Showing 1 changed file with 25 additions and 21 deletions.
46 changes: 25 additions & 21 deletions EIPS/eip-4750.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,15 @@ The type section of EOF containers must adhere to following requirements:

1. The section is comprised of a list of metadata where the metadata index in the type section corresponds to a code section index. Therefore, the type section size MUST be `n * 4` bytes, where `n` is the number of code sections.
2. Each metadata item has 3 attributes: a uint8 `inputs`, a uint8 `outputs`, and a uint16 `max_stack_height`. *Note:* This implies that there is a limit of 255 stack for the input and in the output. This is further restricted to 127 stack items, because the upper bit of both the input and output bytes are reserved for future use. `max_stack_height` is further defined in [EIP-5450](./eip-5450.md).
3. The first code section MUST have 0 inputs and 0 outputs.
3. The 0th code section MUST have 0 inputs and 0 outputs.

Refer to [EIP-3540](./eip-3540.md) to see the full structure of a well-formed EOF bytecode.

### New execution state in EVM

A return stack is introduced, separate from the operand stack. It is a stack of items representing execution state to return to after function execution is finished. Each item is comprised of: code section index, offset in the code section (PC value), calling function stack height.
A return stack is introduced, separate from the operand stack. It is a stack of items representing execution state to return to after function execution is finished. Each item is comprised of code section index and offset in the code section (PC value).

Note: Implementations are free to choose particular encoding for a stack item. In the specification below we assume that representation is three unsigned integers: `code_section_index`, `offset`, `stack_height`.
Note: Implementations are free to choose particular encoding for a stack item. In the specification below we assume that representation is two unsigned integers: `code_section_index`, `offset`.

The return stack is limited to a maximum 1024 items.

Expand All @@ -58,41 +58,41 @@ If the code is legacy bytecode, any of these instructions results in an *excepti

First we define several helper values:

- `caller_stack_height = return_stack.top().stack_height` - stack height value saved in the top item of return stack
- `type[i].inputs = type_section_contents[i * 4]` - number of inputs of ith section
- `type[i].outputs = type_section_contents[i * 4 + 1]` - number of outputs of ith section
- `type[i].inputs = type_section_contents[i * 4]` - number of inputs of ith code section
- `type[i].outputs = type_section_contents[i * 4 + 1]` - number of outputs of ith code section
- `type[i].max_stack_height = type_section_contents[i * 4 + 2:i * 4 + 4]` - maximum operand stack height of ith code section

If the code is valid EOF1, the following execution rules apply:

#### `CALLF`

1. Has one immediate argument,`target_section_index`, encoded as a 16-bit unsigned big-endian value.
2. EOF validation guarantees that operand stack has at least `caller_stack_height + type[target_section_index].inputs` items.
3. If operand stack size exceeds `1024 - type[target_section_index].max_stack_height` (i.e. if the called function may exceed the global stack height limit), execution results in exceptional halt. This also guarantees that the stack height after the call is within the limits.
2. *Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees that operand stack has enough items to use as callee's inputs.
3. If operand stack size exceeds `1024 - type[target_section_index].max_stack_height + type[target_section_index].inputs` (i.e. if the called function may exceed the global stack height limit), execution results in exceptional halt. This also guarantees that the stack height after the call is within the limits.
4. If return stack already has `1024` items, execution results in exceptional halt.
5. Charges 5 gas.
6. Pops nothing and pushes nothing to operand stack.
7. Pushes to return stack an item:

```
(code_section_index = current_section_index,
offset = PC_post_instruction,
stack_height = data_stack.height - types[code_section_index].inputs)
offset = PC_post_instruction)
```

Under `PC_post_instruction` we mean the PC position after the entire immediate argument of `CALLF`. Operand stack height is saved as it was before function inputs were pushed.
Under `PC_post_instruction` we mean the PC position after the entire immediate argument of `CALLF`.

*Note:* Code validation rules of [EIP-5450](./eip-5450.md) guarantee there is always an instruction following `CALLF` (since terminating instruction or unconditional jump is required to be final one in the section), therefore `PC_post_instruction` always points to an instruction inside section bounds.
*Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees there is always an instruction following `CALLF` (since terminating instruction or unconditional jump is required to be final one in the section), therefore `PC_post_instruction` always points to an instruction inside section bounds.
8. Sets `current_section_index` to `target_section_index` and `PC` to `0`, and execution continues in the called section.

#### `RETF`

1. Does not have immediate arguments.
2. EOF validation guarantees that operand stack has exactly `caller_stack_height + type[current_section_index].outputs` items.
2. *Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees that operand stack has exact number of items to use as outputs.
3. Charges 3 gas.
4. Pops nothing and pushes nothing to operand stack.
5. Pops an item from return stack and sets `current_section_index` and `PC` to values from this item.
1. If return stack is empty after this, execution halts with success.

*Note:* EOF validation requirement for 0th code section to be non-returning (non-returning sections introduced in a separate EIP) guarantees that return stack cannot be empty before `RETF`.

### Code Validation

Expand All @@ -103,6 +103,7 @@ In addition to container format validation rules above, we extend code section v
3. `RJUMP`, `RJUMPI` and `RJUMPV` immediate argument value (jump destination relative offset) validation:
1. Code section is invalid in case offset points to a position outside of section bounds.
2. Code section is invalid in case offset points to one of two bytes directly following `CALLF` instruction.
5. No unreachable code sections are allowed, i.e. every code section can be reached from the 0th code section with a series of `CALLF` / `JUMPF` (`JUMPF` introduced in a separate EIP) instructions (0th code section is always reachable).

### Disallowed instructions

Expand All @@ -116,18 +117,21 @@ Dynamic jump instructions `JUMP` (`0x56`) and `JUMPI` (`0x57`) are invalid and t

### Execution

1. Execution starts at the first byte of the first code section, and PC is set to 0.
2. Return stack is initialized to contain one item: `(code_section_index = 0, offset = 0, stack_height = 0)`
3. If any instruction access the operand stack item below `caller_stack_height`, execution results in exceptional halt. This rule replaces the old stack underflow check.
4. No change in stack overflow check: if any instruction causes the operand stack height to exceed `1024`, execution results in exceptional halt.
1. Execution starts at the first byte of the 0th code section, and PC is set to 0.
2. Return stack is initialized empty.
3. Stack underflow check is not performed anymore. *Note:* EOF validation [EIP-5450](./eip-5450.md) guarantees that it cannot happen at run-time.
3. Stack overflow check is not performed anymore, except during `CALLF` as specified above.

## Rationale

### `RETF` in the top frame ends execution vs exceptionally halts
### `RETF` in the top frame ends execution vs exceptionally halts vs is not allowed during validation

Alternative logic for `RETF` in the top frame could be to allow it during code validation and make it either:

Alternative logic for executing `RETF` in the top frame could be to exceptionally halt execution, because there is arguably no caller for the starting function. This would mean that return stack is initialized as empty, and `RETF` exceptionally aborts when return stack is empty.
- end execution if the return stack is emptied by the `RETF` or
- exceptionally halt if the return stack is empty before the `RETF`.

We have decided in favor of always having at least one item in the return stack, because it allows to avoid having a special case for empty stack in the interpreter loop stack underflow check. We keep the stack underflow rule general by having `caller_stack_height = 0` in the top frame.
This has been superseded with the validation rule of top frame (0th code section) being non-returning (non-returning sections introduced in a separate EIP), because validating non-returning status of functions is valuable by itself for other reasons. Therefore all considerations of runtime behavior of `RETF` in the top frame were obsoleted.

### Code section limit and instruction size

Expand Down

0 comments on commit 3a7c2f9

Please sign in to comment.