From 8bdb9213ff2560a83aadd7cc4af062e08e98bd22 Mon Sep 17 00:00:00 2001 From: Santiago Palladino Date: Thu, 14 Mar 2024 09:01:56 -0300 Subject: [PATCH] docs(yellow-paper): Add pseudocode for verifying broadcasted functions in contract deployment (#4431) Plus other small fixes and a TODO --- .../docs/contract-deployment/classes.md | 98 ++++++++++++++----- .../docs/contract-deployment/instances.md | 16 +-- 2 files changed, 80 insertions(+), 34 deletions(-) diff --git a/yellow-paper/docs/contract-deployment/classes.md b/yellow-paper/docs/contract-deployment/classes.md index f54c5660a2f..ea3cfbbcd85 100644 --- a/yellow-paper/docs/contract-deployment/classes.md +++ b/yellow-paper/docs/contract-deployment/classes.md @@ -76,12 +76,14 @@ unconstrained_functions_artifact_tree_root = merkleize(unconstrained_functions_a artifact_hash = sha256( private_functions_artifact_tree_root, - unconstrained_functions_artifact_tree_root, - artifact_metadata, + unconstrained_functions_artifact_tree_root, + artifact_metadata_hash, ) ``` -For the artifact hash merkleization and hashing is done using sha256, since it is computed and verified outside of circuits and does not need to be SNARK friendly. Fields are left-padded with zeros to 256 bits before being hashed. Function leaves are sorted in ascending order before being merkleized, according to their function selectors. Note that a tree with dynamic height is built instead of having a tree with a fixed height, since the merkleization is done out of a circuit. +For the artifact hash merkleization and hashing is done using sha256, since it is computed and verified outside of circuits and does not need to be SNARK friendly, and then wrapped around the field's maximum value. Fields are left-padded with zeros to 256 bits before being hashed. Function leaves are sorted in ascending order before being merkleized, according to their function selectors. Note that a tree with dynamic height is built instead of having a tree with a fixed height, since the merkleization is done out of a circuit. + + Bytecode for private functions is a mix of ACIR and Brillig, whereas unconstrained function bytecode is Brillig exclusively, as described on the [bytecode section](../bytecode/index.md). @@ -126,13 +128,16 @@ In pseudocode: function register( artifact_hash: Field, private_functions_root: Field, + public_bytecode_commitment: Point, packed_public_bytecode: Field[], -) - assert is_valid_packed_public_bytecode(packed_public_bytecode) - +) version = 1 - bytecode_commitment = calculate_commitment(packed_public_bytecode) - contract_class_id = pedersen([version, artifact_hash, private_functions_root, bytecode_commitment], GENERATOR__CLASS_IDENTIFIER) + + assert is_valid_packed_public_bytecode(packed_public_bytecode) + computed_bytecode_commitment = calculate_commitment(packed_public_bytecode) + assert public_bytecode_commitment == computed_bytecode_commitment + + contract_class_id = pedersen([version, artifact_hash, private_functions_root, computed_bytecode_commitment], GENERATOR__CLASS_IDENTIFIER) emit_nullifier contract_class_id emit_unencrypted_event ContractClassRegistered(contract_class_id, version, artifact_hash, private_functions_root, packed_public_bytecode) @@ -157,13 +162,13 @@ Broadcasted contract artifacts that do not match with their corresponding `artif ``` function broadcast_all_private_functions( contract_class_id: Field, - artifact_metadata: Field, + artifact_metadata_hash: Field, unconstrained_functions_artifact_tree_root: Field, - functions: { selector: Field, metadata: Field, vk_hash: Field, bytecode: Field[] }[], + functions: { selector: Field, metadata_hash: Field, vk_hash: Field, bytecode: Field[] }[], ) emit_unencrypted_event ClassPrivateFunctionsBroadcasted( contract_class_id, - artifact_metadata, + artifact_metadata_hash, unconstrained_functions_artifact_tree_root, functions, ) @@ -172,19 +177,19 @@ function broadcast_all_private_functions( ``` function broadcast_all_unconstrained_functions( contract_class_id: Field, - artifact_metadata: Field, + artifact_metadata_hash: Field, private_functions_artifact_tree_root: Field, - functions:{ selector: Field, metadata: Field, bytecode: Field[] }[], + functions:{ selector: Field, metadata_hash: Field, bytecode: Field[] }[], ) emit_unencrypted_event ClassUnconstrainedFunctionsBroadcasted( contract_class_id, - artifact_metadata, + artifact_metadata_hash, unconstrained_functions_artifact_tree_root, functions, ) ``` - + The broadcast functions are split between private and unconstrained to allow for private bytecode to be broadcasted, which is valuable for composability purposes, without having to also include unconstrained functions, which could be costly to do due to data broadcasting costs. Additionally, note that each broadcast function must include enough information to reconstruct the `artifact_hash` from the Contract Class, so nodes can verify it against the one previously registered. @@ -193,16 +198,18 @@ The `ContractClassRegisterer` contract also allows broadcasting individual funct ``` function broadcast_private_function( contract_class_id: Field, - artifact_metadata: Field, + artifact_metadata_hash: Field, unconstrained_functions_artifact_tree_root: Field, - function_leaf_sibling_path: Field, - function: { selector: Field, metadata: Field, vk_hash: Field, bytecode: Field[] }, + private_function_tree_sibling_path: Field[], + artifact_function_tree_sibling_path: Field[], + function: { selector: Field, metadata_hash: Field, vk_hash: Field, bytecode: Field[] }, ) emit_unencrypted_event ClassPrivateFunctionBroadcasted( contract_class_id, - artifact_metadata, + artifact_metadata_hash, unconstrained_functions_artifact_tree_root, - function_leaf_sibling_path, + private_function_tree_sibling_path, + artifact_function_tree_sibling_path, function, ) ``` @@ -210,18 +217,57 @@ function broadcast_private_function( ``` function broadcast_unconstrained_function( contract_class_id: Field, - artifact_metadata: Field, + artifact_metadata_hash: Field, private_functions_artifact_tree_root: Field, - function_leaf_sibling_path: Field, - function: { selector: Field, metadata: Field, bytecode: Field[] }[], + artifact_function_tree_sibling_path: Field[], + function: { selector: Field, metadata_hash: Field, bytecode: Field[] }[], ) emit_unencrypted_event ClassUnconstrainedFunctionBroadcasted( contract_class_id, - artifact_metadata, - unconstrained_functions_artifact_tree_root, - function_leaf_sibling_path: Field, + artifact_metadata_hash, + private_functions_artifact_tree_root, + artifact_function_tree_sibling_path, function, ) ``` +A node that captures a `ClassPrivateFunctionBroadcasted` should perform the following validation steps before storing the private function information in its database: + +``` +// Load contract class from local db +contract_class = db.get_contract_class(contract_class_id) + +// Compute function leaf and assert it belongs to the private functions tree +function_leaf = pedersen([selector as Field, vk_hash], GENERATOR__FUNCTION_LEAF) +computed_private_function_tree_root = compute_root(function_leaf, private_function_tree_sibling_path) +assert computed_private_function_tree_root == contract_class.private_function_root + +// Compute artifact leaf and assert it belongs to the artifact +artifact_function_leaf = sha256(selector, metadata_hash, sha256(bytecode)) +computed_artifact_private_function_tree_root = compute_root(artifact_function_leaf, artifact_function_tree_sibling_path) +computed_artifact_hash = sha256(computed_artifact_private_function_tree_root, unconstrained_functions_artifact_tree_root, artifact_metadata_hash) +assert computed_artifact_hash == contract_class.artifact_hash +``` + + + +The check for an unconstrained function is similar: + +``` +// Load contract class from local db +contract_class = db.get_contract_class(contract_class_id) + +// Compute artifact leaf and assert it belongs to the artifact +artifact_function_leaf = sha256(selector, metadata_hash, sha256(bytecode)) +computed_artifact_unconstrained_function_tree_root = compute_root(artifact_function_leaf, artifact_function_tree_sibling_path) +computed_artifact_hash = sha256(private_functions_artifact_tree_root, computed_artifact_unconstrained_function_tree_root, artifact_metadata_hash) +assert computed_artifact_hash == contract_class.artifact_hash +``` + It is strongly recommended for developers registering new classes to broadcast the code for `compute_hash_and_nullifier`, so any private message recipients have the code available to process their incoming notes. However, the `ContractClassRegisterer` contract does not enforce this during registration, since it is difficult to check the multiple signatures for `compute_hash_and_nullifier` as they may evolve over time to account for new note sizes. + +## Discarded Approaches + +### Bundling private function information into a single tree + +Data about private functions is split across two trees: one for the protocol, that deals only with selectors and verification keys, and one for the artifact, which deals with bytecode and metadata. While bundling together both trees would simplify the representation, it would also pollute the protocol circuits and require more hashing there. In order to minimize in-circuit hashing, we opted for keeping non-protocol info completely out of circuits. \ No newline at end of file diff --git a/yellow-paper/docs/contract-deployment/instances.md b/yellow-paper/docs/contract-deployment/instances.md index 7c19895dcdc..8c825f3456e 100644 --- a/yellow-paper/docs/contract-deployment/instances.md +++ b/yellow-paper/docs/contract-deployment/instances.md @@ -48,9 +48,9 @@ A contract instance at a given address can be either Initialized or not. An addr ### Uninitialized -The instance has not yet been initialized, meaning its constructor has not been called. This is the default state for any given address. A user who knows the preimage of the address can still issue a private call into a function in the contract, as long as that function does not assert that the contract has been initialized by checking the Initialization Nullifier. +The default state for any given address is to be uninitialized, meaning its constructor has not been called. A user who knows the preimage of the address can still issue a private call into a function in the contract, as long as that function does not assert that the contract has been initialized by checking the Initialization Nullifier. -All public function calls to an Uninitialized address _must_ fail, since the Contract Class for it is not known to the network. If the Class is not known to the network, then an Aztec Node, whether it is the elected sequencer or a full node following the chain, may not be able to execute the bytecode for a public function call, which is undesirable. The failing of public function calls to Uninitialized addresses is enforced by having the Public Kernel Circuit check that the Deployment Nullifier for the instance has been emitted. +All function calls to an Uninitialized contract that depend on the contract being initialized should fail, to prevent the contract from being used in an invalid state. This state allows using a contract privately before it has been initialized or deployed, which is used in [diversified and stealth accounts](../addresses-and-keys/diversified-and-stealth.md). @@ -60,8 +60,6 @@ An instance is Initialized when a constructor for the instance has been invoked, The Initialization Nullifier is defined as the contract address itself. Note that the nullifier later gets [siloed by the Private Kernel Circuit](../circuits/private-kernel-tail.md#siloing-values) before it gets broadcasted in a transaction. -In this state, public functions must still fail, for the same reason as for Uninitialized instances. This state then allows using a contract privately before it has been publicly deployed, which is useful for working on private contracts between a small set of parties. - :::warning It may be the case that it is not possible to read a nullifier in the same transaction that it was emitted due to protocol limitations. That would lead to a contract not being callable in the same transaction as it is initialized. To work around this, we can emit an Initialization Commitment along with the Initialization Nullifier, which _can_ be read in the same transaction as it is emitted. If needed, the Initialization Commitment is defined exactly as the Initialization Nullifier. ::: @@ -83,11 +81,13 @@ Removing constructors from the protocol itself simplifies the kernel circuit, an ## Public Deployment -A Contract Instance is considered to be Publicly Deployed when it has been broadcasted to the network via a canonical `ContractInstanceDeployer` contract, which also emits a Deployment Nullifier associated to the deployed instance. A contract needs to be Publicly Deployed for any of its public functions to be called. Note that this last restriction makes Public Deployment a protocol-level concern, whereas Initialization is an application-level concern. +A Contract Instance is considered to be Publicly Deployed when it has been broadcasted to the network via a canonical `ContractInstanceDeployer` contract, which also emits a Deployment Nullifier associated to the deployed instance. -The Deployment Nullifier is defined as the address of the contract being deployed. Note that it later gets [siloed](../circuits/private-kernel-tail.md#siloing-values) using the `ContractInstanceDeployer` address by the Kernel Circuit, so this nullifier is effectively the hash of the deployed contract address and the `ContractInstanceDeployer` address. +All public function calls to an Undeployed address _must_ fail, since the Contract Class for it is not known to the network. If the Class is not known to the network, then an Aztec Node, whether it is the elected sequencer or a full node following the chain, may not be able to execute the bytecode for a public function call, which is undesirable. -Only in this state public function calls are valid. The Public Kernel Circuit validates that the Deployment Nullifier has been emitted by the `ContractInstanceDeployer` as part of its checks. Note that this requires hardcoding the address of an application-level contract in a protocol circuit. +The failing of public function calls to Undeployed addresses is enforced by having the Public Kernel Circuit check that the Deployment Nullifier for the instance has been emitted. Note that makes Public Deployment a protocol-level concern, whereas Initialization is purely an application-level concern. Also, note that this requires hardcoding the address of the `ContractInstanceDeployer` contract in a protocol circuit. + +The Deployment Nullifier is defined as the address of the contract being deployed. Note that it later gets [siloed](../circuits/private-kernel-tail.md#siloing-values) using the `ContractInstanceDeployer` address by the Kernel Circuit, so this nullifier is effectively the hash of the deployed contract address and the `ContractInstanceDeployer` address. ### Canonical Contract Instance Deployer @@ -124,7 +124,7 @@ function deploy ( Upon seeing a `ContractInstanceDeployed` event from the canonical `ContractInstanceDeployer` contract, nodes are expected to store the address and preimage, so they can verify executed code during public code execution as described in the next section. -The `ContractInstanceDeployer` contract provides two implementations of the `deploy` function: a private and a public one. Contracts with a private constructor are expected to use the former, and contracts with public constructors expected to use the latter. Contracts that have already been privately Initialized can use either. +The `ContractInstanceDeployer` contract provides two implementations of the `deploy` function: a private and a public one. ### Genesis