Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime triggers #516

Open
bowenwang1996 opened this issue Nov 5, 2023 · 37 comments
Open

Runtime triggers #516

bowenwang1996 opened this issue Nov 5, 2023 · 37 comments

Comments

@bowenwang1996
Copy link
Collaborator

bowenwang1996 commented Nov 5, 2023

It is sometimes useful for smart contract to subscribe to some event and perform an action based on the result of an event. One simple example is cronjob. Currently smart contracts have no way of scheduling an action that is periodically triggered (in this case the event that triggers the action is time) and as a result, if someone wants to achieve this, they need to rely on offchain infrastructure that periodically calls the smart contract, which is quite cumbersome. A more complex use case is chain signatures, which needs the execution to resume once the signature is generated by validators.

In terms of effect, this should roughly be equivalent to if the smart contract has a method that constantly calls itself

fn trigger(&self) -> Promise {
  if condition {
     do_something
  }
  Self::ext(env::current_account_id()).trigger()
}

However, a function like this is not practical as is because of gas limit associated with each specific function call. However, we can extend the mechanism of callbacks to allow them be not triggered immediately, but rather when a specific condition is met. More specifically, we can introduce a new type of Promise DelayedPromise that generates a postponed receipt, similar to what callbacks do today. However, the postponed receipt is stored globally in the shard, instead of under a specific account (at least conceptually, in practice a separate index could be created if that helps), along with the condition that triggers the delayed promise. Then, during the execution of every chunk, the runtime first checks whether any of the delayed promise should be triggered and execute them if the triggering condition is met. Roughly this would allow us to rewrite the example above into

fn trigger_condition(&self) -> bool {
   // some condition specified by the contract
   condition
}

fn trigger(&self) -> Promise {
  if condition {
     do_something
  }
  // specify the trigger condition, callback and arguments to the callback. This is almost a call to self with callback, except that it also specifies the trigger condition.
  DelayedPromise::new(env::current_account_id(), trigger_condition, trigger, {})
}

For this idea to work, a few issues need to be resolved:

  • How the delayed promises are paid for. In NEAR's execution model, it is always the signer of the transaction who pays for the entire execution chain, regardless of the number of receipts emitted during the execution. However, in the case of a delayed promise, the situation may get more complex. There are two possible options here:
    1. a delayed promise can only be triggered once, with no possibility of scheduling further delayed promises. In this simple mode, we can reuse the existing gas model and continue to ask the signer of the transaction to pay for the promise without having to worry about issues like running out of gas.
    2. a delayed promise can be triggered multiple times, or it can schedule delayed promises within its execution. Under this assumption, the existing gas model does not work because then we would run into a similar problem described at the beginning - there won't be enough gas to execute the delayed promise because of the single function call gas limit (300Tgas). One possible idea to address this issue is to say that the contract which schedules the delayed promise has to pay for its execution and it is the responsibility of the smart contract to figure out how to transfer such cost to users if needed. Then, if the smart contract is not able to pay for the cost of the delayed promise, its execution would fail and no new promises would be scheduled.
  • How triggers are specified. The trigger should be specified by the contract and can depend on the execution context (VMContext). One idea is that we could introduce a special type of function that a smart contract can implement to specify the behavior of a trigger it plans to use. The function must return a boolean value and must consume little gas (the exact threshold needs to be defined). Otherwise a malicious attacker could run an infinite loop to cause validators to do a lot more work and slow down the network as a result. The cost of running the trigger should be priced into the cost of a delayed promise so that this cost is accounted for.
@DavidM-D
Copy link
Contributor

DavidM-D commented Nov 6, 2023

Thanks for writing up this proposal Bowen, it appears to me to be very sensible and solve our problem. I've got a few suggestions of how I think we can make this simpler and potentially more powerful using some well understood primitives.

If we consider an external action of type Input -> Output, the typed extended host function API I propose is:

yield : Input -> ()

resume : ActionReciept -> (Output -> b) -> Output -> ()

Or rendered slightly closer to the WASM API

yield : Bytes -> ()

resume : ActionRecieptId -> String -> [u8] -> ()

Each yield emits a new receipt type YieldedReciept.

type YieldedReciept = {
    data: [u8]
}

Resume continues the execution in the ActionReciept context, consuming the gas of the original caller.

Resume calls can be executed from the first yield call until either value_return, panic or abort is called during any resumptions. In the interim the contract runtime needs to store no data in excess of what it stores when there is an outstanding promise. Resume and yield do not have to be called the name number of times.

To sketch out how this would be used in the sign endpoint:

signer.near:
  // Caller calls and pays
  sign_start : (payload, derivation) {
    let caller = caller();
    // MPC service listens for these yields and calls resume with the result
    let sign_request_id = generate_request_id();
    yield((sign_request_id, payload, derivation, caller))
    store.insert(sign_request_id, (paylaod, derivation, caller))
  }

  // MPC service calls and pays
  sign_resume : (signature, sign_request_id, action_receipt_id) {
    let (payload, derivation, caller) = store.get(sign_request_id)
    assert_valid_sig!(signature, payload, caller)
    resume(action_reciept_id, "sign_finish", signature)
  }
  
  // Resume calls and caller pays
  sign_finish : (signature) {
    // Returns value back along the call tree
    value_return(signature)
  }

More generally, when including the in contract state, this is an untyped effect system. Potential use cases are:

  • Reading information from IPFS
  • Sending/recieving messages to private NEAR shards
  • Sending/recieving messages from other chains
  • Messaging and aggregating information from oracles
  • Running contracts over streams of data

@bowenwang1996
Copy link
Collaborator Author

@DavidM-D sorry I don't think I fully understand your proposal. A couple of questions:

  • How exactly does resume work? Can anyone call resume and continue the execution?
  • "Resume continues the execution in the ActionReciept context, consuming the gas of the original caller." This sounds like you want to save the execution context of the previous execution (before resume). This is actually quite complex since we essentially need to snapshot and store the wasm execution context in the state. @nagisa can provide a better explanation here

@nagisa
Copy link
Contributor

nagisa commented Nov 7, 2023

Correct, the complication arises from the fact that WASM execution state is quite intertwined with the native state, so we can’t just save some explicitly defined data structures and save them, we would need to save things like stack, machine registers and such as well. Fortunately since the spots where the saving and restoring might happen are well defined (by virtue of them being specific user-invoked functions,) some of the concerns about adapting codegen to deal with being suspended and relocated at a further date aren’t as prominent.

@akhi3030
Copy link
Contributor

akhi3030 commented Nov 7, 2023

There is some prior work where it is possible to save some execution context. Essentially it relies on the async system provided by rust to enable this support. https://internetcomputer.org/docs/current/developer-docs/backend/rust/intercanister is a simple example of how something like this can be used to call another smart contract; yield execution till the response comes back; and then continuing execution. https://github.com/dfinity/cdk-rs/blob/main/src/ic-cdk/src/futures.rs is the implementation on the SDK of this feature.

@itegulov
Copy link

itegulov commented Nov 7, 2023

I don't think we need to save the execution context for this feature, although this is an interesting topic in and of itself that potentially deserves its own NEP (I remember briefly discussing how this would look like in Rust SDK last year). Let me rewrite @DavidM-D's code in a way that would be representable with the existing tooling:

pub struct Signer {
  ...
}

impl Signer {
  /// Caller wants to sign payload using a key derived with the given path
  pub fn sign_request(payload: Vec<u8>, derivation_path: String) -> Promise<Signature> {
    let sign_request_id = generate_request_id();
    store.insert(sign_request_id, (payload, derivation, env::signer_account_id()));
    // Yield to self (current account id), meaning only this contract can resume. Pre-allocates gas for the resume
    // transaction.
    YieldPromise::new(env::current_account_id(), {sign_request_id, payload, derivation, caller}, 30k TGas)
      .then(Promise::new(env::current_account_id(), "sign_on_finish"))
  }

  /// Caller wants to fulfill a signature request and resume the chain of promises by the given receipt id
  pub fn sign_response(signature: Signature, sign_request_id: RequestId, action_receipt_id: ActionReceiptId) -> Promise<()> {
    assert_is_allowed_to_respond!(env::signer_account_id());
    let (payload, derivation, caller) = store.get(sign_request_id);
    assert_valid_sig!(signature, payload, caller);
    // Resumes a yielded promise from the corresponding `sign_request` call and consumes gas preallocated for
    // the resume transaction, thus refunding caller the gas they have spent so that ideally calling `sign_response`
    // did not cost anything.
    ResumePromise::new(env::current_account_id(), action_reciept_id, {sign_request_id, signature})
  }

  #[private]
  pub fn sign_on_finish(sign_request_id: RequestId, signature: Signature) -> Signature {
    store.remove(sign_request_id);
    signature
  }
}

The way I see it runtime triggers are essentially polling the contract's state to see if it has changed in a specific way, but that can very straightforwardly be replaced by scheduling a ResumePromise when you change the contract in that specific way. I am not sure if cross-contract polling was ever considered or requested (i.e. being able to return Promise<bool> in trigger_condition), but even if so the polling can be done offchain by doing free view calls and then just "proving" it once in the end when scheduling ResumePromise. Happy to provide a more concrete example on above if unclear

@akhi3030
Copy link
Contributor

akhi3030 commented Nov 8, 2023

I just had a call with @ itegulov to understand their proposal above a bit better. I like this it very much, I think it is much simpler than what was proposed originally.

In the original proposal, we need a way to call into the contract regularly to allow it to implement polling to decide if the desired condition is met. However, as @ itegulov points out, if the polling function is inspecting some state in the smart contract that is going to be updated over time by some other calls, then instead of having the polling, the contract can realise that the condition is met from these other calls and then resume execution.

The biggest challenge though is to figure out how to delay execution while the condition is met. More specifically, we have the following case:

  • Contract A calls contract the signer contract to get something signed.
  • The signer contract initiates the signing process.
  • Now the signer contract needs a way to delay replying to A till the signing process finishes. As far as I understand, we currently do not have such a mechanism in the protocol.
  • Further, when the signing process is finished, then the signing process needs to reply to A. This is effectively going to mixing two different "call trees". We have one call tree from A to the signing contract. And another that is finishing the signing process that now needs to respond to A.

There is some related work that we can explore. This work is trying to come up with a clean way to implement what are effectively asynchronous host functions. It works as following:

  • there exists a virtual smart contract that provides various functionalities.
  • When a smart contract wants to issue an asynchronous host function, they call the virtual smart contract.
  • When response is ready from the host function, a reply from the virtual smart contract is generated and sent back to the contract.

The above solves the problem of having multiple call trees that need to be conflated and solves the problem of delaying responding to a request till some condition is met.

The problem with this solution as I see it is that it will move many bits of the signature aggregation into the protocol.

@bowenwang1996
Copy link
Collaborator Author

@akhi3030 on your proposal with virtual smart contract, how exactly does it solve the issue mentioned in the previous approach that is related to the ability to trigger a callback? You still need a way for the virtual smart contract to signal that the action is complete and the callback can be executed.

@akhi3030
Copy link
Contributor

akhi3030 commented Nov 9, 2023

@bowenwang1996: I chatted with @DavidM-D earlier and am in line with what @itegulov is proposing above. I think it is simpler to implement in the protocol and also a more general framework.

Current situation

When a contract A calls B, B can either respond to A or it needs to call [another] contract. In other words, currently, there is no way for B to say, I am not ready to respond to A yet but hopefully in the near future, I might be ready to. It either has to produce a result for A or keep calling another contract to delay replying to A. And due to gas limits, it can only delay replying A for so long before the gas limit is exhausted and an error is returned to A.

Proposal

Introduce a yield / resume API that allows B to delay replying to A for arbitrarily long.

When A calls B, we introduce a host function: yield that B can call to suspend replying to A. Then in future, another contract C (or A, it doesn't matter) calls into B and B decides that it is now ready to respond to the original call from A. At this point, it can call resume which will now generate a response for A.

How this will work for the fastauth project?

A smart contract calls the signer contract to get a payload signed. The signer contract initiates the signing process and calls yields. The indexers prepare the signature and call into the signer contract. After the signer contract has aggregated enough signatures and verified that they are correct, it calls resume to return the signature to the original caller.

I think @itegulov's proposed API above shows how this would look like.

Changes needed in the protocol

I imagine that the changes needed in the protocol is that when yield is called by a contract, we need to create some state to store the "delayed execution". And when resume is called, we need to look up the respective "delayed execution" and continue it.

We will have to figure out how to charge for the additional state created when yield is called. One idea is that there is actually a time limit on how long this state is kept around for. Something that is yielded can be kept around for a fixed amount of time and if it not resumed during that time, it is resumed with an error. We could set the time limit or we could allow the contract to specify the limit (with an upper bound). Having such a time limit allows us to figure out how much gas should be charged for this storage.

@akhi3030
Copy link
Contributor

@bowenwang1996, @itegulov, @DavidM-D, @saketh-are, @walnut-the-cat: just so we are all on the same page, we are now planning on moving forward with the proposal in the comment above. There are still a couple of open questions for me for the API that I will make subsequent posts about to clarify. Saketh and I will then write up a draft NEP so that we are all agreed on the precise API and then start working on an implementation in nearcore.

@akhi3030
Copy link
Contributor

@itegulov, @DavidM-D, @saketh-are: I have a question about the API for yield and resume.

In order for the API to be generic enough, it will be possible that there are multiple outstanding yielded "executions" in a given contract that can later be resumed. It will also be possible that they will be ready to be resumed in different order than in which they were yielded. As such, the contract needs to specify which "execution" to resume. Does this concern make sense?

So what I am thinking is that yield needs to return a "token", an opaque identifier that the contract needs to save and pass in to resume to indicate which yielded execution to resume.

@itegulov: Looking at your comment, the closest thing I see resembling this identifier is sign_request_id. From your example, it seems like it is something that the application is generating internally. Any thoughts on how we could solve the above issue?

@itegulov
Copy link

itegulov commented Nov 16, 2023

@akhi3030 yes, so this is what I meant by action_receipt_id. sign_request_id is indeed a domain-specific thing here. The idea is that sign_request(...) results into Promise<Signature> which is what Rust SDK uses to denote a receipt id that is supposed to resolve into a value of a specific type. And then we reuse that receipt id to reference the yielded execution.

@akhi3030
Copy link
Contributor

@itegulov: cool, thanks, then we are on agreement on this.

A follow on question is that we need to have some sort of "time limit" for how long the yielded execution can stay alive for. This is because each yielded execution will create some state in the protocol that needs to be paid for. Otherwise, if a contract keeps forgetting to call resume on yielded execution, then that will accumulate state in the protocol.

My high level idea is that when some execution is yielded and it is not resumed within N blocks, then the protocol will resume it with a timeout error. Something like this could then also be used a feature if a contract just wants a callback sometime in the future.

There are two options for an API here. First is that the N blocks specified above is constant of the protocol that cannot be changed and then the gas fees are calculated based on that. But we could consider generalising this a bit more and say that the contract can specify N (which some upper bounds if needed) and then the gas fee will be calculated accordingly. Does anyone have opinions on which approach to take here?

An additional API we could offer is that when N is about to run out, the contract could request to extend the time that the execution is yielded for. I think that we should keep this as future work. I think we can have the simpler API for now and build this in future if needed.

@DavidM-D
Copy link
Contributor

DavidM-D commented Nov 16, 2023 via email

@akhi3030
Copy link
Contributor

Do we have to have an opaque reference type. Is it not possible to do it all on the application level? Surely if the contract has to verify the response it can also associate it with the correct callback?

@DavidM-D: I am not sure how that will work. For each yielded execution, the protocol has to create and some state. And when the application wants to resume some execution, it has to tell the application which yielded execution to continue.

@itegulov
Copy link

@akhi3030: having a timeout mechanism seems very reasonable to me. We can have a retry mechanism on the application level and even potentially subsidize it (e.g., set up a relayer that, given proof that you sent a transaction that resulted in timeout, funds a new retry transaction for you).

There are two options for an API here. First is that the N blocks specified above is constant of the protocol that cannot be changed and then the gas fees are calculated based on that. But we could consider generalising this a bit more and say that the contract can specify N (which some upper bounds if needed) and then the gas fee will be calculated accordingly. Does anyone have opinions on which approach to take here?

Let me think this through, in the meantime I have a couple of counter-questions. Let's say the execution got resumed in K blocks. Do you think it would be possible to refund unspent gas for N-K blocks of not storing the yielded execution? Also, what sort of magnitude would this block limit be - 10s, 100s, 1000s of blocks?

An additional API we could offer is that when N is about to run out, the contract could request to extend the time that the execution is yielded for. I think that we should keep this as future work. I think we can have the simpler API for now and build this in future if needed.

Agreed, this can be kept as a potential future extension. I don't see this being particularly useful for our use case, but a need might arise from somewhere else.

@akhi3030
Copy link
Contributor

Let me think this through, in the meantime I have a couple of counter-questions. Let's say the execution got resumed in K blocks. Do you think it would be possible to refund unspent gas for N-K blocks of not storing the yielded execution?

Good point. Yes, it would make sense to refund the remaining gas for N-K blocks.

Also, what sort of magnitude would this block limit be - 10s, 100s, 1000s of blocks?

Hmm.... I don't have a good intuition for this yet. It will depend on how much state we need to store in the protocol. I imagine that we are probably talking storing less than around 500 bytes per yielded execution. So we will have to do some gas estimations to figure out how much gas we should charge per block for that much storage. Maybe you have some intuition for this?

The other thought I had on this just now is that we probably do not need to specify an upper limit in the protocol on how big N can be. There will be a implicit maximum value for N based on how much gas is left for the account to burn and maybe that is a good enough upper limit.

@walnut-the-cat
Copy link
Contributor

We will have to figure out how to charge for the additional state created when yield is called. One idea is that there is actually a time limit on how long this state is kept around for. Something that is yielded can be kept around for a fixed amount of time and if it not resumed during that time, it is resumed with an error.

This sounds very much like the previous discussion on charging delayed receipts for their storage usage..

My high level idea is that when some execution is yielded and it is not resumed within N blocks, then the protocol will resume it with a timeout error. Something like this could then also be used a feature if a contract just wants a callback sometime in the future.

What's realistic feasibility of this? Currently, we do not guarantee when delayed receipts will be executed in the future and it seems for this type of timeout to work, we need to provide some guarantee such as 'resume() will be executed one block after condition is met'.

@akhi3030
Copy link
Contributor

What's realistic feasibility of this? Currently, we do not guarantee when delayed receipts will be executed in the future and it seems for this type of timeout to work, we need to provide some guarantee such as 'resume() will be executed one block after condition is met'.

I don't think we need to provide any guarantee on when something will execute. Once the timeout passes, we just need to mark the receipt ready to be executed. It doesn't matter when it actually executes.

@akhi3030
Copy link
Contributor

This sounds very much like the previous discussion on charging delayed receipts for their storage usage..

Yes. This is like the single trigger model which should work well with the current gas model.

@walnut-the-cat
Copy link
Contributor

I don't think we need to provide any guarantee on when something will execute. Once the timeout passes, we just need to mark the receipt ready to be executed. It doesn't matter when it actually executes.

But doesn't that mean users cannot control or estimate how much gas will be refunded or what 'N (which some upper bounds if needed)' should be?

I am afraid this resulting in the similar situation we have with gas attachment for txn call(where users always use highest/largest value)

@akhi3030
Copy link
Contributor

No... that is not how I envision the API working. If the contract requests that the execution be yielded for up to N blocks then in the worst case, the amount of gas it will pay will be a function of N. However, just because the contract requested that the callback happen at most N blocks later, doesn't mean that the protocol has to provide that guarantee. The protocol will provide the guarantee that the call will not take place before N blocks however due to congestion and being busy, the protocol can choose to arbitrarily delay when the callback happens.

@akhi3030
Copy link
Contributor

I have created a draft NEP for this work.

@walnut-the-cat
Copy link
Contributor

just because the contract requested that the callback happen at most N blocks later, doesn't mean that the protocol has to provide that guarantee.

I agree that Protocol doesn't have to provide such guarantee, but in the case where callback is yielded for N blocks and timeout as it couldn't get executed within the time limit, will contract end up wasting gas for nothing?

The protocol will provide the guarantee that the call will not take place before N blocks

Just to be clear, N used here is different from N used in the past sentence, as we are talking about 'minimum' delay, instead of 'maximum' delay?

@walnut-the-cat walnut-the-cat moved this to Ready to be prioritised in Near One project tracking Nov 17, 2023
@walnut-the-cat walnut-the-cat moved this from Ready to be prioritised to In Progress in Near One project tracking Nov 17, 2023
@walnut-the-cat walnut-the-cat moved this from In Progress to Prioritised in Near One project tracking Nov 17, 2023
@saketh-are
Copy link

I agree that Protocol doesn't have to provide such guarantee, but in the case where callback is yielded for N blocks and timeout as it couldn't get executed within the time limit, will contract end up wasting gas for nothing?

Suppose that a contract requests that execution be yielded for up to N blocks.

  • In the case that execution is resumed after K<=N blocks, the remaining gas for N-K blocks is refunded. No guarantee is provided on when the resumed execution actually occurs (protocol may choose to arbitrarily delay due to congestion, etc.), but the user does not continue to pay for storage once they make the resume call.
  • In the case that execution is not resumed within N blocks, the contract will be resumed automatically by the protocol with some indication that it was resumed due to timeout. Again no guarantee is made on when the execution occurs.

The protocol will provide the guarantee that the call will not take place before N blocks

This guarantee is referring to the fact that the timeout case won't be triggered by the protocol within N blocks.

@saketh-are
Copy link

Following a discussion with @bowenwang1996, I propose removing the timeout feature for a couple of reasons:

  • The timeout does not provide anything useful towards the chain signatures use case. If anything, it will hurt if they cannot make the resume call quickly enough due to congestion.
  • The need for the protocol to automatically resume timed-out triggers adds unnecessary implementation complexity.

Without a timeout, we would take a storage deposit to pay for storing the yielded computation indefinitely.

@DavidM-D
Copy link
Contributor

DavidM-D commented Dec 6, 2023

Will storage deposits be denominated in gas or NEAR? NEAR storage deposits make for complex interactions with relayers.

@bowenwang1996
Copy link
Collaborator Author

Will storage deposits be denominated in gas or NEAR? NEAR storage deposits make for complex interactions with relayers.

Normally it should be in NEAR. However, I agree that the interactions would be more complex. Given that such a receipt will be quite small (a few hundred bytes), I think we can also consider burning gas (similar to zero-balance accounts).

@akhi3030
Copy link
Contributor

I have some more questions about how things would work if we do not have timeouts. I mentioned them on a slack thread but also mentioning here in case someone else is following the conversation here.

If we do not have timeouts, then the following situation can happen:

  • Contract A calls B
  • B calls yield_create
  • B never calls yield_resume
  • Now A will never get a response to its message to B

This can create some problems such as:

  • If A had allocated some memory (or created some state) before calling B and it was intending to free it when B replies, that may never happen. This might leave A in a potentially inconsistent state.
  • I am not super familiar with the implementation details but on A, the protocol will have created some state to handle the response from B. As the response from B will never come, we will never be able to free up this state. Are we then charging enough at B to account for this state as well? But accounting for this state is not going to be trivial. Let's say the chain is A -> B -> C -> yield_create. Then C has to pay for 2x the state that will never get cleaned up. So essentially, whenever a contract calls yield_create, we need to be able to figure out the entire call chain till that point to figure out how much to charge.

@bowenwang1996
Copy link
Collaborator Author

bowenwang1996 commented Dec 13, 2023

Summary of a meeting with @DavidM-D @itegulov @saketh-are:

  • Timeout is acceptable for the time being.
  • Given the complexity and the time it takes to implement the protocol change, we are considering the following approach:
    • Chain signature is represented by a smart contract with the API to sign (for callers) and respond (for MPC signers).
    • For the time being, we implement a purely smart-contract based approach where sign keeps calling itself until the request is responded to by signers. We need to test out whether it works
    • At the same time, we keep working on Support for yielded execution #519. The goal is to locally change the mpc contract implementation later so that developers are not affected.

@encody
Copy link

encody commented Dec 18, 2023

Re: storage costs.

Lightweight Yields

Couple of questions: is it reasonable to assume that a YieldReceipt may only be resumed by the contract that created it? It seems like it, given the examples being discussed, but I don't think it has been explicitly said. If so, storage costs can be reduced, even bounded. (If not, ignore the rest of this, I guess.) It should not drastically change the usability of the API: external actors can still resume yields via a cross-contract call to this contract, which then resumes the requested yield.

First, it obviates the need to specify a resuming contract ID, and it could eliminate the need to specify a resuming function name, if a standard one is agreed-upon (not required for this optimization, and it leads to slightly worse devx imo).

However, it would also, and more importantly, eliminate the need to support promise chaining (.thenable-ness), which is a source of dynamic memory consumption, because all promise chains can be calculated and submitted as normal from the resumed callback (and functionally-equivalently, since the YieldPromise and its resumption are technically still part of the same call chain).

An API similar to that described by @itegulov above:

// Yield to self (current account id), meaning only this contract can resume. Pre-allocates gas for the resume
// transaction.
YieldPromise::new(env::current_account_id(), {sign_request_id, payload, derivation, caller}, 30k TGas)
  .then(Promise::new(env::current_account_id(), "sign_on_finish"))

Can therefore be simplified to have an upper-bound in storage cost. The arguments struct in particular can be replaced in favor of a contract-generated resume promise ID (e.g. sign_request_id in this case). Let us say that this is a u128 (16 bytes). Additional data can be stored in normal storage, indexed by that ID, incurring normal storage costs, and retrieved by the resuming function as needed. This also would allow the contract to clean up some associated state for never-resumed yields.

Element Byte Cost
Resume Promise ID (contract-generated) 16
Gas Amount 16
Receipt ID (vm-internal) 32
Resume Function Name (optional) 256
Total 320
Total (without function name) 64

The near_sdk could provide helpers or method wrappers that support an API similar to the above.

Of course, this is not a complete picture (e.g. I assume that keeping an action receipt around, unresolved, like the one referenced in the YieldPromise, costs more than just storing the bytes of its ID), but it may simplify some calculations.

@akhi3030
Copy link
Contributor

thanks for the feedback encody. The plan is indeed that only the contract that yielded a receipt can then resume it.

@saketh-are
Copy link

saketh-are commented Jan 5, 2024

Here's how the proposed yield_create/yield_resume API could fit into the MPC contract. Thanks @itegulov for some helpful input on this already.

pub struct MpcContract {
    protocol_state: ProtocolContractState,
    // Maps yielded promise id to the requested payload
    pending_requests: LookupMap<PromiseId, [u8; 32]>,
}

impl MpcContract {
    // Called by the end user; accepts a payload and returns a signature.
    pub fn sign(&mut self, payload: [u8; 32]) -> Promise {
        let promise = yield_create(
            // Callback with return type Option<Signature>
            Self::ext(env::current_account_id()).sign_on_finish(),
            YIELD_NUM_BLOCKS,
            STATIC_GAS_ALLOTMENT,
            GAS_WEIGHT
        );
    
        self.pending_requests.insert(env::promise_id(promise), payload);
        promise
    }
    
    // Called by an MPC node to submit a completed signature
    pub fn sign_respond(&mut self, request_promise_id: PromiseId, signature: Signature) {
        assert_is_allowed_to_respond!(env::current_account_id());
        
        let Some(payload) = self.pending_requests.get(&request_promise_id) {
            assert_valid_sig!(payload, signature);
            
            // The arg tuple passed here needs to match the type signature of the callback function
            yield_resume(request_promise_id, (request_promise_id, Ok(signature),));
        } else {
            env::panic_str("Unexpected response");
        }
    }

    // Callback made after the yield has completed, whether due to resumption or due to timeout
    pub fn sign_on_finish(
        &mut self,
        request_promise_id: PromiseId,
        signature: Result<Signature, PromiseError>
    ) -> Option<Signature> {
        self.pending_requests.remove(request_promise_id);
        signature.ok()
    }
}

@akhi3030
Copy link
Contributor

akhi3030 commented Jan 5, 2024

Thank you @saketh-are! Some questions:

  • when sign_respond is calling yield_resume, what is the type of signature there? It seems like it is String and it seems like sign_on_finish takes signature as a Result. I do not understand how this change of type is handled and where this happens.
  • when sign_respond calls yield_resume, why is signature inside a tuple and separate from the PromiseId. I understand that PromiseId is needed by the system to look up the previously yielded execution. However, this API seems to suggest some constraints on sign_on_finish that it must take PromiseId as a first argument. Instead wouldn't it make more sense for the call to be yield_resume(request_promise_id, (request_promise_id, signature))?

@saketh-are
Copy link

Thanks, I have edited above:

  • Just a simple error, it should say Signature everywhere. At lower level I believe a signature will be represented as two strings, but the details there shouldn't meaningfully change the interaction with yield/resume.
  • I like this suggestion, updated accordingly.

@akhi3030
Copy link
Contributor

akhi3030 commented Jan 5, 2024

Thanks for making the change. I think the following change might still be needed to make the types match up:

-             yield_resume(request_promise_id, (request_promise_id, signature,));
+            yield_resume(request_promise_id, (request_promise_id, Ok(signature),));

Otherwise, this looks good to me.

@saketh-are
Copy link

saketh-are commented Jan 16, 2024

After some tinkering on implementation I think it makes more sense to design the host functions in the following way:

promise_await_data(account_id, yield_num_blocks) -> (Promise, DataId);

promise_submit_data(data_id, data);

Simply, promise_await_data creates a Promise which will resolve to the data passed through promise_submit_data.

We can rely on the composability of promises to attach a callback and consume the data as desired. MPCContract::Sign in the example shared above would instead look like:

impl MpcContract {
    pub fn sign(&mut self, payload: [u8; 32]) -> Promise {
        let (promise, data_id) = promise_await_data(
            env::current_account_id(),
            YIELD_NUM_BLOCKS
        );

        self.pending_requests.insert(data_id, payload);

        promise.then(Self::ext(env::current_account_id()).sign_on_finish())
    }
}

@akhi3030
Copy link
Contributor

@saketh-are: I like this simplification very much! Question: why do need to include the account_id when calling promise_await_data()? Should it not always be the current account id?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NEW❗
Development

No branches or pull requests

8 participants