Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing boilerplate required for ockam::Worker implementations #1564

Closed
thomcc opened this issue Jul 19, 2021 · 3 comments
Closed

Reducing boilerplate required for ockam::Worker implementations #1564

thomcc opened this issue Jul 19, 2021 · 3 comments
Assignees

Comments

@thomcc
Copy link
Contributor

thomcc commented Jul 19, 2021

So, I was tasked with brainstorming ways to reduce boilerplate for ockam workers. The two most promising routes that I was able to come up with were.

  1. Eliminate the Context associated type, which exists because of the split between ockam_core and ockam_node. This seems blocked by WIP feat(rust): add no_std + alloc support to enable embedded device support for ockam #1521, and even then, it's fairly nontrivial — it probably requires moving ockam_node::Context into ockam_core.

    Alternatively, if "true" async trait support lands (which I've heard is plausible by early next year, due to GAT support coming a lot further than expected recently, more options open up there).

    Either way, it seems important to do eventually though, since it's a bit of a design wart that Context is specified in that way, given that it should always be ockam::Context (at the moment).

  2. Add a shorter way to define simple workers.


The second of these is where I focused. I don't think the design I ended up with at the moment justifies its own existence quite yet, but there are a few plausible extensions to it which I didn't have time to get working, which probably would justify themselves.

Essentially, the goal would be for you to do:

ctx.start_function_worker("echoer", move |ctx, msg| async move {
    ctx.send(msg.return_route(), msg.body()).await
}).await?;

While this example doesn't use it, the closure would be able to access variables in scope, which could hold state. This would give it more or less full functionality of a Worker, with the notable exception being that it couldn't customize the initialize/shutdown behavior. For this reason (and the fact that there are a lot of good reasons to prefer explicitly specifying the state held by a struct to implicitly doing so via a closure), I don't think this should replace the current way of defining workers, just augment it.


So: I didn't get this working all the way, I'm not convinced it's impossible, but it's certainly difficult, and could require GATs (thankfully, these seem on track for this year according to rust-lang/rust#44265, and already nearly fully functional on nightly, so designs using them can be explored prior to them landing).

What I got working is summarized below (see https://github.com/thomcc/ockam/tree/func_worker for the implementation):

fn echoer<'ctx>(ctx: &'ctx mut Context, msg: Routed<String>) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'ctx>> {
    Box::pin(async move {
        ctx.send(msg.return_route(), msg.body()).await
    })
}

// later...
ctx.start_function_worker("echoer", echoer).await?;

While I didn't get this far, this was an exploratory sketch, and part of the idea was that the echoer function could be defined in a more ergonomic manner, specifically something like this should be entirely possible:

#[ockam::worker]
async fn echoer(ctx: &mut Context, msg: Routed<String>) -> Result<()> {
    ctx.send(msg.return_route(), msg.body()).await
}

This would require the ockam::worker macro become more than just a reexport of #[async_trait]. It would have to detect being used on a function, and translate the 2nd function definition into the first (doing so should be straightforward).


Unfortunately, this provides fairly limited benefit — it only allows for simple, stateless workers... the kind that show up all the time in example code, but not in the real world.

If arbitrary closures are supported, then the use case expands potentially to any worker that has a short body, and doesn't need custom initialize/shutdown implementations, which would have far more use-cases.

Concretely, for the example at the start to be possible, two things are needed that aren't supported by the code in my exploratory branch.

  1. Any async fn can be used — it shouldn't need a proc macro to type erase the generics.
  2. Any closure can be used (or rather, any FnMut) and not just data-less closures / function pointers.

(Note: I'm using "any" a bit loosely here — of course there would be bounds of Send and 'static in many places, and the signature would have to be correct, etc).

These are actually hard for the same reason, which is it seems like you need a higher ranked impl Trait result, e.g. you really want to say is:

// This would be part of the bound of `start_function_worker`
where
    // ...
    Func: 'static + Send +
        for<'ctx> FnMut(&'ctx mut Context, Routed<Msg>)
            // This next line is not legal Rust.
            -> impl 'ctx + Send + Future<Output = Result<()>>,

Except that you can't say this, and the workarounds for this lose the higher ranked 'ctx. (Perhaps there's something that can be done here using the Captures<'a> pattern, although it's not totally clear it can be applied here...).

The way you often work around this is another layer of type-level indirection, which probably will work, and if we assume GATs will be available, would likely be greatly simplified. Without them, I suspect it might be a bit involved, (On the bright side, it should be totally internal — the user wouldn't see it at all, ideally).

I didn't have time to investigate this, since it requires fully untangling several moving parts (for example, the internals of how async_trait works, just to start). It also might require a tweak to how ockam::Worker is defined (but also maybe not, again, I didn't finish my investigation).

Anyway, it's not 100% clear to me this is definitely worth the trouble. I think in a lot of cases having an explicit struct would be preferred over using a closure, even if a closure is an option. That said, this use case would greatly reduce the boilerplate for a lot of use cases.

@mrinalwadhwa
Copy link
Member

mrinalwadhwa commented Jul 20, 2021

@thomcc this is an awesome design exploration. Thank you for spending time on it 🙏

I look at your notes on:

#[ockam::worker]
async fn echoer(ctx: &mut Context, msg: Routed<String>) -> Result<()> {
    ctx.send(msg.return_route(), msg.body()).await
}

and imagine that they imply this should also be possible:

pub struct Echoer;

#[ockam::worker]
impl Worker for Echoer {
    async fn handle_message(&mut self, ctx: &mut Context, msg: Routed<String>) -> Result<()> {
        println!("Address: {}, Received: {}", ctx.address(), msg);
        ctx.send(msg.return_route(), msg.body()).await
    }
}

Which I think is a great win if we have to keep the #[ockam::worker] macro for other reasons.


I also really like your thoughts on this option. I think it's great for convenience in simple scenarios.

ctx.start_function_worker("echoer", move |ctx, msg| async move {
    ctx.send(msg.return_route(), msg.body()).await
}).await?;

@thomcc
Copy link
Contributor Author

thomcc commented Jul 20, 2021

and imagine that they imply this should also be possible:

pub struct Echoer;

#[ockam::worker]
impl Worker for Echoer {
    async fn handle_message(&mut self, ctx: &mut Context, msg: Routed<String>) -> Result<()> {
        println!("Address: {}, Received: {}", ctx.address(), msg);
        ctx.send(msg.return_route(), msg.body()).await
    }
}

Yes, this is possible in most cases. I think it probably wouldn't work if a user did something like:

type Message = Routed<String>;
#[ockam::worker]
impl Worker for Echoer {
    async fn handle_message(&mut self, ctx: &mut Context, msg: Message) -> Result<()> {
        println!("Address: {}, Received: {}", ctx.address(), msg);
        ctx.send(msg.return_route(), msg.body()).await
    }
}

Maybe there's a way to get this to work, a trait to extract the Routed type (similar to how GetCorrespondingTryType<T> works in the try trait v2 RFC), but I don't see it. That said, perhaps this isn't that big of a deal.

Although, thinking more... by adjusting the definition of Worker to require Message be the Routed type, this case could be made to work too. This would require an additional trait that's implemented only for Routed<M> where M: Message, but it would probably be largely transparent.

One reason that I didn't pursue this approach is that might be a bit confusing, it's essentially the proc macro implementing a variant of rust-lang/rust#29661, where the default is pulled out of the usage site. That's not really a deal breaker, but it would be a bit surprising to come across. I think in general the two lines of boilerplate this saves probably won't bother most users of the library — whereas a proc macro that changes how trait associated types work might be seen as fighting the language, and might bother users.

That said, it could be done in a largely transparent way — that is, only emit the associated type declarations if the user doesn't already explicitly provide them. This might still be somewhat confusing, although if a user doesn't like it, they would be free not to use it.

Sorry if this is a bit disjointed — I just woke up 😅. Also sorry if my responses are somewhat delayed today, I won't be at my computer for much of it.

@mrinalwadhwa
Copy link
Member

Not disjointed at all, that made perfect sense. Have a great day 👋

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants