Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Way to 0.1 #1

Open
15 tasks
tyranron opened this issue Aug 12, 2021 · 8 comments
Open
15 tasks

Way to 0.1 #1

tyranron opened this issue Aug 12, 2021 · 8 comments
Assignees
Labels
feature New feature or request k::design Related to overall design and/or architecture k::example Related to usage examples k::toolchain Related to project toolchain roadmap Roadmap for multiple steps
Milestone

Comments

@tyranron
Copy link
Member

Project layout

  • arcana-core (core/ dir): contains core abstractions
  • arcana-codegen-impl (codegen/impl/ dir): contains codegen implementations
  • arcana-codegen (codegen/ dir): proc-macro shim crate for arcana-codegen-impl
  • arcana (project root): umbrella crate uniting all others behind feature-gates

Roadmap

  • Core design:
    • Events handling
    • Aggregates/read models(?) and event sourcing
    • Commands handling
    • Commands gateway
    • Unit of work
    • Repository
    • Queries handling
    • Queries gateway
    • Background processing
    • Sagas
  • Examples:
    • Example mini-chat project
  • Toolchain:
    • Makefile
    • GitHub Actions CI
    • Auto-releasing on tags
    • Issue/PR templates
@tyranron tyranron added roadmap Roadmap for multiple steps feature New feature or request k::design Related to overall design and/or architecture k::example Related to usage examples k::toolchain Related to project toolchain labels Aug 12, 2021
@tyranron tyranron added this to the 0.1.0 milestone Aug 12, 2021
@ilslv ilslv mentioned this issue Aug 13, 2021
30 tasks
@ilslv
Copy link
Member

ilslv commented Aug 23, 2021

Dealing with EventVersion

First of all, we should understand, that EventVersion is creating troubles only when Events are Deserialized. And as client's data can have any format, generalising behaviour is quite hard.

I propose quite barebones solution, but it keeps maximum amount of flexibility

// arcana

trait DeserializeEvent<'de> {
    fn deserialize_event<D>(
        name: EventName, 
        ver: EventVersion, 
        deserializer: D,
    ) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>;
}

impl<Ev> DeserializeEvent<'de> for Ev
where 
    Ev: VersionedEvent + serde::Deserialize
{
    // ...
}

struct DeserializeEventSeed<Ev> {
    pub name: EventName,
    pub ver: EventVersion,
    _event: PhantomData<Ev>,
}

impl<'de, 'a, Ev> DeserializeSeed<'de> for DeserializeEventSeed<Ev>
where
    Ev: DeserializeEvent<'de>,
{
    type Value = Ev;

    fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: Deserializer<'de>,
    {
        Ev::deserialize_event(self.name, self.ver)
    }
}

// client code


#[derive(serde::Deserialize, VersionedEvent)]
#[event(name = "chat", version = 2)]
struct ChatEvent {
    id: String,
}

#[derive(serde::Deserialize, VersionedEvent)]
#[event(name = "file", version = 2)]
struct FileEvent {
    id: String,
}

#[derive(DeserializeEvent, Event)]
enum Event {
    #[event(deserialize(from = v1::ChatEvent))]
    Chat(v2::ChatEvent),
    FileV2(FileEvent),
    FileV1(v1::FileEvent),
}

mod v1 {
    #[derive(serde::Deserialize, VersionedEvent)]
    #[event(name = "chat", version = 1)]
    struct ChatEvent {
        id: u16,
    }   

    impl From<ChatEvent> for super::ChatEvent {
        fn from(ev: ChatEvent) -> Self {
            Self {
                id: ev.id.to_string(),
            }
        }
    }

    #[derive(serde::Deserialize, VersionedEvent)]
    #[event(name = "file", version = 1)]
    struct FileEvent {
        file: Vec<u8>,
    }
}

DeserializeEvent trait

Main reason to introduce that trait is that serde traverses Deserialized data only once, which can introduce unnecessary overhead.

For example

#[derive(Serialize, Deserialize)]
#[serde(tag = "type")]
enum Message {
    Request { id: String, method: String, params: Params },
    Response { id: String, result: Value },
}

// JSON representation
// {"type": "Request", "id": "...", "method": "...", "params": {...}}

Until serde encounters type field, it collects everything else inside serde_value::Value-like struct. That introduces dynamic allocations for inner Box<Value>, which we might want to avoid.

DeserializeEvent derive-macro

There are 2 different possibilities for Events with same EventName but different EventVersions:

  1. There is From<V1> for V2 implementation
    In that case we simply indicate this relation with #[event(deserialize(from(...)))] attribute
#[derive(DeserializeEvent, Event)]
enum Event {
    #[event(deserialize(from = v1::ChatEvent))]
    Chat(ChatEvent),
    // ...
}
  1. There is no From<V1> for V2 implementation
    In that case we can't really be sure, what user wants to happend: call another EventSourced impl, fallible conversion into another type, or simply return an error. All those possibilities are viable solutions. So we just Deserialize them in different enum variants for handling
#[derive(DeserializeEvent, Event)]
enum Event {
    // ...
    FileV2(FileEvent),
    FileV1(v1::FileEvent)
}

ack @tyranron

@tyranron
Copy link
Member Author

@ilslv regarding serialization/deserialization I'd like to temporary keep that story aside from this project. Ideally, we shouldn't dictate this to library users at all. So implementing serialize/deserialize is totally their responsibility, not ours.

The question which needs discussion and investigation here is not about serialization/deserialization, but rather about Events evolving. Imagine that we have a large project which changes over time. Some new events appear (that's trivial), some disappear, some change either in breaking manner or not breaking manner. How should we handle all of that?

Ideally, we don't want new events to wear the burden of its predecessors, and we want clear ways to avoid outdated versions and work only with new ones.

For example, we had this initially:

#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)]
struct EmailAddedV1 {
    email: String,
}

And later we evolve to something like that:

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    by: UserId,
}

And we have dilemma here:

  • Either we use by: UserId field as our business rules dictate and cannot use From<EmailAddedV1> for EmailAddedV2 as there is no meaningful value for by field.
  • Or we use by: Option<UserId> field and allow handy From<EmailAddedV1> for EmailAddedV2 conversion, but in price of eroding business rules strictness.

Another question to investigate is how to better keep outdated events (modules layout, etc).

@ilslv
Copy link
Member

ilslv commented Aug 24, 2021

Evolving schema

1. Extending an already existing event

Most of the time in this case we'll just add fields to some event.

There are 3 different ways of dealing with this situation:

  1. Creating From implementation
    Pros: only 1 EventSourced implementation needed
    Cons: events in the future can have a lot of Option fields to be able to transform from old ones

  2. Dealing with them as separate events
    Pros: stricter events definitions, which makes them harder to misuse
    Cons: having different EventSourced implementations, which makes it harder to understand what's really happening in the system

  3. Uniting events of different versions in enums
    This approach is a combination of previous 2. We let developers to decide, whether they want to add new strict variant without any Option fields, or they want to replace old variant with the less strict one.
    Pros: only 1 EventSourced implementation needed, refactoring-friendly
    Cons: less intuitive, may lead to more boilerplate (should be investigated)

/// Old event
#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)]
struct EmailAddedV1 {
    email: String,
}

// 1. Creating `From` implementation

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    confirmed_by: Option<UserId>,
}

impl From<EmailAddedV1> for EmailAddedV2 {
    // ...
}

impl Sourced<EmailAddedV2> for S {
    // ...
}

// How it may look in the future

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV10 {
    email: String,
    confirmed_by: Option<UserId>,
    a: Option<A>,
    lot: Option<Lot>,
    of: Option<Of>,
    optional: Option<Optional>,
    fields: Option<Fields>,
}

impl From<EmailAddedV1> for EmailAddedV10 {
    // ...
}

// ...

impl From<EmailAddedV9> for EmailAddedV10 {
    // ...
}

impl Sourced<EmailAddedV10> for S {
    // ...
}

// 2. Creating `From` implementation

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    confirmed_by: UserId,
}

impl Sourced<EmailAddedV1> for S {
    // ...
}

impl Sourced<EmailAddedV2> for S {
    // ...
}

// How it may look in the future

impl Sourced<EmailAddedV4> for S {
    // ...
}

// ...

impl Sourced<EmailAddedV10> for S {
    // ...
}

// 3. Uniting events of different versions in enums

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    confirmed_by: UserId,
}

enum EmailAdded {
     V1(EmailAddedV1)
     V2(EmailAddedV2)
}

impl Sourced<EmailAdded> for S {
    // ...
}

// How it may look in the future

#[derive(event::Versioned)]
#[event(name = "email.added", version = 9)]
struct EmailAddedLegacy {
    email: String,
    confirmed_by: Option<UserId>,
    a: Option<A>,
    lot: Option<Lot>,
    of: Option<Of>,
    optional: Option<Optional>,
    fields: Option<Fields>,
}

#[derive(event::Versioned)]
#[event(name = "email.added", version = 10)]
struct EmailAddedV10 {
    email: String,
    confirmed_by: Option<UserId>,
    much: Much,
    stricter: Stricter,
    definition: Definition,
}

enum EmailAdded {
    Legacy(EmailAddedLegacy), // Converted from versions 1-9
    V10(EmailAddedV10), 
}

impl Sourced<EmailAdded> for S {
    // ...
}

2. Renaming/removing event's fields

  1. Deserialization-based only
#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)] // Version didn't change
struct EmailAddedV2 {
    #[event(alias(value))]
    email: String,
}
  1. Proc-macro + deserialization based approach
#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    #[event(alias(value, version = 1))]
    email: String,
}

// May be expanded to different structs or remain single with version validation on deserialization

Both 1 and 2 are requiring to enforce our own deserialization onto developer. I don't consider that as much of a problem, as serde is de-facto standart in rust ecosystem.

  1. Different version with From impl
#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)]
struct EmailAddedV1 {
    value: String,
}

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
}

impl From<EmailAddedV1> for EmailAddedV2 {
    // ...
}

/// May be combined with `3. Uniting events of different versions in enums` from previous step

I lean more to this option, as renaming fields should be quite infrequent usecase

3. Ignore entire event

  1. Introduce some middleware for filtering deserialized events
+---------+    +---------+
|         |    |         |
|  Event  +---->  Event  |
| Storage +----> Adapter +-->
|         |    |         |
+---------+    +---------+
  1. Use deserializer as filter
    This can allow to squeeze some performance by not deserializing some fields
    This option worth mentioning, but I don't think that this is a viable sotion, as it makes harder to understand the code and perfomance benefit is neglegable
+---------+    +--------------+   +---------+
|         |    |              |   |         |
|  Event  +---->              |   |  Event  |
| Storage +----> Deserializer +---> Adapter +--->
|         |    |              |   |         |
+---------+    +--------------+   +---------+

4. Split large event

+---------+     +---------+
|         |     |         |
|  Event  |     |  Event  +--n-->
| Storage +--1--> Adapter +--n-->
|         |     |         |
+---------+     +---------+

5. Transforming events based on some Context

We can't gurantee that it would be possible to deterministically transform old event into a new one (althought it should be the last resort), so Event Adepter should have some Context to work with.
This Context may accumalate some events, transform them, but it's logic still has to be as small as possible. But I do think that those transformations must be infallible.

Proposal

For solving first 2 problems I propose combination of 1.3 and 2.3. This should cover us for most use-cases.

// Declarations of events 1-8 with `Deserialize` and `From` impls for `EmailAddedLegacy`

#[derive(event::Versioned, Deserialize)]
#[event(name = "email.added", version = 9)]
struct EmailAddedLegacy {
    email: String,
    confirmed_by: Option<UserId>,
    a: Option<A>,
    lot: Option<Lot>,
    of: Option<Of>,
    optional: Option<Optional>,
    fields: Option<Fields>,
}

#[derive(event::Versioned, Deserialize)]
#[event(name = "email.added", version = 10)]
struct EmailAddedV10 {
    email: String,
    confirmed_by: Option<UserId>,
    much: Much,
    stricter: Stricter,
    definition: Definition,
}

#[derive(Event, Deserialize)]
enum EmailAdded {
    Legacy(EmailAddedLegacy), // Converted from versions 1-9
    V10(EmailAddedV10), 
}

impl Sourced<EmailAdded> for S {
    // ...
}

Regarding problems 3-5 it looks like we sould add a new abstraction layer between event storage and EventSourced logic. I'll investigate ergonomic and easy-to-use abstraction for it.

Unresolved questions

Should we consider blue-green deployment where some instances of the same service are producing old events, when other instances already were upgraded?

ack @tyranron

@ilslv
Copy link
Member

ilslv commented Aug 24, 2021

Discussed:

Should we consider blue-green deployment where some instances of the same service are producing old events, when other instances already were upgraded?

Backwards-compatibility is preserved (when old versions of the events are stored for a short amount of time), while forward-compatibility is not, which resolves in 500 for a short time, until instance in updated.

Event Adapter

Sounds like the way to go

@ilslv
Copy link
Member

ilslv commented Aug 26, 2021

First draft of EventAdapter

Base trait

trait EventTransformer<Event> {
    type Context: ?Sized;
    type Error;
    type TransformedEvent;
    type TransformedEventStream<'ctx>: Stream<Item = Result<
        Self::TransformedEvent,
        Self::Error,
    > + 'ctx;

    fn transform(
        event: Event,
        context: &mut Self::Context,
    ) -> Self::TransformedEventStream<'_>;
}

EventTransformer implemented for some Adapter struct, generalised by Event, so different Adapters can transform same Events differently.

Design decisions

  1. &mut Context
    As we want to preserve events order, we shouldn't process them concurrently. This allows us to guarantee exclusive access to Context.
    Downside: This design wouldn't allow to use something like buffered adapter

  2. No &self or &mut self
    I don't really see, how reference to Self would be useful, as we can encapsulate all dependencies in &mut Context. But that can be easily added.

Alternatives

Replace &mut Context with &mut self and keep all context inside Self.
Downside: dependency injection becomes really hard.

Convenience traits

trait EventTransformStrategy<Event> {
    type Strategy;
}

To avoid implementing everything by hand we would provide some convenience Strategies

impl EventTransformStrategy<SkippedEvent> for Adapter {
    type Strategy = strategy::Skip;
}

strategy::Skip allows to skip entire events.

impl EventTransformStrategy<EmailConfirmed> for Adapter {
    type Strategy = strategy::AsIs;
}

strategy::AsIs just passes event as is.

impl EventTransformStrategy<EmailAdded> for Adapter {
    type Strategy = strategy::Into<EmailAddedOrConfirmed>;
}

strategy::Into uses impl From<EmailAdded> for EmailAddedOrConfirmed to convert events.

impl EventTransformStrategy<EmailAddedAndConfirmed> for Adapter {
    type Strategy = strategy::Split<EmailAddedOrConfirmed, 2>;
}

impl From<EmailAddedAndConfirmed> for [EmailAddedOrConfirmed; 2] {
    fn from(ev: EmailAddedAndConfirmed) -> Self {
        [
            EmailAdded { email: ev.email }.into(),
            EmailConfirmed {
                confirmed_by: ev.confirmed_by,
            }
            .into(),
        ]
    }
}

strategy::Split allows to convert into several events at once.

These are just examples and we can provide many more Strategies to simplify our life.

Besides that, we didn't loose ability to implement EventTransformer manually.

impl EventTransformer<Custom> for Adapter {
    type Context = dyn Any;
    type Error = Infallible;
    type TransformedEvent = EmailAddedOrConfirmed;
    type TransformedEventStream<'ctx> = stream::Empty<Result<EmailAddedOrConfirmed, Infallible>>;

    fn transform(
        _: Custom,
        _: &mut Self::Context,
    ) -> Self::TransformedEventStream<'_> {
        stream::empty()
    }
}

That impl basically is the same as strategy::Skipped.

trait EventAdapter<Events> {
    type Context: ?Sized;
    type Error;
    type TransformedEvents;
    type TransformedEventsStream<'ctx>: Stream<Item = Result<Self::TransformedEvents, Self::Error>>
        + 'ctx;

    fn transform_all(
        events: Events,
        context: &mut Self::Context,
    ) -> Self::TransformedEventsStream<'_>;
}

impl<Adapter, Events> EventAdapter<Events> for Adapter
where
    Events: Stream + 'static,
    Adapter: EventTransformer<Events::Item> + 'static,
    Adapter::Context: 'static,
{
    type Context = Adapter::Context;
    type Error = Adapter::Error;
    type TransformedEvents = Adapter::TransformedEvent;
    type TransformedEventsStream<'ctx> = AdapterStream<'ctx, Adapter, Events>;

    fn transform_all(
        events: Events,
        context: &mut Self::Context,
    ) -> Self::TransformedEventsStream<'_> {
        AdapterStream::new(events, context)
    }
}

This trait comes with a blanket impl for any compatible type, implementing EventTransformer and allows to transform Stream of incoming events and Context into a transformed Stream. GATs allow to do it without any unnecessary dynamic allocations required for type erasure.

Whole implementation flow

// Declare all possible input events

#[derive(Debug)]
struct SkippedEvent;

#[derive(Debug)]
struct EmailAddedAndConfirmed {
    email: String,
    confirmed_by: String,
}

#[derive(Debug)]
struct EmailAdded {
    email: String,
}

#[derive(Debug)]
struct EmailConfirmed {
    confirmed_by: String,
}

// Unite them in a enum, deriving `EventTransformer`

#[derive(Debug, From, EventTransformer)]
#[event(transform(into = EmailAddedOrConfirmed, context = dyn Any))]
enum InputEmailEvents {
    Skipped(SkippedEvent),
    AddedAndConfirmed(EmailAddedAndConfirmed),
    Added(EmailAdded),
    Confirmed(EmailConfirmed),
}

// Declare enum of output events

#[derive(Debug, From)]
enum EmailAddedOrConfirmed {
    Added(EmailAdded),
    Confirmed(EmailConfirmed),
}

// Implement transformations

struct Adapter;

impl EventTransformStrategy<EmailAdded> for Adapter {
    type Strategy = strategy::AsIs;
}

impl EventTransformStrategy<EmailConfirmed> for Adapter {
    type Strategy = strategy::Into<EmailAddedOrConfirmed>;
}

impl EventTransformStrategy<EmailAddedAndConfirmed> for Adapter {
    type Strategy = strategy::Split<EmailAddedOrConfirmed, 2>;
}

impl From<EmailAddedAndConfirmed> for [EmailAddedOrConfirmed; 2] {
    fn from(ev: EmailAddedAndConfirmed) -> Self {
        [
            EmailAdded { email: ev.email }.into(),
            EmailConfirmed {
                confirmed_by: ev.confirmed_by,
            }
            .into(),
        ]
    }
}

impl EventTransformStrategy<SkippedEvent> for Adapter {
    type Strategy = strategy::Skip;
}


// Test Adapter

#[tokio::main]
async fn main() {
    let mut ctx = 1_usize; // Can be any type
    let events = stream::iter::<[InputEmailEvents; 4]>([
        EmailConfirmed {
            confirmed_by: "1".to_string(),
        }
        .into(),
        EmailAdded {
            email: "2".to_string(),
        }
        .into(),
        EmailAddedAndConfirmed {
            email: "3".to_string(),
            confirmed_by: "3".to_string(),
        }
        .into(),
        SkippedEvent.into(),
    ]);

    let collect = Adapter::transform_all(events, &mut ctx)
        .collect::<Vec<_>>()
        .await;

    println!("context: {}\nevents:{:?}", ctx, collect);
    // context: 1,
    // events: [
    //     Ok(Confirmed(EmailConfirmed { confirmed_by: "1" })), 
    //     Ok(Added(EmailAdded { email: "2" })), 
    //     Ok(Added(EmailAdded { email: "3" })), 
    //     Ok(Confirmed(EmailConfirmed { confirmed_by: "3" }))
    // ]
}

Complete example

Downsides

To implement EventApdater trait, we use custom Stream with 1 line of unsafe code, as I couldn't figure out the way to do it safely.
Alternative is to use &Context everywhere, which will allow to use buffered adapter and provide safe impl for trait.

ack @tyranron

@tyranron
Copy link
Member Author

@ilslv

EventTransformer trait

  1. &mut Context
    As we want to preserve events order, we shouldn't process them concurrently. This allows us to guarantee exclusive access to Context.
    Downside: This design wouldn't allow to use something like buffered adapter

Unsure about &mut. buffered things still allow to preserver order, while process stuff concurrently. Using interior mutability for contexts is a common thing.

  1. No &self or &mut self
    I don't really see, how reference to Self would be useful, as we can encapsulate all dependencies in &mut Context. But that can be easily added.

One major argument for using &self is trait object safety, so someone will be able to use opaque dyn EventTranformers.

Another one is that dependcies vary, and it still may be meaningful to keep some of them in Adaptor rather than in context.

EventTransformStrategy trait

strategy::Split allows to convert into several events at once.

It seems that HList might have a better fit there rather than an array.


Okay, let's start with implementing that as a "step 1" for events adapting story. Along with an implementation, please, provide full set of examples to cover all possiblr situations, so we'll see how it plays and will evolve in future.

Additional typle-level restrictions to ensure more variants being met let's do in a "step 2" after merging "step 1".

@ilslv
Copy link
Member

ilslv commented Aug 27, 2021

@tyranron

Unsure about &mut. buffered things still allow to preserver order, while process stuff concurrently. Using interior mutability for contexts is a common thing.

Agreed, especially that in practice Context will hold some reference to DbPool, which already provide interior mutability. I've implemented with &mut to demonstrate hardest constraints

One major argument for using &self is trait object safety, so someone will be able to use opaque dyn EventTranformers.

Very good point, entirely missed it

It seems that HList might have a better fit there rather than an array.

It's just PoC, of course we should provide some mechanism, that will allow to vary number of emitted elements based on content of input event, while array approach doesn't

tyranron added a commit that referenced this issue Aug 30, 2021
- bootstrap project structure

Additionally:
- bootstrap CI pipeline
- bootstrap Makefile

Co-authored-by: tyranron <[email protected]>
tyranron added a commit that referenced this issue Sep 2, 2021
…to single Rust type (#3, #1)

Additionally:
- support event::Initial in derive macros

Co-authored-by: tyranron <[email protected]>
@ilslv ilslv mentioned this issue Sep 3, 2021
16 tasks
tyranron added a commit that referenced this issue Sep 3, 2021
- add event::Sourcing trait for more handy dynamic dispatch usage

Co-authored-by: tyranron <[email protected]>
@ilslv
Copy link
Member

ilslv commented Feb 14, 2022

@tyranron regarding our discussion how tracing::Span::current() works and can it be used for Context, I've made a PoC recreating basic capabilities. I guess with a bit more time I can remove redundant clones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request k::design Related to overall design and/or architecture k::example Related to usage examples k::toolchain Related to project toolchain roadmap Roadmap for multiple steps
Projects
None yet
Development

No branches or pull requests

2 participants