Node

Basic requirements

The node must not block in any way. That is, it must not happen that the node is stuck endlessly waiting on any operation, regardless of reasons. Typical common issues here are a slow RPC server and a slow/unstable network.
It must be possible to shut down the node cleanly and reasonably quickly (at most few seconds).
Restarting the node must not lose any information about previously executed actions e.g. filed requests, claims, challenges etc.
The node should avoid serializing anything to non-volatile storage.

Node architecture

In order to satisfy above requirements, we opted for a simple thread-based approach where we moved all software components that could potentially block into their own threads. In particular, these are ContractEventMonitor and EventProcessor.

ContractEventMonitor

ContractEventMonitor listens for blockchain events emitted by a given contract, decodes the events into an internal event representation (event types in raisync.events) and forwards decoded events to the EventProcessor. The event monitor does not query the JSON-RPC server for events directly. Rather, it does that via an instance of EventFetcher. Event fetcher handles communication with the JSON-RPC server and can block. However, even if the event fetcher for, say, L2a chain, blocks, since it is only being invoked by the contract event monitor, which runs inside a thread, it won't block the entire node. Therefore, even if a JSON-RPC server is very slow or the connection is otherwise unreliable, the node as a whole will remain responsive.

With the current contract implementation, we have one contract, RequestManager, deployed on the source chain (L2a), and one contract, FillManager, deployed on the target chain (L2b). Therefore we have two event monitors, one for each contract, and each event monitor has its own event fetcher, as can be seen in the figure above. Each pair (EventFetcher, ContractEventMonitor) works independently of the other, allowing for very different speeds between the L2 chains.

EventProcessor

EventProcessor implements the Raisync protocol logic. It receives events from event monitors and stores them into a list. It is important to note that events are not separated based on the chain they came from -- all events are stored in a single list, in the order they arrived, regardless of the originating chain.

(Implementation detail: each event object, an instance of an event type from raisync.events, has a chain_id attribute that can be used to identify the chain that event came from.)

The thread of EventProcessor will typically sleep until something interesting happens. Delivery of fresh events by one or both of the contract event monitors is just such a case, which triggers the following:

processing events
processing requests

The first part, processing events, consists of going through the list of all events and trying to create new requests or modify the state of the corresponding requests. That process might not always succeed for every event. Consider, for example, the case where a RequestFilled event was received from L2b, but the corresponding RequestCreated event had not been seen yet. In that case, the RequestFilled event will simply be left as-is and will be retained in the event list. All events that have been successfully handled will be dropped from the event list.

Successfully handling an event typically means modifying the state of the Request instance corresponding to the event. To that end, EventProcessor makes use of RequestTracker facilities to keep track of, and acces all requests. The request state is, unsurprisingly, kept on the Request object itself.

An important thing to note is that processing events may consist of multiple iterations, meaning, one may need to iterate over the list of events to process more than once. To see why that is the case, consider the following list of events:

[RequestFilled, RequestCreated, ClaimCreated]

One can see that the RequestFilled arrived first, before the RequestCreated, which it depends on. That can happen since RequestFilled is emitted on L2b, and RequestCreated is emitted on L2a. With different blockchain speeds, network speeds and JSON-RPC speeds, one could even say that such a situation might not be rare. Obviously, the RequestFilled event cannot be processed because the corresponding Request has not been created yet. So in the first iteration, the event processor skips the RequestFilled and processes the RequestCreated event:

Iteration 0
-----------
Event list                                                           Action
[(RequestFilled), RequestCreated, ClaimCreated]                      request does not exist, ignore
[RequestFilled, (RequestCreated), ClaimCreated]                      create a new request
[RequestFilled, RequestCreated, (ClaimCreated)]                      request with the given ID exists, 
                                                                     but is in state pending, cannot claim

New event list:
[RequestFilled, ClaimCreated]

The second iteration looks like this:

Iteration 1
-----------
Event list                                                           Action
[(RequestFilled), ClaimCreated]                                      mark the request as filled
[RequestFilled, (ClaimCreated)]                                      mark the request as claimed

New event list:
[]

Once an iteration completes without making any changes to the state, the processing of events is considered done. In other words, a new iteration over the event list will start only if there were changes in the previous iteration. This is because those changes could have enabled processing of other events in the list. But once there are no changes possible and the event list cannot be reduced further, the event processor moves on to the next, second part.

The second part, processing requests, consists of going through all requests and checking whether there is an action that needs to be done. For example, if a pending request is encountered, the event processor may issue a fillRequest transaction. Similarly, if a filled request is encountered and it was our node that filled it, the event processor may issue a claim transaction. Here again the request tracker is used to access the requests.

Request

The Request object holds the information about a submitted request and the associated state, including any claims that were made. The state machine is depicted by the following figure.

The state machine uses the python-statemachine Python package to declare states and transitions. Auto-generated transition methods like fill or claim are then used by the event process to update request state. This approach also ensures that only valid transitions are possible.

Request states mostly correspond to contract events, except for pending and filled-unconfirmed states. A request is in the initial state pending immediately after it is created. If a RequestFilled event is received, the request will move to the state filled. The process goes similarly for the claimed and withdrawn states, i.e. those states are entered when the corresponding blockchain events are processed.

The filled-unconfirmed state is an intermediate state that is used to mark requests that our node has filled, but the corresponding RequestFilled events have not been received yet. This is to avoid filling the same requests more than once. Once the corresponding RequestFilled events arrive, the requests will proceed to state filled.

Provide feedback

Saved searches