Skip to content

Commit

Permalink
Merge pull request #29 from anoma/mariari/why-elixir
Browse files Browse the repository at this point in the history
Create the Why Elixir blogpost
  • Loading branch information
0xapriori authored Oct 8, 2024
2 parents 54206cb + 9a6d5d0 commit 79d9f40
Show file tree
Hide file tree
Showing 5 changed files with 591 additions and 0 deletions.
Binary file added mariari/media/DistOpFuture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added mariari/media/observer-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added mariari/media/observer-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
308 changes: 308 additions & 0 deletions mariari/why-elixir.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,308 @@
---
title: Anoma in Elixir
publish_date: 2024-10-08
category: ecosystem
image: media/DistOpFuture.png
excerpt: Anoma is a decentralized operating system with heterogeneous trust. For developing Anoma we've been using Elixir. This choice is somewhat unconventional, and thus begs the question "why write Anoma in Elixir!?"

---


## Summary

The article explains why Anoma is being built using the Elixir
programming language:

- Elixir, which runs on the Erlang virtual machine, lets developers
inspect and debug the system while it's running and update the code
without restarting
- Erlang was created to build robust, fault-tolerant networked systems
using lightweight processes that can crash without taking down the
whole system
- Elixir's process model is a natural fit for Anoma's architecture,
which is based around separate "engines" that communicate by passing
messages
- Elixir makes it easier to implement Anoma's design with less
accidental complexity, because the language matches the architecture
- Having the implementation closely match the specification will make
it easier to mathematically prove things about the properties of the
system down the road

In short, Elixir provides the right tools and abstractions to build
Anoma according to spec, while providing a good developer experience
and supporting future verification efforts.


<a id="orgc06199d"></a>

# Anoma: Why Elixir?

Anoma is a decentralized operating system with heterogeneous
trust. For developing Anoma we've been using [Elixir](https://elixir-lang.org/). This choice
is somewhat unconventional, and thus begs the question "why write
Anoma in Elixir!?". There are a plethora of reasons as to why [Elixir](https://elixir-lang.org/)
is a good choice. To motivate the reasoning, we will go first from
generic reasoning as to why it's a good fit for many projects before
diving deep into why it makes sense for Anoma in particular.


<a id="orgb7d7476"></a>

## Live Interactive Systems

When building any large complicated system, the number one feature one
wants is to be able to understand how the system is operating along
with being able to determine if there are mishaps when carrying out a
design. For most software projects, this means sitting down and
thinking about the system abstractly or tracing through the system
execution in the case of direct questions. Thankfully for us, we have
chosen [Elixir](https://elixir-lang.org/), which rests on the foundations of Erlang/OTP. In
practical terms, this means we get a live interactive system from
which we can ask the system questions.

For example, instead of having to sit and think about the
architecture of how the components of Anoma fit together, I can ask
the underlying system, "please give me the layout of processes for my
application":

![img](media/observer-1.png)

The specifics of the diagram doesn't matter. What does matter is that
the system can reflect on itself, helping to explain the very system
that we are building. Meaning, that the system we are building can be
in dialog with the builders to shape discussion around itself. No
longer is the system simply speculated about from the hazy
recollections that the engineers who made it have. A great upshot of
this is that if we have further questions about the system, we can
drill down and ask it more questions. For example what if we were
curious about one of those nodes in the diagram, what information can
we get from there?:

![img](media/observer-2.png)

As we can see, we now have diagnostics on the particular node along
with information about it's **state**, **stack trace**, etc.

This want and concern does not only apply to big complicated projects;
at every stage of development one wants to be able to play with the
software, change a few components, and recompile only the parts they
are interested in. Having the underlying Erlang/OTP available, allows
quicker feedback cycles by allowing the team to better understand the
system as we develop it.

To give a concrete list of features that Elixir utilizes to achieve
this is:

1. Hot code reloading
- This means we can have the software running locally, and be able
to change any part of it without having to shut down and restart
the service. This is invaluable for trying to experiment with new
ideas!
2. Underlying environmental reflection
- The underlying reflection gives the system the information it
needs to be able to report back crucial information about
itself. Without inroads to the system, an interactive system
would quickly turn opaque!
3. The entire system available (from parser, to compiler, to
visualization suite) from the user shell
- This allows a skilled engineer the ability to diagnose problems
and see practical implications of what they are doing. This is
apart of the [same technology that allowed people to debug a space
probe 100 million miles away!](https://flownet.com/gat/jpl-lisp.html)
4. A lack of artificial distinction between system and language.
- This means that the core functionality of the system is realized
through the language itself, proving itself capable enough to
compose a highly robust system.

With these properties and features laid out, this make Elixir quite
different from language in common use (Java, Rust, Python etc.)

With that said, there are [genres of languages](https://wiki.c2.com/?ImageBasedLanguage) that exhibit the same
features some of which have [superior visualization techniques](https://gtoolkit.com/). What is
it about the Erlang/OTP system that makes them special compared to
them?


<a id="org19197a5"></a>

## Designing Robust Systems

To understand what makes Elixir special, we first must go back to what
problems that Erlang and the Beam were meant to solve. Joe Armstrong's
thesis: [Making reliable distributed systems in the presence of
software errors](https://erlang.org/download/armstrong_thesis_2003.pdf), lays out their problem domain:

1. The system must handle very large numbers of concurrent activities
2. Actions must be performed at a certain point in time or within a
certain time.
3. Systems may be distributed over several computers.
4. The system is used to control hardware.
5. The software systems are very large.
6. The system exhibits complex functionality such as, feature
interaction.
7. The system should be in continuous operation for many years.
8. Software maintenance should be performed without stopping the
system.
9. There are stringet quality and reliability requirements
10. Fault tolerance both to hardware failures, and software errors,
must be provided.

In order to tackle these problems they created a philosophy and
ultimately requirements that the underlying operating system and
language should have.

In doing so Erlang/OTP, much like Anoma, defined out a language an
ultimately an operating system that tackles their very real problems.

To give a sample of the features the system has:

1. The system has very cheap processes
```elixir
iex(mariari@Gensokyo)10> :timer.tc(fn -> spawn(fn -> nil end) end)
{10, #PID<0.1689.0>}
```

- This snippet shows a process takes only 10 microseconds to spawn!
2. The architecture of the application is split into a series of
processes, each process are often referred to as actors.
- Actors are interesting objects as here are their central properties:
1. They contain some state they maintain
2. They can send and receive messages
3. Messages are processed sequentially (order is guaranteed)
between the same two actors!
3. Processes are isolated. Meaning that if a process crashes, it does
not affect other processes. [Erlang the movie](https://www.youtube.com/watch?v=BXmOlCy0oBM) gives a good example
of this at play on their real systems!
4. Processes are managed by a chain of [supervisors](https://www.erlang.org/doc/apps/stdlib/supervisor.html).
- Supervisors are interesting in that they look after many
processes (some of who may be supervisors). When a process they
are supervising crashes, they have policies on what happens to
the other processes under their control.
- It was realized that actors on their own are not enough to make a
robust system, and this is a crucial component that is missing
from most other actor models.
5. The mentality is "let it crash".
- With Supervisors this is a very potent strategy for creating
reliable and robust systems.
- When crashes happen early, we can isolate the errors, this is
hard to do when one has an Either/Result/Maybe type, as stack
information is often lost with those strategies. The supervisor
then can restart the crashed component helping the system
resume normal behavior.
- Further since parts of the system being down can be normal, one
typically thinks out a robust supervision system that lets the
system gracefully degrade. A common example is having a full
fledged system that does all the bells and whistles, however
when some external service is down (say you query an external
service for images or AI responses), then we can fall back on
simpler behavior until the issue is solved.
- This also helps in the case of underlying hardware errors, as
if something like a bit flip due to a cosmic ray were to
happen to a high integrity component, instead of the
application crashing and stopping, the application could crash
and sensible restart behavior can be had.
6. The system has fair concurrency, with it's own scheduler system.
- This means your processes aren't going to get starved of
resources.

These are just a few of the features that make the Erlang/OTP system
standout even among other interactive systems. These properties will
be vital when we consider how Anoma is modeled from both a code
perspective and a specification perspective.


<a id="org3dd3410"></a>

## Specs: Or What is an Engine.

Implementing Anoma comes after specifying Anoma. Natural language is
easier to use and parse than arbitrary code. This matters both for
explicating and understanding concepts.

So most high-level ideas for Anoma come in the form of the
specification and those are the ones that will make users initially
interested and excited about Anoma. To understand what Anoma has to
offer, the user will look at the specification. If they are interested
in the presented promises of the system, they become incentivized to
take a proper look at the code and engage with it.

What the user is looking for is for an instantiation of the presented
proposal with desired properties. That is, they need to establish that
a piece of code lying inside a repository is an actual implementation
of the system described.

To make this easy, the engineers need to have a good interface that
makes it easy to establish a connection between concepts central to
the specification and our code. Otherwise, the application becomes
hard to adopt and develop for any outside party. Having such
properties can fundamentally depend on your choice of a language.

For example, if you are interested in functions, it matters whether
you present id : Bool -> Bool as lambda x -> x or {{true, true},
{false, false}}. There is a reason why we don't use set theory
everywhere in mathematics. If you think of functions as something that
takes things in and pops things out, the type theoretic approach is
probably easier to comprehend and work with.

Following this idea, a good requirement for the desired language is to
have a good first-class citizen corresponding to the central concept
of the specification: the [engine](https://specs.anoma.net/latest/node_architecture/engines.html?h=engine). Almost every functionality in the
Anoma infrastructure is specified as a functionality of a particular
engine. Mempool behavior is implemented through the Mempool
Engine. Transaction execution is done through the Execution Engine.

An [engine](https://specs.anoma.net/latest/node_architecture/engines.html?h=engine) has several properties, some of the central ones being:

1. It is stateful
2. It can receive messages
3. It can send messages
4. Messages are processed sequentially

So what is an engine? Something that quacks like an agent, it seems!
Now what systems take agents as first-class citizens? Erlang-based
systems!

Having a good actor model of the language we use does not only allow
for an easier understanding of the codebase, greater engagement with
parties interested in Anoma development coming from the specification
side, it also grants us credence in future verification and auditing
work being done internally.

The formal verification efforts Anoma has are aimed not at the
verification of a particular implementation: implementations may
differ and canonical ones may change due to timely choices. Instead,
proving of system properties will be done through the formalization of
the [Anoma specification](https://specs.anoma.net/latest/). As it goes, a proof is only as good as the
initial formalization of the system. The closer the underlying
concepts of the specification and the implementation are, the more
believable it is that whatever we come to prove about Anoma will also
correspond to a proof about our Elixir implementation.


<a id="orgfdc8d20"></a>

## Conclusion

We have chosen Elixir to develop the Anoma node software. The Anoma
node consists of several independently moving pieces that have to be
orchestrated and connected (e.g., transaction execution, intra-node
communication, .. (insert some more here)). From a developer's
perspective it is beneficial to be able to introspect these individual
components at runtime, update them, and manipulate them.
Additionally, the Erlang VM offers us the basic building blocks to
build and connect such a system of independent processes in a robust
and scalable way.

The specification of Anoma dictates that any node consists of
different 'engines', which are isolted processes each responsible to
handle part of the Anoma node responsibilities. The Erlang VM its
computational model of processes with shared-nothing memory maps
nearly 1:1 on the architecture proposed in the Anoma
specification. This means that the implementation of the specification
will have much less accidental complexity compared to other
computational models.

Thus, choosing the Erlang VM to implement the Anoma specification
offers us better developer UX and reduces the impedance mismatch
between the specification of Anoma and its physical implementation.

Loading

0 comments on commit 79d9f40

Please sign in to comment.