Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for incorporating explicit state intialization and association #22

Closed
JustusAdam opened this issue Nov 17, 2018 · 3 comments
Closed

Comments

@JustusAdam
Copy link
Member

JustusAdam commented Nov 17, 2018

Problem description and solution requirements

The Rust backend for Ohua requires an explicit association between the state of
a stateful function and the function itself. As a relic from Java this has up to
this point been implicit (as a class) or linked using a backend mechanism such
as clojure metadata.

This implicit approach is unfeasible in Rust, because the implicit association
of a class is not present, and more importantly the reflection mechanisms for
discovering initializers for state do not exist.

Furthermore we believe that an explicit state association will provide more
flexibility for the user and may be used to implement state sharing stateful
functions.

Requirements

  • it must be obvious, either from the code directly or from derived information
    which functions have state
  • the state initializer must be explicitly mentioned in the code to ensure
    the backend does not require reflection to find it.

Additionally our proposals so far are also designed such that they may be used
in the future to express state sharing. While this is not a necessary property
to solve the issue at hand, it is a desirable property.

Solution variants

We believe that new syntax may be necessary to implement state initialization and
association in a user friendly way.

Three variants were discussed so far.

Arrow assignment with implicit association

let s <- initState in
  statefulFunction s a b c

In arrow syntax a special new assignment operator <- (or similar) is used to
explicitly initialize a state cell. This binding (s in the example) can
only be used as a first argument to a stateful function and implicitly
associates as the statefulFunction's state.

Advantages

  • The state can be bound in a certain scope (like an smap), which will make it
    possible in the future to do dynamically created/auto resetting state
  • The created binding s can be supplied to multiple functions, allowing state
    sharing.

Disadvantages

  • Possible confusion over the fact s cannot be used in the same way other
    bound values can, despite seeming to be a regularly bound value.

    • This can be fixed by letting the compiler support the programmer by giving out error messages or warnings.
  • The operator <- is not self explanatory. It bears no inherent association
    with state as such.

    • This may be fixed by using something like an <s- arrow.
  • Scoped intialization does not exist yet which means we have to require users
    to bind the state in the outermost scope so that their code does not subtly
    break one we introduce the feature. However doing this is tedious, especially
    in the presence of algorithms, where having to explicitly provide state not
    only bloats code, but to me also feels like leaking implementation detail.

    • This can be fixed (rather easily) by requiring (for now) that state initializers have to be independent, i.e., they do not receive any input (except maybe from an Env arg). This can be enforced via the compiler. As such, an algorithm can have its own initializers.

    Even in the presence of scoped bindings we may want to provide additional
    syntax for locally initializing state that is automatically bound at the
    outermost scope with the very reasons just mentioned.

  • Implicitly associating state via the first argument of a function makes it
    less obvious for the user whether a function is stateful or not. Rather than
    being able to tell immediately from the syntax, one has to find the binding
    site for the first argument and consider its "type"

    • This may not be such a hard challenge because in the very end this feature is at the type system level. We essentially introduce a State a type. As such, finding out whether this is state or not is no harder than fixing any other type error. It is essentially the same problem for the programmer.
  • Finding which functions share state requires tracking down the use sites of s.

    • The compiler can help here again and report/warn on shared state.

with operator (explicit initializer association)

(statefulFunction with initState) a b c

let boundSf = statefulFunction with initState in
boundSf a b (boundSf c)

let (sharingSf1, sharingSf2) = function1, function2 with initState in
function1 a (function2 b c)

The with operator associates a state initializer with a function. with is a
binary operator and its rhs must be a callable expression in the target
language1. State sharing is done by either reusing an
already bound function (boundSf) or initializing multiple stateful functions
with the same function (function1 and function2).

1: The precise nature of the RHS is variable. We can use
a function or maybe even expression, which would allow for dynamically created
state.

Advantages

  • Actual state values are never directly handled in the ohua code. As a result
    state cells cannot be confused with other let bound values.
  • Because the state association happens at the use site, it is immediately clear
    whether the function is stateful or not. No need to chase down the binding
    site of the first argument.
  • with always binds state globally, thus no need to pay special attention to
    the future possibility of scoped state.
  • Functions sharing state always occur together, making it easy to see who is
    sharing a particular state.

Disadvantages

  • No possibility for scoped state. Could be fixed by adding something like a
    local keyword to scope state binding. let boundSf = statefulFunction with local initState in ....
  • Requiring all sharing functions to associate on the same with feels somewhat
    less flexible than using <-.
  • The point of with is to make state sharing more explicit but then it is possible to just use those created functions in different places. So, I essentially have the same problem again, such that we did not solve the problem of finding all function calls that share state in one place.
    • This design needs to be accompanied with compiler support that makes sure these functions that bind state are used only once.
    • Given an algorithm that wants to be parameterized over its state, it feels a bit weird to parameterize it over a stateful function.

Regular let with explicit state association

let s = initState in
(statefulFunction on s) a b c

let s = initState in
let boundSf = statefulFunction on s in
statefulFunction s b (statefulFunction c)

This variant is a blend of <- and with. It explicitly associates a state
cell with a stateful function using on. However unlike with it allows
binding the state cell earlier using let which enables state sharing by using
the bound state with multiple functions or using an on bound function multiple
times.

Depending on the availability of dynamic state binding and scoped state certain
restrictions may be placed on what the RHS of on has to be. For instance, in
the beginning we require that it resolve to an env binding.

Advantages

  • Like the <- variant state can be bound in scopes. Enabling dynamic state
    binding and scoped state.
  • Straight forward state sharing through multiple uses of s.
  • Like with stateful functions are easy do distinguish from stateless ones,
    because on occurs at the use site.

Disadvantages

  • The semantics of s are unclear with respect to whether it is usable as a
    regular value. It is unclear whether it can or should be allowed to be used as
    regular data. If not, this may also lead to the same confusion as was the case
    with the arrow variant.
  • Same scoped state problem as the arrow variant.
  • Finding functions that share state requires tracking down use sites of s.

On new syntax

I think it is worth mentioning that we may not necessarily require actual new
syntax for this feature. A similar effect may also be achieved by using a
function (such as init) that is especially recognized by the compiler.

statefulFunction (init initState) a b c

let s = init initState in
statefulFunction s a b (statefulFunction s c)

Similar as with <- init values would be traced through the code and
implicitly become the state of a function to which they are the first argument.

The same advantages and disadvantages as with <- apply.

One advantage of this approach over all the others is that no new syntax needs
to be introduced. This means that handling init can be entirely done in the
core library with no change to any of the parsers.
As a disadvantage, there can not be a stateful function named init, similar to the new keyword in languages like Java.

@sertel
Copy link
Contributor

sertel commented Nov 19, 2018

Thanks @JustusAdam for putting together this proposal.

This one is very important, so we should understand what that is conceptually.
Here is my take on it:

I see this as an introduction of a special type into our language. One can view it like this: There exist two (higher-kinded) types in our system

  • Data a This is the type for all value that flow in between the functions inside our algorithms.
  • State a This is the type for all values that stick with a function.

More formally, "stick" means that this function actually turns a function f:: a -> b into f:: (a,s) -> (b,s). I say stick because semantically, we/Ohua executes these functions for example inside an smap by reusing the new state for the next invocation, i.e., the state sticks with the function (for its lifetime or until it is reassigned). In Haskell, such functions of the Kleisli category are represented in terms of the State monad and the according Ohua-like execution is achieved via the concept of a value recursion using mfix. This is also what we do: We essentially fold over the state.

I like the discussion above about making the state aspect obvious to the programmer in two places:

  • at creation
  • at sticking state to a function

But after all, we are talking about a type-system aspect here, so if the programmer gets guided by the compiler then making it only obvious during creation is fine for me.
At the moment, I do favor the on-variant.
But whatever syntactic sugar we decide upon, in the end this should desugar into a normal let in the typed lambda caluclus of ALang.
(See issue #23 for my proposal to have a language for this syntatic sugar to keep ALang clean.)

@JustusAdam
Copy link
Member Author

A short note on how I implemented this:

I am currently in the process of implementing this both in the compiler and the parsers.
I decided to go with an operator named with but with the semantics of on. The reason is because I think with, as a word, makes more sense.

JustusAdam added a commit that referenced this issue Dec 10, 2018
Refactored old code to comply with the removed `Sf` pattern
JustusAdam added a commit that referenced this issue Dec 10, 2018
Changed lowering pass accordingly
JustusAdam added a commit that referenced this issue Dec 11, 2018
Fixed the `Arcs` type (now compound, direct and state arcs are separated)
Changed the `Arc` family of types (its a single `Arc` type now and the rest aliases)
Implemented passes necessary for state binding normalization
Also simplified the other inlining passes using compositions of smaller passes
Changed `DFFnRef`. It makes more sense now
Extended the DFLang parser to work with state args
JustusAdam added a commit that referenced this issue Dec 12, 2018
It now prints state arguments
JustusAdam added a commit that referenced this issue Dec 12, 2018
Swapped braces for brackets
JustusAdam added a commit to ohua-dev/docs that referenced this issue Dec 12, 2018
JustusAdam added a commit to ohua-dev/alang-sexpr-parser that referenced this issue Dec 12, 2018
JustusAdam added a commit to ohua-dev/alang-clike-parser that referenced this issue Dec 12, 2018
JustusAdam added a commit to ohua-dev/alang-ml-parser that referenced this issue Dec 12, 2018
@sertel
Copy link
Contributor

sertel commented Jan 17, 2019

I have the basic implementation in place. We will follow up aspects such as shared state etc. in other issues once we get there.

@sertel sertel closed this as completed Jan 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants