[eeps] EEP ???: Value-Based Error Handling Mechanisms

Fred Hebert mononcqc@REDACTED
Tue Sep 4 23:07:03 CEST 2018


This EEP adds a contextual <~ operator to begin ... end expressions, 
which allows them to be usable for value-based error handling, based on 
standard {ok, term()} | {error, term()} return value types.

This lets begin ... end become a control flow construct to replace or 
simplify deeply-nested case ... end expressions, and prevent using 
exceptions for control flow.

I want to thank Jesper Louis Andersen, Sean Cribbs, Tristan Sloughter, 
Evan Vigil-McClanahan, Anthony Ramine, Bryan Paxton, and Pedram Nimreezi 
for their feedback on this document.

The draft is attached to this document, and also available (along with 
an incomplete reference implementation to explore some semantics) at 
https://bitbucket.org/ferd/unwrap

Sorry for sending a first copy without the attachment,
Fred.
-------------- next part --------------
    EEP: Pending
    Title: Value-Based Error Handling Mechanisms
    Version: $Revision$
    Last-Modified: $Date$
    Author: Fred Hebert <mononcqc@REDACTED>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 31-Aug-2018
    Post-History:
****

# EEP ???: Value-Based Error Handling Mechanisms #

Abstract
========

This EEP adds a contextual `<~` operator to `begin ... end` expressions,
which allows them to be usable for value-based error handling, based on
standard `{ok, term()} | {error, term()}` return value types.

This lets `begin ... end` become a control flow construct to replace or
simplify deeply-nested `case ... end` expressions, and prevent using
exceptions for control flow.

Copyright
=========

This document has been placed in the public domain.

Specification
=============

The current syntax for a `begin ... end` expression is:

```
begin
    Exprs
end
```

The expression does not have a restricted scope, and is mostly used to
group multiple distinct expressions as a single block. We propose a new
type of expressions (denoted `UnwrapExprs`), only valid within a
`begin ... end` expression:

```
begin
    Exprs | UnwrapExprs
end
```

`UnwrapExprs` are defined as having the following form:

```
Pattern <~ Expr
```

This definition means that `UnwrapExprs` are only allowed at the
top-level of `begin ... end` expressions.

The `<~` operator takes the value return by `Expr` and inspects it.
If the value is a tuple of the form `{ok, Val}`, it unwraps `Val` from
the tuple, and matches it against `Pattern`.

If the pattern matches, all variables from `Pattern` are bound in the
local environment, and the full value `{ok, Val}` is returned by the
`UnwrapExpr`. If the value does not match, a `{badunwrap, Val}` error
is raised.

A special case exists when `Pattern` is the match-all
variable (`_`), which on top of allowing the value to be considered a successful
unwrapping if the returned value from `Expression` is `{ok, term()}`,
it also considers the atom `ok` to be valid as well.

If the value is a tuple of the form `{error, Reason}`, then the entire
`begin ... end` expression is short-circuited and returns `{error,
Reason}`. The variables that were bound in there remain bound, the
rest are undefined.

The compiler should warn about any variable that is used after the
`begin ... end` expression that was bound in or after the first
`UnwrapExpr` encountered within the block.

If the value returned does not match any of `{ok | error, term()}` as a
type, a `{badunwrap, Val}` error is raised.

Given the structure described here, the final expression may look like:

```
begin
    Foo = bar(),
    X <~ id({ok, 5}),
    [H|T] <~ id({ok, [1,2,3]}),
    ...
end
```

Do note that to allow easier pattern matching and more intuitive usage,
the `<~` operator should have associativity rules lower than `=`, such that:

```
begin
    X = [H|T] <~ exp()
end
```

is a valid `UnwrapExp` equivalent to the non-infix form `'<~'('='(X, [H|T]),
exp())`, since reversing the priorities would give `'='('<~'(X, [H|T]),
exp())`, which would create an `UnwrapExp` out of context and be invalid.


Motivation
==========

Erlang has some of the most flexible error handling available across a
large number of programming languages. The language supports:

1. three types of exceptions (`throw`, `error`, `exit`)
 - handled by `catch Exp`
 - handled by `try ... [of ...] catch ... [after ...] end`
2. links, `exit/2`, and `trap_exit`
3. monitors
4. return values such as `{ok, Val} | {error, Term}`, `{ok, Val} |
   false`, or `ok | {error, Val}`
5. A combination of one or more of the above

So why should we look to add more? There are various reasons for this,
incuding trying to reduce deeply nested conditional expressions,
cleaning up some messy patterns found in the wild, providing a better
separation of concern when implementing functions, and encouraging more
standard and idiomatic interfaces.

Reducing Nesting
----------------

One common pattern that can be seen in Erlang is deep nesting of `case
... end` expressions, to check complex conditionals.

Take the following code taken from
[Mnesia](https://github.com/erlang/otp/blob/a0ae44f324576104760a63fe6cf63e0ca31756fc/lib/mnesia/src/mnesia_backup.erl#L106-L126),
for example:

```
commit_write(OpaqueData) ->
    B = OpaqueData,
    case disk_log:sync(B#backup.file_desc) of
        ok ->
            case disk_log:close(B#backup.file_desc) of
                ok ->
                    case file:rename(B#backup.tmp_file, B#backup.file) of
                       ok ->
                            {ok, B#backup.file};
                       {error, Reason} ->
                            {error, Reason}
                    end;
                {error, Reason} ->
                    {error, Reason}
            end;
        {error, Reason} ->
            {error, Reason}
    end.
```

The code is nested to the extent that shorter aliases must be introduced
for variables (`OpaqueData` renamed to `B`), and half of the code just
transparently returns the exact values each function was given.

By comparison, the same code could be written as follows with the new
construct:

```
commit_write(OpaqueData) ->
    begin
        _ <~ disk_log:sync(OpaqueData#backup.file_desc),
        _ <~ disk_log:close(OpaqueData#backup.file_desc),
        _ <~ file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
        {ok, OpaqueData#backup.file}
    end.
```

The semantics of this call are entirely identical, except that it is now
much easier to focus on the flow of individual operations.

Obsoleting Messy Patterns
-------------------------

Frequent ways in which people work with sequences of failable operations
include folds over lists of functions, and abusing list comprehensions.
Both patterns have heavy weaknesses that makes them less than ideal.

Folds over list of functions use patterns such as those defined in
[posts from the
mailing](http://erlang.org/pipermail/erlang-questions/2017-September/093575.html):

```
pre_check(Action, User, Context, ExternalThingy) ->
    Checks =
        [fun check_request/1,
         fun check_permission/1,
         fun check_dispatch_target/1,
         fun check_condition/1],
    Args = {Action, User, Context, ExternalThingy},
    Harness =
        fun
            (Check, ok)    -> Check(Args);
            (_,     Error) -> Error
        end,
    case lists:foldl(Harness, ok, Checks) of
        ok    -> dispatch(Action, User, Context);
        Error -> Error
    end.
```

This code requires declaring the functions one by one, ensuring the
entire context is carried from function to function. Since there is no
shared scope between functions, all functions must operate on all
arguments.

By comparison, the same code could be implemented with the new construct
as:

```
pre_check(Action, User, Context, ExternalThingy) ->
    begin
        _ <~ check_request(Context, User),
        _ <~ check_permissions(Action, User),
        _ <~ check_dispatch_target(ExternalThingy),
        _ <~ check_condition(Action, Context),
        dispatch(Action, User, Context)
    end
```

And if there was a need for derived state between any two steps, it
would be easy to weave it in:

```
pre_check(Action, User, Context, ExternalThingy) ->
    begin
        _ <~ check_request(Context, User),
        _ <~ check_permissions(Action, User),
        _ <~ check_dispatch_target(ExternalThingy),
        DispatchData <~ dispatch_target(ExternalThingy),
        _ <~ check_condition(Action, Context),
        dispatch(Action, User, Context)
    end
```

The list comprehension _hack_, by comparison, is a bit more rare. In
fact, it is mostly theoretical. Some things that hint at how it could
work can be found in [Diameter test
cases](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/diameter/test/diameter_examples_SUITE.erl#L254-L266)
or the [PropEr plugin for
Rebar3](https://github.com/ferd/rebar3_proper/blob/e7eb96498a9d31f41c919474ec6800df62e237e1/src/rebar3_proper_prv.erl#L298-L308).

Its overal form uses generators in list comprehensions to tunnel a happy
path:

```
[Res] =
    [f(Z) || {ok, W} <- [b()],
             {ok, X} <- [c(W)],
             {ok, Y} <- [d(X)],
             Z <- [e(Y)]],
Res.
```

This form doesn't see too much usage since it is fairly obtuse and I
suspect most people have either been reasonable enough not to use it, or
did not think about it. Obviously the new form would be cleaner:

```
begin
    W <~ b(),
    X <~ c(W),
    Y <~ d(X),
    Z = e(Y),
    f(Z)
end
```

which on top of it, has the benefit of returning an error value if one
is found.

Better Separation of Concerns
-----------------------------

This form is not necessarily obvious at a first glance. To better
expose it, let's take a look at some functions defined in the
[`release_handler` module in
OTP](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/sasl/src/release_handler.erl#L1894-L1923):

```
write_releases_m(Dir, NewReleases, Masters) ->
    RelFile = filename:join(Dir, "RELEASES"),
    Backup = filename:join(Dir, "RELEASES.backup"),
    Change = filename:join(Dir, "RELEASES.change"),
    ensure_RELEASES_exists(Masters, RelFile),
    case at_all_masters(Masters, ?MODULE, do_copy_files,
                        [RelFile, [Backup, Change]]) of
        ok ->
            case at_all_masters(Masters, ?MODULE, do_write_release,
                                [Dir, "RELEASES.change", NewReleases]) of
                ok ->
                    case at_all_masters(Masters, file, rename,
                                        [Change, RelFile]) of
                        ok ->
                            remove_files(all, [Backup, Change], Masters),
                            ok;
                        {error, {Master, R}} ->
                            takewhile(Master, Masters, file, rename,
                                      [Backup, RelFile]),
                            remove_files(all, [Backup, Change], Masters),
                            throw({error, {Master, R, move_releases}})
                    end;
                {error, {Master, R}} ->
                    remove_files(all, [Backup, Change], Masters),
                    throw({error, {Master, R, update_releases}})
            end;
        {error, {Master, R}} ->
            remove_files(Master, [Backup, Change], Masters),
            throw({error, {Master, R, backup_releases}})
    end.
```

At a glance, it is very difficult to clean up this code: there are 3
multi-node operations (backing up, updating, and moving release data),
each of which relies on the previous one to succeed.

You'll also notice that each error requires special handling, reverting
or removing specific operations on success or on failure. This is not a
simple question of tunnelling values in and out of a narrow scope.

Another thing to note is that this module, as a whole (and not just the
snippet presented here) uses `throw` expressions to operate non-local
return. The actual point of return handling these is spread through
various locations in the file:
[`create_RELEASES/4`](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/sasl/src/release_handler.erl#L381-L388),
and
[`write_releases_1/3`](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/sasl/src/release_handler.erl#L1864-L1881)
for example.

The `case catch Exp of` form is used throughout the file because
value-based error flow is painful in nested structures.

So let's take a look at how we could refactor this with the new
construct:

```
write_releases_m(Dir, NewReleases, Masters) ->
    RelFile = filename:join(Dir, "RELEASES"),
    Backup = filename:join(Dir, "RELEASES.backup"),
    Change = filename:join(Dir, "RELEASES.change"),
    begin
        _ <~ backup_releases(Dir, NewReleases, Masters, Backup, Change,
                             RelFile),
        _ <~ update_releases(Dir, NewReleases, Masters, Backup, Change),
        _ <~ move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
    end.

backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, ?MODULE, do_copy_files,
                        [RelFile, [Backup, Change]]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(Master, [Backup, Change], Masters)
            {error, {Master, R, backup_releases}}
    end.

update_releases(Dir, NewReleases, Masters, Backup, Change) ->
    case at_all_masters(Masters, ?MODULE, do_write_release,
                        [Dir, "RELEASES.change", NewReleases]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(all, [Backup, Change], Masters),
            {error, {Master, R, update_releases}}
    end.

move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, file, rename, [Change, RelFile]) of
        ok ->
            remove_files(all, [Backup, Change], Masters),
            ok;
        {error, {Master, R}} ->
            takewhile(Master, Masters, file, rename, [Backup, RelFile]),
            remove_files(all, [Backup, Change], Masters),
            {error, {Master, R, move_releases}}
    end.
```

The only reasonable way to rewrite the code was to extract all three
major multi-node operations into distinct functions. The improvements
are:

- The consequence of failing an operation is located near where the
  operation takes place
- The functions have return values that Dialyzer can more easily
  typecheck
- The functions are inherently more testable independently
- Context can still be added and carried on the generalized workflow at
  the parent level
- The chain of successful operations is very obvious and readable
- Exceptions are no longer required to make the code work, but if we
  needed it, only one `throw()` would be needed in `write_release_m`,
  therefore separating the flow control details from specific function
  implementations.

As a control experiment, let's try reusing our shorter functions with
the previous flow:

```
%% Here is the same done through exceptions:
write_releases_m(Dir, NewReleases, Masters) ->
    RelFile = filename:join(Dir, "RELEASES"),
    Backup = filename:join(Dir, "RELEASES.backup"),
    Change = filename:join(Dir, "RELEASES.change"),
    try
        ok = backup_releases(Dir, NewReleases, Masters, Backup, Change,
                             RelFile),
        ok = update_releases(Dir, NewReleases, Masters, Backup, Change),
        ok = move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
    catch
        {error, Reason} -> {error, Reason}
    end.

backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, ?MODULE, do_copy_files,
                        [RelFile, [Backup, Change]]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(Master, [Backup, Change], Masters)
            throw({error, {Master, R, backup_releases}})
    end.

update_releases(Dir, NewReleases, Masters, Backup, Change) ->
    case at_all_masters(Masters, ?MODULE, do_write_release,
                        [Dir, "RELEASES.change", NewReleases]) of
        ok ->
            ok;
        {error, {Master, R}} ->
            remove_files(all, [Backup, Change], Masters),
            throw({error, {Master, R, update_releases}})
    end.

move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
    case at_all_masters(Masters, file, rename, [Change, RelFile]) of
        ok ->
            remove_files(all, [Backup, Change], Masters),
            ok;
        {error, {Master, R}} ->
            takewhile(Master, Masters, file, rename, [Backup, RelFile]),
            remove_files(all, [Backup, Change], Masters),
            throw({error, {Master, R, move_releases}})
    end.
```

Very little changes in the three distributed functions. However, the weakness
of this approach is that we have intimately tied implementation details of the
small functions to their parent's context.  This makes it hard to reason about
these functions in isolation or to reuse them in a different context.
Furthermore, the parent function may capture `throws` not intended for it.

It is my opinion that using value-based flow control, through similar
refactorings, yields safer and cleaner code, which also happens to have
far more reduced levels of nesting. It should therefore be possible to
express more complex sequences of operations without making them any
harder to read, nor reason about in isolation.

That is in part due to the nesting, but also because we take a more
compositional approach, where there is no need to tie local functions'
implementation details to the complexity of their overall pipeline and
execution context.

It is also the best way to structure code in order to handle all
exceptions and to provide the context they need as close as possible to
their source, and as far as possible from the execution flow.


Encouraging Standards
---------------------

In Erlang, `true` and `false` are regular atoms that only gained special
status through usage in boolean expressions. It would be easy to think
that more functions would return `yes` and `no` were it not from control
flow constructs.

Similarly, `undefined` has over years of use become a kind of default
"not found" value. Values such as `nil`, `null`, `unknown`, `undef`,
`false` and so on have seen some use, but a strong consistency in format
has ended up aligning the community on one value.

When it comes to return values for various functions, `{ok, Term}` is
the most common one for positive results that need to communicate a
value, `ok` for positive results with no other value than their own
success, and `{error, Term}` is most often uses for errors. Pattern
matching and assertions have enforced that it is easy to know whether a
call worked or not by its own structure.

However, many success values are still larger tuples: `{ok, Val,
Warnings}`, `{ok, Code, Status, Headers, Body}`, and so on. Such
variations are not problematic on their own, but it would likely not
hurt too much either to use `{ok, {Val, Warnings}}` or `{ok, {Code,
Status, Headers, Body}}`.

In fact, using more standard forms could lead to easier generalizations
and abstractions that can be applied to community-wide code. By choosing
specific formats for control flow on value-based error handling, we
explicitly encourage this form of standardization.


Rationale
=========

This section will detail the decision-making behind this EEP, including:

- Prior Art in Other Languages
- The choice of `begin ... end` as a construct and its scope
- Why introduce a new operator
- Other disregarded approaches
- The choice of supported values
- The choice of `{badunwrap, Val}` as a default exception

There's a lot of content to cover here.

Prior Art in Other Languages
----------------------------

Multiple languages have value-based exception handling, many of which
have a strong functional slant.

### Haskell ###

The most famous case is possibly Haskell with the `Maybe` monad, which
uses either `Nothing` (meaning the computation returned nothing) or
`Just x` (their type-based equivalent of `{ok, X}`). The union of both
types is denoted `Maybe x`. The following examples are taken from
https://en.wikibooks.org/wiki/Haskell/Understanding_monads/Maybe

Values for such errors are tagged in functions as follows:

```
safeLog :: (Floating a, Ord a) => a -> Maybe a
safeLog x
    | x > 0     = Just (log x)
    | otherwise = Nothing
```

Using the type annotations directly, it is possible to extract values
(if any) through pattern matching:

```
zeroAsDefault :: Maybe Int -> Int
zeroAsDefault mx = case mx of
    Nothing -> 0
    Just x -> x
```

One thing to note here is that as long as you are not able to find a
value to substitute for `Nothing` or that you cannot take a different
branch, you are forced to carry that uncertainty with you through all
the types in the system.

This is usually where Erlang stops. You have the same possibilities
(albeit dynamically checked), along with the possibility of transforming
invalid values into exceptions.

Haskell, by comparison, offers monadic operations and its _do notation_
to abstract over things:

```
getTaxOwed name = do
  number       <- lookup name phonebook
  registration <- lookup number governmentDatabase
  lookup registration taxDatabase
```

In this snippet, even though the `lookup` function returns a `Maybe x`
type, the do notation abstracts away the `Nothing` values, letting the
programmer focus on the `x` part of `Just x`. Even though the code is
written as if we can operate on discrete value, the function
automatically re-wraps its result into `Just x` and any `Nothing` value
just bypasses operations.

As such, the developer is forced to acknowledge that the whole
function's flow is conditional to values being in place, but they can
nevertheless write it mostly as if everything were discrete.

### OCaml ###

OCaml supports exceptions, with constructs such as `raise (Type
"value")` to raise an exception, and `try ... with ...` to handle them.
However, since exceptions wouldn't be tracked by the type system,
maintainers introduced a `Result` type.

The type is defined as

```
type ('a, 'b) result =
  | Ok of 'a
  | Error of 'b
```

which is reminiscent of Erlang's `{ok, A}` and `{error, B}`. OCaml users
appear to mostly use pattern matching, combinator libraries, and monadic
binding to deal with value-based error handling, something similar to
Haskell's usage.

### Rust ###

Rust defines two types of errors: unrecoverable ones (using `panic!`)
and recoverable ones, using the `Error<T, E>` values. The latter is of
interest to us, and defined as:

```
enum Result<T, E> {
    Ok(T),
    Err(E),
}
```

Which would intuitively translate to Erlang terms `{ok, T}` and `{error,
E}`. The simple way to handle these in Rust is through pattern matching:

```
let f = File::open("eep.txt");
match f {
    Ok(file) => do_something(file),
    Err(error) => {
        panic!("Error in file: {:?}", error)
    },
};
```

Specific error values have to be well-typed, and it seems that the Rust
community is still debating implementation details about how to best get
composability and annotations within a generic type.

However, their workflow for handling these is well-defined already. This
pattern matching form has been judged too cumbersome. To automatically
panic on error values, the `.unwrap()` method is added:

```
let f = File::open("eep.txt").unwrap();
```

In Erlang, we could approximate this with:

```
unwrap({ok, X}) -> X;
unwrap({error, T}) -> exit(T).

F = unwrap(file:open("eep.txt", Opts)).
```

Another construct exists to return errors to caller code more directly,
without panics, with the `?` operator:

```
fn read_eep() -> Result<String, io::Error> {
    let mut h = File::open("eep.txt")?;
    let mut s = String::new();
    h.read_to_string(&mut s)?;
    Ok(s)
}
```

Any value `Ok(T)` encountering `?` is unwrapped. Any value `Err(E)`
encountering `?` is returned to the caller as-is, as if a `match` with
`return` had been used. This operator however requires that the
function's type signature use the `Result<T, E>` type as a return value.

Prior to version 1.13, Rust used the `try!(Exp)` macro to the same
effect, but found it too cumbersome. Compare:

```
try!(try!(try!(foo()).bar()).baz())
foo()?.bar()?.baz()?
```

### Swift ###


Swift supports exceptions, along with type annotations declaring that a
function may raise exceptions, and `do ... catch` blocks.

There is a special operator `try?` which catches any thrown exception
and turns it into `nil`:

```
func someThrowingFunction() throws -> Int {
    // ...
}
let x = try? someThrowingFunction()
```

Here `x` can either have a value of `Int` or `nil`. The data flow is
often simplified by using `let` assignments in a conditional expression:

```
func fetchEep() -> Eep? {
    if let x = try? fetchEepFromDisk() { return x }
    if let x = try? fetchEepFromServer() { return x }
    return nil
}
```

### Go ###


Go has some fairly anemic error handling. It has panics, and error
values. Error values must be assigned (or explicitly ignored) but they
can be left unchecked and cause all kinds of issues.

Nevertheless, Go exposed [plans for new error
handling](https://go.googlesource.com/proposal/+/master/design/go2draft-error-handling-overview.md)
in future versions, which can be interesting.

Rather than changing semantics of their error handling, Go designers are
mostly considering syntactic changes to reduce the cumbersome nature of
their errors.

Go programs typically handled errors as follows:

```
func main() {
        hex, err := ioutil.ReadAll(os.Stdin)
        if err != nil {
                log.Fatal(err)
        }

        data, err := parseHexdump(string(hex))
        if err != nil {
                log.Fatal(err)
        }

        os.Stdout.Write(data)
}
```

The new proposed mechanism looks as follows:

```
func main() {
    handle err {
        log.Fatal(err)
    }

    hex := check ioutil.ReadAll(os.Stdin)
    data := check parseHexdump(string(hex))
    os.Stdout.Write(data)
}
```

The `check` keyword asks to implicitly check whether the second return
value `err` is equal to `nil` or not. If it is not equal to `nil`, the
latest defined `handle` block is called. It can return the result out to
exit the function, repair some values, or simply panic, to name a few
options.

### Elixir ###

Elixir has a slightly different semantic approach to error handling compared
to Erlang. Exceptions are discouraged for control flow (while Erlang
specifically uses `throw` for it), and the `with` macro is introduced:

```
with {:ok, var} <- some_call(),
     {:error, _} <- fail(),
     {:ok, x, y} <- parse_name(var)
do
    success(x, y, var)
else
    {:error, err} -> handle(err)
    nil -> {:error, nil}
end
```

The macro allows a sequence of pattern matches, after which the ?do ...?
block is called. If any of the pattern matches fails, the failing value
gets re-matched in the optional ?else ... end` section.

This is the most general control flow in this document, being fully
flexible with regards to which values it can handle. This was done in
part because there is not a strong norm regarding error or valid values
in either the Erlang nor Elixir APIs, at least compared to other
languages here.

This high level of flexibility has been criticized in some instances as
being a bit confusing: it is possible for users to make error-only
flows, success-only flows, mixed flows, and consequently the ?else?
clause can become convoluted.

The [OK library](https://github.com/CrowdHailer/OK) was released to
explicitly narrow the workflow to well-defined errors. It supports three forms,
the first of which is the `for` block:

```
OK.for do
  user <- fetch_user(1)
  cart <- fetch_cart(1)
  order = checkout(cart, user)
  saved_order <- save_order(order)
after
  saved_order
end
```

It works by _only_ matching on `{:ok, val}` to keep moving forwards when
using the `<-` operator: the `fetch_user/1` function above must return
`{:ok, user}` in order for the code to proceed. The `=` operator is
allowed for pattern matches the same way it usually does within Elixir.

Any return value that matches `{:error, t}` ends up returning directly
out of the expression. The `after ... end` section takes the last value
returned, and if it isn't already in a tuple of the form `{:ok val}`, it
wraps it as such.

The second variant is the `try` block:

```
OK.try do
  user <- fetch_user(1)
  cart <- fetch_cart(1)
  order = checkout(cart, user)
  saved_order <- save_order(order)
after
  saved_order
rescue
  :user_not_found -> {:error, missing_user}
end
```

This variant will capture exceptions as well (in the `rescue` block),
and will not re-wrap the final return value in the `after` section.

The last variant for the library is the pipe:

```
def get_employee_data(file, name) do
  {:ok, file}
  ~>> File.read
  ~> String.upcase
end
```

The goal of this variant is to simply thread together operations that
could result in either a success or error. The `~>>` operator matches
and returns an `{:ok, term}` tuple, and the `~>` operator wraps a value
into an `{:ok, term}` tuple.

Choosing `begin ... end` Expressions
------------------------------------

Abstractions over error flow requires to define a scope limiting the
way flow is controlled. Before choosing the `begin ... end` expression,
the following items needed consideration:

1. what is the scope we need to cover
2. what is the format of the structure to use
3. why ending up with `begin ... end`

### Scoping Limits ###

In the languages mentioned earlier, two big error handling categories
seem to emerge.

The first group of language seems to track their error handling at the
function level. For example, Go uses `return` to return early from the
current function.  Swift and Rust also scope their error handling
abstractions to the current function, but they also make use of their
type signatures to keep information about the control flow
transformations taking place. Rust uses the `Result<T, E>` type
signature to define what operations are valid, and Swift asks of
developers that they either handle the error locally, or annotate the
function with `throws` to make things explicit.

On the other hand, Haskell's do notation is restricted to specific
expressions, and so are all of Elixir's mechanisms.

Erlang, Haskell, and Elixir all primarily use recursion as an iteration
mechanism, and (outside of Haskell's monadic constructs) do not support
`return` control flow; it is conceptually more difficult for a `return`
(or `break`) to be useful when iteration requires recursion:
"returning" by exiting the current flow may not bail you out of what the
programmer might consider a loop, for example.

Instead, Erlang would use `throw()` exceptions as a control flow
mechanism for non-local return, along with a `catch` or a `try ...
catch`. Picking a value-based error handling construct that acts at the
function level would not necessarily be very interesting since almost
any recursive procedure would still require using exceptions.

As such, it feels simpler to use a self-contained construct built to
specifically focus on sequences of operations that contain value-based
errors.

### Format of Structure ###


Prior attempts at abstracting value-based error handling in Erlang
overloaded special constructs with parse transforms in order to provide
specific workflows.

For example, the [`fancyflow`](https://github.com/ferd/fancyflow)
library tried to abstract the following code:

```
sans_maybe() ->
    case file:get_cwd() of
        {ok, Dir} ->
            case file:read_file(filename:join([Dir, "demo",
"data.txt"])) of
                {ok, Bin} ->
                    {ok, {byte_size(Bin), Bin}};
                {error, Reason} ->
                    {error, Reason}
            end;
        {error, Reason} ->
            {error, Reason}
    end.
```

as:

```
-spec maybe() -> {ok, non_neg_integer()} | {error, term()}.
maybe() ->
    [maybe](undefined,
            file:get_cwd(),
            file:read_file(filename:join([_, "demo", "data.txt"])),
            {ok, {byte_size(_), _}}).
```

And Erlando would replace:

```
write_file(Path, Data, Modes) ->
    Modes1 = [binary, write | (Modes -- [binary, write])],
    case make_binary(Data) of
        Bin when is_binary(Bin) ->
            case file:open(Path, Modes1) of
                {ok, Hdl} ->
                    case file:write(Hdl, Bin) of
                        ok ->
                            case file:sync(Hdl) of
                                ok ->
                                    file:close(Hdl);
                                {error, _} = E ->
                                    file:close(Hdl),
                                    E
                            end;
                        {error, _} = E ->
                            file:close(Hdl),
                            E
                    end;
                {error, _} = E -> E
            end;
        {error, _} = E -> E
    end.
```

With monadic constructs in list comprehensions:

```
write_file(Path, Data, Modes) ->
    Modes1 = [binary, write | (Modes -- [binary, write])],
    do([error_m ||
        Bin <- make_binary(Data),
        Hdl <- file:open(Path, Modes1),
        Result <- return(do([error_m ||
                             file:write(Hdl, Bin),
                             file:sync(Hdl)])),
        file:close(Hdl),
        Result]).
```

Those cases specifically aimed for a way to write sequences of
operations where pre-defined semantics are bound by a special context,
but are limited to overloading constructs rather than introducing new
ones.

By comparison, most of Erlang's control flow expressions follow similar
structures. See the following most common ones:

```
case ... of
    Pattern [when Guard] -> Expressions
end

if
   Guard -> Expressions
end

begin
    Expressions
end

receive
    Pattern [when Guard] -> Expressions
after                                               % optional
    IntegerExp -> Expressions
end

try
    Expressions
of                                                  % optional
    Pattern [when Guard] -> Expressions
catch                                               % optional
    ExceptionPattern [when Guard] -> Expressions
after                                               % optional
    Expressions
end
```

It therefore logically follows that if we were to add a new construct,
it should be of the form

```
<keyword>
    ...
end
```

The questions remaining are: which keyword to choose, and which clauses
to support.


### Choosing `begin ... end` ###


Initially, a format similar to Elixir's `with` expression was being
considered:

```
<keyword>
    Expressions | UnwrapExpressions
of                                              % optional
    Pattern [when Guard] -> Expressions
end
```

With this construct, the basic `<keyword> ... end` form would follow the
currently proposed semantics, but the `of ...` section would allow
pattern matching on any return value from the expression, whether
`{error, Reason}` or any non-exception value returned by the last
expression in the main section.

This form would be in line with what `try ... of ... catch ... end`
allows: once the main section is covered, more work can be done within
the same construct.

However, `try ... of ... catch ... end` has a specific reason for
introducing the patterns and guards: protected code impacting tail
recursion.

In a loop such as:

```
map_nocrash(_, []) -> [];
map_nocrash(F, [H|T]) ->
    try
        F(H)
    of
        Val -> [Val | map_nocrash(F, T)]
    catch
        _:_ -> map_nocrash(F, T)
    end.
```

The `of` section allows to continue doing work in the case no exception
has happened, _without_ having to protect more than the current scope of
the function, nor preventing tail-recursion by forcing a presence of
each iteration on the stack.

No such concerns exist for value-based error handling, and while the
 `of ... end` section might be convenient at times, it is strictly not
necessary for the construct to be useful.

What was left was to choose a name. Initially, the `<keyword>` value
chosen was `maybe`, based on the Maybe monad. The problem is that
introducing any new keyword carries severe risks to backwards
compatibility.

For example, all of the following words were considered:

    ======= ================= =========================================
    Keyword Times used in OTP Rationale
             as a function
    ======= ================= =========================================
    maybe   0                 can clash with existing used words,
                               otherwise respects the spirit
    option  88                definitely clashes with existing code
    opt     68                definitely clashes with existing code
    check   49                definitely clashes with existing code
    let     0                 word is already reserved and free, but
                               makes no sense in context
    cond    0                 word is already reserved and free, may
                               make sense, but would prevent the
                               addition of a conditional expression
    given   0                 could work, kind of respects the context
    when    0                 reserved for guards, could hijack in new
                              context but may be confusing
    begin   0                 carries no conditional meaning, mostly
                              free for overrides

Initially, this proposal expected to use the `maybe` keyword:

```
maybe
    Pattern <op> Exp,
    ...
of
    Pattern -> Exp  % optional
end
```

but for the reasons mentioned in the previous section, the `of ...`
section became non-essential.

Then, with the strong requirements for backwards compatibility making it
difficult to introduce new keywords, along with the possibility to reuse
`begin` without changing any of its current behavior, this form became the
most interesting one.

The term `begin` is also reminiscent of transactions and abortive
contexts, which means that although not an ideal fit for value-based
error flow, it is also not entirely outlandish and could accept the new
added optional semantics without being too out of place.


A New Infix Operator
--------------------

In order to form `UnwrapExpr`, there is a need for a mechanism to
introduce pattern matching with distinct semantics from regular pattern
matching.

A naive parse transform approach with fake function calls would be the
most basic way to go:

```
begin
    unwrap(Pattern, Exp),
    % variables bound in Pattern are available in scope
end
```

However, this would introduce pattern matches in non-left-hand-side
positions and make nesting really weird to deal with without exposing
parse transform details and knowing how the code is translated.

A prefix keyword such `let <Pattern> = <Exp>` could also be used.
Such keywords unfortunately suffer the same issues as `maybe` would
have, and `let` typically has different implications.

An infix operator seems like a good fit since pattern matching already
uses them in multiple forms:

- `=` is used for pattern matches. Overloading it in error flow would
  prevent regular matching from being used
- `:=` is used for maps; using it could work, but would certainly be
  confusing when handling nested maps in a pattern
- `<-` could make sense. It is already restricted in scope to list and
  binary comprehensions and would therefore not clash nor be confused.
  However, the existing semantics of the operator imply a literal
  pattern match working like a filter. We're looking for the filter-like
  approach, but want to introduce implicit elements (`{ok|error, ...}`)
- `<=` same as `<-` but for binary generators

It would make sense to check for new operators specifically for this
context given the semantics:

    =======  ===========================================================
    Operator Description
    =======  ===========================================================
    #=       no clash with other syntax (maps, records, integers), no
             clash with abstract patterns EEP either.
    !=       No clash with message passing, but is sure to annone used
             to C-style inequality checks
    <~       Works with no known conflict; shouldn't clash with ROK's
             frame proposals (uses infix ~ and < > as delimiters).
    <|       reverse pipe operator. No obvious clash either

There is no strong argument for or against most of these. The choice of
`<~` mostly comes down to having similarity to list comprehensions' `<-`
operator both in semantics and appearance, although being different
overall.

### Operator Priority ###

Within the expected usage of the unwrap expressions, the `<~` operator
needs to have a precedence rule such that:

```
X = {Y,X} <~ <Exp>
```

Is considered a valid pattern match operation with `X = {Y,X}` being the
whole left-hand-side pattern, such that operation priorities are:

```
lhs <~ rhs
```

Instead of

```
lhs = rhs <~ <...>
```

In all other regards, the precedence rules should be the same as `=` in
order to provide the most unsurprising experience possible.

Other Disregarded Approaches and Variations
----------------------------

Other approaches were considered in making this proposal, and ultimately
disregarded.

### Elixir-Like Patterns in `with` ###

The Elixir approach is fairly comprehensive, and rather powerful. Rather
than handling success or errors, it generalizes over pattern matching as
a whole.

In the end though, it ends up not specifically doing that much in terms
of error handling.

The current proposal really wanted a mechanism more appropriate and dedicated
to value-based error handling than straight up pattern matching. The `<-`
operator (for example) could be brought in as an extension supporting more
generalized pattern matching, but is currently not in scope.

### Simplifying Chaining an Pipelining ###

One approach or pain point frequently brough up about Erlang concern
pipelining of operations. Could it be possible to make some
operations easier to chain?

If we take a set of functions `f()`, `g()`, and `h()` that all return
`{ok | error, _}` tuples, current day Erlang requires:

```
{ok, X} = f(),
{ok, Y} = g(X),
{ok, Z} = h(Y),
Z
```

Could there be an easier way to handle this type of chaining, based on
say, an `unwrap` function:

```
unwrap({ok, X}) -> X.

main() ->
    unwrap(h(unwrap(g(unwrap(f()))))).
```

And it appeared that generally, this turns out to be simple enough to do
with the earlier fold approach we had mentioned.

Overall, the various existing mechanisms appeared slightly inconvenient,
but not inconvenient enough to be worth adding a whole new language
mechanism just for it.


### `cond` and `cond let` ###

Anthony Ramine recommended looking into reusing the already reserved
`cond` and `let` keywords. He mentioned Rust planning something based on
these and how it could be ported to Erlang based on his prior work on
supporting the `cond` construct within the language.

The proposed mechanism would look like:

```
cond
    X > 5 -> % regular guard
        Exp;
    f() < 18 -> % function used in guard, as originally planned
        Exp;
    let {ok, Y} = exp(), Y < 5 ->
        Exp
end
```

The last clause would allow `Y` to be used in its own branch only if it
matches and all guards succeed; if the binding fails, a switch is
automatically made to the next branch.

As such, more complex sequences of operations could be covered as:

```
cond
    let {ok, _} = call1(),
    let {ok, _} = call2(),
    let Res = call3() ->
        Res;
    true ->
        AlternativeBranch
end
```

This mechanism is, in my opinion, worth exploring and maybe adding to
the language, but on its own does not adequately solve error handling
flow issues since errors cannot be exracted easily from failing
operations.

### Auto-Wrapping Return Values ###

Auto-wrapping return values is something the Elixir's `OK` library does,
as well as Haskell's do notation, but that neither Rust nor Swift does.

It seems that there is no very clear consensus on what could be done.
Thus, for the simplicity of the implementation and backards
compatibility of the `begin ... end` expression, just returning the
value as-is without auto-wrapping seems sensible.

It would therefore be up to the developer to just return whatever value
best matches their function's type signature, making easier to still
integrate return values with the system they have.

It also lets sequences of operations potentially return `ok` on success,
even if their individual functions returned values such as `true`, for
example, rather than `{ok, true}`.

The choice of supported match values
-------------------------------------

It is kind of straightforward why `{ok, V}` and `{error, T}` are used in
pattern matches as error values: they're the most standard way to
communicate a value and an error in non-overlapping patterns whichever
way you want to match.

On the other hand, it is less obvious why `_ <~ Exp` should positively
match on `ok` alone, and why, for example, `error` as an atom would
raise an exception as not matching any patterns.

The reason `ok` is considered valid can be found in comparing common
Erlang return values with their matches in other languages.

The following functions return `ok` when everything went well but
nothing is worth reporting. The list is not exhaustive:

- `lists:foreach/2`
- over 25 functions in the `file` module
- most functions in `disk_log`
- most functions sending data or handling control of sockets and ports
- most output functions from the `io` module
- logging functions in the `logger` module
- functions from the `applications` module interacting with config and
  starting or loading applications

The pattern is fully entrenched as a core pattern in Erlang and OTP, and
very attached to side-effectful operations.

The interesting aspect comes from seeing what Rust does for similar
functions, which is just return their own unit type, denoted as `()`.
When used with the `Result` types, it is to be returned a `OK(())`.

The Erlang equivalent would probably be `{ok, undefined}`, but `ok` as a
single atom currently plays that role fine, and so it was decided to
support it; it will let error flow integrate well with side-effectful
functions.

The same cannot be said of `error` as an atom result. Most errors can
and should return context with them that qualifies the error result,
since they often have more than one reason to fail. As evidence for this
line of thought, it is currently not possible to raise exceptions
without a `Reason`, whether done through `throw/1`, `error/1`,
`exit/1-2`, or `raise/3`.

Aligning with the standard practices in the Erlang language validate
using `_ <~ Exp` as a pattern suitable for `ok`, and only this pattern
since it allows to basically match on what would be a non-existing value
that wouldn't need to be bound in further contexts.

Choosing Exceptions Raised
--------------------------

The exception format proposed here is `{badunwrap, Value}`. This format
is chosen following Erlang/OTP standards:

- `badarg`
- `badarith`
- `badfun`
- `{badmatch, Val}`

Since "unwrapping" is how the kind of operation where `X` is extracted
from `{ok, X}`, the name `badunwrap` was chosen, along with the
mismatching value being borrowed from `{badmatch, _}`.

Backwards Compatibility
=======================

The possibility of an early exit from a `begin ... end` expression
means that variables declared within its scope are now potentially
unsafe to use outside of it.

This is a change of behaviour that brings `begin` in line with the
variables bound within a `case ... end` branch, a `try/catch` clause, or
a `receive ... end` branch.

This lack of safety only needs to be started at the first `UnwrapExpr`
encountered, since all variables bound before respect the same semantics
as the existing `begin ... end` expression. If this analysis is done
rather than just declaring all variables as unsafe wholesale, then there
is no backwards compatibility concern to be had.

The need for a new operator means code built with support for the new
expressions won't be portable to older Erlang releases.

Reference Implementation
========================

No reference implementation is usually required at this step, but
one is nevertheless provided in the original repository for this
EEP draft, at https://bitbucket.org/ferd/unwrap/. The implementation
uses parse transforms rather than an operator, since it would be
difficult to add custom operators at this point of the process.


More information about the eeps mailing list