[eeps] EEP ???: Value-Based Error Handling Mechanisms

Raimo Niskanen raimo+eeps@REDACTED
Wed Sep 5 12:12:17 CEST 2018


After some formatting I have added this as EEP 49.
Thank you for your proposal!
/ Raimo



On Tue, Sep 04, 2018 at 05:07:03PM -0400, Fred Hebert wrote:
> This EEP adds a contextual <~ operator to begin ... end expressions, 
> which allows them to be usable for value-based error handling, based on 
> standard {ok, term()} | {error, term()} return value types.
> 
> This lets begin ... end become a control flow construct to replace or 
> simplify deeply-nested case ... end expressions, and prevent using 
> exceptions for control flow.
> 
> I want to thank Jesper Louis Andersen, Sean Cribbs, Tristan Sloughter, 
> Evan Vigil-McClanahan, Anthony Ramine, Bryan Paxton, and Pedram Nimreezi 
> for their feedback on this document.
> 
> The draft is attached to this document, and also available (along with 
> an incomplete reference implementation to explore some semantics) at 
> https://bitbucket.org/ferd/unwrap
> 
> Sorry for sending a first copy without the attachment,
> Fred.

>     EEP: Pending
>     Title: Value-Based Error Handling Mechanisms
>     Version: $Revision$
>     Last-Modified: $Date$
>     Author: Fred Hebert <mononcqc@REDACTED>
>     Status: Draft
>     Type: Standards Track
>     Content-Type: text/x-rst
>     Created: 31-Aug-2018
>     Post-History:
> ****
> 
> # EEP ???: Value-Based Error Handling Mechanisms #
> 
> Abstract
> ========
> 
> This EEP adds a contextual `<~` operator to `begin ... end` expressions,
> which allows them to be usable for value-based error handling, based on
> standard `{ok, term()} | {error, term()}` return value types.
> 
> This lets `begin ... end` become a control flow construct to replace or
> simplify deeply-nested `case ... end` expressions, and prevent using
> exceptions for control flow.
> 
> Copyright
> =========
> 
> This document has been placed in the public domain.
> 
> Specification
> =============
> 
> The current syntax for a `begin ... end` expression is:
> 
> ```
> begin
>     Exprs
> end
> ```
> 
> The expression does not have a restricted scope, and is mostly used to
> group multiple distinct expressions as a single block. We propose a new
> type of expressions (denoted `UnwrapExprs`), only valid within a
> `begin ... end` expression:
> 
> ```
> begin
>     Exprs | UnwrapExprs
> end
> ```
> 
> `UnwrapExprs` are defined as having the following form:
> 
> ```
> Pattern <~ Expr
> ```
> 
> This definition means that `UnwrapExprs` are only allowed at the
> top-level of `begin ... end` expressions.
> 
> The `<~` operator takes the value return by `Expr` and inspects it.
> If the value is a tuple of the form `{ok, Val}`, it unwraps `Val` from
> the tuple, and matches it against `Pattern`.
> 
> If the pattern matches, all variables from `Pattern` are bound in the
> local environment, and the full value `{ok, Val}` is returned by the
> `UnwrapExpr`. If the value does not match, a `{badunwrap, Val}` error
> is raised.
> 
> A special case exists when `Pattern` is the match-all
> variable (`_`), which on top of allowing the value to be considered a successful
> unwrapping if the returned value from `Expression` is `{ok, term()}`,
> it also considers the atom `ok` to be valid as well.
> 
> If the value is a tuple of the form `{error, Reason}`, then the entire
> `begin ... end` expression is short-circuited and returns `{error,
> Reason}`. The variables that were bound in there remain bound, the
> rest are undefined.
> 
> The compiler should warn about any variable that is used after the
> `begin ... end` expression that was bound in or after the first
> `UnwrapExpr` encountered within the block.
> 
> If the value returned does not match any of `{ok | error, term()}` as a
> type, a `{badunwrap, Val}` error is raised.
> 
> Given the structure described here, the final expression may look like:
> 
> ```
> begin
>     Foo = bar(),
>     X <~ id({ok, 5}),
>     [H|T] <~ id({ok, [1,2,3]}),
>     ...
> end
> ```
> 
> Do note that to allow easier pattern matching and more intuitive usage,
> the `<~` operator should have associativity rules lower than `=`, such that:
> 
> ```
> begin
>     X = [H|T] <~ exp()
> end
> ```
> 
> is a valid `UnwrapExp` equivalent to the non-infix form `'<~'('='(X, [H|T]),
> exp())`, since reversing the priorities would give `'='('<~'(X, [H|T]),
> exp())`, which would create an `UnwrapExp` out of context and be invalid.
> 
> 
> Motivation
> ==========
> 
> Erlang has some of the most flexible error handling available across a
> large number of programming languages. The language supports:
> 
> 1. three types of exceptions (`throw`, `error`, `exit`)
>  - handled by `catch Exp`
>  - handled by `try ... [of ...] catch ... [after ...] end`
> 2. links, `exit/2`, and `trap_exit`
> 3. monitors
> 4. return values such as `{ok, Val} | {error, Term}`, `{ok, Val} |
>    false`, or `ok | {error, Val}`
> 5. A combination of one or more of the above
> 
> So why should we look to add more? There are various reasons for this,
> incuding trying to reduce deeply nested conditional expressions,
> cleaning up some messy patterns found in the wild, providing a better
> separation of concern when implementing functions, and encouraging more
> standard and idiomatic interfaces.
> 
> Reducing Nesting
> ----------------
> 
> One common pattern that can be seen in Erlang is deep nesting of `case
> ... end` expressions, to check complex conditionals.
> 
> Take the following code taken from
> [Mnesia](https://github.com/erlang/otp/blob/a0ae44f324576104760a63fe6cf63e0ca31756fc/lib/mnesia/src/mnesia_backup.erl#L106-L126),
> for example:
> 
> ```
> commit_write(OpaqueData) ->
>     B = OpaqueData,
>     case disk_log:sync(B#backup.file_desc) of
>         ok ->
>             case disk_log:close(B#backup.file_desc) of
>                 ok ->
>                     case file:rename(B#backup.tmp_file, B#backup.file) of
>                        ok ->
>                             {ok, B#backup.file};
>                        {error, Reason} ->
>                             {error, Reason}
>                     end;
>                 {error, Reason} ->
>                     {error, Reason}
>             end;
>         {error, Reason} ->
>             {error, Reason}
>     end.
> ```
> 
> The code is nested to the extent that shorter aliases must be introduced
> for variables (`OpaqueData` renamed to `B`), and half of the code just
> transparently returns the exact values each function was given.
> 
> By comparison, the same code could be written as follows with the new
> construct:
> 
> ```
> commit_write(OpaqueData) ->
>     begin
>         _ <~ disk_log:sync(OpaqueData#backup.file_desc),
>         _ <~ disk_log:close(OpaqueData#backup.file_desc),
>         _ <~ file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
>         {ok, OpaqueData#backup.file}
>     end.
> ```
> 
> The semantics of this call are entirely identical, except that it is now
> much easier to focus on the flow of individual operations.
> 
> Obsoleting Messy Patterns
> -------------------------
> 
> Frequent ways in which people work with sequences of failable operations
> include folds over lists of functions, and abusing list comprehensions.
> Both patterns have heavy weaknesses that makes them less than ideal.
> 
> Folds over list of functions use patterns such as those defined in
> [posts from the
> mailing](http://erlang.org/pipermail/erlang-questions/2017-September/093575.html):
> 
> ```
> pre_check(Action, User, Context, ExternalThingy) ->
>     Checks =
>         [fun check_request/1,
>          fun check_permission/1,
>          fun check_dispatch_target/1,
>          fun check_condition/1],
>     Args = {Action, User, Context, ExternalThingy},
>     Harness =
>         fun
>             (Check, ok)    -> Check(Args);
>             (_,     Error) -> Error
>         end,
>     case lists:foldl(Harness, ok, Checks) of
>         ok    -> dispatch(Action, User, Context);
>         Error -> Error
>     end.
> ```
> 
> This code requires declaring the functions one by one, ensuring the
> entire context is carried from function to function. Since there is no
> shared scope between functions, all functions must operate on all
> arguments.
> 
> By comparison, the same code could be implemented with the new construct
> as:
> 
> ```
> pre_check(Action, User, Context, ExternalThingy) ->
>     begin
>         _ <~ check_request(Context, User),
>         _ <~ check_permissions(Action, User),
>         _ <~ check_dispatch_target(ExternalThingy),
>         _ <~ check_condition(Action, Context),
>         dispatch(Action, User, Context)
>     end
> ```
> 
> And if there was a need for derived state between any two steps, it
> would be easy to weave it in:
> 
> ```
> pre_check(Action, User, Context, ExternalThingy) ->
>     begin
>         _ <~ check_request(Context, User),
>         _ <~ check_permissions(Action, User),
>         _ <~ check_dispatch_target(ExternalThingy),
>         DispatchData <~ dispatch_target(ExternalThingy),
>         _ <~ check_condition(Action, Context),
>         dispatch(Action, User, Context)
>     end
> ```
> 
> The list comprehension _hack_, by comparison, is a bit more rare. In
> fact, it is mostly theoretical. Some things that hint at how it could
> work can be found in [Diameter test
> cases](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/diameter/test/diameter_examples_SUITE.erl#L254-L266)
> or the [PropEr plugin for
> Rebar3](https://github.com/ferd/rebar3_proper/blob/e7eb96498a9d31f41c919474ec6800df62e237e1/src/rebar3_proper_prv.erl#L298-L308).
> 
> Its overal form uses generators in list comprehensions to tunnel a happy
> path:
> 
> ```
> [Res] =
>     [f(Z) || {ok, W} <- [b()],
>              {ok, X} <- [c(W)],
>              {ok, Y} <- [d(X)],
>              Z <- [e(Y)]],
> Res.
> ```
> 
> This form doesn't see too much usage since it is fairly obtuse and I
> suspect most people have either been reasonable enough not to use it, or
> did not think about it. Obviously the new form would be cleaner:
> 
> ```
> begin
>     W <~ b(),
>     X <~ c(W),
>     Y <~ d(X),
>     Z = e(Y),
>     f(Z)
> end
> ```
> 
> which on top of it, has the benefit of returning an error value if one
> is found.
> 
> Better Separation of Concerns
> -----------------------------
> 
> This form is not necessarily obvious at a first glance. To better
> expose it, let's take a look at some functions defined in the
> [`release_handler` module in
> OTP](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/sasl/src/release_handler.erl#L1894-L1923):
> 
> ```
> write_releases_m(Dir, NewReleases, Masters) ->
>     RelFile = filename:join(Dir, "RELEASES"),
>     Backup = filename:join(Dir, "RELEASES.backup"),
>     Change = filename:join(Dir, "RELEASES.change"),
>     ensure_RELEASES_exists(Masters, RelFile),
>     case at_all_masters(Masters, ?MODULE, do_copy_files,
>                         [RelFile, [Backup, Change]]) of
>         ok ->
>             case at_all_masters(Masters, ?MODULE, do_write_release,
>                                 [Dir, "RELEASES.change", NewReleases]) of
>                 ok ->
>                     case at_all_masters(Masters, file, rename,
>                                         [Change, RelFile]) of
>                         ok ->
>                             remove_files(all, [Backup, Change], Masters),
>                             ok;
>                         {error, {Master, R}} ->
>                             takewhile(Master, Masters, file, rename,
>                                       [Backup, RelFile]),
>                             remove_files(all, [Backup, Change], Masters),
>                             throw({error, {Master, R, move_releases}})
>                     end;
>                 {error, {Master, R}} ->
>                     remove_files(all, [Backup, Change], Masters),
>                     throw({error, {Master, R, update_releases}})
>             end;
>         {error, {Master, R}} ->
>             remove_files(Master, [Backup, Change], Masters),
>             throw({error, {Master, R, backup_releases}})
>     end.
> ```
> 
> At a glance, it is very difficult to clean up this code: there are 3
> multi-node operations (backing up, updating, and moving release data),
> each of which relies on the previous one to succeed.
> 
> You'll also notice that each error requires special handling, reverting
> or removing specific operations on success or on failure. This is not a
> simple question of tunnelling values in and out of a narrow scope.
> 
> Another thing to note is that this module, as a whole (and not just the
> snippet presented here) uses `throw` expressions to operate non-local
> return. The actual point of return handling these is spread through
> various locations in the file:
> [`create_RELEASES/4`](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/sasl/src/release_handler.erl#L381-L388),
> and
> [`write_releases_1/3`](https://github.com/erlang/otp/blob/869537a9bf799c8d12fc46c2b413e532d6e3b10c/lib/sasl/src/release_handler.erl#L1864-L1881)
> for example.
> 
> The `case catch Exp of` form is used throughout the file because
> value-based error flow is painful in nested structures.
> 
> So let's take a look at how we could refactor this with the new
> construct:
> 
> ```
> write_releases_m(Dir, NewReleases, Masters) ->
>     RelFile = filename:join(Dir, "RELEASES"),
>     Backup = filename:join(Dir, "RELEASES.backup"),
>     Change = filename:join(Dir, "RELEASES.change"),
>     begin
>         _ <~ backup_releases(Dir, NewReleases, Masters, Backup, Change,
>                              RelFile),
>         _ <~ update_releases(Dir, NewReleases, Masters, Backup, Change),
>         _ <~ move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
>     end.
> 
> backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
>     case at_all_masters(Masters, ?MODULE, do_copy_files,
>                         [RelFile, [Backup, Change]]) of
>         ok ->
>             ok;
>         {error, {Master, R}} ->
>             remove_files(Master, [Backup, Change], Masters)
>             {error, {Master, R, backup_releases}}
>     end.
> 
> update_releases(Dir, NewReleases, Masters, Backup, Change) ->
>     case at_all_masters(Masters, ?MODULE, do_write_release,
>                         [Dir, "RELEASES.change", NewReleases]) of
>         ok ->
>             ok;
>         {error, {Master, R}} ->
>             remove_files(all, [Backup, Change], Masters),
>             {error, {Master, R, update_releases}}
>     end.
> 
> move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
>     case at_all_masters(Masters, file, rename, [Change, RelFile]) of
>         ok ->
>             remove_files(all, [Backup, Change], Masters),
>             ok;
>         {error, {Master, R}} ->
>             takewhile(Master, Masters, file, rename, [Backup, RelFile]),
>             remove_files(all, [Backup, Change], Masters),
>             {error, {Master, R, move_releases}}
>     end.
> ```
> 
> The only reasonable way to rewrite the code was to extract all three
> major multi-node operations into distinct functions. The improvements
> are:
> 
> - The consequence of failing an operation is located near where the
>   operation takes place
> - The functions have return values that Dialyzer can more easily
>   typecheck
> - The functions are inherently more testable independently
> - Context can still be added and carried on the generalized workflow at
>   the parent level
> - The chain of successful operations is very obvious and readable
> - Exceptions are no longer required to make the code work, but if we
>   needed it, only one `throw()` would be needed in `write_release_m`,
>   therefore separating the flow control details from specific function
>   implementations.
> 
> As a control experiment, let's try reusing our shorter functions with
> the previous flow:
> 
> ```
> %% Here is the same done through exceptions:
> write_releases_m(Dir, NewReleases, Masters) ->
>     RelFile = filename:join(Dir, "RELEASES"),
>     Backup = filename:join(Dir, "RELEASES.backup"),
>     Change = filename:join(Dir, "RELEASES.change"),
>     try
>         ok = backup_releases(Dir, NewReleases, Masters, Backup, Change,
>                              RelFile),
>         ok = update_releases(Dir, NewReleases, Masters, Backup, Change),
>         ok = move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile)
>     catch
>         {error, Reason} -> {error, Reason}
>     end.
> 
> backup_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
>     case at_all_masters(Masters, ?MODULE, do_copy_files,
>                         [RelFile, [Backup, Change]]) of
>         ok ->
>             ok;
>         {error, {Master, R}} ->
>             remove_files(Master, [Backup, Change], Masters)
>             throw({error, {Master, R, backup_releases}})
>     end.
> 
> update_releases(Dir, NewReleases, Masters, Backup, Change) ->
>     case at_all_masters(Masters, ?MODULE, do_write_release,
>                         [Dir, "RELEASES.change", NewReleases]) of
>         ok ->
>             ok;
>         {error, {Master, R}} ->
>             remove_files(all, [Backup, Change], Masters),
>             throw({error, {Master, R, update_releases}})
>     end.
> 
> move_releases(Dir, NewReleases, Masters, Backup, Change, RelFile) ->
>     case at_all_masters(Masters, file, rename, [Change, RelFile]) of
>         ok ->
>             remove_files(all, [Backup, Change], Masters),
>             ok;
>         {error, {Master, R}} ->
>             takewhile(Master, Masters, file, rename, [Backup, RelFile]),
>             remove_files(all, [Backup, Change], Masters),
>             throw({error, {Master, R, move_releases}})
>     end.
> ```
> 
> Very little changes in the three distributed functions. However, the weakness
> of this approach is that we have intimately tied implementation details of the
> small functions to their parent's context.  This makes it hard to reason about
> these functions in isolation or to reuse them in a different context.
> Furthermore, the parent function may capture `throws` not intended for it.
> 
> It is my opinion that using value-based flow control, through similar
> refactorings, yields safer and cleaner code, which also happens to have
> far more reduced levels of nesting. It should therefore be possible to
> express more complex sequences of operations without making them any
> harder to read, nor reason about in isolation.
> 
> That is in part due to the nesting, but also because we take a more
> compositional approach, where there is no need to tie local functions'
> implementation details to the complexity of their overall pipeline and
> execution context.
> 
> It is also the best way to structure code in order to handle all
> exceptions and to provide the context they need as close as possible to
> their source, and as far as possible from the execution flow.
> 
> 
> Encouraging Standards
> ---------------------
> 
> In Erlang, `true` and `false` are regular atoms that only gained special
> status through usage in boolean expressions. It would be easy to think
> that more functions would return `yes` and `no` were it not from control
> flow constructs.
> 
> Similarly, `undefined` has over years of use become a kind of default
> "not found" value. Values such as `nil`, `null`, `unknown`, `undef`,
> `false` and so on have seen some use, but a strong consistency in format
> has ended up aligning the community on one value.
> 
> When it comes to return values for various functions, `{ok, Term}` is
> the most common one for positive results that need to communicate a
> value, `ok` for positive results with no other value than their own
> success, and `{error, Term}` is most often uses for errors. Pattern
> matching and assertions have enforced that it is easy to know whether a
> call worked or not by its own structure.
> 
> However, many success values are still larger tuples: `{ok, Val,
> Warnings}`, `{ok, Code, Status, Headers, Body}`, and so on. Such
> variations are not problematic on their own, but it would likely not
> hurt too much either to use `{ok, {Val, Warnings}}` or `{ok, {Code,
> Status, Headers, Body}}`.
> 
> In fact, using more standard forms could lead to easier generalizations
> and abstractions that can be applied to community-wide code. By choosing
> specific formats for control flow on value-based error handling, we
> explicitly encourage this form of standardization.
> 
> 
> Rationale
> =========
> 
> This section will detail the decision-making behind this EEP, including:
> 
> - Prior Art in Other Languages
> - The choice of `begin ... end` as a construct and its scope
> - Why introduce a new operator
> - Other disregarded approaches
> - The choice of supported values
> - The choice of `{badunwrap, Val}` as a default exception
> 
> There's a lot of content to cover here.
> 
> Prior Art in Other Languages
> ----------------------------
> 
> Multiple languages have value-based exception handling, many of which
> have a strong functional slant.
> 
> ### Haskell ###
> 
> The most famous case is possibly Haskell with the `Maybe` monad, which
> uses either `Nothing` (meaning the computation returned nothing) or
> `Just x` (their type-based equivalent of `{ok, X}`). The union of both
> types is denoted `Maybe x`. The following examples are taken from
> https://en.wikibooks.org/wiki/Haskell/Understanding_monads/Maybe
> 
> Values for such errors are tagged in functions as follows:
> 
> ```
> safeLog :: (Floating a, Ord a) => a -> Maybe a
> safeLog x
>     | x > 0     = Just (log x)
>     | otherwise = Nothing
> ```
> 
> Using the type annotations directly, it is possible to extract values
> (if any) through pattern matching:
> 
> ```
> zeroAsDefault :: Maybe Int -> Int
> zeroAsDefault mx = case mx of
>     Nothing -> 0
>     Just x -> x
> ```
> 
> One thing to note here is that as long as you are not able to find a
> value to substitute for `Nothing` or that you cannot take a different
> branch, you are forced to carry that uncertainty with you through all
> the types in the system.
> 
> This is usually where Erlang stops. You have the same possibilities
> (albeit dynamically checked), along with the possibility of transforming
> invalid values into exceptions.
> 
> Haskell, by comparison, offers monadic operations and its _do notation_
> to abstract over things:
> 
> ```
> getTaxOwed name = do
>   number       <- lookup name phonebook
>   registration <- lookup number governmentDatabase
>   lookup registration taxDatabase
> ```
> 
> In this snippet, even though the `lookup` function returns a `Maybe x`
> type, the do notation abstracts away the `Nothing` values, letting the
> programmer focus on the `x` part of `Just x`. Even though the code is
> written as if we can operate on discrete value, the function
> automatically re-wraps its result into `Just x` and any `Nothing` value
> just bypasses operations.
> 
> As such, the developer is forced to acknowledge that the whole
> function's flow is conditional to values being in place, but they can
> nevertheless write it mostly as if everything were discrete.
> 
> ### OCaml ###
> 
> OCaml supports exceptions, with constructs such as `raise (Type
> "value")` to raise an exception, and `try ... with ...` to handle them.
> However, since exceptions wouldn't be tracked by the type system,
> maintainers introduced a `Result` type.
> 
> The type is defined as
> 
> ```
> type ('a, 'b) result =
>   | Ok of 'a
>   | Error of 'b
> ```
> 
> which is reminiscent of Erlang's `{ok, A}` and `{error, B}`. OCaml users
> appear to mostly use pattern matching, combinator libraries, and monadic
> binding to deal with value-based error handling, something similar to
> Haskell's usage.
> 
> ### Rust ###
> 
> Rust defines two types of errors: unrecoverable ones (using `panic!`)
> and recoverable ones, using the `Error<T, E>` values. The latter is of
> interest to us, and defined as:
> 
> ```
> enum Result<T, E> {
>     Ok(T),
>     Err(E),
> }
> ```
> 
> Which would intuitively translate to Erlang terms `{ok, T}` and `{error,
> E}`. The simple way to handle these in Rust is through pattern matching:
> 
> ```
> let f = File::open("eep.txt");
> match f {
>     Ok(file) => do_something(file),
>     Err(error) => {
>         panic!("Error in file: {:?}", error)
>     },
> };
> ```
> 
> Specific error values have to be well-typed, and it seems that the Rust
> community is still debating implementation details about how to best get
> composability and annotations within a generic type.
> 
> However, their workflow for handling these is well-defined already. This
> pattern matching form has been judged too cumbersome. To automatically
> panic on error values, the `.unwrap()` method is added:
> 
> ```
> let f = File::open("eep.txt").unwrap();
> ```
> 
> In Erlang, we could approximate this with:
> 
> ```
> unwrap({ok, X}) -> X;
> unwrap({error, T}) -> exit(T).
> 
> F = unwrap(file:open("eep.txt", Opts)).
> ```
> 
> Another construct exists to return errors to caller code more directly,
> without panics, with the `?` operator:
> 
> ```
> fn read_eep() -> Result<String, io::Error> {
>     let mut h = File::open("eep.txt")?;
>     let mut s = String::new();
>     h.read_to_string(&mut s)?;
>     Ok(s)
> }
> ```
> 
> Any value `Ok(T)` encountering `?` is unwrapped. Any value `Err(E)`
> encountering `?` is returned to the caller as-is, as if a `match` with
> `return` had been used. This operator however requires that the
> function's type signature use the `Result<T, E>` type as a return value.
> 
> Prior to version 1.13, Rust used the `try!(Exp)` macro to the same
> effect, but found it too cumbersome. Compare:
> 
> ```
> try!(try!(try!(foo()).bar()).baz())
> foo()?.bar()?.baz()?
> ```
> 
> ### Swift ###
> 
> 
> Swift supports exceptions, along with type annotations declaring that a
> function may raise exceptions, and `do ... catch` blocks.
> 
> There is a special operator `try?` which catches any thrown exception
> and turns it into `nil`:
> 
> ```
> func someThrowingFunction() throws -> Int {
>     // ...
> }
> let x = try? someThrowingFunction()
> ```
> 
> Here `x` can either have a value of `Int` or `nil`. The data flow is
> often simplified by using `let` assignments in a conditional expression:
> 
> ```
> func fetchEep() -> Eep? {
>     if let x = try? fetchEepFromDisk() { return x }
>     if let x = try? fetchEepFromServer() { return x }
>     return nil
> }
> ```
> 
> ### Go ###
> 
> 
> Go has some fairly anemic error handling. It has panics, and error
> values. Error values must be assigned (or explicitly ignored) but they
> can be left unchecked and cause all kinds of issues.
> 
> Nevertheless, Go exposed [plans for new error
> handling](https://go.googlesource.com/proposal/+/master/design/go2draft-error-handling-overview.md)
> in future versions, which can be interesting.
> 
> Rather than changing semantics of their error handling, Go designers are
> mostly considering syntactic changes to reduce the cumbersome nature of
> their errors.
> 
> Go programs typically handled errors as follows:
> 
> ```
> func main() {
>         hex, err := ioutil.ReadAll(os.Stdin)
>         if err != nil {
>                 log.Fatal(err)
>         }
> 
>         data, err := parseHexdump(string(hex))
>         if err != nil {
>                 log.Fatal(err)
>         }
> 
>         os.Stdout.Write(data)
> }
> ```
> 
> The new proposed mechanism looks as follows:
> 
> ```
> func main() {
>     handle err {
>         log.Fatal(err)
>     }
> 
>     hex := check ioutil.ReadAll(os.Stdin)
>     data := check parseHexdump(string(hex))
>     os.Stdout.Write(data)
> }
> ```
> 
> The `check` keyword asks to implicitly check whether the second return
> value `err` is equal to `nil` or not. If it is not equal to `nil`, the
> latest defined `handle` block is called. It can return the result out to
> exit the function, repair some values, or simply panic, to name a few
> options.
> 
> ### Elixir ###
> 
> Elixir has a slightly different semantic approach to error handling compared
> to Erlang. Exceptions are discouraged for control flow (while Erlang
> specifically uses `throw` for it), and the `with` macro is introduced:
> 
> ```
> with {:ok, var} <- some_call(),
>      {:error, _} <- fail(),
>      {:ok, x, y} <- parse_name(var)
> do
>     success(x, y, var)
> else
>     {:error, err} -> handle(err)
>     nil -> {:error, nil}
> end
> ```
> 
> The macro allows a sequence of pattern matches, after which the ˋdo ...ˋ
> block is called. If any of the pattern matches fails, the failing value
> gets re-matched in the optional ˋelse ... end` section.
> 
> This is the most general control flow in this document, being fully
> flexible with regards to which values it can handle. This was done in
> part because there is not a strong norm regarding error or valid values
> in either the Erlang nor Elixir APIs, at least compared to other
> languages here.
> 
> This high level of flexibility has been criticized in some instances as
> being a bit confusing: it is possible for users to make error-only
> flows, success-only flows, mixed flows, and consequently the ˋelseˋ
> clause can become convoluted.
> 
> The [OK library](https://github.com/CrowdHailer/OK) was released to
> explicitly narrow the workflow to well-defined errors. It supports three forms,
> the first of which is the `for` block:
> 
> ```
> OK.for do
>   user <- fetch_user(1)
>   cart <- fetch_cart(1)
>   order = checkout(cart, user)
>   saved_order <- save_order(order)
> after
>   saved_order
> end
> ```
> 
> It works by _only_ matching on `{:ok, val}` to keep moving forwards when
> using the `<-` operator: the `fetch_user/1` function above must return
> `{:ok, user}` in order for the code to proceed. The `=` operator is
> allowed for pattern matches the same way it usually does within Elixir.
> 
> Any return value that matches `{:error, t}` ends up returning directly
> out of the expression. The `after ... end` section takes the last value
> returned, and if it isn't already in a tuple of the form `{:ok val}`, it
> wraps it as such.
> 
> The second variant is the `try` block:
> 
> ```
> OK.try do
>   user <- fetch_user(1)
>   cart <- fetch_cart(1)
>   order = checkout(cart, user)
>   saved_order <- save_order(order)
> after
>   saved_order
> rescue
>   :user_not_found -> {:error, missing_user}
> end
> ```
> 
> This variant will capture exceptions as well (in the `rescue` block),
> and will not re-wrap the final return value in the `after` section.
> 
> The last variant for the library is the pipe:
> 
> ```
> def get_employee_data(file, name) do
>   {:ok, file}
>   ~>> File.read
>   ~> String.upcase
> end
> ```
> 
> The goal of this variant is to simply thread together operations that
> could result in either a success or error. The `~>>` operator matches
> and returns an `{:ok, term}` tuple, and the `~>` operator wraps a value
> into an `{:ok, term}` tuple.
> 
> Choosing `begin ... end` Expressions
> ------------------------------------
> 
> Abstractions over error flow requires to define a scope limiting the
> way flow is controlled. Before choosing the `begin ... end` expression,
> the following items needed consideration:
> 
> 1. what is the scope we need to cover
> 2. what is the format of the structure to use
> 3. why ending up with `begin ... end`
> 
> ### Scoping Limits ###
> 
> In the languages mentioned earlier, two big error handling categories
> seem to emerge.
> 
> The first group of language seems to track their error handling at the
> function level. For example, Go uses `return` to return early from the
> current function.  Swift and Rust also scope their error handling
> abstractions to the current function, but they also make use of their
> type signatures to keep information about the control flow
> transformations taking place. Rust uses the `Result<T, E>` type
> signature to define what operations are valid, and Swift asks of
> developers that they either handle the error locally, or annotate the
> function with `throws` to make things explicit.
> 
> On the other hand, Haskell's do notation is restricted to specific
> expressions, and so are all of Elixir's mechanisms.
> 
> Erlang, Haskell, and Elixir all primarily use recursion as an iteration
> mechanism, and (outside of Haskell's monadic constructs) do not support
> `return` control flow; it is conceptually more difficult for a `return`
> (or `break`) to be useful when iteration requires recursion:
> "returning" by exiting the current flow may not bail you out of what the
> programmer might consider a loop, for example.
> 
> Instead, Erlang would use `throw()` exceptions as a control flow
> mechanism for non-local return, along with a `catch` or a `try ...
> catch`. Picking a value-based error handling construct that acts at the
> function level would not necessarily be very interesting since almost
> any recursive procedure would still require using exceptions.
> 
> As such, it feels simpler to use a self-contained construct built to
> specifically focus on sequences of operations that contain value-based
> errors.
> 
> ### Format of Structure ###
> 
> 
> Prior attempts at abstracting value-based error handling in Erlang
> overloaded special constructs with parse transforms in order to provide
> specific workflows.
> 
> For example, the [`fancyflow`](https://github.com/ferd/fancyflow)
> library tried to abstract the following code:
> 
> ```
> sans_maybe() ->
>     case file:get_cwd() of
>         {ok, Dir} ->
>             case file:read_file(filename:join([Dir, "demo",
> "data.txt"])) of
>                 {ok, Bin} ->
>                     {ok, {byte_size(Bin), Bin}};
>                 {error, Reason} ->
>                     {error, Reason}
>             end;
>         {error, Reason} ->
>             {error, Reason}
>     end.
> ```
> 
> as:
> 
> ```
> -spec maybe() -> {ok, non_neg_integer()} | {error, term()}.
> maybe() ->
>     [maybe](undefined,
>             file:get_cwd(),
>             file:read_file(filename:join([_, "demo", "data.txt"])),
>             {ok, {byte_size(_), _}}).
> ```
> 
> And Erlando would replace:
> 
> ```
> write_file(Path, Data, Modes) ->
>     Modes1 = [binary, write | (Modes -- [binary, write])],
>     case make_binary(Data) of
>         Bin when is_binary(Bin) ->
>             case file:open(Path, Modes1) of
>                 {ok, Hdl} ->
>                     case file:write(Hdl, Bin) of
>                         ok ->
>                             case file:sync(Hdl) of
>                                 ok ->
>                                     file:close(Hdl);
>                                 {error, _} = E ->
>                                     file:close(Hdl),
>                                     E
>                             end;
>                         {error, _} = E ->
>                             file:close(Hdl),
>                             E
>                     end;
>                 {error, _} = E -> E
>             end;
>         {error, _} = E -> E
>     end.
> ```
> 
> With monadic constructs in list comprehensions:
> 
> ```
> write_file(Path, Data, Modes) ->
>     Modes1 = [binary, write | (Modes -- [binary, write])],
>     do([error_m ||
>         Bin <- make_binary(Data),
>         Hdl <- file:open(Path, Modes1),
>         Result <- return(do([error_m ||
>                              file:write(Hdl, Bin),
>                              file:sync(Hdl)])),
>         file:close(Hdl),
>         Result]).
> ```
> 
> Those cases specifically aimed for a way to write sequences of
> operations where pre-defined semantics are bound by a special context,
> but are limited to overloading constructs rather than introducing new
> ones.
> 
> By comparison, most of Erlang's control flow expressions follow similar
> structures. See the following most common ones:
> 
> ```
> case ... of
>     Pattern [when Guard] -> Expressions
> end
> 
> if
>    Guard -> Expressions
> end
> 
> begin
>     Expressions
> end
> 
> receive
>     Pattern [when Guard] -> Expressions
> after                                               % optional
>     IntegerExp -> Expressions
> end
> 
> try
>     Expressions
> of                                                  % optional
>     Pattern [when Guard] -> Expressions
> catch                                               % optional
>     ExceptionPattern [when Guard] -> Expressions
> after                                               % optional
>     Expressions
> end
> ```
> 
> It therefore logically follows that if we were to add a new construct,
> it should be of the form
> 
> ```
> <keyword>
>     ...
> end
> ```
> 
> The questions remaining are: which keyword to choose, and which clauses
> to support.
> 
> 
> ### Choosing `begin ... end` ###
> 
> 
> Initially, a format similar to Elixir's `with` expression was being
> considered:
> 
> ```
> <keyword>
>     Expressions | UnwrapExpressions
> of                                              % optional
>     Pattern [when Guard] -> Expressions
> end
> ```
> 
> With this construct, the basic `<keyword> ... end` form would follow the
> currently proposed semantics, but the `of ...` section would allow
> pattern matching on any return value from the expression, whether
> `{error, Reason}` or any non-exception value returned by the last
> expression in the main section.
> 
> This form would be in line with what `try ... of ... catch ... end`
> allows: once the main section is covered, more work can be done within
> the same construct.
> 
> However, `try ... of ... catch ... end` has a specific reason for
> introducing the patterns and guards: protected code impacting tail
> recursion.
> 
> In a loop such as:
> 
> ```
> map_nocrash(_, []) -> [];
> map_nocrash(F, [H|T]) ->
>     try
>         F(H)
>     of
>         Val -> [Val | map_nocrash(F, T)]
>     catch
>         _:_ -> map_nocrash(F, T)
>     end.
> ```
> 
> The `of` section allows to continue doing work in the case no exception
> has happened, _without_ having to protect more than the current scope of
> the function, nor preventing tail-recursion by forcing a presence of
> each iteration on the stack.
> 
> No such concerns exist for value-based error handling, and while the
>  `of ... end` section might be convenient at times, it is strictly not
> necessary for the construct to be useful.
> 
> What was left was to choose a name. Initially, the `<keyword>` value
> chosen was `maybe`, based on the Maybe monad. The problem is that
> introducing any new keyword carries severe risks to backwards
> compatibility.
> 
> For example, all of the following words were considered:
> 
>     ======= ================= =========================================
>     Keyword Times used in OTP Rationale
>              as a function
>     ======= ================= =========================================
>     maybe   0                 can clash with existing used words,
>                                otherwise respects the spirit
>     option  88                definitely clashes with existing code
>     opt     68                definitely clashes with existing code
>     check   49                definitely clashes with existing code
>     let     0                 word is already reserved and free, but
>                                makes no sense in context
>     cond    0                 word is already reserved and free, may
>                                make sense, but would prevent the
>                                addition of a conditional expression
>     given   0                 could work, kind of respects the context
>     when    0                 reserved for guards, could hijack in new
>                               context but may be confusing
>     begin   0                 carries no conditional meaning, mostly
>                               free for overrides
> 
> Initially, this proposal expected to use the `maybe` keyword:
> 
> ```
> maybe
>     Pattern <op> Exp,
>     ...
> of
>     Pattern -> Exp  % optional
> end
> ```
> 
> but for the reasons mentioned in the previous section, the `of ...`
> section became non-essential.
> 
> Then, with the strong requirements for backwards compatibility making it
> difficult to introduce new keywords, along with the possibility to reuse
> `begin` without changing any of its current behavior, this form became the
> most interesting one.
> 
> The term `begin` is also reminiscent of transactions and abortive
> contexts, which means that although not an ideal fit for value-based
> error flow, it is also not entirely outlandish and could accept the new
> added optional semantics without being too out of place.
> 
> 
> A New Infix Operator
> --------------------
> 
> In order to form `UnwrapExpr`, there is a need for a mechanism to
> introduce pattern matching with distinct semantics from regular pattern
> matching.
> 
> A naive parse transform approach with fake function calls would be the
> most basic way to go:
> 
> ```
> begin
>     unwrap(Pattern, Exp),
>     % variables bound in Pattern are available in scope
> end
> ```
> 
> However, this would introduce pattern matches in non-left-hand-side
> positions and make nesting really weird to deal with without exposing
> parse transform details and knowing how the code is translated.
> 
> A prefix keyword such `let <Pattern> = <Exp>` could also be used.
> Such keywords unfortunately suffer the same issues as `maybe` would
> have, and `let` typically has different implications.
> 
> An infix operator seems like a good fit since pattern matching already
> uses them in multiple forms:
> 
> - `=` is used for pattern matches. Overloading it in error flow would
>   prevent regular matching from being used
> - `:=` is used for maps; using it could work, but would certainly be
>   confusing when handling nested maps in a pattern
> - `<-` could make sense. It is already restricted in scope to list and
>   binary comprehensions and would therefore not clash nor be confused.
>   However, the existing semantics of the operator imply a literal
>   pattern match working like a filter. We're looking for the filter-like
>   approach, but want to introduce implicit elements (`{ok|error, ...}`)
> - `<=` same as `<-` but for binary generators
> 
> It would make sense to check for new operators specifically for this
> context given the semantics:
> 
>     =======  ===========================================================
>     Operator Description
>     =======  ===========================================================
>     #=       no clash with other syntax (maps, records, integers), no
>              clash with abstract patterns EEP either.
>     !=       No clash with message passing, but is sure to annone used
>              to C-style inequality checks
>     <~       Works with no known conflict; shouldn't clash with ROK's
>              frame proposals (uses infix ~ and < > as delimiters).
>     <|       reverse pipe operator. No obvious clash either
> 
> There is no strong argument for or against most of these. The choice of
> `<~` mostly comes down to having similarity to list comprehensions' `<-`
> operator both in semantics and appearance, although being different
> overall.
> 
> ### Operator Priority ###
> 
> Within the expected usage of the unwrap expressions, the `<~` operator
> needs to have a precedence rule such that:
> 
> ```
> X = {Y,X} <~ <Exp>
> ```
> 
> Is considered a valid pattern match operation with `X = {Y,X}` being the
> whole left-hand-side pattern, such that operation priorities are:
> 
> ```
> lhs <~ rhs
> ```
> 
> Instead of
> 
> ```
> lhs = rhs <~ <...>
> ```
> 
> In all other regards, the precedence rules should be the same as `=` in
> order to provide the most unsurprising experience possible.
> 
> Other Disregarded Approaches and Variations
> ----------------------------
> 
> Other approaches were considered in making this proposal, and ultimately
> disregarded.
> 
> ### Elixir-Like Patterns in `with` ###
> 
> The Elixir approach is fairly comprehensive, and rather powerful. Rather
> than handling success or errors, it generalizes over pattern matching as
> a whole.
> 
> In the end though, it ends up not specifically doing that much in terms
> of error handling.
> 
> The current proposal really wanted a mechanism more appropriate and dedicated
> to value-based error handling than straight up pattern matching. The `<-`
> operator (for example) could be brought in as an extension supporting more
> generalized pattern matching, but is currently not in scope.
> 
> ### Simplifying Chaining an Pipelining ###
> 
> One approach or pain point frequently brough up about Erlang concern
> pipelining of operations. Could it be possible to make some
> operations easier to chain?
> 
> If we take a set of functions `f()`, `g()`, and `h()` that all return
> `{ok | error, _}` tuples, current day Erlang requires:
> 
> ```
> {ok, X} = f(),
> {ok, Y} = g(X),
> {ok, Z} = h(Y),
> Z
> ```
> 
> Could there be an easier way to handle this type of chaining, based on
> say, an `unwrap` function:
> 
> ```
> unwrap({ok, X}) -> X.
> 
> main() ->
>     unwrap(h(unwrap(g(unwrap(f()))))).
> ```
> 
> And it appeared that generally, this turns out to be simple enough to do
> with the earlier fold approach we had mentioned.
> 
> Overall, the various existing mechanisms appeared slightly inconvenient,
> but not inconvenient enough to be worth adding a whole new language
> mechanism just for it.
> 
> 
> ### `cond` and `cond let` ###
> 
> Anthony Ramine recommended looking into reusing the already reserved
> `cond` and `let` keywords. He mentioned Rust planning something based on
> these and how it could be ported to Erlang based on his prior work on
> supporting the `cond` construct within the language.
> 
> The proposed mechanism would look like:
> 
> ```
> cond
>     X > 5 -> % regular guard
>         Exp;
>     f() < 18 -> % function used in guard, as originally planned
>         Exp;
>     let {ok, Y} = exp(), Y < 5 ->
>         Exp
> end
> ```
> 
> The last clause would allow `Y` to be used in its own branch only if it
> matches and all guards succeed; if the binding fails, a switch is
> automatically made to the next branch.
> 
> As such, more complex sequences of operations could be covered as:
> 
> ```
> cond
>     let {ok, _} = call1(),
>     let {ok, _} = call2(),
>     let Res = call3() ->
>         Res;
>     true ->
>         AlternativeBranch
> end
> ```
> 
> This mechanism is, in my opinion, worth exploring and maybe adding to
> the language, but on its own does not adequately solve error handling
> flow issues since errors cannot be exracted easily from failing
> operations.
> 
> ### Auto-Wrapping Return Values ###
> 
> Auto-wrapping return values is something the Elixir's `OK` library does,
> as well as Haskell's do notation, but that neither Rust nor Swift does.
> 
> It seems that there is no very clear consensus on what could be done.
> Thus, for the simplicity of the implementation and backards
> compatibility of the `begin ... end` expression, just returning the
> value as-is without auto-wrapping seems sensible.
> 
> It would therefore be up to the developer to just return whatever value
> best matches their function's type signature, making easier to still
> integrate return values with the system they have.
> 
> It also lets sequences of operations potentially return `ok` on success,
> even if their individual functions returned values such as `true`, for
> example, rather than `{ok, true}`.
> 
> The choice of supported match values
> -------------------------------------
> 
> It is kind of straightforward why `{ok, V}` and `{error, T}` are used in
> pattern matches as error values: they're the most standard way to
> communicate a value and an error in non-overlapping patterns whichever
> way you want to match.
> 
> On the other hand, it is less obvious why `_ <~ Exp` should positively
> match on `ok` alone, and why, for example, `error` as an atom would
> raise an exception as not matching any patterns.
> 
> The reason `ok` is considered valid can be found in comparing common
> Erlang return values with their matches in other languages.
> 
> The following functions return `ok` when everything went well but
> nothing is worth reporting. The list is not exhaustive:
> 
> - `lists:foreach/2`
> - over 25 functions in the `file` module
> - most functions in `disk_log`
> - most functions sending data or handling control of sockets and ports
> - most output functions from the `io` module
> - logging functions in the `logger` module
> - functions from the `applications` module interacting with config and
>   starting or loading applications
> 
> The pattern is fully entrenched as a core pattern in Erlang and OTP, and
> very attached to side-effectful operations.
> 
> The interesting aspect comes from seeing what Rust does for similar
> functions, which is just return their own unit type, denoted as `()`.
> When used with the `Result` types, it is to be returned a `OK(())`.
> 
> The Erlang equivalent would probably be `{ok, undefined}`, but `ok` as a
> single atom currently plays that role fine, and so it was decided to
> support it; it will let error flow integrate well with side-effectful
> functions.
> 
> The same cannot be said of `error` as an atom result. Most errors can
> and should return context with them that qualifies the error result,
> since they often have more than one reason to fail. As evidence for this
> line of thought, it is currently not possible to raise exceptions
> without a `Reason`, whether done through `throw/1`, `error/1`,
> `exit/1-2`, or `raise/3`.
> 
> Aligning with the standard practices in the Erlang language validate
> using `_ <~ Exp` as a pattern suitable for `ok`, and only this pattern
> since it allows to basically match on what would be a non-existing value
> that wouldn't need to be bound in further contexts.
> 
> Choosing Exceptions Raised
> --------------------------
> 
> The exception format proposed here is `{badunwrap, Value}`. This format
> is chosen following Erlang/OTP standards:
> 
> - `badarg`
> - `badarith`
> - `badfun`
> - `{badmatch, Val}`
> 
> Since "unwrapping" is how the kind of operation where `X` is extracted
> from `{ok, X}`, the name `badunwrap` was chosen, along with the
> mismatching value being borrowed from `{badmatch, _}`.
> 
> Backwards Compatibility
> =======================
> 
> The possibility of an early exit from a `begin ... end` expression
> means that variables declared within its scope are now potentially
> unsafe to use outside of it.
> 
> This is a change of behaviour that brings `begin` in line with the
> variables bound within a `case ... end` branch, a `try/catch` clause, or
> a `receive ... end` branch.
> 
> This lack of safety only needs to be started at the first `UnwrapExpr`
> encountered, since all variables bound before respect the same semantics
> as the existing `begin ... end` expression. If this analysis is done
> rather than just declaring all variables as unsafe wholesale, then there
> is no backwards compatibility concern to be had.
> 
> The need for a new operator means code built with support for the new
> expressions won't be portable to older Erlang releases.
> 
> Reference Implementation
> ========================
> 
> No reference implementation is usually required at this step, but
> one is nevertheless provided in the original repository for this
> EEP draft, at https://bitbucket.org/ferd/unwrap/. The implementation
> uses parse transforms rather than an operator, since it would be
> difficult to add custom operators at this point of the process.

> _______________________________________________
> eeps mailing list
> eeps@REDACTED
> http://erlang.org/mailman/listinfo/eeps


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



More information about the eeps mailing list