[eeps] EEP ???: Value-Based Error Handling Mechanisms

Fred Hebert mononcqc@REDACTED
Thu Sep 6 13:42:48 CEST 2018


Hi everyone,

I have received some feedback during the last few days, mostly on IRC and
Twitter regarding this proposal. Mostly a lot of it was positive, but not
all of it.

The most common criticism related to the choice to limit the construct to
patterns of ok | {ok | error, term()}.

I guess if we wanted to type the values it would be:

-type error(E) :: ok | {error, E}.
-type error(T, E) :: {ok, T} | {error, E}.

Anyway the criticism came from people who would prefer the construct to
work with a full explicit pattern—to make the distinction clear, I’ll reuse
the list comprehension <- arrow instead of <~ when referring to that.

So instead of

begin
    {X,Y} <~ id({ok, {X,Y}})
    ...
end

You would have to write:

begin
    {ok, {X,Y}} <- id({ok, {X,Y}})
    ...
end

This is a fine general construct, but I believe it is inadequate and even
dangerous for errors; it is only good at skipping patterns, but can’t be
safely used as a good error handling mechanism.

One example of this could be taken from the current OTP pull request that
adds new return value to packet reading based on inet options:
https://github.com/erlang/otp/pull/1950

This PR adds a possible value for packet reception to the current form:

{ok, {PeerIP, PeerPort, Data}}

To ask make it possible to alternatively get:

{ok, {PeerIP, PeerPort, AncData, Data}}

Based on socket options set earlier. So let’s put it in context for the
current proposal:

begin
    {X,Y} <~ id({ok, {X,Y}}),
    {PeerIP, PeerPort, Data} <~ gen_udp:recv(...),
    ...
end

If AncData is received, an exception is raised: the value was not an error
but didn’t have the shape or type expected for the successful pattern to
match. Errors are still returned properly by exiting the begin ... end
block, and we ensure correctness in what we handle and return.

However, had we used this form:

begin
    {ok, {X,Y}} <- id({ok, {X,Y}}),
    {ok, {PeerIP, PeerPort, Data}} <- gen_udp:recv(...),
    ...
end

Since this would return early on any non-matching value (we can’t take a
stance on what is an error or ok value aside from the given pattern), the
whole expression, if the socket is configured unexpectedly, could return {ok,
{PeerIP, PeerPort, AncData, Data}}  on a failure to match

Basically, an unexpected but good result could be returned from a function
using the begin ... end construct, which would look like a success while it
was actually a complete failure to match and handle the information given.
This is made even more ambiguous when data has the right shape and type,
but a set of bound variables ultimately define whether the match succeeds
or fails.

In worst cases, It could let raw unformatted data exit a conditional
pipeline with no way to detect it after the fact, particularly if later
functions in begin ... end apply transformations to text, such as
anonymizing or sanitizing data, for example. This could be pretty unsafe
and near impossible to debug well.

Think for example of:

-spec fetch() -> iodata().
fetch() ->
    begin
        {ok, B = <<_/binary>>} <- f(),
        true <- validate(B),
        {ok, sanitize(B)}
    end.

If the value returned from f() turns out to be a list (say it’s a
misconfigured socket using list instead of binary as an option), the
expression will return early, the fetch() function will still return {ok,
iodata()} but you couldn’t know as a caller whether it is the transformed
data or non-matching content.  It would not be obvious to most developers
either that this could represent a major security risk by allowing
unexpected data to be seen as clean data.

It is basically a risky pattern if you want your code to be strict or
future-proof in the context of error handling. The current proposal, by
comparison, would raise an exception on unexpected good values, therefore
preventing ways to sneak such data into your control flow:

-spec fetch() -> iodata().
fetch() ->
    begin
        B = <<_/binary>> <~ f(),
        _ <~ validate(B), % returns ok if valid
        {ok, sanitize(B)}
    end.

Here misconfigured sockets won’t result in unchecked data passing trough
your app.

The general pattern mechanism may have a place, but I believe it could not
guarantee the same amount of safety as the current proposal in any
error-handling context, which was my concern in writing EEP 49.

I can think of no way to make the general pattern approach safer than or
even as safe as the currently suggested mechanism under its current form.
You would have to necessarily add complexity to the construct with a kind
of ‘else’ clause where you must handle all non-matching cases explicitly if
you want them returned (this is what Elixir allows), but unless the clause
is mandatory (it is not in Elixir), you will have that risky ambiguity
possibly hiding in all pattern matches:

-spec fetch() -> iodata().
fetch() ->
    begin
        {ok, B = <<_/binary>>} <- f(),
        true <- validate(B),
        {ok, sanitize(B)}
    else
        {error, _} = E -> E;
        false -> false
    end.

Putting it another way, to get the same amount of safety, you’d have to
re-state all acceptable error forms just to ensure the unexpected cases
don’t go through and instead appropriately raise exceptions, which would
likely be a missing else clause exception. This exception would obscure the
the original error site as well, something the current form does not do.

The follow up question I ask is whether this would be a significant
improvement over the current Erlang code, making it worth the new construct?

-spec fetch() -> iodata().
fetch() ->
    case f() of
        {ok, B = <<_/binary>>} ->
            case validate(B) of
                 true -> {ok, sanitize(B)};
                 false -> false
             end;
        {error, _} = E -> E
    end.

compared to the <- form, it has roughly the same line count (a few more the
more clauses are added), but nested cases have the benefit of making it
obvious which bad matches are acceptable in which context. Here, f() won’t
be able to validly return false as a surprise value whereas the general
pattern form would.

This problem does not exist in the EEP 49 mechanism specifically because it
mandates patterns that denote unambiguous success or failure conditions.
You end up with code that is shorter and safer at the same time.

I hope this helps validate the current design, but let me know if you
disagree.

Regards,
Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/eeps/attachments/20180906/ab6e4524/attachment.htm>


More information about the eeps mailing list