[eeps] EEP ???: Value-Based Error Handling Mechanisms

Mon Sep 10 14:57:13 CEST 2018

Hi Fred!

I realize you have thought about this a lot more than I have had time to, so excuse anything which is not reasonable.

I’m convinced now that way you designed it makes sense, and (correct me if I’m wrong) that using explicit patterns would basically just result in a bunch of bad matches (with no distinction between errors and unexpected values) which makes the whole point of having it kind of moot.

As I mentioned on Twitter, I found it unintuitive that `Val` means `{ok, Val}` and `_` means `ok` which I think is a big departure from Erlang’s otherwise clear explicitness. Would it be possible to make those patterns explicit (so that you have to spell them out) but that the proposed error handling logic still applies? I.e. that `{error, Reason}` gets passed through, but `{ok, OtherVal}` results in a `{badunwrap, OtherVal}`?

Sorry again if there’s some subtlety that I missed making this a stupid suggestion ;-)

Cheers,
Adam

> On 6. Sep 2018, at 13:42, Fred Hebert <mononcqc@REDACTED> wrote:
> 
> Hi everyone,
> 
> I have received some feedback during the last few days, mostly on IRC and Twitter regarding this proposal. Mostly a lot of it was positive, but not all of it.
> 
> The most common criticism related to the choice to limit the construct to patterns of ok | {ok | error, term()}.
> 
> I guess if we wanted to type the values it would be:
> 
> -type error(E) :: ok | {error, E}.
> -type error(T, E) :: {ok, T} | {error, E}.
> 
> Anyway the criticism came from people who would prefer the construct to work with a full explicit pattern—to make the distinction clear, I’ll reuse the list comprehension <- arrow instead of <~ when referring to that.
> 
> So instead of
> 
> begin
>     {X,Y} <~ id({ok, {X,Y}})
>     ...
> end
> 
> You would have to write:
> 
> begin
>     {ok, {X,Y}} <- id({ok, {X,Y}})
>     ...
> end
> 
> This is a fine general construct, but I believe it is inadequate and even dangerous for errors; it is only good at skipping patterns, but can’t be safely used as a good error handling mechanism.
> 
> One example of this could be taken from the current OTP pull request that adds new return value to packet reading based on inet options:
> https://github.com/erlang/otp/pull/1950
> 
> This PR adds a possible value for packet reception to the current form:
> 
> {ok, {PeerIP, PeerPort, Data}}
> 
> To ask make it possible to alternatively get:
> 
> {ok, {PeerIP, PeerPort, AncData, Data}}
> 
> Based on socket options set earlier. So let’s put it in context for the current proposal:
> 
> begin
>     {X,Y} <~ id({ok, {X,Y}}),
>     {PeerIP, PeerPort, Data} <~ gen_udp:recv(...),
>     ...
> end
> 
> If AncData is received, an exception is raised: the value was not an error but didn’t have the shape or type expected for the successful pattern to match. Errors are still returned properly by exiting the begin ... end block, and we ensure correctness in what we handle and return.
> 
> However, had we used this form:
> 
> begin
>     {ok, {X,Y}} <- id({ok, {X,Y}}),
>     {ok, {PeerIP, PeerPort, Data}} <- gen_udp:recv(...),
>     ...
> end
> 
> Since this would return early on any non-matching value (we can’t take a stance on what is an error or ok value aside from the given pattern), the whole expression, if the socket is configured unexpectedly, could return {ok, {PeerIP, PeerPort, AncData, Data}}  on a failure to match
> 
> Basically, an unexpected but good result could be returned from a function using the begin ... end construct, which would look like a success while it was actually a complete failure to match and handle the information given. This is made even more ambiguous when data has the right shape and type, but a set of bound variables ultimately define whether the match succeeds or fails.
> 
> In worst cases, It could let raw unformatted data exit a conditional pipeline with no way to detect it after the fact, particularly if later functions in begin ... end apply transformations to text, such as anonymizing or sanitizing data, for example. This could be pretty unsafe and near impossible to debug well.
> 
> Think for example of:
> 
> -spec fetch() -> iodata().
> fetch() ->
>     begin
>         {ok, B = <<_/binary>>} <- f(),
>         true <- validate(B),
>         {ok, sanitize(B)}
>     end.
> 
> If the value returned from f() turns out to be a list (say it’s a misconfigured socket using list instead of binary as an option), the expression will return early, the fetch() function will still return {ok, iodata()} but you couldn’t know as a caller whether it is the transformed data or non-matching content.  It would not be obvious to most developers either that this could represent a major security risk by allowing unexpected data to be seen as clean data.
> 
> It is basically a risky pattern if you want your code to be strict or future-proof in the context of error handling. The current proposal, by comparison, would raise an exception on unexpected good values, therefore preventing ways to sneak such data into your control flow:
> 
> -spec fetch() -> iodata().
> fetch() ->
>     begin
>         B = <<_/binary>> <~ f(),
>         _ <~ validate(B), % returns ok if valid
>         {ok, sanitize(B)}
>     end.
> 
> Here misconfigured sockets won’t result in unchecked data passing trough your app.
> 
> The general pattern mechanism may have a place, but I believe it could not guarantee the same amount of safety as the current proposal in any error-handling context, which was my concern in writing EEP 49.
> 
> I can think of no way to make the general pattern approach safer than or even as safe as the currently suggested mechanism under its current form. You would have to necessarily add complexity to the construct with a kind of ‘else’ clause where you must handle all non-matching cases explicitly if you want them returned (this is what Elixir allows), but unless the clause is mandatory (it is not in Elixir), you will have that risky ambiguity possibly hiding in all pattern matches:
> 
> -spec fetch() -> iodata().
> fetch() ->
>     begin
>         {ok, B = <<_/binary>>} <- f(),
>         true <- validate(B),
>         {ok, sanitize(B)}
>     else
>         {error, _} = E -> E;
>         false -> false
>     end.
> 
> Putting it another way, to get the same amount of safety, you’d have to re-state all acceptable error forms just to ensure the unexpected cases don’t go through and instead appropriately raise exceptions, which would likely be a missing else clause exception. This exception would obscure the the original error site as well, something the current form does not do.
> 
> The follow up question I ask is whether this would be a significant improvement over the current Erlang code, making it worth the new construct?
> 
> -spec fetch() -> iodata().
> fetch() ->
>     case f() of
>         {ok, B = <<_/binary>>} ->
>             case validate(B) of
>                  true -> {ok, sanitize(B)};
>                  false -> false
>              end;
>         {error, _} = E -> E
>     end.
> 
> compared to the <- form, it has roughly the same line count (a few more the more clauses are added), but nested cases have the benefit of making it obvious which bad matches are acceptable in which context. Here, f() won’t be able to validly return false as a surprise value whereas the general pattern form would.
> 
> This problem does not exist in the EEP 49 mechanism specifically because it mandates patterns that denote unambiguous success or failure conditions. You end up with code that is shorter and safer at the same time.
> 
> I hope this helps validate the current design, but let me know if you disagree.
> 
> Regards,
> Fred
> _______________________________________________
> eeps mailing list
> eeps@REDACTED
> http://erlang.org/mailman/listinfo/eeps

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <http://erlang.org/pipermail/eeps/attachments/20180910/72eceac7/attachment.bin>