[eeps] EEP ???: Value-Based Error Handling Mechanisms

Mon Sep 10 22:34:21 CEST 2018


> On 10. Sep 2018, at 16:37, Fred Hebert <mononcqc@REDACTED> wrote:
> 
> It could make sense. I've just sent a PR on the EEP where I touch this a bit; it's not a strong argument. There are two possible approach: matching on all ok- and error-based tuples, or keeping the same exact semantics although requiring the pattern to be explicit.
> 
> In the first case the question is if it would make sense to choose all good values to be those in a tuple starting with ok (ok | {ok, _} | {ok, _, _} | ...), and all error values all those starting with error ({error, _} | {error, _, _} | ...).
> This approach would allow more flexibility on possible error values, but would make composition more difficult. Let's take the following three function signatures as an example:
> 
> -spec f() -> ok | {error, term()}.
> -spec g() -> {ok, term()} | {error, term(), term()}.
> -spec h() -> {ok, term(), [warning()]} | {error, term()}.
> If a single begin ... end block calls to these as the potential return value of a function, the caller now has to have the following type specification:
> 
> -spec caller() -> ok | {ok, term()} | {ok, term(), [warning()]}
>                 | {error, term()} | {error, term(), term()}.
> As you call more and more functions and compose them together, the cross-section of what is a valid returning function grows in complexity and may even end up giving more trouble to tools such as Dialyzer.
> 
Wouldn't the successful return only be the type of the last function, not any previous?
> So for that I would think that yeah, it would make more sense to just keep ok | {ok, Term} as accepted types because they encourage better long-term composability. The question of explicit patterns then is whether only:
> 
> ok <~ Exp
> 
> and
> 
> {ok, Pattern} <~ Exp
> 
> would make sense. I have to say I do not necessarily mind it, but we'd have to be careful to make sure that, for example, {error, T} <~ Exp and {ok, _, _} <~ Exp  are never matching either, and then pick what would make sense to send out as an error when it happens. Should it be considered invalid syntax, return a compile error (pattern will never match?), or just crash at runtime? By making the 'ok' part implicit, you avoid this issue entirely because it is not possible to write and invalid pattern.
> 
This also makes it look like it's possible to match any term (e.g. `{error, T}`) which would also confuse the user... Compile time errors are a minimum here I think.
> I do agree that seeing _ <~ Exp match on what is essentially nothing is a bit odd, so I would be open to making the patterns explicit. I'm just a bit annoyed by the idea that you're creating a class of possible programmer errors to handle which just were not possible at first. At this point I'm not feeling strongly either way, and have not necessarily received enough feedback to sway one way or the other.
> 
Yeah, I agree. However I'm not fond of the implicit version either (unless the syntax can be made more non-standard as to not break the principle of least surprise).

Cheers
Adam

>> On Mon, Sep 10, 2018 at 8:57 AM, Adam Lindberg <hello@REDACTED> wrote:
>> Hi Fred!
>> 
>> I realize you have thought about this a lot more than I have had time to, so excuse anything which is not reasonable.
>> 
>> I’m convinced now that way you designed it makes sense, and (correct me if I’m wrong) that using explicit patterns would basically just result in a bunch of bad matches (with no distinction between errors and unexpected values) which makes the whole point of having it kind of moot.
>> 
>> As I mentioned on Twitter, I found it unintuitive that `Val` means `{ok, Val}` and `_` means `ok` which I think is a big departure from Erlang’s otherwise clear explicitness. Would it be possible to make those patterns explicit (so that you have to spell them out) but that the proposed error handling logic still applies? I.e. that `{error, Reason}` gets passed through, but `{ok, OtherVal}` results in a `{badunwrap, OtherVal}`?
>> 
>> Sorry again if there’s some subtlety that I missed making this a stupid suggestion ;-)
>> 
>> Cheers,
>> Adam
>> 
>> > On 6. Sep 2018, at 13:42, Fred Hebert <mononcqc@REDACTED> wrote:
>> > 
>> > Hi everyone,
>> > 
>> > I have received some feedback during the last few days, mostly on IRC and Twitter regarding this proposal. Mostly a lot of it was positive, but not all of it.
>> > 
>> > The most common criticism related to the choice to limit the construct to patterns of ok | {ok | error, term()}.
>> > 
>> > I guess if we wanted to type the values it would be:
>> > 
>> > -type error(E) :: ok | {error, E}.
>> > -type error(T, E) :: {ok, T} | {error, E}.
>> > 
>> > Anyway the criticism came from people who would prefer the construct to work with a full explicit pattern—to make the distinction clear, I’ll reuse the list comprehension <- arrow instead of <~ when referring to that.
>> > 
>> > So instead of
>> > 
>> > begin
>> >     {X,Y} <~ id({ok, {X,Y}})
>> >     ...
>> > end
>> > 
>> > You would have to write:
>> > 
>> > begin
>> >     {ok, {X,Y}} <- id({ok, {X,Y}})
>> >     ...
>> > end
>> > 
>> > This is a fine general construct, but I believe it is inadequate and even dangerous for errors; it is only good at skipping patterns, but can’t be safely used as a good error handling mechanism.
>> > 
>> > One example of this could be taken from the current OTP pull request that adds new return value to packet reading based on inet options:
>> > https://github.com/erlang/otp/pull/1950
>> > 
>> > This PR adds a possible value for packet reception to the current form:
>> > 
>> > {ok, {PeerIP, PeerPort, Data}}
>> > 
>> > To ask make it possible to alternatively get:
>> > 
>> > {ok, {PeerIP, PeerPort, AncData, Data}}
>> > 
>> > Based on socket options set earlier. So let’s put it in context for the current proposal:
>> > 
>> > begin
>> >     {X,Y} <~ id({ok, {X,Y}}),
>> >     {PeerIP, PeerPort, Data} <~ gen_udp:recv(...),
>> >     ...
>> > end
>> > 
>> > If AncData is received, an exception is raised: the value was not an error but didn’t have the shape or type expected for the successful pattern to match. Errors are still returned properly by exiting the begin ... end block, and we ensure correctness in what we handle and return.
>> > 
>> > However, had we used this form:
>> > 
>> > begin
>> >     {ok, {X,Y}} <- id({ok, {X,Y}}),
>> >     {ok, {PeerIP, PeerPort, Data}} <- gen_udp:recv(...),
>> >     ...
>> > end
>> > 
>> > Since this would return early on any non-matching value (we can’t take a stance on what is an error or ok value aside from the given pattern), the whole expression, if the socket is configured unexpectedly, could return {ok, {PeerIP, PeerPort, AncData, Data}}  on a failure to match
>> > 
>> > Basically, an unexpected but good result could be returned from a function using the begin ... end construct, which would look like a success while it was actually a complete failure to match and handle the information given. This is made even more ambiguous when data has the right shape and type, but a set of bound variables ultimately define whether the match succeeds or fails.
>> > 
>> > In worst cases, It could let raw unformatted data exit a conditional pipeline with no way to detect it after the fact, particularly if later functions in begin ... end apply transformations to text, such as anonymizing or sanitizing data, for example. This could be pretty unsafe and near impossible to debug well.
>> > 
>> > Think for example of:
>> > 
>> > -spec fetch() -> iodata().
>> > fetch() ->
>> >     begin
>> >         {ok, B = <<_/binary>>} <- f(),
>> >         true <- validate(B),
>> >         {ok, sanitize(B)}
>> >     end.
>> > 
>> > If the value returned from f() turns out to be a list (say it’s a misconfigured socket using list instead of binary as an option), the expression will return early, the fetch() function will still return {ok, iodata()} but you couldn’t know as a caller whether it is the transformed data or non-matching content.  It would not be obvious to most developers either that this could represent a major security risk by allowing unexpected data to be seen as clean data.
>> > 
>> > It is basically a risky pattern if you want your code to be strict or future-proof in the context of error handling. The current proposal, by comparison, would raise an exception on unexpected good values, therefore preventing ways to sneak such data into your control flow:
>> > 
>> > -spec fetch() -> iodata().
>> > fetch() ->
>> >     begin
>> >         B = <<_/binary>> <~ f(),
>> >         _ <~ validate(B), % returns ok if valid
>> >         {ok, sanitize(B)}
>> >     end.
>> > 
>> > Here misconfigured sockets won’t result in unchecked data passing trough your app.
>> > 
>> > The general pattern mechanism may have a place, but I believe it could not guarantee the same amount of safety as the current proposal in any error-handling context, which was my concern in writing EEP 49.
>> > 
>> > I can think of no way to make the general pattern approach safer than or even as safe as the currently suggested mechanism under its current form. You would have to necessarily add complexity to the construct with a kind of ‘else’ clause where you must handle all non-matching cases explicitly if you want them returned (this is what Elixir allows), but unless the clause is mandatory (it is not in Elixir), you will have that risky ambiguity possibly hiding in all pattern matches:
>> > 
>> > -spec fetch() -> iodata().
>> > fetch() ->
>> >     begin
>> >         {ok, B = <<_/binary>>} <- f(),
>> >         true <- validate(B),
>> >         {ok, sanitize(B)}
>> >     else
>> >         {error, _} = E -> E;
>> >         false -> false
>> >     end.
>> > 
>> > Putting it another way, to get the same amount of safety, you’d have to re-state all acceptable error forms just to ensure the unexpected cases don’t go through and instead appropriately raise exceptions, which would likely be a missing else clause exception. This exception would obscure the the original error site as well, something the current form does not do.
>> > 
>> > The follow up question I ask is whether this would be a significant improvement over the current Erlang code, making it worth the new construct?
>> > 
>> > -spec fetch() -> iodata().
>> > fetch() ->
>> >     case f() of
>> >         {ok, B = <<_/binary>>} ->
>> >             case validate(B) of
>> >                  true -> {ok, sanitize(B)};
>> >                  false -> false
>> >              end;
>> >         {error, _} = E -> E
>> >     end.
>> > 
>> > compared to the <- form, it has roughly the same line count (a few more the more clauses are added), but nested cases have the benefit of making it obvious which bad matches are acceptable in which context. Here, f() won’t be able to validly return false as a surprise value whereas the general pattern form would.
>> > 
>> > This problem does not exist in the EEP 49 mechanism specifically because it mandates patterns that denote unambiguous success or failure conditions. You end up with code that is shorter and safer at the same time.
>> > 
>> > I hope this helps validate the current design, but let me know if you disagree.
>> > 
>> > Regards,
>> > Fred
>> > _______________________________________________
>> > eeps mailing list
>> > eeps@REDACTED
>> > http://erlang.org/mailman/listinfo/eeps
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/eeps/attachments/20180910/3fd5da99/attachment.htm>