[eeps] EEP ???: Value-Based Error Handling Mechanisms
Fred Hebert
mononcqc@REDACTED
Mon Sep 10 16:37:07 CEST 2018
It could make sense. I've just sent a PR on the EEP where I touch this a
bit; it's not a strong argument. There are two possible approach: matching
on all ok- and error-based tuples, or keeping the same exact semantics
although requiring the pattern to be explicit.
In the first case the question is if it would make sense to choose all good
values to be those in a tuple starting with ok (ok | {ok, _} | {ok, _, _} |
...), and all error values all those starting with error ({error, _} |
{error, _, _} | ...).
This approach would allow more flexibility on possible error values, but
would make composition more difficult. Let's take the following three
function signatures as an example:
-spec f() -> ok | {error, term()}.
-spec g() -> {ok, term()} | {error, term(), term()}.
-spec h() -> {ok, term(), [warning()]} | {error, term()}.
If a single begin ... end block calls to these as the potential return
value of a function, the caller now has to have the following type
specification:
-spec caller() -> ok | {ok, term()} | {ok, term(), [warning()]}
| {error, term()} | {error, term(), term()}.
As you call more and more functions and compose them together, the
cross-section of what is a valid returning function grows in complexity and
may even end up giving more trouble to tools such as Dialyzer.
So for that I would think that yeah, it would make more sense to just keep ok
| {ok, Term} as accepted types because they encourage better long-term
composability. The question of explicit patterns then is whether only:
ok <~ Exp
and
{ok, Pattern} <~ Exp
would make sense. I have to say I do not necessarily mind it, but we'd have
to be careful to make sure that, for example, {error, T} <~ Exp and {ok, _,
_} <~ Exp are never matching either, and then pick what would make sense
to send out as an error when it happens. Should it be considered invalid
syntax, return a compile error (pattern will never match?), or just crash
at runtime? By making the 'ok' part implicit, you avoid this issue entirely
because it is not possible to write and invalid pattern.
I do agree that seeing _ <~ Exp match on what is essentially nothing is a
bit odd, so I would be open to making the patterns explicit. I'm just a bit
annoyed by the idea that you're creating a class of possible programmer
errors to handle which just were not possible at first. At this point I'm
not feeling strongly either way, and have not necessarily received enough
feedback to sway one way or the other.
On Mon, Sep 10, 2018 at 8:57 AM, Adam Lindberg <hello@REDACTED> wrote:
> Hi Fred!
>
> I realize you have thought about this a lot more than I have had time to,
> so excuse anything which is not reasonable.
>
> I’m convinced now that way you designed it makes sense, and (correct me if
> I’m wrong) that using explicit patterns would basically just result in a
> bunch of bad matches (with no distinction between errors and unexpected
> values) which makes the whole point of having it kind of moot.
>
> As I mentioned on Twitter, I found it unintuitive that `Val` means `{ok,
> Val}` and `_` means `ok` which I think is a big departure from Erlang’s
> otherwise clear explicitness. Would it be possible to make those patterns
> explicit (so that you have to spell them out) but that the proposed error
> handling logic still applies? I.e. that `{error, Reason}` gets passed
> through, but `{ok, OtherVal}` results in a `{badunwrap, OtherVal}`?
>
> Sorry again if there’s some subtlety that I missed making this a stupid
> suggestion ;-)
>
> Cheers,
> Adam
>
> > On 6. Sep 2018, at 13:42, Fred Hebert <mononcqc@REDACTED> wrote:
> >
> > Hi everyone,
> >
> > I have received some feedback during the last few days, mostly on IRC
> and Twitter regarding this proposal. Mostly a lot of it was positive, but
> not all of it.
> >
> > The most common criticism related to the choice to limit the construct
> to patterns of ok | {ok | error, term()}.
> >
> > I guess if we wanted to type the values it would be:
> >
> > -type error(E) :: ok | {error, E}.
> > -type error(T, E) :: {ok, T} | {error, E}.
> >
> > Anyway the criticism came from people who would prefer the construct to
> work with a full explicit pattern—to make the distinction clear, I’ll reuse
> the list comprehension <- arrow instead of <~ when referring to that.
> >
> > So instead of
> >
> > begin
> > {X,Y} <~ id({ok, {X,Y}})
> > ...
> > end
> >
> > You would have to write:
> >
> > begin
> > {ok, {X,Y}} <- id({ok, {X,Y}})
> > ...
> > end
> >
> > This is a fine general construct, but I believe it is inadequate and
> even dangerous for errors; it is only good at skipping patterns, but can’t
> be safely used as a good error handling mechanism.
> >
> > One example of this could be taken from the current OTP pull request
> that adds new return value to packet reading based on inet options:
> > https://github.com/erlang/otp/pull/1950
> >
> > This PR adds a possible value for packet reception to the current form:
> >
> > {ok, {PeerIP, PeerPort, Data}}
> >
> > To ask make it possible to alternatively get:
> >
> > {ok, {PeerIP, PeerPort, AncData, Data}}
> >
> > Based on socket options set earlier. So let’s put it in context for the
> current proposal:
> >
> > begin
> > {X,Y} <~ id({ok, {X,Y}}),
> > {PeerIP, PeerPort, Data} <~ gen_udp:recv(...),
> > ...
> > end
> >
> > If AncData is received, an exception is raised: the value was not an
> error but didn’t have the shape or type expected for the successful pattern
> to match. Errors are still returned properly by exiting the begin ... end
> block, and we ensure correctness in what we handle and return.
> >
> > However, had we used this form:
> >
> > begin
> > {ok, {X,Y}} <- id({ok, {X,Y}}),
> > {ok, {PeerIP, PeerPort, Data}} <- gen_udp:recv(...),
> > ...
> > end
> >
> > Since this would return early on any non-matching value (we can’t take a
> stance on what is an error or ok value aside from the given pattern), the
> whole expression, if the socket is configured unexpectedly, could return
> {ok, {PeerIP, PeerPort, AncData, Data}} on a failure to match
> >
> > Basically, an unexpected but good result could be returned from a
> function using the begin ... end construct, which would look like a success
> while it was actually a complete failure to match and handle the
> information given. This is made even more ambiguous when data has the right
> shape and type, but a set of bound variables ultimately define whether the
> match succeeds or fails.
> >
> > In worst cases, It could let raw unformatted data exit a conditional
> pipeline with no way to detect it after the fact, particularly if later
> functions in begin ... end apply transformations to text, such as
> anonymizing or sanitizing data, for example. This could be pretty unsafe
> and near impossible to debug well.
> >
> > Think for example of:
> >
> > -spec fetch() -> iodata().
> > fetch() ->
> > begin
> > {ok, B = <<_/binary>>} <- f(),
> > true <- validate(B),
> > {ok, sanitize(B)}
> > end.
> >
> > If the value returned from f() turns out to be a list (say it’s a
> misconfigured socket using list instead of binary as an option), the
> expression will return early, the fetch() function will still return {ok,
> iodata()} but you couldn’t know as a caller whether it is the transformed
> data or non-matching content. It would not be obvious to most developers
> either that this could represent a major security risk by allowing
> unexpected data to be seen as clean data.
> >
> > It is basically a risky pattern if you want your code to be strict or
> future-proof in the context of error handling. The current proposal, by
> comparison, would raise an exception on unexpected good values, therefore
> preventing ways to sneak such data into your control flow:
> >
> > -spec fetch() -> iodata().
> > fetch() ->
> > begin
> > B = <<_/binary>> <~ f(),
> > _ <~ validate(B), % returns ok if valid
> > {ok, sanitize(B)}
> > end.
> >
> > Here misconfigured sockets won’t result in unchecked data passing trough
> your app.
> >
> > The general pattern mechanism may have a place, but I believe it could
> not guarantee the same amount of safety as the current proposal in any
> error-handling context, which was my concern in writing EEP 49.
> >
> > I can think of no way to make the general pattern approach safer than or
> even as safe as the currently suggested mechanism under its current form.
> You would have to necessarily add complexity to the construct with a kind
> of ‘else’ clause where you must handle all non-matching cases explicitly if
> you want them returned (this is what Elixir allows), but unless the clause
> is mandatory (it is not in Elixir), you will have that risky ambiguity
> possibly hiding in all pattern matches:
> >
> > -spec fetch() -> iodata().
> > fetch() ->
> > begin
> > {ok, B = <<_/binary>>} <- f(),
> > true <- validate(B),
> > {ok, sanitize(B)}
> > else
> > {error, _} = E -> E;
> > false -> false
> > end.
> >
> > Putting it another way, to get the same amount of safety, you’d have to
> re-state all acceptable error forms just to ensure the unexpected cases
> don’t go through and instead appropriately raise exceptions, which would
> likely be a missing else clause exception. This exception would obscure the
> the original error site as well, something the current form does not do.
> >
> > The follow up question I ask is whether this would be a significant
> improvement over the current Erlang code, making it worth the new construct?
> >
> > -spec fetch() -> iodata().
> > fetch() ->
> > case f() of
> > {ok, B = <<_/binary>>} ->
> > case validate(B) of
> > true -> {ok, sanitize(B)};
> > false -> false
> > end;
> > {error, _} = E -> E
> > end.
> >
> > compared to the <- form, it has roughly the same line count (a few more
> the more clauses are added), but nested cases have the benefit of making it
> obvious which bad matches are acceptable in which context. Here, f() won’t
> be able to validly return false as a surprise value whereas the general
> pattern form would.
> >
> > This problem does not exist in the EEP 49 mechanism specifically because
> it mandates patterns that denote unambiguous success or failure conditions.
> You end up with code that is shorter and safer at the same time.
> >
> > I hope this helps validate the current design, but let me know if you
> disagree.
> >
> > Regards,
> > Fred
> > _______________________________________________
> > eeps mailing list
> > eeps@REDACTED
> > http://erlang.org/mailman/listinfo/eeps
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/eeps/attachments/20180910/78ec3e02/attachment.htm>
More information about the eeps
mailing list