New EEP draft: Pinning operator ^ in patterns

Richard Carlsson carlsson.richard@REDACTED
Thu Jan 14 14:13:47 CET 2021


The way I planned it is:
  1. Even from the start, pinning will always be allowed, without requiring
any flag to opt in. This does not tell you about existing uses of
already-bound variables, but you can start using pinning right away for
readability and for avoiding bugs when refactoring. The compiler will
always tell you if a pinned variable doesn't exist, so you don't
accidentally accept any value in that position.
  2. You can enable warnings at your own pace in order to start cleaning up
your code.
  3. In a following major release, the warnings will be on by default, but
you can disable them to compile old code.
  4. In a distant future, it might become an error to not use ^ to mark
already-bound variables.

        /Richard


Den tors 14 jan. 2021 kl 13:33 skrev Raimo Niskanen <
raimo+erlang-questions@REDACTED>:

> As others have said: for Elixir this operator is essential, since they
> rebind variables without it.
>
> For Erlang, if using a pinning operator had been required from the start;
> I think that would have been a bit better than the current "match
> if already bound".  It is hard to be sure by looking at the code
> if the variable is already bound - you have to make a machine search.
>
> Introducing a pinning operator now is trickier...
>
> Having a compiler option to choose if pinning is allowed/required makes it
> hard to know what to expect from the code.  The compiler option is set in
> some Makefile far away from the source code.
>
> I think I would prefer that instead there should be a compiler pragma
> (I wish it would not be allowed from an include file but that is probably
> impossible to enforce) so it is visible in the current module what to
> expect about operator pinning.  Without the pragma the pinning operator is
> not allowed, with it pinning is mandatory; not a warning - an error if
> a pinning operator is missing.
>
> You get the idea: it should be possible from the source code how to read
> it, at least on the module level.
>
> How to take the next step i.e when code not using pinning is the exception,
> to remove the compiler pragma, I have not thought about yet...
>
> Cheers
> / Raimo Niskanen
>
>
>
> On Thu, Dec 24, 2020 at 09:10:17PM +0100, Richard Carlsson wrote:
> > The ^ operator allows you to annotate already-bound pattern variables as
> > ^X, like in Elixir. This is less error prone when code is being
> refactored
> > and moved around so that variables previously new in a pattern may become
> > bound, or vice versa, and makes it easier for the reader to see the
> intent
> > of the code.
> >
> > See also https://github.com/erlang/otp/pull/2951
> >
> > Ho ho ho,
> >
> >         /Richard & the good folks at WhatsApp
>
> >     Author: Richard carlsson <carlsson.richard(at)gmail(dot)com>
> >     Status: Draft
> >     Type: Standards Track
> >     Created: 21-Dec-2020
> >     Erlang-Version: 24
> >     Post-History: 24-Dec-2020
> > ****
> > EEP XXX: Pinning operator ^ in patterns
> > ----
> >
> >
> > Abstract
> > ========
> >
> > This EEP proposes the addition of a new unary operator `^` for
> > explicitly marking variables in patterns as being already bound.  This
> > is known as "pinning" in Elixir - see [Elixir doc][the Elixir
> > documentation].
> >
> > For example:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, Y} -> ok;
> >             _ -> error
> >         end.
> >
> > could be written more explicitly:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, ^Y} -> ok;
> >             _ -> error
> >         end.
> >
> > In Elixir, this operator is strictly necessary for being able to refer
> > to the value of a bound variable as part of a pattern, because
> > variables in patterns are always regarded as being new shadowing
> > instances (like in Erlang's fun clause heads), unless explicitly
> > pinned.
> >
> > In Erlang, they would be optional, but are still a good idea because
> > they make programs more robust under edits and refactorings, and
> > furthermore allow the use of pinned variables in fun clause heads and
> > in comprehension generator patterns.
> >
> >
> > Specification
> > =============
> >
> > A new unary operator `^` is added to Erlang, called the "pinning
> > operator".  It may only be used in patterns, and only on variables.
> > Its meaning is that the "pinned" variable is to be interpreted in the
> > enclosing environment of the pattern, and its value used in its place
> > for that position in the pattern.
> >
> > In current Erlang, this behaviour is what happens automatically in
> > ordinary matching constructs if the variable is already bound in the
> > enclosing environment.  In the following example:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, Y} -> {ok, Y};
> >             _ -> error
> >         end.
> >
> > the use of `Y` in the pattern is regarded as a reference to the
> > function parameter `Y`, instead of as introducing a new variable, and
> > the `Y` in the clause body is then that same parameter.  Therefore,
> > annotating the pattern variable as `^Y` in this case does not change
> > the behaviour of the program, but makes the intent explicit:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, ^Y} -> {ok, Y};
> >             _ -> error
> >         end.
> >
> > For fun expressions and list comprehension generator patterns, the
> > pinning operator makes the language more expressive.  Take the
> > following Erlang code:
> >
> >     f(X, Y) ->
> >         F = fun ({a, Y}) -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > Here, the occurrence of `Y` in the clause head of the fun `F` is a new
> > variable instance, shadowing the `Y` parameter of `f(X, Y)`, and the
> > fun clause will match any value in that position.  The `Y` in the
> > clause body is the one bound in the clause head.  However, using the
> > pinning operator, we can selectively match on variables bound in the
> > outer scope:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y})  -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > In this case, there is no new binding of `Y`, and the use of `Y` in
> > the fun clause body refers to the function parameter.  But it is also
> > possible to combine pinning and shadowing in the same pattern:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > In this case, the pinned field refers to the value of the function
> > function parameter, but there is also a new shadowing binding of `Y`
> > to the third field of the tuple.  The use in the fun clause body now
> > refers to the shadowing instance.
> >
> > Generator patterns in list comprehensions or binary comprehensions
> > follow the same rules as fun clause heads, so with pinning we can for
> > example write the following code:
> >
> >     f(X, Y) ->
> >         [{b, Y} || {a, ^Y, Y} <- X].
> >
> > where the `Y` in `{b, Y}` is the shadowing instance bound to the third
> > element of the pattern tuple.
> >
> > Finally, a new compiler flag `warn_unpinned_vars` is added, disabled
> > by default, which if enabled makes the compiler emit warnings about
> > all uses of already bound variables in patterns that are not
> > explicitly annotated with the `^` operator.  This allows users to
> > migrate their code module by module towards using explicit pinning in
> > all their code.  If pinning becomes the norm in Erlang, this flag
> > could be turned on by default, and eventually, the pinning operator
> > could become strictly required for referring to already bound
> > variables in patterns.
> >
> >
> > Rationale
> > =========
> >
> > The explicit pinning of variables in patterns make programs more
> > readable, because the intent of the code becomes clear.  When already
> > bound variables are used in Erlang without any annotation, anyone
> > reading a piece of code must first study it closely to understand
> > which variables will be bound at the point of a pattern, before they
> > can tell whether any pattern variable is a new binding or implies an
> > equality assertion.  This is easy to miss even for experienced
> > Erlangers, be it during code reviews or while trying to understand a
> > piece of poorly commented code.
> >
> > Perhaps more importantly, pinning also makes programs more robust
> > under edits and refactorings.  Take our previous example, and add a
> > print statement:
> >
> >     f(X, Y) ->
> >         io:format("checking: ~p", [Y]),
> >         case X of
> >             {a, Y} -> {ok, Y};
> >             _ -> error
> >         end.
> >
> > Suppose someone renames the function parameter from `Y` to `Z` and
> > updates the print statement but forgets to update the use in the case
> > clause.  Without an explicit pinning annotation, the change would be
> > quietly allowed, but the `Y` in the pattern would be interpreted as a
> > new variable that will match any value, which will then be used in the
> > body.  This changes the behaviour of the program.  If the use in the
> > pattern had been annotated as `^Y`, the compiler would have generated
> > an error "Y is unbound" and the mistake would have been caught.
> >
> > When code is being modified to add a feature or fix a bug, a
> > programmer might want to introduce a new variable for a temporary
> > result.  In a long function body, this risks introducing a new bug.
> > Consider the following:
> >
> >     g(Stuff) ->
> >        ...
> >        Thing = case ... of
> >                    {a, T} -> T;
> >                    _ -> 0
> >                end,
> >        ...
> >        {ok, [Thing|Stuff]}.
> >
> > Here, `T` is a new variable, clearly intended as just a temporary and
> > local variable for extracting the second element of the tuple.  But
> > suppose that someone adds a binding of the name `T` further up in the
> > function body, without noticing that the name is already in use:
> >
> >     g(Stuff) ->
> >        ...
> >        T = q(Stuff) + 1,
> >        io:format("~p", [p(T)]),
> >        ...
> >        Thing = case ... of
> >                    {a, T} -> T;
> >                    _ -> 0
> >                end,
> >        ...
> >        {ok, [Thing|Stuff]}.
> >
> > Now the first clause of the case switch will only match if the second
> > element of the tuple has the exact same value as the previously
> > defined `T`.  Again, the compiler quietly accepts this change, while
> > if it had been instructed to warn about all non-annotated uses of
> > already bound variables in patterns, this mistake would have been
> > detected.
> >
> >
> > Shadowing in Funs and Comprehensions
> > ------------------------------------
> >
> > In funs and comprehensions, pinning also lets us do things that
> > otherwise requires additional temporary variables.  Consider the
> > following code:
> >
> >     f(X, Y) ->
> >         F = fun ({a, Y}) -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > Since the `Y` in the clause head of the fun is a new shadowing
> > instance, the pattern will match any value in that position.  To match
> > only the value passed as `Y` to `f`, a clause guard must be added, and
> > a temporary variable be used to access the outer `Y`:
> >
> >     f(X, Y) ->
> >         OuterY = Y,
> >         F = fun ({a, Y}) when Y =:= OuterY -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > We could instead rename the inner use of `Y` to avoid shadowing, but
> > the equality test must still be written as an explicit guard:
> >
> >     f(X, Y) ->
> >         F = fun ({a, Z}) when Z =:= Y -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > With the help of the pinning operator, such things are no longer a
> > concern, and we can simply write:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y}) -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > Furthermore, in the odd case that the pattern would both need to
> > access the surrounding definition of `Y` as well as introduce a new
> > shadowing binding, this can be easily written using pinning:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > but in current Erlang, two separate temporary variables would be
> > required:
> >
> >     f(X, Y) ->
> >         OuterY = Y,
> >         F = fun ({a, Temp, Y}) when Temp =:= OuterY -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > As explained before, the same goes for patterns in generators of
> > comprehensions.
> >
> >
> >
> > Backwards Compatibility
> > =======================
> >
> > The addition of a new and previously unused operator `^` does not
> > affect the meaning of existing code, and the compiler will not emit
> > any new warnings or errors for existing code, unless explicitly
> > enabled with `warn_unpinned_vars`.  This change is therefore fully
> > backwards compatible.
> >
> >
> >
> > Implementation
> > ==============
> >
> > The implementation can be found in [PR #2951][pr].
> >
> >
> >
> > Copyright
> > =========
> >
> > This document has been placed in the public domain.
> >
> >
> > [Elixir doc]:
> https://elixir-lang.org/getting-started/pattern-matching.html#the-pin-operator
> >     "Elixir pattern matching - the pin operator"
> >
> > [pr]: https://github.com/erlang/otp/pull/2951
> >     "#2951: Add a new operator ^ for pinning of pattern variables"
> >
> >
> >
> > [EmacsVar]: <> "Local Variables:"
> > [EmacsVar]: <> "mode: indented-text"
> > [EmacsVar]: <> "indent-tabs-mode: nil"
> > [EmacsVar]: <> "sentence-end-double-space: t"
> > [EmacsVar]: <> "fill-column: 70"
> > [EmacsVar]: <> "coding: utf-8"
> > [EmacsVar]: <> "End:"
> > [VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4
> softtabstop=4: "
>
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/eeps/attachments/20210114/0978b58c/attachment-0001.htm>


More information about the eeps mailing list