New EEP draft: Pinning operator ^ in patterns

Thu Jan 14 22:37:15 CET 2021

I don't like the idea of pinning. One should already know which
variables are not bound. The pinning character would just be more junk
to keep track of.

I would vote for not having a pinning character.

Sam

On Thu, Jan 14, 2021 at 10:14 AM Leonard B <leonard.boyce@REDACTED> wrote:
>
> I've been mulling this over ever since it was first announced and I
> have to admit I have a fairly strong reaction to the proposal.
>
> I'm not a contributor, but I've been using Erlang in anger daily for
> 12 odd years now.
>
> I appreciate the utility of the proposed pinning operator in two specific cases:
> 1. matching against a previously bound variable name in function heads of funs
> 2. matching against a previously bound variable name in list comprehensions
>
> In both these cases, as I understand it, we end up saving a few
> characters and a temp variable name.
>
> As to eventually making it an error, that I'd be very much against.
> Having to update every single piece of code ever written to now add an
> extra character in every case we want to match against a previously
> defined variable name seems like an excessive burden.
>
> There is also the unfortunate side effect of it introducing confusion
> thanks to the ties with Elixir syntax. I've been seeing this both here
> on the list and in Slack discussions.
>
> Feel free to ignore the following...
> I ask myself, since there are already 'erlangy' scoping rules for
> variables (esp with funs) would it not just be better to change those
> to more fully allow defined variables from outer scope within
> fun/comprehension scope?
> IE, if a variable is previously assigned within the outer scope use
> that variable within the inner scope.
> This, in my addled mind, makes more sense
>
> EG:
> %% X is from outer scope, L is from outer scope, V is local to comprehension
> X = 5.
> L = [1,2,3,4,5].
> [V || {X, V} <- L].
>
>
>
> On Thu, Jan 14, 2021 at 8:14 AM Richard Carlsson
> <carlsson.richard@REDACTED> wrote:
> >
> > The way I planned it is:
> >   1. Even from the start, pinning will always be allowed, without requiring any flag to opt in. This does not tell you about existing uses of already-bound variables, but you can start using pinning right away for readability and for avoiding bugs when refactoring. The compiler will always tell you if a pinned variable doesn't exist, so you don't accidentally accept any value in that position.
> >   2. You can enable warnings at your own pace in order to start cleaning up your code.
> >   3. In a following major release, the warnings will be on by default, but you can disable them to compile old code.
> >   4. In a distant future, it might become an error to not use ^ to mark already-bound variables.
> >
> >         /Richard
> >
> >
> > Den tors 14 jan. 2021 kl 13:33 skrev Raimo Niskanen <raimo+erlang-questions@REDACTED>:
> >>
> >> As others have said: for Elixir this operator is essential, since they
> >> rebind variables without it.
> >>
> >> For Erlang, if using a pinning operator had been required from the start;
> >> I think that would have been a bit better than the current "match
> >> if already bound".  It is hard to be sure by looking at the code
> >> if the variable is already bound - you have to make a machine search.
> >>
> >> Introducing a pinning operator now is trickier...
> >>
> >> Having a compiler option to choose if pinning is allowed/required makes it
> >> hard to know what to expect from the code.  The compiler option is set in
> >> some Makefile far away from the source code.
> >>
> >> I think I would prefer that instead there should be a compiler pragma
> >> (I wish it would not be allowed from an include file but that is probably
> >> impossible to enforce) so it is visible in the current module what to
> >> expect about operator pinning.  Without the pragma the pinning operator is
> >> not allowed, with it pinning is mandatory; not a warning - an error if
> >> a pinning operator is missing.
> >>
> >> You get the idea: it should be possible from the source code how to read
> >> it, at least on the module level.
> >>
> >> How to take the next step i.e when code not using pinning is the exception,
> >> to remove the compiler pragma, I have not thought about yet...
> >>
> >> Cheers
> >> / Raimo Niskanen
> >>
> >>
> >>
> >> On Thu, Dec 24, 2020 at 09:10:17PM +0100, Richard Carlsson wrote:
> >> > The ^ operator allows you to annotate already-bound pattern variables as
> >> > ^X, like in Elixir. This is less error prone when code is being refactored
> >> > and moved around so that variables previously new in a pattern may become
> >> > bound, or vice versa, and makes it easier for the reader to see the intent
> >> > of the code.
> >> >
> >> > See also https://github.com/erlang/otp/pull/2951
> >> >
> >> > Ho ho ho,
> >> >
> >> >         /Richard & the good folks at WhatsApp
> >>
> >> >     Author: Richard carlsson <carlsson.richard(at)gmail(dot)com>
> >> >     Status: Draft
> >> >     Type: Standards Track
> >> >     Created: 21-Dec-2020
> >> >     Erlang-Version: 24
> >> >     Post-History: 24-Dec-2020
> >> > ****
> >> > EEP XXX: Pinning operator ^ in patterns
> >> > ----
> >> >
> >> >
> >> > Abstract
> >> > ========
> >> >
> >> > This EEP proposes the addition of a new unary operator `^` for
> >> > explicitly marking variables in patterns as being already bound.  This
> >> > is known as "pinning" in Elixir - see [Elixir doc][the Elixir
> >> > documentation].
> >> >
> >> > For example:
> >> >
> >> >     f(X, Y) ->
> >> >         case X of
> >> >             {a, Y} -> ok;
> >> >             _ -> error
> >> >         end.
> >> >
> >> > could be written more explicitly:
> >> >
> >> >     f(X, Y) ->
> >> >         case X of
> >> >             {a, ^Y} -> ok;
> >> >             _ -> error
> >> >         end.
> >> >
> >> > In Elixir, this operator is strictly necessary for being able to refer
> >> > to the value of a bound variable as part of a pattern, because
> >> > variables in patterns are always regarded as being new shadowing
> >> > instances (like in Erlang's fun clause heads), unless explicitly
> >> > pinned.
> >> >
> >> > In Erlang, they would be optional, but are still a good idea because
> >> > they make programs more robust under edits and refactorings, and
> >> > furthermore allow the use of pinned variables in fun clause heads and
> >> > in comprehension generator patterns.
> >> >
> >> >
> >> > Specification
> >> > =============
> >> >
> >> > A new unary operator `^` is added to Erlang, called the "pinning
> >> > operator".  It may only be used in patterns, and only on variables.
> >> > Its meaning is that the "pinned" variable is to be interpreted in the
> >> > enclosing environment of the pattern, and its value used in its place
> >> > for that position in the pattern.
> >> >
> >> > In current Erlang, this behaviour is what happens automatically in
> >> > ordinary matching constructs if the variable is already bound in the
> >> > enclosing environment.  In the following example:
> >> >
> >> >     f(X, Y) ->
> >> >         case X of
> >> >             {a, Y} -> {ok, Y};
> >> >             _ -> error
> >> >         end.
> >> >
> >> > the use of `Y` in the pattern is regarded as a reference to the
> >> > function parameter `Y`, instead of as introducing a new variable, and
> >> > the `Y` in the clause body is then that same parameter.  Therefore,
> >> > annotating the pattern variable as `^Y` in this case does not change
> >> > the behaviour of the program, but makes the intent explicit:
> >> >
> >> >     f(X, Y) ->
> >> >         case X of
> >> >             {a, ^Y} -> {ok, Y};
> >> >             _ -> error
> >> >         end.
> >> >
> >> > For fun expressions and list comprehension generator patterns, the
> >> > pinning operator makes the language more expressive.  Take the
> >> > following Erlang code:
> >> >
> >> >     f(X, Y) ->
> >> >         F = fun ({a, Y}) -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > Here, the occurrence of `Y` in the clause head of the fun `F` is a new
> >> > variable instance, shadowing the `Y` parameter of `f(X, Y)`, and the
> >> > fun clause will match any value in that position.  The `Y` in the
> >> > clause body is the one bound in the clause head.  However, using the
> >> > pinning operator, we can selectively match on variables bound in the
> >> > outer scope:
> >> >
> >> >     f(X, Y) ->
> >> >         F = fun ({a, ^Y})  -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > In this case, there is no new binding of `Y`, and the use of `Y` in
> >> > the fun clause body refers to the function parameter.  But it is also
> >> > possible to combine pinning and shadowing in the same pattern:
> >> >
> >> >     f(X, Y) ->
> >> >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > In this case, the pinned field refers to the value of the function
> >> > function parameter, but there is also a new shadowing binding of `Y`
> >> > to the third field of the tuple.  The use in the fun clause body now
> >> > refers to the shadowing instance.
> >> >
> >> > Generator patterns in list comprehensions or binary comprehensions
> >> > follow the same rules as fun clause heads, so with pinning we can for
> >> > example write the following code:
> >> >
> >> >     f(X, Y) ->
> >> >         [{b, Y} || {a, ^Y, Y} <- X].
> >> >
> >> > where the `Y` in `{b, Y}` is the shadowing instance bound to the third
> >> > element of the pattern tuple.
> >> >
> >> > Finally, a new compiler flag `warn_unpinned_vars` is added, disabled
> >> > by default, which if enabled makes the compiler emit warnings about
> >> > all uses of already bound variables in patterns that are not
> >> > explicitly annotated with the `^` operator.  This allows users to
> >> > migrate their code module by module towards using explicit pinning in
> >> > all their code.  If pinning becomes the norm in Erlang, this flag
> >> > could be turned on by default, and eventually, the pinning operator
> >> > could become strictly required for referring to already bound
> >> > variables in patterns.
> >> >
> >> >
> >> > Rationale
> >> > =========
> >> >
> >> > The explicit pinning of variables in patterns make programs more
> >> > readable, because the intent of the code becomes clear.  When already
> >> > bound variables are used in Erlang without any annotation, anyone
> >> > reading a piece of code must first study it closely to understand
> >> > which variables will be bound at the point of a pattern, before they
> >> > can tell whether any pattern variable is a new binding or implies an
> >> > equality assertion.  This is easy to miss even for experienced
> >> > Erlangers, be it during code reviews or while trying to understand a
> >> > piece of poorly commented code.
> >> >
> >> > Perhaps more importantly, pinning also makes programs more robust
> >> > under edits and refactorings.  Take our previous example, and add a
> >> > print statement:
> >> >
> >> >     f(X, Y) ->
> >> >         io:format("checking: ~p", [Y]),
> >> >         case X of
> >> >             {a, Y} -> {ok, Y};
> >> >             _ -> error
> >> >         end.
> >> >
> >> > Suppose someone renames the function parameter from `Y` to `Z` and
> >> > updates the print statement but forgets to update the use in the case
> >> > clause.  Without an explicit pinning annotation, the change would be
> >> > quietly allowed, but the `Y` in the pattern would be interpreted as a
> >> > new variable that will match any value, which will then be used in the
> >> > body.  This changes the behaviour of the program.  If the use in the
> >> > pattern had been annotated as `^Y`, the compiler would have generated
> >> > an error "Y is unbound" and the mistake would have been caught.
> >> >
> >> > When code is being modified to add a feature or fix a bug, a
> >> > programmer might want to introduce a new variable for a temporary
> >> > result.  In a long function body, this risks introducing a new bug.
> >> > Consider the following:
> >> >
> >> >     g(Stuff) ->
> >> >        ...
> >> >        Thing = case ... of
> >> >                    {a, T} -> T;
> >> >                    _ -> 0
> >> >                end,
> >> >        ...
> >> >        {ok, [Thing|Stuff]}.
> >> >
> >> > Here, `T` is a new variable, clearly intended as just a temporary and
> >> > local variable for extracting the second element of the tuple.  But
> >> > suppose that someone adds a binding of the name `T` further up in the
> >> > function body, without noticing that the name is already in use:
> >> >
> >> >     g(Stuff) ->
> >> >        ...
> >> >        T = q(Stuff) + 1,
> >> >        io:format("~p", [p(T)]),
> >> >        ...
> >> >        Thing = case ... of
> >> >                    {a, T} -> T;
> >> >                    _ -> 0
> >> >                end,
> >> >        ...
> >> >        {ok, [Thing|Stuff]}.
> >> >
> >> > Now the first clause of the case switch will only match if the second
> >> > element of the tuple has the exact same value as the previously
> >> > defined `T`.  Again, the compiler quietly accepts this change, while
> >> > if it had been instructed to warn about all non-annotated uses of
> >> > already bound variables in patterns, this mistake would have been
> >> > detected.
> >> >
> >> >
> >> > Shadowing in Funs and Comprehensions
> >> > ------------------------------------
> >> >
> >> > In funs and comprehensions, pinning also lets us do things that
> >> > otherwise requires additional temporary variables.  Consider the
> >> > following code:
> >> >
> >> >     f(X, Y) ->
> >> >         F = fun ({a, Y}) -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > Since the `Y` in the clause head of the fun is a new shadowing
> >> > instance, the pattern will match any value in that position.  To match
> >> > only the value passed as `Y` to `f`, a clause guard must be added, and
> >> > a temporary variable be used to access the outer `Y`:
> >> >
> >> >     f(X, Y) ->
> >> >         OuterY = Y,
> >> >         F = fun ({a, Y}) when Y =:= OuterY -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > We could instead rename the inner use of `Y` to avoid shadowing, but
> >> > the equality test must still be written as an explicit guard:
> >> >
> >> >     f(X, Y) ->
> >> >         F = fun ({a, Z}) when Z =:= Y -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > With the help of the pinning operator, such things are no longer a
> >> > concern, and we can simply write:
> >> >
> >> >     f(X, Y) ->
> >> >         F = fun ({a, ^Y}) -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > Furthermore, in the odd case that the pattern would both need to
> >> > access the surrounding definition of `Y` as well as introduce a new
> >> > shadowing binding, this can be easily written using pinning:
> >> >
> >> >     f(X, Y) ->
> >> >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > but in current Erlang, two separate temporary variables would be
> >> > required:
> >> >
> >> >     f(X, Y) ->
> >> >         OuterY = Y,
> >> >         F = fun ({a, Temp, Y}) when Temp =:= OuterY -> {ok, Y};
> >> >                 (_) -> error
> >> >             end,
> >> >         F(X).
> >> >
> >> > As explained before, the same goes for patterns in generators of
> >> > comprehensions.
> >> >
> >> >
> >> >
> >> > Backwards Compatibility
> >> > =======================
> >> >
> >> > The addition of a new and previously unused operator `^` does not
> >> > affect the meaning of existing code, and the compiler will not emit
> >> > any new warnings or errors for existing code, unless explicitly
> >> > enabled with `warn_unpinned_vars`.  This change is therefore fully
> >> > backwards compatible.
> >> >
> >> >
> >> >
> >> > Implementation
> >> > ==============
> >> >
> >> > The implementation can be found in [PR #2951][pr].
> >> >
> >> >
> >> >
> >> > Copyright
> >> > =========
> >> >
> >> > This document has been placed in the public domain.
> >> >
> >> >
> >> > [Elixir doc]: https://elixir-lang.org/getting-started/pattern-matching.html#the-pin-operator
> >> >     "Elixir pattern matching - the pin operator"
> >> >
> >> > [pr]: https://github.com/erlang/otp/pull/2951
> >> >     "#2951: Add a new operator ^ for pinning of pattern variables"
> >> >
> >> >
> >> >
> >> > [EmacsVar]: <> "Local Variables:"
> >> > [EmacsVar]: <> "mode: indented-text"
> >> > [EmacsVar]: <> "indent-tabs-mode: nil"
> >> > [EmacsVar]: <> "sentence-end-double-space: t"
> >> > [EmacsVar]: <> "fill-column: 70"
> >> > [EmacsVar]: <> "coding: utf-8"
> >> > [EmacsVar]: <> "End:"
> >> > [VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "
> >>
> >>
> >> --
> >>
> >> / Raimo Niskanen, Erlang/OTP, Ericsson AB