New EEP draft: Pinning operator ^ in patterns

Ingela Andin ingela.andin@REDACTED
Tue Jan 26 12:05:59 CET 2021


Hi!

So for better or worse I also want to share my view of the "pinning
operator"

I work for the OTP team, but the opinions I am about to share are my own.
The pinning operator, or actually annotation, has turned out to be very
controversial and although I can see some arguments for it I still feel
very unconvinced I like to see it in Erlang. I will try to explain some of
my concerns.

Before I started working for the OTP team I worked as an Erlang consultant
and teacher. When teaching Erlang I would tell my students that to program
Erlang/OTP they need to change their mindset from when programming in
imperative or object oriented languages. There are some things that are
central to Erlang programs that are different and it is pattern matching
and single assignment, recursion being a very common way to solve the
problem and how you think about concurrency.

One of my coworkers checked how many matches were done in the OTP source
code as in comparison to our test suites. And there were actually more
matches in the source code. I think this is a result of how most Erlang
programmers solve problems the "Erlang way".  Also this is not counting all
the matches done in function clauses where the annotation makes no sense
(and hence is not included in the "pinning").  So with pinning we will have
two ways of matching depending on where we match.

Now the EEP proposal does not allow rebinding of variables, but let's face
it that would be the next step. Especially  when that is part of why Elixir
needs to have a "pinning operator". And it is argued that we should have
the ^ to resemble Elixir. Now I am not saying we should allow variable
rebinding, but in the case that we would, I think that the rebinding should
be the thing annotated. Because rebinding should be the exception and the
matching should be the default.

I think it is human to consider too little of the context and alas the
compiler will never be able to catch all the times, we also have to rely on
regression tests for instance. How many times have not a feature
development or bug fix involved fixing some other test that happend to
break.

So we will see what happens. I think there are still many things to
consider.

Regards Ingela

Den ons 20 jan. 2021 kl 15:42 skrev Raimo Niskanen <
raimo+erlang-questions@REDACTED>:

> I have vague feeling that this has been asked,
> but since I can not find it:
>
> How is nested fun()s handled?
>
> foo(Y) ->
>     F = fun (X) ->
>             Y = X + ^Y,
>             FF = fun (Z) ->
>                      Z + ^Y
>                  end,
>             FF(Y)
>         end,
>     F(Y).
>
> Does the innermost Z + ^Y access the outermost Y from foo(Y), or the Y
> bound in F/1 i.e Y = X + ^Y?
>
> Is there a way to choose which of the outer Y:s to refer to from within
> FF/1?
>
> Cheers
> / Raimo
>
>
>
> On Thu, Dec 24, 2020 at 09:10:17PM +0100, Richard Carlsson wrote:
> > The ^ operator allows you to annotate already-bound pattern variables as
> > ^X, like in Elixir. This is less error prone when code is being
> refactored
> > and moved around so that variables previously new in a pattern may become
> > bound, or vice versa, and makes it easier for the reader to see the
> intent
> > of the code.
> >
> > See also https://github.com/erlang/otp/pull/2951
> >
> > Ho ho ho,
> >
> >         /Richard & the good folks at WhatsApp
>
> >     Author: Richard carlsson <carlsson.richard(at)gmail(dot)com>
> >     Status: Draft
> >     Type: Standards Track
> >     Created: 21-Dec-2020
> >     Erlang-Version: 24
> >     Post-History: 24-Dec-2020
> > ****
> > EEP XXX: Pinning operator ^ in patterns
> > ----
> >
> >
> > Abstract
> > ========
> >
> > This EEP proposes the addition of a new unary operator `^` for
> > explicitly marking variables in patterns as being already bound.  This
> > is known as "pinning" in Elixir - see [Elixir doc][the Elixir
> > documentation].
> >
> > For example:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, Y} -> ok;
> >             _ -> error
> >         end.
> >
> > could be written more explicitly:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, ^Y} -> ok;
> >             _ -> error
> >         end.
> >
> > In Elixir, this operator is strictly necessary for being able to refer
> > to the value of a bound variable as part of a pattern, because
> > variables in patterns are always regarded as being new shadowing
> > instances (like in Erlang's fun clause heads), unless explicitly
> > pinned.
> >
> > In Erlang, they would be optional, but are still a good idea because
> > they make programs more robust under edits and refactorings, and
> > furthermore allow the use of pinned variables in fun clause heads and
> > in comprehension generator patterns.
> >
> >
> > Specification
> > =============
> >
> > A new unary operator `^` is added to Erlang, called the "pinning
> > operator".  It may only be used in patterns, and only on variables.
> > Its meaning is that the "pinned" variable is to be interpreted in the
> > enclosing environment of the pattern, and its value used in its place
> > for that position in the pattern.
> >
> > In current Erlang, this behaviour is what happens automatically in
> > ordinary matching constructs if the variable is already bound in the
> > enclosing environment.  In the following example:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, Y} -> {ok, Y};
> >             _ -> error
> >         end.
> >
> > the use of `Y` in the pattern is regarded as a reference to the
> > function parameter `Y`, instead of as introducing a new variable, and
> > the `Y` in the clause body is then that same parameter.  Therefore,
> > annotating the pattern variable as `^Y` in this case does not change
> > the behaviour of the program, but makes the intent explicit:
> >
> >     f(X, Y) ->
> >         case X of
> >             {a, ^Y} -> {ok, Y};
> >             _ -> error
> >         end.
> >
> > For fun expressions and list comprehension generator patterns, the
> > pinning operator makes the language more expressive.  Take the
> > following Erlang code:
> >
> >     f(X, Y) ->
> >         F = fun ({a, Y}) -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > Here, the occurrence of `Y` in the clause head of the fun `F` is a new
> > variable instance, shadowing the `Y` parameter of `f(X, Y)`, and the
> > fun clause will match any value in that position.  The `Y` in the
> > clause body is the one bound in the clause head.  However, using the
> > pinning operator, we can selectively match on variables bound in the
> > outer scope:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y})  -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > In this case, there is no new binding of `Y`, and the use of `Y` in
> > the fun clause body refers to the function parameter.  But it is also
> > possible to combine pinning and shadowing in the same pattern:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > In this case, the pinned field refers to the value of the function
> > function parameter, but there is also a new shadowing binding of `Y`
> > to the third field of the tuple.  The use in the fun clause body now
> > refers to the shadowing instance.
> >
> > Generator patterns in list comprehensions or binary comprehensions
> > follow the same rules as fun clause heads, so with pinning we can for
> > example write the following code:
> >
> >     f(X, Y) ->
> >         [{b, Y} || {a, ^Y, Y} <- X].
> >
> > where the `Y` in `{b, Y}` is the shadowing instance bound to the third
> > element of the pattern tuple.
> >
> > Finally, a new compiler flag `warn_unpinned_vars` is added, disabled
> > by default, which if enabled makes the compiler emit warnings about
> > all uses of already bound variables in patterns that are not
> > explicitly annotated with the `^` operator.  This allows users to
> > migrate their code module by module towards using explicit pinning in
> > all their code.  If pinning becomes the norm in Erlang, this flag
> > could be turned on by default, and eventually, the pinning operator
> > could become strictly required for referring to already bound
> > variables in patterns.
> >
> >
> > Rationale
> > =========
> >
> > The explicit pinning of variables in patterns make programs more
> > readable, because the intent of the code becomes clear.  When already
> > bound variables are used in Erlang without any annotation, anyone
> > reading a piece of code must first study it closely to understand
> > which variables will be bound at the point of a pattern, before they
> > can tell whether any pattern variable is a new binding or implies an
> > equality assertion.  This is easy to miss even for experienced
> > Erlangers, be it during code reviews or while trying to understand a
> > piece of poorly commented code.
> >
> > Perhaps more importantly, pinning also makes programs more robust
> > under edits and refactorings.  Take our previous example, and add a
> > print statement:
> >
> >     f(X, Y) ->
> >         io:format("checking: ~p", [Y]),
> >         case X of
> >             {a, Y} -> {ok, Y};
> >             _ -> error
> >         end.
> >
> > Suppose someone renames the function parameter from `Y` to `Z` and
> > updates the print statement but forgets to update the use in the case
> > clause.  Without an explicit pinning annotation, the change would be
> > quietly allowed, but the `Y` in the pattern would be interpreted as a
> > new variable that will match any value, which will then be used in the
> > body.  This changes the behaviour of the program.  If the use in the
> > pattern had been annotated as `^Y`, the compiler would have generated
> > an error "Y is unbound" and the mistake would have been caught.
> >
> > When code is being modified to add a feature or fix a bug, a
> > programmer might want to introduce a new variable for a temporary
> > result.  In a long function body, this risks introducing a new bug.
> > Consider the following:
> >
> >     g(Stuff) ->
> >        ...
> >        Thing = case ... of
> >                    {a, T} -> T;
> >                    _ -> 0
> >                end,
> >        ...
> >        {ok, [Thing|Stuff]}.
> >
> > Here, `T` is a new variable, clearly intended as just a temporary and
> > local variable for extracting the second element of the tuple.  But
> > suppose that someone adds a binding of the name `T` further up in the
> > function body, without noticing that the name is already in use:
> >
> >     g(Stuff) ->
> >        ...
> >        T = q(Stuff) + 1,
> >        io:format("~p", [p(T)]),
> >        ...
> >        Thing = case ... of
> >                    {a, T} -> T;
> >                    _ -> 0
> >                end,
> >        ...
> >        {ok, [Thing|Stuff]}.
> >
> > Now the first clause of the case switch will only match if the second
> > element of the tuple has the exact same value as the previously
> > defined `T`.  Again, the compiler quietly accepts this change, while
> > if it had been instructed to warn about all non-annotated uses of
> > already bound variables in patterns, this mistake would have been
> > detected.
> >
> >
> > Shadowing in Funs and Comprehensions
> > ------------------------------------
> >
> > In funs and comprehensions, pinning also lets us do things that
> > otherwise requires additional temporary variables.  Consider the
> > following code:
> >
> >     f(X, Y) ->
> >         F = fun ({a, Y}) -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > Since the `Y` in the clause head of the fun is a new shadowing
> > instance, the pattern will match any value in that position.  To match
> > only the value passed as `Y` to `f`, a clause guard must be added, and
> > a temporary variable be used to access the outer `Y`:
> >
> >     f(X, Y) ->
> >         OuterY = Y,
> >         F = fun ({a, Y}) when Y =:= OuterY -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > We could instead rename the inner use of `Y` to avoid shadowing, but
> > the equality test must still be written as an explicit guard:
> >
> >     f(X, Y) ->
> >         F = fun ({a, Z}) when Z =:= Y -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > With the help of the pinning operator, such things are no longer a
> > concern, and we can simply write:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y}) -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > Furthermore, in the odd case that the pattern would both need to
> > access the surrounding definition of `Y` as well as introduce a new
> > shadowing binding, this can be easily written using pinning:
> >
> >     f(X, Y) ->
> >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > but in current Erlang, two separate temporary variables would be
> > required:
> >
> >     f(X, Y) ->
> >         OuterY = Y,
> >         F = fun ({a, Temp, Y}) when Temp =:= OuterY -> {ok, Y};
> >                 (_) -> error
> >             end,
> >         F(X).
> >
> > As explained before, the same goes for patterns in generators of
> > comprehensions.
> >
> >
> >
> > Backwards Compatibility
> > =======================
> >
> > The addition of a new and previously unused operator `^` does not
> > affect the meaning of existing code, and the compiler will not emit
> > any new warnings or errors for existing code, unless explicitly
> > enabled with `warn_unpinned_vars`.  This change is therefore fully
> > backwards compatible.
> >
> >
> >
> > Implementation
> > ==============
> >
> > The implementation can be found in [PR #2951][pr].
> >
> >
> >
> > Copyright
> > =========
> >
> > This document has been placed in the public domain.
> >
> >
> > [Elixir doc]:
> https://elixir-lang.org/getting-started/pattern-matching.html#the-pin-operator
> >     "Elixir pattern matching - the pin operator"
> >
> > [pr]: https://github.com/erlang/otp/pull/2951
> >     "#2951: Add a new operator ^ for pinning of pattern variables"
> >
> >
> >
> > [EmacsVar]: <> "Local Variables:"
> > [EmacsVar]: <> "mode: indented-text"
> > [EmacsVar]: <> "indent-tabs-mode: nil"
> > [EmacsVar]: <> "sentence-end-double-space: t"
> > [EmacsVar]: <> "fill-column: 70"
> > [EmacsVar]: <> "coding: utf-8"
> > [EmacsVar]: <> "End:"
> > [VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4
> softtabstop=4: "
>
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210126/57cb6d13/attachment.htm>


More information about the erlang-questions mailing list