New EEP draft: Pinning operator ^ in patterns

Raimo Niskanen raimo+erlang-questions@REDACTED
Thu Jan 14 16:41:05 CET 2021


On Thu, Jan 14, 2021 at 02:13:47PM +0100, Richard Carlsson wrote:
> The way I planned it is:
>   1. Even from the start, pinning will always be allowed, without requiring
> any flag to opt in. This does not tell you about existing uses of
> already-bound variables, but you can start using pinning right away for
> readability and for avoiding bugs when refactoring. The compiler will
> always tell you if a pinned variable doesn't exist, so you don't
> accidentally accept any value in that position.

I do not like an intermediate state where pinning may be used in only some
places in a module.  Code should be rewritten module wise to keep each
module consistent.  And if a module is using pinning it should be a
syntactical error to not use pinning where it should be.


>   2. You can enable warnings at your own pace in order to start cleaning up
> your code.
>   3. In a following major release, the warnings will be on by default, but
> you can disable them to compile old code.
>   4. In a distant future, it might become an error to not use ^ to mark
> already-bound variables.
> 
>         /Richard
> 
> 
> Den tors 14 jan. 2021 kl 13:33 skrev Raimo Niskanen <
> raimo+erlang-questions@REDACTED>:
> 
> > As others have said: for Elixir this operator is essential, since they
> > rebind variables without it.
> >
> > For Erlang, if using a pinning operator had been required from the start;
> > I think that would have been a bit better than the current "match
> > if already bound".  It is hard to be sure by looking at the code
> > if the variable is already bound - you have to make a machine search.
> >
> > Introducing a pinning operator now is trickier...
> >
> > Having a compiler option to choose if pinning is allowed/required makes it
> > hard to know what to expect from the code.  The compiler option is set in
> > some Makefile far away from the source code.
> >
> > I think I would prefer that instead there should be a compiler pragma
> > (I wish it would not be allowed from an include file but that is probably
> > impossible to enforce) so it is visible in the current module what to
> > expect about operator pinning.  Without the pragma the pinning operator is
> > not allowed, with it pinning is mandatory; not a warning - an error if
> > a pinning operator is missing.
> >
> > You get the idea: it should be possible from the source code how to read
> > it, at least on the module level.
> >
> > How to take the next step i.e when code not using pinning is the exception,
> > to remove the compiler pragma, I have not thought about yet...
> >
> > Cheers
> > / Raimo Niskanen
> >
> >
> >
> > On Thu, Dec 24, 2020 at 09:10:17PM +0100, Richard Carlsson wrote:
> > > The ^ operator allows you to annotate already-bound pattern variables as
> > > ^X, like in Elixir. This is less error prone when code is being
> > refactored
> > > and moved around so that variables previously new in a pattern may become
> > > bound, or vice versa, and makes it easier for the reader to see the
> > intent
> > > of the code.
> > >
> > > See also https://github.com/erlang/otp/pull/2951
> > >
> > > Ho ho ho,
> > >
> > >         /Richard & the good folks at WhatsApp
> >
> > >     Author: Richard carlsson <carlsson.richard(at)gmail(dot)com>
> > >     Status: Draft
> > >     Type: Standards Track
> > >     Created: 21-Dec-2020
> > >     Erlang-Version: 24
> > >     Post-History: 24-Dec-2020
> > > ****
> > > EEP XXX: Pinning operator ^ in patterns
> > > ----
> > >
> > >
> > > Abstract
> > > ========
> > >
> > > This EEP proposes the addition of a new unary operator `^` for
> > > explicitly marking variables in patterns as being already bound.  This
> > > is known as "pinning" in Elixir - see [Elixir doc][the Elixir
> > > documentation].
> > >
> > > For example:
> > >
> > >     f(X, Y) ->
> > >         case X of
> > >             {a, Y} -> ok;
> > >             _ -> error
> > >         end.
> > >
> > > could be written more explicitly:
> > >
> > >     f(X, Y) ->
> > >         case X of
> > >             {a, ^Y} -> ok;
> > >             _ -> error
> > >         end.
> > >
> > > In Elixir, this operator is strictly necessary for being able to refer
> > > to the value of a bound variable as part of a pattern, because
> > > variables in patterns are always regarded as being new shadowing
> > > instances (like in Erlang's fun clause heads), unless explicitly
> > > pinned.
> > >
> > > In Erlang, they would be optional, but are still a good idea because
> > > they make programs more robust under edits and refactorings, and
> > > furthermore allow the use of pinned variables in fun clause heads and
> > > in comprehension generator patterns.
> > >
> > >
> > > Specification
> > > =============
> > >
> > > A new unary operator `^` is added to Erlang, called the "pinning
> > > operator".  It may only be used in patterns, and only on variables.
> > > Its meaning is that the "pinned" variable is to be interpreted in the
> > > enclosing environment of the pattern, and its value used in its place
> > > for that position in the pattern.
> > >
> > > In current Erlang, this behaviour is what happens automatically in
> > > ordinary matching constructs if the variable is already bound in the
> > > enclosing environment.  In the following example:
> > >
> > >     f(X, Y) ->
> > >         case X of
> > >             {a, Y} -> {ok, Y};
> > >             _ -> error
> > >         end.
> > >
> > > the use of `Y` in the pattern is regarded as a reference to the
> > > function parameter `Y`, instead of as introducing a new variable, and
> > > the `Y` in the clause body is then that same parameter.  Therefore,
> > > annotating the pattern variable as `^Y` in this case does not change
> > > the behaviour of the program, but makes the intent explicit:
> > >
> > >     f(X, Y) ->
> > >         case X of
> > >             {a, ^Y} -> {ok, Y};
> > >             _ -> error
> > >         end.
> > >
> > > For fun expressions and list comprehension generator patterns, the
> > > pinning operator makes the language more expressive.  Take the
> > > following Erlang code:
> > >
> > >     f(X, Y) ->
> > >         F = fun ({a, Y}) -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > Here, the occurrence of `Y` in the clause head of the fun `F` is a new
> > > variable instance, shadowing the `Y` parameter of `f(X, Y)`, and the
> > > fun clause will match any value in that position.  The `Y` in the
> > > clause body is the one bound in the clause head.  However, using the
> > > pinning operator, we can selectively match on variables bound in the
> > > outer scope:
> > >
> > >     f(X, Y) ->
> > >         F = fun ({a, ^Y})  -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > In this case, there is no new binding of `Y`, and the use of `Y` in
> > > the fun clause body refers to the function parameter.  But it is also
> > > possible to combine pinning and shadowing in the same pattern:
> > >
> > >     f(X, Y) ->
> > >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > In this case, the pinned field refers to the value of the function
> > > function parameter, but there is also a new shadowing binding of `Y`
> > > to the third field of the tuple.  The use in the fun clause body now
> > > refers to the shadowing instance.
> > >
> > > Generator patterns in list comprehensions or binary comprehensions
> > > follow the same rules as fun clause heads, so with pinning we can for
> > > example write the following code:
> > >
> > >     f(X, Y) ->
> > >         [{b, Y} || {a, ^Y, Y} <- X].
> > >
> > > where the `Y` in `{b, Y}` is the shadowing instance bound to the third
> > > element of the pattern tuple.
> > >
> > > Finally, a new compiler flag `warn_unpinned_vars` is added, disabled
> > > by default, which if enabled makes the compiler emit warnings about
> > > all uses of already bound variables in patterns that are not
> > > explicitly annotated with the `^` operator.  This allows users to
> > > migrate their code module by module towards using explicit pinning in
> > > all their code.  If pinning becomes the norm in Erlang, this flag
> > > could be turned on by default, and eventually, the pinning operator
> > > could become strictly required for referring to already bound
> > > variables in patterns.
> > >
> > >
> > > Rationale
> > > =========
> > >
> > > The explicit pinning of variables in patterns make programs more
> > > readable, because the intent of the code becomes clear.  When already
> > > bound variables are used in Erlang without any annotation, anyone
> > > reading a piece of code must first study it closely to understand
> > > which variables will be bound at the point of a pattern, before they
> > > can tell whether any pattern variable is a new binding or implies an
> > > equality assertion.  This is easy to miss even for experienced
> > > Erlangers, be it during code reviews or while trying to understand a
> > > piece of poorly commented code.
> > >
> > > Perhaps more importantly, pinning also makes programs more robust
> > > under edits and refactorings.  Take our previous example, and add a
> > > print statement:
> > >
> > >     f(X, Y) ->
> > >         io:format("checking: ~p", [Y]),
> > >         case X of
> > >             {a, Y} -> {ok, Y};
> > >             _ -> error
> > >         end.
> > >
> > > Suppose someone renames the function parameter from `Y` to `Z` and
> > > updates the print statement but forgets to update the use in the case
> > > clause.  Without an explicit pinning annotation, the change would be
> > > quietly allowed, but the `Y` in the pattern would be interpreted as a
> > > new variable that will match any value, which will then be used in the
> > > body.  This changes the behaviour of the program.  If the use in the
> > > pattern had been annotated as `^Y`, the compiler would have generated
> > > an error "Y is unbound" and the mistake would have been caught.
> > >
> > > When code is being modified to add a feature or fix a bug, a
> > > programmer might want to introduce a new variable for a temporary
> > > result.  In a long function body, this risks introducing a new bug.
> > > Consider the following:
> > >
> > >     g(Stuff) ->
> > >        ...
> > >        Thing = case ... of
> > >                    {a, T} -> T;
> > >                    _ -> 0
> > >                end,
> > >        ...
> > >        {ok, [Thing|Stuff]}.
> > >
> > > Here, `T` is a new variable, clearly intended as just a temporary and
> > > local variable for extracting the second element of the tuple.  But
> > > suppose that someone adds a binding of the name `T` further up in the
> > > function body, without noticing that the name is already in use:
> > >
> > >     g(Stuff) ->
> > >        ...
> > >        T = q(Stuff) + 1,
> > >        io:format("~p", [p(T)]),
> > >        ...
> > >        Thing = case ... of
> > >                    {a, T} -> T;
> > >                    _ -> 0
> > >                end,
> > >        ...
> > >        {ok, [Thing|Stuff]}.
> > >
> > > Now the first clause of the case switch will only match if the second
> > > element of the tuple has the exact same value as the previously
> > > defined `T`.  Again, the compiler quietly accepts this change, while
> > > if it had been instructed to warn about all non-annotated uses of
> > > already bound variables in patterns, this mistake would have been
> > > detected.
> > >
> > >
> > > Shadowing in Funs and Comprehensions
> > > ------------------------------------
> > >
> > > In funs and comprehensions, pinning also lets us do things that
> > > otherwise requires additional temporary variables.  Consider the
> > > following code:
> > >
> > >     f(X, Y) ->
> > >         F = fun ({a, Y}) -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > Since the `Y` in the clause head of the fun is a new shadowing
> > > instance, the pattern will match any value in that position.  To match
> > > only the value passed as `Y` to `f`, a clause guard must be added, and
> > > a temporary variable be used to access the outer `Y`:
> > >
> > >     f(X, Y) ->
> > >         OuterY = Y,
> > >         F = fun ({a, Y}) when Y =:= OuterY -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > We could instead rename the inner use of `Y` to avoid shadowing, but
> > > the equality test must still be written as an explicit guard:
> > >
> > >     f(X, Y) ->
> > >         F = fun ({a, Z}) when Z =:= Y -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > With the help of the pinning operator, such things are no longer a
> > > concern, and we can simply write:
> > >
> > >     f(X, Y) ->
> > >         F = fun ({a, ^Y}) -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > Furthermore, in the odd case that the pattern would both need to
> > > access the surrounding definition of `Y` as well as introduce a new
> > > shadowing binding, this can be easily written using pinning:
> > >
> > >     f(X, Y) ->
> > >         F = fun ({a, ^Y, Y})  -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > but in current Erlang, two separate temporary variables would be
> > > required:
> > >
> > >     f(X, Y) ->
> > >         OuterY = Y,
> > >         F = fun ({a, Temp, Y}) when Temp =:= OuterY -> {ok, Y};
> > >                 (_) -> error
> > >             end,
> > >         F(X).
> > >
> > > As explained before, the same goes for patterns in generators of
> > > comprehensions.
> > >
> > >
> > >
> > > Backwards Compatibility
> > > =======================
> > >
> > > The addition of a new and previously unused operator `^` does not
> > > affect the meaning of existing code, and the compiler will not emit
> > > any new warnings or errors for existing code, unless explicitly
> > > enabled with `warn_unpinned_vars`.  This change is therefore fully
> > > backwards compatible.
> > >
> > >
> > >
> > > Implementation
> > > ==============
> > >
> > > The implementation can be found in [PR #2951][pr].
> > >
> > >
> > >
> > > Copyright
> > > =========
> > >
> > > This document has been placed in the public domain.
> > >
> > >
> > > [Elixir doc]:
> > https://elixir-lang.org/getting-started/pattern-matching.html#the-pin-operator
> > >     "Elixir pattern matching - the pin operator"
> > >
> > > [pr]: https://github.com/erlang/otp/pull/2951
> > >     "#2951: Add a new operator ^ for pinning of pattern variables"
> > >
> > >
> > >
> > > [EmacsVar]: <> "Local Variables:"
> > > [EmacsVar]: <> "mode: indented-text"
> > > [EmacsVar]: <> "indent-tabs-mode: nil"
> > > [EmacsVar]: <> "sentence-end-double-space: t"
> > > [EmacsVar]: <> "fill-column: 70"
> > > [EmacsVar]: <> "coding: utf-8"
> > > [EmacsVar]: <> "End:"
> > > [VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4
> > softtabstop=4: "
> >
> >
> > --
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB


More information about the erlang-questions mailing list