New EEP draft: Pinning operator ^ in patterns

Wed Jan 20 15:42:37 CET 2021

I have vague feeling that this has been asked,
but since I can not find it:

How is nested fun()s handled?

foo(Y) ->
    F = fun (X) ->
	    Y = X + ^Y,
	    FF = fun (Z) ->
	             Z + ^Y
	         end,
	    FF(Y)
        end,
    F(Y).

Does the innermost Z + ^Y access the outermost Y from foo(Y), or the Y
bound in F/1 i.e Y = X + ^Y?

Is there a way to choose which of the outer Y:s to refer to from within FF/1?

Cheers
/ Raimo

On Thu, Dec 24, 2020 at 09:10:17PM +0100, Richard Carlsson wrote:
> The ^ operator allows you to annotate already-bound pattern variables as
> ^X, like in Elixir. This is less error prone when code is being refactored
> and moved around so that variables previously new in a pattern may become
> bound, or vice versa, and makes it easier for the reader to see the intent
> of the code.
> 
> See also https://github.com/erlang/otp/pull/2951
> 
> Ho ho ho,
> 
>         /Richard & the good folks at WhatsApp

>     Author: Richard carlsson <carlsson.richard(at)gmail(dot)com>
>     Status: Draft
>     Type: Standards Track
>     Created: 21-Dec-2020
>     Erlang-Version: 24
>     Post-History: 24-Dec-2020
> ****
> EEP XXX: Pinning operator ^ in patterns
> ----
> 
> 
> Abstract
> ========
> 
> This EEP proposes the addition of a new unary operator `^` for
> explicitly marking variables in patterns as being already bound.  This
> is known as "pinning" in Elixir - see [Elixir doc][the Elixir
> documentation].
> 
> For example:
> 
>     f(X, Y) ->
>         case X of
>             {a, Y} -> ok;
>             _ -> error
>         end.
> 
> could be written more explicitly:
> 
>     f(X, Y) ->
>         case X of
>             {a, ^Y} -> ok;
>             _ -> error
>         end.
> 
> In Elixir, this operator is strictly necessary for being able to refer
> to the value of a bound variable as part of a pattern, because
> variables in patterns are always regarded as being new shadowing
> instances (like in Erlang's fun clause heads), unless explicitly
> pinned.
> 
> In Erlang, they would be optional, but are still a good idea because
> they make programs more robust under edits and refactorings, and
> furthermore allow the use of pinned variables in fun clause heads and
> in comprehension generator patterns.
> 
> 
> Specification
> =============
> 
> A new unary operator `^` is added to Erlang, called the "pinning
> operator".  It may only be used in patterns, and only on variables.
> Its meaning is that the "pinned" variable is to be interpreted in the
> enclosing environment of the pattern, and its value used in its place
> for that position in the pattern.
> 
> In current Erlang, this behaviour is what happens automatically in
> ordinary matching constructs if the variable is already bound in the
> enclosing environment.  In the following example:
> 
>     f(X, Y) ->
>         case X of
>             {a, Y} -> {ok, Y};
>             _ -> error
>         end.
> 
> the use of `Y` in the pattern is regarded as a reference to the
> function parameter `Y`, instead of as introducing a new variable, and
> the `Y` in the clause body is then that same parameter.  Therefore,
> annotating the pattern variable as `^Y` in this case does not change
> the behaviour of the program, but makes the intent explicit:
> 
>     f(X, Y) ->
>         case X of
>             {a, ^Y} -> {ok, Y};
>             _ -> error
>         end.
> 
> For fun expressions and list comprehension generator patterns, the
> pinning operator makes the language more expressive.  Take the
> following Erlang code:
> 
>     f(X, Y) ->
>         F = fun ({a, Y}) -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> Here, the occurrence of `Y` in the clause head of the fun `F` is a new
> variable instance, shadowing the `Y` parameter of `f(X, Y)`, and the
> fun clause will match any value in that position.  The `Y` in the
> clause body is the one bound in the clause head.  However, using the
> pinning operator, we can selectively match on variables bound in the
> outer scope:
> 
>     f(X, Y) ->
>         F = fun ({a, ^Y})  -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> In this case, there is no new binding of `Y`, and the use of `Y` in
> the fun clause body refers to the function parameter.  But it is also
> possible to combine pinning and shadowing in the same pattern:
> 
>     f(X, Y) ->
>         F = fun ({a, ^Y, Y})  -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> In this case, the pinned field refers to the value of the function
> function parameter, but there is also a new shadowing binding of `Y`
> to the third field of the tuple.  The use in the fun clause body now
> refers to the shadowing instance.
> 
> Generator patterns in list comprehensions or binary comprehensions
> follow the same rules as fun clause heads, so with pinning we can for
> example write the following code:
> 
>     f(X, Y) ->
>         [{b, Y} || {a, ^Y, Y} <- X].
> 
> where the `Y` in `{b, Y}` is the shadowing instance bound to the third
> element of the pattern tuple.
> 
> Finally, a new compiler flag `warn_unpinned_vars` is added, disabled
> by default, which if enabled makes the compiler emit warnings about
> all uses of already bound variables in patterns that are not
> explicitly annotated with the `^` operator.  This allows users to
> migrate their code module by module towards using explicit pinning in
> all their code.  If pinning becomes the norm in Erlang, this flag
> could be turned on by default, and eventually, the pinning operator
> could become strictly required for referring to already bound
> variables in patterns.
> 
> 
> Rationale
> =========
> 
> The explicit pinning of variables in patterns make programs more
> readable, because the intent of the code becomes clear.  When already
> bound variables are used in Erlang without any annotation, anyone
> reading a piece of code must first study it closely to understand
> which variables will be bound at the point of a pattern, before they
> can tell whether any pattern variable is a new binding or implies an
> equality assertion.  This is easy to miss even for experienced
> Erlangers, be it during code reviews or while trying to understand a
> piece of poorly commented code.
> 
> Perhaps more importantly, pinning also makes programs more robust
> under edits and refactorings.  Take our previous example, and add a
> print statement:
> 
>     f(X, Y) ->
>         io:format("checking: ~p", [Y]),
>         case X of
>             {a, Y} -> {ok, Y};
>             _ -> error
>         end.
> 
> Suppose someone renames the function parameter from `Y` to `Z` and
> updates the print statement but forgets to update the use in the case
> clause.  Without an explicit pinning annotation, the change would be
> quietly allowed, but the `Y` in the pattern would be interpreted as a
> new variable that will match any value, which will then be used in the
> body.  This changes the behaviour of the program.  If the use in the
> pattern had been annotated as `^Y`, the compiler would have generated
> an error "Y is unbound" and the mistake would have been caught.
> 
> When code is being modified to add a feature or fix a bug, a
> programmer might want to introduce a new variable for a temporary
> result.  In a long function body, this risks introducing a new bug.
> Consider the following:
> 
>     g(Stuff) ->
>        ...
>        Thing = case ... of
>                    {a, T} -> T;
>                    _ -> 0
>                end,
>        ...
>        {ok, [Thing|Stuff]}.
> 
> Here, `T` is a new variable, clearly intended as just a temporary and
> local variable for extracting the second element of the tuple.  But
> suppose that someone adds a binding of the name `T` further up in the
> function body, without noticing that the name is already in use:
> 
>     g(Stuff) ->
>        ...
>        T = q(Stuff) + 1,
>        io:format("~p", [p(T)]),
>        ...
>        Thing = case ... of
>                    {a, T} -> T;
>                    _ -> 0
>                end,
>        ...
>        {ok, [Thing|Stuff]}.
> 
> Now the first clause of the case switch will only match if the second
> element of the tuple has the exact same value as the previously
> defined `T`.  Again, the compiler quietly accepts this change, while
> if it had been instructed to warn about all non-annotated uses of
> already bound variables in patterns, this mistake would have been
> detected.
> 
> 
> Shadowing in Funs and Comprehensions
> ------------------------------------
> 
> In funs and comprehensions, pinning also lets us do things that
> otherwise requires additional temporary variables.  Consider the
> following code:
> 
>     f(X, Y) ->
>         F = fun ({a, Y}) -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> Since the `Y` in the clause head of the fun is a new shadowing
> instance, the pattern will match any value in that position.  To match
> only the value passed as `Y` to `f`, a clause guard must be added, and
> a temporary variable be used to access the outer `Y`:
> 
>     f(X, Y) ->
>         OuterY = Y,
>         F = fun ({a, Y}) when Y =:= OuterY -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> We could instead rename the inner use of `Y` to avoid shadowing, but
> the equality test must still be written as an explicit guard:
> 
>     f(X, Y) ->
>         F = fun ({a, Z}) when Z =:= Y -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> With the help of the pinning operator, such things are no longer a
> concern, and we can simply write:
> 
>     f(X, Y) ->
>         F = fun ({a, ^Y}) -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> Furthermore, in the odd case that the pattern would both need to
> access the surrounding definition of `Y` as well as introduce a new
> shadowing binding, this can be easily written using pinning:
> 
>     f(X, Y) ->
>         F = fun ({a, ^Y, Y})  -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> but in current Erlang, two separate temporary variables would be
> required:
> 
>     f(X, Y) ->
>         OuterY = Y,
>         F = fun ({a, Temp, Y}) when Temp =:= OuterY -> {ok, Y};
>                 (_) -> error
>             end,
>         F(X).
> 
> As explained before, the same goes for patterns in generators of
> comprehensions.
> 
> 
> 
> Backwards Compatibility
> =======================
> 
> The addition of a new and previously unused operator `^` does not
> affect the meaning of existing code, and the compiler will not emit
> any new warnings or errors for existing code, unless explicitly
> enabled with `warn_unpinned_vars`.  This change is therefore fully
> backwards compatible.
> 
> 
> 
> Implementation
> ==============
> 
> The implementation can be found in [PR #2951][pr].
> 
> 
> 
> Copyright
> =========
> 
> This document has been placed in the public domain.
> 
> 
> [Elixir doc]: https://elixir-lang.org/getting-started/pattern-matching.html#the-pin-operator
>     "Elixir pattern matching - the pin operator"
> 
> [pr]: https://github.com/erlang/otp/pull/2951
>     "#2951: Add a new operator ^ for pinning of pattern variables"
> 
> 
> 
> [EmacsVar]: <> "Local Variables:"
> [EmacsVar]: <> "mode: indented-text"
> [EmacsVar]: <> "indent-tabs-mode: nil"
> [EmacsVar]: <> "sentence-end-double-space: t"
> [EmacsVar]: <> "fill-column: 70"
> [EmacsVar]: <> "coding: utf-8"
> [EmacsVar]: <> "End:"
> [VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB