[erlang-questions] variable exported from case in OTP 18

Thu Oct 1 10:00:38 CEST 2015

On Thursday 01 October 2015 19:28:23 you wrote:
> 
> On 30/09/2015, at 12:42 am, zxq9 <zxq9@REDACTED> wrote:
> > 
> > The reason this has been made into a warning (and many tools compile with warnings-as-errors set) is that it is possible to not assign a variable you access outside the case in every branch of it and still get a clean compile:
> > 
> > foo() ->
> >    case bar() of
> >        {bing, Spam} -> eat(Spam);
> >        {bong, Eggs} -> eat(Eggs)
> >    end,
> >    puke(Spam).
> 
> No, that one's an outright error because Spam is not defined
> in *all* branches.

I didn't realize this actually produces a compile error, but indeed it does.

casetest.erl:9: variable 'Spam' unsafe in 'case' (line 5)

> It's really misleading to even talk about 'exporting' variables.
> Before there were funs or list comprehensions, the model was
> incredibly simple:
> 
> (a) the scope of a variable is the ENTIRE clause it appears in
> (b) at any use of a variable, every path from the entry to that
>     use must bind the variable once and only once.
> 
> (Yuck.  I just realised that (a) is an uncomfortably close
> parallel to JavaScript.)

This is the general sentiment that seemed to lead to this becoming a warning.

The example you presented earlier, though, illustrated the anti-case:

    {X,Y} = case foo(...)
              of {bar,X1,Y1} -> ..., {X1,Y1}
               ; {ugh,Y2,X2} -> ..., {X2,Y2}
            end,
    use(X, Y)

Clearly this is awful compared to just referencing the variables directly.

But... I almost never see the above in actual code. Usually something like:

foo({bar,X,Y}) -> bar_related_thing({X,Y});
foo({ugh,X,Y}) -> ugh_related_thing({X,Y}).

Almost invariably with some other state variable coming along for the ride.

Considering this more carefully now I don't think either the aesthetics of case statements or whether "exporting" is confusing are important issues. This rarely seems to come up in actual code. I can't think of a place similar to the example above where I don't want a separate function instead of a case. That's probably why I (most folks?) have happily plugged away all this time without ever really noticing this.

> I'm reminded of programming languages like IMP and Ada where
> you are not allowed to write
>    p() and q() or r()
> because as a programmer you are presumed to be too dumb to
> work effectively with the concept of operator precedence as
> applied to Boolean operators.

Hey! I actually kind of like Ada. :-)

There is a balance between providing flexibility and providing constructs that almost encourage programmers to silently drop little landmines in their code. My initial (wrong) assumption that case is supposed to be treated as its own semantic scope (which is why I had always thought the warnings were there, and previous discussions here tended to make me think I wasn't alone in expecting them to work that way) made me feel that referencing variables bound in a case is hackish and dirty -- something that might even break someday if the rule changed.

But it is not a particularly confusing idea: Just one scope.

I didn't recall the line in the docs that says, explicitly:

"The scope for a variable is its function clause. Variables bound in a branch of an if, case, or receive expression must be bound in all branches to have a value outside the expression. Otherwise they are regarded as 'unsafe' outside the expression."

It makes me feel sad, all the same. I like limited scope and knowing for sure that it is limited. OTOH, I don't like nested code. Forcing me to accept the return value of a case statement to get anything "out" of it reduces the impulse to pack functions full of cases instead of write separate functions. I suppose that's a language design decision anyway, and it is a decision that has already been made -- just not in the way I would have expected.

> > and list comprehensions being another (it really is a separate scope, but that's not how some other languages work) this has been made into a warning.
> 
> I'm tolerably familiar with Clean, Haskell, F#, and Erlang.
> They all make list comprehensions a scope.
> The only declarative-family language I can think of where there
> is an analogue of list comprehension that isn't a scope is Prolog
> (setof/3 and bagof/3), and that works because Prolog is perfectly
> comfortable with variables that might or might not be bound.
> So which "other languages" make list comprehension not a scope?

Not declarative ones, imperative languages that include this or that generator/comprehension feature and are familiar with the cool kids.

Python, for example:

>>> [x for x in [1,2,3]]
[1, 2, 3]
>>> x
3

I don't think this is addressed in any PEPs, recommended style notes or anything like that. I *do* recall having seen it used more than once to recall the last value of a dynamically generated list after some other processing occurred. (I think in the belly of an XML template language... Cheetah? Django templates? Genji? Something like this)

There is (was?) a way to make something similar happen in Ruby as well, but that language is a big pile of curious decisions anyway. I can't seem to find the syntax for it now.

In the Javascript recommendation for list comprehensions the situation is more weird:

y = [for (z of [1,2,3]) x = z];
x;

/*
y = [1,2,3]
x = 3
*/

Both x and y are accessible now, but z is not. Not that Javascript or Ruby (or some aspects of Python) are great examples of clean language design, but when I referenced "how some other languages work" implying that people might expect scopes to work this way with list comprehensions but maybe work the opposite way in case statements (in light of the compiler warning) this is what I meant.

Comprehensions in these sort of languages are more like wacky syntax over for loops with a few opportunities (apparently) for optimization. (Some operations in list comprehensions in Python are much faster than in an equivalent for loop.) I imagine people expect them to be similar in Erlang, especially considering that using unassigned list comprehensions as a shorthand for lists:foreach/2 specifically to get side-effects is now actually supported as a an optimization.

...not that expectations borne of Javascript experience are things worth living up to.

> That's a quite different issue.  The issue we're talking about
> here is where a variable is unambiguously defined in EVERY branch
> of a branching construct yet the compiler whines when you try to use it.

After this discussion I feel like the warning should be removed entirely. "Just a compiler warning" has always struck me as an uncomfortably vague category of "technically right, but we really don't like things that way" that makes a programmer feel guilty about valid code (not to mention sparking discussions like this one several times a year -- though I do enjoy being proven wrong/learning details here I would probably never have stumbled on writing code by myself).

-Craig