[erlang-questions] case expression scope

Wed Mar 5 03:52:22 CET 2014

On 5/03/2014, at 11:41 AM, Szoboszlay Dániel wrote:

> On Tue, 04 Mar 2014 22:49:12 +0100, Richard A. O'Keefe <ok@REDACTED> wrote:
> 
>> No, you are still thinking in C/C++ terms.
>> There are three  things in Erlang that are scopes:
>> 	- a function clause is a scope
>> 	- a 'fun' is a scope
>> 	- a list comprehension is a scope
> 
> Personally, I feel that these rules are neither convenient nor intuitive. Mainly due to the last one, about the list comprehensions. I *know* that list comprehensions are implemented with funs, hence the scoping,

Wrong.  Yes, list comprehensions *happen* to be implemented
as calls to out-of-line functions, but the *necessity* for
list comprehensions to be separate scopes falls directly
out of their semantics.

Suppose it were otherwise.

	R = [(Y = f(X), Y) || X <- L].

What value would X have after the comprehension
 - if L = []?
 - if L = [1]?
 - if L = [1,2]?
What value would Y have?

The key thing is that X and Y get _different_ values
on each iteration, and Erlang variables being immutable,
the only way we can make sense of that is if each
iteration gets its _own_ X and Y.

>  And if you have background in any C-descendant language you would almost certainly feel that cases, ifs, trys, begins etc. shall have their own scopes too.

Surely C counts as a C-descendant language.
And in C, 'switch' is *not* a scope.  Consider:

int goo(int n) {
    switch (n) if (0)
        case 1:  return -27; else if (0)
        case 2:  return 43; else if (0)
        default: return 122;
    }
}

#include <stdio.h>

int main(void) {
    int i;

    for (i = 0; i < 4; i++) printf("%d => %d\n", i, goo(i));
    return 0;
}

Yep.  That's legal (classic C, C89, C99).  I don't have
a copy of the C11 standard, but I'd be pretty surprised
if it wasn't legal C11.

In Classic C and C89, *none* of 'if', 'switch', 'while',
'do', and 'for' introduce new scopes.
In C99, it's clear from 6.8.4 that 'if' and 'switch'
do *NOT* introduce new scopes and it is clear from
6.8.5 that 'while' and 'do' do *NOT* introduce new
scopes.

In Classic C and C89 the *ONLY* construction that
introduces a new scope is '{...}'.
C99 adds
	iteration-statement:
		...
		for ( declaration expr-opt ; expr-opt ) statement
which *is* a new scope, but in C this is very exceptional.

Conversely, if switch cases were scopes,
you'd expect

void goo(int n) {
    switch (n) {
        case 1:
            int x = 1;
            x++;
        case 2:
            int x = 2;
            x++;
        default:
            int x = 3;
            x--;
    }
}

to be legal, but it isn't.
Change this to

void goo(int n) {
    switch (n) {
        case 1:;
            int x = 1;
            x++;
        case 2:;
            int x = 2;
            x++;
        default:;
            int x = 3;
            x--;
    }
}

and it *still* isn't legal, because the whole body of the
switch is a *single* scope and you are not allowed
more than one declaration of x within it.

I would therefore expect a C programmer to *expect* that
occurrences of a variable in two cases would be the same
variable.

It turns out that the entire body of a switch is a
*single* scope, and the example isn't legal Java either.
Looking at the Java Language Specification for Java 7,
chapter 14, we find that 'if' and 'while' and 'do' do
*NOT* introduce new scopes, that 'for' may but need not,
that 'catch' does, but that 'try' does only in the
14.20.3 try-with-resources form.  So in Java, which is
surely a 'C-descendant' language, selection statements
do NOT introduce new scopes and any Java progammer who
thought they did, or who expected selection statements
in some other language to do so on the strength of what
Java does, would be exposed as not understanding Java.

> 
> Some examples where I really miss them:
> 
> foo(Dict) ->
>    Value = case orddict:find(foo, Dict) of
>                {ok, Value} -> Value;
>                error       -> calculate_very_expensive_default_value()
>            end,
>    ...

So don't do that.  If the compiler *did* allow that,
it would be as confusing as all-get-out for suffering
human beings trying to make sense of it.
It is a blessing that abominations like this are blocked.

This example is obviously not real code.
We have already seen in this thread that the *real*
code that inspired is would be hugely improved by
being broken into little functions, whereupon the
problem disappears.  

> And we aren't done yet, let's add my favorite, the error-handling mess-up to the mix!
> 
> foo() ->
>    ...
>    V1 = case bar() of
>             {ok, Bar} ->
>                 Bar;
>             {error, Reason} ->
>                 lager:error("SOS ~p", [Reason]),
>                 BarDefault
>         end,
>    V2 = case baz() of
>             {ok, Baz} ->
>                 Baz;
>             {error, Reason} ->
>                 lager:error("SOS ~p", [Reason]),
>                 BazDefault
>         end,
>    ...

Again, this is totally unreal code.
Let me offer an equally unreal alternative:

safe_bar() ->
    case bar()
      of {ok, Bar} ->
             Bar
       ; {error, Reason} ->
             lager:error("SOS ~p", [Reason]),
             BarDefault
    end.

safe_baz() ->
    case baz()
      of {ok, Baz} ->
            Baz
       ; {error, Reason} ->
            lager:error("SOS ~p", [Reason]),
            BazDefault
    end.

foo() ->
    ...
    V1 = safe_bar(),
    V2 = safe_baz(),
    ..

> Of course I know real programmers never put two cases in the same function, but lazy people tend to. And lazy people also tend to use convenient variable names like Reason or Error in error branches.

Real programmers who put multiple cases in the same clause
take care to use disjoint names SO THAT HUMAN BEINGS WILL
NOT BE CONFUSED.

Lazy programmers who over-use names like Value, Error, Reason
deserve to be forced to maintain code written by other lazy
programmers.

> The only benefit I see of *not starting* a new scope within a case is that you can do this:
> 
> foo() ->
>    case bar() of
>        1 -> X = 1;
>        2 -> X = 2;
>        _ -> X = 0
>    end,
>    bar(X).
> 
> Which, in my opinion, is less readable than:
> 
> foo() ->
>    X = case bar() of
>            1 -> 1;
>            2 -> 2;
>            _ -> 0
>         end,
>    bar(X).

You are forgetting that there may be multiple variables
involved.  Packaging them up in tuples just so that you
can unpack them again?  Feh.

The thing is that Erlang is what it is
and is not something else.
Criticising Erlang for not being like something else
(especially when it turns out that the something else
is not in fact like that either)
is silly.

Before 'fun' and list comprehensions were added to Erlang
the rule was simple and uniform: one name, one variable,
everywhere in a function clause.  'fun' and list
comprehensions break that rule because they *have* to,
but even they try not to break it any more than they can help.

> 
> I hope these examples may pass the "this code is too trivial/ugly to even notice your actual problem" filter with better luck than the previous nominee. :)

Your examples are profoundly unreal.
Real examples could well be informative.
So far, all you have shown is that lazy programmers
can screw up and the compiler sometimes notices.
So this is news, already?