[erlang-questions] Why doesn't Erlang has return statement?

Wed Dec 17 05:19:30 CET 2014

On 17/12/2014, at 12:00 pm, Mark Nijhof <mark.nijhof@REDACTED> wrote:

> When I first started using Erlang I was missing the return statement as well, in other languages return gave me an easy way to exit a function as early as possible,

A ‘return’ statement is highly unusual in a functional language.
There is no ‘return’ in ML or O’Caml or F# or Haskell or …

Scheme does have (and SML/NJ has copied) something not entirely
unrelated, namely continuations.

  (call-with-current-continuation (lambda (k)
    . . .))

binds k to a function which when called exits
call-with-current-continuation, returning the value it was given.
What’s really interesting in Scheme is that this is a first-
class value and can be stored in global variables and data
structures and can be invoked many times.  So you can ‘return’
from the same invocation any number of times.

Examining >500,000 lines of Scheme code I happened to find lying
around, I discovered 91 mentions of call-with-current-continuation
or call/cc.  Most of these mentions are benchmarks, test cases,
compiler or preprocessor code dealing with it rather than using it.

Let’s look at one serious use of it.
The context is Reduced Ordered Binary Decision Diagrams,
represented as
    #t
    #f
    (<var> <robdd> <robdd>)

;; (true-model R) => `(true-model ,@REDACTED) for
;; some model M making R true if any exist
;; or ‘no-true-model if there are no such models.

(define (true-model R)
  (call-with-current-continuation (lambda (K)
    (let loop ((R R) (L '()))
      (if (boolean? R)
          (if R (K (cons 'true-model (reverse L))))
          (begin (loop (cadr R)  (cons (cons (car R) #t) L))
                 (loop (caddr R) (cons (cons (car R) #f) L)))))
    'no-true-model)))

What might not be clear here is that (let loop . . .)
defines and calls a recursive function; it is commonly used
for (tail recursive) loops but is not limited to that.

Let’s try doing this in Erlang, using

    true
    false
    {Var,Robdd_T,Robdd_F}

true_model(R) -> true_model(R, []).

true_model(true,    M) -> {true_model,M};
true_model(false,   M) -> no_true_model;
true_model({V,T,F}, M) ->
    case true_model(T, [{V,true}|M])
      of no_true_model -> true_model(F, [{V,false}|M])
       ; Success       -> Success
    end.

That wasn’t so hard.  Note that the Scheme code here is
doing something you can’t do in C, C++, Java, C# &c.
It’s returning from the *outer* function from inside the
*inner* function.

In fact, every use of call-with-current-contination
I can find in that body of Scheme code is either a test
case or a lexically scoped early exit from a recursive
traversal of some data structure, and there aren’t many
of those.

Here’s another real use, this time from a library for
handling (parsed) XML.

(define (first-descendant p Element)
  (if (not (procedure? p)) (set! p (is-named? p)))
  (call-with-current-continuation (lambda (k)
    (define (g List)
      (if (pair? List) (begin (f (car List)) (g (cdr List)))))
    (define (f Item)
      (if (p Item) (k Item))
      (g (element-children Item)))
    (f Element)
    #f)))

Given a predicate (or an element name) p, and an XML element,
this is supposed to return the first descendent in document
order satisfying p (or being an element named p).  If there
is none, it returns false.

Let’s use {Tag,Attributes,Children} for an Element,
          {pcdata,Text} for parsed character data,
and ignore everything else that can happen in XML.

And let’s fix the somewhat confusing “find first descendant
that is an element with a given name” merger with “find
first descendant satisfying a predicate”.

%— elsewhere in library

is_named_element(T, {T,_,_}) -> true;
is_named_element(_, _      ) -> false.

is_named_element(T) ->
    fun (X) -> is_named_element(T, X) end.

children({_,_,Children}) -> Children;
children({pcdata,_})     -> [].

%- unimportant glue code

first_descendant_named(T, XML) ->
    first_descendant(is_named(T), XML).

%- code under discussion

%% first_descendant(P, XML) finds the first descendant of XML
%% that satisfies P, or false if nothing does.  It might be
%% XML itself.  Note the early exit in the auxiliary function.

first_descendant(P, XML) ->
    case P(XML)
      of true  -> XML
       ; false -> first_descendant_kids(P, children(XML))
    end.

first_descendant_kids(_, []) ->
    false;
first_descendant_kids(P, [Kid|Kids]) ->
    case first_descendant(P, Kid)
      of false -> first_descendant_kids(P, Kids)
       ; Found -> Found
    end.

It’s the same pattern, with “return” being used to effect a
long-distance BUT VISIBLY BOUNDED control transfer.

Let’s try to simplify this using catch and throw.

first_descendant(P, XML) ->
    try
        first_descendant_item(P, XML)
    catch
        throw:Found -> Found
    end.

first_descendant_item(P, XML) ->
    case P(XML)
      of true  -> throw(XML)
       ; false -> first_descendant_kids(P, children(Kids))
    end.

first_descendant_kids(_, []) ->
    false;
first_descendant_kids(P, [Kid|Kids]) ->
    first_descendant_item(P, Kid)
    first_descendant_kids(P, Kids).

We could fiddle with the indentation of try-catch-end,
but there’s no reduction in code size here.

If it were a matter of searching for a matching *child*, a return statement
would be useful in C, because we’d use a while statement to search a list.
But for a matching *descendant*, we can’t use a plain ‘while’ loop.

Now, it so happens that I have a C implementation of the
“Document Value Model”.

xml find_first_descendant(bool (*p)(xml), xml e) {
    for_each_descendant(e, d)
        if (p(d)) return d;
    end_each_descendant
    return (xml)0;
}

This makes it look very much as if a C-style “return”
would be useful in cases like this.  Well, yes, but it
took *this* code to make it work:

#define for_each_descendant(e, d) {                     \
    xml const _x_a = (e);                               \
    struct XML_Stack *_x_s = 0;                         \
    x_desc_push(&_x_s, &_x_a, 1);                       \
    while (_x_s != 0) {                                 \
        xml const _x_c = x_desc_next(&_x_s);            \
        xml const d = _x_c;

#define end_each_descendant                             \
        if (is_element(_x_c) && x_size(_x_c) != 0)      \
            x_desc_push(&_x_s, &_x_c->u.children[1], x_size(_x_c));     \
    }                                                   \
    x_desc_free(&_x_s);                                 \
}

and the sad result is a storage leak.  So you have to write

xml find_first_descendant(bool (*p)(xml), xml e) {
    for_each_descendant(e, d)
        if (p(d)) descendant_found(d); /* free the hidden stack */
    end_each_descendant
    return (xml)0;
}

Using this style instead of a straightforward recursion comes with
a 70% performance penalty on a modern Mac.

Returning to Erlang, can we do something cleaner?

There are four conditions.
Is the predicate satisfied or not?              (find_first_descendant/2)
Is the XML item an element or not?              (children/1)
Have we reached the end of the children or not? (find_descendant_kids/2)
Should we exit early or not?                    (find_descendant_kids/2)

Of these, only the last one is elided by the catch+throw or call/cc
approach, and in the case of catch+throw it is replaced by “DID we exit
early or not?”, which is no real improvement.

I do agree that “check head, if that failed try tail” is not as pretty
as “check head, check tail”.

-define(OR_ELSE(E1,E2), case E1 of false -> E2 ; _X -> X end).

first_descendant_kids(_, []) ->
    false;
first_descendant_kids(P, [Kid|Kids]) ->
    ?OR_ELSE(first_descendant(P, Kid),
             first_descendant_kids(P, Kids)).

might be thought to clarify this, or it might not.
Personally, I think the original code, with the comment pointing out
the early exit, is good enough.