[erlang-questions] Idiomatic Erlang, style/performance question

Mon Jun 23 04:16:30 CEST 2008

On 20 Jun 2008, at 8:16 pm, Dominic Williams wrote:

> Hello,
>
> ROK wrote:
>> I prefer the term "Tail-Call Optimisation" because it is more  
>> accurate.
>> The idea (as explained in "Lambda - the Ultimate Goto") is that if  
>> the
>> dynamically last thing a function does (in any language) is to return
>> the result of another function call, then there is no point in  
>> keeping
>> the current stack frame.  You might as well arrange for the new
>> function to return directly to the original caller.  Think of *all*
>> tail calls as "argument binding + GOTO".
>
> Is the optimisation possible even when the tail call is not to the  
> same
> function?

That's precisely why I prefer to call it tail CALL optimisation.
It is not in any way limited to recursion.
>>>

>>> power(N, P) when is_integer(P), P >= 0 ->
>>>        ipower(N, P, 1);
>>>    power(N, P) when is_integer(P), P < 0 ->
>>>        1/ipower(N, -P, 1).
>>
>> The first call here is a tail call, the second is not.
>
> As I wrote in a previous post, I thought that there was no  
> optimisation
> here, because it is not a /recursive/ tail-call. Am I wrong? What is  
> the
> case for Erlang, in particular?

The first clause ends with a call to ipower/3.
The fact that it is not a *recursive* call doesn't matter a whit;
what counts is that there is nothing left to do in power/2 when
ipower/3 finishes.  The dynamically last action in the caller is
the call.  That clause IS subject to tail call optimisation.

The last thing that happens in the second clause is the division.
So that one isn't a tail call.

As for what Erlang actually does, there's one way to tell for sure.
% erlc -"'S'" foobar.erl
% view foobar.S

Given
power(X, P) when is_integer(P), P >= 0 ->
     ipower(X, P, 1);
power(X, P) when is_integer(P), P < 0 ->
     1/ipower(X, -P, 1).

ipower(_, 0, R) -> R;
ipower(X, P, R) ->
     ipower(X, P-1, R*X).

>
you get
{function, power, 2, 2}.
   {label,1}.
     {func_info,{atom,fib},{atom,power},2}.
   {label,2}.
     {test,is_integer,{f,3},[{x,1}]}.
     {test,is_ge,{f,3},[{x,1},{integer,0}]}.
     {move,{integer,1},{x,2}}.
     {'%live',3}.
     {call_only,3,{f,5}}.
%% In the code for clause 1, no stack frame is
%% allocated or deallocated and there is no 'return'.
   {label,3}.
     {test,is_integer,{f,1},[{x,1}]}.
     {test,is_lt,{f,1},[{x,1},{integer,0}]}.
     {allocate,0,2}.
     {gc_bif,'-',{f,0},2,[{x,1}],{x,1}}.
     {move,{integer,1},{x,2}}.
     {'%live',3}.
     {call,3,{f,5}}.
     {test_heap,{alloc,[{words,0},{floats,1}]},1}.
     {fconv,{integer,1},{fr,0}}.
     {fconv,{x,0},{fr,1}}.
     fclearerror.
     {bif,fdiv,{f,0},[{fr,0},{fr,1}],{fr,1}}.
     {fcheckerror,{f,0}}.
     {fmove,{fr,1},{x,0}}.
     {'%live',1}.
     {deallocate,0}.
     return.
%% In the code for clause 2, a stack frame IS
%% allocated, and deallocated, and there is a 'return'.

{function, ipower, 3, 5}.
   {label,4}.
     {func_info,{atom,fib},{atom,ipower},3}.
   {label,5}.
     {test,is_eq_exact,{f,6},[{x,1},{integer,0}]}.
     {move,{x,2},{x,0}}.
     return.
%% This is a simple return from a leaf call.
   {label,6}.
     {gc_bif,'-',{f,0},3,[{x,1},{integer,1}],{x,1}}.
     {gc_bif,'*',{f,0},3,[{x,2},{x,0}],{x,2}}.
     {'%live',3}.
     {call_only,3,{f,5}}.
%% This is a tail call with no frame.

Basically, if the code for a clause doesn't end with 'return',
tail call optimisation happened.