Tail recursion modulo cons

Richard Carlsson richardc@REDACTED
Thu Nov 21 09:52:51 CET 2002


On Wed, 20 Nov 2002, James Hague wrote:

> This is a compiler optimization that effectively makes functions like the
> following tail recursive:
> 
> map(F, [H|T]) ->
>     [F(H)|map(F, T)];
> map(_, []) -> [].
> 
> [...]
> Has anyone ever looked into this for Erlang?

The actual rewriting is not that difficult; in Erlang, it would be
something like this (assuming we had a 'set_tail' operation):

map(F, [H|T]) ->
    Cons = [F(H) | []],
    map(F, T, Cons);
map(_, []) -> [].

map(F, [H|T], Cons) ->
    NewCons = [F(H) | []],
    set_tail(Cons, NewCons),
    map(F, T, NewCons);
map(_, [], Cons) ->
    set_tail(Cons, []).

(Note that the tails of allocated cons cells must be initialised,
e.g. to '[]', so that if garbage collection happens during execution,
it will not find random bits in the tail. Thus, we get a penalty of an
extra write for each cons cell.)

The big obstacle for making this work is that the abstract machine and
garbage collector must be extended to handle write barriers. Right now,
it is necessary that there are never any pointers from an older
generation to a younger one. Now suppose a GC happens during execution
of the above 'map'. Then, a pre-allocated cons cell could be moved to
an older generation. When execution continues, it will allocate a new
cons cell on the youngest generation, and insert a pointer from the
old cell to the new one. The current garbage collector will not realise
that the new cell is live, when the next collection happens. Disaster
follows.

The quick summary is that it's not just a code transformation, but
also requires support in the runtime environment. (It might happen
some day, though, since there are several reasons for wanting to
have write barriers.)

	/Richard


Richard Carlsson (richardc@REDACTED)   (This space intentionally left blank.)
E-mail: Richard.Carlsson@REDACTED	WWW: http://user.it.uu.se/~richardc/




More information about the erlang-questions mailing list