Optimized lists

Thomas Lindgren thomasl_erlang@REDACTED
Thu Mar 17 15:34:28 CET 2005


--- Mats Cronqvist <mats.cronqvist@REDACTED>
wrote:

> > (3) There are plenty of list-processing functions
> which, to my understanding,
> >     don't involve tail recursion at the moment,
> but with a slightly different
> >     abstract machine *could* be tail recursions. 
> Here's one:
> > 
> > 	append([X|Xs], L) -> [X | append(Xs, L)];
> > 	append([], L) -> L.
> > 
> >     Making this and similar functions tail
> recursive would be very good
> >     news for memory use (using O(1) stack space
> instead of O(n)).  Such code
> >     is _very_ common, much more common than
> length/1 is.  Switching to tail
> >     recursion here would be IMPOSSIBLE if each
> list cell carried its length.
> 
> 
>    what changes to the emulator would that be?

I can only chime in with Richard: getting tail
recursion can be very advantageous, particularly in
native code. 

Consider: each stack frame of append requires at least
2 words in itself (return address + saved X), which is
4 extra words of memory traffic per list element
(store, load). 

Not to mention the tail recursive version requiring
O(1) stack space rather than O(n).

What is needed: a garbage collector that handles
pointers from old to new data (there are plenty of
existing ones that do), and a compiler that rewrites
code to use tail recursion. Prolog might serve as
inspiration.

A source level example:

append([X|Xs], L) -> [X | append(Xs, L)];
append([], L) -> L.

would become something like this (or perhaps something
a bit nicer):

% wrapper

append([], L) -> L;
append([X|Xs], L) -> 
   Lst = [X|empty],
   app(Xs, L, Lst),
   Lst.

% inner loop, destructive update of tail of Lst

app([X|Xs], L, Out_tail) -> 
   New_out_tail = [X|empty], 
   setcdr(Out_tail, New_out_tail),
   app(Xs, L, New_out_tail);
app([], L, Lst) -> 
   setcdr(Lst, L).

That is, the last tail of the output list Lst is
destructively updated in each iteration. The setcdr/2
operation can be implemented as a "trailed store", a
not-taken test and a store in the usual case.

In this case it's possible to speed this up even more,
since writing the 'empty' in the tail of the list cell
is only done for the sake of the GC :-) So if the GC
can handle it, you might leave the tail of the cell
uninitialized instead.

(I guess this serves as testimony of a youth ill-spent
with optimizing Prolog :-)

The equivalent technique can be used for tuples as
well.

Best,
Thomas



		
__________________________________ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/ 



More information about the erlang-questions mailing list