Lazy lists - an implementation.

Wed Oct 18 14:00:36 CEST 2000

On Wed, 18 Oct 2000, Robert Virding wrote:

> I think that doing it as you have done making lazy lists as
> 
> 	LazyList :: () -> [] | [term() | LazyList]
> 
> makes things much simpler and more consistent.
> 
> However you still have the major problem that it is not integrated
> with the rest of the whole system (Erlang, OTP and apps).  This
> restricts what you can do with it, as the rest of the system sees this
> as funs not lists.

Yes, it is not a list. Should not even try to be a list. (I just
implemented them using cons cells because I liked the notation, but that
is trivial to change.)

I see all this as a question of interfaces between functions. As I just
wrote in reply to Seans question, a lazy data structure allows you to
attach computations without passing an explicit callback hook. In some
applications and programming styles, this might be precisely what you
want. In others not. I think that functions now operating on normal lists
should *not* be augmented to also handle lazy lists, simply because it
mucks up their otherwise simple and straightforward interface using normal
lists. If one wants lazy data structures, one should have completely
disjoint functions for operating on them. The programmer should be able to
know what might and might not happen.

I am actually tiring of the term "lazy lists", so I'm now switching to
"streams", and I hope that no-one will confuse this with Erlangs I/O
streams (which are a process communication protocol).

> As Sean Hinde points out if you already have the big list then why
> bother?

Only if you want to use an existing list with a function that wants a
stream as input.

> Unfortunately it would not directly help Ulf Wiger either as the large
> lists are generated by calls which generate "proper" lists. To use it
> you would have to write new interface functions to generate the lazy
> lists and force people to use the lazy libraries.  This would still
> mean you would have to go through code and modify it, as you would
> using th existing "lazy" interface to ETS tables.

Yes. But I think that the sort of "damage control" that Ulf wanted will in
the end make the code deteriorate into something unreadable, bug-prone and
unpredictable. Better rewrite the code once and for all into something
that can be read and understood, if you decide that you want to use this
sort of technique.

> Most importantly it can't be used in patterns!!!

Well, do you want to? If you want to look at the first N elements of a
stream, you just extract them and then do the matching. There is an
infinite number of possible abstract data types that you cannot do
matching on. Give us O'Keefes abstract patterns, and the whole problem
might go away.

> As it can't be used interchangably with lists, or even list patterns,
> wouldn't it be better to drop the list terminology and instead call
> them something else, lazy sequences?  This would help to lessen
> confusion.

I agree. I'll stick to "streams" from now on.

> I wouldn't even construct them with lists, use a tagged tuple instead.  
> I know it takes more space but I am only looking a one element at a
> time so it isn't really important.  This would help to remove a lot of
> errors where people inadvertantly used "normal" list operations on
> them which could generate very random errors.

Yes, you may well be right. (Me, I don't make errors like that. :-).

> And no way that I or anyone else would prefix their list procesing
> functions with a special clause!  :-)
> 
> func(LazyList, ...) when function(LazyList) -> func(LazyList(), ...);

No, as I said, I do absolutely not want to augment code handling normal
lists with extra cases for continuations. That way lies madness!

> In short if you want to implement a "lazy" sequence be extremely
> explicit and not try to make look like lists, which they are not and
> don't behave as.

OK, boss.

> Anyway without rewriting the result of an evaluation you can create an
> aweful amount of extra work.  Also you could have fun code like
> 
>     L1 = eval(LazyList),
>     L2 = eval(LazyList),
> 
> and L1 /= L2.

Yes, this is the major snag, so to speak. One would like the advantages of
computing things at most once, without the disadvantages of having to
explicitly pass around the "updated stream" to anyone who might want to
read it again from the same point.

As you also said in your "philosophical questions" letter, this laziness
stuff is already handled well by process communication (as e.g., I/O
streams), which also makes it clear who computes what and when. "Lazy
list" streams should only be used if they offer something that makes the
task at hand a lot easier, and you are prepared for the effects of lazy
evaluation.

The advantages of "lazy list" streams over message passing are then 1)
locality (no copying of data between processes) - but this might go away
*if* we get an Erlang implementation with a global heap, and 2) the simple
but powerful interface, allowing you to modify a stream as you like, but
also postponing the computation to another time (and possibly another
process). Actually, I can't say I have a huge use for them myself, but I
like to explore the idea. Real uses may crop up later.

	/Richard

Richard Carlsson (richardc@REDACTED)   (This space intentionally left blank.)
E-mail: Richard.Carlsson@REDACTED	WWW: http://www.csd.uu.se/~richardc/