[erlang-questions] generators/iterators

Tue Jun 19 07:05:31 CEST 2007

Hi - thanks for your notes on interfaces for lazy lists. I tend to see 
lists as materialized streams though, rather than the other way around.

Interestingly enough - prior art does in fact use exceptions to indicate 
the end of a stream - Python does just that.

Ugly though it may be, it seems to be a practical and effective way of 
communicating out of band information about a stream - i.e. when it 
terminates.

Iterating over tuples would be handy, as would a standard interface by 
which arbitrary data structures could be iterated over in comprehensions 
(without having to convert them into lists).

> While Erlang functions *can* have side effects, most of them shouldn't.
> So let's design a functional interface here.  See "Structure and
> Interpretation of Computer Programs" for background, but a simple
> approach is
> 	F() => []		at end of list
> 	F() => [H | F']		not at end of list.
> For example, let's try a range of integers:
>
> 	range(L, U) ->
> 	    fun() ->
> 		if L >  U -> []
> 		 ; L =< U -> [L | range(L+1, U)]
> 		end
> 	    end.
>
> Such things have been called 'streams' and 'lazy lists'; the difference
> between what we have here and a real stream or lazy list is that the
> result isn't remembered.  Whenever you call F() the work will be
> repeated.
>
> Suppose we have such a function and want to recover a list from it.
>
> 	f2l(F) -> case F()
> 		    of [] -> []
>                      ;  [H | G] -> [H | f2l(G)]
> 		  end.
>
> Now whenever we want to use such a function as a generator, all we have
> to do is
> 	[Y*Y | Y <- f2l(items(X))]
>
> If it's that easy, why not make the compiler do it?
> Because there are many 'functional generator' interfaces possible.
> You would still have to have something in the expression to say which
> of the methods you were using, and it might as well be f2l(_).
>
> The native/natural Erlang analogue of a Python iterator is a function
> that returns a list.  (The idea of iterators signalling a *normal*
> condition by raising an exception is pretty disgusting.  Prior art did
> not do that.)  This is also the native/natural equivalent in Haskell.
> For example, I have some students doing a simple analysis of the Lego
> robot language NQC, with code like this:
>
> 	direct_uses :: Identifier -> NQC_AST -> [Identifier]
> 	direct_uses name nqc =
> 	    let (b,c) = fun_body False nqc name in
> 	    list_to_set [i | (s,c') <- substmts b c,
> 			     e <- stmt_exprs s,
> 			     VarExpr i <- subexprs e,
> 			     not (deep_declared i c')]
>
> where substmts returns all the statements directly or indirectly
> contained in a statement, stmt_exprs returns all the expressions
> directly contained in a statement, and subexprs returns all the
> expressions directly or indirectly contained in an expression.
> Now Haskell is a lazy language, so this is equivalent to a backtracking
> search in Prolog.  But this would still be a reasonable way to proceed
> in Erlang:  the space required for the explicit lists is comparable to
> the space required for the data structures being iterated over.
>
> I *would* like to be able to iterate over tuples in Erlang, either
> by having a new <-: symbol like <- but working over tuples, or by
> having the compiler optimise <- tuple_to_list(...) specially.  The
> former is how Clean does it; the latter may be a better approach for
> Erlang and would allow for other optimisations in the same spirit,
> including (a better named version of) f2l(), should there ever be any
> great need for it.
>