Code generation (was NOOB)

Jay Nelson jay@REDACTED
Sat Sep 2 01:52:40 CEST 2006


Bjorn clarified some optimization approaches:

 >  We have considered improving the optimization, so that no list would

>  be built in case 2.

Where case 2 was the line commented <2> below:

> foo(X) ->
>    A = [do_something(E) || E <- X],    % <1>
>    [do_other(E) || E <- X],            % <2>
>    ok.



There has been previous debates about use lists:foreach when only side-effects are desired
and a list comprehension when the result list is needed.  I used to subscribe to that
approach, but the list comprehension is so much clearer that I always use it now unless
there is reason to be concerned about generating the large list. To me, <1> and <2> have
clearly marked intent that the first needs the result and the second is only for side
effects because there is no left side pattern match.

I would lobby for the compiler to not generate a list in this case.  It allows me to
code in the most consistent and clear style and not have to take a performance penalty
for doing so.

There are two other reasons why this optimization might be desirable:

1) When (or if) binary comprehensions are added, it would be possible to use a similar
construct to visit all bytes and emit them via message send or io:format or write them
to a file, without constructing and consuming new memory.  Thus, binaries nearly the
size of total available RAM could be received, transformed and re-emitted as below:

{ok, BigBinary} = file:read_file("bigfile"),   %% or from a receive
<< AmazonS3 ! convert_to_unicode(ByteChunk) || ByteChunk:1024 <- BigBinary >>.


2) List Comprehensions (and binary) could be used for infinite streams.   Sort of the
equivalent of tail recursion not using the stack.


[ send_to_listeners(Publication) || Publication <- get_next_pub(Publisher) ]

get_next_pub(Publisher) ->
  receive
    {Publisher, Pub} -> Pub
  after
    5000 -> end_of_stream   %% Hmmm, well some token to tell comprehension to end.
                            %% Don't think [] would work.  Worst case it could just exit.
  end.


Half-baked idea, but if you spawn link the above code, the intent is to sit emitting
published events to all subscribers until the publisher pauses for 5 seconds.  I think
the coding style makes it concise and clear what is going on.

Instead of get_next_pub, the generator could be lazy_gen_all_natural_numbers or some other
infinite stream.


jay




More information about the erlang-questions mailing list