Code generation (was NOOB)
Jay Nelson
jay@REDACTED
Sat Sep 2 01:52:40 CEST 2006
Bjorn clarified some optimization approaches:
> We have considered improving the optimization, so that no list would
> be built in case 2.
Where case 2 was the line commented <2> below:
> foo(X) ->
> A = [do_something(E) || E <- X], % <1>
> [do_other(E) || E <- X], % <2>
> ok.
There has been previous debates about use lists:foreach when only side-effects are desired
and a list comprehension when the result list is needed. I used to subscribe to that
approach, but the list comprehension is so much clearer that I always use it now unless
there is reason to be concerned about generating the large list. To me, <1> and <2> have
clearly marked intent that the first needs the result and the second is only for side
effects because there is no left side pattern match.
I would lobby for the compiler to not generate a list in this case. It allows me to
code in the most consistent and clear style and not have to take a performance penalty
for doing so.
There are two other reasons why this optimization might be desirable:
1) When (or if) binary comprehensions are added, it would be possible to use a similar
construct to visit all bytes and emit them via message send or io:format or write them
to a file, without constructing and consuming new memory. Thus, binaries nearly the
size of total available RAM could be received, transformed and re-emitted as below:
{ok, BigBinary} = file:read_file("bigfile"), %% or from a receive
<< AmazonS3 ! convert_to_unicode(ByteChunk) || ByteChunk:1024 <- BigBinary >>.
2) List Comprehensions (and binary) could be used for infinite streams. Sort of the
equivalent of tail recursion not using the stack.
[ send_to_listeners(Publication) || Publication <- get_next_pub(Publisher) ]
get_next_pub(Publisher) ->
receive
{Publisher, Pub} -> Pub
after
5000 -> end_of_stream %% Hmmm, well some token to tell comprehension to end.
%% Don't think [] would work. Worst case it could just exit.
end.
Half-baked idea, but if you spawn link the above code, the intent is to sit emitting
published events to all subscribers until the publisher pauses for 5 seconds. I think
the coding style makes it concise and clear what is going on.
Instead of get_next_pub, the generator could be lazy_gen_all_natural_numbers or some other
infinite stream.
jay
More information about the erlang-questions
mailing list