[erlang-questions] Python Generators In Erlang and constant memory usage
Jani Launonen
launoja@REDACTED
Fri Mar 20 09:21:31 CET 2009
I admit not read throughfully your post, but I think you're looking for continuations.
I've done continuous based fibonacci iterator in Python and Erlang so it should be working solution.
But your generator in continuations:
-module(continuation).
-export([gen/1]).
gen(N) ->
{0, fun() -> gen_cont(0, N) end}.
gen_cont(I, N) ->
case I < N of
true -> {I + 1, fun() -> gen_cont(I + 1, N) end};
false -> ok
end.
How it works...
7> {_, Fun1} = continuation:gen(3).
{0,#Fun<continuation.0.17007156>}
8> {_, Fun2} = Fun1().
{1,#Fun<continuation.1.15616427>}
9> {_, Fun3} = Fun2().
{2,#Fun<continuation.1.15616427>}
10> {_, Fun4} = Fun3().
{3,#Fun<continuation.1.15616427>}
11> {_, Fun5} = Fun4().
----- Alkuperäinen viesti -----
Lähettäjä: Frederic T-H <mononcqc@REDACTED>
Päiväys: perjantai, maaliskuu 20, 2009 3:28 am
Aihe: [erlang-questions] Python Generators In Erlang and constant memory usage
Vastaanottaja: erlang-questions@REDACTED
> First time posting here, I've been reading this mailing lists
> for a few
> months now, programming Erlang for a few months too.
>
> Python and its most functionnal parts is what eventually brought
> me to
> Erlang. One feature that I loved in Python and instantly noticed
> was not
> available in Erlang was generators. Generators are basically
> normal
> functions, except they return an iterator, letting you fetch
> each result
> one by one, effectively letting you work with constant memory.
> An
> example of the way Python does it:
>
> def gen(n):
> i = 0
> while i<n:
> yield i
> i+=1
> return
>
> x = gen(2)
> x.next() # returns 0
> x.next() # returns 1
> x.next() # throws an exception
>
> In python, using yield creates a 'next' method, which is
> automatically
> called by most functions that can iterate over text, lists,
> tuples,
> dictionaries, etc.
>
> The idea is to not load all the results from a function at once
> and pass
> it around, but to send each part individually.
>
> As an example, say I have a result set consisting of 1000x5kb
> tuples
> (whatever they contain) which I want to map a function over. The
> way
> Erlang works, you'd need to have a list that is 5000kb in size
> in memory
> and manipulate it every single time the mapped function iterates
> over it.
>
> The end goal is to attempt to bypass having to evaluate the
> whole result
> set at once (heavy) and instead get each element individually
> (lighter).
> This is especially good for smaller systems like phones, web
> tablets
> (Nokia) or really anything with limited amounts of RAM.
>
> While possible to implement in Erlang, generators are quite
> problematic
> because every list BIF at the moment only operates on complete
> lists.
> This means adding generators in Erlang would require one to do
> something
> like duplicate the lists module.
>
> This may not be the first time this whole idea is discussed, but
> fast
> googling didn't reveal much so I've let myself write a small
> naive and
> simple prototype of what the implementation could be.
>
> The idea is quite simple: Have a recursive function as a
> different
> spawned process waiting to receive {Pid, next} before returning
> data.
> I've attached the file demonstrating the implementation, but in
> case I
> got it wrong, I also copy/pasted it there:
> http://pastebin.com/f31ae9731
> What the implementation does:
>
> - define tail-recursive fibonacci functions,
> a normal one and a
> generator version
> - define mockups for the sum and fold (foldl)
> functions from the
> list module adapted to generators
> - short unscientific tests to compare performances.
>
> What it doesn't do:
>
> - consider any kind of safety in case there
> is an error in a generator
> - contain anything past what you'd do for a
> proof of concept.
>
>
> The way it works, a normal tail-recursive way to get all the
> fibonacci
> numbers until the nth one could be done the following way:
>
> t_fib(1) -> [1];
> t_fib(2) -> [1,1];
> t_fib(L) -> t_fib(3,L,[1,1]).
>
> t_fib(L,L,[N1,N2|_]=F) ->
> lists:reverse([N1+N2|F]);
> t_fib(C, L, [N1,N2|_]=F) ->
> t_fib(C+1,L,[N1+N2|F]).
>
> That's not the prettiest function in the world, but it still
> does the
> job. Here's how it would be as a generator (with the macros I
> have
> defined in the file):
>
> y_fib(L) ->
> y_fib(1, L, 1, 0).
> y_fib(L, L, N1 ,N2) ->
> ?last_yield(N1+N2);
> y_fib(C, L, N1, N2) ->
> N = N1+N2,
> ?yield(N),
> y_fib(C+1, L, N2, N).
>
> The code overhead one gets writing it that way is minimal and
> denies the
> need to use a list as an accumulator.
>
> The generator sum function is written the following way:
>
> sum({M,Gen,Args}) ->
> Pid = spawn(M, Gen, Args),
> sum(Pid,0).
> sum(Pid,S) ->
> Pid ! {self(), next},
> receive
> {Pid, N} -> sum(Pid, S+N);
> {Pid, N, done} -> S+N
> end.
>
> Called with the custom sum function in the generators module
> (This could
> be made better, I agree):
>
> >
> generators:sum({generators,y_fib,[100]}).
> 927372692193078999175
>
> Now for performance, I've tested it with Erlang R13A. My
> computer: AMD phenom 9950 quad-core 2.6ghz
> processor, 4gb RAM, Windows 7 Beta
> build 7000 (32 bits).
> batch_test/3 will compare both versions with a sequence/list of
> Fibonacci numbers of length 10000 and 50000, 20 times each, and
> round
> the results showing how much faster the generator is (1.0 is equal).
>
> 26> generators:batch_test(sum, 10000, 20).
> 0.9438942747654998
> 27> generators:batch_test(sum, 50000, 20).
> 2.406425935803285
> 28> generators:batch_test(fold, 10000, 20).
> 0.6706634763086374
> 29> generators:batch_test(fold, 50000, 20).
> 1.3918125332818239
>
> This seems to show that using a generator scales much better,
> while
> becoming faster as the list grows and becomes expansive to
> generate and
> carry around.
>
> The real advantage, though, is the fact that using a function
> that won't
> accumulate lots of data (sum, folds, foreach, all, any, nth,
> etc.) will
> be done with constant memory. This is especially good for small
> systems
> or scaling as a whole.
>
> The testing I've done is minimal, and while I don't expect the
> Erlang
> community to jump on the idea and adopt it, I'm one of these
> people
> who'd like to eventually use this in production code. What kind
> of stuff
> would you see missing, or likely to cause trouble in the current
> implementation?
> What ways would I have to test this giving better results?
>
> Any other comments, tips? Thanks in advance! (Have a nice day also.)
>
More information about the erlang-questions
mailing list