[erlang-questions] About Garbage Collection
Richard O'Keefe
ok@REDACTED
Thu Nov 11 02:50:14 CET 2010
On 10/11/2010, at 7:19 PM, Ş. Volkan Erdoğan wrote:
> Hi,
> I have an idea about memory management in Erlang and although the idea
> is still raw I'd like to share with you.
>
> As far as I know:
> - There are no global variables in Erlang. Memory is allocated in functions.
> - Function calls in processes and message passing operations to other
> processes have copying semantics.
More precisely, you can't *tell* whether things are copied or not.
But in practice, function arguments are NOT copied.
Consider for example
len(List) -> len_loop(List, 0).
len_loop([], L) -> L;
len_loop([_|Tail], L) -> len_loop(Tail, L+1).
Suppose you have a list L = [1,...,N].
If arguments were copied, len(L) would take O(N**2) time.
Since arguments are *not* copied, len(L) takes O(N) time.
> So I believe that these properties guarantee that any memory location
> allocated by runtime is exclusive to the function which requested that
> memory
> and it is not possible for a reference of a memory location to
> escape from the function that created it.
Consider
f(X) -> Y = g(X), {Y,Y}.
g(X) -> [X,X,X].
The memory allocated by g/1 for the three-element list [X,X,X]
is *not* exclusive to g/1 but outlives the call that created it.
The value of Y in f/1 is the very same list allocated in g/1.
So this is wrong.
> And as a result it is
> possible to safely deallocate all memory allocated by the function at
> function exit since those memory locations can only be accessed by
> that function.
And that's wrong too.
However, there's something that _could_ be done.
There was a system called I think "Formula Algol" developed at CWI
many years ago; a computer algebra system. I reimplemented it in
Burroughs B6700 Algol in the late 70s and had a lot of fun with it.
The key idea there was that when a function returned,
the new cells it had allocated were copied back to its caller,
and everything not returned was freed.
However, that has problems too.
Consider
make_list(N)
when is_integer(N), N >= 0 ->
make_list_loop(N, []).
make_list_loop(0, L) ->
L;
make_list_loop(N, L) ->
make_list_loop(N-1, [N|L]).
For example, make_list(3) ==> [1,2,3].
Copying the results back would make this take O(N**2) time
whereas the real non-copying Erlang system takes O(N) time.
However, if we push on that idea a little harder, we come
close to the idea of REGION allocation. In effect, the heap
is broken into a lot of little heaps, and a function can
allocate its result in (one of) its caller's heap(s).
A good quick reference to that is
http://mlton.org/Regions
which explains why the MLTon compiler for ML does NOT use
regions.
I don't *think* regions are compatible with hot loading, but
it would be nice to be wrong about that.
More information about the erlang-questions
mailing list