[erlang-questions] Re: Shared/Hybrid Heap
Thu Oct 14 16:47:46 CEST 2010
Sharing binaries is fine because they use a different heap altogether and a
different garbage collection scheme (reference counting).
Sharing a nif ressource implies you manage it manually through nifs, how you
manage it is largely up to you.
As far as other languages go, yes, most (all?) of them stop all threads to
garbage collect (if those threads share memory). Java does this, C# does
this, ruby, python, etc. C# recently or will shortly introduce a garbage
collection algorithm that runs concurrently and thus doesn't stop all
threads but from what I can remember, it isn't 100% guaranteed (if threads
continue to allocate past a threshold, I believe they are still all stopped
waiting for GC to complete). This an important reason why multi generational
garbage collection algorithms are so popular: it keeps most GC cycles quick,
thus stopping all threads for the least amount of time.
The idea behind each process having its own heap is twofold: it helps memory
locality and it isolates that heap such that it can be GCed on its own,
without interrupting any other process. Processes are such essentially
sandboxed in that regard.
This also means that any two processes, or more, can run garbage collection
concurrently without issues.
While it sounds bad to stop all processes/threads sharing a piece of memory,
in reality, for most desktop applications, it is a non issue. The story is
different for server software however, where they tend to allocate a lot and
have many threads.
When you want to share some data between P1 and P2, if they do not share
everything, a problem arises quite quickly: how do you mark allocations that
should be shared and those that shouldn't? If you share everything,
allocation is easy and goes in 1 heap (ignoring binaries here in their own
heap). If you do not, you have to select those allocations. In both cases,
because some memory is shared, the simplest way to GC will be to stop both
processes. After all, you will need to inspect their respective stacks to
see if they reference things in that shared heap and if the processes are
running, the stack will keep changing. If even for the time of a memcpy of
the stack, all processes will have to stop at some point or another. You
also won't be able to run the GC concurrently on P1 and P2 if they share
memory, or at least, it will be quite hard.
Sharing memory and garbage collection is a non trivial problem to solve and
as erlang currently does, the simplest, most elegant way to deal with the
problem is to not share at all (and always copy) to avoid the problem.
On Thu, Oct 14, 2010 at 4:22 AM, Morten Krogh <> wrote:
>> You are missing the point -- or at least the point which I think Richard
>> is making. In the scheme you propose, T2's execution will be influenced by a
>> piece of data that _in principle_ is not shared. For example, T2 needs to be
>> stopped executing or synchronized with T1 by garbage collection which might
>> take place when process T1 allocates some big D2.
>> How is that different than the two processes sharing everything?
> I don't understand you. Are you saying that it is impossible to implement
> multithreading unless the threads share everything, and garbage collection
> will be slow?
> What about two erlang processes that share a binary. Isn't that how it
> works today?
> What about two erlang processes that share a nif resource?
> What about other languages and virtual machines. They have multithreading.
> We must be talking past each other here.
> By the way, I am not advocating such a change to erlang. I think you can
> get around shared memory by choosing the right processes, partition data
> correctly etc.
> It is only the most computationally intensive cases where you can't, and
> there one can use a nif resource or some external way of sharing data.
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
More information about the erlang-questions