[erlang-questions] Garbage Collector Details

Fri Sep 9 19:55:50 CEST 2011

That was a fantastically interesting description of the VM internals! Thank
you very much!

1. Allocate/Grab a new heap fragment.

Is there a global lock for allocating these fragments, or is the heap
fragment allocator smarter than this? How is contention for this allocator
resolved?

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world. Nevertheless,
whether we get there willingly or not, we shall soon have lower consumption
rates, because our present rates are unsustainable.

On Fri, Sep 9, 2011 at 8:07 AM, Jesper Louis Andersen <
jesper.louis.andersen@REDACTED> wrote:

> On Fri, Sep 9, 2011 at 13:35, Johannes Auer <johannes.auer@REDACTED> wrote:
>
> > It says that a message is first copied to the private heap of the
> receiving process, and afterwards a pointer to that message is placed in the
> message queue. But copying to the heap would require locking, since many
> processes can send messages concurrently to the same receiving process.
> Could someone explain what is happening *exactly* when a message is sent
> from one process to another?
>
> There are three tricks being employed, and I am sure one from the
> Erlang/OTP team will correct me if I am wrong here.
>
> First, an Erlang process struct is governed by a series of locks (4 at
> the time being). For our purposes the most important locks are the
> MSGQ and STATUS locks which together can be called the SENDRECV lock.
> The two locks protect some specific fields in the process struct - we
> are concerned with the message queue called the InQueue. The InQueue
> is a linked list of incoming messages, protected by the MSGQ lock. But
> we also need the STATUS lock, so we will take both of these and call
> them the SENDRECV lock. There is also another important lock, the MAIN
> lock which protects every field not protected by a more specific lock.
>
> Second, notice that the heap of a process can be fragmented. That is,
> rather than consist of a contiguous piece of memory, it may consist of
> several smaller fragments. When we run a garbage collection in the
> process, these fragments will be combined and the fragments will be
> released.
>
> Third, the message queue is split into two. There is the above
> mentioned InQueue, protected by MSGQ. And there is also the PrivQueue,
> the private queue protected by MAIN. The idea is that if you hold the
> MSGQ lock, you can send messages but the process can still do stuff as
> long as it only needs MAIN.
>
> ++
>
> Whenever a process is running, it holds MAIN. This allows it exclusive
> access to most fields and the PrivQueue. When process S wants to send
> to R, it does the following (in a very simplified explanation):
>
> 1. Allocate/Grab a new heap fragment. I don't know how much fragment
> reuse there is, but there could be some.
> 2. Copy the message to the heap fragment. We have exclusive right to
> the fragment.
> 3. Grab the MSGQ and STATUS locks of R.
> 4. Add the message (and fragment) to the InQueue.
> 5. Release the MSGQ and STATUS locks of R.
>
> 4 is mostly just a pointer assignment, so the time we hold the lock is
> extremely small. Also, we don't stop another scheduler with MAIN from
> executing the process. Even the copy is "free" because it happens
> outside the lock.
>
> When the process executes, it will grab MAIN and do its work off the
> PrivQueue. If it exhausts, it'll grab the MSGQ lock and evacuate the
> messages to the PrivQueue. There is also an important invariant, which
> is that InQueue *never* contains pointers into the process "main
> heap". This means that the Garbage Collector does not need to include
> InQueue in its root set.
>
> Garbage collection then amounts to combining all the fragments and
> collecting garbage. The neat trick here is that the fragments works as
> a handoff algorithm. The pointer to the fragment is a "token" and he
> who holds that token is responsible for its collection. When we
> allocate a new fragment/token and then hand it off to the R process,
> we give the R process the responsibility of collecting and reclaiming
> the data. But the allocation and copying work is done in S.
>
> Optimistically, S rarely blocks R and R rarely blocks S. Even if we
> have multiple senders, they hold the lock for a very short time, so
> they are unlikely to trip over each other much. Of course it goes
> wrong if there are scores of threads trying to get that lock, but
> chances are you have a badly designed program then.
>
> I should also mention that this is a message from S to R locally on
> the same machine with S and R different processes. The code path if S
> = R or if S and R are on different distributed nodes is different.
>
>
> --
> J.
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110909/83ec21f8/attachment.htm>