Garbage collection

Thu Sep 2 11:56:19 CEST 1999

On Thu, 2 Sep 1999, Klacke wrote:

klacke>Furthermore, when a process has garbed and the garb was 
klacke>not finished (since it took too long time) and execution
klacke>was resumed instead, well then we have a fairly complex
klacke>situation. Imagine that we need to send a message to such 
klacke>a "half-garbed" process, we need to allocate som memory there
klacke>for the message. Where do we allocate that memory, hardly
klacke>on any of the heaps that are "half-garbed" ??
klacke>
klacke>(Or do we suspend the sender !!!, deadlocks ???)

Why would this cause deadlocks? The runtime system already uses flow
control when a process sends a message to a port, right?
If a process tries to send a message to another process which is
half-garbed, it is suspended; if no other processes can run, the
half-garbed process is allowed to continue. Since it doesn't depend on the
process waiting to send a message, there won't be any deadlock.
Or am I missing something?

klacke>Thus the situation which lead to this "half-garb", was probably
klacke>that the live data was large. Interupting the gc of this
klacke>large data set leaves us with a whole lot of memory which
klacke>is not released. The original two heaps of the process as
klacke>well as the new heap we're garbing to. Three heaps.
klacke>If a message arrives at the process, we need to have yet
klacke>another memory area associated with the process where we
klacke>can put the message, four heaps !!.

Yes.
If we suspend senders -- three heaps.
This is a tradeoff between memory useage and realtime behaviour.

The main point of the exercise is that designers will know fairly well
which parts of a program may cause a costly gc, and can structure their
processes accordingly (processes which have to be responsive can offload
heavy tasks to other processes.) There is no point in doing that today,
because the heavy gc will destroy real-time characteristics even if it
happens in a background process! What you can do today is rewrite your
Erlang code so that it performs the same job differently -- perhaps by
forcing small garbage collections along the way. This is "unnecessary" work
(since it's a workaround for shortcomings in the runtime system), and makes
for slower code at the Erlang level.

klacke>Now then, at a later stage when we're garbing
klacke>this 4-heap process, we need to make this gc reentrant as  well
klacke>otherwise there's no point in the exercise at all.

I don't think I follow you.
It probably isn't a good idea to accept more data as we're garbing, because
we can't be sure that the gc will ever terminate. Therefore, we should
suspend senders during the gc, and if we do, the gc will behave as today,
except that it may yield at certain stages.

One aspect to consider is that the vast majority of collections which are
cheap should not become significantly more expensive because we want to
make the heavy ones reentrant. I don't know how important it is, or how to
go about it. Perhaps a few clever checks (e.g. heap size) could reveal
whether the gc will be a candidate for reentrant gc...

/Uffe

Ulf Wiger, Chief Designer AXD 301      <ulf.wiger@REDACTED>
Ericsson Telecom AB                          tfn: +46  8 719 81 95
Varuvägen 9, Älvsjö                          mob: +46 70 519 81 95
S-126 25 Stockholm, Sweden                   fax: +46  8 719 43 44