[erlang-questions] Garbage collection

Fri Oct 31 11:54:47 CET 2014

On 31 October 2014 10:49, Lukas Larsson <lukas@REDACTED> wrote:

> On Fri, Oct 31, 2014 at 11:06 AM, Chandru <
> chandrashekhar.mullaparthi@REDACTED> wrote:
>
>>
>> On 31 October 2014 10:01, Lukas Larsson <lukas@REDACTED> wrote:
>>
>>> On Fri, Oct 31, 2014 at 10:51 AM, Chandru <
>>> chandrashekhar.mullaparthi@REDACTED> wrote:
>>>
>>>> Thank you Lukas.
>>>>
>>>> On 31 October 2014 09:37, Lukas Larsson <lukas@REDACTED> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> On Fri, Oct 31, 2014 at 10:20 AM, Chandrashekhar Mullaparthi <
>>>>> chandrashekhar.mullaparthi@REDACTED> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> I have a question about beam’s GC implementation. When an erlang
>>>>>> process is being GCed, is the processing required to do the GC taken out of
>>>>>> the process’s 2000 reduction quota, or is it done after a process has been
>>>>>> scheduled out?
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> The GC work is taken out of the process' reductions. The GC is never
>>>>> triggered when it is scheduled out, but it can be triggered before being
>>>>> scheduled in, in which case the newly allotted reductions will be reduced
>>>>> by the GC work.
>>>>>
>>>>
>>>> So what happens if the process has a large heap? Can the GC end up
>>>> taking more time than to execute 2000 reductions? Or is it somehow time
>>>> bounded? If it is not time bounded, it explains a lot of the problems I'm
>>>> seeing on a system.
>>>>
>>>
>>> The current GC is not incremental, so once it has started doing work it
>>> cannot be interrupted. This means that if a process has a large heap it
>>> will block all other execution on that scheduler for the GC duration.
>>>
>>
>> Can it also block other schedulers by any chance? Robert Virding's
>> presentation [1] says that every 20-40k reductions, a new master scheduler
>> is chosen. I'm wondering if this transition of master scheduler has to
>> happen while one of the scheduler's is stuck in a long GC, will it
>> potentially block other schedulers?
>>
>>
> It can result in schedulers not waking up properly, for the same reasons
> as long running nifs/bifs causes this. So if this is you problem I would
> have a look at which processes are using a lot of heap space and try to
> reduce it, or make sure that they do not GC :)
>

Thanks for the confirmation, and yes of course. I'm trying to convince
someone that it isn't GC which is the problem, but its the system design
;-) Knowing exactly how it works helps in the argument.

>
> Scott has collected a bunch of his observations on the long running
> nifs/bifs issue here:
> https://github.com/slfritchie/nifwait/blob/md5/README.md. Sometime in the
> not too distant future I hope to have the time to write an incremental GC
> for large Erlang heaps, but in the meantime I believe Scott recommends
> using something like "+sfwi 500 +scl false" in order to avoid this problem.
> Try them out and see if the options work for you.
>

Thanks, I'll give them a try.

cheers
Chandru
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141031/ac7868fa/attachment.htm>