[erlang-questions] Garbage collecting binaries

Kenneth Lundin <>
Fri Oct 8 09:09:25 CEST 2010


Hi

The GC of binaries has been significantly improved since R13B03.
Before that it was quite easy to create situations where binaries
where not released early enough
during GC.

>From R14 the new module binary was introduced. In that module the
functions copy and
referenced_byte_size can be helpful in situations where it is
important to free unused binaries as
soon as possible.

See
http://www.erlang.org/doc/man/binary.html#referenced_byte_size-1
and
http://www.erlang.org/doc/man/binary.html#copy-1

/Kenneth, Erlang/OTP Ericsson

On Fri, Oct 8, 2010 at 3:01 AM, Robert Virding
<> wrote:
>  Large binaries are not stored in any process heap, but are specially
> handled outside the normal process data. This allows them to be sent in
> messages between processes by reference instead copying, thereby saving the
> copying of potentially large amounts of data. Which is the main reason for
> doing it this way.
>
> However, this does cause problems for the gc. Once a large binary has been
> passed between processes it is no longer enough to collect just one process
> before freeing it. You must first collect a process before you can determine
> which references to large binaries it has had are no longer in use. This
> means you have to first collect at least each process through which the
> binary has passed before you can determine if there are any references to
> it, or if it can be freed. Which means that freeing of large binaries is
> delayed, which means it is possible to fill memory with unreferenced
> binaries which have not yet been collected.
>
> The fact that you can get references to large binaries even though you only
> reference parts of them just aggravate matters.
>
> Unfortunately there are few ways around this. As usual it is a matter of
> making trade-offs and hoping for the best. Removing the special handling of
> large binaries would kill many applications.
>
> Robert
>
> On 2010-10-07 16.07, tom kelly wrote:
>>
>> Hello List,
>>
>> I need some help on garbage collecting binaries.
>> I have an application that handles large binaries and under load it eats
>> up
>> all the available memory then falls over, even when I start all the
>> data-handling processes with "{spawn_opt,[{fullsweep_after, 20}]}".
>>
>> Reading point 5.15 on the page:
>> http://www.erlang.org/faq/how_do_i.html
>> leads me to think that calling the garbage collector on the process that
>> created a binary will clean it up but my shell experiment below shows me
>> I'm
>> wrong. It's now 10 minutes since I've done this experiment and the large
>> binary is still in memory.
>>
>> Maybe I have to call the garbage collector of the process whose heap is
>> storing the binary? If so, which process is it?
>>
>> Any pointers as to what I'm doing wrong or mis-understand will be greatly
>> appreciated.
>>
>> I'm on a legacy R12B5 system.
>>
>> //Tom.
>>
>>
>>
>> 18>  process_info(self(),total_heap_size).
>> {total_heap_size,4181}
>>
>> 19>  A = L(100000).
>> [140,161,41,128,44,96,215,43,15,164,88,107,1,167,4,125,118,
>>  180,121,181,160,124,244,140,169,215,31,82,43|...]
>>
>> 20>  process_info(self(),total_heap_size).
>> {total_heap_size,1149851}
>>
>> 21>  f(A).
>> ok
>>
>> 22>  erlang:garbage_collect().
>> true
>>
>> 23>  process_info(self(),total_heap_size).
>> {total_heap_size,3194}
>>
>>
>> 24>  memory(binary).
>> 6536
>>
>> 25>  A = B(100000).
>> <<232,241,55,171,218,35,86,122,211,185,232,1,203,249,181,
>>   218,176,33,88,131,102,56,102,82,158,114,200,174,253,...>>
>>
>> 26>  memory(binary).
>> 106704
>>
>> 27>  f(A).
>> ok
>>
>> 28>  memory(binary).
>> 106704
>>
>> 29>  erlang:garbage_collect().
>> true
>>
>> 30>  memory(binary).
>> 106560
>>
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
>
>


More information about the erlang-questions mailing list