[erlang-questions] Garbage collecting binaries
Robert Virding
robert.virding@REDACTED
Fri Oct 8 03:01:49 CEST 2010
Large binaries are not stored in any process heap, but are specially
handled outside the normal process data. This allows them to be sent in
messages between processes by reference instead copying, thereby saving
the copying of potentially large amounts of data. Which is the main
reason for doing it this way.
However, this does cause problems for the gc. Once a large binary has
been passed between processes it is no longer enough to collect just one
process before freeing it. You must first collect a process before you
can determine which references to large binaries it has had are no
longer in use. This means you have to first collect at least each
process through which the binary has passed before you can determine if
there are any references to it, or if it can be freed. Which means that
freeing of large binaries is delayed, which means it is possible to fill
memory with unreferenced binaries which have not yet been collected.
The fact that you can get references to large binaries even though you
only reference parts of them just aggravate matters.
Unfortunately there are few ways around this. As usual it is a matter of
making trade-offs and hoping for the best. Removing the special handling
of large binaries would kill many applications.
Robert
On 2010-10-07 16.07, tom kelly wrote:
> Hello List,
>
> I need some help on garbage collecting binaries.
> I have an application that handles large binaries and under load it eats up
> all the available memory then falls over, even when I start all the
> data-handling processes with "{spawn_opt,[{fullsweep_after, 20}]}".
>
> Reading point 5.15 on the page:
> http://www.erlang.org/faq/how_do_i.html
> leads me to think that calling the garbage collector on the process that
> created a binary will clean it up but my shell experiment below shows me I'm
> wrong. It's now 10 minutes since I've done this experiment and the large
> binary is still in memory.
>
> Maybe I have to call the garbage collector of the process whose heap is
> storing the binary? If so, which process is it?
>
> Any pointers as to what I'm doing wrong or mis-understand will be greatly
> appreciated.
>
> I'm on a legacy R12B5 system.
>
> //Tom.
>
>
>
> 18> process_info(self(),total_heap_size).
> {total_heap_size,4181}
>
> 19> A = L(100000).
> [140,161,41,128,44,96,215,43,15,164,88,107,1,167,4,125,118,
> 180,121,181,160,124,244,140,169,215,31,82,43|...]
>
> 20> process_info(self(),total_heap_size).
> {total_heap_size,1149851}
>
> 21> f(A).
> ok
>
> 22> erlang:garbage_collect().
> true
>
> 23> process_info(self(),total_heap_size).
> {total_heap_size,3194}
>
>
> 24> memory(binary).
> 6536
>
> 25> A = B(100000).
> <<232,241,55,171,218,35,86,122,211,185,232,1,203,249,181,
> 218,176,33,88,131,102,56,102,82,158,114,200,174,253,...>>
>
> 26> memory(binary).
> 106704
>
> 27> f(A).
> ok
>
> 28> memory(binary).
> 106704
>
> 29> erlang:garbage_collect().
> true
>
> 30> memory(binary).
> 106560
>
More information about the erlang-questions
mailing list