[erlang-questions] Garbage collecting binaries

Kenneth Lundin <>
Fri Oct 8 14:16:36 CEST 2010


Hi Tom,

Of course you should experiment with the new binary module in R14B but
the GC of
binaries is improved in general as well, so hopefully the freeing of
binaries should work good enough
for your case (and most cases) without need to use the additional
possibilities available via the new binary module.

/Kenneth Erlang/OTP Ericsson

On Fri, Oct 8, 2010 at 1:45 PM, tom kelly <> wrote:
> Hi Robert,
>
> Thanks for that extra insight, it's a lot clearer now! We do pass the
> binaries between different processes so that's an important point for me to
> keep in mind while I'm debugging this.
>
> Also great work Kenneth & OTP team! I've been exploring this area of Erlang
> for the first time this week, nice to find out you're so far ahead of us! We
> were already switching to R14 for our next trunk release so I'm looking
> forward to experimenting with the new binary modules there.
>
> //Tom.
>
>
>
> On Fri, Oct 8, 2010 at 8:09 AM, Kenneth Lundin <>
> wrote:
>>
>> Hi
>>
>> The GC of binaries has been significantly improved since R13B03.
>> Before that it was quite easy to create situations where binaries
>> where not released early enough
>> during GC.
>>
>> From R14 the new module binary was introduced. In that module the
>> functions copy and
>> referenced_byte_size can be helpful in situations where it is
>> important to free unused binaries as
>> soon as possible.
>>
>> See
>> http://www.erlang.org/doc/man/binary.html#referenced_byte_size-1
>> and
>> http://www.erlang.org/doc/man/binary.html#copy-1
>>
>> /Kenneth, Erlang/OTP Ericsson
>>
>> On Fri, Oct 8, 2010 at 3:01 AM, Robert Virding
>> <> wrote:
>> >  Large binaries are not stored in any process heap, but are specially
>> > handled outside the normal process data. This allows them to be sent in
>> > messages between processes by reference instead copying, thereby saving
>> > the
>> > copying of potentially large amounts of data. Which is the main reason
>> > for
>> > doing it this way.
>> >
>> > However, this does cause problems for the gc. Once a large binary has
>> > been
>> > passed between processes it is no longer enough to collect just one
>> > process
>> > before freeing it. You must first collect a process before you can
>> > determine
>> > which references to large binaries it has had are no longer in use. This
>> > means you have to first collect at least each process through which the
>> > binary has passed before you can determine if there are any references
>> > to
>> > it, or if it can be freed. Which means that freeing of large binaries is
>> > delayed, which means it is possible to fill memory with unreferenced
>> > binaries which have not yet been collected.
>> >
>> > The fact that you can get references to large binaries even though you
>> > only
>> > reference parts of them just aggravate matters.
>> >
>> > Unfortunately there are few ways around this. As usual it is a matter of
>> > making trade-offs and hoping for the best. Removing the special handling
>> > of
>> > large binaries would kill many applications.
>> >
>> > Robert
>> >
>> > On 2010-10-07 16.07, tom kelly wrote:
>> >>
>> >> Hello List,
>> >>
>> >> I need some help on garbage collecting binaries.
>> >> I have an application that handles large binaries and under load it
>> >> eats
>> >> up
>> >> all the available memory then falls over, even when I start all the
>> >> data-handling processes with "{spawn_opt,[{fullsweep_after, 20}]}".
>> >>
>> >> Reading point 5.15 on the page:
>> >> http://www.erlang.org/faq/how_do_i.html
>> >> leads me to think that calling the garbage collector on the process
>> >> that
>> >> created a binary will clean it up but my shell experiment below shows
>> >> me
>> >> I'm
>> >> wrong. It's now 10 minutes since I've done this experiment and the
>> >> large
>> >> binary is still in memory.
>> >>
>> >> Maybe I have to call the garbage collector of the process whose heap is
>> >> storing the binary? If so, which process is it?
>> >>
>> >> Any pointers as to what I'm doing wrong or mis-understand will be
>> >> greatly
>> >> appreciated.
>> >>
>> >> I'm on a legacy R12B5 system.
>> >>
>> >> //Tom.
>> >>
>> >>
>> >>
>> >> 18>  process_info(self(),total_heap_size).
>> >> {total_heap_size,4181}
>> >>
>> >> 19>  A = L(100000).
>> >> [140,161,41,128,44,96,215,43,15,164,88,107,1,167,4,125,118,
>> >>  180,121,181,160,124,244,140,169,215,31,82,43|...]
>> >>
>> >> 20>  process_info(self(),total_heap_size).
>> >> {total_heap_size,1149851}
>> >>
>> >> 21>  f(A).
>> >> ok
>> >>
>> >> 22>  erlang:garbage_collect().
>> >> true
>> >>
>> >> 23>  process_info(self(),total_heap_size).
>> >> {total_heap_size,3194}
>> >>
>> >>
>> >> 24>  memory(binary).
>> >> 6536
>> >>
>> >> 25>  A = B(100000).
>> >> <<232,241,55,171,218,35,86,122,211,185,232,1,203,249,181,
>> >>   218,176,33,88,131,102,56,102,82,158,114,200,174,253,...>>
>> >>
>> >> 26>  memory(binary).
>> >> 106704
>> >>
>> >> 27>  f(A).
>> >> ok
>> >>
>> >> 28>  memory(binary).
>> >> 106704
>> >>
>> >> 29>  erlang:garbage_collect().
>> >> true
>> >>
>> >> 30>  memory(binary).
>> >> 106560
>> >>
>> >
>> > ________________________________________________________________
>> > erlang-questions (at) erlang.org mailing list.
>> > See http://www.erlang.org/faq.html
>> > To unsubscribe; mailto:
>> >
>> >
>
>


More information about the erlang-questions mailing list