[erlang-questions] Garbage collecting binaries
tom kelly
ttom.kelly@REDACTED
Fri Oct 8 13:45:16 CEST 2010
Hi Robert,
Thanks for that extra insight, it's a lot clearer now! We do pass the
binaries between different processes so that's an important point for me to
keep in mind while I'm debugging this.
Also great work Kenneth & OTP team! I've been exploring this area of Erlang
for the first time this week, nice to find out you're so far ahead of us! We
were already switching to R14 for our next trunk release so I'm looking
forward to experimenting with the new binary modules there.
//Tom.
On Fri, Oct 8, 2010 at 8:09 AM, Kenneth Lundin <kenneth.lundin@REDACTED>wrote:
> Hi
>
> The GC of binaries has been significantly improved since R13B03.
> Before that it was quite easy to create situations where binaries
> where not released early enough
> during GC.
>
> From R14 the new module binary was introduced. In that module the
> functions copy and
> referenced_byte_size can be helpful in situations where it is
> important to free unused binaries as
> soon as possible.
>
> See
> http://www.erlang.org/doc/man/binary.html#referenced_byte_size-1
> and
> http://www.erlang.org/doc/man/binary.html#copy-1
>
> /Kenneth, Erlang/OTP Ericsson
>
> On Fri, Oct 8, 2010 at 3:01 AM, Robert Virding
> <robert.virding@REDACTED> wrote:
> > Large binaries are not stored in any process heap, but are specially
> > handled outside the normal process data. This allows them to be sent in
> > messages between processes by reference instead copying, thereby saving
> the
> > copying of potentially large amounts of data. Which is the main reason
> for
> > doing it this way.
> >
> > However, this does cause problems for the gc. Once a large binary has
> been
> > passed between processes it is no longer enough to collect just one
> process
> > before freeing it. You must first collect a process before you can
> determine
> > which references to large binaries it has had are no longer in use. This
> > means you have to first collect at least each process through which the
> > binary has passed before you can determine if there are any references to
> > it, or if it can be freed. Which means that freeing of large binaries is
> > delayed, which means it is possible to fill memory with unreferenced
> > binaries which have not yet been collected.
> >
> > The fact that you can get references to large binaries even though you
> only
> > reference parts of them just aggravate matters.
> >
> > Unfortunately there are few ways around this. As usual it is a matter of
> > making trade-offs and hoping for the best. Removing the special handling
> of
> > large binaries would kill many applications.
> >
> > Robert
> >
> > On 2010-10-07 16.07, tom kelly wrote:
> >>
> >> Hello List,
> >>
> >> I need some help on garbage collecting binaries.
> >> I have an application that handles large binaries and under load it eats
> >> up
> >> all the available memory then falls over, even when I start all the
> >> data-handling processes with "{spawn_opt,[{fullsweep_after, 20}]}".
> >>
> >> Reading point 5.15 on the page:
> >> http://www.erlang.org/faq/how_do_i.html
> >> leads me to think that calling the garbage collector on the process that
> >> created a binary will clean it up but my shell experiment below shows me
> >> I'm
> >> wrong. It's now 10 minutes since I've done this experiment and the large
> >> binary is still in memory.
> >>
> >> Maybe I have to call the garbage collector of the process whose heap is
> >> storing the binary? If so, which process is it?
> >>
> >> Any pointers as to what I'm doing wrong or mis-understand will be
> greatly
> >> appreciated.
> >>
> >> I'm on a legacy R12B5 system.
> >>
> >> //Tom.
> >>
> >>
> >>
> >> 18> process_info(self(),total_heap_size).
> >> {total_heap_size,4181}
> >>
> >> 19> A = L(100000).
> >> [140,161,41,128,44,96,215,43,15,164,88,107,1,167,4,125,118,
> >> 180,121,181,160,124,244,140,169,215,31,82,43|...]
> >>
> >> 20> process_info(self(),total_heap_size).
> >> {total_heap_size,1149851}
> >>
> >> 21> f(A).
> >> ok
> >>
> >> 22> erlang:garbage_collect().
> >> true
> >>
> >> 23> process_info(self(),total_heap_size).
> >> {total_heap_size,3194}
> >>
> >>
> >> 24> memory(binary).
> >> 6536
> >>
> >> 25> A = B(100000).
> >> <<232,241,55,171,218,35,86,122,211,185,232,1,203,249,181,
> >> 218,176,33,88,131,102,56,102,82,158,114,200,174,253,...>>
> >>
> >> 26> memory(binary).
> >> 106704
> >>
> >> 27> f(A).
> >> ok
> >>
> >> 28> memory(binary).
> >> 106704
> >>
> >> 29> erlang:garbage_collect().
> >> true
> >>
> >> 30> memory(binary).
> >> 106560
> >>
> >
> > ________________________________________________________________
> > erlang-questions (at) erlang.org mailing list.
> > See http://www.erlang.org/faq.html
> > To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
> >
> >
>
More information about the erlang-questions
mailing list