[erlang-questions] binary, ets and memory...

Robert Virding <>
Tue Oct 16 23:39:29 CEST 2012


Yes, that is why binary:copy was added. For just the case where you are accessing a sm all section of a large binary and the whoe binary is kept. 

Another problem is with large binaries which are sent between processes. Yes, they are not copied and only a reference is passed, but it also means that it will take a longer time before the system can detect that the binary is no longer live and can be reclaimed. All the processes through which the binary has passed must be garbage collected first . The binary is a global object. 

Robert 

----- Original Message -----

> From: "Chris Hicks" <>
> To: , 
> Sent: Friday, 12 October, 2012 9:27:03 PM
> Subject: Re: [erlang-questions] binary, ets and memory...

> I could be wrong but I'm going to take a guess and say that in the
> first implementation the whole binary is being kept around and never
> destroyed. I think what's happening is you are getting a reference
> to part of a larger binary and passing that around, but the larger
> binary is sticking around since part of it is still being used.
> Copying the part you need, and thus creating an entirely new binary,
> is probably allowing all references to that large binary to
> disappear so that it can be GC'd.

> That's a rather naive guess based on what I know about how binaries
> work. Can anyone else back that up or tell me I'm wrong?

> Chris

> > From: 
> > Date: Fri, 12 Oct 2012 22:15:21 +0300
> > To: 
> > Subject: [erlang-questions] binary, ets and memory...
> >
> > Hello,
> >
> > Recently, my system starts to swap. The investigation has indicated
> > that memory consumption was almost twice more then I've expected
> > and to be honest, I've confused why it so...
> >
> > I am talking about Erlang R15B (erts-5.9) [source] [64-bit]
> > [smp:8:8] [async-threads:0] [hipe] [kernel-poll:false]
> >
> > So, I do have two processes
> > * first process handles tcp/ip socket I/O. received data is pushed
> > to second process
> > * second process splits binaries binary:split(Buf, [<<$,>>],
> > [global]), parses data and makes list of tuple. When list of
> > tuples is ready, it folds tuples into ets table
> > ets:new(cache, [named_table, ordered_set, public]),
> > lists:foldl(
> > fun({A, B}, Acc) -> ets:insert(cache, {A, B}) end,
> > true,
> > List
> > ).
> >
> > I do have about 6M tuples, where first element is SHA1 signature,
> > second element is integer. Receiver process pushes 100 tuples per
> > time. I hope you got rough idea.
> >
> > When cache is populated, I do have following memory usage, it looks
> > suspicious for me:
> > {total,1179235080},
> > {processes,2373638},
> > {processes_used,2373570},
> > {system,1176861442},
> > {atom,264505},
> > {atom_used,253241},
> > {binary,434761000}, <-- this looks strange for me. Why binaries are
> > left in heap and in ets?
> > {code,6521469},
> > {ets,732409416}
> >
> > If I change my implementation to
> > lists:foldl(
> > fun({A, B}, Acc) -> ets:insert(cache, {binary:copy(A), B}) end,
> > true,
> > List
> > ).
> > then memory utilization is on the par with my estimates
> > {total,701448856},
> > {processes,2405251},
> > {processes_used,2405170},
> > {system,699043605},
> > {atom,264505},
> > {atom_used,253241},
> > {binary,2686280},
> > {code,6521493},
> > {ets,686663080}
> >
> > Best Regards,
> > Dmitry
> >
> > _______________________________________________
> > erlang-questions mailing list
> > 
> > http://erlang.org/mailman/listinfo/erlang-questions

> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121016/beb985d4/attachment.html>


More information about the erlang-questions mailing list