[erlang-questions] supervisor using 500+ MB of ram to track 6.7 million dead workers?

Siri Hansen erlangsiri@REDACTED
Fri Dec 28 10:50:37 CET 2012


Hi Daniel - It seems like you are using a different supervisor than the one
provided in stdlib. Or maybe a patched version of it - if so, do you know
which version it is based on?

There has been problems with memory consumption in simple_one_for_one
supervisors with *many* children earlier, but that has been fixed already a
few releases ago.

Regards
/siri@REDACTED


2012/12/27 Daniel Barney <dan353hehe@REDACTED>

> Hey Guys,
>
> This is what I am running into, I have a supervisor which manages a
> bunch of processes that are each in charge of a gen_tcp or ssl socket.
> After upgrading to R15B02 i started noticing that memory usage was
> growing slowly and over the course of three days I found that the
> machines were starting to run out of memory. I do not remember which
> version we upgraded from, but it was from last year. I can check if it
> is really needed.
>
> We upgraded to R15B02, wrote a small patch and then we cherry-picked
> it onto the stable version. The patch shouldn't be affecting what I am
> seeing as I only changed two files in the SSL application, relating to
> parsing certs that do not obey the standard.
>
> So assuming that i had a memory leak somewhere i first checked to see
> if any processes were using large amounts of memory, and this is what
> I found:
>
> <0.785.0>             supervisor:cowboy_requests_sup/1   8024355 15270185
>    0
>                       gen_server:loop/6                        9
>
> This processes was a supervisor so i checked how many children that it
> had, and this is what i got back:
>
> supervisor:count_children(Pid).
> [{specs,1},{active,80},{supervisors,0},{workers,6782028}]
>
>
> And then general process information:
>
> erlang:process_info(Pid).
> [{current_function,{gen_server,loop,6}},
>  {initial_call,{proc_lib,init_p,5}},
>  {status,waiting},
>  {message_queue_len,0},
>  {messages,[]},
>  {links,[<0.783.0>]},
>  {dictionary,[{'$ancestors',[<0.783.0>,cowboy_sup,<0.544.0>]},
>               {'$initial_call',{supervisor,cowboy_requests_sup,1}}]},
>  {trap_exit,true},
>  {error_handler,error_handler},
>  {priority,normal},
>  {group_leader,<0.543.0>},
>  {total_heap_size,67810415},
>  {heap_size,8024355},
>  {stack_size,9},
>  {reductions,1551847476},
>  {garbage_collection,[{min_bin_vheap_size,46368},
>                       {min_heap_size,233},
>                       {fullsweep_after,65535},
>                       {minor_gcs,31155}]},
>  {suspending,[]}]
>
> erlang:process_info(Pid,memory).
> {memory,542484248}
>
> and this is how many processes i have on the machine:
> length(processes()).
> 588
>
> so all of the workers for this supervisor are dead. but it doesn't
> think so for some reason?
>
> Is there any reason why a supervisor would hold onto 6.7 million
> workers that are already dead?
>
> Any help would be appreciated,
> Daniel
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121228/dac01b8e/attachment.htm>


More information about the erlang-questions mailing list