[erlang-questions] supervisor using 500+ MB of ram to track 6.7 million dead workers?
Daniel Barney
dan353hehe@REDACTED
Thu Dec 27 22:11:19 CET 2012
Hey Guys,
This is what I am running into, I have a supervisor which manages a
bunch of processes that are each in charge of a gen_tcp or ssl socket.
After upgrading to R15B02 i started noticing that memory usage was
growing slowly and over the course of three days I found that the
machines were starting to run out of memory. I do not remember which
version we upgraded from, but it was from last year. I can check if it
is really needed.
We upgraded to R15B02, wrote a small patch and then we cherry-picked
it onto the stable version. The patch shouldn't be affecting what I am
seeing as I only changed two files in the SSL application, relating to
parsing certs that do not obey the standard.
So assuming that i had a memory leak somewhere i first checked to see
if any processes were using large amounts of memory, and this is what
I found:
<0.785.0> supervisor:cowboy_requests_sup/1 8024355 15270185 0
gen_server:loop/6 9
This processes was a supervisor so i checked how many children that it
had, and this is what i got back:
supervisor:count_children(Pid).
[{specs,1},{active,80},{supervisors,0},{workers,6782028}]
And then general process information:
erlang:process_info(Pid).
[{current_function,{gen_server,loop,6}},
{initial_call,{proc_lib,init_p,5}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.783.0>]},
{dictionary,[{'$ancestors',[<0.783.0>,cowboy_sup,<0.544.0>]},
{'$initial_call',{supervisor,cowboy_requests_sup,1}}]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.543.0>},
{total_heap_size,67810415},
{heap_size,8024355},
{stack_size,9},
{reductions,1551847476},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,31155}]},
{suspending,[]}]
erlang:process_info(Pid,memory).
{memory,542484248}
and this is how many processes i have on the machine:
length(processes()).
588
so all of the workers for this supervisor are dead. but it doesn't
think so for some reason?
Is there any reason why a supervisor would hold onto 6.7 million
workers that are already dead?
Any help would be appreciated,
Daniel
More information about the erlang-questions
mailing list