[erlang-bugs] Scheduler Wall Time Statistics live|dead locking a process.
pan@REDACTED
pan@REDACTED
Wed Jul 18 15:44:13 CEST 2012
Hi Fred!
On Wed, 18 Jul 2012, Fred Hebert wrote:
> Hi there,
>
> If you go on erlang-questions, you'll find the following thread I started
> regarding one of my gen_servers locking up forever until I try to connect to
> the VM: http://erlang.org/pipermail/erlang-questions/2012-July/068097.html
>
> And the information following it in
> http://erlang.org/pipermail/erlang-questions/2012-July/068099.html
>
> The gist of it is that apparently, the gen_server gets stuck while calling
> erlang:statistics(scheduler_wall_time). A process info dump on it returns:
>
> [{registered_name,vmstats_server},
> {current_function,{erlang,sched_wall_time,3}},
> {initial_call,{proc_lib,init_p,5}},
> {status,waiting},
> {message_queue_len,2},
> {messages,[{system,{<5998.7341.243>,#Ref<5998.0.3810.221818>},get_status},
> {system,{<5998.28757.800>,#Ref<5998.0.3811.260443>},get_status}]},
> {links,[<5998.918.0>]},
> {dictionary,[{random_seed,{17770,13214,15044}},
> {'$ancestors',[vmstats_sup,<5998.917.0>]},
> {'$initial_call',{vmstats_server,init,1}}]},
> {trap_exit,false},
> {error_handler,error_handler},
> {priority,normal},
> {group_leader,<5998.916.0>},
> {total_heap_size,122003},
> {heap_size,121393},
> {stack_size,21},
> {reductions,314325681},
> {garbage_collection,[{min_bin_vheap_size,46368},
> {min_heap_size,233},
> {fullsweep_after,65535},
> {minor_gcs,23774}]},
> {suspending,[]}]
> ok
>
> with the interesting parts:
> {current_function,{erlang,sched_wall_time,3}},
> {status,waiting},
>
> I'm unsure what exactly causes the problem, and we're running the VM with
> default arguments when it comes to scheduling and layout. It happens even
> when the virtual machine is under relatively low load (scheduler active wall
> time is less than 5%, but more than 2% of the total wall time when averaging
> all cores) and can also happen under higher load.
Ouch... Seems like one of the schedulers does not understand that it
should report data back to the process. Is there any chance of dumping
core of a machine where it hangs, or would that mean interruption of
service? I *really* would like to know what the schedulers are doing when
they should be reporting back...
>
> Only that process appears affected.
Yes, it's just waiting for a message that does not arrive, one that should
be sent from the VM when statistics for the scheduler is available...
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs
>
Cheers,
/Patrik
More information about the erlang-bugs
mailing list