[erlang-bugs] Schedulers getting "stuck", part II

Scott Lystig Fritchie <>
Mon Apr 29 22:01:43 CEST 2013


Hi, all.  I'd originally intended to cross-post last week's message
about stuck/collapsed schedulers to both erlang-questions and
erlang-bugs ... but forgot to do it.  So, here it is.

Incorporated by reference:

    http://erlang.org/pipermail/erlang-questions/2013-April/073490.html

Since that message, I've found that R16B on my 8 core MacBook Pro laptop
can get its schedulers stuck in about one case in four with a much less
timeconsuming recipe than the original recipe ... but it requires human
intervention to stop & restart if collapse doesn't happen.  Note that
it's an 8 core box, and I'm dropping the number of online schedulers
down to 5.  Using 6 cores doesn't appear to be successful ... or I'm not
patient enough to run it enough to see it happen.

     /usr/local/erlang/R16B.64bit/bin/erl +scl false -pz ebin -sname foo -eval 'N = 5, io:format("OS pid ~s\n\n", [os:getpid()]), timer:sleep(8*1000), io:format("go\n"),  erlang:system_flag(schedulers_online, N), os:cmd("say yo is " ++ integer_to_list(N)), timer:sleep(12*1000), timer:tc(erlang, apply, [fun () -> XX = lists:sort(element(1,wait:run(4*100, 1024*1024, 1100, 5))), {hd(XX), lists:last(XX)} end, []]).'

I run "iostat 1" in another window ... if it is reporting %user time in
the low 60s, all 5 (out of 8) schedulers are working the way that
they're supposed to.  If you see 12-25 percent instead, you've got only
one or two active schedulers.

-Scott


More information about the erlang-bugs mailing list