[erlang-bugs] Schedulers getting "stuck", part II

Patrik Nyblom <>
Tue Apr 30 16:09:28 CEST 2013


Hi Scott!

On 04/29/2013 10:01 PM, Scott Lystig Fritchie wrote:
> Hi, all.  I'd originally intended to cross-post last week's message
> about stuck/collapsed schedulers to both erlang-questions and
> erlang-bugs ... but forgot to do it.  So, here it is.
>
> Incorporated by reference:
>
>      http://erlang.org/pipermail/erlang-questions/2013-April/073490.html
>
> Since that message, I've found that R16B on my 8 core MacBook Pro laptop
> can get its schedulers stuck in about one case in four with a much less
> timeconsuming recipe than the original recipe ... but it requires human
> intervention to stop & restart if collapse doesn't happen.  Note that
> it's an 8 core box, and I'm dropping the number of online schedulers
> down to 5.  Using 6 cores doesn't appear to be successful ... or I'm not
> patient enough to run it enough to see it happen.
Hmmm, dropping schedulers...? There seems to be a perfectly new and 
fresh bug in R16B when dropping schedulers. One that we've fixed in the 
maint branch. Could you please please please try the tip of the maint 
branch (i.e. what's to be R16B01)? The R16B plain "drop schedulers bug" 
ought to be unrelated to the misbehaving schedulers you've seen in other 
cases, so I just want to be sure we are not hunting a ghost with this 
test case, that it can really show the misbehaving schedulers that you 
also see in R15...
>       /usr/local/erlang/R16B.64bit/bin/erl +scl false -pz ebin -sname foo -eval 'N = 5, io:format("OS pid ~s\n\n", [os:getpid()]), timer:sleep(8*1000), io:format("go\n"),  erlang:system_flag(schedulers_online, N), os:cmd("say yo is " ++ integer_to_list(N)), timer:sleep(12*1000), timer:tc(erlang, apply, [fun () -> XX = lists:sort(element(1,wait:run(4*100, 1024*1024, 1100, 5))), {hd(XX), lists:last(XX)} end, []]).'
>
> I run "iostat 1" in another window ... if it is reporting %user time in
> the low 60s, all 5 (out of 8) schedulers are working the way that
> they're supposed to.  If you see 12-25 percent instead, you've got only
> one or two active schedulers.
>
> -Scott
Cheers,
Patrik
> _______________________________________________
> erlang-bugs mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-bugs



More information about the erlang-bugs mailing list