[erlang-bugs] Schedulers getting "stuck", part II
Scott Lystig Fritchie
Wed May 1 00:13:15 CEST 2013
Patrik, there are a couple of synthetic load cases that have an end
result of what we occasionally see Riak and Riak CS doing in the wild.
Manymany thanks to Joseph Blomstedt for inventing these two modules.
Both can be used by running the 'go/0' function.
The test10:go() function creates an oscillation between a couple of
workloads: one that tends toward scheduler collapse, and one that tends
to wake them up again.
The test11:go() function uses only a single load that tends toward
Both of them fail mostly regularly on my 8 core MBP using R15B01,
R15B03, and R16B.
The io:format() messages are sent while load is not running, with very
generous pauses before starting the next phase of workload. If you call
io:format() during unfairly-scheduled workload (which these tests excel
at doing), the messages can be delayed by dozens of seconds.
Note that these synthetic tests are using two different functions to
cause scheduler collapse: test10.erl with crypto:md5_update/2, a NIF,
and test11.erl with erlang:external_size/1, a BIF. It's quite likely
that erlang:term_to_binary/1 is similarly effective/buggy.
Neither of them fails when using this patch on any of those three VM
... when also using "+scl false +zdnfgtse 500:500".
More information about the erlang-bugs