[erlang-questions] McErlang
Kostis Sagonas
kostis@REDACTED
Wed Oct 4 01:57:12 CEST 2017
On 10/03/2017 07:08 PM, John Hughes wrote:
> Concuerror will guarantee to find bugs… with relatively short runs,
> relatively few processes, and a small number of preemptions. It’s
> searching very thoroughly in a small part of the space of possible
> executions—and it’s very good at it, but there is still an exponential
> blowup that will strike eventually. Random sampling, such as QuickCheck
> and PULSE do, can sample from a much larger part of the space… and if
> that’s where the bugs are, then that’s where you have to look for them!
> So Concuerror/McErlang and random tools are really complementary—either
> one may find bugs faster, depending on the kind of bug. There is plenty
> of evidence that sampling larger random tests, and shrinking them, can
> be a much faster way to find faults than enumerating smaller tests.
I very much agree that systematic and random-based tools are really
complementary: both have their pros and cons and, for that reason, their
proper place in a programmer's tool suite. Also, for testing purposes,
there is a lot to be said for finding bugs sooner rather than later and
perhaps one can detect bugs faster with random sampling if the bug
occurs sufficiently frequently.
Still, some of my recent experiences [1] [2] are different: a
sufficiently powerful stateless model checking tool, such as Concuerror
(or a similar such tool for C), can find concurrency bugs pretty fast,
_provided_ that one pays attention to the careful *modeling* of the
system. This is where one of the difficulties of using systematic
concurrency testing tools lies: how to "minimize the model"(*) in order
to fight the combinatorial explosion. Once this is done properly,
finding concurrency bugs is easy, because most such bugs require very
few preemptions/context switches to manifest themselves.
(*) In many cases the part that needs to be minimized consists of the
components of the system that are not tested; in some others it's the
complete program under test.
Unfortunately, this manual minimization process that tools which test
very thoroughly require (e,g., in a program with N reader processes and
M writer processes, all concurrent, one does not need to use large
values of N and M to find a data race; using just one of each kind will
suffice for read/write races and two writers for write/write races or
immutability violations) goes very much against the idea of "you need to
stress-test to increase your likelihood of finding concurrency errors"
that many programmers have been brainwashed to use, which may or may not
be related to why random sampling often works well.
But if one is interested just in "finding bugs", bounding techniques can
often do wonders. [1] reports by that by using bounding, Concuerror
finds a bug in some particular method/program (Method 1 in that paper)
in less than a minute, while without bounding the tool needs 144 hours
to detect the bug. (Later, the same paper shows that, even without
bounding, the bugs in that method are detected very fast if one pays
attention to the model [1, Table 2].)
An interesting (to me at least) question is what should happen after all
"shallow" bugs have been fixed and a particular tool does NOT detect a
bug anymore. What sort of guarantees are provided by the tool? Isn't
it time to use a systematic instead of a random-based tool? This is
exactly what I was referring to in my previous mail where I wrote:
Thus Concuerror, in principle at least (i.e., modulo bugs in its
code), actually *verifies*, using stateless model checking, that a
particular (part of a) program is free of concurrency errors, in
contrast to e.g. QuickCheck/PULSE which fundamentally cannot provide
any correctness guarantees (other than probabilistic ones).
As both [1] and [2] report, it's possible to get actual guarantees about
the absence of certain classes of (concurrency) errors using a stateless
model checking tool, even in code bases of significant size such as
portions of the Linux kernel. This guarantee exists _because of_ the
systematic exploration and the fact that the tools are sound (i.e., can
in principle detect all errors).
Kostis
[1] Testing and Verifying Chain Repair Methods for Corfu Using Stateless
Model Checking (http://concuerror.com/assets/pdf/iFM2017.pdf)
[2] Stateless model checking of the Linux kernel's hierarchical
read-copy-update (Tree RCU) - https://dl.acm.org/citation.cfm?id=3092287
More information about the erlang-questions
mailing list