[erlang-questions] McErlang

Wed Oct 4 01:57:12 CEST 2017

On 10/03/2017 07:08 PM, John Hughes wrote:
> Concuerror will guarantee to find bugs… with relatively short runs, 
> relatively few processes, and a small number of preemptions. It’s 
> searching very thoroughly in a small part of the space of possible 
> executions—and it’s very good at it, but there is still an exponential 
> blowup that will strike eventually. Random sampling, such as QuickCheck 
> and PULSE do, can sample from a much larger part of the space… and if 
> that’s where the bugs are, then that’s where you have to look for them! 
> So Concuerror/McErlang and random tools are really complementary—either 
> one may find bugs faster, depending on the kind of bug. There is plenty 
> of evidence that sampling larger random tests, and shrinking them, can 
> be a much faster way to find faults than enumerating smaller tests.

I very much agree that systematic and random-based tools are really 
complementary: both have their pros and cons and, for that reason, their 
proper place in a programmer's tool suite.  Also, for testing purposes, 
there is a lot to be said for finding bugs sooner rather than later and 
perhaps one can detect bugs faster with random sampling if the bug 
occurs sufficiently frequently.

Still, some of my recent experiences [1] [2] are different: a 
sufficiently powerful stateless model checking tool, such as Concuerror 
(or a similar such tool for C), can find concurrency bugs pretty fast, 
_provided_ that one pays attention to the careful *modeling* of the 
system.  This is where one of the difficulties of using systematic 
concurrency testing tools lies: how to "minimize the model"(*) in order 
to fight the combinatorial explosion.  Once this is done properly, 
finding concurrency bugs is easy, because most such bugs require very 
few preemptions/context switches to manifest themselves.

(*) In many cases the part that needs to be minimized consists of the 
components of the system that are not tested; in some others it's the 
complete program under test.

Unfortunately, this manual minimization process that tools which test 
very thoroughly require (e,g., in a program with N reader processes and 
M writer processes, all concurrent, one does not need to use large 
values of N and M to find a data race; using just one of each kind will 
suffice for read/write races and two writers for write/write races or 
immutability violations) goes very much against the idea of "you need to 
stress-test to increase your likelihood of finding concurrency errors" 
that many programmers have been brainwashed to use, which may or may not 
be related to why random sampling often works well.

But if one is interested just in "finding bugs", bounding techniques can 
often do wonders.  [1] reports by that by using bounding, Concuerror 
finds a bug in some particular method/program (Method 1 in that paper) 
in less than a minute, while without bounding the tool needs 144 hours 
to detect the bug.  (Later, the same paper shows that, even without 
bounding, the bugs in that method are detected very fast if one pays 
attention to the model [1, Table 2].)

An interesting (to me at least) question is what should happen after all 
"shallow" bugs have been fixed and a particular tool does NOT detect a 
bug anymore.  What sort of guarantees are provided by the tool?  Isn't 
it time to use a systematic instead of a random-based tool?  This is 
exactly what I was referring to in my previous mail where I wrote:

   Thus Concuerror, in principle at least (i.e., modulo bugs in its
   code), actually *verifies*, using stateless model checking, that a
   particular (part of a) program is free of concurrency errors, in
   contrast to e.g. QuickCheck/PULSE which fundamentally cannot provide
   any correctness guarantees (other than probabilistic ones).

As both [1] and [2] report, it's possible to get actual guarantees about 
the absence of certain classes of (concurrency) errors using a stateless 
model checking tool, even in code bases of significant size such as 
portions of the Linux kernel.  This guarantee exists _because of_ the 
systematic exploration and the fact that the tools are sound (i.e., can 
in principle detect all errors).

Kostis

[1] Testing and Verifying Chain Repair Methods for Corfu Using Stateless 
Model Checking (http://concuerror.com/assets/pdf/iFM2017.pdf)

[2] Stateless model checking of the Linux kernel's hierarchical 
read-copy-update (Tree RCU) - https://dl.acm.org/citation.cfm?id=3092287