[erlang-questions] QuickCheck module for testing the new string module

Fri Apr 7 17:10:59 CEST 2017

On Thu, Apr 6, 2017 at 3:26 PM Björn Gustavsson <bjorn@REDACTED> wrote:

>
> Comments are welcome. This is my first major use of QuickCheck. I am
> interested in how I could improve the QC specifications and
> generators.
>
>
I think it's looking rather good. If you have the commercial variant of
EQC, here are some things you may want to do:

I'm very fond of using eqc:module({testing_budget, 30}, Mod) because it
gives you 30 seconds of tests equally distributed over the properties in
the file. Time bounds are usually nicer than number of tests. You can also
weight your properties so those which most often fail are tested a bit
more. I tend to pick 15 seconds when developing, 2-5 minutes for coffee
runs, 30 minutes for lunch and 12 hours for when you leave the office or go
to sleep.

You can use the in_parallel transform on your file to execute your test
cases on all cores. The speedup is more or less linear in the number of
cores.

Start using classification in your test cases. You want to classify on the
structure of your generated strings, so you can see if you actually cover a
realistic set of strings or if you are looking at rather small strings only.

Collect information about the length of your strings. I have some tooling
in https://github.com/jlouis/eqc_lib/blob/master/eqc_lib.erl for
summarizing data in the form of what R does on a data set (and stem+leaf
plots). Again, the goal is to verify that your generator is generating a
realistic input set.

Since we are trying to handle unicode, I would lace the input with a
frequency generator which deliberately creates strings which are known to
be naughty[0][1]. In principle we should hit them randomly after a while,
but it is often simpler to just generate all the nasty strings more often
in the code. Normal tests and use are likely to quickly hit the common
faults. So go straight for the jugular: hit all the corner cases early and
often. you want to hit errors in less than 100 test cases if possible. The
goal here is to crash the code base. In general, look up what people in the
security world are using as fuzzing inputs.

Another point, which you may already cover, is that of negative testing:

* Positive: Valid inputs must succeed with the right value
* Negative: Invalid inputs should return the right error or throw an error

In my maps_eqc tests, which are available at [2], we have the following
lines:

https://github.com/jlouis/maps_eqc/blob/3ab960018684785415e7265245889caf083e330c/src/maps_eqc.erl#L320-L379

which verifies the property of the maps module if you input values which
are not valid maps or inputs. We can, in each case, predict what the error
should be, especially in the situation of {badkey, K} errors. This in turn
ensures that the error cases are hit in all cases.

Typical strategy here is either to use the fault/2 generator and then use a
parameter to alter the fault injection rate. Or to have separate properties
which always generate faulty input. Lace the generator with a 10% fault
injection rate at each part of your tree, say, so the chances of generating
a fault is fairly high when multiple such are taken together. Then guard it
with a ?SUCHTHAT on acutally having a fault. But beware having to search
too much in the ?SUCHTHAT as that slows down test case generation.
Classification of the types of faults become paramount here. You can find
some of these strategies used in my enacl test cases[3]

Feel free to question me with stuff if needed!

[0] http://www.lookout.net/2011/06/special-unicode-characters-for-error.html
[1] https://github.com/minimaxir/big-list-of-naughty-strings
[2] https://github.com/jlouis/maps_eqc
[3] https://github.com/jlouis/enacl/blob/master/eqc_test/enacl_eqc.erl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170407/3d7e6111/attachment.htm>