David R. MacIver's Blog
A sketch of a new simplify API for Hypothesis
One of the decisions I’m quite happy I made is that the SearchStrategy API in Hypothesis is not public for 1.0. This gives me some freedom to break it when I figure out ways to make it better.
Which is good, because I’ve already figured out a way to break it to make it better.
In my fuzzer for yapf, I realised an interesting thing: The minimization happens in two phases, first line based, then character based. Once character based simplification has started we stop bothering with line based simplification altogether: Although in principle character based simplification could get us back into a state where line based simplification would be useful again, in practice this happens so rarely that trying line based simplification is basically never worth going back to.
Meanwhile, back in Hypothesis land, I was staring at the following
quality test this morning:
def dedupe(xs):
result = []
for x in xs:
if x not in result:
result.append(x)
return result
@given(strategy([bytes]).map(dedupe))
def is_sorted(xs):
assume(len(xs) >= 10)
assert sorted(xs) == xs
This is an example that is designed to fail and mostly exists to test
the quality of the minimization: What’s the simplest list of byte
strings it can find such that:
- The list only has distinct elements
- The list has at least 10 elements
- The list is not sorted
It’s not hard to find such an example - generally speaking Hypothesis
will find an example within the first couple of throws of the dice - but
producing a good example is harder, which is what this test
looks for (In the actual test we make a bunch of assertions about the
quality of the result we got at the end).
The problem is that sometimes this can get into a bad state where it takes an extraordinarily long time to minimize.
The problem is basically this: Every time it successfully minimizes an example, the simplification process starts again from scratch on the new example. This means that it tries all the usual simplifications for deleting elements from the list. These are a complete waste of time because they always shrink the list, violating the size assumption. But every time we shrink an element (and there are 10 elements to shrink) we do it all again.
I was originally thinking of trying to fix this in terms of making simplify skip simplifications which fail to produce examples too often, but that’s not really very satisfactory and requires a lot of delicate state tracking. It might be worth it in some cases, but I think it probably wouldn’t be.
Then I realised that this problem fits entirely into the approach I mentioned using for yapf, and we can use it to fix this problem.
Right now Hypothesis essentially follows the classic quickcheck API
where there is a function shrink which takes values and provides a
generator over simpler versions of the value. Simplified, the hypothesis
example finding algorithm looks like:
def find_minimal(x, f):
for s in shrink(x):
if f(s):
return find_minimal(s, f)
return x
(that’s not actual Hypothesis code. The real implementation is more
complex in several ways and doesn’t use recursion because Python, but
it’s the idea of it).
The idea is to instead change this so that there are multiple
shrinkers, each behaving like our shrink function above, and then we
apply each shrinker in turn, never returning to an earlier one. This
would be something akin to:
def minimize_with_shrinker(x, f, shrinker):
for s in shrinker(x):
if f(s):
return minimize_with_shrinker(s, f, shrinker)
return x
def find_minimal(x, f):
for shrinker in shrinkers:
x = minimize_with_shrinker(x, f, shrinker)
return x
This lets us fix the list example by splitting up shrinking into two
phases: First it tries to shrink by removing elements from the list.
Then it tries to shrink by shrinking individual elements. Once it starts
shrinking elements, the size of the list is fixed - it’s already found
the smallest list it’s going to.
This does impact the quality of the examples a little bit. If necessary that could be fixed by iterating to a fixed point: We run the above algorithm then we run it again until it no longer changes the result. A simpler version would just be to run it twice. I don’t plan to do this unless I find the results from a single pass of this are much worse than I expect them to be. The goal here isn’t always to find the single simplest example, only to find a simple enough example in a short amount of time. I think this will be entirely acceptable quality and it should be very aceptable in time in comparison to the existing implementation.
Comments
pozorvlak on 2015-03-28 12:04:41:
def minimize_with_shrinker(x, f, shrinker):
for s in shrink(x):
Should the second line be for s in shrinker(x):?
david on 2015-03-28 12:06:13:
Yes. Thanks.