Beautiful Debugging > Delta Debugging

28.5. Delta Debugging

Over the next several months, I refined and optimized the algorithm as well as the tool, such that eventually it would need just one hour to find the failure-inducing change. I eventually published the algorithm under the name delta debugging because it "debugs" programs by isolating a delta, or difference between two versions.

Here I'll show my Python implementation of the delta debugging algorithm. The function dd() takes three arguments—two lists of changes as well as a test:

If you plan to implement dd() yourself, you can easily use the code shown here (and included on the O'Reilly web site for this book). You also need three supporting functions:


split (list, n)

Splits a list into n sublists of equal length (except for possibly the last). Thus:

	split([1, 2, 3, 4, 5], 3)

yields:

	[[1, 2], [3, 4], [5]]


listminus() and listunion( )

Return the difference or union of two sets represented as lists, respectively. Thus:

	listminus([1, 2, 3], [1, 2])

yields:

	[3]

whereas:

	listunion([1, 2, 3], [3, 4])

yields:

	[1, 2, 3, 4]

The Python code is shown in Example 28-1.

Example 28-1. An implementation of the delta debugging algorithm

def dd(c_pass, c_fail, test):
    """Return a triple (DELTA, C_PASS', C_FAIL') such that
       - C_PASS subseteq C_PASS' subset C_FAIL' subseteq C_FAIL holds
       - DELTA = C_FAIL' - C_PASS' is a minimal difference
         between C_PASS' and C_FAIL' that is relevant with respect
         to TEST."""

    n = 2 # Number of subsets

    while 1:
        assert test(c_pass) == PASS # Invariant
        assert test(c_fail) == FAIL # Invariant
        assert n >= 2

        delta = listminus(c_fail, c_pass)

        if n > len(delta):
            # No further minimizing
            return (delta, c_pass, c_fail)

        deltas = split(delta, n)
        assert len(deltas) == n

        offset = 0
        j = 0
        while j < n:
            i = (j + offset) % n
            next_c_pass = listunion(c_pass, deltas[i])
            next_c_fail = listminus(c_fail, deltas[i])

            if test(next_c_fail) == FAIL and n == 2:
                c_fail = next_c_fail
                n = 2; offset = 0; break
            elif test(next_c_fail) == PASS:
                c_pass = next_c_fail
                n = 2; offset = 0; break
            elif test(next_c_pass) == FAIL:
                c_fail = next_c_pass
                n = 2; offset = 0; break
            elif test(next_c_fail) == FAIL:
                c_fail = next_c_fail
                n = max(n - 1, 2); offset = i; break
            elif test(next_c_pass) == PASS:
                c_pass = next_c_pass
                n = max(n - 1, 2); offset = i; break
            else:
                j = j + 1

        if j >= n:
            if n >= len(delta):
                return (delta, c_pass, c_fail)
            else:
                n = min(len(delta), n * 2)