Friday, January 14, 2011

Big-O Isn't Everything

Question: when does an O(N2) algorithm outperform an O(NlogN) algorithm?

As part of my article on primitives, I decided to implement Heapsort. I needed an in-place sort for primitive integers that used an external comparator, and couldn't find one with a reasonable amount of Googling. I picked Heapsort because it's an easy algorithm to implement, and has O(NlogN) behavior for all inputs (unlike Quicksort, which degrades to O(N2) on already-sorted input unless you take care picking the pivot).

Naturally, I wanted to see how my implementation performed, vis-a-vis the sorts provided by the JDK. Particularly on large datasets, which is the reason that I wanted an in-place sort. So I created a benchmark that measured thread CPU time to sort a 10,000,000 element array (which I knew would fit in RAM). Here are the results, the average of 10 executions on my aged desktop (a 2.4 Ghz P4 with only 512 kb of cache):

Implementation Time (sec) Compares
Arrays.sort(int[]) 2.55
Heapsort.sort(int[]) 18.08
Arrays.sort(Integer[]) 25.09 223,179,141
Heapsort.sort(Integer[]) 79.91 439,021,125

Those are dramatic differences. I knew that Heapsort would make more comparisons than Mergesort (which the JDK will use for Integer[]), so I added a counter to the comparator. But the running time was over three times longer, with only twice the number of comparisons. And Quicksort (used by the JDK for int[]) was over seven times faster. Perhaps some of that difference comes from Heapsort calling a comparator, but clearly, not all O(NlogN) algorithms are alike!

So I decided to look at the JDK source. And found this interesting bit of code:*

private static void sort1(int x[], int off, int len) {
// Insertion sort on smallest arrays
if (len < 7) {
    for (int i=off; i<len+off; i++)
    for (int j=i; j>off && x[j-1]>x[j]; j--)
        swap(x, j, j-1);
    return;
}

Insertion sort is an O(N2) algorithm. But it doesn't have a lot of moving parts — unlike Quicksort, which takes almost 40 lines of code in the JDK implementation, not counting the swap and range-swap methods. And when N is small, lines of code are more important than algorithmic efficiency (or, as my coworker Max points out, when N is bounded, you have a constant-time operation).

There's one more performance-enhancing feature of this snippet: all of the data accesses should be within the L1 cache. On Intel processors, the cache is structured as 64-byte chunks of memory — enough to hold 8 adjacent int values. While Arrays.sort() makes no attempt to ensure alignment of the insertion-sorted sub-arrays, it's a naturally cache-friendly algorithm. As is Quicksort itself, for that matter. And Heapsort, which jumps around the array comparing elements at position N and 2 * N, is decidedly cache-unfriendly.

All of which is to say that sometimes, micro-optimizations have a big effect. And that, rather than implementing Heapsort, I should have simply copied the OpenJDK implementation of Quicksort and added a comparator.**


* This optimization isn't specific to the JDK. The GNU C library implementation also switches to insertion sort, albeit at 4 elements, and credits Sedgewick for that and other optimizations.

** In case you're wondering, the updated Quicksort took 4.25 seconds to run, and made approximately 375,000,000 comparisons. If it weren't covered by GPL2, I'd put it in my toolbox in place of the Heapsort implementation.

Friday, January 7, 2011

Mock Objects and Statistics Gathering

Unit testing tends to focus on the API. The unspoken assumption is that, if the API behaves as expected, then the internals are correct. However, that's not guaranteed; even Uncle Bob concedes that a pure test-driven approach will lead you to BubbleSort rather than QuickSort (warning: it's a bit of a rant). Clearly, the “black box” approach of testing just the API is not sufficient; some tests need a “white box” approach, where you look into the implementation. But how to do that?

One way is to add test hooks: public methods that give access to internal object state. I can see how this might work: you write all your mainline code in terms of interfaces, with factories to give you instances of those interfaces. And as mainstream software development embraces dependency injection frameworks, this might be the right answer. But it still leaves me with a queasy feeling. All it takes is one undisciplined programmer writing a cast to the implementation class so that s/he can use those methods, and you've got brittle code waiting to break.

I recently tried a different approach: using a mock object to gather statistics about the internals of the object under test. Ironically, it was a sort routine.

Mock objects have been a part of test-centric development for a while. There are many libraries that help you create mock objects, and it's easy to create your own using proxies. But most of the articles that I've read on mock-centric tests use them to track simple interactions: you want to verify that your class takes a particular action, so you inject a mock that asserts it occurred.

When testing my sort routine, I wanted to verify that the sort actually performed in NlogN time. It was a static method, so there was no place to add a test hook even if I believed in doing so. But there was one hook that was a natural part of the API:

public static class CountingIntComparator
implements Heapsort.IntComparator
{
    public int count;
    public int expectedCount;

    public CountingIntComparator(int size)
    {
        // our implementation of heapsort should perform at most 3 compares per element
        expectedCount = 3 * size * (int)Math.ceil(Math.log(size) / Math.log(2));
    }

    public int compare(int i1, int i2)
    {
        count++;
        return (i1 < i2) ? -1
             : (i1 > i2) ? 1
             : 0;
    }

    public void assertCompareCount()
    {
        assertTrue("expected < " + expectedCount + ", was " + count,
                   count < expectedCount);
    }
}

This works because “O(NlogN)” refers to the number of comparisons made by the sort. And it made for very simple test code (note that I needed a relatively large test array; this is a probabilistic tests, and I didn't want the test to fail because the random number generator returned a string of “bad” values):

public void testIntSortManyElements() throws Exception
{
    final int size = 10000;

    int[] src = createRandomArray(size);
    int[] exp = createSortedCopy(src);

    CountingIntComparator cmp = new CountingIntComparator(size);
    Heapsort.sort(src, cmp);
    assertEquals(exp, src);

    cmp.assertCompareCount();
}

The take-away from this, if there is one, is that mock objects can do more than simple interaction tests. They can gather statistics that give you insight into the long-term behavior of your objects, without the need to expose object state to the entire world.

The key seems to be whether or not the API that you're building is amenable to injecting such a mock without adding a test-specific variant. Perhaps it's a technique that only works for utility classes. But I'll be looking for similar opportunities in more complex code.