Friday, January 29, 2010

Micro-Optimization is Easy and Feels Good

Have you ever seen code like this?

private String foo(String arg1, int arg2)
{
    StringBuilder buf = new StringBuilder();
    buf.append(arg1).append(" = ").append(arg2);
    return buf.toString();
}

If you ask the person who wrote this why s/he used a StringBuilder rather than simply concatenating the strings, you'll probably hear a long lecture about how it's more efficient. If you're a bytecode geek, you can respond by demonstrating that the compiler generate exactly the same code. But then the argument changes: it's more efficient in some cases, and is never less efficient, so is still a Good Thing. At that point it's usually easiest to (snarkily) say “well, you should at least pre-allocate a reasonable buffer size,” and walk away.

These arguments are not caused by the time quantum fallacy. The StringBuilder fan recognizes that different operations take differing amounts of time, and is actively trying to minimize the overall time of his/her code. Never mind that, in context, string concatenation is rarely a significant time sink. The rationale is always “it can't hurt, and might help.”

This is the same attitude that drives curbside recycling. It feels good to put an aluminum can in the bin: the can can be melted down and reused, saving much of the energy needed to smelt the aluminum from ore. Doing so, however, avoids the bigger questions of whether aluminum is the most environmentally sound packaging material in the first place, or whether you should even be drinking the contents of that can.

In the same way, micro-optimizations let you feel good while avoiding the bigger question of what parts of your code should be optimized. Experienced programmers know that not all code should be optimized: there's an 80-20 (or more likely 90-10) rule at work. But finding the 10% that should be optimized is, quite frankly, hard.

It's not that we don't have tools: profilers of one form or another have been around since before I started working with computers. Java has had a built-in profiler since at least JDK 1.2, and a good profiler since (I think) 1.4. For that matter, some carefully placed logging statements can give you a good idea of what parts of your code take the longest (just be sure to use a logging guard so that you're not wasting CPU cycles!).

The real barrier to profiling — and by extension, intelligent optimization — is coming up with a representative workload. “Representative” is the key word here: you can write a performance test that spends all of its time doing things that your real users never do. Even when you think that you know what the users will do, it's tempting to take shortcuts: to profile an eCommerce application, you might add 10,000 items to a single cart. But the optimizations that you make after this test could be meaningless if the real world has 10,000 carts containing one item. And even if you get that right, there may be different implications running on an empty database versus one that has lots of data.

So yes, it's hard. But ultimately, the brain cycles that you spend thinking about representative workloads will have a greater payoff than those you spend thinking about micro-optimizations. And the end result feels better too, much like foregoing a can of soda for a glass of water.

No comments: