Friday, May 29, 2009

TDD Training Debrief

I'm going to interrupt the postings on SkipList, because the TDD class is over. Did I learn anything?

I came into class thinking that I had a fairly good understanding of TDD, albeit as a skeptic that it could directly drive design. I left thinking the same thing. I'm sure there were some pieces of information that I picked up without realizing — it's impossible for that not to happen if you have a good instructor. And I was exposed to Fitnesse, a tool that will probably be more useful in a future career stop. But I came in test-infected, left test-infected, and didn't have a revelation in the middle. My thoughts may not have changed, but the class pushed me to think deeply about the subject again, and test those thoughts.

Where I thought the class was tremendously valuable was in interacting with Uncle Bob. He's a lot more pragmatic in person, admitting that there are places where test-driven design falls down (eg, TDD will give you bubble-sort, not quicksort). I can understand this difference in personality: written words have to stand on their own for all time, presentations are marketing (of ideas, if not of self), while the classroom is a place for discussion.

The biggest revelation, however, came from observing my fellow students. At any given time, about half of them were working on other things: responding to email, reviewing defects, writing code for their current projects, and generally ignoring what was happening around them. And to be honest, during the last example of the first day I joined them: a minor crisis had appeared in my inbox over lunch, and I decided that it was more important to respond to that crisis than to work through the example — after all, I already knew this stuff, right?

Uncle Bob finished the class with a talk on what he saw as the future of TDD. He used an interesting example: in the 1970s there was a lot of argument about the value of structured programming, yet today every mainstream language uses structured constructs — “goto” is no longer considered harmful, because it no longer exists. His opinion is that 30 years from now we won't be discussing TDD either, because it will be standard practice.

I think there's a flaw in this reasoning: computer languages are created by language designers, made whole by compiler writers, and are in effect handed down from on high for the rest of us. Even Fortran and Cobol have evolved: while there may be dusty decks filled with gotos, code written yesterday (and there is some) uses structured constructs. TDD won't follow that path, unless Eiffel becomes a mainstream languge, because of the commitment that it requires.

Anything that requires commitment also has costs: what do I have to invest to travel this path (direct cost), and what do I have to give up (opportunity cost)? In an economics classroom, choices are easy: you add the direct and opportunity costs, and take the low-cost option. In real life, it's often hard to identify those costs — most people consider staying in the same place to have zero cost. And given that, there's no reason to change.

I can't remember when I became test-infected: I know that I was writing test programs to isolate defects some 20 years ago. As I read the testing literature that started to appear around 1999, a lightbulb went on in my head, and I tried writing my mainline code with tests. And learned that no, it didn't take me any longer, and yes, I found problems very early. In other words, that TDD was in fact the low-cost option.

Tuesday, May 26, 2009

TDD and the SkipList (part 1)

I'm not a purist when it comes to test-driven development, although I do believe that tests validate design as well as code. My mantra is “if it's hard to test, it will be hard to use” (which is disturbingly close to “if you can dodge a wrench, you can dodge a ball”). But the purist approach, as exemplified by “Uncle Bob” Martin, is that writing your tests first will actually lead to a better design — the term they use is “emergent design.”

The Bowling Game is Uncle Bob's canonical example. While Big Design Up Front (BDUF) looks at the real world and posits the need for objects such as Frame, TDD approaches from the simple goal of scoring the game, and ends up with a single Scorer class. A neat demonstration, but the immediate response is that it's trivial, and you're not likely to be writing a scorekeeping application for your bowling league. What about real applications?

Last year I attended one of Uncle Bob's presentations on clean code. As an example of code needing cleanup, he used excerpts from his own Fitnesse tool. And while I have a great deal of respect for people who point out the flaws in their own work, I had to ask: “didn't you develop this using TDD?” His answer was yes, but that TDD doesn't guarantee clean code. Not an answer to give a warm fuzzy feeling, particularly since most of the unclean bits simply needed better factoring — which is supposed to be TDD's sweet spot.

But my real break with TDD orthodoxy came when I tried to implement a SkipList using pure TDD. For those who didn't follow the link, a SkipList is a data structure that holds its elements according to their natural ordering, and provides O(logN) access. It does this with multiple tiers of linked lists: every element is linked into the bottom tier, and each higher tier has roughly half the elements of the tier beneath. I say “roughly,” because elements are randomly assigned to tiers (following a log2 distribution). To find an element, you start at the topmost tier and work your way right and down until you get to the bottom tier and run into an element whose value is larger than the one you seek.

A SkipList doesn't implement the java.util.List API, because it doesn't preserve the insertion order of its elements. It does, however, implement the java.util.Collection API, and so I approached it through this public API, doing the simplest thing that could possibly work to make the tests pass. However, this wasn't going to get me to a SkipList: I could make all the tests pass using an ArrayList with linear search and insert. While I could actually implement a SkipList back end, I would break the rule of writing only enough code to make a single test pass. Clearly, testing the external interface wasn't enough; I had to look inside the implementation.

I posted this quandary in a comment to one of Uncle Bob's blog entries several years ago, and received the astounding (to me) reply that yes, I should open up the class and look inside it. My reply was perhaps a bit more sarcastic than needed, but I felt my point was valid: isn't encapsulation one of the reasons that we use object-oriented languages? Perhaps that explained some of the ugly code in Fitnesse.

One way to preserve encapsulation while still enabling white box testing is to program to interfaces: SkipList becomes a public interface, implemented by SkipListImpl. Mainline code uses the interface, test code uses the implementation. Of course, this implies the existence of a factory, because you need some way to create instances. Is this really what we want for a utility class?

An interesting side-effect of programming to interfaces is that we now need two tiers of tests: one that works with interfaces, one that works with the implementation classes. To me, this seems to be a move away from test- driven development, into the realm of testing for validation.

Back to the SkipList: if I wanted to develop the core linked-list code via TDD, I was going to have to extract that code and test it separately. This is one of the claimed benefits of TDD: it leads you to small, well-factored classes. But again, is this really what we want for a utility class? Or, to rephrase, do we want a library that exposes a lot of public classes used by only one consumer? My inner database developer is wary of 1:1 relationships.

I never saved my first attempt, so this weekend I started fresh; my next post will look at what TDD gave me. I decided to do this, not because I particularly need a SkipList, but because today I start a three-day class on TDD taught by Uncle Bob. I doubt there will be time in the class to do an actual implementation, but I plan to throw it out as a question. I'd like to see if I'm missing something or if the TDD purists are.

Wednesday, May 20, 2009

Tokyo at Night

Back in the 1980s, automakers started to get inventive with dashboard design. The traditional American dash, with its large speedometer front-illuminated by white lights, fell out of fashion as European imports appeared with ergonomic layout and lighting schemes such as backlit orange numerals that didn't damage the driver's night vision. Released from the fetters of convention, and with new materials such as LCDs at their disposal, dashboards underwent a dramatic transformation. In many cases, for the worse, as dials were replaced by bright moving bars. The automotive press derided these as looking like Tokyo at night.

Here we are in the new millennium, sitting in front of computers that have evolved from the monochrome screens of the “dumb terminal” era, through colored text displays, to graphical interfaces that can display millions of colors. And UI designers great and small, released from the fetters of convention, have responded by making use of color: ls --color

Does anyone find that distracting? I have no clue what half the colors mean, and that dark blue just seems to disappear into the background. One of my first tasks when moving to a new environment is unalias ls.

Text editors are equally garish. Out of the box, Eclipse uses a mixture of strident color and boldfaced fonts: Eclipse default color scheme

Boldface text is meant to attract attention. When it's used for a method signature, or a Java keyword, it takes your attention away what's actually happening in the code. Fortunately, we can correct this: My Eclipse color scheme

Yep, it's boring: almost everything has the same tonal values. In fact, it took me several tries before I found a file that showed more than two colors. Java keywords are just text; I don't care about them, so don't do anything to highlight them. Function calls are important, so get highlighted, but not too much. Literal text is important and probably shouldn't exist, so it's highlighted with a shade of red (see if you can find the empty string). And I want to avoid auto-boxing, so it gets highlighted in bright red. At the other extreme, HTML markup within doc-comments almost disappears: I like the “computer text” in JavaDoc, but when I'm in the editor I don't want to see it.

Am I the only person to feel this way?

Tuesday, May 19, 2009

The Journey of a Thousand Miles Begins With sldfjouwe;ji

A long time ago I learned a technique to break writer's block: type random characters, followed by random words, followed by words and sentences that start to make sense. The idea is to get your fingers used to typing, jump-start your brain, and maybe generate some good ideas along the way. As long as you delete all the goobledygook, and don't find yourself typing “all work and no play makes Jack a dull boy,” it's at worst harmless and may lead to good things. So when I have a writing project, that's the way I start.

I've also adopted this technique to programming: every morning I sit down at my home computer, pick one of my many unfinished projects, and spend an hour working on it. Whether I accomplish a lot or a little in that hour doesn't really matter; the goal is to flip a mental switch. Afterwards, I often find new ideas popping into my head on the drive to work (other drivers should be glad that I don't carry paper to write them down).

This hour every morning serves another purpose that's at least as important: it lets me actually create something. My workday is filled with answering the question “can we do this, and how much will it cost?” It's easy to forget the joy of actually making something: an article for my website, a blog entry, even a Java class that will never be seen by anyone else. All it takes is an hour each morning to remind me why I ended up in this career.

Tuesday, May 12, 2009

Stepping Into Space

I don't think I'll ever go skydiving. I might really like it: I liked spinning an airplane, and like going fast on a motorcycle, and freefall seems to be a combination of these. But there's a nagging fear that the chute won't open. It's not a fear of death per se, but an understanding of just how much time it takes to drop that last 3,000 feet.

About 15 seconds, which is a long time when there's nothing you can do but enjoy the view.

Leaving a job without your next one lined up is sometimes described as jumping without a parachute. A friend of mine did it a few weeks ago, and I've done it several times during my career. It's definitely scary, and you may have the nagging fear that you won't find another job.

The difference is that you can do something while falling. You may not find the job you want right away, but that's OK: it gives you more time to write blog entries. Or paint the house.

And you can still enjoy the view.

Friday, May 1, 2009

Book Review: Dreaming in Code

Dreaming in Code has been out for a few years now, and it's received glowing reviews, some of which are linked from its website. This is not one of those reviews: I ended up returning the book to the library after getting barely halfway through. I found it boring, so much so that I preferred reading about the history of paint.

DiC is often compared to Tracy Kidder's Soul of a New Machine. The latter book came out while I was in high school, and I consider it one of the influences that led me into the computer industry. Superficially, the two books are very similar: they're both histories of a high-tech project, written for a non-technical audience. Both chronicle the successes and failures of the project, both profile the people involved. And yet, SoaNM just does it better. Before writing this post, I opened my copy to a random page for insight … and read to the end of the chapter. Nearly thirty years after it was written, and with the knowledge that I've gained in that time, it still held my interest. Why? And why didn't DiC?

The answer, I think, starts in the prologue. Rosenberg writes of his own experiences programming. Kidder takes us onto a sailboat, running before a storm, and introduces us to Tom West. Every book needs a protagonist, and Tom is ours: we learn of his character, and the events of the prologue are a foreshadowing of the rest of the book. Turn to any page, and Tom West is nearby. Rosenberg, by comparison, lacks a protagonist: Mitch Kapor is a visionary on the fringes, and there are a succession of project managers and programmers who never really step up to the role. Perhaps that's why the project failed, but it doesn't make for good reading.

Not only does DiC not have a protagonist, it doesn't have a crisis. The team in SoaNM is working against a deadline, against internal competition, and against their own limitations. A conventional plotline, but it works, and it moves the book forward. By comparison, the team in DiC drifts around, throws out ideas to see if they stick, and then drifts away. I'm sure there was pain and angst in the Chandler offices, but that never makes its way to the page.

Or perhaps it does, but is diluted by the mass of explanatory material that Rosenberg injects into the book. And here we come back to a difference obvious from the prologues: Rosenberg is a part of the computer industry, even if his job has been to write about it. Kidder is a journalism professor, who likely had never used a computer before this book. Both need to present technical material in their books, and in rereading, I'm amazed at just how dense the technology is in SoaNM. But the difference is telling: Rosenberg looks at the material with an insider's eye, while Kidder is an outsider who has to learn before writing. And in so doing, he distills the content, giving enough detail for the novice, but not so much that the knowledgeable reader is bored.

After three weeks with Dreaming in Code, I gave up. The book was due back to the library, and I didn't have a lot of interest in continuing to read it. Based on what I've read about the project elsewhere, that seems appropriate.