Friday, December 18, 2009

Debugging 102: Sherlock Holmes and William of Ockham

So if assumptions are a problem, and they're almost impossible to eliminate once formed, how do you prevent them from being formed? A good first step is to examine every perceived fact with Occam's razor — or more correctly, the interpretation of Occam's razor that says “simpler is more likely.”

Looking back at my “broken” network cable, with razor in hand, I should have asked myself “how did this cable break while sitting on the floor?” There's no simple explanation for that. Cables don't break unless they experience some trauma, yet this cable was stored in an out-of-the-way place. Damage, therefore, wasn't a simple explanation; time to look for something else.

Occam's razor is also known by the tagline “SELECT isn't broken.” The belief that your program is perfect and the compiler, or DBMS, or third-party open-source library is broken is common, especially among neophyte programmers. It takes time and suppression of one's own ego to trust someone else's work, to believe that “if 10,000 programmers can use this compiler without it breaking, my code is to blame.” Of course, sometimes it is the compiler: I've discovered one or two compiler bugs in my career, along with numerous bugs in open-source libraries.

And this is where Sherlock Holmes enters the picture (as quoted from The Sign of the Four):

when you have eliminated the impossible whatever remains, however improbable, must be the truth

The key part of this quote is “when you have eliminated.” Debugging is a process: start with the most likely situation (per Occam's razor), work to eliminate it as a possibility, then move on. The process fails when you either jump too quickly to an assumption or, as in the case of my network connection, don't work to eliminate your assumptions.

And one way to eliminate possibilities, even if they're based on strongly-held assumptions, is to apply the “Scientific Process” to your debugging.

Thursday, December 17, 2009

Debugging 101: Assumptions

Story time: this weekend I was configuring a router for my wife's uncle. I needed a live upstream network connection to test it, and happened to have a spare cable plugged into my main router — a gray cable, that I'd used recently for a computer without wireless, which was coiled neatly next to my equipment rack.

I plugged this cable into the new router, and nothing happened. “Oh, that's right, the connector has a broken tab, it probably needs to be pushed back into its socket.” After five minutes of unplugging and replugging the cable, and testing it on my laptop, I decided that the cable simply wasn't working anymore, and grabbed another off the shelf. I configured the router and that was that … until I sat down at my desktop computer and couldn't connect to the Internet.

I had violated the first rule of debugging: don't assume.

In this case I assumed that the gray cable I was busy plugging and unplugging at my main router was the same gray cable that was plugged into the new router. In fact, it was the gray cable that was attached to my desktop computer. The end of the neatly coiled spare cable had long-ago fallen onto the floor (it was missing the tab on its connector, remember).

There were plenty of clues staring me in the face. For one thing, the “spare” cable was zip-tied to my equipment rack. When I saw this, I actually said to myself “why did I do that?” Another clue was that only three “connection” lights were lit on the main router. Oh, and then there was the fact that I had two gray cables lying next to each other. But I ignored these clues.

That's the nature of assumptions: once you believe something, you won't let go of that belief until you have incontrovertible evidence to the contrary. Perhaps this was a good thing for early human evolution: it's best to assume that rustling in the brush is something that plans to eat you until you see otherwise. Those who abandoned their assumptions too quickly were removed from the gene pool.

But to a software developer, it's a trait that gets in the way of debugging. Fortunately, there are a few tricks to break assumptions … as long as you remember to use them before the assumption takes hold.

More on those tomorrow.

Friday, December 4, 2009

Integration Tests with Maven

Maven is simultaneously one of the most useful and most frustrating tools that I have ever used. As long as you stay within its “standard build process,” you can do some amazing things with very little configuration. But if you step off that standard, even in a small way, you'll end up in a world of pain. If you're lucky, a short session of Googling will suffice; if unlucky, you'll need to read plugin source code.

In my case, all I wanted to do was create a separate directory of integration tests, and allow them to be invoked from the command line.* I thought that I could get away with just redefining a system property:

mvn test

No luck: Maven used its default directory (confirmed with the -X command-line option). After some Googling, which included a bunch of Maven bug reports, I learned that some plugins load system properties before they load the properties defined in the POM — which I think misses the whole point of command-line defines.

My next step was to take a closer look at Surefire's configuration options in the hope that I could override there. And testSourceDirectory looked like it would work, but its datatype is File, and Maven doesn't try to convert non-String properties.

Some more Googling led me to build profiles. This seemed promising: I could create a named profile, and within it set the test directory to whatever I wanted. There was only one problem: profiles can't override the POM's build directories. If this post ever drops into the hands of the Maven team, I'd really like to know the reasoning behind that decision.**

OK, but perhaps I could use a profile to pass my directory directly to the Surefire plugin? This seemed promising, and when I first tried it, even appeared to work (but it turned out that I'd accidentally put source files under the wrong directory tree). Back to the documentation, and I realized that tests are compiled by the compiler plugin, not the Surefire plugin, and the compiler plugin doesn't have any directory config. This is when I downloaded the Surefire plugin code, and grepped for testSourceDirectory. It appears to be used only for TestNG; too bad I'm using JUnit.

In the end, I gave up on using a command-line override to run integration tests. I realized that Maven really, really wanted me to have a single test directory, used for all phases of the build. But to split this directory into unit and integration phases meant I would have to use a naming convention and include/exclude filters. I still rejected this: I have no desire for “big ball of mud” source trees.

Instead I created a sub-project for the integration tests. Of course, it wasn't a true sub-project: Maven is very particular about parent-child relationships. Instead, I just created an "itest" directory within my main project directory, and gave it its own POM; Maven's transitive dependency mechanism really helped here. It took about 10 minutes, versus the 4+ hours I'd spent trying to override the test directory. And it didn't affect the mainline build; for once I was happy that Maven ignores things that it doesn't expect.

* Yes, I know that Maven has an integration-test phase. But after reading about the configuration needed, I decided that it would be a last resort.

** Update (7-Apr-10): last night I attended a presentation by Jason van Zyl, creator of Maven and founder of Sonatype. And his answer to why Maven 2 doesn't allow profiles to override all build elements was that they didn't see a use case at the time: integration tests are outside of the normal build flow — except that now they're finding a need to do just that when building the Hudson build-management tool. For myself, I've come to accept sub-projects as the simplest solution.

Thursday, December 3, 2009

Why Write Open Source Libraries

I just created my third open-source project on SourceForge, S34J. It's a set of a half-dozen objects that encapsulate the calls for Amazon's Simple Storage Service (S3) — at least, it will be once I finish the code. My other two projects are PracticalXML, a utility library hiding the (often painful) Java XML API, and SwingLib, a library of Swing GUI enhancements that currently has three classes (mostly because I haven't taken the time to upload more).

Other than PracticalXML, I don't expect anyone to ever use these libraries. And for PXML, I don't expect anyone other than the other maintainers, all former coworkers, to use it. So why write them, and why take the time to create a SourceForge project? The answer can be found in the project description for SwingLib:

Classes that I've written for Swing programming. Possibly useful for other people.

Throughout my career, I've written the same code over and over again. Such as a method that creates an XML element that inherits the namespace of its parent. Simple code, a matter of a few minutes to write, but after the third or fourth time it gets annoying. Particularly if you have a several dozen such methods. And once I type that code on an employer's computer, it becomes their property; I can't simply take a copy on to my next job (I'll note here that ideas, such as an API, are not protected by copyright; I always make a DomUtil class, and it always has an appendChildInheritNamespace() method, but the implementation has been from-scratch each time).

An auto mechanic acquires his tools over a lifetime, and takes them from job to job; they don't belong to the shop where he works. By releasing this code as open source, I can do the same. And, who knows, someone else might stumble on it and decide it's useful.

Tuesday, December 1, 2009

Building a Wishlist Service: The Data Access Layer

I'm a huge fan of hiding persistence behind a data access layer. In particular, a data access layer that provides the basic CRUD operations — create / retrieve / update / delete — in a form slightly less granular than the actual domain objects.

I adopted this attitude after realizing that a typical business application has different types — levels — of rules. At the top are the “business” rules that describe how data is to be transformed: for example, a withdrawal or deposit changes the value of a bank account. Below those are the “logical model” rules describing how objects interact: a deposit or withdrawal can't exist without an account. And at the bottom are the “physical model” rules of how data is structured: a transaction consists of an account number, an operation, and a dollar amount.

In my view, the data access layer should manage the logical model. For the product list service, it implements rules such as “adding an item to a list will create that list if it doesn't already exist.” I could do this in the database, using triggers, but that means writing new code for each DBMS, along with additional work for deployment or upgrades. Or I could do it in the high-level service code, manipulating the domain objects directly, but that would result in duplicated code. Using the data access layer, the high level code is responsible only for populating and formatting the domain objects, while the database is responsible only for persistence.

Along with providing a place to hang rules, a data access layer simplifies testing: it allows the use of mock or stub objects in unit tests, thereby avoiding any actual database dependencies. Prior to developing the product list service, however, I had never used them in this way. For most data-driven applications, I would rely on integration tests to exercise a real persistence layer.

The product list service was different, primarily because I wanted to focus on client-server interaction, and didn't want to think about a persistence framework until later in the development process. So I spent a couple of hours to develop a stub data access layer that used HashMap for storage. Along with a suite of tests that allowed me to work through the rules of data management. My intent was that these could start as unit tests, then migrate to implementation tests once I added a real persistence framework.

Going into the project, I had assumed that persistence would be handled via a relational DBMS, probably using Hibernate as an object-relational mapper.* As I proceeded with development, however, I realized that there was no good reason to do this: as long as I had some way to assure sequenced, atomic commits, I had no need for a relational database. And because I was coding to the data access layer rather than “active record,” I had the freedom to explore this option.

That exploration is worth its own posting.

* In my opinion, Hibernate's chief benefit is that it hides the actual DBMS, meaning that it's (relatively) easy to implement portable applications — a big win if you're planning to sell your app, not so big if you control its deployment. In the case of the product list service, the object model is very simple, so I wrote a data access implementation using JDBC talking to an in-memory HSQLDB database. The goal was to verify that I wasn't doing anything that relational databases couldn't support. It didn't take long to write, but I remembered just how painful explicit SQL can be.