Tuesday, January 19, 2010

Debugging 301: Into the Code

You've tried to write a Small, Self-Contained Example program and failed. Whatever the problem is, it only manifests in your full-scale application. What now?

The Scientific Method remains your best tool, but rather than trying to fit a hypothesis to the available data, it's now time to make predictions based on your hypotheses. And, using either log messages or a debugger, watch your program execute until you find the place where your predictions become incorrect. I've written elsewhere about logging, so this post will focus on running your code in a debugger.

The power of a debugger comes from the fact that you can examine the entire state of your program and then change it at will. In some cases, you'll be able to restart methods to see how their execution changes with different starting conditions. However, this power is also a hindrance: it's easy to get overwhelmed by the amount of information available, and even more easy to make assumptions and spend hours attempting to prove them correct.

The key is to form a hypothesis, and then look only at the code and data that are affected by that hypothesis. For example: you've written code that's supposed to upload data to a server, but it doesn't. You could waste a lot of time tracing the code, but it's more efficient to come up with a few hypotheses:

  1. You're not actually running the upload code.
  2. The upload fails due to a communication problem between client and server; the data never gets to the server.
  3. The server is unable to process the file.

Hypothesis #1 is easy to falsify: put a breakpoint at the start of the upload code and run your program. If the breakpoint is hit, then that hypothesis is done. If you don't hit the breakpoint, then you need to come up with hypotheses as to why you didn't (note, however, that falsifying this hypothesis says nothing about hypotheses #2 and #3 — you could have more bugs).

Hypothesis #2 is more difficult to falsify, because there could be many reasons for the upload to appear successful. For example, library code that ignores an exception. It might make sense to step through the code, but again all of the possibilities can waste your time. Better is to use a tool that can directly disprove the hypothesis: a TCP monitor, or perhaps using the logging that's built into your your communications library.

If you get to hypothesis #3, it will soon be time to generate a whole new set of hypotheses, to cover the different cases that would cause the server to reject your file. And it's likely that the debugger will no longer be the best tool. Because now you have enough information to create test cases to validate your code, or an SSCEP to validate the server code.

No comments: