CS 2 -- Debugging Techniques


Debugging

A story rolls around the annals of computer science about the origins of the word ``bug''. The story claims that a few years back, when computers took up whole rooms, one of those mighty dinosaurs started malfunctioning. It took engineers a long time, but finally, they discovered the problem. A small insect had gotten into the computer systems. In its unfortunate demise, it had shorted some of the circuits and caused a system failure. Whether or not the tale is true, the bugs of programs, as they are none-too-affectionately termed, have scourged programmers as long as they have been programming.

One of the skills that every programmer develops over his or her career is how to detect bugs and debug programs. One small bug can reverberate through a program with incredible effects. In fact, in a language such as C, a single missing quote can wreak havoc on a programmer's weekend.


Program Life Cycle

The program life cycle goes through the following stages: design, implementation, testing, debugging. The debugging stage requires the programmer to restart the cycle at either the design or implementation stage. Software engineering tells us we should attempt to minimize the total time spent on the entire cycle. Obviously, the best way to do this is to write a correct program right from the start. Now we all know that this is impossible in practice, but we should always keep this ideal in the back of our minds. How does this help us?

This is where the module comes in. A module is a small piece of a program that forms a conceptual whole and has a well-defined interface to the rest of the program. A module allows us to divide the program life cycle into smaller module life cycles. The same steps are involved, but the steps are much smaller and much more managable. Because they may be designed, implemented and tested independently, modules help keep debugging localized and debugging time to a minimum. They also serve as a good way to break the monotony of debugging a huge program, by allowing programmers to spread out debugging and intersperse it with coding time. Finally, modules allow us to decrease the total debugging time of the program, because we will better know where the source of the bug is. After all, if we have 3 modules, 2 of which we know are correct, and a bug pops up, where is the bug going to be?

Certainly modules are a huge help. But what about when our program cannot be easily broken up into modules? Or when the problem is in a single module. Is the previous paragraph a complete waste? No, we can still use the principles of debugging as we go, just go one function or one part of the algorithm at a time. We can also use the ideal of writing a correct program the first time. Do you understand the algorithm completely? Debugging a complex algorithm you don't understand is nearly impossible. Did you fully correct the compile time errors and warnings? Compiler errors and warnings are designed to help programmers find bugs before run time. They can save a lot of time. If you don't understand an error message, please ask your TA to explain it.

In any large programming project, programmers will write debugging support code (``debug code'') that will not be released in the final program. The idea of debug code is to make sure the program is correct at a certain point, or to make sure a data structure has correct values. For example, if you are building a linked list module, you should write debug code to print out the entire list. This code will never be released to the public, but will be invaluable to you at debugging time. Continuing the example, say your linked list routines fail when searching. Before possibly wasting time looking at a correct search routine, does the add routine work? One call to your debug routine at the beginning of your search will clear up any doubt of this.


Localized Debugging

It is unclear whether or not there is or should be a set procedure on debugging. However, there does seem to be an underlying scheme to it. It is obvious that the first step is to locate the bug. However, this is where most troubles arise. The standard solution to this problem is to set many breakpoints and watch where the program fails. Sometimes these appear as actual breakpoints in a symbolic debugger, but often, programmers resort to the almighty printf("DEBUG 1\n"); as a breakpoint.

In fact, there is much more that can be done before applying breakpoints to a particular program. The first step to fixing a bug is to find out exactly when a program fails. Second, attempt to locate the bug using that information and the other available resources. Remember that where a program crashes is not necessarily the location of the bug. And of course, third, fix the bug.

How does it fail?

The first step in every debugging process is to analyze the program execution for any information it might yield. Think about the algorithm and its data structures, and try to find whether there are any particular scenarios where it is likely to break down. As an example, let's use a calculator program that seems to be crashing at random times. In any mathematical program, the case of zero should be considered. Division by zero can appear in many places including the modulo and tangent functions. Test the zero case, is the program crashing in one of those situtations? How about the same function without a zero? This example seems a little simple, but it applies across the board.

Let's consider another example, that of a linked list. Whenever a program using a linked list is crashing, always consider the special cases: the head, the middle and the tail. If the program crashes during deletion, consider what happens when deleting from the head of the list. Are all the links being updated correctly? How about the addition? Is that executed correctly? Practically every operation on a linked list can have a special case in one of the three situations.

A more specific example involves string operations. Many times, the size allocated for a string falls a little short of the necessary amount, and the repercussions are felt throughout the program. Imagine, for example, a string that is too short to contain the '\0' character at the end. When applying a strcpy() function from that string to another, the function will continue copying, writing over memory, until it finds a '\0' somewhere in memory! Try strings of different sizes in cases where you think it might be sensitive. Does it fail for a particular size only?

Where does it fail? (I)

All right, we know as much as we can from the program execution. The next step is to figure out exactly where the program fails. Rather than automatically apply breakpoints, consider the algorithm and its implementation. The program execution should have revealed something about the situation in which the program fails. Where is that code implemented? Are there any obvious reasons why the program might be failing? For example, is it obvious the that algorithm is implemented incorrectly? How about memory; could the memory have been allocated improperly (or not at all)? Or, in the case of a linked list, think about link updates; are those being done correctly?

Always try to think about the program to discover the bug. Instant reaction may work faster in smaller programs (say fifty lines or less), but in a program thousands of lines long, insight is the key. If nothing else, analyzing the program execution should help you narrow down the general region of the bug.

Where does it fail? (II)

If it is still unclear where the program fails, then breakpoints may be the only way to find the location. However, many programmers apply breakpoints gratuitously, leading to screens full of unnecessary debugging information. This, unfortunately, is the wrong way to go about debugging. All it produces is a headache from the time you have to spend debugging the debugging output!

There are three things to remember about applying breakpoints. First, breakpoints should always be applied in passes (a few at a time) rather than all at once. Second, while not always obvious, there are certain critical steps in every segment of code. These junctures occur at particularly important state changes. A single breakpoint at a few of these states can nail down most bugs faster than a breakpoint after every line of code. And lastly, breakpoints can be used to tell more about the status of the program than just ``Got here''.

Consider, for example, a loop construct inside a program. There are two ways that such a loop might fail. It may continue indefinitely (the infinite loop), or there may be a fault inside the body of the loop. If it is unclear what could be happening, the first-pass breakpoints should be placed before and after the loop. If the loop is correct, both breakpoints will be reached; if the loop is infinite, only the first breakpoint will be reached. However, if there is a fault inside the body of the loop, the output will be indistinguishable from a correct loop, and a second pass of breakpoints must be added.

Consider what would would happen if the first-pass breakpoints included one inside the body of the loop. If there is no fault in the loop, the breakpoint would be reached as many times as the loop executes, producing large amounts of indistinguishable, useless output. Similarly, if the loop is infinite, useless amounts of output are produced. If there is a fault inside the loop, however, it could be detected with such a breakpoint. For this reason, breakpoints should be applied in passes. If the first pass reveals nothing, add second- and third-pass breakpoints until you get useful information.

The most important thing to remember about a breakpoint is that it should inform you of the state of the program. Remember that programs, as much as it would seem otherwise, are always deterministic. Every statement executed evolves a certain state into another state. Good insight into the state of the information at critical junctures will aid you in locating the faulty statement. You can also make breakpoints more descriptive while still keeping the breakpoint to one line. How about ``Entered sort function'', instead of just ``DEBUG 1''? Also, you can use that opportunity to print out the arguments to the function. Obviously, the function cannot be expected to produce the correct output if the input is coming in incorrectly.

Consider a recursive program and its execution. One breakpoint displaying its arguments can inform you if the recursive relation is being executed correctly. It can also illustrate whether or not the termination condition will be reached. In the case of a linked list program, a breakpoint displaying the status of a set of links before and after deletion can show whether or not the links are updated correctly. If the links are not updated correctly, it is unnecessary to wait until the program crashes. That information is seen immediately. This is faster, and better, than putting a simple breakpoint after every statement and waiting for the program to crash.

Fixing the bug

There is not much to be said about fixing a bug once it is found. It is generally a pretty clear-cut process. However, there is one word of caution to be considered: do not stop until you are satisfied you understand both the cause and remedy of a bug. Too many times, programmers find that a small change to a program can fix a bug, even if it is unclear why. That should not be an acceptable solution. In fact, in the long run, this is likely to lead to more bugs. And those will be harder to trace, since the code they hide in is unclear.

While finding the cause of a bug and then a cure may seem tedious at times, it will not only show you the shortcomings of your programming, but will also lead to better code. This is especially true in cases where the bug is related to memory errors. Many times a bug will spawn from overwritten or corrupted memory. In some cases, an adequately placed, yet unnecessary, statement is enough to fix the problem. Be aware, though, that the reason for this is that the statement is taking up space in memory that would otherwise be taken by a relevant statement. This does not guarantee that the bug will disappear. As more and more statements are added, the location of the ``fix'' statement may change and the bug may reappear! Take heed, fix a bug completely or it will come back to haunt you.


Assertions

Undoubtedly, the best time to fix a bug is before you run your program. As you write your program, you can insert assertions in your code using the C assert() macro. By calling assert with a conditional as an argument, (such as what you put inside if or while statements) you are expressing a condition which you think should be true at this point in your program. For instance, suppose you have the following piece of code:

int i, a[10];
for (i = 0; i < 10; ++i)
  {
  a[i] = 10-i;
  }
for (i = 0; i < 10; ++i)
  {
  a[a[i]] = a[i];
  }

If you run the above fragment, it might, if you are lucky, core dump. A less fortunate result is random corruption of some other unrelated variables in your program. Why is this? Because one of the array indices in the second loop is outside of the bounds of the array. Here is a case where assertions might have helped. If the condition inside an assertion fails, the program exits immediately, printing out the condition and line number of the failed assertion. Here's how the above program might have been written more safely using assertions:

#include <assert.h>
int i, a[10];
for (i = 0; i < 10; ++i)
  {
  assert(0 <= i && i < 10);
  a[i] = 10-i;
  }
for (i = 0; i < 10; ++i)
  {
  assert(0 <= i && i < 10);
  assert(0 <= a[i] && a[i] < 10);
  a[a[i]] = a[i];
  }

Notice that, in order to use the assert macro, we must include assert.h at the top of the file. In the first loop, since we are using i as an index into the array a which has ten elements, we assert that i lies within the valid range of the array, that is, between 0 and 9, inclusive. If, by some unrealized fluke, i were to take on a value which we had not expected, this assertion would fail, alerting us immediately to the error, before random corruption of memory could occur. Similarly, in the second loop, we use both i and a[i] as indices into the array, and thus assert that both are within the proper range.

If you include many assertions, and make them as strong as possible, it can help you catch bugs much faster than a debugger. In each function, loop, or interesting statement in your program, you should ask yourself, ``What properties am I relying on for this to work correctly?'' and assert them all.

Here is another example of a code snippet that makes good use of assertions:

/* This function returns a number between 0 and 1 indicating the
 * density of 'e' characters in a word, per character.
 */
float find_e_density(const char *word)
{
  int len, e_count=0, i;
  assert(word != NULL); /* word should point to valid memory */
  len = strlen(word);
  assert(len != 0); /* word should not be 0-length for valid density */
  for (i = 0; word[i] != 0; ++i)
    if (word[i] == 'e')
      ++e_count;
  return e_count/len; /* Above assertion protects against division by 0 */
}

Conclusion and How This Applies to CS 2

In CS 1, it was acceptable to ask your TA to debug much of the program for you. This is no longer true in CS 2. You should at least narrow the problem down to a function before asking a TA for help. Don't just say ``It crashes, fix it.'' If you do, your TA has every right to just walk away. Your TA can help you with ways to find a bug, give you possible causes for a bug, explain compiler errors and warnings, and explain what is wrong with a particular statement. Please feel free to ask your TA about any (and every) compiler error you don't understand. Understanding of compiler errors comes through experience which your TA can give you. In general, use your judgement. Remember that your TA has 8 or so other students to help, so don't monopolize.

All told, the clear lesson is that debugging is a skill which you can develop. It may not be as clear cut or spelled out as other aspects of programming, but there are definitely better and worse ways of debugging. Please note that the methods recommended here are only suggestions on how to debug. There is no best or worst way to debug, especially considering the large differences between programmers and their styles. Most importantly, take these suggestions and use them to develop your own style of debugging to best fit your own style of programming. And always remember, good, clear programming from the beginning will save you hours of debugging in the end.


CS 2 Home Page