One part of doing research is debugging your understanding of reality. This is hard work: How do you even discover where you misunderstand? If you discover a misunderstanding, how do you go about removing it?
The process of debugging computer programs is quite analogous to debugging reality misunderstandings. This is natural—a bug in a computer program is a misunderstanding between you and the computer about what you said. Many of the familiar techniques from debugging have exact parallels.
- Details When programming, there are often signs that some bug exists like: “the graph my program output is shifted a little bit” = maybe you have an indexing error. In debugging yourself, we often have some impression that something is “not right”. These impressions should be addressed directly and immediately. (Some people have the habit of suppressing worries in favor of excess certainty. That’s not healthy for research.)
- Corner Cases A “corner case” is an input to a program which is extreme in some way. We can often concoct our own corner cases and solve them. If the solution doesn’t match our (mis)understanding, a bug has been found.
- Warnings On The compiler “gcc” has the flag “-Wall” which means “turn all warnings about odd program forms on”. You should always compile with “-Wall” as you immediately realize if you compare the time required to catch a bug that “-Wall” finds with the time required to debug the hard way.
The equivalent for debugging yourself is listening to others carefully. In research, some people have the habit of wanting to solve everything before talking to others. This is usually unhealthy. Talking about the problem that you want to solve is much more likely to lead to either solving it or discovering the problem is uninteresting and moving on.
- Debugging by Design When programming, people often design the process of creating the program so that it is easy to debug. The analogy for us is stepwise mastery—first master your understanding of something basic. Then take the next step, the next, etc…
- Isolation When a bug is discovered, the canonical early trouble shooting step is isolating the bug. For a parse error, what is the smallest program exhibiting the error? For a compiled program: what are the simplest set of inputs which exhibit the bug? For research, what is the simplest example that you don’t understand?
- Representation Change When programming, sometimes a big program simply becomes too unwieldy to debug. In these cases, it is often a good idea to rethink the problem the program is trying to solve. How can you better structure the program to avoid this unwieldiness?
The issue of how to represent the problem is perhaps even more important in research since human brains are not as adept as computers at shifting and using representations. Significant initial thought on how to represent a research problem is helpful. And when it’s not going well,
changing representations can make a problem radically simpler.
Some aspects of debugging a reality misunderstanding don’t have a good analogue for programming because debugging yourself often involves social interactions. One basic principle is that your ego is unhelpful. Everyone (including me) dislikes having others point out when they are wrong so there is a temptation to avoid admitting it (to others, or more harmfully to yourself). This temptation should be actively fought . With respect to others, admitting you are wrong allows a conversation to move on to other things. With respect to yourself, admitting you are wrong allows you to move on to other things. A good environment can help greatly with this problem. There is an immense difference in how people behave under “you lose your job if wrong” and “great, let’s move on”.
What other debugging techniques exist?
I relate to the idea of debugging as finding and resolving contradictions. The program is supposed to output X, but it outputs Y. This is a contradiction. Why does it output Y? This variable is Z, why? etc. Going back the sequence of steps that caused the wrong output often isolates the mistake that led to the error. I’m not sure I’d call this a “debugging technique,” but it’s how I see myself debugging; it’s also a way to improve one’s understanding of, e.g., a mathematical technique — seek out and resolve contradictions.
Just to play devil’s advocate on one point — conversations about research are fine, but if you wait for someone else to tell you that “yes, that’s a good idea,” you’ll never put in the time to actually push forward on the research and find out. You’ve pointed out one extreme, but the other extreme is also bad. In part this is because it may not be clear how interesting the problem is until you do some preliminary work; if the conversatio goes badly, you may become discouraged while there’s actually something there. It took me a while to figure that one out.
I would suggest the general principle of “Sanity check”, maybe as an extension of the ” -Wall compiling ” and “Corner case” rules. In some code in developing phase, you can add for example testing of the value of a variable which should always be positive if everything is right and issue some message is this is not the case. Sometimes, you obtain this way an error message from a totally unexpected part of the program. Similarly, when I think I have come to some understanding of a specific point of “reality”, I try to test it against other knowledge I have, preferrably coming from a different point of view or source. For example, in physics you are taught to always check for the homogeneity of a formula, you could also check if some result derived using an energy minimization principle will not violate, say, the second principle of thermodynamics. Intuition coming from an uncorrelated source (which also includes “talking with other people”) is very valuable for bug discovery.
I generally agree. Listening to people carefully enough to find valid criticism yet not getting discouraged is very tricky.
I saw a great talk by Mitch Marcus, where one of his main points, which flashed in big bold letters several times during the talk was “Look at the data, stupid”. (This obviously doesn’t apply to all disciplines, but…) His point is that a lot of times when people work on problems, they do so cerebrally. They think about what they think should and will happen. But in many cases (especially supervised learning cases), we have data and we should look at it. I have found this often shows that many thing I expect exist don’t (or don’t exist in abundance) and there are many more interesting facets of the problem that I didn’t anticipate that are there all the time.