Debugging Your Brain

One part of doing research is debugging your understanding of reality. This is hard work: How do you even discover where you misunderstand? If you discover a misunderstanding, how do you go about removing it?

The process of debugging computer programs is quite analogous to debugging reality misunderstandings. This is natural—a bug in a computer program is a misunderstanding between you and the computer about what you said. Many of the familiar techniques from debugging have exact parallels.

  1. Details When programming, there are often signs that some bug exists like: “the graph my program output is shifted a little bit” = maybe you have an indexing error. In debugging yourself, we often have some impression that something is “not right”. These impressions should be addressed directly and immediately. (Some people have the habit of suppressing worries in favor of excess certainty. That’s not healthy for research.)
  2. Corner Cases A “corner case” is an input to a program which is extreme in some way. We can often concoct our own corner cases and solve them. If the solution doesn’t match our (mis)understanding, a bug has been found.
  3. Warnings On The compiler “gcc” has the flag “-Wall” which means “turn all warnings about odd program forms on”. You should always compile with “-Wall” as you immediately realize if you compare the time required to catch a bug that “-Wall” finds with the time required to debug the hard way.

    The equivalent for debugging yourself is listening to others carefully. In research, some people have the habit of wanting to solve everything before talking to others. This is usually unhealthy. Talking about the problem that you want to solve is much more likely to lead to either solving it or discovering the problem is uninteresting and moving on.

  4. Debugging by Design When programming, people often design the process of creating the program so that it is easy to debug. The analogy for us is stepwise mastery—first master your understanding of something basic. Then take the next step, the next, etc…
  5. Isolation When a bug is discovered, the canonical early trouble shooting step is isolating the bug. For a parse error, what is the smallest program exhibiting the error? For a compiled program: what are the simplest set of inputs which exhibit the bug? For research, what is the simplest example that you don’t understand?
  6. Representation Change When programming, sometimes a big program simply becomes too unwieldy to debug. In these cases, it is often a good idea to rethink the problem the program is trying to solve. How can you better structure the program to avoid this unwieldiness?

    The issue of how to represent the problem is perhaps even more important in research since human brains are not as adept as computers at shifting and using representations. Significant initial thought on how to represent a research problem is helpful. And when it’s not going well,
    changing representations can make a problem radically simpler.

Some aspects of debugging a reality misunderstanding don’t have a good analogue for programming because debugging yourself often involves social interactions. One basic principle is that your ego is unhelpful. Everyone (including me) dislikes having others point out when they are wrong so there is a temptation to avoid admitting it (to others, or more harmfully to yourself). This temptation should be actively fought . With respect to others, admitting you are wrong allows a conversation to move on to other things. With respect to yourself, admitting you are wrong allows you to move on to other things. A good environment can help greatly with this problem. There is an immense difference in how people behave under “you lose your job if wrong” and “great, let’s move on”.

What other debugging techniques exist?

MLTV

As part of a PASCAL project, the Slovenians have been filming various machine learning events and placing them on the web here. This includes, for example, the Chicago 2005 Machine Learning Summer School as well as a number of other summer schools, workshops, and conferences.

There are some significant caveats here—for example, I can’t access it from Linux. Based upon the webserver logs, I expect that is a problem for most people—Computer scientists are particularly nonstandard in their choice of computing platform.

Nevertheless, the core idea here is excellent and details of compatibility can be fixed later. With modern technology toys, there is no fundamental reason why the process of announcing new work at a conference should happen only once and only for the people who could make it to that room in that conference. The problems solved include:

  1. The multitrack vs. single-track debate. (“Sometimes the single track doesn’t interest me” vs. “When it’s multitrack I miss good talks”
  2. “I couldn’t attend because I was giving birth/going to a funeral/a wedding”
  3. “What was that? I wish there was a rewind on reality.”

There are some fears here too. For example, maybe a shift towards recording and placing things on the web will result in lower attendance at a conference. Such a fear is confused in a few ways:

  1. People go to conferences for many more reasons than just announcing new work. Other goals include doing research, meeting old friends, worrying about job openings, skiing, and visiting new places. There also a subtle benefit of going to a conference: it represents a commitment of time to research. It is this commitment which makes two people from the same place start working together at a conference. Given all these benefits of going to a conference, there is plenty of reason for them to continue to exist.
  2. It is important to remember that a conference is a process in aid of research. Recording and making available for download the presentations at a conference makes research easier by solving all the problems listed above.
  3. This is just another new information technology. When the web came out, computer scientists and physicists quickly adopted a “place any paper on your webpage” style even when journals forced them to sign away the rights of the paper to publish. Doing this was simply healthy for the researcher because his papers were more easily readable. The same logic applies to making presentations at a conference available on the web.

Deadline Season

Many different paper deadlines are coming up soon so I made a little reference table. Out of curiosity, I also computed the interval between submission deadline and conference.

Conference Location Date Deadline interval
COLT Pittsburgh June 22-25 January 21 152
ICML Pittsburgh June 26-28 January 30/February 6 140
UAI MIT July 13-16 March 9/March 16 119
AAAI Boston July 16-20 February 16/21 145
KDD Philadelphia August 23-26 March 3/March 10 166

It looks like the northeastern US is the big winner as far as location this year.

Automated Labeling

One of the common trends in machine learning has been an emphasis on the use of unlabeled data. The argument goes something like “there aren’t many labeled web pages out there, but there are a huge number of web pages, so we must find a way to take advantage of them.” There are several standard approaches for doing this:

  1. Unsupervised Learning. You use only unlabeled data. In a typical application, you cluster the data and hope that the clusters somehow correspond to what you care about.
  2. Semisupervised Learning. You use both unlabeled and labeled data to build a predictor. The unlabeled data influences the learned predictor in some way.
  3. Active Learning. You have unlabeled data and access to a labeling oracle. You interactively choose which examples to label so as to optimize prediction accuracy.

It seems there is a fourth approach worth serious investigation—automated labeling. The approach goes as follows:

  1. Identify some subset of observed values to predict from the others.
  2. Build a predictor.
  3. Use the output of the predictor to define a new prediction problem.
  4. Repeat…

Examples of this sort seem to come up in robotics very naturally. An extreme version of this is:

  1. Predict nearby things given touch sensor output.
  2. Predict medium distance things given the nearby predictor.
  3. Predict far distance things given the medium distance predictor.

Some of the participants in the LAGR project are using this approach.

A less extreme version was the DARPA grand challenge winner where the output of a laser range finder was used to form a road-or-not predictor for a camera image.

These automated labeling techniques transform an unsupervised learning problem into a supervised learning problem, which has huge implications: we understand supervised learning much better and can bring to bear a host of techniques.

The set of work on automated labeling is sketchy—right now it is mostly just an observed-as-useful technique for which we have no general understanding. Some relevant bits of algorithm and theory are:

  1. Reinforcement learning to classification reductions which convert rewards into labels.
  2. Cotraining which considers a setting containing multiple data sources. When predictors using different data sources agree on unlabeled data, an inferred label is automatically created.

It’s easy to imagine that undiscovered algorithms and theory exist to guide and use this empirically useful technique.

Yes , I am applying

Every year about now hundreds of applicants apply for a research/teaching job with the timing governed by the university recruitment schedule. This time, it’s my turn—the hat’s in the ring, I am a contender, etc… What I have heard is that this year is good in both directions—both an increased supply and an increased demand for machine learning expertise.

I consider this post a bit of an abuse as it is neither about general research nor machine learning. Please forgive me this once.

My hope is that I will learn about new places interested in funding basic research—it’s easy to imagine that I have overlooked possibilities.

I am not dogmatic about where I end up in any particular way. Several earlier posts detail what I think of as a good research environment, so I will avoid a repeat. A few more details seem important:

  1. Application. There is often a tension between basic research and immediate application. This tension is not as strong as might be expected in my case. As evidence, many of my coauthors of the last few years are trying to solve particular learning problems and I strongly care about whether and where a learning theory is useful in practice.
  2. Duration. I would like my next move to be of indefinite duration.

Feel free to email me (jl@hunch.net) if there is a possibility you think I should consider.