Interesting Papers at COLT 2007

Here are two papers that seem particularly interesting at this year’s COLT.

  1. Gilles Blanchard and François Fleuret, Occam’s Hammer. When we are interested in very tight bounds on the true error rate of a classifier, it is tempting to use a PAC-Bayes bound which can (empirically) be quite tight. A disadvantage of the PAC-Bayes bound is that it applies to a classifier which is randomized over a set of base classifiers rather than a single classifier. This paper shows that a similar bound can be proved which holds for a single classifier drawn from the set. The ability to safely use a single classifier is very nice. This technique applies generically to any base bound, so it has other applications covered in the paper.
  2. Adam Tauman Kalai. Learning Nested Halfspaces and Uphill Decision Trees. Classification PAC-learning, where you prove that any problem amongst some set is polytime learnable with respect to any distribution over the input X is extraordinarily challenging as judged by lack of progress over a long period of time. This paper is about regression PAC-learning, and the results appear much more encouraging than exist in classification PAC-learning. Under the assumption that:
    1. The level sets of the correct regressed value are halfspaces.
    2. The level sets obey a Lipschitz condition.

    this paper proves that a good regressor can be PAC-learned using a boosting algorithm. (The “uphill decision trees” part of the paper is about one special case where you don’t need the Lipschitz condition.)

What to do with an unreasonable conditional accept

Last year about this time, we received a conditional accept for the searn paper, which asked us to reference a paper that was not reasonable to cite because there was strictly more relevant work by the same authors that we already cited. We wrote a response explaining this, and didn’t cite it in the final draft, giving the SPC an excuse to reject the paper, leading to unhappiness for all.

Later, Sanjoy Dasgupta suggested that an alternative was to talk to the PC chair instead, as soon as you see that a conditional accept is unreasonable. William Cohen and I spoke about this by email, the relevant bit of which is:

If an SPC asks for a revision that is inappropriate, the correct
action is to contact the chairs as soon as the decision is made,
clearly explaining what the problem is, so we can decide whether or
not to over-rule the SPC. As you say, this is extra work for us
chairs, but that’s part of the job, and we’re willing to do that sort
of work to improve the overall quality of the reviewing process and
the conference. In short, Sanjoy was right.

At the time, I operated under the belief that the PC chair’s job was simply too heavy to bother with something like this, but that was wrong. William invited me to post this, and I hope we all learn a little bit from it. Obviously, this should only be used if there is a real flaw in the conditions for a conditional accept paper.

Create Your Own ICML Workshop

As usual ICML 2007 will be hosting a workshop program to be held this year on June 24th. The success of the program depends on having researchers like you propose interesting workshop topics and then organize the workshops. I’d like to encourage all of you to consider sending a workshop proposal. The proposal deadline has been extended to March 5. See the workshop web-site for details.

Organizing a workshop is a unique way to gather an international group of researchers together to focus for an entire day on a topic of your choosing. I’ve always found that the cost of organizing a workshop is not so large, and very low compared to the benefits. The topic and format of a workshop are limited only by your imagination (and the attractiveness to potential participants) and need not follow the usual model of a mini-conference on a particular ML sub-area. Hope to see some interesting proposals rolling in.

The Forgetting

How many papers do you remember from 2006? 2005? 2002? 1997? 1987? 1967? One way to judge this would be to look at the citations of the papers you write—how many came from which year? For myself, the answers on recent papers are:

year 2006 2005 2002 1997 1987 1967
count 4 10 5 1 0 0

This spectrum is fairly typical of papers in general. There are many reasons that citations are focused on recent papers.

  1. The number of papers being published continues to grow. This is not a very significant effect, because the rate of publication has not grown nearly as fast.
  2. Dead men don’t reject your papers for not citing them. This reason seems lame, because it’s a distortion from the ideal of science. Nevertheless, it must be stated because the effect can be significant.
  3. In 1997, I started as a PhD student. Naturally, papers after 1997 are better remembered because they were absorbed in real time. A large fraction of people writing papers and attending conferences haven’t been doing it for 10 years.
  4. Old papers aren’t on the internet. This is huge effect for any papers prior to 1995 (or so). The ease of examining a paper greatly influences the ability of an author to read and understand it. There are a number of journals which essentially have “internet access for the privileged elite who are willing to pay”. In my experience, this is only marginally better than having them stuck in the library.
  5. The recent past is more relevant to the present than the far past. There is a lot of truth in this—people discover and promote various problems or techniques which take off for awhile, until their turn to be forgotten arrives.

Should we be disturbed by this forgetting? There are a few good effects. For example, when people forget, they reinvent, and sometimes they reinvent better. Nevertheless, it seems like the effect of forgetting is bad overall, because it causes wasted effort. There are two implications:

  1. For paper writers, it is very common to overestimate the value of a paper, even though we know that the impact of most papers is bounded in time. Perhaps by looking at those older papers, we can get an idea of what is important in the long term. For example, looking at my own older citations, simplicity is it. If you want a paper to have a long term impact, it needs to have a simple algorithm, analysis method, or setting. Fundamentally, only those things which are teachable survive. Was your last paper simple? Could you teach it in a class? Are other people going to start doing so? Are the review criteria promoting the papers which a hope of survival?
  2. For conference organizers, it’s important to understand the way science has changed. Originally, you had to be a giant to succeed at science. Then, you merely had to stand on the shoulders of giants to succeed. Now, it seems that even the ability to peer over the shoulders of people standing on the shoulders of giants might be helpful. This is generally a good thing, because it means more people can help on a very hard task. Nevertheless, it seems that much of this effort is getting wasted in forgetting, because we do not have the right mechanisms to remember the information. Which is going to be the first conference to switch away from an ordered list of papers to something with structure? Wouldn’t it be great if all the content at a conference was organized in a wikipedia-like easy-for-outsiders-to-understand style?