Interesting Papers at COLT 2007 – Machine Learning (Theory)

Here are two papers that seem particularly interesting at this year’s COLT.

Gilles Blanchard and FranÃƒÂ§ois Fleuret, Occam’s Hammer. When we are interested in very tight bounds on the true error rate of a classifier, it is tempting to use a PAC-Bayes bound which can (empirically) be quite tight. A disadvantage of the PAC-Bayes bound is that it applies to a classifier which is randomized over a set of base classifiers rather than a single classifier. This paper shows that a similar bound can be proved which holds for a single classifier drawn from the set. The ability to safely use a single classifier is very nice. This technique applies generically to any base bound, so it has other applications covered in the paper.
Adam Tauman Kalai. Learning Nested Halfspaces and Uphill Decision Trees. Classification PAC-learning, where you prove that any problem amongst some set is polytime learnable with respect to any distribution over the input X is extraordinarily challenging as judged by lack of progress over a long period of time. This paper is about regression PAC-learning, and the results appear much more encouraging than exist in classification PAC-learning. Under the assumption that:
1. The level sets of the correct regressed value are halfspaces.
2. The level sets obey a Lipschitz condition.
this paper proves that a good regressor can be PAC-learned using a boosting algorithm. (The “uphill decision trees” part of the paper is about one special case where you don’t need the Lipschitz condition.)