When I was thinking about the best “10 year paper” for ICML, I also took a look at a few other conferences. Here is one from 10 years ago that interested me:
Prior to this paper, the only mechanism known for controlling or estimating the necessary sample complexity for learning over continuously parameterized predictors was VC theory and variants, all of which suffered from a basic problem: they were incredibly pessimistic in practice. This meant that only very gross guidance could be provided for learning algorithm design. The PAC-Bayes bound provided an alternative approach to sample complexity bounds which was radically tighter, quantitatively. It also imported and explained many of the motivations for Bayesian learning in a way that learning theory and perhaps optimization people might appreciate. Since this paper came out, there have been a number of moderately successful attempts to drive algorithms directly by the PAC-Bayes bound. We’ve gone from thinking that a bound driven algorithm is completely useless to merely a bit more pessimistic and computationally intense than might be necessary.
What other machine learning or learning theory papers from 10 years ago have had a substantial impact?