ICML 2009 Tutorial on Reductions in Machine Learning

ICML 2009 Tutorial on Reductions in Machine Learning


Alina Beygelzimer, John Langford, and Bianca Zadrozny

Tutorial Description

Machine learning reductions are about reusing solutions to simple, core problems in order to solve more complex problems. A basic difficulty in applying machine learning in practice is that we often need to solve problems that don't quite match the problems solved by standard machine learning algorithms. Reductions are techniques that transform such practical problems into core machine learning problems. These can then be solved using any existing learning algorithm whose solution can, in turn, be used to solve the original problem.

The material that we plan to cover is both algorithmic and analytic. We will discuss existing and new algorithms, along with the methodology for analyzing and creating new reductions. We will also discuss common design flaws in folklore reductions. In our experience, this approach is an effective tool for designing empirically successful, automated solutions to learning problems.

Slides

Slides

Presenters

Alina Beygelzimer is a research staff member at IBM Research, working on machine learning (including reductions) and algorithms. She has taught reductions as a part of a graduate machine learning course at Columbia University.

John Langford is a senior research scientist at Yahoo! Research. His work includes research in machine learning, learning theory, algorithms, game theory, steganography, and Captchas. John has worked extensively on formalizing, analyzing, and teaching reductions.

Bianca Zadrozny is a professor in the computer science department at UFF in Brazil, where she has taught several classes and receives support from CNPq (Brazil's National Research Council) to conduct research in machine learning. Prior to that, she was a research staff member at IBM Research where, among other things, she worked on reductions.

References

John's reductions page, with links to written and video tutorials, a workshop, and related papers.

Additional references:

Cost-sensitive classification:
P. Domingos. MetaCost: A general method for making classifiers cost-sensitive, KDD-99.

Multiclass classification:
E. Allwein, R. Schapire, and Y. Singer. Reducing multiclass to binary: A unifying approach for margin classifiers, Journal of Machine Learning Research, 2000.
K. Crammer and Y. Singer. On the learnability and design of output codes for multiclass problems, Machine Learning Journal, 2002.
T. Dietterich and G. Bakiri. Solving Multiclass Learning Problems via Error-Correcting Output Codes, Journal of Artificial Intelligence Research, 1995.
V. Guruswami and A. Sahai. Multiclass learning, boosting, and error-correcting codes, COLT-99.
T. Hastie and R. Tibshirani. Classification by Pairwise Coupling, Annals of Statistics, 1998.
R. Rifkin and A. Klautau. In Defense of One-Vs-All Classification, Journal of Machine Learning Research, 2004.

Ranking:
N. Ailon and M. Mohri. An Efficient Reduction of Ranking to Classification, COLT-08.

Glossary of Reductions

Costing (from importance weighted classification to binary classification by importance-proportionate rejection sampling)

Probing (from squared loss regression to classification)

Quanting (from quantile regression to classification)

Searn (from structured prediction to binary classification)

Regression Tree (from multiclass regression to binary regression in logarithmic time)

Error-Correcting Tournaments (Filter Tree) (from cost-sensitive multiclass to binary classification)

Logistics

  • ICML 2009
  • Where: T1. McGill University, Montreal, Canada.
  • When: Sunday, June 14, 9-11:30am