Machine Learning Reductions

This project is about applying the method of reduction to machine learning. We hope to learn the relationship between machine learning problems, and construct learning algorithms which work via reduction.


A new tutorial at ICML 2009 with a videolecture.
Helsinki University class day 1 day 1, 4up day 2 day 2, 4up
Video lecture from MLSS 2006 Taipei
Paper for sigmetrics.


Machine Learning Reductions


Multiclass regression -> Binary RegressionAlina Beygelzimer, John Langford, Yuri Lifshits, Gregory Sorkin, and Alex Strehl Conditional Probability Tree Estimation Analysis and Algorithms UAI 2009. We analyze and show the feasibility of doing multiclass regression in logarithmic time, with an application to an ad dataset.
Ranking -> ClassificationNina Balcan, Nikhil Bansal, Alina Beygelzimer, Don Coppersmith, John Langford, and Greg Sorkin Robust Reductions from Ranking to Classification COLT 2007 also Journal version at MLJ Volume 72, Numbers 1-2, August 2008, pages 139-153. .tex This paper shows that correctly predicting the pairwise ordering strongly implies good AUC performance.
Structured Prediction -> Binary Prediction Hal Daumé III, John Langford, and Daniel Marcu Search-Based Structured Prediction, MLJ 2009. Hal's repositoryWe show how to reduce arbitrary complex structured prediction problems to binary classification using the Searn algorithm. The resulting algorithm is fast, performs well, and allows tackling of new problems.
Multiclass -> Binary Classification (Regret Transform) Alina Beygelzimer, John Langford, and Pradeep Ravikumar Error-Correcting Tournaments ALT 2009.
John Langford and Alina Beygelzimer Sensitive Error Correcting Output Codes COLT 2005 .ps.gz .pdf .tex
These paper shows that any cost sensitive classification problem can be reduced to binary classification or regression so that small regret implies small cost sensitive regret. The SECOC/PECOC results are best understood as a reduction from multiclass classification to regression, composed with Probing reducing regression to binary classification. The newer Error Correcting Tournament work is a more direct and tighter reduction to binary classification.
Regression -> Classification John Langford and Bianca Zadrozny Estimating Class Membership Probabilities Using Classifier Learners AISTAT 2005 .ps.gz .pdf .tex
John Langford, Roberto Oliveira, and Bianca Zadrozny Predicting Conditional Quantiles via Reduction to Classification UAI 2006 .ps.gz .pdf
The first paper shows how squared loss regression can be done with a classifier using the "Probing" algorithm. It is the first example of a regret reduction. The tutorial (above) has a slightly cleaner but equivalent algorithm and analysis.
The second paper uses a similar technique (the "Quanting" algorithm) to reduce quantile regression to classification.
Multiclass -> Binary (Error Transform) Alina Beygelzimer, Varsha Dani, Tom Hayes, John Langford, and Bianca Zadrozny Reductions Between Classification Tasks ICML 2005 .tex
Naoki Abe, Bianca Zadrozny, and John Langford An Iterative Method for Multi-class Cost-sensitive Learning KDD2004 .ps.gz, .pdf, .tex
Alina Beygelzimer, John Langford, and Bianca Zadrozny Weighted One Against All AAAI 2005 .ps.gz .pdf .tex
The first paper formalizes the notion of an error transform reduction and presents "Weighted All Pairs", proving that it is an error transform reduction exist from cost sensitive multiclass classification problems to binary classification.
The second paper is from earlier when we didn't know how to do this in a noninteractive fashion. It is a practical and effective algorithm.
The Weighted One Against All reduction is a universal improvement on the common One-Against-All Reduction
RL -> Classification John Langford and Bianca Zadrozny Reducing T-step Reinforcement Learning to Classification (Error Reduction Version)
John Langford and Bianca Zadrozny Relating Reinforcement Learning Performance to Classification Performance ICML 2005 .ps.gz .pdf .tex
These papers analyze how to decompose Reinforcement Learning performance into classification performance. The unpublished draft is about an error transform reduction. The ICML paper is about a nonconstructive reduction.
Importance Weighted Classification -> Classification Bianca Zadrozny, John Langford, and Naoki Abe Cost Sensitive Learning by Cost-Proportionate Example Weighting ICDM 2003 .ps.gz, .pdf, .tex The "costing" algorithm reduces importance weighted classification to classification, while empirically providing superior performance to hand crafted algorithms.