Helsinki University class day 1 day 1, 4up day 2 day 2, 4up

Video lecture from MLSS 2006 Taipei

Paper for sigmetrics.

Multiclass regression -> Binary Regression | Alina Beygelzimer, John Langford, Yuri Lifshits, Gregory Sorkin, and Alex Strehl Conditional Probability Tree Estimation Analysis and Algorithms UAI 2009. | We analyze and show the feasibility of doing multiclass regression in logarithmic time, with an application to an ad dataset. |

Ranking -> Classification | Nina Balcan, Nikhil Bansal, Alina Beygelzimer, Don Coppersmith, John Langford, and Greg Sorkin Robust Reductions from Ranking to Classification COLT 2007 also Journal version at MLJ Volume 72, Numbers 1-2, August 2008, pages 139-153. .tex | This paper shows that correctly predicting the pairwise ordering strongly implies good AUC performance. |

Structured Prediction -> Binary Prediction | Hal Daumé III, John Langford, and Daniel Marcu Search-Based Structured Prediction, MLJ 2009. Hal's repository | We show how to reduce arbitrary complex structured prediction problems to binary classification using the Searn algorithm. The resulting algorithm is fast, performs well, and allows tackling of new problems. |

Multiclass -> Binary Classification (Regret Transform) |
Alina Beygelzimer, John Langford, and Pradeep Ravikumar Error-Correcting Tournaments ALT 2009. John Langford and Alina Beygelzimer Sensitive Error Correcting Output Codes COLT 2005 .ps.gz .pdf .tex |
These paper shows that any cost sensitive classification problem can be reduced to binary classification or regression so that small regret implies small cost sensitive regret. The SECOC/PECOC results are best understood as a reduction from multiclass classification to regression, composed with Probing reducing regression to binary classification. The newer Error Correcting Tournament work is a more direct and tighter reduction to binary classification. |

Regression -> Classification |
John Langford and Bianca Zadrozny Estimating Class Membership Probabilities Using Classifier Learners AISTAT 2005
.ps.gz .pdf .tex John Langford, Roberto Oliveira, and Bianca Zadrozny Predicting Conditional Quantiles via Reduction to Classification UAI 2006 .ps.gz .pdf |
The first paper shows how squared loss regression can be done with a classifier using the "Probing" algorithm. It is the first example of a regret reduction. The tutorial (above) has a slightly cleaner but equivalent algorithm and analysis. The second paper uses a similar technique (the "Quanting" algorithm) to reduce quantile regression to classification. |

Multiclass -> Binary (Error Transform) |
Alina Beygelzimer, Varsha Dani, Tom Hayes, John Langford, and Bianca Zadrozny Reductions Between Classification Tasks ICML 2005 .tex Naoki Abe, Bianca Zadrozny, and John Langford An Iterative Method for Multi-class Cost-sensitive Learning KDD2004 .ps.gz, .pdf, .tex Alina Beygelzimer, John Langford, and Bianca Zadrozny Weighted One Against All AAAI 2005 .ps.gz .pdf .tex |
The first paper formalizes the notion of an error transform reduction and presents "Weighted All Pairs", proving that it is an error transform reduction exist from cost sensitive multiclass classification problems to binary classification. The second paper is from earlier when we didn't know how to do this in a noninteractive fashion. It is a practical and effective algorithm. The Weighted One Against All reduction is a universal improvement on the common One-Against-All Reduction |

RL -> Classification |
John Langford and Bianca Zadrozny Reducing T-step Reinforcement Learning to Classification (Error Reduction Version) John Langford and Bianca Zadrozny Relating Reinforcement Learning Performance to Classification Performance ICML 2005 .ps.gz .pdf .tex |
These papers analyze how to decompose Reinforcement Learning performance into classification performance. The unpublished draft is about an error transform reduction. The ICML paper is about a nonconstructive reduction. |

Importance Weighted Classification -> Classification |
Bianca Zadrozny, John Langford, and Naoki Abe Cost Sensitive Learning by Cost-Proportionate Example Weighting ICDM 2003 .ps.gz, .pdf, .tex | The "costing" algorithm reduces importance weighted classification to classification, while empirically providing superior performance to hand crafted algorithms. |