Reinforcement Learning Reductions Analysis

This is a best-effort trace of the development of reductions analysis for reinforcement learning. (See also the sample complexity analysis page.)

With a Reduction Guarantee

RL->regression given an oracle to the optimal state distribution. Randomized.Sham Kakade, John Langford Approximately Optimal Approximate Reinforcement Learning ICML 2002. Also in: Sham Kakade On the Sample Complexity of Reinforcement Learning Thesis, Gatsby, UCL 2003
RL->binary classification given an oracle to the optimal state distribution. DerandomizedDrew Bagnell, Sham Kakade, Andrew Ng, & Jeff Schneider, Policy Search by Dynamic Programming, NIPS 2003
RL->binary classification given a generative model. e->TeJohn Langford and Bianca Zadrozny Reducing T-step Reinforcement Learing to Classification 2003 (unpublished)
RL->binary classification given a reset.Bianca Zadrozny Policy mining: Learning decision policies from fixed sets of data, Ph.D. Thesis, University of California, San Diego, 2003
RL->binary classifiation given oracle acces to optimal policy at training time. John Langford and Bianca Zadrozny Relating Reinforcement Learning Performance to Classification Performance ICML 2005
RL->binary classification given oracle access optimal policy (nice application)Doron Blatt and Alfred Hero, From weighted classification to policy search, NIPS 2005
RL->binary classification given oracle access to optimal policy at training time. Improves time complexity of the previous. Randomized.Hal Daumé III, John Langford, and Daniel Marcu Search-Based Structured Prediction (unpublished)

Without a Reduction Guarantee

There are also several efforts which are 'type correct', but without a reduction-style guarantee.
Reduces to importance weighted classification and applies to planning problems.Alan Fern, SungWook Yoon, and Robert Givan, Approximate Policy Iteration with a Policy Language Bias NIPS 2003
Shows the approach can work for some standard test domains.Michail G. Lagoudakis and Ronald Parr, Reinforcement Learning as Classification: Leveraging Modern Classifiers ICML 2003