A tutorial on modern reinforcement learning theory at ICML 2006 covering Exploration sample complexity analysis and Reductions analysis.

Reinforcement learning is a formalization of the AI problem in learning. The work here is very general and sometimes even useful.

Alexander L. Strehl, Lihong Li, Eric Wiewiora, John Langford, and Michael L. Littman, PAC Model-Free Reinforcement Learning ICML 2006 .tex, .ps.gz, .pdf | An MDP can be explored with only O(SA) actions. Slides from Lihong's presentation |

Sham Kakade, Michael Kearns, and John Langford Exploration in Metric State Spaces ICML2003 .ps.gz, .pdf, .tex | An MDP with a metric property can be explored with an amount of experience related to a covering number. |

Sham Kakade, John Langford Approximately Optimal Approximate Reinforcement Learning ICML2002 .ps.gz, .pdf, .tex | Introduces the "Conservative policy iteration" algorithm which has the advantages of policy iteration and policy gradient while losing several of the disadvantages of these algorithms. |

John Langford, Martin Zinkevich, Sham Kakade Competitive Analysis of the Explore/Exploit Tradeoff ICML2002 .ps.gz, .pdf, .tex | Analysis of the explore/exploit tradeoff in a simplified model. |

A presentation on the state of RL theory. (The beginning.)

RLBench a reinforcement learning benchmark suite