FOCS 2020 tutorial on the Theoretical Foundations of Reinforcement Learning

Alekh Agarwal, Akshay Krishnamurthy, and John Langford

Overview

This is a tutorial on the theoretical foundations of reinforcement learning covering many new developments over the last half-decade which substantially deepen our understanding of what is possible and why. In addition, we cover various important open problems. The tutorial has 3 key parts: The information theory of reinforcement learning, optimization/gradient descent in reinforcement learning, and latent state discovery.

The tutorial video

backup video

slides

Primary references

Other references

Lower bounds

The three challenges

Contextual bandits

Tabular Markov decision process

Linear bandits

Extrapolation methods for RL

Bellman and witness rank

Factored MDPs

Policy optimization algorithms

Policy optimization algorithms

Exploration in policy optimization

Latent state discovery