Why Reinforcement Learning is Important

One prescription for solving a problem well is:

  1. State the problem, in the simplest way possible. In particular, this statement should involve no contamination with or anticipation of the solution.
  2. Think about solutions to the stated problem.

Stating a problem in a succinct and crisp manner tends to invite a simple elegant solution. When a problem can not be stated succinctly, we wonder if the problem is even understood. (And when a problem is not understood, we wonder if a solution can be meaningful.)

Reinforcement learning does step (1) well. It provides a clean simple language to state general AI problems. In reinforcement learning there is a set of actions A, a set of observations O, and a reward r. The reinforcement learning problem, in general, is defined by a conditional measure D( o, r | (o,r,a)*) which produces an observation o and a reward r given a history (o,r,a)*. The goal in reinforcement learning is to find a policy pi:(o,r,a)* -> a mapping histories to actions so as to maximize (or approximately maximize) the expected sum of observed rewards.

This formulation is capable of capturing almost any (all?) AI problems. (Are there any other formulations capable of capturing a similar generality?) I don’t believe we yet have good RL solutions from step (2), but that is unsurprising given the generality of the problem.

Note that solving RL in this generality is impossible (for example, it can encode classification). The two approaches that can be taken are:

  1. Simplify the problem. It is very common to consider the restricted problem where the history is summarized by the previous observation. (aka a “Markov Decision Process”). In many cases, other restrictions are added.
  2. Think about relativized solutions (such as reductions).

Both methods are options are under active investigation.

Peekaboom

Luis has released Peekaboom a successor to ESPgame (game site). The purpose of the game is similar—using the actions of people playing a game to gather data helpful in solving AI.

Peekaboom gathers more detailed, and perhaps more useful, data about vision. For ESPgame, the byproduct of the game was mutually agreed upon labels for common images. For Peekaboom, the location of the subimage generating the label is revealed by the game as well. Given knowledge about what portion of the image is related to a label it may be more feasible learn to recognize the appropriate parts.

There isn’t a dataset yet available for this game as there is for ESPgame, but hopefully a significant number of people will play and we’ll have one to work wtih soon.