Second Annual Reinforcement Learning Competition

The Second Annual Reinforcement Learning Competition is about to get started. The aim of the competition is to facilitate direct comparisons between various learning methods on important and realistic domains. This year’s event will feature well-known benchmark domains as well as more challenging problems of real-world complexity, such as helicopter control and robot soccer keepaway.

The competition begins on November 1st, 2007 when training software is released. Results must be submitted by July 1st, 2008. The competition will culminate in an event at ICML-08 in Helsinki, Finland, at which the winners will be announced.

For more information, visit the competition website.

Online as the new adjective

Online learning is in vogue, which means we should expect to see in the near future:

  1. Online boosting.
  2. Online decision trees.
  3. Online SVMs. (actually, we’ve already seen)
  4. Online deep learning.
  5. Online parallel learning.
  6. etc…

There are three fundamental drivers of this trend.

  1. Increasing size of datasets makes online algorithms attractive.
  2. Online learning can simply be more efficient than batch learning. Here is a picture from a class on online learning:
    online optimization picture
    The point of this picture is that even in 3 dimensions and even with linear constraints, finding the minima of a set in an online fashion can be typically faster than finding the minima in a batch fashion. To see this, note that there is a minimal number of gradient updates (i.e. 2) required in order to reach the minima in the typical case. Given this, it’s best to do these updates as quickly as possible, which implies doing the first update online (i.e. before seeing all the examples) is preferred. Note that this is the simplest possible setting—making the constraints nonlinear, increasing the dimensionality, introducing noise, etc… all make online learning preferred.
  3. Online learning is conceptually harder, and hence more researchable. The essential difficulty is that there is no well defined static global optima, as exists for many other optimization-based learning algorithms. Grappling with this lack is of critical importance.

NIPS workshops extended to 3 days

(Unofficially, at least.) The Deep Learning Workshop is being held the afternoon before the rest of the workshops in Vancouver, BC. Separate registration is needed, and open.

What’s happening fundamentally here is that there are too many interesting workshops to fit into 2 days. Perhaps we can get it officially expanded to 3 days next year.