The Peekaboom Dataset

Luis von Ahn‘s Peekaboom project has yielded data (830MB).

Peekaboom is the second attempt (after Espgame) to produce a dataset which is useful for learning to solve vision problems based on voluntary game play. As a second attempt, it is meant to address all of the shortcomings of the first attempt. In particular:

  1. The locations of specific objects are provided by the data.
  2. The data collection is far more complete and extensive.

The data consists of:

  1. The source images. (1 file per image, just short of 60K images.)
  2. The in-game events. (1 file per image, in a lispy syntax.)
  3. A description of the event language.

There is a great deal of very specific and relevant data here so the hope that this will help solve vision problems seems quite reasonable.

3 Replies to “The Peekaboom Dataset”

  1. John, This post got me wondering about something maybe unrelated to this. Yesterday on NPR they had an interview with Jeff Hawkins about his current start-up Numenta and the work they’re doing on AI by modeling the functioning of the neocortex. He mentioned that he was using machine vision as a test case, and that he’s been having some very promising results. I’m curious if you’re familiar at all with his work, and if it falls within the subject of “machine learning” or if it’s something quite different.

  2. I was not familiar, but it does look interesting, especially with respect to the workshop on atomic learning we are about to run.

    Unfortunately, it looks like Jeff is essentially uncontactable due to success.

    Yes, I would count that as machine learning.

Comments are closed.