Principles of Learning Problem DesignNIPS 2007 WorkshopDecember 7, 2007 | Whistler, BC SynopsisThis workshop is about how to design learning problems. The dominant system for applying machine learning in practice involves a human labeling data. This approach is limited to situations where human experts exist, can be afforded, and are fast enough to solve the relevant problem. In many settings these constraints are not met, yet it appears that machine learning is still possible via cleverly reinterpreting or reusing existing data. The basic idea is to create supervised learning problems from data which is not conventionally labeled in such a way that successfully solving these ancillary problems is helpful in solving the original learning problem. Since the task is akin to the problem of mechanism design in economics and game theory, we call it learning problem design. Several recent examples of learning problem design include converting otherwise-unsupervised problems into supervised problems; creating recursive prediction problems (predicting from predictions); reducing one learning task to another. This area is new and not entirely defined. It is our goal to bring together anyone who is interested in the topic, define what we do and don't understand, and attempt to define the principles of learning problem design. ExamplesHere are a few examples of the kinds of problems where learning problem design becomes important.
There are several learning settings which, although relevant, do not adequatly describe the problem we are interested in: (1) These problems are not truly unsupervised, because there is both an identifiable goal and the data has a strongly relevant signal. (2) The problems are often fundamentally easier than reinforcement learning because the implications of changing world state can often be neglected. (Although perhaps not for (3) above.) (3) This is also not a bandit problem because we are interested in systems that generalize across available context information. In the language of bandit learning, competing with just the set of arms is too weak---we want to compete with a large set of policies choosing arms. (4) This is not a multitask or transfer learning task, because we are not trying to solve or reuse the solution of multiple (human-labeled) supervised tasks. Instead, we are interested in creating new "synthetic" prediction problems from data for use as either subroutines or as final output. Naturally techniques from all of these related settings may be helpful. |
Invited speakers:
Luis von Ahn, CMU
Hal Daume, Utah
Sham Kakade, TTI
Mark Reid, NICTA
Rajat Raina, Stanford
David Bradley, CMU
Important dates:
Submission deadline: November 1
Notification: November 5
Workshop: December 7
Organizers: