New York’s ML Day

I’m not as naturally exuberant as Muthu 2 or David about CS/Econ day, but I believe it and ML day were certainly successful.

At the CS/Econ day, I particularly enjoyed Toumas Sandholm’s talk which showed a commanding depth of understanding and application in automated auctions.

For the machine learning day, I enjoyed several talks and posters (I better, I helped pick them.). What stood out to me was number of people attending: 158 registered, a level qualifying as “scramble to find seats”. My rule of thumb for workshops/conferences is that the number of attendees is often something like the number of submissions. That isn’t the case here, where there were just 4 invited speakers and 30-or-so posters. Presumably, the difference is due to a critical mass of Machine Learning interested people in the area and the ease of their attendance.

Are there other areas where a local Machine Learning day would fly? It’s easy to imagine something working out in the San Francisco bay area and possibly Germany or England.

The basic formula for the ML day is a committee picks a few people to give talks, and posters are invited, with some of them providing short presentations. The CS/Econ day was similar, except they managed to let every submitter do a presentation. Are there tweaks to the format which would improve things?

NIPS 2008 workshop on ‘Learning over Empirical Hypothesis Spaces’

This workshop asks for insights how far we may/can push the theoretical boundary of using data in the design of learning machines. Can we express our classification rule in terms of the sample, or do we have to stick to a core assumption of classical statistical learning theory, namely that the hypothesis space is to be defined independent from the sample? This workshop is particularly interested in – but not restricted to – the ‘luckiness framework’ and the recently introduced notion of ‘compatibility functions’ in a semi-supervised learning context (more information can be found at http://www.kuleuven.be/wehys).

Workshop Summary—Principles of Learning Problem Design

This is a summary of the workshop on Learning Problem Design which Alina and I ran at NIPS this year.

The first question many people have is “What is learning problem design?” This workshop is about admitting that solving learning problems does not start with labeled data, but rather somewhere before. When humans are hired to produce labels, this is usually not a serious problem because you can tell them precisely what semantics you want the labels to have, and we can fix some set of features in advance. However, when other methods are used this becomes more problematic. This focus is important for Machine Learning because there are very large quantities of data which are not labeled by a hired human.

The title of the workshop was a bit ambitious, because a workshop is not long enough to synthesize a diversity of approaches into a coherent set of principles. For me, the posters at the end of the workshop were quite helpful in getting approaches to gel.

Here are some answers to “where do the labels come from?”:

  1. Simulation Use a simulator (which need not be that good) to predict the cost of various choices and turn that into label information. Ashutosh had some cool demos showing the power of this approach. Gregory also presented a poster which might be viewed this way.
  2. Agreement A label is a point of agreement. Luis often used an agreement mechanism to induce labels with games. Sham discussed the power of agreement to constrain learning algorithms. Huzefa‘s work on bioprediction can be thought of as partly using agreement with previous structures to simulate the label of a new structure.
  3. Compilation Labels can be found by compiling one learning problem into another. Mark and I both talked about reductions a bit, which come with some nice formal guarantees.
  4. Backprop Labels are the signals in generalized backpropagation (David Bradley‘s talk).

Some answers to “where do the data come from” are:

  1. Everywhere The essential idea is to integrate as many data sources as possible. Rakesh had several algorithms which (in combination) allowed him to use a large number of diverse data sources in a text domain.
  2. Sparsity A representation is formed by finding a sparse set of basis functions on otherwise totally unlabeled data. Rajat discussed self-taught learning algorithms which achieve this.
  3. Self-prediction A representation is formed by learning to self-predict a set of raw features. Hal‘s talk covered this idea.

A workshop like this is successful if it informs the questions we ask (and answer) in the future. Some natural questions (some of which were discussed) are:

  1. What is a natural, sufficient langauge for adding prior information into a learning system? Which languages are insufficient? Shai described a sense in which kernels are insufficient as a language for prior information. Bayesian analysis emphasizes reasoning about the parameters of the model, but the language of examples or maybe label expectations may be more natural.
  2. What is missing from the above lists? And are the elements of the lists actually distinct?
  3. How do we modularize? Many of the approaches use problem-specific tricks. That’s to be expected for a direction of research which is just starting, but it’s important to modularize these techniques so they can be repeatedly and easily applied. Achieving modularity in a manner which supports prior information properly seems tricky.
  4. How do we formalize and analyze? Of the items listed above, I feel like we only have some reasonable understanding of the compilation approach. The other approaches and questions are essentially unexplored territory where some serious thinking may be helpful.

NIPS workshops extended to 3 days

(Unofficially, at least.) The Deep Learning Workshop is being held the afternoon before the rest of the workshops in Vancouver, BC. Separate registration is needed, and open.

What’s happening fundamentally here is that there are too many interesting workshops to fit into 2 days. Perhaps we can get it officially expanded to 3 days next year.