Machine Learning (Theory)


In Active Learning, the question changes

Tags: Active,Questions jl@ 10:54 am

A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”.

At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?”

The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied.

In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.

At the active learning tutorial, I stated a set of somewhat more precise research questions that I don’t yet have answer to, and which I believe are worth answering. Here is a bit of an expansion on those questions for those interested.

  1. Is active learning possible in a fully adversarial setting? By fully adversarial, I mean when an adversary controls all the algorithms observations. Some work by Claudio and Nicolo has moved in this direction, but there is not yet a solid answer.
  2. Is there an efficient and effective reduction of active learning to supervised learning? The bootstrap IWAL approach is efficient but not effective in some situations where other approaches can succeed. The algorithm here is a reduction to a special kind of supervised learning where you can specify both examples and constraints. For many supervised learning algorithms, adding constraints seems problematic.
  3. Can active learning succeed with alternate labeling oracles? The ones I see people trying to use in practice often differ because they can provide answers of varying specificity and cost, or because some oracles are good for some questions, but not good for others.
  4. At this point, there have been several successful applications of active learning, but that’s not the same thing as succeeding with more robust algorithms. Can we succeed empirically with more robust algorithms? And is the empirical cost of additional robustness worth the empirical peace-of-mind that your learning algorithm won’t go astray where other more aggressive approaches may do so?
5 Comments to “In Active Learning, the question changes”
  1. Shiva Kaul says:

    It was fun to watch the recent conquest of the agnostic setting in terms of the magnitude of sample complexity improvements and the general conditions in which they were realized. We still lack understanding of the precise conditions amenable to active learning. The (generalized) disagreement coefficient and splitting index are great, important contributions but work in this area is just beginning.

    The existing cost model – uniform, stationary, coming out of a fixed budget – was OK when active learning was being scrutinized relative to passive learning, but now requires more sophistication.

    Practically, there is an awful lot of inertia behind the manipulation of fixed data sets and the assumption of iid data. The transition requires both theoretical and engineering preparation.

    I hope to see some more cross-pollination with related fields as well. I think some of the most recent improvements could be folded back into experiment design and change-point estimation, and perhaps there could be a nice “passive vs. active” interchange with the compressed sensing folks.

  2. […] at 3:02 PM: As soon as I wrote this, I read a post on active learning in John Langford’s blog, which points to the following tutorial page, which among other […]

  3. Vikas says:

    Hi John,

    With respect to point 3, I just wanted to point to some recent papers that attempt to elicit alternative forms of supervision/ model constraints using active learning schemes:

    Learning from Measurements in Exponential Families
    Percy Liang, Michael I. Jordan, and Dan Klein, ICML 2009

    Active Learning by Labeling Features
    Gregory Druck, Burr Settles, Andrew McCallum.
    To appear in Proceedings of EMNLP.

    Uncertainty Sampling and Transductive Experimental Design for Active Dual Supervision
    V. Sindhwani, P. Melville, R. Lawrence, ICML 2009


  4. […] thought was triggered by the ICML 2009 tutorial on active learning from John Langford’s blog article.  Active learning is a machine learning technique where the machine actively chooses which query […]

  5. […] label noise. Several more substantial improvements occurred, leading to a tutorial 2 years ago and discussion about what’s next. Since then, we cracked question (2) here and applied it to get an effective absurdly efficient […]

Sorry, the comment form is closed at this time.

Powered by WordPress