Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning. This is an update on the progress I know of. As a refresher, active learning as meant here is:
- There is a source of unlabeled data.
- There is an oracle from which labels can be requested for unlabeled data produced by the source.
- The goal is to perform well with minimal use of the oracle.
Here is what I’ve learned:
- Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier).
- Nina, Alina, and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here.
- Nicolo, Claudio, and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here. This was published a year or two ago and I recently learned about it.
All of these results are relatively ‘rough’: they don’t necessarily make good algorithms as stated (although the last one has a few experiments). None of these results are directly comparable because the assumptions vary. Comparing the assumptions and the results leads to a number of remaining questions:
- Do the sufficient and seminecessary conditions apply to the IID only case? The adversarial case?
- Is there a generic algorithm for any hypothesis space that works in the fully adversarial setting?
- What are special cases of these algorithms which are computationally tractable and useful?
The Foundations of Active Learning workshop at NIPS should be a good place to discuss these questions.