Learning Theory, by assumption – Machine Learning (Theory)

One way to organize learning theory is by assumption (in the assumption = axiom sense), from no assumptions to many assumptions. As you travel down this list, the statements become stronger, but the scope of applicability decreases.

No assumptions
1. Online learning There exist a meta prediction algorithm which compete well with the best element of any set of prediction algorithms.
2. Universal Learning Using a “bias” of 2^{– description length of turing machine} in learning is equivalent to all other computable biases up to some constant.
3. Reductions The ability to predict well on classification problems is equivalent to the ability to predict well on many other learning problems.
Independent and Identically Distributed (IID) Data
1. Performance Prediction Based upon past performance, you can predict future performance.
2. Uniform Convergence Performance prediction works even after choosing classifiers based on the data from large sets of classifiers.
IID and partial constraints on the data source
1. PAC Learning There exists fast algorithms for learning when all examples agree with some function in a function class (such as monomials, decision list, etc…)
2. Weak Bayes The Bayes law learning algorithm will eventually reach the right solution as long as the right solution has a positive prior.
Strong Constraints on the Data Source
1. Bayes Learning When the data source is drawn from the prior, using Bayes law is optimal

This doesn’t include all forms of learning theory, because I do not know them all. If there are other bits you know of, please comment.