The Journal of Machine Learning Gossip has some fine satire about learning research. In particular, the guides are amusing and remarkably true.
As in all things, it’s easy to criticize the way things are and harder to make them better.
Machine learning and learning theory research
The Journal of Machine Learning Gossip has some fine satire about learning research. In particular, the guides are amusing and remarkably true.
As in all things, it’s easy to criticize the way things are and harder to make them better.
One way to organize learning theory is by assumption (in the assumption = axiom sense), from no assumptions to many assumptions. As you travel down this list, the statements become stronger, but the scope of applicability decreases.
This doesn’t include all forms of learning theory, because I do not know them all. If there are other bits you know of, please comment.
Machine learning makes the New Scientist. From the article:
COMPUTERS can learn the meaning of words simply by plugging into Google. The finding could bring forward the day that true artificial intelligence is developed….
But Paul Vitanyi and Rudi Cilibrasi of the National Institute for Mathematics and Computer Science in Amsterdam, the Netherlands, realised that a Google search can be used to measure how closely two words relate to each other. For instance, imagine a computer needs to understand what a hat is.
You can read the paper at KC Google.
Hat tip: Kolmogorov Mailing List
Any thoughts on the paper?
One nice use for this blog is to consider and discuss papers that that have appeared at recent conferences. I really enjoyed Andrew Ng and Sham Kakade’s paper Online Bounds for Bayesian Algorithms. From the paper:
The philosophy taken in the Bayesian methodology is often at odds with
that in the online learning community…. the online learning setting
makes rather minimal assumptions on the conditions under which the
data are being presented to the learner —usually, Nature could provide
examples in an adversarial manner. We study the performance of
Bayesian algorithms in a more adversarial setting… We provide
competitive bounds when the cost function is the log loss, and we
compare our performance to the best model in our model class (as in
the experts setting).
It’s a very nice analysis of some of my favorite algorithms that all hinges around a beautiful theorem:
Let Q be any distribution over parameters theta. Then for all sequences S:
L_{Bayes}(S) leq L_Q(S) + KL(Q|P)
where P is our prior and the losses L are: first, log-loss for the Bayes algorithm (run online) and second, expected log-loss with respect to an arbitrary distribution Q.
Any thoughts? Any other papers you thought we have to read?