Joel Predd mentioned “Antilearning” by Adam Kowalczyk, which is interesting from a foundational intuitions viewpoint.
There is a pervasive intuition that “nearby things tend to have the same label”. This intuition is instantiated in SVMs, nearest neighbor classifiers, decision trees, and neural networks. It turns out there are natural problems where this intuition is opposite of the truth.
One natural situation where this occurs is in competition. For example, when Intel fails to meet its earnings estimate, is this evidence that AMD is doing badly also? Or evidence that AMD is doing well?
This violation of the proximity intuition means that when the number of examples is few, negating a classifier which attempts to exploit proximity can provide predictive power (thus, the term “antilearning”).
There are a few other reasons why this happens:
* Assume the XOR problem with four instances. If you perform 4-fold cross-validation and employ the linear SVM classifier, you’re guaranteed to always misclassify.
* Assume that do not perform stratified cross-validation, and that your learning algorithm ignores all the attributes except for the binary label. If there are 50% white and 50% black instances in your data, and you do a training/test split where the training test has more than 50% white instances, the test set is guaranteed to contain fewer than 50% of white instances.
Actually this is not quite right. I am pretty sure that AMD and Intel are positively correlated. Market conditions seem to be more important than effects of competition in this case.
The “Proximity intuition”, as John calls it, underlies much of our thinking. Consider algorithms like Hill-Climbing and simulated annealing. We just tend to assume, as humans, that untill proven otherwise “everything is smooth”.