Machine Learning (Theory)


Antilearning: When proximity goes bad

Tags: General jl@ 11:16 am

Joel Predd mentionedAntilearning” by Adam Kowalczyk, which is interesting from a foundational intuitions viewpoint.

There is a pervasive intuition that “nearby things tend to have the same label”. This intuition is instantiated in SVMs, nearest neighbor classifiers, decision trees, and neural networks. It turns out there are natural problems where this intuition is opposite of the truth.

One natural situation where this occurs is in competition. For example, when Intel fails to meet its earnings estimate, is this evidence that AMD is doing badly also? Or evidence that AMD is doing well?

This violation of the proximity intuition means that when the number of examples is few, negating a classifier which attempts to exploit proximity can provide predictive power (thus, the term “antilearning”).

4 Comments to “Antilearning: When proximity goes bad”
  1. Aleks says:

    There are a few other reasons why this happens:

    * Assume the XOR problem with four instances. If you perform 4-fold cross-validation and employ the linear SVM classifier, you’re guaranteed to always misclassify.

    * Assume that do not perform stratified cross-validation, and that your learning algorithm ignores all the attributes except for the binary label. If there are 50% white and 50% black instances in your data, and you do a training/test split where the training test has more than 50% white instances, the test set is guaranteed to contain fewer than 50% of white instances.

  2. Anonymous says:

    Actually this is not quite right. I am pretty sure that AMD and Intel are positively correlated. Market conditions seem to be more important than effects of competition in this case.

  3. Gilad Tsur says:

    The “Proximity intuition”, as John calls it, underlies much of our thinking. Consider algorithms like Hill-Climbing and simulated annealing. We just tend to assume, as humans, that untill proven otherwise “everything is smooth”.

  4. […] Last week I saw an interesting PhD monitoring presentation by Justin Bedo on the counter-intuitive phenomenon of “anti-learning”. For certain datasets, learning a classifier from a small number of samples and inverting its predictions performs much better than the original classifier. Most of the theoretical results Justin mentioned about are discussed in a recent paper and video lecture by Adam Kowalczyk. These build on earlier work presented at ALT 2005. As John notes in his blog post from a couple of years ago, the strangeness of anti-learning is due to our assumption that proximity implies similarity. […]

Sorry, the comment form is closed at this time.

Powered by WordPress