Machine Learning (Theory)

2/27/2008

The Stats Handicap

Graduating students in Statistics appear to be at a substantial handicap compared to graduating students in Machine Learning, despite being in substantially overlapping subjects.

The problem seems to be cultural. Statistics comes from a mathematics background which emphasizes large publications slowly published under review at journals. Machine Learning comes from a Computer Science background which emphasizes quick publishing at reviewed conferences. This has a number of implications:

  1. Graduating statistics PhDs often have 0-2 publications while graduating machine learning PhDs might have 5-15.
  2. Graduating ML students have had a chance for others to build on their work. Stats students have had no such chance.
  3. Graduating ML students have attended a number of conferences and presented their work, giving them a chance to meet people. Stats students have had fewer chances of this sort.

In short, Stats students have had relatively few chances to distinguish themselves and are heavily reliant on their advisors for jobs afterwards. This is a poor situation, because advisors have a strong incentive to place students well, implying that recommendation letters must always be considered with a grain of salt.

This problem is more or less prevalent depending on which Stats department students go to. In some places the difference is substantial, and in other places not.

One practical implication of this, is that when considering graduating stats PhDs for hire, some amount of affirmative action is in order. At a minimum, this implies spending extra time getting to know the candidate and what the candidate can do is in order.

9 Comments to “The Stats Handicap”
  1. asarwate says:

    Isn’t this also a bit subfield dependent? At Berkeley we have a number of cross-listed faculty in CS and Statistics, and I think their students do a bit differently. But if you are in Stat and studying something like random walks on non-Abelian groups or something more heavily in probability theory you might have a harder time making a case for a CS job.

    That aside, I think you’re absolutely right about having to take background and academic culture into account — length of CV is not everything.

  2. jl says:

    I would rate Berkeley in the “in other places not” category, as you say.

  3. anon says:

    This raises another question. For the field, which model would lead to better research in the long-run? (short, frequent conference papers vs long, full journal papers).

  4. Anonymous says:

    I think it is also hard for people coming from a mathematical background to adjust to the significantly smaller “minimum publishable unit” prevalent in ML and CS.

  5. Stats and machine learning are more similar to each other than the empirical sciences in publishing. Lab biologists and experimental physicists can spend months or even years running experiments that typically lead to papers with many co-authors (dozens in some cases) if they lead to anything at all.

    For the other extreme, look to speech recognition (often considered a branch of EE), where conferences accept multiple submissions, often review on the basis of 500-word ASCII abstracts, and may have acceptance rates well above 50% (e.g. 70% for some Eurospeechs and ICSLPs, and 55% for ICASSP). I’ve seen folks with well over a dozen papers at a single conference (see, e.g., Alex Waibel’s group’s publications).

    As a result, people are now listing conference acceptance rates on their CVs with indications of who did how much of the work.

  6. Theodore V. says:

    Perhaps it’s because of the age of the fields… publishing a new and relevant paper in the Math/Stats world takes a lot of work and toiling on obscure and deep topics. However, ML and AI have only been theoretical until computers became affordable enough that the mass of researchers could start using them.

    ML=wide open for discovery, math/stats=chock full of previous work done by really brilliant people.

  7. karthik Jayasurya says:

    I agree with theodore. Statistics is a beautifully weaved, mathematics based subject. I personally believe statistics is for guys who are math-crazy and love to turn coffee to intelligent theorems backed with solid proof. As many of the pioneering mathematicians and statisticians have already built a lot of groundwork, significant work has to be done in order to propose a novel methodology.

    On the other hand, machine learning is recent, is rapidly evolving its definition and is heavily algorithmic based. It is not always necessarily required to write down all distributions on a piece of paper. Some ML papers are mostly about addressing and solving computational issues of an already solved problem and most of the papers dont even have a solid proof why and when exactly it works. I think ML is kind of a easy way in that aspect, so are the heavy and growing number of publications. My point is even though its only a couple of statistics journal pubs, there is enough tag on it describing the amount of effort put in it, i would still be happy with a couple.

  8. Part of the issue might be that most of the jobs are in the computer/technology industry, so most of the people doing the hiring have a CS background and, consciously or not, prefer or understand better candidates with a similar background. Probably this leads to the same practical implication: “some amount of affirmative action is in order”.

  9. [...] CS and Stats. There’s an interesting John Langford post on part of the issue, which he calls “The Stats Handicap”. He points out that stats Ph.D.’s have a big disadvantage in the job market because [...]

Leave a Reply


Powered by WordPress