The Stats Handicap

Graduating students in Statistics appear to be at a substantial handicap compared to graduating students in Machine Learning, despite being in substantially overlapping subjects.

The problem seems to be cultural. Statistics comes from a mathematics background which emphasizes large publications slowly published under review at journals. Machine Learning comes from a Computer Science background which emphasizes quick publishing at reviewed conferences. This has a number of implications:

  1. Graduating statistics PhDs often have 0-2 publications while graduating machine learning PhDs might have 5-15.
  2. Graduating ML students have had a chance for others to build on their work. Stats students have had no such chance.
  3. Graduating ML students have attended a number of conferences and presented their work, giving them a chance to meet people. Stats students have had fewer chances of this sort.

In short, Stats students have had relatively few chances to distinguish themselves and are heavily reliant on their advisors for jobs afterwards. This is a poor situation, because advisors have a strong incentive to place students well, implying that recommendation letters must always be considered with a grain of salt.

This problem is more or less prevalent depending on which Stats department students go to. In some places the difference is substantial, and in other places not.

One practical implication of this, is that when considering graduating stats PhDs for hire, some amount of affirmative action is in order. At a minimum, this implies spending extra time getting to know the candidate and what the candidate can do is in order.

Research Political Issues

I’ve avoided discussing politics here, although not for lack of interest. The problem with discussing politics is that it’s customary for people to say much based upon little information. Nevertheless, politics can have a substantial impact on science (and we might hope for the vice-versa). It’s primary election time in the United States, so the topic is timely, although the issues are not.

There are several policy decisions which substantially effect development of science and technology in the US.

  1. Education The US has great contrasts in education. The top universities are very good places, yet the grade school education system produces mediocre results. For me, the contrast between a public education and Caltech was bracing. For many others attending Caltech, it clearly was not. Upgrading the k-12 education system in the US is a long-standing chronic problem which I know relatively little about. My own experience is that a basic attitude of “no child unrealized” is better than “no child left behind”. A fair claim can also be made that the US just doesn’t invest enough.
  2. Respect Lack of respect for science and technology is routinely expressed in many ways in the US.
    1. The most bald form of lack of respect is scientific censorship. This may be easily understood as a generality: you choose to spend a large fraction of your life learning to interpret some part of the world. After years, you come to some conclusion about the nature of the world. Then, someone with no particular experience or expertise tells you to alter it.
    2. A more refined form of lack of respect is simply lack of presence in decision making. This isn’t necessarily intentional: many people simply make decisions from the gut, and then come up with reasons to justify their decision. This style explicitly cuts out the deep thinking of science. Many policies could have been better informed by a serious consideration of even basic science:
      1. The oil of Iraq is fundamentally less valuable if we are going to tackle global warming.
      2. Swapping gasoline for hydrogen-based transportable energy source is dubious because it introduces another energy storage conversion to lose energy on. The same goes for swapping bioethanol for gasoline. In contrast, hybrid and electric vehicles actually recover substantial energy from regenerative braking, and a plug-in hybrid could run off electricity in typical commuter usage.
      3. The Space Shuttle is a boondoggle design. The rocket equation implies that the ratio of initial to final mass for vehicles reaching earth orbit must be at least a factor of e2.5 (it’s actually e2.93 for the Space Shuttle). Making the system reusable implies that most of this mass returns to earth so the payload deliverable into space is only 1.2% of the liftoff mass. A better designed system might deliver payloads a factor of 4 larger or be much smaller.
      4. Passenger Inspections at airports is another poor policy from the perspective of science. It isn’t effective, and there is no cost-efficient way to make it effective against a motivated opponent. Solid evidence for this is the continued use of mules to smuggle drugs. The basic problem from a chemistry point of view is that too much can be done with a small amount of mass. Deterrence and limitation (armored cockpits and active resistance for example) are fine policies.
    3. Lack of support. The simplest form of lack of respect is simply lack of support. The case for federal vs corporate funding of basic science and technology development is very simple: the benefit to society of conducting such work dramatically exceeds the benefit any one agent within society (such as a company) could gain from it. Of late, investment in core science has been an anemic 0.0005 GDP and visa issues hamstring broader technology development.
  3. Confidence This is primarily related to the technology side of science and technology. Many policy decisions are made without confidence in the ability of technologists to adapt. This comes in at least two flavors.
    1. The foreordained solution. Policy often comes in the form “we use approach X to solve problem Y” (some examples are above). This demonstrates an overconfidence by policy makers in there ability to pick the winner, and a lack of confidence in the ability of technologists to solve problems. It also represents an opportunity for large established industries to get huge payoffs at taxpayer expense. The X-prize represents the opposite of this approach, and it has been radically more effective by any reasonable standard.
    2. Confusion about the meaning of wealth. Some people believe that wealth is about what you have. However, for a society it seems much better to measure wealth in terms of what the society can do. Policy makers often forget that science and technology is a capability when it comes time to think of a solution. For example, someone with no confidence in the ability to create and make affordable plugin electric hybrids might think it necessary to conquest for oil.
  4. Stability People can’t program, do science, or invent new things when they are worried about more immediate events. There are several destabilizing trends going on in the US right now which either now or in the future may make it hard to focus away from immediate concerns.
    1. Debt and money supply. The federal debt for the US government is about 3.5 times the federal budget. This is bad for the simple reason that investors buying US treasury bonds aren’t investing in new technology. However, the destabilizing concern is more subtle. Since world war II, the US dollar has become the standard currency for exchange around the world. Since debt by the government creates a temptation by the government to (effectively) print money, the number of dollars in circulation has been rapidly growing. But, a growing number of dollars means that the currency is devaluing, which makes owning dollars undesirable. I don’t know an example of a previous world currency that has ceased to be such, but basic economics says that bad things happen to dollar-based savings if all the dollars flow back into the US. So far, the decline of the dollar has been relatively gradual, but a very disruptive cliff might exist out there somewhere. Policies which increase debt (like cutting taxes and increasing spending) exacerbate this problem. There is no fix once the dollar loses world currency status because confidence can be lost quickly, but not regained.
    2. Health Care. The US is running an experiment to determine how large a fraction of GDP can be devoted to health care. Currently it’s over 15%, in first place, and growing. This is even worse than it sounds, because many comparable countries in Europe (or Japan) have older populations which should generally be more expensive to take care of. In the present situation, because health care is incredibly expensive, losing health insurance (which is typically tied to a job) is potentially catastrophic for any individual.
    3. Wealth Asymmetry. The US has shifted towards a substantially more asymmetric division of wealth since the 1970s. An asymmetric division of wealth is not fundamentally bad—there needs to be room for great success to imply great rewards. However, a casual correlation of science and technology development with the gini coefficient map reveals that a large gini coefficient and substantial science and technology development do not coincide. The problem is that wealth becomes inheritable, and it’s very unlikely that the wealth is inherited by a someone interested in science and technology. Wealth is now scheduled to become perfectly inheritable in 2010 in the US.

I’m sure some of these issues are endemic to many other parts of the world as well, because there are fundamental conceptual difficulties with investing in the unknown instead of the known.

Machine Learning Jobs are Growing on Trees

The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. This is my experience as well. Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups.

Like all interest bursts, this one will probably pass because of economic recession or other distractions. Nevertheless, the general outlook for machine learning in business seems to be good. Machine learning is all about optimization when there is uncertainty and lots of data. The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term.

(*) In fact, it would be reasonable to claim that no one has mastered it—there are just some people who know a bit more than others.

The Coming Patent Apocalypse

Many people in computer science believe that patents are problematic. The truth is even worse—the patent system in the US is fundamentally broken in ways that will require much more significant reform than is being considered now.

The myth of the patent is the following: Patents are a mechanism for inventors to be compensated according to the value of their inventions while making the invention available to all. This myth sounds pretty desirable, but the reality is a strange distortion slowly leading towards collapse.

There are many problems associated with patents, but I would like to focus on just two of them:

  1. Patent Trolls The way that patents have generally worked over the last several decades is that they were a tool of large companies. Large companies would amass a large number of patents and then cross-license each other’s patents—in effect saying “we agree to owe each other nothing”. Smaller companies would sometimes lose in this game, essentially because they didn’t have enough patents to convince the larger companies that cross-licensing was a good idea. However, they didn’t necessarily lose, because small companies are also doing fewer things which makes their patent violation profile smaller.

    The patent trolls arrived recently. They are a new breed of company which does nothing but produce patents and collect money from them. The thing which distinguishes patent troll companies is that they have no patent violation profile. A company with a large number of patents can not credibly threaten to sue them unless they cross-license their patents, because they don’t do anything which violates a patent.

    The best example (and proof that this method works) is NTP, which extracted $612.5M from RIM. Although this is the big case with lots of publicity, the process of extracting money goes on constantly all around your in backroom negotiations with the companies that actually do things. In effect, patent trolls impose an invisible tax on companies that do things by companies that don’t. Restated in another way, patent trolls are akin to exploiting tax loopholes—except they exploit the law to make money rather than simply to avoid losing it. Smaller companies are particularly prone to lose, because they simply can not afford the extreme legal fees associated with fighting even a winning battle, but even large companies are also vulnerable to a patent troll.

    The other side of this argument is that patent trolls are simply performing a useful business function: employing researchers to come up with ideas or (at least) putting a floor on the value of ideas which they buy up through patents. Unfortunately, this is simply not true in my experience, due to the next problem.

  2. Combinatorial Ideas Patents are too easy. In fact, the process of coming up with patentable ideas is simply a matter of combinatorial application of existing ideas. This is a simple game that any reasonably intelligent person can play: you take idea 1 and idea 2, glue them together in any reasonable way, and get a patent.

    There are several reasons why the combinatorial application of existing ideas has become standard for patents.

    1. One of these is regulatory capture. It should surprise no one that the patent office, which gets paid for every patent application, has found a way to increase the number of patent applications.
    2. Another reason has to do with the way that patent law developed. Initially, patents were for processes doing things, rather than ideas. The scope of patents has steadily extended over time but the basic idea of patenting a process has been preserved. The fundamental problem is that processes can be created by the combinatorial application of ideas.

    The ease of patents is fundamentally valuable to patent troll type companies because they can acquire a large number of patents on processes which other companies accidentally violate.

The patent apocalypse happens when we project forward these two trends. Patents become ever easier to acquire and patent troll companies become ever more prevalent. In the end, every company which does something uses some obvious process that violates someone’s patent, and they have to pay at rates the patent owner chooses. There is no inherent bound on the number of patent troll type companies which can exist—they can multiply unchecked and drain money from every other company which does things until the system collapses.

I would like to make some positive suggestions here about how to reform the patent system, but it’s a hard mechanism design problem. Some obvious points are:

  1. Patents should have higher standards than research papers rather than substantially lower standards.
  2. The patent office should not make money from patents (this is not equivalent to saying that the patent applications should not be charged).
  3. The decision in whether or not to grant a patent should weigh the cost of granting. Many private property afficionados think “patents are great, because they compensate inventors”, but there is a real cost to society in granting obvious patents. You block people from doing things in the straightforward way or create hidden liabilities.
  4. Patents should be far more readable if the “available for all” part of the myth is to be upheld. Published academic papers are substantially more readable (which isn’t all that high of a bar).

Patent troll companies have found a clever way to exhibit the flaws in the current patent system. Substantial patent reform to eliminate this style of company would benefit just about everyone, except for these companies.