Machine Learning (Theory)


Graduates and Postdocs

Several strong graduates are on the job market this year.

  • Alekh Agarwal made the most scalable public learning algorithm as an intern two years ago. He has a deep and broad understanding of optimization and learning as well as the ability and will to make things happen programming-wise. I’ve been privileged to have Alekh visiting me in NY where he will be sorely missed.
  • John Duchi created Adagrad which is a commonly helpful improvement over online gradient descent that is seeing wide adoption, including in Vowpal Wabbit. He has a similarly deep and broad understanding of optimization and learning with significant industry experience at Google. Alekh and John have often coauthored together.
  • Stephane Ross visited me a year ago over the summer, implementing many new algorithms and working out the first scale free online update rule which is now the default in Vowpal Wabbit. Stephane is not on the market—Google robbed the cradle successfully :-) I’m sure that he will do great things.
  • Anna Choromanska visited me this summer, where we worked on extreme multiclass classification. She is very good at focusing on a problem and grinding it into submission both in theory and in practice—I can see why she wins awards for her work. Anna’s future in research is quite promising.

I also wanted to mention some postdoc openings in machine learning.


Simons Institute Big Data Program

Tags: Announcements,Funding,Workshop jl@ 8:17 am

Michael Jordan sends the below:

The new Simons Institute for the Theory of Computing
will begin organizing semester-long programs starting in 2013.

One of our first programs, set for Fall 2013, will be on the “Theoretical Foundations
of Big Data Analysis”. The organizers of this program are Michael Jordan (chair),
Stephen Boyd, Peter Buehlmann, Ravi Kannan, Michael Mahoney, and Muthu Muthukrishnan.

See for more information on
the program.

The Simons Institute has created a number of “Research Fellowships” for young
researchers (within at most six years of the award of their PhD) who wish to
participate in Institute programs, including the Big Data program. Individuals
who already hold postdoctoral positions or who are junior faculty are welcome
to apply, as are finishing PhDs.

Please note that the application deadline is January 15, 2013. Further details
are available at .

Mike Jordan


The Submodularity workshop and Lucca Professorship

Nina points out the Submodularity Workshop March 19-20 next week at Georgia Tech. Many people want to make Submodularity the new Convexity in machine learning, and it certainly seems worth exploring.

Sara Olson also points out a tenured faculty position at IMT Lucca with a deadline of May 15th. Lucca happens to be the ancestral home of 1/4 of my heritage :-)


Key Scientific Challenges and the Franklin Symposium

For graduate students, the Yahoo! Key Scientific Challenges program including in machine learning is on again, due March 9. The application is easy and the $5K award is high quality “no strings attached” funding. Consider submitting.

Those in Washington DC, Philadelphia, and New York, may consider attending the Franklin Institute Symposium April 25 which has several speakers and an award for V. Attendance is free with an RSVP.


Giving Thanks

Tags: Funding,Research jl@ 7:40 pm

Thanksgiving is perhaps my favorite holiday, because pausing your life and giving thanks provides a needed moment of perspective.

As a researcher, I am most thankful for my education, without which I could not function. I want to share this, because it provides some sense of how a researcher starts.

  1. My long term memory seems to function particularly well, which makes any education I get is particularly useful.
  2. I am naturally obsessive, which makes me chase down details until I fully understand things. Natural obsessiveness can go wrong, of course, but it’s a great ally when you absolutely must get things right.
  3. My childhood was all in one hometown, which was a conscious sacrifice on the part of my father, implying disruptions from moving around were eliminated. I’m not sure how important this was since travel has it’s own benefits, but it bears thought.
  4. I had several great teachers in grade school, and naturally gravitated towards teachers over classmates, as they seemed more interesting. I particularly remember Mr. Cox, who read Watership Down 10 minutes a day. The frustration of not getting to the ending drove me into reading books on my own, including just about every science fiction book in Lebanon Oregon.
  5. I spent a few summers picking strawberries and blueberries. It’s great motivation to not do that sort of thing for a living.
  6. Lebanon school district was willing to bend the rules for me, so I could skip unnecessary math classes. I ended up a year advanced, taking math from our local community college during senior year in high school.
  7. College applications was a very nervous time, because high quality colleges cost much more than we could reasonably expect to pay. I was very lucky to get into Caltech here. Caltech should not be thought of as a university—instead, it’s a research lab which happens to have a few undergraduate students running around. I understand from Preston that the operating budget is about 4% tuition these days. This showed in the financial aid package, where they basically let me attend for the cost of room&board. Between a few scholarships and plentiful summer research opportunities, I managed to graduate debt free. Caltech was also an exceptional place to study, because rules like “no taking two classes at the same time” were never enforced them. The only limits on what you could learn were your own.
  8. Graduate school was another big step. Here, I think Avrim must have picked out my application to Carnegie Mellon, which was a good fit for me. At the time, I knew I wanted to do research in some sort of ML/AI subject area, but not really what, so the breadth of possibilities at CMU was excellent. In graduate school, your advisor is much more important, and between Avrim and Sebastian, I learned quite a bit. The funding which made this all work out was mostly hidden from me at CMU, but there was surely a strong dependence on NSF and DARPA. Tom Siebel also directly covered my final year as as a Siebel Scholar.
  9. Figuring out what to do next was again a nervous time, but it did work out, first in a summer postdoc with Michael Kearns, then at IBM research as a Herman Goldstine Fellow, then at TTI-Chicago, and now at Yahoo! Research for the last 5 years.

My life is just one anecdote, from which it’s easy to be misled. But trying to abstract the details, it seems like the critical elements for success are a good memory, an interest in getting the details right, motivation, and huge amounts of time to learn and then to do research. Given that many of the steps in this process winnow out large fractions of people, some amount of determination and sheer luck is involved. Does the right person manage to see you as a good possibility?

But mostly I’d like to give thanks for the “huge amounts of time” which in practical terms translates into access to other smart people and funding. In education and research funding is something like oxygen—you really miss it when it’s not there, so Thanksgiving is a good time to remember it.


2011 ML symposium and the bears

The New York ML symposium was last Friday. Attendance was 268, significantly larger than last year. My impression was that the event mostly still fit the space, although it was crowded. If anyone has suggestions for next year, speak up.

The best student paper award went to Sergiu Goschin for a cool video of how his system learned to play video games (I can’t find the paper online yet). Choosing amongst the submitted talks was pretty difficult this year, as there were many similarly good ones.

By coincidence all the invited talks were (at least potentially) about faster learning algorithms. Stephen Boyd talked about ADMM. Leon Bottou spoke on single pass online learning via averaged SGD. Yoav Freund talked about parameter-free hedging. In Yoav’s case the talk was mostly about a better theoretical learning algorithm, but it has the potential to unlock an exponential computational complexity improvement via oraclization of experts algorithms… but some serious thought needs to go in this direction.

Unrelated, I found quite a bit of truth in Paul’s talking bears and Xtranormal always adds a dash of funny. My impression is that the ML job market has only become hotter since 4 years ago. Anyone who is well trained can find work, with the key limiting factor being “well trained”. In this environment, efforts to make ML more automatic and more easily applied are greatly appreciated. And yes, Yahoo! is still hiring too :)


Herman Goldstine 2011

Vikas points out the Herman Goldstine Fellowship at IBM. I was a Herman Goldstine Fellow, and benefited from the experience a great deal—that’s where work on learning reductions started. If you can do research independently, it’s recommended. Applications are due January 6.


CI Fellows program renewed

Tags: CS,Funding jl@ 4:01 pm

Lev Reyzin points out the CI Fellows program is renewed. CI Fellows are essentially NSF funded computer science postdocs for universities and industry research labs. I’ve been lucky and happy to have Lev visit me for a year under last year’s program, so I strongly recommend participating if it suits you.

As with last year, the application timeline is very short, with everything due by May 23.


COLT Treasurer is now Phil Long

Tags: Conferences,Funding jl@ 2:14 pm

For about 5 years, I’ve been the treasurer of the Association for Computational Learning, otherwise known as COLT, taking over from John Case before me. A transfer of duties to Phil Long is now about complete. This probably matters to almost no one, but I wanted to describe things a bit for those interested.

The immediate impetus for this decision was unhappiness over reviewing decisions at COLT 2009, one as an author and several as a member of the program committee. I seem to have disagreements fairly often about what is important work, partly because I’m focused on learning theory with practical implications, partly because I define learning theory more broadly than is typical amongst COLT members, and partly because COLT suffers a bit from insider-clique issues. The degree to which these issues come up varies substantially each year so last year is not predictive of this one. And, it’s important to understand that COLT remains healthy with these issues not nearly so bad as they were. Nevertheless, I would like to see them taken more actively into account than I’ve been able to persuade people so far.

After thinking about it for a few days before acting, I decided to go ahead with the transfer for another reason: I’ve been suffering from multitask poisoning. Partly this is Ada, but partly it’s many other things, each of which takes a small bit of my time, in aggregate leaving me disappointing people, myself in particular. The effect of this has been quite obvious in terms of the posting rate on

Fortunately, Phil Long was ready to take up the duties, and he’s well positioned to do so.

Despite the above, I found being treasurer not particularly difficult. The functions of the treasury part of ACL have been

  1. Self-insurance for the conference each year. Prior to the formation of ACL-the-nonprofit (which Bob was instrumental in), COLT used to buy insurance against the possibility that some disaster would strike canceling the conference while leaving the local organizer on the hook for substantial expenses. When I came in, the treasury was a little bit low for this function, and when I left, somewhat too high.
  2. Budget fragmentation avoidance. Local organizers typically have a local account from which they spend for expenses and collect registration fees. Without the ACL, dealing with net positive or negative local accounts from year to year was awkward. With the ACL, it’s easy to square things up at the end of each year.
  3. A stable point of contact for funding related things. COLT is partly sponsored by several big CS-related companies including IBM, Microsoft, and Google. Providing a stable point of contact definitely helps ease this process. This also helps on the publishing side, where Omnipress is the current publisher of proceedings.
  4. Budget advice for local organizers. Somewhat to my surprise, the proper role of the treasurer was typically asking the local organizer to reduce registration fees rather than increase. The essential observation is that local organizers, because they operate out of a local account, tend to be a bit conservative in budget estimates. On the other hand, because ACL has an adequate interest bearing account, we should expect and desire to spend the interest in each typical year. In effect, ACL is naturally in a position to sponsor COLT to a small but nontrivial degree.

After having been treasurer for a little while, I’m convinced that having a nonprofit to back a conference is a good idea easing many difficulties with relatively small effort.


Yahoo! ML events

Yahoo! is sponsoring two machine learning events that might interest people.

  1. The Key Scientific Challenges program (due March 5) for Machine Learning and Statistics offers $5K (plus bonuses) for graduate students working on a core problem of interest to Y! If you are already working on one of these problems, there is no reason not to submit, and if you aren’t you might want to think about it for next year, as I am confident they all press the boundary of the possible in Machine Learning. There are 7 days left.
  2. The Learning to Rank challenge (due May 31) offers an $8K first prize for the best ranking algorithm on a real (and really used) dataset for search ranking, with presentations at an ICML workshop. Unlike the Netflix competition, there are prizes for 2nd, 3rd, and 4th place, perhaps avoiding the heartbreak the ensemble encountered. If you think you know how to rank, you should give it a try, and we might all learn something. There are 3 months left.


CI Fellows

Tags: CS,Funding,Machine Learning jl@ 3:37 pm

Lev Reyzin points out the CI Fellows Project. Essentially, NSF is funding 60 postdocs in computer science for graduates from a wide array of US places to a wide array of US places. This is particularly welcome given a tough year for new hires. I expect some fraction of these postdocs will be in ML. The time frame is quite short, so those interested should look it over immediately.


Effective Research Funding

Tags: Funding,Research jl@ 10:17 am

With a worldwide recession on, my impression is that the carnage in research has not been as severe as might be feared, at least in the United States. I know of two notable negative impacts:

  1. It’s quite difficult to get a job this year, as many companies and universities simply aren’t hiring. This is particularly tough on graduating students.
  2. Perhaps 10% of IBM research was fired.

In contrast, around the time of the dot com bust, ATnT Research and Lucent had one or several 50% size firings wiping out much of the remainder of Bell Labs, triggering a notable diaspora for the respected machine learning group there. As the recession progresses, we may easily see more firings as companies in particular reach a point where they can no longer support research.

There are a couple positives to the recession as well.

  1. Both the implosion of Wall Street (which siphoned off smart people) and the general difficulty of getting a job coming out of an undergraduate education suggest that the quality of admitted phd students may increase. In half a decade when they start graduating, we might have some new and very creative ideas.
  2. The latest stimulus bill includes substantial additional research funding. This is particularly welcome news for those at universities, because it will compensate for other cutbacks which may be necessary there as endowments or state funding fall. It’s also particularly good for young researchers at universities who just got a position or succeed this year, as the derivative on research funding particularly impacts them.

There are two effects going on: Does a recession cause us to refocus on other possibly better ideas? Or does it cause us to focus on short term survival? The first effect helps research while the second effect does not. By far, most of the money invested by governments to fight the recession has gone towards survival, but a small fraction in the US is going towards other possibly better ideas, with a portion of that going towards research.

We could hope for a larger fraction of money heading towards new ideas, rather than rescuing old, but there is a basic issue: the apparatus for creation and use of new ideas in the US is simply too small—it may not be able to effectively use more funding. In order to justify further funding for research, we may need to be more creative than simply “give us more”.

However, this is easy. Throughout much of the 1900s, Bell Labs created many inventions which are fundamental to modern society, including the transistor, C(++), Unix, the laser, information theory, etc… In my view the vital ingredients for success are:

  1. Access to cutting edge problems. Even extraordinarily intelligent researchers can simply end up working on the wrong problem. Without direct access to and knowledge of such problems researchers can end up inventing their own, which occasionally works out well, but more often does not.
  2. Free time. This is both obvious and yet a common failure mode. Researchers at universities have many more demands on their time, including teaching, fundraising, mentoring, and running a university. Similarly, researchers at corporations can be sucked into patching an existing system rather than thinking about the best way to really solve a problem.
  3. Concentration. Two researchers working together can often manage much more than one apart, as each can bring relevant expertise and viewpoints necessary to solve a problem. This remains true up to the point where communication becomes a substantial overhead, which in my experience is about 5, but which we might imagine technology helps improve.

Bell Labs managed to satisfy all three of these desiderata. Some research universities manage to achieve at least access and concentration to some extent, but hidden difficulties exist. For example, professors often don’t work with other professors, because they are both too busy with students and they must make a case for tenure based on work which is unambiguously their own. I’m not extremely familiar with existing national labs, but I believe they often fail at (a)—at least research at national labs have had relatively little impact on newer fields such as computer science.

So, my suggestion would be funding research in modes which satisfy all three desiderata. The natural and easy way to do this is by the government partially subsidizing basic research at those corporations which have decided to fund basic research. In computer science at least, this includes Microsoft, IBM, Yahoo, Google, and what’s left of Bell Labs at ATnT and Alcatel. While this is precisely the conclusion you might expect from someone doing research at one of these places, it’s also what you would expect of someone intensely interested in research who sought out the best environment for research. In economic terms, these companies have for reasons of their own decided to provide a public good. As long as we are interested, as a nation, or as a civilization, in subsidizing this public good, it is desirable to do this as efficiently as possible.

Some people might think that basic research done at a university is inherently more desirable than the same in industry. I don’t see any reason for this. For example, it seems that patentable research is about as likely to be patented at a university as elsewhere, and hence equally restricted for public use over the duration of a patent. Other people might think that basic research only really happens at universities or national labs, but that simply doesn’t agree with history.

Given this, it’s odd that the rules for NSF funding, which is the premier source of funding for basic science in the US, generally requires university participation on proposals. This restriction naturally makes it easier for researchers at universities to acquire grant money than researchers not at universities. I don’t understand why this restriction is desirable from the viewpoint of a government wanting to effectively subsidize research.


Key Scientific Challenges

Yahoo released the Key Scientific Challenges program. There is a Machine Learning list I worked on and a Statistics list which Deepak worked on.

I’m hoping this is taken quite seriously by graduate students. The primary value, is that it gave us a chance to sit down and publicly specify directions of research which would be valuable to make progress on. A good strategy for a beginning graduate student is to pick one of these directions, pursue it, and make substantial advances for a PhD. The directions are sufficiently general that I’m sure any serious advance has applications well beyond Yahoo.

A secondary point, (which I’m sure is primary for many :) ) is that there is money for graduate students here. It’s unrestricted, so you can use it for any reasonable travel, supplies, etc…


Adversarial Academia

One viewpoint on academia is that it is inherently adversarial: there are finite research dollars, positions, and students to work with, implying a zero-sum game between different participants. This is not a viewpoint that I want to promote, as I consider it flawed. However, I know several people believe strongly in this viewpoint, and I have found it to have substantial explanatory power.

For example:

  1. It explains why your paper was rejected based on poor logic. The reviewer wasn’t concerned with research quality, but rather with rejecting a competitor.
  2. It explains why professors rarely work together. The goal of a non-tenured professor (at least) is to get tenure, and a case for tenure comes from a portfolio of work that is undisputably yours.
  3. It explains why new research programs are not quickly adopted. Adopting a competitor’s program is impossible, if your career is based on the competitor being wrong.

Different academic groups subscribe to the adversarial viewpoint in different degrees. In my experience, NIPS is the worst. It is bad enough that the probability of a paper being accepted at NIPS is monotonically decreasing in it’s quality. This is more than just my personal experience over a number of years, as it’s corroborated by others who have told me the same. ICML (run by IMLS) used to have less of a problem, but since it has become more like NIPS over time, it has inherited this problem. COLT has not suffered from this problem as much in my experience, although it had other problems related to the focus being defined too narrowly. I do not have enough experience with UAI or KDD to comment there.

There are substantial flaws in the adversarial viewpoint.

  1. The adversarial viewpoint makes you stupid. When viewed adversarially, any idea has crippling disadvantages and no advantages. Contorting your viewpoint enough to make this true damages your ability to conduct research. In short, it promotes poor mental hygiene.
  2. Many activities become impossible. Doing research is in general extremely hard, so there are many instances where working with other people can allow you to do things which are otherwise impossible.
  3. The previous two disadvantages apply even more strongly for a community—good ideas are more likely to be missed, change comes slowly, and often with steps backward.
  4. At it’s most basic level, the assumption that research is zero-sum is flawed, because the process of research is not done in a closed system. If the rest of society at large discovers that research is valuable, then the budget increases.

Despite these disadvantages, there is a substantial advantage as well: you can materially protect and aid your career by rejecting papers, preventing grants, and generally discriminating against key people doing interesting but competitive work.

The adversarial viewpoint has a validity in proportion to the number of people subscribing to it. For those of us who would like to deemphasize the adversarial viewpoint, what’s unclear is: how?

One concrete thing is: use Arxiv. For a long time, physicists have adopted an Arxiv-first philosophy, which I’ve come to respect. Arxiv functions as a universal timestamp which decreases the power of an adversarial reviewer. Essentially, you avoid giving away the power to muddy the track of invention. I’m expecting to use Arxiv for essentially all my past-but-unpublished and future papers.

It is plausible that limiting the scope of bidding, as Andrew McCallum suggested at the last ICML, and as is effectively implemented at this ICML, will help. The system of review at journals might also help for the same reason. In my experience as an author, if an anonymous reviewer wants to kill a paper they usually succeed. Most area chairs or program chairs are more interested in avoiding conflict with the reviewer (who they picked and may consider a friend) than reading the paper to determine the illogic of the review (which is a difficult task that simply cannot be done for all papers). NIPS experimented with a reputation system for reviewers last year, but I’m unclear on how well it worked, as an author’s score for a review and a reviewer’s score for the paper may be deeply correlated, revealing little additional information.

Public discussion of research can help with this, because very poor logic simply doesn’t stand up under public scrutiny. While I hope to nudge people in this direction, it’s clear that most people aren’t yet comfortable with public discussion.


The Stats Handicap

Graduating students in Statistics appear to be at a substantial handicap compared to graduating students in Machine Learning, despite being in substantially overlapping subjects.

The problem seems to be cultural. Statistics comes from a mathematics background which emphasizes large publications slowly published under review at journals. Machine Learning comes from a Computer Science background which emphasizes quick publishing at reviewed conferences. This has a number of implications:

  1. Graduating statistics PhDs often have 0-2 publications while graduating machine learning PhDs might have 5-15.
  2. Graduating ML students have had a chance for others to build on their work. Stats students have had no such chance.
  3. Graduating ML students have attended a number of conferences and presented their work, giving them a chance to meet people. Stats students have had fewer chances of this sort.

In short, Stats students have had relatively few chances to distinguish themselves and are heavily reliant on their advisors for jobs afterwards. This is a poor situation, because advisors have a strong incentive to place students well, implying that recommendation letters must always be considered with a grain of salt.

This problem is more or less prevalent depending on which Stats department students go to. In some places the difference is substantial, and in other places not.

One practical implication of this, is that when considering graduating stats PhDs for hire, some amount of affirmative action is in order. At a minimum, this implies spending extra time getting to know the candidate and what the candidate can do is in order.


Research Political Issues

Tags: Funding,General,Research jl@ 5:02 pm

I’ve avoided discussing politics here, although not for lack of interest. The problem with discussing politics is that it’s customary for people to say much based upon little information. Nevertheless, politics can have a substantial impact on science (and we might hope for the vice-versa). It’s primary election time in the United States, so the topic is timely, although the issues are not.

There are several policy decisions which substantially effect development of science and technology in the US.

  1. Education The US has great contrasts in education. The top universities are very good places, yet the grade school education system produces mediocre results. For me, the contrast between a public education and Caltech was bracing. For many others attending Caltech, it clearly was not. Upgrading the k-12 education system in the US is a long-standing chronic problem which I know relatively little about. My own experience is that a basic attitude of “no child unrealized” is better than “no child left behind”. A fair claim can also be made that the US just doesn’t invest enough.
  2. Respect Lack of respect for science and technology is routinely expressed in many ways in the US.
    1. The most bald form of lack of respect is scientific censorship. This may be easily understood as a generality: you choose to spend a large fraction of your life learning to interpret some part of the world. After years, you come to some conclusion about the nature of the world. Then, someone with no particular experience or expertise tells you to alter it.
    2. A more refined form of lack of respect is simply lack of presence in decision making. This isn’t necessarily intentional: many people simply make decisions from the gut, and then come up with reasons to justify their decision. This style explicitly cuts out the deep thinking of science. Many policies could have been better informed by a serious consideration of even basic science:
      1. The oil of Iraq is fundamentally less valuable if we are going to tackle global warming.
      2. Swapping gasoline for hydrogen-based transportable energy source is dubious because it introduces another energy storage conversion to lose energy on. The same goes for swapping bioethanol for gasoline. In contrast, hybrid and electric vehicles actually recover substantial energy from regenerative braking, and a plug-in hybrid could run off electricity in typical commuter usage.
      3. The Space Shuttle is a boondoggle design. The rocket equation implies that the ratio of initial to final mass for vehicles reaching earth orbit must be at least a factor of e2.5 (it’s actually e2.93 for the Space Shuttle). Making the system reusable implies that most of this mass returns to earth so the payload deliverable into space is only 1.2% of the liftoff mass. A better designed system might deliver payloads a factor of 4 larger or be much smaller.
      4. Passenger Inspections at airports is another poor policy from the perspective of science. It isn’t effective, and there is no cost-efficient way to make it effective against a motivated opponent. Solid evidence for this is the continued use of mules to smuggle drugs. The basic problem from a chemistry point of view is that too much can be done with a small amount of mass. Deterrence and limitation (armored cockpits and active resistance for example) are fine policies.
    3. Lack of support. The simplest form of lack of respect is simply lack of support. The case for federal vs corporate funding of basic science and technology development is very simple: the benefit to society of conducting such work dramatically exceeds the benefit any one agent within society (such as a company) could gain from it. Of late, investment in core science has been an anemic 0.0005 GDP and visa issues hamstring broader technology development.
  3. Confidence This is primarily related to the technology side of science and technology. Many policy decisions are made without confidence in the ability of technologists to adapt. This comes in at least two flavors.
    1. The foreordained solution. Policy often comes in the form “we use approach X to solve problem Y” (some examples are above). This demonstrates an overconfidence by policy makers in there ability to pick the winner, and a lack of confidence in the ability of technologists to solve problems. It also represents an opportunity for large established industries to get huge payoffs at taxpayer expense. The X-prize represents the opposite of this approach, and it has been radically more effective by any reasonable standard.
    2. Confusion about the meaning of wealth. Some people believe that wealth is about what you have. However, for a society it seems much better to measure wealth in terms of what the society can do. Policy makers often forget that science and technology is a capability when it comes time to think of a solution. For example, someone with no confidence in the ability to create and make affordable plugin electric hybrids might think it necessary to conquest for oil.
  4. Stability People can’t program, do science, or invent new things when they are worried about more immediate events. There are several destabilizing trends going on in the US right now which either now or in the future may make it hard to focus away from immediate concerns.
    1. Debt and money supply. The federal debt for the US government is about 3.5 times the federal budget. This is bad for the simple reason that investors buying US treasury bonds aren’t investing in new technology. However, the destabilizing concern is more subtle. Since world war II, the US dollar has become the standard currency for exchange around the world. Since debt by the government creates a temptation by the government to (effectively) print money, the number of dollars in circulation has been rapidly growing. But, a growing number of dollars means that the currency is devaluing, which makes owning dollars undesirable. I don’t know an example of a previous world currency that has ceased to be such, but basic economics says that bad things happen to dollar-based savings if all the dollars flow back into the US. So far, the decline of the dollar has been relatively gradual, but a very disruptive cliff might exist out there somewhere. Policies which increase debt (like cutting taxes and increasing spending) exacerbate this problem. There is no fix once the dollar loses world currency status because confidence can be lost quickly, but not regained.
    2. Health Care. The US is running an experiment to determine how large a fraction of GDP can be devoted to health care. Currently it’s over 15%, in first place, and growing. This is even worse than it sounds, because many comparable countries in Europe (or Japan) have older populations which should generally be more expensive to take care of. In the present situation, because health care is incredibly expensive, losing health insurance (which is typically tied to a job) is potentially catastrophic for any individual.
    3. Wealth Asymmetry. The US has shifted towards a substantially more asymmetric division of wealth since the 1970s. An asymmetric division of wealth is not fundamentally bad—there needs to be room for great success to imply great rewards. However, a casual correlation of science and technology development with the gini coefficient map reveals that a large gini coefficient and substantial science and technology development do not coincide. The problem is that wealth becomes inheritable, and it’s very unlikely that the wealth is inherited by a someone interested in science and technology. Wealth is now scheduled to become perfectly inheritable in 2010 in the US.

I’m sure some of these issues are endemic to many other parts of the world as well, because there are fundamental conceptual difficulties with investing in the unknown instead of the known.


Machine Learning Jobs are Growing on Trees

Tags: Funding,Machine Learning jl@ 10:59 am

The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. This is my experience as well. Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups.

Like all interest bursts, this one will probably pass because of economic recession or other distractions. Nevertheless, the general outlook for machine learning in business seems to be good. Machine learning is all about optimization when there is uncertainty and lots of data. The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term.

(*) In fact, it would be reasonable to claim that no one has mastered it—there are just some people who know a bit more than others.


The Coming Patent Apocalypse

Tags: Funding,Research jl@ 5:46 pm

Many people in computer science believe that patents are problematic. The truth is even worse—the patent system in the US is fundamentally broken in ways that will require much more significant reform than is being considered now.

The myth of the patent is the following: Patents are a mechanism for inventors to be compensated according to the value of their inventions while making the invention available to all. This myth sounds pretty desirable, but the reality is a strange distortion slowly leading towards collapse.

There are many problems associated with patents, but I would like to focus on just two of them:

  1. Patent Trolls The way that patents have generally worked over the last several decades is that they were a tool of large companies. Large companies would amass a large number of patents and then cross-license each other’s patents—in effect saying “we agree to owe each other nothing”. Smaller companies would sometimes lose in this game, essentially because they didn’t have enough patents to convince the larger companies that cross-licensing was a good idea. However, they didn’t necessarily lose, because small companies are also doing fewer things which makes their patent violation profile smaller.

    The patent trolls arrived recently. They are a new breed of company which does nothing but produce patents and collect money from them. The thing which distinguishes patent troll companies is that they have no patent violation profile. A company with a large number of patents can not credibly threaten to sue them unless they cross-license their patents, because they don’t do anything which violates a patent.

    The best example (and proof that this method works) is NTP, which extracted $612.5M from RIM. Although this is the big case with lots of publicity, the process of extracting money goes on constantly all around your in backroom negotiations with the companies that actually do things. In effect, patent trolls impose an invisible tax on companies that do things by companies that don’t. Restated in another way, patent trolls are akin to exploiting tax loopholes—except they exploit the law to make money rather than simply to avoid losing it. Smaller companies are particularly prone to lose, because they simply can not afford the extreme legal fees associated with fighting even a winning battle, but even large companies are also vulnerable to a patent troll.

    The other side of this argument is that patent trolls are simply performing a useful business function: employing researchers to come up with ideas or (at least) putting a floor on the value of ideas which they buy up through patents. Unfortunately, this is simply not true in my experience, due to the next problem.

  2. Combinatorial Ideas Patents are too easy. In fact, the process of coming up with patentable ideas is simply a matter of combinatorial application of existing ideas. This is a simple game that any reasonably intelligent person can play: you take idea 1 and idea 2, glue them together in any reasonable way, and get a patent.

    There are several reasons why the combinatorial application of existing ideas has become standard for patents.

    1. One of these is regulatory capture. It should surprise no one that the patent office, which gets paid for every patent application, has found a way to increase the number of patent applications.
    2. Another reason has to do with the way that patent law developed. Initially, patents were for processes doing things, rather than ideas. The scope of patents has steadily extended over time but the basic idea of patenting a process has been preserved. The fundamental problem is that processes can be created by the combinatorial application of ideas.

    The ease of patents is fundamentally valuable to patent troll type companies because they can acquire a large number of patents on processes which other companies accidentally violate.

The patent apocalypse happens when we project forward these two trends. Patents become ever easier to acquire and patent troll companies become ever more prevalent. In the end, every company which does something uses some obvious process that violates someone’s patent, and they have to pay at rates the patent owner chooses. There is no inherent bound on the number of patent troll type companies which can exist—they can multiply unchecked and drain money from every other company which does things until the system collapses.

I would like to make some positive suggestions here about how to reform the patent system, but it’s a hard mechanism design problem. Some obvious points are:

  1. Patents should have higher standards than research papers rather than substantially lower standards.
  2. The patent office should not make money from patents (this is not equivalent to saying that the patent applications should not be charged).
  3. The decision in whether or not to grant a patent should weigh the cost of granting. Many private property afficionados think “patents are great, because they compensate inventors”, but there is a real cost to society in granting obvious patents. You block people from doing things in the straightforward way or create hidden liabilities.
  4. Patents should be far more readable if the “available for all” part of the myth is to be upheld. Published academic papers are substantially more readable (which isn’t all that high of a bar).

Patent troll companies have found a clever way to exhibit the flaws in the current patent system. Substantial patent reform to eliminate this style of company would benefit just about everyone, except for these companies.


Luis von Ahn is awarded a MacArthur fellowship.

For his work on the subject of human computation including ESPGame, Peekaboom, and Phetch. The new MacArthur fellows.


Research Budget Changes

Tags: Funding,Research jl@ 2:44 am

The announcement of an increase in funding for basic research in the US is encouraging. There is some discussion of this at the Computing Research Policy blog.

One part of this discussion has a graph of NSF funding over time, presumably in dollar budgets. I don’t believe that dollar budgets are the right way to judge the impact of funding changes on researchers. A better way to judge seems to be in terms of dollar budget divided by GDP which provides a measure of the relative emphasis on research.

This graph was assembled by dividing the NSF budget by the US GDP. For 2005 GDP, I used the current estimate and for 2006 and 2007 assumed an increase by a factor of 1.04 per year. The 2007 number also uses the requested 2007 budget which is certain to change.

This graph makes it clear why researchers were upset: research funding emphasis has fallen for 3 years in a row. The reality has been significantly more severe due to DARPA decreasing funding and industrial research labs (ATnT and Lucent for example) laying off large numbers of researchers about when the governments emphasis on basic research started declining.

It is certainly encouraging to see the emphasis on science growing again.

Older Posts »

Powered by WordPress