General – Page 13 – Machine Learning (Theory)

3/10/20053/10/2005

Breaking Abstractions

Sam Roweis‘s comment reminds me of a more general issue that comes up in doing research: abstractions always break.

Real number’s aren’t. Most real numbers can not be represented with any machine. One implication of this is that many real-number based algorithms have difficulties when implemented with floating point numbers.
The box on your desk is not a turing machine. A turing machine can compute anything computable, given sufficient time. A typical computer fails terribly when the state required for the computation exceeds some limit.
Nash equilibria aren’t equilibria. This comes up when trying to predict human behavior based on the result of the equilibria computation. Often, it doesn’t work.
The probability isn’t. Probability is an abstraction expressing either our lack of knowledge (the Bayesian viewpoint) or fundamental randomization (the frequentist viewpoint). From the frequentist viewpoint the lack of knowledge typically precludes actually knowing the fundamental randomization. From the Bayesian viewpoint, precisely specifying our lack of knowledge is extremely difficult and typically not done.

So, what should we do when we learn that our basic tools can break? The answer, of course is to keep using them until something better comes along. However, the uncomfortable knowledge our tools break is necessary in a few ways:

When considering a new abstraction, the existence of a break does not imply that it is a useless abstraction. (Just as the existence of the breaks above does not imply that they are useless abstractions.)
When using an abstraction in some new way, we must generally consider “is this a reasonable use?”
Some tools break more often than other tools. We should actively consider the “rate of breakage” when deciding amongst tools.

3/8/20053/8/2005

Fast Physics for Learning

While everyone is silently working on ICML submissions, I found this discussion about a fast physics simulator chip interesting from a learning viewpoint. In many cases, learning attempts to predict the outcome of physical processes. Access to a fast simulator for these processes might be quite helpful in predicting the outcome. Bayesian learning in particular may directly benefit while many other algorithms (like support vector machines) might have their speed greatly increased.

The biggest drawback is that writing software for these odd architectures is always difficult and time consuming, but a several-orders-of-magnitude speedup might make that worthwhile.

3/5/20053/5/2005

Funding Research

The funding of research (and machine learning research) is an issue which seems to have become more significant in the United States over the last decade. The word “research” is applied broadly here to science, mathematics, and engineering.

There are two essential difficulties with funding research:

Longshot Paying a researcher is often a big gamble. Most research projects don’t pan out, but a few big payoffs can make it all worthwhile.
Information Only Much of research is about finding the right way to think about or do something.

The Longshot difficulty means that there is high variance in payoffs. This can be compensated for by funding many different research projects, reducing variance.

The Information-Only difficulty means that it’s hard to extract a profit directly from many types of research, so companies have difficulty justifying basic research. (Patents are a mechanism for doing this. They are often extraordinarily clumsy or simply not applicable.)

These two difficulties together imply that research is often chronically underfunded compared to what would be optimal for any particular nation. They also imply that funding for research makes more sense for larger nations and makes sense for government (rather than private) investment.

The United States has a history of significant research, and significant benefits from research, but that seems to be under attack.

Historically, the old phone monopoly ran Bell Labs which employed thousands doing science and engineering research. It made great sense for them because research was a place to stash money (and evade regulators) that might have some return. They invented the transistor, the laser, and unix. With the breakup of the phone monopoly, it no longer made sense, and so it has been broken apart and has lost orders of magnitude of staff.
On a smaller scale, Xerox Parc (inventors of mice, ethernet, and other basic bits of computer technology) has been radically scaled back.
IBM and HP, who have been historically strong funders of computer-related research have been forced to shift towards more direct research. (Some great research still happens at these places, but the overall trend seems clear.)
The NSF has had funding cut.

What’s Left
The new monopoly on the block is Microsoft, which has been a significant funder of new research, some of which is basic. IBM is still managing to do some basic research. Many companies are funding directed research (with short term expected payoffs). The NSF still has a reasonable budget, even if it is a bit reduced. Many other branches of the government fund directed research of one sort or another. From the perspective of a researcher, this isn’t as good as NSF because it is “money with strings attached”, including specific topics, demos, etc…

Much of the funding available falls into two or three categories: directed into academia, very directed, or both. These have difficulties from a research viewpoint.

Into Academia The difficulty with funding directed into academia is that the professors who it is directed at are incredibly busy with nonresearch. Teaching and running a university are full time jobs. It takes an extraordinary individual to manage all of this and get research done. (Amazingly, many people do manage, but the workload can be crushing.) From the perspective of funding research, this is problematic, because the people being funded are forced to devote much time to nonresearch. As an example from machine learning, AT&T inherited the machine learning group from Bell Labs consisting of Leon Bottou, Michael Collins, Sanjoy Dasgupta, Yoave Freund, Michael Kearns, Yann Lecun, David McAllester, Robert Schapire, Satinder Singh, Peter Stone, Rich Sutton, Vladimir Vapnik (and maybe a couple more I’m forgetting). The environment there was almost pure research without other responsibilities. It would be extremely difficult to argue that a similar-sized group drawn randomly from academia has had as significant an impact on machine learning. This group is gone now, scattered to many different locations.
Very directed It’s a basic fact of research that it is difficult to devote careful and deep attention to something that does not interest you. Attempting to do so simply means that many researchers aren’t efficient. (I’m not arguing against any direction here. It makes sense for any nation to invest in directions which seem important.)

The Good News (for researchers, anyways)
The good news at the moment is outside of the US. NICTA, in Australia, seems to be a well made attempt to do research right. India is starting to fund research more. China is starting to fund research more. Japan is starting to fund basic research more. With the rise of the EU more funding for research makes sense because the benefit applies to a much larger pool of people. In machine learning, this is being realized with the PASCAL project. On the engineering side, centers like the Mozilla Foundation and OSDL (which are funded by corporate contributions) provide some funding for open source programmers.

We can hope for improvements in the US—there is certainly room for it. For example, the NSF budget is roughly 0.3% of the Federal government budget so the impact of more funding for basic research is relatively trivial in the big picture. However, it’s never easy to tradeoff immediate needs against the silent loss of the future.

2/27/20052/28/2005

Antilearning: When proximity goes bad

Joel Predd mentioned “Antilearning” by Adam Kowalczyk, which is interesting from a foundational intuitions viewpoint.

There is a pervasive intuition that “nearby things tend to have the same label”. This intuition is instantiated in SVMs, nearest neighbor classifiers, decision trees, and neural networks. It turns out there are natural problems where this intuition is opposite of the truth.

One natural situation where this occurs is in competition. For example, when Intel fails to meet its earnings estimate, is this evidence that AMD is doing badly also? Or evidence that AMD is doing well?

This violation of the proximity intuition means that when the number of examples is few, negating a classifier which attempts to exploit proximity can provide predictive power (thus, the term “antilearning”).

2/25/20052/25/2005

Why Papers?

Makc asked a good question in comments—“Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia.

The basic idea is that papers have considerably more utility than the obvious “present an idea”.

Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce.
Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive.
Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work.
Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes you even solve them.
Papers are an excuse to get a large number of smart people in the same room and think about the same topic.
Good papers are easy to read. In particular, they are much easier to read (and understand) then a long discussion thread. They are even easy to read in several decades. (Writing good papers is hard)

All of the above are reasons why writing papers is a good idea. It’s also important to understand that academia is a large system and large systems have a lot of inertia. This means switching from paper writing to some other method of doing research won’t happen unless the other method is significantly more effective, and even then there will be a lot of inertia.

Also note: the “similar sites” link to the right is to other discussion forums, etc…