John Langford – Page 46 – Machine Learning (Theory)

4/21/2008

The Science 2.0 article

I found the article about science using modern tools interesting, especially the part about ‘blogophobia’, which in my experience is often a substantial issue: many potential guest posters aren’t quite ready, because of the fear of a permanent public mistake, because it is particularly hard to write about the unknown (the essence of research), and because the system for public credit doesn’t yet really handle blog posts.

So far, science has been relatively resistant to discussing research on blogs. Some things need to change to get there. Public tolerance of the occasional mistake is essential, as is a willingness to cite (and credit) blogs as freely as papers.

I’ve often run into another reason for holding back myself: I don’t want to overtalk my own research. Nevertheless, I’m slowly changing to the opinion that I’m holding back too much: the real power of a blog in research is that it can be used to confer with many people, and that just makes research work better.

4/12/2008

Blog compromised

Iain noticed that hunch.net had zero width divs hiding spammy URLs. Some investigation reveals that the wordpress version being used (2.0.3) had security flaws. I’ve upgraded to the latest, rotated passwords, and removed the spammy URLs. I don’t believe any content was lost. You can check your own and other sites for a similar problem by greping for “width:0” or “width: 0” in the delivered html source.

4/12/2008

It Doesn’t Stop

I’ve enjoyed the Terminator movies and show. Neglecting the whacky aspects (time travel and associated paradoxes), there is an enduring topic of discussion: how do people deal with intelligent machines (and vice versa)?

In Terminator-land, the primary method for dealing with intelligent machines is to prevent them from being made. This approach works pretty badly, because a new angle on building an intelligent machine keeps coming up. This is partly a ploy for writer’s to avoid writing themselves out of a job, but there is a fundamental truth to it as well: preventing progress in research is hard.

The United States, has been experimenting with trying to stop research on stem cells. It hasn’t worked very well—the net effect has been retarding research programs a bit, and exporting some research to other countries. Another less recent example was encryption technology, for which the United States generally did not encourage early public research and even discouraged as a munition. This slowed the development of encryption tools, but I now routinely use tools such as ssh and GPG.

Although the strategy of preventing research may be doomed, it does bring up a Bill Joy type of question: should we actively chose to do research in a field where knowledge can be used to great harm? As an example, the Terminator series illustrates the dark fears of AI gone bad. Many researchers avoid this question by not thinking about it, but this is a substantial question of concern to society at large, and whether or not society supports a direction of research.

My answer is “yes, we should do research”. The reason is simple: I believe that good AI is the best chance of the survival of civilization. This might seem like a leap, but considering the following.

Civilization is not stable. Anyone who believes otherwise needs to try to smell the 1908. Just a lifetime ago, humans could barely fly and computers were people. These radical changes in the abilities of a civilization are strong evidence against stability. Further evidence of instabilities come from long term world changing trends such as greenhouse gas accumulation and population graphs.
Instability is bad in the long run. There are quite a number of doomsday-for-civilization scenarios kicking around—nuclear, plague, grey goo, black holes, etc… Many people find doomsday scenarios triggered by malevolence or accident to be unconvincing, since doomsday claims are so commonly debunked (remember the Y2K computer bug armageddon?). I am naturally skeptical myself, but it only takes one. In the next 10000 years, the odds of something going wrong seem fair.
… for a closed system. There is one really good caveat to instability, which is redundancy. Perhaps if we Earthlings screwup, our descendendents on Alpha Centauri can come pick up the pieces. The fundamental driver here is light speed latency: if it takes years for two groups to communicate, then it is unlikely they’ll manage to coordinate (with malevolence or accident) a simultaneous doomsday.
But real space travel requires AI. Getting from one star system to another with known physics turns out to be very hard. The best approaches I know involve giant lasers and multiple solar sails or fusion powered rockets, taking many years. Merely getting there, of course, is not enough—we need to get there with a kernel of civilization, capable of growing anew in the new system. Any adjacent star system may not have an earth-like planet implying the need to support a space-based civilization. Since travel between star systems is so prohibitively difficult, a basic question is: how small can we make a kernel of civilization? Many science fiction writers and readers think of generation ships, which would necessarily be enormous to support the air, food, and water requirements of a self-sustaining human population. A much simpler and easier solution comes with AI. A good design might mass 10³ kilograms or so and be designed to first land on an asteroid, then mine it, first creating a large solar cell array, and replicas to seed other asteroids. Eventually, hallowed out asteroids could support human life if the requisite materials (Oxygen, Carbon, Hydrogen, etc..) are found. The fundamental observation here is that intelligence and knowledge require very little mass.

I hope we manage to crack AI, opening the door to real space travel, so that civilization doesn’t stop.

3/23/20084/12/2008

Interactive Machine Learning

A new direction of research seems to be arising in machine learning: Interactive Machine Learning. This isn’t a familiar term, although it does include some familiar subjects.

What is Interactive Machine Learning? The fundamental requirement is (a) learning algorithms which interact with the world and (b) learn.

For our purposes, let’s define learning as efficiently competing with a large set of possible predictors. Examples include:

Online learning against an adversary (Avrim’s Notes). The interaction is almost trivial: the learning algorithm makes a prediction and then receives feedback. The learning is choosing based upon the advice of many experts.
Active Learning. In active learning, the interaction is choosing which examples to label, and the learning is choosing from amongst a large set of hypotheses.
Contextual Bandits. The interaction is choosing one of several actions and learning only the value of the chosen action (weaker than active learning feedback).

More forms of interaction will doubtless be noted and tackled as time progresses. I created a webpage for my own research on interactive learning which helps define the above subjects a bit more.

What isn’t Interactive Machine Learning?
There are several learning settings which fail either the interaction or the learning test.

Supervised Learning doesn’t fit. The basic paradigm in supervised learning is that you ask experts to label examples, and then you learn a predictor based upon the predictions of these experts. This approach has essentially no interaction.
Semisupervised Learning doesn’t fit. Semisupervised learning is almost the same as supervised learning, except that you also throw in many unlabeled examples.
Bandit algorithms don’t fit. They have the interaction, but not much learning happens because the sample complexity results only allow you to choose from amongst a small set of strategies. (One exception is EXP4 (page 66), which can operate in the contextual bandit setting.)
MDP learning doesn’t fit. The interaction is there, but the set of policies learned over is still too limited—essentially the policies just memorize what to do in each state.
Reinforcement learning may or may not fit, depending on whether you think of it as MDP learning or in a much broader sense.

All of these not-quite-interactive-learning topics are of course very useful background information for interactive machine learning.

Why now? Because it’s time, of course.

We know from other fields and various examples that interaction is very powerful.
1. From online learning against an adversary, we know that independence of samples is unnecessary in an interactive setting—in fact you can even function against an adversary.
2. From active learning, we know that interaction sometimes allows us to use exponentially fewer labeled samples than in supervised learning.
3. From context bandits, we gain the ability to learn in settings where traditional supervised learning just doesn’t apply.
4. From complexity theory we have “IP=PSPACE” roughly: interactive proofs are as powerful as polynomial space algorithms, which is a strong statement about the power of interaction.
We know that this analysis is often tractable. For example, since Sanjoy‘s post on Active Learning, much progress has been made. Several other variations of interactive settings have been proposed and analyzed. The older online learning against an adversary work is essentially completely worked out for the simpler cases (except for computational issues).
Real world problems are driving it. One of my favorite problems at the moment is the ad display problem—How do you learn which ad is most likely to be of interest? The contextual bandit problem is a big piece of this problem.
It’s more fun. Interactive learning is essentially a wide-open area of research. There are plenty of kinds of natural interaction which haven’t been formalized or analyzed. This is great for beginnners, because it means the problems are simple, and their solution does not require huge prerequisites.
It’s a step closer to AI. Many people doing machine learning want to reach AI, and it seems clear that any AI must engage in interactive learning. Mastering this problem is a next step.

Basic Questions

For natural interaction form [insert yours here], how do you learn? Some of the techniques for other methods of interactive learning may be helpful.
How do we blend interactive and noninteractive learning? In many applications, there is already a pool of supervised examples around.
Are there general methods for reducing interactive learning problems to supervised learning problems (which we know better)?

3/15/20083/16/2008

COLT Open Problems

COLT has a call for open problems due March 21. I encourage anyone with a specifiable open problem to write it down and send it in. Just the effort of specifying an open problem precisely and concisely has been very helpful for my own solutions, and there is a substantial chance others will solve it. To increase the chance someone will take it up, you can even put a bounty on the solution. (Perhaps I should raise the $500 bounty on the K-fold cross-validation problem as it hasn’t yet been solved).