Future Publication Models @ NIPS

Yesterday, there was a discussion about future publication models at NIPS. Yann and Zoubin have specific detailed proposals which I’ll add links to when I get them (Yann’s proposal and Zoubin’s proposal).

What struck me about the discussion is that there are many simultaneous concerns as well as many simultaneous proposals, which makes it difficult to keep all the distinctions straight in a verbal conversation. It also seemed like people were serious enough about this that we may see some real movement. Certainly, my personal experience motivates that as I’ve posted many times about the substantial flaws in our review process, including some very poor personal experiences.

Concerns include the following:

  1. (Several) Reviewers are overloaded, boosting the noise in decision making.
  2. (Yann) A new system should run with as little built-in delay and friction to the process of research as possible.
  3. (Hanna Wallach(updated)) Double-blind review is particularly important for people who are unknown or from an unknown institution.
  4. (Several) But, it’s bad to take double blind so seriously as to disallow publishing on arxiv or personal webpages.
  5. (Yann) And double-blind is bad when it prevents publishing for substantial periods of time. Apparently, this comes up in CVPR.
  6. (Zoubin) Any new system should appear to outsiders as if it’s the old system, or a journal, because it’s already hard enough to justify CS tenure cases to other disciplines.
  7. (Fernando) There shouldn’t be a big change with a complex bureaucracy, but rather a smaller changes which are obviously useful or at least worth experimenting with.

There were other concerns as well, but these are the ones that I remember.

Elements of proposals include:

  1. (Yann) Everything should go to Arxiv or an arxiv-like system first, as per physics or mathematics. This addresses (1), because it delinks dissemination from review, relieving some of the burden of reviewing. It also addresses (2) since with good authors they can immediately begin building on each other’s work. It conflicts with (3), because Arxiv does not support double-blind submission. It does not conflict if we build our own system.
  2. (Fernando) Create a conference coincident journal in which people can publish at any time. VLDB has apparently done this. It can be done smoothly by allowing submission in either conference deadline mode or journal mode. This proposal addresses (1) by reducing peak demand on reviewing. It also addresses (6) above.
  3. (Daphne) Perhaps we should have a system which only reviews papers for correctness, which is not nearly as subjective as for novelty or interestingness. This addresses (1), by eliminating some concerns for the reviewer. It is orthogonal to the double blind debate. In biology, such a journal exists (pointer updated), because delays were becoming absurd and intolerable.
  4. (Yann) There should be multiple publishing entities (people or groups of people) that can bless a paper as interesting. This addresses (1).

There are many other proposal elements (too many for my memory), which hopefully we’ll see in particular proposals. If other people have concrete proposals, now is probably the right time to formalize them.

Vowpal Wabbit version 4.0, and a NIPS heresy

I’m releasing version 4.0(tarball) of Vowpal Wabbit. The biggest change (by far) in this release is experimental support for cluster parallelism, with notable help from Daniel Hsu.

I also took advantage of the major version number to introduce some incompatible changes, including switching to murmurhash 2, and other alterations to cachefiles. You’ll need to delete and regenerate them. In addition, the precise specification for a “tag” (i.e. string that can be used to identify an example) changed—you can’t have a space between the tag and the ‘|’ at the beginning of the feature namespace.

And, of course, we made it faster.

For the future, I put up my todo list outlining the major future improvements I want to see in the code. I’m planning to discuss the current mechanism and results of the cluster parallel implementation at the large scale machine learning workshop at NIPS later this week. Several people have asked me to do a tutorial/walkthrough of VW, which is arranged for friday 2pm in the workshop room—no skiing for me Friday. Come join us if this heresy interests you as well 🙂

AI Safety

Dan Reeves introduced me to Michael Vassar who ran the Singularity Summit and educated me a bit on the subject of AI safety which the Singularity Institute has small grants for.

I still believe that interstellar space travel is necessary for long term civilization survival, and the AI is necessary for interstellar space travel. On these grounds alone, we could judge that developing AI is much more safe than not. Nevertheless, there is a basic reasonable fear, as expressed by some commenters, that AI could go bad.

A basic scenario starts with someone inventing an AI and telling it to make as much money as possible. The AI promptly starts trading in various markets to make money. To improve, it crafts a virus that takes over most of the world’s computers using it as a surveillance network so that it can always make the right decision. The AI also branches out into any form of distance work, taking over the entire outsourcing process for all jobs that are entirely digital. To further improve, the AI invests a bit into robotics, creating automated manufacturing systems that produce all kinds of goods. Robot cars and construction teams complete the process, so that any human with money can order anything cheaply and quickly, but no jobs remain for humans.

At this point, the AI is stuck—it can eventually extract all the money from the economic system, and that’s all there is. But of course, it isn’t really stuck. It simply funds appropriate political campaigns so that in some country a measure passes granting the AI the right to make money, which it promptly does, mushrooming it’s wealth from trillions to the maximum number representable in all computers simultaneously. To remove this obstacle, the AI promptly starts making more computers on a worldwide scale until all available power sources are used up. To add more power, the AI starts a space program with beamed power. Unfortunately, it finds the pesky atmosphere an obstacle to space travel, so it chemically binds the atmosphere in the crust of the earth allowing many Gauss Guns to efficiently project material into space where solar sails are used for orbital positioning. This process continues, slowed perhaps by the need to cool the Earth’s core, until the earth and other viable rocky bodies in the solar system are discorporated into a Dyson sphere. Then, the AI goes interstellar with the same program.

Somewhere in this process, certainly by the time the atmosphere is chemically bound, all life on earth (except the AI if you count it) is extinct. Furthermore, the AI while intelligent by many measures doesn’t seem to be accomplishing anything interesting.

One element of understanding AI safety seems to be understanding what an AI could do. Many people seem to ascribe arbitrary powers to any sort of superintelligence, making any constraints imposed on them ineffective. I don’t believe that’s the right approach—we should think of an AI as simply having much more ability to research, control, and manipulate large systems, all within the constraints of known physics.

Efforts to create safe AI go back to Asimov‘s Three Laws of Robotics, which appears limited by the inability to encompass robotic warfare. The general problem is related to the wish problem: How do you specify a wish in a manner so that it can’t be misinterpreted? A cheap trick here is to add “… in a manner that I would consider acceptable” to the end of the wish. Applied to AI, this approach also has limits because any limit imposed by a person can and eventually will be removed by a person given sufficient opportunity.

Perhaps a complementary approach is shown by the game RISK, where it appears to be virtually impossible for one player to win if all other players play defensively (i.e. build up armies and only attack in response to a provoking attack). Applied to AI, the idea would be that we make many AIs programmed to behave well either via laws or wish tricks, with an additional element of aggressively enforcing this behavior in other AIs. Then, if any AI is corrupted, the other AIs, with substantially more aggregate resources, will discover and deal with the problem.

Certain elements are necessary for this approach to work. There must be multiple AIs, and (more importantly) the resources any one controls must be a small compared to all, an extreme form of antimonopoly. Furthermore, the default must be that AIs are programmed to not harm or cause harm to humans, enforcing that behavior in other AIs. Getting the programming right is the hard part, and I’m not clear on how viable this is, or how difficult it is compared to simply creating an AI, which of course I haven’t managed.

ICML 2009 Workshops (and Tutorials)

I’m the workshops chair for ICML this year. As such, I would like to personally encourage people to consider running a workshop.

My general view of workshops is that they are excellent as opportunities to discuss and develop research directions—some of my best work has come from collaborations at workshops and several workshops have substantially altered my thinking about various problems. My experience running workshops is that setting them up and making them fly often appears much harder than it actually is, and the workshops often come off much better than expected in the end. Submissions are due January 18, two weeks before papers.

Similarly, Ben Taskar is looking for good tutorials, which is complementary. Workshops are about exploring a subject, while a tutorial is about distilling it down into an easily taught essence, a vital part of the research process. Tutorials are due February 13, two weeks after papers.

The Other Online Learning

If you search for “online learning” with any major search engine, it’s interesting to note that zero of the results are for online machine learning. This may not be a mistake if you are committed to a global ordering. In other words, the number of people specifically interested in the least interesting top-10 online human learning result might exceed the number of people interested in online machine learning, even given the presence of the other 9 results. The essential observation here is that the process of human learning is a big business (around 5% of GDP) effecting virtually everyone.

The internet is changing this dramatically, by altering the economics of teaching. Consider two possibilities:

  1. The classroom-style teaching environment continues as is, with many teachers for the same subject.
  2. All the teachers for one subject get together, along with perhaps a factor of 2 more people who are experts in online delivery. They spend a factor of 4 more time designing the perfect lecture & learning environment as verified by extensive study.

These two approaches have a similar economic cost, with the additional effort in the second approach being offset by the fact that it is a one-time effort rather than an annual effort.

I’m sure many people prefer the classroom approach, because it’s traditional, because a teacher can adjust dynamically and intelligently to the student, and because a teacher provides ancillary benefits such as day care and child abuse detection. Nevertheless, the second approach represents a compelling alternative addressing education. For classes commonly taught through high school, it’s difficult to imagine how good a learning experience could be after millions of hours spent refining to create the perfect approach. Imagine repeating a lecture over-and-over, testing the resulting student understanding a {day, week, month, year, decade} later to such an extent that every slide, every sentence, and every exercise is optimized for excellent learning. We could even imagine adapting the lecture to the learning style of each student.

The process of converting to the second approach has been slow, but it seems to be picking up. This suggests we can expect several things:

  1. Shakeout Like all new approaches, there is room for early adopters to win while the established old order suffers. We can expect the most severe impact on pure teaching institutions which do not adopt the newer approaches. Research universities will be insulated in two ways: much of their revenue comes from research grants anyways while the new approach creates a flight to excellence, which the research universities can lay some claim to. At one extreme, I understand that only 4-5% of the operating budget for Caltech comes from student tuition.
  2. Centralized Testing. Although class lessons can be taught at a distance, and exercises worked out by students, there is great room for cheating. The remedy for this is a strong centralized testing service. This already exists in the form of SAT, GRE, and AP tests, because grade inflation and nonuniform standards are common across schools. If a student can ace these tests after taking online learning classes, then there is a real sense in which colleges accepting students are satisfied by their qualifications. We can expect this to become more true, and perhaps to see more employer-oriented tests. We can also expect that testable subjects have an inherent advantage in online learning. As centralized testing is a difficult market to break into, the existing systems have a substantial advantage here.
  3. Digitization. Doing online learning brings all the advantages and disadvantages of any other digital media. These include perfect replicability, essentially free distribution, and difficult economics—on one hand the approach could be vastly valuable while on the other it’s difficult to charge someone for something they can get free. The economics imply that there is room for a major charity or state government to accomplish a great deal which might be difficult to accomplish in a business model.
  4. Gaps. There are areas of teaching which are not amenable to online instruction. For example, teaching people to do research remains in the apprentice model. Similarly, letters of recommendation remain an aspect of the apprentice model. Subjects of relatively small interest such as individual research directions may not merit the effort of a highly polished online instruction system. Similarly, many elements of our current education system are not related to formal education, but rather are about students meeting students, teachers acting as daycare for students, or simply structuring the day for learning. Mechanisms achieving the same ends with online human learning systems are necessary, and the conflation of goals represented by the traditional education approach will retard (but not stop) the adoption of online learning approaches. This process has already taken a decade, and we can expect more decades to come.

For those of us interested in online machine learning, it’s natural to question the relationship with online human learning. The practices differ entirely, but the theory still applies, as there are no clauses in the theorem statements of the form “if the learning agent is not a human then…” When you examine the theorem statements for applicability to online human learning, there are a few ideas which may transfer well. One of these is the necessity and techniques for handling exploration problems. If there are two ways to teach a subject, then you could simply try both and take the best. But if your resources are limited then a UCB approach provides a more efficient mechanism for doing this testing. Similarly, if a student has a set of known attributes, contextual-bandit approaches suggest a sound mechanism for personalization of lessons.

Much of our other theory about the process of online learning may be helpful in a heuristic-motivating manner, but it appears typically too pessimistic to accurately capture what is possible. For example, a common technique to explain an idea when teaching is to simply cover a few extreme cases from which all others are some interpolation. The closest common machine learning analogue to this is some active learning algorithms, where a learning algorithm chooses which examples to label. But, of course, this is not an accurate model, because it’s not the student, but rather the teacher which is choosing the examples. A setting more suitable for student and teacher has been studied in learning theory (see the bibliography here for a link into the citation tree). However, these results are typically rather brittle, so it’s not clear yet that we have understood the right way to formalize this process.