Multitask Poisoning

There are many ways that interesting research gets done. For example it’s common at a conference for someone to discuss a problem with a partial solution, and for someone else to know how to solve a piece of it, resulting in a paper. In some sense, these are the easiest results we can achieve, so we should ask: Can all research be this easy?

The answer is certainly no for fields where research inherently requires experimentation to discover how the real world works. However, mathematics, including parts of physics, computer science, statistics, etc… which are effectively mathematics don’t require experimentation. In effect, a paper can be simply a pure expression of thinking. Can all mathematical-style research be this easy?

What’s going on here is research-by-communication. Someone knows something, someone knows something else, and as soon as someone knows both things, a problem is solved. The interesting thing about research-by-communication is that it is becoming radically easier with improved technology. It’s already possible to communicate with nearly anyone anywhere in the world via {voice, video, text, shared program/website/blog}. As these mechanisms become cheaper in cost and ease of use, we’ll surely see their adoption into many activities. If all mathematical-style research can be sped up by communication, that’s great news for research.

However, I don’t believe it. The essential difficulty is that doing good research often requires the simultaneous understanding of several different things—the problem, all the broken approaches to solving some problem, why they break, and some hint about where to look for a solution. Absorbing and simultaneously keeping in mind this information requires a substantial amount of effort.

The extent of effort required is perhaps most easily understood in collaboration with yourself. Often, a problem is not immediately solved the first time it is thought of, instead a researcher must attack it again and again, until either giving up or finding a solution. A basic parameter in attacking a problem is: How hard is it to resume where you left off?

For many of the problems I’ve worked on, the answer is pretty hard. After weeks of not working on a problem, it might take several hours to refresh my memory with all the simultaneous concerns that go into a solution. Even if I worked on such a problem yesterday, it might take a half hour or an hour to reach a state where I’m prepared to make progress.

Given the difficulty of research, I (and many other people) often struggle in dealing with interruptions. Modern technology has made communication very easy, implying a stream of potential interruptions throughout the day, some of which are undeniably fruitful. And yet, an interruption means the overhead of getting back to thinking must be paid yet again.

Trading off properly between the value of avoiding the overhead and the value of dealing with interruptions is a constant struggle which typically did not exist before before modern communication technology made it prevalent. I think it is common to give in to the interrupts, and effectively cease to be able to do research other than research-by-communication. I’m not ready to do that, so I’ve found it essential to arrange large unused and uninterrupted periods of time for the purpose of doing research.

Although the solution is simple, I’ve often found implementing it challenging in several ways. There is the standard problem that people you deal with don’t understand the overhead of switching tasks in research. However, there is also a more insidious problem. Coping with the modern world requires that at least some portion of our time be devoted to interrupts, which are almost always easier to deal with than research. Dealing with these interrupts therefore can create a bad habit where you seek interrupts to achieve the (short term, easy) gratification of dealing with them. Thus, multitasking creates an internal expectation of multitasking, which makes multitasking preferred, eliminating the ability to do careful research. This is what I think of as multitask poisoning—it’s not just the time lost to swapping some context back in, but also the bad habits of self-interruption that it creates.

I don’t have a good answer to this problem, other than continuing the struggle to preserve substantial contiguous chunks of time for thinking. An alternative sometimes-applicable solution is to reduce the overhead of starting to solve a problem by decomposing the problem into subproblems. Where possible, this is of course valuable, but mathematical research is often almost uniquely undecomposable, because the nature of the problem is that it’s solution is unknown and hence undecomposable. Restated, the real problem is finding a valid decomposition.

I self-interrupted at least 3 times in making this post.

Many ways to Learn this summer

There are at least 3 summer schools related to machine learning this summer.

  1. The first is at University of Chicago June 1-11 organized by Misha Belkin, Partha Niyogi, and Steve Smale. Registration is closed for this one, meaning they met their capacity limit. The format is essentially an extended Tutorial/Workshop. I was particularly interested to see Valiant amongst the speakers. I’m also presenting Saturday June 6, on logarithmic time prediction.
  2. Praveen Srinivasan points out the second at Peking University in Beijing, China, July 20-27. This one differs substantially, as it is about vision, machine learning, and their intersection. The deadline for applications is June 10 or 15. This is also another example of the growth of research in China, with active support from NSF.
  3. The third one is at Cambridge, England, August 29-September 10. It’s in the MLSS series. Compared to the Chicago one, this one is more about the Bayesian side of ML, although effort has been made to create a good cross section of topics. It’s also more focused on tutorials over workshop-style talks.

2009 ICML discussion site

Mark Reid has setup a discussion site for ICML papers again this year and Monica Dinculescu has linked it in from the ICML site. Last year’s attempt appears to have been an acceptable but not wild success as a little bit of fruitful discussion occurred. I’m hoping this year will be a bit more of a success—please don’t be shy 🙂

I’d like to also point out that ICML‘s early registration deadline has a few hours left, while UAI‘s and COLT‘s are in a week.

CI Fellows

Lev Reyzin points out the CI Fellows Project. Essentially, NSF is funding 60 postdocs in computer science for graduates from a wide array of US places to a wide array of US places. This is particularly welcome given a tough year for new hires. I expect some fraction of these postdocs will be in ML. The time frame is quite short, so those interested should look it over immediately.

Server Update

The hunch.net server has been updated. I’ve taken the opportunity to upgrade the version of wordpress which caused cascading changes.

  1. Old threaded comments are now flattened. The system we used to use (Brian’s threaded comments) appears incompatible with the new threading system built into wordpress. I haven’t yet figured out a workaround.
  2. I setup a feedburner account.
  3. I added an RSS aggregator for both Machine Learning and other research blogs that I like to follow. This is something that I’ve wanted to do for awhile.
  4. Many other minor changes in font and format, with some help from Alina.

If you have any suggestions for site tweaks, please speak up.