How should we, as researchers in machine learning, organize ourselves?
The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem.
In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt.
If our mathematical models are bad, the simple mechanism of research above can not yield the end goal. (This should be agreed on even by people who disagree about what the end goal of machine learning is!) Instead, research which proposes and investigates new mathematical models for machine learning might yield the end goal. Doing this is hard.
- Coming up with a new mathematical model is just plain not easy. Some sources of inspiration include:
- Watching carefully: what happens succesfully in practice, can often be abstracted into a mathematical model.
- Swapping fields: In other fields (for example crypto), other methods of analysis have been developed. Sometimes, these methods can be transferred.
- Model repair: Existing mathematical models often have easily comprehendable failure modes. By thinking about how to avoid such failure modes, we can sometimes produce a new mathematical model.
- Speaking about a new model is hard. The difficulty starts with you in explaining it. Often, when trying to converge on a new model, we think of it in terms of the difference with respect to an older model, leading to a tangled explanation. The difficulty continues with other people (in particular: reviewers) reading it. For a reviewer with limited time, it is very tempting to assume that any particular paper is operating in some familiar model and fail out. The best approach here seems to be super explicitness. You can’t be too blunt about saying “this isn’t the model you are thinking about”.
- Succeeding with new models is also hard. When people don’t have a reference frame to understand the new model, they are unlikely to follow up, as is necessary for success in academia.
The good news here is that a succesful new model can be a big win. I wish it was an easier win: the barriers to success are formidably high, and it seems we should do everything possible to lower the barriers to success for the sake of improving research.