Don’t mix the solution into the problem – Machine Learning (Theory)

A common defect of many pieces of research is defining the problem in terms of the solution. Here are some examples in learning:

“The learning problem is finding a good seperating hyperplane.”
“The goal of learning is to minimize (y-p)² + C w² where y = the observation, p = the prediction and w = a parameter vector.”
Defining the loss function to be the one that your algorithm optimizes rather than the one imposed by the world.

The fundamental reason why this is a defect is that it creates artificial boundaries to problem solution. Artificial boundaries lead to the possibility of being blind-sided. For example, someone committing (1) or (2) above might find themselves themselves surprised to find a decision tree working well on a problem. Example (3) might result in someone else solving a learning problem better for real world purposes, even if it’s worse with respect to the algorithm optimization. This defect should be avoided so as to not artificially limit your learning kungfu.

The way to avoid this defect while still limiting the scope of investigation to something you can manage is to be explicit.

Admit what the real world-imposed learning problem is. For example “The problem is to find a classifier minimizing error rate”.
Be explicit about where the problem ends and the solution begins. For example “We use a support vector machine with a l₂ loss to train a classifier. We use the l₂ loss because it is an upper bound on the error rate which is computationally tractable to optimize.”
Finish the solution. For example “The error rate on our test set was 0.34.”

It is important to note that this is not a critique about any particular method for solving learning problems, but rather about the process of thinking about learning problems. Eliminating this thinking-bug will leave people more capable of appreciating and using different approaches to solve the real learning problem.