The Limits of Learning Theory – Machine Learning (Theory)

Suppose we had an infinitely powerful mathematician sitting in a room and proving theorems about learning. Could he solve machine learning?

The answer is “no”. This answer is both obvious and sometimes underappreciated.

There are several ways to conclude that some bias is necessary in order to succesfully learn. For example, suppose we are trying to solve classification. At prediction time, we observe some features X and want to make a prediction of either 0 or 1. Bias is what makes us prefer one answer over the other based on past experience. In order to learn we must:

Have a bias. Always predicting 0 is as likely as 1 is useless.
Have the “right” bias. Predicting 1 when the answer is 0 is also not helpful.

The implication of “have a bias” is that we can not design effective learning algorithms with “a uniform prior over all possibilities”. The implication of “have the ‘right’ bias” is that our mathematician fails since “right” is defined with respect to the solutions to problems encountered in the real world. The same effect occurs in various sciences such as physics—a mathematician can not solve physics because the “right” answer is defined by the world.

A similar question is “Can an entirely empirical approach solve machine learning?”. The answer to this is “yes”, as long as we accept the evolution of humans and that a “solution” to machine learning is human-level learning ability.

A related question is then “Is mathematics useful in solving machine learning?” I believe the answer is “yes”. Although mathematics can not tell us what the “right” bias is, it can:

Give us computational shortcuts relevant to machine learning.
Abstract empirical observations of what an empirically good bias is allowing transference to new domains.

There is a reasonable hope that solving mathematics related to learning implies we can reach a good machine learning system in time shorter than the evolution of a human.

All of these observations imply that the process of solving machine learning must be partially empirical. (What works on real problems?) Anyone hoping to do so must either engage in real-world experiments or listen carefully to people who engage in real-world experiments. A reasonable model here is physics which has benefited from a combined mathematical and empirical study.

4 Replies to “The Limits of Learning Theory”

Do you think there is one type of bias waiting to be discovered that will work for all real learning problems?
Or do you expect that different biases will work well for different kinds of
problems? In this case, would our job be to figure out which biases work best for
each “kind” of real learning problem? (This creates a new learning problem).

The answer is “yes” and “no”. For “yes”: there exist good biases which can solve a diverse range of learning problems. For example, people seem to have them built in. For “no”, it’s important to remember that on any particular learning problem, the “best” bias is the solution. So the problem of “what is a good bias for this problem?” seems like it is just another way of stating “what is a good predictor?” (i.e. the original learning problem).

I will disagree. Let me focus on the first statement about whether or not an “infinitely powerful mathematician” can “solve” machine learning. Before I go any further however, I need to clarify what you mean by “solve”?

Perhaps you mean something like, “Come up with a mathematical equation that describes a super powerful learning system that can solve all solvable problems with speed.”

Naturally, armed with such an equation, our mathematician could then use it to predict things about the world and solve all sorts of very difficult problems etc.

Is this what you mean by “solve”?

Oops, that wasn’t supposed to be anonymous, let me try with both my email address and my name

Comments are closed.