I just visited Yahoo Research which has several fundamental learning problems near to (or beyond) the set of problems we know how to solve well. Here are 3 of them.
- Ranking This is the canonical problem of all search engines. It is made extra difficult for several reasons.
- There is relatively little “good” supervised learning data and a great deal of data with some signal (such as click through rates).
- The learning must occur in a partially adversarial environment. Many people very actively attempt to place themselves at the top of
rankings. - It is not even quite clear whether the problem should be posed as ‘ranking’ or as ‘regression’ which is then used to produce a
ranking.
- Collaborative filtering Yahoo has a large number of recommendation systems for music, movies, etc… In these sorts of systems, users specify how they liked a set of things, and then the system can (hopefully) find some more examples of things they might like
by reasoning across multiple such sets. - Exploration with Generalization The cash cow of
search engines is displaying advertisements which are relevant to search along with search results. Better targeting these advertisements makes money (a small improvement might be worth $millions) and improves the value of the search engine for the user.It is natural to predict the set of advertisements which maximize the advertising payoff. This natural idea is stymied by both the extreme
multiplicity of advertisements under contract (think millions) and a lack of ability to measure hypotheticals like “What would have
happened if we had displayed a different set of advertisements for this (query,user) pair instead?” This is a combined exploration and
generalization problem.
Good solutions to any of these problems would be extremely useful (and not just at Yahoo). Even further small improvements on the existing solutions may be very useful.
For those interested, Yahoo (as an organization) knows these are learning problems and is very actively interested in solving them. Yahoo Research is committed to a relatively open method of solving these problems. Dennis DeCoste is one contact point for machine learning research at Yahoo Research.