This, again, is something of a research direction rather than a single problem.

There are several metrics people care about which depend upon the relative ranking of examples and there are sometimes good reasons to care about such metrics. Examples include AROC, “F1”, the proportion of the time that the top ranked element is in some class, the proportion of the top 10 examples in some class (google‘s problem), the lowest ranked example of some class, and the “sort distance” from a predicted ranking to a correct ranking. See here for an example of some of these.

**Problem** What does the ability to classify well imply about performance under these metrics?

**Past Work**

- Probabilistic classification under squared error can be solved with a classifier. A counterexample shows this does not imply a good AROC.
- Sample complexity bounds for AROC (and here).
- A paper on “Learning to Order Things“.

**Difficulty** Several of these may be easy. Some of them may be hard.

**Impact** Positive or negative results will broaden our understanding of the relationship between different learning goals. It might also yield new algorithms (via the reduction) for solving these problems.

as far as ranking schemes go, there is a nice theory of rankings based on social choice principles. Kemeny optimality is an important notion here, and there are examples of combining ranking schemes based on what is essentially a 1-median computation on a metric associated with ranking schemes.

See ‘http://www10.org/cdrom/papers/577/’