This post is about an open problem in learning reductions.
Background A reduction might transform a a multiclass prediction problem where there are k possible labels into a binary learning problem where there are only 2 possible labels. On this induced binary problem we might learn a binary classifier with some error rate e. After subtracting the minimum possible (Bayes) error rate b, we get a regret r = e – b. The PECOC(Probabilistic Error Correcting Output Code) reduction has the property that binary regret r implies multiclass regret at most 4r0.5.
The problem This is not the “rightest” answer. Consider the k=2 case, where we reduce binary to binary. There exists a reduction (the identity) with the property that regret r implies regret r. This is substantially superior to the transform given by the PECOC reduction, which suggests that a better reduction may exist for general k. For example, we can not rule out the possibility that a reduction R exists with regret transform guaranteeing binary regret r implies at most multiclass regret c(k) r where c(k) is a k dependent constant between 1 and 4.
Difficulty I believe this is a solvable problem, given some serious thought.
Impact The use of some reduction from multiclass to binary is common practice, so a good solution could be widely useful. One thing to be aware of is that there is a common and reasonable concern about the ‘naturalness’ of induced problems. There seems to be no way to address this concern other than via empirical testing. On the theoretical side, a better reduction may help us understand whether classification or l2 regression is the more natural primitive for reduction. The PECOC reduction essentially first turns a binary classifier into an l2 regressor and then uses the regressor repeatedly to make multiclass predictions.
Some background material which may help:
- Dietterich and Bakiri introduce Error Correcting Output Codes.
- Guruswami and Sahai analyze ECOC as an error transform reduction. (see lemma 2)
- Allwein, Schapire, and Singer generalize ECOC to use loss-based decoding.
- Beygelzimer and Langford showed that ECOC is not a regret transform and proved the PECOC regret transform.