I found Tong Zhang’s paper on Data Dependent Concentration Bounds for Sequential Prediction Algorithms interesting. Roughly speaking, it states a tight bound on the future error rate for online learning algorithms assuming that samples are drawn independently. This bound is easily computed and will make the progressive validation approaches used here significantly more practical.

John,

Can you compare/contrast this paper with its reference [4] by Cesa-Bianchi et. al.? The later paper seemed appealingly elementary and with similar goals.

Drew

I believe [4] is superceded entirely. [4] was showing that it was possible (with bad constants) to take advantage of a small number of errors. This paper shows that we can do it efficiently, using a very fast approximation that is perfect with respect to the leading constant.