The competitors for the Netflix Prize are tantalizingly close winning the million dollar prize. This year, BellKor and Commendo Research sent a combined solution that won the progress prize. Reading the writeups 2 is instructive. Several aspects of solutions are taken for granted including stochastic gradient descent, ensemble prediction, and targeting residuals (a form of boosting). Relatively to last year, it appears that many approaches have added parameterizations, especially for the purpose of modeling through time.
The big question is: will they make the big prize? At this point, the level of complexity in entering the competition is prohibitive, so perhaps only the existing competitors will continue to try. (This equation might change drastically if the teams open source their existing solutions, including parameter settings.) One fear is that the progress is asymptoting on the wrong side of the 10% threshold. In the first year, the teams progressed through 84.3% of the 10% gap, and in the second year, they progressed through just 64.4% of the remaining gap. While these numbers suggest an asymptote on the wrong side, in the month since the progress prize another 34.0% improvement of the remainder has been achieved. It’s remarkable that it’s too close to call, with just a 0.0035 RMSE gap to win the big prize. Clever people finding just the right parameterization might very well succeed.