|
Code Usage Examples Discussion |
This is a project at Yahoo! Research to design a fast, scalable, useful learning algorithm. There are two ways to have a fast learning algorithm: (a) start with a slow algorithm and speed it up, or (b) build an intrinsically fast learning algorithm. This project is about approach (b), and it's reached a state where it may be useful to others as a platform for research and experimentation. There are two algorithms, one implementing specialist gradient descent (GD) on squared loss and the other implementing specialist exponentiated gradient descent (SEG) on squared loss. The code should be easily usable. Its only external dependence is on the boost program_options library, which is often installed by default. FeaturesThere are several features that (in combination) are fairly interesting.
Learning RateThe code implements several methods for adjusting the learning rates. The default is a fixed learning rate which decays by a factor of 20.5 if multiple epochs are used. This seems to be a fairly stable default. For some datasets, having a learning rate which decays as 1/(number of examples) or 1/(C + number of examples) in stochastic gradient descent style can work better. Choosing C and the learning rate well appear to be substantially more problem dependent so this is not the default.The FutureThis project is "live" and ongoing. We are interested in incorporating any significant improvements from other people, and I believe any such are of substantial research interest.AuthorsJohn Langford, Lihong Li, and Alex Strehl have all worked on VW, while at Yahoo! Research. |