Machine Learning (Theory)

12/21/2007

Vowpal Wabbit Code Release

Tags: Code,Machine Learning,Online jl@ 10:10 am

We are releasing the Vowpal Wabbit (Fast Online Learning) code as open source under a BSD (revised) license. This is a project at Yahoo! Research to build a useful large scale learning algorithm which Lihong Li, Alex Strehl, and I have been working on.

To appreciate the meaning of “large”, it’s useful to define “small” and “medium”. A “small” supervised learning problem is one where a human could use a labeled dataset and come up with a reasonable predictor. A “medium” supervised learning problem dataset fits into the RAM of a modern desktop computer. A “large” supervised learning problem is one which does not fit into the RAM of a normal machine. VW tackles large scale learning problems by this definition of large. I’m not aware of any other open source Machine Learning tools which can handle this scale (although they may exist). A few close ones are:

  1. IBM’s Parallel Machine Learning Toolbox isn’t quite open source. The approach used by this toolbox is essentially map-reduce style computation, which doesn’t seem amenable to online learning approaches. This is significant, because the fastest learning algorithms without parallelization tend to be online learning algorithms.
  2. Leon Bottou‘s sgd implementation first loads data into RAM, then learns. Leon’s code is a great demonstrator of how fast and effective online learning approaches (specifically stochastic gradient descent) can be. VW is about a factor of 3 faster on my desktop, and yields a lower error rate solution.

There are several other features such as feature pairing, sparse features, and namespacing that are often handy in practice.

At present, VW optimizes squared loss via gradient descent or exponentiated gradient descent over a linear representation.

This code is free to use, incorporate, and modify as per the BSD (revised) license. The project is ongoing inside of Yahoo. We will gladly incorporate significant improvements from other people, and I believe any significant improvements are of substantial research interest.

16 Comments to “Vowpal Wabbit Code Release”
  1. Very Cool. Congrats on the release.

  2. [...] Was doing some machine learning parusing and found a very interesting project that has been just released by the name Vowpal Wabbit. It’s good to see such impressive intellectual work being open. [...]

  3. Ron says:

    Thanks for posting this. I’ve had a lot of fun converting it into Java.

  4. Ron says:

    bug? parse_regressor.cc line 98 and line 111, I think “if (regressor.good())” isn’t needed. My interpretation is that the last weight of the file is being ignored.

  5. Ron says:

    Hmm, never mind on that bit about .good(), I was wrong.

  6. jl says:

    If you do find any bugs, I’m of course quite interested.

    I’m also interested in any comparative timings you have between the java and C++ code. We chose C++ because we thought it was necessary for speed, but some people claim otherwise.

  7. Daniel Lowd says:

    Neato. One other toolkit that seems related is VFML, by Geoff Hulten and Pedro Domingos. It’s a set of online algorithms for learning decision trees, Bayesian networks, and clustering, along with an API for implementing more algorithms.

  8. laowuz says:

    quite cool!

  9. Matt says:

    Thanks for sharing! Are there any higher-level bindings — eg. for python?

  10. Benoit says:

    Congrats! I hadn’t seen a classifier with such a good performance/speed ratio since a long time.

    I get big performance difference by changing the –initial_t and –power_t parameters. Could you give a short tutorial on how to choose them?

    Would it be possible to perform structured learning with it? I have played with MIRA lately and I was wondering if the same ideas would apply.

    • jl says:

      The answer is certainly “yes”, but it requires programming. Hal Daume and I have seriously discussed implementing Searn, providing perhaps providing a factor of 100-1000 speedup over his current implementation. This is particularly compelling, because Searn is already substantially faster than CRF-style structured prediction.

  11. [...] [1] Actually the first public version of the hashing trick John Langford knew of was in the first release of Vowpal Wabbit in back in [...]

Sorry, the comment form is closed at this time.

Powered by WordPress