Machine Learning (Theory)

9/29/2012

Vowpal Wabbit, version 7.0

A new version of VW is out. The primary changes are:

  1. Learning Reductions: I’ve wanted to get learning reductions working and we’ve finally done it. Not everything is implemented yet, but VW now supports direct:
    1. Multiclass Classification –oaa or –ect.
    2. Cost Sensitive Multiclass Classification –csoaa or –wap.
    3. Contextual Bandit Classification –cb.
    4. Sequential Structured Prediction –searn or –dagger

    In addition, it is now easy to build your own custom learning reductions for various plausible uses: feature diddling, custom structured prediction problems, or alternate learning reductions. This effort is far from done, but it is now in a generally useful state. Note that all learning reductions inherit the ability to do cluster parallel learning.

  2. Library interface: VW now has a basic library interface. The library provides most of the functionality of VW, with the limitation that it is monolithic and nonreentrant. These will be improved over time.
  3. Windows port: The priority of a windows port jumped way up once we moved to Microsoft. The only feature which we know doesn’t work at present is automatic backgrounding when in daemon mode.
  4. New update rule: Stephane visited us this summer, and we fixed the default online update rule so that it is unit invariant.

There are also many other small updates including some contributed utilities that aid the process of applying and using VW.

Plans for the near future involve improving the quality of various items above, and of course better documentation: several of the reductions are not yet well documented.

8/24/2012

Patterns for research in machine learning

There are a handful of basic code patterns that I wish I was more aware of when I started research in machine learning. Each on its own may seem pointless, but collectively they go a long way towards making the typical research workflow more efficient. Here they are:

  1. Separate code from data.
  2. Separate input data, working data and output data.
  3. Save everything to disk frequently.
  4. Separate options from parameters.
  5. Do not use global variables.
  6. Record the options used to generate each run of the algorithm.
  7. Make it easy to sweep options.
  8. Make it easy to execute only portions of the code.
  9. Use checkpointing.
  10. Write demos and tests.

Click here for discussion and examples for each item. Also see Charles Sutton’s and HackerNews’ thoughts on the same topic.

My guess is that these patterns will not only be useful for machine learning, but also any other computational work that involves either a) processing large amounts of data, or b) algorithms that take a significant amount of time to execute. Share this list with your students and colleagues. Trust me, they’ll appreciate it.

12/13/2011

Vowpal Wabbit version 6.1 & the NIPS tutorial

I just made version 6.1 of Vowpal Wabbit. Relative to 6.0, there are few new features, but many refinements.

  1. The cluster parallel learning code better supports multiple simultaneous runs, and other forms of parallelism have been mostly removed. This incidentally significantly simplifies the learning core.
  2. The online learning algorithms are more general, with support for l1 (via a truncated gradient variant) and l2 regularization, and a generalized form of variable metric learning.
  3. There is a solid persistent server mode which can train online, as well as serve answers to many simultaneous queries, either in text or binary.

This should be a very good release if you are just getting started, as we’ve made it compile more automatically out of the box, have several new examples and updated documentation.

As per tradition, we’re planning to do a tutorial at NIPS during the break at the parallel learning workshop at 2pm Spanish time Friday. I’ll cover the basics, leaving the fun stuff for others.

  1. Miro will cover the L-BFGS implementation, which he created from scratch. We have found this works quite well amongst batch learning algorithms.
  2. Alekh will cover how to do cluster parallel learning. If you have access to a large cluster, VW is orders of magnitude faster than any other public learning system accomplishing linear prediction. And if you are as impatient as I am, it is a real pleasure when the computers can keep up with you.

This will be recorded, so it will hopefully be available for viewing online before too long.

I hope to see you soon :)

8/15/2011

Vowpal Wabbit 6.0

I just released Vowpal Wabbit 6.0. Since the last version:

  1. VW is now 2-3 orders of magnitude faster at linear learning, primarily thanks to Alekh. Given the baseline, this is loads of fun, allowing us to easily deal with terafeature datasets, and dwarfing the scale of any other open source projects. The core improvement here comes from effective parallelization over kilonode clusters (either Hadoop or not). This code is highly scalable, so it even helps with clusters of size 2 (and doesn’t hurt for clusters of size 1). The core allreduce technique appears widely and easily reused—we’ve already used it to parallelize Conjugate Gradient, LBFGS, and two variants of online learning. We’ll be documenting how to do this more thoroughly, but for now “README_cluster” and associated scripts should provide a good starting point.
  2. The new LBFGS code from Miro seems to commonly dominate the existing conjugate gradient code in time/quality tradeoffs.
  3. The new matrix factorization code from Jake adds a core algorithm.
  4. We finally have basic persistent daemon support, again with Jake’s help.
  5. Adaptive gradient calculations can now be made dimensionally correct, following up on Paul’s post, yielding a better algorithm. And Nikos sped it up further with SSE native inverse square root.
  6. The LDA core is perhaps twice as fast after Paul educated us about SSE and representational gymnastics.

All of the above was done without adding significant new dependencies, so the code should compile easily.

The VW mailing list has been slowly growing, and is a good place to ask questions.

Enjoy.

6/22/2011

Ultra LDA

Shravan and Alex‘s LDA code is released. On a single machine, I’m not sure how it currently compares to the online LDA in VW, but the ability to effectively scale across very many machines is surely interesting.

« Newer PostsOlder Posts »

Powered by WordPress