{"id":1917,"date":"2011-08-15T21:46:08","date_gmt":"2011-08-16T03:46:08","guid":{"rendered":"http:\/\/hunch.net\/?p=1917"},"modified":"2011-08-15T21:46:08","modified_gmt":"2011-08-16T03:46:08","slug":"vowpal-wabbit-6-0","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=1917","title":{"rendered":"Vowpal Wabbit 6.0"},"content":{"rendered":"<p>I just released <a href=\"https:\/\/github.com\/JohnLangford\/vowpal_wabbit\/zipball\/6.0\">Vowpal Wabbit 6.0<\/a>.  Since the last version:<\/p>\n<ol>\n<li>VW is now 2-3 orders of magnitude faster at linear learning, primarily thanks to <a href=\"http:\/\/www.cs.berkeley.edu\/~alekh\/\">Alekh<\/a>.  Given the baseline, this is loads of fun, allowing us to easily deal with terafeature datasets, and dwarfing the scale of any other open source projects.  The core improvement here comes from effective parallelization over kilonode clusters (either <a href=\"http:\/\/hadoop.apache.org\/\">Hadoop<\/a> or not).   This code is highly scalable, so it even helps with clusters of size 2 (and doesn&#8217;t hurt for clusters of size 1).  The core allreduce technique appears widely and easily reused&#8212;we&#8217;ve already used it to parallelize Conjugate Gradient, LBFGS, and two variants of online learning.  We&#8217;ll be documenting how to do this more thoroughly, but for now &#8220;README_cluster&#8221; and associated scripts should provide a good starting point.\n<\/li>\n<li>The new <a href=\"http:\/\/en.wikipedia.org\/wiki\/L-BFGS\">LBFGS<\/a> code from <a href=\"http:\/\/www.cs.cmu.edu\/~mdudik\/\">Miro<\/a> seems to commonly dominate the existing conjugate gradient code in time\/quality tradeoffs.<\/li>\n<li>The new matrix factorization code from <a href=\"http:\/\/research.yahoo.com\/Jake_Hofman\">Jake<\/a> adds a core algorithm.<\/li>\n<li>We finally have basic persistent daemon support, again with Jake&#8217;s help.<\/li>\n<li>Adaptive gradient calculations can now be made dimensionally correct, following up on <a href=\"http:\/\/www.machinedlearnings.com\/2011\/06\/dimensional-analysis-and-gradient.html\">Paul&#8217;s post<\/a>, yielding a better algorithm.  And <a href=\"http:\/\/www.cs.cornell.edu\/~nk\/\">Nikos<\/a> sped it up further with SSE native inverse square root. <\/li>\n<li>The <a href=\"http:\/\/en.wikipedia.org\/wiki\/Latent_Dirichlet_Allocation\">LDA<\/a> core is perhaps twice as fast after Paul educated us <a href=\"http:\/\/www.machinedlearnings.com\/2011\/06\/faster-lda-part-ii.html\">about SSE<\/a> and <a href=\"http:\/\/www.machinedlearnings.com\/2011\/06\/faster-lda-part-ii.html\">representational gymnastics<\/a>.<\/li>\n<\/ol>\n<p>All of the above was done without adding significant new dependencies, so the code should compile easily.<\/p>\n<p>The <a href=\"http:\/\/tech.groups.yahoo.com\/group\/vowpal_wabbit\/\">VW mailing list<\/a> has been slowly growing, and is a good place to ask questions.  <\/p>\n<p>Enjoy.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I just released Vowpal Wabbit 6.0. Since the last version: VW is now 2-3 orders of magnitude faster at linear learning, primarily thanks to Alekh. Given the baseline, this is loads of fun, allowing us to easily deal with terafeature datasets, and dwarfing the scale of any other open source projects. The core improvement here &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=1917\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Vowpal Wabbit 6.0&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,42,29],"tags":[],"class_list":["post-1917","post","type-post","status-publish","format-standard","hentry","category-announcements","category-code","category-machine-learning"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/1917","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1917"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/1917\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1917"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1917"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1917"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}