{"id":339,"date":"2008-07-06T23:26:10","date_gmt":"2008-07-07T05:26:10","guid":{"rendered":"http:\/\/hunch.net\/?p=339"},"modified":"2008-07-06T23:26:10","modified_gmt":"2008-07-07T05:26:10","slug":"to-dual-or-not","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=339","title":{"rendered":"To Dual or Not"},"content":{"rendered":"<p><a href=\"http:\/\/www.cs.huji.ac.il\/~singer\/\">Yoram<\/a> and <a href=\"http:\/\/ttic.uchicago.edu\/~shai\/\">Shai<\/a>&#8216;s <a href=\"http:\/\/ttic.uchicago.edu\/~shai\/icml08tutorial\/index.html\">online learning tutorial<\/a> at <a href=\"http:\/\/icml2008.cs.helsinki.fi\/\">ICML<\/a> brings up a question for me, &#8220;Why use the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Dual_problem\">dual<\/a>?&#8221;<\/p>\n<p>The basic setting is learning a weight vector <em>w<sub>i<\/sub><\/em> so that the function <em>f(x)= sum<sub>i<\/sub> w<sub>i<\/sub> x<sub>i<\/sub><\/em> optimizes some convex loss function.<\/p>\n<p>The functional view of the dual is that instead of (or in addition to) keeping track of <em>w<sub>i<\/sub><\/em> over the feature space, you keep track of a vector <em>a<sub>j<\/sub><\/em> over the examples and define <em>w<sub>i<\/sub> = sum<sub>j<\/sub> a<sub>j<\/sub> x<sub>ji<\/sub><\/em>.<\/p>\n<p>The above view of duality makes operating in the dual appear unnecessary, because in the end a weight vector is always used.  The tutorial suggests that thinking about the dual gives a unified algorithmic font for deriving online learning algorithms.  I haven&#8217;t worked with the dual representation much myself, but I have seen a few examples where it appears helpful.<\/p>\n<ol>\n<li><strong>Noise<\/strong> When doing online optimization (i.e. online learning where you are allowed to look at individual examples multiple times), the dual representation may be helpful in dealing with noisy labels.<\/li>\n<li><strong>Rates<\/strong> One annoyance of working in the primal space with a gradient descent style algorithm is that finding the right learning rate can be a bit tricky.<\/li>\n<li><strong>Kernelization<\/strong> Instead of explicitly computing the weight vector, the representation of the solution can be left in terms of <em>a<sub>i<\/sub><\/em>, which is handy when the weight vector is only implicitly defined.<\/li>\n<\/ol>\n<p>A substantial drawback of dual analysis is that it doesn&#8217;t seem to apply well in nonconvex situations.  There is a reasonable (but not bullet-proof) argument that learning over nonconvex functions is unavoidable in solving some problems.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Yoram and Shai&#8216;s online learning tutorial at ICML brings up a question for me, &#8220;Why use the dual?&#8221; The basic setting is learning a weight vector wi so that the function f(x)= sumi wi xi optimizes some convex loss function. The functional view of the dual is that instead of (or in addition to) keeping &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=339\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;To Dual or Not&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[29,7],"tags":[],"class_list":["post-339","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-online"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/339","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=339"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/339\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=339"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=339"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=339"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}