{"id":468,"date":"2008-11-16T18:54:59","date_gmt":"2008-11-17T00:54:59","guid":{"rendered":"http:\/\/hunch.net\/?p=468"},"modified":"2008-11-26T07:45:20","modified_gmt":"2008-11-26T13:45:20","slug":"observations-on-linearity-for-reductions-to-regression","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=468","title":{"rendered":"Observations on Linearity for Reductions to Regression"},"content":{"rendered":"<p><a href=\"http:\/\/gosset.wharton.upenn.edu\/~foster\/index.pl\">Dean Foster<\/a> and <a href=\"http:\/\/www.cse.ucsd.edu\/~djhsu\/\">Daniel Hsu<\/a> had a couple observations about reductions to regression that I wanted to share.  This will make the most sense for people familiar with error correcting output codes (see the <a href=\"https:\/\/hunch.net\/~jl\/projects\/reductions\/tutorial\/paper\/chapter.pdf\">tutorial, page 11<\/a>).<\/p>\n<p>Many people are comfortable using linear regression in a one-against-all style, where you try to predict the probability of choice <i>i<\/i> vs other classes, yet they are not comfortable with more complex error correcting codes because they fear that they create harder problems.  This fear turns out to be mathematically incoherent under a linear representation: comfort in the linear case should imply comfort with more complex codes.<\/p>\n<p>In particular, If there exists a set of weight vectors <i>w<sub>i<\/sub><\/i> such that <i>P(i|x)= &lt;w<sub>i<\/sub>,x&gt;<\/i>, then for any invertible error correcting output code <i>C<\/i>, there exists weight vectors <i>w<sub>c<\/sub><\/i> which decode to perfectly predict the probability of each class.  The proof is simple and constructive: the weight vector <i>w<sub>c<\/sub><\/i> can be constructed according to the linear superposition of <i>w<sub>i<\/sub><\/i> implied by the code, and invertibility implies that a correct encoding implies a correct decoding.<\/p>\n<p>This observation extends to all-pairs like codes which compare subsets of choices to subsets of choices using &#8220;don&#8217;t cares&#8221;.<\/p>\n<p>Using this observation, Daniel created a very short proof of the PECOC regret transform theorem (<a href=\"https:\/\/hunch.net\/images\/pecoc.pdf\">here<\/a>, and Daniel&#8217;s <a href=\"http:\/\/www.cse.ucsd.edu\/~djhsu\/notes\/pecoc.pdf\">updated version<\/a>).<\/p>\n<p>One further observation is that under ridge regression (a special case of linear regression), for any code, there exists a setting of parameters such that you might as well use one-against-all instead, because you get the same answer numerically.  The implication is that the advantages of codes more complex than one-against-all is confined to other prediction methods.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Dean Foster and Daniel Hsu had a couple observations about reductions to regression that I wanted to share. This will make the most sense for people familiar with error correcting output codes (see the tutorial, page 11). Many people are comfortable using linear regression in a one-against-all style, where you try to predict the probability &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=468\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Observations on Linearity for Reductions to Regression&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[29,12],"tags":[],"class_list":["post-468","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-reductions"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/468","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=468"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/468\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=468"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=468"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=468"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}