{"id":482,"date":"2008-12-07T19:46:22","date_gmt":"2008-12-08T01:46:22","guid":{"rendered":"http:\/\/hunch.net\/?p=482"},"modified":"2008-12-07T19:46:22","modified_gmt":"2008-12-08T01:46:22","slug":"a-nips-paper","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=482","title":{"rendered":"A NIPS paper"},"content":{"rendered":"<p>I&#8217;m skipping NIPS this year in favor of <a href=\"https:\/\/hunch.net\/~ada\">Ada<\/a>, but I wanted to point out <a href=\"http:\/\/www.cs.toronto.edu\/~amnih\/papers\/hlbl_draft.pdf\">this paper<\/a> by <a href=\"http:\/\/www.cs.toronto.edu\/~amnih\/\">Andriy Mnih<\/a> and <a href=\"http:\/\/www.cs.toronto.edu\/~hinton\/\">Geoff Hinton<\/a>.  The basic claim of the paper is that by carefully but automatically constructing a binary tree over words, it&#8217;s possible to predict words well with huge computational resource savings over unstructured approaches.<\/p>\n<p>I&#8217;m interested in this beyond the application to word prediction because it is relevant to the general normalization problem: If you want to predict the probability of one of a large number of events, often you must compute a predicted score for all the events and then normalize, a computationally inefficient operation.  The problem comes up in many places using probabilistic models, but I&#8217;ve run into it with high-dimensional regression.<\/p>\n<p>There are a couple workarounds for this computational bug:<\/p>\n<ol>\n<li>Approximate. There are many ways.  Often the approximations are uncontrolled (i.e. can be arbitrarily bad), and hence finicky in application.<\/li>\n<li>Avoid.  You don&#8217;t really want a probability, you want the most probable choice which can be found more directly.  <a href=\"http:\/\/www.cs.nyu.edu\/~yann\/research\/ebm\/\">Energy based model<\/a> update rules are an example of that approach and there are many other direct methods from supervised learning.  This is great when it applies, but sometimes a probability is actually needed.<\/li>\n<\/ol>\n<p>This paper points out that a third approach can be viable empirically: use a self-normalizing structure.  It seems highly likely that this is true in other applications as well.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;m skipping NIPS this year in favor of Ada, but I wanted to point out this paper by Andriy Mnih and Geoff Hinton. The basic claim of the paper is that by carefully but automatically constructing a binary tree over words, it&#8217;s possible to predict words well with huge computational resource savings over unstructured approaches. &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=482\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;A NIPS paper&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,27,29,18],"tags":[],"class_list":["post-482","post","type-post","status-publish","format-standard","hentry","category-bayesian","category-empirical","category-machine-learning","category-papers"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/482","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=482"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/482\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=482"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=482"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}