{"id":85,"date":"2005-06-08T09:15:38","date_gmt":"2005-06-08T15:15:38","guid":{"rendered":"\/?p=85"},"modified":"2006-03-22T23:34:41","modified_gmt":"2006-03-23T05:34:41","slug":"question-when-is-the-right-time-to-insert-the-loss-function","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=85","title":{"rendered":"Question: &#8220;When is the right time to insert the loss function?&#8221;"},"content":{"rendered":"<p>Hal <a href=\"https:\/\/hunch.net\/index.php?p=83#comments\">asks<\/a>  a very good question: &#8220;When is the right time to insert the loss function?&#8221;  In particular, should it be used at testing time or at training time?<\/p>\n<p>When the world imposes a loss on us, the standard Bayesian recipe is to predict the (conditional) probability of each possibility and then choose the possibility which minimizes the expected loss.  In contrast, as the <a href=\"https:\/\/hunch.net\/index.php?p=11\">confusion<\/a> over &#8220;loss = money lost&#8221; or &#8220;loss = the thing you optimize&#8221; might indicate, many people ignore the Bayesian approach and simply optimize their loss (or a close proxy for their loss) over the representation on the training set.<\/p>\n<p>The best answer I can give is &#8220;it&#8217;s unclear, but I prefer optimizing the loss at training time&#8221;.  My experience is that optimizing the loss in the most direct manner possible typically yields best performance.  This question is related to a basic principle which both <a href=\"http:\/\/yann.lecun.com\/\">Yann LeCun<\/a>(applied) and <a href=\"http:\/\/www.clrc.rhul.ac.uk\/people\/vlad\/\">Vladimir Vapnik<\/a>(theoretical) advocate: &#8220;solve the simplest prediction problem that solves the problem&#8221;.  (One difficulty with this principle is that &#8216;simplest&#8217; is difficult to define in a satisfying way.)<\/p>\n<p>One reason why it&#8217;s unclear is that optimizing an arbitrary loss is not an easy thing for a learning algorithm to cope with.  <a href=\"https:\/\/hunch.net\/index.php?cat=12\">Learning reductions<\/a> (which I am a big fan of) give a mechanism for doing this, but they are new and relatively untried.<\/p>\n<p>Drew Bagnell adds: Another approach to integrating loss functions into learning is to try to re-derive ideas about probability theory appropriate for other loss functions. For instance, Peter Grunwald and A.P. Dawid present a variant on <a href=\"http:\/\/arxiv.org\/abs\/math.ST\/0410076\">maximum entropy learning<\/a>.  Unfortunately, it&#8217;s even less clear how often these approaches lead to efficient algorithms. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hal asks a very good question: &#8220;When is the right time to insert the loss function?&#8221; In particular, should it be used at testing time or at training time? When the world imposes a loss on us, the standard Bayesian recipe is to predict the (conditional) probability of each possibility and then choose the possibility &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=85\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Question: &#8220;When is the right time to insert the loss function?&#8221;&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,1],"tags":[],"class_list":["post-85","post","type-post","status-publish","format-standard","hentry","category-bayesian","category-general"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/85","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=85"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/85\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=85"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=85"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=85"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}