{"id":107,"date":"2005-08-04T15:04:03","date_gmt":"2005-08-04T21:04:03","guid":{"rendered":"\/?p=107"},"modified":"2005-08-04T15:09:05","modified_gmt":"2005-08-04T21:09:05","slug":"why-reinforcement-learning-is-important","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=107","title":{"rendered":"Why Reinforcement Learning is Important"},"content":{"rendered":"<p>One prescription for solving a problem well is:<\/p>\n<ol>\n<li>State the problem, in the simplest way possible. In particular, this statement should involve no contamination with or anticipation of the solution.<\/li>\n<li>Think about solutions to the stated problem.<\/li>\n<\/ol>\n<p>Stating a problem in a succinct and crisp manner tends to invite a simple elegant solution.  When a problem can not be stated succinctly, we  wonder if the problem is even understood. (And when a problem is not understood, we wonder if a solution can be meaningful.)<\/p>\n<p>Reinforcement learning does step (1) well.  It provides a clean simple language to state general AI problems.  In reinforcement learning there is a set of actions <em>A<\/em>, a set of observations <em>O<\/em>, and a reward <em>r<\/em>.  The reinforcement learning problem, in general, is defined by a conditional measure <em>D( o, r | (o,r,a)<sup>*<\/sup>)<\/em> which produces an observation <em>o<\/em> and a reward <em>r<\/em> given a history <em>(o,r,a)<sup>*<\/sup><\/em>.  The goal in reinforcement learning is to find a policy <em>pi:(o,r,a)<sup>*<\/sup> -> a<\/em> mapping histories to actions so as to maximize (or approximately maximize) the expected sum of observed rewards.<\/p>\n<p>This formulation is capable of capturing almost any (all?) AI problems.  (Are there any other formulations capable of capturing a similar generality?) I don&#8217;t believe we yet have good RL solutions from step (2), but that is unsurprising given the generality of the problem.  <\/p>\n<p>Note that solving RL in this generality is impossible (for example, it can encode classification).  The two approaches that can be taken are:<\/p>\n<ol>\n<li>Simplify the problem.  It is very common to consider the restricted problem where the history is summarized by the previous observation.  (aka a &#8220;Markov Decision Process&#8221;).  In many cases, other restrictions are added.<\/li>\n<li>Think about relativized solutions (such as reductions).<\/li>\n<\/ol>\n<p>Both methods are options are under active investigation.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One prescription for solving a problem well is: State the problem, in the simplest way possible. In particular, this statement should involve no contamination with or anticipation of the solution. Think about solutions to the stated problem. Stating a problem in a succinct and crisp manner tends to invite a simple elegant solution. When a &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=107\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Why Reinforcement Learning is Important&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[],"class_list":["post-107","post","type-post","status-publish","format-standard","hentry","category-reinforcement"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/107","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=107"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/107\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=107"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=107"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=107"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}