{"id":5937325,"date":"2017-01-04T14:24:08","date_gmt":"2017-01-04T20:24:08","guid":{"rendered":"http:\/\/hunch.net\/?p=5937325"},"modified":"2017-01-13T18:14:29","modified_gmt":"2017-01-14T00:14:29","slug":"ewrl-and-nips-2016","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=5937325","title":{"rendered":"EWRL and NIPS 2016"},"content":{"rendered":"<p>I went to the <a href=\"https:\/\/ewrl.wordpress.com\/ewrl13-2016\/\">European Workshop on Reinforcement Learning<\/a> and <a href=\"https:\/\/nips.cc\/Conferences\/2016\">NIPS<\/a> last month and saw several interesting things.  <\/p>\n<p>At EWRL, I particularly liked the talks from: <\/p>\n<ol>\n<li><a href=\"https:\/\/ewrl.files.wordpress.com\/2016\/12\/munos.pdf\">Remi Munos<\/a> on off-policy evaluation<\/li>\n<li><a href=\"https:\/\/ewrl.files.wordpress.com\/2017\/01\/ghavamzadeh.pdf\">Mohammad Ghavamzadeh<\/a> on learning safe policies<\/li>\n<li>Emma Brunskill on optimizing biased-but safe estimators (sense a theme?)<\/li>\n<li><a href=\"https:\/\/ewrl.files.wordpress.com\/2016\/12\/levine.pdf\">Sergey Levine<\/a> on low sample complexity applications of RL in robotics.<\/li>\n<\/ol>\n<p>My talk is <a href=\"https:\/\/ewrl.files.wordpress.com\/2016\/12\/langford.pdf\">here<\/a>.  Overall, this was a well organized workshop with diverse and interesting subjects, with the only caveat being that they had to limit registration \ud83d\ude42<\/p>\n<p>At NIPS itself, I found the poster sessions fairly interesting. <\/p>\n<ol>\n<li>Allen-Zhu and Hazan had a new <a href=\"https:\/\/arxiv.org\/abs\/1603.05642\">notion of a reduction<\/a> (<a href=\"https:\/\/www.youtube.com\/watch?v=Nmlp7Pk4uog&#038;feature=youtu.be\">video<\/a>).<\/li>\n<li>Zhao, Poupart, and Gordon had a new <a href=\"https:\/\/arxiv.org\/abs\/1601.00318\">way to learn Sum-Product Networks<\/a><\/li>\n<li>Ho, Littman, MacGlashan, Cushman, and Austerwell, had a paper on how <a href=\"https:\/\/papers.nips.cc\/paper\/6413-showing-versus-doing-teaching-by-demonstration.pdf\">&#8220;Showing&#8221; is different from &#8220;Doing&#8221;<\/a>.<\/li>\n<li>Toulis and Parkes had a paper on <a href=\"http:\/\/papers.nips.cc\/paper\/6059-long-term-causal-effects-via-behavioral-game-theory.pdf\">estimation of long term causal effects<\/a>.<\/li>\n<li>Rae, Hunt, Danihelka, Harley, Senior, Wayne, Graves, and Lillicrap had a paper on <a href=\"https:\/\/arxiv.org\/abs\/1610.09027\">large memories with neural networks<\/a>.\n<\/li>\n<li>Hardt, Price, and Srebro, had a paper on <a href=\"https:\/\/arxiv.org\/pdf\/1610.02413.pdf\">Equal Opportunity in ML<\/a>.<\/li>\n<\/ol>\n<p>Format-wise, I thought the 2 sessions was better than 1, but I really would have preferred more.  The recorded spotlights are also pretty cool.<\/p>\n<p>The NIPS workshops were great, although I was somewhat reminded of <a href=\"http:\/\/www.nbparks.org\/Sports-Leagues\/photos\/NewSoccer_Kindergarten_9_09.jpg\">kindergarten soccer<\/a> in terms of lopsided attendance. This may be inevitable given how hot the field is, but I think it&#8217;s important for individual researchers to remember that:<\/p>\n<ol>\n<li>There are many important directions of research.<\/li>\n<li>You personally have a much higher chance of doing something interesting if everyone else is not doing it also.<\/li>\n<\/ol>\n<p>During the workshops, I learned about <a href=\"https:\/\/arxiv.org\/abs\/1412.6980\">ADAM<\/a> (a momentum form of Adagrad), <a href=\"https:\/\/www.eecs.tufts.edu\/~dsculley\/papers\/ml_test_score.pdf\">testing ML systems<\/a>, and that even <a href=\"https:\/\/en.wikipedia.org\/wiki\/TensorFlow\">TenserFlow<\/a> is finally looking into synchronous updates for parallel learning (<a href=\"https:\/\/hunch.net\/?p=151364\">allreduce<\/a> is the way).<\/p>\n<p>(edit: added one)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I went to the European Workshop on Reinforcement Learning and NIPS last month and saw several interesting things. At EWRL, I particularly liked the talks from: Remi Munos on off-policy evaluation Mohammad Ghavamzadeh on learning safe policies Emma Brunskill on optimizing biased-but safe estimators (sense a theme?) Sergey Levine on low sample complexity applications of &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=5937325\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;EWRL and NIPS 2016&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[33,29,11],"tags":[],"class_list":["post-5937325","post","type-post","status-publish","format-standard","hentry","category-conferences","category-machine-learning","category-reinforcement"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/5937325","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5937325"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/5937325\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5937325"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5937325"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5937325"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}