{"id":4710099,"date":"2016-07-26T15:32:54","date_gmt":"2016-07-26T21:32:54","guid":{"rendered":"http:\/\/hunch.net\/?p=4710099"},"modified":"2016-07-26T15:32:54","modified_gmt":"2016-07-26T21:32:54","slug":"icml-2016-was-awesome","status":"publish","type":"post","link":"https:\/\/hunch.net\/?p=4710099","title":{"rendered":"ICML 2016 was awesome"},"content":{"rendered":"<div>I had a fantastic time at ICML 2016\u2014 I learned a great deal. There was far more good stuff than I could see, and it was exciting to catch up on recent advances.<\/div>\n<div><\/div>\n<p><div>David Silver gave one of the best tutorials I\u2019ve seen on his group\u2019s recent work in \u201cdeep\u201d reinforcement learning. I learned about a few new techniques, including the benefits of asychrononous \u00a0updates in distributed Q-learning <a href=\"https:\/\/arxiv.org\/abs\/1602.01783\">https:\/\/arxiv.org\/abs\/1602.01783<\/a>, which was presented in more detail at the main conference. The new domains being explored were exciting, as were the improvements made on the computational side. I would love to seen more pointers to some of the related work from the tutorial, particularly given there was such an exciting mix of new techniques and old staples (e.g. experience replay\u00a0<a href=\"http:\/\/www.dtic.mil\/dtic\/tr\/fulltext\/u2\/a261434.pdf\">http:\/\/www.dtic.mil\/dtic\/tr\/fulltext\/u2\/a261434.pdf<\/a>\u00a0), but the talk was so information packed it would have been difficult.<\/div>\n<div><\/div>\n<\/p>\n<p><div>Pieter Abbeel gave an outstanding talk in the Abstraction in RL workshop <a href=\"http:\/\/rlabstraction2016.wix.com\/icml#!schedule\/bx34m\">http:\/\/rlabstraction2016.wix.com\/icml#!schedule\/bx34m<\/a>, and (I heard) another excellent one during the deep learning workshop.<\/div>\n<div>It was rumored that Aviv Tamar gave an exciting talk (I believe on this <a href=\"http:\/\/arxiv.org\/abs\/1602.02867\">http:\/\/arxiv.org\/abs\/1602.02867<\/a>)\u00a0, but I was forced to miss it to see Rong Ge\u2019s <a href=\"https:\/\/users.cs.duke.edu\/~rongge\/\">https:\/\/users.cs.duke.edu\/~rongge\/<\/a>\u00a0outstanding talk on a new-ish geometric tool for understanding non-convex optimization, the <i>strict saddle.<\/i> I first read about the approach here\u00a0<a href=\"http:\/\/arxiv.org\/abs\/1503.02101\">http:\/\/arxiv.org\/abs\/1503.02101<\/a>, but at ICML he and other authors have demonstrated a remarkable number of problems that have this property that enables efficient optimization via an stochastic gradient descent (and other) procedures.<\/div>\n<div><\/div>\n<\/p>\n<div>\n<div>This was a theme of ICML\u2014 an incredible amount of good material, so much that I barely saw the posters at all because there was nearly always a talk I wanted to see!<\/div>\n<div><\/div>\n<p><div>Rocky Duan surveyed some benchmark RL continuous control problems\u00a0<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/duan16.pdf\">http:\/\/jmlr.org\/proceedings\/papers\/v48\/duan16.pdf<\/a>\u00a0 An interesting theme of the conference\u2014 and came up in conversation with John Schulman and Yann LeCun&#8211; was really old methods working well. In fact, this group demonstrated that variants of the natural\/covariant policy gradient proposed originally by Sham Kakade (with a derivation here:\u00a0<a href=\"http:\/\/repository.cmu.edu\/cgi\/viewcontent.cgi?article=1080&amp;context=robotics\">http:\/\/repository.cmu.edu\/cgi\/viewcontent.cgi?article=1080&amp;context=robotics<\/a>) are largely at the state-of-the-art on many benchmark problems. There are some clever tricks necessary for large policy classes like neural networks (like using a partial-least squares-style truncated \u00a0conjugate gradient to solve for the change in policy in the usual F \\delta = \\nabla one solves in the natural gradient procedure) that dramatically improve performance (<a href=\"https:\/\/arxiv.org\/abs\/1502.05477\">https:\/\/arxiv.org\/abs\/1502.05477<\/a>).\u00a0 I had begun to view these methods as doing little better (or worse) then black-box search, so it\u2019s exciting to see them make a comeback.<\/div>\n<div><\/div>\n<\/p>\n<p><div>Chelsea Finn <a href=\"http:\/\/people.eecs.berkeley.edu\/~cbfinn\/\">http:\/\/people.eecs.berkeley.edu\/~cbfinn\/<\/a>\u00a0gave an outstanding talk on this work\u00a0<a href=\"https:\/\/arxiv.org\/abs\/1603.00448\">https:\/\/arxiv.org\/abs\/1603.00448<\/a>. She and co-authors (Sergey Levine and Pieter) effectively came up with a technique that lets one apply Maximum Entropy Inverse Optimal Control without the double-loop procedure and using policy gradient techniques.\u00a0 Jonathan Ho described a related algorithm\u00a0<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/ho16.pdf\">http:\/\/jmlr.org\/proceedings\/papers\/v48\/ho16.pdf<\/a>\u00a0that also appeared to mix policy gradient and an optimization over cost functions. Both are definitely on my reading list, and I want to understand the trade-offs of the techniques.<\/div>\n<\/p>\n<p><div><\/div>\n<div>Both presentations were informative, and both made the interesting connection to Generative Adversarial Nets (GANS)\u00a0<a href=\"http:\/\/arxiv.org\/abs\/1406.2661\">http:\/\/arxiv.org\/abs\/1406.2661<\/a>\u00a0. These were also a theme of the conference in both talks and during discussions. A very cool idea getting more traction, and being embraced by the neural net pioneers.<\/div>\n<div><\/div>\n<\/p>\n<p><div>David Belanger\u00a0<a href=\"https:\/\/people.cs.umass.edu\/~belanger\/belanger_spen_icml.pdf\">https:\/\/people.cs.umass.edu\/~belanger\/belanger_spen_icml.pdf<\/a>\u00a0gave a interesting talk on using backprop to optimize a structured output relative to a a learned cost function. I left thinking the technique was closely related to inverse optimal control methods and the GANs, and wanting understand how implicit differentiation wasn\u2019t being used to optimize the energy function parameters.<\/div>\n<div><\/div>\n<\/p>\n<p><div>Speaking of neural net pioneers\u2014 there was lots of good talks during both the main conference and workshops on what\u2019s new \u2014 and what\u2019s old <a href=\"https:\/\/sites.google.com\/site\/nnb2tf\/\">https:\/\/sites.google.com\/site\/nnb2tf\/<\/a>&#8212; in neural network architectures and algorithms.<\/div>\n<div><\/div>\n<\/p>\n<p><div>I was intrigued by\u00a0<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/balduzzi16.pdf\">http:\/\/jmlr.org\/proceedings\/papers\/v48\/balduzzi16.pdf<\/a>\u00a0and particularly by the well written blog post it mentions <a href=\"http:\/\/colah.github.io\/posts\/2015-09-NN-Types-FP\/\">http:\/\/colah.github.io\/posts\/2015-09-NN-Types-FP\/<\/a>\u00a0by Christopher Olah. The notion that we need language tools to structure the design of learning programs (e.g.\u00a0<a href=\"http:\/\/www.umiacs.umd.edu\/~hal\/docs\/daume14lts.pdf\">http:\/\/www.umiacs.umd.edu\/~hal\/docs\/daume14lts.pdf<\/a>)\u00a0 and have tools to reason about them seems to be gaining currency. After reading these, I began to view some of the recent work of Wen, Arun, Byron, and myself (including at\u00a0<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/sun16.pdf\">http:\/\/jmlr.org\/proceedings\/papers\/v48\/sun16.pdf<\/a>\u00a0 ICML) in this light\u2014 generative RNNs \u201cshould&#8221; have a well defined hidden state whose &#8220;type&#8221; is effectively (moments of) future observations. I wonder now if there is a larger lesson here in the design of learning programs.<\/div>\n<div><\/div>\n<\/p>\n<p><div>Nando de Freitas and colleagues approach of separating value and advantage function predictions in one network\u00a0<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/wangf16.pdf\">http:\/\/jmlr.org\/proceedings\/papers\/v48\/wangf16.pdf<\/a>\u00a0was quite interesting and had a lot of buzz.<\/div>\n<div><\/div>\n<div>Ian Osband gave an amazing talk on another topic that previously made me despair: exploration in RL\u00a0<a href=\"http:\/\/jmlr.org\/proceedings\/papers\/v48\/osband16.pdf\">http:\/\/jmlr.org\/proceedings\/papers\/v48\/osband16.pdf<\/a>.\u00a0This is one of few approaches that combines the ability to function approximation with rigorous exploration guarantees\/sample complexity in the tabular case (and amazingly *better* sample complexity then previous papers that work only in the tabular case). \u00a0Super cool and also very high on my reading list.<\/div>\n<div><\/div>\n<\/p>\n<p><div>Boaz Barak <a href=\"http:\/\/www.boazbarak.org\/Papers\/\">http:\/\/www.boazbarak.org\/<\/a> gave a truly inspired talk that mixed a kind of coherent computationally-bounded Bayesian-ism (Slogan: \u201dCompute like a frequentist, think like a Bayesian.\u201d) with demonstrating a lower bound for SoS procedures. Well outside of my expertise, but delivered in a way that made you feel like you understood all of it.<\/div>\n<div><\/div>\n<\/p>\n<p><div>Honglak Lee gave an exciting talk on the benefits of semi-supervision in CNNs <a href=\"http:\/\/web.eecs.umich.edu\/~honglak\/icml2016-CNNdec.pdf\">http:\/\/web.eecs.umich.edu\/~honglak\/icml2016-CNNdec.pdf<\/a>. The authors demonstrated that a remarkable amount of information needed to reproduce an input image was preserved quite deep in CNNs, and further that encouraging the ability to reconstruct could significantly enhance discriminative performance on real benchmarks.<\/div>\n<div><\/div>\n<\/p>\n<p><div>The problem with this ICML is that I think it would take literally weeks of reading\/watching talks to really absorb the high quality work that was presented. I\u2019m *very* grateful to the organizing committee\u00a0<a href=\"http:\/\/icml.cc\/2016\/?page_id=39\">http:\/\/icml.cc\/2016\/?page_id=39<\/a>\u00a0for making it so valuable.<\/div>\n<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I had a fantastic time at ICML 2016\u2014 I learned a great deal. There was far more good stuff than I could see, and it was exciting to catch up on recent advances. David Silver gave one of the best tutorials I\u2019ve seen on his group\u2019s recent work in \u201cdeep\u201d reinforcement learning. I learned about &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/hunch.net\/?p=4710099\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;ICML 2016 was awesome&#8221;<\/span><\/a><\/p>\n","protected":false},"author":9,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[33,1,29],"tags":[],"class_list":["post-4710099","post","type-post","status-publish","format-standard","hentry","category-conferences","category-general","category-machine-learning"],"_links":{"self":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/4710099","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/users\/9"}],"replies":[{"embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4710099"}],"version-history":[{"count":0,"href":"https:\/\/hunch.net\/index.php?rest_route=\/wp\/v2\/posts\/4710099\/revisions"}],"wp:attachment":[{"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4710099"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4710099"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hunch.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4710099"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}