<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Machine Learning (Theory)</title>
	<atom:link href="http://hunch.net/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://hunch.net</link>
	<description>Machine learning and learning theory research</description>
	<lastBuildDate>Sun, 24 Jan 2010 18:03:01 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Specializations of the Master Problem</title>
		<link>http://hunch.net/?p=1167</link>
		<comments>http://hunch.net/?p=1167#comments</comments>
		<pubDate>Sun, 24 Jan 2010 18:03:01 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Online]]></category>
		<category><![CDATA[Reductions]]></category>
		<category><![CDATA[Reinforcement]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1167</guid>
		<description><![CDATA[One thing which is clear on a little reflection is that there exists a single master learning problem capable of encoding essentially all learning problems.  This problem is of course a very general sort of reinforcement learning where the world interacts with an agent as:

The world announces an observation x.
The agent makes a choice [...]]]></description>
			<content:encoded><![CDATA[<p>One thing which is clear on a little reflection is that there exists a single master learning problem capable of encoding essentially all learning problems.  This problem is of course a very general sort of reinforcement learning where the world interacts with an agent as:</p>
<ol>
<li>The world announces an observation <em>x</em>.</li>
<li>The agent makes a choice <em>a</em>.</li>
<li>The world announces a reward <em>r</em>.</li>
</ol>
<p>The goal here is to maximize the sum of the rewards over the time of the agent.  No particular structure relating <em>x</em> to <em>a</em> or <em>a</em> to <em>r</em> is implied by this setting so we do not know effective general algorithms for the agent.  It&#8217;s very easy to prove lower bounds showing that an agent cannot hope to succeed here&#8212;just consider the case where actions are unrelated to rewards.  Nevertheless, there is a real sense in which essentially all forms of life are agents operating in this setting, somehow succeeding.  The gap between these observations drives research&#8212;How can we find tractable specializations of the master problem general enough to provide an effective solution in real problems?</p>
<p>The process of specializing is a tricky business, as you want to simultaneously achieve tractable analysis, sufficient generality to be useful, and yet capture a new aspect of the master problem not otherwise addressed.  Consider: How is it even possible to choose a setting where analysis is tractable before you even try to analyze it? What follows is my mental map of different specializations.</p>
<h3>Online Learning</h3>
<p>The online learning setting is perhaps the most satisfying specialization more general than standard batch learning at present, because it turns out to additionally provide tractable algorithms for many batch learning settings.</p>
<p>Standard online learning models specialize in two ways: You assume that the choice of action in step 2 does not influence future observations and rewards, and you assume additional information is available in step 3, a retrospectively available reward for each action.  The algorithm for an agent in this setting typically has a given name&#8212;gradient descent, weighted majority, Winnow, etc&#8230;</p>
<p>The general algorithm here is a more refined version of follow-the-leader than in batch learning, with online update rules.  An awesome discovery about this setting is that it&#8217;s possible to compete with a set of predictors even when the world is totally adversarial, substantially strengthening our understanding of what learning is and where it might be useful.  For this adversarial setting, the algorithm alters into a form of follow-the-perturbed leader, where the learning algorithm randomizes it&#8217;s action amongst the set of plausible alternatives in order to defeat an adversary.  </p>
<p>The standard form of argument in this setting is a potential argument, where at each step you show that if the learning algorithm performs badly, there is some finite budget from which an adversary deducts it&#8217;s ability.   The form of the final theorem is that you compete with the accumulated reward of a set any one-step policies <em>h:X &#8211; > A</em>, with a dependence log(#policies) or weaker in regret, a measure of failure to compete.  </p>
<p>A good basic paper to read here is:<br />
Nick Littlestone and <a href="http://users.soe.ucsc.edu/~manfred/">Manfred Warmuth</a>, <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.1595&#038;rep=rep1&#038;type=pdf">The Weighted Majority Algorithm</a>, which shows the basic information-theoretic claim clearly.  <a href="http://www.vovk.net/">Vovk</a>&#8217;s page on <a href="http://www.vovk.net/aa/index.html">aggregating algorithms</a> is also relevant, although somewhat harder to read.</p>
<p>Provably computationally tractable special cases all have linear structure, either on <a href="http://people.cs.uchicago.edu/~kalai/papers/onlineopt/onlineopt.pdf">rewards</a> or <a href="http://webdocs.cs.ualberta.ca/~maz/publications/ICML03.pdf">policies</a>.  Good results are often observed empirically by applying <a href="http://www.cs.toronto.edu/~hinton/absps/naturebp.pdf">backpropagation for nonlinear architectures</a>, with the danger of local minima understood.</p>
<h3>Bandit Analysis</h3>
<p>In the bandit setting, step 1 is omitted, and the difficulty of the problem is weakened by assuming that action in step (2) don&#8217;t alter future rewards.  The goal is generally to compete with all constant arm strategies.  </p>
<p>Analysis in this basic setting started very specialized with <a href="http://en.wikipedia.org/wiki/Gittins_index">Gittin&#8217;s Indicies</a> and gradually generalized over time to include IID and fully adversarial settings, with <a href="http://www.cs.princeton.edu/~schapire/uncompress-papers.cgi/AuerCeFrSc01.ps">EXP3</a> a canonical algorithm.  If there are <em>k</em> strategies available, the standard theorem states that you can compete with the set of all constant strategies up to regret <em>k</em>.  The most impressive theoretical discovery in this setting is that the dependence on <em>T</em>, the number of timesteps, is not substantially worse than supervised learning despite the need to explore.</p>
<p>Given the dependence on <em>k</em> all of these algorithms are computationally tractable.  </p>
<p>However, the setting is flawed, because the set of constant strategies is inevitably too weak in practice&#8212;it&#8217;s an example of optimal decision making given that you ignore almost all information.  Adding back the observation in step 1 allows competing with a large set of policies, while the regret grows only as log(#policies) or weaker.  Canonical algorithms here are <a href="http://www.cs.princeton.edu/~schapire/uncompress-papers.cgi/AuerCeFrSc01.ps">EXP4</a> (computationally intractable, but information theoretically near-optimal), <a href="http://hunch.net/~jl/projects/interactive/sidebandits/bandit.pdf">Epoch-Greedy</a> (computationally tractable given an oracle optimizer), and the <a href="http://arxiv.org/abs/0812.4044">Offset Tree</a> providing a reduction to supervised binary classification.</p>
<h3>MDP analysis</h3>
<p>A substantial fraction of reinforcement learning has specialized on the <a href="http://en.wikipedia.org/wiki/Markov_Decision_Process">Markov Decision Process</a> setting, where the observation <em>x</em> is a state <em>s</em>, which is a sufficient statistic for predicting all future observations.   Compared to the previous settings, dealing with time dependence is explicitly required, but learning typically exists in only primitive forms.  </p>
<p>The first work here was in the 1950&#8217;s where the actual MDP was assumed known and the problem was simply computing a good policy, typically via dynamic programming style solutions.  More recently, principally in the 1990&#8217;s, the setting where the MDP was not assumed known was analyzed.  A very substantial theoretical advancement was the <a href="http://www.cis.upenn.edu/~mkearns/papers/reinforcement.pdf"><em>E<sup>3</sup></em> algorithm</a> which requires only <em>O(S<sup>2</sup>A)</em> experience to learn a near-optimal policy where the world is an MDP with <em>S</em> state and <em>A</em> actions per state.  A further improvement on this is <a href="http://hunch.net/~jl/projects/RL/Delayed_Q/icml06.ps">Delayed Q-Learning</a>, where only <em>O(SA)</em> experience is required.  There are many variants on the model-based approach and not much for the model-free approach.  <a href="http://www.research.rutgers.edu/~lihong/">Lihong Li</a>&#8217;s <a href="http://www.research.rutgers.edu/~lihong/pub/Li09Unifying.pdf">thesis</a> probably has the best detailed discussion at present.</p>
<p>There are some unsatisfactory elements of the analysis here.  First, I&#8217;ve suppressed the dependence on the definition of &#8220;approximate&#8221; and the typical time horizon, for which the dependence is often bad and the optimality is unclear.  The second is the dependence on <em>S</em>, which is intuitively unremovable, with this observation formalized in the lower bound <a href="http://stat.wharton.upenn.edu/~skakade/">Sham</a> and I worked on (section 8.6 of <a href="http://stat.wharton.upenn.edu/~skakade/papers/thesis/sham_thesis.pdf">Sham&#8217;s thesis</a>).  Empirically, these and related algorithms are often finicky, because in practice the observation isn&#8217;t a sufficient statistic and the number of states isn&#8217;t small, so approximating things as such is often troublesome.</p>
<p>A very different variant of this setting is given by <a href="http://en.wikipedia.org/wiki/Control_theory">Control theory</a>, which I know less about than I should.  The canonical setting for control theory is with a known MDP having linear transition dynamics.  More exciting are the <a href="http://en.wikipedia.org/wiki/System_identification">system identification</a> problems where the system must be first identified.  I don&#8217;t know any good relatively assumption free results for this setting.</p>
<h3>Oracle Advice Shortcuts</h3>
<p>Techniques here specialize the setting to situations in which some form of oracle advice is available when a policy is being learned.  A good example of this is an oracle which provides samples from the distribution of observations visited by a good policy.  Using this oracle, <a href="http://hunch.net/~jl/projects/RL/aoarl/Final.pdf">conservative policy iteration</a> is guaranteed to perform well, so long as a base learning algorithm can predict well.  This algorithm was refined and improved a bit by <a href="http://www.ri.cmu.edu/pub_files/pub4/bagnell_james_2003_1/bagnell_james_2003_1.pdf">PSDP</a>, which works via dynamic programming, improving guarantees to work with regret rather than errors.</p>
<p>An alternative form of oracle is provide by access to a good policy at training time.  In this setting, <a href="http://hunch.net/~jl/projects/reductions/searn/searn.pdf">Searn</a> has similar provable guarantees with a similar analysis.</p>
<p>The oracle based algorithms appear to work well anywhere these oracles are available.</p>
<h3>Uncontrolled Delay</h3>
<p>In the uncontrolled delay setting, step (2) is removed, and typically steps (1) and (3) are collapsed into one observation, where the goal becomes state tracking.  Most of the algorithms for state tracking are heavily model dependent, implying good success within particular domains.  Examples include <a href="http://en.wikipedia.org/wiki/Kalman_Filter">Kalman filters</a>, <a href="http://en.wikipedia.org/wiki/Hidden_markov_model">hidden markov models</a>, and <a href="http://en.wikipedia.org/wiki/Particle_filter">particle filters</a> which typical operate according to an explicit probabilistic model of world dynamics.</p>
<p>Relatively little is known for a nonparametric version of this problem.  One observation is that the process of predicting adjacent observations well forms states as a byproduct when the observations are sufficiently rich as <a href="http://arxiv.org/abs/0905.3369">detailed here</a>.  </p>
<p>A basic question is: What&#8217;s missing from the above?  A good answer is worth a career.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1167</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Deadline Season, 2010</title>
		<link>http://hunch.net/?p=1182</link>
		<comments>http://hunch.net/?p=1182#comments</comments>
		<pubDate>Tue, 19 Jan 2010 23:37:04 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1182</guid>
		<description><![CDATA[Many conference deadlines are coming soon.



Deadline
Double Blind / Author Feedback
Time/Place


ICML
January 18((workshops) / February 1 (Papers) / February 13 (Tutorials)
Y/Y
Haifa, Israel, June 21-25


KDD
February 1(Workshops) / February 2&#038;5 (Papers) / February 26 (Tutorials &#038; Panels)) / April 17 (Demos)
N/S
Washington DC, July 25-28


COLT
January 18 (Workshops) / February 19 (Papers)
N/S
Haifa, Israel, June 25-29


UAI
March 11 (Papers)
N?/Y
Catalina Island, California, July 8-11


ICML [...]]]></description>
			<content:encoded><![CDATA[<p>Many conference deadlines are coming soon.</p>
<table border=1>
<tr>
<td></td>
<td>Deadline</td>
<td>Double Blind / Author Feedback</td>
<td>Time/Place</td>
</tr>
<tr>
<td><a href="http://www.icml2010.org/">ICML</a></td>
<td>January 18((workshops) / February 1 (Papers) / February 13 (Tutorials)</td>
<td>Y/Y</td>
<td>Haifa, Israel, June 21-25</td>
</tr>
<tr>
<td><a href="http://www.kdd.org/kdd2010/cfp_research.shtml">KDD</a></td>
<td>February 1(Workshops) / February 2&#038;5 (Papers) / February 26 (Tutorials &#038; Panels)) / April 17 (Demos)</td>
<td>N/S</td>
<td>Washington DC, July 25-28</td>
</tr>
<tr>
<td><a href="http://www.colt2010.org/">COLT</a></td>
<td>January 18 (Workshops) / February 19 (Papers)</td>
<td>N/S</td>
<td>Haifa, Israel, June 25-29</td>
</tr>
<tr>
<td><a href="http://event.cwi.nl/uai2010/cfp.html">UAI</a></td>
<td>March 11 (Papers)</td>
<td>N?/Y</td>
<td>Catalina Island, California, July 8-11</td>
</tr>
</table>
<p>ICML continues to experiment with the reviewing process, although perhaps less so than last year.  </p>
<p>The S &#8220;sort-of&#8221; for COLT is because author feedback occurs only after decisions are made.  </p>
<p>KDD is notable for being the most comprehensive in terms of {Tutorials, Workshops, Challenges, Panels, Papers (two tracks), Demos}.  The S for KDD is because there is sometimes author feedback at the decision of the SPC.</p>
<p>The (past) January 18 deadline for workshops at ICML is nominal, as I (as workshop chair) almost missed it myself and we have space for a few more workshops.  If anyone is thinking &#8220;oops, I missed the deadline&#8221;, send in your proposal by Friday the 22nd.</p>
<p>This year, I&#8217;m an area chair for ICML and on the SPC for KDD.  I hope to see interesting papers on plausibly useful learning theory (broadly interpreted) at each conference, as I did last year.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1182</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sam Roweis died</title>
		<link>http://hunch.net/?p=1172</link>
		<comments>http://hunch.net/?p=1172#comments</comments>
		<pubDate>Thu, 14 Jan 2010 01:02:42 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1172</guid>
		<description><![CDATA[and I can&#8217;t help but remember him.
I first met Sam as an undergraduate at Caltech where he was TA for Hopfield&#8217;s class, and again when I visited Gatsby, when he invited me to visit Toronto, and at too many conferences to recount.  His personality was a combination of enthusiastic and thoughtful, with a great [...]]]></description>
			<content:encoded><![CDATA[<p>and I can&#8217;t help but remember him.</p>
<p>I first met <a href="http://cs.nyu.edu/~roweis/">Sam</a> as an undergraduate at <a href="http://www.caltech.edu/">Caltech</a> where he was TA for <a href="http://en.wikipedia.org/wiki/John_Joseph_Hopfield">Hopfield</a>&#8217;s class, and again when I visited <a href="http://www.gatsby.ucl.ac.uk/">Gatsby</a>, when he invited me to visit <a href="http://web.cs.toronto.edu/Page4.aspx">Toronto</a>, and at too many conferences to recount.  His personality was a combination of enthusiastic and thoughtful, with a great ability to phrase a problem so it&#8217;s solution must be understood.  With respect to my own work, Sam was the one who advised me to make <a href="http://hunch.net/~jl/projects/prediction_bounds/tutorial/langford05a.ps">my first tutorial</a>, leading to others, and to other things, all of which I&#8217;m grateful to him for.  In fact, my every interaction with Sam was positive, and that was his way.</p>
<p>His death is <a href="http://nyulocal.com/on-campus/2010/01/13/nyu-computer-science-professor-sam-roweis-jumps-to-death-in-washington-square-village/">being called a suicide</a> which is so incompatible with my understanding of Sam that it strains my credibility.  But we know that his many responsibilities were great, and it is well understood that basically all sane researchers have legions of inner doubts.  Having been depressed now and then myself, it&#8217;s helpful to understand at least intellectually that the true darkness of the now is overestimated, and that you have more friends than you think.  Sam was one of mine, and I&#8217;ll miss him.</p>
<p>My last interaction with Sam, last week, was discussing a new research direction that interested him, optimizing the cost of acquiring feature information in the learning algorithm.  This problem is endemic to real-world applications, and has been studied to some extent elsewhere, but I expect that in our unwritten future  history, we&#8217;ll discover that further study of this problem is more helpful than almost anyone realizes.  The reply that I owed him feels heavy, and an incompleteness is hanging.  For his wife and children it is surely so incomparably greater that I lack words.</p>
<p>(Added) Others: <a href="http://earningmyturns.blogspot.com/2010/01/sam-roweis.html">Fernando</a>, <a href="http://www.sigcrap.org/2010/01/13/sad-news-about-sam-roweis/">Kevin McCurley</a>, <a href="http://blog.smellthedata.com/2010/01/sam.html">Danny Tarlow</a>, <a href="http://hoggresearch.blogspot.com/2010/01/sam-roweis.html">David Hogg</a>, <a href="http://yyue.blogspot.com/2010/01/sam-roweis-1972-2010.html">Yisong Yue</a>, <a href="http://blog.computationalcomplexity.org/2010/01/sam-roweis-1972-2010.html">Lance Fortnow</a> on Sam, a <a href="http://samroweis1972-2010.blogspot.com/">Memorial site</a>, and a <a href="http://nips.cc/SamRoweis">Memorial Fund</a></p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1172</wfw:commentRss>
		<slash:comments>38</slash:comments>
		</item>
		<item>
		<title>Interesting things at NIPS 2010</title>
		<link>http://hunch.net/?p=1152</link>
		<comments>http://hunch.net/?p=1152#comments</comments>
		<pubDate>Sun, 27 Dec 2009 21:40:19 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1152</guid>
		<description><![CDATA[Several papers at NIPS caught my attention.

Elad Hazan and Satyen Kale, Online Submodular Optimization They define an algorithm for online optimization of submodular functions with regret guarantees.  This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning.  
Elad Hazan and Satyen Kale On Stochastic and Worst-Case [...]]]></description>
			<content:encoded><![CDATA[<p>Several papers at NIPS caught my attention.</p>
<ol>
<li><a href="http://www.cs.princeton.edu/~ehazan/">Elad Hazan</a> and <a href="http://www.cs.princeton.edu/~satyen/">Satyen Kale</a>, <a href="http://books.nips.cc/papers/files/nips22/NIPS2009_0482.pdf">Online Submodular Optimization</a> They define an algorithm for online optimization of submodular functions with regret guarantees.  This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning.  </li>
<li><a href="http://www.cs.princeton.edu/~ehazan/">Elad Hazan</a> and <a href="http://www.cs.princeton.edu/~satyen/">Satyen Kale</a> <a href="http://books.nips.cc/papers/files/nips22/NIPS2009_0470.pdf">On Stochastic and Worst-Case Models of Investing</a>.  At it&#8217;s core, this is yet another example of modifying worst-case online learning to deal with variance, but the application to financial models is particularly cool and it seems plausibly superior other common approaches for financial modeling.</li>
<li><a href="http://www.ri.cmu.edu/person.html?person_id=1584">Mark Palatucci</a>, <a href="http://www.ri.cmu.edu/person.html?person_id=241">Dean Pomerlau</a>, <a href="http://www.cs.cmu.edu/~tom/">Tom Mitchell</a>, and <a href="http://www.cs.toronto.edu/~hinton/">Geoff Hinton</a> <a href="http://books.nips.cc/papers/files/nips22/NIPS2009_0395.pdf">Zero Shot Learning with Semantic Output Codes</a> The goal here is predicting a label in a multiclass supervised setting where the label never occurs in the training data.  They have some basic analysis and also a nice application to FMRI brain reading.</li>
<li><a href="http://www.cs.cmu.edu/~shobha/">Shobha Venkataraman</a>, <a href="http://www.cs.cmu.edu/~avrim/">Avrim Blum</a>, <a href="http://www.eecs.berkeley.edu/~dawnsong/">Dawn Song</a>, <a href="http://www2.research.att.com/~sen/">Subhabrata Sen</a>, and <a href="http://spatscheck.com/oliver/">Oliver Spatscheck</a>, <a href="http://books.nips.cc/papers/files/nips22/NIPS2009_0810.pdf">Tracking Dynamic Sources of Malicious Activity at Internet Scales</a>.  This is a plausible combination of worst-case learning algorithms in a tree-like structure over IP space to track and predict bad IPs.  Their empirical results look quite good to me and there are many applications where this prediction problem needs to be solved.</li>
<li><a href="http://talk.ucsd.edu/kamalika/">Kamalika Chaudhuri</a>, <a href="http://cseweb.ucsd.edu/~djhsu/">Daniel Hsu</a>, and <a href="http://cseweb.ucsd.edu/~yfreund/">Yoav Freund</a>, <a href="http://books.nips.cc/papers/files/nips22/NIPS2009_1110.pdf">A Parameter Free Hedging Algorithm</a>  This paper is about eliminating the learning rate parameter from online learning algorithms.  While that&#8217;s certainly useful, the approach taken involves a double-exponential rather than a single exponential potential, which is strange and potentially useful in many other places.</li>
<li><a href="http://paul.rutgers.edu/~bbai/">Bing Bai</a>, <a href="http://www.kyb.tuebingen.mpg.de/bs/people/weston/">Jason Weston</a>, <a href="http://david.grangier.info/">David Grangier</a>, <a href="http://ronan.collobert.com/">Ronan Collobert</a>, <a href="http://www.nec-labs.com/research/machine/ml_website/main/person.php?person=kunihiko">Kunihiko Sadamasa</a>, <a href="http://home.yanjunqi.net/">Yanjun Qi</a>, <a href="http://research.google.com/pubs/author121.html">Corinna Cortes</a>, <a href="http://cs.nyu.edu/~mohri/"></a><a href="http://books.nips.cc/papers/files/nips22/NIPS2009_0881.pdf">Polynomial Semantic Indexing</a>  This is about an empirically improved algorithm for learning ranking functions based on (query,document) content.  The sexy Semantic name is justified because it is not based on syntactic matching of query to document.</li>
</ol>
<p>I also found the <a href="http://nips.cc/Conferences/2009/PublicationModelsDebate">future publication models</a> discussion interesting.  The <a href="http://hunch.net/?p=1086">follow-up post here</a> has details and further discussion.</p>
<p>At the workshops, I was deeply confronted with the problem of too many interesting workshops to attend in the given amount of time.  Two talks stood out for me:</p>
<ol>
<li><a href="http://www.cs.cmu.edu/~guestrin/">Carlos Guestrin</a> gave a talk in the <a href="http://research.microsoft.com/~sumitb/adaiml09">interactive machine learning workshop</a> on <a href="http://www.cs.cmu.edu/afs/cs/Web/People/kbe/tdn_kdd09.pdf">Turning Down the Noise in the Blogosphere</a> by <a href="http://www.cs.cmu.edu/afs/cs/Web/People/kbe/">Khalid El-Arini</a>, <a href="http://www.cs.cmu.edu/~gveda/">Gaurav Veda</a>, <a href="http://people.cs.cmu.edu/person/68732.html">Dafna Shahaf</a>, and <a href="http://www.cs.cmu.edu/~guestrin/">Carlos Guestrin</a>  which I missed at <a href="http://www.sigkdd.org/kdd2009/">KDD</a> this year.  The paper discusses the use exponential weight online learning algorithms to rerank blog posts based on user-specific interests.  It comes with a <a href="http://www.turndownthenoise.org/">demonstration website</a> where you can test it out.</li>
<li><a href="http://people.seas.harvard.edu/~valiant/">Leslie Valiant</a> gave a talk on representations and operations on concepts in a brain-like fashion.  The style of representation and algorithm involves distributed representations on sparse graphs, an approach which is relatively unfamiliar.  <a href="http://en.wikipedia.org/wiki/Bloom_filter">Bloom filters</a> and in machine learning experience with <a href="http://hunch.net/~jl/projects/hash_reps/index.html">learning through hashing functions</a> has sharpened my intuition a bit.  The talk seemed to cover <a href="http://people.seas.harvard.edu/~valiant/memassoc-eccc.pdf">Memorization and Association on a Realistic Neural Model</a> at <a href="http://www.mitpressjournals.org/loi/neco?cookieSet=1">Neural Computation</a> as well as <a href="http://people.seas.harvard.edu/~valiant/KR08MichaelValiant.pdf">A First Experimental Demonstration of Massive Knowledge Infusion</a> at <a href="http://www.cse.unsw.edu.au/~kr2008/">KR</a>.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1152</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Top graduates this season</title>
		<link>http://hunch.net/?p=1131</link>
		<comments>http://hunch.net/?p=1131#comments</comments>
		<pubDate>Thu, 24 Dec 2009 22:55:26 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Graduates]]></category>
		<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1131</guid>
		<description><![CDATA[I would like to point out 3 graduates this season as having my confidence they are capable of doing great things. 

Daniel Hsu has diverse papers with diverse coauthors on {active learning, mulitlabeling, temporal learning, &#8230;} each covering new algorithms and methods of analysis.  He is also a capable programmer, having helped me with [...]]]></description>
			<content:encoded><![CDATA[<p>I would like to point out 3 graduates this season as having my confidence they are capable of doing great things. </p>
<ol>
<li><a href="http://cseweb.ucsd.edu/~djhsu/">Daniel Hsu</a> has diverse papers with diverse coauthors on {active learning, mulitlabeling, temporal learning, &#8230;} each covering new algorithms and methods of analysis.  He is also a capable programmer, having helped me with some nitty-gritty details of cluster parallel <a href="http://hunch.net/~vw">Vowpal Wabbit</a> this summer.  He has an excellent tendency to just get things done.</li>
<li><a href="http://ai.stanford.edu/~nlambert/">Nicolas Lambert</a> doesn&#8217;t nominally work in machine learning, but I&#8217;ve found his work in <a href="http://ai.stanford.edu/~nlambert/papers/EC2008-elicitability.pdf">elicitation</a> relevant nevertheless.  In essence, elicitable properties are closely related to learnable properties, and the elicitation complexity is related to a notion of learning complexity.  See the <a href="http://mark.reid.name/files/pubs/icml09.pdf">Surrogate regret bounds paper</a> for some related discussion.  Few people successfully work at such a general level that it crosses fields, but he&#8217;s one of them.</li>
<li><a href="http://www.yisongyue.com/">Yisong Yue</a> is deeply focused on interactive learning, which he has attacked at all levels: theory, algorithm adaptation, programming, and <a href="http://www.scientificblogging.com/stated_degree_confidence/blog/selfimproving_systems_learn_through_human_interaction">popular description</a>.  I&#8217;ve seen a relentless multidimensional focus on a new real-world problem be an excellent strategy for research and expect he&#8217;ll succeed.</li>
</ol>
<p>The obvious caveat applies&#8212;I don&#8217;t know or haven&#8217;t fully appreciated everyone&#8217;s work so I&#8217;m sure I missed people.  I&#8217;d like to particularly point out <a href="http://www.eecs.berkeley.edu/~pliang/">Percy Liang</a> and <a href="http://people.csail.mit.edu/dsontag/">David Sontag</a> as plausibly such whom I&#8217;m sure others appreciate a great deal.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1131</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Inherent Uncertainty</title>
		<link>http://hunch.net/?p=1108</link>
		<comments>http://hunch.net/?p=1108#comments</comments>
		<pubDate>Wed, 09 Dec 2009 18:01:16 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1108</guid>
		<description><![CDATA[I&#8217;d like to point out Inherent Uncertainty, which I&#8217;ve added to the ML blog post scanner on the right.   My understanding from Jake is that the intention is to have a multiauthor blog which is more specialized towards learning theory/game theory than this one.  Nevertheless, several of the posts seem to be [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d like to point out <a href="http://www.inherentuncertainty.org/">Inherent Uncertainty</a>, which I&#8217;ve added to the ML blog post scanner on the right.   My understanding from <a href="http://www.cs.berkeley.edu/~jake/">Jake</a> is that the intention is to have a multiauthor blog which is more specialized towards learning theory/game theory than this one.  Nevertheless, several of the posts seem to be of wider interest.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1108</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Future Publication Models @ NIPS</title>
		<link>http://hunch.net/?p=1086</link>
		<comments>http://hunch.net/?p=1086#comments</comments>
		<pubDate>Wed, 09 Dec 2009 17:49:23 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Reviewing
]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1086</guid>
		<description><![CDATA[Yesterday, there was a discussion about future publication models at NIPS.  Yann and Zoubin have specific detailed proposals which I&#8217;ll add links to when I get them (Yann&#8217;s proposal and Zoubin&#8217;s proposal).
What struck me about the discussion is that there are many simultaneous concerns as well as many simultaneous proposals, which makes it difficult [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday, there was a discussion about <a href="http://nips.cc/Conferences/2009/PublicationModelsDebate">future publication models at NIPS</a>.  <a href="http://yann.lecun.com/">Yann</a> and <a href="http://learning.eng.cam.ac.uk/zoubin/">Zoubin</a> have specific detailed proposals which I&#8217;ll add links to when I get them (<a href="http://yann.lecun.com/ex/pamphlets/publishing-models.html">Yann&#8217;s proposal</a> and <a href="http://hunch.net/?page_id=1115">Zoubin&#8217;s proposal</a>).</p>
<p>What struck me about the discussion is that there are many simultaneous concerns as well as many simultaneous proposals, which makes it difficult to keep all the distinctions straight in a verbal conversation.  It also seemed like people were serious enough about this that we may see some real movement.  Certainly, my personal experience motivates that as I&#8217;ve <a href="http://hunch.net/?cat=21">posted many times</a> about the substantial flaws in our review process, including some very poor personal experiences.</p>
<p>Concerns include the following:</p>
<ol>
<li>(Several) Reviewers are overloaded, boosting the noise in decision making.</li>
<li>(<a href="http://yann.lecun.com/">Yann</a>) A new system should run with as little built-in delay and friction to the process of research as possible.</li>
<li>(<a href="http://www.cs.umass.edu/~wallach/">Hanna Wallach</a>(updated)) Double-blind review is particularly important for people who are unknown or from an unknown institution.</li>
<li>(Several) But, it&#8217;s bad to take double blind so seriously as to disallow publishing on <a href="arxiv.org">arxiv</a> or personal webpages.</li>
<li>(<a href="http://yann.lecun.com/">Yann</a>) And double-blind is bad when it prevents publishing for substantial periods of time.  Apparently, this comes up in <a href="http://www.cvpr2009.org/">CVPR</a>.</li>
<li>(<a href="http://learning.eng.cam.ac.uk/zoubin/">Zoubin</a>) Any new system should appear to outsiders as if it&#8217;s the old system, or a journal, because it&#8217;s already hard enough to justify CS tenure cases to other disciplines.</li>
<li>(<a href="http://www.cis.upenn.edu/~pereira/">Fernando</a>) There shouldn&#8217;t be a big change with a complex bureaucracy, but rather a smaller changes which are obviously useful or at least worth experimenting with.</li>
</ol>
<p>There were other concerns as well, but these are the ones that I remember.</p>
<p>Elements of proposals include:</p>
<ol>
<li>(<a href="http://yann.lecun.com/">Yann</a>) Everything should go to Arxiv or an arxiv-like system first, as per physics or mathematics.  This addresses (1), because it delinks dissemination from review, relieving some of the burden of reviewing.  It also addresses (2) since with good authors they can immediately begin building on each other&#8217;s work.  It conflicts with (3), because Arxiv does not support double-blind submission.  It does not conflict if we build our own system.</li>
<li>(<a href="http://www.cis.upenn.edu/~pereira/">Fernando</a>) Create a conference coincident journal in which people can publish at any time. <a href="http://www.vldb.org/">VLDB</a> has apparently done this.  It can be done smoothly by allowing submission in either conference deadline mode or journal mode.  This proposal addresses (1) by reducing peak demand on reviewing.  It also addresses (6) above.</li>
<li>(<a href="http://ai.stanford.edu/~koller/">Daphne</a>)  Perhaps we should have a system which <em>only</em> reviews papers for correctness, which is not nearly as subjective as for novelty or interestingness.  This addresses (1), by eliminating some concerns for the reviewer.  It is orthogonal to the double blind debate.  In biology, <a href="PLoS ONE">such a journal exists</a> (pointer updated), because delays were becoming absurd and intolerable.</li>
<li>(<a href="http://yann.lecun.com/">Yann</a>) There should be multiple publishing entities (people or groups of people) that can bless a paper as interesting.  This addresses (1).</li>
</ol>
<p>There are many other proposal elements (too many for my memory), which hopefully we&#8217;ll see in particular proposals.  If other people have concrete proposals, now is probably the right time to formalize them.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1086</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
		<item>
		<title>Vowpal Wabbit version 4.0, and a NIPS heresy</title>
		<link>http://hunch.net/?p=1068</link>
		<comments>http://hunch.net/?p=1068#comments</comments>
		<pubDate>Mon, 07 Dec 2009 18:42:35 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Online]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1068</guid>
		<description><![CDATA[I&#8217;m releasing version 4.0(tarball) of Vowpal Wabbit.  The biggest change (by far) in this release is experimental support for cluster parallelism, with notable help from Daniel Hsu.  
I also took advantage of the major version number to introduce some incompatible changes, including switching to murmurhash 2, and other alterations to cachefiles.  You&#8217;ll [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m releasing <a href="http://github.com/JohnLangford/vowpal_wabbit/tree/4.0">version 4.0</a>(<a href="http://hunch.net/~vw/vowpal_wabbit_v4.0.tar.gz">tarball</a>) of <a href="http://hunch.net/~vw">Vowpal Wabbit</a>.  The biggest change (by far) in this release is experimental support for cluster parallelism, with notable help from <a href="http://cseweb.ucsd.edu/~djhsu/">Daniel Hsu</a>.  </p>
<p>I also took advantage of the major version number to introduce some incompatible changes, including switching to <a href="http://murmurhash.googlepages.com/">murmurhash 2</a>, and other alterations to cachefiles.  You&#8217;ll need to delete and regenerate them.  In addition, the precise specification for a &#8220;tag&#8221; (i.e. string that can be used to identify an example) changed&#8212;you can&#8217;t have a space between the tag and the &#8216;|&#8217; at the beginning of the feature namespace.  </p>
<p>And, of course, we made it faster.</p>
<p>For the future, I put up my <a href="http://hunch.net/~vw/todo.html">todo list</a> outlining the major future improvements I want to see in the code.  I&#8217;m planning to discuss the current mechanism and results of the cluster parallel implementation at the <a href="http://www.select.cs.cmu.edu/meetings/biglearn09/">large scale machine learning workshop</a> at <a href="http://nips.cc/">NIPS</a> later this week.  Several people have asked me to do a tutorial/walkthrough of VW, which is arranged for friday 2pm in the workshop room&#8212;no skiing for me Friday.  Come join us if this heresy interests you as well <img src='http://hunch.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1068</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>AI Safety</title>
		<link>http://hunch.net/?p=1053</link>
		<comments>http://hunch.net/?p=1053#comments</comments>
		<pubDate>Sun, 29 Nov 2009 21:21:26 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[AI]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1053</guid>
		<description><![CDATA[Dan Reeves introduced me to Michael Vassar who ran the Singularity Summit and educated me a bit on the subject of AI safety which the Singularity Institute has small grants for.  
I still believe that interstellar space travel is necessary for long term civilization survival, and the AI is necessary for interstellar space travel. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://messymatters.com/">Dan Reeves</a> introduced me to <a href="http://singinst.org/aboutus/team">Michael Vassar</a> who ran the <a href="http://www.singularitysummit.com/">Singularity Summit</a> and educated me a bit on the subject of AI safety which the <a href="http://singinst.org/">Singularity Institute</a> has <a href="http://singinst.org/researchgrants/summary">small grants for</a>.  </p>
<p>I still believe that <a href="http://hunch.net/?p=324">interstellar space travel is necessary for long term civilization survival, and the AI is necessary for interstellar space travel</a>.  On these grounds alone, we could judge that developing AI is much more safe than not.  Nevertheless, there is a basic reasonable fear, as expressed by some commenters, that AI could go bad.</p>
<p>A basic scenario starts with someone inventing an AI and telling it to make as much money as possible.  The AI promptly starts trading in various markets to make money.  To improve, it crafts a virus that takes over most of the world&#8217;s computers using it as a surveillance network so that it can always make the right decision.  The AI also branches out into any form of distance work, taking over the entire outsourcing process for all jobs that are entirely digital.  To further improve, the AI invests a bit into robotics, creating automated manufacturing systems that produce all kinds of goods.  Robot cars and construction teams complete the process, so that any human with money can order anything cheaply and quickly, but no jobs remain for humans.  </p>
<p>At this point, the AI is stuck&#8212;it can eventually extract all the money from the economic system, and that&#8217;s all there is.  But of course, it isn&#8217;t really stuck.  It simply funds appropriate political campaigns so that in some country a measure passes granting the AI the right to make money, which it promptly does, mushrooming it&#8217;s wealth from trillions to the maximum number representable in all computers simultaneously.  To remove this obstacle, the AI promptly starts making more computers on a worldwide scale until all available power sources are used up.  To add more power, the AI starts a space program with beamed power.  Unfortunately, it finds the pesky atmosphere an obstacle to space travel, so it chemically binds the atmosphere in the crust of the earth allowing many <a href="http://en.wikipedia.org/wiki/Coilgun">Gauss Guns</a> to efficiently project material into space where <a href="http://en.wikipedia.org/wiki/Solar_sail">solar sails</a> are used for orbital positioning.  This process continues, slowed perhaps by the need to cool the Earth&#8217;s core, until the earth and other viable rocky bodies in the solar system are discorporated into a <a href="http://en.wikipedia.org/wiki/Dyson_Sphere">Dyson sphere</a>.  Then, the AI goes interstellar with the same program.</p>
<p>Somewhere in this process, certainly by the time the atmosphere is chemically bound, all life on earth (except the AI if you count it) is extinct.  Furthermore, the AI while intelligent by many measures doesn&#8217;t seem to be accomplishing anything interesting.  </p>
<p>One element of understanding AI safety seems to be understanding what an AI could do.  Many people seem to ascribe arbitrary powers to any sort of superintelligence, making any constraints imposed on them ineffective.  I don&#8217;t believe that&#8217;s the right approach&#8212;we should think of an AI as simply having much more ability to research, control, and manipulate large systems, all within the constraints of known physics.</p>
<p>Efforts to create safe AI go back to <a href="http://en.wikipedia.org/wiki/Isaac_Asimov">Asimov</a>&#8217;s <a href="http://en.wikipedia.org/wiki/Three_laws_of_robotics">Three Laws of Robotics</a>, which appears limited by the inability to encompass robotic warfare.  The general problem is related to the wish problem: How do you specify a wish in a manner so that it can&#8217;t be misinterpreted?  A cheap trick here is to add &#8220;&#8230; in a manner that I would consider acceptable&#8221; to the end of the wish.  Applied to AI, this approach also has limits because any limit imposed by a person can and eventually will be removed by a person given sufficient opportunity.  </p>
<p>Perhaps a complementary approach is shown by the <a href="http://en.wikipedia.org/wiki/Risk_(game)">game RISK</a>, where it appears to be virtually impossible for one player to win if all other players play defensively (i.e. build up armies and only attack in response to a provoking attack).  Applied to AI, the idea would be that we make many AIs programmed to behave well either via laws or wish tricks, with an additional element of aggressively enforcing this behavior in other AIs.  Then, if any AI is corrupted, the other AIs, with substantially more aggregate resources, will discover and deal with the problem.</p>
<p>Certain elements are necessary for this approach to work.  There must be multiple AIs, and (more importantly) the resources any one controls must be a small compared to all, an extreme form of antimonopoly.  Furthermore, the default must be that AIs are programmed to not harm or cause harm to humans, enforcing that behavior in other AIs.  Getting the programming right is the hard part, and I&#8217;m not clear on how viable this is, or how difficult it is compared to simply creating an AI, which of course I haven&#8217;t managed.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1053</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>ICML 2010 Workshops (and Tutorials)</title>
		<link>http://hunch.net/?p=1046</link>
		<comments>http://hunch.net/?p=1046#comments</comments>
		<pubDate>Mon, 23 Nov 2009 22:35:49 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Workshop]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=1046</guid>
		<description><![CDATA[I&#8217;m the workshops chair for ICML this year.  As such, I would like to personally encourage people to consider running a workshop.
My general view of workshops is that they are excellent as opportunities to discuss and develop research directions&#8212;some of my best work has come from collaborations at workshops and several workshops have substantially [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m the <a href="http://www.icml2010.org/workshops.html">workshops chair</a> for <a href="http://www.icml2010.org/">ICML</a> this year.  As such, I would like to personally encourage people to consider running a workshop.</p>
<p>My general view of workshops is that they are excellent as opportunities to discuss and develop research directions&#8212;some of my best work has come from collaborations at workshops and several workshops have substantially altered my thinking about various problems.  My experience running workshops is that setting them up and making them fly often appears much harder than it actually is, and the workshops often come off much better than expected in the end.  Submissions are due January 18, two weeks before papers.</p>
<p>Similarly, <a href="http://www.seas.upenn.edu/~taskar/">Ben Taskar</a> is looking for good <a href="http://www.icml2010.org/tutorials.html">tutorials</a>, which is complementary.  Workshops are about exploring a subject, while a tutorial is about distilling it down into an easily taught essence, a vital part of the research process.  Tutorials are due February 13, two weeks after papers.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=1046</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
