<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Machine Learning (Theory)</title>
	<atom:link href="http://hunch.net/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://hunch.net</link>
	<description>Machine learning and learning theory research</description>
	<lastBuildDate>Sat, 27 Jun 2009 02:32:43 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Netflix nearly done</title>
		<link>http://hunch.net/?p=827</link>
		<comments>http://hunch.net/?p=827#comments</comments>
		<pubDate>Sat, 27 Jun 2009 02:32:43 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Competitions]]></category>
		<category><![CDATA[Empirical]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Supervised]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=827</guid>
		<description><![CDATA[A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team.  This is just in time for Yehuda&#8217;s presentation at KDD, which I&#8217;m sure will be one of the best attended ever.  
This isn&#8217;t quite over&#8212;there are a few days for another super-conglomerate team to come together [...]]]></description>
			<content:encoded><![CDATA[<p>A $1M qualifying result was achieved on the <a href="http://www.netflixprize.com//leaderboard">public Netflix test set</a> by a <a href="http://www.research.att.com/~volinsky/netflix/bpc.html">3-way ensemble team</a>.  This is just in time for <a href="http://research.yahoo.com/Yehuda_Koren">Yehuda</a>&#8217;s presentation at <a href="http://www.sigkdd.org/kdd2009/">KDD</a>, which I&#8217;m sure will be one of the best attended ever.  </p>
<p>This isn&#8217;t quite over&#8212;there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not.  </p>
<p>Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=827</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Interesting papers at UAICMOLT 2009</title>
		<link>http://hunch.net/?p=813</link>
		<comments>http://hunch.net/?p=813#comments</comments>
		<pubDate>Wed, 24 Jun 2009 08:36:42 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Papers]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=813</guid>
		<description><![CDATA[Here&#8217;s a list of papers that I found interesting at ICML/COLT/UAI in 2009.

Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML.  This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies.  The definition of regret they deal with seems particularly useful [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a list of papers that I found interesting at <a href="http://www.cs.mcgill.ca/~icml2009/">ICML</a>/<a href="http://www.cs.mcgill.ca/~colt2009">COLT</a>/<a href="http://www.cs.mcgill.ca/~uai2009">UAI</a> in 2009.</p>
<ol>
<li><a href="http://www.cs.princeton.edu/~ehazan/">Elad Hazan</a> and <a href="http://www.cs.princeton.edu/~csesha/">Comandur Seshadhri</a> <a href="http://www.conflate.net/icml/paper/2009/75">Efficient learning algorithms for changing environments</a> at ICML.  This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies.  The definition of regret they deal with seems particularly useful in many situation.</li>
<li><a href="http://www.cs.utah.edu/~hal/">Hal Daume</a>, <a href="http://conflate.net/icml/paper/2009/297">Unsupervised Search-based Structured Prediction</a> at ICML.  This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b)  makes semisupervised learning both easy and highly effective. </li>
<li>There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to .  One was <a href="http://homes.dsi.unimi.it/~cesabian/">Nicolo Cesa-Bianchi</a>, <a href="http://www.dicom.uninsubria.it/~cgentile/">Claudio Gentile</a>, and <a href="http://www.idiap.ch/~forabona/">Francesco Orabona</a> <a href="http://conflate.net/icml/paper/2009/289">Robust Bounds for Classification via Selective Sampling</a> at ICML and the other was <a href="http://paul.rutgers.edu/~thomaswa/">Thomas Walsh</a>, <a href="http://szityu.web.eotvos.elte.hu/">Istvan Szita</a>, <a href="http://paul.rutgers.edu/~cdiuk/">Carlos Diuk</a>, <a href="http://www.cs.duke.edu/~mlittman/">Michael Littman</a> <a href="http://www.cs.mcgill.ca/~uai2009/papers/UAI2009_0137_63d573cf0e7d454d9e3e3eb713845209.pdf">Exploring compact reinforcement-learning representations with linear regression</a> at UAI.  The UAI paper covers application to RL as well.</li>
<li><a href="http://www.stat.cornell.edu/~li/">Ping Li</a>, <a href="http://www.cs.mcgill.ca/~uai2009/papers/UAI2009_0130_3c7f00baafc699511c26697b0e87ecff.pdf">Improving Compressed Counting</a> at UAI.  This paper talks about how to keep track of the moments in a datastream with very little space and computation.  I&#8217;m not sure I have a use for it yet, but it seems like a cool piece of basic technology.</li>
<li><a href="http://mark.reid.name/">Mark Reid</a> and <a href="http://axiom.anu.edu.au/~williams/">Bob Williamson</a> <a href="http://conflate.net/icml/paper/2009/400">Surrogate Regret Bounds for Proper Losses</a> at ICML.  This paper points out that via the integral characterization of proper losses, proper scoring rules can be reduced to binary classification.  The results unify and generalize the <a href="http://hunch.net/~jl/projects/reductions/probing/aistat_real_final/aistat_final.ps">Probing</a> and <a href="http://hunch.net/~jl/projects/reductions/median/median_uai.ps">Quanting</a> reductions we worked on previously.  This paper is also related to <a href="http://ai.stanford.edu/~nlambert/">Nicolas Lambert</a>&#8217;s <a href="http://ai.stanford.edu/~nlambert/papers/EC2008-elicitability.pdf">work</a>, which is quite thought provoking in terms of specifying what is learnable.</li>
<li><a href="http://cseweb.ucsd.edu/~djhsu/">Daniel Hsu</a>, <a href="http://ttic.uchicago.edu/~sham/">Sham M. Kakade</a> and <a href="http://stat.rutgers.edu/~tzhang/">Tong Zhang</a>.  <a href="http://www.cs.mcgill.ca/~colt2009/papers/011.pdf#page=1">A Spectral Algorithm for Learning Hidden Markov Models</a> COLT.  This paper shows that a subset of HMMs can be learned using an SVD-based algorithm.</li>
<li><a href="http://cseweb.ucsd.edu/~skpotufe/">Samory Kpotufe</a>, <a href="http://www.cs.mcgill.ca/~colt2009/papers/019.pdf#page=1">Escaping the curse of dimensionality with a tree-based regressor</a> at COLT.  This paper shows how to directly applying regression in high dimensional vector spaces and have it succeed anyways because the data is naturally low-dimensional.</li>
<li><a href="http://www.cs.uwaterloo.ca/~shai/">Shai Ben-David</a>, <a href="http://david.palenica.com/">David Pal</a> and <a href="http://ttic.uchicago.edu/~shai/">Shai Shalev-Shwartz</a>. <a href="http://www.cs.mcgill.ca/~colt2009/papers/032.pdf#page=1">Agnostic Online Learning</a> at COLT.  This paper characterizes the ability to learn when an adversary is choosing features in the online setting as the &#8220;Littlestone dimension&#8221;.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=813</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>In Active Learning, the question changes</title>
		<link>http://hunch.net/?p=805</link>
		<comments>http://hunch.net/?p=805#comments</comments>
		<pubDate>Mon, 15 Jun 2009 16:54:27 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Active]]></category>
		<category><![CDATA[Questions]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=805</guid>
		<description><![CDATA[A little over 4 years ago, Sanjoy made a post saying roughly &#8220;we should  study active learning theoretically, because not much is understood&#8221;.   
At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the [...]]]></description>
			<content:encoded><![CDATA[<p>A little over 4 years ago, <a href="http://www.cs.ucsd.edu/~dasgupta/">Sanjoy</a> <a href="http://hunch.net/?p=49">made a post</a> saying roughly &#8220;we should  study active learning theoretically, because not much is understood&#8221;.   </p>
<p>At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate.  In other words, the fundamental question was &#8220;can we do it?&#8221;</p>
<p>The nature of the question has fundamentally changed in my mind.   The answer is to the previous question is &#8220;yes&#8221;, both information theoretically and computationally, most places where supervised learning could be applied.  </p>
<p>In many situation, the question has now changed to: &#8220;is it worth it?&#8221;  Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile?  Currently, there are situations where this question could go either way.  Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.</p>
<p>At the <a href="http://hunch.net/~active_learning">active learning tutorial</a>, I stated a set of somewhat more precise research questions that I don&#8217;t yet have answer to, and which I believe are worth answering.  Here is a bit of an expansion on those questions for those interested.</p>
<ol>
<li>Is active learning possible in a fully adversarial setting?  By fully adversarial, I mean when an adversary controls all the algorithms observations.  <a href="http://www.conflate.net/icml/paper/2009/289">Some work</a> by <a href="http://www.dicom.uninsubria.it/~cgentile/">Claudio</a> and <a href="http://homes.dsi.unimi.it/~cesabian/">Nicolo</a> has moved in this direction, but there is not yet a solid answer.</li>
<li>Is there an efficient and effective reduction of active learning to supervised learning?  The <a href="http://arxiv.org/abs/0812.4952">bootstrap IWAL</a> approach is efficient but not effective in some situations where other approaches can succeed.  The  algorithm <a href="http://www.cs.ucsd.edu/~dasgupta/papers/cal.pdf">here</a> is a reduction to a special kind of supervised learning where you can specify both examples and constraints.  For many supervised learning algorithms, adding constraints seems problematic.</li>
<li>Can active learning succeed with alternate labeling oracles?  The ones I see people trying to use in practice often differ because they can provide answers of varying specificity and cost, or because some oracles are good for some questions, but not good for others.</li>
<li>At this point, there have been several successful applications of active learning, but that&#8217;s not the same thing as succeeding with more robust algorithms.  Can we succeed empirically with more robust algorithms?  And is the empirical cost of additional robustness worth the empirical peace-of-mind that your learning algorithm won&#8217;t go astray where other more aggressive approaches may do so?</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=805</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Functionally defined Nonlinear Dynamic Models</title>
		<link>http://hunch.net/?p=777</link>
		<comments>http://hunch.net/?p=777#comments</comments>
		<pubDate>Wed, 03 Jun 2009 10:31:57 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Reinforcement]]></category>
		<category><![CDATA[Supervised]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=777</guid>
		<description><![CDATA[Suppose we have a set of observations over time x1,x2,&#8230;,xt and want to predict some future event yt+1.  An inevitable problem arises, because learning a predictor h(x1,&#8230;,xt) of yt+1 is generically intractable due to the size of the input.  To make this problem tractable, what&#8217;s necessary is a method for summarizing the relevant [...]]]></description>
			<content:encoded><![CDATA[<p>Suppose we have a set of observations over time <em>x<sub>1</sub>,x<sub>2</sub>,&#8230;,x<sub>t</sub></em> and want to predict some future event <em>y<sub>t+1</sub></em>.  An inevitable problem arises, because learning a predictor <em>h(x<sub>1</sub>,&#8230;,x<sub>t</sub>)</em> of <em>y<sub>t+1</sub></em> is generically intractable due to the size of the input.  To make this problem tractable, what&#8217;s necessary is a method for summarizing the relevant information in past observations for the purpose of prediction in the future.  In other words, state is required.</p>
<p>Existing approaches for deriving state have some limitations.</p>
<ol>
<li><a href="http://en.wikipedia.org/wiki/Hidden_Markov_model">Hidden Markov models</a> learned with EM suffer from local minima, use tabular learning approaches which provide dubious generalization ability, and often require substantial a.priori specification of the observations.</li>
<li><a href="http://en.wikipedia.org/wiki/Kalman_filter">Kalman Filters</a> and <a href="http://en.wikipedia.org/wiki/Particle_filter">Particle Filters</a> are very parametric in the sense that substantial information must be specified up front.</li>
<li>Dynamic Bayesian Networks (<a href="http://en.wikipedia.org/wiki/Graphical_model">graphical models</a> through time) require substantial a.priori specification and often require the solution of difficult computational problems to use.  Some of these difficulties are <a href="http://www.cis.upenn.edu/~mkearns/papers/dbnhard.pdf">representational rather than computational</a>. </li>
<li>The <a href="http://citeseer.ist.psu.edu/old/641900.html">Subspace-ID</a> approach from control theory uses a linear representation, with the basic claim that it works well when all transformations are linear, but not so well when things are nonlinear. (Thanks to <a href="http://www.ri.cmu.edu/person.html?person_id=689">Drew</a> for pointing it out.)  In making this post, I ran across <a href="http://a2c2.org/conferences/acc2009/MT2_larimore_syllabus_ACC09.pdf">this two day tutorial</a> which discusses extensions of this idea to nonlinear systems.  Unfortunately, I&#8217;ll miss the tutorial, and I haven&#8217;t found the related paper.</li>
</ol>
<p>The point of <a href="http://arxiv.org/abs/0905.3369">this paper</a> at <a href="http://www.cs.mcgill.ca/~icml2009/">ICML</a> is that some dynamic systems (those which are &#8220;invertible&#8221;), can be decomposed into separate bounded resource prediction problems which, when solved, create an implicit definition of state.  This allows us to use any general purpose supervised learning algorithm to solve the <a href="http://hunch.net/?p=180">state formation problem</a> without requiring linearity or any specific representation.  When writing papers you don&#8217;t generally gush too hard, but it&#8217;s fair to say that I&#8217;m excited by this approach.</p>
<ol>
<li>It&#8217;s not a known dead end.</li>
<li>It doesn&#8217;t require lots of prior specification &#038; information when you have lots of data.</li>
<li>It leverages the huge amount of work that has gone into supervised learning algorithm design.</li>
<li>It works in controlled systems also, where the control is simply another observation.</li>
<li>It works with generalization from the start, rather than requiring the (often awkward) addition of generalization later.</li>
<li>It doesn&#8217;t require predicting everything in order to predict what you want.</li>
<li>It can work with very large observation spaces, and can even work better the larger the observation space, because larger observations imply more invertibility.</li>
</ol>
<p>I expect some people reading this paper will be disappointed that it doesn&#8217;t solve all problems.  That&#8217;s good news for anyone interested in research.  For those who aren&#8217;t note that this is (in some sense) a generalization of subspace ID, and hence that there are other applications of the approach known to work in practice.  Furthermore, we have some <a href="http://www.cse.ucsd.edu/~djhsu/papers/hmm-colt.pdf">sample complexity analysis in the linear case</a>.  </p>
<p>It&#8217;s relatively rare to have a paper about a new approach to solving a problem as intractable as nonlinear dynamics has proved to be, so if you see a flaw please speak up.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=777</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Multitask Poisoning</title>
		<link>http://hunch.net/?p=760</link>
		<comments>http://hunch.net/?p=760#comments</comments>
		<pubDate>Mon, 01 Jun 2009 08:25:02 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Research]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=760</guid>
		<description><![CDATA[There are many ways that interesting research gets done.  For example it&#8217;s common at a conference for someone to discuss a problem with a partial solution, and for someone else to know how to solve a piece of it, resulting in a paper.  In some sense,  these are the easiest results we [...]]]></description>
			<content:encoded><![CDATA[<p>There are many ways that interesting research gets done.  For example it&#8217;s common at a conference for someone to discuss a problem with a partial solution, and for someone else to know how to solve a piece of it, resulting in a paper.  In some sense,  these are the easiest results we can achieve, so we should ask: Can all research be this easy?  </p>
<p>The answer is certainly no for fields where research inherently requires  experimentation to discover how the real world works.  However, mathematics, including parts of physics, computer science, statistics, etc&#8230; which are effectively mathematics don&#8217;t require experimentation. In effect, a paper can be simply a pure expression of thinking.  Can all mathematical-style research be this easy?</p>
<p>What&#8217;s going on here is research-by-communication.  Someone knows something, someone knows something else, and as soon as someone knows both things, a problem is solved.  The interesting thing about research-by-communication is that it is becoming radically easier with improved technology.  It&#8217;s already possible to communicate with nearly anyone anywhere in the world via {voice, video, text, shared program/website/blog}. As these mechanisms become cheaper in cost and ease of use, we&#8217;ll surely see their adoption into many activities.  If all mathematical-style research can be sped up by communication, that&#8217;s great news for research.</p>
<p>However, I don&#8217;t believe it.  The essential difficulty is that doing good research often requires the simultaneous understanding of several different things&#8212;the problem, all the broken approaches to solving some problem, why they break, and some hint about where to look for a solution.  Absorbing and simultaneously keeping in mind this information requires a substantial amount of effort.  </p>
<p>The extent of effort required is perhaps most easily understood in collaboration with yourself.  Often, a problem is not immediately solved the first time it is thought of, instead a researcher must attack it again and again, until either giving up or finding a solution.  A basic parameter in attacking a problem is: How hard is it to resume where you left off?  </p>
<p>For many of the problems I&#8217;ve worked on, the answer is pretty hard.  After weeks of not working on a problem, it might take several hours to refresh my memory with all the simultaneous concerns that go into a solution.  Even if I worked on such a problem yesterday, it might take a half hour or an hour to reach a state where I&#8217;m prepared to make progress.  </p>
<p>Given the difficulty of research, I (and many other people) often struggle in dealing with interruptions.  Modern technology has made communication very easy, implying a stream of potential interruptions throughout the day, some of which are undeniably fruitful.  And yet, an interruption means the overhead of getting back to thinking must be paid yet again.</p>
<p>Trading off properly between the value of avoiding the overhead and the value of dealing with interruptions is a constant struggle which typically did not exist before before modern communication technology made it prevalent.  I think it is common to give in to the interrupts, and effectively cease to be able to do research other than research-by-communication.  I&#8217;m not ready to do that,  so I&#8217;ve found it essential to arrange large unused and uninterrupted periods of time for the purpose of doing research.  </p>
<p>Although the solution is simple, I&#8217;ve often found implementing it challenging in several ways.  There is the standard problem that people you deal with don&#8217;t understand the overhead of switching tasks in research.  However, there is also a more insidious problem.  Coping with the modern world requires that at least some portion of our time be devoted to interrupts, which are almost always easier to deal with than research.  Dealing with these interrupts therefore can create a bad habit where you seek interrupts to achieve the (short term, easy) gratification of dealing with them.  Thus, multitasking creates an internal expectation of multitasking, which makes multitasking preferred, eliminating the ability to do careful research.   This is what I think of as multitask poisoning&#8212;it&#8217;s not just the time lost to swapping some context back in, but also the bad habits of self-interruption that it creates.</p>
<p>I don&#8217;t have a good answer to this problem, other than continuing the struggle to preserve substantial contiguous chunks of time for thinking.   An alternative sometimes-applicable solution is to reduce the overhead of starting to solve a problem by decomposing the problem into subproblems.  Where possible, this is of course valuable, but mathematical research is often almost uniquely undecomposable, because the nature of the problem is that it&#8217;s solution is unknown and hence undecomposable.  Restated, the real problem <em>is</em> finding a valid decomposition.</p>
<p>I self-interrupted at least 3 times in making this post.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=760</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Many ways to Learn this summer</title>
		<link>http://hunch.net/?p=756</link>
		<comments>http://hunch.net/?p=756#comments</comments>
		<pubDate>Sat, 30 May 2009 21:43:47 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Teaching]]></category>
		<category><![CDATA[Workshop]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=756</guid>
		<description><![CDATA[There are at least 3 summer schools related to machine learning this summer.

The first is at University of Chicago June 1-11 organized by Misha Belkin, Partha Niyogi, and  Steve Smale.  Registration is closed for this one, meaning they met their capacity limit.  The format is essentially an extended Tutorial/Workshop.  I was [...]]]></description>
			<content:encoded><![CDATA[<p>There are at least <em>3</em> summer schools related to machine learning this summer.</p>
<ol>
<li><a href="http://www.cse.ohio-state.edu/mlss09/">The first</a> is at <a href="http://www.uchicago.edu/">University of Chicago</a> June 1-11 organized by <a href="http://www.cse.ohio-state.edu/~mbelkin/">Misha Belkin</a>, <a href="http://people.cs.uchicago.edu/~niyogi/">Partha Niyogi</a>, and  <a href="http://ttic.uchicago.edu/~smale/index.html">Steve Smale</a>.  Registration is closed for this one, meaning they met their capacity limit.  The format is essentially an extended Tutorial/Workshop.  I was particularly interested to see <a href="http://people.seas.harvard.edu/~valiant/">Valiant</a> amongst the speakers.  I&#8217;m also presenting Saturday June 6, on logarithmic time prediction.</li>
<li><a href="http://www.seas.upenn.edu/~psrin/">Praveen Srinivasan</a> points out the second at <a href="http://en.pku.edu.cn/">Peking University</a> in Beijing, China, July 20-27.  <a href="http://vision.stanford.edu/VLPR2009/">This one</a> differs substantially, as it is about vision, machine learning, and their intersection.  The deadline for applications is June 10 or 15.  This is also another example of the growth of research in China, with active support from <a href="http://www.nsf.gov/">NSF</a>.</li>
<li>The third one is at <a href="http://www.cam.ac.uk/">Cambridge</a>, England, August 29-September 10.  It&#8217;s in the <a href="http://mlss.cc/">MLSS series</a>.  Compared to the Chicago one, this one is more about the Bayesian side of ML, although effort has been made to create a good cross section of topics.  It&#8217;s also more focused on tutorials over workshop-style talks.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=756</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2009 ICML discussion site</title>
		<link>http://hunch.net/?p=752</link>
		<comments>http://hunch.net/?p=752#comments</comments>
		<pubDate>Sun, 24 May 2009 19:54:18 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=752</guid>
		<description><![CDATA[Mark Reid has setup a discussion site for ICML papers again this year and Monica Dinculescu has linked it in from the ICML site.  Last year&#8217;s attempt appears to have been an acceptable but not wild success as a little bit of fruitful discussion occurred.  I&#8217;m hoping this year will be a bit [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://mark.reid.name/">Mark Reid</a> has setup a <a href="http://www.conflate.net/icml/">discussion site for ICML papers</a> again this year and <a href="http://www.cs.mcgill.ca/~mdincu/">Monica Dinculescu</a> has linked it in from the ICML site.  Last year&#8217;s attempt appears to have been an acceptable but not wild success as a little bit of fruitful discussion occurred.  I&#8217;m hoping this year will be a bit more of a success&#8212;please don&#8217;t be shy <img src='http://hunch.net/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I&#8217;d like to also point out that <a href="http://www.cs.mcgill.ca/~icml2009/index.html">ICML</a>&#8217;s early <a href="http://www.cs.mcgill.ca/~icml2009/registration.html">registration</a> deadline has a few hours left, while <a href="http://www.cs.mcgill.ca/~uai2009/">UAI</a>&#8217;s and <a href="http://www.cs.mcgill.ca/~colt2009/">COLT</a>&#8217;s are in a week.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=752</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>CI Fellows</title>
		<link>http://hunch.net/?p=748</link>
		<comments>http://hunch.net/?p=748#comments</comments>
		<pubDate>Tue, 19 May 2009 21:37:31 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[CS]]></category>
		<category><![CDATA[Funding]]></category>
		<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=748</guid>
		<description><![CDATA[Lev Reyzin points out the CI Fellows Project.  Essentially, NSF is funding 60 postdocs in computer science for graduates from a wide array of US places to a wide array of US places.  This is particularly welcome given a tough year for new hires.  I expect some fraction of these postdocs will [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cs.yale.edu/homes/lr288/">Lev Reyzin</a> points out the <a href="http://cifellows.org/">CI Fellows Project</a>.  Essentially, <a href="http://www.nsf.gov/">NSF</a> is funding 60 postdocs in computer science for graduates from a wide array of US places to a wide array of US places.  This is particularly welcome given a tough year for new hires.  I expect some fraction of these postdocs will be in ML.  The time frame is quite short, so those interested should look it over immediately.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=748</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Server Update</title>
		<link>http://hunch.net/?p=743</link>
		<comments>http://hunch.net/?p=743#comments</comments>
		<pubDate>Sun, 17 May 2009 21:32:29 +0000</pubDate>
		<dc:creator>jl</dc:creator>
				<category><![CDATA[Meta]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=743</guid>
		<description><![CDATA[The hunch.net server has been updated.  I&#8217;ve taken the opportunity to upgrade the version of wordpress which caused cascading changes.

Old threaded comments are now flattened.  The system we used to use (Brian&#8217;s threaded comments) appears incompatible with the new threading system built into wordpress.  I haven&#8217;t yet figured out a workaround.
I setup [...]]]></description>
			<content:encoded><![CDATA[<p>The hunch.net server has been updated.  I&#8217;ve taken the opportunity to upgrade the version of wordpress which caused cascading changes.</p>
<ol>
<li>Old threaded comments are now flattened.  The system we used to use (<a href="http://meidell.dk/archives/2004/09/04/nested-comments/">Brian&#8217;s threaded comments</a>) appears incompatible with the new threading system built into wordpress.  I haven&#8217;t yet figured out a workaround.</li>
<li>I setup a <a href="http://feeds2.feedburner.com/MachineLearningtheory">feedburner account</a>.</li>
<li>I added an RSS aggregator for both Machine Learning and other research blogs that I like to follow.  This is something that I&#8217;ve wanted to do for awhile.</li>
<li>Many other minor changes in font and format, with some help from <a href="http://hunch.net/~beygel">Alina</a>.</li>
</ol>
<p>If you have any suggestions for site tweaks, please speak up.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=743</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Computability in Artificial Intelligence</title>
		<link>http://hunch.net/?p=727</link>
		<comments>http://hunch.net/?p=727#comments</comments>
		<pubDate>Fri, 08 May 2009 09:15:55 +0000</pubDate>
		<dc:creator>Marcus Hutter</dc:creator>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Reinforcement]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[Universal Learning]]></category>

		<guid isPermaLink="false">http://hunch.net/?p=727</guid>
		<description><![CDATA[Normally I do not blog, but John kindly invited me to do so.  Since computability issues play a major role in Artificial Intelligence and Machine Learning, I would like to take the opportunity to comment on that and raise some questions.
The general attitude is that AI is about finding efficient smart algorithms. For large [...]]]></description>
			<content:encoded><![CDATA[<p>Normally I do not blog, but John kindly invited me to do so.  Since computability issues play a major role in Artificial Intelligence and Machine Learning, I would like to take the opportunity to comment on that and raise some questions.</p>
<p>The general attitude is that AI is about finding efficient smart algorithms. For large parts of machine learning, the same attitude is not too dangerous. If you want to concentrate on conceptual problems, simply become a statistician. There is no analogous escape for modern research on AI (as opposed to <a href="http://en.wikipedia.org/wiki/GOFAI">GOFAI</a> rooted in logic).
</p>
<p>Let me show by analogy why <b>limiting research to computational questions is bad for any field.</b></p>
<p>Except in computer science, computational aspects play little role in the development of <i>fundamental</i> theories: Consider e.g. set theory with axiom of choice, foundations of logic, exact/full minimax for zero-sum games, quantum (field) theory, string theory, &#8230; Indeed, at least in physics, every new fundamental theory seems to be less computable than previous ones. Of course, once a subject has been formalized, further research (a) analyzes the structure of the theory and (b) tries to compute efficient approximations. Only in (b) do computational aspects play a role.</p>
<p>So my question is: <b>Why are computational questions so prevalent in AI research? Here are some unconvincing arguments I&#8217;ve heard:</b></p>
<p><b>A) Because AI is a subfield of computer science</b>, <i>and the task of computer scientists is to find (efficient) algorithms for well-defined problems?</i></p>
<p>I think it does not do any (real-world) problem any good to confine it to computer science. Of course, philosophers and cognitive scientists also care about AI, but where are the mathematicians?</p>
<p><b>B) Because formalizing AI and finding efficient smart programs goes hand-in-hand?</b> <i>Separating these two issues would lead to no, or at best to results which are misleading or useless in the construction of intelligent machines?</i></p>
<p>I am not aware of any convincing argument that separating the issues of &#8220;axiomatizing a field&#8221; and &#8220;finding efficient solutions&#8221; will (likely) fail for AI. The examples above of other fields actually indicate the opposite. Of course, interaction is important to avoid <i>both</i> sides running wild. For instance, von Neumann&#8217;s minimax solution for games, albeit infeasible for most games, is the cornerstone of most practical approximations.</p>
<p><b>C) Because there is some deep connection between intelligence and computation</b><i> which can not be disentangled?</i></p>
<p>Sure, you could say that intelligence is by definition about computationally efficient decision making. This is as unconvincing as argument (A). Pointing out that the human brain is a computational device is quite useful in many ways, but doesn&#8217;t proves (C) either. Of course, ultimately we want a &#8220;fast&#8221; smart algorithm. How is AI different from wanting a fast algorithm computing primes, which you derive from a non-algorithmic definition of primes; or drawing fractals?</p>
<p><b>D) Because AI is trivial if computational issues are ignored?</b> <i>All conceptual problems have already been solved?</i></p>
<p>Many have expressed ideas that some form of exhaustive search over all possible solutions and picking the &#8220;best&#8221; one does the job. <a href="http://dx.doi.org/10.1142/S0129054102001199">This works essentially for exactly those problems that are well-defined</a>. For instance, optimal minimax play of a zero-sum game or solving NP complete problems are conceptually trivial, i.e. if computation time is ignored. But in general AI and machine learning, there is <i>not</i> a universally agreed-upon objective function. The Turing test is informal (involves a human judge in the loop), maximizing expected reward (the true distribution is not known, so expectation w.r.t. to what?), etc. <a href="http://www.hutter1.net/ai/uaibook.htm">The AIXI model</a>, briefly discussed at this blog, is the first complete and formal such criterion, for which, let me phrase it that way, no flaw has yet been identified. <a href="http://www.lulu.com/content/2043514">Shane Legg&#8217;s award-winning thesis</a> gives an informal introduction and contains lots of discussion.</p>
<p><b>Conceptual and computational problems in AI should be studied jointly as well as separately</b>, but the latter is not (yet) fashionable. When AI was more logic oriented, some good logicians helped develop the foundations of &#8220;deductive&#8221; AI. Where are the researchers giving modern &#8220;inductive&#8221; AI its foundation? I am talking about generic learning agents, not classifying i.i.d. data. Reinforcement learners? Well, most of the hard results are from adaptive control theorists, but it&#8217;s reassuring to see parts of these communities merging. It&#8217;s a pity that so few mathematicians are interested in AI. A field &#8220;mathematical AI&#8221; with the prestige of &#8220;mathematical physics&#8221; would be exciting. As a start: 40% of the <u>COLT</u> &amp; <a href="http://www-alg.ist.hokudai.ac.jp/~thomas/ALTARCH/altarch.jhtml">ALT</a> papers on generic learning agents, 30% induction, 20% time-series forecasting, 10% i.i.d. Currently it&#8217;s reversed.</p>
]]></content:encoded>
			<wfw:commentRss>http://hunch.net/?feed=rss2&amp;p=727</wfw:commentRss>
		<slash:comments>24</slash:comments>
		</item>
	</channel>
</rss>
