General – Page 10 – Machine Learning (Theory)

5/29/20055/29/2005

Bad ideas

I found these two essays on bad ideas interesting. Neither of these is written from the viewpoint of research, but they are both highly relevant.

Why smart people have bad ideas by Paul Graham
Why smart people defend bad ideas by Scott Berkun (which appeared on slashdot)

In my experience, bad ideas are common and over confidence in ideas is common. This overconfidence can take either the form of excessive condemnation or excessive praise. Some of this is necessary to the process of research. For example, some overconfidence in the value of your own research is expected and probably necessary to motivate your own investigation. Since research is a rather risky business, much of it does not pan out. Learning to accept when something does not pan out is a critical skill which is sometimes never acquired.

Excessive condemnation can be a real ill when it’s encountered. This has two effects:

When the penalty for being wrong is too large, it means people have a great investment in defending “their” idea. Since research is risky, “their” idea is often wrong (or at least in need of amendment).
A large penalty implies people are hesitant to introduce new ideas.

Both of these effects slow the progress of research. How much, exactly, is unclear and very difficult to imagine measuring.

While it may be difficult to affect the larger community of research, you can and should take these considerations into account when choosing coauthors, advisors, and other people you work with. The ability to say “oops, I was wrong”, have that be accepted without significant penalty, and move on is very valuable for the process of thinking.

5/28/20055/30/2005

Running A Machine Learning Summer School

We just finished the Chicago 2005 Machine Learning Summer School. The school was 2 weeks long with about 130 (or 140 counting the speakers) participants. For perspective, this is perhaps the largest graduate level machine learning class I am aware of anywhere and anytime (previous MLSSs have been close). Overall, it seemed to go well, although the students are the real authority on this. For those who missed it, DVDs will be available from our Slovenian friends. Email Mrs Spela Sitar of the Jozsef Stefan Institute for details.

The following are some notes for future planning and those interested.
Good Decisions

Acquiring the larger-than-necessary “Assembly Hall” at International House. Our attendance came in well above our expectations, so this was a critical early decision that made a huge difference.
The invited speakers were key. They made a huge difference in the quality of the content.
Delegating early and often was important. One key difficulty here is gauging how much a volunteer can (or should) do. Many people are willing to help a little, so breaking things down into small chunks is important.

Unclear Decisions

Timing (May 16-27, 2005): We wanted to take advantage of the special emphasis on learning quarter here. We also wanted to run the summer school in the summer. These goals did not have a good solution. By starting as late as possible in the quarter, we were in the “summer” for universities on a semester schedule but not those on a quarter schedule. Thus, we traded some students and scheduling conflicts at University of chicago for the advantages of the learning quarter.
Location (Hyde Park, Chicago):
Advantages:
1. Easy to fly to.
2. Easy to get funding. (TTI and Uchicago were both significant contributors.)
3. Easy (on-site) organization.
Disadvantages:
1. US visas were too slow or rejected 7+ students.
2. Location in Chicago implied many locals drifted in and out.
3. The Hyde Park area lacks real hotels, creating housing difficulties.
Workshop colocation: We colocated with two workshops. The advantage of this is more content. The disadvantage was that it forced talks to start relatively early. This meant that attendance at the start of the first lecture was relatively low (60-or-so), ramping up through the morning. Although some students benefitted from the workshop talks, most appeared to gain much more from the summer school.

Things to do Differently Next Time

Delegate harder and better. Doing various things rather than delegating means you feel like you are “doing your part”, but it also means that you are distracted and do not see other things which need to be done….and they simply don’t get done unless you see it.
Have a ‘sorting session’. With 100+ people in the room, it is difficult to meet people of similar interests. This should be explicitly aided. One good suggestion is “have a poster session for any attendees”. Sorting based on other dimensions might also be helpful. The wiki helped here for social events.
Torture the speakers more. Presenting an excess of content in a minimum of time to an audience of diverse backgrounds is extremely difficult. This difficulty can not be avoided, but it can be ameliorated. Having presentation slides and suggested reading well in advance helps. The bad news here is that it is very difficult to get speakers to make materials available in advance. They naturally want to tweak slides at the last minute and include the newest cool discoveries.
Schedules posted at the entrance.

The Future There will almost certainly be future machine learning summer schools in the series and otherwise. My impression is that the support due to being “in series” is not critical to success, but it is considerable. For those interested, running one “in series” starts with a proposal consisting of {organizers,time/location,proposed speakers,budget} sent to Alex Smola and Bernhard Schoelkopf. I am sure they are busy, so conciseness is essential.

5/17/20055/17/2005

A Short Guide to PhD Graduate Study

Graduate study is a mysterious and uncertain process. This easiest way to see this is by noting that a very old advisor/student mechanism is preferred. There is no known succesful mechanism for “mass producing” PhDs as is done (in some sense) for undergraduate and masters study. Here are a few hints that might be useful to prospective or current students based on my own experience.

Masters or PhD (a) You want a PhD if you want to do research. (b) You want a masters if you want to make money. People wanting (b) will be manifestly unhappy with (a) because it typically means years of low pay. People wanting (a) should try to avoid (b) because it prolongs an already long process.
Attitude. Many students struggle for awhile with the wrong attitude towards research. Most students come into graduate school with 16-19 years of schooling where the principle means of success is proving that you know something via assignments, tests, etc… Research does not work this way. Research is what a PhD is about.
The right attitude is something more like “I have some basic questions about the world and want to understand the answer to them.” The process of discovering the answers, writing it up, and convincing others that you have the right answers is what a PhD is about.

Let me repeat this another way: you cannot get a PhD by doing homework (even homework assigned by your advisor). Many students fall into this failure mode because it is very natural to continue as you have for the last n years of education. The difficulty is often exacerbated by the mechanics of the PhD process. For example, students are often dependent on their advisors for funding and typical advisors have much more accumulated experience and knowledge than the typical student.

Their are several reasons why you cannot succeed with the “homework” approach:
1. Your advisor doesn’t have time to micromanage your work.
2. A very significant part of doing good research is understanding the motivations (and failures of motivations) of different approaches. Offloading thinking about this on your advisor means that you are missing a critical piece of your own education.
3. It invites abuse. Advisors are often trapped in their own twisty maze of too many things to do, so they are very tempted to offload work (including nonresearch work) onto students. Students doing some of this can make some sense. A bit of help with nonresearch can be useful here and there and even critical when it comes to funding the students. But a student should never be given (or even allowed to take on) more load than comfortably leaves room for significant research.
4. With respect to the wider research community, a PhD is an opportunity to develop a personality independent of your advisor. Doing your advisor’s homework doesn’t accomplish this.
Advisor. The choice of advisor is the most important choice in a PhD education. You want one that is comfortable with your own independent streak. You want one that is well enough off to fund you and who won’t greatly load you down with nonresearch tasks. You want one who’s research style fits yours and who has the good regard of the larger research community. This combination of traits is difficult to come by.
Even more difficult is coming by it twice. I recommend having two advisors because it gives you twice the sources of good advice, wisdom, and funding.
Institution. The choice of advisor is more important than the choice of institution. A good advisor is a make-or-break decision with respect to succeess. The institution is a less important choice of the form “make or make well”. A good institution will have sufficient computational resources and sufficient funding to cover student costs. Quality of life outside of school should be a significant concern because you will be spending years in the same place.
Lifestyle. Before choosing to go for a PhD, try to understand the research lifestyle. If it doesn’t fit reasonably well, don’t try. You will just end up unhappy after years of your life wasted. This is a common failure mode.

5/12/20059/6/2005

Math on the Web

Andrej Bauer has setup a Mathematics and Computation Blog. As a first step he has tried to address the persistent and annoying problem of math on the web. As a basic tool for precisely stating and transfering understanding of technical subjects, mathematics is very necessary. Despite this necessity, every mechanism for expressing mathematics on the web seems unnaturally clumsy. Here are some of the methods and their drawbacks:

MathML This was supposed to be the answer, but it has two severe drawbacks: “Internet Explorer” doesn’t read it and the language is an example of push-XML-to-the-limit which no one would ever consider writing in. (In contrast, html is easy to write in.) It’s also very annoying that math fonts must be installed independent of the browser, even for mozilla based browsers.
Create inline images. This has several big drawbacks: font size is fixed for all viewers, you can’t cut & paste inside the images, and you can’t hyperlink from (say) symbol to definition. Math World is a good example using this approach.
Html Extensions. For example, y_i = x². The drawback here is that the available language is very limited (no square roots, integrals, sums, etc…). This is what I have been using for posts.
Raw latex. Researchers are used to writing math in latex and compile into postscript or pdf. It is possible to simply communicate in that language. Unfortunately, the language can make simple things like fractions appear (syntactically) much more complicated. More importantly, latex is not nearly as universally known as the mathematics layed out in math books.
Translation. An obvious trick is to translate this human-editable syntax into something. There are two difficulties here:
1. What do you translate to? None of the presentations mechanisms above are fully satisfying.
2. Lost in translation. For example in latex, it’s hard to make a hyperlink from a variable in one formula to an anchor in the variable definition of another formula and have that translated correctly into (say) MathML.

Approach (4) is what Andrej’s blog is using, with a javascript translator that changes output depending on the destination browser. Ideally, the ‘smart translator’ would use whichever of {MathML, image, html extensions, human-edit format} was best and supported by the destination browser, but that is not yet the case. Nevertheless, it is a good start.

5/11/20055/11/2005

Visa Casualties

For the Chicago 2005 machine learning summer school we are organizing, at least 5 international students can not come due to visa issues. There seem to be two aspects to visa issues:

Inefficiency. The system rejected the student simply by being incapable of even starting to evaluate their visa in less than 1 month of time.
Politics. Border controls became much tighter after the September 11 attack. Losing a big chunk of downtown of the largest city in a country will do that.

What I (and the students) learned is that (1) is a much larger problem than (2). Only 1 prospective student seems to have achieved an explicit visa rejection. Fixing problem (1) should be a no-brainer, because the lag time almost surely indicates overload, and overload on border controls should worry even people concerned with (2). The obvious fixes to overload are “spend more money” and “make the system more efficient”.

With respect to (2), (which is a more minor issue by the numbers) it is unclear that the political calculus was done right. There is an obvious demonstrated risk that letting the wrong people through border controls means large buildings can be destroyed. However there is a subtle risk in making acquiring a visa a more uncertain process: it contributes towards shifting science, (human) learning, and technology outside of the US. This shift is economically detrimental to the US. For some anecdotal evidence of this effect, note that this is the first machine learning summer school in the US but the 6th in the series. Less striking, but perhaps a surer measurement is to notice that many of the machine learning related summer conferences are in Europe this year.