Machine Learning (Theory)

6/18/2014

An ICML proposal: yearly surveys

I’d like to propose that ICML conducts a yearly survey similar to the one from 2010 or 2012 which is reported to all.

The key reason for this is information: I expect everyone participating in ICML has some baseline interest in how ICML is doing. Everyone involved has personal anecdotal information, but we all understand that a few examples can be highly misleading.

Aside from satisfying everyone’s joint curiousity, I believe this could improve ICML itself. Consider for example reviewing. Every program chair comes in with ideas for how to make reviewing better. Some succeed, but nearly all are forgotten by the next round of program chairs. Making survey information available will help quantify success and correlate it with design decisions.

The key question to ask for this is “who?” The reason why surveys don’t happen more often is that it has been the responsibility of program chairs who are typically badly overloaded. I believe we should address this by shifting the responsibility to a multiyear position, similar to or the same as a webmaster. This may imply a small cost to the community (<$1/participant) for someone’s time to do and record the survey, but I believe it’s a worthwhile cost.

I plan to bring this up with IMLS board in Beijing, but would like to invite any comments or thoughts.

6/20/2010

2010 ICML discussion site

A substantial difficulty with the 2009 and 2008 ICML discussion system was a communication vacuum, where authors were not informed of comments, and commenters were not informed of responses to their comments without explicit monitoring. Mark Reid has setup a new discussion system for 2010 with the goal of addressing this.

Mark didn’t want to make it to intrusive, so you must opt-in. As an author, find your paper and “Subscribe by email” to the comments. As a commenter, you have the option of providing an email for follow-up notification.

6/22/2005

Languages of Learning

Tags: Organization jl@ 1:08 pm

A language is a set of primitives which can be combined to succesfully create complex objects. Languages arise in all sorts of situations: mechanical construction, martial arts, communication, etc… Languages appear to be the key to succesfully creating complex objects—it is difficult to come up with any convincing example of a complex object which is not built using some language. Since languages are so crucial to success, it is interesting to organize various machine learning research programs by language.

The most common language in machine learning are languages for representing the solution to machine learning. This includes:

  1. Bayes Nets and Graphical Models A language for representing probability distributions. The key concept supporting modularity is conditional independence. Michael Kearns has been working on extending this to game theory.
  2. Kernelized Linear Classifiers A language for representing linear separators, possibly in a large space. The key form of modularity here is kernelization.
  3. Neural Networks A language for representing and learning functions. The key concept supporting modularity is backpropagation. (Yann LeCun gave some very impressive demos at the Chicago MLSS.)
  4. Decision Trees Another language for representing and learning functions. The key concept supporting modularity is partitioning the input space.

Many other learning algorithms can be seen as falling into one of the above families.

In addition there are languages related to various aspects of learning.

  1. Reductions A language for translating between varying real-world losses and core learning algorithm optimizations.
  2. Feature Languages Exactly how features are specified varies from on learning algorithm to another. Several people have been working on languages for features that cope with sparsity or the cross-product nature of databases.
  3. Data interaction languages The statistical query model of learning algorithms provides a standardized interface between data and learning algorithm.

These lists surely miss some languages—feel free to point them out below.

With respect to research “interesting” language-related questions include:

  1. For what aspects of learning is a language missing? Anytime adhocery is encountered, this suggests that there is room for a language. Finding what is not there is both hard and valuable.
  2. Are any of these languages fundamentally flawed or fundamentally advantageous with respect to another language?
  3. What are the most easy to use and effective primitives for these languages?

6/13/2005

Wikis for Summer Schools and Workshops

Tags: Organization dinoj@ 4:52 pm

Chicago ’05 ended a couple of weeks ago. This was the sixth Machine Learning Summer School, and the second one that used a wiki. (The first was Berder ’04, thanks to Gunnar Raetsch.) Wikis are relatively easy to set up, greatly aid social interaction, and should be used a lot more at summer schools and workshops. They can even be used as the meeting’s webpage, as a permanent record of its participants’ collaborations — see for example the wiki/website for last year’s NVO Summer School.

A basic wiki is a collection of editable webpages, maintained by software called a wiki engine. The engine used at both Berder and Chicago was TikiWiki — it is well documented and gets you something running fast. It uses PHP and MySQL, but doesn’t require you to know either. Tikiwiki has far more features than most wikis, as it is really a full Content Management System. (My thanks to Sebastian Stark for pointing this out.) Here are the features we found most useful:

  • Bulletin boards, or forums. The most-used one was the one for social events, which allowed participants to find company for doing stuff without requiring organizer assistance. While conferences, by their inherently less interactive nature, don’t usually benefit from all aspects of wikis, this is one feature worth adding to every one. [Example]

    Other useful forums to set up are “Lost and Found”, and discussion lists for lectures — although the latter only work if the lecturer is willing to actively answer questions arising on the forum. You can set forums up so that all posts to them are immediately emailed to someone.

  • Editable pages. For example, we set up pages for each lecture that we were able to edit easily later as more information (e.g. slides) became available. Lecturers who wanted to modify their pages could do so without requiring organizer help or permission. (Not that most of them actually took advantage of this in practice… but this will happen in time, as the wiki meme infects academia.) [Example]

  • Sign-up sheets. Some tutorials or events were only open to a limited number of people. Having editable pages means that people can sign up themselves. [Example]

  • FAQs. You can set up general categories, and add questions, and place the same question in different categories. We set most of this up before the summer school, with directions of how to get there from the airport, what to bring, etc. We also had volunteers post answers to anticipated FAQs like the location of local restaurants and blues clubs. [Example]

  • Menus. You can set up the overall layout of the webpage, by specifying the locations and contents menus on the left and right of a central `front page’. This is done via the use of `modules’, and makes it possible for your wiki pages to completely replace the webpages — if you are willing to make some aesthetic sacrifices.

  • Different levels of users: The utopian wiki model of having ‘all pages editable by everyone’ is … well, utopian. You can set up different groups of users with different permissions.

  • Calendars. Useful for scheduling, and for changes to schedules. (With the number of changes we had, we really needed this.) You can have multiple calendars e.g. one for lectures, another for practical sessions, and another for social events — and users can overlay them on each other. [Example]

A couple of other TikiWiki features that we didn’t get working at Chicago, but would have been nice to have, are these:

  • Image Galleries. Gunnar got this working at Berder, where it was a huge success. Photographs are great icebreakers, even the ones that don’t involve dancing on tables.

  • Surveys. These are easy to set up, and have option for participants to see, or not to see, the results of surveys — useful when asking people to rate lectures.

TikiWiki also has several features that we didn’t use, such as blogs and RSS feeds. It also has a couple of bugs (and features that are bad enough to be called bugs), such as permission issues and the inability to print calendars neatly. These will doubtless get cleaned up in due course.

Finally, owing to much prodding from John and some other MLSS participants, I’ve written up my experiences in using TikiWiki @ Chicago ’05 on my website, including installation instructions and a list of “Good Things to Do”. This documentation is meant to be a survival guide complementary to the existing TikiWiki documentation, which can sometimes be overwhelming.

4/14/2005

Families of Learning Theory Statements

Tags: Organization jl@ 4:41 pm

The diagram above shows a very broad viewpoint of learning theory.

arrow Typical statement Examples
Past->Past Some prediction algorithm A does almost as well as any of a set of algorithms. Weighted Majority
Past->Future Assuming independent samples, past performance predicts future performance. PAC analysis, ERM analysis
Future->Future Future prediction performance on subproblems implies future prediction performance using algorithm A. ECOC, Probing

A basic question is: Are there other varieties of statements of this type? Avrim noted that there are also “arrows between arrows”: generic methods for transforming between Past->Past statements and Past->Future statements. Are there others?

Older Posts »

Powered by WordPress