Machine Learning (Theory)


ICML Board and Reviewer profiles

The outcome of the election for the IMLS (which runs ICML) adds Emma Brunskill, Kamalika Chaudhuri, and Hugo Larochelle to the board. The current members of the board (and the reason for board membership) are:

President Elect is a 2-year position with little responsibility, but I decided to look into two things. One is the website which seems relatively difficult to navigate. Ideas for how to improvement are welcome.

The other is creating a longitudinal reviewer profile. I keenly remember the day after reviews were due when I was program chair (in 2012) which left a panic-inducing number of unfinished reviews. To help with this, I’m planning to create a profile of reviewers which program chairs can refer to in making decisions about who to ask to review. There are a number of ways to do this wrong which I’m avoiding with the following procedure:

  1. After reviews are assigned, capture the reviewer/paper assignment. Call this set A.
  2. After reviews are due, capture the completed & incomplete reviews for papers. Call these sets B & C respectively.
  3. Strip the paper ids from B (completed reviews) turning it into a multiset D of reviewers completed reviews.
  4. Compute C-A (as a set difference) then turn it into a multiset E of reviewers incomplete reviews.
  5. Store D & E for long term reference.

This approach:

  • Is objectively defined. Approaches based on subjective measurements seem both fraught with judgment issues and inconsistent. Consider for example the impressive variation we all see in review quality.
  • Does not record a review as late for reviewers who are assigned a paper late in the process via step (1) and (4). We want to encourage reviewers to take on the unusual but important late tasks that arrive.
  • Does not record a review as late for reviewers who discover they are inappropriate after assignment and ask for reassignment. We want to encourage reviewers to look at their papers early and, if necessary, ask for a paper to be reassigned early.
  • Preserves anonymity of paper/reviewer assignments for authors who later become program chairs. The conversion into a multiset removes the paper id entirely.

Overall, my hope is that several years of this will provide a good and useful tool enabling program chairs and good (or at least not-bad) reviewers to recognize each other.


Vowpal Wabbit 8.5.0 & NIPS tutorial

Yesterday, I tagged VW version 8.5.0 which has many interactive learning improvements (both contextual bandit and active learning), better support for sparse models, and a new baseline reduction which I’m considering making a part of the default update rule.

If you want to know the details, we’ll be doing a mini-tutorial during the Friday lunch break at the Extreme Classification workshop at NIPS. Please join us if interested.

Edit: also announced at the Learning Systems workshop


The Real World Interactive Learning Tutorial

Alekh and I have been polishin the Real World Interactive Learning tutorial for ICML 2017 on Sunday.

This tutorial should be of pretty wide interest. For data scientists, we are crossing a threshold into easy use of interactive learning while for researchers interactive learning is plausibly the most important frontier of understanding. Great progress on both the theory and especially on practical systems has been made since an earlier NIPS 2013 tutorial.

Please join us if you are interested :-)


ICML is changing its constitution

Andrew McCallum has been leading an initiative to update the bylaws of IMLS, the organization which runs ICML. I expect most people aren’t interested in such details. However, the bylaws change rarely and can have an impact over a long period of time so they do have some real importance. I’d like to hear comment from anyone with a particular interest before this year’s ICML.

In my opinion, the most important aspect of the bylaws is the at-large election of members of the board which is preserved. Most of the changes between the old and new versions are aimed at better defining roles, committees, etc… to leave IMLS/ICML better organized.

Anyways, please comment if you have a concern or thoughts.


Machine Learning the Future Class

This spring, I taught a class on Machine Learning the Future at Cornell Tech covering a number of advanced topics in machine learning including online learning, joint (structured) prediction, active learning, contextual bandit learning, logarithmic time prediction, and parallel learning. Each of these classes was recorded from the laptop via Zoom and I just uploaded the recordings to Youtube.

In some ways, this class is a followup to the large scale learning class I taught with Yann LeCun 4 years ago. The videos for that class were taken down(*) so these lectures both update and replace shared subjects as well as having some new subjects.

Much of this material is fairly close to research so to assist other machine learning lecturers around the world in digesting the material, I’ve made all the source available as well. Feel free to use and improve.

(*) The NYU policy changed so that students could not be shown in classroom videos.

Older Posts »

Powered by WordPress