Coronavirus and Machine Learning Conferences

I’ve been following the renamed COVID-19 epidemic closely since potential exponentials deserve that kind of attention.

The last few days have convinced me it’s a good idea to start making contingency plans for machine learning conferences like ICML. The plausible options happen to be structurally aligned with calls to enable reduced travel to machine learning conferences, but of course the need is much more immediate.

I’ll discuss relevant observations about COVID-19 and then the impact on machine learning conferences.

COVID-19 observations

  1. COVID-19 is capable of exponentiating with a base estimated at 2.13-3.11 and a doubling time around a week when unchecked.
  2. COVID-19 is far more deadly than the seasonal flu with estimates of a 2-3% fatality rate but also much milder than SARS or MERS. Indeed, part of what makes COVID-19 so significant is the fact that it is mild for many people leading to a lack of diagnosis, more spread, and ultimately more illness and death.
  3. COVID-19 can be controlled at a large scale via draconian travel restrictions. The number of new observed cases per day peaked about 2 weeks after China’s lockdown and has been declining for the last week.
  4. COVID-19 can be controlled at a small scale by careful contact tracing and isolation. There have been hundreds of cases spread across the world over the last month which have not created new uncontrolled outbreaks.
  5. New significant uncontrolled outbreaks in Italy, Iran, and South Korea have been revealed over the last few days. Some details:
    1. The 8 COVID-19 deaths in Iran suggests that the few reported cases (as of 2/23) are only the tip of the iceberg.
    2. The fact that South Korea and Italy can suddenly discover a large outbreak despite heavy news coverage suggests that it can really happen anywhere.
    3. These new outbreaks suggest that in a few days COVID-19 is likely to become a world-problem with a declining China aspect rather than a China-problem with ramifications for the rest of the world.

There remains quite a bit of uncertainty about COVID-19, of course. The plausible bet is that the known control measures remain effective when and where they can be exercised with new ones (like a vaccine) eventually reducing it to a non-problem.

Conferences
The plausible scenario leaves conferences still in a delicate position because they require many things go right to function. We can easily envision 3 quite different futures here consistent with the plausible case.

  1. Good case New COVID-19 outbreaks are systematically controlled via proven measures with the overall number of daily cases declining steadily as they are right now. The impact on conferences is marginal with lingering travel restrictions affecting some (<10%) potential attendees.
  2. Poor case Multiple COVID-19 outbreaks turn into a pandemic (=multi-continent epidemic) in regions unable to effectively exercise either control measure. Outbreaks in other regions occur, but they are effectively controlled. The impact on conferences is significant with many (50%?) avoiding travel due to either restrictions or uncertainty about restrictions.
  3. Bad case The same as (2), except that an outbreak occurs in the area of the conference. This makes the conference nonviable due to travel restrictions alone. It’s notable here that Italy’s new outbreak involves travel lockdowns a few hundred miles/kilometers from Vienna where ICML 2020 is planned.

Even the first outcome could benefit from some planning while gracefully handling the last outcome requires it.

The obvious response to these plausible scenarios is to reduce the dependence of a successful conference on travel. To do this we need to think about what a conference is in terms of the roles that it fulfills. The quick breakdown I see is:

  1. Distilling knowledge. Luckily, our review process is already distributed.
  2. Passing on knowledge.
  3. Meeting people, both old friends and discovering new ones.
  4. Finding a job / employee.

How (and which) of these can be effectively supported remotely?

I’m planning to have discussions over the next few weeks about this to distill out some plans. If you have good ideas, let’s discuss. Unlike most contingency planning, it seems likely that efforts are not wasted no matter what the outcome šŸ™‚

Code submission should be encouraged but not compulsory

ICML, ICLR, and NeurIPS are all considering or experimenting with code and data submission as a part of the reviewer or publication process with the hypothesis that it aids reproducibility of results. Reproducibility has been a rising concern with discussions in paper, workshop, and invited talk.

The fundamental driver is of course lack of reproducibility. Lack of reproducibility is an inherently serious and valid concern for any kind of publishing process where people rely on prior work to compare with and do new things. Lack of reproducibility (due to random initialization for example) was one of the things leading to a period of unpopularity for neural networks when I was a graduate student. That has proved nonviable (Surprise! Learning circuits is important!), but the reproducibility issue remains. Furthermore, there is always an opportunity and latent suspicion that authors ‘cheat’ in reporting results which could be allayed using a reproducible approach.

With the above said, I think the reproducibility proponents should understand that reproducibility is a value but not an absolute value. As an example here, I believe it’s quite worthwhile for the community to see AlphaGoZero published even if the results are not necessarily easily reproduced. There is real value for the community in showing what is possible irrespective of whether or not another game with same master of Go is possible, and there is real value in having an algorithm like this be public even if the code is not. Treating reproducibility as an absolute value could exclude results like this.

An essential understanding here is that machine learning is (at least) 3 different kinds of research.

  • Algorithms: The goal is coming up with a better algorithm for solving some category of learning problems. This is the most typical viewpoint at these conferences.
  • Theory: The goal is generally understanding what is possible or not possible for learning algorithms. Although these papers may have algorithms, they are often not the point and demanding an implementation of them is a waste of time for author, reviewer, and reader.
  • Applications: The goal is solving some particular task. AlphaGoZero is a reasonable example of this—it was about beating the world champion in Go with algorithmic development in service of that. For this kind of research perfect programmatic reproducibility may be infeasible because the computation is to extreme, the data is proprietary, etc…

Using a one-size-fits-all approach where you demand that every paper “is” a programmatically reproducible implementation is a mistake that would create a division that reduces our community. Keeping this three-fold focus fundamentally enriches the community both literally and ontologically.

Another view here is provided by considering the argument at a wider scope. Would you prefer that health regulations/treatments be based on all scientific studies including those where data is not fully released to the public (i.e almost all of them for privacy reasons)? Or would you prefer that health regulations/treatments be based only on data fully released to the public? Preferring the latter is equivalent to ignoring most scientific studies in making decisions.

The alternative to a compulsory approach is to take an additive view. The additive approach has a good track record amongst reviewing process changes.

  • When I was a graduate student, papers were not double blind. The community switched to double blind because it adds an opportunity for reviewers to review fairly and it gives authors a chance to have their work reviewed fairly whether they are junior or senior. As a community we also do not restrict posting on arxiv or talks about a paper before publication, because that would subtract from what authors can do. Double blind reviewing could be divisive, but it is not when used in this fashion.
  • When I was a graduate student, there was also a hard limit on the number of pages in submissions. For theory papers this meant that proofs were not included. We changed the review process to allow (but not require) submission of an appendix which could optionally be used by reviewers. This again adds to the options available to authors/reviewers and is generally viewed as positive by everyone involved.

What can we add to the community in terms reproducibility?

  1. Can reviewers do a better job of reviewing if they have access to the underlying code or data?
  2. Can authors benefit from releasing code?
  3. Can readers of a paper benefit from an accompanying code release?

The answer to each of these question is a clear ‘yes’ if done right.

For reviewers, it’s important to not overburden them. They may lack the computational resources, platform, or personal time to do a full reproduction of results even if that is possible. Hence, we should view code (and data) submission in the same way as an appendix which reviewers may delve into and use if they so desire.

For authors, code release has two benefits—it provides an additional avenue for convincing reviewers who default to skeptical and it makes followup work significantly more likely. My most cited paper was Isomap which did indeed come with a code release. Of course, this is not possible or beneficial for authors in many cases. Maybe it’s a theory paper where the algorithm isn’t the point? Maybe either data or code can’t be fully released since it’s proprietary? There are a variety of reasons. From this viewpoint we see that releasing code should be supported and encouraged but optional.

For readers, having code (and data) available obviously adds to the depth of value that a paper has. Not every reader will take advantage of that but some will and it enormously reduces the barrier to using a paper in many cases.

Let’s assume we do all of these additive and enabling things, which is about where Kamalika and Russ aimed the ICML policy this year.

Is there a need for go further towards compulsory code submission? I don’t yet see evidence that default skeptical reviewers aren’t capable of weighing the value of reproducibility against other values in considering whether a paper should be published.

Should we do less than the additive and enabling things? I don’t see why—the additive approach provides pure improvements to the author/review/publish process. Not everyone is able to take advantage of this, but that seems like a poor reason to restrict others from taking advantage when they can.

One last thing to note is that this year’s code submission process is an experiment. We should all want program chairs to be able to experiment, because that is how improvements happen. We should do our best to work with such experiments, try to make a real assessment of success/failure, and expect adjustments for next year.

ICML 2019: Some Changes and Call for Papers

The ICML 2019 Conference will be held from June 10-15 in Long Beach, CA — about a month earlier than last year. To encourage reproducibility as well as high quality submissions, this year we have three major changes in place.

There is an abstract submission deadline on Jan 18, 2019. Only submissions with proper abstracts will be allowed to submit a full paper, and placeholder abstracts will be removed. The full paper submission deadline is Jan 23, 2019.

This year, the author list at the paper submission deadline (Jan 23) is final. No changes will be permitted after this date for accepted papers.

Finally, to foster reproducibility, we highly encourage code submission with papers. Our submission form will haveĀ space for two optional supplementary files — a regular supplementary manuscript, and code. Reproducibility of results and easy accessibility of code will be taken into account in the decision-making process.

Our full Call for Papers is availableĀ here.

Kamalika Chaudhuri and Ruslan Salakhutdinov
ICML 2019 Program Chairs

ICML Board and Reviewer profiles

The outcome of the election for the IMLS (which runs ICML) adds Emma Brunskill and Hugo Larochelle to the board. The current members of the board (and the reason for board membership) are:

President Elect is a 2-year position with little responsibility, but I decided to look into two things. One is the website which seems relatively difficult to navigate. Ideas for how to improve are welcome.

The other is creating a longitudinal reviewer profile. I keenly remember the day after reviews were due when I was program chair (in 2012) which left a panic-inducing number of unfinished reviews. To help with this, I’m planning to create a profile of reviewers which program chairs can refer to in making decisions about who to ask to review. There are a number of ways to do this wrong which I’m avoiding with the following procedure:

  1. After reviews are assigned, capture the reviewer/paper assignment. Call this set A.
  2. After reviews are due, capture the completed & incomplete reviews for papers. Call these sets B & C respectively.
  3. Strip the paper ids from B (completed reviews) turning it into a multiset D of reviewers completed reviews.
  4. Compute C-A (as a set difference) then turn it into a multiset E of reviewers incomplete reviews.
  5. Store D & E for long term reference.

This approach:

  • Is objectively defined. Approaches based on subjective measurements seem both fraught with judgment issues and inconsistent. Consider for example the impressive variation we all see in review quality.
  • Does not record a review as late for reviewers who are assigned a paper late in the process via step (1) and (4). We want to encourage reviewers to take on the unusual but important late tasks that arrive.
  • Does not record a review as late for reviewers who discover they are inappropriate after assignment and ask for reassignment. We want to encourage reviewers to look at their papers early and, if necessary, ask for a paper to be reassigned early.
  • Preserves anonymity of paper/reviewer assignments for authors who later become program chairs. The conversion into a multiset removes the paper id entirely.

Overall, my hope is that several years of this will provide a good and useful tool enabling program chairs and good (or at least not-bad) reviewers to recognize each other.

ICML is changing its constitution

Andrew McCallum has been leading an initiative to update the bylaws of IMLS, the organization which runs ICML. I expect most people aren’t interested in such details. However, the bylaws change rarely and can have an impact over a long period of time so they do have some real importance. I’d like to hear comment from anyone with a particular interest before this year’s ICML.

In my opinion, the most important aspect of the bylaws is the at-large election of members of the board which is preserved. Most of the changes between the old and new versions are aimed at better defining roles, committees, etc… to leave IMLS/ICML better organized.

Anyways, please comment if you have a concern or thoughts.