Machine Learning – Page 13 – Machine Learning (Theory)

8/27/20129/1/2012

NYAS ML 2012 and ICML 2013

The New York Machine Learning Symposium is October 19 with a 2 page abstract deadline due September 13 via email with subject “Machine Learning Poster Submission” sent to physicalscience@nyas.org. Everyone is welcome to submit. Last year’s attendance was 246 and I expect more this year.

The primary experiment for ICML 2013 is multiple paper submission deadlines with rolling review cycles. The key dates are October 1, December 15, and February 15. This is an attempt to shift ICML further towards a journal style review process and reduce peak load. The “not for proceedings” experiment from this year’s ICML is not continuing.

Edit: Fixed second ICML deadline.

8/24/2012

Patterns for research in machine learning

There are a handful of basic code patterns that I wish I was more aware of when I started research in machine learning. Each on its own may seem pointless, but collectively they go a long way towards making the typical research workflow more efficient. Here they are:

Separate code from data.
Separate input data, working data and output data.
Save everything to disk frequently.
Separate options from parameters.
Do not use global variables.
Record the options used to generate each run of the algorithm.
Make it easy to sweep options.
Make it easy to execute only portions of the code.
Use checkpointing.
Write demos and tests.

Click here for discussion and examples for each item. Also see Charles Sutton’s and HackerNews’ thoughts on the same topic.

My guess is that these patterns will not only be useful for machine learning, but also any other computational work that involves either a) processing large amounts of data, or b) algorithms that take a significant amount of time to execute. Share this list with your students and colleagues. Trust me, they’ll appreciate it.

7/17/20127/17/2012

MUCMD and BayLearn

The workshop on the Meaningful Use of Complex Medical Data is happening again, August 9-12 in LA, near UAI on Catalina Island August 15-17. I enjoyed my visit last year, and expect this year to be interesting also.

The first Bay Area Machine Learning Symposium is August 30 at Google. Abstracts are due July 30.

7/9/2012

Videolectures

Yaser points out some nicely videotaped machine learning lectures at Caltech. Yaser taught me machine learning, and I always found the lectures clear and interesting, so I expect many people can benefit from watching. Relative to Andrew Ng‘s ML class there are somewhat different areas of emphasis but the topic is the same, so picking and choosing the union may be helpful.

6/29/2012

ICML survey and comments

Just about nothing could keep me from attending ICML, except for Dora who arrived on Monday. Consequently, I have only secondhand reports that the conference is going well.

For those who are remote (like me) or after the conference (like everyone), Mark Reid has setup the ICML discussion site where you can comment on any paper or subscribe to papers. Authors are automatically subscribed to their own papers, so it should be possible to have a discussion significantly after the fact, as people desire.

We also conducted a survey before the conference and have the survey results now. This can be compared with the ICML 2010 survey results. Looking at the comparable questions, we can sometimes order the answers to have scores ranging from 0 to 3 or 0 to 4 with 3 or 4 being best and 0 worst, then compute the average difference between 2012 and 2010.

Glancing through them, I see:

Most people found the papers they reviewed a good fit for their expertise (-.037 w.r.t 2010). Achieving this was one of our subgoals in the pursuit of high quality decisions.
Most people had sufficient time for doing reviews. This was something that we worried about significantly in shifting the paper deadline and otherwise massaging the schedule. Most people also thought the review period was sufficiently long and most reviews were high quality (+.023 w.r.t. 2010)
About 1/4 of reviewers say that author response changed their mind on a paper and 2/3 of reviewers say discussion changed their mind on a paper. The expectation of decision impact from author response is reduced from 2010 (-.135). The existence of author response is overwhelmingly preferred.
People generally found ICML reviewing the same or better than previous ICMLs (+.35 w.r.t. 2010) and other similar conferences (+.198 w.r.t. 2010) at the cost of being somewhat more work. A substantial bump in reviewing quality was a primary goal.
The ACs spent substantially more time (43 hours on average) than PC members (28 hours on average). This agrees with our expectation—the set of ACs didn’t change even after we had a 50% increase in submissions. The AC load we had this year was probably too high and will need to be reduced somewhat for next year.
2/3 of authors prefer the option to revise a paper during author response.
The choice of how to deal with increased submissions is deeply undecided, with a slight preference for short talk+poster as we did.
Most people like having two workshop days or don’t care.
There is a strong preference for COLT and UAI colocation with the next tier of preference for IJCAI, KDD, AAAI, and CVPR.