Here are a few other papers I enjoyed from ICML06.
Topic Models:
Dynamic Topic Models
David Blei, John Lafferty
A nice model for how topics in LDA type models can evolve over time,
using a linear dynamical system on the natural parameters and a very
clever structured variational approximation (in which the mean field
parameters are pseudo-observations of a virtual LDS). Like all Blei
papers, he makes it look easy, but it is extremely impressive.
Pachinko Allocation
Wei Li, Andrew McCallum
A very elegant (but computationally challenging) model which induces
correlation amongst topics using a multi-level DAG whose interior nodes
are “super-topics” and “sub-topics” and whose leaves are the
vocabulary words. Makes the slumbering monster of structure learning stir.
Sequence Analysis (I missed these talks since I was chairing another session)
Online Decoding of Markov Models with Latency Constraints
Mukund Narasimhan, Paul Viola, Michael Shilman
An “ah-ha!” paper showing how to trade off latency and decoding
accuracy when doing MAP labelling (Viterbi decoding) in sequential
Markovian models. You’ll wish you thought of this yourself.
Efficient inference on sequence segmentation model
Sunita Sarawagi
A smart way to re-represent potentials in segmentation models
to reduce the complexity of inference from cubic in the input sequence
to linear. Also check out her NIPS2004 paper with William Cohen
on “segmentation CRFs”. Moral of the story: segmentation is NOT just
sequence labelling.
Optimal Partitionings/Labellings
The uniqueness of a good optimum for K-means
Marina Meila
Marina shows a stability result for K-means clustering, namely
that if you find a “good” clustering it is not too “different” than the
(unknowable) optimal clustering and that all other good clusterings
are “near” it. So, don’t worry about local minima in K-means as long
as you get a low objective.
Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation
Pradeep Ravikumar, John Lafferty
Paradeep and John introduce QP relaxations for the problem of finding
the best joint labelling of a set of points (connected by a weighted
graph and with a known metric cost between labels and extended
the non-metric case). Surprisingly, they show that the QP relaxation
is both computationally more attractive and more accurate than
the “natural” LP relaxation or than loopy BP approximations.
Distinguished Paper Award Winners
How Boosting the Margin Can Also Boost Classifier Complexity
Lev Reyzin, Robert Schapire
Trading Convexity for Scalability
Ronan Collobert, Fabian Sinz, Jason Weston, Leon Bottou
Looping Suffix Tree-Based Inference of Partially Observable Hidden State
Michael Holmes, Charles Isbell
Just an additional comment, there was a very interesting looking paper by Mike Jordan and Barbara Engelhardt
http://www.icml2006.org/icml_documents/camera-ready/038_A_Graphical_Model_fo.pdf which unfortunately
overlapped with Marina’s talk. Related to Marina’s talk was a paper by Fernando and Takeo
http://www.icml2006.org/icml_documents/camera-ready/031_Discriminative_Clust.pdf.
Linli Xu had a paper, following up the great stuff from the NIPS workshops
(convex HMMs!) http://www.icml2006.org/icml_documents/camera-ready/133_Discriminative_Unsup.pdf
Also on the clustering front, overlapping with Blei’s talk was this paper by Arik and Zoubin
http://www.icml2006.org/icml_documents/camera-ready/008_A_New_Approach_to_Da.pdf