Here are a few other papers I enjoyed from ICML06.

Topic Models:

Dynamic Topic Models

David Blei, John Lafferty

A nice model for how topics in LDA type models can evolve over time,

using a linear dynamical system on the natural parameters and a very

clever structured variational approximation (in which the mean field

parameters are pseudo-observations of a virtual LDS). Like all Blei

papers, he makes it look easy, but it is extremely impressive.

Pachinko Allocation

Wei Li, Andrew McCallum

A very elegant (but computationally challenging) model which induces

correlation amongst topics using a multi-level DAG whose interior nodes

are “super-topics” and “sub-topics” and whose leaves are the

vocabulary words. Makes the slumbering monster of structure learning stir.

Sequence Analysis (I missed these talks since I was chairing another session)

Online Decoding of Markov Models with Latency Constraints

Mukund Narasimhan, Paul Viola, Michael Shilman

An “ah-ha!” paper showing how to trade off latency and decoding

accuracy when doing MAP labelling (Viterbi decoding) in sequential

Markovian models. You’ll wish you thought of this yourself.

Efficient inference on sequence segmentation model

Sunita Sarawagi

A smart way to re-represent potentials in segmentation models

to reduce the complexity of inference from cubic in the input sequence

to linear. Also check out her NIPS2004 paper with William Cohen

on “segmentation CRFs”. Moral of the story: segmentation is NOT just

sequence labelling.

Optimal Partitionings/Labellings

The uniqueness of a good optimum for K-means

Marina Meila

Marina shows a stability result for K-means clustering, namely

that if you find a “good” clustering it is not too “different” than the

(unknowable) optimal clustering and that all other good clusterings

are “near” it. So, don’t worry about local minima in K-means as long

as you get a low objective.

Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation

Pradeep Ravikumar, John Lafferty

Paradeep and John introduce QP relaxations for the problem of finding

the best joint labelling of a set of points (connected by a weighted

graph and with a known metric cost between labels and extended

the non-metric case). Surprisingly, they show that the QP relaxation

is both computationally more attractive and more accurate than

the “natural” LP relaxation or than loopy BP approximations.

Distinguished Paper Award Winners

How Boosting the Margin Can Also Boost Classifier Complexity

Lev Reyzin, Robert Schapire

Trading Convexity for Scalability

Ronan Collobert, Fabian Sinz, Jason Weston, Leon Bottou

Looping Suffix Tree-Based Inference of Partially Observable Hidden State

Michael Holmes, Charles Isbell

Just an additional comment, there was a very interesting looking paper by Mike Jordan and Barbara Engelhardt

http://www.icml2006.org/icml_documents/camera-ready/038_A_Graphical_Model_fo.pdf which unfortunately

overlapped with Marina’s talk. Related to Marina’s talk was a paper by Fernando and Takeo

http://www.icml2006.org/icml_documents/camera-ready/031_Discriminative_Clust.pdf.

Linli Xu had a paper, following up the great stuff from the NIPS workshops

(convex HMMs!) http://www.icml2006.org/icml_documents/camera-ready/133_Discriminative_Unsup.pdf

Also on the clustering front, overlapping with Blei’s talk was this paper by Arik and Zoubin

http://www.icml2006.org/icml_documents/camera-ready/008_A_New_Approach_to_Da.pdf