I thought this was a very good NIPS with many excellent papers. The following are a few NIPS papers which I liked and I hope to study more carefully when I get the chance. The list is not exhaustive and in no particular order…

- Preconditioner Approximations for Probabilistic Graphical Models.

Pradeeep Ravikumar and John Lafferty.

I thought the use of preconditioner methods from solving linear systems in the context of approximate inference was novel and interesting. The results look good and I’d like to understand the limitations. - Rodeo: Sparse nonparametric regression in high dimensions.

John Lafferty and Larry Wasserman.

A very interesting approach to feature selection in nonparametric regression from a frequentist framework. The use of lengthscale variables in each dimension reminds me a lot of ‘Automatic Relevance Determination’ in Gaussian process regression — it would be interesting to compare Rodeo to ARD in GPs. - Interpolating between types and tokens by estimating power law generators.

Goldwater, S., Griffiths, T. L., & Johnson, M.

I had wondered how Chinese restaurant processes and Pitman-Yor processes related to Zipf’s plots and power laws for word frequencies. This paper seems to have the answers. - A Bayesian spatial scan statistic.

Daniel B. Neill, Andrew W. Moore, and Gregory F. Cooper.

When I first learned about spatial scan statistics I wondered what a Bayesian counterpart would be. I liked the fact they their method was simple, more accurate, and much*faster*than the usual frequentist method. - Q-Clustering.

M. Narasimhan, N. Jojic and J. Bilmes.

A very interesting application of sub-modular function optimization to clustering. This feels like a hot area. - Worst-Case Bounds for Gaussian Process Models.

Sham M. Kakade, Matthias W. Seeger, & Dean P. Foster.

It’s useful for Gaussian process practitioners to know that their approaches don’t do silly things when viewed from a worst-case frequentist setting. This paper provides some relevant theoretical results.

Yesterday we were discussing some NIPS papers in our lab and I mentioned the Q-clustering paper. My colleague Jens remarked that he thought the maximum separation criterion clustering (which the paper mostly focuses on) was exactly the “single-linkage” clustering for which an well-known algorithm exists (based on the minimum spanning tree). After giving it some thought I must say that now I am also convinced it is the same thing (although I may have overlooked something due to my limited knowledge in the clustering area: I suspect it is probably be a well-known characterization of single linkage clustering if this is indeed true).

For the record, single linkage clustering can be constructed as follows:

1. Initialize with each point being one cluster

2. Aggregate the two clusters having minimum distance among cluster pairs, iterate.

(NB: complexity is O(n^3) )

Doesn’t this exactly produce a sequence of maximum separation clusterings?