I’ve been looking at some recent embeddings work, and am struck by how beautiful the theory and algorithms are. It also makes me wonder, what are embeddings good for?
A few things immediately come to mind:
(1) For visualization of high-dimensional data sets.
In this case, one would like good algorithms for embedding specifically into 2- and 3-dimensional Euclidean spaces.
(2) For nonparametric modeling.
The usual nonparametric models (histograms, nearest neighbor) often require resources which are exponential in the dimension. So if the data actually lie close to some low-dimensional
surface, it might be a good idea to first identify this surface and embed the data before applying the model.
Incidentally, for applications like these, it’s important to have a functional mapping from high to low dimension, which some techniques do not yield up.
(3) As a prelude to classifier learning.
The hope here is presumably that learning will be easier in the low-dimensional space, because of (i) better generalization and (ii) a more “natural” layout of the data.
I’d be curious to know of other uses for embeddings.
There are a few problems in which projecting in a lower-dimensional manifold would be a good choice (“… learning will be easier in the low-dimensional space …”) and some problems in which tranforming them into high-dimensional spaces would make them linearly separable. It seems that it is not that much dependent on the specific problem, but it depends on our assumption about the easier-classifable topology using our methods in our toolbox. I am curious to know more about this issue.
Of course the original use of embeddings was in none of the above :), as a tool for approximating certain NP-hard problems (like sparsest cut)
I have seen several people have problems with (3). One reason why you might expect problems is because of successive approximation: First you approximate the manifold and then you approximate the decision boundary.
Michael Littman has been thinking about (2) (although I am unsure of his success).
I’ve used dimension reduction for density estimation for human poses. In that particular problem, estimating a multimodal density from a small set of a data points in the full, high-dimensional space was not practical, but a low-dimensional embedding (and a Bayesian model) made it tractable. Specifically, we used the GPLVM. It doesn’t have the lovely closed-form results of the other embedding methods, but it seems much more useful, at least for the kinds of problems I’m interested in.
I have been working on the subject for a while. I think a good interpretation of manifold learning is abstraction. It abstracts concepts present in a data set.