Manifold based dimension-reduction algorithms share the following general outline.
Given: a metric d() and a set of points S
- Construct a graph with a point in every node and every edge connecting to the node of one of the k-nearest neighbors. Associate with the edge a weight which is the distance between the points in the connected nodes.
- Digest the graph. This might include computing the shortest path between all points or figuring out how to linearly interpolate the point from it’s neighbors.
- Find a set of points in a low dimensional space which preserve the digested properties.
Examples include LLE, Isomap (which I worked on), Hessian-LLE, SDE, and many others. The hope with these algorithms is that they can recover the low dimensional structure of point sets in high dimensional spaces. Many of them can be shown to work in interesting ways producing various compelling pictures.
Despite doing some early work in this direction, I suffer from a motivational problem: Why do we want to recover the low dimensional structure? One answer is “for better data visualization”. This is compelling if you have data visualization problems. However, I don’t — I want to make machines that can better predict the future, which generally appears to be a sound goal of learning. Reducing the dimensionality of a dataset is not obviously helpful in accomplishing this. In fact, doing so violates one of the basic intuitions of applied learning algorithms “avoid double approximation”. (One approximation = the projection into the low dimensional space, another approximation = the classifier learned on that space.)
Another answer is “for robots”. Several people have experimented with using a vision sensor and a dimension reduction technique in an attempt to extract the manifold of pose space. These attempts have not generally worked well, basically because the euclidean distance on pixels is not particularly good at predicting which things are “nearby”. However, we might be able to do considerably better if we learn the distance. At the 1-bit level, we might learn a predictor from image pairs to “nearby” or “far”. Any stream S of images i1, i2, i3, …, in can be transformed into a binary problem according to:
{((ij,ik),1 – I(j = k+1 or k = j+1): ij,ik in S}. In unmath “the binary problem formed by predicting whether images are adjacent in the chain of experience”. (*) A good solution to this binary problem would give us an
interesting 1-bit metric. Using regression and counting numbers of transitions might provide a more conventional multibit metric.
This metric, if well solved, has a concrete meaning: the minimum distance in terms of actuator transitions between positions. A shortest path in this space is a sequence of actuator movements leading from a position A to a position B. A projection of this space into low dimensions provides some common format which both the human and the robot can understand. Commanding the robot to go to some location is just a matter of pointing out that location in the low dimensional projection.
This is a possible use for manifold based dimension reduction techniques which I find compelling, if it works out. (Anyone interested in playing with this should talk to Dana Wilkinson who is considering experimenting with this approach.)
(*) We probably would want to tweak the positive/negative ratio to reflect the pattern encountered in usage.
(**) Post tweaked to fix an oversight.