This hashing trick is implemented in Vowpal Wabbit as the core representation, as well as Torch and Stream, part of Elefant. It is particularly useful for online learning algorithms consuming large amounts of data.
Qinfeng Shi, James Petterson, Gideon Dror, John Langford, Alex Smola, and SVN Vishwanathan, Hash Kernels for Structured Data, AISTAT 2009 and JMLR 2009.
This paper defines a 'hash kernel', shows that it can be handy for several problems, and makes an observation that hashing can work well asymptotically due to internal feature redundancy.
Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, Josh Attenberg, Feature Hashing for Large Scale Multitask Learning, ICML 2009 Warning: Theorem 3 in this paper appears to be false. We'll be updating the draft soon.
Here, we provide a large scale application of the hashing trick to mass personalized spam filtering, define an unbiased hash, and prove that the unbiased hash is length preserving when the magnitudes of individual elements are not large. This implies that hashing can be thought of as a sparsity preserving projection.
See also Small Statistical Models by Random Feature Mixing by Kuzman Ganchev and Mark Dredze which shows this approach can work for NLP applications.