Research Projects/Papers

Interactive Learning	This research project is to address learning in interactive situations. It includes active learning (where the algorithm chooses which examples to label), contextual bandits (which is similar, except less label information is returned), and online learning (where the label is revealed after each prediction).
Learning Reductions	The goal of this project is to learn the relationships between the learning problems common in the machine learning community. Typical statements are of the form "Ability to approximately solve problem A implies ability to approximately solve problem B" and typical algorithms are reductions/compilers/transformers from one problem to another.
Hashing Representations	We investigate the use of hashing functions to define feature indicies, allowing radical space savings and substantial speed improvements in learning algorithms.
Scaling Up Machine Learning	This is a book that I edited with Ron Bekkerman and Misha Bilenko that has many kinds of parallel learning algorithms. See also, our survey tutorial of the field.
Nearest Neighbors	The "Cover Tree" is a datastructure on which nearest neighbor (and similar) queries can be calculated. The algorithm relies upon only a metric, so it is applicable to noneuclidean spaces.
Prediction Bounds	The goal is gaining a fine understanding of how well a machine can learn to predict things. Typical statements are of the form "assuming samples are independently drawn from an unknown distribution, this classifier errs less than a 0.4 portion of the time". Algorithms optimize these bounds to find a classifier with a low guaranteed error rate.
CAPTCHA	The goal of the captcha project is to construct tests which a computer can generate and grade, but not pass. These are simultaneously tightly specified challenges for AI research and useful to prevent robotic attacks on resources. Luis von Ahn, Manuel Blum, Nick Hopper and John Langford CAPTCHA: Using Hard AI Problems for Security Eurocrypt 2003 .ps.gz, .pdf, .tex Luis von Ahn, Manuel Blum, and John Langford Telling Humans and Computers Apart (Automatically) CMU Tech Report .ps.gz, .pdf, .tex
Isomap	This is an "unsupervised learning" algorithm. Given many points in some high (1000000) dimensional space, isomap attempts to extract a low (1-10) dimensional manifold containing the points. If the high dimensional points are an isometric embedding of a low dimensional manifold (and some niceness conditions apply) isomap is guaranteed to recoever it. Josh Tenenbaum, Vin de Silva and John Langford. A Global Geometric Framework for Nonlinear Dimensionality Reduction . Science 290, 2319-2323, 2000 isomap site a comment and response Mira Bernstein, Vin de Silva, John Langford and Josh Tenenbaum Graph approximations to geodesics on embedded manifolds Technical report 2000
Reinforcement Learning	Reinforcement Learning is a formalization of the general AI problem. It is thus very general and capable of capturing many other machine learning problems. One of the fundamental problems arising is the "exploration/exploitation tradeoff".
Centmail	Shard Goel, Jake Hofman, John Langford, David M. Pennock, and Daniel M. Reeves. CentMail: Rate Limiting via Certified Micro-Donations at the WWW2009 developers track. This is about a scheme to eliminate spam by introducing microdonations to the email system.
Predictive Indexing	If you have a machine learned rule for ranking elements, how do you arrange objects so that not all of them need to be evaluated to produce a top-k ranking? This is a classic problem that we provide a new and experimentally effective twist on. Sharad Goel, John Langford and Alexander Strehl, Predictive Indexing for Fast Search, NIPS 2008.
Steganography	Encryption hides the content of a message. Steganography hides the existence of a message. Where and how is steganography possible? Nick Hopper, John Langford, and Luis von Ahn Provably Secure Steganography Crypto 2002/CMU Tech report .ps.gz, .pdf, .tex (revised corrections .tex w.r.t. crypto submission) It turns out that anonmyous two-party computation is also possible. Luis von Ahn, Nick Hopper, and John Langford Covert Two-Party Computation STOC 2005 .ps.gz .pdf .tex
Online Learning for Information Markets	It turns out rational agents produce learning information markets. Alina Beygelzimer, John Langford, and David Pennock, Learning Performance of Prediction Markets with Kelly Bettors, AAMAS 2012
Self-financed Prediction	How do you setup a mechanism for resolving bets on outcomes which is incentive compatible and self financing? Nicolas Lambert, John Langford, Jennifer Wortman, Yiling Chen, Daniel Reeves, Yoav Shoham, and David Pennock Self-financed Wagering Mechanisms for Forecasting EC 2008.
Incentive Compatible Exploration	Does the exploration required for learning a policy messup the incentives for ad auctions on the internet? Jennifer Wortman, Yevgeniy Vorobeychik, Lihong Li, and John Langford, Maintaining Equilibria During Exploration in Sponsored Search Auctions WINE 2007 also Algorithmica 2010. (This is the longer journal draft.)
Quantum Compression	Any classical compression algorithm can be compiled into a quantum compression algorithm. John Langford Generic Quantum Block Compression (at xxx.lanl.gov and Phys. rev. A.) May 2002
Graphical Games	With many players interacting locally, what correlated equilibria exist? Sham Kakade, Michael Kearns, John Langford, and Luis Ortiz, Correlated Equilibria in Graphical Games, ACM EC 2003. .ps.gz, .pdf, .tex
Probabilistic Graphplan	Avrim Blum and John Langford Probabilistic Planning in the Graphplan Framework. ECP, 1999.
Particle Filters	Particle filters are a flexible nonparametric approximate Bayes posterior algorithm. Sebastian Thrun, John Langford, and Vandi Verma, Risk Sensitive Particle Filters, NIPS2001. .pdf, .ps.gz, .tex S. Thrun and John Langford, 1998. Monte Carlo Hidden Markov Models. Technical Report S. Thrun, John Langford, and Dieter Fox 1999. Monte Carlo Hidden Markov Models: Learning Non-Parametric Models of Partially Observable Stochastic Proecesses. ICML99
FeatureBoost	FeatureBoost is a boosting-style algorithm intendended to improve the robustness of a learned classifier to perturbations in the underlying algorithm. Joseph O'Sullivan, John Langford, Rich Caruana and Avrim Blum. FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness. ICML2000 .pdf, .ps.gz, .tex
Cloud Control	John Langford, Lihong Li, Preston McAfee, Kishore Papineni, Cloud control: Voluntary admission control for Intranet traffic management Information Systems and e-Business Management 2010.

Research Projects/Papers

Predictive Indexing

Steganography

Online Learning for Information Markets

Self-financed Prediction

Incentive Compatible Exploration

Quantum Compression

Graphical Games

Probabilistic Graphplan

Particle Filters

FeatureBoost

Cloud Control