Scaling Up Machine Learning, the Tutorial, KDD 2011

Ron Bekkerman, Misha Bilenko and John Langford

Part I slides (Powerpoint) Introduction

Part II.a slides (Powerpoint) Tree Ensembles

Part II.b slides (Powerpoint) Graphical models

Part III slides (Summary + GPU learning + Terascale linear learning)

This tutorial gives a broad view of modern approaches for scaling up machine learning and data mining methods on parallel/distributed platforms. Demand for scaling up machine learning is task-specific: for some tasks it is driven by the enormous dataset sizes, for others by model complexity or by the requirement for real-time prediction. Selecting a task-appropriate parallelization platform and algorithm requires understanding their benefits, trade-offs and constraints. This tutorial focuses on providing an integrated overview of state-of-the-art platforms and algorithm choices. These span a range of hardware options (from FPGAs and GPUs to multi-core systems and commodity clusters), programming frameworks (including CUDA, MPI, MapReduce, and DryadLINQ), and learning settings (e.g., semi-supervised and online learning). The tutorial is example-driven, covering a number of popular algorithms (e.g., boosted trees, spectral clustering, belief propagation) and diverse applications (e.g., speech recognition and object recognition in vision).

The tutorial is based on (but not limited to) the material from our upcoming Cambridge U. Press edited book which is currently in production and will be available in December 2011.

Presenters

Ron Bekkerman is a senior research scientist at LinkedIn where he develops machine learning and data mining algorithms to enhance LinkedIn products. Prior to LinkedIn, he was a researcher at HP Labs. Ron completed his PhD in Computer Science at the University of Massachusetts Amherst in 2007. He holds BSc and MSc degrees from the Technion---Israel Institute of Technology. Ron has published on various aspects of clustering, including multimodal clustering, semi-supervised clustering, interactive clustering, consensus clustering, one-class clustering, and clustering parallelization.

Misha Bilenko is a researcher in Machine Learning and Intelligence group at Microsoft Research, which he joined in 2006 after receiving his PhD from the University of Texas at Austin. His current research interests include large-scale machine learning methods, adaptive similarity functions and personalized advertising.

John Langford is a senior researcher at Yahoo! Research. He studied Physics and Computer Science at the California Institute of Technology, earning a double bachelor's degree in 1997, and received his PhD from Carnegie Mellon University in 2002. Previously, he was affiliated with the Toyota Technological Institute and IBM's Watson Research Center. He is the author of the popular Machine Learning weblog, hunch.net. John's research focuses on the fundamentals of learning, including sample complexity, learning reductions, active learning, learning with exploration, and the limits of efficient optimization.