Announcements (Blog)
Lectures and Homeworks

Lectures: Tue Thu 2:40pm3:55pm, MUDD 535 (Notice ROOM CHANGE)
Instructor:
Alina Beygelzimer
(beygel at us.ibm.com)
Guest Instructors (alphabetically):
Sanjoy Dasgupta (UCSD),
Tony Jebara (Columbia),
John Langford (Yahoo! Research), and
Cynthia Rudin (Columbia).
Office hours: after each class
Machine learning is about making machines that learn from past experience.
The goal of the course is to present the fundamental issues involved in
solving machine learning problems and to introduce a broad range of machine
learning methods.
Specific topics will include:
 What is machine learning? Basic concepts, types of prior information, types of learning problems, loss function semantics.

Supervised learning basics
 Three learning algorithms:
Decision tree learning, nearest neighbor methods, Bayesian learning.
 Evaluating predictors: Biasvariance decomposition,
overfitting, crossvalidation.
 Regression: linear regression, least squares and other loss functions, regularized regression
(Cynthia)
 Reductions between different supervised learning problems
 Theoretical foundations of supervised learning
 The PAC model, Occam's razor, VCdimension.
 Mistake bounds: halving, winnow, weighted majority, relation to PAC
 The perceptron algorithm, margins
 AdaBoost and Logistic Regression, margins theory for Boosting (Cynthia)
 Bandit problems, online optimization
 Learning rankings (Cynthia)
 Support vector machines and kernel methods
 Largescale learning: Stochastic gradient descent, backpropagation (John)
 Markov Decision Processes and Reinforcement Learning (John)
 Mixing unlabeled and labeled data, cotraining
 Active learning
 Generative models, maximum likelihood parameter estimation, EM (Tony)
 Graphical models, Hidden Markov Models (Tony)
 Random projection methods; JohnsonLindenstrauss lemma
 Dimensionality reduction
 Algorithms for nearest neighbor search
Reading Material
Most of the material will be provided electronically (check
this blog for updates).
The following books are recommended (not crucial, but good to have):
 Duda, Hart, Stork, "Pattern Classification", 2000.
 Bishop, "Pattern Recognition and Machine Learning" (preview)
 Kearns and Vazirani, "An Introduction to Computational Learning Theory", MIT Press, 1994 (limited preview at google book search)
 Hastie, Tibshirani, and Friedman, "Elements of Statistical Learning: Data Mining, Inference and Prediction", 2001 (see limited preview at google book search)
 Schoelkopf and Smola, "Learning with Kernels", MIT Press, 2002 (limited preview, and a short introduction to learning with kernels).
 Mitchell, "Machine Learning", 1997.
Grading
There will be four homeworks due one to two weeks after they
are assigned (50% of the grade), a midterm (25%), and a final (25%).
Late assignments will not be accepted.
You can discuss your homework assignments and papers with other students,
but no collaboration or help is allowed in the actual writing of solutions.
If you discuss homework problems with other students, you should give their names on the homework.
