COMS 4771 Machine Learning

COMS 4771 Machine Learning (Spring 2008)

Announcements (Blog)

Lectures and Homeworks

Lectures: Tue Thu 2:40pm-3:55pm, MUDD 535 (Notice ROOM CHANGE)

Instructor: Alina Beygelzimer (beygel at us.ibm.com)

Guest Instructors (alphabetically): Sanjoy Dasgupta (UCSD), Tony Jebara (Columbia), John Langford (Yahoo! Research), and Cynthia Rudin (Columbia).

Office hours: after each class

Machine learning is about making machines that learn from past experience. The goal of the course is to present the fundamental issues involved in solving machine learning problems and to introduce a broad range of machine learning methods.

Specific topics will include:

What is machine learning? Basic concepts, types of prior information, types of learning problems, loss function semantics.
Supervised learning basics
- Three learning algorithms: Decision tree learning, nearest neighbor methods, Bayesian learning.
- Evaluating predictors: Bias-variance decomposition, overfitting, cross-validation.
- Regression: linear regression, least squares and other loss functions, regularized regression (Cynthia)
- Reductions between different supervised learning problems
Theoretical foundations of supervised learning
- The PAC model, Occam's razor, VC-dimension.
- Mistake bounds: halving, winnow, weighted majority, relation to PAC
- The perceptron algorithm, margins
- AdaBoost and Logistic Regression, margins theory for Boosting (Cynthia)
- Bandit problems, online optimization
- Learning rankings (Cynthia)
Support vector machines and kernel methods
Large-scale learning: Stochastic gradient descent, back-propagation (John)
Markov Decision Processes and Reinforcement Learning (John)
Mixing unlabeled and labeled data, co-training
Active learning
Generative models, maximum likelihood parameter estimation, EM (Tony)
Graphical models, Hidden Markov Models (Tony)
Random projection methods; Johnson-Lindenstrauss lemma
Dimensionality reduction
Algorithms for nearest neighbor search

Reading Material

Most of the material will be provided electronically (check this blog for updates). The following books are recommended (not crucial, but good to have):

Duda, Hart, Stork, "Pattern Classification", 2000.
Bishop, "Pattern Recognition and Machine Learning" (preview)
Kearns and Vazirani, "An Introduction to Computational Learning Theory", MIT Press, 1994 (limited preview at google book search)
Hastie, Tibshirani, and Friedman, "Elements of Statistical Learning: Data Mining, Inference and Prediction", 2001 (see limited preview at google book search)
Schoelkopf and Smola, "Learning with Kernels", MIT Press, 2002 (limited preview, and a short introduction to learning with kernels).
Mitchell, "Machine Learning", 1997.

Grading

There will be four homeworks due one to two weeks after they are assigned (50% of the grade), a midterm (25%), and a final (25%). Late assignments will not be accepted.

You can discuss your homework assignments and papers with other students, but no collaboration or help is allowed in the actual writing of solutions. If you discuss homework problems with other students, you should give their names on the homework.