NYU Large Scale Machine Learning Class

Yann LeCun and I are coteaching a class on Large Scale Machine Learning starting late January at NYU. This class will cover many tricks to get machine learning working well on datasets with many features, examples, and classes, along with several elements of deep learning and support systems enabling the previous.

This is not a beginning class—you really need to have taken a basic machine learning class previously to follow along. Students will be able to run and experiment with large scale learning algorithms since Yahoo! has donated servers which are being configured into a small scale Hadoop cluster. We are planning to cover the frontier of research in scalable learning algorithms, so good class projects could easily lead to papers.

For me, this is a chance to teach on many topics of past research. In general, it seems like researchers should engage in at least occasional teaching of research, both as a proof of teachability and to see their own research through that lens. More generally, I expect there is quite a bit of interest: figuring out how to use data to make predictions well is a topic of growing interest to many fields. In 2007, this was true, and demand is much stronger now. Yann and I also come from quite different viewpoints, so I’m looking forward to learning from him as well.

We plan to videotape lectures and put them (as well as slides) online, but this is not a MOOC in the sense of online grading and class certificates. I’d prefer that it was, but there are two obstacles: NYU is still figuring out what to do as a University here, and this is not a class that has ever been taught before. Turning previous tutorials and class fragments into coherent subject matter for the 50 students we can support at NYU will be pretty challenging as is. My preference, however, is to enable external participation where it’s easily possible.

Suggestions or thoughts on the class are welcome 🙂

30 Replies to “NYU Large Scale Machine Learning Class”

  1. That sounds great. Will the videos be posted as you go along… so people can follow/contribute at a distance?
    I guess what would be nice would be to create a discussion forum like coursera’s so that
    a) distance learners can team up
    b) discuss questions (with each other) and students ( obviously don’t expect you to get involved in anything but the most interesting questions)

    1. I’d be happy to participate (as much as possible) from afar. It would be great if, in addition to lectures and slides being available online, we could have access too handouts/tutorials/homeworks. It would be even better if there could be some centralized site to connect us distance learners with each other – we don’t necessarily need your help/grading/feedback if we can generate our own.

    2. Can’t wait till this starts! Would love a discussion forum, but if there’s only a few of us we could just discuss questions over email.

  2. +1 Would love it if it were an MOOC.

    jon, even if it werent a classic MOOC with tests/grades, merely having lectures videotaped would be useful.

  3. Respectfully, why must you, a couple of luminaries of machine learning, also use the over-hyped term “Big Data” as the title of your class?
    Its interesting you don’t use the term here in the blog.

    1. “Big Data” a running joke in the lab, but I can’t take credit here 🙂

  4. Some of my thoughts and requests.

    1. Like everyone else, I would love to follow along, so it would be nice if the lecture and slides are available online after the actual lecture.
    2. From the course outline, I suspect that there will be a lot of practical works and assignments along which I would also like to follow.
    3. Providing a key reference to each topic would be helpful for me.


  5. I definitively agree with Kirk. In fact, it is not a big deal if it is not a MOOC, as if it was probably it would have been hard not to make it a watered-down version of the class. I personally enjoyed much more the video lectures of Andrew Ng then the MOOC they offered. For me listening lectures of the top researchers in the field and having access to the class notes, practicals, and assignments is more then enough.


    1. First priority goes to registered students. Any space remaining would work for auditing, but it isn’t clear yet if any space will remain.

  6. I’m a co-founder of TechTalks.tv and would love to chat with you about how we can help with your workshop video and slides. After reading through these comments, there is clearly a need for a suitable platform to host them on. Your course topic is a perfect fit for the userbase on Tech Talks. Hope to hear from you.


  7. This sounds great. I will follow along if the vids and slides are available but really hope you get to make this a MOOC soon.

  8. Can you please us link to the slides and assignments?

  9. Hi,

    thank you for offering this interesting course.

    It is possible to add subtitles to the video for hearing impaired?

    Thanks and best wishes.

    1. It costs to much for now, but they are looking into creating an interface to crowdsource subtitles.

      1. ok, thanks for your help. The slides are quite helpful for their own and provide the necessary information for looking up the relevant material.

  10. I know this is a lot to ask but a lot of people in the world (I live in India), operate in low bandwidth environments (I’m on a 256Kbps line), so would it be possible for you to upload this to youtube or any other site where these can be downloaded?
    (I don’t expect you to say yes, but I’d very much appreciate it if you did 🙂 )


Comments are closed.