The Decision Service is a first-in-the-world project making tractable reinforcement learning easily used by developers everywhere. We are hiring for devel opers, data scientist, and a product manager. Please consider joining us to do something interesting this life

## 4/12/2017

## 1/4/2017

### EWRL and NIPS 2016

I went to the European Workshop on Reinforcement Learning and NIPS last month and saw several interesting things.

At EWRL, I particularly liked the talks from:

- Remi Munos on off-policy evaluation
- Mohammad Ghavamzadeh on learning safe policies
- Emma Brunskill on optimizing biased-but safe estimators (sense a theme?)
- Sergey Levine on low sample complexity applications of RL in robotics.

My talk is here. Overall, this was a well organized workshop with diverse and interesting subjects, with the only caveat being that they had to limit registration

At NIPS itself, I found the poster sessions fairly interesting.

- Allen-Zhu and Hazan had a new notion of a reduction (video).
- Zhao, Poupart, and Gordon had a new way to learn Sum-Product Networks
- Ho, Littman, MacGlashan, Cushman, and Austerwell, had a paper on how “Showing” is different from “Doing”.
- Toulis and Parkes had a paper on estimation of long term causal effects.
- Rae, Hunt, Danihelka, Harley, Senior, Wayne, Graves, and Lillicrap had a paper on large memories with neural networks.
- Hardt, Price, and Srebro, had a paper on Equal Opportunity in ML.

Format-wise, I thought the 2 sessions was better than 1, but I really would have preferred more. The recorded spotlights are also pretty cool.

The NIPS workshops were great, although I was somewhat reminded of kindergarten soccer in terms of lopsided attendance. This may be inevitable given how hot the field is, but I think it’s important for individual researchers to remember that:

- There are many important directions of research.
- You personally have a much higher chance of doing something interesting if everyone else is not doing it also.

During the workshops, I learned about ADAM (a momentum form of Adagrad), testing ML systems, and that even TenserFlow is finally looking into synchronous updates for parallel learning (allreduce is the way).

(edit: added one)

## 12/8/2016

### Vowpal Wabbit version 8.3 and tutorial

I just released Vowpal Wabbit 8.3 and we are planning a tutorial at NIPS Saturday over the lunch break in the ML systems workshop. Please join us if interested.

8.3 should be backwards compatible with all 8.x series. There have been big changes since the last version related to

- Contextual bandits, particularly w.r.t. the decision service.
- Learning to search for which we have a paper at NIPS.
- Logarithmic time multiclass classification.

## 8/26/2016

### ICML 2016 videos and statistics

The ICML 2016 videos are out.

I also wanted to share some statistics from registration that might be of general interest.

The total number of people attending: 3103.

Industry: 47% University: 46%

Male: 83% Female: 14%

Local (NY, NJ, or CT): 27%

North America: 70% Europe: 18% Asia: 9% Middle East: 2% Remainder: <1% including 2 from Antarctica

## 7/26/2016

### ICML 2016 was awesome

*strict saddle.*I first read about the approach here http://arxiv.org/abs/1503.02101, but at ICML he and other authors have demonstrated a remarkable number of problems that have this property that enables efficient optimization via an stochastic gradient descent (and other) procedures.