The Heritage Health Prize is potentially the largest prediction prize yet at $3M, which is sure to get many people interested. Several elements of the competition may be worth discussing.
- The most straightforward way for HPN to deploy this predictor is in determining who to cover with insurance. This might easily cover the costs of running the contest itself, but the value to the health system of a whole is minimal, as people not covered still exist. While HPN itself is a provider network, they have active relationships with a number of insurance companies, and the right to resell any entrant. It’s worth keeping in mind that the research and development may nevertheless end up being useful in the longer term, especially as entrants also keep the right to their code.
- The judging metric is something I haven’t seen previously. If a patient has probability 0.5 of being in the hospital 0 days and probability 0.5 of being in the hospital ~53.6 days, the optimal prediction in expectation is ~6.4 days. This is evidence against point (1) above, since cost is probably closer to linear in the number of hospital days. As a starting point, I suspect many people will simply optimize conditional squared loss and then back out an inferred prediction according to p=ex-1, with clipping. The standard approach of ensembling should be effective.
- The team structure seems a bit strange to me. I’m not sure there is a good reason for it from a prediction point of view and 8 may be too hard a limit on team size, imposing bin packing problems on the entrants.
- Privacy is clearly a huge concern. They anonymized the data, require entrants to protect the data, and admonish people to not try to break privacy. Despite that, the data will be released to large numbers of people, so I wouldn’t be surprised if someone attempts a join attack of some sort. Whether or not a join attack succeeds could make a huge difference in how this contest is viewed in the long term.
- The Accuracy Threshold is a big deal. If they set it at an out-of-reach point (which they could easily do), the size of the prize becomes 0.5M. This part of the contest is supposed to be determined next month.
This contest is not a slam-dunk, but is has the potential to become one, and I’ll be interested to see how it turns out.