I spent the last month becoming a connoisseur of digital contact tracing approaches since this seems like something where I might be able to help. Many other people have been thinking along similar lines (great), but I also see several misconceptions that even smart and deeply involved people are making.
For the following a key distinction to understand is between proximity and location approaches. In proximity approaches (such as DP3T, TCN, MIT PACT(*), Apple or one of the UW PACT(*) protocols which I am involved in) smartphones use Bluetooth low energy and possibly ultrasonics to discover other smartphones nearby. Location approaches (such as MIT Safe Paths or Israel) instead record the absolute location of the device based on gps, cell tower triangulation, or wifi signals.
Location traces are both poor quality and intrinsically identifying
Many people associate the ability of a phone to determine where it is with the ability to discover where it is with high precision. This is typically incorrect. Common healthcare guidance for possible contact is “within 2 meters for 10 minutes” while location data is often off by 10-100 meters, with varying accuracy due to which location methodology is in use. As an example, approximately everyone in Manhattan may be within 100 meters of someone who later tested positive for COVID-19. Given this inaccuracy, I expect users of a system based on location crossing to simply turn them off due to the large number of false positives.
These location traces, even though they are crude, are also highly identifying. When going about your normal pre-pandemic life, you move from location X to Y to Z. Typically no one else goes from X to Y to Z in the same timeframe (clocks are typically very accurate). If you test positive and make your trace available to help suppress the virus, a store owner with a video camera and a credit card record might de-anonymize you and accuse you of killing someone they care about. Given the stakes here, preserving as much anonymity as possible is critical for convincing people to release the information which is needed to control the virus.
Given this, approaches which upload the location data of users seem likely to have reduced adoption and many false positives. While some governments are choosing to use all location data on an involuntary basis like Israel, the lack of effectiveness compared to proximity based approaches and the draconian compromise of civil liberties are worrisome.
Location traces can be useful in a privacy-preserving way
Understanding the above, people often conclude that location traces are subsumed by alternatives. That’s not true. Location approaches can be made very private by simply never allowing a location trace leave the personal device. While this might feel contradictory to epidemiological success, it’s actually extremely helpful in at least two ways.
- People have a pretty poor memory, so when they test positive and someone calls them up to do a contact tracing interview, having a location trace on their phone can be incredibly useful in jogging their memory. Using the location trace this way allows the manual contact tracing process to be much more complete. It can also be made much faster by allowing infected people to prefill much of a contact interview form before they get a call.
- The virus is inherently very localized, so public health authorities often want to quickly talk to people at location X or warn people to stay away from location Y until it is cleaned. This can be strongly enabled by on-device location traces. The phone can download all the public health messages in a region and check automatically which are relevant to the phone’s location trace, surfacing those as needed to the user. This provides more power than crossing location traces. A message of “If you were at store X on April 16th, please email email@example.com” allows people to not respond if they went to store V next door.
Both of these capabilities are a part of the UW PACT protocols I worked on for this reason.
Proximity-only approaches have an x2 problem
When people abandon location-based approaches, it’s in favor of proximity-based approaches. For any proximity protocol approach to work, both the infected person and the contact must be running the protocol implying there are two ways for it to fail to be useful.
To get a sense of what is necessary, consider the reproduction number of the coronavirus. Estimates vary but a reproduction number of 2.5 is reasonable. That is, the virus might infect 2.5 new people per infected person on average in the absence of countermeasures. To keep an infection with a base reproduction number of 2.5 from exponentiating, it is necessary to reduce the reproduction number to 1 which can be done when 60% of contacts are discovered, assuming (optimistically) no testing error and perfect isolation of discovered contacts before they infect anyone else.
To reach 60% you need 77.5% of people to download and run proximity protocols. This is impossible in many places where smartphones are owned by fewer than 77.5% of the population. Even in places where it’s possible it’s difficult to imagine reaching that level of usage without it being a mandatory part of the operating system that you are forced to use. Even then, subpopulations without smartphones are not covered. The square problem gets worse at lower levels of adoption. At 10% adoption (which corresponds to a hugely popular app), only 1% of contacts can be discovered via this mechanism. Despite the smallness, informing 1% of contacts does have real value in about the same sense that if someone loaned you money with a 1%/week interest rate you would call them a loan shark. At the same time, this is only 1/60th of a solution to getting the reproduction number below 1.
Hence, people advocating for proximity approaches must either hope for pervasive mandatory use (which will still miss subcommunities without smartphones) or accept that proximity approaches are only a part of the picture.
This quadratic structure also implies that the number of successful proximity tracing protocols will be either 0 or 1 in any geographic region. Given that Apple/Google are building a protocol into their OSes, that’s the candidate for the possible 1 in most of the world once it becomes available(**).
This quadratic structure is difficult to avoid. For example, if location traces are crossed with location traces, the same issue comes up. Similarly for proximity tracing, you could imagine recording “wild” bluetooth beacons and then reporting them to avoid the quadratic structure. This however unavoidably reveals contacts publicly which can then cause the positive person to be revealed publicly.
Interestingly, traditional manual contact tracing does not suffer from the quadratic problem. Hence approaches (discussed above) which augment and benefit from manual contact tracing have a linear value structure, which matters enormously with lower levels of adoption.
The primary thrust of contract tracing needs to be manual, as that is what has worked in countries (like South Korea) which suppressed large outbreaks. Purely digital approaches don’t seem like a credible solution due to issues discussed above. Hybrid approaches with smartphone-based apps can help by complementing manual contact tracing and perhaps via proximity approaches. Getting there requires high levels of adoption, which implies trust is a critical commodity. In addition to navigating the issues above, projects need to be open source, voluntary, useful, and strongly respect privacy (the ACLU recommendations are good here). This is what the CovidSafe project is aimed at in implementing the UW PACT protocols. Projects not navigating the above issues as well are less credible in my understanding.
An acknowledgement: many people have affected my thinking through this process, particularly those on the UW PACT paper and CovidSafe projects.
(*) I have no idea how the name collision occurred. We started using PACT here, 3 weeks ago, and circulated drafts to many people including a subset of the MIT PACT group before putting it on arxiv.
(**) The Apple protocol is a bit worrisome as development there is not particularly open and I have a concern about the crypto protocol. The Tracing Key on page 5, if acquired via hack or subpeona, allows you to prove the location of a device years after the fact. This is not epidemiologically required and there are other protocols without this weakness. Edit: The new version of their protocol addresses this issue.