Alex is a Canadian epidemiologist and data scientist focused on applying the intelligence process to detecting and assessing emerging pandemic threats and the management of biological emergencies. He has led teams in risk assessment, strategic intelligence, data science, and data engineering, often in the context of an ongoing crisis response. Currently, Alex leads BlueDot's Epidemic Intelligence Unit, which fuses human and artificial intelligence to process millions of discrete information sources, investigate thousands of signals, and help organizations respond rapidly and effectively to high consequence biothreat events.
Here is a Q&A with Alex (answers have been lightly edited for clarity):
Q: Why did you propose this question?
A: Primarily, I was hopeful this question could serve as an early warning signal of public health emergencies. It’s also just an interesting combination of forecasting themes, depending on partially predictable (or at least detectable) external events, a somewhat murky institutional decision-making process, and lots of attention combined with a nice crisp resolution criterion.
Selfishly, it was a good opportunity to learn. I find the PHEIC mechanism itself and the process surrounding it kind of fascinating, and being involved in a forecasting question is a great excuse to learn lots of gory detail about some interesting area.
Q: What were you/your organization hoping to learn from the crowd forecasting?
A: I lead a team of analysts and data scientists at BlueDot.global, a biothreat intelligence company 100% focused on infectious diseases, particularly high consequence events of the type that would warrant a PHEIC. We regularly do structured assessments of events detected by our ML system, and I’m always on the hunt for interesting new data sources, so I thought this question might serve as a source of additional early warning.
Q: What was your thought process in approaching this question early on?
A: Early on, I was simply betting on the base rate, which is a bit less than 1 PHEIC every 2 years (~40% per year). I also reasoned that WHO/WHO’s Director-General Dr. Tedros Adhanom Ghebreyesus had been so heavily criticized for their prior perceived inaction on COVID-19 that they’d be somewhat more inclined to act decisively, and shifted up slightly.
Q: When and why did you start leaning more closely to the final outcome?
A: I adjusted upward on the basis of epidemic signals detected by BlueDot, first very slightly and then more aggressively as the magnitude of the Monkeypox event became more clear. Once the first Emergency Committee (EC) meeting was declared, the mechanics of the PHEIC decision-making process dominated. For example, conditional on a second EC meeting being called shortly after a first non-declaration, a PHEIC is actually quite likely! This makes sense, as another forecaster put it “when the boss calls you back right after the first meeting to re-think the decision, what’s likely to happen.”
Q: From your perspective, why do you think the crowd forecast stayed relatively low?
A: I think there’s some confusion generally about the PHEIC mechanism. It’s used normatively as an “alarm system,” since it’s the only formal emergency declaration available to WHO, but it’s not synonymous with a COVID-level pandemic declaration. My guess is a more casual interpretation of “PHEIC = pandemic” led the crowd to underestimate its likelihood in general, and Monkeypox causing one in particular.
There was also a last-minute twist, with Dr. Tedros overruling the EC (who voted against, 9-6), but I don’t think there was enough time between the vote becoming public and the declaration itself for that to matter much.
Q: How can crowd forecasting benefit the public health sector / policymakers?
A: It’s potentially a perfect layer between the core public health activities of surveillance and monitoring for threats and responding appropriately, since it provides clear quantitative estimates of highly complex events.
A major barrier is likely the degree to which public health decision-makers trust crowdsourced assessments vs the judgment of their own in-house teams and experts. I think more widespread understanding and adoption of probabilistic reasoning could change that, as could well-developed highly legible track records of strong performance over a range of relevant questions.
Take the example of COVID-19. A worrisome signal was detected by BlueDot and a few others in late December 2019, and various warnings were issued, but generally, these weren't acted upon robustly until months later. In a world where there was widespread trust in crowd forecasts, and there was a clear, visible, authoritative crowd estimate of a looming pandemic, I think we could shorten the time to the response by months and save tens of thousands of lives. I’m excited for platforms like INFER to help us do better next time!