Machine learning could help stop a future pandemic in its tracks by indicating which individuals should be tested for the disease. That is the finding of physicists at the University of Gothenborg, Sweden and CNR-IPCF, Italy, whose neural-network-derived method proved far more effective than standard contact-tracing strategies at containing a simulated outbreak. Though the model has yet to be tested under real-world conditions, lead author Laura Natali says it could be especially useful in the early stages of an epidemic, when tests are scarce and little is known about how a new disease spreads.
In their study, Natali and colleagues began by dividing a population of 100 000 simulated individuals into three groups: those who are susceptible to the disease (S), those who are currently infected (I), and those who have recovered (R). During the simulation, these individuals move randomly around sub-regions of a 320 x 320 lattice of cells. At each time step, individuals within a certain radius of an infected person have a probability β of becoming infected, and a probability γ thereafter of recovering and becoming immune.
To capture the impact of asymptomatic disease carriers – a major feature of the COVID-19 pandemic – the researchers assign each simulated individual a temperature. The temperatures of infected individuals are, on average, higher than those of healthy individuals. However, the “healthy” and “infected” temperature distributions overlap substantially, making it impossible to determine an individual’s status by temperature alone. This means that tests are needed to identify which individuals are infected. The model assumes that these tests are accurate but not widely available, such that the number of individuals who can be tested (and, if infected, isolated) at each time step t is always much less than the total population.
Different strategies, different outcomes
Using this model, the researchers explored four possible scenarios. In the first, the disease spread unchecked through the population, with no containment measures. At t=150, nearly all individuals in this scenario had been infected.
In the second scenario, the researchers focused their limited testing capacity on individuals who had the most contacts (defined as being in the same cell) with others who had previously tested positive, using temperature data to break any ties. From time t=20 onward, all individuals who tested positive under this strategy were “frozen” in place and not allowed to interact with anyone else. This scenario is based on standard contact-tracing methods, and it produced a much lower peak in the infection rate. Still, the disease was not eliminated: at t=150, around 20% of the population remained infected, and thus capable of passing it on to the remaining susceptible individuals.
The third scenario mimics the strict lockdowns that many countries adopted to combat the spread of the SARS-CoV-2 coronavirus, which causes COVID-19. From t=20, all individuals in the “lockdown” scenario were frozen in place. This drastic action – isolating the entire population at once – kept infection rates very low and eliminated the disease entirely at t=120. However, the researchers note that such a comprehensive quarantine would be “unrealistic” in practice.
Enter the machine
In the final scenario, Natali and colleagues explored whether it might be possible to eliminate the disease while isolating only part of the population. To this end, they used a neural network to select which individuals to test. “In general, a neural network receives some inputs, elaborates them through of a series of hidden layers of artificial neurons, and returns an output,” they explain. “In our case, the input consists of contact-tracing information for a given individual n for the last 10 time steps.”
Based on the number of known infectious individuals within various distances of individual n, the number of actual contacts between n and these known infectious individuals, and n’s total number of contacts, the neural network outputs a value p: the probability that individual n is infected. If p=0, they are considered healthy. If p>0.995, they are immediately isolated. A value between 0.5 and 0.995 means they are prioritized for testing, beginning with individuals displaying the highest temperature until all available tests are depleted.
Speed of spreading epidemics is predicted using analytical technique
To boost the accuracy of the network’s predictions, the researchers “trained” it using data from t=20, when testing begins. Thanks to this training, the network’s predictive power improved over time, with striking results: the infection rate peaked at 5.1% of the population and quickly dropped to zero thereafter, even with no more than 25% of the population isolated. “We show that it is possible to use relatively simple and limited information to make predictions of who would be most beneficial to test,” Natali says. “This allows better use of available testing resources.”
Highly adaptable
The researchers note that their network makes no assumptions about either the disease or the underlying SIR (susceptible, infectious, recovered) model. This, they claim, means that it should adapt its predictions automatically to epidemics with more complex dynamics, such as a disease with an incubation period, delays in the testing process, or different patterns of individual movement.
Natali and colleagues showed that their model was also effective at suppressing an epidemic when individuals can get the disease more than once. “In the case of temporary immunization, the neural-network-informed strategy can prevent a disease outbreak from becoming endemic,” they conclude.
The research is published in Machine Learning Science & Technology.