Skip to main content
Diagnostic imaging

Diagnostic imaging

AI-reconstructed medical images can’t be trusted

04 Jun 2020
Image reconstruction

Medical images reconstructed using artificial intelligence (AI) techniques are unreliable, according to recent research by an international team of mathematicians. The team found that deep learning tools that create high-quality images from short scan times produce multiple alterations and artefacts in the data that could affect diagnosis. These issues were found in multiple systems, suggesting the phenomenon will not be easy to fix.

Cutting medical scan time could reduce costs and allow more scans to be performed. To enable this, some researchers have developed AI systems that construct high-quality images from low-resolution scans. The medical imaging equipment samples fewer data points than would normally be required and the AI enhances these data to create a high-resolution image. The AI trains on previous datasets from high-quality images. This is a radical shift from classical reconstruction techniques based on mathematical theory, which do not learn or rely on previous data.

A study published in the Proceedings of the National Academy of Sciences, however, finds that these AI algorithms have serious instability issues. Small structural changes, such as the presence of a small tumour, may not be captured, while tiny, almost undetectable perturbations, like those created by patient movement, can lead to severe artefacts in the final image.

The team, led by Anders Hansen at the University of Cambridge, tested six different neural networks trained to create enhanced images from MRI or CT scans. The researchers fed the networks data designed to replicate three possible issues: tiny perturbations; small structural changes; and changes in the sampling rate compared with the data on which the AI was trained.

Tiny perturbations can be generated by factors such as the patient shifting, white noise-like issues from the scanner and small anatomic differences between people, the researchers say. Such issues created multiple different artefacts and instabilities in the AI systems.

“What we show is that a tiny perturbation that is so small that you can’t even see it with your eyes can suddenly make a change so that there is now a new thing that appears in the image, or something that is removed,” Hansen explains. “So, you can get false positives and false negatives.”

To test the ability of the systems to detect small structural changes the team added letters and symbols from playing cards to the images. One of the networks was able to reconstruct these details, but the other five presented issues ranging from blurring to almost complete removal of the changes.

Only one of the neural networks produced better images as the researchers increased the sampling rate of the scans. Another stagnated, with no improvement in quality, while in three, the reconstructions dropped in quality as the number of samples increased. The sixth AI system does not allow the sampling rate to be changed.

Hansen says that researchers need to start testing the stability of these systems. “What they will see on a large scale is that many of these AI systems are unstable,” he explains. The “big, big problem”, according to Hansen, is that there is no mathematical understanding of how these AI systems work. “They become a black box and if you don’t test these things properly you can have completely disastrous outcomes.”

Similar instabilities have also been highlighted in deep-learning tools that classify images. “You take a tiny little perturbation and the AI systems says the image of the cat is suddenly a fire truck,” Hansen explains. He says that you can now imagine a system where you use an unstable AI to classify a medical image that has been reconstructed by another unstable neural network. “You are now going to decide do you have cancer or not? The question is, would you like to try it?” he asks.

Hansen believes that these reconstruction techniques do have potential, but there are things that machine learning will not be able to figure out. “What is absolutely crucial is to understand the limitations,” he explains.

Such techniques are not yet being used clinically. The team say that they created the tests as they do not want them to be approved by regulatory bodies unless they have been thoroughly tested.

Copyright © 2024 by IOP Publishing Ltd and individual contributors