First systematic review and meta-analysis suggests artificial intelligence may be as effective as health professionals at diagnosing disease
Artificial Intelligence (AI) appears to detect diseases from medical imaging with similar levels of accuracy as health-care professionals, according to the first systematic review and meta-analysis, synthesising all the available evidence from the scientific literature published in The Lancet Digital Health journal.
Nevertheless, only a few studies were of sufficient quality to be included in the analysis, and the authors caution that the true diagnostic power of the AI technique known as deep learning — the use of algorithms, big data, and computing power to emulate human learning and intelligence — remains uncertain because of the lack of studies that directly compare the performance of humans and machines, or that validate AI’s performance in real clinical environments.
“We reviewed over 20,500 articles, but less than 1% of these were sufficiently robust in their design and reporting that independent reviewers had high confidence in their claims. What’s more, only 25 studies validated the AI models externally (using medical images from a different population), and just 14 studies actually compared the performance of AI and health professionals using the same test sample,” explains Professor Alastair Denniston from University Hospitals Birmingham NHS Foundation Trust, UK, who led the research.
“Within those handful of high-quality studies, we found that deep learning could indeed detect diseases ranging from cancers to eye diseases as accurately as health professionals. But it’s important to note that AI did not substantially out-perform human diagnosis.”
With deep learning, computers can examine thousands of medical images to identify patterns of disease. This offers enormous potential for improving the accuracy and speed of diagnosis. Reports of deep learning models outperforming humans in diagnostic testing has generated much excitement and debate, and more than 30 AI algorithms for healthcare have already been approved by the US Food and Drug Administration.
Despite strong public interest and market forces driving the rapid development of these technologies, concerns have been raised about whether study designs are biased in favour of machine learning, and the degree to which the findings are applicable to real-world clinical practice.
To provide more evidence, researchers conducted a systematic review and meta-analysis of all studies comparing the performance of deep learning models and health