An AI boost for clinical trials
Kevin Hughes needed volunteers. It was 1994, and the breast-cancer surgeon was starting a randomized, controlled trial at Massachusetts General Hospital in Boston. He and his colleagues wanted to test the efficacy of a treatment regimen commonly followed by people with a certain type of early-stage breast cancer: surgery followed by the drug tamoxifen and radiation therapy. Despite being an established protocol, it wasn’t clear whether the radiotherapy was beneficial for all women — and, in particular, those who were older.
The researchers sought volunteers over the age of 70 whose tumours were of a particular size and type. Of the roughly 40,000 women in the United States each year who could have qualified, they managed to enrol 636 people. That was enough for the study, but it took five years to find them.
Recruitment is just one of many bottlenecks in conducting clinical trials. “Medical research is remarkably inefficient in so many different ways,” says Eric Topol, director of the Scripps Research Translational Institute in La Jolla, California. An analysis of clinical-trial data from January 2000 up to April 2019 estimated that only around 12% of drug-development programmes ended in success1 (see ‘The state of clinical trials’). Most clinical trials fail because they don’t demonstrate the efficacy or safety of an intervention. Others flop because of a flawed study design, a shortage of money, participant drop-outs or a failure to recruit enough volunteers in the first place. Whether entering and transferring data or ensuring that participants take the correct dosage, delays, inaccuracies and inefficiencies abound.
To improve clinical trials, researchers in academia and the pharmaceutical industry are turning to Artificial Intelligence (AI). Fuelled by the rapidly increasing amounts of medical data that are available to researchers, including those provided by electronic health records and wearable devices, sophisticated machine-learning algorithms have the potential to save billions of dollars, to speed up medical advances and to expand access to experimental treatments. “Improving clinical trials would be a huge deal,” Hughes says.
The trial led by Hughes was one of the successful ones2. Although the extra step of radiotherapy reduced the rate of breast-cancer recurrence, it didn’t affect the overall survival rate. For older women, at least, the added financial cost and risk of radiotherapy might outweigh the potential benefit. A follow-up study reached the same conclusion3. Had he and his colleagues found people faster, Hughes says, they might have arrived at their conclusions sooner — and then could have begun to better inform women earlier. It would have also enabled the researchers to move on to other burning questions.
The recruitment process is often the most time-consuming and expensive step of a trial. According to a 2016 study4, 18% of cancer trials that launched between 2000 and 2011 as part of the US National Cancer Institute’s National Clinical Trials Network failed to find even half the number of patients they were seeking after three or more years of trying, or had closed entirely after signing up only a few volunteers. An estimated 20% of people with cancer are eligible to participate in such trials, but fewer than 5% do5. “Recruitment is the number one barrier to clinical research,” says Chunhua Weng, a biomedical informaticist at Columbia University in New York City.
Many are hoping that AI can make a difference. One branch of AI, called natural language processing (NLP), enables computers to analyse the written and spoken word. When applied to medicine, such techniques could allow algorithms to search doctors’ notes and pathology reports for people who would be eligible to participate in a given clinical trial.
The challenge is that the text in such documents is often free flowing and unstructured, and valuable information might only be implicit, requiring some background knowledge or context to understand. Doctors, for instance, have several ways of describing the same concept — a heart attack might be referred to as a myocardial infarction, a myocardial infarct or even just ‘MI’. But an NLP algorithm can be trained to spot all such synonyms by exposure to sample medical records that have been annotated by researchers. The algorithm can then apply that knowledge to interpret unannotated records.
Efforts are being made to make it easier for computers to interpret the descriptions of clinical trials. The inclusion and exclusion criteria of trials are commonly written in plain text. So that hospitals can search patient databases for people who are eligible to take part, these criteria must first be translated into a standardized, coded query format that the database can understand. Weng and her colleagues built an open-source web tool called Criteria2Query that uses NLP to do just that — enabling researchers and administrators to search databases without needing to know a database query language6.
AI can also help patients to look for clinical trials by themselves. Typically, people rely on their doctors to inform them about suitable studies. Some patients search the website ClinicalTrials.gov, which lists more than 300,000 studies that are being conducted in the United States and 209 other countries. Daunting scale aside, the often highly technical eligibility criteria can be incomprehensible to the public. “It’s pretty overwhelming,” says Edward Shortliffe, a physician and biomedical informaticist at Columbia University.
To help patients to make sense of eligibility criteria, Weng and her colleagues developed another open-source web tool, called DQueST. The software reads trials on ClinicalTrials.gov and then generates plain-English questions such as “What is your BMI?” to assess users’ eligibility. An initial evaluation7 showed that after 50 questions, the tool could filter out 60–80% of trials that the user was not eligible for, with an accuracy of a little more than 60%.
Tools such as those developed by Weng have plenty of room for improvement. Machine-learning algorithms rely on being fed training data from which they can learn — and to reach their potential, they need plenty. But labelling important features in these data, as is required to train NLP algorithms, is time consuming. The problem in academia, Weng says, is that both data and people power are limited.
Industry might be better placed to overcome those obstacles, and the past few years have seen a burst of activity. For example, digital-health company Antidote in New York City has developed a tool that helps people to search for trials. Other companies are working with health-care providers to find participants for trials in patient data held by these providers. Software developed by Deep 6 AI, an AI-based trials recruitment company in Pasadena, California, was used by researchers at Cedars-Sinai Smidt Heart Institute in Los Angeles, California, to find 16 suitable participants for a trial in one hour. A conventional approach had turned up only two people in six months.
Similarly, in a pilot study5 conducted by Mayo Clinic in Rochester, Minnesota, IBM’s Watson for Clinical Trial Matching system, which is powered by the company’s Watson supercomputer, increased the a