A fairer way forward for AI in health care
When data scientists in Chicago, Illinois, set out to test whether a machine-learning algorithm could predict how long people would stay in hospital, they thought that they were doing everyone a favour. Keeping people in hospital is expensive, and if managers knew which patients were most likely to be eligible for discharge, they could move them to the top of doctors’ priority lists to avoid unnecessary delays. It would be a win–win situation: the hospital would save money and people could leave as soon as possible.
Starting their work at the end of 2017, the scientists trained their algorithm on patient data from the University of Chicago academic hospital system. Taking data from the previous three years, they crunched the numbers to see what combination of factors best predicted length of stay. At first they only looked at clinical data. But when they expanded their analysis to other patient information, they discovered that one of the best predictors for length of stay was the person’s postal code. This was puzzling. What did the duration of a person’s stay in hospital have to do with where they lived?
As the researchers dug deeper, they became increasingly concerned. The postal codes that correlated to longer hospital stays were in poor and predominantly African American neighbourhoods. People from these areas stayed in hospitals longer than did those from more affluent, predominantly white areas. The reason for this disparity evaded the team. Perhaps people from the poorer areas were admitted with more severe conditions. Or perhaps they were less likely to be prescribed the drugs they needed.
The finding threw up an ethical conundrum. If optimizing hospital resources was the sole aim of their programme, people’s postal codes would clearly be a powerful predictor for length of hospital stay. But using them would, in practice, divert hospital resources away from poor, black people towards wealthy white people, exacerbating existing biases in the system.
“The initial goal was efficiency, which in isolation is a worthy goal,” says Marshall Chin, who studies health-care ethics at University of Chicago Medicine and was one of the scientists who worked on the project. But fairness is also important, he says, and this was not explicitly considered in the algorithm’s design.
This story from Chicago serves as a timely warning as medical researchers turn to Artificial Intelligence (AI) to improve health care. AI tools could bring great benefits to people who aren’t currently served well by the medical system. For example, an AI tool for screening chest X-rays for signs of tuberculosis, developed by start-up Zebra Medical Vision in Shefayim, Israel, is being rolled out in hospitals in India to speed up diagnosis of people with the disease. Machine-learning algorithms could also help scientists to tease out which people are likely to respond best to which treatments, ushering in an era of tailor-made medicine that might improve outcomes.
But this revolution hinges on the data that are available for these tools to learn from, and those data mirror the unequal health system we see today. “In some health-care systems, there are very basic things that are being ignored, basic quality of care that people are not receiving,” says Kadija Ferryman, an anthropologist at the New York University Tandon School of Engineering who studies the social, cultural and ethical impacts of the use of AI in health care. These inequalities are preserved in the terabytes of health data being generated around the world. And these data have primed the health-care industry for the kind of disruption that is being driven by ride-sharing platforms in the transport sector and home-rental platforms such as Airbnb in the hotel industry, Ferryman says. “Apple, Google, Amazon — they are all making inroads into the health-care space.” But because AI algorithms learn from existing data, there is a risk, Ferryman says, that the tools that result from this gold rush could entrench or deepen inequalities — such as the fact that black people in US emergency rooms are 40% less likely to receive pain medication than are white people1.
The Chicago story is an example of bias being documented in a system before it is implemented. But not all occurrences are caught. In January, at the Conference on Fairness, Accountability and Transparency in Atlanta, Georgia, scientists from the University of California, Berkeley, and the University of Chicago presented evidence of “significant racial bias” in an algorithm that determines health-care decisions for more than 70 million people in the United States2.
The algorithm in question allocates ‘risk scores’, which are used to enrol people at high risk of future complex health needs into specially resourced care programmes. The researchers found that black people had significantly more chronic illnesses than did white people with the same risk scores. This means that white people are more likely to be enrolled in targeted programmes than are black people with the same level of health. If the algorithm scored black and white people equally, the researchers said, black people would be enrolled into the programmes at more than twice the current rate.
Rubbish in, rubbish out
Impaired access to care for certain people is just one way in which AI tools could widen the health gap globally. Another problem is making sure that AI-powered tools can be applied equally to different groups of people. Information from certain population groups tends to be missing from the data with which these tools learn, meaning that the tool might work less well for members of those communities.
White, adult men are strongly over-represented in existing medical data sets, at the expense of data from white women and children and people of all ages from other ethnic groups. This lack of diversity in the data is likely to result in biased algorithms3.
There are some efforts to plug these gaps. In 2015, the US National Institutes of Health (NIH) created the All of Us initiative with US$130 million in funding. The research programme aims to form a database of genetic and health data from one million volunteers, expanding the data sets available for guiding the development of precision medicine to provide better quality care for everyone in the United States. It s