Culture / Machine Learning / Technology / Tools

Did AI Tools Help Doctors During the Pandemic?

22 Aug 2021 6:00am, by

This week a Montreal-based artificial intelligence startup named Stratuscent issued a press release promising “the world’s first AI-powered air quality monitor” for both Covid-19 and influenza. The company named it NOZE, since its website says it uses a “digital sense of smell” to scan the very air that you’re breathing, all powered by “a single sensor built on years of NASA innovation.” (And as an added bonus, it monitors indoor mold risk and tracks nine other air contaminants, including “tobacco odor,” their press release says.)

The website says the built-in AI component analyzes temperature, humidity, carbon dioxide, and particulate matter (as well as Covid and influenza markers). “As a connected device, it is future-proof, being able to detect new smells… as updates are pushed to it from the cloud.” Units will start shipping in October (from getNOZE.com.)

But whatever fate awaits NOZE, they’re not the only ones trying to apply AI to the problems of the pandemic. Since April 2020, the European Commission has been using AI-powered tools from a company named InferVision to “enhance” the detection of Covid-19 patients, according to the company’s press release, which states that “strong benefits” have been reported by hospitals in Belgium, Estonia, France, Italy, Portugal, Romania, Spain, Sweden and the Netherlands. InferVision’s tools “quickly and accurately” analyze CT-scan chest images for signs of pulmonary infections, looking for the telltale tissue density, which the press release calls “key findings of viral pneumonia caused by the coronavirus.”

And work continues on other possible tools. A March article in MobiHealthNews describes a faster AI screening tool called CURIAL being validated in the emergency rooms an Oxford University teaching hospital. Leading that research team is Andrew Soltan, one of the hospital’s clinicians (and a clinical machine learning researcher at Oxford). Soltan told The New Stack this week that they’ve just completed that validation testing (which he described as a “very significant study.”) And he remains enthusiastic about the technology’s potential.

“This kind of technology, applied at the front door of hospitals, can make a big difference to helping reduce pressure on overstretched emergency departments and may be particularly valuable in smaller hospitals where there isn’t a lab on-site,” Soltan said.

“We hope that results of the upcoming CURIAL study, which are due to be released shortly, will contribute to the evidence base for safe and responsible use of clinical AI.”

But it all begs a larger question: Where were the breakthrough AI applications during the pandemic?

High Hopes

Back in April 2020, MIT Technology Review noted rising hopes for AI systems to screen for Covid (fueled partly by “staff shortages and overwhelming patient loads.”) That article even called the pandemic “a gateway for AI adoption in health care,” arguing that it led to “a growing number” of hospitals turning to AI, many for the first time.

But even then, Rizwan Malik, a lead radiologist at the Royal Bolton Hospital, was telling the site that after speaking to 24 companies pitching AI-based tools for Covid screenings, “Most of them were utter junk. They were trying to capitalize on the panic and anxiety.”

A March report from the Centre for Data Ethics and Innovation notes AI played a very important role in vaccine research, though also finding that the U.K.’s response to the pandemic involved more pure data science. AI deployments happened more often in the healthcare sector, “where the unprecedented nature of the crisis has required public services to consider all available technologies, including those still nascent.”

In July an article in MIT Technology Review argued that “hundreds” of AI tools were created to help with the pandemic response — then cited two research papers examining data from 2020 to question whether those tools were ultimately as effective as hoped.

First under scrutiny was an article in the British Medical Journal that examined 169 studies (which had resulted in 232 prediction models). They weren’t all modeling the same thing. Seven tried to identify people at risk in the general population, while several others tried to diagnose the severity of a Covid infection or the risk of its leading to an ICU admission, to intubation, or to a fatal outcome. A few studies even tried to estimate exactly how long a patient would need to be hospitalized. But whatever their subject, the paper’s authors seemed to find the same results again and again: “[A]lmost all published prediction models are poorly reported, and at high risk of bias such that their reported predictive performance is probably optimistic.”

Of course, working specifically with healthcare-related data comes with a unique set of privacy requirements. So, for its July article on Covid and AI, MIT Technology Review circled back to one of the paper’s authors, Laure Wynants, an epidemiologist at Maastricht University in the Netherlands studying who studies predictive tools.

Wynants still believes AI could potentially help patients — but she told the Technology Review a discouraging story about one company that was marketing its own deep-learning algorithms for healthcare that highlights another important issue. She identified a high risk of bias in several models published by researchers tied to the company — but despite reaching out to the organization, she never heard back, and doesn’t know what the company implemented.

Even when speaking directly to hospitals, “there’s a lot of secrecy,” she told the Technology Review, with some medical AI vendors requiring their customers to sign nondisclosure agreements. Wrote Will Douglas Heaven, author of the Review piece, “When she asked doctors what algorithms or software they were using, they sometimes told her they weren’t allowed to say.”

Details About Datasets

MIT Technology Review also cited a second paper published in Nature Machine Intelligence that performed a careful review on 62 papers published between January and October 2020. Of course, there’s a large universe of papers to review, and their level of detail might not always be available for this kind of academic analysis. But still, the researchers remained unimpressed with what they saw, stating unequivocally, “Our review finds that none of the models identified are of potential clinical use due to methodological flaws and/or underlying biases.”

For example, it notes that 16 of the 62 papers reviewed (more than 25%) used a pneumonia dataset as their control — without ever mentioning that the dataset “consists of pediatric patients aged between one and five.” The danger, the reviewers warn, is a model that is “likely to overperform,” and instead of detecting people suffering from Covid infections, “it is merely detecting children versus adults.”

Another paper had combined several datasets that inadvertently contained duplicates, leading to the possibility of “algorithms being trained and tested on identical or overlapping datasets while believing them to be from distinct sources.” And another common issue was small sample sizes.

One of the study’s co-authors was Derek Driggs, a machine-learning researcher at the University of Cambridge, who tells MIT Technology Review that he still believes AI has the potential to help, and he’s working on a machine-learning tool himself to try to aid doctors during the pandemic. But he also shared two particularly telling anecdotes highlighted by Heaven in the Technology Review: “Driggs’s group trained its own model using a dataset that contained a mix of scans taken when patients were lying down and standing up. Because patients scanned while lying down were more likely to be seriously ill, the AI learned wrongly to predict serious Covid risk from a person’s position.”

“In yet other cases, some AIs were found to be picking up on the text font that certain hospitals used to label the scans. As a result, fonts from hospitals with more serious caseloads became predictors of Covid risk.”

A Call for More Collaboration

Like Wynants, Driggs also told MIT Technology Review that he saw potential for improvement if researchers focused more on improving and testing existing models rather than having each study start building its own unique model from scratch. And they both called for more openness about models and training protocols — as well as greater collaboration between AI teams and the medical clinicians they’re trying to support.

The call for openness is already being taken up by the larger industry. Last December, participants in the Turing Institute‘s workshops suggested accessible and centralized “data lakes,” along with protocols for data standardization. (Other suggestions included more “equitable” access to data.) Their calls for openness were captured in a June report released by the Institute.

“If the community can make progress in these areas, then when we are next faced with a pandemic… we should be better placed as a collective to respond,” the report stated.

They’re not the only ones. Later the report mentions a call in June for “data readiness” for health emergencies that was published by the top scientific bodies of G7 nations. “As the pandemic is brought under control, the G7 should champion the cause of establishing health data as a global public good… [T]he G7 should capture this moment to help build a trustworthy and trusted international data system for health emergencies.”

The Turing Institute also documents its own role in responding to the pandemic, including working as a partner on the DECOVID project (which created a detailed database of anonymized patient health data, including 185,000 patient records from the National Health Service) and Project Odysseus, which provided near-real-time measures of activity on city streets using traffic cameras and other sensors. The data was ultimately used by London’s transport to make real-world pandemic decisions like where to move bus stops or close parking spaces. Activity data also fed the urban analytics workstream, a project which ultimately created models simulating the effects of different lockdown strategies.

Final thoughts? “Navigating our way through the pandemic without the knowledge and resources of the data science and AI community would have been markedly more difficult,” the Turing Institute report concludes. “These are transformational times for the community as its research becomes ever more embedded in everyday life.

“We need to draw on our experiences during this pandemic to ensure that data science and AI continue to change lives for the better.”


WebReduce

The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: Real.

Featured image by Annie Spratt via Unsplash.

A newsletter digest of the week’s most important stories & analyses.