A Blog by Jonathan Low

 

Oct 2, 2020

Can Health Authorities Assure Adapative AI Learns the Right Lessons?

Adaptive AI learns from new information, but who can assure that the lessons learned are the right ones and that harmful impacts do not inadvertently follow? JL

Sam Surette reports in Stat:

The promise of adaptive AI lies in its ability to learn and respond to new information. Manufacturers can train an algorithm only so far on in-house data. When an algorithm encounters a real-world clinical setting, adaptive AI allows it to learn from new data and incorporate feedback to optimize its performance. AI that can reshape itself to fit existing clinical environments could learn the wrong lessons from the clinicians or institutions it monitors, reinforcing harm. To provide for effective AI, products must be designed to evaluate and test for implicit biases in health care delivery systems in real time.
Picture this: As a Covid-19 patient fights for her life on a ventilator, software powered by artificial intelligence analyzes her vital signs and sends her care providers drug-dosing recommendations — even as the same software simultaneously analyzes in real time the vital signs of thousands of other ventilated patients across the country to learn more about how the dosage affects their care and automatically implements improvements to its drug-dosing algorithm.
This type of AI has never been allowed by the Food and Drug Administration. But that day is coming.
AI that continuously learns from new data and modifies itself, called adaptive AI, faces some steep barriers. All FDA-cleared or approved AI-based software is “locked,” meaning the manufacturer cannot allow adaptations for real-world use without new testing to confirm that it still works properly.

To unlock the transformative power of adaptive AI, the FDA and industry will need to develop new scientific approaches and embrace an expansive new definition of what it means to design a product. It also means creating artificial self-control: a built-in system of limits on the types of improvements it can make to itself and the rules the software uses to decide if it can make them.
The promise of adaptive AI lies in its ability to learn and respond to new information. Manufacturers, however, can train an algorithm only so far on in-house data. When an algorithm encounters a real-world clinical setting, adaptive AI might allow it to learn from these new data and incorporate clinician feedback to optimize its performance. This kind of feedback mechanism is common in non-medical AI services. It’s what’s going on when they ask users to respond to the prompt, “Was this helpful?” to improve their recommendations.

Adaptive AI could also adapt to entire institutions. A hospital in Minneapolis may see a very different mix of patients than one in Baton Rouge, 1,200 miles down the Mississippi River, in terms of age, comorbidities such as obesity or diabetes, and other factors. Because clinically appropriate performance is set in part on factors such as disease prevalence, having access to local data can help fine-tune performance to match the needs of each institution. Adaptive AI might even learn subtle differences between institutions, such as how frequently they perform certain blood tests, which are otherwise difficult to factor into calculations.
Allowing for adaptive learning introduces potential risks as well as offering advantages. AI that can reshape itself to fit existing clinical environments could learn the wrong lessons from the clinicians or institutions it monitors, reinforcing harmful biases. In 2019, research published in Science reported that an algorithm widely used by U.S. health care systems to guide health decisions showed marked evidence of racial bias. Since the algorithm used health costs as a proxy for risk, it incorrectly assumed that Black patients’ unequal access to care meant they didn’t need it — reducing by more than half the number of Black patients identified for extra care.
To provide for effective and equitable artificial intelligence, products must be designed to evaluate and test for the implicit biases in our health care access and delivery systems in real time. If implemented correctly, AI might actually reduce — rather than mimic or amplify — the implicit biases of its human counterparts, who often have trouble recognizing their own biases, let alone test for them.
To promote the exciting potential of adaptive AI while mitigating its known risks, developers and the FDA need a new approach. In addition to considering the product as originally designed, they need to prespecify how, and how much, a product can change on its own.
Design provides the blueprint for the product, while artificial self-control provides the guidance and constraint for the product’s evolution. Adaptive AI governed by artificial self-control would be able to make only specific types of changes and only after successfully passing a series of robust automated tests. The guardrails provided by artificial self-control would ensure that the software performs acceptably and equitably as it adapts. Instead of being unleashed, artificial self-control lets a manufacturer put adaptive AI on a longer leash, allowing the algorithm to explore within a defined space to find the optimal operating point.
A leash, even a long one, functions only when someone is holding the other end. AI is frequently described using human analogies, such as how it “learns” or “makes decisions.” But AI is not self-aware. Every action it takes is the responsibility of its developers, including artificial self-control.
Implementation of artificial self-control might look something like this: Take the case of the drug-dosing algorithm I mentioned at the beginning of this article. When the algorithm is ready to incorporate what it has learned from real-world data about how drug-dosing information has affected other patients on ventilators, it first goes through a controlled revalidation process, automatically testing its performance on a random sample from a large representative test dataset in the cloud, a dataset that has been carefully curated by the manufacturer to ensure it is representative of the overall population and has high quality information about drug-dosing and patient outcomes.
The curated test dataset lets the algorithm check if it has developed any bias from the real-world data, or if there were other data quality issues that could negatively affect its performance. If the algorithm meets minimum requirements for performance against the test dataset, the update is allowed to proceed and become available to clinicians to better manage their patients. The test is logged, and each data point used in the test is carefully controlled to ensure that the algorithm is not simply getting better and better at predicting the answer in a small test set (a common problem in machine learning called overfitting) but is instead truly improving its performance. In the event that unanticipated problems emerge in real-world situations, the manufacturer has the ability to quickly roll back the update to a previous version.
For now, FDA-cleared artificial intelligence software products are manufactured in a conventional way. All algorithm updates are controlled by the manufacturer, not the software. Many such updates require a new FDA premarket review, even if the previous version already had received FDA clearance. Yet the FDA has its eye on the future, evidenced by a discussion paper released last April on how the agency might regulate adaptive AI. It envisions incorporating a new component into its premarket reviews: a “predetermined change control plan” that specifies future modifications and the associated methodology to implement them in a controlled manner.
The use of these predetermined change-control plans would enable adaptive AI by allowing the FDA to review possible modifications ahead of time, obviating the need for a new premarket review before each significant algorithm update. In other words, it would mean FDA-authorized artificial self-control.
In addition to the discussion paper, two recent device marketing authorizations have laid the groundwork for adaptive AI. These include some of the first FDA-authorized predetermined change-control plans, albeit for locked algorithms. One authorization was for IDx-DR, an AI-based software designed to automatically screen for diabetic retinopathy in primary care settings. The other, made by my company, Caption Health, was for Caption AI, an AI-guided imaging acquisition system that enables novice users to acquire heart ultrasound scans.
Both products have the potential to expand patient access to specialized diagnostics. But they also set a precedent for the FDA “pre-reviewing” future changes, blazing a trail for other companies to follow, including — inevitably — the first company to obtain FDA authorization of adaptive AI.
To be clear, both of these change-control plans still require the company to design, test, and lock the algorithm before it’s released. From there, though, it is only a short step to using the same mechanisms for artificial self-control.
As the FDA gains experience with premarket reviews of AI-based products, it should continue to collaborate with experts in industry and academia to establish good machine learning practices, as it has done through participation in Xavier Health’s AI Initiative since 2017. The FDA should also consider placing greater emphasis on real-world evidence and post-market surveillance mechanisms for these products, similar to how the FDA has responded to the rapidly evolving Covid-19 pandemic.
To the best of my knowledge, no company has sought FDA authorization for adaptive AI. The first to do so should endeavor to set a high bar in terms of safety, effectiveness, and trust. In light of lingering skepticism toward medical AI among patients and clinicians, the most successful introduction of adaptive AI will be for products whose decision-making is readily understandable to the users. A product that highlights suspected lesions in a CT image, for example, can be verified by a radiologist, and any mistakes would be apparent.
Research in developing more explainable AI is still nascent, but progressing quickly. In the meantime, a medical device designed to assist a clinician in real time may be a good candidate for successful introduction of adaptive AI. Such products act as a co-pilot to the clinician, who can build trust in the product quickly, something that can be difficult for a “black box” diagnostic algorithm to achieve.
Though the regulatory framework is becoming clearer, thanks to proactive measures by the FDA, it may be some time before artificial self-control is ready to leave the nest. Its transformative potential is undeniable, but manufacturers and the FDA would be wise to wait until adaptive AI can demonstrate practical clinical benefit over locked algorithms and can be made more explainable for users. A clumsy roll-out could have a chilling effect on the entire field.
To paraphrase the early Christian theologian Augustine of Hippo: Give me self control, but not yet.

0 comments:

Post a Comment