How do we regulate AI in healthcare?
As AI becomes more routine in healthcare, a central question becomes – how do we regulate it?
This has been on the minds of the Food and Drug Administration (FDA), as it has released several pieces of guidance for using AI in areas such as drug development and software as a medical device over the past 5 years.
The intention of these documents is to try to stay ahead of the rapid expansion of AI into clinical care. Over 500 AI and machine learning devices have been cleared by the FDA – and this will only increase over the next few years.
Why does the FDA need to regulate AI and machine learning?
For one, AI and machine learning tools have been known to reinforce bias, most notoriously towards Black individuals and other disadvantaged communities. Much of this has to do with a biased data generation process, something that is very difficult to correct for with fancy statistics.
Additionally, AI has the potential to touch many patients. The scale of AI is orders of magnitude higher than a drug, because AI-based software can be deployed in electronic health record systems and other commonly-used systems.
How is AI currently regulated?
Many might argue that AI and ML shouldn’t be regulated similar to a medical device because the safety concerns involved – such as bias and scale – are unique. However, if a software is intended to treat, diagnose, cure, mitigate or prevent disease, the FDA considers it a medical device. This is commonly termed the “Software as a Medical Device” (SaMD) categorization and includes radiology computer aided detection devices or a software that analyzes eye images to determine whether there are critical changes due to diabetes. SaMDs are regulated based on risk. Class I devices are lowest risk and generally just present readings, whereas Class III devices are life-sustaining or life-supporting. A continuous glucose monitor is an example here.
Certain software that is intended to monitor healthy lifestyle or improve ease of scheduling or practice management is exempt from the SaMD or any other FDA regulation.
There are also separate considerations for clinical decision support (CDS) products. CDS products that are just used in one city may not be subject to as stringent requirements.
There are other considerations that the FDA has begun to release guidance on. One relates to continuously updating algorithms. With AI, it is critical to integrate new data to improve the algorithm, rather than only deploying a “locked” algorithm. In draft guidance, "Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence/Machine Learning (AI/ML)-Enabled Device Software Functions”, the FDA recommends approaches for ensuring that AI/ML devices can be rapidly modified and updated, without having to resubmit a separate application for every update.
This strategy of ensuring oversight over the lifetime of the AI/ML device, rather than just the approval decision, is a smart one and atypical for FDA regulation of typical drugs and devices. For a drug, the primarily regulatory assessment of safety and efficacy occurs prior to the drug being available in the market.
However, for AI, performance metrics, safety, and impacts on decision-making or resource allocation may change during the entire lifecycle of the AI. For example, an AI device to predict mortality may be able to learn from a new datapoint, e.g. molecular testing, that wasn’t available when the model was trained. Because of this, it becomes necessary to take a total life-cycle approach to AI, whereby pre-specified “Change Control Plans” for monitoring and improvement may be more necessary than how well the model performs at first.
There are other ways to improve AI regulation. For one, regulators frequently only see the model performance but not the training data that went into the algorithm. That may be X-rays, electronic health record data, or other types of patients screens. For the FDA to accurately assess potential underrepresentation in the training data or bias, this source data should be made more widely available prior to FDA clearance. Furthermore, metrics of bias against underrepresented populations – both pre- and post-marketing – need to be incorporated into regulatory decisions.
More generally, the FDA could obviate the need for significant post-marketing surveillance if the initial approvals were based on higher-quality prospective data – such as randomized trials – to verify patient benefit. This is where we sense the future of AI regulation is going. There is a playbook to follow from the drug and device regulation space – show prospective efficacy prior to approval. Greater emphasis on randomized trials may facilitate companies redirecting resources towards collaborations with academia to reinforce trials.
These are initial ideas for regulation. But the key is to first ensure higher quality data to improve accuracy and decrease bias. Before we do that, other types of regulation may fall flat.
Ravi B. Parikh, MD, MPP, FACP, is an Assistant Professor of Medicine and Health Policy and a medical oncologist at the University of Pennsylvania and Corporal Michael J. Crescenz VA Medical Center. He is the Director of the HACLab.