AI in imaging: the regulatory landscape
AI in imaging: the regulatory landscape
https://doi.org/10.1093/bjr/tqae002
Advance access publication: 4 January 2024
Review
Abstract
experts, who delineate image features by hand. For example, resolution, for example, to reduce examination time
an algorithm to find the boundary of the left ventricle in a while preserving image diagnostic quality.15
cardiac ultrasound scan may be trained with images that 4. Identifying a disease-specific signature, learned from
have been carefully delineated by a radiologist or ultrasound multiple image features, that could be used in the diag-
technician using a drawing tool on a workstation. However, nosis of a disease, for example, an imaging signature of
labelling may also be done based on data that are not in the Alzheimer disease (AD) from MRI.16 Unlike categories 1
images; for example, an algorithm might be trained to tell the and 2, this application of AI goes beyond automating a
difference between patients with rapidly progressing or task that could be done by a radiologist on a worksta-
slowly progressing disease by training with longitudinal out- tion. These approaches may use information from sour-
come data. ces other than just the images to generate their output.
One particular type of machine learning, referred to as 5. Predicting outcomes based on medical images, for exam-
deep learning, has recently become extremely widespread in ple, predicting outcomes from ischaemic stroke,17 intra-
medical imaging applications. Deep learning is used to de- cranial aneurysm rupture, in COVID-19,18 or in
scribe methods in which more sophisticated AI models, typi- oncology.19 These approaches may also use information
Is there a valid clinical association between Does your SaMD correctly process input data Does use of your SaMD’s accurate, reliable,
your SaMD output and your SaMD’s targeted to generate accurate, reliable, and precise out- and precise output data achieve your intended
clinical condition? put data? purpose in your target population in the con-
text of clinical care?
Table 2. SaMD risk categories intended medical purpose (horizontal) vs targeted healthcare condition (vertical).
Critical IV III II
Table 3. Good machine learning practice for medical device development: guiding principles.
Multi-disciplinary expertise is leveraged throughout the total product Good software engineering and security practices are implemented
life cycle
Clinical study participants and datasets are representative of the Training datasets are independent of test sets
intended patient population
Selected reference datasets are based upon available methods Model design is tailored to the available data and reflects the intended
use of the device
Focus is placed on the performance of the Human-AI team Testing demonstrates device performance during clinically rele-
vant conditions
Users are provided clear and essential information Deployed models are monitored for performance and re-training risks
are managed
principles” in October 2021.27 This is a short document that Risk management in medical devices is already focused on
captures some aspects of good practice in the development of possible harm to patients and the hazardous situation that
medical devices that incorporate machine learning. Table 3 can give rise to that harm. This AAMI publication highlights
reproduces these guiding principles. the fact that AI introduces new possible hazards that are not
While these guiding principles are helpful, for example, properly covered by current product development methodol-
stating that independent test data (rather than cross-validated ogy for “rule-based” algorithms, and provides a detailed rec-
methods) should be used, it is not in all cases clear how to ipe for how to handle risk in AI software. Table 4 gives the
show compliance. In order to provide greater clarity to devel- risks highlighted in this document.
opers, the FDA recognized as a “consensus standard”, a guid- The FDA is arguably the leading medical device regulator
ance document published by AAMI CR34971:2022 for the for providing guidance for device developers and manufac-
application of the established medical device risk manage- turers in their AI-enabled devices. While technically the FDA
ment standard, ISO14971, to medical devices incorporating jurisdiction is limited to the United States, several other juris-
dictions provide fast-track means for FDA-cleared or ap-
AI and machine learning. This document has subsequently
proved devices to be put on the market in their own
been released by BSI as BS/AAMI 34971:2023, demonstrat-
countries. Most recently, the UK MHRA has announced
ing its international impact. This publication starts with a
plans for such a recognition route to enable FDA-cleared and
cautionary note: approved devices to be sold in the UK.
A paper authored by employees at the FDA was recently
Despite the sophistication and complicated methodologies
published, focusing specifically on regulatory concepts and
employed, machine learning systems can introduce risks to challenges for AI-enabled medical imaging devices.28 This ar-
safety by learning incorrectly, making wrong inferences, ticle emphasizes how radiology has been a pioneer in adopt-
and then recommending or initiating actions that, instead ing AI-enabled medical devices in a clinical environment, but
of better outcomes, can lead to harm. also highlights how these devices “come with unique
challenges” including the need for large and representative
The amplification of errors in an AI system has the poten- datasets, dealing with bias, understanding impact on clinical
tial to create large scale harm to patients. workflows, and maintaining safety and efficacy over time.
One key innovation from the FDA is the concept of
With medical devices without AI, risk can be assessed “Predetermined Change Control Plans for Artificial
from real-world experience with that technology. With Intelligence/Machine Learning-enabled Medical Devices”. This
AI-enabled medical devices, however, that experience is idea was proposed in the FDA “Artificial Intelligence/Machine
lacking … . it may be more complex to identify risks and Learning Software as a Medical Device Action Plan” in
bias since the algorithmic decision pathways may be chal- January 2021,29 and in 2023 a draft guidance was published30
lenging to interpret. that describes how this approach would be used to provide
BJR, 2024, Volume 97, Issue 1155 487
Table 4. Risk categories for AI/ML medical devices, to be incorporated in ISO14971 risk analysis.
Table 5. Issues to be addressed in ensuring safe and effective use of In addition to taking account of publications from medical
AI tools. device regulators on AI-enabled devices, developers need to take
account of other relevant regulations such as data privacy and
Human-led governance, accountability, and transparency
Table 6. AI-enabled hardware radiology devices cleared by FDA August that publicly available datasets have played in catalysing in-
2021 to July 2024. novation in AI algorithms. There is now a wide range of pub-
Type of device Number product code licly available datasets that can be used to train machine
learning image analysis algorithms, and here we will in par-
Ultrasonic pulsed Doppler imaging system 28 IYN ticular consider the UK Biobank and the Alzheimer Disease
Ultrasonic pulsed echo imaging system 1 IYO Neuroimaging Initiative (ADNI). These datasets have driven
Mobile X-ray system 1 IZL
Computed tomography X-ray system 38 JAK
a lot of high-quality science, but they do not include a repre-
Emission computed tomography 8 KPS sentative sample of the general population, and illustrate the
Magnet resonance diagnostic device 26 LNH problem of bias in algorithms used to train imaging
Stationary X-ray system 3 MQB AI models.
Densitometer, bone 1 KGI
Image-intensified fluoroscopic X-ray system 2 OWB Bias
Optoacoustic imaging system 1 QNK
Medical charged-particle radiation 15 MUJ
Petrick et al28 reported that a particular concern of regulators
therapy system is how studies used to evaluate performance are “often based
Clinical meaningfulness
Discussion The widespread availability of well-curated public databases
There have been large numbers of publications on applica- has catalysed the innovation of AI tools, but a perverse conse-
tions of AI and machine learning to medical imaging and ra- quence is that they encourage algorithm developers to focus
diology, and hundreds of medical devices placed on the on problems implicit in the datasets, rather than challenges in
market that are based on machine learning and AI tools. This clinical care. For example, many authors developing algo-
rapid innovation, however, has highlighted some important rithms trained on the ADNI dataset demonstrate that they
challenges that the field needs to address in order for these in- can separate subjects who are “normal”, “mild cognitive
novative tools to be trusted by patients and healthcare profes- impairment” (MCI), or “Alzheimer’s Disease” or that they
sionals. In particular, there is increasing evidence that poorly can accurately predict conversion of MCI to early AD.
implemented AI could lead to patient harm, and there is a However, not only do the patients enrolled in ADNI not rep-
need to identify and mitigate the underlying risks. resent the typical patient population in a community memory
Two key challenges for the field are dealing with bias that clinic, but these sorts of classifiers may not be relevant to
might detrimentally impact real-world performance, and en- addressing a clinically meaningful question. For example, if a
suring that the output is relevant to clinical care, that is, clini- patient arrives in a memory clinic with impaired memory, the
cally meaningful. These challenges are illustrated by the role question is not likely to be “does this patient have MCI or
BJR, 2024, Volume 97, Issue 1155 489
Table 7. AI-enabled software radiology devices cleared by FDA August 2021 to July 2024.
AD”, but “what is the underlying pathology causing these marketed are for medical imaging applications. However, as
symptoms”, as that can impact subsequent management. the examples given earlier in this article illustrate, the litera-
Borchert et al reported that in their systematic review “We ture contains many papers that justify the medical device reg-
found no studies that assessed the common clinical challenge ulators’ position that these methodologies introduce risks
of differential diagnosis from among multiple (>2) possible that are different, and in many cases greater, than the risks
diagnoses”, which is quite a strong critique of the field. present in traditional “rule-based” software medical devices.
The regulatory framework for AI-enabled medical devices As a consequence, AI-enabled devices on the market mitigate
described in this article has relevance to addressing these sorts these risks with indications for use that require they be used
of limitations in academic AI tool development. The Clinical under expert supervision, often in parallel with current clini-
Evaluation SaMD framework helps clearly define the need to cal practice, reducing their likely impact on clinical practice.
evaluate performance in the context of clinical care; Good For AI to have a greater clinical impact, developers of AI-
Machine Learning Practice makes clear the importance of in- enabled medical imaging tools need to provide more rigorous
dependent datasets for testing and validating (you should not risk analysis and performance assessment than traditional
use a single dataset like UK Biobank or ADNI for both train- software methods that are already on the market.
ing and testing), and the FDA recognized consensus standard Radiologists and their professional bodies have a key role to
AAMI CR34971:2022 provides a detailed framework for play in helping imaging AI researchers and device developers
identifying and mitigating risks such as bias in AI- to put in place more rigorous frameworks for developing
enabled devices. medical imaging AI devices, and monitoring their perfor-
mance on the market in clinical practice. Radiologists and
their radiographer and medical physics colleagues have a de-
Conclusions tailed understanding of the variation in patient presentation,
Artificial intelligence has already demonstrated it has great impact of artefacts, variability due to radiographic practice,
potential to enable novel and valuable medical technologies, and variability caused by different imaging device manufac-
and the great majority of AI-enabled medical devices turers and acquisition parameters, which is of great value in
490 BJR, 2024, Volume 97, Issue 1155
helping identification and mitigation of risks in AI medical implications of large language models for medical education and
imaging tool development. knowledge assessment. JMIR Med Educ. 2023;9:e45312.
Also, as the technology evolves, the regulatory landscape is 9. Gong C, Jing C, Chen X, et al. Generative AI for brain image com-
likely to continue to evolve, and in particular, ways in which puting and brain network computing: a review. Front Neurosci.
2023;17:1203104.
AI software can be updated once on the market, and ways in
10. Avendi MR, Kheradvar A, Jafarkhani H. Automatic segmentation
which the balance of pre-market and post-market perfor-
of the right ventricle from cardiac MRI using a learning-based ap-
mance data can be used to demonstrate safety, are likely to proach. Magn Reson Med. 2017;78(6):2439-2448.
evolve in the near future. 11. Perez AA, Noe-Kim V, Lubner MG, et al. Deep learning CT-based
The evolving regulatory landscape can be criticized for pro- quantitative visualization tool for liver volume estimation: defining
viding developers with a “moving target” by rapidly chang- normal and hepatomegaly. Radiology. 2021;302(2):336-342.
ing the documentation required before AI-enabled medical 12. Balboni E, Nocetti L, Carbone C, et al. The impact of transfer learn-
devices can be put on the market, thus providing a barrier to ing on 3D deep learning convolutional neural network segmentation
innovation. However, it is also arguable that regulators are of the hippocampus in mild cognitive impairment and Alzheimer dis-
being agile in providing developers with increasing clarity on ease subjects. Hum Brain Mapp. 2022;43(11):3427-3438.