IMDRF Machine Learning-Enabled Medical Devices - A Subset of Artificial Intelligence-Enabled Medical Devices - Key Terms and Definitions
IMDRF Machine Learning-Enabled Medical Devices - A Subset of Artificial Intelligence-Enabled Medical Devices - Key Terms and Definitions
Proposed Document
This document was produced by the International Medical Device Regulators Forum. There are
no restrictions on the reproduction or use of this document; however, incorporation of this
document, in part or in whole, into another document, or its translation into languages other than
English, does not convey or represent an endorsement of any kind by the International Medical
Device Regulators Forum.
1 Table of Contents
Page 2 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
31 Preface
32
33 The document herein was produced by the International Medical Device Regulators Forum
34 (IMDRF), a voluntary group of medical device regulators from around the world. The document
35 has been subject to consultation throughout its development.
36
37 There are no restrictions on the reproduction, distribution or use of this document; however,
38 incorporation of this document, in part or in whole, into any other document, or its translation into
39 languages other than English, does not convey or represent an endorsement of any kind by the
40 International Medical Device Regulators Forum.
Page 3 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
41 1.0 Introduction
42 Artificial Intelligence (AI) is a branch of computer science, statistics, and engineering that uses
43 algorithms or models to perform tasks and exhibit behaviors such as learning, making decisions
44 and making predictions. The subset of AI known as Machine Learning (ML) allows computer
45 algorithms to learn through data, without being explicitly programmed, to perform a task.
46
47 Approaches utilizing ML, sometimes colloquially referred to as AI or AI/ML, have been employed
48 in several fields, such as the automotive industry, robotics, medicine, finance, and art. ML has
49 given many sectors an ability to gain new insights from large amounts of data and to support tasks.
50
51 There has been accelerated adoption and use of ML-enabled approaches in medical devices. We
52 refer to these medical devices as Machine Learning-enabled Medical Devices, or MLMD. AI
53 systems are typically implemented as software in medical devices or as Software as a Medical
54 Device. MLMD have the potential to transform health care by deriving new and important insights
55 from the vast amount of data generated during all phases of the healthcare process. Examples of
56 applications include earlier disease detection and diagnosis; identification of new observations or
57 patterns on human physiology; development of personalized diagnostics and therapeutics;
58 workflow optimization; and guidance in use of the device with the goal of improving user and
59 patient experience. One of the greatest benefits of MLMD resides in its ability to learn from real-
60 world use and experience to improve its performance.
61
62 The purpose of this publication is to establish relevant terms and definitions across the Total
63 Product Life Cycle (TPLC) to promote consistency, support global harmonization efforts, and
64 provide a foundation for the development of future guidelines related to MLMD. Terms referenced
65 herein have either been previously defined in Global Harmonization Task Force (GHTF)
66 documents or by internationally recognized standards on AI., Some terms and definitions have
67 been generated by or are discussed by the IMDRF Artificial Intelligence Medical Device (AIMD)
68 Working Group within this document.
69
70 The overarching objective of this effort is to promote consistent expectations and understanding
71 for MLMD, promote patient safety, foster innovation, and encourage access to advances in
72 healthcare technology.
73
Page 4 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
74 2.0 Scope
75 This document applies to key terms and definitions relating to Machine Learning-enabled Medical
76 Devices (MLMD).
77
78 Note 1 : MLMD are medical devices. A product must first meet the definition of a medical device
79 before it can be an MLMD.
80
81 Note 2 : Most jurisdictions include "accessories to medical devices" in the definition of "medical
82 device". Other jurisdictions define "accessories to medical devices" separately. The definitions
83 and the concepts in this document are intended to apply in both case.
84
85 Note 3 : This document does not attempt to define established definitions in the field of computer
86 science; however, it does strive to highlight and clarify conflicting terms and definitions as
87 necessary. This document does not provide guidelines for the development, risk management or
88 evaluation of MLMD.
89
90 Note 4 : Terms and definitions that refer technical standards that are under development (e.g., ISO,
91 IEC, IEEE) may be updated upon final publication of those standards.
92
93 3.0 References
94 3.1 IMDRF / GHTF
99 3.2 Standards
100 The standards below were consulted in the writing of this document and may be useful in meeting
101 the key definition of MLMD discussed herein. This list is not intended as a required or complete
102 list of standards that can be used to meet the key definition of MLMD.
103 ISO/IEC DIS 22989 Information technology — Artificial intelligence — Artificial
104 Intelligence Concepts and Terminology
106 AAMI, BSI, Turpin, R., Hoefer, E., Lewelling, J., & Baird, P. (2020). Machine Learning
107 AI in Medical Devices: Adapting Regulatory Frameworks and Standards to Ensure Safety
108 and Performance. AAMI/BSI Initiative on Artificial Intelligence.
109 https://www.bsigroup.com/en-US/medical-devices/resources/Whitepapers-and-
110 articles/machine-learning-ai-in-medical-devices/
Page 5 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
111 Kohavi, R., & Provost, F. (Eds.). (n.d.). Glossary of Terms: Special Issue on Applications
112 of Machine Learning and the Knowledge Discovery Process.
113 https://ai.stanford.edu/~ronnyk/glossary.html
114 Kan A. (2017). Machine learning applications in cell image analysis. Immunology and
115 Cell Biology, 95(6), 525–530.
116 https://doi.org/10.1038/icb.2017.16
117
Page 6 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
133
134 Figure 1 Overview of AI and ML Concepts
135
136 ISO/IEC’s draft international standard for AI, 22989, defines and discusses ML in terms of being
137 an ML model parameter optimisation process for the purpose of the ML model’s behaviour
138 reflecting the data or experience.
139
140 There are several different types of ML methods, as well as different algorithms. For example,
141 some applications may use Supervised Learning, others may use Unsupervised or Semi-
142 Supervised Learning (Section 6.0). Different types of algorithms include neural networks (e.g.,
143 feed forward neural network, recurrent neural network, convolutional neural network, etc.)
1
A.L. Samuel, “Some Studies in Machine Learning Using the Game of Checkers.” IBM Journal 1(3), 210–229
(1959)
Page 7 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
144 Bayesian networks, decision trees, support vector machine, among others. The learning process
145 itself may be an iterative process of trial and error, also known as Reinforcement Learning.
146
147 Note : Within this document, the term ML algorithm is used to represent a software procedure
148 developed using ML, and consisting of mathematics and logic, that can process data. The term ML
149 model is used here to represent the relationship or function that is the result of Training an ML
150 algorithm with data.
151
152 The following sections provide key definitions that are relevant to ML when used in medical
153 devices (Section 5.0), definitions from technical standards (Section 6.0), followed by a discussion
154 of common ML terms (Section 7.0).
155
Page 8 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
163 Medical Device: Any instrument, apparatus, implement, machine, appliance, implant, reagent for
164 in vitro use, software, material or other similar or related article, intended by the manufacturer to
165 be used, alone or in combination, for human beings, for one or more of the specific medical
166 purpose(s) of:
167 diagnosis, prevention, monitoring, treatment or alleviation of disease,
168 diagnosis, monitoring, treatment, alleviation of, or compensation for, an injury,
169 investigation, replacement, modification, or support of the anatomy, or of a physiological
170 process,
171 supporting or sustaining life,
172 control of conception,
173 cleaning, disinfection or sterilization of medical devices,
174 providing information by means of in vitro examination of specimens derived from the
175 human body;
176 and does not achieve its primary intended action by pharmacological, immunological, or
177 metabolic means, in or on the human body, but which may be assisted in its intended function
178 by such means.
179
180 Note 1 : Products which may be considered to be medical devices in some jurisdictions but
181 not in others include:
182 disinfection substances,
183 aids for persons with disabilities,
184 devices incorporating animal and/or human tissues,
185 devices for in-vitro fertilization or assisted reproduction technologies.
186
187 Note 2 : For clarification purposes, in certain regulatory jurisdictions, devices for
188 cosmetic/aesthetic purposes are also considered medical devices.
189
190 Note 3 : For clarification purposes, in certain regulatory jurisdictions, the commerce of
191 devices incorporating human tissues is not allowed.
192
193 Editorial issue has been corrected from IMDRF/GRRP WG/N47:2018.
194
Page 9 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
199 Note : Bias is used in both data science and in legal discussions. When used in data science,
200 bias is the tendency of a statistic to overestimate or underestimate a parameter. From a legal
201 point of view, however, bias is any prejudiced or partial personal or social perception of a
202 person or group. For the purposes of this document, bias is a data science term, and not a
203 legal one. Bias can be introduced into study design, conduct or analysis. Sources of bias
204 include selection bias (of study sample), operational bias, and analyses that do not account
205 for missing data.
207 Training that leads to change of an MLMD with each exposure to data that takes place on
208 an ongoing basis during the operation phase of the MLMD life cycle. (Modified from
209 ISO/IEC DIS 22989)
210 Note : Batch Learning is a training that leads to the change of an MLMD that involves
211 discrete updates based on defined sets of data that take place at distinct points prior to or
212 during the operation phase of the MLMD life cycle.
213
215 An objectively determined benchmark that is used as the expected result for comparison,
216 assessment, training, etc. (e.g., ground truth, gold standard).
218 Machine learning utilizing a reward function to optimize either a policy function or a value
219 function by sequential interaction with an environment. (ISO/IEC DIS 22989)
220 Note 1 to entry: Policy functions and value functions express a strategy that is learned by
221 the environment.
224 Property of consistent intended behavior and results. (ISO/IEC DIS 22989)
Page 10 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
225 6.6 Semi-Supervised Machine Learning
226 Machine learning that makes use of both labelled and unlabelled data during training.
227 (ISO/IEC DIS 22989)
228 Note 1 : Descriptive information can be broader than just labelling. Annotation is the
229 process of attaching descriptive information to data, such as metadata, labels, or anchors.
230 The data itself is unchanged in the annotation process.2
231 Note 2 : Additional information about this term can be found in Section 7.4
233 Machine learning that makes use of labelled data during training. (ISO/IEC DIS 22989)
234 Note 1 : Descriptive information can be broader than just labelling. Annotation is the
235 process of attaching descriptive information to data, such as metadata, labels, or anchors.
236 The data itself is unchanged in the annotation process.2
237 Note 2 : Additional information about this term can be found in Section 7.4
239 A subset of the data that is never shown to the ML model during training, used to verify
240 what the model has learned. (Modified from ISO/IEC DIS 22989)
242 Process intended to establish or to improve the parameters of a machine learning model,
243 based on a machine learning algorithm, by using training data. (Modified from ISO/IEC
244 DIS 22989)
246 Subset of input data samples used to train a machine learning model. (ISO/IEC DIS 22989)
248 Machine learning that makes use of unlabelled data during training. (ISO/IEC DIS 22989)
249 Note 1 : Descriptive information can be broader than just labelling. Annotation is the
250 process of attaching descriptive information to data, such as metadata, labels, or anchors.
251 The data itself is unchanged in the annotation process.2
252 Note 2 : Additional information about this term can be found in Section 7.4
2
ISO/IEC DIS 22989 Information technology — Artificial intelligence — Artificial Intelligence Concepts and
Terminology
Page 11 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
259 MLMD offer unique benefits, flexibility, and challenges related to their capacity for change.
260 The transparent communication of the various aspects of these changes is important to the safety,
261 performance, and effectiveness of MLMD.
262
263 The examples outlined in this discussion are not exhaustive and the relevant information may
264 expand over time. It is important to note that changes , such as software patches, operating system
265 updates, cybersecurity improvements, etc., can impact both MLMD and non-MLMD medical
266 devices and, although important, these changes are not within the scope of this discussion.
267
268 There are a number of unique changes related to MLMD, including changes to the ML model or
269 to the environment of use relative to the ML training data. The following discussion highlights
270 these important aspects in two sections, MLMD Changes and MLMD Environmental Changes.
271
Page 12 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
272 7.1.1 Changes to MLMD
273 A change to the device could include a modification to the machine learning model, algorithm,
274 weights, or parameters. MLMD is in a locked state when changes are not permitted. Aspects that
275 describe these changes include the cause, effect, trigger, domain, timing, and effectuation. These
276 attributes describe what changes, as well as why, where, when, and how the MLMD change occurs.
277
278 Note : The word "locked" has been used by the community in a number of different ways. Some
279 have defined a "locked device" as one that has been developed using ML methods but for which
280 the developer does not have an intention of modifying at the present time. Others have used the
281 term "locked device" as any device that does not perform "continuous learning." When using the
282 word "locked" it is important to provide clarifying language around its use to communicate how it
283 is being used.
284
285 Figure 2 Aspects of MLMD Changes
286 The cause refers to the source of the change to the MLMD, for example, re-training with new or
287 appended data or new training methods, algorithm/model, tuning, etc.
288
289 The effect refers to the resulting change to the MLMD, which can include amended intended
290 use/indications for use; modified performance, changes in inputs, outputs, etc.
291
292 The trigger refers to the event that prompts or instigates the change to the MLMD, which can
293 include performance thresholds, training data batch-size thresholds, exposure to new
294 data/experiences, scheduled time intervals, MLMD environmental changes, user feedback, etc.
295
296 The domain refers to the scope or applicable extent of the change to the MLMD, which can be
297 categorized as either homogeneous or heterogeneous. A homogeneous change is a uniform change
298 that occurs universally (sometimes referred to as a global adaptation, note that global does not
Page 13 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
299 denote around-the-world). Heterogeneous changes are non-uniform changes that can be specific
300 to one clinic, region, demographic, etc. (sometimes referred to as local adaptations).3
301
302 The effectuation refers to where the mechanism for change implementation resides, which can
303 either be external (i.e., updated by the developer or user) or internal (i.e., updated by a change-
304 control-algorithm within the device).
306 An MLMD environmental change is a modification to the setting of the MLMD relative to the
307 ML development data. Aspects that describe an MLMD environmental change include the cause,
308 effect, and domain.
309
310
311 Figure 3 Aspects of MLMD Environmental Changes
312 The cause of an MLMD environmental change refers to the source of the change relative to the
313 development environment. Examples of such causes include changes to the format or quality of
314 the MLMD inputs (e.g., changes to third party image processing, incidents of adversarial machine
315 learning); changes in the patient population (e.g., demographic shift); changes in clinical practice
316 (e.g., earlier interventions that mask features used by the model for classification), etc.
317
318 The effect of an MLMD environmental change can involve deteriorated or improved performance,
319 effectiveness, or safety.
320
321 The domain of an MLMD environmental change refers to the scope or applicable extent of the
322 change, which can be categorized as either homogeneous or heterogeneous. Heterogeneous
323 changes are non-uniform changes that can be specific to one clinic, region, demographic, etc.
324 (sometimes referred to as local changes). Homogeneous changes are changes that occur uniformly
325 (universally, globally) over some groups or settings/context. Note that global does not denote
326 around-the-world.
3
“Introduction to Online Machine Learning: Simplified”,
https://www.analyticsvidhya.com/blog/2015/01/introduction-online-machine-learning-simplified-2/
Page 14 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
327 7.2 Supervised / Unsupervised / Semi-Supervised Learning
328 Supervised and Unsupervised Machine Learning are two methods that are commonly used to train
329 machine learning algorithms, but they are not the only methods available. The terms “supervised”
330 and “unsupervised” in a machine learning context refer to the training methods, and specifically
331 whether labelled or unlabelled data are used. Supervised Machine Learning utilizes labelled data
332 during Training to learn the relationship between independent attributes and a designated
333 dependent attribute (the label). In other words, supervised learning is a task to learn a mapping
334 from input to output values, where the correct output values are known (labelled training data).
335 Most induction algorithms are developed through supervised learning. Unsupervised Machine
336 Learning utilizes unlabelled data during Training to group data without a pre-specified dependent
337 attribute. In other words, unsupervised learning is the ability to find patterns from input values,
338 where the output values are unknown. Examples of unsupervised learning include some types of
339 algorithms that perform clustering or dimensionality reduction.
340
341 Machine learning systems can use a mix of supervised and unsupervised learning (sometimes
342 referred to as semi-supervised learning), as well as other learning methods such as Reinforcement
343 Learning.
344
345 The terms “Supervised Machine Learning ” and “Unsupervised Machine Learning ” are often
346 misunderstood. When used in a machine learning context, “supervised” or “unsupervised” does
347 not refer to the presence or absence of a human supervisor overseeing the software. “Supervised”
348 or “unsupervised” does not refer to the role that the software plays in a clinical environment, i.e.,
349 it does not describe the level of “autonomy” in practice. “Supervised” or “unsupervised” also does
350 not refer to whether the software updates itself in a self-effectuating update process, i.e., whether
351 it performs its own updates or adaptations.
352
354 The term validation has been used to represent different concepts within the fields of medical
355 device development and machine learning algorithm development.
356
357 Validation within the context of medical device development has been defined as follows:
358
359 Validation means confirmation by examination and provision of objective evidence that the
360 particular requirements for a specific intended use can be consistently fulfilled.4
361
362 The term validation has also been used within the field of machine learning to refer to either data
363 curation (sometimes referred to as data validation) or model tuning (sometimes referred to as
364 validation5).
4
Design Control Guidance for Medical Device Manufacturers (GHTF.SG3.N99-9)
Ripley, B. (1996). Glossary. In Pattern Recognition and Neural Networks (pp. 347-354). Cambridge: Cambridge
5
Page 15 of 16
IMDRF/AIMD WG/N@@:202X
_____________________________________________________________________________________________
365
366 Data curation and model tuning can occur throughout the product lifecycle. Data curation refers
367 to the selection, management and assessment of the quality attributes of data sets. Model tuning is
368 a particular phase of model development during which ML model hyper-parameters are tuned; this
369 optional tuning phase can be combined with the Training phase to optimize the ML model selection.
370
371 MLMD manufacturers, regulators, and users should be aware of the conflicting interpretations of
372 the term validation and ensure that communication regarding the development phases and the
373 associated datasets is clear to avoid confusion between data validation, model tuning, and medical
374 device validation. It is recommended that the use of the term “validation” be accompanied by the
375 context when referring to model tuning, data curation, and the associated datasets. Alternatively,
376 the use of the term validation that refers to the training and tuning process may be avoided in the
377 context of medical device development.
378
Page 16 of 16