0% found this document useful (0 votes)
68 views

AI Breast Screening

AI shows promise for breast cancer screening

Uploaded by

Chan Chee Hou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
68 views

AI Breast Screening

AI shows promise for breast cancer screening

Uploaded by

Chan Chee Hou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 2

measurements. In absorption spectroscopy, 4.

Ehrenfreund, P. & Charnley, S. B. Annu. Rev. Astron. 9. Chalmers, J. M., Edwards, H. G. M. & Hargreaves, M. D.
the signal is sensed only indirectly, from the Astrophys. 38, 427–483 (2000). (eds) Infrared and Raman Spectroscopy in Forensic
5. Bibring, J.-P. et al. Nature 428, 627–630 (2004). Science (Wiley, 2012).
light that does not interact with the sample 6. Chalmers, J. M. & Griffiths, P. R. (eds) Handbook of 10. Bunaciu, A. A., Fleschin, Ş., Hoang, V. D. &
(Fig. 1a). Weak absorption is therefore very Vibrational Spectroscopy Vols 4 & 5 (Wiley, 2001). Aboul-Enein, H. Y. Crit. Rev. Anal. Chem. 47, 67–75 (2017).
difficult to detect, because it changes the 7. Barth, A. & Haris, P. I. (eds) Biological and Biomedical 11. Wu, Q. & Zhang, X.-C. Appl. Phys. Lett. 67, 3523–3525
Infrared Spectroscopy (IOS, 2009). (1995).
intensity of the transmitted light only mar- 8. Sun, D.-W. (ed.) Infrared Spectroscopy for Food Quality 12. Theocharous, E., Ishii, J. & Fox, P. N. Appl. Opt. 43,
ginally. Theoretically, the detection of weak Analysis and Control (Academic, 2008). 4182–4188 (2004).
absorbers could be improved by increasing the
intensity of the incident light, but commonly
Medical research
used infrared detectors become less sensitive

AI shows promise for


at higher light intensities12, imposing a practi-
cal limit on the maximum light intensity that
can be used. By contrast, Pupeza et al. detect
the signal of interest — the radiation emit-
ted from the vibrating molecules — directly breast cancer screening
(Fig. 1b). This is analogous to the difference
between absorbance and fluorescence
Etta D. Pisano
measurements in the visible spectral range:
fluorescence measurements are the more Could artificial intelligence improve the accuracy of
sensitive because they detect a signal directly screening for breast cancer? A comparison of the diagnostic
from the sample, and can even detect it from
a single molecule.
performance of expert physicians and computers suggests so,
Pupeza and colleagues demonstrate the but the clinical implications are as yet uncertain. See p.89
high sensitivity of their approach in various
ways. For example, they were able to detect
40-fold lower concentrations of a compound Screening is used to detect breast cancer early be needed to further assess the utility of
in solution, and to better distinguish between in women who have no obvious signs of the this tool in medical practice. The real world
two similar compounds, than when using disease. This image-analysis task is challenging is more complicated and potentially more
absorption spectroscopy. They also obtained because cancer is often hidden or masked in diverse than the type of controlled research
spectra of biological samples that block nearly mammograms by overlapping ‘dense’ breast environment reported in this study. For exam-
all of the incoming light (in one case, at least tissue. The problem has stimulated efforts to ple, the study did not include all the different
99.999%). Thus, the new approach senses develop computer-based artificial-intelligence mammography technologies currently in
light where currently used methods see only (AI) systems to improve diagnostic perfor- use, and most images were obtained using a
darkness. This is an impressive achievement, mance. On page 89, McKinney et al.1 report the mammography system from a single manu-
and might alleviate both of the main prob- development of an AI system that outperforms facturer. The study included examples of two
lems of conventional infrared spectroscopy: expert radiologists in accurately interpreting types of mammogram: tomosynthesis (also
sensitivity and strong infrared absorption by mammograms from screening programmes. known as 3D mammography) and conven-
water. It will simplify sample preparation in The work is part of a wave of studies investigat- tional digital (2D) mammography. It would
many cases by removing the need for sample ing the use of AI in a range of medical-imaging be useful to know how the system performed
concentration or drying, and will open up new contexts2. individually for each technology.
applications — particularly those involving Despite some limitations, McKinney and
aqueous biological samples. colleagues’ study is impressive. Its strengths “Clinical trials will be needed
The authors suggest several ideas for taking include the large scale of the data sets used for
the method further, such as by increasing the training and subsequently validating the AI
to further assess the utility
power of the laser used to irradiate the sample. algorithm. Mammograms for 25,856 women of this tool in medical
It is to be hoped that such measures will further in the United Kingdom and 3,097 women in practice.”
narrow the technological gap that at present the United States were used to train the AI sys-
prevents the method from achieving the ulti- tem. The system was then used to identify the
mate goal of single-molecule sensitivity in bulk presence of breast cancer in mammograms of The demographics of the population
water. Other challenges will be to increase women who were known to have had either studied by the authors is not well defined,
the spectral range of the measurements to biopsy-proven breast cancer or normal fol- apart from by age. The performance of AI
include the shorter wavelengths at which low-up imaging results at least 365 days later. algorithms can be highly dependent on the
prominent and diagnostically useful signals These outcomes are the widely accepted gold population used in the training sets. It is there-
are found for proteins, lipids and nucleo­tides, standard for confirming breast cancer status fore important that a representative sample of
and to develop a spectrometer suitable for in people undergoing screening for the dis- the general population be used in the devel-
commercialization at a competitive price. ease. The authors report that the AI system opment of this technology, to ensure that the
outperformed both the histori­cal decisions results are broadly applicable.
Andreas Barth is in the Department of made by the radiologists who initially assessed Another reason to temper excitement
Biochemistry and Biophysics, Stockholm the mammograms, and the decisions of 6 about this and similar AI studies is the lessons
University, Stockholm 106 91, Sweden. expert radiologists who interpreted 500 ran- learnt from computer-aided detection (CAD)
e-mail: [email protected] domly selected cases in a controlled study. of breast cancer. CAD, an earlier computer
McKinney and colleagues’ results suggest system aimed at improving mammography
1. Pupeza, I. et al. Nature 577, 52–59 (2020). that AI might some day have a role in aiding interpretation in the clinic, showed great
2. Herschel, W. Phil. Trans. R. Soc. Lond. 90, 284–292 (1800).
3. van Dishoeck, E. F. Annu. Rev. Astron. Astrophys. 42, the early detection of breast cancer, but the promise in experimental testing, but fell
119–167 (2004). authors rightly note that clinical trials will short in real-world settings3. CAD marks

Nature | Vol 577 | 2 January 2020 | 35


©
2
0
2
0
S
p
r
i
n
g
e
r
N
a
t
u
r
e
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
News & views
mammograms to draw the interpreter’s encounters, as occurs in machine-learning more-complex clinical problems will require
attention to areas that might be abnormal. algorithms. Such performance metrics would greater effort by readers and the develop-
However, analysis of a large sample of clini- need to be available to those using these tools, ment of tools that can interrogate electronic
cal mammography interpretations from the in case performance deteriorates over time. health records to identify and annotate cases
US Breast Cancer Surveillance Consortium It is sobering to consider the sheer vol- representing specific diagnoses.
registry demonstrated that there was no ume of data needed to develop and test AI To achieve the promise of AI in health care
improvement in diagnostic accuracy with algorithms for clinical tasks. Breast cancer that is implied by McKinney and colleagues’
CAD3. Moreover, that study revealed that the screening is perhaps an ideal application for AI study, anonymized data in health records
addition of CAD worsened sensitivity (the in medical imaging because large curated data might thus have to be treated as precious
performance of radiologists in determining sets suitable for algorithm training and test- resources of potential benefit to human
that cancer was present), thus increasing the ing are already available, and information for health, in much the same way as public utilities
likelihood of a false negative test. CAD did not validating straightforward clinical end points such as drinking water are currently treated.
result in a significant change in specificity (the is readily obtainable. Breast cancer screening Clearly, however, if such AI systems are to be
performance of radiologists in determining programmes routinely measure their diagnos- developed and used widely, attention must
that cancer was not present) and the likelihood tic performance — whether cancer is correctly be paid to patient privacy, and to how data are
of a false positive test3. detected (a true positive) or missed (a false stored and used, by whom, and with what type
It has been speculated that CAD was not negative). Some areas found on mammograms of oversight.
as useful in the clinic as experimental data might be identified as abnormal but turn out
suggested it might be because radiologists on further testing not to be cancerous (false Etta D. Pisano is at the American College
ignored or misused its input owing to the positives). For most women, screening iden- of Radiology, Philadelphia, Pennsylvania
high frequency of marks on the images that tifies no abnormalities, and when there is still 19103, USA, and at Beth Israel Lahey Medical
were not findings suggestive of cancer. This no evidence of cancer one year later, this is Center, Harvard Medical School, Boston,
outcome was attributed by some to the classified as a true negative. Massachusetts.
limited processing power available for CAD, Most other medical tasks have more- e-mail: [email protected]
which meant that comparisons with previous complicated clinical outcomes, however, in
imaging studies of the same person were not which the clinician’s decision is not a binary
possible4. Thus, CAD might mark regions that one (between the presence or absence of
were not changing over time and that could be cancer), and thus further signs and symptoms
easily dismissed by expert readers. Another must also be considered. In addition, most 1. McKinney, S. M. et al. Nature 577, 89–94 (2020).
factor that limited CAD is that it was developed diseases lack readily accessible, validated 2. Neri, E. et al. Insights Imaging 10, 44 (2019).
3. Lehman, C. D. et al. JAMA Intern. Med. 175, 1828–1837
using the performance of human-based diag- data sets in which the ‘truth’ is defined rela- (2015).
nosis. It was trained using mammograms in tively easily. Obtaining validated data sets for 4. Kohli, A. & Jha, S. J. Am. Coll. Radiol. 15, 535–537 (2018).
which humans had found signs of cancer and
others that were false negatives — cases in
Astronomy
which humans could not see signs of cancer

Galaxy cluster illuminates


although the disease was indeed present4.
Similar pitfalls could be encountered with
AI-based decision aids, too.
A system by which AI finds abnormalities
that humans miss will require radiologists to the cosmic dark ages
adapt to the use of these types of tool. Imagine
a system in which an algorithm marks a dense
Nina A. Hatch
breast area on a screening mammogram and
the human radiologist cannot see anything Observations of a distant cluster of galaxies suggest that
that looks potentially malignant. With CAD, star formation began there only 370 million years after the
radiologists scrutinize the areas marked, and
if they decide the mark is probably not cancer,
Big Bang. The results provide key details about where and when
they assign the mammogram as being nega- the first stars and galaxies emerged in the Universe. See p.39
tive for malignancy. However, if AI algorithms
are to make a bigger difference than CAD in
detecting cancers that are currently missed, Shortly after the Big Bang, the Universe was galaxies known, the authors located galaxies
an abnormality detected by the AI system, completely dark. Stars and galaxies, which that formed stars in the dark ages, close to the
but not perceived as such by the radiologist, provide the Universe with light, had not yet earliest possible time that stars could emerge.
would probably require extra investigation. formed, and the Universe consisted of a pri- A galaxy cluster is a group of thousands
This might result in a rise in the number of mordial soup of neutral hydrogen and helium of galaxies that orbit each other at speeds3
people who receive callbacks for further eval- atoms and invisible ‘dark matter’. During of about 1,000 kilometres per second. They
uation. A clinical trial would show the effect of these cosmic dark ages, which lasted for are prevented from flying apart by the grav-
the AI system on the detection of cancer and several hundred million years, the first stars itational pull of the accompanying dark
the rate of false positive diagnoses, while also and galaxies emerged. Unfortunately, obser- matter, which has the equivalent total mass
allowing the development of effective clinical vations of this era are challenging because of about one hundred trillion Suns4. Astron-
practice in response to mammograms flagged dark-age galaxies are exceptionally faint1. On omers use these clusters as laboratories for
as abnormal by AI but not by the radiologist. page 39, Willis et al.2 provide a glimpse of what many experiments in astrophysics, such as
In addition, it would be essential to develop happened during the dark ages by doing some measuring the composition of the Universe,
a mechanism for monitoring the performance galactic archaeology. By measuring the ages testing theories of gravity and determining
of the AI system as it learns from cases it of stars in one of the most distant clusters of how galaxies form. Willis et al. used one of the

36 | Nature | Vol 577 | 2 January 2020


©
2
0
2
0
S
p
r
i
n
g
e
r
N
a
t
u
r
e
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.

You might also like