0% found this document useful (0 votes)
7 views9 pages

Noninvasive Diagnosis of Nonalcoholic Fatty Liver Disease and Quantification of Liver Fat With

This study developed and evaluated deep learning algorithms using one-dimensional convolutional neural networks to noninvasively diagnose nonalcoholic fatty liver disease (NAFLD) and quantify liver fat using radiofrequency ultrasound data. The algorithms demonstrated high accuracy in diagnosing NAFLD with a sensitivity of 97% and specificity of 94%, and they correlated well with MRI-derived proton density fat fraction measurements. The findings suggest that these deep learning techniques can provide an effective alternative to traditional imaging methods for assessing hepatic steatosis.

Uploaded by

rattan.kumar2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views9 pages

Noninvasive Diagnosis of Nonalcoholic Fatty Liver Disease and Quantification of Liver Fat With

This study developed and evaluated deep learning algorithms using one-dimensional convolutional neural networks to noninvasively diagnose nonalcoholic fatty liver disease (NAFLD) and quantify liver fat using radiofrequency ultrasound data. The algorithms demonstrated high accuracy in diagnosing NAFLD with a sensitivity of 97% and specificity of 94%, and they correlated well with MRI-derived proton density fat fraction measurements. The findings suggest that these deep learning techniques can provide an effective alternative to traditional imaging methods for assessing hepatic steatosis.

Uploaded by

rattan.kumar2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

ORIGINAL RESEARCH • GASTROINTESTINAL IMAGING

Noninvasive Diagnosis of Nonalcoholic Fatty Liver Disease


and Quantification of Liver Fat with Radiofrequency
Ultrasound Data Using One-dimensional Convolutional
Neural Networks
Aiguo Han, PhD • Michal Byra, PhD • Elhamy Heba, MD1 • Michael P. Andre, PhD • John W. Erdman, Jr, PhD •
Rohit Loomba, MD, MHSc • Claude B. Sirlin, MD • William D. O’Brien, Jr, PhD
From the Bioacoustics Research Laboratory, Department of Electrical and Computer Engineering (A.H., W.D.O.), and Department of Food Science and Human Nutri-
tion (J.W.E.), University of Illinois at Urbana-Champaign, 306 N Wright St, Urbana, IL 61801; Department of Radiology (M.B., M.P.A.), Liver Imaging Group, Depart-
ment of Radiology (E.H., C.B.S.), and NAFLD Research Center, Division of Gastroenterology, Department of Medicine (R.L.), University of California, San Diego, La
Jolla, Calif; and Department of Ultrasound, Institute of Fundamental Technological Research, Polish Academy of Sciences, Warsaw, Poland (M.B.). Received May 21,
2019; revision requested July 29; revision received December 2; accepted December 18. Address correspondence to A.H. (e-mail: [email protected]).
Supported by the National Institutes of Health (R01DK106419).
1
Current address: Department of Radiology, SUNY Upstate Medical University, Syracuse, NY.
Conflicts of interest are listed at the end of this article.
See also the editorial by Lockhart and Smith in this issue.

Radiology 2020; 295:342–350 • https://doi.org/10.1148/radiol.2020191160 • Content codes:

Background: Radiofrequency ultrasound data from the liver contain rich information about liver microstructure and composition.
Deep learning might exploit such information to assess nonalcoholic fatty liver disease (NAFLD).

Purpose: To develop and evaluate deep learning algorithms that use radiofrequency data for NAFLD assessment, with MRI-derived
proton density fat fraction (PDFF) as the reference.

Materials and Methods: A HIPAA-compliant secondary analysis of a single-center prospective study was performed for adult partici-
pants with NAFLD and control participants without liver disease. Participants in the parent study were recruited between February
2012 and March 2014 and underwent same-day US and MRI of the liver. Participants were randomly divided into an equal num-
ber of training and test groups. The training group was used to develop two algorithms via cross-validation: a classifier to diagnose
NAFLD (MRI PDFF  5%) and a fat fraction estimator to predict MRI PDFF. Both algorithms used one-dimensional convolu-
tional neural networks. The test group was used to evaluate the classifier for sensitivity, specificity, positive predictive value, nega-
tive predictive value, and accuracy and to evaluate the estimator for correlation, bias, limits of agreements, and linearity between
predicted fat fraction and MRI PDFF.

Results: A total of 204 participants were analyzed, 140 had NAFLD (mean age, 52 years 6 14 [standard deviation]; 82 wom-
en) and 64 were control participants (mean age, 46 years 6 21; 42 women). In the test group, the classifier provided 96%
(95% confidence interval [CI]: 90%, 99%) (98 of 102) accuracy for NAFLD diagnosis (sensitivity, 97% [95% CI: 90%,
100%], 68 of 70; specificity, 94% [95% CI: 79%, 99%], 30 of 32; positive predictive value, 97% [95% CI: 90%, 99%], 68
of 70; negative predictive value, 94% [95% CI: 79%, 98%], 30 of 32). The estimator-predicted fat fraction correlated with
MRI PDFF (Pearson r = 0.85). The mean bias was 0.8% (P = .08), and 95% limits of agreement were -7.6% to 9.1%. The
predicted fat fraction was linear with an MRI PDFF of 18% or less (r = 0.89, slope = 1.1, intercept = 1.3) and nonlinear with
an MRI PDFF greater than 18%.

Conclusion: Deep learning algorithms using radiofrequency ultrasound data are accurate for diagnosis of nonalcoholic fatty liver dis-
ease and hepatic fat fraction quantification when other causes of steatosis are excluded.
© RSNA, 2020

Online supplemental material is available for this article.

N onalcoholic fatty liver disease (NAFLD) is the most


common chronic liver disease worldwide, affecting
approximately 25% of the human population (1). NAFLD
hepatic steatosis. However, biopsy is costly, invasive, and
inappropriate for screening.
There is a critical need to develop noninvasive imaging
covers a spectrum of liver abnormalities ranging from methods to assess hepatic steatosis. Several modalities have
simple steatosis to nonalcoholic steatohepatitis. Hepatic been investigated (3–8), among which MRI and conven-
steatosis, characterized by the accumulation of fat droplets tional (qualitative) US have the advantage of involving no
within hepatocytes, can progress to nonalcoholic steato- ionizing radiation. Confounder-corrected chemical shift–
hepatitis, fibrosis, cirrhosis, and even hepatocellular carci- encoded MRI can measure the proton density fat fraction
noma (1,2). Early detection and treatment may halt or (PDFF), a leading method for noninvasive quantification
reverse NAFLD progression (2). Liver biopsy remains the of hepatic steatosis (4,5). However, chemical shift–encoded
reference standard for diagnosing NALFD and grading MRI is not routinely accessible. Conventional US is widely
This copy is for personal use only. To order printed copies, contact [email protected]
Han et al

participants without liver disease. The parent study was re-


Abbreviations ported in a previous article (16) and used a different ultrasound
AUC = area under receiver operating characteristic curve, CI = confi- analysis technique. For the current analysis, we developed and
dence interval, CNN = convolutional neural network, NAFLD = non-
alcoholic fatty liver disease, PDFF = proton density fat fraction, RF = evaluated deep learning techniques in the same participants.
radiofrequency, TGC = time gain compensation The University of California, San Diego, approved this second-
ary analysis and the parent study, both of which complied with
Summary the Health Insurance Portability and Accountability Act. All
When other causes of steatosis are excluded, de novo one-dimension- participants provided written informed consent.
al convolutional neural network algorithms can accurately identify
nonalcoholic fatty liver disease and quantify hepatic fat fraction by Participants were consecutively recruited by an expert hepa-
using raw radiofrequency ultrasound data. tologist (R.L., .10 years of experience) from the University of
California, San Diego, NAFLD Research Center between Feb-
Key Results ruary 2012 and March 2014. Inclusion criteria were age of at
n Deep learning with raw ultrasound data provided hepatic fat frac-
tion estimates correlated to proton density fat fraction measured
least 18 years and willingness and ability to participate. Exclu-
with confounder-corrected chemical shift–encoded MRI (Pearson sion criteria were clinical, laboratory, or histologic evidence of a
r = 0.85). liver disease other than NAFLD; excessive alcohol consumption
n The proposed deep learning approach can diagnose nonalcohol- (.30 g per day within the past 10 years or .10 g per day in
ic fatty liver disease (area under the receiver operating charac- the previous year); and steatogenic or hepatoxic medication use.
teristic curve, 0.98) and is robust to changes in system settings,
including transmit focal range and time gain compensation.
NAFLD in study participants was defined as MRI PDFF of 5%
or greater, with other causes of steatosis excluded (16). Control
participants (MRI PDFF ,5%) had no liver disease based on
comprehensive clinical and laboratory testing performed under
available for NALFD assessment but is limited by its qualitative the supervision of and interpreted by the hepatologist. All par-
nature, operator dependency, and modest accuracy (3). ticipants underwent same-day US and chemical shift–encoded
Quantitative analysis of raw radiofrequency (RF) ultrasound MRI of the liver.
signals shows potential for objective and accurate disease assess-
ment (9). This analysis is based on the premise that by altering Ultrasound Protocol
tissue microstructures, disease processes can cause quantifiable Nonenhanced US was performed by a research physician
RF signal changes. Of note, the RF signals contain more infor- (E.H., 1 year of hands-on US training) using the 4C1 convex
mation than do gray-scale B-mode images because information array (1–4 MHz) on a clinical ultrasound machine (Siemens
is lost or altered when B-mode images are generated from the S2000; Siemens, Issaquah, Wash) with an Ultrasound Research
raw data (9). Thus, when compared with B-mode images, the Interface option that allowed direct acquisition of RF data. Par-
rich RF data may allow more comprehensive characterization ticipants were positioned in the dorsal decubitus position, with
of pathophysiologic conditions. By using a well-characterized the right arm at maximum abduction. The right liver lobe was
phantom for system calibration, quantitative ultrasound tech- visualized via a right intercostal approach, and a representative
niques analyze the fundamental RF signals to extract system- region of the parenchyma was identified, thereby avoiding ma-
independent parameters, such as the attenuation and backscatter jor vasculature. The physician adjusted the transmit focal range
coefficients, with minimal operator dependency (10–14). These and time gain compensation (TGC) (ie, a setting that reduces
coefficients are correlated with hepatic fat fraction (15–17). the effect of ultrasound attenuation on clinical images by in-
To take further advantage of RF signals and to eliminate the creasing the received signal intensity with time [depth] [26])
dependency on a calibration phantom, we propose a phantom- for each participant, while fixing other settings. Ten consecu-
free deep-learning ultrasound approach for objective, accurate, tive RF frames were recorded at a rate of 10 frames per second
and automated NAFLD diagnosis and liver fat quantification. when the participant was executing a breath hold in shallow
Deep learning (18,19) based on convolutional neural networks expiration. Each frame had 560 lateral lines and was 10 cm
(CNNs) can extract features from raw data and has been ap- deep. The machine automatically recorded the transmit focal
plied to ultrasound B-mode image analysis (20–25) but not RF range and the TGC settings (Appendix E1 [online]).
analysis for steatosis assessment. The current study developed
and evaluated one-dimensional CNNs for NAFLD diagnosis Ultrasound Data Preprocessing
and liver fat quantification using contemporaneous MRI PDFF A fixed region of interest with standard size and location
as the reference standard. MRI PDFF was used because it ac- (central 256 RF lines laterally; 1.8–9.7 cm axially) relative to
curately quantifies liver fat (4,5) and can be safely and ethically the image frame was used for the one-dimensional CNN al-
acquired in asymptomatic control participants. gorithms, yielding 2560 RF signals per participant (256 RF
signals per frame 3 10 frames per participant). The region of
Materials and Methods interest was intended to cover as much of the liver region be-
low the liver capsule as possible while generally avoiding tissues
Study Participants outside the liver. A fixed region of interest rather than a hand-
This study was a secondary analysis of 204 prospectively en- drawn one tailored to each participant’s liver anatomy was ap-
rolled adult research participants with NAFLD and control plied to minimize human intervention. No effort was made to

Radiology: Volume 295: Number 2—May 2020 n radiology.rsna.org 343


Diagnosis of NAFLD with Radiofrequency Ultrasound Data

Table 1: Demographic, Physical, Biochemical, and MRI Proton Density Fat Fraction Characteristics of Study Participants

Characteristic Training Group (n = 102) Test Group (n = 102) P Value


Men (%) †
40 38 .89
Age (y)* 51 6 17 49 6 17 .34
Height (cm)* 166 6 10 167 6 10 .38
Weight (kg)* 85 6 21 84 6 20 .81
BMI (kg/m2)* 31 6 6 30 6 6 .43
Ethnic origin (%)† .67
White 47 48 …
Hispanic 31 26 …
Asian 14 16 …
Black 4 4 …
Other 4 6 …
Diabetes† 42 47 .57
Biochemical profile*
Hemoglobin (g/dL) 14 6 2 14 6 2 .09
Hematocrit (%) 40 6 4 42 6 4 .04
Platelet count (3103/µL) 251 6 72 255 6 66 .68
AST (U/L) 34 6 27 34 6 36 .95
ALT (U/L) 42 6 37 44 6 55 .81
Alkaline phosphatase (U/L) 76 6 28 74 6 23 .53
GGT (U/L) 45 6 46 41 6 45 .60
Total bilirubin (mg/dL) 0.5 6 0.4 0.5 6 0.3 .86
Albumin (g/dL) 4.5 6 0.4 4.9 6 3.9 .32
Glucose (mg/dL) 106 6 47 110 6 48 .52
Triglycerides (mg/dL) 145 6 81 163 6 275 .54
Total cholesterol (mg/dL) 183 6 41 180 6 46 .65
HDL cholesterol (mg/dL) 55 6 21 54 6 16 .72
LDL cholesterol (mg/dL) 101 6 32 97 6 30 .31
INR 1.0 6 0.2 1.0 6 0.2 .57
Imaging*
MRI PDFF 5–8 (%) 11 6 9 11 6 8 .54
Note.—Unless otherwise noted, data are mean 6 standard deviation. ALT = alanine aminotransferase, AST = aspartate aminotransferase,
BMI = body mass index, GGT = g-glutamyl transpeptidase, HDL = high-density lipoprotein, INR = international normalized ratio, LDL =
low-density lipoprotein, PDFF = proton density fat fraction (mean calculated from segments 5–8). Table 1 is adapted and reprinted, with
permission, from reference 16.
* Mean value provided with standard deviations and P values (t test). All laboratory results were obtained while patients were fasting.

The x2 test P values are presented; note that the x2 test for comparing ethnic proportions in the two groups were conducted for white
patients versus Hispanic patients versus Asian patients, black patients, and those with some other ethnicity.

completely exclude regions outside the liver, however, and the the 4C1 transducer (bandwidth, approximately 2–4 MHz). The
regions of interest contained variable amounts of extrahepatic downsampled signal containing 1024 sample points was input
tissue and structures. to the one-dimensional CNNs.
Because TGC settings affect RF signals, quantitative analy-
ses were performed before and after removal of the machine- CNN Algorithm Development
recorded TGC settings. Two one-dimensional CNN algorithms were developed: a bi-
The RF data of the last five frames were corrupted in two nary classifier and a fat fraction estimator. For each RF signal
participants in the test group. The intact frames (frames 1–5) input, the classifier output an NAFLD classification score be-
were duplicated for both participants to make 10 frames per par- tween 0 and 1, and the fat fraction estimator output the pre-
ticipant for convenience of algorithm testing. dicted fat fraction as a percentage.
To reduce data size, the signals were downsampled by deci- The participants were equally divided into training (n = 102)
mating the RF by four (ie, keeping every fourth sample) without and test (n = 102) groups by using stratified randomization (16).
filtering, to reduce the sampling frequency from 40 MHz to 10 The algorithms were developed by using the training group via
MHz. According to the Nyquist-Shannon sampling theorem, cross-validation and were evaluated by using the test group. De-
the 10-MHz sampling frequency was sufficient to preserve use- tails are presented in Appendix E2 (online), and the code is avail-
ful information contained in the original signal acquired from able for research use at https://github.com/han51/nafld-1d-cnn.

344 radiology.rsna.org n Radiology: Volume 295: Number 2—May 2020


Han et al

Men made up 40% (41 of 102) of the training group and 38%
(39 of 102) of the test group. The mean body mass index was
31 kg/m2 6 6 in the training group and 30 kg/m2 6 6 in the
test group. The mean MRI PDFF (segments 5–8) was 11% 6
9 and 11% 6 8 in the training and test groups, respectively.
In each group, MRI PDFF ranged from 1% to 35%, and 70
of 102 participants (69%) had NAFLD (MRI PDFF 5%).

RF Signals and B-Mode Images


Representative B-mode images (with TGC) and RF signals
are shown for two participants, referred to as participants A
(MRI PDFF, 1%) (Fig 2) and B (MRI PDFF, 28%) (Fig 3).
Figure 2a is a B-mode image reconstructed from frame 1 of
the raw RF data (with TGC) acquired in participant A. The
fixed region of interest is outlined by the superimposed yel-
low box. The blue dashed line is one of the 256 lines covered
Figure 1: Flowchart of study participants included and excluded in the study, by the region of interest. The RF signals without and with
adapted from reference 16. PDFF = proton density fat fraction.
TGC corresponding to the blue line in Figure 2a are shown
in Figure 2b. Ultrasound attenuation with time (depth) was
Statistical Analysis modest in the RF signal without TGC (Fig 2b), correspond-
Algorithms were evaluated at the participant level. Because ing to the lower fat fraction. The TGC caused the deeper and
each algorithm generated one output per RF signal input and weaker signals to be artificially more pronounced than the
because there were 2560 signal inputs per participant, the clas- more superficial signals.
sifier and fat fraction estimator generated 2560 NAFLD clas- The reconstructed B-mode images were visually similar be-
sification scores and 2560 fat fraction estimates per participant, tween adjacent frames (Fig 2a, 2c), although careful examina-
respectively. The 2560 outputs were averaged for each algo- tion revealed differences that were likely due to slight motion
rithm to yield composite per-participant scores and estimates. between frames. In contrast, the RF signals along the same scan
A cutoff of 0.5 for the composite score was set a priori for the line (Fig 2d) were noticeably different between adjacent frames
NAFLD classifier. (eg, different intensities around 40, 70, and 95 µsec), both with-
To evaluate classifier performance, we calculated sensitivity, out and with TGC, as might be expected due to random and
specificity, positive predictive value, negative predictive value, structured effects, which contributed to speckle and noise.
and overall accuracy for NAFLD identification in the test group. The reconstructed B-mode image and RF signals from par-
We also generated the receiver operating characteristic curve of ticipant B (Fig 3) visually differed from those from participant
the composite NAFLD classification score for the test group and A (Fig 2a, 2b). The B-mode image was more homogeneous for
calculated the area under the receiver operating characteristic participant B than for participant A. Blood vessels were visible
curve (AUC) and the 95% confidence interval (CI). The De- on the B-mode image for participant A but were obscured for
Long test (27) was performed to compare the AUCs obtained by participant B. For the RF signals from participant B, increased
using signals without and with TGC. attenuation with time (depth) was evident without TGC, cor-
To evaluate the fat fraction estimator performance, we cal- responding to the higher fat fraction. The TGC compensated
culated the correlation (Pearson r), bias, limits of agreement, for this attenuation by increasing signal amplification with
and linearity between predicted fat fraction and MRI PDFF. time.
Linearity was assessed by using sequential tests of polynomial
fits to the plot of estimated fat fraction versus MRI PDFF (28). Classification
Linear range was identified if the linearity test over the entire In the test group, the composite NAFLD classification score
MRI PDFF range failed. Linear regression slope, intercept, and provided a high degree of discrimination between control par-
R2 were evaluated. Analyses were performed by using MATLAB ticipants without liver disease and participants with NALFD,
R2016a (Mathworks, Natick, Mass) and RStudio 1.2 (RStudio, as demonstrated by the receiver operating characteristic curves
Boston, Mass) software. A P value less than .05 indicated statisti- obtained by using RF signals without and with TGC (Fig 4).
cal significance. The AUCs were 0.98 (95% CI: 0.94, 1.00) and 0.95 (95% CI:
0.91, 0.99) for scores obtained by using RF signals without
Results and with TGC, respectively. The two AUC estimates did not
differ (P = .23).
Participant Characteristics Applying the predetermined threshold of 0.5 on the com-
Participant characteristics are reported in Table 1. A total of 204 posite NAFLD classification score for NAFLD diagnosis in
participants (Fig 1), 140 with NAFLD (mean age, 52 years 6 the test group yielded 68 true-positive results, two false-posi-
14 [standard deviation]; 82 women) and 64 control partici- tive results, two false-negative results, and 30 true-negative re-
pants (mean age, 46 years 6 21; 42 women), were analyzed. sults when RF signals without TGC were used and yielded 64

Radiology: Volume 295: Number 2—May 2020 n radiology.rsna.org 345


Diagnosis of NAFLD with Radiofrequency Ultrasound Data

Figure 2: Data from 22-year-old woman with low proton density fat fraction (1%) (control participant, denoted participant A). Computer-
reconstructed nonenhanced ultrasound B-mode images (sagittal plane with time gain compensation) and the underlying radiofrequency sig-
nals. (a) B-mode image frame 1 (with time gain compensation), with yellow outline superimposed to indicate the region of interest for deep
learning analysis. (b) Radiofrequency signals corresponding to the blue line in a, without and with time gain compensation. (c) B-mode
image frame 2 (with time gain compensation). (d) Radiofrequency signals corresponding to same location as indicated by the blue line in c
but different frames (blue = frame 1, black = frame 2) without and with time gain compensation. Fixed region of interest includes signals from
outside the liver.

true-positive results, four false-positive results, six false-negative fat fraction against MRI PDFF within the linear range (MRI
results, and 28 true-negative results when RF signals with TGC PDFF  18%) yielded a slope of 1.1, an intercept of 1.3,
were used. These diagnostic results yielded a classification accuracy and R2 of 0.79 (Pearson r = 0.89) when signals without TGC
of 96% in the test group using RF signals without TGC, with were used and a slope of 0.9, an intercept of 3.1, and R2 of
97% sensitivity, 94% specificity, 97% positive predictive value, 0.59 (Pearson r = 0.77) when signals with TGC were used.
and 94% negative predictive value (Table 2). They yielded a clas- The fat fraction estimator underestimated the fat fraction for
sification accuracy of 90% in the test group using RF signals with MRI PDFF greater than 18%, suggesting a saturation effect
TGC, with 91% sensitivity, 88% specificity, 94% positive predic- outside the linear range. Linear regression of the predicted fat
tive value, and 82% negative predictive value (Table 2). fraction against MRI PDFF over the entire MRI PDFF range
(MRI PDFF , 35%) yielded a slope of 0.7, an intercept of
Fat Fraction Estimation 3.8, and R2 of 0.73 (Pearson r = 0.85) when signals without
The predicted fat fraction values correlated with the MRI TGC were used and a slope of 0.6, an intercept of 4.8, and R2
PDFF in the test group for RF signals without and those with of 0.64 (Pearson r = 0.80) when signals with TGC were used;
TGC (Fig 5). The Pearson correlation coefficient was 0.85 the R2 values were equal to the squared values of the Pearson
(P , .001) and 0.80 (P , .001) for use of RF signals without correlation coefficients, as expected.
and with TGC, respectively. The mean bias of the predicted fat fraction over the entire
Graphically, the predicted fat fraction versus MRI PDFF MRI PDFF range was 0.8% (P = .08), and 95% limits of agree-
scatterplots (Fig 5) track the identity line. A linearity test ment were -7.6% to 9.1% when signals without TGC were used
(28) showed no nonlinearity between predicted fat fraction (Fig 6). When signals with TGC were used, the mean bias be-
and MRI PDFF for MRI PDFF of 18% or less, regardless of came 0.34% (P = .49), and the 95% limits of agreement were
whether TGC was removed. Linear regression of the predicted -9.4% to 10.0%.

346 radiology.rsna.org n Radiology: Volume 295: Number 2—May 2020


Han et al

Figure 3: Data from 50-year-old man with high proton density fat fraction (28%) (participant with nonalcoholic fatty liver disease, denoted
participant B). Computer-reconstructed nonenhanced ultrasound B-mode image (transverse plane with time gain compensation) and underlying
radiofrequency signals. (a) B-mode image frame 1 for participant B, with yellow outline superimposed to indicate region of interest for
deep learning analysis. (b) Radiofrequency signals corresponding to blue dashed line shown in a, without and with time gain compensation.
Boundaries of the liver are not well delineated, and it is unclear whether the fixed region of interest includes signals from outside the liver.

Discussion
We developed one-dimensional convolutional
neural network (CNN) algorithms for nonalco-
holic fatty liver disease (NAFLD) diagnosis and
fat fraction estimation using ultrasound radio-
frequency (RF) signals as the input and MRI
proton density fat fraction (PDFF) as the refer-
ence standard. The algorithms showed promising
performance in a test group of 102 participants.
The classifier yielded high classification accuracy
(96%) and an area under the receiver operating
characteristic curve of 0.98. The fat fraction esti-
Figure 4: Receiver operating characteristic curves with 95% confidence bands of the composite mator predicted fat fraction values that correlated
nonalcoholic fatty liver disease classification scores yielded by the classifier for the test group using with MRI PDFF (r = 0.85; P , .001) and that
radiofrequency ultrasound signals (a) without and (b) with time gain compensation as the inputs. were linear with MRI PDFF over a broad range of
AUC = area under receiver operating characteristic curve.
clinically relevant MRI PDFF values. However, we
also observed a possible saturation effect at MRI
Table 2: Performance Metrics for Nonalcoholic Fatty Liver
PDFF greater than 18%, the exact cause of which is not yet
Disease Diagnosis in Test Group well understood. A potential explanation was insufficient train-
ing data for MRI PDFF greater than 18%. Another potential
Performance explanation was that ultrasonic signals could be insensitive to
Metrics Input: RF without TGC Input: RF with TGC fat fraction changes at high MRI PDFF values. We also dem-
Sensitivity 97 (90, 100) [68/70] 91 (82, 97) [64/70] onstrated the feasibility to develop and train one-dimensional
Specificity 94 (79, 99) [30/32] 88 (71, 96) [28/32] CNNs de novo using RF signals, without using techniques,
PPV 97 (90, 99) [68/70] 94 (86, 98) [64/68] such as transfer learning (ie, reuse of a model pretrained on
NPV 94 (79, 98) [30/32] 82 (68, 91) [28/34] a different problem) and data augmentation (ie, artificial ex-
Accuracy 96 (90, 99) [98/102] 90 (83, 95) [92/102] pansion of the input data through various transformations).
Note.—Metrics were obtained by applying the predetermined We showed algorithm robustness under varying transmit focal
threshold of 0.5 on the composite nonalcoholic fatty liver disease range and time gain compensation (TGC) settings, although
classification scores generated by the binary classifier. Values
better performance was achieved by using signals without
are expressed as percentages, with 95% confidence intervals in
parentheses and fractions in square brackets. NPV = negative TGC. Other settings (eg, transmit frequency, line density)
predictive value, PPV = positive predictive value, RF = radiofre- potentially critical to the algorithm performance were fixed.
quency, TGC = time gain compensation. However, the model robustness with focal range and TGC sug-
gested the one-dimensional CNN algorithms could be robust
to more settings, possibly providing a phantom-free approach
for ultrasound diagnosis using RF signals.

Radiology: Volume 295: Number 2—May 2020 n radiology.rsna.org 347


Diagnosis of NAFLD with Radiofrequency Ultrasound Data

Figure 5: Predicted fat fraction versus MRI-derived proton density fat fraction obtained by using radiofrequency signals (a) without and
(b) with time gain compensation. Blue lines represent the linear range. Gray line represents the identity line.

Figure 6: Difference between predicted fat fraction (FF) and MRI-derived proton density fat fraction (PDFF) versus the MRI-derived PDFF
plots obtained by using radiofrequency signals (a) without and (b) with time gain compensation. SD = standard deviation.

Several studies have used deep learning with B-mode images 0.57 Spearman correlation coefficient between the controlled
for steatosis classification (Table 3). Byra et al (24) proposed a attenuation parameter and MRI PDFF.
transfer learning approach to diagnose fatty liver disease us- Use of RF signals has several potential advantages. Not
ing ultrasound B-mode images with a deep CNN pretrained only do RF signals contain more information than B-mode
with nonmedical images. They evaluated the approach in 55 images (9) or the envelope data (Appendix E3 [online]), they
patients with severe obesity, 38 of whom had fatty liver (with are also less dependent on system settings and postprocessing
biopsy used as the reference standard), yielding 100% sensi- operations or can be corrected for these, which can reduce
tivity, 88% specificity, 96% overall accuracy, and an AUC of variability. For instance, RF signals are not influenced by
0.98. Reddy et al (25) used a similar transfer learning approach the dynamic range setting and filtering operations that affect
to diagnose fatty liver disease on 157 ultrasound liver images, the appearance of B-mode images. Additionally, diagnostic
with radiologists’ qualitative score used as the ground truth, techniques based on RF signals are potentially more suitable
yielding 95% sensitivity, 85% specificity, 91% accuracy, and an for devices that do not easily produce B-mode images (eg,
AUC of 0.96. Although our classifier achieved performances emerging wearable ultrasound devices [31]). Although train-
nominally similar to those of Byra et al (24) and better than ing the one-dimensional CNN algorithms takes a consider-
those of Reddy et al (25), it is difficult to directly compare able amount of time, the trained algorithms can be run in real
the various studies because of differences in the reference and time to analyze new data.
participant samples. Our study had several limitations. First, the ultrasound data
Several studies quantified liver steatosis by using MRI were acquired from a single scanner platform by one physi-
PDFF or liver biopsy as the reference standard (Table 3). For cian. The cross-platform and cross-operator generalizability of
example, a study of 153 patients (29) showed that controlled the algorithms remains to be tested. Second, the RF data are
attenuation parameter was correlated with the percentage of not yet readily available on all commercial ultrasound systems.
steatosis (Spearman r = 0.47), with biopsy used as the refer- However, more manufacturers are starting to provide RF capa-
ence standard, and a study (30) in 107 participants showed a bilities. Third, this study did not address whether deep learning

348 radiology.rsna.org n Radiology: Volume 295: Number 2—May 2020


Han et al

Table 3: Summary of Ultrasound-based Studies on Fatty Liver Disease Diagnosis That Used Deep Learning and on Steatosis
Quantification with Controlled Attenuation Parameter

Variable Byra et al (24) Reddy et al (25) Myers et al (29) Guthrie et al (30)


Task Fatty liver disease diagnosis Fatty liver disease diagnosis Steatosis quantification Steatosis quantification
Reference standard Liver biopsy Radiologists’ qualitative score Liver biopsy MRI PDFF
Sample size 55 patients with severe obesity, 157 ultrasound liver images 153 patients 107 participants
38 of whom had fatty liver from unknown number of
disease participants
Method Deep learning with ultrasound Deep learning with ultrasound CAP CAP
B-mode images B-mode images
Results Sensitivity, 100%; specificity, Sensitivity, 95%; specificity, Spearman r = 0.47 be- Spearman r = 0.57 be-
88%; accuracy, 96%; AUC, 85%; accuracy, 91%; AUC, tween CAP and biopsy- tween CAP and MRI
0.98 0.96 determined steatosis PDFF
percentage
Note.—AUC = area under receiver operating characteristic curve, CAP = controlled attenuation parameter, PDFF = proton density fat
fraction.

algorithms using RF data could outperform those using B-mode Myer Squibb, Celgene, Cirius, CohBar, Conatus, Eli Lilly, Galmed, Gemphire,
Gilead, Glympse bio, GNI, GRI Bio, Intercept, Ionis, Janssen Inc., Merck, Meta-
images. Finally, despite our efforts to provide methodologic de- crine, NGM Biopharmaceuticals, Novartis, Novo Nordisk, Pfizer, Prometheus,
tails, other investigators might still have difficulty reproducing Sanofi, Siemens, and Viking Therapeutics; institution has received grant support
this deep learning study. To facilitate reproducibility, we have from Allergan, Boehringer-Ingelheim, Bristol-Myers Squibb, Cirius, Eli Lilly,
Galectin Therapeutics, Galmed Pharmaceuticals, GE, Genfit, Gilead, Intercept,
made our code available for research use. Grail, Janssen, Madrigal Pharmaceuticals, Merck, NGM Biopharmaceuticals,
A possible direction for future studies is to optimize the re- NuSirt, Pfizer, pH Pharma, Prometheus, and Siemens; is the co-founder of Li-
gions of interest to include only signals in the liver. The fixed ponexus. C.B.S. Activities related to the present article: disclosed no relevant
relationships. Activities not related to the present article: is on the board of the
region of interest used in this study was easy to implement and Society of Abdominal Radiology, AMRA, Guerbet, and Bristol Myers Squibb; is
required no human intervention but included signals in a vari- a consultant for GE Healthcare, Bayer, AMRA, Fulcrum Therapeutics, and IBM/
able and uncontrolled manner from structures outside the liver, Watson Health; institution received grants from Gilead, GE Healthcare, Siemens,
GE MRI, Bayer, GE Digital, GE US, ACR Innovation, Philips, and Celgene; is
which probably reduced the algorithm performance. Another di- a speaker for GE Healthcare and Bayer; institution receives royalties from Wolt-
rection could be to assess the role of our algorithms in providing ers Kluwer Health (UpToDate Publishing); developed educational presentations
a cost-effective solution for quantifying longitudinal changes of for Medscape and Resoundant; institution has lab service agreements with En-
anta, ICON Medical Imaging, Gilead, Shire, Virtualscopics, Intercept, Synageva,
liver fat in response to treatment. Takeda, Genzyme, Janssen, NuSirt, Celgene-Parexel, and Organovo; has indepen-
In conclusion, one-dimensional convolutional neural net- dent consulting contracts with Epigenomics and Blade Therapeutics; developed
work algorithms can be developed and trained de novo to ac- educational presentations or articles for Medscape. Other relationships: disclosed
no relevant relationships. W.D.O. disclosed no relevant relationships.
curately identify nonalcoholic fatty liver disease and quantify
hepatic fat fraction using raw radiofrequency ultrasound data References
in the appropriate clinical context. The algorithms are robust 1. Loomba R, Sanyal AJ. The global NAFLD epidemic. Nat Rev Gastroenterol Hepatol
2013;10(11):686–690.
to changes in several machine settings, including transmit focal 2. Friedman SL, Neuschwander-Tetri BA, Rinella M, Sanyal AJ. Mechanisms of
range and time gain compensation. NAFLD development and therapeutic strategies. Nat Med 2018;24(7):908–922.
3. Machado MV, Cortez-Pinto H. Non-invasive diagnosis of non-alcoholic fatty liver
disease. A critical appraisal. J Hepatol 2013;58(5):1007–1019.
Author contributions: Guarantors of integrity of entire study, A.H., W.D.O.; 4. Le TA, Chen J, Changchien C, et al. Effect of colesevelam on liver fat quantified by
study concepts/study design or data acquisition or data analysis/interpretation, magnetic resonance in nonalcoholic steatohepatitis: a randomized controlled trial.
all authors; manuscript drafting or manuscript revision for important intellec- Hepatology 2012;56(3):922–932.
tual content, all authors; approval of final version of submitted manuscript, all 5. Noureddin M, Lam J, Peterson MR, et al. Utility of magnetic resonance imaging
authors; agrees to ensure any questions related to the work are appropriately versus histology for quantifying changes in liver fat in nonalcoholic fatty liver disease
trials. Hepatology 2013;58(6):1930–1940.
resolved, all authors; literature research, A.H., M.P.A., R.L., C.B.S.; clinical
6. Park CC, Nguyen P, Hernandez C, et al. Magnetic resonance elastography vs tran-
studies, A.H., E.H., M.P.A., R.L., C.B.S.; experimental studies, A.H., M.P.A., sient elastography in detection of fibrosis and noninvasive measurement of steatosis
W.D.O.; statistical analysis, A.H., M.B., W.D.O.; and manuscript editing, all in patients with biopsy-proven nonalcoholic fatty liver disease. Gastroenterology
authors 2017;152(3):598–607.e2.
7. Reeder SB, Cruite I, Hamilton G, Sirlin CB. Quantitative assessment of liver
fat with magnetic resonance imaging and spectroscopy. J Magn Reson Imaging
Disclosures of Conflicts of Interest: A.H. Activities related to the present arti- 2011;34(4):729–749.
cle: disclosed no relevant relationships. Activities not related to the present article: 8. Artz NS, Hines CDG, Brunner ST, et al. Quantification of hepatic steatosis with
institution received funding from Siemens Healthineers for the prior study. Other dual-energy computed tomography: comparison with tissue reference standards
relationships: disclosed no relevant relationships. M.B. disclosed no relevant rela- and quantitative magnetic resonance imaging in the ob/ob mouse. Invest Radiol
tionships. E.H. disclosed no relevant relationships. M.P.A. disclosed no relevant 2012;47(10):603–610.
relationships. J.W.E. disclosed no relevant relationships. R.L. Activities related 9. Oelze ML, Mamou J. Review of quantitative ultrasound: envelope statistics and
to the present article: received funding from NIEHS (5P42ES010337), NCATS backscatter coefficient imaging and contributions to diagnostic ultrasound. IEEE
Trans Ultrason Ferroelectr Freq Control 2016;63(2):336–351.
(5UL1TR001442), NIDDK (R01DK106419, P30DK120515), and DOD PR-
10. Han A, Andre MP, Erdman JW Jr, Loomba R, Sirlin CB, O’Brien WD Jr.
CRP (CA170674P2); received an investigator-initiated study grant from Siemens. Repeatability and reproducibility of a clinically based QUS phantom study and
Activities not related to the present article: disclosed no relevant relationships. methodologies. IEEE Trans Ultrason Ferroelectr Freq Control 2017;64(1):218–231.
Other relationships: is a consultant or advisory board member for Arrowhead 11. Han A, Andre MP, Deiranieh L, et al. Repeatability and reproducibility of the ultra-
Pharmaceuticals, AstraZeneca, Bird Rock Bio, Boehringer Ingelheim, Bristol- sonic attenuation coefficient and backscatter coefficient measured in the right lobe

Radiology: Volume 295: Number 2—May 2020 n radiology.rsna.org 349


Diagnosis of NAFLD with Radiofrequency Ultrasound Data

of the liver in adults with known or suspected nonalcoholic fatty liver disease. J 23. Byra M, Galperin M, Ojeda-Fournier H, et al. Breast mass classification in sonog-
Ultrasound Med 2018;37(8):1913–1927. raphy with transfer learning using a deep convolutional neural network and color
12. Han A, Labyed Y, Sy EZ, et al. Inter-sonographer reproducibility of quantitative conversion. Med Phys 2019;46(2):746–755.
ultrasound outcomes and shear wave speed measured in the right lobe of the liver 24. Byra M, Styczynski G, Szmigielski C, et al. Transfer learning with deep convolu-
in adults with known or suspected non-alcoholic fatty liver disease. Eur Radiol tional neural network for liver steatosis assessment in ultrasound images. Int J CARS
2018;28(12):4992–5000. 2018;13(12):1895–1903.
13. Han A, Zhang YN, Boehringer AS, et al. Inter-platform reproducibility of ultra- 25. Reddy DS, Bharath R, Rajalakshmi P. A novel computer-aided diagnosis framework
sonic attenuation and backscatter coefficients in assessing NAFLD. Eur Radiol using deep learning for classification of fatty liver disease in ultrasound imaging.
2019;29(9):4699–4708. 2018 IEEE 20th International Conference on e-Health Networking, Applications
14. Yao LX, Zagzebski JA, Madsen EL. Backscatter coefficient measurements using a and Services (Healthcom), Ostrava, Czech Republic, September 17–20, 2018. Pisca-
reference phantom to extract depth-dependent instrumentation factors. Ultrason taway, NJ: IEEE, 2018.
Imaging 1990;12(1):58–70. 26. Reid JM, Wild JJ. Ultrasonic ranging for cancer diagnosis. Electronics (Basel)
15. Andre MP, Han A, Heba E, et al. Accurate diagnosis of nonalcoholic fatty liver dis- 1952;25(5):136–138.
ease in human participants via quantitative ultrasound. 2014 IEEE International Ul- 27. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or
trasonics Symposium, Chicago, September 3–6, 2014. Piscataway, NJ: IEEE, 2014; more correlated receiver operating characteristic curves: a nonparametric approach.
2375–2377. Biometrics 1988;44(3):837–845.
16. Lin SC, Heba E, Wolfson T, et al. Noninvasive diagnosis of nonalcoholic fatty liver 28. Raunig DL, McShane LM, Pennello G, et al. Quantitative imaging biomarkers: a
disease and quantification of liver fat using a new quantitative ultrasound technique. review of statistical methods for technical performance assessment. Stat Methods
Clin Gastroenterol Hepatol 2015;13(7):1337–1345.e6. Med Res 2015;24(1):27–67.
17. Paige JS, Bernstein GS, Heba E, et al. A pilot comparative study of quantitative ultra- 29. Myers RP, Pollett A, Kirsch R, et al. Controlled attenuation parameter (CAP): a
sound, conventional ultrasound, and MRI for predicting histology-determined steato- noninvasive method for the detection of hepatic steatosis based on transient elastog-
sis grade in adult nonalcoholic fatty liver disease. AJR Am J Roentgenol 2017;208(5): raphy. Liver Int 2012;32(6):902–910.
W168–W177. 30. Guthrie H, Castro N, Beysen C, Morrow L, Hompesch M. Relationship between
18. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–444. controlled attenuation parameter (CAP) and magnetic resonance imaging-derived
19. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer proton density fat fraction (MRI-PDFF) in subjects at high risk for nonalcoholic
with deep neural networks. Nature 2017;542(7639):115–118 [Published correction fatty liver disease (NAFLD), ProSciento, Inc. https://prosciento.com/wp-content/
appears in Nature 2017;546(7660):686.]. uploads/2019/02/NASHTAG2019-Poster_Hompesch_FINAL-CB12292018L.pdf.
20. Han S, Kang HK, Jeong JY, et al. A deep learning framework for supporting the classifi- Accessed April 23, 2019.
cation of breast lesions in ultrasound images. Phys Med Biol 2017;62(19):7714–7728. 31. Wang C, Li X, Hu H, et al. Monitoring of the central blood pressure waveform via a
21. Xu Y, Wang Y, Yuan J, Cheng Q, Wang X, Carson PL. Medical breast ultrasound conformal ultrasonic device. Nat Biomed Eng 2018;2(9):687–695.
image segmentation by machine learning. Ultrasonics 2019;91:1–9.
22. Yap MH, Pons G, Martí J, et al. Automated breast ultrasound lesions detection using
convolutional neural networks. IEEE J Biomed Health Inform 2018;22(4):1218–1226.

350 radiology.rsna.org n Radiology: Volume 295: Number 2—May 2020

You might also like