0% found this document useful (0 votes)
16 views

1 Done-1

Uploaded by

tlaisengineering
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

1 Done-1

Uploaded by

tlaisengineering
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Version of Record: https://www.sciencedirect.

com/science/article/pii/S0010482519300642
Manuscript_25f98f13d6fdc190683172b295138e9e

Radiological images and machine learning: trends, perspectives,


and prospects
Zhenwei Zhang, Ervin Sejdić*

Abstract
The application of machine learning to radiological images is an increasingly active research area
that is expected to grow in the next five to ten years. Recent advances in machine learning have
the potential to recognize and classify complex patterns from different radiological imaging modalities
such as x-rays, computed tomography, magnetic resonance imaging and positron emission tomography
imaging. In many applications, machine learning based systems have shown comparable performance to
human decision-making. The applications of machine learning are the key ingredients of future clinical
decision making and monitoring systems. This review covers the fundamental concepts behind various
machine learning techniques and their applications in several radiological imaging areas, such as medical
image segmentation, brain function studies and neurological disease diagnosis, as well as computer-
aided systems, image registration, and content-based image retrieval systems. Synchronistically, we will
briefly discuss current challenges and future directions regarding the application of machine learning in
radiological imaging. By giving insight on how take advantage of machine learning powered applications,
we expect that clinicians can prevent and diagnose diseases more accurately and efficiently.
Keywords: deep learning, machine learning, imaging modalities, deep neural network

1 Introduction
Radiology is a branch of medicine that uses imaging techniques to detect, diagnose and treat diseases [1–3].
Diagnostic radiology helps radiologists image internal body structures to diagnose the cause of symptoms,
screen for illnesses and detect the body’s response to treatments. The most common radiology modalities
include: plain X-ray, computed tomography (CT), magnetic resonance imaging (MRI), positron emission
tomography (PET), and ultrasound imaging. Fig. 1 shows these internal body structures viewed via these
different imaging techniques, and Fig. 2 illustrates an example of CT and PET images. In MRI images, the
white areas represent subcutaneous fat, while in the CT images, the white areas represent the skull. However,
the main disadvantage for all x-ray and gamma ray imaging modalities is the risk of radiation exposure for
patients [4–8]. Ultrasound imaging is convenient because it does not expose patients or radiologists to
radiation, but it has poor penetration through bone or air, which makes images difficult to interpret [9, 10].
MRI and CT images can capture anatomical changes in tissues, while PET images detects biochemical
and physiological changes, which often occur before anatomical changes [11]. Disadvantageously, patients
with ferromagnetic orthopedic implants, materials, and devices cannot undergo MRI procedures. MRIs also
have relatively long scanning times which imposes limitations for patients in need of urgent care [12, 13].
The broader use of radiological image analysis increases the workload for radiologists, and therefore the
development of intelligent computer-aided systems for automated image analysis that can achieve faster and
more accurate results for large volumes of imaging data is essential.
This paper provides an overview of machine learning techniques used in radiological image analysis. We
begin with a brief overview of current imaging technologies. In section 2, we review general concepts of
* Zhenwei Zhang and Ervin Sejdić are with the Department of Electrical and Computer Engineering, Swanson School of
Engineering, University of Pittsburgh, Pittsburgh, PA, 15261, USA. E-mail: [email protected]. Ervin Sejdić is the corresponding
author.

© 2019 published by Elsevier. This manuscript is made available under the Elsevier user license
https://www.elsevier.com/open-access/userlicense/1.0/
machine learning and detail methods most commonly used in recent years. In section 3, we provide an
overview of the most current studies dealing machine learning and radiological images. This review paper
mainly focuses on the most recent contributions to different machine learning techniques (i.e., after 2014),
and the reader should refer to previous review papers for older contributions related to machine learning and
biomedical imaging [14–18], or contributions that focus solely on a single machine learning approach (e.g.,
deep learning [19, 20]). Lastly, we have summarized these contributions by outlining current technological
limitations and potential future areas of research in this field.
Contributions cited in this review were collected using various research databases such as GoogleScholar,
PubMed (MEDLINE), IEEE Xplore and SpringerLink. All contributions collected were published between
the middle of 2014 and the middle of 2017. We used variations of the keywords including but not limited
to combinations of machine learning techniques (SVM, random forest, regression, neural networks, deep
learning), applications (segmentation, computer assisted system, brain studies) and imaging modalities (MRI,
x-rays, ultrasound, CT). While deep learning techniques have been prevalent in the past five years, our search
not only included these hot topics but also included traditional methods. In this review, the preference was
given to papers that presented real data rather than theoretical frameworks. Similarly, we did not include
papers that repeated past experiments unless the data collection or data analysis procedures were different.

(a) (b) (c)

Figure 1: An example of CT (a), MRI (b) and ultrasound (c) images displaying brain structures. Soft
tissue has a better resolution in MRI images. Each types of MRI sequence displays a different brightness
for the same structures [21]. Ultrasound is more convenient than CT and MRI, however it is unable to
capture information well, as ultrasound waves do not transmit well through bone [22].

2 Machine Learning in Radiology


In recent years, machine learning algorithms have become useful tools for the analysis of medical images
in many radiology applications [15, 23]. For example, machine learning algorithms can extract the useful
information found within the details of medical images [24]. Thus, computer-aided systems based on machine
learning help radiologists to make informed decisions while interpreting these images [15].

2.1 Types of Learning


Depending on the utilization of labels in training data, there are three categories of machine learning algo-
rithms: supervised learning, unsupervised learning, and semi-supervised learning. Supervised-learning is the
most common form in machine learning, and researchers widely use supervised-learning for classification and
regression [26]. Data is usually collected and labeled in categories, as the purpose of supervised learning is
to find an appropriate input-output function from training data, which generalizes well against the testing
data. We can compute an objective function to measure the error between the desired pattern and the output
score. In general, many scientific contributions focus on finding a suitable objective function with adjustable
parameters. Contrariwise, in cases where labeled data sets are relatively rare or difficult to acquire, unsu-
pervised learning can derive deductions from data without corresponding label information; the purpose of
unsupervised learning is to discover the hidden structure or distribution of data [27]. Unsupervised learning

2
Figure 2: (A) axial views of a CT scan, (B) coronal PET. CT images show better resolution than PET
images. However, each type of image can provide useful information for diseases. In this case, coronal PET
images shows multiple foci of intense FDG uptake in the pelvic area while CT images do not demonstrate
any abnormalities [25]

approaches include clustering and blind signal separation techniques such as principal component analysis
and independent component analysis. Lastly, semi-supervised learning lie between supervised learning and
unsupervised-learning [28, 29]. During the training phase, semi-supervised learning begins with a small set
of labeled data and augments the training data size by gradually labeling unlabeled data.

2.2 Feature Selection


Feature extraction and representation is a crucial step in medical image processing. With the development
of modern medical techniques, higher resolution and more features have become obtainable to feed the
classifiers; however, this is an obstacle for machine learning techniques in achieving an optimal solution
using high dimensional features. Significant interest exists in extracting and identifying reliable features from
radiological images to improve classification performance [30, 31]. Several methods exist for extraction of
features from medical images including region-based, shape-based , texture-based, and bag-of-words features
[32–38]. The performance of most image retrieval systems is dependent on the use of these features. Table 1
summaries image features used in radiological image analysis. Color features are one of the essential features
of images, including RGB, histograms [39], color moments [40] and color coherence vectors. Groups of pixels
can calculate texture features, which can help characterize a wide range of images. The Gabor filter is the
most common method for texture extraction [35]. Scale invariant feature transform and speed up robust
features algorithm are two popular methods for scale and rotation invariant feature detector and descriptor
in computer vision [41]. Different types of images have significant contrast variation. Thus visual features
such as color, shape and texture are not enough to easily classify images. Thus high-level features are useful
to overcome the intensity variations in different types of images and extract the appropriate information
from said images. The process to select ideal features that can reflect the most useful contents of images
remains a challenging problem in machine learning.

2.3 Overview of Machine Learning Methods


Machine learning has been developing rapidly in recent years, and it is impossible to cover all recently-
developed techniques in one section. In this section, we will review the most commonly used machine
learning methods in radiology, such as linear models, the support vector machine, decision tree learning, the
ensemble classifier, as well as neural networks and deep learning. This section provides a general description
of machine learning techniques and will help understanding their applications in the field of radiology, as
described in subsequent sections.

3
Table 1: A summary of image features used in ML systems

Features Examples
Color Invariant from different size and direction Histogram [42–44]
Shape Binary representation of images Sphericity [44, 45]
Texture Description of image structure, randomness, linearity, Haralick’s features [45, 46]
roughness, granulation, and homogeneity Gabor features [47–49]
Co-occurrence [50]
Curvelet-based [51, 52]
Wavelet-based [53, 54]
Local Description of local image information using region, Local binary pattern [44]
object of interest, corners, or edges Scale invariant feature transform [55–57]
Speed up robust features [57, 58]
Other Other methods to extract image features CNN [59]

(a) (b)

Figure 3: Basic idea of linear classification and non-linear classification, (a) linear case (b) non linear case.
The linear model uses linear functions to separate the data yet is not suitable for non-linear cases. SVM is
one way to separate non-linear models using different kernel functions.

2.3.1 Linear Models for Regression and Classification


Regression predicts the value from the given input features, whereas classification assigns input x to one of
the predefined classes. [60]. The simplest linear models establish a linear relationship among input variables.
Commonly used linear models include linear regression, Fisher’s linear discriminant (LDA), and logistic
PN
regression. Given xi , i = 1, 2, 3..., N , the input feature vector, the output y(x, ω) = ω0 + i=1 ωN xN .
Logistic regression is the most basic classifier, it predicts the probability that an input x belongs to a class
(class 1), versus the probability that it belongs to another class (class 0). The basic idea of logistic regression
is that we learn the logistic function of the form:
1
P (y = 1|x) =
1 + exp(−ω T x)
where x is the input vector and ω is a weight vector for input. The logistic function is a continuous function
which can turn any input from negative infinity to positive infinity into an output that is always between
zero and one [61]. Fig. 3 illustrates linear and non-linear separable cases for a dataset.

2.3.2 Support Vector Machine


Support vector machines (SVM) are kernel-based supervised learning techniques widely used for classification
and regression [60,62]. The basic idea of SVM is to find an optimal hyperplane for linear separable patterns.
It attempts to maximize the geometric margin on the training set and minimize the training error. Then, a
kernel function maps the original data into a new space for non-linearly separable cases, resulting in a two-
class classification problem. xi , i = 1, 2, ..., N are feature vectors of the training set X, and of corresponding

4
Is the minimum systolic blood pressure
over the initial 24 hour period > 91?

no yes
Is age > 65 ?
High
risk
no yes

Low Is sinus tachycardia present?


risk
no yes

Low High
risk risk

Figure 4: A medical example of decision trees. In this example, patients are classified into two classes: high
risk and low risk. The features include blood pressures, age, etc. In this case, the classification tree
operates similarly to a clinician’s examination process.

class indicator yi ∈ {−1, +1}. The goal of SVM is to construct a classifier in the form of:
Ns
X
y(x) = sign[ λi yi K(xi , x) + ω0 ]
i=1

The function K(xi , x) is called the kernel function, and their different mathematical properties enable many
pattern recognition and regression models. SVM with a linear kernel equation is computationally faster than
SVM with quadratic kernel functions. SVM models using fewer but more significant features are most likely
robust and less prone to overfitting [63].

2.3.3 Decision Tree Learning


Decision trees are one of the most popular classification approaches in machine learning [64]. The decision
tree consists of a “root”, ”leaves”, and internal nodes [65–67]. The internal nodes use certain features to
split the instance space into two or more subspaces. Each leaf represents one class. The leaf may represent
the most appropriate target value or indicate the probability of the target having a specific value. Fig. 4
is an example of the decision tree model. Decision trees are capable of handling datasets that may have
missing values and errors, however, this method may overfit training data and add unnecessary features. In
radiological image analysis, decision trees are usually ensembled to form random forests for prediction and
classification.

2.3.4 Ensemble Learning


Ensemble learning combines multiple classifiers and applies voting algorithms to achieve a final classification.
Popular ensemble approaches include boosting and bagging [68]. Fig. 5 shows the basic idea of ensemble
learning. In boosting, extra weight is assigned to incorrectly predicted points, and a set of weak classifiers are
applied to deal with data in the training phase; the outputs of weak classifiers and the weighted inputs help
calculate the final prediction. In bagging, the sub-classifier is independently constructed using a bootstrap
sample of the data set and a majority voting method is applied for the final prediction [69]. Random forests
are an ensemble learning method that consists of a multitude of decision trees. In standard tree construction,
the node is split using the best split among all features. In a random forest, a random subset of features split

5
Figure 5: The concept of ensemble learning: an ensemble classifier is made up of several sub-classifiers, the
final output is combined with outputs from these sub classifiers and their weights.

each node. The random forest is one of the most powerful machine learning predictors used in detection,
classification, and segmentation [70], particularly for brain [71, 72] and heart [73, 74] images.

2.3.5 Neural Networks and Deep Learning


Deep learning techniques have become a hot topic in machine learning due to the availability of sufficient
computational power and a high volume of data. These approaches can select specific features directly
from the data for classification and detection purposes [75, 76]. Deep learning avoids designing specific
features from the data, which is its main advantage in comparison with other machine learning methods.
Some outstanding frameworks such as the restricted Boltzmann machine [77], convolutional neural networks
(CNNs) [78] and sparse autoencoders have proven useful tools in many applications such as Alzheimer’s
disease diagnosis [79], segmentation [80], and tissue classification [81]. CNNs have a large number of param-
eters, which requires vast volumes of labeled training data. However, this requirement makes the training of
CNNs from medical images challenging due to the difficulty of acquiring a database with labeled data [82].
However, several studies use CNNs to extract features for medical images and achieve good performance in
classification [83, 84].

2.4 Evaluating Machine Learning Techniques


Physicians may rely on the prediction or classification results of machine learning algorithms. However,
performing one round of training and testing on data sets may not yield a meaningful idea of the accuracy of
an algorithm. Cross-validation reduces the variance of accuracy scores by ensuring that each data instance
is used for both training and testing an equal number of times. The cross-validation method randomly splits
data into k subsets and holds out each one while training on the rest.
The Dice similarity coefficient is used in segmentation, and it measures the spatial overlap between two
segmented target regions [85]. A and B are target regions or volumes, and the Dice similarity coefficient is
defined as the ratio of their intersection to the average [86]:

2(A ∩ B)
DSC(A,B) =
A+B
The Dice similarity coefficient has a value of 0 for no overlap and 1 when pa complete agreement is present.
Fig. 6 illustrates the Dice similarity coefficient with different overlaps.
In clinical practice, subjects with a disease are labeled as positive and healthy subjects are labeled as
negative. True positive (TP), false positive (FP), false negative (FN) and true negative (TN) are defined as
follows:
TP: a test detects the disease when the disease is present

6
Figure 6: The Dice similarity coefficient represents spatial overlap.

Figure 7: ROC curves consist of the points evaluated from model many times with different classification
thresholds. AUC computes the area beneath the ROC curves, which is more efficient to evaluate the
models compared to ROC curve.

TN: a test does not detect the disease when the disease is absent
FP: a test detects the disease when the disease is absent
FN: a test does not detect the disease when the disease is present
The goal of a computer-aided diagnosis system is to detect as many true positives as possible and
minimize false positives and false negatives. There are several popular metrics used to assess classifier
outcomes. Sensitivity shows the ability of a test to correctly detect patients with diseases while specificity
is the ability of a test to identify healthy subjects correctly. They can be written as:
True Positive
sensitivity =
True Positive + False Negative
True Negative
specificity =
True Negative + False Positive

Other popular methods used to assess models include the area under the receiver operating characteristic
(ROC) and the top precision value. ROC curves describe the relationship between sensitivity and specificity.

7
The area under the curve (AUC) measures the entire area under the ROC curve from (0,0) to (1,1) and
represents the probability that the model can distinguish between classes. Figure 7 illustrates the relation
between ROC and AUC. The top precision, the portion of top-ranked relevant images before the top irrelevant
database image [38, 87], is a popular evaluation metrics in retrieval systems.

3 Application of Machine Learning in Radiology


3.1 Segmentation
Image segmentation is a necessary step in effective disease diagnosis and treatment in radiology imaging
research. It helps clinicians to understand structural information and spatial anatomic relationships, however,
it depends on the experience of clinicians and is very time-consuming [24]. Automatic classification methods
are essential for improving diagnosis analysis and for the reproducibility of large-scale clinical studies.

3.1.1 Brain segmentation


Tree based methods are hot topics currently being investigated in the brain segmentation field. For example,
Yoo et al. segmented multiple sclerosis lesions in multi-3D MR images from unsupervised features [88].
Features were extracted from T2-weighted and proton density MR images using a deep belief network, and
a random forest was built for the final supervised classification. In order to improve the model performance
from noisy training data and robustness against overfitting, Maier et al. proposed an extra tree forest to
locate, segment and quantify sub-acute ischemic stroke lesions [89]. They used voxel-wise local features such
as intensity, weighted local mean, local histogram and 2D center distance. However, their method can only
deal with the T1-weighted and diffusion-weighted data sequences and high-quality images. Multimodal data
from the same patient can provide extra useful information for diagnosis. Therefore, Mitra et al. proposed
to use features from multimodal data to segment ischemic lesions, white matter and other secondary lesions.
In their study, algorithms combined expectation maximization likelihood estimation and Bayesian-Markov
random field to segment the probable lesion areas from FLAIR data then applied random forest on the
multimodal data [90].
Neural networks and deep learning techniques are powerful tools in brain segmentation tasks. Si et al.
proposed a semi-automatic method to classify the pixels of brain MRI into lesioned and healthy tissues by
use of an artificial neural network with gray levels and statistical features as inputs [91]. The segmentation
of early-brain tissues is more difficult than that of adult brains due to the lower tissue contrast [92], while
multiple image modalities contain complementary information for insufficient tissue contrast [93]. Zhang et
al. [94] showed that fractional anisotropy images are more potent in distinguishing gray matter and white
matter, and that T2-weighted images have higher performance in capturing cerebrospinal fluid. Zhang
et al. proposed a CNN method combining these multiple modality image data to improve segmentation
performance. Similarly, Kleesiek segmented the brain and non-brain tissues by feeding data into a neural
network with seven hidden convolutional layers [95]. Their model can be applied on any single image modality
or a combination of several modalities with varying size. Deep learning methods can also automatically
segment MRI images of the human brain into many anatomical regions [96, 97]. As shown in Fig 8, Chen
et al. extended ResNet into volumetric brain anatomical segmentation [98]. They integrated the low-
level image appearance features, implicit shape information and high-level context to further improve the
volumetric segmentation performance [98].

3.1.2 Other segmentation applications


Segmentation is also applied to identify and detect other structures [99], such as organs, bones, muscles,
and fractures. Similar to the brain segmentation, tree-based methods are popular as well in other types
of segmentation tasks. Lombaert et al. presented kidneys segmentation using the Laplacian Forest [100].
They used intensity within a randomly-shaped cuboid centered around several pixels during their data
training. The idea of the Laplacian Forest is to use a guided bagging strategy to produce more related image

8
Figure 8: Chen et al. applied their model on different imaging modalities: (a)-(c) denote T1, T1-IR and
T2-FLAIR MR images; (d) represents the ground truth label; (e)-(g) illustrates the segmentation results
using single image modality respectively; (h) is the result that combines all image modalities. [98].

information for tree models, which have more substantial improvements in model accuracy. Conze proposed a
semi-automatic liver tumor segmentation combining a simple linear iterative clustering super pixel algorithm
and random forest, which considers the inter-dependencies among voxels [101]. The multi-phase cluster-wise
features that consider the spatial consistency applied in their approach are more robust for a random forest.
The analysis of the knee also plays vital role in clinical assessment and surgical planning of the disease.
The cartilage is typically small, and the segmentation results of Haar-like operators are often unreliable in
extracting context features. To overcome these limitations, Liu proposed a novel method using a multi-atlas
context forest, which segments bones first and then cartilage [102]. They trained classifiers using appearance
features and context features to align the expert segmentation of the atlases in each iteration.
Medical segmentation research utilizes regression-based models. Chen et al. proposed an automated
method to localize and segment intervertebral discs from MRI [103]. They used unified regression and
classification frameworks to estimate displacements for image points by using the visual features around
them and achieved satisfactory results. Ventricle structure segmentation in MRI is an essential task for
investigating most cardiac disorders. The primary challenge of this task is the considerable shape variation
among different patients [104]. et al. proposed a segmentation method using cascade shape regression for the
right ventricle in cardiac MRI. They applied gradient boosted regression to regress multidimensional right
ventricle shape landmarks from image appearance, which consider correlations between landmarks. Their
method minimizes the shape alignment error over training data and shows better segmentation performance
than multi-atlas-label-fusion based segmentation methods.
The other traditional supervised methods applied in segmentation tasks include dictionary learning and
Bayes classification. Tong et al. proposed the extraction of voxel-intensity features for multi-organ segmen-
tation (liver, kidneys, pancreas, and spleen) using dictionary learning and a sparse coding technique (Fig.
9) [105]. The atlases selected against which to segment the images profoundly influence the performance
of multi-based methods [106]. To deal with the high inter-subject variation in CT images, they applied a
voxel-wised local atlas selection strategy to improve performance. Griffis proposed a supervised learning
method that automatically delineates stroke lesions using Naı̈ve Bayes classification in single T1-weighted
MRI sequence data [107]. In order to save time and money, their approach focuses on using single scan data,
which detects direct lesion effects and has a better performance than manual delineation.
Image quality remains a limitation of the extraction of features from the radiology images. In many cases
such as brain boundary segmentation, the data is of low contrast by nature. Additionally, both resolution
and partial volume effects influence the definition of boundaries [108]. Some research contributions focus on
multi-modalities to obtain complementary information [90,109–111]; however, it is difficult and inconvenient
to apply various testing methods on patients. Also, the accuracy of the segmentation system is difficult to
measure and compare because the ground truth varies based on the delineation by different experts [112].
However, it is challenging and expensive to obtain manually labeled data from several experts for reliability
tests [113].

9
Table 2: Overview of segmentation methods for different radiological images
image types # images goal methods Dice coefficients
0.91 (Gray matter)
[114] MRI 12 Brain tissue Sparse dictionary learning
0.87 (White matter)
[90] MRI 36 Stroke lesion Random forest 0.82
[101] CT 42 Liver tumor Random forests & supervoxels 0.93
[115] CT 30 Liver tumor CNN 0.84
0.97 (Bone)
[102] MRI 70 Knee Multi-atlas context forests
0.81 (Cartilage)
0.90 (Liver)
0.88 (Kidney)
[105] CT 150 Multi-organ Discriminative dictionary learning
0.55 (Pancreas)
0.92 (Spleen)
0.95 (Gray matter)
[94] MRI 10 Brain tissue CNN
0.86 (White matter)
[116] CT 82 Pancreas CNN 0.72
[107] MRI 30 Stroke lesion Gaussian Naı̈ve Bayes classification 0.81
[91] MRI 12 Brain lesion ANN 0.79
[117] MRI 66 Prostate Sparse auto-encoder & sparse patch matching 0.88
[118] MRI 45 Left ventricle CNN & stacked-auto-encoder 0.97
[95] MRI 53 Brain tumor CNN 0.95
[119] CT 73 Lung texture Convolutional restricted Boltzmann machines 0.74
[97] MRI 57 Brain segmentation CNN 0.86
0.79 (Gray matter)
[120] 4D-CT 22 Brain tissue SVM
0.81 (White matter)
[121] MRI 65 Brain lesion CNN 0.79
[122] CT 42 Liver tumor CNN 0.97
[123] MRI 73 Brain tumor CNN 0.65

Figure 9: Tong et al. performed discriminative dictionary learning in muliresolution to generate


probabilistic atlas for each organ. The graph-cuts algorithm is implemented in Native space, combining the
information across resolutions and achieving the final segmentation results [105].

10
Table 3: A summary of recent CAD studies.
AUC = area under curve; ROC = receiver operating cruve; TP = true positive rate; MAE = mean average
error
year image type # cases disease Measurements results keywords
[131] 2014 mammography 956 Breast cancer AUC 0.81 Combination of classifiers
[132] 2014 mammography 500 Breast cancer AUC 0.91 Naı̈ve Bayes classification
[63] 2014 MRI 81 Cervical cancer Accuracy 0.69 Texture features, SVM
[133] 2015 mammography 340 Breast cancer AUC 0.73 Texture features, SVM
[134] 2015 mammography 772 Breast cancer AUC 0.89 Feature selection method
[135] 2015 CT 750 Lung AUC 0.98 Structured SVM
[136] 2015 X-ray 5,440 Lung Accuracy 0.92 SVM
[137] 2015 MRI 83 Pediatric cardiomyopathy Accuracy 0.81 Bayesian rule learning
[138] 2016 mammography 736 Breast cancer AUC 0.82 CNN
[139] 2016 mammography 2,604 Breast cancer AUC 0.93 Wavelet neural network
[140] 2016 ultrasound 520 Breast lesions Accuracy 0.82 Stacked denoising auto-encoder
[141] 2016 ultrasound 95 Liver lesions Accuracy 0.87 SVM
[142] 2016 CT 104 Vertebral body fractures TP 0.81 SVM
[143] 2016 CT 409 Wrist, radius, ulna fractures ROC 0.89 Random forest
[144] 2017 mammography 45,000 Breast cancer AUC 0.91 CNN
[145] 2017 CT 1012 Lung cancer Sensitivity 0.89 ANN
[146] 2017 CT 52 Teeth Accuracy 0.89 CNN
[147] 2017 CT 344 Prostate cancer ROC 0.80 CNN
[148] 2017 X-ray 1391 Bone age MAE 0.80 CNN
[149] 2017 X-ray 108,948 Thorax diseases Accuracy 0.63 CNN
[150] 2017 X-ray 112,120 Thorax diseases AUC 0.84 CNN
[151] 2017 MRI 107 Brain tumor Accuracy 0.88 Logistic Regression

3.2 Computer Aided Diagnosis


Computer-aided diagnosis (CAD) systems can detect, mark, and assess potential pathologies for radiologists
to help improve identification accuracy in the case of data overload and human resource limitation. The
analysis, quantification, and categorization of images with these methods is an important technique, which
can improve patient safety and care. CAD systems have achieved breakthroughs in the detection of lesions
[124, 125], epidural masses [126], fractures [127], as well as a degenerative disease [128] and cancer [129].
Fisher’s linear discriminant, Bayesian methods, artificial neural networks, and SVM are widely used as
classifiers in CAD applications [13,130]. Table 3 summarizes some current CAD investigations with machine
learning techniques.
Breast cancer is one of the most common cancers in the world. Currently, about one in ten women suffer
from it, and early diagnosis and treatment of breast cancer could increase the chance of survival significantly
[152]. Among these diagnoses techniques, mammography is the best approach to detect breast cancer in its
early stages and features indicating abnormalities can be extracted directly from medical images [153, 154].
The identification of benign and malignant masses is the core principle for using mammography as a means to
diagnose breast cancer [155]. Perez et al. developed machine learning classifiers that combine suitable feature
selection methods with different machine learning techniques [131]. The feature selection methods include
chi-square discretization, information gain, one rule, relief, and u-test based filter. They then improved
their feature selection algorithm called uFilter, which ranks features in a descending manner [134]. Their
method was useful for different datasets and reduced the number of employed features without decreasing
the classification accuracies.
The SVM classifier is widely used in breast cancer diagnosis with various features, such as wavelet features,
gray-level-co-occurrence matrix features, intensity features, and texture features [133, 156]. Banaem et al.
proposed a fully automatic tool that can classify the mammogram data into normal and abnormal. They
used gray level co-occurrence and maximum difference method to extract proper features and the ensemble
classification combining SVM, KNN and Naı̈ve Bayes was applied to improve the diagnostic accuracy [157].
Many investigations not only consider the accuracy of the model, but also the model complexity. Arevalo et
al. trained an SVM model that integrated two layers CNN for mass lesion classification [138, 158]. Similarly,
Jiao et al. trained two SVM classifiers using deep learning features extracted from two different layers of
CNN networks [159]. An automated CAD system was proposed, combing the content-based image retrieval
to detect masses [132]. The main idea of their approach is to use scale invariant feature transform features
to match query mammogram and exemplar masses in the database, and then uses Naı̈ve Bayes classification
and thresholded maps to detect masses. In their method, the model complexity is low as there is no sliding
window-based scanning.
The SVM method is also widely studied in other diagnose such as lesion, injury and fractures detection.

11
In these diagnosis tasks, choice of features plays a significant role in model accuracy. Torheim et al. predicted
cervical cancer from dynamic contrast enhanced MRI. In their study, gray-level-co-occurrence matrices based
textural features were implemented as explanatory variables [63]. Wang improved the accuracy of lung lesion
detection from CT images by using a 3D matrix patterns-based SVM with latent variables. Their study
focused on detecting lung lesions that had irregular shape and low-intensity, rather than the nodules, which
provides a new thought for the detection of lung lesions [135]. In the detection task of thoracic and lumbar
vertebral fractures [142], Burns extracted 28 features from the cortical shell from CT images based on the
essential method (Denis middle column), which is specific to detection of fracture discontinuities on vertebral
body cortices. Jin et al. established a prognosis model of cervical spondylotic myelopathy using a least-
square SVM [160]. In their studies, they extracted values of fractional anisotropy, axial diffusivity, mean
diffusivity and radial diffusivity from each slice of DTI metrics as features, which yielded 88.62% prediction
accuracy.
The popular methods such as deep relief networks [161] and convolutional neural networks [140, 162]
achieved promising results in many diagnosis applications. The important diagnosis tasks based on neural
networks include chest pathology identification [163], cancer detection [164, 165] and lung diseases [166] .
Neural network based methods rely heavily on the support of big data. A semi-supervised algorithm has
been proposed to deal with a large amount of unlabeled data with CNN approaches [167]. Their approaches
using unlabeled data increased the overall accuracy, rather than just using labeled data.
There are many advantages to using machine learning techniques in CAD systems. The first advantage
of machine learning is its accurate and robust performance in many radiology studies. For instance, CAD
systems have reached perfect accuracy e.g., over 99% in oral cancer detection [168], which is comparable
to manual diagnosis. Moreover, CAD systems are expected to perform consistently and produce robust
results with large amounts of data at any time and in any space, while manual diagnosis results may be
affected by fatigue, reading time, and emotion on the part of the practitioner. The second advantage is
that the diagnosis can be finalized in a brief time. Many radiology analyses are time-consuming and require
experienced radiologists. For example, the software developed for breast cancer prediction [169] can review
charts 30 times faster than humans can. Another example is that the suggested approach in breast cancer
diagnosis is the double reading of mammograms by two radiologists [131]. With the help of a CAD system,
only one radiologist is needed instead of two, which could help to increase the survival rate among women
in a cost-effective manner [170].
Although we are witnessing better accuracy of computer-aided diagnosis systems to tackle the most com-
mon clinical problems, current contributions still have potentials for improvement before their applications
in clinical practice. First, a majority of current diagnosis contributions mainly focus on the prediction of
one type of disease, which may not meet the clinical demands. There may be one or more diseases existing
in one radiological image (for example, effusion & atelectasis in one chest x-ray image). Second, the current
model trainings is mainly based on one type of measurement. However, most disease decisions in clinical
practice rely on multiple domain measurement (such as patient demographics, image screening, blood test
and drug test). Information from multi measurement may increase model accuracies. Third, current medical
datasets mainly cover common diseases. Only a limited number of rare diseases are exposed to human clin-
icians, and many contributions may not consider these individual cases during their model training. More
comprehensive systems that can detect various types of diseases and report rare cases are expected to be
seen in the future.

3.3 Functional Brain Studies and Neurological Diseases


Brain tumors, neurological disorders such as epilepsy, and neurodegenerative diseases have attracted much
attention in brain-related investigations. In brain-related image diagnosis, a large number of features can
be extracted from brain regions related to the nature of pathological changes. Cortical thickness [171], the
volume of brain structures [172], and voxel tissue probability maps around some regions of interest [173] are
popular choices for feature extraction [174]. Different MRI modalities such as T1-weighted or fluid-attenuated
inversion recovery imaging contain large amounts of information and noise [91]. Therefore, compelling feature
fusion strategy is necessary for neuroimaging analysis and classification [175, 176].

12
3.3.1 Support vector machine in brain studies
In brain studies, the SVM is a powerful tool for feature selection, which may improve model accuracy. Larroza
et al. developed a classification model of brain metastasis and radiation necrosis in contrast-enhanced T1-
weighted images. Features were extracted by texture analysis and reduced by using a linear SVM [31].
Bron proposed a feature selection method based on the SVM significance value [177]. The significance
value (p-value) serves to quantify the contribution of each feature to the SVM classifier and is used to
reduce voxel-based morphometry features. Neurodegenerative diseases such as Parkinson’s disease begins
before the onset of symptoms. Thus, medical treatment is more effective if it is detected in early stage.
Among various forms of Parkinsonism, progressive supranuclear palsy is one of the most difficult to be
identified in an early disease stages [178]. Salvatore et al. proposed to classify control subjects, progressive
supranuclear palsy patients, and Parkinson’s disease patients based on SVM models. Features were extracted
by spatial transformations and principal component analysis from T1-weighted sequences. The accuracy of
discrimination of Parkinson’s disease and progressive supranuclear palsy is above 90% [24]. Fig. 10 uses a
color scale to express the importance of each region during classification. To improve the diagnostic accuracy
of classifying Parkinson’s disease patients, Singh proposed an unsupervised feature extraction method from
a T1-weighted sequence by using a Kohonen self-organizing map algorithm. With the least square SVM, the
accuracy of identifying the affected area in Parkinson’s disease is up to 99% [179]. In [174], features were
extracted using a deep network and a stacked denoising sparse autoencoder, which makes the input data
points more linearly separable in SVM [174]. Liu proposed an inherent structure-guided multi-view learning
method to classify Alzheimer’s disease and mild cognitive impairment patients [180]. They extracted 1500
features from gray matter density, and multi-task feature selection was applied to reduce the dimension,
followed by an ensemble classification method using multiple SVM classifiers.
Besides the disease studies, some research work applied machine learning techniques to understand the
brain’s functional network architecture. Smyser compared the fMRI data from 50 preterm-born and 50 term-
born infants using SVM [181]. Their results show that inter- and intra-hemispheric functional connections
throughout the brain are stronger in full-term infants. Their findings might be helpful for the development
of models for defining indices of brain maturation.

3.3.2 Ensemble learning in brain studies


Ensemble learning methods combines multiple classifiers, which is popular in Alzheimer’s disease diagnosis.
Alzheimer’s disease is estimated to affect around 5.4 million patients in America, and is the most common
form of dementia among the elderly population [128, 182], which leads to the loss of cognitive function and
death. Liu proposed a classification framework that works on different image modalities for the classification
of Alzheimer’s disease patients [175]. Their method contains level classifiers: low-level classifiers that use
different types of low-level features from patches, high-level classifiers that combine coarse-scale imaging
features in each patch and outputs of low-level classifiers, as well as a final ensemble classification that
combines the decisions of a high-level classifier with a weighted voting strategy (Fig. 11). In [183], high
accuracy results were obtained from Alzheimer’s disease/healthy and mild cognitive impairment/healthy
classification. However, accuracies in classifying mild cognitive impairment as converted to Alzheimer’s
disease are very low (57.4%), but is slightly higher than majority classification. Komlagan et al. developed
an ensemble learning method using gray matter for a weak classifier and selecting the most relevant sub-
ensembles through sparse logistic regression [184]. They trained a global linear SVM classifier for the final
classification. Combining high quality biomarkers with advanced learning methods makes results comparable
to those of multi-modality methods.

3.3.3 Others techniques in brain studies


Some researchers leverage regression and principle analysis components in classification and feature mapping.
Ahmed et al. detected neocortical structural lesions with an automated approach, which contained five
surface-based MRI features and combined them in a logistic regression [185]. To deal with imbalance issues,
they used a “bagging” approach and an iterative-reweighted least squares algorithm. The base-level classifier

13
Figure 10: Salvatore et al. [24] proposed a supervised learning method to identify PD and PSP using MR
images. The figures show maps of voxel-based pattern distribution of brain structural differences. The
color scale expresses the importance of each voxel in SVM classification.

MR image Patch Extraction

𝑃𝑖 …… 𝑃𝑘 …… 𝑃𝐾

Low-level Classification
GM {𝑷𝟏 , 𝑷𝟐 ~𝑷𝒌} {𝑷𝟏 , 𝑷𝟐 ~𝑷𝒌} GM
densities Correlations
…… Correlations densities

Classifier Statistical Classifier …… Classifier Statistical Classifier


𝑪𝟏,𝟏 Measures 𝑪𝟏,𝟏 𝑪𝟐,𝑲 Measures 𝑪𝟏,𝑲

High-level Classification
High-level features High-level features High-level features
𝑯𝑭𝟏 …… 𝑯𝑭𝒌 …… 𝑯𝑭𝑲

High-level Classifier
High-level Classifier ……
𝐇𝐂𝟏 𝐇𝐂𝑴
Final Classification

Classifier Ensemble
σ𝑴
𝒊=𝟏 𝒘𝒊 𝑯𝑪𝒊

Class Label

Figure 11: Flow chart of the hierarchical classification algorithm proposed in [175], the low-level classifiers
are used to transform imaging and spatial-correlation features from the local patch, and the output of these
low-level classifiers is integrated into high-level classifiers with coarse-scale imaging features. The final
classification is achieved by ensemble outputs from high-level classifiers.

was trained on all the minority class instances and the same size of random data from majority class instances.
Hong proposed a machine learning technique combining surface-based analysis in patients with a subtype of
focal cortical dysplasia [186]. Their automated approach used features of Focal cortical dysplasia morphology
and intensities, and Fisher’s linear discriminant was applied as a classifier to identify Focal cortical dysplasia
in patients. Huang proposed the use of a soft-split random forest to predict clinical scores in Alzheimer’s
disease patients [187]. In their method, lasso regression is applied to map MRI features, and then features
are reduced by principal component analysis. Li combines principal component analysis, the lasso method,
and a deep learning framework to extract features by fusing information from MRI and PET images in
the classifier of Alzheimer’s disease/mild cognitive impairment patients [183]. Zhu et al. focused on the
identification of Alzheimer’s disease patients with multi-view or visual features of image data. They proposed
several feature selection approaches for Alzheimer’s disease classification. They integrated subspace learning
into a sparse least square regression framework for multi-classification in 2014 [188]. Then, they mapped
the histogram of oriented gradient features (which are diverse) onto a region of interest features (which
are robust to noise), which provided complementary information for features and enhanced disease status
identification accuracy [189]. Other machine learning techniques such as convolutional neural networks are
widely investigated in the field as well. Table 4 summarizes recent contributions related to Alzheimer’s
disease classification.

14
Table 4: Recent studies on Alzheimer’s diseases
NC: normal; AD: Alzheimer’s disease; pMCI: progressive mild cognitive impairment; sMCI: stable mild
cognitive impairment
year databses image # image types classification gruops accuracy keywords
AD vs. NC 89%
[111] 2014 ADNI 834 MRI Multiple instance learning
pMCI vs. sMCI 70%
AD vs. MCI vs. NC 73.35% Sparse discrimination feature selection
[188] 2014 ADNI 202 MRI+PET
AD vs. pMCI vs. sMCI vs. NC 61.06%
AD vs. NC 89%
[190] 2014 ADNI 1071 MRI Manifold and transfer learning
pMCI vs. sMCI 73%
[184] 2014 ADNI 814 MRI pMCI vs. sMCI 75.6% Gray matter grading, weak-classifier fusion
AD vs. NC 93.83%
[180] 2015 ADNI 459 MRI pMCI vs. sMCI 80.9% Hierarchial fusion of features
pMCI vs. NC 89.09%
[191] 2015 ADNI 202 PET+ MRI pMCI vs. sMCI 78.7% Multimodel multi-label transfer learning
AD vs. NC 91.31%
[189] 2015 ADNI 830 MRI MCI vs. NC 78.07% HoG mapping
pMCI vs. sMCI 75.54%
[47] 2016 OASIS 416 MRI AD vs. NC 80.76% Gabor filter
[192] 2016 ADNI 416 MRI AD vs. NC vs. MCI 89.1 % CNN
[193] 2016 Self-collected 67 MRI AD vs. NC 96.77% SVM
[184] 2014 ADNI 814 MRI pMCI vs. sMCI 75.6% Gray matter grading, weak-classifier fusion
AD vs. NC 97.50%
[194] 2016 Self-collected 89 fMRI MCI vs. AD 87.30% SVM
MCI vs. NC 72.00%
[195] 2016 Dartmouth College 116 MRI AD vs. NC 97.14% Feature ranking selection
[196] 2016 Self-collected 43 fMRI AD vs. NC 96.85% Deep learning selection
[197] 2017 Self-collected 250 DTI AD vs. NC 89.60% Elastic net selection

3.4 Image Retrieval


With the increased use of modern medical diagnostic techniques, there are numbers of medical images
stored in hospital archives. Manual annotation and attribution of these images are impractical [14]. Picture
archiving and communication systems have been widely introduced in many hospitals [198]. These systems
could retrieve images based on keywords, however these images may not be directly useful in helping to
making clinical decisions. Different from traditional image search systems, which are based on matching
keywords and image tags, content-based image retrieval extracts rich contents from images and searches for
other images with similar contents. Content-based image retrieval is becoming necessary for the medical
image databases, which may potentially become effective tools of anatomical and functional information for
diagnostic, educational, and research purposes [199]. Table 5 lists current investigations on image retrieval.
Recently, similarity or distance learning is a hot topic in the image retrieval field. Traditional choices
include the Euclidean distance function, x2 square distance function, Mahalanobis distance, l1 norm distance
function [38], maximum likelihood approach [200] and Bayes ensemble [201]. Like other machine learning
tasks, features extraction is an important step in image retrieval systems. Kurtz et al. proposed the use
of hierarchical semantic-based distance to retrieve images based on 72 manually annotated semantic terms
from each region of interest [202]. Then, they built a semantic framework that learns image descriptions of
each term using Riesz wavelets and SVM. In [203], local wavelet patterns were introduced as a new feature
descriptor. Their experiments first utilized the relationship among the neighboring pixels and performed well
in CT image retrieval. Their results were shown in Fig. 12. Different from traditional similarity learning that
only maximizes the margin, Meng et al. proposed a novel similarity learning algorithm which considered the
top precision performance measure in the loss function [38]. Their methods showed advantages over other
traditional similarity learning methods.
The other supervised techniques applied in retrieval systems include online dictionary learning, ensemble
learning and principal component analysis. The main advantage of an online dictionary learning system is
the computational time, as learned dictionaries are used to represent the dataset in a sparse model, which
is an valuable tool for representing data [204]. A method using online dictionary learning and its features
extracted by multi-scale wavelet packet decomposition from different types of images is proposed in [205].
Srinivas et al. proposed a medical image classification approach using online dictionary learning with the
edge- and patch- based features to distinguish 18 categories [206]. Ahn et al. developed a robust method to
improve X-ray image classification [207]. A fusion strategy was proposed that combines domain transferred
convolutional neural networks and sparse spatial pyramid classification. The combined method performs
better than the single method used. Faria et al. proposed a retrieval method for brain MRI images. They

15
Figure 12: The method for retrieving images using Local wavelet pattern features and similarity
measurement. All retrieved images are from the same category, achieving 100 % precision in this
example [203]: (a) Query image. (b) Top 10 retrieved images.

Table 5: A summary of recent image retrieval research using machine learning techniques
year image types # images results keywords
[209] 2014 CT 72 AUC:0.93 Riesz wavelets, hierarchical semantic-based distance
[208] 2015 MRI 30 Accuracy:0.88 Partial least square discriminant analysis, principal component analysis
EXACT09:40 Precision:
[203] 2015 CT Local wavelet pattern
TCIA: 604 0.88
ImageCLEF:
[210] 2015 Multimodality MAP:0.29 Deep Boltzmann machine
10 thousand
[211] 2015 MRI OASIS:421 Precision: 0.48 Local binary patterns, gray-level-co-occurrence matrices
[206] 2016 X-ray & CT ImageCLEF:5400 Accuracy:0.98 Sparse representation, online dictionary learning
Indoor:15620,
Caltech256:30670, Top precision:
[38] 2016 Multimodality Support top irrelevant machine
Corel5000:5000, 0.36
ImageCLEF:2785
EXACT09:675 Precision:
[212] 2017 CT Gabor and Schmid filters
TCIA: 604 0.96

captured anatomical features from T1-weighted images using least-square discriminant analysis and principal
component analysis and performed a search for images between healthy controls and patients with primary
progressive aphasia [208].
As large portions of medical images in the dataset lack labels and annotations, semi-supervised and
unsupervised techniques are required in the retrieval systems. Uunsupervised image retrieval based on
the clustering method using K-SVD executes iterations between grouping similar images into clusters and
generating a dictionary for clusters until clusters converge [42]. The advantage of this method is that it
requires no training data for classification and is not restricted to a specific context. Since labeled data
is limited, Herrera proposed a semi-supervised learning method for image classification using k-nearest
neighbors to expand the training data set and a random forest for final classification [113].
Medical image retrieval gives an opportunity for clinicians to search for similar disease cases. Accuracy
and performance time are both vital aspects of the medical image retrieval system. A practical model
and relevant image feature extraction are required to get better results. Furthermore, some image retrieval
contributions mainly investigated small datasets and limited disease cases. With an increasing number of
digital radiological images in hospital databases every year, whether systems can retrieve disease cases stably
and efficiently in huge datasets is still an exciting avenue for researchers.

3.5 Image Prediction


With the development of neuroimaging techniques, various new image modalities have been applied in daily
clinical practice to make diagnosis and treatment more efficient and accurate. Thus, image prediction meth-
ods, which combine various image modalities and provide information for diagnosis, are fundamental. The
main idea of image prediction is to estimate radiological images in different modalities or higher resolution,
which can provide detailed functional information for assessment and diagnosis. PET is a molecular imaging
technique which is widely used in clinical cancer diagnosis that produces 3D images, which can reflect tissue
metabolic activity in the human body [213]. However, the quality of PET images is proportional to the
dose injected and imaging time. As a result, low-dose PET images can suffer in quality. Thus, a great deal
of effort has been made to predict high-quality PET images. Kang proposed a regression forest based ap-
proach to predict standard-dose PET images from low-dose PET and multimodal MRI images [214]. They

16
Figure 13: Deep auto-context convolutional neural networks were proposed for standard-dose PET (SPET)
image estimation from low-dose PET (LPET) and T1 images [215]. SPET images were estimated by using
LPET images along and combination of LPET and T1 images. The neural networks perform better results
when two image modalities were included.

used a regression forest as their non-linear prediction model and features from local intensity patches of
MRI data and low-dose PET. Meanwhile, Wang used a mapping-based sparse representation approach for
prediction [11]. They used a graph-based distribution mapping method to reduce the patch distribution dif-
ferences between MRI and low-dose PET and constructed a patch selection based dictionary learning method
to predict standard-dose PET. Both methods performed better when compared with a path-based sparse
model. In [215], Xiang et al. used convolutional neural networks to estimate standard PET image from
low dose PET/MR image (Fig. 13). By using neural network techniques, they can map the inputs to the
output directly, without any pre/post-processing beyond the optimization in the training stage. Huynh pre-
dicted CT images from MRI data using a structured random forest instead of a classical random forest [70].
A structured random forest is an extension of a random forest, which predicts structured outputs instead
of scalar outputs [216, 217]. Characterizing the information obtained from multiple sources of information
improves prediction accuracy.
Compared to classification and segmentation tasks, contributions involving radiological imaging pre-
diction is still quite limited. The rapid developments in hybrid imaging scanners (PET-CT, SPECT-CT,
PET-MRI) has provided integrated images for diagnostic purposes. The primary challenge remains to find
the appropriate way to match the correspondences among different images modalities [218]. Besides, current
contributions are mainly focused on brain data, but we expect to see more non brain related contributions
to this area soon.

4 Current Challenges
Until now, machine learning has been used to help radiologists in diagnostic tasks, but still cannot be a
substitute for the clinician’s role. There are some limitations regarding the application of machine learning
in clinical practice [219]. One such limitation is that the majority of these studies in radiology are based on
supervised learning. The algorithms learn specific patterns based on previous decisions made by radiologists.
The algorithm is expected to reach an accuracy of 100% compared with the human clinician, however, in
many cases, an accurate diagnosis was made by several radiologists after multiple diagnostic tests. Whether
a machine can perform well alone still needs to be investigated in the future.
Facilitating data collection and sharing is a crucial point for further investigation in many studies.
Clinically applied algorithms depend on two critical factors: robustness for large datasets and accuracy
achieved [220]. In many machine learning related studies, the size of the chosen dataset is relatively
small [89, 94, 103, 107, 208]. This is due to limited access to patients or limited diagnostic work in the
research setting. For example, in some clinical practices, there is no pathological exam done as a follow-up
procedure in cases of brain metastasis [31], which limits the acquisition of data. The application of machine
learning to a limited number of cases (only around twenty patients were studied in some cases) is not always

17
persuasive. Whether these clinical tools are robust enough to analyze immense amounts of medical data
accurately remains a question. While 20-50 images were sufficient in past research, hundreds of or more
image sets are required in the future to meet increasing requirements concerning robustness and accuracy.
The creation of large databases and sharing centers such as ADNI for Alzheimer’s disease patients, NIH
repositories for chest X-rays, and TCIA for cancer imaging help to effectively collect millions of images
for research. Furthermore, current studies trained and evaluated their models based on various datasets,
which makes it challenging to compare their algorithms. A systematic evaluation standard based on various
diseases and various public datasets is required in medical applications.
Excepting the data size, data quality and feature selection remain highly important for effective machine
learning techniques. Medical images contain rich features that are clinically important. It is challenging to
use low-level image features to get the visual appearance of disease. However, high dimensional features could
be redundant for results in many fields such as image retrieval and classification. The choice of different
high-level descriptions as input features is a prominent research topic. Choosing informative features for
training can lead to robust models, whereas overfitting, underfitting, and misclassification usually occur
when features are not selected well. More work remains to be done for selecting and utilizing proper features
from images.
Transfer learning is an accepted scenario to learn information from limited datasets. There are three
main transfer learning research directions in the field of radiological imaging. One is used to reduce bias
among different equipment for image acquisition and different protocols [123,221]. The second approach is to
learn various abnormalities from the same data source [222,223]. The last approach is to find a good feature
representation from various domains and then apply them to the radiological imaging field [224]. Transfer
learning allows us to deal with various scenarios by leveraging the already existing information from some
related task or domain. For more details on transfer learning techniques and radiological imaging, readers
should refer to [225].
Imbalanced data is standard in medical diagnosis, where the majority of data is normal data, but only
a minority class is abnormal [226]. For example, brain tumors are not common, occurring only in 1 h
of the population. However it remains the most fatal form of cancer [16]. This data imbalance might
affect prediction accuracy and cause a bias toward the majority class [227]. Several researchers considered
the imbalanced situation in their models [38, 94, 185], however, the majority of studies have not properly
addressed this issue [94, 228], or they use the same amount of data from different classes. How to utilize
imbalanced data to improve the accuracy of machine learning algorithms remains an open question in this
field.
Large amounts of radiological images are produced in hospitals every year. However, most of these
images are not utilized for further training of machine learning algorithms, as the training process constrains
available resources. Useful information is hidden in this mass of data, and diagnostic machine learning
models could be improved by using these streaming data. Online learning is a novel idea in recommender
systems and other machine learning based systems that could update the model while streaming data are
currently developed in other fields. This idea can also be transferred to medical diagnosis systems to make
full use of streaming image datasets.
Researchers have generated more powerful and more innovative diagnosis models for radiological imaging
[229]. However, very few of them are commercialized and deployed into the market. The main challenge is to
comply with government requirements in various countries [230]. Current FDA protocols suggest that medical
products should pass the clinical trials, and be produced, commercialized, and used in a defined, unchanging
form. If a machine learning model is used for medical diagnosis, a pre-build and freeze model must be tested
in different clinical environments, assessed for various real-life medical conditions, and carefully evaluated
on how these conditions affect the accuracy of the diagnosis. A recent study showed that a pre-trained
model demonstrated significantly lower external performance on the data obtained from another hospital
system [231]. In order to reduce the bias and improve performance, current machine learning solutions in
nonmedical fields typically update the parameters of the models every time new data is included. However,
this is not realistic for medical products as the system must pass a new clinical trial after the update.
Therefore, this remains a major issue for machine learning algorithms in medical applications.

18
Knowing how deep neural networks work is an open question, and this prompts clinicians and patients
to distrust these models. Due to huge amounts of parameters in the models, it is difficult to interpret
how the models make diagnostic decisions between input and output. This could potentially be fatal if a
machine learning model leads to a wrong conclusion [232], as medical experts can not verify these models.
Deep learning researchers have recently computed heat maps using class activation maps in order to a more
concrete analysis of how these models perform [233]. An activation map visually highlights the discriminative
regions in medical images that models used to identify the category. However, the related researches on
network explainability and visualization are still limited in the field of radiological imaging.

5 Conclusion and Future Work


In this paper, we reviewed five applications of machine learning techniques on radiologic images: image
segmentation, computer-aided detection and diagnosis, functional brain studies and neurological disease
diagnosis, image classification and retrieval, ands image registration. While machine learning techniques are
active in computer-aided systems to assist radiologists in daily diagnosis and studies, the use of machine
learning techniques in radiology is still evolving. There are many strategies that this field could investigate
in the future
- Previous contributions have shown that machine learning-based systems showed accurate results com-
parable to those of radiologists themselves. However, the system accuracy of these techniques must
still be improved, that is, systems must be more accurate than those of radiologists. Otherwise, the
widespread application of machine learning techniques will be limited. A possible approach to achieve
this superior performance is to design better machine learning models or to gather more representative
data that can be continuously used to improve the algorithms.
- Although the core advancement of deep learning is its ability to learn useful features directly from
data, its accuracy and performance are highly limited by the size of data. Traditional machine learning
methods still play a role in the case of small amounts of labeled data. Understanding how to choose
and use features from images effectively is still a significant direction for these traditional methods.

- Another issue deals with the translation of these techniques to clinical practice. While many machine
learning algorithms have already shown good results, it still needs to pass clinical trials required by the
government. Additionally, many people still believe in human decisions, as clinicians always consciously
tend to decide with all the relevant information in mind. These decisions make it difficult to justify
the use of algorithms for clinical decision making in all possible cases, but through rigorous research
contributions, we can justify the use of machine learning algorithms in the cases when patients outcomes
can be improved.
The application and development of machine learning techniques to radiological images is a hot topic cur-
rently and a large number of algorithms are being developed to ensure higher accuracy and lower computa-
tional complexity. We expect that machine learning techniques will become essential components in clinical
tools and will be widely used to assess patients’ health in the future.

Acknowledgment
Research reported in this publication was supported by the Eunice Kennedy Shriver National Institute of
Child Health & Human Development of the National Institutes of Health under Award Number R01HD092239.
The content is solely the responsibility of the authors and does not necessarily represent the official views of
the National Institutes of Health.

19
Conflicts of Interest
None declared.

References
[1] R. A. Novelline and L. F. Squire, Squire’s fundamentals of radiology. La Editorial, UPR, 2004.

[2] M. Chen, T. Pope, and D. Ott, Basic radiology. McGraw Hill Professional, 2010.
[3] W. Herring, Learning radiology: Recognizing the basics. Elsevier Health Sciences, 2015.
[4] S. J. Swensen, J. R. Jett, T. E. Hartman, D. E. Midthun, S. J. Mandrekar, S. L. Hillman, A.-m.
Sykes, G. L. Aughenbaugh, A. O. Bungum, and K. L. Allen, “Radiology CT screening for lung cancer
: five-year prospective,” Cancer, pp. 259–265, 2005.
[5] V. R. Iyer and S. I. Lee, “MRI, CT, and PET/CT for ovarian cancer detection and adnexal lesion
characterization,” American Journal of Roentgenology, vol. 194, no. 2, pp. 311–321, 2010.
[6] M. S. Pearce, J. A. Salotti, M. P. Little, K. McHugh, C. Lee, K. P. Kim, N. L. Howe, C. M. Ronckers,
P. Rajaraman, A. W. Craft, L. Parker, and A. B. De González, “Radiation exposure from CT scans
in childhood and subsequent risk of leukaemia and brain tumours: A retrospective cohort study,” The
Lancet, vol. 380, no. 9840, pp. 499–505, 2012.
[7] R. Smith-Bindman, J. Lipson, R. Marcus, K. P. Kim, M. Mahesh, R. Gould, A. B. de Gonzalez, and
D. L. Miglioretti, “Radiation dose associated with common computed tomography examinations and
the associated lifetime attributable risk of cancer,” Archives of Internal Medicine, vol. 169, no. 22, pp.
2078–2086, 2009.
[8] D. P. Frush, L. F. Donnelly, and N. S. Rosen, “Computed tomography and radiation risks: what
pediatric health care providers should know.” Pediatrics, vol. 112, no. 4, pp. 951–957, 2003.
[9] Y. L. Huang, D. R. Chen, and Y. K. Liu, “Breast cancer diagnosis using image retrieval for different
ultrasonic systems,” in International Conference on Image Processing, 2004, pp. 2957–2960.

[10] J. Shan, “A fully automatic segmentation method for breast ultrasound images,” Ph.D. dissertation,
Utah State University, 2011.
[11] Y. Wang, P. Zhang, L. An, G. Ma, J. Kang, F. Shi, X. Wu, J. Zhou, D. S. Lalush, W. Lin, and
D. Shen, “Predicting standard-dose PET image from low-dose PET and multimodal MR images using
mapping-based sparse representation,” Physics in Medicine and Biology, vol. 61, no. 2, p. 791, 2016.
[12] M. Sundaram and M. H. Mcguire, “Computed tomography or magnetic resonance for evaluating the
solitary tumor or tumor-like lesion of bone?” Skeletal Radiology, vol. 17, no. 6, pp. 393–401, 1988.
[13] J. Yao, J. E. Burns, and R. M. Summers, “Computer aided detection of bone metastases in the
thoracolumbar spine,” in Spinal Imaging and Image Analysis, 2015, pp. 97–130.

[14] H. S. J. Ibrahim and A. Mukhtar, “Content based image retrieval in mammograms: a survey,” Inter-
national Journal of Engineering Science, vol. 4638, 2016.
[15] S. Wang and R. M. Summers, “Machine learning and radiology,” Medical Image Analysis, vol. 16,
no. 5, pp. 933–951, 2013.

[16] S. Bauer, R. Wiest, L.-P. Nolte, and M. Reyes, “A survey of MRI-based medical image analysis for
brain tumor studies,” Physics in Medicine and Biology, vol. 58, no. 13, pp. R97–R129, 2013.

20
[17] D. Garcı́a-Lorenzo, S. Francis, S. Narayanan, D. L. Arnold, and D. L. Collins, “Review of automatic
segmentation methods of multiple sclerosis white matter lesions on conventional magnetic resonance
imaging,” Medical Image Analysis, vol. 17, no. 1, pp. 1–18, 2013.
[18] K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis, “Machine learning
applications in cancer prognosis and prediction,” Computational and Structural Biotechnology Journal,
vol. 13, pp. 8–17, 2015.
[19] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. van der Laak,
B. van Ginneken, and C. I. Sánchez, “A survey on deep learning in medical image analysis,” arXiv
preprint arXiv:1702.05747, 2017.

[20] D. Shen, G. Wu, and H.-I. Suk, “Deep learning in medical image analysis,” Annual Review of Biomedical
Engineering, no. 0, 2017.
[21] University of Wisconsin School of Medicine and Public Health. (2016) Neuroradiology learning
module.
[22] C. Bailey, T. A. Huisman, R. M. de Jong, and M. Hwang, “Contrast-enhanced ultrasound and elastog-
raphy imaging of the neonatal brain: A review,” Journal of Neuroimaging, vol. 27, no. 5, pp. 437–441,
2017.
[23] B. J. Erickson, P. Korfiatis, Z. Akkus, and T. L. Kline, “Machine learning for medical imaging,”
Radiographics, vol. 37, no. 2, pp. 505–515, 2017.

[24] C. Salvatore, A. Cerasa, I. Castiglioni, F. Gallivanone, A. Augimeri, M. Lopez, G. Arabia, M. Morelli,


M. C. Gilardi, and A. Quattrone, “Machine learning on brain MRI data for differential diagnosis of
Parkinson’s disease and Progressive Supranuclear Palsy,” Journal of Neuroscience Methods, vol. 222,
pp. 230–237, 2014.
[25] D. W. Townsend, T. Beyer, and T. M. Blodgett, “Pet/ct scanners: a hardware approach to image
fusion,” in Seminars in nuclear medicine, vol. 33, no. 3. Elsevier, 2003, pp. 193–204.
[26] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
[27] P. Dayan, “Unsupervised learning,” The MIT Encyclopedia of the Cognitive Sciences, pp. 1–7, 2009.
[28] T. Mitchell and A. Blum, “Combining labeled and unlabeled data with co-training,” in 11th Annual
Conference on Computational Learning Theory, 1998, pp. 92–100.
[29] X. Zhu, “Semi-supervised learning,” in Encyclopedia of Machine Learning, 2011, pp. 892–897.
[30] S. T. Chao, M. S. Ahluwalia, G. H. Barnett, G. H. J. Stevens, E. S. Murphy, A. L. Stockham,
K. Shiue, and J. H. Suh, “Challenges with the diagnosis and treatment of cerebral radiation necrosis,”
International Journal of Radiation Oncology Biology Physics, vol. 87, no. 3, pp. 449–457, 2013.

[31] A. Larroza, D. Moratal, A. Paredes-Sánchez, E. Soria-Olivas, M. L. Chust, L. A. Arribas, and E. Arana,


“Support vector machine classification of brain metastasis and radiation necrosis based on texture
analysis in mri,” Journal of Magnetic Resonance Imaging, vol. 42, no. 5, pp. 1362–1368, 2015.
[32] C. F. Tsai, “Image mining by spectral features: A case study of scenery image classification,” Expert
Systems with Applications, vol. 32, no. 1, pp. 135–142, 2007.
[33] M. M. Islam, D. Zhang, and G. Lu, “A geometric method to compute directionality features for texture
images,” in IEEE International Conference on Multimedia and Expo, no. 3, 2008, pp. 1521–1524.

21
[34] N. C. Yang, W. H. Chang, C. M. Kuo, and T. H. Li, “A fast MPEG-7 dominant color extraction with
new similarity measure for image retrieval,” Journal of Visual Communication and Image Represen-
tation, vol. 19, no. 2, pp. 92–105, 2008.
[35] D. P. Tian, “A review on image feature extraction and representation techniques,” International Jour-
nal of Multimedia and Ubiquitous Engineering, vol. 8, no. 4, pp. 385–395, 2013.
[36] W. Xie, Y. Li, and Y. Ma, “Breast mass classification in digital mammography based on extreme
learning machine,” Neurocomputing, vol. 173, pp. 930–941, 2016.
[37] R. Rastghalam and H. Pourghassem, “Breast cancer detection using MRF-based probable texture
feature and decision-level fusion-based classification using HMM on thermography images,” Pattern
Recognition, vol. 51, pp. 176–186, 2016.
[38] J. Meng, Y. Jiang, X. Xu, and I. Priananda, “Support top irrelevant machine: learning similarity
measures to maximize top precision for image retrieval,” Neural Computing and Applications, pp.
1–10, 2016.
[39] J. Yue, Z. Li, L. Liu, and Z. Fu, “Content-based image retrieval using color and texture fused features,”
Mathematical and Computer Modelling, vol. 54, no. 3, pp. 1121–1127, 2011.
[40] G. Pass and R. Zabih, “Histogram refinement for content-based image retrieval,” in IEEE Workshop
on Applications of Computer Vision, 1996, pp. 96–102.
[41] L. Juan and O. Gwun, “A comparison of SIFT, PCA-SIFT and SURF,” International Journal of Image
Processing, vol. 3, no. 4, pp. 143–152, 2009.
[42] M. Srinivas, R. R. Naidu, C. S. Sastry, and C. K. Mohan, “Content based medical image retrieval
using dictionary learning,” Neurocomputing, vol. 168, pp. 880–895, 2015.
[43] R. R. Gundreddy, M. Tan, Y. Qiu, S. Cheng, H. Liu, and B. Zheng, “Assessment of performance and
reproducibility of applying a content-based image retrieval scheme for classification of breast lesions,”
Medical physics, vol. 42, no. 7, pp. 4241–4249, 2015.
[44] X. Yingying, L. Lanfen, H. Hongjie, Y. Huajun, J. Chongwu, W. Jian, H. Xianhua, and C. Yen-Wei,
“Combined density, texture and shape features of multi-phase contrast-enhanced CT images for CBIR
of focal liver lesions: a preliminary study,” in Innovation in Medicine and Healthcare 2015. Springer,
2016, pp. 215–224.
[45] A. K. Dhara, S. Mukhopadhyay, A. Dutta, M. Garg, and N. Khandelwal, “A combination of shape
and texture features for classification of pulmonary nodules in lung CT images,” Journal of Digital
Imaging, vol. 29, no. 4, pp. 466–475, 2016.
[46] L. P. Suresh, S. S. Dash, and B. K. Panigrahi, “Artificial intelligence and evolutionary algorithms in
engineering systems,” Advances in Intelligent Systems and Computing, vol. 324, pp. 109–117, 2015.
[47] P. Keserwani, V. S. C. Pammi, O. Prakash, A. Khare, and M. Jeon, “Classification of Alzheimer
disease using gabor texture feature of hippocampus region,” International Journal of Image, Graphics
and Signal Processing, vol. 8, no. 6, pp. 13–20, 2016.
[48] X. Zhu, X. He, P. Wang, Q. He, D. Gao, J. Cheng, and B. Wu, “A method of localization and
segmentation of intervertebral discs in spine MRI based on Gabor filter bank,” BioMedical Engineering
OnLine, vol. 15, no. 1, p. 32, 2016.
[49] W. L. Lee, K. Chang, and K. S. Hsieh, “Unsupervised segmentation of lung fields in chest radio-
graphs using multiresolution fractal feature vector and deformable models,” Medical and Biological
Engineering and Computing, vol. 54, no. 9, pp. 1409–1422, 2016.

22
[50] S. Murala and Q. M. Jonathan Wu, “Local ternary co-occurrence patterns: A new feature descriptor
for MRI and CT image retrieval,” Neurocomputing, vol. 119, pp. 399–412, 2013.
[51] S. Dhahbi, W. Barhoumi, and E. Zagrouba, “Breast cancer diagnosis in digitized mammograms using
curvelet moments,” Computers in Biology and Medicine, vol. 64, pp. 79–90, 2015.

[52] G. Sethi and B. S. Saini, “Abdomen disease diagnosis in CT images using flexiscale curvelet transform
and improved genetic algorithm,” Australasian Physical & Engineering Sciences in Medicine, vol. 38,
no. 4, pp. 671–688, 2015.
[53] D. C. Pereira, R. P. Ramos, and M. Z. do Nascimento, “Segmentation and detection of breast cancer
in mammograms combining wavelet analysis and genetic algorithm,” Computer Methods and Programs
in Biomedicine, vol. 114, no. 1, pp. 88–101, 2014.
[54] H. Madero Orozco, O. O. Vergara Villegas, V. G. Cruz Sánchez, H. D. J. Ochoa Domı́nguez, and
M. D. J. Nandayapa Alfaro, “Automated system for lung nodules classification based on wavelet
feature descriptor and support vector machine.” BioMedical Engineering OnLine, vol. 14, no. 1, p. 9,
2015.

[55] J. Arias, J. Martı́nez-Gómez, J. A. Gámez, A. G. Seco de Herrera, and H. Müller, “Medical image
modality classification using discrete Bayesian networks,” Computer Vision and Image Understanding,
vol. 151, pp. 61–71, 2016.
[56] D.-H. Lee, D.-W. Lee, and B.-S. Han, “Possibility study of scale invariant feature transform (SIFT)
algorithm application to spine magnetic resonance imaging,” Plos One, vol. 11, no. 4, p. e0153043,
2016.
[57] M. Alkhawlani and M. Elmogy, “Content-based image retrieval using local features descriptors and
bag-of-visual words,” International Journal of Advanced Computer Science and Applications, vol. 6,
no. 9, pp. 212–219, 2015.

[58] K. Velmurugan and S. S. Baboo, “Content-based image retrieval using SURF and colour moments,”
Global Journal of Computer Science and Technology, vol. 11, no. 10, pp. 1–4, 2011.
[59] M. Srinivas, D. Roy, and C. K. Mohan, “Discriminative feature extraction of X-ray images using deep
convolutional neural networks,” Icassp 2016, pp. 917–921, 2016.
[60] C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.

[61] K. Deng, “OMEGA : On-line memory-based general purpose system classifier,” Ph.D. dissertation,
Carnegie Mellon University, 1998.
[62] J. A. K. Suykens and J. Vandewalle, “Least squares support vector machine classifiers,” Neural Pro-
cessing Letters, vol. 9, no. 3, pp. 293–300, 1999.

[63] T. Torheim, E. Malinen, K. Kvaal, H. Lyng, U. G. Indahl, E. K. F. Andersen, and C. M. Futsaether,


“Classification of dynamic contrast enhanced MR images of cervical cancers using texture analysis and
support vector machines,” IEEE Transactions on Medical Imaging, vol. 33, no. 8, pp. 1648–1656, 2014.
[64] W.-Y. Loh, “Fifty years of classification and regression trees,” International Statistical Review, vol. 82,
no. 3, pp. 329–348, 2014.

[65] L. Rokach and O. Maimon, “Classification Trees,” in Data Mining and Knowledge Discovery Handbook,
2010, pp. 149–174.
[66] A. T. Azar and S. M. El-Metwally, “Decision tree classifiers for automated medical diagnosis,” Neural
Computing and Applications, vol. 23, no. 7-8, pp. 2387–2403, 2013.

23
[67] N. Speybroeck, “Classification and regression trees,” International Journal of Public Health, vol. 57,
no. 1, pp. 243–246, 2012.
[68] E. Bauer, R. Kohavi, P. Chan, S. Stolfo, and D. Wolpert, “An empirical comparison of voting clas-
sification algorithms: bagging, Boosting, and variants,” Machine Learning, vol. 36, no. August, pp.
105–139, 1999.

[69] A. Liaw and M. Wiener, “Classification and regression by randomForest,” R News, vol. 2, no. December,
pp. 18–22, 2002.
[70] TriHuynh, G. Yaozong, K. Jiayin, W. Li, Z. Pei, S. Dinggang, and Alzheimer’s Disease Neuroimaging
Initiative, “Multi-source information gain for random forest: an application to CT image prediction
from MRI data,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 321–
329.
[71] D. Zikic, B. Glocker, E. Konukoglu, A. Criminisi, C. Demiralp, J. Shotton, O. M. Thomas, T. Das,
R. Jena, and S. J. Price, “Decision forests for tissue-specific segmentation of high-grade gliomas in
multi-channel MR,” Medical Image Computing and Computer-Assisted Intervention, vol. 15, no. Pt 3,
pp. 369–76, 2012.
[72] E. Geremia, O. Clatz, B. H. Menze, E. Konukoglu, A. Criminisi, and N. Ayache, “Spatial decision
forests for MS lesion segmentation in multi-channel magnetic resonance images,” NeuroImage, vol. 57,
no. 2, pp. 378–390, 2011.
[73] R. Sammouda, R. M. Jomaa, and H. Mathkour, “Heart region extraction and segmentation from chest
CT images using Hopfield Artificial Neural Networks,” in International Conference on Information
Technology and e-Services, 2012, pp. 3–8.
[74] V. Lempitsky, V. Lempitsky, M. Verhoek, M. Verhoek, J. A. Noble, J. A. Noble, A. Blake, and A. Blake,
“Random forest classication for automatic delineation of myocardium in real-time 3D echocardiogra-
phy,” in International Conference on Functional Imaging and Modeling of the Heart, 2009, pp. 447–456.

[75] H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mollura, and R. M. Summers, “Deep
convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics
and transfer learning.” IEEE Transactions on Medical Imaging, vol. PP, no. 99, p. 1, 2016.
[76] Y. Song, L. Zhang, S. Chen, D. Ni, B. Lei, and T. Wang, “Accurate segmentation of cervical cytoplasm
and nuclei based on multiscale convolutional network and graph partitioning,” IEEE Transactions on
Biomedical Engineering, vol. 62, no. 10, pp. 2421–2433, 2015.
[77] R. Salakhutdinov and G. E. Hinton, “Deep boltzmann machines,” in 12th International Conference on
Artificial Intelligence and Statics, no. 3, 2009, pp. 448–455.
[78] Y. LeCun, K. Kavukcuoglu, and C. Farabet, “Convolutional networks and applications in vision,”
in IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems,
2010, pp. 253–256.
[79] H.-I. Suk and D. Shen, “Deep learning-based feature representation for AD/MCI classification,” in
International Conference on Medical Image Computing and Computer-Assisted Intervention, 2013, pp.
583–590.

[80] A. Prasoon, K. Petersen, C. Igel, F. Lauze, E. Dam, and M. Nielsen, “Deep feature learning for knee
cartilage segmentation using a triplanar convolutional neural network,” in International Conference
on Medical Image Computing and Computer-Assisted Intervention, 2013, pp. 246–253.

24
[81] A. A. Cruz-Roa, J. E. Arevalo Ovalle, A. Madabhushi, and F. A. González Osorio, “A deep learn-
ing architecture for image representation, visual interpretability and automated basal-cell carcinoma
cancer detection,” in International Conference on Medical Image Computing and Computer-Assisted
Intervention, 2013, pp. 403–410.
[82] J. Shiraishi, L. L. Pesce, C. E. Metz, and K. Doi, “Experimental design and data analysis in receiver op-
erating characteristic studies : lessons learned from reports in radiology from 1997 to 2006,” Radiology,
vol. 253, no. 3, 2009.
[83] B. Van Ginneken, A. A. A. Setio, C. Jacobs, and F. Ciompi, “Off-the-shelf convolutional neural network
features for pulmonary nodule detection in computed tomography scans,” in 12th IEEE International
Symposium on Biomedical Imaging, 2015, pp. 286–289.
[84] S. Choi, “X-ray image body part clustering using deep convolutional neural network,” ImageCLEF
2015 Medical Clustering Task, pp. 6–8, 2015.
[85] K. H. Zou, S. K. Warfield, A. Bharatha, C. M. C. Tempany, M. R. Kaus, S. J. Haker, W. M. Wells,
F. A. Jolesz, and R. Kikinis, “Statistical validation of image segmentation Quality Based on a Spatial
Overlap Index,” Academic Radiology, vol. 11, no. 2, pp. 178–189, 2004.
[86] V. K. Reed, W. A. Woodward, L. Zhang, E. A. Strom, G. H. Perkins, W. Tereffe, J. L. Oh, T. K. Yu,
I. Bedrosian, G. J. Whitman, T. A. Buchholz, and L. Dong, “Automatic segmentation of whole breast
using atlas approach and deformable image registration,” International Journal of Radiation Oncology
Biology Physics, vol. 73, no. 5, pp. 1493–1500, 2009.
[87] N. Li, R. Jin, and Z. Zhou, “Top rank optimization in linear time,” Advances in Neural Information
Processing Systems, pp. 1–9, 2014.
[88] Y. Yoo, T. Brosch, A. Traboulsee, D. Li, and R. Tam, “Deep learning of image features from unlabeled
data for multiple sclerosis lesion segmentation,” Mlmi, pp. 117–124, 2014.
[89] O. Maier, M. Wilms, J. von der Gablentz, U. M. Krämer, T. F. Münte, and H. Handels, “Extra tree
forests for sub-acute ischemic stroke lesion segmentation in MR sequences,” Journal of Neuroscience
Methods, vol. 240, pp. 89–100, 2015.
[90] J. Mitra, P. Bourgeat, J. Fripp, S. Ghose, S. Rose, O. Salvado, A. Connelly, B. Campbell, S. Palmer,
G. Sharma, S. Christensen, and L. Carey, “Lesion segmentation from multimodal MRI using random
forest following ischemic stroke,” NeuroImage, vol. 98, pp. 324–335, 2014.
[91] T. Si, A. De, and A. Kumar, “Artificial neural network based lesion segmentation of brain MRI,”
Communications on Applied Electronics, vol. 4, no. 5, pp. 1–5, 2016.
[92] G. Li, L. Wang, F. Shi, W. Lin, and D. Shen, “Multi-atlas based simultaneous labeling of longitudinal
dynamic cortical surfaces in infants,” in International Conference on Medical Image Computing and
Computer-Assisted Intervention, 2013, pp. 58–65.
[93] L. Wang, F. Shi, Y. Gao, G. Li, J. H. Gilmore, W. Lin, and D. Shen, “Integration of sparse multi-
modality representation and anatomical constraint for isointense infant brain MR image segmentation,”
NeuroImage, vol. 89, pp. 152–164, 2014.
[94] W. Zhang, R. Li, H. Deng, L. Wang, W. Lin, S. Ji, and D. Shen, “Deep convolutional neural networks
for multi-modality isointense infant brain image segmentation,” NeuroImage, vol. 108, pp. 214–224,
2015.
[95] J. Kleesiek, G. Urban, A. Hubert, D. Schwarz, K. Maier-Hein, M. Bendszus, and A. Biller, “Deep MRI
brain extraction: a 3D convolutional neural network for skull stripping,” NeuroImage, vol. 129, pp.
460–469, 2016.

25
[96] A. de Brebisson and G. Montana, “Deep neural networks for anatomical brain segmentation,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015,
pp. 20–28.
[97] P. Moeskops, M. A. Viergever, A. M. Mendrik, L. S. de Vries, M. J. Benders, and I. Išgum, “Automatic
segmentation of mr brain images with a convolutional neural network,” IEEE transactions on medical
imaging, vol. 35, no. 5, pp. 1252–1261, 2016.
[98] H. Chen, Q. Dou, L. Yu, J. Qin, and P.-A. Heng, “Voxresnet: Deep voxelwise residual networks for
brain segmentation from 3d mr images,” NeuroImage, 2017.
[99] C. Lindner, S. Thiagarajah, J. M. Wilkinson, T. Consortium, G. A. Wallis, and T. F. Cootes, “Fully
automatic segmentation of the proximal femur using random forest regression voting,” Medical Image
Analysis, vol. 32, no. 8, pp. 1462–1472, 2013.
[100] H. Lombaert, D. Zikic, A. Criminisi, and N. Ayache, “Laplacian forests: Semantic image segmentation
by guided bagging,” in International Conference on Medical Image Computing and Computer-Assisted
Intervention, 2014, pp. 496–504.

[101] S. R. B, A. Carass, J. L. Prince, and D. L. Pham, “Semi-automatic liver tumor segmentation in dynamic
contrast-enhanced CT scans using random forests and supervoxels,” in International Workshop on
Machine Learning in Medical Imaging, 2015, pp. 212—-219.
[102] Q. Liu, Q. Wang, L. Zhang, Y. Gao, and D. Shen, “Multi-atlas context forests for knee MR image
segmentation,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 186–
193.
[103] C. Chen, D. Belavy, and G. Zheng, “3D intervertebral disc localization and segmentation from MR
images by data-driven regression and classification,” in International Workshop on Machine Learning
in Medical Imaging. Springer, 2014, pp. 50–58.

[104] S. Sedai, P. Roy, and R. Garnavi, “Segmentation of right ventricle in cardiac MR images using shape
regression,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 1–8.
[105] T. Tong, R. Wolz, Z. Wang, Q. Gao, K. Misawa, M. Fujiwara, K. Mori, J. V. Hajnal, and D. Rueckert,
“Discriminative dictionary learning for abdominal multi-organ segmentation,” Medical Image Analysis,
vol. 23, no. 1, pp. 92–104, 2015.

[106] P. Aljabar, R. A. Heckemann, A. Hammers, J. V. Hajnal, and D. Rueckert, “Multi-atlas based seg-
mentation of brain images: atlas selection and its effect on accuracy,” NeuroImage, vol. 46, no. 3, pp.
726–738, 2009.
[107] J. C. Griffis, J. B. Allendorfer, and J. P. Szaflarski, “Voxel-based Gaussian naı̈ve Bayes classification of
ischemic stroke lesions in individual T1-weighted MRI scans,” Journal of Neuroscience Methods, vol.
257, pp. 97–108, 2016.
[108] Y. Wang, J. Nie, P. T. Yap, G. Li, F. Shi, X. Geng, L. Guo, and D. Shen, “Knowledge-guided
robust MRI brain extraction for diverse large-scale neuroimaging studies on humans and non-human
primates,” PLoS ONE, vol. 9, no. 1, pp. 1–23, 2014.
[109] Y. Jin, Y. Shi, L. Zhan, B. A. Gutman, G. I. de Zubicaray, K. L. McMahon, M. J. Wright, A. W.
Toga, and P. M. Thompson, “Automatic clustering of white matter fibers in brain diffusion MRI with
an application to genetics,” NeuroImage, vol. 100, pp. 75–90, 2014.
[110] D. Zhang, Daoqiuang; Shen, “Multi modal multi task learning for joint prediction of multiple regression
and classification variables in Alzheimer’s disease,” NeuroImage, vol. 59, no. 2, pp. 895–907, 2013.

26
[111] T. Tong, R. Wolz, Q. Gao, R. Guerrero, J. V. Hajnal, and D. Rueckert, “Multiple instance learning
for classification of dementia in brain MRI,” Medical Image Analysis, vol. 18, no. 5, pp. 808–818, 2014.
[112] S. F. Eskildsen, P. Coupé, V. Fonov, J. V. Manjón, K. K. Leung, N. Guizard, S. N. Wassef, L. R.
Østergaard, and D. L. Collins, “BEaST: brain extraction based on nonlocal segmentation technique,”
NeuroImage, vol. 59, no. 3, pp. 2362–2373, 2012.

[113] A. G. S. D. Herrera, D. Markonis, R. Joyseeree, R. Schaer, and A. Foncubierta-rodr, “Semi Supervised


Learning for Image Modality Classification,” Multimodal Retrieval in the Medical Domain, pp. 85–98,
2015.
[114] S. Roy, A. Carass, J. L. Prince, and D. L. Pham, “Subject specific sparse dictionary learning for atlas
based brain MRI segmentation,” in International Workshop on Machine Learning in Medical Imaging,
2014, pp. 248–255.
[115] W. Li, F. Jia, and Q. Hu, “Automatic segmentation of liver tumor in ct images with deep convolutional
neural networks,” Journal of Computer and Communications, vol. 3, no. 11, p. 146, 2015.
[116] H. R. Roth, L. Lu, A. Farag, H.-C. Shin, J. Liu, E. B. Turkbey, and R. M. Summers, “Deeporgan: Multi-
level deep convolutional networks for automated pancreas segmentation,” in International Conference
on Medical Image Computing and Computer-Assisted Intervention. Springer, 2015, pp. 556–564.
[117] Y. Guo, Y. Gao, and D. Shen, “Deformable mr prostate segmentation via deep feature learning and
sparse patch matching,” IEEE transactions on medical imaging, vol. 35, no. 4, pp. 1077–1089, 2016.

[118] M. Avendi, A. Kheradvar, and H. Jafarkhani, “A combined deep-learning and deformable-model ap-
proach to fully automatic segmentation of the left ventricle in cardiac mri,” Medical image analysis,
vol. 30, pp. 108–119, 2016.
[119] G. van Tulder and M. de Bruijne, “Combining generative and discriminative representation learning
for lung ct analysis with convolutional restricted boltzmann machines,” IEEE transactions on medical
imaging, vol. 35, no. 5, pp. 1262–1272, 2016.
[120] R. Manniesing, T. Marcel, H. Oei, L. J. Oostveen, J. Melendez, E. J. Smit, B. Platel, C. I. Sánchez,
J. Frederick, A. Meijer et al., “White matter and gray matter segmentation in 4d computed tomogra-
phy,” Scientific Reports (Nature Publisher Group), vol. 7, p. 1, 2017.
[121] M. Havaei, A. Davy, D. Warde-Farley, A. Biard, A. Courville, Y. Bengio, C. Pal, P.-M. Jodoin, and
H. Larochelle, “Brain tumor segmentation with deep neural networks,” Medical image analysis, vol. 35,
pp. 18–31, 2017.
[122] P. Hu, F. Wu, J. Peng, P. Liang, and D. Kong, “Automatic 3d liver segmentation based on deep
learning and globally optimized surface evolution,” Physics in Medicine and Biology, vol. 61, no. 24,
p. 8676, 2016.

[123] D. Paredes, A. Saha, and M. A. Mazurowski, “Deep learning for segmentation of brain tumors: can we
train with images from different institutions?” in Medical Imaging 2017: Computer-Aided Diagnosis,
vol. 10134. International Society for Optics and Photonics, 2017, p. 101341P.
[124] S. D. O’Connor, J. Yao, and R. M. Summers, “Lytic metastases in thoracolumbar spine: computer-
aided detection at CT–preliminary study.” Radiology, vol. 242, no. 3, pp. 811–816, 2007.

[125] J. Yao, H. Munoz, J. E. Burns, and L. Lu, “Computer aided detection of spinal degenerative osteophytes
on sodium fluoride PET/CT,” Computational Methods and Clinical Applications for Spine Imaging,
pp. 51–60, 2014.

27
[126] J. Liu, S. Pattanaik, J. Yao, E. Turkbey, W. Zhang, X. Zhang, and R. M. Summers, “Computer aided
detection of epidural masses on computed tomography scans,” Computerized Medical Imaging and
Graphics, vol. 38, no. 7, pp. 606–612, 2014.
[127] J. Yao, J. E. Burns, H. Munoz, and R. M. Summers, “Detection of vertebral body fractures based on
cortical shell unwrapping,” in International Conference on Medical Image Computing and Computer-
Assisted Intervention, vol. 15, no. 3, 2012, pp. 509–516.
[128] K.-H. Thung, C.-Y. Wee, P.-T. Yap, D. Shen, and the Alzheimer’s Disease Neuroimaging Initiative,
“Neurodegenerative disease diagnosis using incomplete multi-modality data via matrix shrinkage and
completion,” NeuroImage, vol. 91, pp. 386–400, 2014.

[129] C. D. Lehman, R. D. Wellman, D. S. M. Buist, K. Kerlikowske, A. N. A. Tosteson, and D. L. Miglioretti,


“Diagnostic accuracy of digital screening mammography with and without computer-aided detection.”
JAMA Internal Medicine, vol. 175, no. 11, pp. 1–10, 2015.
[130] J. Yao, A. Dwyer, R. M. Summers, and D. J. Mollura, “Computer-aided diagnosis of pulmonary infec-
tions using texture analysis and support vector machine classification,” Academic Radiology, vol. 18,
no. 3, pp. 306–314, 2011.
[131] N. Pérez, M. A. Guevara, A. Silva, I. Ramos, and J. Loureiro, “Improving the performance of machine
learning classifiers for breast cancer diagnosis based on feature selection,” in Federated Conference on
Computer Science and Information Systems, vol. 2, 2014, pp. 209–217.
[132] M. Jiang, S. Zhang, and D. Metaxas, “Detection of mammographic masses by content-based image
retrieval,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 33–41.
[133] W. Sun, T.-L. B. Tseng, W. Qian, J. Zhang, E. C. Saltzstein, B. Zheng, F. Lure, H. Yu, and S. Zhou,
“Using multiscale texture and density features for near-term breast cancer risk analysis.” Medical
Physics, vol. 42, no. 6, pp. 2853–2862, 2015.

[134] N. P. Pérez, M. A. Guevara López, A. Silva, and I. Ramos, “Improving the Mann-Whitney statisti-
cal test for feature selection: An approach in breast cancer diagnosis on mammography,” Artificial
Intelligence in Medicine, vol. 63, no. 1, pp. 19–31, 2015.
[135] Q. Wang, W. Zhu, and B. Wang, “Three-dimensional SVM with latent variable: application for detec-
tion of lung lesions in CT images.” Journal of Medical Systems, vol. 39, no. 1, p. 171, 2015.

[136] S. Antani, “Automated detection of lung diseases in chest X-Rays,” US National Library of Medicine,
2015.
[137] V. Gopalakrishnan, P. G. Menon, and S. Madan, “cmri-bed: A novel informatics framework for cardiac
mri biomarker extraction and discovery applied to pediatric cardiomyopathy classification,” Biomedical
engineering online, vol. 14, no. 2, p. S7, 2015.

[138] J. Arevalo, F. A. González, R. Ramos-Pollán, J. L. Oliveira, and M. A. Guevara Lopez, “Representation


learning for mammography mass lesion classification with convolutional neural networks,” Computer
Methods and Programs in Biomedicine, vol. 127, pp. 248–257, 2016.
[139] S. P. Singh and S. Urooj, “An improved CAD system for breast cancer diagnosis based on generalized
pseudo-zernike moment and Ada-DEWNN classifier,” Journal of Medical Systems, vol. 40, no. 4, pp.
1–13, 2016.
[140] J.-Z. Cheng, D. Ni, Y.-H. Chou, J. Qin, C.-M. Tiu, Y.-C. Chang, C.-S. Huang, D. Shen, and C.-M.
Chen, “Computer-aided diagnosis with deep learning architecture: applications to breast lesions in us
images and pulmonary nodules in ct scans,” Scientific reports, vol. 6, 2016.

28
[141] A. Rani, D. Mittal et al., “Detection and classification of focal liver lesions using support vector
machine classifiers,” Journal of Biomedical Engineering and Medical Imaging, vol. 3, no. 1, p. 21, 2016.
[142] J. E. Burns, J. Yao, H. Muñoz, and R. M. Summers, “Automated detection, localization, and classifi-
cation of traumatic vertebral body fractures in the thoracic and lumbar spine at CT,” Radiology, vol.
278, no. 1, pp. 64–73, 2016.

[143] R. Ebsim, J. Naqvi, and T. Cootes, “Detection of wrist fractures in x-ray images,” in Workshop on
Clinical Image-Based Procedures. Springer, 2016, pp. 1–8.
[144] T. Kooi, G. Litjens, B. van Ginneken, A. Gubern-Mérida, C. I. Sánchez, R. Mann, A. den Heeten, and
N. Karssemeijer, “Large scale deep learning for computer aided detection of mammographic lesions,”
Medical Image Analysis, vol. 35, pp. 303–312, 2017.
[145] X. Liu, F. Hou, H. Qin, and A. Hao, “A cade system for nodule detection in thoracic ct images based
on artificial neural network,” Science China Information Sciences, vol. 60, no. 7, p. 072106, 2017.
[146] Y. Miki, C. Muramatsu, T. Hayashi, X. Zhou, T. Hara, A. Katsumata, and H. Fujita, “Classification of
teeth in cone-beam ct using deep convolutional neural network,” Computers in Biology and Medicine,
vol. 80, pp. 24–29, 2017.
[147] A. Mehrtasha, A. Sedghic, M. Ghafooriana, M. Taghipoura, C. M. Tempanya, T. Kapura, P. Mousavic,
P. Abolmaesumib, and A. Fedorova, “Classification of clinical significance of mri prostate findings using
3d convolutional neural networks,” in SPIE Medical Imaging. International Society for Optics and
Photonics, 2017, pp. 101 342A–101 342A.

[148] C. Spampinato, S. Palazzo, D. Giordano, M. Aldinucci, and R. Leonardi, “Deep learning for automated
skeletal bone age assessment in x-ray images,” Medical Image Analysis, vol. 36, pp. 41–51, 2017.
[149] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers, “Chestx-ray8: Hospital-scale chest
x-ray database and benchmarks on weakly-supervised classification and localization of common thorax
diseases,” in Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE,
2017, pp. 3462–3471.
[150] P. Rajpurkar, J. Irvin, K. Zhu, B. Yang, H. Mehta, T. Duan, D. Ding, A. Bagul, C. Langlotz, K. Sh-
panskaya et al., “Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning,”
arXiv preprint arXiv:1711.05225, 2017.

[151] K. L.-C. Hsieh, C.-M. Lo, and C.-J. Hsiao, “Computer-aided grading of gliomas based on local and
global mri features,” Computer methods and programs in biomedicine, vol. 139, pp. 31–38, 2017.
[152] C. H. Lee, D. D. Dershaw, D. Kopans, P. Evans, B. Monsees, D. Monticciolo, R. J. Brenner, L. Bassett,
W. Berg, S. Feig, E. Hendrick, E. Mendelson, C. D’Orsi, E. Sickles, and L. W. Burhenne, “Breast cancer
screening with imaging: recommendations from the society of breast imaging and the ACR on the use
of mammography, breast MRI, breast ultrasound, and other technologies for the detection of clinically
occult breast cancer,” Journal of the American College of Radiology, vol. 7, no. 1, pp. 18–27, 2010.
[153] R. Nithya and B. Santhi, “Classification of normal and abnormal patterns in digital mammograms for
diagnosis of breast cancer,” International Journal of Computer Applications, vol. 28, no. 6, pp. 21–25,
2011.

[154] S.-T. Luo and B.-W. Cheng, “Diagnosing breast masses in digital mammography using feature selection
and ensemble methods,” Journal of Medical Systems, vol. 36, no. 2, pp. 569–577, 2012.
[155] Y. Jiang, R. M. Nishikawa, R. A. Schmidt, C. E. Metz, M. L. Giger, and K. Doi, “Improving breast
cancer diagnosis with computer-aided diagnosis,” Academic Radiology, vol. 6, no. 1, pp. 22–33, 1999.

29
[156] W. Sun, B. Zheng, F. Lure, T. Wu, J. Zhang, B. Y. Wang, E. C. Saltzstein, and W. Qian, “Prediction of
near-term risk of developing breast cancer using computerized features from bilateral mammograms,”
Computerized Medical Imaging and Graphics, vol. 38, no. 5, pp. 348–357, 2014.
[157] H. Y. Banaem, A. M. Dehnavi, and M. Shahnazi, “Ensemble supervised classification method using
the regions of interest and grey level co-occurrence matrices features for mammograms Data,” Iranian
Journal of Radiology, vol. 12, no. 3, 2015.
[158] J. Arevalo, F. A. González, R. Ramos-Pollán, J. L. Oliveira, and M. A. Guevara Lopez, “Convolutional
neural networks for mammography mass lesion classification,” in 37th Annual International Conference
of the IEEE Engineering in Medicine and Biology Society, 2015, pp. 797–800.

[159] Z. Jiao, X. Gao, Y. Wang, and J. Li, “A deep feature based framework for breast masses classification,”
Neurocomputing, vol. 197, pp. 1–11, 2016.
[160] R. Jin, K. D. Luk, J. Cheung, and Y. Hu, “A machine learning based prognostic prediction of cervical
myelopathy using diffusion tensor imaging,” in Computational Intelligence and Virtual Environments
for Measurement Systems and Applications (CIVEMSA), 2016 IEEE International Conference on.
IEEE, 2016, pp. 1–4.
[161] A. M. Abdel-Zaher and A. M. Eldeib, “Breast cancer classification using deep belief networks,” Expert
Systems with Applications, vol. 46, pp. 139–144, 2016.
[162] D. Wang, A. Khosla, R. Gargeya, H. Irshad, and A. H. Beck, “Deep learning for identifying metastatic
breast cancer,” arXiv preprint arXiv:1606.05718, 2016.

[163] Y. Bar, I. Diamant, L. Wolf, and H. Greenspan, “Deep learning with non-medical training used for
chest pathology identification,” in Medical Imaging 2015: Computer-Aided Diagnosis, vol. 9414. In-
ternational Society for Optics and Photonics, 2015, p. 94140V.
[164] R. Rasti, M. Teshnehlab, and S. L. Phung, “Breast cancer diagnosis in dce-mri using mixture ensemble
of convolutional neural networks,” Pattern Recognition, vol. 72, pp. 381–390, 2017.
[165] X. Wang, W. Yang, J. Weinreb, J. Han, Q. Li, X. Kong, Y. Yan, Z. Ke, B. Luo, T. Liu et al., “Searching
for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus
non-deep learning,” Scientific reports, vol. 7, no. 1, p. 15415, 2017.
[166] M. Anthimopoulos, S. Christodoulidis, L. Ebner, A. Christe, and S. Mougiakakou, “Lung pattern clas-
sification for interstitial lung diseases using a deep convolutional neural network,” IEEE transactions
on medical imaging, vol. 35, no. 5, pp. 1207–1216, 2016.
[167] W. Sun, T.-L. Tseng, J. Zhang, and W. Qian, “Enhancing deep convolutional neural network scheme
for breast cancer diagnosis with unlabeled data,” Computerized Medical Imaging and Graphics, 2016.

[168] K. P. Exarchos, Y. Goletsis, and D. I. Fotiadis, “Multiparametric decision support system for the
prediction of oral cancer reoccurrence,” IEEE Transactions on Information Technology in Biomedicine,
vol. 16, no. 6, pp. 1127–1134, 2012.
[169] T. A. Patel, M. Puppala, R. O. Ogunti, J. E. Ensor, T. He, J. B. Shewale, D. P. Ankerst, V. G.
Kaklamani, A. A. Rodriguez, S. T. C. Wong, and J. C. Chang, “Correlating mammographic and
pathologic findings in clinical decision support using natural language processing and data mining
methods,” Cancer, pp. 1–8, 2016.
[170] T. Ayer, M. U. Ayvaci, Z. X. Liu, O. Alagoz, and E. S. Burnside, “Computer-aided diagnostic models
in breast cancer screening.” Imaging in Medicine, vol. 2, no. 3, pp. 313–323, 2010.

30
[171] S. Kloppel, C. M. Stonnington, C. Chu, B. Draganski, R. I. Scahill, J. D. Rohrer, N. C. Fox, C. R.
Jack, J. Ashburner, and R. S. J. Frackowiak, “Automatic classification of MR scans in Alzheimer’s
disease,” Brain, vol. 131, no. 3, pp. 681–689, 2008.
[172] M. Chupin, A. Hammers, R. S. N. Liu, O. Colliot, J. Burdett, E. Bardinet, J. S. Duncan, L. Garnero,
and L. Lemieux, “Automatic segmentation of the hippocampus and the amygdala driven by hybrid
constraints: Method and validation,” NeuroImage, vol. 46, no. 3, pp. 749–761, 2009.
[173] Y. Fan, D. Shen, R. C. Gur, R. E. Gur, and C. Davatzikos, “COMPARE: classication of morphological
patterns using adaptive regional elements,” IEEE Transactions on Medical Imaging, vol. 26, no. 1, pp.
93–105, 2007.
[174] Y. Chen, B. Shi, C. D. Smith, and J. Liu, “Nonlinear feature transformation and deep fusion for
Alzheimer’s disease staging analysis,” in Lecture Notes in Computer Science (including subseries Lec-
ture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015, pp. 304–312.
[175] M. Liu, D. Zhang, and D. Shen, “Hierarchical fusion of features and classifier decisions for Alzheimer’s
disease diagnosis,” Human Brain Mapping, vol. 35, no. 4, pp. 1305–1319, 2014.
[176] C. Chu, A. L. Hsu, K. H. Chou, P. Bandettini, and C. Lin, “Does feature selection improve classification
accuracy? Impact of sample size and feature selection on classification using anatomical magnetic
resonance images,” NeuroImage, vol. 60, no. 1, pp. 59–70, 2012.
[177] E. Bron, M. Smits, J. Van Swieten, W. Niessen, and S. Klein, “Feature selection based on SVM
significance maps for classification of dementia,” in International Workshop on Machine Learning in
Medical Imaging, 2014, pp. 272–279.
[178] D. Gelb, E. Oliver, and S. Gilman, “Diagnostic criteria for Parkinson disease,” Arch Neurol, vol. 56,
no. 4, pp. 368–376, 1999.
[179] G. Singh and L. Samavedham, “Unsupervised learning based feature extraction for differential diagnosis
of neurodegenerative diseases: A case study on early-stage diagnosis of Parkinson disease,” Journal of
Neuroscience Methods, vol. 256, pp. 30–40, 2015.
[180] M. Liu, D. Zhang, and D. S. B, “Inherent structure-guided multi-view learning for Alzheimer’s disease
and mild cognitive impairment classification,” in International Workshop on Machine Learning in
Medical Imaging, 2015, pp. 296–303.
[181] C. D. Smyser, N. U. Dosenbach, T. A. Smyser, A. Z. Snyder, C. E. Rogers, T. E. Inder, B. L.
Schlaggar, and J. J. Neil, “Prediction of brain maturity in infants using machine-learning algorithms,”
NeuroImage, vol. 136, pp. 1–9, 2016.
[182] Alzheimer’s Association, “Alzheimer’s disease facts and figures,” Alzheimer’s & Dementia, vol. 12,
no. 4, p. 88, 2015.
[183] F. Li, L. Tran, K.-H. Thung, S. Ji, D. Shen, and J. Li, “Robust deep learning for improved classification
of AD / MCI patients,” in International Workshop on Machine Learning in Medical Imaging, 2014,
pp. 240–247.
[184] M. Komlagan, V.-T. Ta, X. Pan, J.-P. Domenger, D. Collins, and P. Coupé, “Anatomically constrained
weak classifier fusion for early detection of Alzheimer’s disease,” in International Workshop on Machine
Learning in Medical Imaging, 2014, pp. 141–148.
[185] B. Ahmed, C. E. Brodley, K. E. Blackmon, R. Kuzniecky, G. Barash, C. Carlson, B. T. Quinn,
W. Doyle, J. French, O. Devinsky, and T. Thesen, “Cortical feature analysis and machine learning
improves detection of “MRI-negative” focal cortical dysplasia,” Epilepsy & Behavior, vol. 48, pp. 21–28,
2015.

31
[186] S. J. Hong, H. Kim, D. Schrader, N. Bernasconi, B. C. Bernhardt, and A. Bernasconi, “Automated
detection of cortical dysplasia type II in MRI-negative epilepsy,” Neurology, vol. 83, no. 1, pp. 48–55,
2014.
[187] L. Huang, Y. Gao, Y. Jin, K.-H. Thung, and D. Shen, “Soft-split sparse regression based random
forest for predicting future clinical scores of Alzheimer’s disease,” International Workshop on Machine
Learning in Medical Imaging, pp. 194–202, 2015.
[188] X. Zhu, H.-i. Suk, and D. Shen, “Sparse discriminative feature selection for multi-class Alzheimer’s
disease classification,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp.
157–164.

[189] X. Zhu, H.-i. Suk, Y. Zhu, and K.-h. Thung, “Multi-view classification for identification of Alzheimer’s
Disease,” in International Workshop on Machine Learning in Medical Imaging, vol. 255-262, 2015, pp.
255–262.
[190] R. Guerrero, C. Ledig, and D. Rueckert, “Manifold alignment and transfer learning for classification of
Alzheimer’s disease,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp.
77–84.
[191] B. Cheng, M. Liu, and D. Zhang, “Multimodal multi-label transfer learning for early diagnosis of
Alzheimer’s disease,” in International Workshop on Machine Learning in Medical Imaging. Springer,
2015, pp. 238–245.
[192] S. Sarraf, J. Anderson, G. Tofighi et al., “Deepad: Alzheimer s disease classification via deep convolu-
tional neural networks using mri and fmri,” bioRxiv, p. 070441, 2016.
[193] Z. Long, B. Jing, H. Yan, J. Dong, H. Liu, X. Mo, Y. Han, and H. Li, “A support vector machine based
method to identify mild cognitive impairment with multi-level characteristics of magnetic resonance
imaging,” Neuroscience, vol. 331, pp. 169–176, 2016.

[194] A. Khazaee, A. Ebrahimzadeh, and A. Babajani-Feremi, “Application of advanced machine learning


methods on resting-state fmri network for identification of mild cognitive impairment and alzheimers
disease,” Brain imaging and behavior, vol. 10, no. 3, pp. 799–817, 2016.
[195] R. Armananzas, M. Iglesias, D. A. Morales, and L. Alonso-Nanclares, “Voxel-based diagnosis of
Alzheimer’s disease using classifier ensembles,” IEEE Journal of Biomedical and Health Informatics,
vol. PP, no. 99, pp. 1–7, 2016.

[196] S. Sarraf and G. Tofighi, “Deep learning-based pipeline to recognize alzheimer’s disease using fmri
data,” in Future Technologies Conference (FTC). IEEE, 2016, pp. 816–820.
[197] T. M. Schouten, M. Koini, F. de Vos, S. Seiler, M. de Rooij, A. Lechner, R. Schmidt, M. van den
Heuvel, J. van der Grond, and S. A. Rombouts, “Individual classification of alzheimer’s disease with
diffusion magnetic resonance imaging,” Neuroimage, vol. 152, pp. 476–481, 2017.
[198] R. S. Kumar and M. Senthilmurugan, “Content-based image retrieval system in medical applications,”
International Journal of Engineering Research and Technology, vol. 2, no. 3, 2013.
[199] C.-H. Wei, C.-T. Li, and R. Wilson, “A content–based approach to medical image database retrieval,”
Database Modeling for Industrial Data Management: Emerging Technologies and Applications, pp.
258–291, 2005.
[200] J. Yu, J. Amores, N. Sebe, P. Radeva, and Q. Tian, “Distance learning for similarity estimation,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 3, pp. 451–462, 2008.

32
[201] T. Emrich, F. Graf, H. P. Kriegel, M. Schubert, and M. Thoma, “Similarity estimation using Bayes
ensembles,” in International Conference on Scientific and Statistical Database Management, 2010, pp.
537–554.
[202] C. Kurtz, C. F. Beaulieu, S. Napel, and D. L. Rubin, “A hierarchical knowledge-based approach
for retrieving similar medical images described with semantic annotations,” Journal of Biomedical
Informatics, vol. 49, pp. 227–244, 2014.
[203] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local wavelet pattern: a new feature descriptor for
image retrieval in medical CT databases,” IEEE Transactions on Image Processing, vol. 24, no. 12,
pp. 5892–5903, 2015.
[204] I. Ramirez, P. Sprechmann, and G. Sapiro, “Classification and clustering via dictionary learning with
structured incoherence and shared features,” in IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, no. 1, 2010, pp. 3501–3508.
[205] M. Srinivas and C. K. Mohan, “Medical images modality classification using multi-scale dictionary
learning,” in International Conference on Digital Signal Processing, no. August, 2014, pp. 621–625.
[206] ——, “Classification of medical images using edge-based features and sparse representation,” in IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 912–916.
[207] E. Ahn, A. Kumar, J. Kim, C. Li, D. Feng, M. Fulham, N. Medicine, R. Prince, and A. Hospital, “X-ray
image classification using domain transferred convolutional neural networks and local sparse spatial
pyramid,” in 2016 IEEE 13th International Symposium on Biomedical Imaging, 2016, pp. 855–858.
[208] A. V. Faria, K. Oishi, S. Yoshida, A. Hillis, M. I. Miller, and S. Mori, “Content-based image retrieval
for brain MRI: an image-searching engine and population-based analysis to utilize past clinical data
for future diagnosis,” NeuroImage: Clinical, vol. 7, pp. 367–376, 2015.
[209] C. Kurtz, A. Depeursinge, S. Napel, C. F. Beaulieu, and D. L. Rubin, “On combining image-based and
ontological semantic dissimilarities for medical image retrieval applications,” Medical Image Analysis,
vol. 18, no. 7, pp. 1082–1100, 2014.
[210] Y. Cao, S. Steffey, H. Jianbiao, D. Xiao, C. Tao, P. Chen, and H. Müller, “Medical image retrieval: a
multimodal approach,” Cancer Informatics, vol. 13, pp. 125–136, 2015.
[211] M. Verma and B. Raman, “Center symmetric local binary co-occurrence pattern for texture , face and
bio-medical image retrieval,” Journal of Visual Communication and Image Representation, vol. 32, pp.
224–236, 2015.
[212] R. Lan, S. Zhong, Z. Liu, Z. Shi, and X. Luo, “A simple texture feature for retrieval of medical images,”
Multimedia Tools and Applications, pp. 1–14, 2018.
[213] E. M. Rohren, T. G. Turkington, and R. E. Coleman, “Clinical applications of PET in oncology.”
Radiology, vol. 231, pp. 305–332, 2004.
[214] J. Kang, Y. Gao, F. Shi, D. S. Lalush, W. Lin, and D. Shen, “Prediction of standard-dose PET image
by low-dose PET and MRI images,” Medical Physics, vol. 42, no. 9, pp. 5301–5309, 2015.
[215] L. Xiang, Y. Qiao, D. Nie, L. An, W. Lin, Q. Wang, and D. Shen, “Deep auto-context convolutional
neural networks for standard-dose pet image estimation from low-dose pet/mri,” Neurocomputing, vol.
267, pp. 406–416, 2017.
[216] P. Kontschieder, S. R. Bulò, H. Bischof, and M. Pelillo, “Structured class-labels in random forests for
semantic image labelling,” in IEEE International Conference on Computer Vision, 2011, pp. 2190–
2197.

33
[217] P. Dollar and C. L. Zitnick, “Structured forests for fast edge detection,” in IEEE International Con-
ference on Computer Vision, 2013, pp. 1841–1848.
[218] X. Yang, R. Kwitt, M. Styner, and M. Niethammer, “Quicksilver: Fast predictive image registration–a
deep learning approach,” NeuroImage, vol. 158, pp. 378–396, 2017.
[219] A. Cerasa, “Machine learning on Parkinson’s disease? Let’s translate into clinical practice,” Journal
of Neuroscience Methods, vol. 266, pp. 161–162, 2015.
[220] J. Weese and C. Lorenz, “Four challenges in medical image analysis from an industrial perspective,”
Medical Image Analysis, vol. 33, pp. 1339–1351, 2016.
[221] V. Cheplygina, A. van Opbroek, M. A. Ikram, M. W. Vernooij, and M. de Bruijne, “Asymmetric
similarity-weighted ensembles for image segmentation,” in Biomedical Imaging (ISBI), 2016 IEEE
13th International Symposium on. IEEE, 2016, pp. 273–277.
[222] W. Shen, M. Zhou, F. Yang, D. Dong, C. Yang, Y. Zang, and J. Tian, “Learning from experts: devel-
oping transferable deep features for patient-level lung cancer prediction,” in International Conference
on Medical Image Computing and Computer-Assisted Intervention. Springer, 2016, pp. 124–131.
[223] B. Cheng, M. Liu, H.-I. Suk, D. Shen, D. Zhang, A. D. N. Initiative et al., “Multimodal manifold-
regularized transfer learning for mci conversion prediction,” Brain imaging and behavior, vol. 9, no. 4,
pp. 913–926, 2015.
[224] R. Paul, S. H. Hawkins, Y. Balagurunathan, M. B. Schabath, R. J. Gillies, L. O. Hall, and D. B.
Goldgof, “Deep feature transfer learning in combination with traditional features predicts survival
among patients with lung adenocarcinoma,” Tomography: a journal for imaging research, vol. 2, no. 4,
p. 388, 2016.
[225] V. Cheplygina, M. de Bruijne, and J. P. Pluim, “Not-so-supervised: a survey of semi-supervised,
multi-instance, and transfer learning in medical image analysis,” arXiv preprint arXiv:1804.06353,
2018.
[226] L. Mena and J. a. Gonzalez, “Machine learning for imbalanced datasets: Application in medical
diagnostic,” Breast, pp. 574–579, 2006.
[227] N. Japkowicz and S. Stephen, “The class imbalance problem: A systematic study,” Intelligent Data
Analysis, vol. 6, no. 5, pp. 429–449, 2002.
[228] J. Wang, X. Yang, H. Cai, W. Tan, C. Jin, and L. Li, “Discrimination of breast cancer with micro-
calcifications on mammography by deep learning,” Scientific Reports, vol. 6, no. February, p. 27327,
2016.
[229] W. Samek, T. Wiegand, and K.-R. Müller, “Explainable artificial intelligence: Understanding, visual-
izing and interpreting deep learning models,” arXiv preprint arXiv:1708.08296, 2017.
[230] E. Thelisson, K. Padh, and L. E. Celis, “Regulatory mechanisms and algorithms towards trust in
ai/ml,” in Proceedings of the IJCAI 2017 Workshop on Explainable Artificial Intelligence (XAI), Mel-
bourne, Australia, 2017.
[231] J. R. Zech, M. A. Badgeley, M. Liu, A. B. Costa, J. J. Titano, and E. K. Oermann, “Variable generaliza-
tion performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional
study,” PLoS medicine, vol. 15, no. 11, p. e1002683, 2018.
[232] R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad, “Intelligible models for healthcare:
Predicting pneumonia risk and hospital 30-day readmission,” in Proceedings of the 21th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. ACM, 2015, pp. 1721–1730.

34
[233] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, “Learning deep features for discriminative
localization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
2016, pp. 2921–2929.

35

You might also like