Computational Analysis and Deep Learning for Medical Care Principles Methods and Applications 1st Edition Amit Kumar Tyagi Editor pdf download
Computational Analysis and Deep Learning for Medical Care Principles Methods and Applications 1st Edition Amit Kumar Tyagi Editor pdf download
https://ebookmeta.com/product/computational-analysis-and-deep-
learning-for-medical-care-principles-methods-and-
applications-1st-edition-amit-kumar-tyagi-editor/
https://ebookmeta.com/product/data-science-for-genomics-1st-
edition-amit-kumar-tyagi/
https://ebookmeta.com/product/computational-methods-for-deep-
learning-theoretic-practice-and-applications-texts-in-computer-
science-wei-qi-yan/
https://ebookmeta.com/product/sustainable-networks-in-smart-
grid-1st-edition-b-d-deebak/
CISM Certified Information Security Manager All-in-One
Exam Guide, 2nd Edition Peter H. Gregory
https://ebookmeta.com/product/cism-certified-information-
security-manager-all-in-one-exam-guide-2nd-edition-peter-h-
gregory/
https://ebookmeta.com/product/sold-to-the-hitman-men-of-ruthless-
corp-1st-edition-logan-chance/
https://ebookmeta.com/product/fault-tolerant-attitude-estimation-
for-small-satellites-1st-edition-chingiz-hajiyev-halil-ersin-
soken/
https://ebookmeta.com/product/resonance-revision-dpp-physical-
inorganic-organic-chemistry-1-to-7-sets-jee-mains-
advanced-2022-2022nd-edition-resonance-eduventures-limited/
https://ebookmeta.com/product/life-of-the-prophet-in-makkah-1st-
edition-zakaria-beshier/
Law for Business Students, 12th Edition Alix Adams
https://ebookmeta.com/product/law-for-business-students-12th-
edition-alix-adams/
Table of Contents
Cover
Title Page
Copyright
Preface
Part 1: Deep Learning and Its Models
1 CNN: A Review of Models, Application of IVD Segmentation
1.1 Introduction
1.2 Various CNN Models
1.3 Application of CNN to IVD Detection
1.4 Comparison With State-of-the-Art Segmentation
Approaches for Spine T2W Images
1.5 Conclusion
References
2 Location-Aware Keyword Query Suggestion Techniques
With Artificial Intelligence Perspective
2.1 Introduction
2.2 Related Work
2.3 Artificial Intelligence Perspective
2.4 Architecture
2.5 Conclusion
References
3 Identification of a Suitable Transfer Learning Architecture
for Classification: A Case Study with Liver Tumors
3.1 Introduction
3.2 Related Works
3.3 Convolutional Neural Networks
3.4 Transfer Learning
3.5 System Model
3.6 Results and Discussions
3.7 Conclusion
References
4 Optimization and Deep Learning-Based Content Retrieval,
Indexing, and Metric Learning Approach for Medical Images
4.1 Introduction
4.2 Related Works
4.3 Proposed Method
4.4 Results and Discussion
4.5 Conclusion
References
Part 2: Applications of Deep Learning
5 Deep Learning for Clinical and Health Informatics
5.1 Introduction
5.2 Related Work
5.3 Motivation
5.4 Scope of the Work in Past, Present, and Future
5.5 Deep Learning Tools, Methods Available for Clinical,
and Health Informatics
5.6 Deep Learning: Not-So-Near Future in Biomedical
Imaging
5.7 Challenges Faced Toward Deep Learning Using in
Biomedical Imaging
5.8 Open Research Issues and Future Research Directions
in Biomedical Imaging (Healthcare Informatics)
5.9 Conclusion
References
6 Biomedical Image Segmentation by Deep Learning Methods
6.1 Introduction
6.2 Overview of Deep Learning Algorithms
6.3 Other Deep Learning Architecture
6.4 Biomedical Image Segmentation
6.5 Conclusion
References
7 Multi-Lingual Handwritten Character Recognition Using
Deep Learning
7.1 Introduction
7.2 Related Works
7.3 Materials and Methods
7.4 Experiments and Results
7.5 Conclusion
References
8 Disease Detection Platform Using Image Processing Through
OpenCV
8.1 Introduction
8.2 Problem Statement
8.3 Conclusion
8.4 Summary
References
9 Computer-Aided Diagnosis of Liver Fibrosis in Hepatitis
Patients Using Convolutional Neural Network
9.1 Introduction
9.2 Overview of System
9.3 Methodology
9.4 Performance and Analysis
9.5 Experimental Results
9.6 Conclusion and Future Scope
References
Part 3: Future Deep Learning Models
10 Lung Cancer Prediction in Deep Learning Perspective
10.1 Introduction
10.2 Machine Learning and Its Application
10.3 Related Work
10.4 Why Deep Learning on Top of Machine Learning?
10.5 How is Deep Learning Used for Prediction of Lungs
Cancer?
10.6 Conclusion
References
11 Lesion Detection and Classification for Breast Cancer
Diagnosis Based on Deep CNNs from Digital Mammographic
Data
11.1 Introduction
11.2 Background
11.3 Methods
11.4 Application of Deep CNN for Mammography
11.5 System Model and Results
11.6 Research Challenges and Discussion on Future
Directions
11.7 Conclusion
References
12 Health Prediction Analytics Using Deep Learning Methods
and Applications
12.1 Introduction
12.2 Background
12.3 Predictive Analytics
12.4 Deep Learning Predictive Analysis Applications
12.5 Discussion
12.6 Conclusion
References
13 Ambient-Assisted Living of Disabled Elderly in an
Intelligent Home Using Behavior Prediction—A Reliable Deep
Learning Prediction System
13.1 Introduction
13.2 Activities of Daily Living and Behavior Analysis
13.3 Intelligent Home Architecture
13.4 Methodology
13.5 Senior Analytics Care Model
13.6 Results and Discussions
13.7 Conclusion
Nomenclature
References
14 Early Diagnosis Tool for Alzheimer’s Disease Using 3D
Slicer
14.1 Introduction
14.2 Related Work
14.3 Existing System
14.4 Proposed System
14.5 Results and Discussion
14.6 Conclusion
References
Part 4: Deep Learning - Importance and Challenges for Other
Sectors
15 Deep Learning for Medical Healthcare: Issues, Challenges,
and Opportunities
15.1 Introduction
15.2 Related Work
15.3 Development of Personalized Medicine Using Deep
Learning: A New Revolution in Healthcare Industry
15.4 Deep Learning Applications in Precision Medicine
15.5 Deep Learning for Medical Imaging
15.6 Drug Discovery and Development: A Promise
Fulfilled by Deep Learning Technology
15.7 Application Areas of Deep Learning in Healthcare
15.8 Privacy Issues Arising With the Usage of Deep
Learning in Healthcare
15.9 Challenges and Opportunities in Healthcare Using
Deep Learning
15.10 Conclusion and Future Scope
References
16 A Perspective Analysis of Regularization and Optimization
Techniques in Machine Learning
16.1 Introduction
16.2 Regularization in Machine Learning
16.3 Convexity Principles
16.4 Conclusion and Discussion
References
17 Deep Learning-Based Prediction Techniques for Medical
Care: Opportunities and Challenges
17.1 Introduction
17.2 Machine Learning and Deep Learning Framework
17.3 Challenges and Opportunities
17.4 Clinical Databases—Electronic Health Records
17.5 Data Analytics Models—Classifiers and Clusters
17.6 Deep Learning Approaches and Association
Predictions
17.7 Conclusion
17.8 Applications
References
18 Machine Learning and Deep Learning: Open Issues and
Future Research Directions for the Next 10 Years
18.1 Introduction
18.2 Evolution of Machine Learning and Deep Learning
18.3 The Forefront of Machine Learning Technology
18.4 The Challenges Facing Machine Learning and Deep
Learning
18.5 Possibilities With Machine Learning and Deep
Learning
18.6 Potential Limitations of Machine Learning and Deep
Learning
18.7 Conclusion
Acknowledgement
Contribution/Disclosure
References
Index
List of Illustrations
Chapter 1
Figure 1.1 Architecture of LeNet-5.
Figure 1.2 Architecture of AlexNet.
Figure 1.3 Architecture of ZFNet.
Figure 1.4 Architecture of VGG-16.
Figure 1.5 Inception module.
Figure 1.6 Architecture of GoogleNet.
Figure 1.7 (a) A residual block.
Figure 1.8 Architecture of ResNeXt.
Figure 1.9 Architecture of SE-ResNet.
Figure 1.10 Architecture of DenseNet.
Figure 1.11 Architecture of MobileNets.
Chapter 2
Figure 2.1 General architecture of a search engine.
Figure 2.2 The increased mobile users.
Figure 2.3 AI-powered location-based system.
Figure 2.4 Architecture diagram for querying.
Chapter 3
Figure 3.1 Phases of CECT images (1: normal liver; 2: tumor
within liver; 3: sto...
Figure 3.2 Architecture of convolutional neural network.
Figure 3.3 AlexNet architecture.
Figure 3.4 GoogLeNet architecture.
Figure 3.5 Residual learning—building block.
Figure 3.6 Architecture of ResNet-18.
Figure 3.7 System model for case study on liver tumor
diagnosis.
Figure 3.8 Output of bidirectional region growing
segmentation algorithm: (a) in...
Figure 3.9 HA Phase Liver CT images: (a) normal liver; (b)
HCC; (c) hemangioma; ...
Figure 3.10 Training progress for AlexNet.
Figure 3.11 Training progress for GoogLeNet.
Figure 3.12 Training progress for ResNet-18.
Figure 3.13 Training progress for ResNet-50.
Chapter 4
Figure 4.1 Proposed system for image retrieval.
Figure 4.2 Schematic of the deep convolutional neural
networks.
Figure 4.3 Proposed feature extraction system.
Figure 4.4 Proposed model for the localization of the
abnormalities.
Figure 4.5 Graph for the retrieval performance of the metric
learning for VGG19.
Figure 4.6 PR values for state of art ConvNet model for CT
images.
Figure 4.7 PR values for state of art CNN model for CT images.
Figure 4.8 Proposed system—PR values for the CT images.
Figure 4.9 PR values for proposed content-based image
retrieval.
Figure 4.10 Graph for loss function of proposed deep
regression networks for tra...
Figure 4.11 Graph for loss function of proposed deep
regression networks for val...
Chapter 6
Figure 5.1 Different informatics in healthcare [28].
Chapter 6
Figure 6.1 CT image reconstruction (past, present, and future)
[3].
Figure 6.2 (a) Classic machine learning algorithm, (b) Deep
learning algorithm.
Figure 6.3 Traditional neural network.
Figure 6.4 Convolutional Neural Network.
Figure 6.5 Psoriasis images [2].
Figure 6.6 Restricted Boltzmann Machine.
Figure 6.7 Autoencoder architecture with vector and image
inputs [1].
Figure 6.8 Image of chest x-ray [60].
Figure 6.9 Regular thoracic disease identified in chest x-rays
[23].
Figure 6.10 MRI of human brain [4].
Chapter 7
Figure 7.1 Architecture of the proposed approach.
Figure 7.2 Sample Math dataset (including English
characters).
Figure 7.3 Sample Bangla dataset (including Bangla numeric).
Figure 7.4 Sample Devanagari dataset (including Hindi
numeric).
Figure 7.5 Dataset distribution for English dataset.
Figure 7.6 Dataset distribution for Hindi dataset.
Figure 7.7 Dataset distribution for Bangla dataset.
Figure 7.8 Dataset distribution for Math Symbol dataset.
Figure 7.9 Dataset distribution.
Figure 7.10 Precision-recall curve on English dataset.
Figure 7.11 ROC curve on English dataset.
Figure 7.12 Precision-recall curve on Hindi dataset.
Figure 7.13 ROC curve on Hindi dataset.
Figure 7.14 Precision-recall curve on Bangla dataset.
Figure 7.15 ROC curve on Bangla dataset.
Figure 7.16 Precision-recall curve on Math Symbol dataset.
Figure 7.17 ROC curve on Math symbol dataset.
Figure 7.18 Precision-recall curve of the proposed model.
Figure 7.19 ROC curve of the proposed model.
Chapter 8
Figure 8.1 Eye image dissection [34].
Figure 8.2 Cataract algorithm [10].
Figure 8.3 Pre-processing algorithm [48].
Figure 8.4 Pre-processing analysis [39].
Figure 8.5 Morphologically opened [39].
Figure 8.6 Finding circles [40].
Figure 8.7 Iris contour separation [40].
Figure 8.8 Image inversion [41].
Figure 8.9 Iris detection [41].
Figure 8.10 Cataract detection [41].
Figure 8.11 Healthy eye vs. retinoblastoma [33].
Figure 8.12 Unilateral retinoblastoma [18].
Figure 8.13 Bilateral retinoblastoma [19].
Figure 8.14 Classification of stages of skin cancer [20].
Figure 8.15 Eye cancer detection algorithm.
Figure 8.16 Sample test cases.
Figure 8.17 Actual working of the eye cancer detection
algorithm.
Figure 8.18 Melanoma example [27].
Figure 8.19 Melanoma detection algorithm.
Figure 8.20 Asymmetry analysis.
Figure 8.21 Border analysis.
Figure 8.22 Color analysis.
Figure 8.23 Diameter analysis.
Figure 8.24 Completed detailed algorithm.
Chapter 9
Figure 9.1 Basic overview of a proposed computer-aided
system.
Figure 9.2 Block diagram of the proposed system for finding
out liver fibrosis.
Figure 9.3 Block diagram representing different pre-
processing stages in liver f...
Figure 9.4 Flow chart showing student’s t test.
Figure 9.5 Diagram showing SegNet architecture for
convolutional encoder and dec...
Figure 9.6 Basic block diagram of VGG-16 architecture.
Figure 9.7 Flow chart showing SegNet working process for
classifying liver fibro...
Figure 9.8 Overall process of the CNN of the system.
Figure 9.9 The stages in identifying liver fibrosis by using
Conventional Neural...
Figure 9.10 Multi-layer neural network architecture for a CAD
system for diagnos...
Figure 9.11 Graphical representation of Support Vector
Machine.
Figure 9.12 Experimental analysis graph for different
classifier in terms of acc...
Chapter 10
Figure 10.1 Block diagram of machine learning.
Figure 10.2 Machine learning algorithm.
Figure 10.3 Structure of deep learning.
Figure 10.4 Architecture of DNN.
Figure 10.5 Architecture of CNN.
Figure 10.6 System architecture.
Figure 10.7 Image before histogram equalization.
Figure 10.8 Image after histogram equalization.
Figure 10.9 Edge detection.
Figure 10.10 Edge segmented image.
Figure 10.11 Total cases.
Figure 10.12 Result comparison.
Chapter 11
Figure 11.1 Breast cancer incidence rates worldwide (source:
International Agenc...
Figure 11.2 Images from MIAS database showing normal,
benign, malignant mammogra...
Figure 11.3 Image depicting noise in a mammogram.
Figure 11.4 Architecture of CNN.
Figure 11.5 A complete representation of all the operation
that take place at va...
Figure 11.6 An image depicting Pouter, Plesion, and Pbreast in
a mammogram.
Figure 11.7 The figure depicts two images: (a) mammogram
with a malignant mass a...
Figure 11.8 A figure depicting the various components of a
breast as identified ...
Figure 11.9 An illustration of how a mammogram image
having tumor is segmented t...
Figure 11.10 A schematic representation of classification
procedure of CNN.
Figure 11.11 A schematic representation of classification
procedure of CNN durin...
Figure 11.12 Proposed system model.
Figure 11.13 Flowchart for MIAS database and unannotated
labeled images.
Figure 11.14 Image distribution for training model.
Figure 11.15 The graph shows the loss for the trained model
on train and test da...
Figure 11.16 The graph shows the accuracy of the trained
model for both test and...
Figure 11.17 Depiction of the confusion matrix for the trained
CNN model.
Figure 11.18 Receiver operating characteristics of the trained
model.
Figure 11.19 The image shows the summary of the CNN
model.
Figure 11.20 Performance parameters of the trained model.
Figure 11.21 Prediction of one of the image collected from
diagnostic center.
Chapter 12
Figure 12.1 Deep learning [14]. (a) A simple, multilayer deep
neural network tha...
Figure 12.2 Flowchart of the model [25]. The orange icon
indicates the dataset, ...
Figure 12.3 Evaluation result [25].
Figure 12.4 Deep learning techniques evaluation results [25].
Figure 12.5 Deep transfer learning–based screening system
[38].
Figure 12.6 Classification result.
Figure 12.7 Regression result [45].
Figure 12.8 AE model of deep learning [47].
Figure 12.9 DBN for induction motor fault diagnosis [68].
Figure 12.10 CNN model for health monitoring [80].
Figure 12.11 RNN model for health monitoring [87].
Figure 12.12 Deep learning models usage.
Chapter 13
Figure 13.1 Intelligent home layout model.
Figure 13.2 Deep learning model in predicting behavior
analysis.
Figure 13.3 Lifestyle-oriented context aware model.
Figure 13.4 Components for the identification, simulation, and
detection of acti...
Figure 13.5 Prediction stages.
Figure 13.6 Analytics of event.
Figure 13.7 Prediction of activity duration.
Chapter 14
Figure 14.1 Comparison of normal and Alzheimer brain.
Figure 14.2 Proposed AD prediction system.
Figure 14.3 KNN classification.
Figure 14.4 SVM classification.
Figure 14.5 Load data in 3D slicer.
Figure 14.6 3D slicer visualization.
Figure 14.7 Normal patient MRI.
Figure 14.8 Alzheimer patient MRI.
Figure 14.9 Comparison of hippocampus region.
Figure 14.10 Accuracy of algorithms with baseline records.
Figure 14.11 Accuracy of algorithms with current records.
Figure 14.12 Comparison of without and with dice coefficient.
Chapter 15
Figure 15.1 U-Net architecture [19].
Figure 15.2 Architecture of the 3D-DCSRN model [29].
Figure 15.3 SMILES code for Cyclohexane and Acetaminophen
[32].
Figure 15.4 Medical chatbot architecture [36].
Chapter 16
Figure 16.1 A classical perceptron.
Figure 16.2 Forward and backward paths on an ANN
architecture.
Figure 16.3 A DNN architecture.
Figure 16.4 A DNN architecture for digit classification.
Figure 16.5 Underfit and overfit.
Figure 16.6 Functional mapping.
Figure 16.7 A generalized Tikhonov functional.
Figure 16.8 (a) With hidden layers (b) Dropping h2 and h5.
Figure 16.9 Image cropping as one of the features of data
augmentation.
Figure 16.10 Early stopping criteria based on errors.
Figure 16.11 (a) Convex, (b) Non-convex.
Figure 16.12 (a) Affine (b) Convex function.
Figure 16.13 Workflow and an optimizer.
Figure 16.14 (a) Error (cost) function (b) Elliptical: Horizontal
cross section.
Figure 16.15 Contour plot for a quadratic cost function with
elliptical contours...
Figure 16.16 Gradients when steps are varying.
Figure 16.17 Local minima. (When the gradient ∇ of the
partial derivatives is po...
Figure 16.18 Contour plot showing basins of attraction.
Figure 16.19 (a) Saddle point S. (b) Saddle point over a two-
dimensional error s...
Figure 16.20 Local information encoded by the gradient
usually does not support ...
Figure 16.21 Direction of gradient change.
Figure 16.22 Rolling ball and its trajectory.
Chapter 17
Figure 17.1 Artificial Neural Networks vs. Architecture of
Deep Learning Model [...
Figure 17.2 Machine learning and deep learning techniques
[4, 5].
Figure 17.3 Model of reinforcement learning
(https://www.kdnuggets.com).
Figure 17.4 Data analytical model [5].
Figure 17.5 Support Vector Machine—classification approach
[1].
Figure 17.6 Expected output of K-means clustering [1].
Figure 17.7 Output of mean shift clustering [2].
Figure 17.8 Genetic Signature–based Hierarchical Random
Forest Cluster (G-HR Clu...
Figure 17.9 Artificial Neural Networks vs. Deep Learning
Neural Networks.
Figure 17.10 Architecture of Convolution Neural Network.
Figure 17.11 Architecture of the Human Diseases Pattern
Prediction Technique (EC...
Figure 17.12 Comparative analysis: processing time vs.
classifiers.
Figure 17.13 Comparative analysis: memory usage vs.
classifiers.
Figure 17.14 Comparative analysis: classification accuracy vs.
classifiers.
Figure 17.15 Comparative analysis: sensitivity vs. classifiers.
Figure 17.16 Comparative analysis: specificity vs. classifiers.
Figure 17.17 Comparative analysis: FScore vs. classifiers.
Chapter 18
Figure 18.1 Deep Neural Network (DNN).
Figure 18.2 The evolution of machine learning techniques
(year-wise).
List of Tables
Chapter 1
Table 1.1 Various parameters of the layers of LeNet.
Table 1.2 Every column indicates which feature map in S2 are
combined by the uni...
Table 1.3 AlexNet layer details.
Table 1.4 Various parameters of ZFNet.
Table 1.5 Various parameters of VGG-16.
Table 1.6 Various parameters of GoogleNet.
Table 1.7 Various parameters of ResNet.
Table 1.8 Comparison of ResNet-50 and ResNext-50 (32 × 4d).
Table 1.9 Comparison of ResNet-50 and ResNext-50 and SE-
ResNeXt-50 (32 × 4d).
Table 1.10 Comparison of DenseNet.
Table 1.11 Various parameters of MobileNets.
Table 1.12 State-of-art of spine segmentation approaches.
Chapter 2
Table 2.1 History of search engines.
Table 2.2 Three types of user refinement of queries.
Table 2.3 Different approaches for the query suggestion
techniques.
Chapter 3
Table 3.1 Types of liver lesions.
Table 3.2 Dataset count.
Table 3.3 Hyperparameter settings for training.
Table 3.4 Confusion matrix for AlexNet.
Table 3.5 Confusion matrix for GoogLeNet.
Table 3.6 Confusion matrix for ResNet-18.
Table 3.7 Confusion matrix for ResNet-50.
Table 3.8 Comparison of classification accuracies.
Chapter 4
Table 4.1 Retrieval performance of metric learning for VGG19.
Table 4.2 Performance of retrieval techniques of the trained
VGG19 among fine-tu...
Table 4.3 PR values of various models—a comparison for CT
image retrieval.
Table 4.4 Recall vs. precision for proposed content-based
image retrieval.
Table 4.5 Loss function of proposed deep regression networks
for training datase...
Table 4.6 Loss function of proposed deep regression networks
for validation data...
Table 4.7 Land mark details (identification rates vs. distance
error) for the pr...
Table 4.8 Accuracy value of the proposed system.
Table 4.9 Accuracy of the retrieval methods compared with
the metric learning–ba...
Chapter 6
Table 6.1 Definition of the abbreviations.
Chapter 7
Table 7.1 Performance of proposed models on English dataset.
Table 7.2 Performance of proposed model on Bangla dataset.
Table 7.3 Performance of proposed model on Math Symbol
dataset.
Chapter 8
Table 8.1 ABCD factor for TDS value.
Table 8.2 Classify mole according to TDS value.
Chapter 9
Table 9.1 The confusion matrix for different classifier.
Table 9.2 Performance analysis of different classifiers:
Random Forest, SVM, Naï...
Chapter 10
Table 10.1 Result analysis.
Chapter 11
Table 11.1 Comparison of different techniques and tumor.
Chapter 13
Table 13.1 Cognitive functions related with routine activities.
Table 13.2 Situation and design features.
Table 13.3 Accuracy of prediction.
Chapter 14
Table 14.1 Accuracy comparison and mean of algorithms with
baseline records.
Table 14.2 Accuracy comparison and mean of algorithms with
current records.
Chapter 15
Table 15.1 Variances of Convolutional Neural Network (CNN).
Table 15.2 Various issues challenges faced by researchers for
using deep learnin...
Chapter 17
Table 17.1 Comparative analysis: classification accuracy for 10
datasets—analysi...
Chapter 18
Table 18.1 Comparison among data mining, machine learning,
and deep learning.
Scrivener Publishing
100 Cummings Center, Suite 541J
Beverly, MA 01915-6106
Publishers at Scrivener
Martin Scrivener ([email protected])
Phillip Carmical ([email protected])
Computational Analysis and
Deep Learning for Medical
Care
Abstract
The widespread publicity of Convolutional Neural Network (CNN) in various domains such as image
classification, object recognition, and scene classification has revolutionized the research in machine
learning, especially in medical images. Magnetic Resonance Images (MRIs) are suffering from severe
noise, weak edges, low contrast, and intensity inhomogeneity. Recent advances in deep learning with
fewer connections and parameters made their training easier. This chapter presents an in-depth review
of the various deep architectures as well as its application for segmenting the Intervertebral disc (IVD)
from the 3D spine image and its evaluation. The first section deals with the study of various traditional
architectures of deep CNN such as LeNet, AlexNet, ZFNet, GoogleNet, VGGNet, ResNet, Inception model,
ResNeXt, SENet, MobileNet V1/V2, and DenseNet. It also deals with the study of the parameters and
components associated with the models in detail. The second section discusses the application of these
models to segment IVD from the spine image. Finally, theoretically performance and experimental results
of the state-of-art of the literature shows that 2.5D multi-scale FCN performs the best with the Dice
Similarity Index (DSC) of 90.64%.
Keywords: CNN, deep learning, intervertebral disc degeneration, MRI segmentation
1.1 Introduction
The concept of Convolutional Neural Network (CNN) was introduced by Fukushima. The principle in CNN is
that the visual mechanism of human is hierarchical in structure. CNN has been successfully applied in
various image domain such as image classification, object recognition, and scene classification. CNN is
defined as a series of convolution layer and pooling layer. In the convolution layer, the image is convolved
with a filter, i.e., slide over the image spatially and computing dot products. Pooling layer provides a smaller
feature set.
One major cause of low back pain is disc degeneration. Automated detection of lumbar abnormalities from
the clinical scan is a burden for radiologist. Researchers focus on the automation task of the segmentation of
large set of MRI data due to the huge size of such images. The success of the application of CNN in various
field of object detection enables the researchers to apply various models for the detection of Intervertebral
Disc (IVD) and, in turn, helps in the diagnosis of diseases.
The details of the structure of the remaining section of the paper are as follows. The next section deals with
the study of the various CNN models. Section 1.3, presents applications of CNN for the detection of the IVD.
In Section 1.4, comparison with state-of-the-art segmentation approaches for spine T2W images is carried
out, and conclusion is in Section 1.5.
(1.1)
(1.2)
In the first convolutional layer, number of learning parameters is (5×5 + 1) × 6 = 156 parameters; where 6 is
the number of filters, 5 × 5 is the filter size, and bias is 1, and there are 28×28×156 = 122,304 connections.
The number of feature map calculation is as follows:
(1.3)
(1.4)
(1.5)
(1.6)
The number of feature map is 14×14 and the number of learning parameters is (coefficient + bias) × no.
filters = (1+1) × 6 = 12 parameters and the number of connections = 30×14×14 = 5,880.
Layer 3: In this layer, only 10 out of 16 feature maps are connected to six feature maps of the previous layer
as shown in Table 1.2. Each unit in C3 is connected to several 5 × 5 receptive fields at identical locations in
S2. Total number of trainable parameters = (3×5×5+1)×6+(4×5×5+1)×9+(6×5×5+1) = 1516. Total number
of connections = (3×5×5+1)×6×10×10+(4×5×5+1) ×9×10×10 +(6×5×5+1)×10×10 = 151,600. Total number
of parameters is 60K.
1.2.2 AlexNet
Alex Krizhevsky et al. [2] presented a new architecture “AlexNet” to train the ImageNet dataset, which
consists of 1.2 million high-resolution images, into 1,000 different classes. In the original implementation,
layers are divided into two and to train them on separate GPUs (GTX 580 3GB GPUs) takes around 5–6 days.
The network contains five convolutional layers, maximum pooling layers and it is followed by three fully
connected layers, and finally a 1,000-way softmax classifier. The network uses ReLU activation function, data
augmentation, dropout and smart optimizer layers, local response normalization, and overlapping pooling.
The AlexNet has 60M parameters. Figure 1.2 shows the architecture of AlexNet and Table 1.3 shows the
various parameters of AlexNet.
Table 1.2 Every column indicates which feature map in S2 are combined by the units in a particular feature
map of C3 [1].
012 3456 789 10 11 12 13 14 15
0X XXX X X X X X X
1XX XX X X X X X X
2XXX X XX X X X X
3 XX X X XXX X X X
4 X XX XXX X X X X
5 XXX XX X X X X X
Figure 1.2 Architecture of AlexNet.
First Layer: AlexNet accepts a 227 × 227 × 3 RGB image as input which is fed to the first convolutional layer
with 96 kernels (feature maps or filters) of size 11 × 11 × 3 and a stride of 4 and the dimension of the output
image is changed to 96 images of size 55 × 55. The next layer is max-pooling layer or sub-sampling layer
which uses a window size of 3 × 3 and a stride of two and produces an output image of size 27 × 27 × 96.
Second Layer: The second convolutional layer filters the 27 × 27 × 96 image with 256 kernels of size 5 × 5
and a stride of 1 pixel. Then, it is followed by max-pooling layer with filter size 3 × 3 and a stride of 2 and the
output image is changed to 256 images of size 13 × 13.
Third, Fourth, and Fifth Layers: The third, fourth, and fifth convolutional layers uses filter size of 3 × 3 and a
stride of one. The third and fourth convolutional layer has 384 feature maps, and fifth layer uses 256 filters.
These layers are followed by a maximum pooling layer with filter size 3 × 3, a stride of 2 and have 256
feature maps.
Sixth Layer: The 6 × 6 × 256 image is flattened as a fully connected layer with 9,216 neurons (feature maps)
of size 1 × 1.
Seventh and Eighth Layers: The seventh and eighth layers are fully connected layers with 4,096 neurons.
Output Layer: The activation used in the output layer is softmax and consists of 1,000 classes.
1.2.3 ZFNet
The architecture of ZFNet introduced by Zeiler [3] is same as that of the AlexNet, but convolutional layer
uses reduced sized kernel 7 × 7 with stride 2. This reduction in the size will enable the network to obtain
better hyper-parameters with less computational efficiency and helps to retain more features. The number
of filters in the third, fourth and fifth convolutional layers are increased to 512, 1024, and 512. A new
visualization technique, deconvolution (maps features to pixels), is used to analyze first and second layer’s
feature map.
Table 1.3 AlexNet layer details.
Sl. no. Layer Kernel Stride Activation Weights Bias # Activation #
size shape Parameters Connections
1 Input - - (227,227,3) 0 0 - relu -
Layer
2 CONV1 11 × 11 4 (55,55,96) 34,848 96 34,944 relu 105,415,200
3 POOL1 3 × 3 2 (27,27,96) 0 0 0 relu -
4 CONV2 5 × 5 1 (27,27,256) 614,400 256 614,656 relu 111,974,400
5 POOL2 3 × 3 2 (13,13,256) 0 0 0 relu -
6 CONV3 3 × 3 1 (13,13,384) 884,736 384 885,120 relu 149,520,384
7 CONV4 3 × 3 1 (13,13,384) 1,327,104 384 1,327,488 relu 112,140,288
8 CONV5 3 × 3 1 (13,13,256) 884,736 256 884,992 relu 74,760,192
9 POOL3 3 × 3 2 (6,6,256) 0 0 0 relu -
10 FC - - 9,216 37,748,736 4,096 37,752,832 relu 37,748,736
11 FC - - 4,096 16,777,216 4,096 16,781,312 relu 16,777,216
12 FC - - 4,096 4,096,000 1,000 4,097,000 relu 4,096,000
OUTPUT FC - - 1,000 - - 0 softmax -
- - - - - - - 62,378,344 - -
(Total)
1.2.4 VGGNet
Simonyan and Zisserman et al. [4] introduced VGGNet for the ImageNet Challenge in 2014. VGGNet-16
consists of 16 layers; accepts a 227 × 227 × 3 RGB image as input, by subtracting global mean from each
pixel. Then, the image is fed to a series of convolutional layers (13 layers) which uses a small receptive field
of 3 × 3 and uses same padding and stride is 1. Besides, AlexNet and ZFNet uses max-pooling layer after
convolutional layer. VGGNet does not have max-pooling layer between two convolutional layers with 3 × 3
filters and the use of three of these layers is more effective than a receptive field of 5 × 5 and as spatial size
decreases, the depth increases. The max-pooling layer uses a window of size 2 × 2 pixel and a stride of 2. It is
followed by three fully connected layers; first two with 4,096 neurons and third is the output layer with
1,000 neurons, since ILSVRC classification contains 1,000 channels. Final layer is a softmax layer. The
training is carried out on 4 Nvidia Titan Black GPUs for 2–3 weeks with ReLU nonlinearity activation
function. The number of parameters is decreased and it is 138 million parameters (522 MB). The test set
top-5 error rate during competition is 7.1%. Figure 1.4 shows the architecture of VGG-16, and Table 1.5
shows its parameters.
Table 1.4 Various parameters of ZFNet.
Layer name Input Filter Window # Stride Padding Output # Feature # Connections
size size size Filters size maps
Conv 1 224 × 224 7 × 7 - 96 2 0 110 × 96 14,208
110
Max-pooling 110 × 110 3×3 - 2 0 55 × 55 96 0
1
Conv 2 55 × 55 5×5 - 256 2 0 26 × 26 256 614,656
Max-pooling 26 × 26 - 3×3 - 2 0 13 × 13 256 0
2
Conv 3 13 × 13 3×3 - 384 1 1 13 × 13 384 885,120
Conv 4 13 × 13 3×3 - 384 1 1 13 × 13 384 1,327,488
Conv 5 13 × 13 3×3 - 256 1 1 13 × 13 256 884,992
Max-pooling 13 × 13 - 3×3 - 2 0 6×6 256 0
3
Fully 4,096 37,752,832
connected 1 neurons
Fully 4,096 16,781,312
connected 2 neurons
Fully 1,000 4,097,000
connected 3 neurons
Softmax 1,000 62,357,608
classes (Total)
1.2.5 GoogLeNet
In 2014, Google [5] proposed the Inception network for the ImageNet Challenge in 2014 for detection and
classification challenges. The basic unit of this model is called “Inception cell”—parallel convolutional layers
with different filter sizes, which consists of a series of convolutions at different scales and concatenate the
results; different filter sizes extract different feature map at different scales. To reduce the computational
cost and the input channel depth, 1 × 1 convolutions are used. In order to concatenate properly, max pooling
with “same” padding is used. It also preserves the dimensions. In the state-of-art, three versions of Inception
such as Inception v2, v3, and v4 and Inception-ResNet are defined. Figure 1.5 shows the inception module
and Figure 1.6 shows the architecture of GoogLeNet.
For each image, resizing is performed so that the input to the network is 224 × 224 × 3 image, extract mean
before feeding the training image to the network. The dataset contains 1,000 categories, 1.2 million images
for training, 100,000 for testing, and 50,000 for validation. GoogLeNet is 22 layers deep and uses nine
inception modules, and global average pooling instead of fully connected layers to go from 7 × 7 × 1,024 to 1
× 1 × 1024, which, in turn, saves a huge number of parameters. It includes several softmax output units to
enforce regularization. It is trained on a high-end GPUs within a week and achieved top-5 error rate of
6.67%. GoogleNet trains faster than VGG and size of a pre-trained GoogleNet is comparatively smaller than
VGG.
Table 1.5 Various parameters of VGG-16.
Layer name Input size Filter Window # Stride/Padding Output # Feature #
size size Filters size maps Parameters
Conv 1 224 × 224 3 × 3 - 64 1/1 224 × 64 1,792
224
Conv 2 224 × 224 3 × 3 - 64 1/1 224 × 64 36,928
224
Max-pooling 224 × 224 - 2×2 - 2/0 112 × 64 0
1 112
Conv 3 112 × 112 3 × 3 - 128 1/1 112 × 128 73,856
112
Conv 4 112 × 112 3 × 3 - 128 1/1 112 × 128 147,584
112
Max-pooling 112 × 112 - 2×2 - 2/0 56 × 56 128 0
2
Conv 5 56 × 56 3×3 - 256 1/1 56 × 56 256 295,168
Conv 6 56 × 56 3×3 - 256 1/1 56 × 56 256 590,080
Conv 7 56 × 56 3×3 - 256 1/1 56 × 56 256 590,080
Max-pooling 56 × 56 - 2×2 - 2/0 28 × 28 256 0
3
Conv 8 28 × 28 3×3 - 512 1/1 28 × 28 512 1,180,160
Conv 9 28 × 28 3×3 - 512 1/1 28 × 28 512 2,359,808
Conv 10 28 × 28 3×3 - 512 1/1 28 × 28 512 2,359,808
Max-pooling 28 × 28 - 2×2 - 2/0 14 × 14 512 0
4
Conv 11 14 × 14 3×3 - 512 1/1 14 × 14 512 2,359,808
Conv 12 14 × 14 3×3 - 512 1/1 14 × 14 512 2,359,808
Conv 13 14 × 14 3×3 - 512 1/1 14 × 14 512 2,359,808
Max-pooling 14 × 14 - 2×2 - 2/0 7×7 512 0
5
Fully 4,096 102,764,544
connected 1 neurons
Fully 4,096 16,781,312
connected 2 neurons
Fully 1,000 4,097,000
connected 3 neurons
Softmax 1,000
classes
Figure 1.5 Inception module.
1.2.6 ResNet
Usually, the input feature map will be fed through a series of convolutional layer, a non-linear activation
function (ReLU) and a pooling layer to provide the output for the next layer. The training is done by the back-
propagation algorithm. The accuracy of the network can be improved by increasing depth. Once the network
gets converged, its accuracy saturates. Further, if we add more layers, then the performance gets degraded
rapidly, which, in turn, results in higher training error. To solve the problem of the vanishing/exploding
gradient, ResNet with a residual learning framework [6] was proposed by allowing new layers to fit a
residual mapping. When a model is converged than to fit the mapping, it is easy to push the residual to zero.
The principle of ResNet is residual learning and identity mapping and skip connections. The idea behind the
residual learning is that it feeds the input image to the next convolutional layer and adds them together and
performs non-linear activation (ReLU) and pooling.
Table 1.6 Various parameters of GoogleNet.
Layer Input Filter Window # Stride Depth # 1 # 3 × 3 # 3 # 5 × 5 # 5 Pool Padding Out
name size size size Filters × 1 reduce × 3 reduce × 5 proj size
Convolution 224 × 7 × 7 - 64 2 1 2 112
224 112
64
Max pool 112 × - 3×3 - 2 0 0 56 ×
112 × 64
Convolution 56 × 3×3 - 192 1 2 64 192 1 56 ×
56 × 19
Max pool 56 × - 3×3 192 2 0 0 28 ×
56 × 19
Inception 28 × - - - - 2 64 96 128 16 32 32 - 28 ×
(3a) 28 × 25
Inception 28 × - - - - 2 128 128 192 32 96 64 - 28 ×
(3b) 28 × 48
Max pool 28 × - 3×3 480 2 0 0 14 ×
28 × 48
Inception 14 × - - - - 2 192 96 208 16 48 64 - 14 ×
(4a) 14 × 51
Inception 14 × - - - - 2 160 112 224 24 64 64 - 14 ×
(4b) 14 × 51
Inception 14 × - - - - 2 128 128 256 24 64 64 - 14 ×
(4c) 14 × 51
Inception 14 × - - - - 2 112 144 288 32 64 64 - 14 ×
(4d) 14 × 52
Inception 14 × - - - - 2 256 160 320 32 128 128 - 14 ×
(4e) 14 × 83
Max pool 14 × - 3×3 - 2 0 0 7×7
14 832
Inception 7×7 - - - - 2 256 160 320 32 128 128 - 7×7
(5a) 832
Inception 7×7 - - - - 2 384 192 384 48 128 128 - 7×7
(5b) 1,02
Avg pool 7×7 - 7×7 - - 0 0 1×1
1,02
Dropout - - - 1,024 - 0 - 1×1
(40 %) 1,02
Linear - - - 1,000 - 1 - 1×1
1,00
Softmax - - - 1,000 - 0 - 1×1
1,00
The architecture is a shortcut connection of VGGNet (consists of 3 × 3 filters) that is inserted to form a
residual network as shown in figure. Figure 1.7(b) shows 34-layer network converted into the residual
network and has lesser training error as compared to the 18-layer residual network. As in GoogLeNet, it
utilizes a series of a global average pooling layer and the classification layer. ResNets were capable of
learning a network with a maximum depth of 152. Compared to the GoogLeNet and VGGNet, accuracy is
better and computationally efficient than VGGNet. ResNet-152 achieves 95.51 top-5 accuracies. Figure 1.7(a)
shows a residual block, Figure 1.7(b) shows the architecture of ResNet and Table 1.7 shows the parameters
of ResNet.
1.2.7 ResNeXt
The ResNeXt [7] architecture is built based on the advantages of ResNet (residual networks) and GoogleNet
(multi-branch architecture) and requires less number of hyperparameters compared to the traditional
ResNet. The next defines the next dimension (“cardinality”), an additional dimension on top of the depth and
width of ResNet. The input is split channelwise into groups. The standard residual block is replaced with a
“split-transform-merge” procedure. This architecture uses a series of residual blocks and uses the following
rules. (1) If the spatial maps are of same size, the blocks will split the hyperparameters; (2) The spatial map
is pooled by two factors; block width is doubled by two factors. ResNeXt becomes the 1st runner up of
ILSVRC classification task and produces better results than ResNet. Figure 1.8 shows the architecture of
ResNeXt, and the comparison with REsNet is shown in Table 1.8.
Figure 1.7 (a) A residual block.
Table 1.7 Various parameters of ResNet.
1.2.8 SE-ResNet
Hu et al. [8] proposed a Squeeze-and-Excitation Network (SENet) (first position on ILSVRC 2017 category)
with lightweight gating mechanism. This architecture focuses explicitly on model interdependencies
between the channels of convolutional features and to achieve dynamic channel-wise feature recalibration.
In the squeeze phase, SE block uses global average pooling operation and in the excitation phase uses
channel-wise scaling. For an input image of size 224 × 224, the running time of ResNet-50 is 164 ms,
whereas it is 167 ms for SE-ResNet-50. Also, SE-ResNet-50 requires ∼3.87 GFLOPs, which shows a 0.26%
relative increase over the original ResNet-50. The top-5 error is reduced to 2.251%. Figure 1.9 shows the
architecture of SE-ResNet, and Table 1.9 shows ResNet and its comparison with SE-ResNet-50 and SE-
ResNeXt-50.
1.2.9 DenseNet
The architecture is proposed by [9], where every layer connect directly with each other so as to ensure
maximum information (and gradient) flow. Thus, this model with L layer has L(L+1) connections. A number
of dense block (group of layers connected to previous layers) and the transition layer control the complexity
of the model. Each dense block adds one channel to the model. Transition layer is used to reduce the number
of channels by using the convolutional layer of size 1 × 1 and reduces the width and height of the average
pooling layer by a factor of 2 and with a stride of 2. It concatenates all the output feature map of previous
layers along with incoming feature maps, i.e., each layer has direct access to the gradients from the loss
function and the original input image. Further, DenseNets needs small set of parameters as compared to the
traditional CNN and reduces vanishing gradient problem. Figure 1.10 shows the architecture of DenseNet,
and Table 1.10 shows various DenseNet architectures.
Table 1.8 Comparison of ResNet-50 and ResNext-50 (32 × 4d).
1.2.10 MobileNets
Google proposed MobileNets VI [10] uses depthwise separable convolution instead of the normal
convolutions, which, in turn, reduces the model size and complexity. Depthwise separable convolution is
defined as a depthwise convolution followed by a pointwise convolution, i.e., a single convolution is
performed on each colour channel and it is followed by pointwise convolution which applies a 1 × 1
convolution to combine the outputs of depthwise convolution; after each convolution, batch normalization
(BN) and ReLU are applied. The whole architecture consists of 30 layers with (1) Convolutional layer with
stride 2, (2) Depthwise layer, (3) Pointwise layer, (4) Depthwise layer with stride 2, and (5) Pointwise layer.
The advantage of MobileNets is that it requires fewer number of parameters and the model is less complex
(small number of Multiplications and Additions). Figure 1.11 shows the architecture of MobileNets. Table
1.11 shows the various parameters of MobileNets.
Figure 1.9 Architecture of SE-ResNet.
1.5 Conclusion
In this Chapter, we had discussed about the various CNN architectural models and its parameters. In the first
phase, various architectures such as LeNet, AlexNet, VGGnet, GoogleNet, ResNet, ResNeXt, SENet, and
DenseNet and MobileNet are studied. In the second phase, the application of CNN for the segmentation of
IVD is presented. The comparison with state-of-the-art of segmentation approaches for spine T2W images
are also presented. From the experimental results, it is clear that 2.5D multi-scale FCN outperforms all other
models. As a future study, this work modify any currents models to get optimized results.
Discovering Diverse Content Through
Random Scribd Documents
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.