0% found this document useful (0 votes)
20 views

CMMM2021 5940433

Uploaded by

shamim mahabub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

CMMM2021 5940433

Uploaded by

shamim mahabub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Hindawi

Computational and Mathematical Methods in Medicine


Volume 2021, Article ID 5940433, 12 pages
https://doi.org/10.1155/2021/5940433

Research Article
Gastrointestinal Tract Disease Classification from Wireless
Endoscopy Images Using Pretrained Deep Learning Model

J. Yogapriya ,1 Venkatesan Chandran ,2 M. G. Sumithra ,2,3 P. Anitha ,4


P. Jenopaul ,4 and C. Suresh Gnana Dhas 5
1
Department of Computer Science and Engineering, Kongunadu College of Engineering and Technology, Trichy,
621215 Tamil Nadu, India
2
Department of Electronics and Communication Engineering, Dr.N.G.P Institute of Technology, Coimbatore,
641048 Tamilnadu, India
3
Department of Biomedical Engineering, Dr.N.G.P Institute of Technology, Coimbatore, 641048 Tamilnadu, India
4
Department of EEE, Adi Shankra Institute of Engineering and Technology, Kalady, Ernakulam, Kerala 683574, India
5
Department of Computer Science, Ambo University, Ambo University, Ambo, Post Box No.: 19, Ethiopia

Correspondence should be addressed to C. Suresh Gnana Dhas; [email protected]

Received 10 May 2021; Revised 3 July 2021; Accepted 16 August 2021; Published 11 September 2021

Academic Editor: John Mitchell

Copyright © 2021 J. Yogapriya et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Wireless capsule endoscopy is a noninvasive wireless imaging technology that becomes increasingly popular in recent years.
One of the major drawbacks of this technology is that it generates a large number of photos that must be analyzed by
medical personnel, which takes time. Various research groups have proposed different image processing and machine
learning techniques to classify gastrointestinal tract diseases in recent years. Traditional image processing algorithms and a
data augmentation technique are combined with an adjusted pretrained deep convolutional neural network to classify
diseases in the gastrointestinal tract from wireless endoscopy images in this research. We take advantage of pretrained
models VGG16, ResNet-18, and GoogLeNet, a convolutional neural network (CNN) model with adjusted fully connected
and output layers. The proposed models are validated with a dataset consisting of 6702 images of 8 classes. The VGG16
model achieved the highest results with 96.33% accuracy, 96.37% recall, 96.5% precision, and 96.5% F1-measure. Compared
to other state-of-the-art models, the VGG16 model has the highest Matthews Correlation Coefficient value of 0.95 and
Cohen’s kappa score of 0.96.

1. Introduction 2000, wireless capsule endoscopy (WCE) was developed to


solve the problem with gastroscopy instruments [3]. Confer-
Esophageal, stomach, and colorectal cancers account for 2.8 ring to the yearly report in 2018, roughly 1 million patients
million new cases and 1.8 million deaths worldwide per year. were successfully treated with the assistance of WCE [4].
Out of these ulcers, bleeding and polyps are all examples of To detect disease, the doctor employs the WCE procedure
gastrointestinal infections [1]. Since the beginning of 2019, to inspect the interior of the gastrointestinal tract (GIT).
an estimated 27,510 cases have been diagnosed in the United The doctor uses the WCE method to inspect the interior of
States, with 62.63% males and 37.37% females, and esti- the gastrointestinal tract in order to discover disease (GIT)
mated deaths of 40.49%, with 61% males and 39% females [5, 6]. The capsule autonomously glides across the GI tract,
[2]. Due to its complex nature, gastroscopy instruments are giving real-time video to the clinician. After the process of
not suitable for identifying and examining gastrointestinal transmitting the videos, the capsule is discharged through
infections such as bleeding, polyps, and ulcers. In the year the anus. The video frames received are examined by the
2 Computational and Mathematical Methods in Medicine

physician to decide about the diseases [7]. The major papers published in the field. Traditional machine learning
diseases diagnosed using the WCE are ulcers, bleeding, algorithms and deep learning algorithms are used in these
malignancy, and polyps in the digestive system. The ana- studies. Improving the classification of disease areas with a
tomical landmarks, pathological findings, and poly removal high degree of precision in automatic detection is a great
play a vital role in diagnosing the diseases in the digestive challenge. Advanced deep learning techniques are important
system using WCE captured images. It is a more convenient in WCE to boost its analytical vintage. The AlexNet model is
method to diagnose by providing a wide range of visuals [8]. proposed to classify the upper gastrointestinal organs from
It reduces the patient’s discomfort and complications during the images captured under different conditions. The model
the treatment in conventional endoscopy methods like com- achieves an accuracy of 96.5% in upper gastrointestinal
puter tomography enteroclysis and enteroscopy. The accu- anatomical classification [17]. The author proposed the
racy of diagnosing tumours and gastrointestinal bleeding, technique to reduce the review time of endoscopy screening
especially in the small intestine, has improved. The overall based on the analysis of factorization. The sliding window
process is very time-consuming to analyze all the frames mechanism with single value decomposition is used. The
extracted from each patient [9]. Furthermore, even the most technique achieves an overall precision of 92% [18]. The
experienced physicians confront difficulties that necessitate a author proposed a system for automatically detecting irreg-
large amount of time to analyze all of the data because the ular WCE images by extracting fractal features using the
contaminated zone in one frame will not emerge in the next. differential box-counting method. The output is tested on
Even though the majority of the frames contain useless two datasets, both of which contain WCE frames, and
material, the physician must go through the entire video in achieves binary classification accuracy of 85% and 99% for
order. Owing to inexperience or negligence, it may often dataset I and dataset II, respectively [19]. The author uses
result in a misdiagnosis [10]. the pretrained models Inception-v4, Inception ResNet-v2,
Segmentation, classification, detection, and localization and NASNet to classify the anatomical landmarks from the
are techniques used to solve this problem by researchers. WCE images, which obtained 98.45%, 98.48%, and 97.35%.
Feature extraction and visualization are an important step Out of this, the Inception-v4 models achieves a precision
that determines the overall accuracy of the computer-aided of 93.8% [20]. To extract the features from the data, the
diagnosis method. The different features are extracted based authors used AlexNet and GoogLeNet. This approach is
upon the texture analysis, color-based, points, and edges in aimed at addressing the issues of low contrast and abnormal
the images [11]. The features extracted are insufficient to lesions in endoscopy [21]. The author proposed a computer-
determine the model’s overall accuracy. As a result, feature aided diagnostics tool for classifying ulcerative colitis and
selection is a time-consuming process that is crucial in deter- achieves the area under the curve of 0.86 for mayo 0 and
mining the model’s output. The advancements in the field of 0.98 for mayo 0-1 [22]. The author proposed the convolu-
deep learning, especially CNN, can solve the problem [12]. tional neural network with four layers to classify a different
The advancement of CNN has been promising in the last class of ulcers from the WCE video frames. The test results
decades, with automated detection of diseases in various are improved by tweaking the model’s hyperparameters
organs of the human body, such as the brain [13], cervical and achieving an accuracy of 96.8% [23]. The authors have
cancer [14], eye diseases [15], and skin cancer [16]. Unlike introduced the new virtual reality capsule to simulate and
conventional learning algorithms such as machine learning, identify the normal and abnormal regions. This environ-
the CNN model has the advantage of extracting features hier- ment is generated new 3D images for gastrointestinal dis-
archically from low to a high level. The remainder of the eases [24]. Local spatial features are retrieved from pixels
manuscript is organized as follows: Section 2 explains the of interest in a WCE image using a linear separation
related work in the field of GIT diagnosis; Section 3 discusses approach in this paper. The proposed probability density
the dataset consider for this study; Section 4 describes the function model fitting-based approach not only reduces
pretrained architecture to diagnose eight different diseases computing complexity, but it also results in a more consis-
from WCE images; Section 5 contains the derived findings tent representation of a class. The proposed scheme
from the proposed method; Section 6 concludes the work. performs admirably in terms of precision, with a score of
96.77% [25]. In [26], the author proposed a Gabor capsule
2. Related Work network for classifying complex images like the Kvasir data-
set. The model achieves an overall accuracy of 91.50%. The
The automated prediction of anatomical landmarks, patho- wavelet transform with a CNN is proposed to classify gastro-
logical observations, and polyp groups from images obtained intestinal tract diseases and achieves an overall average
using wireless capsule endoscopy is the subject of this performance of 93.65% in classifying the eight classes [27].
research. The experimental groups from the pictures make From the literature, the CNN model can provide better
it simple for medical experts to make an accurate diagnosis results if the number of the dataset is high. But there are
and prescribe a treatment plan. Significant research in this several obstacles in each step that will reduce the model’s
area has led to the automatic detection of infection from a performance. The low contrast video frames in the dataset
large number of images, saving time and effort for medical make segmenting the regions difficult. The extraction and
experts while simultaneously boosting diagnosis accuracy. selection of important traits are another difficult step in
Automatically detecting infected image from WCE images identifying disorders including ulcers, bleeding, and polyps.
has lately been a popular research topic, with a slew of The workflow of the proposed method for disease
Computational and Mathematical Methods in Medicine 3

WCE images without labels

Image labeling
Wireless capsule Data
by medical
endoscopy augmentation
experts

Normal-pylorus
Normal Z-line

Esophagitis
Model
Model training Trained model Normal-cecum
predication
Polyps
Ulcerative-colitis
Dyed-resection-margins
Dyed-lifted-polyps

Figure 1: Workflow for GIT disease classification from wireless endoscopy.

classification using wireless endoscopy is shown in Figure 1. × 1072 pixels. The different diseases with corresponding
The significant contributions of this study are as follows. class label encoding are provided in Table 1.
An anatomical landmark is a characteristic of the GIT
(1) A computer-assisted diagnostic system is being pro- that can be seen through an endoscope. It is necessary for
posed to classify GIT diseases into many categories, navigation and as a reference point for describing the loca-
including anatomical landmarks, pathological obser- tion of a given discovery. It is also possible that the land-
vations, and polyp removal marks are specific areas for pathology, such as ulcers or
inflammation. Class 0 and class 1 are the two classes of poly
(2) The pretrained model is used to overcome small
removal. Class 3, class 4, and class 5 are the most important
datasets and overfitting problem, which reduces the
anatomical landmarks. The essential pathological findings
model accuracy [28]
are class 2, class 6, and class 7. The sample image from the
(3) The VGG16, ResNet-18, and GoogLeNet pretrained dataset is shown in Figure 2, and the distribution of the data-
CNN architecture classify gastrointestinal tract set is represented in Figure 3.
diseases from the endoscopic images by slightly
modifying the architecture 4. Proposed Deep Learning Framework
(4) The visual features of GIT disease ways of obtaining To solve the issue of small data sizes, transfer learning was
classification decisions are visualized using the used to fine-tune three major pretrained deep neural
occlusion sensitivity map networks called VGG16, ResNet-18, and GoogLeNet on the
(5) We also compared the modified pretrained architec- training images of the augmented Kvasir version 2 dataset.
ture with other models, which used handcrafted
features and in-depth features to detect the GIT dis- 4.1. Transfer Learning. In the world of medical imaging,
eases in accuracy, recall, precision, F1-measure, classifying multiple diseases using the same deep learning
region of characteristics (ROC) curve, and Cohen’s architecture is a difficult task. Transfer learning is a tech-
kappa score nique for repurposing a model trained on one task to a com-
parable task that requires some adaptation. When there are
3. Dataset Description not enough training samples to train a model from start,
transfer learning is particularly beneficial for applications
The dataset used in these studies is a GIT images taken with like medical picture classification for rare or developing dis-
endoscopic equipment at Norway’s VV health trust. The eases. This is particularly true for deep neural network
training data is obtained from a large gastroenterology models, which must be trained with a huge number of
department at one of the hospitals in this trust. The further parameters. Transfer learning enables model parameters to
medical experts meticulously annotated the dataset and start with good initial values that only need minimal tweaks
named it Kvasir-V2. This dataset was made available in the to be better curated for the new problem. Transfer learning
fall of 2017 as part of the Mediaeval Medical Multimedia can be done in two ways; one approach is training the model
Challenge, a benchmarking project that assigns tasks to the from the top layers, and another approach is freezing the top
research group [29]. Anatomical landmarks, pathological layers of the model and fine-tunes it on the new dataset. The
observations, and polyp removal are among the eight groups eight different types of diseases are considered in the pro-
that make up the dataset with 1000 images each. The images posed model, so the first approach is used where the model
in the dataset range in resolution from 720 × 576 to 1920 is trained from the top layers. VGG16, GoogLeNet, and
4 Computational and Mathematical Methods in Medicine

Table 1: Kvasir v2 dataset details.

Disease name Class Description


The raising of the polyps decreases the risk of damage to the GI wall’s deeper layers due to electrocautery. It is
Dyed lifted polyps Class 0
essential to pinpoint the areas where polyps can be removed from the underlying tissue.
Dyed resection
Class 1 The resection margins are crucial for determining whether or not the polyp has been entirely removed.
margins
Esophagitis is a condition in which the esophagus becomes inflamed or irritated. They appear as a break in the
Esophagitis Class 2
mucosa of the esophagus.
In the lower abdominal cavity, the cecum is a long tube-like structure. It usually gets foods that have not been
Normal-cecum Class 3
digested. The significance of identifying the cecum is that it serves as evidence of a thorough colonoscopy.
The pylorus binds the stomach to the duodenum, the first section of the small bowel. The pylorus must be
Normal-pylorus Class 4
located before the duodenum can be instrumented endoscopically, which is a complicated procedure.
The Z-line depicts the esophagogastric junction, which connects the esophagus’s squamous mucosa to the
Normal-Z-line Class 5
stomach’s columnar mucosa. It is vital to identify Z-line to determine whether or not a disease is available.
Polyps are clumps of lesions that grow within the intestine. Although the majority of polyps are harmless, a
Polyps Class 6
few of them can lead to colon cancer. As a result, detecting polyps is essential.
The entire bowel will affect by ulcerative colitis (UC) affects which can lead to long-term inflammation or
Ulcerative colitis Class 7
bowel wounds.

Normal-pylorus Normal Z-line Normal-cecum Esophagitis

Polyps Ulcerative-colitis Dyed-lifted-polyps Dyed-resection-margins

Figure 2: Sample images of Kvasir v2 dataset with eight different classes.

Dataset distribution

1000

800
Total images

600

400

200

0
Class 1

Class 2

Class 3

Class 4

Class 5

Class 6

Class 7

Class 8

Figure 3: Dataset distribution among the different classes.


Computational and Mathematical Methods in Medicine 5

VGG16

Conv1_1
Conv1_2

Conv2_1
Conv2_2

Conv3_1
Conv3_2
Conv3_1

Conv4_1
Conv4_2
Conv4_1

Conv4_1
Conv4_2
Conv4_1
Pooling

Pooling

Pooling

Pooling

Pooling

Output
Dense
Dense
Dense
Input
Figure 4: VGG16 architecture for gastrointestinal tract disease classification.

ResNet-18 are the pretrained model used for classifying the Input
different gastrointestinal tract diseases using endoscopic
images. The above pretrained models are used as baseline 7×7 Conv, 64, /2
− − - Skip Connection
models, and the model performance is increased by using 3×3, pooling, /2
various performance improvement techniques.
3×3 Conv, 64
4.2. Gastrointestinal Tract Disease Classification Using
VGG16. The VGG16 model comprises 16 layers which 3×3 Conv, 64
consist of 13 convolution layers and three dense layers. This Layer 1 100×100
model is initially introduced in 2014 for the ImageNet com- 3×3 Conv, 64
petition. The VGG16 is one of the best models for image
classification. Figure 4 depicts the architecture of the 3×3 Conv, 64
VGG16 model. 50×50
Instead of having many parameters, the model focuses 3×3 Conv, 128, /2
on having a 3 × 3 convolution layer with stride one and pad- 3×3 Conv, 128
ding that is always the same. The max-pooling layer uses a Layer 2
2 × 2 filter with a stride of two. The model is completed by 3×3 Conv, 128
two dense layers, followed by the softmax layer. There are
approximately 138 million parameters in the model [30]. 3×3 Conv, 128
The dense layers 1 and 2 consist of 4096 nodes. The dense 25×25
layer 1 consists of a maximum number of parameters of 3×3 Conv, 256, /2
100 million approximately. The number of the parameter 3×3 Conv, 256
in that particular layer is reduced without degrading the Layer 3
performance of the model. 3×3 Conv, 256

4.3. Gastrointestinal Tract Disease Classification Using ResNet- 3×3 Conv, 256
18. Another pretrained model for classifying gastrointestinal 5×5
tract disease from endoscopic images is the ResNet-18 model. 3×3 Conv, 512, /2
Figure 5 depicts the architecture of the ResNet-18 platform.
3×3 Conv, 512
This model is based on a convolutional neural network, one Layer 4
of the most common architectures for efficient training. It 3×3 Conv, 512
allows for a smooth gradient flow. The identity shortcut link
in the ResNet-18 model skips one or more layers. This will 3×3 Conv, 512
allow the network to have a narrow connection to the
network’s first layers, rendering gradient upgrades much Avg Pool
1×1
easier for those layers [31]. The ResNet model comprises 17
FC 8
convolution layers and one fully connected layer.
Figure 5: ResNet-18 architecture for gastrointestinal tract disease
4.4. Gastrointestinal Tract Disease Classification Using
classification.
GoogLeNet. In many transfer learning tasks, the GoogLeNet
model is a deep CNN model that obtained good classifica-
tion accuracy while improving compute efficiency. With a ations. When compared to the AlexNet model, the GoogLe-
top-5 error rate of 6.67%, the GoogLeNet, commonly known Net model has twice the amount of parameters.
as the Inception model, won the ImageNet competition in
2015. The inception module is shown in Figure 6, and the 4.5. Data Augmentation. The CNN models are proven to be
GoogLeNet architecture is shown in Figure 7. It has 22 suitable for many computer vision tasks; however, they
layers, including 2 convolution layers, 4 max-pooling layers, required a considerable amount of training data to avoid
and 9 linearly stacked inception modules. The average pool- overfitting. Overfitting occurs when a deep learning model
ing is introduced at the end of the previous inception learns a high-variance function that precisely models the
module. To execute the dimension reduction, the 1 × 1 filter training data but has a narrow range of generalizability.
is employed before the more expensive 3 × 3 and 5 × 5 oper- But in many cases, especially for medical image datasets
6 Computational and Mathematical Methods in Medicine

Filter concatenate

1×1 convolution 3×3 convolution 5×5 convolution 1×1 convolution

1×1 convolution 1×1 convolution 3×3 max pooling

previous layer

Figure 6: Inception module.

Input image Conv1 Conv1 Inception 3a Inception 4a Inception 4a Softmax

Maxpool Maxpool Inception 3b Inception 4b Inception 4b

Maxpool Inception 4c Avg pool

Inception 4d Dropout

Maxpool

Figure 7: GoogLeNet architecture for gastrointestinal tract disease classification.

Original image data

Augmented image data

Figure 8: Augmented Kvasir v2 dataset.

obtained, a large amount of data is a tedious task. Different 256 × 256 pixels in the collected dataset. The augmented
data augmentation techniques are used to increase the size dataset is consisting of 33536 images which contained 4192
and consistency of the data to solve the issue of overfitting. images in individual classes. Then, the augmented datasets
These techniques produce dummy data that has been sub- are divided into 80% training and 20% validation set. There
jected to different rotations, width changes, height shifts, are 26832 images in the training and 6407 images in the vali-
zooming, and horizontal flip but is not the same as the orig- dation. The pretrained models are trained from scratch with
inal data. The rotation range is fixed as 45°, shifting width the hyperparameters of 30 epoch, batch size of 8, Adam opti-
and height range is 0.2, zooming range of 0.2, and horizontal mizers, and learning rate of 1-e05 with step size 33% via trial
flip. The augmented dataset from the original Kvasir version and error method by considering the computing facility. The
2 dataset is shown in Figure 8. Adam optimizers are used due to their reduced complexity
during the model training [32]. The softmax classification
layer and categorical cross-entropy are used in the output of
5. Results and Discussion the pretrained model, and it is given in equations (1) and (2).
In this work, the Kvasir version 2 dataset is used for the clas-
sification of GIT diseases. The entire dataset is divided into   ez i
an 80% training and 20% validation set. NVIDIA Digits uses  =
σ Z , ð1Þ
the Caffe deep learning system to build the pretrained CNN
i
∑Kj=1 ez j
models. The CNN pretrained model is trained and tested
with a system configuration Intel i9 processor with 32 GB
NVIDIA Quadro RTX6000 GPU. The pretrained models where σ denotes the softmax, Z denotes the input vector,
are written with the Caffe deep learning framework in the ezi denotes the standard exponential of the input vector, K
NVIDIA Digits platform. Images with resolutions ranging denotes the number of classes, and ez j denotes the standard
from 720 × 576 to 1920 × 1072 pixels were transformed to exponential of the output vector.
Computational and Mathematical Methods in Medicine 7

13 100

12 90
11
80
10
9 70

Accuracy (%)
8 60
7
Loss

50
6
5 40

4 30
3
20
2
10
1
0 0
0 5 10 15 20 25 30
Epoch
Loss (train) Accuracy_top_5 (val)
Accuracy_top_1 (val) Loss (val)

Figure 9: VGG16 training graph for GIT classification.

100
90
90
80
80
70
70
60

Accuracy (%)
60
50
Loss

50
40 40
30 30
20 20

10 10

0 0
0 5 10 15 20 25 30
Epoch

Loss (train)
Accuracy (val)
Loss (val)

Figure 10: ResNet-18 training graph for GIT classification.

Output curve of the three pretrained models is shown in


Size ð2Þ
Figures 9–11. The graph is plotted for each epoch versus the
Categorical Cross Entropy = − 〠 yi :log ̂yi , training loss and accuracy. The graph is interpreting the train-
i=1 ing loss and training accuracy calculated versus the epoch.
VGG16 model is trained for 30 epoch among the training
where yi denotes target value and ̂y is the ith model output dataset, and the model is proved to be converged after 15
scalar value. The confusion matrix obtained after validating epoch with accuracy ranges between 96%. After the 30 epochs,
the model with a validation collection of 6407 images is used the model is provided with top_1 accuracy of 96.62%, top_5
to measure the confusion matrix. The confusion matrix is used accuracy of 100%, and validation loss of 0.18. The ResNet-18
to evaluate the classification models’ results. The training model is proved to provide less training accuracy of 78.83%
8 Computational and Mathematical Methods in Medicine

100
8
90
7
80
6 70

Accuracy (%)
5 60
Loss

4 50
40
3
30
2
20
1 10
0 0
0 5 10 15 20 25 30
Epoch

Loss (train) Loss1/accuracy (val)


Loss/loss (train) Loss1/accuracy-top5 (val)
Loss2/loss (train) Loss1/loss (val)
Accuracy (val) Loss2/accuracy (val)
Accuracy-top5 (val) Loss2/accuracy-top5 (val)
Loss (val) Loss2/loss (val)

Figure 11: GoogLeNet training graph for GIT classification.

Truth Data

VGG16 Class 0 Class 1 Class 2 Class 3 Class 4 Class 5 Class 6 Class 7

Class 0 824 11 0 0 0 0 3 0

Class 1 18 819 1 0 0 0 0 0

Class 2 0 0 764 0 1 72 1 0
Classifier
Class 3 0 0 0 831 0 0 4 3
results
Class 4 0 0 0 0 835 0 3 0

Class 5 0 0 80 0 0 757 1 0

Class 6 2 0 0 6 2 0 819 9

Class 7 0 1 0 11 2 0 15 809

Figure 12: VGG16 confusion matrix for GIT classification.

and high training loss of 0.58 after the epoch of 30. The Goo- The kappa coefficient is the de facto norm for assessing
gLeNet model has obtained a top_1 accuracy of 91.21%, top_5 rater agreement, as it eliminates predicted agreement due
accuracy of 100%, and training loss of 0.21. to chance. Cohen’s kappa value is obtained by equation
After the model training is completed, the models are (3), where G denotes overall correctly predicted classes, H
validated with the validation dataset, and the confusion denotes the total number of elements, cl denotes the overall
matrix is drawn out of it. Figures 12–14 represent the confu- times class l that was predicted, and sl denotes overall times
sion matrices of the three pretrained models validated on the class l occurred [33].
validation dataset. The confusion matrix is drawn with truth
data and classifier results. From the confusion matrix, the G × H − ∑Ll cl × sl
True Positive Value (TPV), False Positive Value (FPV), True CK = : ð3Þ
Negative Value (TNV), and False Negative Value (FNV) are H 2 − ∑Ll cl × sl
calculated. The diagonal elements represent the TPV of the
corresponding class. The different performance metrics such The kappa coefficient is used when the number of classes
as top_1 accuracy, top_5 accuracy, recall, precision, and more to determine its classification performance. The value
Cohen’s Kappa score are calculated using equations interprets the kappa score ranges from 0 to 1, and their
mentioned in Table 2. interpretation is provided in Table 3.
Computational and Mathematical Methods in Medicine 9

Truth data

Resnet18 Class 0 Class 1 Class 2 Class 3 Class 4 Class 5 Class 6 Class 7

Class 0 662 150 2 2 1 0 14 7

Class 1 205 627 0 2 0 0 3 1

Class 2 0 0 626 0 17 192 1 2


Classifier
Class 3 0 0 0 705 0 0 64 69
results
Class 4 0 0 12 0 796 13 6 11

Class 5 0 0 216 0 15 604 2 1

Class 6 4 1 0 81 16 0 549 187

Class 7 1 0 1 52 19 1 52 712

Figure 13: ResNet-18 confusion matrix for GIT classification.

Truth data

Googlenet Class 0 Class 1 Class 2 Class 3 Class 4 Class 5 Class 6 Class 7

Class 0 787 50 0 0 0 0 0 1

Class 1 58 780 0 0 0 0 0 0

Classifier Class 2 0 0 667 0 3 165 2 1


results Class 3 0 0 0 765 0 0 40 33

Class 4 0 0 1 0 813 14 7 3

Class 5 0 0 137 0 9 689 2 1

Class 6 3 0 0 21 4 0 757 53

Class 7 0 0 0 11 0 0 33 794

Figure 14: GoogLeNet confusion matrix for GIT classification.

Table 2: Classification metrics.


methods outperformed all the other pretrained models in
Metric Equation terms of all classification metrics. The model achieved the
highest top_1 classification accuracy of 96.33% compared
TPV + TNV
Accuracy (ACC) to the ResNet-18 and GoogLeNet models. The model also
TPV + TNV + FNV + FPV
performs a perfect recall and precision with 96.37% and
TPV
Precision 96.5%, respectively. The GoogLeNet model achieved better
TPV + FPV accuracy over ResNet-18 with top_1 classification accuracy.
TPV The kappa coefficient is calculated for models, from that
Recall
TPV + FNV VGG16 and GoogLeNet model provided almost perfect
Precision:Recall agreement with the value of 0.96 and 0.89, respectively.
F1-measure 2∗
Precision + Recall Because of the high miss classification of diseases in the
category dyed lifted polyps, dyed resection margins, esopha-
gitis, standard Z-line, and polyps, ResNet-18 offers very low
Table 3: Cohen’s kappa interpretation. metrics in terms of all classification metrics. Owing to the
injection of liquid underneath the polyp, the model is unable
Value ranges Interpretation (agreement) to correctly distinguish dyed lifted polyps and dyed resection
0 No margins, making the model more difficult to classify. The
0.01 to 0.20 Minor VGG16 and GoogLeNet models are proved to provide better
0.21 to 0.40 Moderate accuracy in classifying the GIT diseases. However, the model
0.41 to 0.60 Reasonable is more difficult to identify because of the interclass similar-
ity between dyed lifted polyps and dyed resection margins,
0.61 to 0.80 Significant
as well as the intraclass similarity between standard Z-line
0.81 to 1.00 Perfect and esophagitis.
The MCC is a more reliable statistical rate that produces
All the pretrained models are trained from scratch to a higher rate when the prediction results are good in all four
classify gastrointestinal tract diseases using the Kvasir v2 values TPV, FPV, TNV, and FNV. It is calculated using
dataset, and results are reported in Table 4. The VGG16 equation (4).
10 Computational and Mathematical Methods in Medicine

Table 4: Performance analysis of pretrained models on GIT classification.

Model name Top_1 ACC (%) Top_5 ACC (%) Recall (%) Precision (%) F1-measure (%) Kappa score
VGG16 96.33 100 96.37 96.50 96.50 0.96
GoogLeNet 90.27 100 90.33 90.27 90.37 0.89
ResNet-18 78.77 99.99 78.91 78.77 78.75 0.75

Table 5: Performance analysis of proposed method with existing models.

Method Accuracy
DenseNet-201 [34] 90.74
ResNet-18 [34] 88.43
Baseline+Inceptionv3 + VGGNet [35] 96.11
Ensemble model [36] 93.7
Logistic regression tree [29] 94.2
Proposed method 96.33

ROC curves
1.0

0.8
True positive rate

0.6
Original image Heat map
0.4

0.2

0.0
0.00 00.22 00.44 00.66 00.88 11.0 Ulcerative-colitis
False positive rate
ROC curve of class 0 (area = 0.94)
ROC curve of class 1 (area = 0.94)
ROC curve of class 2 (area = 0.95)
ROC curve of class 3 (area = 0.98)
ROC curve of class 4 (area = 1.00)
ROC curve of class 5 (area = 0.95)
ROC curve of class 6 (area = 0.94)
ROC curve of class 7 (area = 0.96)
Micro-average ROC curve (area = 0.96)
Macro-average ROC curve (area = 0.96)
(a) (b)

Figure 15: (a) VGG16 ROC for GIT classification. (b) Heat map for test data.

G × H − ∑Ll cl × sl Table 5. The Densenet-201 and ResNet-18 models that are


MCC = rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
  ffi : ð4Þ reported in the reference [34] achieved an accuracy of
2 L 2 2 L 2
H − ∑l cl H − ∑ l sl 90.74% and 88.43%. Both the models are trained for more
than 400 epochs, and it has taken roughly 10 hours to com-
plete training. The model reported in [35] has provided an
Using the Kvasir v2 datasets, the modified VGG16 accuracy of 96.11%, which is very close to the proposed
model is compared with other models in classifying GIT method reported in Table 5. But the said model uses the
diseases based on results reported in the article showed in three stages model of baseline, Inception-V3, and VGG
Computational and Mathematical Methods in Medicine 11

model, which requires high computation power and different cases. Compared to the various machine learning
obtained the Matthews Correlation Coefficient (MCC) of and deep learning models used to classify gastrointestinal
0.826. In [36], the CNN and transfer learning model is pro- tract disease, the VGG16 model achieves better results of
posed classify GIT diseases using global features. The model 96.33% accuracy, 0.96 Cohen’s kappa score, and 0.95
achieves an accuracy of 93.7% with an MCC value of 0.71. MCC. The requirement of manually marking data is the
The logistic model tree proposed in the reference uses the algorithm’s weakest point. As a result, the network could
handcrafted features using 4000 images and achieves an inherit some flaws from an analyst, as diagnosing diseases
accuracy of 94.2% but with poor MCC values of 0.72 [29]. correctly is difficult even for humans in many cases. Using
The person’s significant disadvantage should be knowledge a larger dataset labelled by a larger community of experts
of feature extraction and feature selection techniques. The will be one way to overcome this limitation.
modified pretrained model VGG16 obtained the MCC value
of 0.95, which outperforms all the other models. From the Data Availability
MCC of all the states of the method, we found that the mod-
ified VGG16 method proves to be a perfect agreement for The data used to support the findings of this study are
classifying GIT diseases. included within the article.
The time complexity of the modified pretrained model is
compared with the other models in classifying the GIT Conflicts of Interest
diseases. The proposed models VGG16, GoogLeNet, and
ResNet-18 reported the training time of 1 hour 50 minutes, The authors declare that there is no conflict of interest
1 hour, 7, and 57 minutes, respectively. The literature found regarding the publication of this article.
that DenseNet-201 [34] and ResNet-18 [34] have been
trained for more than 10 hours. The ROC curve in References
Figure 15(a) depicts the tradeoff between true-positive and
false-positive rates. The ROC curve shows the performance [1] M. A. Khan, M. A. Khan, F. Ahmed et al., “Gastrointestinal
of the classification model at different classification thresh- diseases segmentation and classification based on duo-deep
olds. It is plotted at different classification thresholds. The architectures,” Pattern Recognition Letters, vol. 131, pp. 193–
ROC is drawn for the eight classes to determine the better 204, 2020.
threshold for each category. The curve that fits the top left [2] M. A. Khan, M. Rashid, M. Sharif, K. Javed, and T. Akram,
of the corner indicates the better performance of classifica- “Classification of gastrointestinal diseases of stomach from
WCE using improved saliency-based method and discrimi-
tion. Occlusion sensitivity is used to assess the deep neural
nant features selection,” Multimedia Tools and Applications,
network’s sensitivity map to identify the image input area vol. 78, no. 19, pp. 27743–27770, 2019.
for predicted diagnosis. The heat map for test data is shown [3] T. Rahim, M. A. Usman, and S. Y. Shin, “A survey on contem-
in Figure 15(b). This test procedure identified the region of porary computer-aided tumor, polyp, and ulcer detection
interest, which was crucial in the development of the methods in wireless capsule endoscopy imaging,” Computer-
VGG16 model. The model’s occlusion sensitivity map is ized Medical Imaging and Graphics, vol. 85, p. 101767, 2020.
visualized to determine the areas of greatest concern when [4] A. Liaqat, M. A. Khan, J. H. Shah, M. Sharif, M. Yasmin, and
evaluating a diagnosis. The occlusion test’s greatest advan- S. L. Fernandes, “Automated ulcer and bleeding classification
tage is that it shows unresponsive insights into neural net- from WCE images using multiple features fusion and selec-
work decisions, also known as black boxes. The algorithm tion,” Journal of Mechanics in Medicine and Biology, vol. 18,
has been disfigured without disrupting its performance since no. 4, article 1850038, 2018.
the evaluation was performed at the end of the experiment. [5] N. Dey, A. S. Ashour, F. Shi, and R. S. Sherratt, “Wireless cap-
sule gastrointestinal endoscopy: direction-of-arrival estima-
tion based localization survey,” IEEE Reviews in Biomedical
6. Conclusion Engineering, vol. 10, no. c, pp. 2–11, 2017.
These findings show that the most recent pretrained models, [6] A. S. Ashour, N. Dey, W. S. Mohamed et al., “Colored video
such as VGG-16, ResNet-18, and GoogLeNet, can be used in analysis in wireless capsule endoscopy: a survey of state-of-
medical imaging domains such as image processing and the-art,” Current Medical Imaging Formerly Current Medical
Imaging Reviews, vol. 16, no. 9, pp. 1074–1084, 2020.
analysis. CNN models can advance medical imaging tech-
[7] Q. Wang, N. Pan, W. Xiong, H. Lu, N. Li, and X. Zou, “Reduc-
nology by offering a higher degree of automation while also
tion of bubble-like frames using a RSS filter in wireless capsule
speeding up processes and increasing efficiency. The algo-
endoscopy video,” Optics & Laser Technology, vol. 110,
rithm in this study obtained a state-of-the-art result in pp. 152–157, 2019.
gastrointestinal tract disease classification, with 96.33% and [8] M. T. K. B. Ozyoruk, G. I. Gokceler, T. L. Bobrow et al., “Endo-
equally high sensitivity and specificity. Transfer learning is SLAM dataset and an unsupervised monocular visual odome-
helpful for various challenging tasks and is one solution to try and depth estimation approach for endoscopic videos:
computer vision problems for which only small datasets endo-SfMLearner,” Medical Image Analysis, vol. 71, article
are often accessible. Medical applications demonstrate that 102058, 2021.
advanced CNN architectures can generalize and acquire very [9] M. Islam, B. Chen, J. M. Spraggins, R. T. Kelly, and K. S. Lau,
rich features, mapping information on images similar to “Use of single-cell -omic technologies to study the gastrointes-
those in the ImageNet database and correctly classifying very tinal tract and diseases, from single cell identities to patient
12 Computational and Mathematical Methods in Medicine

features,” Gastroenterology, vol. 159, no. 2, pp. 453–466.e1, [25] A. K. Kundu and S. A. Fattah, “Probability density function
2020. based modeling of spatial feature variation in capsule endos-
[10] T.-C. Hong, J. M. Liou, C. C. Yeh et al., “Endoscopic sub- copy data for automatic bleeding detection,” Computers in
mucosal dissection comparing with surgical resection in Biology and Medicine, vol. 115, article 103478, 2019.
patients with early gastric cancer - a single center experi- [26] M. Abra Ayidzoe, Y. Yu, P. K. Mensah, J. Cai, K. Adu, and
ence in Taiwan,” Journal of the Formosan Medical Associa- Y. Tang, “Gabor capsule network with preprocessing blocks
tion, vol. 119, no. 12, pp. 1750–1757, 2020. for the recognition of complex images,” Machine Vision and
[11] M. Suriya, V. Chandran, and M. G. Sumithra, “Enhanced deep Applications, vol. 32, no. 4, 2021.
convolutional neural network for malarial parasite classifica- [27] S. Mohapatra, J. Nayak, M. Mishra, G. K. Pati, B. Naik, and
tion,” International Journal of Computers and Applications, T. Swarnkar, “Wavelet transform and deep convolutional neu-
pp. 1–10, 2019. ral network-based smart healthcare system for gastrointestinal
[12] T. M. Berzin, S. Parasa, M. B. Wallace, S. A. Gross, A. Repici, disease detection,” Interdisciplinary Sciences: Computational
and P. Sharma, “Position statement on priorities for artificial Life Sciences, vol. 13, no. 2, pp. 212–228, 2021.
intelligence in GI endoscopy: a report by the ASGE Task [28] P. Muruganantham and S. M. Balakrishnan, “A survey on deep
Force,” Gastrointestinal Endoscopy, vol. 92, no. 4, pp. 951– learning models for wireless capsule endoscopy image analy-
959, 2020. sis,” International Journal of Cognitive Computing in Engineer-
[13] S. Murugan, C. Venkatesan, M. G. Sumithra et al., “DEMNET: ing, vol. 2, pp. 83–92, 2021.
a deep learning model for early diagnosis of Alzheimer dis- [29] K. Pogorelov, K. R. Randel, C. Griwodz et al., “KVASIR: a
eases and dementia from MR images,” IEEE Access, vol. 9, multi-class image dataset for computer aided gastrointestinal
pp. 90319–90329, 2021. disease detection,” in Proceedings of the 8th ACM on Multime-
[14] V. Chandran, M. G. Sumithra, A. Karthick et al., “Diagnosis of dia Systems Conference, pp. 164–169, New York, NY, USA,
cervical cancer based on ensemble deep learning network 2017.
using colposcopy images,” vol. 2021, pp. 1–15, 2021. [30] A. Caroppo, A. Leone, and P. Siciliano, “Deep transfer learning
[15] A. Khosla, P. Khandnor, and T. Chand, “A comparative anal- approaches for bleeding detection in endoscopy images,” Com-
ysis of signal processing and classification methods for differ- puterized Medical Imaging and Graphics, vol. 88, article
ent applications based on EEG signals,” Biocybernetics and 101852, 2021.
Biomedical Engineering, vol. 40, no. 2, pp. 649–690, 2020. [31] S. Minaee, R. Kafieh, M. Sonka, S. Yazdani, and G. Jamalipour
[16] P. Tang, Q. Liang, X. Yan et al., “Efficient skin lesion segmen- Soufi, “Deep-COVID: predicting COVID-19 from chest X-ray
tation using separable-Unet with stochastic weight averaging,” images using deep transfer learning,” Medical Image Analysis,
Computer Methods and Programs in Biomedicine, vol. 178, vol. 65, p. 101794, 2020.
pp. 289–301, 2019. [32] M. N. Y. Ali, M. G. Sarowar, M. L. Rahman, J. Chaki, N. Dey,
[17] S. Igarashi, Y. Sasaki, T. Mikami, H. Sakuraba, and S. Fukuda, and J. M. R. S. Tavares, “Adam deep learning with SOM for
“Anatomical classification of upper gastrointestinal organs human sentiment classification,” International Journal of
under various image capture conditions using AlexNet,” Com- Ambient Computing and Intelligence, vol. 10, no. 3, pp. 92–
puters in Biology and Medicine, vol. 124, article 103950, 2020. 116, 2019.
[18] A. Biniaz, R. A. Zoroofi, and M. R. Sohrabi, “Automatic reduc- [33] M. Grandini, E. Bagli, and G. Visani, “Metrics for multi-class
tion of wireless capsule endoscopy reviewing time based on classification: an overview,” 2020, https://arxiv.org/abs/2008
factorization analysis,” Biomedical Signal Processing and Con- .05756.
trol, vol. 59, p. 101897, 2020. [34] C. Gamage, I. Wijesinghe, C. Chitraranjan, and I. Perera, “GI-
[19] S. Jain, A. Seal, A. Ojha et al., “Detection of abnormality in Net: anomalies classification in gastrointestinal tract through
wireless capsule endoscopy images using fractal features,” endoscopic imagery with deep learning,” in 2019 Moratuwa
Computers in Biology and Medicine, vol. 127, p. 104094, 2020. Engineering Research Conference (MERCon), pp. 66–71, Mora-
tuwa, Sri Lanka, 2019.
[20] T. Cogan, M. Cogan, and L. Tamil, “MAPGI: accurate identi-
fication of anatomical landmarks and diseased tissue in gastro- [35] T. Agrawa, R. Gupta, S. Sahu, and C. E. Wilson, “SCL-UMD at
intestinal tract using deep learning,” Computers in Biology and the medico task-mediaeval 2017: transfer learning based classi-
Medicine, vol. 111, article 103351, 2019. fication of medical images,” CEUR Workshop Proceedings,
vol. 1984, pp. 3–5, 2017.
[21] H. Alaskar, A. Hussain, N. Al-Aseem, P. Liatsis, and
D. Al-Jumeily, “Application of convolutional neural net- [36] S. S. A. Naqvi, S. Nadeem, M. Zaid, and M. A. Tahir, “Ensem-
works for automated ulcer detection in wireless capsule ble of texture features for finding abnormalities in the gastro-
endoscopy images,” Sensor, vol. 19, no. 6, p. 1265, 2019. intestinal tract,” CEUR Workshop Proceedings, vol. 1984, 2017.
[22] T. Ozawa, S. Ishihara, M. Fujishiro et al., “Novel computer-
assisted diagnosis system for endoscopic disease activity in
patients with ulcerative colitis,” Gastrointestinal Endoscopy,
vol. 89, no. 2, pp. 416–421.e1, 2019.
[23] V. Vani and K. M. Prashanth, “Ulcer detection in wireless cap-
sule endoscopy images using deep CNN,” Journal of King Saud
University - Computer and Information Sciences, 2020.
[24] K. İncetan, I. O. Celik, A. Obeid et al., “VR-Caps: a virtual
environment for capsule endoscopy,” Medical Image Analysis,
vol. 70, p. 101990, 2021.

You might also like