Deepak Kumar Swain
Deepak Kumar Swain
School of Computing
National College of Ireland
I hereby certify that the information contained in this (my submission) is information
pertaining to research I conducted for this project. All information other than my own
contribution will be fully referenced and listed in the relevant bibliography section at the
rear of the project.
ALL internet material must be referenced in the bibliography section. Students are
required to use the Referencing Standard specified in the report template. To use other
author’s written or electronic work is illegal (plagiarism) and may result in disciplinary
action.
Attach a completed copy of this sheet to each project (including multiple copies).
Attach a Moodle submission receipt of the online project submission, to
each project (including multiple copies).
You must ensure that you retain a HARD COPY of the project, both for
your own reference and in case a project is lost or mislaid. It is not sufficient to keep
a copy on computer.
Assignments that are submitted to the Programme Coordinator office must be placed
into the assignment box located outside the office.
Date:
Penalty Applied (if applicable):
Combining VGG16 with Random Forest and Capsule
Network for Detecting Multiple Myeloma
Deepak Kumar Swain
x19216769
Abstract
Multiple Myeloma (MM) is a common blood cancer linked to white blood cells.
Patients’ survival rates increase based on early diagnosis. Early diagnosis, however,
has been a major issue. This study offers a model for detecting multiple myeloma
from microscopic images of patients’ blood cells that combines VGG16, capsule
networks (CapsNet), and random forests (RF). The model is trained using 85 blood
cell pictures, and accuracy and intersection over union (IoU) metrics are used to
compare it to state-of-the-art (SOTA) models like U-Net and masked-RCNN. From
the comparison of the results of both models, it was observed that VGG16-CapsNet
model gave better accuracy, but VGG16-RF model performed better in terms of
segmentation of myeloma cells. However, the achieved segmented output was not
better than the masked-RCNN model.
1 Introduction
In the biomedical sector, detecting malignant white blood cells that are produced in bone
marrow is a difficult process. In the human body, stem cells are responsible for the devel-
opment of white blood cells (WBC), red blood cells (RBC), and platelets. WBC serve as
a barrier to pathogens, while RBC operate as a route for oxygen to circulate throughout
the human body. Platelets, also known as thrombocytes, are blood components that
react to injuries or bleeding by causing blood to clot (McKenzie and Williams; 2015).
WBC are important components of the human body’s internal defense system, and
they are mostly produced in the red bone marrow. The thymus gland, lymph nodes, and
spleen all generate different types of WBC. The human body requires that WBC divide
in an organized way. Multiple Myeloma is a cancer that develops when a person’s body
produces too many aberrant WBC. These aberrant cells then quickly mutate, crowding
out regular red and white blood cells. The yearly rate of new cases of cancer and deaths
caused by MM per 100,000 on men and women from 2014 to 2018 in the US are 7.1
and 3.2 respectively. In 2021, around 35,000 new cases of MM cancer were detected and
12,410 deaths caused by MM in US.
Multiple myeloma affects plasmacytes, which are a kind of blood cell. Antibodies are
produced by normal plasma cells to fight infections (Vyshnav et al.; 2020). A physical
examination is usually used to discover the disease, with the doctor looking for physical
symptoms such as spleen or liver enlargement, as well as lymphadenopathy. Blood tests
are then performed to check for abnormalities in red or white blood cells, or bone marrow
samples are removed and examined for multiple myeloma. The sample testing aids in
1
determining the illness type and pace of progression. The mechanisms outlined above
are quite effective at identifying malignant cells. These procedures, on the other hand,
are laborious, time-consuming, and may result in human mistakes. Machine learning
techniques have been successfully applied (Anilkumar et al.; 2020) in the healthcare in
recent years on tasks like classification and identification of various diseases, as well as
detection of uncommon disorders that are difficult to analyze from a microscopic image
with the human eye.
Due to their effectiveness and accuracy in image segmentation, capsule neural net-
works (LaLonde et al.; 2020) and transfer learning (Karimi et al.; 2021) approaches have
gained relevance in medical sectors in recent years for performing image segmentation
effectively. For example, in transfer learning techniques, pretrained model’s learings were
used as the starting point for instead of starting from scratch. This process helps in
training the models faster without decreasing the effectiveness in giving results. It is also
helpful in the cases where the dataset is comparatively small for a research. To the best of
our knowledge, capsule neural networks have not yet been employed for image segmenta-
tion of any form of blood malignancy, such as leukemia, myeloma, or lymphoma. So far,
researchers have used autoencoder type models like U-Net, Masked-RCNN, Tiramisu,
Faster-RCNN for image segmentations where the encoder is responsible to down sample
the original images to generate learned feature vector and the decoder up-samples the
learned features to give the original image/object as output. Transfer learning techniques
have also gained popularity especially when the dataset size is small as the models are not
trained from scratch (Zan et al.; 2020). The pre-trained parameters are used as starting
point along with the desired dataset.
The proposed solution combines VGG16, CapsNet and RF to perform image seg-
mentation. The aim of this research is to investigate how accurately transfer learning
techniques are able to detect multiple myeloma from microscopic blood cell images. The
following sets of study objectives were developed to answer the research question:
• Examine SOTA models like as U-Net and masked-RCNN, which have been widely
applied in the biomedical fields.
• Design transfer learning models combining VGG16 with Capsule network and Ran-
dom Forest for the segmentation of myeloma cells.
• Train and assess the performance of the models with small dataset (less than 100
images).
Section 2 addresses related work with a focus on machine learning approaches for
image segmentation and classification of multiple myeloma. Section 3 discusses the re-
search methodology while section 4 discusses the experiments and results of applying the
proposed approach. Section 5 concludes the research and discusses future work.
2 Related Work
Various research projects linked to image segmentation of blood malignancies such as
leukemia, multiple myeloma have been undertaken in recent years, and some of the find-
ings are presented in the following sub-sections. Multiple myeloma has been identified
2
and classified using a variety of methodologies and approaches by researchers. Section
2.1 provides a literature review on existing work related to segmentation of blood cancer
images. Section 2.2 provides a literature review on CapsNet and RF.
3
the classification accuracy of Acute Lymphoblastic Leukemia (ALL) type (L1, L2, and
L3). The researchers discovered that using PCA rather than other feature extraction
approaches improves sensitivity and specificity. For image segmentation of leukemia, Ja-
been et al. (2020) utilized a new method that used color filters to locate regions of interest
(in this case, white blood cells) and feature extraction utilizing transformation techniques
such as curvelet transformation and wavelet transformation. The collected feature vectors
were then input into K-nearest neighbours (KNN) and support vector machine (SVM)
models to verify the proposed algorithm’s accuracy. Amin et al. (2014) utilized a bin-
ary support vector machine (SVM) model to categorize the pictures into malignant and
noncancerous classes and used K-means clustering to separate the white blood cells. The
model was tested using K-fold cross validation (k=10) to see if it was overfitting with
data, and it was found that it was not. With the aid of a faster-region convolutional
neural network (faster-RCNN) model, Hossain et al. (2020) suggested a method to detect
malignant blood types in acute lymphoblastic leukemia. For image segmentation and
classification of acute lymphoblastic leukemia, Sukhia et al. (2019) employed two distinct
methods. The diffused expectation maximisation (DEM) method was used to detect
three separate classes in the images: WBC, RBC, and background. The pictures were
categorized into classes (normal blood cell and lymphoblast blood cell) using a sparse
classifier after feature extraction and selection. Vincent et al. (2015) employed the stack-
ing idea, in which a two-step neural network was used to provide two separate sorts of
classifications. The blood smear picture samples were used as input in the first phase,
and required characteristics were retrieved from the images. With the aid of a sequential
neural network, the pictures were categorized into two groups (normal blood cells images
and aberrant blood cells images) after feature extraction. The result of the first stage was
utilized as input for feature extraction in the second step. Following feature extraction,
the aberrant cell pictures were further classified into acute lymphoblastic leukemia (ALL)
and acute myeloid leukemia (AML) with the help of another sequential neural network.
4
autoencoder structure to compensate for global information loss in restricting the routing.
This network was compared to cutting-edge networks like U-Net, Tiramisu, and P-HNN.
Wu and Misra (2019) used RF, wavelet transform, and hessian matrix to segment images
of organic shales. The existence of the four shale components, porosity, rock matrix,
pyrite, and organic components, was determined using features extracted from pictures
and put into a RF model. For automated identification of maize tassels, Zan et al.
(2020) utilized transfer learning by merging VGG16 and RF, which is appropriate for
complicated situations. First, the RF was used to partition drone images into tassel and
non-tassel areas, followed by the morphological method to extract likely tassel region
recommendations, and finally, false positives were removed using VGG16.
Due to its superior feature extraction method, machine learning-based technologies
have been increasingly popular for image segmentation in recent years. For image seg-
mentation, convolutional neural networks such as faster-RCNN, U-Net, masked-RCNN,
and Tiramisu have been frequently utilized in research projects with excellent results.
However, the most significant disadvantage of convolutional neural networks is that they
require a huge amount of training data to obtain a good outcome. Another issue with
convolutional neural networks is that they have trouble storing object orientation and
location in pictures. This scenario arises as a result of the network’s usage of the pooling
approach, which can assist in locating the required component in an image but fails to
comprehend the component’s location. It just looks to see if the component is there in
the image and then classifies it based on that. It is capable of accurately classifying a
picture as a whole. However, it is impossible to overlook the importance of component
placements, particularly in medical images. The issue with convolutional neural net-
works (CNN) has been well-known for years, and numerous studies are currently being
conducted to address it. To address CNN’s limitations, Hinton et al. (2011) proposed the
basic principles underpinning capsule neural networks. Sabour et al. (2017) eventually
deciphered the reasoning and presented the notion of dynamic capsule routing. This is
a very new notion that is still being studied. Even though deep learning techniques are
widely used in biomedical fields, it cannot be ignored that machine learning techniques
perform excellently when it comes to tabular data. Among various machine learning tech-
niques, Random Forest (RF) is widely used in biomedical fields as unlike other machine
learning algorithms it is not prone to model overfitting and imbalanced data (Csaholczi
et al.; 2021).
3 Methodology
The proposed method, as illustrated in Figure 1, performs image segmentation using
the capabilities of image classification algorithms. The transfer learning approach has
been adapted by using VGG16 model that has already been trained on ImageNet dataset
(WordNet; 2021) that has 1000 classes. The weights of the final layer from the VGG16
model were chopped and the learnings (trained weights) were used as the starting point of
training the multiple myeloma dataset using Random Forest and Capsule Neural Network.
The training of both the models were done separately.
The current study makes use of a dataset from the Cancer Imaging Archive (TCIA)
(Gupta and Gupta; 2019). The dataset contains microscopic images collected from bone
marrow aspirate slides of patients with multiple myeloma who were diagnosed using stand-
ard methods. The slides were stained with Jenner-Geemsa stain. Images were captured
5
Figure 1: Proposed method where both Random Forest and Capsule Network models
were trained separately by using VGG16 as backbone
at 1000x magnification using a Nikon Eclipse-200 microscope and a digital camera. The
pictures were captured in raw BMP format at a resolution of 2560x1920 pixels. However,
for the purposes of this project, these photos were converted to PNG format. There are
85 photos in total in this dataset. Before being used for segmentation, all 85 photos were
stain normalized by professionals in the biomedical area. For convenience, these stain
corrected pictures have been given as an annotated dataset with plasma cells recognized
in all image slides of a presentation. Figure 2 depicts a selection of annotated pictures
taken from the slides.
Figure 2: (a) Ground Truth Image, (b) Annotation Using VGG Annotator
For data preparation, images were annotated using VGG image annotator (Dutta
and Zisserman; 2019) and then used as a JSON file for creation of masked images. To
create masked images, K-means clustering and computer vision (OpenCV) techniques
were used which are two of the most popular techniques to find contours in images.
6
However, OpenCV gave better result in terms of masked image creation. For creating
masked images, the pixel coordinates of myeloma cells were taken from the JSON files
and then bitwise AND operation was performed to get the contours of the myeloma cells
in the images. This approach helped in getting the desired masked images which was
difficult in K-Means clustering.
The chopped output of the VGG16 model containing pretrained weights was used
as input for both RF and CapsNet models. For prediction of test images, the model
returned the coordinates of the masked images which was then converted to images to
visualize the segmented output.
7
Figure 3: Output of state of the art models and proposed models on the same test data
Figure 4: Visual Representation of IoU Values on a Test Image for VGG16-RF Model:
(a) Ground Truth, (b) Prediction, (c) Ground Truth Vs Prediction comparison based on
IoU values
8
the output of Masked-RCNN model. However, it is clearly seen that none of the model
is giving accurate results as per the ground truth. In the ground truth image, there are
only four myeloma cells but in the Masked RCNN and VGG16-RF model, six myeloma
cells showing. This also clarifies the argument that the metrices like accuracy, precision,
recall and f1-score are not the valid measures to evaluate these models. Hence, IoU has
been used in the current research to validate each output myeloma cells. Figure 4 shows
the IoU values for a sample image. The IoU values of individual myeloma cells are 0.713,
0.807, 0.923, 0.784, 0.000 and 0.000 respectively. For the invalid cells, IoU value is shown
as 0.
4.3 Discussion
The VGG16-CapsNet model failed to achieve desired segmented output even if the it
model achieved the hight accuracy. The VGG16-RF model achieve less accuracy than
the Masked RCNN, but it managed to achieve similar result as shown in figure 3. This
also justifies the argument that accuracy is not the correct measure for image segmenta-
tion. Hence, IoU is used to evaluate how accurately the models are able to perform the
segmentation. As shown in figure 4, the VGG16-RF model managed to give good IoU
values for the desired myeloma cells. However, the model gave some segmented cells in
the output even if they are not myeloma cells. This occurred because the colour contrast
was almost same as myeloma cells. Hence, the model failed to disregard these cells. The
same output is received in the Masked RCNN model in the existing work. However, the
existing work related to masked-RCNN hasn’t been reproduced in the current research.
References
Acharya, V. and Kumar, P. (2019). Detection of acute lymphoblastic leukemia using
image segmentation and data mining algorithms, Springer .
URL: https://doi.org/10.1007/s11517-019-01984-1
Amin, M. M., Kermani, S., Talebi, A. and Oghli, M. G. (2014). Recognition of acute
lymphoblastic leukemia cells in microscopic images using k-means clustering and sup-
port vector machine classifier, JMSS .
URL: http://dx.doi.org/10.4103/2228-7477.150428
9
Anilkumar, K., Manoj, V. and Sagi, T. (2020). A survey on image segmentation of blood
and bone marrow smear images with emphasis to automated detection of leukemia,
ELSEVIER .
URL: https://doi.org/10.1016/j.bbe.2020.08.010
Dutta, A. and Zisserman, A. (2019). The via annotation software for images, audio and
video, 27th ACM International Conference on Multimedia p. 2276–2279.
URL: https://annotate.officialstatistics.org/
Gautam, A., Singh, P., Raman, B. and Bhadauria, H. (2016). Automatic classification
of leukocytes using morphological features and naı̈ve bayes classifier, IEEE .
URL: https://doi.org/10.1109/TENCON.2016.7848161
Gupta, R. and Gupta, A. (2019). Mimm sbilab dataset: Microscopic images of multiple
myeloma, The Cancer Imaging Archive .
URL: https://doi.org/, 10.7937/tcia.2019.pnn6aypl
Hossain, M. A., Sabik, M. I., Muntasir, I., Islam, A. M., Islam, S. and Ahmed, A. (2020).
Leukemia detection mechanism through microscopic image and ml techniques, IEEE .
URL: https://doi.org/10.1109/TENCON50793.2020.9293925
Jabeen, A., Jabeen, S., Shah, S. A. and Rao, W. A. (2020). Efficient features for effectively
detection of leukemia cells, IEEE .
URL: https://doi.org/10.1109/INMIC50486.2020.9318085
Karimi, D., Warfield, S. K. and Gholipour, A. (2021). Transfer learning in medical image
segmentation: New insights from analysis of the dynamics of model parameters and
learned representations, ELSEVIER 116.
URL: https://doi.org/10.1016/j.artmed.2021.102078
LaLonde, R., Xu, Z., Irmakci, I., Jain, S. and Bagci, U. (2020). Capsules for biomedical
image segmentation and challenging issues, ELSEVIER .
URL: https://doi.org/10.1016/j.media.2020.101889
10
Mishra, S., Majhi, B., Sa, P. K. and Sharma, L. (2016). Gray level co-occurrence matrix
and random forest based acutelymphoblastic leukemia detection, ELSEVIER .
URL: https://doi.org/10.1016/j.bspc.2016.11.021
Sabour, S., Hinton, G. E. and Frosst, N. (2017). Dynamic routing between capsules,
NeurIPS .
Saeedizadeh, Z., Dehnavi, A. M., Talebi, A., Rabbani, H., Sarrafzadeh, O. and Vard, A.
(2016). Automatic recognition of myeloma cells in microscopic images using bottleneck
algorithm, modified watershed and svm classifier, Journal of Microscopy .
URL: https://doi.org/10.1111/jmi.12314
Sajjad, M., Khan, S., Jan, Z., Muhammad, K., Moon, H., Kwak, J. T., Rho, S., Baik,
S. W. and Mehmood, I. (2017). Leukocytes classification and segmentation in micro-
scopic blood smear: A resource-aware healthcare service in smart cities, IEEE .
URL: https://doi.org/10.1109/ACCESS.2016.2636218
Sukhia, K. N., Ghafoor, A., Riaz, M. M. and Iltaf, N. (2019). Automated acute lympho-
blastic leukaemia detection system using microscopic images, IET .
URL: https://doi.org/10.1049/iet-ipr.2018.5471
Vincent, I., Kwon, K.-R., Lee, S.-H. and Moon, K.-S. (2015). Acute lymphoid leukemia
classification using two-step neural network classifier, IEEE .
URL: https://doi.org/10.1109/FCV.2015.7103739
Vyshnav, M. T., Sowmya, V., Gopalakrishnan, E. A., Sajith, V. V. V., Menon, V. K. and
Soman, K. P. (2020). Deep learning based approach for multiple myeloma detection,
IEEE .
URL: https://doi.org/10.1109/ICCCNT49239.2020.9225651
Wu, Y. and Misra, S. (2019). Intelligent image segmentation for organic-rich shales using
random forest, wavelet transform, and hessian matrix, IEEE .
URL: https://doi.org/10.1109/LGRS.2019.2943849
Zan, X., , Zhang, X., Xing, Z., Liu, W., Zhang, X., Su, W., Liu, Z., Zhao, Y. and Li, S.
(2020). Automatic detection of maize tassels from uav images by combining random
forest classifier and vgg16, IEEE 112.
URL: https://doi.org/10.3390/rs12183049
11