Thesis
Thesis
net/publication/383944913
CITATIONS READS
0 27
1 author:
Shambhavi Sinha
Manipal Academy of Higher Education
4 PUBLICATIONS 1 CITATION
SEE PROFILE
All content following this page was uploaded by Shambhavi Sinha on 11 September 2024.
BACHELOR OF TECHNOLOGY
In
Electronics and Communication Engineering
Submitted by
Shambhavi Sinha
Reg. No: 200907122
Manipal
st
1 July, 2024
CERTIFICATE
This is to certify that the project titled SpectraDerm Vision: Deep Learning for
Multiclass Skin Cancer Analysis is a record of the bonafide work done by Shambhavi
Sinha (Reg. No. 200907122`) submitted in partial fulfilment of the requirements for the
award of the Degree of Bachelor of Technology (BTech) in ELECTRONICS AND
COMMUNICATION ENGINEERING of Manipal Institute of Technology, Manipal,
Karnataka, (A Constituent Unit of Manipal Academy of Higher Education), during the
academic year 2023 - 2024.
ii
ACKNOWLEDGMENTS
I would like to express my sincere gratitude to the individuals whose support and guidance
were instrumental in the successful completion of this project.
First and foremost, I am deeply indebted to Director Anil Rana for his unwavering support and
encouragement throughout the course of this project. His leadership and vision have been a
source of inspiration.
I extend my heartfelt thanks to Dr. Pallavi R Mane, Head of the Department, for her valuable
insights, continuous support, and for providing an environment conducive to learning and
research.
A special note of thanks to Dr. Anu Shaju Areeckal for her expert guidance, constructive
criticism, and constant encouragement. Her expertise and advice were crucial in navigating the
challenges faced during this project.
I would also like to acknowledge the invaluable assistance of Manan Bhatt (Reg. 200907188).
His support and collaboration were significant in the successful completion of this work.
Finally, I am grateful to all the faculty members whose assistance and cooperation were sought
during the project work. Their knowledge, suggestions, and support have been immensely
helpful.
iii
ABSTRACT
Skin cancer is among the most prevalent cancers worldwide, significantly affecting a large
portion of the global population. There is an urgent need for an accurate, non-invasive method
to differentiate between benign and malignant tumors from dermoscopic images. This paper
introduces a novel lightweight deep learning model, SpectraDerm Vision, which enhances the
accuracy of skin cancer diagnosis. This model integrates advanced preprocessing and color
consistency techniques with a new Dual Attention mechanism, based on the modified
MobileNet-V2 architecture. The effectiveness of SpectraDerm Vision is demonstrated using
the public ISIC Challenge dataset, where it achieves impressive results: 98.85% accuracy,
99.28% precision, 98.21% recall, and 98.74% F1-score. SpectraDerm Vision shows
significantly superior performance compared to existing transfer learning models, making it
highly effective for the precise detection of skin cancer from dermoscopic images.
Additionally, we have extended the experiment to multiclass classification using the 2019 ISIC
dataset for 8 classes, achieving remarkable results with a Macro Precision of 94.68%, Macro
Recall of 91.94%, and Macro F1 Score of 92.16%. These outcomes further underscore the
robustness and versatility of SpectraDerm Vision in accurately classifying various types of skin
lesions, making it a powerful tool for dermatological diagnostics.
iv
LIST OF TABLES
v
LIST OF FIGURES
vi
Contents
Page
No
Acknowledgement i
Abstract ii
List Of Figures iii
List Of Tables vi
Chapter 1 INTRODUCTION
1.1 Introduction 1
1.2 Introduction to Area of work 1
1.3 Brief Present Day Scenario with Regard to the Work Area 1
1.4 Motivation 1
1.5 Organization of Report 2
1.6 Target Specification 2
1.7 Project Work Schedule 2
1.8 Organisation of Report 2
Chapter 3 METHODOLOGY
3.1 Introduction 9
Methodology:
3.2.1: Dataset Used
3.2 3.2.2: Dataset Splitting 9-24
3.2.3: Addressing Class Imbalance
3.2.4: Preprocessing and Algorithm Description
vii
3.2.5: Proposed Model Architecture
3.3 Preliminary Result Analysis 24-25
3.4 Conclusion 25
REFERENCES 37-38
PROJECT DETAILS 39
viii
CHAPTER 1
INTRODUCTION
1.1. Introduction
This chapter outlines the scope and intentions of the project, detailing the methodology
and the objectives aimed at enhancing the precision of skin cancer classification using
advanced deep learning techniques. It sets the stage for a detailed discussion on the current
landscape of skin cancer diagnosis, the motivation behind this project, and the expected impact
of the research.
1.3. Brief Present Day Scenario with Regard to the Work Area
Currently, the diagnosis of skin cancer is primarily conducted through histopathological
examination, requiring tissue samples and expert analysis. However, this method is not only
invasive but also prone to delays and variability in diagnosis due to the subjective nature of
visual assessments and the scarcity of specialized dermatopathologists.
1
1.5. Objective of the Work
Main Work Objective: To develop and validate a lightweight, efficient deep learning
model capable of high-accuracy classification of skin cancer from dermatoscopic
images.
Secondary Objective: To compare the performance of the proposed model against
existing benchmarks, demonstrating improvements in speed and accuracy.
1.6. Target Specifications
Importance of the End Result: The project aims to achieve over 98% accuracy,
precision, and recall, making the SpectraDerm Vision model a viable tool for clinical
and remote diagnostic settings.
1.7. Project Work Schedule
A detailed timeline is laid out for the project, from data collection through to final
testing, ensuring structured progress and timely completion.
2
1.8. Organization of the Project Report (Chapter Wise)
Introduction: Overview of the project, its significance, and structure.
Literature Review: Analysis of current technologies and previous work in the field.
Methodology: Detailed account of the techniques and processes used in the project.
Results and Discussion: Presentation and discussion of the findings from the project
testing.
Conclusion and Future Work: Summary of the project, its impact, and potential future
directions.
3
CHAPTER 2
BACKGROUND THEORY
2.1. Introduction
This section explores the theoretical framework and prior studies that form the basis of
our project. It offers an extensive review of the literature, capturing the latest advancements
and theoretical discourse in dermatological image analysis, with a special emphasis on skin
cancer detection.
Recent Developments in the Work Area: Early detection of skin cancer is crucial in
dermatology due to its aggressive nature and significant implications on patient outcomes.
A variety of methods have been explored to enhance early detection capabilities.
Brief Background Theory: In the literature, multiple machine learning methods have been
explored for skin cancer detection, indicating an ongoing evolution of technology in this
field. Each method presents a potential for further improvement.
Artificial Intelligence Techniques: Ozkan et al. [2] explored four distinct machine
learning models: Support Vector Machines (SVM), Decision Trees, Artificial Neural
Networks (ANN) and K-Nearest Neighbors (KNN),. Among these, the ANN model
achieved the highest performance on the PH2 dataset with an accuracy of 92.5% and an
F1-score of 90.45%.
Hybrid Models: Toğaçar et al. [4] integrated auto-encoders with MobileNetV2 and spiking
neural networks, leading to a classification accuracy of up to 93.54% when combining
datasets. Meanwhile, Hosny et al. [5] utilized AlexNet with extensive data augmentation
techniques to achieve an average accuracy of 98.61% on the PH2 dataset, with a precision
of 97.73%.
4
Optimization Techniques: Sayed et al. [6] combined convolutional neural networks with
the Bald Eagle Search optimization technique and used the ROS method to address data
imbalances in the ISIC dataset. This model achieved high metrics across accuracy,
specificity, sensitivity, and F1-score, highlighting the effectiveness of their novel method.
Comprehensive Reviews: Magdy et al. [7] provided a detailed review of various machine
learning and deep learning models for various skin cancer classification. They highlighted
the VGG-16 with a KNN-PDNN classifier as exceptionally effective, albeit noting the
model's substantial size as a limitation [8].
Enhanced Architectures for Multiclass Classification: Thanga Purni et al. [9] introduces
a unique architecture optimized specifically for multiclass classification of skin cancer. It
leverages an ensemble of deep learning techniques, significantly enhancing the robustness
and accuracy of predictions. The model's extensive use of data augmentation also improves
its generalization capabilities, showcasing superior performance in accuracy and precision
compared to other state-of-the-art models. However, the complexity of EOSA-Net could
require substantial computational resources, potentially limiting its deployment in
resource-constrained environments.
Comprehensive Analysis of CNN Architectures: Akter et al. [10] explore a range of deep
convolutional neural networks, including MobileNet, DenseNet, InceptionV3, ResNet-50,
VGG-16 and Xception, for multiclass skin cancer classification. This study highlights the
high accuracy rates, particularly with InceptionV3, which reaches up to 90%. It also
discusses the benefits and limitations of using stacking ensemble models, noting issues with
overfitting as training accuracy consistently exceeds validation accuracy.
Innovative Use of Vision Transformer Networks: Arshed et al. [11] investigate the
application of Vision Transformer (ViT) networks in combination with convolutional
neural network-based pre-trained models. This novel approach takes advantage of the
strengths of both architectures, enhancing performance in multiclass skin cancer
classification. The study provides competitive results in benchmarking against various
advanced methods. However, the high computational demand of Vision Transformers and
challenges in model interpretability remain as significant concerns.
This work sets the stage for 'SpectraDerm Vision,' a new Dual Attention model built on the
modified MobileNet-V2 architecture, promising further improvements in multi-class skin
cancer diagnosis through advanced preprocessing and colour consistency techniques.
5
2.4. Summary
The review underscores a pivotal trend towards integrating deep learning with
traditional and novel optimization techniques to enhance the detection and classification of
skin cancer. These developments reflect a significant potential for improving diagnostic
accuracy and patient management.
Real-world Application and Validation: The transition from controlled test environments
to real-world clinical settings remains a significant challenge. Models that perform well on
datasets like ISIC may not necessarily translate directly to clinical success without thorough
real-world testing and validation to ensure they meet rigorous clinical standards and can
function effectively within existing medical workflows.
Scalability: The scalability of AI diagnostic tools is critical. Models must adapt to varying
levels of healthcare infrastructure, from small clinics to large hospitals, ensuring that all
patients benefit from advancements in AI diagnostics regardless of the setting.
Data Privacy and Security: Making sure the security of patient data is paramount along
with keeping the data private, especially as models become more integrated into healthcare
systems. Compliance by medical regulatory bodies is essential step in making deployment
of AI in medical sector more feasible.
6
Bias and Fairness: It is necessary to address importance of biases in AI models to ensure
that they cater to diverse group of patients. This involves training models on diverse
datasets to prevent biases that could adversely affect diagnostic accuracy for certain
demographic groups.
Continuous Learning and Adaptation: The ability of AI models to learn and adapt
continuously from new data is a promising aspect that can keep them relevant and improve
their diagnostic precision over time.
7
2.6.3 Scalability and Adaptability:
As AI-based diagnostic models evolve, the scalability and adaptability of these systems
become crucial. Models must be designed to scale effectively across different healthcare
environments and adapt to changes in technology and medical practice. This includes being
able to update algorithms as new data becomes available, ensuring that models remain
current and effective.
2.7. Conclusions
The chapter concludes by reiterating the importance of continued research and
innovation in the field of dermatological imaging, highlighting how the proposed SpectraDerm
Vision model is poised to advance the capabilities of skin cancer diagnostics.
8
CHAPTER 3
METHODOLOGY
3.1. Introduction
This chapter presents the comprehensive methodology employed for the project
focused on classifying various skin cancer using advanced machine learning techniques. It
elaborates on the dataset used, preprocessing steps, the innovative architecture of the proposed
model, preliminary results, and draws conclusions from these analyses.
3.2. Methodology
• Dataset Used:
The ISIC (International Skin Imaging Collaboration) 2019 [12,13] dataset is a
comprehensive collection of dermoscopic images designed to support the classification
of skin lesions into various diagnostic categories. This dataset is utilized for training
machine learning models to enhance the accuracy of detection of skin cancer and its
classification. The dataset includes images from the ISIC challenges of 2018 and 2017,
resulting in a substantial compilation of 25,331 images.
This dataset supports the classification of skin lesions into multiple diagnostic
categories, which include:
• Melanoma(MEL)
• Basal Cell Carcinoma(BCC)
• Squamous Cell Carcinoma(SCC)
• Actinic Keratosis(AK)
• Dermatofibroma(DF)
• Melanocytic Nevus(NV)
• Vascular Lesion(VASC)
• Benign Keratosis (BKL- including solar lentigo, seborrheic keratosis, and lichen
planus-like keratosis)
9
• Data Split
The dataset is randomly sampled into train set, validation set and test set. The split of
train-test data is 8:2, and the train-validation split within the training set is 9:1. The table
above shows the distribution of images across these splits for each diagnostic category.
• Preprocessing
The presence of hair in skin images can interfere with the accurate diagnosis and
classification of skin conditions, such as melanoma. Removing hair from these images
is crucial for improving the performance of machine learning algorithms used in medical
image analysis. This report outlines a method for hair removal using image processing
techniques.
Algorithm Description:
The algorithm for hair removal consists of several key steps:
• Convert Image to Grayscale: Hair removal is performed on grayscale images
to simplify processing.
• Apply Gaussian Blur: Gaussian blur is used to smooth the image, reducing
noise and small details that can interfere with hair detection.
• Thresholding: Thresholding helps in differentiating hair from the skin. This step
converts the grayscale image into a binary image where the hair appears as white
regions.
• Morphological Operations: Morphological operations, such as closing and
opening, are applied to refine the detected hair regions.
• Inpainting: The detected hair regions are inpainted using the surrounding skin
pixels to remove the hair from the image.
Detailed Steps:
• Convert Image to Grayscale: The input color image is converted to a
grayscale image using the cv2.cvtColor function.
10
• Apply Gaussian Blur: Gaussian blur is applied to the grayscale image to smooth
it.
• Thresholding: Otsu's thresholding method is used to create a binary image
where the hair appears white on a black background.
• Morphological Operations: Morphological closing is applied to fill small holes
in the hair regions.
• Inpainting: The hair regions in the binary image are inpainted in the original
grayscale image.
Mathematical Explanation
The mathematical foundation of the hair removal algorithm can be broken down into
several key components:
• Grayscale Conversion: Grayscale conversion is an important image processing
technique which is used to reduce a colour image to a single channel image,
where each pixel merely represents shades of gray rather than colour. This
simplification is achieved by eliminating the hue and saturation information
while retaining the luminance. This is particularly useful in applications where
colour does not carry essential information, such as texture analysis, edge
detection, and, in your case, skin cancer detection, where structural properties
might be more informative than colour. The conversion is done using the
formula:
where G(x, y) is the Gaussian kernel, and σ is the standard deviation. The
convolution of the image with this kernel results in a blurred image.
11
• Otsu's Thresholding: Otsu's Thresholding is an automatic thresholding
technique that is used to convert a grayscale image into a binary image. There
are two classes of pixels(background and foreground) which when passed
through the algorithm forms bi-modal histogram. This is used to compute the
optimum thresholding for the two classes. The threshold is determined such that
the intra-class variance (the variance within each class) is minimized. This
method is extremely effective when the histogram of the image has a distinct
separation between the pixel intensity values of the foreground and background.
A • B = (A ⊕ B) ⊖ B
where A is the input image, B is the structuring element, ⊕ denotes dilation,
and ⊖ denotes erosion. This operation helps in filling small holes in the binary
image.
• Inpainting: The lost and deteriorated parts of images can be reconstructed by
Inpainting. It is commonly used to restore damaged photographs or artworks, remove
objects from images, or repair areas where data is missing. The Telea inpainting
technique, one method of achieving this, uses fast marching methods (a numerical
technique for solving boundary value problems) to propagate pixel values into the
regions to be inpainted. It works by iteratively replacing the pixel values in the damaged
region with a combination of surrounding pixel values, taking into account the
geometric and photometric properties of the image for seamless blending.
Before proceeding further, please note that the images depicted below have undergone
a series of preprocessing techniques, including grayscale conversion, Gaussian blur, Otsu’s
12
thresholding, morphological operations, and inpainting as described above. These images
represent various classes of carcinoma and are provided to illustrate the step-by-step process
of how masks are created and applied to each image. This visualization helps demonstrate the
transformation from raw dermoscopic images to enhanced versions, highlighting how each
preprocessing step clarifies and emphasizes critical features necessary for the accurate
classification and analysis of skin cancer lesions. By showcasing these processed images, we
aim to underline the effectiveness of our methods in improving diagnostic accuracy and visual
clarity, essential for reliable medical analysis.
13
(a)
14
(b)
15
(c)
16
(d)
17
Fig. 3.1(a)-(d) Illustration of preprocessing techniques applied on various classes of malignant skin cancer images along with
benign samples. First column shows the original unprocessed images, second column displays black resembling features, third
column represents pure dark pixels, fourth column represents continuous extension of similar density pixels, fifth column
represents the combined mask made from rest of the masks and fifth column represents the net filtered output. a) Melanoma
and Melanocytic Carcinoma, b) Basal Cell Carcinoma and Aktinic Keratosis, c) Benign Keratosis and Dermatofibroma, d)
Vascular Lesions and Squamous Cell Carcinoma.
18
Fig. 3.2. Architecture of proposed Multiscale-FocusNet, comprising of Feature extraction (MB Block), Multi-scale feature
learning (Block) and Final prediction (Final Block)
19
• Batch Normalization and ReLU Activation: Applied sequentially after each
convolution, batch normalization normalizes the outputs, stabilizing the learning
process by reducing internal covariate shift. This is complemented by ReLU
(Rectified Linear Unit) activation, which introduces non-linearity to the model,
enabling it to learn and represent more complex patterns and distinctions in the
input data.
Fig. 3.3 Architecture of proposed Feature extraction block – MB Block, comprising activation layers, pooling
layers and dual headed attention layer at the end.
20
the resolution or increasing the number of parameters. The kernel used is filled
with spaces to increases the field size (atrous effect).
• Atrous Convolution: Enhances the model's ability to incorporate wider
contextual information without losing detail in the image's resolution. By
adjusting the dilation rate, atrous convolutions can gather information from a
larger area, helping the model to better understand and interpret broader spatial
relationships within the image, which is essential for effective lesion analysis
[17].
Fig. 3.4 Architecture of proposed Multi-scale Feature Learning block, comprising various convolution layers with
stabilising dropout and batch normalisation layers.
21
relevance of each channel based on its content, this module ensures that
features which are more predictive of skin cancer characteristics are
emphasized during the learning process, thereby enhancing the overall
discriminative capability of the model [16].
• Spatial Attention: Directs the model’s focus to crucial spatial regions within
the image. This mechanism refines the feature maps to concentrate more on
areas likely to contain diagnostic features of melanomas, thus improving the
model’s accuracy by reducing distractions from less relevant regions.
Fig. 3.5 Dual Attention Module: Spatial attention coupled with Channel attention, comprising two fully connected layers
(FC1 and FC2) and Global Average Pooling (GAP)
22
the actual category labels across multiple classes. By doing so, it drives the
model to refine its predictions across all categories, enhancing the accuracy
and precision in classifying different types of skin lesions.
Fig. 3.6 Final Block, comprising of Global pooling average and dense layers to condense the information for prediction
3.2.5.5 SoftMax:
• The SoftMax function is a mathematical function commonly used in
classification models, particularly in the final layer of neural networks, to
transform raw output scores from the model into probabilities. It works by
exponentiating and then normalizing the output scores, ensuring that the sum
of the probabilities of all output classes equals one. This function is ideal for
multiclass classification problems as it provides a clear, probabilistic output
for each class, making it easier to determine which class the input most likely
belongs to base on the highest probability.
23
a)
Fig. 3.7 a) Example of how sigmoid processes block of input values, b) Ideal waveform function of SoftMax
• In the SpectraDerm Vision model, the output from the final block, which
contains the refined and processed features, is fed into the SoftMax function
to obtain the final predictions. This step is crucial as it converts the logits
(raw classification scores) from the final block into probabilities that sum to
one. These probabilities represent the model’s confidence in each class of
skin cancer, allowing for a clear and decisive classification based on which
class has the highest probability. This final step is essential for ensuring that
the model's predictions are both interpretable and actionable, facilitating
accurate diagnostic decisions in clinical settings.
24
constrained environments such as mobile devices or remote clinical settings where computing
power is limited.
3.4. Conclusions
The SpectraDerm Vision model represents a significant advancement in the non-
invasive and early detection of skin cancer. Its low computational demand paired with high
accuracy offers a promising solution for early detection campaigns, potentially improving
prognosis through timely and precise diagnosis. The model's performance, enhanced by
advanced architectural features such as dual-headed attention and dilated convolutions,
demonstrates a successful balance between model complexity and diagnostic efficacy. This
chapter underscores the potential for SpectraDerm Vision to revolutionize dermatological
screenings by providing a tool that is both accessible and effective in real-world settings.
25
CHAPTER 4
RESULT ANALYSIS
4.1. Introduction
This chapter delves into the evaluation of the SpectraDerm Vision model,
specifically its performance on the SIIM-ISIC 2019 Challenge Dataset. We employ a
detailed set of evaluation metrics to thoroughly assess the model’s capability in diagnosing
various classes of skin cancer. The results are showcased through both graphical
representations and detailed tables, providing a clear visual and statistical analysis of the
model's effectiveness. This section will also explore the significance of the findings,
pinpoint any unexpected outcomes, and explore potential reasons for such deviations.
To comprehensively evaluate our model, we utilize a set of diverse metrics that
collectively gauge the diagnostic accuracy of the SpectraDerm Vision model in
distinguishing among different skin lesion types from dermoscopic images. These key
metrics include sensitivity, specificity, accuracy, and the F1-score, each offering insights
into different aspects of diagnostic performance.
4.2.1 Macro Average: The Macro Average calculates the average of the metric for
each class independently and does not take class size into account. This method
treats all classes equally, regardless of their frequency in the dataset. It provides
a measure of the model's effectiveness across all classes and can highlight if the
model is underperforming in any smaller classes.
26
Table 4.1 Result table for Macro average metrics
Metrics Results
Macro Precision 94.68%
Macro Recall 91.94%
Macro F1 Score 92.16%
Metrics Results
Micro Precision 93.77%
Micro Recall 90.94%
Micro F1 Score 92.06%
4.2.3 Weighted Average: The Weighted Average adjusts the metrics to account for
the number of instances (support) of each class, which is essential for
imbalanced datasets. This average multiplies each class metric by the class’s
support, sums these products, and divides by the total number of instances
across all classes. This method ensures that classes with more instances have a
proportionately greater impact on the overall metric.
27
Table 4.3 Result table for Macro average metrics
Metrics Results
Weighted Precision 94.76%
Weighted Recall 92.58%
Weighted F1 Score 92.14%
4.2.4 Confusion Matrix: The confusion matrix below offers a detailed breakdown of
the SpectraDerm Vision model’s performance across different diagnostic categories.
This matrix is an essential tool for visualizing the model's ability to correctly classify
each type of skin lesion, as well as identifying which types are most frequently confused
with others.
Fig. 4.1 Confusion Matrix depicting the predicted instances of each class from the dataset.
28
Each column of the matrix is used to represent the predicted probabilities of each instance
of the class. Similarly, the row is used to represent the instances which actual class. This
setup allows us to easily observe not only the true positive rates for each class but also the
specific types of errors, such as false positives and false negatives, that the model is making.
This information is critical for understanding the model’s strengths and weaknesses in
classifying different types of skin cancers and guiding further improvements.
29
4.5. Deviations from Expected Results & Justification:
While SpectraDerm Vision performs exceptionally well across most standard metrics,
such as sensitivity and F1-score, there is a noted slight deviation in achieving a lower loss value
than anticipated. This discrepancy could be attributed to the inherent challenges in
differentiating between certain benign and malignant lesions that exhibit similar dermatoscopic
features. These lesions can sometimes blur the distinct boundaries that the model relies on for
classification, leading to slight inaccuracies in the final loss calculations. Further refinement of
the feature extraction and classification layers could potentially reduce this loss, enhancing the
model’s ability to discern nuanced differences in skin lesions more effectively.
Feature
Specificity Sensitivity F1-Score Accura
Methods Extract Dataset Classifier
(%) (%) (%) cy (%)
or
Ozkan et al. [2] PH2 ANN 96.11 90.86 92.50
LBP PH2 MLM 98.80 98.33 98.34 98.33
Rebouças Filho et
GLCM PH2 MLM-NN 95.71 90.83 91.23 90.83
al. [3]
HU PH2 SVM 91.90 83.33 84.02 83.33
MobileNetV2
+ Autoencoder
ISIC 93.44 93.01 93.83 93.20
without using
Toğaçar et al. [4] SNN
MobileNetV2
ISIC + Autoencoder 95.15 95.36 95.68 95.27
using SNN
ISIC-
Sayed et al. [6] SqueezeNet 96.75 100 98.40 98.37
2020
PH2-
DCNN 83.33 72.92 80
Original
Hosny et al. [5] PH2 -
Augment DCNN 98.93 98.83 98.61
ed
KNN-
ISIC KNN- VGG16 99.75 99 99.37 99.38
PDNN
Magdy et al. [7]
KNN-
98.50 90.25 94.13 94.38
MobileNetV2
Thanga Purani et al.
ISIC EOSA-Net 93.05
[9]
30
HAM10 InceptionNetV
90
000 3
HAM10
Akter et al. [10] Xception 88
000
HAM10
DenseNet 88
000
HAM10
Arshed et al. [11] ViT 92.16 92.14 92.17 92.14
000
ISIC- SpectraDerm
Proposed Method 91.21 92.74 94.85
2019 Vision
4.6. Conclusions:
The SpectraDerm Vision model has demonstrated remarkable effectiveness in the
classification of various types of skin cancer, consistently outperforming existing models
across key diagnostic metrics such as accuracy, sensitivity, and F1-score. The model’s novel
integration of dual-headed attention mechanisms and specialized convolutional blocks
significantly boosts its capability to accurately identify critical dermatoscopic features. Despite
its compact size, the model does not compromise on performance, establishing it as a
formidable tool for clinical application.
The architecture’s ability to maintain high precision while being lightweight makes it
especially suitable for deployment in resource-constrained environments, enhancing
accessibility and efficiency in skin cancer diagnostics. Looking forward, there is substantial
potential for further enhancements to the model. Improvements in training methods, dataset
diversity, and real-time processing capabilities could broaden its applicability, making
SpectraDerm Vision a pivotal component in advancing dermatological diagnostics.
This chapter not only underscores the current achievements of the SpectraDerm Vision model
but also sets the stage for further discussions on potential improvements and the broader
implications of deploying this technology in real-world clinical settings. The overarching goal
remains to enhance diagnostic accuracy and accessibility, thereby significantly improving
patient outcomes in the healthcare system.
31
CHAPTER 5
CONCLUSION AND FUTURE SCOPE OF WORK
5.2. Conclusions
SpectraDerm Vision has showcased remarkable capabilities in the multiclass
classification of various skin cancers, achieving a Weighted Precision of 94.68%, Weighted
Recall of 91.94%, and a Weighted F1 Score of 92.16%. These metrics are particularly
telling of the model’s proficiency in handling class imbalances and accurately
distinguishing between multiple types of skin lesions, a crucial feature for effective clinical
diagnostics.
32
The application of a dual-headed attention mechanism significantly enhances the
model's ability to focus on relevant features within complex dermatoscopic images. This
refinement in focus aids in distinguishing between benign and malignant lesions across
different categories of skin cancer, enhancing diagnostic accuracy. The high weighted
scores across precision, recall, and F1 demonstrate that SpectraDerm Vision not only
performs well on average but also maintains consistent accuracy across various classes, an
essential factor in clinical settings where misclassification can have serious implications.
The efficacy of SpectraDerm Vision in a multiclass diagnostic context underscores
its substantial potential for integration into clinical decision-support systems, offering a
significant enhancement over existing state-of-the-art models. This performance boost is
particularly valuable in dermatological healthcare, where the accurate early detection of
skin cancer can dramatically improve patient management and outcomes.
Looking ahead, there is considerable scope for further enhancing SpectraDerm
Vision. Future work can explore expanding the model's capabilities to include even more
diverse types of skin conditions, further refining its classification algorithms to improve
specificity and sensitivity across underrepresented classes. Additionally, clinical trials in
real-world settings are necessary to validate the model's effectiveness in practical
applications and to gather feedback for iterative improvements. Ultimately, the goal is to
integrate SpectraDerm Vision into widespread clinical use, providing dermatologists with
a powerful tool for early and accurate detection of skin cancer, thereby enhancing the
quality of care and patient outcomes.
33
without needing explicit retraining phases. By learning from new data as it becomes
available, the model can adjust to new trends in dermatological conditions,
variations in imaging technology, and evolving medical standards, maintaining its
relevance and accuracy.
5.3.5 Enhanced Model Explainability: Increasing the explainability of the model can
build trust among clinicians and patients by making its decision-making process
more transparent. Techniques such as feature importance mapping and the
integration of explainable AI frameworks can help in illustrating why certain
diagnostic conclusions were reached, which is particularly important in medical
settings for justifying treatment decisions.
34
CHAPTER 6
HEALTH, SAFETY, RISK AND ENVIRONMENT ASPECTS
35
6.3. Concluding Remarks
The health, safety, risk, and environmental aspects of SpectraDerm Vision
have been thoroughly managed to ensure that the technology not only advances
dermatological diagnostics but does so in a manner that is safe, secure, and
sustainable. As the technology moves forward, continuous improvement in these areas
will remain a priority to address any new challenges that arise and to adapt to
evolving standards and expectations in healthcare technology.
36
REFERENCES
37
[13]. Rotemberg, V., Kurtansky, N., Betz-Stablein, B., Caffery, L., Chousakos, E., Codella,
N., Combalia, M., Dusza, S., Guitera, P., Gutman, D., Halpern, A., Helba, B., Kittler,
H., Kose, K., Langer, S., Lioprys, K., Malvehy, J., Musthaq, S., Nanda, J., Reiter, O.,
Shih, G., Stratigos, A., Tschandl, P., Weber, J., Soyer, P.: A patient-centric dataset of
images and metadata for identifying melanomas using clinical context. Sci Data 8
(2021) 34. https://doi.org/10.1038/s41597-021-00815-z.
[14]. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted
residuals and linear bottlenecks. In Proc. IEEE conference on Computer Vision and
Pattern Recognition (2018) 4510-4520. https://doi.org/10.1109/CVPR.2018.00474.
[15]. Chollet, F., Xception: Deep learning with depthwise separable convolutions. In Proc.
IEEE Conference on Computer Vision and Pattern Recognition (2017) 1251-1258.
[16]. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.,
Polosukhin, I.: Attention is all you need. Advances in Neural Information Processing
Systems (2017) 30.
[17]. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In
International Conference on Learning Representations (ICLR) (2016).
[18]. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep
convolutional neural networks. Advances in neural information processing systems
(2012) 25.
[19]. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In
Proceedings of the IEEE conference on computer vision and pattern recognition (2016)
770-778.
[20]. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. In ICLR (2015).
38
PROJECT DETAILS
Student Details
Student Name Shambhavi Sinha
Register Number 200907122 Section / A/20
Roll No
Email Address [email protected] Phone No 7667388068
(M)
Student Name Manan Bhatt
Register Number 200907188 Section / B/27
Roll No
Email Address [email protected] Phone No 9769325418
(M)
Project Details
Project Title SpectraDerm Vision: Deep Learning for Multiclass Skin Cancer
Analysis
Project Duration 5 months Date of 1st July, 2024
reporting
Organization Details
Organization Manipal Institute of Technology
Name
Full postal Manipal Institute of Technology, Eshwar Nagar, Manipal, Karnataka,
address with pin 576104
code
Website address www.manipal.edu
39