0% found this document useful (0 votes)
25 views4 pages

DAP391_HAM_ENG

This paper discusses the use of Machine Learning for skin cancer recognition utilizing the HAM10000 dataset, achieving a high accuracy of 98.20%. It employs Explainable AI methods, including Counterfactual-based explanations, Lime, and SHAP, to enhance the interpretability of model predictions. The proposed model outperforms traditional models like ResNet50 and VGG16, demonstrating its effectiveness in clinical applications.

Uploaded by

aladinnam200044
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views4 pages

DAP391_HAM_ENG

This paper discusses the use of Machine Learning for skin cancer recognition utilizing the HAM10000 dataset, achieving a high accuracy of 98.20%. It employs Explainable AI methods, including Counterfactual-based explanations, Lime, and SHAP, to enhance the interpretability of model predictions. The proposed model outperforms traditional models like ResNet50 and VGG16, demonstrating its effectiveness in clinical applications.

Uploaded by

aladinnam200044
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

SKIN CANCER IMAGE RECOGNITION

USING THE HAM10000 DATASET

Group 5 - FPT University

February 21, 2025

Abstract

This paper presents the application of Machine Learning in skin cancer recogni-
tion from medical images using the HAM10000 dataset. We apply Explainable AI
(XAI) methods based on counterfactual explanations, as well as Lime and SHAP
techniques to interpret model decisions. Experimental results show that the model
achieves 98.20% accuracy on the test set, with a loss function of 0.0898, and XAI
methods help clarify the basis of predictions.

1 Introduction
Skin cancer is one of the most common types of cancer today. Early detection through
skin images can improve treatment success rates. The HAM10000 dataset contains
10,015 labeled skin lesion images, supporting deep learning research for automatic skin
cancer recognition. Additionally, the use of Explainable AI (XAI) methods such as
Counterfactual-based, Lime, and SHAP helps doctors better understand model decisions.

2 Research Methods

2.1 Dataset

The HAM10000 dataset was collected from different patients with seven types of skin
lesions, including malignant and benign tumors. The data was balanced and preprocessed
by normalizing images, augmenting data, and converting them into a suitable format.

1
2.2 Algorithms Used

2.2.1 Counterfactual-based XAI

The counterfactual-based XAI method works by generating slightly modified samples


compared to the original data to determine the most critical factors influencing the model’s
decision. If a small change significantly alters the prediction, that factor has a significant
impact on the model’s output.

2.2.2 Lime

Lime (Local Interpretable Model-agnostic Explanations) operates by generating a set of


simulated data points close to the original data and then training a simple model (often
a linear model) to approximate the decisions of the main model within that local region.
By analyzing the weights of the linear model, Lime helps identify which parts of the data
have the most influence on the model’s predictions.

2.2.3 SHAP

SHAP (SHapley Additive exPlanations) is based on game theory, calculating the con-
tribution of each data feature to the prediction outcome. SHAP uses Shapley values to
measure the impact of each variable on the model’s decision, providing a general and
precise interpretation of the factors affecting predictions.

3 Experiments and Results

Table 1: Model training results.


Metric Value

Number of training images 10,015


Number of test images 2,003
Test set accuracy 98.20%
Loss function (Loss) 0.0898

2
3.1 Model Performance Comparison

We conducted experiments with multiple models on the HAM10000 dataset to evaluate


performance. Table 2 presents the comparison results:

Table 2: Performance comparison of different models on the test set.


Model Accuracy Loss

ResNet50 96.85% 0.1123


VGG16 95.42% 0.1357
EfficientNetB0 97.68% 0.0985
Proposed Model 98.20% 0.0898

The results show that the proposed model outperforms traditional models such as
ResNet50, VGG16, and EfficientNetB0, confirming the superiority of the applied method.

4 Conclusion
Experimental results demonstrate that the model achieves high performance in skin can-
cer recognition on the HAM10000 dataset. The application of XAI methods such as
Counterfactual-based, Lime, and SHAP provides clearer information on the model’s pre-
diction reasoning, thus better supporting clinical diagnosis.

References
[1] P. Tschandl, C. Rosendahl, H. Kittler, ”The HAM10000 Dataset: A Large Collection of
Multi-Source Dermatoscopic Images of Common Pigmented Skin Lesions,” Scientific
Data, 2018.

[2] S. Wachter, B. Mittelstadt, C. Russell, ”Counterfactual Explanations without Opening


the Black Box: Automated Decisions and the GDPR,” Harvard Journal of Law &
Technology, 2021.

[3] M. Ribeiro, S. Singh, C. Guestrin, ”Why Should I Trust You? Explaining the Predic-
tions of Any Classifier,” Proceedings of the 22nd ACM SIGKDD International Con-
ference on Knowledge Discovery and Data Mining, 2016.

3
[4] S. Lundberg, S. Lee, ”A Unified Approach to Interpretable Model Predictions,” Ad-
vances in Neural Information Processing Systems, 2017.

[5] Kolmogorov, A. et al., ”Kolmogorov-Arnold Convolutions: Design,” Papers with Code,


2021.

[6] Z. Jha, ”DoubleU-Net: A Deep Convolutional Neural Network,” Papers with Code,
2021.

[7] X. Wang, ”DiffMic: Dual Guidance Diffusion Network for Microscopy Image Analy-
sis,” Papers with Code, 2023.

You might also like