0% found this document useful (0 votes)
4 views

Predicting enhanced diagnostic models: deep learning for multi-label retinal disease classification

This study evaluates the performance of three CNN architectures—VGG16, ResNet50, and InceptionV3—for multi-label classification of retinal diseases using a dataset of 860 fundus images. VGG16 outperformed the others with the highest subset accuracy (84.81%) and macro precision (95.83%), while ResNet50 and InceptionV3 showed lower performance metrics. The findings suggest that VGG16's architecture is particularly effective for complex multi-label classification tasks in medical imaging.

Uploaded by

IAES IJAI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Predicting enhanced diagnostic models: deep learning for multi-label retinal disease classification

This study evaluates the performance of three CNN architectures—VGG16, ResNet50, and InceptionV3—for multi-label classification of retinal diseases using a dataset of 860 fundus images. VGG16 outperformed the others with the highest subset accuracy (84.81%) and macro precision (95.83%), while ResNet50 and InceptionV3 showed lower performance metrics. The findings suggest that VGG16's architecture is particularly effective for complex multi-label classification tasks in medical imaging.

Uploaded by

IAES IJAI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 14, No. 1, February 2025, pp. 54∼61


ISSN: 2252-8938, DOI: 10.11591/ijai.v14.i1.pp54-61 ❒ 54

Predicting enhanced diagnostic models: deep learning for


multi-label retinal disease classification
Sridhevi Sundararajan, Harikrishnan Ramachandran, Harshita Gupta, Yashraj Patil
Symbiosis Institute of Technology (SIT)-Pune Campus, Symbiosis International Deemed University (SIDU), Pune, India

Article Info ABSTRACT


Article history: In this study, we assess three convolutional neural network (CNN) architec-
tures—VGG16, ResNet50, and InceptionV3 for multi classification of fundus
Received Mar 29, 2024
images in the retinal fundus multi-disease image dataset (RFMID2), compris-
Revised Jun 28, 2024 ing of 860 images. Focusing on diabetic retinopathy, exudation, and hem-
Accepted Jul 26, 2024 orrhagic retinopathy, we preprocessed the dataset for uniformity and balance.
Using transfer learning, the models were adapted for feature extraction and
Keywords: fine-tuned to our multi-label classification task. Their performance was mea-
sured by subset accuracy, precision, recall, F1-score, hamming loss, and Jaccard
Classification score. VGG16 emerged as the top performer, with the highest subset accuracy
Deep learning (84.81%) and macro precision (95.83%), indicating its superior class distinc-
Medical imaging tion capabilities. ResNet50 showed commendable accuracy (79.75%) and preci-
Multi-label sion (86.70%), whereas InceptionV3 lagged with lower accuracy (66.67%) and
Prediction models precision (81.21%). These findings suggest VGG16’s depth offers advantages
Retinal disease in multi-label classification, highlighting InceptionV3’s limitations in complex
scenarios. This analysis helps optimize CNN architecture selection for specific
tasks, suggesting future exploration of dataset variability, ensemble methods,
and hybrid models for improved performance.

This is an open access article under the CC BY-SA license.

Corresponding Author:
Harikrishnan Ramachandran
Symbiosis Institute of Technology (SIT)-Pune Campus, Symbiosis International Deemed University (SIDU)
Pune 412115, India
Email: [email protected]

1. INTRODUCTION
The integration of AI-driven image processing in healthcare has marked a significant leap forward in
diagnosing critical retinal conditions such as diabetic retinopathy [1], exudation [2], and hemorrhagic retinopa-
thy [3]. These conditions, notorious for their potential to impair vision, are increasingly being detected with
advanced computational approaches, notably convolutional neural networks (CNNs) [4]. Our investigation
evaluates the capabilities of three distinguished CNN models— VGG16 [5] , ResNet50 [6], and InceptionV3
[7]—on the retinal fundus multi-disease image dataset (RFMID2) dataset [8], which encompasses 860 fundus
images annotated for these specific diseases. The generation of the dataset RFMID2 is depicted in Figure 1.
Applying transfer learning [9] followed by feature extraction [10] of the fundal images, we aim to fine-tune
these models to the task of multi-label retinal disease classification. The RFMID2 dataset serves as an ideal
ground for this exploration, given its diverse and comprehensive annotation of retinal pathologies. By ensuring
a balanced dataset and employing rigorous preprocessing measures, we seek to establish a level field for eval-
uating model performance across key metrics [11]: subset accuracy, precision, recall, F1 score, hamming loss,
and Jaccard score [12]. As the global incidence of retinal diseases escalates, the urgency for developing rapid

Journal homepage: http://ijai.iaescore.com


Int J Artif Intell ISSN: 2252-8938 ❒ 55

and reliable diagnostic tools becomes paramount. This study not only compares the effectiveness of different
CNN architectures in automated retinal disease detection but also aims to contribute to the enhancement of
diagnostic processes in ophthalmology. Through our findings, we anticipate setting the stage for future ad-
vancements in automated retinal imaging [13], potentially transforming patient care and management in the
field.

Figure 1. Creation of retinal fundus multi image dataset RFMID2 [8]

The advent of CNNs has significantly impacted machine learning [14], enhancing multi-label clas-
sification [15] tasks across various domains, including medical image analysis [16]. VGG16, ResNet, and
Inception are notable architectures that have contributed to advancements in this field. VGG16 is celebrated for
its depth and the use of small, uniform filters, making it adept at capturing detailed features crucial for medical
imaging. Its architectural simplicity has played a pivotal role in deepening our understanding of CNNs. ResNet
introduces skip connections, allowing the training of very deep networks by overcoming the vanishing gradi-
ent problem. This innovation enables the learning of complex patterns, illustrating the model’s versatility in
tasks such as disease classification from medical scans. Inception models, with their unique inception modules,
capture information at different scales. This capability is particularly beneficial for medical diagnostics, where
images may contain intricate patterns across various resolutions. Applying these CNNs in medical imaging
has transformed diagnostic methods, enabling the classification and diagnosis of multiple diseases from a sin-
gle image with unprecedented accuracy. This advancement not only simplifies the diagnostic process but also
improves early disease detection, potentially enhancing patient outcomes. The continuous evolution of CNN
architectures promises further advancements in medical image analysis. VGG16, ResNet, and Inception mod-
els, with their versatility and adaptability, promise a future where AI-driven diagnostics combined with human
expertise, offering faster and more precise diagnoses. Table 1 describes the a brief review on the deep learning
techniques for retinal diseases classification.
In the exploration of CNNs for retinal disease classification using datasets like RFMID2, researchers
face a set of challenges but also encounter numerous opportunities for advancement. Understanding these
can guide future research directions and help in overcoming existing limitations. The Table 2 explains the
challenges and opportunities in retinal disease classification right from data collection till diagnosis.
The advent of deep learning has revolutionized the field of medical imaging, offering unprecedented
opportunities for enhancing diagnostic accuracy and patient care. CNNs, in particular, have shown remark-
able success in automatically detecting and classifying diseases from various types of medical images. This
potential is especially pertinent in the realm of ophthalmology, where conditions like diabetic retinopathy, ex-
udation, and hemorrhagic retinopathy can be identified through retinal imaging. Timely and precise detection
of these diseases is vital to prevent severe consequences, such as blindness. Thus, leveraging deep learning
for this purpose addresses a significant need in healthcare. In light of these developments, this work proposes
to apply and compare three state-of-the-art CNN architectures—VGG16, InceptionV3, and ResNet50—for the
classification of ophthalmic diseases in retinal images. By leveraging these models’ pre-trained weights for
feature extraction and employing custom classification layers, this study aims to investigate each architecture’s

Predicting enhanced diagnostic models: deep learning for multi-label ... (Sridhevi Sundararajan)
56 ❒ ISSN: 2252-8938

efficacy in accurately identifying diabetic retinopathy, exudation, and hemorrhagic retinopathy. This compar-
ative analysis will contribute to identifying the most suitable deep learning approach for enhancing diagnostic
processes in ophthalmology, ultimately aiding in early disease detection and treatment planning. The Table 3
explains the techniques and analysis done for this research.

Table 1. Review on deep learning approaches in retinal disease classification


Topic Description
Multilevel glowworm swarm optimization A method that integrates CNN with glowworm swarm optimization (GSO) [18]
convolutional neural network (MGSCNN) for classifier optimization, emphasizing normalization, smoothing, and resizing
[17] for multi-disease classification in preprocessing. This model showcases the efficacy in enhancing classification
accuracy for retinal images.
Addressing class imbalance [19] The literature provides various strategies to handle class imbalance prob-
lems. These strategies include resampling methods (both oversampling minority
classes and undersampling majority ones), classifier adaptation that requires the
model to be tailored to address the dataset’s imbalance, ensemble approaches
that use multiple models, and cost-sensitive methods that employ custom met-
rics in the loss function to emphasize the minority classes.
Optical coherence tomography (OCT) image Describes the adaptation of VGG-16 using transfer learning for OCT retinal im-
classification with VGG-16 [20] ages, focusing on data augmentation and custom layers for disease classification.
Includes performance evaluation metrics and Grad-CAM for model visualization
[21].
Publicly available fundus image datasets A variety of datasets have been reviewed and are used in multi-label retinal
[22] disease classification, including DRIVE, STARE, CHASE-DB, Messidor, and
e-ophtha datasets. Each dataset has been developed for different diagnostic pur-
poses, exhibiting diverse variations and abnormalities that are crucial for training
deep learning models.

Table 2. From data to diagnosis: artificial intelligence challenges and opportunities in retinal disease
classification
Challenges Opportunities
Class imbalance: A common issue in medical datasets, includ- Advanced architectural innovations [23]: Exploring beyond
ing those for retinal diseases, is class imbalance. Minority VGG16, ResNet, and Inception models to newer architectures
classes (rarer diseases) are not much in number, which can can provide opportunities for improved performance. Innova-
bias the model towards more frequent conditions. Addressing tions in neural network design could offer solutions to class
this requires innovative strategies beyond simple resampling, imbalance, enhance feature extraction, and improve model in-
demanding more sophisticated approaches like cost-sensitive terpretability.
learning or synthetic data generation.
Data quality and variability: The quality of retinal images can Multi-modal data fusion [24]: Combining retinal fundus im-
vary significantly due to different imaging conditions, patient ages with other types of medical data (e.g., OCT [25], pa-
demographics, and disease stages. This variability can hinder tient demographics) through multi-modal learning approaches
the model’s ability to generalize from the training data to real- could enhance diagnostic accuracy and offer a more holistic
world applications. view of patient health.
Interpretability and trust: For clinical applications, it’s crucial Transfer learning and domain adaptation [26]: Leveraging
that model predictions are interpretable by healthcare profes- transfer learning more effectively, including domain adaptation
sionals. Developing models that offer transparency in their techniques, can help models better generalize across different
decision-making process remains a challenge, affecting their datasets and imaging conditions, reducing the gap between re-
adoption in clinical settings. search settings and real-world applications.
Integration with clinical workflows: Successfully integrating Collaboration between AI researchers and clinicians: Strength-
AI models into clinical workflows involves challenges related ening the collaboration between AI researchers and clinical
to system compatibility, data privacy, and the need for real- practitioners can lead to the development of more practical,
time analysis without disrupting existing procedures. user-friendly AI tools that fit seamlessly into clinical work-
flows and address real healthcare needs.

The study explores the impact of architectural variations inherent to VGG16, inceptionV3, and ResNet-
50. This includes the depth of the networks, the use of inception modules versus residual connections, and their
implications for feature representation and extraction capabilities. Each model was fine-tuned to the specific
task of medical image classification. This involved adjusting learning rates, experimenting with different num-
bers of dense layers in the classification head, and exploring dropout for regularization to prevent overfitting.

Int J Artif Intell, Vol. 14, No. 1, February 2025: 54–61


Int J Artif Intell ISSN: 2252-8938 ❒ 57

Table 3. CNN architectures in ophthalmic disease detection: techniques, variations, and insights
Methodology/feature Description
extraction
Dataset description The study uses a RFMID2 consisting of 860 images labeled with the presence or absence of diabetic
retinopathy, edema, and hemorrhagic retinopathy. The labels are encoded in a CSV file, facilitating a
structured approach to supervised learning.
Balancing and sampling To address potential class imbalance, a balanced subset of images was constructed. This ensures equi-
table representation of each condition, enhancing the model’s ability to learn from an evenly distributed
dataset.
Image resizing and nor- All images were resized to 512x512 pixels to standardize input size. Additionally, pixel values were
malization normalized to a [0, 1] range to aid in model convergence and efficiency during training.
Data analysis The comparative analysis of label frequencies and combinations was conducted for both retinal fundus
multi image datasets that includes creating graphical representations to illustrate label frequencies and
co-occurrence patterns
Utilization of pre-trained Leveraging the VGG16, InceptionV3, and ResNet50 models pre-trained on ImageNet enables the ex-
models traction of rich, hierarchical features from the medical images, even with a limited dataset size.
Global average pooling Following feature extraction, a GAP layer was applied to reduce the feature dimensionality while main-
(GAP) taining spatial hierarchies, facilitating a robust input for the classification layer.

2. METHOD
This section thoroughly outlines the methods adopted in our investigation. It details the research
framework, the procedures for obtaining the dataset, and the analytical strategies applied to fulfill our inves-
tigative goals. We offer an in-depth description of our procedural approach to maintain openness and ensure a
clear understanding of our research execution.
2.1. Setup and variables considered
The Table 4 describes the specification of the retinal fundus dataset used in the research. The RFMID2
[8] a publicly accessible dataset has 860 retinal images, respectively. From this diabetic retinopathy, exudation,
and hemorrhagic retinopathy have been identified for this reserach.

Table 4. Specification of the retinas fundus dataset [8]


Aspect Desciption of the medical data
Research domain Multiple label classification of retinal fundus image
Detailed study focus Classification of various diseases using retinal fundus images
Collection method Utilized the TOPCON TRC-NW300 for data collection
Data presentation Includes images and associated data in CSV format
Variables under study The majority of the subjects underwent mydriasis dilation using 0.5% tropicamide. Another subsect of
patients without mydriasis were also accessed.
Experimental setup Images captured using non-invasive techniques, maintaining specific distances between the camera lenses
and the subjects’ eyes with the patients in a seated position.
Data origin Data collected from Shri Ganpati Netralaya in Jalna, Maharashtra, and the Center of Excellence in Signal
and Image Processing at SGGS Institute of Engineering and Technology in Nanded, Maharashtra, India.

2.2. Dataset processing


The Figure 2 depicts the flowchart explaining the methodology used in this research for the classifica-
tion of retinal diseases from RFMID2 dataset. The methodology depicted in the flowchart explains the use of
convolutional neutal networks for the classification of retinal images. It starts with identifying the labels that
are crucial for this study namely diabetic retinopathy, exudation, and hemorrhagic retinopathy from the 860
fundus images. Starting with data collection, the algorithm follows the integration of fundal images and the
disease labels to ensure preprocessing steps like resizing, normalization, and dataset partitioning. This facili-
ates the model training and validation. An emphasis is specifically placed on the dataset imbalance to ensure
an unbiased model evaluation. This research study has considered three CNN architectures namely VGG16,
InceptionV3, and ResNet50 for the feature extraction of the fundal images. GAP is implemented to simplify
the complex data from the images into a format that is easier to identify the patterns assosiated with the retinal
diseases used in this study. Training is conducted followed by evaluation of the model’s prediction against the
established performance metrics. This way of structured comaparison and analysis of the results highlights the
efficiency of the models in diagnosisng retinal diseases. The prediction also helps shed light into the details sur-
ronding diabetic retinopathy, exudation, and hemorrhagic retinopathy. With this rigorous process the research

Predicting enhanced diagnostic models: deep learning for multi-label ... (Sridhevi Sundararajan)
58 ❒ ISSN: 2252-8938

helps to identify the promising CNN architecture for fundal images, therby offering valuable insights into the
automated diagnosis of retinal conditions to help in advance opthalmic cure.

Figure 2. Flowchart for deep learning in ophthalmic disease classifications

3. RESULTS AND DISCUSSION


Based on the comparison of the three architectures considered in the research we have arrived at the
following analysis which reveals the distinct performance features of VGG16, InceptionV3, and ResNet mod-
els, highlighting their use in various applications based on their unique strength and weakness. The VGG16
model demonstrates superior performance in almost all evaluated metrics. With a subset accuracy of 84.81%, it
significantly outperforms Inception and ResNet models in predicting the entire set of labels for an instance. This
is complemented by its high precision, both on a macro (95.83%) and micro (95.56%) level, suggesting that the
model is particularly effective at minimizing false positives. Moreover, VGG16’s recall rates (macro: 80.74%,
micro: 79.63%) gives a strong indication about its capability to identify positive instances, which, combined
with its precision, leads to the highest F1-scores among the models (macro: 87.22%, micro: 86.87%). These
results highlight VGG16’s balanced performance, making it a robust choice for tasks requiring precise label
predictions. ResNet, on the other hand, presents itself as a competitive alternative, with a subset accuracy of
79.75%, positioning it between VGG16 and Inception in terms of the ability to match the full label set. While
its precision and recall scores are moderately lower than those of VGG16, they still showcase respectable
performance (precision macro: 86.70%, recall macro: 75.37%). The F1-scores for ResNet (macro: 80.00%,

Int J Artif Intell, Vol. 14, No. 1, February 2025: 54–61


Int J Artif Intell ISSN: 2252-8938 ❒ 59

micro: 79.21%) further underscore its balanced precision-recall trade-off, indicating its effectiveness in a vari-
ety of scenarios where a slightly lower accuracy is acceptable. Inception’s performance, while trailing behind
VGG16 and ResNet, offers insights into the challenges and limitations of the architecture within the specific
task context. With a subset accuracy of 66.67%, and the lowest precision (macro: 81.21%, micro: 78.05%)
and recall (macro: 59.72%, micro: 60.38%) scores among the three, Inception appears to struggle more with
correctly identifying positive instances and minimizing false positives. Its F1-scores (macro: 65.51%, micro:
68.09%) reflect these challenges, suggesting areas for improvement, particularly in tasks requiring high preci-
sion and recall. The Table 5 shows the results of the three CNN architecture models namely VGG16, Inception,
and ResNet models carried out in this research study.

Table 5. Comparative analysis of VGG16, Inception, and ResNet across key performance indicators
Metrics VGG16 (%) Inception (%) ResNet (%)
Subset accuracy 84.81 66.67 79.75
Precision (Macro) 95.83 81.21 86.70
Recall (Macro) 80.74 59.72 75.37
F1-Score (Macro) 87.22 65.51 80.00
Precision (Micro) 95.56 78.05 85.11
Recall (Micro) 79.63 60.38 74.07
F1-Score (Micro) 86.87 68.09 79.21
Hamming Loss 5.49 12.82 8.86
Jaccard Score (Macro) 78.62 53.10 69.30
Jaccard Score (Micro) 76.79 51.61 65.57

The graph representated in Figure 3 gives an overview between the performance metrics of the
VGG16, Inception, and ResNet models across the key metric as discussed in the Table 5. We see that VGG16
emerges as the most accurate and balanced model, offering high precision and recall, which are critical for
applications demanding stringent accuracy in label prediction. ResNet’s performance, while not reaching the
heights of VGG16, remains strong, suggesting its potential as a versatile model capable of delivering robust
results across various tasks. Inception’s lower performance metrics indicate potential difficulties in tasks re-
quiring high precision and recall, highlighting the importance of model selection based on specific task re-
quirements and the inherent trade-offs between different architectures. This comparison not only emphasizes
the strengths of VGG16 in achieving high accuracy and a balanced precision-recall relationship but also illus-
trates the versatility of ResNet and the targeted applicability of Inception. The choice between these models
should be guided by the specific requirements of the application, including the need for precision, recall, and
the ability to correctly predict complete sets of labels.

Figure 3. Grouped bar depicting the comparative analysis of VGG16, Inception, and ResNet across key
performance indicators

Predicting enhanced diagnostic models: deep learning for multi-label ... (Sridhevi Sundararajan)
60 ❒ ISSN: 2252-8938

4. CONCLUSION
This study’s findings underscore the importance of model selection in the deployment of CNNs for
image recognition tasks. The choice of model can significantly influence the outcome, depending on the specific
requirements of accuracy, computational efficiency, and application context. Future research should explore
the adaptability of these models to emerging image recognition challenges, including their performance on
increasingly complex datasets and in real-world scenarios. Additionally, investigating the integration of these
architectures into ensemble methods could offer paths to further enhance their accuracy and efficiency. In
conclusion, while VGG16 emerges as the top performer in our analysis, the strengths of ResNet and Inception
in certain contexts cannot be overlooked. The decision to employ a particular model should be guided by the
specific demands of the task at hand, balancing considerations of accuracy, efficiency, and practical constraints.
As the field of deep learning continues to evolve, so too will the capabilities and applications of these models,
promising new opportunities for innovation and improvement.

REFERENCES
[1] T. E. Tan and T. Y. Wong, “Diabetic retinopathy: Looking forward to 2030,” Frontiers in Endocrinology, vol. 13, 2023, doi:
10.3389/fendo.2022.1077669.
[2] P. G. Pavani, B. Biswal, and T. K. Gandhi, “Simultaneous multiclass retinal lesion segmentation using fully automated RILBP-YNet
in diabetic retinopathy,” Biomedical Signal Processing and Control, vol. 86, 2023, doi: 10.1016/j.bspc.2023.105205.
[3] P. Fu et al., “Efficacy and safety of pan retinal photocoagulation combined with intravitreal anti-VEGF agents for high-
risk proliferative diabetic retinopathy: A systematic review and meta-analysis,” Medicine, vol. 102, no. 39, 2023, doi:
10.1097/MD.0000000000034856.
[4] M. Krichen, “Convolutional neural networks: A survey,” Computers, vol. 12, no. 8, 2023, doi: 10.3390/computers12080151.
[5] A. Kay and M. Nguyen, “Transfer learning with VGG16 deep convolutional neural network model effectively differentiates between
subtypes of bright and dark lesions,” Investigative Ophthalmology & Visual Science, vol. 64, no. 8, pp. 242–242, 2023.
[6] T. Castilla, M. S. Martı́nez, M. Leguı́a, I. Larrabide, and J. I. Orlando, “A ResNet is all you need: modeling a strong baseline for
detecting referable diabetic retinopathy in fundus images,” in 18th International Symposium on Medical Information Processing
and Analysis, 2023, pp. 212–221, doi: 10.1117/12.2669816.
[7] K. D. Bhavani and M. F. Ukrit, “Design of inception with deep convolutional neural network based fall detection and classification
model,” Multimedia Tools and Applications, vol. 83, no. 8, pp. 23799–23817, 2024, doi: 10.1007/s11042-023-16476-6.
[8] S. Panchal et al., “Retinal fundus multi-disease image dataset (RFMiD) 2.0: A dataset of frequently and rarely identified diseases,”
Data, vol. 8, no. 2, 2023, doi: 10.3390/data8020029.
[9] M. Iman, H. R. Arabnia, and K. Rasheed, “A review of deep transfer learning and recent advancements,” Technologies, vol. 11, no.
2, 2023, doi: 10.3390/technologies11020040.
[10] A. D. Vairamani, “Detection and diagnosis of diseases by feature extraction and analysis on fundus images using deep learning
techniques,” in Computational Methods and Deep Learning for Ophthalmology, 2023, pp. 211–227, doi: 10.1016/B978-0-323-
95415-0.00009-7.
[11] S. Guefrachi, A. Echtioui, and H. Hamam, “Automated diabetic retinopathy screening using deep learning,” Multimedia Tools and
Applications, vol. 83, no. 24, pp. 65249–65266, 2024, doi: 10.1007/s11042-024-18149-4.
[12] S. Pravin, S. Kanagasabapathy, V. Sivaraman, S. Jayaraman, and P. Manickavelu, “Efficient CNN based detection of diabetic
retinopathy,” AIP Conference Proceedings, vol. 2829, no. 1, 2023, doi: 10.1063/5.0156753.
[13] K. A. Heger and S. M. Waldstein, “Artificial intelligence in retinal imaging: current status and future prospects,” Expert Review of
Medical Devices, vol. 21, no. 1–2, pp. 73–89, 2024, doi: 10.1080/17434440.2023.2294364.
[14] O. Srivastava, M. Tennant, P. Grewal, U. Rubin, and M. Seamone, “Artificial intelligence and machine learning in ophthalmology:
A review,” Indian Journal of Ophthalmology, vol. 71, no. 1, pp. 11–17, 2023, doi: 10.4103/ijo.IJO 1569 22.
[15] Z. Li, M. Xu, X. Yang, Y. Han, and J. Wang, “A multi-label detection deep learning model with attention-guided image enhancement
for retinal images,” Micromachines, vol. 14, no. 3, 2023, doi: 10.3390/mi14030705.
[16] P. Kaur and R. K. Singh, “A review on optimization techniques for medical image analysis,” Concurrency and Computation: Prac-
tice and Experience, vol. 35, no. 1, 2023, doi: 10.1002/cpe.7443.
[17] R. Chavan and D. Pete, “Automatic multi-disease classification on retinal images using multilevel glowworm swarm convolutional
neural network,” Journal of Engineering and Applied Science, vol. 71, no. 1, 2024, doi: 10.1186/s44147-023-00335-0.
[18] H. Gao et al., “Optimum design of a reusable spacecraft launch system using electromagnetic energy: an artificial intelligence GSO
algorithm,” Energies, vol. 16, no. 23, 2023, doi: 10.3390/en16237717.
[19] A. R. Chłopowiec, K. Karanowski, T. Skrzypczak, M. Grzesiuk, A. B. Chłopowiec, and M. Tabakov, “Counteracting data bias
and class imbalance—towards a useful and reliable retinal disease recognition system,” Diagnostics, vol. 13, no. 11, 2023, doi:
10.3390/diagnostics13111904.
[20] E. Hassan et al., “Enhanced deep learning model for classification of retinal optical coherence tomography images,” Sensors, vol.
23, no. 12, 2023, doi: 10.3390/s23125393.
[21] M. S. Jamil, S. P. Banik, G. M. A. Rahaman, and S. Saha, “Advanced GradCAM++: Improved visual explanations of CNN decisions
in diabetic retinopathy,” in Computer Vision and Image Analysis for Industry 4.0, 2023, pp. 64–75, doi: 10.1201/9781003256106-6.
[22] T. Krzywicki, P. Brona, A. M. Zbrzezny, and A. E. Grzybowski, “A global review of publicly available datasets containing fundus
images: Characteristics, barriers to access, usability, and generalizability,” Journal of Clinical Medicine, vol. 12, no. 10, 2023, doi:
10.3390/jcm12103587.
[23] N. Anton et al., “Comprehensive review on the use of artificial intelligence in ophthalmology and future research directions,”
Diagnostics, vol. 13, no. 1, 2022, doi: 10.3390/diagnostics13010100.

Int J Artif Intell, Vol. 14, No. 1, February 2025: 54–61


Int J Artif Intell ISSN: 2252-8938 ❒ 61

[24] A. Kumar et al., “A novel deep learning approach for retinopathy prediction using multimodal data fusion,” International Journal
of Intelligent Systems and Applications in Engineering, vol. 12, no. 11, pp. 70–77, 2024.
[25] D. Restrepo et al., “Ophthalmology optical coherence tomography databases for artificial intelligence algorithm: a review,” Seminars
in Ophthalmology, vol. 39, no. 3, pp. 193–200, 2024, doi: 10.1080/08820538.2024.2308248.
[26] P. Ruamviboonsuk, N. Kaothanthong, V. Ruamviboonsuk, and T. Theeramunkong, “Transfer learning for artificial intelligence in
ophthalmology,” in Digital Eye Care and Teleophthalmology, Cham: Springer International Publishing, 2023, pp. 181–198, doi:
10.1007/978-3-031-24052-2 14.

BIOGRAPHIES OF AUTHORS

Sridhevi Sundararajan is actively engaged in her postgraduate studies, pursuing a Master


of Technology in Engineering Design at Symbiosis Institute of Technology in Pune, India. She has
around 15 years of experience in IT domain with C++, Unix being her main forte. Her area of interest
is research in medical domain. She likes to spend her time reading books related to medical field and
exploring unknown. She can be contacted at email: [email protected].

Harikrishnan Ramachandran is currently working as associate professor in the Depart-


ment of Electronics and Telecommunication Engineering, Symbiosis Institute of Technology, Pune
Campus, Symbiosis International Deemed University, Pune, India. His main research interest in-
cludes smart grid, optimization, internet of things, artificial intelligence, and data analytics. He is
an IEEE senior member, fellow life member of Institution of Engineers India, fellow life member of
IETE India, life member of Indian Society for Technical Education and life member of Computer
Society of India. He can be contacted at email: [email protected].

Harshita Gupta is currently in the final year of her bachelor’s degree in Electronics and
Telecommunication Engineering from Symbiosis Institute of Technology, Pune, Symbiosis Interna-
tional Deemed University. Additionally, she is pursuing a minor in Data Science from the Department
of Computer Science of the same university. With a keen interest in the dynamic fields of artificial
intelligence, machine learning, deep learning, and computer vision, she is poised to contribute to the
future of technology. She can be contacted at email: [email protected].

Yashraj Patil is a final year research scholar at Symbiosis Institute of Technology, Pune
Campus, Symbiosis International Deemed University, Pune, India, pursuing a Master of Technology
in Engineering Design. His main research interest includes internet of things, artificial intelligence,
data analytics, and earth and space science. He is a UNITAR Global Diplomacy Fellow, distinguished
speaker at UN General Assembly Science Summit, GLOBE Scientist at the NASA’s GLOBE Program
and a peer mentor in NASA’s SEES Summer Internship Program. He can be contacted at email:
[email protected].

Predicting enhanced diagnostic models: deep learning for multi-label ... (Sridhevi Sundararajan)

You might also like