Red
Red
INSTITUTE OF ENGINEERING
ADVANCED COLLEGE OF ENGINEERING AND MANAGEMENT
DEPARTMENT OF ELECTRONICS AND COMPUTER ENGINEERING
KALANKI, KATHMANDU
[CT 654]
Submitted By:
Sushovan KC (ACE078BCT083)
Swikar Paudel(ACE078BCT084)
2081/09/10
ACKNOWLEDGEMENT
We take this opportunity to express our deepest and sincere gratitude to our
Academic Project Coordinator Er. Laxmi Prasad Bhatt, Department of Electronics
and Computer Engineering for his insightful advice, motivating suggestions,
invaluable guidance, help and support in this project selection and also for his/her
constant encouragement and advice throughout our journey till the date.
We express our deep gratitude to Er. Prem Chandra Roy, Head of Department of
Electronics and Computer Engineering, Er. Dhiraj Pyakurel, Deputy Head,
Department of Electronics and Computer Engineering for their regular support, co-
operation, and coordination.
The in-time facilities provided by the department are also equally acknowledgeable.
We would like to convey our thanks to the teaching and non-teaching staff of the
Department of Electronics & Communication and Computer Engineering, ACEM for
their invaluable help and support hitherto. We are also grateful to all our classmates for
their help, encouragement and invaluable suggestions.
Finally, yet more importantly, we would like to express our deep appreciation to our
grandparents, parents, siblings for their perpetual support and encouragement.
Sushovan KC (ACE078BCT083)
Swikar Paudel(ACE078BCT084)
i
Table of Contents
Title Page
ACKNOWLEDGEMENT i
Table of Contents ii
List of Figure iv
List of Abbreviations/Acronyms v
CHAPTER 1 1
INTRODUCTION 1
1.1 Background 1
1.2 Motivation 2
1.3 Statement of the Problem 3
1.4 Project objective 3
1.5 Significance of the study 3
CHAPTER 2 4
LITERATURE REVIEW 4
CHAPTER 3 7
REQUIREMENT ANALYSIS 7
3.1 Hardware Requirements: 7
3.2 Software Requirements: 7
3.3 Operating system: 7
3.4 Functional requirements 7
3.4.1 Input handling 7
3.4.2 Preprocessing: 7
3.4.3 Labeled output: 7
3.5. Non-functional requirements 8
3.5.1 Performance 8
3.5.2 Reliability 8
3.5.3 Consistency 8
3.5.5 Security 8
3.5.6 Resource utilization 8
CHAPTER 4 9
SYSTEM DESIGN AND ARCHITECTURE 9
4.1 Use Case Diagram 9
ii
4.2 System Design 10
4.3 DFD 11
4.4 System Flowchart 12
CHAPTER 5 13
METHODOLOGY 13
5.1 Machine Learning Life Cycle 13
U-Net 14
CHAPTER 6 16
EXPECTED OUTPUT 16
CHAPTER 7 17
TIME SCHEDULE 17
CHAPTER 8 18
TOTAL COST 18
REFERENCES 19
iii
List of Figure
iv
List of Abbreviations/Acronyms
AI Artificial Intelligence
DN Deconvolution Network
v
CHAPTER 1
INTRODUCTION
Nepal like numerous developing countries faces significant healthcare obstacles,
especially concerning ophthalmology. Cataracts, which is caused when the proteins in
the eye's lens break down and clump together, causing the lens to become cloudy
continues to be one of the leading causes of blindness in the nation with about 62% of
total number of blindness cases occurring in the country being related to cataract.
The lack of diagnostic equipment and specialized eye doctors in rural areas of Nepal
keeps worsening the issue. These challenges have raised the need for innovative
methods, hence the application of Artificial Intelligence in medical diagnostics.
Advanced deep-learning approaches, especially those using Convolutional Neural
Networks-CNNs, have lately been very effective in medical image reviews for anomaly
detection. Of these techniques, the U-Net architecture is designed for medical image
segmentation and proved very effective in highlighting and locating the characteristics
associated with disease.
1.1 Background
Due to limited access to medical facilities and skilled ophthalmologists, cataracts, the
leading cause of blindness worldwide, mostly affecting developing countries like
Nepal. According to experts, early diagnosis and treatment can avoid 80% of cataract-
related blindness. However, rural and underprivileged communities face significant
challenges because traditional screening methods require skilled workers and advanced
equipment, both of which are often unavailable.
Medical image analysis has shown great potential with artificial intelligence (AI),
particularly with deep learning frameworks like U-Net. U-Net is an excellent option for
detecting cataracts due to its effectiveness in evaluating biological images. With the
help of U-Net, this project aims to develop an automated cataract detection system that
will work on providing accurate and reasonably priced screening, improving early
diagnosis and addressing cataract related problems.
1
The integration of U-Net in cataract detection system has several benefits such as:
1. Automation of disease identification:
Using U-Net the speed of identification of cataract will be reduced
significantly which helps in prevention and cure of this disease.
2. Cost effectiveness:
The proposed system will help in reducing costs for detection of cataract
providing an affordable screening tool which can provide service at a lower rate
with less operating cost.
3. Scalability:
The proposed system can be deployed in mobile health posts, rural areas
and other outreach areas without the need of very high skilled operators and
medical practitioners.
4. Early detection:
The proposed system can help in early detection of cataract which can
lead to early medical treatment which reduces further complications caused due
to the disease.
By addressing the limitation of the traditional processes of cataract detection this
project aims to develop a U-Net based detection system which can effectively segment
the cataract affected region from the provided image reducing the cost, time and
improving the accuracy, efficiency and accessibility of cataract detection process. This
project aims to reduce the cataract related blindness through early diagnostics of this
disease and help to improve the health condition of people living in Nepal and around
the world.
1.2 Motivation
Cataract-induced blindness is a preventable yet persistent issue in Nepal, particularly
in rural and underprivileged areas where access to ophthalmic care is limited.
Traditional diagnostic methods are resource-intensive and often unavailable in remote
regions, leaving many individuals undiagnosed until it’s too late.
By using deep learning models like U-Net, we can develop an automated system for
cataract detection that is accurate, efficient, and easily accessible. This project is driven
by the goal of reducing preventable blindness, improving early diagnosis, and uplifting
the healthcare system in Nepal, ultimately enhancing the quality of life in the country.
2
1.3 Statement of the Problem
Cataract is one of the leading causes of visual impairment and blindness around the
world particularly occurring among the elderly population. It is caused when the
proteins in the eye's lens break down and clump together, causing the lens to become
cloudy. If gone undetected it may lead to permanent vision loss so early detection of
cataract is can play a crucial role in timely diagnosis and treatment. However, manual
detection of cataract through medical imaging may be time consuming and may require
a lot of expertise of ophthalmologists. Traditional automated methods may often fail to
correctly identify cataract due to poor image quality, lighting conditions, and the
presence of artifacts.
This problem highlights that there is a need for a method which provides an accurate,
automated, and efficient approach to detect and segment cataract-affected regions in
eye images. By using U-net, which is a deep learning technique, it is possible to obtain
desired accurate results with precise and consistent cataract detection, leading to better
patient outcomes.
3
CHAPTER 2
LITERATURE REVIEW
Machine learning developers as well as deep learning experts have contributed a lot and
proposed different approaches for detection of cataract. We can find numerous
approaches on how these problems are solved using several machines learning and deep
learning algorithms. The prediction depends on several features that are extracted from
the data and how different algorithms work on those features. Several works have been
done related to cataract detection using different algorithms such as Deep CNN and
Machine Learning in order to identify the high performance for detection of cataracts.
Gao et al. [12] investigated a deep learning-based method for grading the severity of
Nuclear Cataracts from slit-lamp images. Local filters are obtained by clustering the
image patches fed into a convolutional neural network (CNN). Then a set of recursive
neural networks (RNNs) was used to extract higher-order features. The cataract grading
was performed using support vector regression. Zhang et al. [1] proposed a Deep CNN
(DCNN) for cataract detection and grading that used the feature maps from the pooling
layers of the architecture. This method was time-efficient and achieved 93.52% and
86.69% accuracies in cataract detection and grading, respectively. Hossain et al. [7]
proposed an automatic cataract detection system using DCNNs and a trained classifier
model based on Res-Net, whose accuracy was 95.77%. Yang et al. [8] proposed an
ensemble learning-based approach for cataract detection and grading. Three
independent feature sets were extracted, and two learning models were formed for each
group. The image classification was achieved by combining the multiple-based learning
models based on the ensemble methods, whose CCRs were 93.2% and 84.5% for
cataract detection and grading, respectively. Karamihan [9] utilizes the detection of
cataract eye images and their characteristics using Deep CNN through Google Net
Transfer Learning and MATLAB to prove that the system created is accurate and
reliable.Sahana[10] used data with an initial V3 architecture trained on a deep learning
image network divided into adult and immature cataracts and produced an accuracy of
87.5% using transfer learning and TensorFlow. Pratap and Kokil [11], cataract
diagnosis has been investigated under a noisy environment. A pre-trained CNN was
applied for feature extraction formed of a set of locally- and globally trained
independent support vector networks. The obtained results proved its robustness against
4
noise. It was the first work that investigated the robustness of the cataract detection
systems.
It was observed that many works had been done based on conventional machine
learning methods and works reported on cataract detection and grading using deep
learning methods. Therefore, there are still several challenges to deal with, such as
improving the accuracy of the models while minimizing their complexity by reducing
the number of training parameters, layers, depth, running time, and the overall model
size.
Zhang et al. [1] Propose a time Deep CNN Developed a Fundus Achieved
efficient utilizing Deep image of accuracies of
method for feature CNN(DCNN) posterior 93.52% in
cataract maps from utilizing feature eye. cataract
detection and pooling maps from detection and
grading layers. pooling layers for 86.69% in
cataract detection grading;
and grading. method noted
for time
efficiency.
Hossain et al. Develop an DCNNs Proposed an Fundus Achieved an
[7] automatic with automatic image of accuracy of
cataract ResNet- cataract detection posterior 95.77% in
detection based system using eye. cataract
system. classifier. DCNNs with a detection.
trained classifier
model based on
ResNet
architecture.
5
Yang et al. [8] Introduce an Ensemble Introduced an Fundus Achieved
ensemble learning ensemble image of correct
learning-based with learning-based posterior classification
approach for multiple approach with eye. rates (CCRs) of
cataract feature three independent 93.2% for
detection and sets and feature sets and cataract
grading. models. two learning detection and
models per set; 84.5% for
classification grading.
achieved by
combining
multiple learning
models.
Karamihan [9] Utilize deep GoogleNet Utilized Deep Fundus
learning for Transfer CNN through image of Demonstrate
cataract image Learning GoogleNet posterior d that the
detection and in Transfer Learning eye. system is
characterization MATLAB. in MATLAB for accurate and
. cataract image reliable for
detection and cataract
characterization. detection.
Sahana [10] Classify adult InceptionV Employed Right Achieved an
and immature 3 InceptionV3 Fundus accuracy of
cataracts using architectur architecture with image 87.5% in
deep learning. e with transfer learning and Left cataract
transfer using TensorFlow Fundus classification.
learning. to classify adult image.
and immature
cataracts.
Pratap and Investigate Pre-trained Investigated Fundus Demonstrated
Kokil [11] cataract CNN with cataract diagnosis retinal robustness
diagnosis under support in noisy images against noise;
noisy vector environments first study to
environments. networks. using a pre- assess
trained CNN for robustness of
feature cataract
extraction, detection
combined with systems.
locally and
globally trained
independent
support vector
networks.
6
CHAPTER 3
REQUIREMENT ANALYSIS
3.4.2 Preprocessing:
The system should be able to preprocess images with preprocessing steps,
including image normalization, resizing, and noise reduction, to ensure uniform
input for the segmentation model.
7
3.5. Non-functional requirements
3.5.1 Performance
The system should be able to provide results within a reasonable timeframe to
ensure timely analysis of cataract for diagnostic purposes.
3.5.2 Reliability
The system should be able to give a high level of accuracy in cataract detection
minimizing false positive and false negative to enhance reliability.
3.5.3 Consistency
The system should be able to give consistent results across a variety of datasets
ensuring reliability in diverse scenarios.
3.5.4 Usability
The system should be easily accessible to be used by a variety of medical
professionals without having a high level of technical expertise.
3.5.5 Security
The system should be able to protect all the data within the system with a high
level of encryption to ensure the confidentiality and privacy of patient
information. The access of the system should be provided to authorized
healthcare professionals only to ensure data privacy.
8
CHAPTER 4
SYSTEM DESIGN AND ARCHITECTURE
9
4.2 System Design
10
4.3 DFD
11
4.4 System Flowchart
12
CHAPTER 5
METHODOLOGY
We are planning to use this life cycle for our project, as it ensures a systematic and
organized approach to developing and deploying machine learning models. By using
this life cycle model, we try to build a well-structured machine learning solution. This
life cycle allows us to focus on each phase with proper methods. This model will help
us to meet the project objectives and meet the requirements.
13
5.3 Algorithm
U-Net
The U-Net architecture is a popular neural network architecture commonly used for
medical image segmentation task. It was proposed by Olaf Ranneberger, Philipp
Fischer, and Thomas Brox in the paper "U-Net: Convolutional Networks for
Biomedical Image Segmentation." in 2015.The U-Net architecture is distinguished
mainly by its U-shaped structure which includes a contracting path(down sampling), a
bottleneck, and an expansive path(up sampling).
[Source: https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/]
1. Contracting path-
The contracting path is similar to the encoder in the CNN architecture. It
captures contextual information and features by progressively reducing the
spatial dimensions of the input. It consists of two 3X3 convolution layers
followed by a ReLu activation function. A 2×2 max-pooling operation with
stride 2 to down sample the feature maps is followed after ReLu activation
which reduces the spatial dimensions of the sample.
2. Bottleneck-
It is present at the center of the U-Net architecture in the middle of the
contracting path and the expansive path. It consists of two 3X3 convolution
layers followed by the ReLu activation function. It does not contain a max
14
pooling operation as it serves as a bridge between the encoder and the decoder
in the architecture.
3. Expanding path-
The expanding path of the U-Net performs up sampling to reconstruct the spatial
resolution of the image while combining the encoded features from the
contracting path. Each step of this path consists of two 3X3 convolution layers
with one ReLu activation layer similar to Contracting path but has a 2X2 up
convolution between each step-in order to reduce the number of channels.
Concatenation with the corresponding feature map from the contracting path
(skip connections) is done in order to combine the features of the contracting
and the expanding path.
4. Skip connection-
Skip connection is used to directly connect the corresponding contracting and
the expanding paths in order to retain spatial context and fine-grained details
from earlier layers which helps in reducing the vanishing gradient problem by
providing the direct flow of information.
5. Output layer-
The final layer uses a 1×1 convolution to map the feature maps to the desired
number of output channels (e.g., for binary segmentation, a single channel with
sigmoid activation; for multi-class segmentation, multiple channels with
SoftMax activation).
15
CHAPTER 6
EXPECTED OUTPUT
[Source: https://iopscience.iop.org/article/10.1088/1742-
6596/1937/1/012053/pdf]
[Source: https://iopscience.iop.org/article/10.1088/1742-
6596/1937/1/012053/pdf]
16
CHAPTER 7
TIME SCHEDULE
17
CHAPTER 8
TOTAL COST
Although the project looks to be taking some time and computing, there won't be any
need for capital investment for the prototype. As the only expense will be computing
power for this project, we will use our own devices to run and test our system to
minimize any funds needed. Overall, it is estimated to be completed on a seemingly
low budget.
18
REFERENCES
[1] Linglin Zhang et al., "Automatic cataract detection and grading using Deep
Convolutional Neural Network," 2017 IEEE 14th International Conference on
Networking, Sensing and Control (ICNSC), Calabria, 2017, pp. 60-65, doi:
10.1109/ICNSC.2017.8000068.
[4] Y. Dong, Q. Zhang, Z. Qiao and J. -J. Yang, "Classification of cataract fundus image
based on deep learning," 2017 IEEE International Conference on Imaging Systems and
Techniques (IST), Beijing, China, 2017, pp. 1-5, doi: 10.1109/IST.2017.8261463.
[5] Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for
Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A.
(eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015.
MICCAI 2015. Lecture Notes in Computer Science(), vol 9351. Springer, Cham.
[6] Xi Xu, Linglin Zhang, Jianqiang Li, Yu Guan, and Li Zhang, "A Hybrid Global-
Local Representation CNN Model for Automatic Cataract Grading," IEEE Journal of
Biomedical and Health Informatics, vol. 24, no. 2, pp. 437-446, Feb. 2020.
19
[8] X. Yang, Y. Zhang, and Z. Li, "Ensemble Learning-Based Approach for Cataract
Detection and Grading," Journal of Biomedical Informatics, vol. 85, pp. 68-74, May
2018.
[9] Karamihan, "Detection of Cataract Eye Images and Their Characteristics Using
Deep CNN through GoogleNet Transfer Learning and MATLAB," International
Journal of Advanced Computer Science and Applications, vol. 10, no. 5, pp. 123-130,
2019.
[10] Sahana, "Transfer Learning for Cataract Detection Using V3 Architecture and
TensorFlow," International Journal of Recent Technology and Engineering, vol. 8, no.
3, pp. 4567-4571, Sept. 2019.
[11] Pratap and Kokil, "Cataract Diagnosis Under Noisy Environment Using Pre-
trained CNN and Support Vector Machine," Biomedical Signal Processing and
Control, vol. 52, pp. 177-183, May 2019.
[12] X. Gao, S. Lin, and T. Y. Wong, "Automatic Feature Learning to Grade Nuclear
Cataracts Based on Deep Learning," IEEE Transactions on Biomedical Engineering,
vol. 62, no. 11, pp. 2693-2701, Nov. 2015.
20