Project Report
Project Report
“Jnana Sangama”,Belagavi-590018
PROJECT REPORT
On
“A HIGH PERFORMANCE GLAUCOMA SCREENING TECHNIQUE USING
CNN ARCHITECTURE”
Submitted in partial fulfilment of the requirements for the award of the degree of
BACHELOR OF ENGINEERING
In
ELECTRONICS AND COMMUNICATION
Submitted by
DEEKSHITHA S 1CK16EC009
DHIYA N S 1CK16EC010
SRINIVASA J 1CK15EC047
SUSHIL KAVIN U 1CK15EC049
Asst.Proffessor
Dept of ECE,CBIT,Kolar.
Certificate
This is to certify that the Project entitled “A HIGH
PERFORMANCE GLAUCOMA SCREENING TECHNIQUE USING
CNN ARCHITEC-
TURE” is a bonafide work carried out by
DEEKSHITHA S 1CK16EC009
DHIYA N S 1CK16EC010
SRINIVASA J 1CK15EC047
SUSHIL KAVIN U 1CK15EC049
In partial fulfillment for the award of Bachelor of Engineering in Electron-
ics and Communication Engineering of the Visvesvaraya Technological
University, Belagavi during the year 2019-2020. The report has been approved
as it satisfies the academic requirements with respect to project work prescribed
by the VTU of the Bachelor of Engineering degree.
NAME: USN:
ACKNOWLEDGMENT
First and foremost, we would like to thank the ALMIGHTY for giving us
the strength, knowledge, ability and opportunity to undertake this Project and to
persevere and complete it satisfactorily.
We extend our special in-depth, sincere gratitude to our Internal Guide, Mr.
CHETHAN KUMAR N.S, Asst. Professor, Dept. of ECE, CBIT Whose in-
spiration, patience, valuable suggestions and guidance helped us in finishing the
Project work with a great Endeavour.
Finally, we would like to thank all the teaching and the non-teaching staffs
of the Department of Electronics and Communication Engineering,
CBIT, Kolar, for their valuable suggestions and support during the tenure of the
project.
DEEKSHITHA S 1CK16EC009
DHIYA N S 1CK16EC010
SRINIVASA J 1CK15EC047
SUSHIL KAVIN U 1CK15EC049
i
DECLARATION
We also declare that, to the best of our knowledge and belief the matter em-
bodied in this dissertation has not been submitted previously by us for the award
of any Degree or Diploma to any other university.
ii
ABSTRACT
iii
Contents
Acknowledgment..............................................................................................i
Declaration......................................................................................................ii
Abstract...................................................................................................iii
Table of Contents.....................................................................................v
List of Figures vi
List of Figures.........................................................................................vi
1 INTRODUCTION 1
1.1 Types of Glaucoma . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Clinical diagnosis of Glaucoma . . . . . . . . . . . . . . . . . 3
1.1.2 CAD for Glaucoma . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.3 Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Organization Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 LITERATURE SURVEY 12
3 PROPOSED METHODOLOGY 16
3.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
iv
4 WORK FLOW 20
6 EXPERIMENTAL RESULTS 31
References 37
List of Figures
vi
List of Abbrevations
ACC Accuracy
AI Artificial Intelligence
DL Deep Learning
DR Diabetic Retinopathy
FN False Negative
GB Giga Byte
vii
GPU Graphics Processing Unit
ML Machine Learning
PC Personal Computer
PRC Precision
SN Sensitivity
SP Specificity
TB Tera Byte
TN True Negative
TP True Positive
viii
TPR True Positive Rate
ix
Chapter 1
INTRODUCTION
Glaucoma is a chronic eye disease that occurs as a result of optic nerve damage
due to intraocular pressure inside the eye [2]. Glaucoma is a leading cause of
blindness in people aged older than 60 years [3]. The World Health Organization
has declared Glaucoma to be the second largest cause of blindness all over the
world [4]. The number of people with glaucoma worldwide was estimated as 64.3
million in 2013, and 80 million in 2020 [5]. As it may be asymptomatic, early
detection and treatment are important to prevent vision loss [1].
1
A HPGST Using CNN Architecture 2019-2020
to have a reliable early detection system for glaucoma onset [1, 10].
Dept.of ECE,CBIT,Kolar 2
2. Primary Angle Closure Glaucoma : It is a less common form of glau-
coma & is caused by blocked drainage canals, resulting in a sudden rise in
intraocular pressure it has a closed or narrow angle between the iris and
cornea develops very quickly it has symptoms and damage that are usu-
ally very noticeable demands immediate medical attention. It is also called
acute glaucoma or narrow angle glaucoma. Unlike open-angle glaucoma, angle-
closure glaucoma is a result of the angle between the iris and cornea closing
[13].
The clinical diagnosis of glaucoma includes a series of various tests that is car-
ried out by the ophthalmologist. The key to prevent glaucoma is to have regular
eye check-ups after 40 years.
The following different tests are commonly performed to diagnose glaucoma
are described as follows:
2. Ophthalmoscopy : The doctor examine the optic nerve for glaucoma dam-
age. Eye drops are used to dilate the pupil so that the doctor can see through
the eye to examine the shape and color of the optic nerve [10].
[10].
The clinical diagnosis of eye using above techniques is time consuming and
involves inter/intra observer variability. Digital fundus images captured using a
fundus camera can be effectively utilized for observing the progression of the dia-
betic retinopathy (DR), glaucoma and age related macular degeneration (AMD).
The interesting clinical features of eye such as retina, optic disc, blood vessels
etc., can be clearly visualized in fundus images. In addition, fundus camera is
reliable, less expensive and easy to operate and it can be used to measure various
struc- tures such as change in cup to disc ratio, optic nerve head (ONH), cup
diameter etc. Hence, fundus images can be effectively utilized as a cost effective
tool for the diagnosis of retinal health and eye abnormalities (DR, AMD and
glaucoma) using a single fundus image. Computer aided diagnosis (CAD) of
fundus images helps to diagnose the retinal health using various computational
algorithms. It is a cost effective tool which can avoid inter/intra observer
variability which may be encountered in clinical diagnosis[10].
1.1.2 CAD for Glaucoma
1.4 Objectives
The objectives are important to achieve the goal. The main objectives of this
project are:
Chapter 3 : It deals with the model that is been used, the mathematical
background of CNN and the datasets used for training and testing process. The
proposed glaucoma detection system based on CNN and supervised classification
approaches are presented.
Chapter 4 : It deals with the overall workflow with the sequence of tasks
that processes a set of data.
Chapter 6 : The software and hardware tools used for building the auto-
matic glaucoma detection models are discussed.
LITERATURE SURVEY
12
not
12
A HPGST Using CNN Architecture 2019-2020
affect the discriminative ability. Heatmap analysis showed that the optic disc area
[3]
was the most important area for the discrimination of glaucoma .
Dept.of ECE,CBIT,Kolar 13
accuracy and high AUC-score detection of glaucoma from OCT probability map
images. In fact, one model of each Type - one CNN-A model and one CNN-B
[7]
model achieved the best accuracy rates of the entire set .
PROPOSED METHODOLOGY
We develop an algorithm with CNN architecture that avoid the classical hand-
crafted features extraction step, by processing features extraction and
classification at one time within the same network of neurons and consequently
provide a diag- nosis automatically and without user input. The Convolutional
Neural Network is currently the most powerful models for classifying images.
They have two distinct parts. At the input, an image is provided in the form of a
matrix of pixels. It has 2 dimensions of a grayscale image. The color is
represented by a third dimension, depth 3 to represent the fundamental colors
(RGB). The first part of a CNN is the conventional part itself. It functions as an
extractor of image characteristics. An image is passed through a succession of
filters, or convoluted nuclei, creating new images called convolution maps. Some
intermediate filters reduce the resolution of the image by a local maximum
operation. In the end, the convolution maps are flattened and concatenated into a
feature vector, called a CNN code. This code CNN got out of it from the
convolutive party is then connected in the entry of a second part, constituted by
completely connected layers (multilayer perceptron). The role of this part is to
combine the characteristics of the code CNN to classify the image. The output is
a last layer with one neuron per category.
In input layer the images with random pixels are resized into the pixel size of
224 x 224 x 3 by pre-processing. Then, in convolutional layer the image pixels
224 x 224 x 3 are convolved with the weights and bias terms. An activation
function (ReLU) is used to convert all the negative values obtained after
16
convolution to 0 and to retain the positive values as it is. This activation function
decides whether
16
A HPGST Using CNN Architecture 2019-2020
to activate a neuron or not. Then these images are given to max-pooling layer
in order to reduce the dimension of the image and to increase the computation
speed. Again, this pooled image is given for batch normalization which
normalizes the image within the range from 0 to 1 in each channel (RGB). And
this process is carried out for 3 more times with dropout layer in the alternate
time, the dropout randomly drops some units in our model so as to reduce the
model complexity and increase the speed of performance. We are also using a
regularizer where it sums all the squared weight values of the weight matrix to
calculate the loss function so that the weights get updated to reduce the model
loss. Now after all this steps the image pixel values are given to the flattening
layer which helps to convert to 1D from 2D format to feed the data to the fully
connected network for classification.
3.1 Preprocessing
In the preprocessing step, we processed the images from the different data sets
to a common and standard format in order to train the networks in a homogeneous
way. In input layer the images with random pixels are resized into the pixel size
of 224 x 224 x 3 by pre-processing. In our model, we are using a data
augmentation technique that involves creating a transformed versions of images
in the training dataset that belong to the same class as the original image. In our
Dept.of ECE,CBIT,Kolar 17
A HPGST Using CNN Architecture 2019-2020
paper we use Transforms which include a range of operations from the field of
image manipu-
Dept.of ECE,CBIT,Kolar 18
lation, such as shifts, flip, and rotation. A shift to an image means moving all
pixels of the image in one direction, such as horizontally or vertically while
keeping the image dimensions same. The “width shift range and height shift
range” argu- ments to the ImageDataGenerator constructor control the amount
of horizontal and vertical shifts respectively. A rotation augmentation
unsystematically rotates the image clockwise by a given number of degrees from
0 to 360. The rotation will likely rotate pixels out of the image frame and leave
areas of the frame with no pixel data that must be filled in Image data
augmentation which is typically only applied to the training dataset, and not to the
test dataset. The “Keras” deep learning library which provides the ability to use
data augmentation automatically when training a model. It is achieved using
ImageDataGenerator class.
3.3 Classification
Classification block is also knows as fully connected layer, this layers is
always last layer of all neural networks. The classifier which we are using is
Artificial neu- ral network (ANN), the specialty of this network is that it connects
each and every
neuron of its layer to the neuron of the next layer which yields better accuracy
when compared to other classifiers. In our model we are using 4 fully connected
layers, in first 3 fully connected layers it works along with dropout layer and the
last fully connected layer works along with an activation (sigmoid) function
which activates within the range of 0 to 1. We are also using an optimizer called
ADAM, to optimize the weight values during the back propagation. The back-
propagation process is nothing but it updates the weights of the model in reverse
order (from last step to the first) based on the error rate calculated in the model
using the optimizer. This back-propagation helps in increasing the accuracy of our
model. Learning rate is scheduled in our model for tuning the parameter in an
optimiza- tion algorithm which is used to determine the step size at each iteration
while moving toward a minimum of a loss function. Epoch is defined before
training our model; one epoch is when an entire dataset is passed in both forward
and backward through the neural network only once. For our model epoch is
given for 100 times (100 epochs). Using all the extracted features the binary
classifier classifies the images into infected and non-infected where, 0 indicates
glaucoma infected and 1 indicates non-infected glaucoma.
Chapter 4
WORK FLOW
Overall sequence of tasks that processes the set of data in a high performance
glaucoma screening technique using CNN architecture are as follows:
20
A HPGST Using CNN Architecture 2019-2020
1. Database
The publicly available database called ACRIMA is used for the evalua-
tion of glaucoma classification methods. It consists of 705 images in which
396 images are glaucomatous images and 309 images are non-glaucomatous
images.
4. Preprocessing
Dept.of ECE,CBIT,Kolar 21
(b) Convolution
Convolution is a specialized type of linear operation used for feature
extraction, where a small array of numbers, called a kernel, is applied
across the input, which is an array of numbers. An element-wise prod-
uct between each element of the kernel and the input is calculated at
each location and summed to obtain the output value in the corre-
sponding position of the output, called a feature map. This procedure
is repeated applying multiple kernels to form an arbitrary number of
feature maps, which represent different characteristics of the inputs,
different kernels can thus, be considered as different feature extractors.
There are 4 important key hyper-parameters of the convolutional
layers
The learning rate will be specified for the optimizer which is the amount
of change to the model during each step of this search process, or the step
size where we systematically drop the learning rate after specific epochs
during training and it is considered to be the most important
hyperparameter to tune the neural network in order to achieve good
performance of the model.
After creating the CNN model, then the model is undergone Training
with the training data sets, validating data sets and the hyperparameters
like batch size and epochs are tuned to get the better performance of the
model.
9. Model accuracy
After training the CNN model, the testing datasets are given to make
predictions which helps to determine how well the CNN model is trained.
It supports Python, but not R or Scala yet. Google colab supports free graph-
ics processing unit. The types of graphics processing units that are available in
Colab vary over time. This is necessary for Colab to be able to provide access to
these resources for free. The graphics processing units available in Google Colab
often include Nvidia K80s, T4s, P4s and P100s. There is no way to choose what
type of graphics processing unit you can connect to in Google Colab at any given
time.
Colab is able to provide free resources in part by having dynamic usage limits
that sometimes fluctuate, and by not providing guaranteed or unlimited resources.
This means that overall usage limits as well as idle timeout periods, maximum
VM lifetime, graphics processing unit types available, and other factors vary over
time.
26
A HPGST Using CNN Architecture 2019-2020
Colab does not publish these limits, because they can vary quickly.
Colab is ideal for everything from improving our Python coding skills to
work- ing with deep learning libraries, like PyTorch, Keras, TensorFlow, and
OpenCV. We can create notebooks in Colab, upload notebooks, store notebooks,
share note- books, mount our Google Drive and use whatever is stored in our
drive, we can upload our personal Jupyter Notebooks, upload notebooks directly
from GitHub, upload Kaggle files, download our notebooks, and use it.
5.2 GPU
A graphics processing unit (GPU) is a specialized electronic circuit designed
to rapidly manipulate and alter memory to accelerate the creation of images in
a frame buffer intended for output to a display device. The Graphics Card is re-
sponsible for rendering an image to your monitor, it does this by converting data
into a signal your monitor can understand. This device is made freely available
through cloud based google colab with limited access session. Google Colab pro-
vides a single 12GB NVIDIA Tesla K80 graphics processing unit that can be used
up to 12 hours continuously. Recently, Colab also started offering free TPU.
To use the google colab in a GPU mode, make sure the hardware accelerator
is configured to graphics processing unit. Sometimes, all graphics processing
units are in use and there is no graphics processing unit available. If this is the
case, we will get the following alert, and have to wait for a while and try again.
Dept.of ECE,CBIT,Kolar 27
5.3 Deep learning frame work
TensorFlow is an open source library developed by Google for its internal use.
TensorFlow is Google Brain’s second-generation system. Its main usage is in ma-
chine learning and dataflow programming. The name TensorFlow derives from
the operations that such neural networks perform on multidimensional data arrays,
which are referred to as tensors. TensorFlow, as the name indicates, is a frame-
work to define and run computations involving tensors. A tensor is a
generalization of vectors and matrices to potentially higher dimensions.
Internally, TensorFlow represents tensors as n-dimensional arrays of base
datatypes. TensorFlow com- putations are expressed as stateful dataflow graphs.
Its flexible architecture al- lows for the easy deployment of computation across a
variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of
servers to mobile and edge devices.
TensorFlow is the most famous deep learning library these recent years. A
practitioner using TensorFlow can build any deep learning structure, like CNN,
RNN or simple artificial neural network. TensorFlow is mostly used by
academics, start-ups, and large companies. Google uses TensorFlow in almost all
Google daily products including Gmail, Photo and Google Search Engine. They
built a framework called Tensor Flow to let researchers and developers work
together on an AI model. Once developed and scaled, it allows lots of people to
use it.
5.3.2 Keras
5.4 Python
Python is an interpreted, high-level, general-purpose programming language.
Created by Guido van Rossum and first released in 1991, Python’s design philos-
ophy emphasizes code readability with its notable use of significant whitespace.
Its language constructs and object-oriented approach aim to help programmers
write clear, logical code for small and large-scale projects. Python is dynamically
typed and garbage-collected. It supports multiple programming paradigms, in-
cluding structured (particularly, procedural), object-oriented, and functional pro-
gramming. Python is often described as a “batteries included” language due to
its comprehensive standard library.
The Python 2 language was officially discontinued in 2020 (first planned for
2015), and “Python 2.7.18 is the last Python 2.7 release and therefore the last
Python 2 release.” No more security patches or other improvements will be released
for it. Python interpreters are available for many operating systems.
EXPERIMENTAL RESULTS
Our automatic glaucoma detection using retinal fundus images model was built
and tested on personal laptop with the processor of intel core i5 CPU @ 1.60
GHZ to 1.80 GHZ and random-access memory (RAM) of 8GB. The model was
imple- mented in deep learning frame work which is an open source library
(Tensor flow and Keras) @ python IDE (Google colab).
Using the 705 glaucoma and non-glaucoma retinal fundus images from ACRIMA
data set our model was randomly trained and tested. Where, 80% of the images
(396 images consisting of both glaucoma and non-glaucoma) were randomly chosen
for training and 20% of images (309 images consisting of both glaucoma and non-
glaucoma) were randomly chosen for testing our model. The model extracted the
features and trained the model to classify images into glaucoma and non-glaucoma.
The performance of our model was evaluated using statistical measures like preci-
sion, recall, f1-score, accuracy, AUC-ROC curve for three different learning rates
0.1, 0.001, 0.0001 with 100 epochs for each learning rate. A comparison of three
different learning rates with the statistical measures are shown in the table.
TP + TN
Accuracy =
TP + FN + TN + (6.1)
FP
Precision = TP FP
TP +
31
(6.2)
31
A HPGST Using CNN Architecture 2019-2020
Recall/Sensitivity =
TP
(6.3)
Specificity = TP +
FN (6.4)
TP
TN + FP
F1− precision recall
(6.5)
precision +
recall
Where,
TP = True Positive values
TN = True Negative
values FP = False Positive
values FN = False
Negative values
Our model obtained best performance at 0.0001 learning rate for 100 epochs
of accuracy 0.99, precision of 0.99, recall of 0.99, f1- score of 0.99 and we also
achieved the AUC-ROC curve area of 0.99 s shown in the graph
Dept.of ECE,CBIT,Kolar 32
Dept.of ECE,CBIT,Kolar 32
Chapter 7
7.1 Applications
1. It is used to screen the glaucoma at earlier stage.
4. It reduces the work of the ophthalmologists in finding the cause of eye dis-
ease.
7.2 Advantages
1. The algorithm reduce the processing time taken by the manual computer
based algorithm without compromising.
33
A HPGST Using CNN Architecture 2019-2020
7.3 Disadvantages
1. In our project we are using ACRIMA data which contains Less number of
images.
4. The Convnet requires large dataset to process and to train the neural net-
work.
Dept.of ECE,CBIT,Kolar 34
Chapter 8
In our model, glaucoma can be diagnose using advance deep learning archi-
tecture using retinal funds images without using the hand-crafted features. To
develop a Glaucoma deep learning algorithms, a convolutional neural network
(CNN) supervised model is applied on 705 funds images to extract the features
through multilayer from raw pixel intensities. As we examine the positive effect
of architecture, the training strategy, the size of the Data set and collected data
from the online clinical. We have used data set of ACRIMA which is publicly
available in online. In this implemented algorithm contains 4 convolution layer
and 4 fully connected layer, we take on batch-normalization and max-pooling
layer. To in- crease our computation power and to avoid over-fitting we adopt
dropout and data augmentation. Precision (PRC), Sensitivity ( SN), Accuracy
(ACC), Specificity (SP) and statistical measure are utilized to evaluate the
performance of glaucoma deep learning algorithm. On average, the Precision
(PRC) of 99%, Accuracy (ACC) of 99%, Specificity (SP) of 99%, Sensitivity
(SN) of 99% were achieved. Conclusively, we evaluated the performance of the
integration of the data from the clinical history with color fundus images. The
result show a amendment in specificity and sensitivity with homogeneous AUCs.
Further tests with more data and incipient architectural approaches should be
developed and assessed to attest this line of work.
35
A HPGST Using CNN Architecture 2019-2020
processing, massive datasets and huge models. Further, simulation will be carried
out for clinical data, for real-time hardware implementation using raspberry pi
will also be investigated.
Dept.of ECE,CBIT,Kolar 36
References
[1] Andres Diaz Pintol, Sandra Morales, Valery Naranjo, Thomas K¨ohler, Jose
M. Moss and Amparo Navea. “CNNs for automatic glaucoma assessment
using fundus images: an extensive validation”. Diaz Pinto et al. Bio Med Eng
OnLine March-2019.
[2] Ali Serener and Sertan Serte “ Transfer Learning for Early and Advanced
Glaucoma Detection with Convolutional Neural Networks” IEEE Interna-
tional Conference January-2019 .
[3] Sang Phan, Shin’ichi Satoh, Yoshioki Yoda, Kenji Kashiwagi and Tetsuro
Oshika. “Evaluation of Deep convolutional neural network for glaucoma de-
tection” Research Center for Medical Bigdata (RCMB), National Institute of
Informatics, jan-2019.
[4] Muhammad Naseer Bajwa, Muhammad Imran Malik, Shoaib Ahmed Sid-
diqui, Andreas Dengel, Faisal Shafait, Wolfgang Neumeier and Sheraz Ahmed,
“Two-stage framework for optic disc localization and glaucoma
classification in retinal fundus images using deep learning”. BMC Medical
Informatics and Decision Making, july-2019.
[5] Juanj.Gomez-Valverde, Alfonso Anton, Gian Luca Fatti, Bart Liefers, Ale-
jandra Herranz, Andres Santos, Clara I.Sanchez and Maria J.Ledesma-
Carbayo,“Automatic glaucoma classification using colour fundus images
based on convolutional neural networks and transfer learning”. researchgate
published 25 Jan 2019.
37
A HPGST Using CNN Architecture 2019-2020
[8] Nacer Eddine Benzebouchi, Nabiha Azizi and Seif Eddine Bouzaine“ Glau-
coma Diagnosis Using Cooperative Convolutional Neural Networks”. Inter-
national Journal of Advances in Electronics and Computer Science, ISSN:
2393-2835, Jan-2018.
[9] Jongwoo kim, Sema Candemir and George R.Thoma, Emily Y.Chew.
“Region of Interest Detection in Fundus Images Using DL and Blood vessel
Informa- tion”. IEEE 31st International symposium of computer-based
medical system jan-2018.
[11] Baidaa Al Bander, Waleed Al-Nuaimy, Majid A.Al-Taee and Yalin Zheng.
“Automated Glaucoma Diagnosis using Deep Learning Approach”.14th In-
ternational MultiConference on Systems, Signals & Devices (SSD) 2017.
[12] Radia Touahri, Nabiha Azizi, Nacer Eddine Benzebouchi, Nacer Eddine
Ham- mami and Ouided Moumene. “A Comparative Study of Convolutional
Neural Network and Twin SVM for Automatic Glaucoma Diagnosis
Embedded & Distributed Systems”. (EDiS)IEEE, pp.16,2017.
Dept.of ECE,CBIT,Kolar 38