0% found this document useful (0 votes)
13 views

3.deep Learning Approach To Diabetic Retinopathy Detection

This document discusses a deep learning approach to detecting diabetic retinopathy from fundus images. Diabetic retinopathy progresses in stages from mild non-proliferative to proliferative and can lead to blindness if left untreated. Existing methods of diagnosis are inefficient and inconsistent. The authors propose an automatic deep learning method using convolutional neural networks to classify images and detect the stage of diabetic retinopathy. They achieve high sensitivity and specificity on benchmark datasets, outperforming other methods. Their multi-stage transfer learning approach makes use of similar datasets with different labels.

Uploaded by

Niharika Mamgain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

3.deep Learning Approach To Diabetic Retinopathy Detection

This document discusses a deep learning approach to detecting diabetic retinopathy from fundus images. Diabetic retinopathy progresses in stages from mild non-proliferative to proliferative and can lead to blindness if left untreated. Existing methods of diagnosis are inefficient and inconsistent. The authors propose an automatic deep learning method using convolutional neural networks to classify images and detect the stage of diabetic retinopathy. They achieve high sensitivity and specificity on benchmark datasets, outperforming other methods. Their multi-stage transfer learning approach makes use of similar datasets with different labels.

Uploaded by

Niharika Mamgain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Deep Learning Approach to Diabetic Retinopathy Detection

Borys Tymchenko1 a
, Philip Marchenko2 b
and Dmitry Spodarets3 c
1 Instituteof Computer Systems, Odessa National Polytechnic University, Shevchenko av. 1, Odessa, Ukraine
2 Department of Optimal Control and Economical Cybernetics, Faculty of Mathematics, Physics and Information Tecnology,
Odessa I.I. Mechnikov National University, Dvoryanskaya str. 2, Odessa, Ukraine
3 VITech Lab, Rishelevska St, 33, Odessa, Ukraine

[email protected], [email protected], [email protected]

Keywords: Deep learning, diabetic retinopathy, deep convolutional neural network, multi-target learning, ordinal
arXiv:2003.02261v1 [cs.LG] 3 Mar 2020

regression, classification, SHAP, Kaggle, APTOS.

Abstract: Diabetic retinopathy is one of the most threatening complications of diabetes that leads to permanent blindness
if left untreated. One of the essential challenges is early detection, which is very important for treatment
success. Unfortunately, the exact identification of the diabetic retinopathy stage is notoriously tricky and
requires expert human interpretation of fundus images. Simplification of the detection step is crucial and
can help millions of people. Convolutional neural networks (CNN) have been successfully applied in many
adjacent subjects, and for diagnosis of diabetic retinopathy itself. However, the high cost of big labeled
datasets, as well as inconsistency between different doctors, impede the performance of these methods. In
this paper, we propose an automatic deep-learning-based method for stage detection of diabetic retinopathy
by single photography of the human fundus. Additionally, we propose the multistage approach to transfer
learning, which makes use of similar datasets with different labeling. The presented method can be used
as a screening method for early detection of diabetic retinopathy with sensitivity and specificity of 0.99 and
is ranked 54 of 2943 competing methods (quadratic weighted kappa score of 0.925466) on APTOS 2019
Blindness Detection Dataset (13000 images).

1 INTRODUCTION disease;
• Severe non-proliferative retinopathy results in de-
Diabetic retinopathy (DR) is one of the most threat- prived blood supply to the retina due to the in-
ening complications of diabetes in which damage oc- creased blockage of more blood vessels, hence
curs to the retina and causes blindness. It damages the signaling the retina for the growing of fresh blood
blood vessels within the retinal tissue, causing them vessels;
to leak fluid and distort vision. Along with diseases
• Proliferative diabetic retinopathy is the advanced
leading to blindness, such as cataracts and glaucoma,
stage, where the growth features secreted by the
DR is one of the most frequent ailments, according to
retina activate proliferation of the new blood ves-
the US, UK, and Singapore statistics (NCHS, 2019;
sels, growing along inside covering of retina in
NCBI, 2018; SNEC, 2019).
some vitreous gel, filling the eye.
DR progresses with four stages:
Each stage has its characteristics and particular
• Mild non-proliferative retinopathy, the earliest properties, so doctors possibly could not take some
stage, where only microaneurysms can occur; of them into account, and thus make an incorrect di-
• Moderate non-proliferative retinopathy, a stage agnosis. So this leads to the idea of creation of an
which can be described by losing the blood ves- automatic solution for DR detection.
sels’ ability of blood transportation due to their At least 56% of new cases of this disease could be
distortion and swelling with the progress of the reduced with proper and timely treatment and mon-
itoring of the eyes (Rohan T, 1989). However, the
a https://orcid.org/0000-0002-2678-7556 initial stage of this ailment has no warning signs, and
b https://orcid.org/0000-0001-9995-9454 it becomes a real challenge to detect it on the early
c https://orcid.org/0000-0001-6499-4575 start. Moreover, well-trained clinicians sometimes
could not manually examine and evaluate the stage 2 RELATED WORK
from diagnostic images of a patient’s fundus (accord-
ing to Google’s research (Krause et al., 2017), see
Figure 1). At the same time, doctors will most of-
ten agree when lesions are apparent. Furthermore, Many research efforts have been devoted to the prob-
existing ways of diagnosing are quite inefficient due lem of early diabetic retinopathy detection. First of
to their duration time, and the number of ophthalmol- all, researchers were trying to use classical methods
ogists included in patient’s problem solution. Such of computer vision and machine learning to provide
sources of disagreement cause wrong diagnoses and a suitable solution to this problem. For instance,
unstable ground-truth for automatic solutions, which Priya et al. (Priya and Aruna, 2012) proposed a
were provided to help in the research stage. computer-vision-based approach for the detection of
diabetic retinopathy stages using color fundus images.
They tried to extract features from the raw image, us-
ing the image processing techniques, and fed them
to the SVM for binary classification and achieved a
sensitivity of 98%, specificity 96%, and accuracy of
97.6% on a testing set of 250 images. Also, other re-
searchers tried to fit other models for multiclass clas-
sification, e.g., applying PCA to images and fitting
decision trees, naive Bayes, or k-NN (Conde et al.,
2012) with best results 73.4% of accuracy, and 68.4%
for F-measure while using a dataset of 151 images
with different resolutions.
Figure 1: Google showed that ophtalmologists’ diagnoses
differ for same fundus image. Best viewed in color. With the growing popularity of deep learning-
based approaches, several methods that apply CNNs
to this problem appeared. Pratt et al. (Harry Pratt,
Thus, algorithms for DR detection began to ap- 2016) developed a network with CNN architecture
pear. The first algorithms were based on different and data augmentation, which can identify the intri-
classical algorithms from computer vision and set- cate features involved in the classification task such
ting thresholds (Michael D. Abrmoff and Quellec, as micro-aneurysms, exudate, and hemorrhages in the
2010; Christopher E.Hann, 2009; Nathan Silberman retina and consequently provide a diagnosis automat-
and Subramanian, 2010). Nevertheless, in the past ically and without user input. They achieved a sen-
few years, deep learning approaches have proved their sitivity of 95% and an accuracy of 75% on 5,000
superiority over other algorithms in tasks of classifi- validation images. Also, there are other works on
cation and object detection (Harry Pratt, 2016). In CNNs from other researchers (Carson Lam and Lind-
particular, convolutional neural networks (CNN) have sey, 2018; Yung-Hui Li and Chung, 2019). It is use-
been successfully applied in many adjacent subjects ful to note that Asiri et al. reviewed a significant
and for diagnosis of diabetic retinopathy itself (Shao- amount of methods and datasets available, highlight-
hua Wan, 2018; Harry Pratt, 2016). ing their pros and cons (Asiri et al., 2018). Besides,
they pointed out the challenges to be addressed in de-
In 2019, APTOS (Asia Pacific Tele-
signing and learning about efficient and robust deep-
Ophthalmology Society) and competition ML
learning algorithms for various problems in DR diag-
platform Kaggle challenged ML and DL researchers
nosis and drew attention to directions for future re-
to develop a five-class DR automatic diagnosing solu-
search.
tion (APTOS 2019 Blindness Detection Dataset). In
this paper, we propose the transfer learning approach Other researchers also tried to make transfer learn-
and an automatic method for detection of the stage ing with CNN architectures. Hagos et al. (Hagos
of diabetic retinopathy by single photography of the and Kant, 2019) tried to train InceptionNet V3 for 5-
human fundus. This approach is able to learn useful class classification with pretrain on ImageNet dataset
features even from a noisy and small dataset and and achieved accuracy of 90.9%. Sarki et al. (Ru-
could be used as a DR stages screening method in bina Sarki, 2019) tried to train ResNet50, Xception
automatic solutions. Also, this method was ranked 54 Nets, DenseNets and VGG with ImageNet pretrain
of 2943 different methods on APTOS 2019 Blindness and achieved best accuracy of 81.3%. Both teams
Detection Competition and achieved the quadratic of researchers used datasets, which were provided by
weighted kappa score of 0.92546. APTOS and Kaggle.
3 PROBLEM STATEMENT

3.1 Datasets
The image data used in this research was taken from
several datasets. We used an open dataset from Kag-
gle Diabetic Retinopathy Detection Challenge 2015
(EyePACs, 2015) for pretraining our CNNs. This
dataset is the largest available publicly. It consists
of 35126 fundus photographs for left and right eyes
of American citizens labeled with stages of diabetic
retinopathy:
• No diabetic retinopathy (label 0)
• Mild diabetic retinopathy (label 1) Figure 2: Classes distribution in APTOS2019 dataset.
• Moderate diabetic retinopathy (label 2)
• Severe diabetic retinopathy (label 3)
• Proliferative diabetic retinopathy (label 4)
In addition, we used other smaller datasets: In-
dian Diabetic Retinopathy Image Dataset (IDRiD)
(Sahasrabuddhe and Meriaudeau, 2018), from which
we used 413 photographs of the fundus, and MES-
SIDOR (Methods to Evaluate Segmentation and In-
dexing Techniques in the field of Retinal Ophthal-
mology) (Decencire et al., 2014) dataset, from which
we used 1200 fundus photographs. As the origi-
nal MESSIDOR dataset has different grading from Figure 3: Sample of fundus photo from the dataset.
other datasets, we used the version that was relabeled
to standard grading by a panel of ophthalmologists
scores assigned by the human rater and the predicted
(Google Brain, 2018).
scores. This metric varies from -1 (complete disagree-
As the evaluation was performed on Kaggle AP-
ment between raters) to 1 (complete agreement be-
TOS 2019 Blindness Detection (APTOS2019) dataset
tween raters). The definition of κ is:
(APTOS, 2019), we had access only to the training
part of it. The full dataset consists of 18590 fundus
photographs, which are divided into 3662 training, ∑ki=1 ∑kj=1 wi j oi j
κ = 1− , (1)
1928 validation, and 13000 testing images by organiz- ∑ki=1 ∑kj=1 wi j ei j
ers of Kaggle competition. All datasets have similar
where k is the number of categories, oi j , and ei j
distributions of classes; distribution for APTOS2019
are elements in the observed, and expected matrices
is shown in Figure 2.
respectively. wi j is calculated as following:
As different datasets have a similar distribution,
we considered it as a fundamental property of this
(i − j)2
type of data. We did no modifications to the dataset wi j = , (2)
distribution (undersampling, oversampling, etc.). (k − 1)2
The smallest native size among all of the datasets Due to Cohens Kappa properties, researchers must
is 640x480. Sample image from APTOS2019 is carefully interpret this ratio. For instance, if we con-
shown in Figure 3. sider two pairs of raters with the same percentage of
an agreement, but different proportions of ratings, we
3.2 Evaluation metric should know, that it will drastically affect the Kappa
ratio.
In this research, we used quadratic weighted Co- Another problem is the number of codes: as the
hen’s kappa score as our main metric. Kappa score number of codes grows, Kappa becomes higher. Also,
measures the agreement between two ratings. The Kappa may be low even though there are high levels
quadratic weighted kappa is calculated between the of agreement, and even though individual ratings are
accurate. All things mentioned above make Kappa a
volatile ratio to analyze.
The main reason to use the Kappa ratio is that
we do not have access to labels of validation and test
datasets. Kappa value for these datasets is obtained by
submitting our model and runner’s code to the check-
ing system on the Kaggle site. Moreover, we do not
have explicit access to images from the test dataset.
Along with the Kappa score, we calculate macro
F1- score, accuracy, sensitivity, specificity on holdout
dataset of 736 images taken from APTOS2019 train-
ing data.

Figure 4: Spurious correlations between meta-features and


diagnosis.
4 METHOD
The diabetic retinopathy detection problem can be 4.3 Network architecture
viewed from several angles: as a classification prob-
lem, as a regression problem, and as an ordinal regres-
sion problem (Ananth and Kleinbaum, 1997). This is We aim to classify each fundus photograph accu-
possible because stages of the disease come sequen- rately. We build our neural networks using conven-
tially. tional deep CNN architecture, which has a feature ex-
tractor and smaller decoder for a specific task (head).
4.1 Preprocessing However, training the encoder from scratch is dif-
ficult, especially given the small amount of training
Model training and validation were performed with data. Thus, we use an Imagenet-pretrained CNNs
preprocessed versions of the original images. The as initialization for encoder (Iglovikov and Shvets,
preprocessing consisted of image cropping followed 2018).
by resizing. We propose the multi-task learning approach to
Due to the way APTOS2019 was collected, there detect diabetic retinopathy. We use three decoders.
are spurious correlations between the disease stage Each is trained to solve its task based on features ex-
and several image meta-features, e.g., resolution, crop tracted with CNN backbone:
type, zoom level, or overall brightness. Correlation
matrix is shown in Figure 4. • classification head,
To make CNN be able not to overfit to these fea-
tures and to reduce correlations between image con- • regression head,
tent and its meta-features, we used a high amount of
augmentations. Additionally, as we do not have ac- • ordinal regression head.
cess to the test dataset both in the competition and in
real life, we decided to show as much data variance as Here, classification head outputs a one-hot en-
possible to models. coded vector, where the presence of each stage is rep-
resented as 1. Regression head outputs real number
4.2 Data augmentation in the range [0, 4.5), which is then rounded to an in-
teger that represents the disease stage. For the ordi-
We used online augmentations, at least one augmenta- nal regression head, we use the approach described in
tion was applied to the training image before inputting (Cheng, 2007). Briefly, if the data point falls into cat-
to the CNN. We used following augmentations from egory k, it automatically falls into all categories from
Albumentations (A. Buslaev and Kalinin, 2018) li- 0 to k − 1. So, this head aims to predict all categories
brary: optical distortion, grid distortion, piecewise up to the target. The final prediction is obtained by
affine transform, horizontal flip, vertical flip, ran- fitting a linear regression model to outputs of three
dom rotation, random shift, random scale, a shift of heads. Neural network structure is shown in Figure 5.
RGB values, random brightness and contrast, additive We train all heads and the feature extractor jointly
Gaussian noise, blur, sharpening, embossing, random in order to reduce training time. We keep the linear
gamma, and cutout (Devries and Taylor, 2017). regression model frozen until the post-training stage.
random weights (He et al., 2015). We train a model
for 20 epochs on 2015 data with minibatch-SGD and
cosine-annealing learning rate schedule (Loshchilov
and Hutter, 2016).
Every head is minimizing its loss function: cross-
entropy for classification head, binary cross-entropy
for ordinal regression head, and mean absolute error
for regression head.
After pretraining, we use encoder weights as ini-
tialization for subsequent stages. In our experiments,
we observed the consistent improvement of metrics
when we substituted weights of heads with random
initialization before the main training, so we discard
trained heads.

4.4.2 Main training

The main training is performed on 2019 data, IDRID,


and MESSIDOR combined. Starting with weights ob-
tained in the pretraining stage, we performed 5-fold
cross-validation and evaluated models on the holdout
set.
At this stage, we change loss functions for heads:
Focal Loss (Lin et al., 2017) for classification head,
binary Focal Loss (Lin et al., 2017) for ordinal re-
gression head and mean-squared error for regression
head.
We trained each fold for 75 epochs using Recti-
fied Adam optimizer (Liyuan Liu, 2019), with cosine
annealing learning rate schedule. To save pretrained
weights while new heads are in a random state, we
disabled training (froze) of the encoder for five epochs
Figure 5: Three-head CNN structure. while training heads only.
During the main training, we monitor separability
4.4 Training process in feature space generated by the encoder. We gener-
ate 2-dimensional embeddings with T-SNE (van der
We use a multi-stage training process with different Maaten and Hinton, 2008) and visualize them in the
settings and datasets in every stage. validation phase for manual control of training perfor-
mance. Figure 6 shows T-SNE of embeddings labeled
4.4.1 Pretraining with ground truth data and predicted classes. From
the picture, it can be seen that images with no signs
We found out that labeling schemes are inconsistent of DR are separable with a large margin from other
between datasets, so we decided to use the largest images that have any sign of DR. Additionally, stages
one (2015 data) to pretrain our CNNs. Using trans- of DR come sequentially in embedding space, which
fer learning is possible because the natural features of corresponds to semantics in real diagnoses.
the diabetic retinopathy are consistent between differ-
ent people and do not depend on the dataset. 4.4.3 Post-training
In addition, different datasets are collected on dif-
ferent equipment. Incorporation of this knowledge In the post-training stage, we only fit the linear regres-
into the model increases its ability to generalize and sion model to outputs of different heads.
elevates the importance of natural features by reduc- We found it essential to keep it from updating dur-
ing sensitivity to instrument noise. ing previous stages because otherwise, it converges
We initialize feature extractor with weights from to the suboptimal local minima with weights of two
Imagenet-pretrained CNN. Heads are initialized with heads close to zero. These coefficients prevent gra-
Figure 7: Output distributions for regression head and com-
bination of heads

tion and ordinal regression heads, we propose label


smoothing scheme for linear regression head. It can
be used if it is known that underlying targets are dis-
crete. We add random uniform noise to discrete tar-
gets:

Ts = T + ∆
∆ ∼ U (a, b)
Figure 6: Feature embeddings with T-SNE. Ground truth
(top) and predicted (bottom) classes. Best viewed in color. Where Ts is smoothed target label, T is the orig-
inal label, and U is the uniform distribution. In this
T −T
dients of updating corresponding heads’ weights and case, −a = b = i 3 i+1 and Ti Ti+1 are neighbouring
further discourage network of converging. discrete target labels.
Initial weights for every head were set to 1/3 and Applying this smoothing scheme, we could reduce
then trained for five epochs to minimize mean squared the importance of wrong labeling.
error function.
Difference between prediction distributions of re- 4.4.5 Ensembling
gression head and linear regression outputs is show
on Figure 7. For final scoring, we ensembled models with 3
encoder architectures at different resolution that
4.4.4 Regularization scored best on the holdout dataset : EfficientNet-B4
(380x380), EfficientNet-B5 (456x456) (Tan and Le,
At training time, we regularize our models for bet- 2019), SE-ResNeXt50 (380x380 and 512x512) (Hu
ter robustness. We use conventional methods, e.g., et al., 2017).
weight decay (Krogh and Hertz, 1992) and dropout. Our best performing solution is an ensemble of 20
Also, we penalize the network for overconfident pre- models (4 architectures x 5 folds) with test-time aug-
dictions by using label smoothing (Szegedy et al., mentations (horizontal flip, vertical flip, transpose,
2016). rotate, zoom). Overall, this scheme generated 200
Additionally to label smoothing for classifica- predictions per one fundus image. These predictions
were averaged with a 0.25-trimmed mean to eliminate cases, visualization of salient features can assist the
outliers from possibly overfitted models. A trimmed physician to focus on regions of interest where fea-
mean is used to filter out outliers to reduce variance. tures are the most noticeable.
We used Catalyst framework (Kolesnikov, 2018) In Figure 8, we show an example visualization of
based on PyTorch (Paszke et al., 2017) with GPU SHAP values for one of the models from the ensem-
support. Evaluation of the whole ensemble was per- ble. Red color denotes features that increase the out-
formed on Nvidia P100 GPU in 9 hours, processing put value for a given class, and blue color denotes fea-
2.5 seconds per image. tures that decrease the output value for a given class.
Overall intensity of the features denotes the saliency
of the given region for the classification process.
5 RESULTS
As experimental results, we provide two tables
with metrics, which were mentioned in the Evalua-
tion paragraph. The first table is about results that we
have got from local validation without TTA (Table 1),
and the second is with TTA (Table 2).
Our test stage was split into two parts: local test-
ing and Kaggle testing. As we found locally, the en-
sembling method is the best one, and we evaluated it
on Kaggle validation and test datasets.
On a local dataset of 736 images, ensembling
with TTA performed slightly worse than without it.
Ensemble with TTA performed better on the testing
dataset of 13000 images as it has a better ability to
generalize on unseen images.
Ensembles scored 0.818462/0.924746 valida-
tion/test QWK score for a trimmed mean ensemble
without TTA and 0.826567/0.925466 QWK score for
trimmed mean ensemble with TTA.
Additionally, we evaluated binary classification
(DR/No DR) to check the best model’s quality as a
screening method (see Tables 1 and 2, last row)
The ensemble with TTA showed its stability in the
final scoring, keeping consistent rank (58 and 54 of
2943) on validation and testing datasets, respectively.

6 INTERPRETATION
In medical applications, it is important to be able to
interpret models’ predictions. As a good performance
of the validation dataset can be a measure to select the
best-trained model for production, it is insufficient for
real-life use of this model.
By using SHAP (Shapley Additive exPlanations)
(Lundberg and Lee, 2017), it is possible to visualize
features that contribute to the assessment of the dis-
ease stage. SHAP unites several previous methods
and represents the only possible consistent and locally
accurate additive feature attribution method based.
Using SHAP allows ensuring that the model learns Figure 8: Shap analysis of sample images. Best viewed in
useful features during training, as well as uses correct color.
features at inference time. Furthermore, in uncertain
Model QWK Macro F1 Accuracy Sensitivity Specificity
EfficientNet-B4 0.965 0.811 0.903 0.812 0.976
EfficientNet-B5 0.963 0.815 0.907 0.807 0.977
SE-ResNeXt50 (512x512) 0.969 0.854 0.924 0.871 0.982
SE-ResNeXt50 (380x380) 0.960 0.788 0.892 0.785 0.974
Ensemble (mean) 0.968 0.840 0.921 0.8448 0.981
Ensemble (trimmed mean) 0.971 0.862 0.929 0.860 0.983
Ensemble (trimmed mean, binary classification) 0.981 0.989 0.986 0.991 0.991
Table 1: Results of experiments and metrics tracked, without using TTA.

Model QWK Macro F1 Accuracy Sensitivity Specificity


EfficientNet-B4 0.966 0.806 0.902 0.809 0.977
EfficientNet-B5 0.963 0.812 0.902 0.807 0.976
SE-ResNeXt50 (512x512) 0.971 0.853 0.928 0.868 0.983
SE-ResNeXt50 (380x380) 0.962 0.799 0.899 0.798 0.976
Ensemble (mean) 0.968 0.827 0.917 0.828 0.980
Ensemble (trimmed mean) 0.969 0.840 0.919 0.840 0.981
Ensemble (trimmed mean, binary classification) 0.986 0.993 0.993 0.993 0.993
Table 2: Results of experiments and metrics tracked, with using TTA.

7 CONCLUSION APTOS (2019). APTOS 2019 blindness detection. Ac-


cessed: 2019-10-20.
In this paper, we proposed the multistage transfer Asiri, N., Hussain, M., and Aboalsamh, H. A. (2018).
learning approach and an automatic method for de- Deep learning based computer-aided diagnosis sys-
tems for diabetic retinopathy: A survey. CoRR,
tection of the stage of diabetic retinopathy by single
abs/1811.01238.
photography of the human fundus. We have used an
Carson Lam, Darvin Yi, M. G. and Lindsey, T. (2018). Au-
ensemble of 3 CNN architectures (EfficientNet-B4, tomated detection of diabetic retinopathy using deep
EfficientNet-B5, SE- ResNeXt50) and made transfer learning.
learning for our final solution. The experimental re- Cheng, J. (2007). A neural network approach to ordinal
sults show that the proposed method achieves high regression. CoRR, abs/0704.1028.
and stable results even with unstable metric. The main Christopher E.Hann, J. Geoffrey Chase, J. A. R. D. H. G.
advantage of this method is that it increases general- M. S. (2009). Diabetic retinopathy screening using
ization and reduces variance by using an ensemble of computer vision.
the networks, pretrained on a large dataset, and fine- Conde, P., de la Calleja, J., Medina, M., and Benitez Ruiz,
tuned on the target dataset. The future work can ex- A. B. (2012). Application of machine learning to clas-
tend this method with the calculation of SHAP for sify diabetic retinopathy.
the whole ensemble, not only for a particular net- Decencire, E., Zhang, X., Cazuguel, G., Lay, B., Coch-
work, and with more accurate hyperparameter opti- ener, B., Trone, C., Gain, P., Ordonez, R., Massin,
mization. Besides, we can do experiments using pre- P., Erginay, A., Charton, B., and Klein, J.-C. (2014).
trained encoders on other connected to eye ailments Feedback on a publicly distributed database: the
messidor database. Image Analysis & Stereology,
tasks. Also, it is possible to investigate meta-learning 33(3):231–234.
(Nichol et al., 2018) with these models, but realized
Devries, T. and Taylor, G. W. (2017). Improved regular-
that it requires the separate in-depth research. ization of convolutional neural networks with cutout.
CoRR, abs/1708.04552.
EyePACs (2015). Diabetic retinopathy detection. Accessed:
REFERENCES 2019-10-20.
Google Brain (2018). Messidor-2 diabetic retinopathy
A. Buslaev, A. Parinov, E. K. V. I. I. and Kalinin, A. A. grades. Accessed: 2019-10-20.
(2018). Albumentations: fast and flexible image aug- Hagos, M. T. and Kant, S. (2019). Transfer learning based
mentations. ArXiv e-prints. detection of diabetic retinopathy from small dataset.
Ananth, C. V. and Kleinbaum, D. G. (1997). Regression CoRR, abs/1905.07203.
models for ordinal responses: a review of methods and Harry Pratt, Frans Coenen, D. M. B. S. P. H. Y. Z. (2016).
applications. International Journal of Epidemiology, Convolutional neural networks for diabetic retinopa-
26(6):1323–1333. thy.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep Sahasrabuddhe, P. P. S. P. R. K. M. K. G. D. V. and Meri-
into rectifiers: Surpassing human-level performance audeau, F. (2018). Indian diabetic retinopathy image
on imagenet classification. CoRR, abs/1502.01852. dataset (idrid).
Hu, J., Shen, L., and Sun, G. (2017). Squeeze-and- Shaohua Wan, Yan Liang, Y. Z. (2018). Deep convolutional
excitation networks. CoRR, abs/1709.01507. neural networks for diabetic retinopathy detection by
Iglovikov, V. and Shvets, A. (2018). Ternausnet: U-net with image classification.
VGG11 encoder pre-trained on imagenet for image SNEC (2019). Singapore’s eye health.
segmentation. CoRR, abs/1801.05746. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna,
Kolesnikov, S. (2018). Reproducible and fast dl and rl. Z. (2016). Rethinking the inception architecture for
Krause, J., Gulshan, V., Rahimy, E., Karth, P., Widner, K., computer vision. In Proceedings of IEEE Conference
Corrado, G. S., Peng, L., and Webster, D. R. (2017). on Computer Vision and Pattern Recognition,.
Grader variability and the importance of reference Tan, M. and Le, Q. V. (2019). Efficientnet: Rethink-
standards for evaluating machine learning models for ing model scaling for convolutional neural networks.
diabetic retinopathy. CoRR, abs/1710.01711. cite arxiv:1905.11946Comment: Published in ICML
Krogh, A. and Hertz, J. A. (1992). A simple weight decay 2019.
can improve generalization. In Moody, J. E., Hanson, van der Maaten, L. and Hinton, G. (2008). Visualizing data
S. J., and Lippmann, R. P., editors, Advances in Neu- using t-SNE. Journal of Machine Learning Research,
ral Information Processing Systems 4, pages 950–957. 9:2579–2605.
Morgan-Kaufmann. Yung-Hui Li, Nai-Ning Yeh, S.-J. C. and Chung, Y.-C.
Lin, T., Goyal, P., Girshick, R. B., He, K., and Dollár, P. (2019). Computer-assisted diagnosis for diabetic
(2017). Focal loss for dense object detection. CoRR, retinopathy based on fundus images using deep con-
abs/1708.02002. volutional neural network.
Liyuan Liu, Haoming Jiang, P. H. W. C. X. L. J. G. J. H.
(2019). On the variance of the adaptive learning rate
and beyond. CoRR, abs/1908.03265.
Loshchilov, I. and Hutter, F. (2016). SGDR: stochastic gra-
dient descent with restarts. CoRR, abs/1608.03983.
Lundberg, S. M. and Lee, S.-I. (2017). A unified ap-
proach to interpreting model predictions. In Guyon, I.,
Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R.,
Vishwanathan, S., and Garnett, R., editors, Advances
in Neural Information Processing Systems 30, pages
4765–4774. Curran Associates, Inc.
Michael D. Abrmoff, Joseph M. Reinhardt, S. R. R. J. C. F.
V. B. M. M. N. and Quellec, G. (2010). Automated
early detection of diabetic retinopathy.
Nathan Silberman, Kristy Ahlrich, R. F. and Subramanian,
L. (2010). Case for automated detection of diabetic
retinopathy.
NCBI (2018). The economic impact of sight loss and blind-
ness in the uk adult population.
NCHS (2019). Eye disorders and vision loss among u.s.
adults aged 45 and over with diagnosed diabetes.
Nichol, A., Achiam, J., and Schulman, J. (2018).
On first-order meta-learning algorithms. CoRR,
abs/1803.02999.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E.,
DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and
Lerer, A. (2017). Automatic differentiation in Py-
Torch. In NIPS Autodiff Workshop.
Priya, R. and Aruna, P. (2012). Svm and neural network
based diagnosis of diabetic retinopathy.
Rohan T, Frost C, W. N. (1989). Prevention of blindness
by screening for diabetic retinopathy: a quantitative
assessment.
Rubina Sarki, Sandra Michalska, K. A. H. W. Y. Z.
(2019). Convolutional neural networks for mild di-
abetic retinopathy detection: an experimental study.
bioRxiv.

You might also like