Combined_Optic_Disc_and_Optic_Cup_Segmentation_Network_Based_on_Adversarial_Learning (1)

This document presents a novel joint segmentation method for the optic disc (OD) and optic cup (OC) using adversarial learning to enhance the diagnosis of glaucoma through color fundus photography. The proposed method addresses challenges such as asymmetrical thinning and dataset gaps, achieving high performance on public datasets Drishti-GS and REFUGE. The results demonstrate the effectiveness of the approach, with significant improvements in segmentation accuracy compared to traditional methods.

Uploaded by

21Z363 - THIRISHA S

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Combined_Optic_Disc_and_Optic_Cup_Segmentation_Network_Based_on_Adversarial_Learning (1)

Uploaded by

21Z363 - THIRISHA S

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Received 28 June 2024, accepted 23 July 2024, date of publication 29 July 2024, date of current version 7 August 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3435552

Combined Optic Disc and Optic Cup Segmentation

Network Based on Adversarial Learning
YONG LIU 1,2 , JIN WU1 , (Member, IEEE), YUANPEI ZHU3 , AND XUEZHI ZHOU2
1 School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
2 School of Medical Engineering, Xinxiang Medical University, Xinxiang 453003, China
3 School of Physics and Electronic Engineering, Xinxiang University, Xinxiang 453003, China

Corresponding author: Yong Liu ([email protected])

This work was supported in part by Henan Province Key Research and Development and Promotion Projects under Grant 232102310009,
and in part by the National Natural Science Foundation of China under Grant 82302298.

ABSTRACT Glaucoma is a group of diseases characterized by progressive optic nerve damage, ultimately
resulting in irreversible visual impairment. Early diagnosis through color fundus photography, including
measurement of the vertical cup-to-disk ratio (CDR), can help prevent vision loss. The normal range of
CDR values is usually 0.3-0.5, and if it exceeds 0.6, then there may be some problems. However, asym-
metrical thinning at the edges of the bottom-superior temporal-nasal region and large gaps in datasets pose
challenges for existing automatic segmentation methods. To address these challenges, this paper proposes a
joint segmentation method for the optic disc (OD) and optic cup (OC) based on an adversarial network,
incorporating new monitoring functions to guide the network optimization process. The effectiveness
and stability of this framework were evaluated using two public performance datasets of retinal fundus
images, namely Drishti-GS and REFUGE. On the Drishti-GS dataset, our method achieved a score of
0.850/0.964/0.086, while on the REFUGE dataset, it obtained a score of 0.887/0.975/0.061. These results
indicate the effectiveness of our approach.

INDEX TERMS Adversarial learning, deep learning, color fundus photography, glaucoma.

I. INTRODUCTION glaucoma. The CDR value is defined as the ratio of the

Glaucoma is the most common blinding disease, and can diameter of the optic disc to the diameter of the optic cup in
cause optic nerve damage, visual field damage and irre- the vertical direction, therefore, its accuracy depends on the
versible vision loss, bringing great inconvenience to patients’ accuracy of OD and OC segmentation.
live and work [1]. Owing to limited medical resources, anal- In figure.1, we can know the general position of the nerve
ysis of the optic nerve head (ONH) can help prevent the papilla in the fundus image. The yellow circle indicates the
occurrence and development of glaucoma. This can lead to optic disc area and the blue circle indicates the optic cup
differences in the diagnosis of glaucoma among different area. VCD represents the diameter of the vertical optic cup
doctors. In addition, manual diagnosis methods are time- area, and VDD represents the diameter of the optic disc
consuming, expensive, and not suitable for large-scale sample in the disposal direction. This ratio represents the vertical
screening. With the increasing number of patients, there is an cup-to-plate ratio. The difference between the cup-to-plate
urgent need for an automatic segmentation method that can ratio of glaucoma patients and normal conditions can be
help clinicians improve their work efficiency. clearly observed in (c) and (d). Under normal circumstances,
The morphology of the optic disc (OD), optic cup (OC) and the value of the cup-to-plate ratio is less than 0.6, and the
cup-to-disk ratio (CDR) are the main indices used to evaluate difference between the left and right eyes is less than 0.2.
ONH, and the CDR value could indicate the possibility of At present, many methods have been applied to OD
and OC automatic task segmentation. The most com-
The associate editor coordinating the review of this manuscript and monly used methods are edge detection, region segmenta-
approving it for publication was Mohammad Zia Ur Rahman . tion, and threshold methods (including color and contrast).
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
104898 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
Y. Liu et al.: Combined OD and OC Segmentation Network Based on Adversarial Learning

Maninis et al. [13] proposed a DRIU network to segment OD

and related blood vessels, which is an improved full convolu-
tional network [14] with the advantages of VGG16 [15]. The
complexity of the calculation can be reduced using certain
sampling techniques. Zilly et al. [16] adopted entropy-based
techniques to achieve this goal. Ding et al. [17] proposed a
high-order attentional mechanism to obtain global context
information and applied this method in medical image seg-
mentation tasks. In these deep learning-based approaches,
each pixel of images must be classified. For fundus color
images, pixels can be divided into the background, OD, and
OC. This approach often requires manual labeling to monitor
network training. In supervised learning based on manual
labeling, the quality of the segmentation results is affected
by the manually labeled data. Most training schemes assume
that all the datasets have the same distribution. However, there
are significant differences between each dataset, including
FIGURE 1. Schematic diagram of the position of the optic nerve papilla in data distribution and picture quality, which results in the poor
the color fundus image. (a) is the optic nerve head topography image,
(b) is enlarged structure of the optic papilla, (c) is a color fundus image generalization ability of network models trained with a single
without glaucoma and (d) is a color image of a patient with glaucoma. datasets [18].
Several different approaches have been applied to seg-
mentation models to overcome the generalization problem
For example, Aquino et al. [2] proposed a method using between different datasets [20], [21], [22]. For exam-
morphology and edge detection techniques to obtain circular ple, in order to transmit weak class tag information,
OD boundaries by using circular transformations. If we can Hong et al. [19] used an attention network based on encoder-
capture the pixel change law of the image boundary and near decoder mechanism. Additional methods were also intro-
the boundary simultaneously, this will be very helpful for duced in the OD and OC joint segmentation tasks. Spatial
the segmentation result. Lu [3] proposed a method based adversarial networks have demonstrated a positive effect in
on circular transformation. Chakravarty and Sivaswamy [4] recent semantic segmentation. Wang et al. [18] proposed
extracted the boundaries of optic disc and cup based on the a patch-based model, which changes the discriminator in
edge detection method of conditional random field. There adversarial network into a patch discriminator. They also
are also some methods that use manual labeling of visual introduced morphological perception segmentation loss in
features to help us carry out segmentation tasks, such as the OD and OC joint segmentation tasks. This model has
stereo pair features [5], combines advanced feature extraction shown good segmentation effect on several datasets, includ-
and attention mechanisms with spatial processing [6] and ing Drishti-GS [24], RIM-ONE-r3 [25], and REFUGE [26].
based on multi-channel and spatial attention mechanisms [7]. Some researchers have found that the GAN network is unsta-
However, with the increase of model parameters, higher ble during the training process. Kadambi et al. [23] used
requirements will be put forward for training time and hard- WGAN [27] to carry out the training process and refined the
ware. Although these methods have made progress in the details in the training process. Although these methods have
automatic segmentation the optic cup from the optic disc, they made significant progress in the existing datasets, they cannot
are easily affected by training datasets, pathological changes, provide reliable support for the screening and diagnosis of
blood vessels near the optic disk and other factors, and the glaucoma because of the different data distributions in the
boundary between the OD and OC is fuzzy. application process. This prompted us to explore a network
With the development of science and technology, espe- that could provide reliable performance on different datasets
cially computers and neural networks [8], deep learning and satisfy the joint OD and OC segmentation task. The
technology has made remarkable achievements in the medical inclusion of OC in the OD region is a physiological structural
field and has achieved good results in the joint segmen- feature that is often ignored, and this relationship can be
tation task of OD and OC [9], [10], [11]. Deep learning beneficial to the segmentation results. The above methods can
networks no longer rely on manually marked features, and be summarized in Table 1.
after several rounds of iterative calculation, they can obtain In this work, a convolution network model based on adver-
the feature information of the training image, and can sarial learning is proposed for simultaneous OD and OC joint
apply them to subsequent segmentation tasks, and the seg- segmentation of fundus images. It is worth noting that OC
mentation effect is better than that of traditional methods. are always present within OD regions, and their boundaries
Wang et al. [12] proposed a method to capture OD and OC are difficult to define. The proposed method involves using
boundaries with estimated ellipses on the basis of deep learn- an extraction network to identify the region of interest, which
ing, which obtained better results under less supervision. primarily includes the region of the firm order and the region
VOLUME 12, 2024 104899
Y. Liu et al.: Combined OD and OC Segmentation Network Based on Adversarial Learning

of the apparent cup. Subsequently, the extracted image is The optic disc, also known as the optic nerve papilla,
transformed using polar coordinates to increase the propor- is located at the posterior pole of the eyeball, approximately
tion of optic disc and optic cup regions in the entire image, 3 millimeters nasally and 1.5 millimeters in diameter. The
thereby enhancing the accuracy of subsequent segmentation. optic disc nerve plays an essential role in the fundus exam-
The segmentation process employs adversarial learning net- ination. Early methods of extracting optic disc boundaries
works, which are adjusted in terms of network structure and relied on the use of templates. Lowell et al. [29] utilized
loss function to improve overall performance. To evaluate image gradient changes to segment optic disc boundaries
the effectiveness of the proposed method, experiments were and incorporated the active contour method. Since both optic
conducted on two public datasets (Drishti-GS and REFUGE). discs and optic cups have ellipsoid shapes, methods based
The results demonstrate that the model achieves the desired on circular transformation technology have also been used
outcomes. [2], [3]. To enhance the robustness of the model, Fu et al
[9] incorporated local texture features in a multidimensional
TABLE 1. Advantages and disadvantages of the above methods. space. Pixel classification methods have achieved notable
results in current semantic segmentation tasks, particularly
in the field of medical image segmentation. By converting
the boundary segmentation task into a pixel classification
problem, researchers have found it to be more conducive
to solving the problem. Cheng et al. [30] used a superpixel
classifier to segment the optic disc and optic cup, and manu-
ally produced visual features to improve detection accuracy.
Abramoff et al. introduced parallax values extracted from
stereo image pairs to identify the optic disc and background
[5]. Although these methods have shown good results, they
all rely on manual annotation information, and are therefore
more susceptible to image quality and pathological changes.
OC Segmentation: An important indicator for diagnosis
glaucoma is the optic cup, situated in the center of the optic
This work’s primary contributions are as follows: disc within a brighter oval depression. The anterior movement
(1) We explored supervised adversarial networks to of the optic cup obstructs the optic disc, leading to glaucoma.
improve the adaptability of different segmentation networks Under normal circumstances, the optic cup is less than 1/3
to different datasets and to improve the generalization ability the size of the optic disc, but the proportion of the optic
of networks. cup is larger patients with glaucoma. Wong et al. proposed a
(2) A method of polar coordinate transformation is pro- level-set algorithm to automatically segment the boundary of
posed to transform the image so that the accuracy of the final OC [31]. Later, the information about blood vessel curvature
segmentation task can be improved. in retinal images has been shown to be beneficial for the seg-
(3) We evaluated our model on two public fundus image mentation of OC [32]. Due to the natural distortion of fundus
datasets, and achieved good results in OD and OC joint blood vessels near the optic disc (OC) boundary, the accu-
segmentation tasks. racy of OC segmentation based on the information of blood
The remainder of this paper is organized as follows. vessel distortion information is not satisfactory. In addition,
We review related techniques in Section II, and Section III Cheng et al. introduced the method of pixel classifier method
introduces the proposed method in detail. The evaluation and into the OC segmentation task [30]. More and more useful
results are presented in Section IV. Finally, we discuss the methods are being introduced to OC segmentation tasks [33],
results and draw conclusions in Section V and Section VI. [34]. All of the aforementioned methods depend on manually
labeled visual features, primarily focusing on the contrast
II. RELATED WORKS information between the edge of the optic nerve and the optic
Nowadays, many researchers are engaged in research on cup.
task segmentation of OD and OC, and many methods are Joint OD and OC Segmentation: Optic disc and optic cup
effective. These methods rely heavily on visual features of are closely related in physiological structure, and the optic
artificial markers for segmentation, such as image gradient cup is contained in the optic disc, which means that the pixels
information, features of stereoscopic image pairs, local tex- belonging to the optic cup also belong to the optic disc. The
ture features and superpixel based classifiers. The boundary joint segmentation of the optic disc can obtain high accuracy
between OC and OD is often difficult to distinguish, so OC in calculating the value of CDR [35]. Joshi et al. divided OD
segmentation is more difficult, and it relies more on man- and OC step by step [36]. Zheng et al. integrated the prior
ually annotated features. In recent years, it has been found graph cut framework into OD and OC segmentation [37]. The
that the joint segmentation of OD and OC can improve the above methods are based on the fact that any pixel in the
performance of segmentation networks [28]. fundus image only belongs to one part, such as background,
104900 VOLUME 12, 2024
Y. Liu et al.: Combined OD and OC Segmentation Network Based on Adversarial Learning

OD or OC. That is to say, they believe that OD and OC

are independent, which is in contradiction with the actual
physiological structure.
Recently, advancements have been made in domain adap-
tive technology, particularly in the field of medical image
analysis. This technique is based on adversarial networks
and explores the shared feature space between the source
FIGURE 3. Segmentation process based on adversarial learning Network.
and target domains, enabling feature correspondence between PCT stands for the image after polar coordinate transformation, both the
the two domains. The network is trained using these ideas, training part and the test part require this step. The discriminator can
reduce the difference between training set and test set prediction. The
and then applied to the target domain. These methods can entire network is monitored by two functions called Lseg and Ladv . Lseg
generate realistic images in another domain without using is calculated by using the predicted value and the training labeled
paired training sets. One such method is Cycle-GAN, but datasets (yTrain ). And Ladv is calculated using the prediction of the
unlabeled test datasets (yTest ).
it requires additional constraints to guide the unsupervised
style change process. For instance, Sevastopolsky [38] used
two segmentation networks stacked behind the cycle-GAN manipulate the original image and crop it in the center of the
to achieve enhanced shape consistency. In [16], adversarial optic disc. Specifically, the extraction network is a kind of
learning of semantic perception was used to prevent semantic U-Net network. We use the cut image block and the corre-
distortion during conversion. In [39], a generative adversarial sponding OD label for training, so that we can cut out the
network was constructed to enhance segmentation consis- sub-image we are interested in with the OD center. In this
tency. However, these methods fail to consider the spatial case, the size of our clipped image is 480 × 480.
correlations between the target domain and its neighborhood.
Therefore, we propose a network that combines the joint
B. SEGMENTATION NETWORK ARCHITECTURE
segmentation task of OC and OD to improve the accuracy
of the results. We used an improved network to get better performance for
OD and OC joint split tasks.
III. METHOD
Segmentor: We used MobileNetV2 [14] to replace xcex-
Figure. 2 and figure. 3 show the overall structure diagram of ception [13] in the original DeepLabv3+ architecture [13],
OD and OC joint segmentation proposed by us. It includes which reduces the overall network computation and improves
two parts: polar coordinate transformation for cropped fundu network performance. See figure 4. We used an original
images and adversarial learning network. Firstly, we use the convolutional network and seven inverted residual blocks of
ROI extraction network to extract the OD region in the color the MobileNetV2 in the down-sampling process. The stride
of the first four convolution layers is set as the initial value,
fundus image [40]. As the proportion of the cut image of OD
and the stride of the remaining three layers are set as 1. During
part is small, the network segmentation performance is not
the down-sampling process, a total of eight down-sampling
good after they are used as the training set. We will carry
operations are performed. In order to collect as many feature
out polar coordinate transformation on the cut image, so as
maps as possible, we use ASPP (Atrous Spatial Pyramid
to improve the proportion of OD region in the whole image.
Pooling) [13]. Finally, the probability graph of OD and OC
is generated according to the multi-label Settings in [1].

FIGURE 2. Overview of our ROI Extraction framework. ROI regions (ITrain ;

ITest ) are firstly extracted from the training section (ITrain ) and the test
section (ITest ).

The network can get more useful features from them.

Finally, the transformed images are used as the inputs of
adversarial learning network to obtain the segmentation
result.

A. ROI EXTRACTION
Increasing the proportion of optic disc and optic cup in the
FIGURE 4. The proposed segmentation network architecture. It includes a
whole image can help improve the accuracy of segmentation. down-sampling part, an up-sampling part and a skip connection part.
For this purpose, we use the ROI extraction framework to And it includes the ASPP module.