0% found this document useful (0 votes)
9 views

Automatic_Thyroid_Ultrasound_Image_Segmentation_Based_on_U-shaped_Network

Uploaded by

lvgosgos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Automatic_Thyroid_Ultrasound_Image_Segmentation_Based_on_U-shaped_Network

Uploaded by

lvgosgos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)

Automatic Thyroid Ultrasound Image Segmentation


Based on U-shaped Network
Jianrui Ding*, Zichen Huang, Mengdie Shi Chunping Ning
School of Computer Science and Technology Department of Ultrasound
Harbin Institute of Technology The Affiliated Hospital of Qingdao University
Harbin, China Qingdao, China

Abstract—Automatic tumor segmentation of thyroid ultrasound we need to improve the existing depth learning model to fit for
image is quite challenging due to the poor image quality. medical images.
Recently the U-shaped network, especially U-Net, has achieved
good results in medical image segmentation. In this paper, we To solve the above problem, this paper proposed an
proposed a modified U-Net model (ReAgU-Net), which embedded improved U-shaped model. On the basis of U-Net, residual
the improved residual units into the skip connection among the substructures and attention gates are embedded in the jump
encoding and decoding path and introduce the attention gate connection to narrow the semantic gap between shallow and
mechanism to multiply the weight feature maps obtained from deep features. In addition, considering the small object in
shallow layers and deep layers. Also, a hyperparameter is medical images, this paper improves the loss function so that
introduced to combine Focal-Tversky Loss, Dice Loss and Cross- the model can promote the sensitivity while retaining the
entropy Loss to jointly guide the model optimization process. The attention of overlap.
experimental results demonstrate that the proposed approach
outperforms the other U-shaped models.
II. MATERIALS AND METHODS
Keywords-thyroid ultrasound image; automatic segmentation;
U-Net; ReAgU-Net A. Patients and imaging acquisition
192 patients (148 females, mean age 46.31±9.79 years;
I. INTRODUCTION range 11~67 years; and 44 males, mean age 54.9±11.7 years,
range 29~81 years) and totally 1936 images were evaluated in
Thyroid nodules are one of the most common diseases in the study. The mean size of the nodules was 1.74cm (range
adults. Although most nodules are benign, in recent years, the 0.77~2.64cm).
thyroid cancer has increased quickly. In the statistics of global
cancer population in 2018, the incidence and mortality of Ultrasonography (US) acquisition was performed with the
thyroid cancer rank ninth and sixth respectively [1]. The HITACHI Vision 900, HIVISION Preirus (Hitachi Medical
computer-aided diagnosis (CAD) system can describe the System, Tokyo, Japan) and Siemens S2000 (Siemens Medical
nodules objectively and quantitatively, eliminate the Solutions) equipped with a liner probe with central frequency
subjectivity of doctors, and provide a useful reference for of 7.5~14.0MHz. All the examinations were conducted by two
doctors. Automatic thyroid ultrasound image segmentation is a experienced sonographers and all the nodules used in this study
key step in CAD and also is very challenging due to low were delineated by them as the ground truth. Both of them have
contrast, speckle noise, weak boundary and artifacts. more than 6 years’ experience. The sample images were shown
in Fig.1.
With the great success in natural scene image analysis,
more and more deep learning methods have been applied to
medical image segmentation [2-8], including thyroid
ultrasound image segmentation [9-11]. The major strategy of
the approaches is to apply Convolutional Neural Network
(CNN) to encoding the image and upsample the deep features
to decoding the image. Although some results have been
achieved, there are still some problems to be studied in depth.
Natural scene images are easy access, easy labeling and
have large data sets. The deep learning model designed for
them is usually having deep hierarchy and large parameters.
While medical images are difficult to obtain and label, and also
are small data sets. Therefore, the direct application of the
existing deep learning model will make the training data
distributed in the space sparsely. This will bring the over-fitting
and affect the generalization ability of the model. Therefore, Figure 1. Sample images from dataset

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on December 10,2024 at 11:59:55 UTC from IEEE Xplore. Restrictions apply.
978-1-7281-4852-6/19/$31.00 ©2019 IEEE
B. Embedding residual unit in skip connections In the task of object detection, Lin et al. [14] compared
The prototype of residual unit [12] is shown in Fig.2(a). single-stage and multi-stage detection methods, it is found that
The structure of the new residual learning unit proposed in this single-stage detector is often faster and simpler, but its
paper is shown in Fig.2(b), which reduces the parameters while accuracy lags behind cascade detector. The research shows that
increasing the depth of the model. It can effectively combine the most important problem is the serious imbalance of
the semantic level features from the encoder with the abstract foreground-background categories. In this paper, a single-stage
features from the decoder, thus solving the semantic gap to a segmentation method is used, so Focal-Tversky loss is
certain extent. introduced. And the three loss (Cross Entropy, Dice and Focal-
Tversky) are combined together as equation (1).
Lc = ε *FTLc + (1 − ε ) * ( CEc + DLc ) (1)

The super parameter ε was determined by grid search.

E. Training ReAgU-Net
On the basis of U-Net, residual unit and attention gates are
embedded in the skip connection to narrow the semantic gap
between shallow and deep features. The whole network
structure of our model ReAgU-Net is shown in Fig.4.

(a) original residual unit (b) proposed residual unit


Figure 2. residual unit

C. Using attention gate mechanism


In order to narrow the semantic gap and embedding
multiple residual units in skip connections, we deepen the
network horizontally as shown in Fig.2. It will inevitably lead
to the loss of spatial information and the location shift of
abstract features. Attention Gate (AG) [13] mechanism extracts
context information from low-level features, gets weighted
feature map and multiplies the abstract features. This process
can autonomously learn and focus on the target structure
without additional supervision, so it has the effect of position
correction. The AG mechanism is shown in Fig.3.

Figure 4. Network Structure of ReAgU-Net

The ReAgU-Net model uses the Adam optimization


algorithm [15] to iteratively update the weight of the model. Its
main parameters are: α (learning rate, used to control
Figure 3. AG mechanism [13] parameter update step), β1 (exponential attenuation rate of first-
order moment estimation, usually set to 0.9), β 2 (exponential
D. Design of loss function
attenuation rate of second-order moment estimation, usually set
In thyroid ultrasound images, nodules are usually much
to 0.999), (very small constant, such as 10-7). The training
smaller than the background. Deep learning regards image
process is shown in Fig 5.
segmentation problem as pixel classification. Therefore, the
difference between foreground and background area reflects
the imbalance of positive and negative categories of data.
Moreover, the cost of incorrectly classifying foreground into
background and background into foreground cannot be treated
equally. Cross Entropy (CE) and Dice loss cannot solve the
problem of category imbalance.

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on December 10,2024 at 11:59:55 UTC from IEEE Xplore. Restrictions apply.
Require: : Learning rate
Table 1. Comparison of segmentation results

1 ∈ 0,1): Exponential decay rate for 1 st moment estimate model mIoU DSC Precision Recall
∈ 0,1): Exponential decay rate for 2nd moment estimate
2 U-Net 0.722 0.820 0.829 0.811
: Infinite decimal UNet++ 0.765 0.854 0.872 0.837
: Batch size ReAgU-Net 0.788 0.869 0.873 0.865
( ): Stochastic objective function with parameters

Require: 0: Initial parameter vector From the table, we can see that ReAgU-Net model has
0 ← 0 (Initialize 1st moment vector) 6.6%, 4.9%, 4.4% and 5.4% improvement in mIoU, DSC,
0 ← 0 (Initialize 2nd moment vector) Precision and Recall compared with U-Net model, and 2.3%,
← 0 (Initialize timestep) 1.5%, 0.1% and 2.8% improvement compared with UNet+.
This shows that ReAgU-Net model can recognize the location
1 while not converged do
and contour of nodules better and has higher accuracy.
2 ← +1
3 , , = 1,2, … , (Randomly selecting m pairs with and ) Some results are shown in Fig. 6.
4 ← ( ; ), = 1,2, … , (Computing segmented image for )
5 ← ( −1 ) (Get gradients w.r.t stochastic objective at timestep t)
6 ← 1 ∗ −1 + (1 − 1) ∗ (Update biased 1st moment estimate)
2
7 ← 2∗ −1 + (1 − 2) ∗ (Update biased 2 nd raw moment estimate)

8 ← /(1 − 1 ) (Compute bias-corrected 1st moment estimate)

9 ← /(1 − 2 ) (Compute bias-corrected 2nd raw moment estimate)
′ ′
10 ← −1 − ∗ /( + ) (Update parameters)
11 end while
12 return (Resulting parameters)

Figure 5. Training the ReAgU-Net

III. RESULTS

A. Evaluation Criteria
In this paper, four commonly used indexes are used to Original Image U-Net ReAgU-Net U-Net++ Ground Truth
evaluate the segmentation algorithm. They are mean Figure 5. Training the ReAgU-Net
Intersection over Union (mIoU), Dice Similarity Coefficient
(DSC), Precision (Precision) and Recall (Recall). The In order to better demonstrate the effectiveness of each
computation of the indexes is shown in equation (2) – (5). improvement point and compare their roles in the model, we
add each improvement point to the U-Net model in turn to
1 c TPi (2) compare their contributions. The results are shown in Table 2.
mIoU = 
c i TPi + FN i + FPi Table 2. Segmentation Results of Different Improvement Points
2* TP (3) mdel mIoU DSC Precision Recall
DSC =
2* TP + FP + FN U-Net 0.722 0.820 0.829 0.811
U-Net + AG 0.754 0.844 0.838 0.850
TP (4) U-Net + R-RB 0.787 0.867 0.871 0.863
Precision =
TP + FP U-Net + AG + R-RB
0.766 0.851 0.858 0.844
+ Dice loss
TP (5) ReAgU-Net 0.788 0.869 0.873 0.865
Recall =
TP + FN
where c is the class (foreground or background), TP (True From the table, we can see that compared with the attention
Positive), TN (True Negative), FP (False Positive) an FN gate mechanism, the improved residual units contributes
(False Negative) are refer to pixels predicted by the algorithm. greatly to the performance and compared with the Dice loss,
the loss function proposed in this paper can further improve the
B. Exprimental Results performance.
The dataset was divided into training set, verification set
and test set according to the ratio of 7:2:1. The training set is C. Performance comparison under different data sets
used to train the model iteratively, and the convergence of the In order to test the generalization ability of the model, in
model is judged by the error of verification set. Then, the addition to the data provided by the Affiliated Hospital of
trained model is used to segment the test set. The segmentation Qingdao University, an open dataset of 428 thyroid ultrasound
performance is shown in Table 1. images from DDTI (Digital Database Thyroid Image) [16] was
used. These images are collected with TOSHIBA Nemio 30

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on December 10,2024 at 11:59:55 UTC from IEEE Xplore. Restrictions apply.
and TOSHIBA Nemio MX. The frequency of linear detector is information loss caused by increasing the horizontal depth of
12MHz. The location of the nodule in each image is recorded the network. (3) Combining the advantages of Dice loss, cross-
by an XML file. So, the corresponding mask image can be entropy loss and Focal-Tversky loss, a new loss function is
obtained by parsing the XML file. designed to effectively solve the imbalance of foreground and
background categories in medical image segmentation.
Some examples are shown in Fig. 6. Experiments show that ReAgU-Net has 6.6%, 4.9%, 4.4% and
5.4% improvement in mIoU, DSC, precision and recall
compared with U-Net, and it also has outstanding performance
in different data sets.
But automatic segmentation of thyroid ultrasound image is
a very challenging task. There are still some images which
cannot segment well by the algorithm. Some examples are
shown no Fig. 7.

Figure 6. Sample image from DDTI

From the figure, we can see that multi-nodule exists in most


images in DDTI datasets. It greatly increases the difficulty of
segmentation.
The mIoU index of segmentation results are listed in Table
3.
Table 3. mIoU index of different datasets

model Original dataset DDTI


U-Net 0.722 0.201
UNet++ 0.765 0.258 Original Image U-Net ReAgU-Net U-Net++ Ground Truth
ReAgU-Net 0.788 0.260
Figure 7. examples of unsatisfactory segmentation results

When the contrast between nodules and background is low,


The low performance on DDTI datasets may be attributed as shown in the first row, the performance of the three models
to the multi-nodule first, and then to the small number of is not good. The dark areas of blood vessels, muscle folds and
images in DDTI datasets compared with the original datasets low echo in the image are similar to the gray level of the lesion.
(428 vs. 1936). Deep learning algorithms require a large Therefore, when facing these problems, the probability of
number of samples for training to get satisfactory results. misjudgment increases. And when multi-nodule appears, such
as the case in DDTI dataset, the algorithm also needs to
IV. DISCUSSIONS AND CONCLUSIONS improve.
The paper proposed an improved U-Shaped model for We believe that besides continuing to collect a large
thyroid ultrasound image segmentation. The advantages of the number of samples and enrich training data, we should make
proposed method are: (1) embedding residual units in skip full use of the prior knowledge of thyroid pathology and
connections to lessen the semantic gap by reducing the integrate it into the model. This will further improve the
parameters and increasing the depth of the model. Batch robustness of the model.
standardization layer is introduced to increase the
backpropagation gradient, which avoids the disappearance of ACKNOWLEDGMENT
gradient and improves the convergence speed. (2) Introducing
attention gate mechanism to enable the model to learn This work is supported, in part, by National Science
independently and focus on the object structure without Foundation of China; the Grant numbers is 81501477.
additional supervision, thus solving the problem of spatial

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on December 10,2024 at 11:59:55 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [9] H. Li, J. Weng, Y. Shi, et al. “An improved deep learning approach for
detection of thyroid papillary cancer in ultrasound images” Scientific
[1] B. Freddie, F. Jacques, S. Isabelle, et al. ‘‘Global Cancer Statistics 2018: Reports, vol. 8, pp. 1-12, 2018.
GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36
Cancers in 185 Countries.’’ CA Cancer J Clin, vol. 68, pp. 394-424, [10] P. Poudel, A. Illanes, D. Sheet, et al. “Evaluation of Commonly Used
2018. Algorithms for Thyroid Ultrasound Images Segmentation and
Improvement Using Machine Learning Approaches” Journal of
[2] O. Ronneberger, P. Fischer, T. Brox. “U-Net: Convolutional Networks Healthcare Engineering, pp. 1-13, 2018.
for Biomedical Image Segmentation,” International Conference on
Medical image computing and computer-assisted intervention. Springer, [11] J. Ma, F. Wu, T. Jiang, et al. “Ultrasound image-based thyroid nodule
Cham, pp. 234-241, 2015. automatic segmentation using convolutional neural networks”
International Journal of Computer Assisted Radiology and Surgery, vol.
[3] J. Long, E. Shelhamer and T. Darrell. “Fully Convolutional Networks 12, pp. 1895-1910, 2017.
for Semantic Segmentation” Proceedings of the IEEE conference on
computer vision and pattern recognition. pp. 3431-3440, 2015. [12] K. He, X. Zhang, S. Ren, et al. “Deep Residual Learning for Image
Recognition” Proceedings of the IEEE Conference on Computer Vision
[4] S. Zheng, S. Jayasumana, B. Romeraparedes, et al. “Conditional and Pattern Recognition, pp. 770-778, 2016.
Random Fields as Recurrent Neural Networks” Proceedings of the IEEE
international conference on computer vision. pp. 1529-1537, 2015. [13] J. Fu, H. Zheng and T. Mei. “Look Closer to See Better: Recurrent
Attention Convolutional Neural Network for Fine-grained Image
[5] M. Alom, M. Hasan, C. Yakopcic, et al. “Recurrent Residual Recognition” Proceedings of the IEEE Conference on Computer Vision
Convolutional Neural Network based on U-Net (R2U-Net) for Medical and Pattern Recognition, pp. 4438-4446, 2017.
Image Segmentation,” arXiv preprint arXiv:1802.06955, 2018.
[14] T. Lin, P. Goyal, R. Girshick, et al. “Focal Loss for Dense Object
[6] Z. Zhou, M. Siddiquee, N. Tajbakhsh, et al. “UNet++: A Nested U-Net Detection” Proceedings of the IEEE International Conference on
Architecture for Medical Image Segmentation” Deep Learning in Computer Vision, pp. 2980-2988, 2017.
Medical Image Analysis and Multimodal Learning for Clinical Decision
Support. Springer, Cham, pp. 3-11, 2018. [15] D. Kingma, J. Ba. “Adam: A Method for Stochastic Optimization” arXiv
preprint arXiv:1412.6980, 2014.
[7] Y. Xue, T. Xu, H. Zhang, et al. “SegAN: Adversarial Network with
Multi-scale L1 Loss for Medical Image Segmentation” [16] L. Pedraza, C. Vargas C , Fabián Narváez, et al. “An open access thyroid
Neuroinformatics, vol. 16, pp. 383-392, 2018. ultrasound image database” 10th International Symposium on Medical
Information Processing and Analysis. International Society for Optics
[8] M. Rezaei, K. Harmuth, W. Gierke, et al. “A Conditional Adversarial and Photonics, 2015.
Network for Semantic Segmentation of Brain Tumor” International
MICCAI Brainlesion Workshop. Springer, Cham, pp. 241-252, 2017.

Authorized licensed use limited to: China University of Geosciences Wuhan Campus. Downloaded on December 10,2024 at 11:59:55 UTC from IEEE Xplore. Restrictions apply.

You might also like