Dong PDF
Dong PDF
article info a b s t r a c t
Article history: Traditional cell classification methods generally extract multiple features of the cell manually. More-
Received 15 September 2019 over, the simple use of artificial feature extraction methods has low universality. For example, it
Received in revised form 8 April 2020 is unsuitable for cervical cell recognition because of the complexity of the cervical cell texture and
Accepted 13 April 2020
the large individual differences between cells. Using the convolutional neural network classification
Available online 14 May 2020
method is a good way to solve this problem. However, although the cell features can be extracted
Keywords: automatically, the cervical cell domain knowledge will be lost, and the corresponding features of
Inception v3 different cell types will be missing; hence, the classification effect is not sufficiently accurate. Aiming
Artificial feature extraction at addressing the limitations of the two mentioned classification methods, this paper proposes a cell
Transfer learning classification algorithm that combines Inception v3 and artificial features, which effectively improves
Medical image processing the accuracy of cervical cell recognition. In addition, to address the under-fitting problem and carry
Cervical cancer disease diagnosis out effective deep learning training with a relatively small amount of medical data, this paper inherits
the strong learning ability from transfer learning, and achieves accurate and effective cervical cell
image classification based on the Herlev dataset. Using this method, an accuracy of more than 98%
is achieved, providing an effective framework for computer aided diagnosis of cervical cancer. The
proposed algorithm has good universality, low complexity, and high accuracy, rendering it suitable for
further extension and application to the classification of other types of cancer cells.
© 2020 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.asoc.2020.106311
1568-4946/© 2020 Elsevier B.V. All rights reserved.
2 N. Dong, L. Zhao, C.H. Wu et al. / Applied Soft Computing Journal 93 (2020) 106311
used when researchers extract the texture features of cells, which called the Inception network [29–33]. There are many versions
mainly include contrast, energy, and entropy. The classification of GoogLeNet, which are mainly divided into Inception v1 (2014),
of cervical cells was achieved by Walker et al. [16,17] using the Inception v2 (2015), Inception v3 (2015), Inception v4 (2016), and
GLCM method, whilst Plissiti et al. [18] used local binary pattern Inception-ResNet (2016).
features to analyse cervical cell texture features. The Inception module typically contains three different sizes
The CNN extracts features of the cell image through a self- of convolution and one maximum pooling. For the network out-
learning capability. Hence, the specific learning methods and put of the previous layer, the channel is aggregated after the
selected features are an uncertain aspect of the process, which
convolution operation, and then the nonlinear fusion is per-
could have an impact on classification accuracy. Therefore, in this
formed. In this way, the expression of the network and the
paper an innovative cell recognition algorithm is proposed that
adaptability to different scales can be improved, and over-fitting
combines Inception v3 and artificial features. Using this method,
the classification accuracy of cervical cancer cells is greater than can be prevented. Fig. 2 shows the Inception network structure.
98%. The structure of the paper is divided into the following four Inception v3 is primarily a network structure developed by
sections: inception v3 combined with transfer learning, design of Keras, which is pre-trained in Image Net. The default images input
the proposed algorithm, analysis of the experimental results, and size is 299*299 with three channels. The Inception v3 network
concluding remarks. structure used in this paper is shown in Fig. 3.
Compared to the previous versions (Inception v1 and v2),
2. Inception v3 combined with transfer learning the Inception v3 network structure uses a convolution kernel
splitting method to divide large volume integrals into small con-
Before explaining the framework of the combination of Incep- volutions. For example, a 3*3 convolution is split into 3*1 and
tion v3 model and transfer learning, a brief literature review is 1*3 convolutions. Through the splitting method, the number of
presented. parameters can be reduced; hence, the network training speed
Network structure based on Inception v3 combined with can be accelerated while the spatial feature can be extracted more
transfer learning has attracted the attention of scholars in recent effectively.
years, due to its excellent performance on a wide range of small
At the same time, Inception v3 optimises the Inception net-
datasets. Zhang [19] achieved a high-precision classification of
work structure module using three different size area grids
five representative snakes, and Lin et al. [20] completed the
(35*35, 17*17, and 8*8), as shown in Fig. 4.
classification of the German Traffic Sign Recognition Standard
(GTSRB). Xia et al. [21] classified flowers from the Oxford-i7 and
Oxford-102 flower datasets, obtaining good classification results. 2.2. Transfer learning
Lee et al. [22] successfully classified permanent gum disease in
apical X-ray film. Xu et al. [23] used the algorithm to identify
wild photos and apply them to land-cover type classification. Li A major feature of deep learning is that it requires a large
et al. [24] successfully achieved the classification of lymph node amount of data. If the amount of data is too small, under-fitting
metastasis in colorectal cancer. Meanwhile, Mednikov et al. [25] will occur during training. To train with deep learning in a small
realised an effective classification of breast lumps. dataset, scholars have proposed the concept of transfer learning.
Through the learning ability of transfer learning, accurate and
2.1. Inception v3 efficient image classification can be achieved with only a small
dataset.
The CNN [26–28] is a neural network designed specifically For an emerging task, if there is an insufficient training set, the
for image recognition problems. It mimics the multi-layered pro- transfer learning method is used to pre-train a good performance
cess of human recognition of images: pupils ingest the pixels; model on Image Net (1.2 million images with 1000 categories)
initial processing of certain cells in the cerebral cortex, finding and learn the features of the Image Net dataset. In practice, the
shape edges and directions; abstractly determining shapes (such
weight parameters pre-trained by Image Net should be initialised
as circles and squares) and further abstracting decisions (such
in the model that you have established, which can ensure that
as judging objects as balloons). CNNs typically include five lay-
the previously learned features are sent to your model, ensuring
ers: input, convolutional, pooled, fully connected, and output. A
general CNN model is presented in Fig. 1. better results.
The GoogLeNet network is a CNN that was proposed by Google After reviewing the above techniques, the core idea in this
in 2014. It adopts the Inception network structure, which not paper is to build a network model based on Inception v3. First,
only reduces the amount of network parameters, but also in- the model is pre-trained using Image Net, then it is fine-tuned
creases the network depth. Hence, it is widely used in image by the cervical cancer cell dataset. Finally, the goal of a good
classification tasks. As the core of the GoogLeNet network is classification result based on a small dataset using the proposed
the Inception network structure, the GoogLeNet network is also method is achieved.
N. Dong, L. Zhao, C.H. Wu et al. / Applied Soft Computing Journal 93 (2020) 106311 3
3. Design of the Inception v3 combined with artificial feature are extracted from three aspects: colour, morphology, and tex-
extraction ture. In this research, nine features from these aspects are se-
lected and integrated into deep learning.
In this section, the design of Inception v3 combined with
artificial feature extraction is presented, to recognise cervical cells 3.1.1. Colour features
with a higher accuracy than traditional recognition methods. The In this research, a colour histogram is used to extract the
aim is to propose an effective framework for computer aided colour features between normal and abnormal cells, which de-
diagnosis of cervical cancer. scribes the proportion of each colour in the cell image. The
proportion is defined as follows:
3.1. Extraction of artificial features ni
H (i) = , i = 0, 1, . . . , L − 1, (1)
N
Feature extraction is used to find effective features from cell where i is the grey level to which the pixel belongs, L is the total
images. Under normal circumstances, cancerous and normal cells number of grey levels, the number of pixels representing the grey
are very different in colour and morphology. Typically, features level is ni , and N is the total number of pixels.
4 N. Dong, L. Zhao, C.H. Wu et al. / Applied Soft Computing Journal 93 (2020) 106311
(2) Variance: This indicates the degree of pixel dispersion. The Fig. 5. Eight connected chain code diagram.
larger the value, the more dispersed the distribution.
L−1
∑
σ2 = (i − µ)2 H(i) (3)
i=0
(4) Energy: This indicates the degree of uniformity of the Fig. 6. Code representation.
image distribution.
L−1
∑
µN = H(i)2 (5)
i=0
more intuitive than the GLCM and has a better visual effect. The
features of roughness are selected from the Tamura feature, and
there are three steps in calculating the roughness, as follows: is a pointer to any pixel point P, and the surrounding eight pixels
(A) Calculate the average intensity value of pixels in the active are its connected pixels.
window of size 2k ∗ 2k pixel in the image as follows: Fig. 5 shows the direction relationship between the surround-
ing pixels and P, with Fig. 6 being a specific example. The ob-
x+2k−1 −1 y+2k−1 −1
∑ ∑ served pattern in Fig. 6 is expressed as {2 1 2 0 6 0 6 4 5 4 2}.
Ak (x, y) = I(i, j)/22k (7) According to this principle, the morphological features of cells are
i=x−2k−1 j=y−2k−1 extracted from cell images. Fig. 7 presents a cervical cell diagram,
where k = 0, 1, . . . , 5, I(i, j) is the grey value of the pixel at (i, j) and the following three morphological features are selected:
in the equation. (1) Area: the number of pixels occupied by cells;
(B) Calculate the average intensity difference between win- (2) Circumference: the length of one week around the cell;
dows, and the windows cannot overlap each other in the hori- (3) Nuclear to cytoplasm ratio: ratio of nuclear area to cell
zontal and vertical directions. area:
N Nuclarea
Ek,h(x,y) = ⏐Ak (x + 2k−1 , y) − Ak (x − 2k−1 , y)⏐ ,
⏐ ⏐
= (10)
C Nuclarea + Cytoarea
Ek,v (x, y) = ⏐Ak (x, y + 2k−1 ) − Ak (x, y − 2k−1 )⏐
⏐ ⏐
(8)
3.2. Combining inception v3 with artificial feature extraction
For each pixel, it is necessary to determine the appropriate
k value and the optimal size Sbest (x, y) = 2k to maximise the E Traditional image processing methods mainly identify cells
value. by artificially extracting useful features. However, the texture of
(C) Roughness can be obtained by calculating the average of cervical cells is complex, and the individual differences are large.
Sbest in the entire image, where m*n is the size of the image. Therefore, the simple use of artificial feature extraction methods
1
m
∑ n
∑ is not effective. Although the cell features can be extracted auto-
Fcrs = Sbest (i, j) (9) matically using the classification method of CNNs, the knowledge
m∗n
i=1 j=1 of cervical cells is lacking. This means the corresponding features
of different types are lacking, making the classification effect less
3.1.3. Morphological features than ideal.
The morphological features are the size and shape of the cells. By addressing the limitations of the two different classification
This research selects eight connected chain codes to obtain the algorithms, an effective cell recognition algorithm is proposed in
morphological features of cells. The eight-connected chain code this paper. The proposed algorithm uses the features extracted
N. Dong, L. Zhao, C.H. Wu et al. / Applied Soft Computing Journal 93 (2020) 106311 5
Table 1
Cell image enhancement.
Table 2
Expanded number of cells per category.
category Normal cell Abnormal cell
Epithelial Middle Normal Mild Moderate Severe Carcinom-a
squamous scaly column lesion lesion lesion in situ
quantity 296 280 392 728 584 788 600
total 968 2700
Fig. 9. Two-classification accuracy change during iteration. Fig. 11. Seven-classification accuracy change during iteration.
Fig. 10. Two-classification loss rate change during iteration. Fig. 12. Seven-classification loss rate change during iteration.
To observe the advantages of the proposed algorithm more The experimental results indicate that compared with the
effectively, some of the latest algorithms are compared to ob-
serve differences in accuracy of the experimental results. The other classification algorithms, the proposed algorithm has higher
comparison results are displayed in Fig. 14.
N. Dong, L. Zhao, C.H. Wu et al. / Applied Soft Computing Journal 93 (2020) 106311 7
Fig. 14. Comparison of the results of the two-classification experiments (see Refs. [34–37])
This paper classifies cervical cells based on feature extrac- [13] M.E. Plissiti, C. Nikou, A review of automated techniques for cervical
tion and deep learning algorithms, which improves the accu- cell image analysis and classification, in: Biomedical Imaging and Com-
putational Modeling in Biomechanics, Springer, Netherlands, 2013, pp.
racy of classification and recognition. Although the deep learning
1–18.
framework is effective, its working principle is not yet known. [14] Y.F. Chen, P.C. Huang, K.C. Lin, H.H. Lin, L.E. Wang, C.C. Cheng, T.P. Chen,
Furthermore, combining the mechanisms of artificial features Y.K. Chan, J.Y. Chiang, Semi-automatic segmentation and classification of
and features generated by deep network needs further study pap smear cells, IEEE J. Biomed. Health Inf. 18 (1) (2014) 94–108.
and analysis. Effective and more explainable feature combination [15] R.M. Haralick, K. Shanmugam, I.H. Dinstein, Textural features for image
classification, IEEE Trans. Syst. Man Cybern. (6) (1973) 610–621.
methods will be further studied based on the particularities of [16] R.F. Walker, P. Jackway, B. Lovell, I.D. Longstaff, Classification of cervical
cervical cell graphs. cell nuclei using morphological segmentation and textural feature extrac-
tion, in: Proceedings of ANZIIS ’94 - Australian New Zealnd Intelligent
CRediT authorship contribution statement Information Systems Conference, 1994, pp. 297–301.
[17] R.F. Walker, P. Jackway, B. Lovell, Cervical cell classification via co-
occurrence and Markov random field features, in: Proceedings of Digital
N. Dong: Conceptualization, Methodology, Supervision, Fund- Image Computing: Techniques and Applications, 1995, pp. 294–299.
ing acquisition. L. Zhao: Writing - original draft, Data curation, [18] M.E. Plissiti, C. Nikou, A. Charchanti, Automated detection of cell nuclei in
Formal analysis. C.H. Wu: Investigation, Project administration, Pap smear images using morphological reconstruction and clustering, IEEE
Writing - review & editing. J.F. Chang: Software, Visualization, Trans. Inf. Technol. Biomed. 15 (2) (2011) 233–241.
[19] H.Y. Zhang, Snake image recognition based on Inception-v3 model,
Validation. Electron. Technol. Softw. Eng. 10 (2019) 58–61.
[20] Y. Lin, X.Y. Zhang, Research on road traffic sign recognition based on
Declaration of competing interest inception v3 model, Jiangxi Sci. 36 (5) (2018) 849–852.
[21] X.L. Xia, C. Xu, B. Nan, Inception-v3 for flower classifcation, in: 2017 2nd
Interational Conference on Image, Vision and Computing, pp. 783–787.
The authors declare that they have no known competing finan-
[22] J.H. Lee, D.H. Kim, S.N. Jeong, S.H. Choi, Detection and diagnosis of
cial interests or personal relationships that could have appeared dental caries using a deep learning-based convolutional neural network
to influence the work reported in this paper. algorithm, J. Dent. 77 (2018) 106–111.
[23] G. Xu, X. Zhu, D. Fu, J. Dong, Automatic land cover classification of geo-
Acknowledgements tagged field photos by deep learning, Environ. Model. Softw. 91 (2017)
127–134.
[24] J. Li, P. Wang, Y.Z. Li, Y. Zhou, X.L. Liu, K. Luan, Transfer learning of pre-
The authors sincerely thank the anonymous reviewers and trained inception-V3 model for colorectal cancer lymph node metastasis
handling editor for their valuable comments. This work was sup- classification, in: 2018 IEEE International Conference on Mechatronics and
ported by the National Natural Science Foundation of China under Automation, Vol. 10, 2018, pp. 1650–1654.
[25] Y. Mednikov, S. Nehemia, B. Zheng, O. Benzaquen, D. Lederman, Transfer
No. 61773282. Gratitude is extended to Big Data Intelligence
representation learning using inception-V3 for the detection of masses in
Centre of The Hang Seng University of Hong Kong for supporting mammography, in: 2018 40th Annual International Conference of the IEEE
the research. Engineering in Medicine and Biology Society, Honolulu, HI, USA, 2018.
[26] S. Matiz, K.E. Barner, Inductive conformal predictor for convolutional
References neural networks: Applications to active learning for image classification,
Pattern Recognit. 90 (2019) 172–182.
[27] Y. Wang, Y.T. Chen, N.N. Yang, L.F. Zheng, N. Dey, A.S. Ashour, V. Ra-
[1] M. Wu, C. Yan, H. Liu, Q. Liu, Y. Yin, Automatic classification of cervical
jinikanth, João Manuel R.S. Tavares, F.Q. Shi, Classification of mice hepatic
cancer from cytological images by using convolutional neural network,
granuloma microscopic images based on a deep convolutional neural
Biosci. Rep. 38 (6) (2018) BSR20181769.
network, Appl. Soft Comput. 74 (2019) 40–50.
[2] L. Guo, Image Recognition of Cervical Cytopathic Lesions Based on
[28] N. Meng, E.Y. Lam, K.K. Tsia, H.K. So, Large-scale multi-class image-based
Convolutional Neural Network, Guangxi Normal University, 2017.
cell classification with deep learning, IEEE J. Biomed. Health Inf. 23 (5)
[3] Y. Zhao, L.B. Zeng, Q.S. Wu, Classification of cervical cell images by
(2019) 2091–2098.
convolutional neural networks, J. Comput. Aided Design Comput. Graph.
[29] M. Abadi, A. Agarwal, et al., TensorFlow: Large-scale machine learning on
30 (11) (2018) 2049–2054.
heterogeneous distributed systems, 2016, CoRR abs/1603.04467.
[4] Y.W. Xu, A. Hosny, R. Zeleznik, C. Parmar, T. Coroller, I. Franco, R.H. Mak,
[30] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V.
Hugo J.W.L. Aerts, Deep learning predicts lung cancer treatment response
Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: 2015 IEEE
from serial medical imaging, Clin. Cancer Res. 25 (11) (2019) 3266–3275.
Conference on Computer Vision and Pattern Recognition, 2015, pp. 1–9.
[5] M. Mittal, L.M. Goyal, S. Kaur, I. Kaur, A. Vermad, D.J. Hemanth, Deep
[31] S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network train-
learning based enhanced tumor segmentation approach for MR brain
ing by reducing internal covariate shift, in: International Conference on
images, Appl. Soft Comput. J. 78 (2019) 346–354.
International Conference on Machine Learning, 2015.
[6] O. Nunobiki, M. Sato, E. Taniguchi, W. Tang, M. Nakamura, H. Utsunomiya,
[32] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the
Y. Nakamura, I. Mori, K. Kakudo, Color image analysis of cervical neoplasia
inception architecture for computer vision, in: 2016 IEEE Conference on
using RGB computer color specification, Anal. Quant. Cytol. Histol. 24 (5)
Computer Vision and Pattern Recognition, 2016, pp. 2818–2826.
(2002) 289–294.
[33] C. Szegedy, S. Ioffe, V. Vanhoucke, A.A. Alemi, Inception-ResNet and the im-
[7] R. Vijayashree, K. Ramesh Rao, A semi-automated morphometric assess-
pact of residual connections on learning, in: Thirty-First AAAI Conference
ment of nuclei in pap smears using Imagej, J. Evol. Med. Dent. Sci. 4 (2015)
on Artificial Intelligence, 2016.
5363–5370.
[34] K.B. Kim, D.H. Song, Y.W. Woo, Nucleus segmentation and recognition of
[8] Y. Zhang, An effective white space image color space sequential
uterine cervical Pap-smears, in: International Workshop on Rough Sets,
segmentation method, J. Xi’an Jiaotong Univ. 32 (8) (1998) 52–56.
Fuzzy Sets, Data Mining, and Granular-Soft Computing, 2007, pp. 153–160.
[9] L. Hua, Y.K. Ye, Knowledge-based early diagnosis system for lung cancer,
[35] K. Bora, M. Chowdhury, L.B. Mahanta, M.K. Kundu, A.K. Da, Automated
Appl. Res. Comput. 17 (2) (2000) 90–92.
classification of pap smear images to detect cervical dysplasia, Comput.
[10] X.Q. Lu, N. Li, S.F. Chen, Study on the application of morphology, color
Methods Programs Biomed. 138 (2017) 31–47.
characteristics and neural network in lung cancer cell identification, J.
[36] L.L. Zhao, K. Li, J.P. Yin, Q. Liu, S.Q. Wang, Complete three-phase detection
Comput. Aided Design Comput. Graph. 13 (1) (2001) 87–92.
framework for identifying abnormal cervical cells, IET Image Process. 11
[11] J. Jantzen, J. Norup, G. Dounias, B. Bjerregaard, Pap-smear benchmark data
(4) (2017) 258–265.
for pattern classification, Nat. Insp. Smart Inf. Syst. (2005) 1–9.
[37] Z.M. Yang, Y.W. Yang, B. Yang, W.B. Pang, Z.N. Tian, Multi-stream convolu-
[12] J. Hallinan, P. Jackway, Detection of malignancy associated changes in
tional neural network classification algorithm based on the characteristics
thionin stained cervical cells, in: Conference on Digital Image Computing
of cervical cells, J. Comput. Aided Design Comput. Graph. 31 (4) (2019)
and Applications, 1995, pp. 426–431.
531–540.