0% found this document useful (0 votes)
26 views8 pages

Hierarchical Pathology Screening For Cervical Abnormality

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views8 pages

Hierarchical Pathology Screening For Cervical Abnormality

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Computerized Medical Imaging and Graphics 89 (2021) 101892

Contents lists available at ScienceDirect

Computerized Medical Imaging and Graphics


journal homepage: www.elsevier.com/locate/compmedimag

Hierarchical pathology screening for cervical abnormality


Ming Zhou a, 1, Lichi Zhang a, 1, Xiaping Du a, Xi Ouyang a, Xin Zhang a, Qijia Shen a, Dong Luo b,
Xiangshan Fan c, Qian Wang a, *
a
Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, China
b
Advanced Industrial Technology Research Institute, Shanghai Jiao Tong University, China
c
Department of Pathology, the affiliated Drum Tower Hospital, Nanjing University Medical School, China

A R T I C L E I N F O A B S T R A C T

Keywords: Cervical smear screening is an imaging-based cancer detection tool, which is of pivotal importance for the early-
Object detection stage diagnosis. A computer-aided screening system can automatically find out if the scanned whole-slide images
Image classification (WSI) with cervical cells are classified as “abnormal” or “normal”, and then alert pathologists. It can significantly
Cervical smear screening
reduce the workload for human experts, and is therefore highly demanded in clinical practice. Most of the
TCT examination
screening methods are based on automatic cervical cell detection and classification, but the accuracy is generally
limited due to the high variation of cell appearance and lacking context information from the surroundings. Here
we propose a novel and hierarchical framework for automatic cervical smear screening aiming at the robust
performance of case-level diagnosis and finding suspected “abnormal” cells. Our framework consists of three
stages. We commence by extracting a large number of pathology images from the scanned WSIs, and imple­
menting abnormal cell detection to each pathology image. Then, we feed the detected “abnormal” cells with
corresponding confidence into our novel classification model for a comprehensive analysis of the extracted
pathology images. Finally, we summarize the classification outputs of all extracted images, and determine the
overall screening result for the target case. Experiments show that our three-stage hierarchical method can
effectively suppress the errors from cell-level detection, and provide an effective and robust way for cervical
abnormality screening.

1. Introduction can also diagnose whether there are precancerous lesions and microor­
ganisms’ infections (such as mold, trichomoniasis, viruses, chlamydia,
Cervical cancer is the second most common cancer among adult etc.). Details of the TCT examination’s process are described as follows.
women, which can be effectively treated and cured if diagnosed early The clinician first stretches the vagina of the subject, and then uses a
(Schiffman et al., 2007). On the other hand, delayed diagnosis of cer­ professional "small brush" to collect the shed cells from the cervix. The
vical cancer until an advanced stage will have a much negative impact collected cell specimens are then set up based on the cell preservation
on patient prognosis and cost healthcare resources. Currently, early solution, which are sent to the laboratory for preparation and staining.
screening of cervical cancer is recommended worldwide, as it is an Finally, a professional pathologist reads the cell morphology under a
effective approach in the prevention and management of cervical microscope, and produces the TBS diagnostic report.
cancer. According to the TBS rules, the cervical cells under examination are
Nowadays, the general staining methods for cervical cancer generally diagnosed into the following categories: normal class or
screening are based on the pap smear and thin-prep cytologic test (TCT) negative for intraepithelial malignancy (NILM), atypical squamous cells
(Koss, 1989). The TCT examination is more widely applied, which uses (ASC), the low-grade squamous intraepithelial lesion (LSIL), the high-
liquid-based thin-layer cell detection technology to diagnose the grade squamous intraepithelial lesion (HSIL) and squamous cell carci­
collected cells, to determine whether they are cancerous following the noma (SCC). Specifically, NILM is a “normal” conclusion, while the rest
“The Bethesda System (TBS)” rules (Nayar and Wilbur, 2015). Doctors are screened as “abnormal” in different degrees. Note that the

* Corresponding author.
E-mail address: [email protected] (Q. Wang).
1
Ming Zhou and Lichi Zhang contributed equally to this work and are co-first authors.

https://doi.org/10.1016/j.compmedimag.2021.101892
Received 20 August 2020; Received in revised form 6 February 2021; Accepted 24 February 2021
Available online 11 March 2021
0895-6111/© 2021 Elsevier Ltd. All rights reserved.
M. Zhou et al. Computerized Medical Imaging and Graphics 89 (2021) 101892

Fig. 1. Examples of the extracted images (with the size of 1024 × 1024 pixels) for different categories, including A for “normal” (NILM), and B-E for “abnormal” (i.e.,
ASC, LSIL, HSIL, SCC, respectively). Typical abnormal cells are zoomed-in on the right of each image (with expert annotations in red bounding boxes), along with the
results generated by cell-level detection in our framework (highlighted by blue boxes) (For interpretation of the references to colour in this figure legend, the reader is
referred to the web version of this article.).

morphology and appearance of the abnormal cervical cells are quite we first extract a certain number of “images” with fixed image sizes from
different from those of the normal cells. From the examples shown in the WSI scan per case. Then, for each extracted image, we adopt the cell
Fig. 1, we can observe that the normal cells typically have small nuclei detection technique to find the abnormal “cells”, with varying confi­
and full/round cytoplasm, while abnormal cells have irregular nuclei dence assigned to each abnormal candidate. Next, we refer to the loca­
and wrinkled edges. tions of the detected abnormal cells, and extract the corresponding
Currently, most TCT examinations are manually conducted by expert image patches. By ensembling from these patches, we finish image-level
pathologists, which is labor-intensive and inevitably subjective. Such classification and acquire relative confidence in deriving the decision. In
issues become more challenging especially in developing countries with the end, we conduct “case”-level classification to combine the image-
large populations, where screening needs are high in amount yet the level cues and derive the overall diagnosis for the patient. In this way,
numbers of qualified pathologists are much short. Therefore, it is highly our proposed framework can effectively utilize the cell-level abnor­
desired to develop a fully-automatic computer-aided screening system mality detection output, and further boost the robustness of case-level
based on the scanned whole-slide images (WSIs). The system is expected classification by considering image-level appearance and morphology
to be capable of screening cervical cancers in a high-throughput and in the nearby of potential abnormal cells. Our screening system has high
robust way, which can finally reduce the mortality rate of cervical computation efficiency and robustness, making it eligible to be
cancer. embedded in the fully automatic pipeline for high-throughput cervical
The techniques related to automatic cervical disease screening have cancer screening.
been extensively investigated especially in the era of deep learning. The The rest of this paper is organized as follows. Section 2 introduces
majority of attempts in the literature generally focus on cervical cell related works, which consists of the current development in object
detection, segmentation, and classification. For example, (Zhang et al., detection and image classification methods. Then, we introduce the
2017) proposed a method namely DeepPap that uses a convolutional methodology of our hierarchical cervical cancer screening system in
neural network (CNN) to extract features and then uses additional Section 3, with three major stages that are illustrated respectively.
datasets for fine-tuning. (Zhao et al., 2019) proposed the Deformable Section 4 shows the experiments to validate our method, with the con­
Multipath Ensemble Model (D-MEM) for cervical cell nuclear segmen­ clusions presented in Section 5.
tation. (Zhang et al., 2019) proposed a binary-tree structured cervical
cell nuclear segmentation network incorporating attention mechanisms. 2. Related works
(Hussain et al., 2020) proposed a fusion method that merges multiple
neural networks. (Taha et al., 2017) proposed to classify cervical cancer 2.1. Object detection
cells by employing a pre-trained CNN architecture as a feature extractor
and then an SVM as the classifier. In general, the cell-level detection and We adopt object detection to identify those cells that are potentially
classification remain challenging due to the high variation of cervical abnormal. Object detection is one of the basic tasks in the computer
cells and the complexity of the diseases. It is particularly difficult to vision field. Recently, many robust and efficient object detection
generalize those methods to high-throughput cervical cancer screening, methods have been put forward with the rapid development in deep
as the diagnosis of only a single patient’s case needs to be ensembled learning. However, the task remains challenging, with great potentials
from thousands of cells. and spaces for improvement.
In this work, we propose a novel hierarchical framework for WSI- The current object detection algorithms are mainly based on deep
based cervical cancer screening. Since the WSI size of a case is enor­ learning, which can be roughly divided into two categories.
mous and it is infeasible to directly conduct classification for full WSI, (1) One-stage detection. This type of detection algorithm does not

2
M. Zhou et al. Computerized Medical Imaging and Graphics 89 (2021) 101892

Fig. 2. The overall pipeline of our hierarchical WSI cervical cancer screening framework. Firstly, in the lowest cell-level detection, we identify those suspected
abnormal cells. Then, in the image-level classification task, we determine the normal/abnormal label for each image based on detected abnormal cells and their
surrounding patches. In the highest case-level classification task, we ensemble information from different images and derive the final classification report for the case
under consideration.

require the independent region proposals, and can directly detect the low-level and obtained through shallow learning. Thus, there is still a
target objects from images with the corresponding category probabili­ big "semantic gap" between the high-level perception of the image and
ties. The representative methods for one-stage object detection are the low-level feature representations. Nowadays, deep learning tech­
YOLO (Redmon et al., 2016), SSD (Liu et al., 2016), RetinaNet (Lin et al., nology uses sophisticated network architectures to further boost the
2017b), FCOS (Tian et al., 2019), etc. representation capability of the learned features, which can deliver
(2) Two-stage detection. The algorithms in this category divide the better performance than many traditional methods.
object detection problem into two parts. The first part generates a There are some deep learning methods available that have already
candidate region (or region proposal), which encloses the approximate demonstrated its validity in applications, such as the basic convolutional
position of the target object. Then, in the second part, they classify and neural network (CNN) (LeCun et al., 1998), the residual network
refine the detection outcomes. Typical two-stage detection methods are (ResNet) (He et al., 2016), and densely connected convolutional
Fast R-CNN (Girshick, 2015), Faster-RCNN (Ren et al., 2015), Grid network (DenseNet) (Huang et al., 2017). The main idea of the CNN
RCNN (Lu et al., 2019), etc. method is to construct an end-to-end multi-layer network, which can
The main performance indicators of object detection are accuracy integrate low-, middle- and high-level features for comprehensive
and speed. The accuracy mainly considers the positioning and classifi­ learning and modeling. However, the basic CNN method cannot train
cation accuracy of the detection objects. Generally, two-stage algo­ very deep models, which limits its performance in classification tasks.
rithms have advantages in terms of accuracy, but suffers from relatively The ResNet mitigates this issue, which skips the blocks of the convolu­
long computation time. Single-stage algorithms, on the other hand, have tional layer by using shortcut connections. The DenseNet method further
advantages in terms of speed. For our cervical cancer screening, we extends the idea of ResNet. It provides a more densely structure to
conduct comprehensive investigations and apply currently popular ob­ connect the convolutional layers compared with the ResNet, aiming at a
ject detection methods on cell-level detection, to find the optimal more robust performance. However, the above-mentioned classification
solution. methods cannot be directly adopted in the cervical cancer screening
task, since the cervical cells usually only occupy a small part of the re­
gion in the image, which is difficult to capture.
2.2. Image representation and classification
3. Method
Recognition and classification of target images is one of the major
tasks in the computer vision domain. It is also widely demanded in the We propose the hierarchical framework for cervical cancer
medical image analysis field, such as computer-aided diagnosis. Feature screening, concerning the huge size of WSI and the infeasibility to
description of the image is the focus of (case) classification. Generally handle a WSI scan in one-shot. The overall pipeline of our framework is
speaking, the classification algorithms use either manual features or presented in Fig. 2. Given a case and the corresponding WSI data, we
learned features to encode the entire image. The widely used manual first crop into multiple images, each of which is sized 1024 × 1024. For
features include SIFT (Lowe, 1999), HOG (Freeman and Roth, 1995), each selected image, cell-level detection is conducted to find the
SURF (Bay et al., 2006), etc. For learned features, most of them are

3
M. Zhou et al. Computerized Medical Imaging and Graphics 89 (2021) 101892

Fig. 3. The framework of cell-level detection. We use RetinaNet to complete abnormal cervical cell detection.

Fig. 4. The image-level classification includes two modules: Patch Encoder Module (PEM) and Fusion Module (FM).

suspicious abnormal cells. Then, regarding the cell-level detection out­ is presented in Fig. 3. It includes a backbone network of ResNet and
puts, we can finish the classification upon the selected image. For each Feature Pyramid Network (FPN) (Lin et al., 2017a), which extracts
case, the detection/classification outcomes from many cropped images features and obtains pyramid representations at the same time. After
are ensembled, such that the final case-level classification can be obtaining the feature pyramids, two subnetworks (classification + po­
attained. The methodology details of the three stages are illustrated in sition regression) are used for each layer of the feature pyramids to
the next. localize and classify the detected objects. Note that the output of cell
detection using RetinaNet consists of bounding boxes with their corre­
sponding confidence indicators. Here we only choose top- k bounding
3.1. Cell-level detection
boxes with the highest confidence values, which are fed into the sub­
sequent image-level classification. Detailed comparisons of individual
Our image classification requires cell-level detection to provide
cell detection algorithms are presented in Section 4.1.
context information. The performance of cell-level detection is critical to
subsequent classification accuracy. As previously mentioned in Section
2.1, there are many object detection methods developed and widely 3.2. Image-level classification
applied in recent years, such as Faster-RCNN, RetinaNet, FCOS, Grid
RCNN, SSD, etc. It is important to choose a reliable object detection The cell-level detection described in Section 3.1 can find the
solution, which can best fit our needs in finding the candidates of abnormal cells from the provided images, but the detection inevitably
abnormal cervical cells. has errors due to the high variation of cells and cell clusters. Therefore,
Specifically, we have adopted RetinaNet in this work, which verifies our hierarchical framework intends to revisit the cells and their sur­
its advantages of both high accuracy and efficiency compared with the roundings in the extracted images. Then, we conduct the classification
alternatives in our experiments. The network architecture of RetinaNet task to further determine if the image is recognized as normal or

4
M. Zhou et al. Computerized Medical Imaging and Graphics 89 (2021) 101892

abnormal, aiming at refining the cell-level detection results and provide Table 1
a more robust classification estimation. AP and runtime comparisons of five different models for abnormal cell
The methodology of the image-level classification stage is presented detection.
in Fig. 4. Specifically, we commence by extracting patches (sized 224 × Faster- SSD (Liu FCOS ( Grid RetinaNet (
224) from the given image (1024 × 1024). The center locations of the RCNN ( et al., Tian RCNN (Lu Lin et al.,
patches are the same as these detected (abnormal) bounding boxes in Ren et al., 2016) et al., et al., 2017b)
2015) 2019) 2019)
cell-level detection. There are thus k patches for an image, with the i -th
as Patchi and ci for the confidence produced by abnormal cell detection. AP 0.706 0.661 0.697 0.689 0.715
Runtime 0.170 0.292 0.125 0.163 0.128
As shown in Fig. 4, the image-level classification can be further
(s)
divided into two modules. The first is the Patch Encoder Module (PEM),
which uses a backbone encoder E with shared weights to extract the
feature map of each input patch. Then, the confidence provided by the with bounding boxes. Meanwhile, for the image that contains at least
cell detection network is passed in as the auxiliary context input. The one abnormal cell, we define it as an abnormal image in the image-level
confidence of the cell detection network usually correlates with classi­ classification task. The case-level classification task adopts the TBS
fication. For example, a high-confidence cell implies that the corre­ result directly as the gold standard. Some details in the experiments are
sponding patch has a high probability to be classified into the same listed as follows:
category. Thus the features extracted by different patches should be
given different weights. In PEM, all feature maps are multiplied by the • In the cell-level detection, we use 10,619 annotated images for model
corresponding cell detection confidence to obtain the weighted feature training and evaluation. We also investigate the following methods
map PEMi of Patchi , which is written as and compare their performance in our cell-level detection task, to
choose the optimal one for our framework: Faster-RCNN, RetinaNet,
PEMi = ci ⋅E( Patchi ) (1)
FCOS, Grid RCNN, and SSD.
The second Fusion Module (FM) aims to fuse k patches to complete • In the image-level classification, we use 4,928 normal images and
image-level classification. Here we conduct element-wise addition to the 1,612 abnormal images for the experiments, which are not over­
feature maps of all k patches, followed by global average pooling (GAP), lapped with these used in the detection task. All images are extracted
and derive the classification outcome through a fully connected (FC) to have the same size of 1024 × 1024 pixels. The examples of the
layer. The design of FM is also available in Fig. 4. extracted images are shown in Fig. 1. ResNet50, DenseNet121, and
our image-level classification model are evaluated and compared on
3.3. Case-level classification this data;
• In the case-level classification, we use another 237 WSI cases to train
After the image-level classification, each extracted image has its an SVM classifier and 361 cases to verify the performance. The
corresponding classification label with varying confidence. The confi­ training/testing cases are naturally separated according to the dates
dence score can represent the degree of abnormality of the current when they have been collected. To guarantee the efficiency of our
image, and the vector corresponding to all the confidence degrees of the framework, we only extract 300 pathology images from each case.
selected image set of a case can represent the degree of abnormality of Note that the images are randomly selected, with higher priority to
the current case. In the case-level classification, we collect the confi­ be chosen from the center part of WSI, to avoid choosing the pe­
dence scores of all images per case, and pass as feature vectors to train a ripheral part that usually has no or few cervical cells available.
support vector machine (SVM) classifier (Suykens and Vandewalle, • There is no data overlap across the three stages, and we obtain the
1999). Specifically, we first choose a fixed number of images from each training set and validation set divided by the patient case ID to
WSI case. The feature vector used in SVM is then instantiated as the ensure that the images of a patient will only appear in the training set
histogram of the confidence values of all selected images. The SVM or the validation set.
classifier is therefore trained to implement the case-level normal/­
abnormal classification for screening. For the comparison experiment All our evaluations are implemented by PyTorch, with a CPU of Intel
with SVM, we choose MLP (Multi-Layer Perceptron) and LASSO (Least Core i7− 4790 K and a GPU of Nvidia GTX 1080Ti.
Absolute Shrinkage and Selection Operator). Both MLP and LASSO are
common methods for simple classification tasks.
4.1. Cell-level abnormality detection
4. Experiments
The first experiment is to investigate the current widely-applied
In this section, we intend to demonstrate the validity of the three object detection methods and compare the performance in detecting
stages in the proposed hierarchical screening framework. Our method abnormal cells. For the data part, we have collected 10,619 manually
mainly targets the case-level screening, which needs to provide the annotated images from 100 patient cases. Note that there are in total
screening results of the case, as well as suspected abnormal cells if the 69,081 abnormal cells in bounding boxes from these images. In exper­
case is positive. In the following experimental design, we have succes­ iments, we used 90 % of these pathology images for model training, and
sively verified the performance of the model at the cell-level, patho­ the rest 10 % for model verification. We use average precision (AP) and
logical image-level, and case-level tasks. Specifically, in Section 4.1, we runtime as the evaluation metrics. AP is computed based on the preci­
compare the detection performance of five different detection methods sion and the recall, which is the area surrounded by the precision-recall
in the cell level. In Section 4.2, we conduct the ablation study to validate curve. We apply the five aforementioned models for cell-level detection,
our novel image-level classification model. Finally, in Section 4.3, we and compare the results in Table 1.
demonstrate our classification results at the case level. As shown in Table 1, the RetinaNet model achieves the best AP score,
The WSI dataset is collected from a large number of cases with cor­ while the performance of SSD is relatively poor. For the runtime, FCOS
responding diagnostic results through our collaborative institutes. Spe­ has the fastest computation performance. By comprehensively consid­
cifically, the diagnostic results include information if the participants ering the factors of AP and runtime of these methods, we decide to
are normal or abnormal. The WSI data is scanned from the cell speci­ choose RetinaNet for the abnormal cell detection task in our framework.
mens with 20x magnification. For cell-level detection, we collaborate We also show some visualization results of abnormal cell detection using
with a group of expert pathologists to manually mark abnormal cells RetinaNet in Fig. 1.

5
M. Zhou et al. Computerized Medical Imaging and Graphics 89 (2021) 101892

Table 2 Table 4
The relationship between the number of k and the proportion of the hit images. Accuracy, F1-score, and Sensitivity with the different methods of the image-level
The bold and underlined one is the recommended value of k. classification.
k value 1 2 3 4 5 6 7 Method Confidence- Based DenseNet121 ResNet50 Proposed
Threshold Method
Proportion of 0.796 0.845 0.902 0.934 0.954 0.956 0.959
the hit images Accuracy 0.694 0.851 0.863 0.921
F1-score 0.632 0.782 0.802 0.873
Sensitivity 0.667 0.810 0.823 0.903

Table 3
Accuracy, F1-score, and Sensitivity with the different values of k. The bold and
underlined one is the recommended value of k.
k value 1 3 5 7

Accuracy 0.823 0.884 0.921 0.920


F1-score 0.782 0.817 0.873 0.867
Sensitivity 0.801 0.832 0.903 0.891

4.2. Validation of image-level classification

Although cell-level detection can locate the suspicious abnormal


cells, it cannot be directly used to infer the image-level or case-label
classification results, due to the high errors from the detection task.
To this end, we use the cell detection results as auxiliary context to guide
the subsequent classification tasks. Specifically, for image-level classi­
fication, we select top- k bounding boxes with high confidence values
that are produced by cell-level detection. The selected cells and their
nearby patches, as well as the confidence values in cell detection, are
Fig. 5. The ROC curve of ResNet, DenseNet, and the proposed method.
used by the image-level classification.
To find the optimal value of k, we first check the inputs to image-
level classification. Referring to the testing results of cell-level detec­ Threshold” to represent this method. The experimental results addressed
tion using RetinaNet in Section 4.1, here we denote “hit image” if the above are shown in Table 4, where we use accuracy, F1-score and
selected patches from the image are overlapped with at least one sensitivity to characterize the performance of different methods. We also
ground-truth bounding box of abnormal cells. Normally, as the number show the ROC curves of these methods in Fig. 5, which can further
of selected patches increases, more and more images emerge areas that visualize the difference in their classification performance.
coincide with the bounding boxes annotated by the pathologist and Referred from Table 4, it can be observed that just using the detec­
qualify as “hit images”. It can be observed in Table 2 that, when the tion results to make a classification task is relatively insufficient. It also
patch number k = 5, the proportion of the hit images to the whole testing reveals that the detection model has produced too many failure cases.
images for cell-level detection generally converges, which is therefore However, the information from the detection can be integrated into
decided to be used in our framework. other methods as context knowledge. Then, the proposed two-stage
Note that we further investigate the selection of patch number in the method can achieve better performance than the one-stage method,
actual image-level classification in Table 3, where we use 4,928 normal such as ResNet and DenseNet, with about 7% of accuracy improvement.
images and 1,612 abnormal images for the classification works. These We also compare the ROC curves of three different classification
images come from 100 normal cases of normal and 80 abnormal cases, methods as shown in Fig. 5. It can be found that our proposed method
which are different from the dataset of abnormal cell detection. The outperforms the other two methods, which also proves the effectiveness
results also indicate that patch number k = 5 can provide the best and superiority of our method.
classification results for accuracy, F1-score and sensitivity.
Next, we investigate the effectiveness of the PEM module based on a 4.3. The experiment of case-level classification
set of ablation experiments. Particularly, we compare to the case where
the weight coefficients of all patches are all assigned to be 1, which In case-level classification, we use 129 normal WSI cases and 108
yields an accuracy of 0.852. Compared with the model without the abnormal ones. Each image of these cases needs to go through cell
confidence generated by the detection network, the model improves the detection and image classification modules first, and the information
accuracy of the classification task by 0.069 after utilizing the confi­ obtained will be collected by us for the following training. As mentioned
dences of the detection network as the fusion weights. It proves that the above, we use the feature vectors of the cases as input and pass them to
confidences generated by the detection network are the importance of the SVM to get a classification result. Specifically, after screening out the
context cues for the pathology image classification task. blank images, we randomly select 300 pathological images as repre­
Besides, to verify the effectiveness of the proposed method, we also sentative images to obtain a feature distribution that characterizes the
use the pathology images under-study as the input of different backbone degree of abnormality of the case. For the feature distribution of this
networks (e.g., ResNet50, DenseNet121). In these experiments, we do case, we use an SVM to determine whether the screening case is
not use the detection model to generate abnormal patches firstly. At the abnormal.Correspondingly, the comparative experimental design is as
same time, to prove that the proposed model is more effective than the follows. We simply use the classification results of the image as a
direct classification model, we use the confidence of the object detection benchmark to determine the case category. We count the results of the
network as the benchmark for pathology image classification. For pa­ image classification model of the 300 images of the current case to be
thology images after detection, if a patch with a confidence greater than classified. If the proportion of images determined to be abnormal ex­
the threshold appears in this image, we will treat this image as an ceeds a certain threshold (selected here is 5%), we determine that the
abnormal image. Here, we set the threshold as 0.4. In this way, we can classification result of the current case is abnormal. We use the
also classify pathology images, we simply use the “Confidence-Based “Abnormal Image Ratio” to characterize this method. Comparing the

6
M. Zhou et al. Computerized Medical Imaging and Graphics 89 (2021) 101892

Table 5 CRediT authorship contribution statement


The results of the case-level classification with Accuracy, F1-score, and Sensi­
tivity of different methods. Ming Zhou: Resources, Methodology, Software, Formal analysis,
Abnormal Image Ratio MLP LASSO Proposed Method Validation, Visualization, Writing - original draft. Lichi Zhang: Re­
Accuracy 0.856 0.814 0.879 0.905
sources, Conceptualization, Validation, Supervision, Writing - review &
F1-score 0.801 0.740 0.754 0.867 editing. Xiaping Du: Resources, Methodology, Software, Validation. Xi
Sensitivity 0.834 0.703 0.744 0.891 Ouyang: Methodology, Validation. Xin Zhang: Resources, Methodol­
ogy, Software. Qijia Shen: Methodology, Validation. Dong Luo:
Funding acquisition, Investigation. Xiangshan Fan: Conceptualization,
Resources, Supervision. Qian Wang: Conceptualization, Investigation,
Supervision, Project administration, Funding acquisition, Writing - re­
view & editing.

Declaration of Competing Interest

The authors report no declarations of interest.

Acknowledgements

This research was supported by the grants from National Natural


Science Foundation of China (62001292), Shanghai Pujiang Program
(19PJ1406800), Nanjing Medical Science and Technique Development
Foundation (YKK19066) and Interdisciplinary Program of Shanghai Jiao
Tong University.

Fig. 6. The ROC curve of different methods.


References

Bai, L., Cui, L., Jiao, Y., Rossi, L., Hancock, E., 2020a. Learning backtrackless aligned-
two different classification methods, the results are shown in Table 5 and spatial graph convolutional networks for graph classification. IEEE Trans. Pattern
Fig. 6. We also include the results of MLP and LASSO, which can be seen Anal. Mach. Intell. https://doi.org/10.1109/TPAMI.2020.3011866.
Bai, L., Cui, L., Zhang, Z., Xu, L., Wang, Y., Hancock, E.R., 2020b. Entropic dynamic time
below. warping kernels for co-evolving financial time series analysis. IEEE Trans. Neural
As observed from the table and figure above, our SVM method has a Netw. Learn. Syst. https://doi.org/10.1109/TNNLS.2020.3006738.
significant improvement over the use of image classification results for Bay, H., Tuytelaars, T., Van Gool, L., 2006. Surf: Speeded up Robust Features, European
Conference on Computer Vision. Springer, pp. 404–417.
case diagnosis. If only a simple method (such as the proportion of Freeman, W.T., Roth, M., 1995. Orientation histograms for hand gesture recognition.
abnormal images described above), the error of image classification International Workshop on Automatic Face and Gesture Recognition 296–301.
cannot be weakened very well, on the contrary, it will accumulate and Girshick, R., 2015. Fast r-cnn. Proceedings of the IEEE International Conference on
Computer Vision 1440–1448.
reduce the accuracy of the case diagnosis. Comparing to the same
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition.
classification method that uses feature vectors as input, the performance Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
of MLP and LASSO is weaker than SVM. In the ROC curve, the area 770–778.
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q., 2017. Densely connected
enclosed by SVM is the largest. Therefore, the SVM classification model
convolutional networks. Proceedings of the IEEE Conference on Computer Vision
of comprehensive image information we proposed can reduce the clas­ and Pattern Recognition 4700–4708.
sification error of the front-end network and achieve the purpose of Hussain, E., Mahanta, L.B., Das, C.R., Talukdar, R.K., 2020. A comprehensive study on
integrating useful information to assist in diagnosis. the multi-class cervical cancer diagnostic prediction on pap smear images using a
fusion-based decision from ensemble deep convolutional neural network. Tissue Cell
65, 101347.
5. Conclusion Koss, L.G., 1989. The Papanicolaou test for cervical cancer detection: a triumph and a
tragedy. Jama 261, 737–743.
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to
In this paper, we propose a novel framework to achieve cervical document recognition. Proc. IEEE 86, 2278–2324.
cancer screening based on the provided WSIs, which is consisted of three Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S., 2017a. Feature
steps: The first step is to locate the “abnormal” cervical cells and the pyramid networks for object detection. Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition 2117–2125.
corresponding confidence information using the object detection tech­ Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P., 2017b. Focal loss for dense object
nique. The second step is to utilize the output of cell detection for patch detection. Proceedings of the IEEE International Conference on Computer Vision
extraction, which is further fused in a novel pathology image classifi­ 2980–2988.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C., 2016. Ssd:
cation network for obtaining the estimation results. The third step is to
single shot multibox detector. In: European Conference on Computer Vision.
fuse the outputs from image-level classification to make the final diag­ Springer, pp. 21–37.
nosis at the case level. Experiments show that our method achieves Lowe, D.G., 1999. Object recognition from local scale-invariant features. In: Proceedings
of the Seventh IEEE International Conference on Computer Vision. Ieee,
better accuracy than directly using object detection and applying a
pp. 1150–1157.
classification network directly on the original image. Our main goal is to Lu, X., Li, B., Yue, Y., Li, Q., Yan, J., 2019. Grid r-cnn. Proceedings of the IEEE
provide a more robust resolution to the cervical cancer screening pro­ Conference on Computer Vision and Pattern Recognition 7363–7372.
cess, and to resolve the current limitations in cell-level detection and Nayar, R., Wilbur, D.C., 2015. The Bethesda System for Reporting Cervical Cytology:
Definitions, Criteria, and Explanatory Notes. Springer.
classification. For example, it is interesting to further investigate the Redmon, J., Divvala, S., Girshick, R., Farhadi, A., 2016. You only look once: unified, real-
recent developments in GCN-based classification methods(Bai et al., time object detection. Proceedings of the IEEE Conference on Computer Vision and
2020a, b), which can be referred and incorporated in our classification Pattern Recognition 779–788.
Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster r-cnn: towards real-time object
framework. The future works include further investigation of our detection with region proposal networks. Adv. Neural Inf. Process. Syst. 91–99.
method’s validity using more pathology images available. Besides, we Schiffman, M., Castle, P.E., Jeronimo, J., Rodriguez, A.C., Wacholder, S., 2007. Human
also plan to integrate multiple stages to achieve a single model where the papillomavirus and cervical cancer. Lancet 370, 890–907.
Suykens, J.A., Vandewalle, J., 1999. Least squares support vector machine classifiers.
overall training process is also ends-to-end. Neural Process. Lett. 9, 293–300.

7
M. Zhou et al. Computerized Medical Imaging and Graphics 89 (2021) 101892

Taha, B., Dias, J., Werghi, N., 2017. Classification of cervical-cancer using pap-smear Zhang, J., Liu, Z., Du, B., He, J., Li, G., Chen, D., 2019. Binary tree-like network with two-
images: a convolutional neural network approach. In: Annual Conference on Medical path fusion attention feature for cervical cell nucleus segmentation. Comput. Biol.
Image Understanding and Analysis. Springer, pp. 261–272. Med. 108, 223–233.
Tian, Z., Shen, C., Chen, H., He, T., 2019. Fcos: fully convolutional one-stage object Zhao, J., Li, Q., Li, X., Li, H., Zhang, L., 2019. Automated segmentation of cervical nuclei
detection. Proceedings of the IEEE International Conference on Computer Vision in pap smear images using deformable multi-path ensemble model. In: 2019 IEEE
9627–9636. 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE,
Zhang, L., Lu, L., Nogues, I., Summers, R.M., Liu, S., Yao, J., 2017. DeepPap: deep pp. 1514–1518.
convolutional networks for cervical cell classification. IEEE J. Biomed. Health
Inform. 21, 1633–1643.

You might also like