Ensemble U-Net Model For Efficient Polyp Segmentation
Ensemble U-Net Model For Efficient Polyp Segmentation
ABSTRACT
This paper presents our approach developed for the Medico auto-
matic polyp segmentation challenge 2020 1 . We used a U-Net model
with two different encoder backbones: ResNet-34 and EfficientNet-
B2. The two models were trained separately, and trained for en-
sembling using Tversky loss. We performed cutmix and standard
augmentations for data pre-processing. For ensembling, we chose
the hyperparameter of the loss function in the range that makes
individual models have high recall while relaxing the precision.
We evaluated the individual models and the ensemble model on Figure 1: Original colonoscopy images and correspond-
validation data. ResNet-34 backbone model and the ensemble model ing ground truth polyp masks for Kvasir-SEG training
were submitted to the challenge website for further evaluation on dataset [9]
the test data. Our ensemble model improved performance on all
segmentation metrics compared to the single networks except the
and another with ResNet-101 [6], and then performed bit wise com-
recall and F2 score.
bination of two predicted masks. CNN based polyp segmentation
method must have uncertainty in predictions. [18] studied uncer-
1 INTRODUCTION tainty estimation and model interpretability for polyp segmentation
Colorectal cancers are one of the leading causes of death world- task. It also provided the advancements on two methods, firstly in
wide. Colonoscopy is preferred for detecting and removing the FCN-8 [15] by keeping batch normalization after each layer, and
colorectal polyps, which are the predecessors of Colorectal Can- secondly in SegNet by including dropouts. Their best performance
cers(CRC) [3]. Polyps generally occur as a protrusion of the mucosa method on EndoScene dataset used Monte Carlo Dropout model
looking like a bumpy structure. However, wide variation in shape, and had far fewer parameters.
size, intensity of polyps, and specular reflection in colonoscopy
images can make polyps very difficult to detect by endoscopists 3 DATASET
that can have a severe impact on CRC patients and often are con- We use publicly available Kvasir-SEG dataset [9] that consists of
tributor to higher mortality rate in CRC [3]. In recent years, several 1000 images of gastrointestinal polyp images and corresponding
computer-aided polyp detection and segmentation methods has manually annotated segmentation masks verified by an experienced
been developed [2, 4, 7]. While the detection methods provide gastroenterologist. The sample-images of this data set are shown
image level presence or absence of polyps or locate them with a in Figure 1. We performed a random split of the dataset into 80%
rectangular box, semantic segmentation provides pixel-wise classi- and 20% train-validation split resulting into 880 training set and
fication targeting finer polyp boundaries. In this paper, we focus 120 validation set. 160 test images were provided by the organisers
on semantic segmentation for automated delineation of polyps. during the [8] challenge for which no ground truth masks were
provided.
2 RELATED WORK
The state-of-the-art polyp segmentation methods use Convolu- 4 METHOD
tional Neural Networks (CNN). Akbari et al. [1] used FCN-8S [15] An encoder decoder architecture with transfer learning was
network to get region of probable polyps followed by Otsu thresh- used for computing the predicted mask on the provided polyp
olding to select the largest connected component to segment polyp dataset [9]. In addition to this, we have also exploited different data
regions, resulting in 81% accuracy in the CVC- ColonDB database2 . augmentation techniques and used Tversky loss function [14] to
Sanchez et al. [16] first proposed a polyp detection system using tune the precision and recall of the individual models for effective
texture to find potential polyps windows, which were further seg- ensembling.
mented to produce masks for polyp location and extension. Kang
et al. [10] used a transfer learning-based ensemble method. They 4.1 Encoder-decoder architecture
ensembled Mask R-CNN [5] models, one with ResNet-50 backbone The encoder-decoder architecture is one of the widely used ar-
1 https://multimediaeval.github.io/editions/2020/tasks/medico/ chitectures for medical image segmentation. The encoder takes
2 http://mv.cvc.uab.es/projects/colon-qa/cvccolondb/
the input and downscales it by computing feature representations
Copyright held by the owner/author(s).
at various resolution scales and outputs feature maps that hold
MediaEval’20, December 14-15 2020, Online encoded information of the input image. In the decoder part these
feature maps are up sampled and restored to the full segmentation
MediaEval’20, December 14-15 2020, Online S. Shrestha et al.
REFERENCES [17] Mingxing Tan and Quoc V. Le. 2020. EfficientNet: Rethink-
[1] Mojtaba Akbari, Majid Mohrekesh, Ebrahim Nasr-Esfahani, S. M. Reza ing Model Scaling for Convolutional Neural Networks. (2020).
Soroushmehr, Nader Karimi, Shadrokh Samavi, and Kayvan Najar- arXiv:cs.LG/1905.11946
ian. 2018. Polyp Segmentation in Colonoscopy Images Using Fully [18] K. Wickstrøm, M. Kampffmeyer, and R. Jenssen. 2018. UNCERTAINTY
Convolutional Network. (2018). arXiv:eess.IV/1802.00368 MODELING AND INTERPRETABILITY IN CONVOLUTIONAL NEU-
[2] Sharib Ali, Mariia Dmitrieva, Noha M. Ghatwary, Sophia Bano, RAL NETWORKS FOR POLYP SEGMENTATION. In 2018 IEEE 28th In-
Gorkem Polat, Alptekin Temizel, and others. 2020. A translational ternational Workshop on Machine Learning for Signal Processing (MLSP).
pathway of deep learning methods in GastroIntestinal Endoscopy. 1–6. https://doi.org/10.1109/MLSP.2018.8516998
CoRR abs/2010.06034 (2020). https://arxiv.org/abs/2010.06034 [19] Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Jun-
[3] Melina Arnold, Mónica S Sierra, Mathieu Laversanne, Isabelle Soerjo- suk Choe, and Youngjoon Yoo. 2019. CutMix: Regularization Strat-
mataram, Ahmedin Jemal, and Freddie Bray. 2017. Global patterns and egy to Train Strong Classifiers with Localizable Features. (2019).
trends in colorectal cancer incidence and mortality. Gut 66, 4 (2017), arXiv:cs.CV/1905.04899
683–691. https://doi.org/10.1136/gutjnl-2015-310912
[4] Jorge Bernal and others. 2018. Polyp detection benchmark in
colonoscopy videos using gtcreator: A novel fully configurable tool for
easy and fast annotation of image databases. In Proc. Comput. Assist.
Radiol. Surg. (CARS).
[5] K. He, G. Gkioxari, P. Dollár, and R. Girshick. 2017. Mask R-CNN. In
2017 IEEE International Conference on Computer Vision (ICCV). 2980–
2988. https://doi.org/10.1109/ICCV.2017.322
[6] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
2015. Deep Residual Learning for Image Recognition. (2015).
arXiv:cs.CV/1512.03385
[7] Debesh Jha, Sharib Ali, Håvard D. Johansen, Dag D. Johansen, Jens
Rittscher, Michael A. Riegler, and Pål Halvorsen. 2020. Real-Time Polyp
Detection, Localisation and Segmentation in Colonoscopy Using Deep
Learning. CoRR abs/2011.07631 (2020). https://arxiv.org/abs/2011.
07631
[8] Debesh Jha, Steven A. Hicks, Krister Emanuelsen, Håvard D. Jo-
hansen, Dag Johansen, Thomas de Lange, Michael A. Riegler, and Pål
Halvorsen. 2020. Medico Multimedia Task at MediaEval 2020:Auto-
matic Polyp Segmentation. In Proc. of MediaEval 2020 CEUR Workshop.
[9] Debesh Jha, Pia H. Smedsrud, Michael A. Riegler, Pål Halvorsen,
Thomas de Lange, Dag Johansen, and Håvard D. Johansen. 2019. Kvasir-
SEG: A Segmented Polyp Dataset. (2019). arXiv:eess.IV/1911.07069
[10] J. Kang and J. Gwak. 2019. Ensemble of Instance Segmentation Models
for Polyp Segmentation in Colonoscopy Images. IEEE Access 7 (2019),
26440–26447. https://doi.org/10.1109/ACCESS.2019.2900672
[11] Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for
Stochastic Optimization. In 3rd International Conference on Learn-
ing Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015,
Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).
http://arxiv.org/abs/1412.6980
[12] Tianyu Ma, Hang Zhang, Hanley Ong, Amar Vora, Thanh D. Nguyen,
Ajay Gupta, Yi Wang, and Mert Sabuncu. 2020. Ensembling Low
Precision Models for Binary Biomedical Image Segmentation. (2020).
arXiv:eess.IV/2010.08648
[13] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net:
Convolutional Networks for Biomedical Image Segmentation. (2015).
arXiv:cs.CV/1505.04597
[14] Seyed Sadegh Mohseni Salehi, Deniz Erdogmus, and Ali Gholipour.
2017. Tversky loss function for image segmentation using 3D fully
convolutional deep networks. (2017). arXiv:cs.CV/1706.05721
[15] E. Shelhamer, J. Long, and T. Darrell. 2017. Fully Convolutional
Networks for Semantic Segmentation. IEEE Transactions on Pat-
tern Analysis and Machine Intelligence 39, 4 (2017), 640–651. https:
//doi.org/10.1109/TPAMI.2016.2572683
[16] A. Sánchez-González, B. Garcia-Zapirain, D. Sierra-Sosa, and A. El-
maghraby. 2018. Colon Polyp Segmentation Using Texture Analysis.
In 2018 IEEE International Symposium on Signal Processing and Infor-
mation Technology (ISSPIT). 579–588. https://doi.org/10.1109/ISSPIT.
2018.8642748