Progress Report
Progress Report
CANDIDATE’S DECLARATION
I/We hereby certify that the work which is being presented in the project progress report
entitled “Cervical Intraepithelial Neoplasia Detection using case based Reasoning
technique” in partial fulfillment of the requirements for the award of the Degree of Bachelor
of Technology in Computer Science and Engineering(Core) in the Department of Computer
Science and Engineering of the Graphic Era (Deemed to be University), Dehradun shall be
carried out by the undersigned under the supervision of Akanksha Kapruwan, Asst.
Professor, Department of Computer Science and Engineering, Graphic Era (Deemed to be
University), Dehradun.
The above mentioned students shall be working under the supervision of the undersigned on
the “Cervical Intraepithelial Neoplasia Detection using case based Reasoning technique”
Examination
In the following sections, a brief introduction and the problem statement for the work has
been included.
1.1 Introduction
Cervical cancer (CC) constitutes one of the most commonly diagnosed gynecological cancers
worldwide (1). Progression of CC is characterized by the transition from an initial
premalignant state called cervical intraepithelial neoplasia (CIN), that is graded based on the
extension of dysplastic abnormalities in the epithelial cells of the cervix. Three stages of
pre-malignancy can be defined: CIN-I [low-grade intraepithelial lesion (LSIL)], CIN-II and
CIN-III [high-grade intraepithelial lesion (HSIL)] (2, 3). Persistent infection with high-risk
human papilloma virus (HPV) is considered necessary for the development of CIN, however
the majority of women clear the infection (2, 3). Nevertheless, a fraction of women develop
CIN that can progress to CC if not detected and treated (2, 3). The prolonged period
necessary for progression from carcinogenic HPV infection to precancerous CIN to cancer,
allows for detection and treatment of these lesions and dramatic reductions in mortality from
cancer (4). However, concerns related to the low accuracy of Papanicolaou test and the
financial burden pertained to cytological-based screening, have been raised (5). Additionally,
overtreatment for CIN remains a matter of considerable discussion (6). Thus, markers for
estimating the progression of CIN may be of potential value in the optimization of cervical
screening and treatment.
The input to our classifiers is a photograph of the cervix taken through a vaginal
speculum. The output is the probability distribution over the three classes, from which
we extract the most likely class.
A Convolutional Neural Network is a deep neural network (DNN) widely used for the
purposes of image recognition and processing and NLP. Also known as a ConvNet, a
CNN has input and output layers, and multiple hidden layers, many of which are
convolutional. In a way, CNNs are regularized multilayer perceptrons.
1
Cervical dysplasia usually causes no symptoms and is most often discovered by a routine
Pap test. Mild cervical dysplasia sometimes resolves without treatment and may only
require careful observation with follow up Pap tests. But moderate to severe cervical
dysplasia usually requires treatment to remove the abnormal cells and reduce the risk of
cervical cancer. Sometimes, mild dysplasia that has persisted longer than two years may
be treated, as well.
• Manual analysis of the cervical analysis is tedious, relentless and blunder intended.
We intend to create a system that can aid doctors in classifying stage of cervical cancer
and in turn help women in rural India get the cervical cancer screening that could
2
potentially save and create awareness regarding menstrual health using Case based
Reasoning.
3
Chapter 2
Objectives
The proposed work objectives are as follows:
4
Chapter 3
1. Preliminary study
Cervical cancer is the fourth most prevalent disease in women. Accurate and timely cancer
detection can save lives. Automatic and reliable cervical cancer detection methods can be
devised through the accurate segmentation and classification of colposcopy image into stage
of CIN, such as normal, CIN1, CIN2/3 and cancer.
In this project we identify the correlations between the parameters that are likely to be
responsible for cervical cancer and classify images into various stages of cervical cancer
using convolutional neural network (CNN).
Fig. 2. Four cases of colposcopy data, which are Normal, CIN1, CIN2/3, and Cervical Cancer respectively
5
1.1 Methods of CIN Classification
We introduce the concept of combining natural language processing, ontology learning and
artificial intelligence techniques to classify stages of CIN. Consequently, the main goal of this
study is to apply the concept of natural language processing (NLP) for ontology learning and
population task before using an improved CBR technique to the problem of classifying cases
of CIN in stages even when the disease is still in its early stage of manifestation in the
presented case. An NLP model for feature extraction of the presented case was designed and
implemented. The originality of the current study lies in the robustness and efficiency of the
sentence-level extraction of feature-value pair for all a priori declared features. Furthermore,
the case retrieval similarity metric applied to the proposed NPL-based CBR framework
contributes to the interesting performance of the proposed system.
The CBR is an artificial intelligence paradigm that has proven to be useful in medical
systems and also exploits the similarity of cases in its knowledge base in providing a solution
to a new case or problem. Case retrievals that are closely related to the new case are usually
computed using different similarity computational models like Euclidean distance which have
been adopted by different researches. However, CBR systems all have the challenge of
features extraction and formalization. Furthermore, the choice of selecting the best distance
measure model for computing similarity of cases is a problem demanding optimal solution
considering the sensitivity of medical cases. CBR reasoning means using old experiences to
understand and solve new problems. In case-based reasoning, a reasoner remembers a
6
previous situation similar to the current one and uses that to solve the new problem [38].
CBR and expert systems have a long tradition in artificial intelligence. CBR has been
formulated since the late 1970s. CBR is an approach for problem solving and learning of
humans and computers [39]. Case-based reasoning is useful in problem solving and
automation of learning by an agent. Because empirical evidence has shown that reasoning
with CBR is more powerful, this has made reasoning by re-using past cases a powerful and
frequently applied way to solve problems for humans. An essential feature of case-based
reasoning is its coupling to learning and its strong association with machine learning [40].
Ben-Bassat et al. [41] enumerated some features of CBR, and these include: cases that
present similar symptoms and findings results from same faults/disease, and “Nearest
Neighbor” algorithm is used to identify unknown diagnosis from the known. More so, CBR
avoids the knowledge-based acquisition bottleneck of RBR, it compiles past solutions,
mimics the diagnostic experience of human experts, avoids past mistakes, interprets rules,
supplements weak domain models, facilitates explanation, supports knowledge acquisition
and learning, and exploits the database of solved problems so as to learn.
7
1. Design of an ontology learning algorithm for feature extraction and mapping from
2. Dataset preparation
A dataset is critical for developing and validating deep learning systems for cervical cancer
screening. We use a multistate colposcopy image dataset. The classic feature extractors and
classifiers are used for CIN grading. The proposed MSCI dataset consists of colposcopy
image of different grades of precancerous lesions (normal, CIN1, and CIN2/3) and cancer,
and the images of each grade include three states: acetic acid reaction, green filter, and iodine
test. Fig 4 shows the dataset used for classification based on the colour of the colposcopy
images.
8
Chapter 4
No.
1 Classification: 10
2 Training: 14
9
3 Testing: 13
· Data organization
· Forecasted output
· Expected output.
Model testing will be carried out using
testing set.
10
Chapter 5
Weekly Task
Week
No. Date: Work Allocated Work Remarks Guide
Completed Signature
From-To
(Yes/No)
14-10-22
to
28-10-22
to
04-11-22
to
Present
11
References
[1] A. Jemal, F. Bray, M. M. Center, J. Ferlay, E. Ward, and D. Forman, “Global cancer statistics,”
CA: A Cancer Journal for Clinicians, vol. 61, no. 2, pp. 69–90, 2011.
[3] E. F. Dunne, E. R. Unger, M. Sternberg et al., “Prevalence of HPV infection among females in the
United States,” The Journal of the American Medical Association, vol. 297, no. 8, pp. 813–819, 2007.
[4] N. Munoz, F. X. Bosch, S. de Sanjose et al., “Epidemiologic classification of human
papillomavirus types associated with cervical cancer,” The New England Journal of Medicine, vol.
348, no. 6, pp. 518–527, 2003.
[6] A. Mathew and P. S. George, “Trends in incidence and mortality rates of squamous cell carcinoma
and adenocarcinoma of cervix—worldwide,” Asian Pacific Journal of Cancer Prevention, vol. 10, no.
4, pp. 645–650, 2009.
12
[7] J. Cuzick, M. Arbyn, R. Sankaranarayanan et al., “Overview of human papillomavirus-based and
other novel options for cervical cancer screening in developed and developing countries,” Vaccine,
vol. 26, supplement 10, pp. K29–K41, 2008.
13