Project Documentation
Project Documentation
Benign and Malignant Colon Cancer Prediction using Deep CNN models
Name: Nithish S
Reg. No: 953621243039
Company Name: Hera Diagnostics, Rajapalayam
Time Frame: 26/06/2023 to 04/08/2023
1. Project Description
Usually while diagnosing a patient with cancer,the diagnosis process involves multiple
steps such as non-invasive for initial screening and invasive procedures after reliable diagnosis
from initial screening. The whole process of both stages is time consuming and mostly the
invasive procedures such as biopsies (removing a piece of muscle/tissue/fluid/blood around the
cancer affected region mostly through surgery) consumes a lot of time. After removing a sample
from the patient, the sample is placed in a slide and gets stained using chemicals such as
hematoxylin and eosin and they are left to dry. Then the slides are manually analyzed by a
pathologist which takes hours. A pathologist usually looks for the features like Cell Polarity,
Cell Size and Shape, Invasion into Stroma, Mitotic Figures, Chromatin Content and Nuclear
Cytoplasmic Ratio.
● Problem Statement: The main purpose of the project is to reduce the efforts of the
pathologist who spends hours analyzing the slide samples.
● Goals: Developing an AI model which augments the process of diagnosis and not fully
taking control over the pathologist's job.
● Dataset: The dataset used is an open source dataset which contains only images and has
no annotations. Each image has a dimension of 768x768 pixels. The data is divided into
two categories 1.Benign and 2. Adenocarcinoma, both parts are equally divided and we
have 5000 images in each category.
2. Methodology
We can use AI methods such as CNN(Convolutional Neural Network) based algorithms
for precise classification of cells. These Algorithms can do the next level of learning from
images by extracting features in front of the images and identify the patterns underlying in the
input images.In addition to CNN-based algorithms, implementing transfer learning with pre-
trained models like VGG16 or ResNet can expedite the training process and enhance the model's
accuracy by leveraging features learned from large image datasets. This combination of cutting-
edge CNN architectures and transfer learning will provide a robust foundation for our AI-
assisted cell classification system.
Day wise Tasks Completed during the internship period
7 04/07/2023 • Splitting the dataset into training and testing sets ensures
that you have a separate portion of data for training and
evaluating the model.
• These two days(9 & 10) lay the foundation for subsequent
model development and training phases, ensuring a well-
structured and optimized neural network ready for learning
from the data.
13 11/07/2023 • On Day 13, I reviewed the results from Day 12's initial
testing. If the model exhibited any shortcomings, this was
the time to make necessary adjustments to its architecture,
hyperparameters, or preprocessing steps.
Conclusion
Through this project, we have successfully achieved several significant goals aimed at enhancing
the diagnostic process in the field of cell classification for cancer diagnosis. Our primary
objective was to develop an AI-assisted cell classification system that augments the pathologist's
workflow, reducing their workload while maintaining a high standard of accuracy and reliability.
One of the key accomplishments was the development of a robust deep learning model, based on
Convolutional Neural Networks (CNN), which demonstrated exceptional capabilities in
accurately classifying cells into benign and adenocarcinoma categories. This model, fine-tuned
through rigorous optimization, significantly reduces the time and effort required for manual slide
analysis. It acts as a valuable second opinion for pathologists, aiding in the identification of
crucial features such as cell polarity, size, shape, invasion into stroma, mitotic figures, chromatin
content, and nuclear-cytoplasmic ratio
.
1. Adenocarcinoma Samples 2. Benign Samples
Furthermore, the integration of this AI model into a user-friendly graphical user interface (GUI)
marked another milestone. The GUI, built using Flask, HTML, CSS, and JavaScript, ensures
seamless interaction between pathologists and the AI system. It provides a platform where users
can effortlessly upload cell images, receive prompt analysis, and access classification results in a
comprehensible format.
In addition to the technical achievements, the project prioritized usability and user feedback.
Continuous optimization based on user inputs led to an interface that not only streamlines the
diagnostic process but also caters to the specific needs and expectations of medical professionals.
This collaborative approach ensured that the system aligns with the highest standards of medical
accuracy and usability.
5. Confusion Matrix
In conclusion, this project underscores the successful fusion of cutting-edge AI technology with
medical expertise. It represents a significant step forward in the quest to enhance cancer
diagnosis, reduce human error, and ultimately improve patient outcomes. Our achievement lies
not only in the development of a sophisticated AI model and user-friendly interface but also in
the promise of a brighter future for medical professionals and patients alike.