0% found this document useful (0 votes)
32 views

Object Detection Using Convolutional Neural Network

Uploaded by

rishiakkaladevi0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Object Detection Using Convolutional Neural Network

Uploaded by

rishiakkaladevi0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

A

Major Project Report

On

OBJECT DETECTION USING CONVOLUTIONAL NEURAL NETWORK

Submitted in partial fulfilment of the Requirements for the award of the degree in

Bachelor of Technology

In

COMPUTER SCIENCE & ENGINEERING

By

P. RAHUL 20681A0551

S. AVINASH 20681A0557

K. SHYAMSUNDAR 20681A0529

B. SANDHYA 20681A0505

B. SANDHYA 20681A0507

Under the Esteemed Guidance of

Dr. B. V. Pranay Kumar

(Associate Professor)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE


Colombo Nagar, Yeshwanthapur, Jangoan-506167, Telangana

2023-2024
CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE

Colombo Nagar, Yeshwanthapur, Jangaon-506167.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

2023– 2024

CERTIFICATE

This is to certify that it is a Bonafide record of project work entitled “Object Detection

using Convolutional Neural Network” carried out by P. Rahul (20681A0551), S. Avinash

(20681A0557), K. Shyamsundar (20681A0529), B. Sandhya (20681A0505), B. Sandhya

(20681A0507) during the academic year 2023-2024, in partial fulfillment of the

requirement for the award of degree of Bachelor of Technology in Computer Science &

Engineering offered by Christu Jyothi Institute of Technology & Science, Yeshwanthapur,

Jangaon.

Project Co-Guide Project Guide HOD-CSE


Ch. Prudvini Dr. B. V. Pranay Kumar M. RAMARAJU

(Assistant Professor) (Associate Professor) (Assistant Professor)

External Examiner
CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE

Colombo Nagar, Yeshwanthapur, Jangaon-506167.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

2023 – 2024

DECLARATION

We hereby declare that the document entitled “Object Detection using

Convolutional Neural Network” submitted to the Christu Jyothi Institute of Technology

& Science in partial fulfilment of the requirements for the award of the degree of

Bachelor of Technology in Computer Science and Engineering is a record of an original

work done by us under the guidance of Mr. Dr. B. V. Pranay Kumar Associate professor

and Mrs. Ch. Prudvini Assistant professor this document has not been submitted to any

other university for the award of any other degree.

P. RAHUL 20681A0551

S. AVINASH 20681A0557

K. SHYAMSUNDAR 20681A0529

B. SANDHYA 20681A0505

B. SANDHYA 20681A0507
ACKNOWLEDGEMENT

On the submission of our project entitled “Object Detection using Convolutional

Neural Network”, we would like to thank our Director Rev. Fr. D. VIJAYA PAUL

REDDY for giving us the opportunity to carry out our project work.

We endow our sincere thanks to Principal Dr. S. CHANDRASHEKAR REDDY

for his consistent cooperation and encouragement.

We would also like to extend our sincere thanks to Mr. M. RAMARAJU., Assistant

Professor, Head of the Department, Computer Science & Engineering, for his

valuable suggestions and motivating guidance during our project work.

We would like to extend our gratitude and sincere thanks to our guide Dr. B. V.

PRANAY KUMAR., Associate Professor and Co-guide CH. PRUDVINI.,

Assistant Professor Department of Computer Science and Engineering for their

valuable and timely suggestions.

We are very thankful to our teachers for providing the required background during

the project work. We would also like to extend our gratitude to our friends and those

who are directly or indirectly helped us in completing our project work.

P. RAHUL 20681A0551

S. AVINASH 20681A0557

K. SHYAMSUNDAR 20681A0529

B. SANDHYA 20681A0505

B. SANDHYA 20681A0507

iv
CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE
Colombo Nagar, Yeshwanthapur, Jangoan – 506167

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Vision and Mission of the Institute

Vision

To admit and groom students from rural background and be a truly rural

technical institution, benefiting society and nation as a whole institute.

Mission

The mission of the institution is to create, deliver and refine knowledge. Being a

rural technical institute, we aim too:

1. Enhance our position to one of the best technical institutions and to measure

our performance against the highest defined standards.

2. Provide highest quality learning environment to our students for their greater

wellbeing so as to equip them with highest technical and professional ethics.

3. Produce engineering graduates fully equipped to meet the ever-growing needs

of industry and society.

v
CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE
Colombo Nagar, Yeshwanthapur, Jangoan – 506167

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Vision and Mission of the Institute

Vision

To be a center of eminence to mould young, fresh minds into challenging computer

science professionals with ethical values.

Mission

1. Enrich the knowledge and wisdom with repository of books and modernized
laboratory aided by dedicated resources.

2. Organize training and activities on upcoming techniques, and inter-personal


skills.

3. Develop the ability to provide sustainable solutions to real world situations with
collaborations

vi
CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE
Colombo Nagar, Yeshwanthapur, Jangaon – 506167

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Program Educational Objectives

PEO 1 Graduates of B. Tech (CSE) are able to formulate, analyse, and solve

hardware and software problems within the constraints and pursue the research.

PEO 2 Demonstrate knowledge in core areas of computer science and related

engineering to comprehend engineering trade-offs to create novel products.

PEO3 Show the awareness of life-long learning needed for a successful

professional career and exhibit ethical values, excellence, leadership, and social

responsibilities.

vii
CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE

Colombo Nagar, Yeshwanthapur, Jangoan – 506167

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

PROGRAM OUTCOMES

PO No Program Outcomes
PO1 Engineering knowledge apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.
PO2 Problem analysis identity, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences
PO3 Design/development of solutions design solutions for complex engineering problems
and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal, and
environmental considerations.
PO4 Conduct investigations of complex problems use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data,
and synthesis of the information to provide valid conclusions.
PO5 Modern tools usage Create, Select, and apply appropriate techniques resources, and
modern engineering and IT tools including predictions and modeling to complex
engineering activities with an understanding of the limitations.
PO6 The engineer and society apply reasoning informed by the contextual knowledge to
assess societal and environment contexts, and demonstrate the knowledge of, and need
for sustainable development.
PO7 Environment and sustainability understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of,
and need for sustainable development.
PO8 Ethics apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.
P09 Individual and team work function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings
PO10 Communication communicates effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend
and write effective reports and design documentation, make effective presentations.
PO11 Project management and finance demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a
member and leader in a team, to manage projects and in multidisciplinary
environments.
PO12 Life-long learning recognizes the need for, and have the preparation and ability to
engage and lifelong learning in the broadest context of technology change.

viii
CHRISTU JYOTHI INSTITUTE OF TECHNOLOGY & SCIENCE
Colombo Nagar, Yeshwanthapur, Jangoan – 506167

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Program Specific Outcomes

PSO 1 Proficiency Skill: Understand, analyse and develop computer programs

in the areas related to algorithms, system software, multimedia, web design, big

data analytics and networking for efficient design of computer-based systems

of varying complexity.

PSO 2 Problem-Solving Skill: Apply standard practices and strategies in

software project development using open-ended programming environments to

deliver a quality product for business success.

PSO 3 Successful Career and Entrepreneurship: Employ modern computer

languages, environments, and platforms in creating innovative career paths to

be an entrepreneur and a zest for higher studies

ix
ABSTRACT

During the last years, a noticeable growth is observed in the field of computer
vision research. In computer vision, object detection is a task of classifying and
localizing the objects in order to detect the same. The widely used object detection
applications are human–computer interaction, video surveillance, satellite imagery,
transport system, and activity recognition. In the wider family of deep learning
architectures, convolutional neural network (CNN) made up with set of neural
network layers is used for visual imagery. Deep CNN architectures exhibit
impressive results for detection of objects in digital image. This paper represents a
comprehensive review of the recent development in object detection using
convolutional neural networks. It explains the types of object detection models,
benchmark datasets available, and research work carried out of applying object
detection models for various applications. Object recognition is a popular task in
computer vision. The method usually requires the presence of a data-set annotated
with location information of the objects, which is in the form of bounding boxes
around the objects. In this project, we have implemented a method to carry out
object recognition in a weakly supervised manner i.e., using partially annotated
data-set. The data-set provides the information about what objects are present in
the image but not where they are present. We have used a Convolutional Neural
Network (CNN) based architecture to perform this task. We also validated by
experimenting with different architectures that mere information of presence/
absence of objects in an image (weak labels) does provide their location
information for free. With the recent advancement in deep neural networks in
image processing, classifying and detecting the object accurately is now possible.
In this paper, Convolutional Neural Networks (CNN) is used to detect objects in
the environment.

x
TABLE OF CONTENTS

1. INTRODUCTION .............................................................................................. 1

1.1 Background 1

1.2 Objectives 1

1.3 Scope and Significance 2

1.4 Object Detection and Models 2

1.5 Convolutional neural Network 3

2. LITERATURE SURVEY .................................................................................. 7

3. SOFTWARE REQUIREMENTS ANALYSIS ................................................ 8

3.1 Existing System .............................................................................................. 8

3.2 Proposed System ............................................................................................ 9

3.3 Modules ........................................................................................................ 10

3.4 System Architecture ..................................................................................... 11

3.5 Software Requirement Specification ........................................................... 12

3.5.1 Overall Description ............................................................................... 12

3.5.2 Functional Requirements ....................................................................... 13

3.5.3 Non-Functional Requirements............................................................... 14

4. SOFTWARE DESIGN ..................................................................................... 15

4.1 Data Flow Diagram ...................................................................................... 15

4.2 UML Diagrams............................................................................................. 16


xi
4.2.1 Use Case Diagram ................................................................................. 16

4.2.2 Sequence Diagram................................................................................ 18

4.2.3 Class Diagram ...................................................................................... 19

4.2.4 Activity Diagram .................................................................................. 20

4.2.5 Deployment Diagram ............................................................................ 21

5 . SOFTWARE AND HARDWARE REQUIREMENTS ............................... 22

6 . CODING .......................................................................................................... 23

6.2 Sample Code ................................................................................................ 23

7 . TESTING ........................................................................................................ 32

7.2 Types Of Testing .......................................................................................... 32

7.2.3 Unit Testing ........................................................................................... 32

7.2.4 Integration Testing ................................................................................ 32

7.2.5 Functional Testing ................................................................................. 33

7.2.6 System Testing ...................................................................................... 33

7.2.7 White Box Testing ................................................................................ 34

7.2.8 Black Box Testing ................................................................................. 34

7.2.9 Acceptance Testing ............................................................................... 34

7.2.10 Validation ........................................................................................... 35

7.2.11 Test Cases........................................................................................... 35

8 . OUTPUT SCREENS ...................................................................................... 38

9 . CONCLUSION ............................................................................................... 42

10 FURTHER ENHANCEMENTS………………………………………….. 43

11.REFERENCES…………………………………………………………….. 45

xii
LIST OF FIGURES:

Figure 1: Introduction (R-CNN) 3

Figure 2: CNN- Architecture 4

Figure 3: System Architecture 11

Figure 4: Data Flow Diagram 15

Figure 5: Use Case Diagram 17

Figure 6: Sequence Diagram 18

Figure 7: Class Diagram 19

Figure 8: Activity Diagram 20

Figure 9: Deployment Diagram 21

Figure 10: Test Case1 37

Figure 11: Test Case2 37

Figure 12: Output Screen1 38

Figure 13: Output Screen2 39

Figure 14: Output Screen3 39

Figure 15: Output Screen4 40

Figure 16: Output Screen5 40

Figure 17: Output Screen6 41

Figure 18: Output Screen7 41

xiii
1.INTRODUCTION
1.1 Background

In recent years, advancements in computer vision technology have revolutionized


various industries, ranging from autonomous vehicles and surveillance systems to
healthcare and retail. Object detection, a fundamental task in computer vision,
plays a crucial role in enabling machines to understand and interpret the visual
world. By accurately identifying and localizing objects within images or videos,
object detection systems empower applications such as pedestrian detection,
vehicle counting, facial recognition, and anomaly detection.The motivation behind
this project stems from the growing demand for robust and efficient object
detection systems capable of operating in real-world environments. Traditional
computer vision techniques often struggle with complex scenarios, such as
occlusions, varying lighting conditions, and object scale variations. Deep learning-
based approaches, particularly Convolutional Neural Networks (CNNs), have
emerged as a promising solution for addressing these challenges. CNNs have
demonstrated remarkable performance in object detection tasks, leveraging their
ability to automatically learn hierarchical features from raw pixel data.

1.2 Objectives

The primary objective of this project is to develop an object detection system using
CNNs with Python. Specifically, the project aims to achieve the following goals:
i. Implement a CNN-based object detection model capable of accurately
localizing and classifying objects within images.
ii. Train the model on a diverse dataset of annotated images to learn
representative features for various object categories.
iii. Explore and analyze the strengths and limitations of different CNN

1
iv. architectures and training strategies for object detection tasks.

1.3 Scope and Significance:

The scope of this project encompasses the implementation and evaluation of an


object detection system using CNNs with Python. The project will focus on
exploring different CNN architectures and training methodologies to achieve
optimal performance in terms of accuracy, speed, and memory efficiency. While
the project will primarily target object detection in images, the techniques and
methodologies developed can be extended to video-based object detection and
other related tasks in computer vision.
The significance of this project lies in its potential to contribute to the
advancement of object detection technology and its practical applications. A
robust and efficient object detection system can enhance various domains,
including autonomous driving, surveillance systems, industrial automation, and
medical imaging. By leveraging CNNs and deep learning techniques, the project
aims to push the boundaries of what is possible in object detection, paving the
way for innovative solutions to real-world challenges.

1.4. Object Detection Models:

With the increase number of usage of face detection systems, video


surveillance, vehicle tracking and autonomous vehicle driving, fast and accurate
object detection systems are heavily required. The output of object detection normally
has a bounding box around an object with the determined value of confidence. Object
detection can be single-class object detection, where there is only one object found in
particular image. In multi-class object detection, more than one object pertaining to
different classes is to be found . Object detection systems mostly rely on large set of
training examples as they construct a model for detection an object class. The
available frameworks for object detection can be categorized in two types, i.e.,
region proposal networks and unified networks . The models based on region
proposal networks are called as multi-stage or two-stage models. The unified

2
models are called as single-stage models. The multi-stage models perform the
detection task in two stages:
(i) The region of interest (ROI) is generated in first stage and (ii) classification is
performed on these ROI. Two-stage object detectors are accurate but somewhat
slower. Single-stage detectors are faster than two-stage detectors as less
computational work is needed

Fig 1: R-CNN: a region-based CNN detector.

to carry out. On the contrary, two-stage detectors are slow but more accurate. Mask
R-CNN, R-CNN, Fast R-CNN, and Faster R-CNN are two-stage object detectors.
YOLO and SSD are single-stage object detection models. Two-stage models detect
the foreground objects first and then classify it to a specific class, whereas single-
stage detector skips the detection of foreground objects and takes uniform samples
of objects through a grid. The following section describes the single-stage and
two-stage detectors models briefly.

1.5 Convolution neural network:

Convolutional neural network (CNN) is a class of deep, feed-forward artificial


neural network that has been utilized to produce an accurate performance in
computer vision tasks, such as image classification and detection [5]. CNNs are
like traditional neural network, but with deeper layers. It has weights, biases and
outputs through a nonlinear activation. The neurons of the CNN are arranged in a
volumetric fashion such as, height, width and depth. Fig. 1 shows the CNN
3
architecture, it is composed of convolutional layer, pooling layer and fully
connected layer. Convolutional layer and pooling layer are typically alternated and
the depth of each filter increases from left to right while the output size (height
and width) are decreasing. The fully connected layer is the last stage which is
similar to the last layer of the conventional neural networks.
A Single Shot Multi-Box Detector is an object detection approach in images
using a deep neural network [24]. It produces bounding box and class scores of
the detected object at greater speed than previous approach like You Only Look
Once. On the other hand, MobileNet is a deep learning model that can be applied
to various recognition tasks like object detection, landmark recognition, and face
attributes. The advantages of MobileNet are accurate, fast, small and easy to tune.

Fig 2: CNN Architecture

1.6 YOLO (You Only Look Once):


YOLO is a real-time object detection algorithm that processes images in a single
pass through a neural network, dividing them into a grid and predicting bounding
boxes and class probabilities for objects within each grid cell. It balances speed and
accuracy, making it suitable for applications requiring fast and efficient object
detection

4
 Single Pass Detection: YOLO processes images in a single pass through a
neural network, making it extremely fast for real-time applications.
 Grid Division: Input images are divided into a grid, and each grid cell predicts
bounding boxes and object probabilities.
 Bounding Box Prediction: Each grid cell predicts multiple bounding boxes with
confidence scores, indicating the likelihood of containing an object.
 Class Prediction: YOLO also predicts class probabilities for the detected
objects within each grid cell.
 YOLO, short for "You Only Look Once," is a popular algorithm for real-time
object detection. It's a deep learning algorithm that belongs to the family of
Convolutional Neural Networks (CNNs). Here's a breakdown of how it works:
 Single Neural Network: Unlike traditional object detection algorithms that
apply a classifier to different parts of the image (region-based methods like R-
CNN), YOLO uses a single neural network to predict bounding boxes and class
probabilities directly from full images in one evaluation.
 Grid Division: YOLO divides the input image into a grid of cells. Each cell is
responsible for predicting bounding boxes and class probabilities within itself.
 Bounding Box Prediction: For each grid cell, YOLO predicts bounding boxes.
These bounding boxes are represented by a set of 5 numbers: (x, y, w, h,
confidence). (x, y) represent the coordinates of the center of the bounding box
relative to the cell, and (w, h) represent the width and height of the bounding
box relative to the entire image. Confidence represents the probability that the
box contains an object and how accurate the box is.
 Class Prediction: YOLO also predicts the probability of different classes for
each bounding box. It uses a softmax function to predict the probabilities of
different classes.
 Non-max Suppression: After predicting bounding boxes and class probabilities,
YOLO applies non-max suppression to remove duplicate detections. It keeps
only the most confident bounding box when multiple boxes overlap
significantly.
 Here's a simplified overview of the YOLO algorithm process:
5
1. Input: YOLO takes an input image.
2. Feature Extraction: It processes the image through a series of convolutional
layers to extract features.
3. Grid Division: It divides the image into an S x S grid.
4. Prediction: For each grid cell, YOLO predicts bounding boxes and class
probabilities.
5. Non-max Suppression: It removes duplicate detections.
6. Output: The final output is a set of bounding boxes along with their associated
class probabilities.
 YOLO has several advantages, including its speed and efficiency in processing
images, making it suitable for real-time applications like autonomous driving,
surveillance, and robotics. However, it may struggle with detecting small
objects or objects close together in the image.

6
2.LITERATURE SURVEY

[1] H.S. Parekh, D. G.Thakore, & U. K. Jaliya (2014): This survey provides an
overview of various object detection and tracking methods. It likely discusses
different techniques, algorithms, and approaches employed in the field of
computer vision for object detection and tracking. Such surveys are valuable for
understanding the landscape of existing methods and identifying potential gaps or
areas for improvement.

[2] J. Deng et al. (2009): This paper introduces ImageNet, a large-scale


hierarchical image database. Although not specifically focused on object
detection, ImageNet has become a crucial resource for training and evaluating
object detection models due to its extensive collection of labeled images spanning
a wide range of object categories.

[3] H. Chen et al. (2018): This paper presents LSTD, a low-shot transfer detector
for object detection. The approach likely addresses the challenge of detecting
objects with limited training data, possibly by leveraging transfer learning
techniques to adapt pre-trained detectors to new object classes with only a few
examples.

[4] H. Xu et al. (2017): The paper introduces Deep Regionlets for object detection,
which could be a method for detecting objects within images using region-based
convolutional neural networks. This approach may focus on improving the region
proposal process or feature representation to enhance detection accuracy.

[5] C. C. Kao et al. (2018): The paper presents Localization-Aware Active


Learning for Object Detection, which could be a technique for selecting
informative samples during the training of object detection models. This approach
may involve actively selecting training samples that are challenging or informative
for improving localization accuracy. "EfficientDet: Scalable and Efficient Object
Detection" by Mingxing Tan et al. (2020) introduced the EfficientDet architecture.
7
3.SOFTWARE REQUIREMENTS AND ANALYSIS

3.1 Existing System

Often rely on handcrafted features and shallow classifiers (e.g., Haar cascades,
HOG features). Limited in handling complex scenarios with occlusions, scale
variations, and cluttered backgrounds.
Less efficient in terms of both accuracy and computational speed compared to
deep learning-based approaches. Deep learning-based systems generally
outperform traditional methods in terms of accuracy and robustness.
CNN-based approaches offer faster inference times, making them suitable for real-
time applications. Transfer learning enables efficient training on limited datasets,
reducing the need for large-scale annotated data.
Existing systems may vary in terms of programming languages, frameworks, and
hardware dependencies.
Some systems may offer pre-trained models or cloud-based APIs for easy
integration.
Considerations such as model size, memory footprint, and computational
requirements influence deployment choices. Deep learning-based systems tend to
achieve higher accuracy compared to traditional methods, especially in challenging
scenarios.
CNN-based approaches offer faster inference times, making them suitable for real-
time applications. Ease of Use Traditional methods may be simpler to implement
and deploy, but deep learning-based approaches offer better performance with
proper training and optimization. Resource Requirements: Deep learning models
often require more computational resources (e.g., GPUs) for training and inference
compared to traditional methods. Generalization: Transfer learning allows deep
learning models to generalize well even with limited annotated data, whereas
traditional methods may struggle with diverse datasets.
Disadvantages of Existing System:

1. Limited Accuracy
8
2. High Computational Cost
3. Lack of Flexibility
4. Slow Inference Speed
5. Lack of Generalization

3.2 Proposed System

Utilizes Convolutional Neural Networks (CNNs) for object detection. May


employ architectures such as YOLO (You Only Look Once), SSD (Single Shot
Multibox Detector), or Faster R-CNN. Allows for end-to-end learning of object
features and spatial information. Data preprocessing includes augmentation
techniques to increase dataset diversity. Utilizes transfer learning with pre-
trained CNN models for faster convergence and better generalization. Fine-tunes
model parameters on the specific object detection task. Optimizes
hyperparameters such as learning rate and batch size for improved performance.
Evaluates model performance using metrics like mean Average Precision
,precision, recall, and F1-score. Conducts validation on a separate dataset to
assess generalization ability. Visualizes detection results with bounding boxes
and class labels overlaid on images.
Developed using Python programming language with libraries/frameworks such
as TensorFlow, Keras, and OpenCV. Modular design allows for easy integration
into other applications or systems. Offers flexibility in deployment, supporting
various hardware platforms.
Advantages of Proposed System:

1. Established Techniques
2. Compatibility with Limited Hardware
3. Familiarity and Accessibility
4. Robustness to Small Datasets
5. Versatility in Application Domains

9
3.3 Modules:

Data Preprocessing Module:

Purpose: Responsible for preparing the input dataset for training and
evaluation.
Tasks: Data loading: Reads annotated images and corresponding ground truth
bounding boxes from the dataset.
Data augmentation: Applies augmentation techniques such as rotation, flipping,
scaling, and translation to increase dataset diversity and model robustness.
Data normalization: Normalizes pixel values of images to ensure consistency
and convergence during training.
Implementation: Utilizes Python libraries such as OpenCV and NumPy for
image processing and manipulation.

Convolutional Neural Network (CNN) Module:

Purpose: Implements the core object detection model based on CNN


architecture.
Tasks: Feature extraction: Learns hierarchical features from input images using
multiple convolutional layers.
Object localization: Predicts bounding boxes and class probabilities for objects
present in the input image.

Evaluation Module:

Purpose: Evaluates the performance of the trained model on validation and test
datasets.
Tasks: Metric calculation: Computes evaluation metrics such as mean Average
Precision (mAP), precision, recall, and F1-score to quantify model performance.
Visualization: Generates visualizations of detection results, including bounding
boxes overlaid on input images and associated confidence scores.
10
Implementation: Utilizes Python libraries such as scikit-learn and Matplotlib for
metric calculation and result visualization.

Deployment Module:
Purpose: Facilitates the deployment of the trained object detection model for real-
world applications.
Tasks:
Integration: Integrates the trained model into existing software systems or
applications for object detection tasks.
Optimization: Optimizes the model for deployment on target hardware platforms,
such as CPUs, GPUs, or edge devices.

3.4 System Architecture:

Fig 3: System Architecture

11
The system architecture for the "Object Detection using CNN with Python" project
comprises four interconnected modules. The Data Preprocessing Module prepares
the dataset by loading annotated images, applying augmentation techniques for
diversity, and normalizing pixel values. The Convolutional Neural Network (CNN)
Module implements the core object detection model, learning features, predicting
bounding boxes, and handling training/inference phases. The Evaluation Module
assesses model performance by computing metrics like mean Average Precision
(mAP) and generating visualizations of detection results.

3.5 SOFTWARE REQUIREMENT SPECIFICATION

3.5.1 Overall Description:

The "Object Detection using CNN with Python" project aims to develop a robust
and efficient system for object detection tasks utilizing Convolutional Neural
Networks (CNNs) implemented in Python. The system will process input images,
detect objects within them, and provide bounding box coordinates along with class
labels. It will be capable of accurately identifying objects in various scenarios,
contributing to applications such as surveillance, autonomous vehicles, and
industrial automation.
1. Purpose:

The purpose of this project is to provide a comprehensive solution for object


detection using CNNs with Python. By leveraging deep learning techniques, the
system aims to achieve high accuracy and efficiency in detecting objects within
images. The project aims to address the growing demand for automated object
detection systems across various industries and applications, enabling users to
analyze visual data and make informed decisions.
2. Scope:

The project scope includes the development of software modules for data
preprocessing, CNN model implementation, evaluation, and deployment. It

12
encompasses tasks such as loading and preprocessing input datasets, training and
fine-tuning CNN models, evaluating model performance using standard metrics,
and deploying the trained model for real-world applications. The system will
support integration with existing software systems and provide APIs/interfaces for
seamless interaction.

3.Audience:

The primary audience for this project includes data scientists, machine learning
engineers, software developers, researchers, and academics. Data scientists and
machine learning engineers will benefit from the development and deployment of
object detection models using CNNs.

3.5.2 Functional Requirements

Data Preprocessing Module:


1. Load annotated images and corresponding ground truth bounding boxes
from the dataset.
2. Apply augmentation techniques such as rotation, flipping, scaling, and
translation to increase dataset diversity.
3. Normalize pixel values of images to ensure consistency and convergence
during training.

Convolutional Neural Network (CNN) Module:


1. Define and implement a CNN architecture suitable for object detection
tasks.
2. Train the CNN model using annotated image datasets with ground truth
labels.
3. Predict bounding boxes and class probabilities for objects present in input
images during inference.

Evaluation Module:
1. Compute evaluation metrics such as mean Average Precision (mAP),

13
precision, recall, and F1-score.
2. Generate visualizations of detection results, including bounding boxes
overlaid on input images and associated confidence scores.

Deployment Module:
1. Integrate the trained object detection model into existing software systems
or applications.
2. Optimize the model for deployment on target hardware platforms such as
CPUs, GPUs, or edge devices.
3. Develop APIs or interfaces for interacting with the model, enabling
seamless integration with external systems.

3.5.3 Non-Functional Requirements:

Performance:
1. The system should achieve high accuracy in object detection, with minimal
false positives and false negatives.
2. The model training and inference processes should be computationally
efficient, with reasonable execution times.

Scalability:
1. The system should be capable of handling large-scale datasets with
thousands of images and objects.
2. The model should scale gracefully with increased computational resources
for training and inference.

Reliability:
1. The system should exhibit robustness to variations in object appearance,
lighting conditions, and background clutter.
2. The trained model should generalize well to unseen data and object
categories, avoiding overfitting.

Usability:
1. The system should have a user-friendly interface, allowing users to easily
configure and interact with different module.

14
4. SOFTWARE DESIGN

4.1 Data Flow Diagram

A Data Flow Diagram (DFD) in UML (Unified Modeling Language)


is a graphical representation of the flow of data through a system. It is used
to model the processes and data involved in a system or application, and to
illustrate the data flow between different components of the system. DFDs
can be used to model systems at various levels of detail, from high-level
overviews to detailed specifications. They can help identify data
dependencies and interactions between different components of a system and
can be used to communicate system requirements and specifications to
stakeholders and development teams

Fig 4: Data Flow Diagram

15
4.2 UML Diagrams:
Unified Modeling Language (UML) is a general-purpose modelling language. The

main aim of UML is to define a standard way to visualize the way a system has been

designed. It is quite similar to blueprints used in other fields of engineering. It helps

in designing and characterizing, especially those software systems that incorporate

the concept of Object orientation. It describes the working of both the software and

hardware systems.

The UML is a language for

 Visualizing
 Specifying
 Constructing
 Documenting
4.2.1 Use Case Diagram

The Use Case Diagram illustrates the interactions between the Object Detection
System and its modules. The primary actor, which can be a user or an external
system, interacts with the system through various use cases. In this project, the main
functionalities represented by the use cases include data preprocessing, object
detection using CNN, model evaluation, and deployment. Each of these modules
encapsulates specific tasks and functionalities necessary for achieving the
objectives of the system. The primary actors are the User, who uploads images for
detection, and the System, responsible for processing and detecting objects. The
main use cases include uploading an image, processing it with the CNN model,
detecting objects within the image, displaying the results to the user, and optionally
allowing the user to save the annotated image. . Each use case depends on the
"Process Image" and "Detect Objects" functionalities. The "Save Results" use case
can be extended from any other use case. The diagram visually represents these
interactions, aiding in understanding the system's functionality at a glance.
16
Fig 5: Use Case Diagram

4.2.2 Sequence Diagram

The Sequence Diagram illustrates the interactions between the user and the system
components over time, showing the sequence of messages exchanged between
them. In this project, the Sequence Diagram captures the workflow of the user
interacting with the Object Detection System. It starts with the user selecting a
dataset and proceeds with the system performing data preprocessing, CNN model
training, evaluation, and deployment. Each step in the sequence represents a
17
specific action or task performed by the system components, demonstrating how
they collaborate to achieve the desired outcomes. The Sequence Diagram provides
insights into the dynamic behavior of the system during its operat

Fig 6: Sequence Diagram

4.2.3 Class Diagram


The Class Diagram represents the static structure of the Object Detection System,
depicting the classes, their attributes, and relationships. In this project, the main
classes include Object Detector, Data Preprocessor, CNN Model, and Evaluation
Module. These classes encapsulate the functionalities related to data preprocessing,
CNN model implementation, evaluation, and deployment. The relationships
between the classes illustrate how they collaborate and communicate to achieve the
18
overall objectives of the system. Class Diagram provides blueprint for the
implementation of the system's components and their interactions.

Fig 7: Class Diagram

4.2.4 Activity Diagram

The Activity Diagram represents the workflow or flow of control within the Object
Detection System, illustrating the sequence of activities or actions performed by the
system components. In this project, the Activity Diagram outlines the steps
involved in data preprocessing, object detection using CNN, model evaluation, and
deployment. Each activity represents a specific task or operation carried out by the
19
system modules, such as loading data, training the model, evaluating performance,
and integrating the model into existing systems.

Fig 8: Activity Diagram

4.2.5 Deployment Diagram

The Deployment Diagram depicts the physical deployment architecture of the


Object Detection System, illustrating the distribution of system components across
hardware nodes or environments. In this project, the Deployment Diagram
showcases the deployment of various modules and components onto different
hardware nodes. For example, the Data Preprocessing Module, CNN Model,
20
Evaluation Module, and Deployment Module may be deployed on separate servers
or cloud instances. The diagram also represents the communication paths between
the deployed components, such as data flow between the preprocessing module and
the CNN model. Additionally, it may illustrate external systems or devices
interacting with the deployed system through APIs or interfaces.

Fig 9: Deployment Diagram

21
5. SOFTWARE AND HARDWARE REQUIREMENTS

Software Requirements:

Operating System: Windows 10 or above

Coding Language: Python

Software : Visual code studio

Hardware Requirements:

Processor : i3 or above

Hard Disk : 40 GB

Ram : 8 GB

Key Board : Standard Windows Keyboard

Mouse :Two or Three Button mouse

22
6.CODING

6.1 Sample Code:

import os
import sys

def main():
"""Run administrative tasks."""
os.environ.setdefault('DJANGO_SETTINGS_MODULE',
'config.settings.development')
try:
from django.core.management import execute_from_command_line
except ImportError as exc:
raise ImportError(
"Couldn't import Django. Are you sure it's installed and "
"available on your PYTHONPATH environment variable? Did you "
"forget to activate a virtual environment?"
) from exc
execute_from_command_line(sys.argv)

if __name__ == '__main__':
main()

import os
import sys
import tkinter as tk
from tkinter import messagebox

def run_django():
23
os.environ.setdefault('DJANGO_SETTINGS_MODULE',
'config.settings.development')
try:
from django.core.management import execute_from_command_line
execute_from_command_line(sys.argv)
except ImportError as exc:
messagebox.showerror("Error", "Couldn't import Django. Make sure it's
installed and available on your PYTHONPATH or activate a virtual
environment.")
except Exception as e:
messagebox.showerror("Error", str(e))

def main():
"""Run administrative tasks."""
root = tk.Tk()
root.title("Django Command-Line Utility")
root.geometry("300x150")

label = tk.Label(root, text="Click the button to run Django command-line


utility:")
label.pack(pady=10)

button = tk.Button(root, text="Run Django", command=run_django)


button.pack(pady=10)

root.mainloop()

if __name__ == '__main__':
main()
24
import sys

this_python = sys.version_info[:2]
min_version = (3, 7)
if this_python < min_version:
message_parts = [
"This script does not work on Python {}.{}".format(*this_python),
"The minimum supported Python version is {}.{}.".format(*min_version),
"Please use https://bootstrap.pypa.io/pip/{}.{}/get-pip.py
instead.".format(*this_python),
]
print("ERROR: " + " ".join(message_parts))
sys.exit(1)

import os.path
import pkgutil
import shutil
import tempfile
import argparse
import importlib
from base64 import b85decode

def include_setuptools(args):
"""
Install setuptools only if absent and not excluded.
"""
cli = not args.no_setuptools
env = not os.environ.get("PIP_NO_SETUPTOOLS")

25
absent = not importlib.util.find_spec("setuptools")
return cli and env and absent

def include_wheel(args):
"""
Install wheel only if absent and not excluded.
"""
cli = not args.no_wheel
env = not os.environ.get("PIP_NO_WHEEL")
absent = not importlib.util.find_spec("wheel")
return cli and env and absent

def determine_pip_install_arguments():
pre_parser = argparse.ArgumentParser()
pre_parser.add_argument("--no-setuptools", action="store_true")
pre_parser.add_argument("--no-wheel", action="store_true")
pre, args = pre_parser.parse_known_args()

args.append("pip")

if include_setuptools(pre):
args.append("setuptools")

if include_wheel(pre):
args.append("wheel")

return ["install", "--upgrade", "--force-reinstall"] + args

def monkeypatch_for_cert(tmpdir):
26
"""Patches `pip install` to provide default certificate with the lowest priority.

This ensures that the bundled certificates are used unless the user specifies a
custom cert via any of pip's option passing mechanisms (config, env-var, CLI).

A monkeypatch is the easiest way to achieve this, without messing too much with
the rest of pip's internals.
"""
from pip._internal.commands.install import InstallCommand

# We want to be using the internal certificates.


cert_path = os.path.join(tmpdir, "cacert.pem")
with open(cert_path, "wb") as cert:
cert.write(pkgutil.get_data("pip._vendor.certifi", "cacert.pem"))

install_parse_args = InstallCommand.parse_args

def cert_parse_args(self, args):


if not self.parser.get_default_values().cert:
self.parser.defaults["cert"] = cert_path # calculated above
return install_parse_args(self, args)

InstallCommand.parse_args = cert_parse_args

def bootstrap(tmpdir):
monkeypatch_for_cert(tmpdir)

from pip._internal.cli.main import main as pip_entry_point


args = determine_pip_install_arguments()
sys.exit(pip_entry_point(args))
27
def main():
tmpdir = None
try:
tmpdir = tempfile.mkdtemp()
pip_zip = os.path.join(tmpdir, "pip.zip")
with open(pip_zip, "wb") as fp:
fp.write(b85decode(DATA.replace(b"\n", b"")))
sys.path.insert(0, pip_zip)

# Run the bootstrap


bootstrap(tmpdir=tmpdir)
finally:
# Clean up our temporary working directory
if tmpdir:
shutil.rmtree(tmpdir, ignore_errors=True)
{
"cSpell.words": [
"detectobj",
"Inferenced"
]
}
from django.db import migrations, models
import django.db.models.deletion

class Migration(migrations.Migration):

initial = True

dependencies = [
('modelmanager', '0001_initial'),
28
('images', '0001_initial'),
]
operations = [
migrations.CreateModel(
name='InferencedImage',

fields=[
('id', models.BigAutoField(auto_created=True, primary_key=True,
serialize=False, verbose_name='ID')),
('created', models.DateTimeField(auto_now_add=True,
verbose_name='Creation Date and Time')),
('modified', models.DateTimeField(auto_now=True,
verbose_name='Modification Date and Time')),
('inf_image_path', models.CharField(blank=True, max_length=250,
null=True)),
('detection_info', models.JSONField(blank=True, null=True)),
('yolo_model', models.CharField(blank=True, choices=[('yolov5s.pt',
'yolov5s.pt'), ('yolov5m.pt', 'yolov5m.pt'), ('yolov5l.pt', 'yolov5l.pt'),
('yolov5x.pt', 'yolov5x.pt')], default=('yolov5s.pt', 'yolov5s.pt'),
help_text='Selected yolo model will download. Requires an
active internet connection.', max_length=250, null=True,
verbose_name='YOLOV5 Models')),
('model_conf', models.DecimalField(blank=True, decimal_places=2,
max_digits=4, null=True, verbose_name='Model confidence')),
('custom_model', models.ForeignKey(blank=True, help_text='Machine
Learning model for detection', null=True,
on_delete=django.db.models.deletion.DO_NOTHING,
related_name='detectedimages', to='modelmanager.mlmodel',
verbose_name='Custom ML Models')),
],
options={
29
'abstract': False,
},
),
]
from django.contrib import admin
from .models import InferencedImage

@admin.register(InferencedImage)
class InferencedImageAdmin(admin.ModelAdmin):
list_display = ["orig_image", "inf_image_path",
"model_conf", "custom_model"]

from django.apps import AppConfig

class DetectobjConfig(AppConfig):
default_auto_field = 'django.db.models.BigAutoField'
name = 'detectobj'
from django import forms
from .models import InferencedImage

class InferencedImageForm(forms.ModelForm):
model_conf = forms.DecimalField(label="Model confidence",
max_value=1,
min_value=0.25,
max_digits=3,
decimal_places=2,
initial=0.45,
help_text="Confidence of the model for prediction.",
)
30
class Meta:
model = InferencedImage
fields = ('custom_model', 'model_conf')

class YoloModelForm(forms.ModelForm):
model_conf = forms.DecimalField(label="Model confidence",
max_value=1,
min_value=0.25,
max_digits=3,
decimal_places=2,
initial=0.45,
help_text="Confidence of the model for prediction.",
)

class Meta:
model = InferencedImage
fields = ('yolo_model', 'model_conf')

import os

from django.conf import settings


from django.db import models
from django.utils.translation import gettext_lazy as _
from config.models import CreationModificationDateBase

class InferencedImage(CreationModificationDateBase):
orig_image = models.ForeignKe

31
7. TESTING

Testing is necessary for the success of the system. During testing, program to be
tested is executed with a set of test data and the output of the program for test
data is evaluated to determine if the programs are performing as expected. The
purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product.It provides a way to check
the functionality of components, sub- assemblies, assemblies and/or a finished
product It is the process of exercising software with the intent of ensuring that
the Software system meets its requirements and user expectations and does not
fail in an unacceptable manner. There are several types of tests. Each test type
addresses a specific testing requirement.

7.1 Types Of Testing

7.1.1 Unit Testing

Unit testing for an object detection project with CNNs involves


testing individual parts, like layers in the network, one at a time. Each test checks
if a specific component works correctly, such as a convolutional layer
identifying features accurately. These tests are automated and help catch errors
early. We might use mock data to simulate different situations for thorough
testing. This can include testing the functionalities of different layers within the
CNN architecture, such as convolutional layers, pooling layers, and fully
connected layers.
7.1.2 Integration Testing

Integration testing for the object detection project involving


convolutional neural networks (CNNs) and related techniques would involve
verifying the seamless interaction between different modules or components of
the system. Firstly, it would entail testing the integration between the data
preprocessing module and the CNN model to ensure that input images are
32
correctly preprocessed and fed into the network. Subsequently, integration
testing would assess the coordination between the feature extraction and
classification modules within the CNN architecture, verifying that features
extracted from input images are appropriately interpreted and classified into
object categories.
7.1.3 Functional Testing

Functional testing for an object detection project involves


systematically validating that the system performs according to its functional
specifications. In this context, valid input could consist of images containing
objects from different categories, sizes, orientations, and lighting conditions,
reflecting real-world scenarios. The functions being tested would include the
entire object detection pipeline, including preprocessing steps like image
normalization, feature extraction using convolutional neural networks (CNNs),
and post-processing steps such as non-maximum suppression for refining object
bounding boxes. The output of the functional testing phase would involve
verifying that the system accurately detects and classifies objects within the input
images, ensuring that the detected objects correspond to the correct classes with
appropriate confidence scores.

7.1.4 System Testing

System testing for the object detection project involves evaluating the
entire system's functionality and performance to ensure it meets the specified
requirements and user expectations. Firstly, the system is tested against non-
functional requirements, such as scalability, reliability, and performance. This
involves assessing its ability to process a variety of input images efficiently,
including images with different resolutions, aspect ratios, and complexities

33
7.1.5 White Box Testing

White box testing for the object detection project would involve
examining the internal structure and implementation details of the system,
particularly focusing on the convolutional neural network (CNN) architecture.
This would include analyzing the design and configuration of individual layers
within the CNN, such as convolutional layers, pooling layers, and fully
connected layers. White box testing would also entail inspecting the learned
features, weights, and activations within the network to ensure they align with
the expected patterns for different object classes.

7.1.6 Black Box Testing

Black box testing for an object detection project involves evaluating the
system's functionality without relying on knowledge of its internal structure or
implementation details. In this context, black box testing entails providing input
images with diverse characteristics and assessing the system's output against
expected results. Test cases may include images with varying lighting
conditions, backgrounds, object orientations, sizes, and occlusions. The
objective is to verify that the object detection system correctly identifies and
classifies objects regardless of the specifics of its internal workings.
7.1.7 Acceptance Testing

Acceptance testing for the object detection project would involve


verifying whether the system meets the specified requirements and satisfies the
stakeholders' expectations. This testing phase typically occurs after the system
has undergone unit, integration, functional, and system testing. During
acceptance testing, stakeholders, including end-users, domain experts, and
project sponsors, interact with the system to assess its usability, accuracy, and
overall performance. This can involve providing a representative set of input
images and evaluating the system's ability to detect and classify objects
accurately across various scenarios.

34
7.1.8 Validation

Validation testing for the object detection project involves assessing


the performance and generalization ability of the trained model on unseen data.
This process typically includes splitting the dataset into training, validation, and
test sets, where the validation set is used to tune hyperparameters and monitor
the model's performance during training. Cross-validation techniques may also
be employed to ensure robustness and mitigate overfitting.

7.1.9 Test Cases

Test cases for the object detection project encompass a range of


scenarios to thoroughly evaluate the system's functionality, robustness, and
accuracy. These test cases typically involve diverse input images containing
various objects, backgrounds, lighting conditions, and occlusions to simulate
real-world scenarios. Each test case specifies the expected behavior of the object
detection system, including the correct identification and localization of objects
within the images. Test cases cover different object classes, sizes, and
orientations to ensure the model's ability to generalize across a wide range of
object instances. Additionally, edge cases, such as objects partially occluded or
overlapping with others, are included to assess the system's resilience in
challenging conditions. Test cases also consider performance aspects, such as
processing time and memory usage, particularly when dealing with large
datasets or real-time applications. These test cases would include evaluating the
model's accuracy in correctly identifying objects of interest, assessing its ability
to generalize to unseen data by conducting tests on a separate validation dataset,
measuring the model's speed and efficiency in processing images, and testing
its robustness against different environmental conditions such as changes in
lighting, background clutter, or variations in object size and orientation

35
Additionally, test cases may involve analyzing the model's performance across
different object categories to ensure balanced performance across the entire
dataset. Furthermore, tests might be conducted to evaluate the model's resilience
to occlusion or partial visibility of objects. Overall, comprehensive test cases
are essential for validating the reliability and effectiveness of the object
detection model built using CNN.

Actual Output
S. (After
No Action Input Expected Output Execution)

Provide an image Detected objects


Detected objects
1 containing labeled with
labelled
objects bounding boxes

Provide an image Empty detection Empty detection


2
with no objects result result

Provide an image
All objects detected All objects
3 with multiple
and labeled correctly detected
objects

Provide an image
Partial detection of
4 with partially Partial detection
obscured objects
obscured objects

Provide a very Accurate detection Accurate


5
cluttered image despite clutter detection

36
Fig 10:Testcase 1: Uploading image and detecting single object at a
time.

Here in the above figure we have done testing by uploading single image
and detected single object from that image successfully passed the test
case.

Fig 11: Testcase 2: Uploading image and detecting the multiple


objects in that image a time.

Here in the above figure we have done testing by uploading single image having
multiple objects and detected multiple objects from that image successfully and
passed the test cases.

37
8.OUTPUT SCREENS

Fig 12: A Home page for Object detection

The above figure is home page for object detection. Here, there we can upload the

image in the create imageset and by selecting the model in it, the object is detected.

The detected object images is saved in the my imagesets list section and we train

the models as we need to detect any kind of object.

38
Fig 13: Django admin page for Object detection

The above figure it is the admin page which has the details of the saved objects

that they detected with this object detection application. The admin can add the

users and they can also delete the use at the same time. And then after successfully

adding the user now the use can upload the image and detect the object from that

image.You can see in the below figures.

Fig 14: Image uploaded in object Detection and choose the Model
39
In the above figure we have uploaded the image and we have choose the model
for the object detection.

Fig15: Django administration Page.

Here in the above figure we can see the all users who login to this application and

we can change the user as per our requirement.

Fig 16: single object detected in the image.

40
Here in the above figure we can observe the a image is uploaded into our object
detection Application and the after clicking on detect button then we can see
the detected object and with its name as we can see in the image dog is detected
in the image.

Fig 17: Uploading image and detecting multiple object in that image.

Here in the above and below figures we can observe the a image is uploaded into
our object detection Application has multiple objects and the after clicking on
detect button then we can see the detected objects along with its names as we can
see them on the screen the objects like Person, Horse is detected in the image.
Fig 18: Uploading image and detecting the multiple objects in that images

41
9.CONCLUSION

With the increase number of usage of face detection systems, video surveillance,
vehicle tracking and autonomous vehicle driving, fast and accurate object detection
systems are heavily required. Object detection refers to locating and classifying
object from digital image. With the progressive result from deep CNN architectures,
CNN based object detectors are used in variety of applications. Based on the
methodology, it has been categorized as either single-stage or two-stage object
detection model. This paper summarizes the different CNN-based models that
include R-CNN, Fast R-CNN, Faster R-CNN, Mask R-CNN, SSD, and YOLO.
Apart from this, it explains the different features of available datasets. It also covers
the details of research work carried out so far that applied object detection models
in various fields of applications. Our project successfully demonstrated the
effectiveness of Convolutional Neural Networks (CNNs) in detecting objects within
images. By leveraging CNN architectures and training them on labeled datasets, we
achieved notable accuracy in identifying and localizing various objects. This
technology holds immense potential across numerous domains, including
autonomous vehicles, surveillance systems, and medical imaging. Through this
project, we've gained valuable insights into the application of deep learning
techniques for object detection, paving the way for further advancements in this
field.

42
10.FURTHER ENHANCEMENTS

The object recognition system can be applied in the area of surveillance system,
face recognition, fault detection, character recognition etc. The objective of this
thesis is to develop an object recognition system to recognize the 2D and 3D objects
in the image. The performance of the object recognition system depends on the
features used and the classifier employed for recognition. This research work
attempts to propose a novel feature extraction method for extracting global features
and and obtaining local features from the region of interest. Also the research work
attempts to hybrid the traditional classifiers to recognize the object. The object
recognition system developed in this research was tested with the benchmark
datasets like COIL100, Caltech 101, ETH80 and MNIST. The object recognition
system is implemented in MATLAB 7.5 It is important to mention the difficulties
observed during the experimentation of the object recognition system due to several
features present in the image. The research work suggests that the image is to be
preprocessed and reduced to a size of 128 x 128. The proposed feature extraction
method helps to select the important feature. To improve the efficiency of the
classifier, the number of features should be less in number. Specifically, the
contributions towards this research work are as follows,

 An object recognition system is developed, that recognizes the two-


dimensional and three dimensional objects.
 The feature extracted is sufficient for recognizing the object and marking
the location of the object. x The proposed classifier is able to recognize the
object in less computational cost.
 The proposed global feature extraction requires less time, compared to the
traditional feature extraction method.
 The performance of the SVM-KNN is greater and promising when
compared with the BPN and SVM.
 The performance of the One-against-One classifier is efficient.
 Global feature extracted from the local parts of the image.

43
 Local feature PCA-SIFT is computed from the blobs detected by the
Hessian-Laplace detector. Along with the local features, the width and
height of the object computed through projection method is used

 For Night time visual tracking, night vision mode should be available as an
inbuilt feature in the CCTV camera.

 Features either the local or global used for recognition can be increased, to
increase the efficiency of the object recognition system

 The proposed object recognition system uses grey-scale image and discards
the color information. The colour information in the image can be used for
recognition of the object. Colour based object recognition plays vital role
in Robotics

 Foreground object extraction depends on the binary segmentation which is


carried out by applying threshold techniques. So blob extraction and
tracking depends on the threshold value.

 Splitting and merging cannot be handled very well in all conditions using
the single camera due to the loss of information of a 3D object projection
in 2D images

 Fully occluded object cannot be tracked and considered as a new object in


the next frame

 Object identification task with motion estimation needs to be fast enough


to be implemented for the real time system. Still there is a scope for
developing faster algorithms for object identification. Such algorithms can
be implemented using FPGA or CPLD for fast execution

44
11.REFERENCES

[1] H. S. Parekh, D. G. Thakore and U. K. Jaliya, "A Survey on Object Detection


and Tracking Methods," International Journal of Innovative Research in Computer
and Communication Engineering, vol. 2, no. 2, pp. 2970-2978, February 2014.
[2] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, "ImageNet: A
Large-Scale Hierarchical Image Database," in 2009 IEEE Conference on Computer
Vision and Pattern Recognition, Miami, FL, USA, 2009.
[3] H. Chen, Y. Wang, G. Wang and Y. Qiao, "LSTD: A Low-Shot Transfer
Detector for Object Detection," arXiv:1803.01529v1, 2018.
[4] H. Xu, X. Lv, X. Wang, Z. Ren and R. Chellappa, "Deep Regionlets for Object
Detection," arXiv:1712.02408v1, December 2017.
[5] C.-C. Kao, P. Sen, T.-Y. Lee and M.-Y. Liu, "Localization-Aware Active
Learning for Object Detection," arXiv:1801.05124v1, January 2018.
[6] J. P. N. Cruz, M. L. Dimaala, L. G. L. Francisco, E. J. S. Franco, A. A. Bandala
and E. P. Dadios, "Object Recognition and Detection by Shape and Color Pattern
Recognition Utilizing Artificial Neural Networks," in 2013 International
Conference of Information and Communication Technology (ICoICT), Bandung,
Indonesia, 2013
[7] A. Karpathy, October 2017. [Online]. http://cs231n.github.io/convolutional-
networks/. Available.
[8] R. L. Galvez, N. N. F. Giron, M. K. Cabatuan and E. P. Dadios, "Vehicle
Classification Using Transfer Learning in Convolutional Neural Networks," in
2017 2nd Advanced Research in Electrical and Electronic Engineering Technology
(ARIEET), Jakarta, Indonesia, 2017.

45

You might also like