0% found this document useful (1 vote)
994 views

Face Mask Detection Project

The document is a project report submitted by M KALYAN CHAKRAVARTHI for the partial fulfillment of the degree of Master of Technology in Computer Science and Engineering. The project aims to develop a real-time face mask detection system using a Raspberry Pi kit. The report includes an introduction to face mask recognition and the role of machine learning. It discusses the hardware and software requirements and provides an organization of the document. It also includes chapters on literature review, problem definition, dataset description, system analysis and design, implementation, testing, and conclusion.

Uploaded by

akhil rebels
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
994 views

Face Mask Detection Project

The document is a project report submitted by M KALYAN CHAKRAVARTHI for the partial fulfillment of the degree of Master of Technology in Computer Science and Engineering. The project aims to develop a real-time face mask detection system using a Raspberry Pi kit. The report includes an introduction to face mask recognition and the role of machine learning. It discusses the hardware and software requirements and provides an organization of the document. It also includes chapters on literature review, problem definition, dataset description, system analysis and design, implementation, testing, and conclusion.

Uploaded by

akhil rebels
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Realtime Face-Mask Detection on Raspberry Kit

A PROJECT REPORT

Submitted by

M KALYAN CHAKRAVARTHI ( 193J1D5802)

In partial fulfillment for the award of the degree

Of

MASTER OF TECHNOLOGY

IN

COMPUTER SCIENCE AND ENGINEERING

Under the Esteemed Guidance of

Mrs.G.M.PADMAJA ,

Assistant Professor

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


Approved by AICTE,New Delhi ,Accredited by NBA & NACC-‘A’ Grade ,Permanently Affiliated to
JNTU KAKINADA
Dakamarri(V), Bheemunipatnam (M), Visakhapatnam District, Andhra Pradesh, India.
RAGHU INSTITUTE OF TECHNOLOGY:
AUTONOMOUS

Approved by AICTE,New Delhi ,Accredited by NBA & NACC-‘A’ Grade ,Permanently Affiliated to
JNTU KAKINADA
Dakamarri(V), Bheemunipatnam (M), Visakhapatnam District, Andhra Pradesh, India.

Department of Computer Science and Engineering

CERTIFICATE

This is to certify that the project report entitled “Realtime Face-Mask Detection on
Raspberry Kit” is the bonafide work of “M KALYAN CHAKRAVARTHI (Regd.No:
193J1D5802)” who carried out the project work under my supervision.

Mrs. G.M.Padmaja Dr.S.Adinarayana,Ph.D.

Internal Guide HEAD OF THE DEPARTMENT

External Examiner
DECLARATION

I hereby declare that this project entitled “Realtime Face-Mask Detection on Raspberry
Kit” is the original work done by me in partial fulfillment of the requirement for the award of the
Degree of Master of Technology in Computer Science & Engineering. This project works/project
report has not been previously submitted to any other University/Institution for the award of any other
degree.

M KALYAN CHAKRAVARTHI (REGD.NO.193J1D5802)


ACKNOWLEDGEMENT

I would like to thank all the people who helped us in successful completion of my project “Realtime
Face-Mask Detection on Raspberry Kit”.

I would like to thank Sri Kalidindi Raghu Garu, Chairman of Raghu Institute of Technology, for
providing the necessary facilities.

I wish to express thanks to Dr. S. Satyanarayana, Principal, AU College of Engineering(A) for


supporting us in his capacity throughout my study.

I express my deep sense of gratitude to my Project Guide, Anil Chakravarthy, Assistant

Professor, Department of Computer Science and Systems Engineering (CSE), Andhra University

College of Engineering (A), for guiding me all through the project work, giving right direction and

shape to my learning by extending her expertise and experience in the education. Really I’m

indebted for her excellent and enlightened guidance.

I am very thankful to our beloved Head of the Department Dr.S.Adinarayana, Department of

Computer Science and Systems Engineering (CSSE), Andhra University College of Engineering

(A), for his valuable suggestions and constant motivation that greatly helped the project to be

successfully completed.

I also extend my heartfelt gratitude to all the teaching, technical and non-teaching staff of the
Department of Computer Science and Engineering (CSE) for their support.

We thank all those who contributed directly or indirectly in successfully carrying out this project
work.

MALLA KALYAN CHAKRAVARTHI


ABSTRACT

Recognition from faces is a popular and significant technology in recent years. In


the real-world, when a person is uncooperative with the systems such as in video
surveillance then masking is further common scenarios. For these masks, current face
recognition performance degrades. Still, difficulties created by masks are usually
disregarded. Face recognition is a promising area of applied computer vision. This
technique is used to recognize a face or identify a person automatically from given images.
In our daily life activates like, in a passport checking, smart door, access control, voter
verification, criminal investigation, and many other purposes face recognition is widely
used to authenticate a person correctly and automatically. Face recognition has gained
much attention as a unique, reliable biometric recognition technology that makes it most
popular than any other biometric technique likes password, pin, fingerprint, etc.
The primary concern to this work is about facial masks, and especially to enhance
the recognition accuracy of different masked faces. A feasible approach has been proposed
that consists of first detecting the facial regions. The occluded face detection problem has
been approached using Cascaded Convolutional Neural Network (CNN). Besides, its
performance has been also evaluated within excessive facial masks and found attractive
outcomes. Finally, a correlative study also made here for a better understanding.
FEASIBILITY STUDY

The feasibility study is carried out to test whether the proposed system is worth being
implemented. The proposed system will be selected if it is best enough in meeting the
performance requirements.

The feasibility carried out mainly in three sections namely.

• Economic Feasibility

• Technical Feasibility

• Behavioural Feasibility

Economic Feasibility

Economic analysis is the most frequently used method for evaluating effectiveness of the
proposed system. More commonly known as cost benefit analysis. This procedure determines the
benefits and saving that are expected from the system of the proposed system. The hardware in
system department if sufficient for system development.

Technical Feasibility

This study centre around the system’s department hardware, software and to what extend it
can support the proposed system department is having the required hardware and software there is
no question of increasing the cost of implementing the proposed system. The criteria, the proposed
system is technically feasible and the proposed system can be developed with the existing facility.

Behavioural Feasibility

People are inherently resistant to change and need sufficient amount of training, which
would result in lot of expenditure for the organization. The proposed system can generate reports
with day-to-day information immediately at the user’s request, instead of getting a report, which
doesn’t contain much detail.
CONTENTS

Chapters Page no. AbstractList of Figures Error!


Bookmark not defined.

1 Introduction 1.1 Face-mask Recognition 11

1.2 Role of Machine Learning 13

1.2.1 Evolution of machine learning 13

1.2.2 Importance of machine learning 14

1.2.3 Machine Learning Applications 15

1.3 System Requirements 19

1.3.1 Hardware Requirements 19

1.3.2 Software Requirements 920

1.4 Organization of the Documents Error! Bookmark not defined.

2 Literature Survey 27

3 Problem Statement 30

3.1 Introduction Error! Bookmark not defined.

3.2 Existing System 30

3.2.1 Disadvantages of Existing System Error! Bookmark not defined.

3.3 Proposed system 30

3.3.1 Advantages of Proposed System Error! Bookmark not defined.

4 Dataset Description Error! Bookmark not defined.

4.1 Introduction Error! Bookmark not defined.

4.2 Kaggle input dataset Error! Bookmark not defined.

5 Analysis & Design Error! Bookmark not defined.

5.1 Introduction 22
5.2 Architecture Error! Bookmark not defined.

5.3 Flow Chart Error! Bookmark not defined.

5.4 Uml diagrams Error! Bookmark not defined.

5.4.1 Use case diagram Error! Bookmark not defined.

5.4.2 Sequence diagram Error! Bookmark not defined.

5.4.3 Class diagram 28


5.5 Modules of CNN 29
5.6 CNN algorithm description 32
6. Implementation & Source Code
6.1 Algorithm Implementation Error! Bookmark not defined.

6.2 Source code Error! Bookmark not defined.

7. Testing & Experimental Analysis Error! Bookmark not defined.

7.1 Testing Error! Bookmark not defined.

7.2 Experimental Analysis Error! Bookmark not defined.

7.3 Expected output Error! Bookmark not defined.

8. Conclusion and Future Work Error! Bookmark not defined.

REFERENCES Error! Bookmark not defined.


LIST OF TABLES

Figure1: Hardware Requirements 8

Figure2: Software Requirements 9

Figure3: Dataset Description 21

Figure4: Testcases 42
LIST OF FIGURES

Figure1: Importing input Dataset 21

Figure2: Block Diagram for Proposed System 23

Figure3: Working flow chart 24

Figure4: Uml diagrams 33

Figure5: Sequence diagrams 35

Figure6: Class diagram 36

Figure7: CNN Layer1 Convolutional 39

Figure8: CNN Layer2 Pooling 43

Figure9: Model Summary 43

Figure10: Resulted Output 44


Real-Time Face-Mask Detection Using Open CV

Organization of Chapters:

Chapter 1: Introduction

Chapter 2: Literature Survey

Chapter 3: Problem Statement

Chapter 4: Dataset description

Chapter 5: Analysis and Design

Chapter 6: Implementation

Chapter 7: Experimental Result analysis

Chapter 8: Conclusion
Chapter 1

Introduction

1.1 Face-Mask Recognition:


Rapid advancements in the fields of Science and Technology have led us to a stage where
we are capable of achieving feats that seemed improbable a few decades ago. Technologies in
fields like Machine Learning and Artificial Intelligence have made our lives easier and provide
solutions to several complex problems in various areas.

Face mask detection refers to detect whether a person is wearing a mask or not. In fact, the
problem is reverse engineering of face detection where the face is detected using different
machine learning algorithms for the purpose of security, authentication and surveillance. Face
detection is a key area in the field of Computer Vision and Pattern Recognition. A significant
body of research has contributed sophisticated to algorithms for face detection in past. The
primary research on face detection was done in 2001 using the design of handcraft feature and
application of traditional machine learning algorithms to train effective classifiers for detection
and recognition. The problems encountered with this approach include high complexity in
feature design and low detection accuracy. In recent years, face detection methods based on
deep convolutional neural networks (CNN) have been widely developed to improve detection
performance.

Modern Computer Vision algorithms are approaching human-level performance in visual


perception tasks. From image classification to video analytics, Computer Vision has proven to
be revolutionary aspect of modern technology. In a world battling against the Novel Corona-
virus Disease (COVID-19) pandemic, technology has been a lifesaver. With the aid of
technology, ‘work from home’ has substituted our normal work routines and has become a part
of our daily lives. However, for some sectors, it is impossible to adapt to this new norm.

In this paper, we propose a two-stage CNN architecture, where the first stage detects human
faces, while the second stage uses a lightweight image classifier to classify the faces detected in
the first stage as either ‘Mask’ or ‘No Mask’ faces and draws bounding boxes around them
along with the detected class name.
This algorithm was extended to videos as well. The detected faces are then tracked between
frames using an object tracking algorithm, which makes the detection robust to the noise. This
system can then be integrated with an image or video capturing device like a CCTV camera, to
track safety violations, promote the use of face masks, and ensure a safe working environment.

1.2 Role of Machine Learning:


Machine learning is a method of data analysis that automates analytical model building. It is
a branch of artificial intelligence based on the idea that systems can learn from data, identify
patterns and make decisions with minimal human intervention.

1.2.1 Evolution of machine learning


Because of new computing technologies, machine learning today is not like machine learning of
the past. It was born from pattern recognition and the theory that computers can learn without
being programmed to perform specific tasks; researchers interested in artificial intelligence
wanted to see if computers could learn from data. The iterative aspect of machine learning is
important because as models are exposed to new data, they are able to independently adapt. They
learn from previous computations to produce reliable, repeatable decisions and results.

While many machine learning algorithms have been around for a long time, the ability to
automatically apply complex mathematical calculations to big data – over and over, faster and
faster – is a recent development. Here are a few widely publicized examples of machine learning
applications you may be familiar with:

• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine learning
applications for everyday life.

• Knowing what customers are saying about you on Twitter? Machine learning combined
with linguistic rule creation.

• Fraud detection? One of the more obvious, important uses in our world today.
1.2.2 Importance of Machine Learning
Resurging interest in machine learning is due to the same factors that have made data mining and
Bayesian analysis more popular than ever. Things like growing volumes and varieties of
available data, computational processing that is cheaper and more powerful, and affordable data
storage.

All of these things mean it's possible to quickly and automatically produce models that can
analyze bigger, more complex data and deliver faster, more accurate results even on a very large
scale. And by building precise models, an organization has a better chance of identifying
profitable opportunities or avoiding unknown risks.

Features of Machine Learning:


 Machine leaning models involves machines learning from data without the help of
humans or any kind of human intervention.
 Machine Learning is the science of making of making the computers learn and act like
humans by feeding data and information without being explicitly programmed.
 Machine Learning is totally different from traditionally programming, here data and
output is given to the computer and in return it gives us the program which provides
solution to thevarious problems.
 It is nothing but automating the automation.
 Writing software is bottleneck.
 Getting computers to program themselves.
 With in the field of data analytics, machine learning is a method used to devise
complex models and algorithms that lend themselves to prediction; in commercial use,
this is known as predictive analytics. These analytical models allow researchers, data
scientists, engineers, and analysts to "produce reliable, repeatable decisions and results"
and uncover “hidden insights" through learning from historical relationships and trends in
the data.

Categories of Machine Learning:

Machine learning tasks Machine learning tasks are typically classified into several broad
categories:
Supervised learning: The computer is presented with example inputs and their desired outputs,
given by a "teacher”, and the goal is to learn a general rule that maps inputs to outputs. As
special cases, theinput signal can beonlypartially available, or restricted to special feedback.

Semi-supervised learning: The computer is given only an incomplete training signal: a training
set with some (often many) of the target outputs missing.

Active learning: The computer can only obtain training labels for a limited set of instances
(based on a budget), and also has to optimize its choice of objects to acquire labels for. When
used interactively, these can be presented to the user for labelling.

Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to
find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden
patterns in data) or a means towards an end (feature learning).

Reinforcement learning: Data (in form of rewards and punishments) are given only as
feedback to the program's actions in a dynamic environment, such as driving a vehicle or
playing a game against an opponent.

1.2.3 Machine Learning Applications

Artificial Intelligence is everywhere. Possibility is that you are using it in one way or the other
and you don’t even know about it. One of the popular applications of AI is Machine Learning, in
which computers, software, and devices perform via cognition (very similar to human brain).
Herein, we share few applications of machine learning.

I. Image Recognition:

One of the most common uses of machine learning is image recognition. There are many
situations where you can classify the object as a digital image. For digital images, the
measurements describe the outputs of each pixel in the image.

• In the case of a black and white image, the intensity of each pixel serves as one
measurement. So if a black and white image has N*N pixels, the total number of pixels
and hence measurement is N2.
• In the colored image, each pixel considered as providing 3 measurements to the
intensities of 3 main colours component i.e RGB. So N*N colored image there are 3 N2
measurements.

• For face detection – The categories might be face versus no face present. There might be
a separate category for each person in a database of several individuals.

• For character recognition – We can segment a piece of writing into smaller images, each
containing a single character. The categories might consist of the 26 letters of the
English alphabet, the 10 digits, and some special characters.

II. Speech Recognition:

Speech recognition (SR) is the translation of spoken words into text. It is also known as
“automatic speech recognition” (ASR), “computer speech recognition”, or “speech to text”
(STT).In speech recognition, a software application recognizes spoken words. The measurements
in this application might be a set of numbers that represent the speech signal. We can segment
the signal into portions that contain distinct words or phonemes.

In each segment, we can represent the speech signal by the intensities or energy in different time-
frequency bands. Although the details of signal representation are outside the scope of this
program, we can represent the signal by a set of real values. Speech recognition applications
include voice user interfaces. Voice user interfaces are such as voice dialing; call routing,
demotic appliance control. It can also use as simple data entry, preparation of structured
documents, speech-to-text processing, and plane.

III. Medical Diagnosis:

ML provides methods, techniques, and tools that can help solving diagnostic and prognostic
problems in a variety of medical domains. It is being used for the analysis of the importance of
clinical parameters and of their combinations for prognosis, e.g. prediction of disease
progression, for the extraction of medical knowledge for outcomes research, for therapy planning
and support, and for overall patient management. ML is also being used for data analysis, such
as detection of regularities in the data by appropriately dealing with imperfect data, interpretation
of continuous data used in the Intensive Care Unit, and for intelligent alarming resulting in
effective and efficient monitoring.

It is argued that the successful implementation of ML methods can help the integration of
computer-based systems in the healthcare environment providing opportunities to facilitate and
enhance the work of medical experts and ultimately to improve the efficiency and quality of
medical care. In medical diagnosis, the main interest is in establishing the existence of a disease
followed by its accurate identification. There is a separate category for each disease under
consideration and one category for cases where no disease is present. Here, machine learning
improves the accuracy of medical diagnosis by analyzing data of patients.

IV. Statistical Arbitrage:

In finance, statistical arbitrage refers to automated trading strategies that are typical of a short
term and involve a large number of securities. In such strategies, the user tries to implement a
trading algorithm for a set of securities on the basis of quantities such as historical correlations
and general economic variables. These measurements can be cast as a classification or estimation
problem. The basic assumption is that prices will move towards a historical average.

In the case of classification, the categories might be sold, buy or do nothing for each security. I
the case of estimation one might try to predict the expected return of each security over a future
time horizon. In this case, one typically needs to use the estimates of the expected return to make
a trading decision (buy, sell, etc.)

V. Learning Associations:

Learning association is the process of developing insights into various associations between
products. A good example is how seemingly unrelated products may reveal an association to one
another. When analyzed in relation to buying behaviors of customers.

One application of machine learning- Often studying the association between the products people
buy, which is also known as basket analysis. If a buyer buys ‘X’, would he or she force to buy
‘Y’ because of a relationship that can identify between them. This leads to relationship that
exists between fish and chips etc. When new products launches in the market a Knowing these
relationships it develops new relationship. Knowing these relationships could help in suggesting
the associated product to the customer. For a higher likelihood of the customer buying it, it can
also help in bundling products for a better package.

IV. Classification:

A Classification is a process of placing each individual from the population under study in many
classes. This is identifying as independent variables. Classification helps analysts to use
measurements of an object to identify the category to which that object belong. To establish an
efficient rule, analysts use data. Data consists of many examples of objects with their correct
classification.

For example, before a bank decides to disburse a loan, it assesses customers on their ability to
repay the loan. By considering factors such as customer’s earning, age, savings and financial
history we can do it. This information taken from the past data of the loan. Hence, Seeker uses to
create relationship between customer attributes and related risks.

VI. Prediction:

Consider the example of a bank computing the probability of any of loan applicants faulting the
loan repayment. To compute the probability of the fault, the system will first need to classify the
available data in certain groups. It is described by a set of rules prescribed by the analysts. Once
we do the classification, as per need we can compute the probability. These probability
computations can compute across all sectors for varied purposes.

VII. Extraction:

Information Extraction (IE) is another application of machine learning. It is the process of


extracting structured information from unstructured data. For example web pages, articles, blogs,
business reports, and e-mails. The relational database maintains the output produced by the
information extraction. The process of extraction takes input as a set of documents and produces
a structured data. This output is in summarized form such as excel sheet and table in a relational
database.

Now-a-days extraction is becoming a key in big data industry. As we know that huge volume of
data is getting generated out of which most of the data is unstructured. The first key challenge is
handling of unstructured data. Now conversion of unstructured data to structured form based on
some pattern so that the same can stored in RDBMS. Apart from this in current day’s data
collection mechanism is also getting change.

VIII. Regression:

We can apply Machine learning to regression as well. Assume that x= x1, x2, x3 … xn are the
input variables and y is the outcome variable. In this case, we can use machine learning
technology to produce the output (y) on the basis of the input variables (x).
You can use a model to express the relationship between various parameters as below:

Y=g(x) where g is a function that depends on specific characteristics of the model.

In regression, we can use the principle of machine learning to optimize the parameters. To cut the
approximation error and calculate the closest possible outcome. We can also use Machine
learning for function optimization. We can choose to alter the inputs to get a better model. This
gives a new and improved model to work with. This is known as response surface design.

1.3 System Requirements

1.3.1 Hardware Requirements


Table 1.3.1: Hardware Requirements
SYSTEM Intel Core i3, i5, or i7

RAM 8 GB and above

HARD DISK 10 GB and above

INPUT DEVICES Keyboard and Mouse

OUTPUT DEVICES Monitor or PC

Table 1 shows the basic / minimal hardware requirements to do the project.


1.3.2 Software Requirements

Table 1.3.2: Software Requirements


OPERATING SYSTEM Windows 7, 10 or above/ ubuntu linux

IDE/EDITOR Jupyter, visual studio, google colab

FRONT END Opencv

BACK END Python and Files

PROGRAMMING LANGUAGE Python

Table 2 shows what are the basic / minimal software requirements to do the project.

1.3.2.1 Python:
Python is an interpreted, object-oriented, high-level programming language with dynamic
semantics. Its high-level built in data structures, combined with dynamic typing and dynamic
binding; make it very attractive for Rapid Application Development, as well as for use as a
scripting or glue language to connect existing components together. Python's simple, easy to
learn syntax emphasizes readability and therefore reduces the cost of program maintenance.
Python supports modules and packages, which encourages program modularity and code reuse.
The Python interpreter and the extensive standard library are available in source or binary form
without charge for all major platforms, and can be freely distributed.

Often, programmers fall in love with Python because of the increased productivity it provides.
Since there is no compilation step, the edit-test-debug cycle is incredibly fast. Debugging Python
programs is easy: a bug or bad input will never cause a segmentation fault. Instead, when the
interpreter discovers an error, it raises an exception. When the program doesn't catch the
exception, the interpreter prints a stack trace. A source level debugger allows inspection of local
and global variables, evaluation of arbitrary expressions, setting breakpoints, stepping through
the code a line at a time, and so on. The debugger is written in Python itself, testifying to
Python's introspective power. On the other hand, often the quickest way to debug a program is to
add a few print statements to the source: the fast edit-test-debug cycle makes this simple
approach very effective.

1.3.2.2 Importance of Python:

Python was designed to be easy to understand and fun to use (its name came from Monty Python
so a lot of its beginner tutorials reference it). Fun is a great motivator, and since you'll be able to
build prototypes and tools quickly with Python, many find coding in Python a satisfying
experience. Thus, Python has gained popularity for being a beginner-friendly language, and it
has replaced Java as the most popular introductory language at Top U.S. Universities.

Easy to Understand

Being a very high level language, Python reads like English, which takes a lot of syntax-learning
stress off coding beginners. Python handles a lot of complexity for you, so it is very beginner-
friendly in that it allows beginners to focus on learning programming concepts and not have to
worry about too many details.

Very Flexible

As a dynamically typed language, Python is really flexible. This means there are no hard rules on
how to build features, and you'll have more flexibility solving problems using different methods
(though the Python philosophy encourages using the obvious way to solve things). Furthermore,
Python is also more forgiving of errors, so you'll still be able to compile and run your program
until you hit the problematic part.

Scalability
Not Easy to Maintain

Because Python is a dynamically typed language, the same thing can easily mean something
different depending on the context. As a Python app grows larger and more complex, this may
get difficult to maintain as errors will become difficult to track down and fix, so it will take
experience and insight to know how to design your code or write unit tests to ease
maintainability.
Slow

As a dynamically typed language, Python is slow because it is too flexible and the machine
would need to do a lot of referencing to make sure what the definition of something is, and
this slows Python performance down.

At any rate, there are alternatives such as PyPy that are faster implementations of Python.
While they might still not be as fast as Java, for example, it certainly improves the speed
greatly.

Community
As you step into the programming world, you'll soon understand how vital support is, as the
developer community is all about giving and receiving help. The larger a community, the more
likely you'd get help and the more people will be building useful tools to ease the process of
development.

Career Opportunities
On Angel List, Python is the 2nd most demanded skill and also the skill with the highest average
salary offered.

With the rise of big data, Python developers are in demand as data scientists, especially since
Python can be easily integrated into web applications to carry out tasks that require machine
learning.

Future
According to the TIOBE index, Python is the 4th most popular programming language out of
100, with the rise of Ruby on Rails and more recently Node.js, Python's usage as the main
prototyping language for backend web development has diminished somewhat, especially since it
has a fragmented MVC ecosystem. However, with big data becoming more and more important,
Python has become a skill that is more in demand than ever, especially it can be integrated into
web applications.

As an open source project, Python is actively worked on with a moderate update cycle, pushing
out new versions every year or so to make sure it remains relevant. A programming language's
ability to stay relevant also depends on whether the language is getting new blood. In terms of
search volume for anyone interested in learning Python, it has skyrocketed to the 1st place when
compared to other languages.

Benefits of Python:
 Presence of Third-Party Modules
 Extensive Support Libraries
 Open Source and Community Development
 Learning Ease and Support Available
 User-friendly Data Structures
 Productivity and Speed
 Highly Extensible and Easily Readable Language.

1.3.2.3 Python Modules

Python allows us to store our code in files (also called modules). This is very useful for more
serious programming, where we do not want to retype a long function definition from the very
beginning just to change one mistake. In doing this, we are essentially defining our own
modules, just like the modules defined already in the Python library. To support this, Python has
a way to put definitions in a file and use them in a script or in an interactive instance of the
interpreter. Such a file is called a module; definitions from a module can be imported into other
modules or into the main module.

 NumPy - NumPy is a module for Python. The name is an acronym for "Numeric Python"
or "Numerical Python". Furthermore, NumPy enriches the programming language Python
with powerful data structures, implementing multi-dimensional arrays and matrices.
 Opencv - OpenCV-Python is a library of Python bindings designed to solve computer
vision problems. ... OpenCV-Python makes use of Numpy, which is a highly optimized
library for numerical operations with a MATLAB-style syntax. All the OpenCV array
structures are converted to and from NumPy arrays.
 keras: Keras is a minimalist Python library for deep learning that can run on top of
Theano or TensorFlow. It was developed to make implementing deep learning models as
fast and easy as possible for research and development.
 Tensorflow: It is an open source artificial intelligence library, using data flow graphs to
build models. It allows developers to create large-scale neural networks with many
layers. TensorFlow is mainly used for: Classification, Perception, Understanding,
Discovering, Prediction and Creation.

Jupyter notebook:
The Jupyter Notebook is an open source web application that you can use to create and share
documents that contain live code, equations, visualizations, and text. Jupyter Notebook is
maintained by the people at Project Jupyter. Jupyter Notebooks are powerful, versatile, shareable
and provide the ability to perform data visualization in the same environment. Jupyter Notebooks
allow data scientists to create and share their documents, from codes to full blown reports.

Advantages of Jupyter Notebook:


1. All in one place: As you know, Jupyter Notebook is an open-source web-based interactive
environment that combines code, text, images, videos, mathematical equations, plots, maps,
graphical user interface and widgets to a single document.
2. Easy to convert: Jupyter Notebook allows users to convert the notebooks into other formats
such as HTML and PDF. It also uses online tools and nbviewer which allows you to render a
publicly available notebook in the browser directly.
3. Easy to share: Jupyter Notebooks are saved in the structured text files (JSON format),
which makes them easily shareable.
4. Language independent: Jupyter Notebook is platform-independent because it is
represented as JSON (JavaScript Object Notation) format, which is a language-independent,
text-based file format. Another reason is that the notebook can be processed by any
programing language, and can be converted to any file formats such as Markdown, HTML,
PDF, and others.
5. Interactive code: Jupyter notebook uses ipywidgets packages, which provide many
common user interfaces for exploring code and data interactivity.

Visual Studio:
Visual Studio Code features a lightning fast source code editor, perfect for day-to-day use.
With support for hundreds of languages, VS Code helps you be instantly productive with syntax
highlighting, bracket-matching, auto-indentation, box-selection, snippets, and more.
Benefits of Visual Studio:
1. Cross-platform support : Windows, Linux, Mac
2. Light-weight
3. 3. Robust Architecture
4. Intelli-Sense
5. Freeware: Free of Cost- probably the best feature of all for all the programmers out there,
even more for the organizations.
6. Many users will use it or might have used it for desktop applications only, but it also
provides great tool support for Web Technologies like; HTML, CSS, JSON.

Google Colab:
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to
write and execute arbitrary python code through the browser, and is especially well suited
to machine learning, data analysis and education.

Benefits of Google colab:

 Sharing: You can share your Google Colab notebooks very easily. Thanks to Google
Colab everyone with a Google account can just copy the notebook on his own Google
Drive account. No need to install any modules to run any code, modules come preinstalled
within Google Colab.
 Versioning: You can save your notebook to Github with just one simple click on a button.
No need to write “git add git commit git push git pull” codes in your command client (this
is if you did use versioning already)!
 Code snippets: Google Colab has a great collection of snippets you can just plug in on
your code.E.g. if you want to write data to a Google Sheet automatically, there’s a snippet
for it in the Google Library
 Forms for non-technical users: Not only programmers have to analyze data and Python
can be useful for almost everyone in an office job. The problem is non-technical people are
scared to death of making even the tiniest change to the code. But Google Colab has the
solution for that. Just insertthe comment #@param {type:”string”} and you turn any
variable
field in a easy-to-use form input field. Your non-technical user needs to change form
fields and Google Colab will automatically update the code. You can find more info
on https://colab.research.google.com/notebooks/forms.ipynb
 Performance: Use the computing power of the Google servers instead of your own
machine. Running python scripts requires often a lot of computing power and can take time.
By running scripts in the cloud, you don’t need to worry. Your local machine performance
won’t drop while executing your Python scripts.
 Price: Best of all it’s free.
Chapter 2

Literature Survey

Initially, researchers focused on the edge and gray value of the face image. It was based on a
pattern recognition model, having prior information of the face model. Ada-boost was a good
training classifier. The face detection technology got a breakthrough with the famous Viola-
Jones Detector, which greatly improved real-time face detection. Viola-Jones detector optimized
the features of Haar, but failed to tackle the real-world problems and was influenced by various
factors like face brightness and face orientation. Viola-Jones could only detect frontal well-light
faces. It failed to work well in dark conditions and with non-frontal images. These issues have
made the independent researchers work on developing new face detection models based on deep
learning, to have better results for the different facial conditions. We have developed our face
detection model using Convolutional Neural Network (CNN), such that it can detect the face in
any geometric condition frontal or non-frontal for that matter. Convolutional Neural Networks
have always been used for image classification tasks.

[1] Single-stage Detectors:


The single-stage detectors treat the detection of region proposals as a simple regression
problem by taking the input image and learning the class probabilities and bounding box
coordinates. Overfeat and Deep-MultiBox were early examples. YOLO (You Only Look Once)
popularized the single-stage approach by demonstrating real-time predictions and achieving
remarkable detection speed but suffered from low localization accuracy when compared with
two-stage detectors; especially when small objects are taken into consideration. Basically, the
YOLO network divides an image into a grid of size GxG, and each grid generates N predictions
for bounding boxes. Each bounding box is limited to having only one class during the prediction,
which restricts the network from finding smaller objects. Further, the YOLO network was
improved to YOLOv2 included batch normalization, high-resolution classifier, and anchor
boxes. Furthermore, the development of YOLOv3 is built upon YOLOv2 with the addition of an
improved backbone classifier, multi-sale prediction, and a new network for feature extraction.
Although, YOLOv3 is executed faster than Single-Shot Detector (SSD) but does not perform
well in terms of classification accuracy. Moreover, YOLOv3 requires a large amount of
computational power for inference, making it not suitable for embedded or mobile devices. Next,
SSD networks have superior performance than YOLO due to small convolutional filters,
multiple feature maps, and prediction in multiple scales. The key difference between the two
architectures is that YOLO utilizes two fully connected layers, whereas the SSD network uses
convolutional layers of varying sizes. Besides, the RetinaNet proposed by Lin is also a single-
stage object detector that uses featured image pyramid and focal loss to detect the dense objects
in the image across multiple layers and achieves remarkable accuracy as well as speed
comparable to two-stage detectors.

[2] Two-stage Detectors:


In contrast to single-stage detectors, two-stage detectors follow a long line of reasoning in
computer vision for the prediction and classification of region proposals. They first predict
proposals in an image and then apply a classifier to these regions to classify potential detection.
Various two-stage region proposal models have been proposed in past by researchers. Region-
based convolutional neural network also abbreviated as R-CNN described in 2014 by Ross
Girshick et al. It may have been one of the first large-scale applications of CNN to the problem
of object localization and recognition. The model was successfully demonstrated on benchmark
datasets such as VOC-2012 and ILSVRC-2013 and produced state of art results. Basically, R-
CNN applies a selective search algorithm to extract a set of object proposals at an initial stage
and applies an SVM (Support Vector Machine) classifier for predicting objects and related
classes at a later stage. Spatial pyramid pooling SPPNet (modifies R-CNN with an SPP layer)
collects features from various region proposals and fed into a fully connected layer for
classification. The capability of SPNN to compute feature maps of the entire image in a single
shot resulted in significant improvement in object detection speed by the magnitude of nearly 20
folds greater than R-CNN. Next, Fast R-CNN is an extension over R-CNN and SPPNet. It
introduces a new layer named Region of Interest (RoI) pooling layer between shared
convolutional layers to fine-tune the model. Moreover, it allows to simultaneously train a
detector and regressor without altering the network configurations. Although Fast-R-CNN
effectively integrates the benefits of R-CNN and SPPNet but still lacks detection speed
compared to single-stage detectors. Further, Faster R-CNN is an amalgam of fast R-CNN and
Region Proposal Network (RPN). It
enables nearly cost-free region proposals by gradually integrating individual blocks (e.g.
proposal detection, feature extraction, and bounding box regression) of the object detection
system in a single step. Although this integration leads to the accomplishment of break-through
for the speed bottleneck of Fast R-CNN there exists a computation redundancy at the subsequent
detection stage. The Region-based Fully Convolutional Network (R-FCN) is the only model that
allows complete backpropagation for training and inference. Feature Pyramid Networks (FPN)
can detect non-uniform objects but are least used by researchers due to high computation cost
and more memory usage. Furthermore, Mask R-CNN strengthens Faster R- CNN by including
the prediction of segmented masks on each RoI. Although two-stage yields high object detection
accuracy, it is limited by low inference speed in real-time for video surveillance.
Chapter 3
Problem Statement

Introduction: Face recognition is one of the well-studied real-life problems. Excellent progress
has been done against face recognition technology throughout the last years. Face alterations
and the presence of different masks make it too much challenging. The primary concern of this
work is about facial masks, especially to enhance the recognition accuracy of different masked
faces. Face Mask detection has turned up to be an astonishing problem in the domain of image
processing and computer vision.

3.1 Existing System


There are several recent works on Facial mask recognition using different frameworks.
Most of the existing method works on Fully Convolutional Networks. In the existing system, a
Fully Convolutional Network is used to detect faces, then facial feature extraction is performed
using a Semantic segmentation model.

3.1.1 Disadvantages of Existing System

 A Fully convolution Network is a significantly slower operation than, say max pool, both
forward and backward. If the network is pretty deep, each training step is going to take
much longer.
 The network is a bit too slow and complicated if you just want a good pre-trained model.

3.2 Proposed System


In this model, our aim is to detect whether the person is wearing the mask or not wearing
the mask. Our proposed system analyzes people’s faces using a Convolutional Neural Network
(CNN) architecture in two stages. First, the system detects the face from the input image, and
these detected faces are cropped and normalized to a size of 150×150. Then, these face images
are used as input to CNN. while the second stage uses a lightweight image classifier to classify
the faces detected in the first stage as either ‘Mask’ or ‘No Mask’ faces and draws bounding
boxes around them along with the detected class name.
3.2.1 Advantages of Proposed System

 The approach proposed system in the project uses only the Convolutional Neural
Network model (CNN) to detect the human faces.
 In these Feature extraction was performed best compared to the existing system and
Computation is very fast. The power of CNN is to detect distinct features from images all by
itself, without any actual human intervention.
Chapter 4

Dataset Description

In our research, we worked on covid-face-mask-detection-dataset which we obtained


from the Kaggle website.

Kaggle allows users to find and publish data sets, explore and build models in a web-based data-
science environment, work with other data scientists and machine learning engineers, and enter
competitions to solve data science challenges. Kaggle has over 50,000 public datasets and
400,000 public notebooks.

Dataset No. of faces with No. of faces without


masks masks
Kaggle-covid-face-
503 503
mask-detection-dataset

Fig 1: Input data to the model


Chapter 5

Analysis and Design

Object-oriented analysis and design (OOAD) is a software engineering approach that models a
system as a group of interacting objects. Each object represents an entity of interest in the system
being modeled and is characterized by its class, its state (data elements), and its behaviors.
Various models can be created to show the static structure, dynamic behaviors, and runtime
deployment of these collaborating objects. there are a number of different notations for
representing these models, such as the Unified Modelling Language (UML).
Object-oriented analysis (OOA) applies object modeling techniques to analyze the
functional requirements for a system. Object-oriented design (OOD) elaborates the analysis
models to produce implementation specifications. OOA focuses on what this system does, OOD
on how this system does it.

Design: The most creative and challenging phase of the life cycle is system design. The term
design describes a final system and the process by which it is developed. It refers to the
technical specifications that will be applied in the implementation of the candidate system. The
design may be defined as “The process of applying various techniques and principal for the
purpose of defining a device, a process or a system with sufficient details to permit its physical
realization are documented and evaluated by management as a step towards implementation”.

The importance of software design can be stated in a single word “Quality”. Design providers
with the representation of software that can be assessed for quality. Designers the only way
where we can accurately translate customers’ requirements into a complete software product or
system. without design, we risk building an unstable system that might fail if small changes are
made. It may as well be difficult to test or could be one whose quality can't be tested. so it is an
essential facet in the development of software products.
5.1 Architecture :

Fig 2: CNN Architecture

In this model, our aim is to detect whether the person is wearing the mask or not wearing
the mask. Face Mask detection has turned up to be an astonishing problem in the domain of
image processing and computer vision. Face detection has various use cases ranging from face
recognition to capturing facial motions, where the latter calls for the face to be revealed with
very high precision. Due to the rapid advancement in the domain of machine learning
algorithms, face mask detection technology seems to be well addressed yet. This technology is
more relevant today because it is used to detect faces in static images and videos also and the
same is graphically represented in the figure as a Flow diagram.

5.2 Flow Diagram:

Step1: Initially import all the libraries and packages.


Step2: Train the data with mask faces and without mask faces, after training is complete.
Validate and test the dataset with accuracy, finally completing all these save the file for
prediction.
Step3: Run the video and if it detected the face
Step4: If it is a human face then the condition will re-scale the image or if it is not a human face the
condition will break.
Step5: After re-scaling the image and check with greyscale values and check to the face
recognition values and displays rectangle box on the screen.
Step6: After face recognition is complete it will predict whether the face has a mask or not.
Step7: If the person wears a mask, then in the display it will show with a green line rectangle
box with the name mask on the screen. If the person wears a mask, then in the display it will
show with a red line rectangle box with the name No-mask on the screen.

Fig 3: Working flow chart

5.3 UML DIAGRAMS:

UML short for Unified Modeling Language, is a standardized modeling language


consisting of an integrated set of diagrams, developed to help system and software developers
for specifying, visualizing, constructing, and documenting the artifacts of software systems, as
well
as for business modeling and other non-software systems. The UML represents a collection of
best engineering practices that have proven successful in the modeling of large and complex
systems. The UML is a very important part of developing object-oriented software and the
software development process. The UML uses mostly graphical notations to express the design
of software projects. Using the UML helps project teams communicate, explore potential
designs, and validate the architectural design of the software.

The goal of UML is to provide a standard notation that can be used by all object-oriented
methods and to select and integrate the best elements of precursor notations. UML has been
designed for a broad range of applications. Hence, it provides constructs for a broad range of
systems and activities.

5.3.1 Use case Diagram:

A UML use case diagram is the primary form of system/software requirements for a new
software program underdeveloped. Use cases specify the expected behavior (what), and not the
exact method of making it happen (how). Use cases once specified can be denoted by both
textual and visual representation (i.e., use case diagram).

A key concept of use case modeling is that it helps us design a system from the end user's
perspective. It is an effective technique for communicating system behavior in the user's terms
by specifying all externally visible system behavior.
Fig 4: use case diagram
5.3.2 Sequence Diagram:

UML Sequence Diagrams are interaction diagrams that detail how operations are carried
out. They capture the interaction between objects in the context of a collaboration. Sequence
Diagrams are time focused and they show the order of the interaction visually by using the
vertical axis of the diagram to represent time what messages are sent and when.

Fig 5: sequence diagram


Sequence Diagrams captures:

 the interaction that takes place in a collaboration that either realizes a use case or
an operation (instance diagrams or generic diagrams)
 high-level interactions between the user of the system and the system, between the
system and other systems, or between subsystems (sometimes known as system
sequence diagrams)

5.3.3 Class Diagram:

In software engineering, a class diagram in the UML is a type of static structure diagram
that describes the structure of a system by showing the system's classes, their attributes,
operations (or methods), and the relationships among objects.

A UML class diagram is made up of:

 A set of classes and


 A set of relationships between classes

The user performs an operation called giving input to the system. The system trains and tests the
model and predicts the result based on the inputs given to it and displays the result to the user.

Fig 6: class diagram


5.4 Modules of CNN:

CONVOLUTIONAL NEURAL NETWORKS: (CNN)

Convolutional Neural Network, also known as CNN is a subfield of deep learning which is
mostly used for the analysis of visual imagery. CNN is a class of deep feedforward (ANN). This
Neural Network uses the already supplied dataset to it for training purposes and predicts the
possible future labels to be assigned. Any kind of data This Neural Network uses its strengths
against the curse of dimensionality. A portion of the territories where CNNs are broadly utilized
are image recognition, image classification, image captioning and object detection, etc. The
CNNs got immense popularity when Alex discovered it in 2012. In just three years, the
engineers have advanced it to an extent that an older 8 layer AlexNet now is converted into 152
layer ResNet. Tasks where recommendation systems, contextual importance, or natural
language processing (NLP) is considered, CNNs come handy. The key chore of the neural
network is to make sure it processes all the layers, and hence detects all the underlying features,
automatically. A CNN is a convolution tool that parts the different highlights of the picture for
analysis and prediction.

 Convolutional neural network is also known as Artificial neural network that has so far
been most popularly used for analyzing images.
 CNN has hidden layers called Convolutional layers, and these layers are precisely
what makes this CNN.
 Convolutional layer more precisely able to detect patterns.
 Through CNN model we insert any object from input it will check convolutional layer and
transform through output.

A CNN typically has three layers:

1. Convolutional layer
2. Pooling layer
3. Fully connected layer.

1. Convolution Layer: The convolution layer is a core building block of the CNN. It carries
the main portion of the network’s computational load. This layer performs a dot product
between two matrices, where one matrix is the set of learnable parameters otherwise known
as a kernel and the other matrix is the restricted portion of the receptive field. The kernel is
spatially smaller than an image but is more in-depth.

Fig:7 Convolutional Layer

During the forward pass, the kernel slides across the height and width of the image-
producing the image representation of that receptive region. This produces a two-
dimensional representation of the image known as an activation map that gives the response
of the kernel at each spatial position of the image. The sliding size of the kernel is called a
stride. If we have an input of size W *W *D and Dout number of kernels with a spatial size
of F with stride S and amount of padding P, then the size of output volume can be
determined by the following formula:

2. Pooling layer: The pooling layer replaces the output of the network at certain locations by
deriving a summary statistics of the nearby outputs. This helps in reducing the spatial size of
the representation, which decreases the required amount of computation and weights. The
pooling operation is processed on every slice of the representation individually.
There are several pooling functions such as the average of the rectangular
neighborhood, L2 norm of the rectangular neighborhood, and a weighted average based on
the distance from the central pixel. However, the most popular process is max pooling,
which reports the maximum output from the neighborhood.

Fig:8 Pooling
If we have an activation map of sizeW*W*D, a pooling kernel of spatial size F, and stride S,
then the size of output volume can be determined by the following formula:

3. Fully Connected Layer: Neurons in this layer have full connectivity with all neurons in
the preceding and succeding layer as seen in regular FCNN. This is why it can be computed
as usual by a matrix multiplication folowed by a bias effect. The FC layer helps to map the
representation between the input and and output.

Designing a Convolutional Neural Network:

Our convolutional neural network has architecture as follows:

[INPUT]

→ [CONV 1] → [BATCH NORM] → [ReLU] → [POOL 1]

→ [CONV 2] → [BATCH NORM] → [ReLU] → [POOL 2]

→ [FC LAYER] → [RESULT]


5.5 CNN Algorithm Description:

 Importing the libraries and reading the CSV file.


 Getting the training features X and labels y from pixels of the CSV respectively and
converting them into numpy arrays. We also add an additional dimension to our
feature vector by using np.expand_dims() function, this is done to make the input
suitable for our CNN which we will design later. Both features and labels are stored
as .npy files to be used later.
 Importing the required libraries for CNN.
 We have 150x 150-pixel resolution so we have width and height as 150. We will be
processing our inputs with a batch size of 64.
 load the features and labels into x and y respectively
 divide the data into training and testing set and save the test features and labels to be
used later.
 Sequential () - A sequential model is just a linear stack of layers which is putting
layers on top of each other as we progress from the input layer to the output layer.
You can read more about this here.
 model. Add (Conv2D()) - This is a 2D Convolutional layer which performs the
convolution operation as described at the beginning of this post. To quote Keras
Documentation “ This layer creates a convolution kernel that is convolved with the
layer input to produce a tensor of outputs.” Here we are using a 3x3 kernel size and
Rectified Linear Unit (ReLU) as our activation function.
 model. Add (MaxPooling2D()) - This function performs the pooling operation on the
data as explained at the beginning of the post. We are taking a pooling window of
2x2 with 2x2 strides in this model. If you want to read more about MaxPooling you
can refer the Keras Documentation or the post mentioned above.
 model. Add (Dropout()) - As explained above Dropout is a technique where
randomly selected neurons are ignored during the training. They are “dropped out”
randomly. This reduces overfitting.
 model.add(Flatten()) - This just flattens the input from ND to 1D and does not affect
the batch size.
 model.add(Dense()) - According to Keras Documentation, Dense implements the
operation: output = activation(dot(input, kernel)where activation is the element-wise
activation function passed as the activation argument, kernel is a weights matrix
created by the layer. In simple words, it is the final nail in the coffin which uses the
features learned using the layers and maps it to the label.

Fig 9: CNN Model Dataset Summary


Chapter 6

Implementation

Implementation is the stage of the project when the theoretical design is turned out into a
working system. Thus, it can be considered to be the most critical stage in achieving a
successful new system and in giving the user, confidence that the new system will work and be
effective.

The implementation stage involves careful planning, investigation of the existing system and its
constraints on implementation, designing of methods to achieve changeover and evaluation of
changeover methods.

6.1 Importing all the required packages

6.2 Loading a dataset


6.3 Defining directories

6.4 Importing matplot library

6.5 Classification of images into classes


6.6 Model Summary

6.7 Validating data


6.8 Plotting chart for Training and Validation Loss

6.9 Plotting chart for Training and Validation Accuracy

6.10 Evaluation of Model and print test loss and test accuracy

6.11 Facial Mask Prediction


Mask.py: (face-mask detection)

// importing libraries
import cv2
from tensorflow.keras.models import load_model
from keras.preprocessing.image import load_img , img_to_array
import numpy as np
//loading model.h5
model =load_model('model.h5')
img_width , img_height = 150,150
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
//providing video as input to the model
cap = cv2.VideoCapture('video.mp4')
img_count_full = 0
font = cv2.FONT_HERSHEY_SIMPLEX
org = (1,1)
class_label = ''
fontScale = 1
color = (255,0,0)
thickness = 2
while True:
img_count_full += 1
response , color_img = cap.read()
if response == False:
break
scale = 50
width = int(color_img.shape[1]*scale /100)
height = int(color_img.shape[0]*scale/100)
dim = (width,height)
color_img = cv2.resize(color_img, dim ,interpolation=
cv2.INTER_AREA) gray_img =
cv2.cvtColor(color_img,cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray_img, 1.1, 6)
img_count = 0
for (x,y,w,h) in faces:
org = (x-10,y-10)
img_count += 1
color_face = color_img[y:y+h,x:x+w]
cv2.imwrite('input/%d%dface.jpg'%(img_count_full,img_count),color_face)
img=
load_img('input/%d%dface.jpg'%(img_count_full,img_count),target_size=(img_width,img_height))
img = img_to_array(img)
img = np.expand_dims(img,axis=0)
prediction = model.predict(img)
if prediction==0:
class_label = "Mask"
color = (255,0,0)
else:
class_label = "No Mask"
color = (0,255,0)
cv2.rectangle(color_img,(x,y),(x+w,y+h),(0,0,255),3)
cv2.putText(color_img, class_label, org, font ,fontScale, color, thickness,cv2.LINE_AA)
cv2.imshow('Face mask detection', color_img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Chapter 7

Testing & Experimental result analysis

7.1 TYPES OF TESTS:

UNIT TESTING:

Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches and
internal code flow should be validated. It is the testing of individual software units of the
application. It is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application, and/or system
configuration.

INTEGRATION TESTING:

Integration tests are designed to test integrated software components to determine if they
actually run as one program. Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that although the components were
individually satisfaction, as shown by successfully unit testing, the combination of components
is correct and consistent. Integration testing is specifically aimed at exposing the problems that
arise from the combination of components.

VALIDATION TESTING:

An engineering validation test (EVT) is performed on first engineering prototypes, to ensure that
the basic unit performs to design goals and specifications. It is important in identifying design
problems, and solving them as early in the design cycle as possible, is the key to keeping
projects on time and within budget. Too often, product design and performance problems are
not detected until late in the product development cycle — when the product is ready to be
shipped. The old
adage holds true: It costs a penny to make a change in engineering, a dime in production and a
dollar after a product is in the field.

Verification is a Quality control process that is used to evaluate whether or not a product,
service, or system complies with regulations, specifications, or conditions imposed at the start of
a development phase. Verification can be in development, scale-up, or production. This is often
an internal process. Validation is a Quality assurance process of establishing evidence that
provides a high degree of assurance that a product, service, or system accomplishes its intended
requirements. This often involves acceptance of fitness for purpose with end users and other
product stakeholders.

The testing process overview is as follows:

Fig: The Testing Process

SYSTEM TESTING:

System testing of software or hardware is testing conducted on a complete, integrated system to


evaluate the system's compliance with its specified requirements. System testing falls within the
scope of black box testing, and as such, should require no knowledge of the inner design of the
code or logic.

As a rule, system testing takes, as its input, all of the "integrated" software components that have
successfully passed integration testing and also the software system itself integrated with any
applicable hardware system.
System testing is a more limited type of testing; it seeks to detect defects both within the "inter-
assemblages" and also within the system as a whole. System testing is performed on the entire
system in the context of a Functional Requirement Specification (FRS) or System Requirement
Specification (SRS).

TESTCASE: Test Case for Prediction Result

Serial Number of Test Case TC 01

Module Under Test predict facial masks

Description User trains data and user model to predict new


user in normal video or live video and predict
whether there is a mask or not.
Input Live video stream data and trained model

Output Application detects facial masks and draws


bounding boxes around them along with the
detected class name.
Remarks Test Successful.
7.2 Experimental Evaluation:

To avoid the problem of overfitting, two major steps are taken. First, we performed data
augmentation. Second, the model accuracy is critically observed over 60 epochs both for the
training and testing phase. The observations are reported in the below figures:

Figure representing Training and Validation loss

Figure representing Training and Validation accuracy

Observation: It is further observed that model accuracy keeps on increasing in different


epochs and get stable after epoch=3 as depicted graphically in Fig. 2. above. To summarize
the experimental results, we can say that the proposed model achieves high accuracy in face
and mask detection with less inference time and less memory consumption as compared to
recent techniques. Significant efforts had been put to resolve the data imbalance problem in
the existing MAFA dataset, resulting in a new unbiased dataset which is highly suitable for
COVID related mask detection tasks. The newly created dataset, optimal face detection
approach, localizing the person identity and avoidance of overfitting resulted in an overall
system that can be easily installed in an embedded device at public places to curtail the
spread of Coronavirus.

7.3 EXPECTED OUTPUT:

Fig:10 Result after detecting face with and without mask


Chapter 8
Conclusion

A novel architecture for an economic Face-Mask detection technology is proposed and


implemented in this paper. We presented a Convolutional Neural Network model for facial
masks recognition. The proposed model includes 2 convolutional layers and 2 max pooling.
The system recognizes faces from input images and classifies them and neutral. Thus, Our
facial mask recognition system can be integrated with an image or video capturing device
like a CCTV camera, to track safety violations, promote the use of face masks, and ensure a
safe working environment. Thus, in our future work we will focus on applying Convolutional
Neural Network model on 3D face image in order to detect more accurately.

Future Work:
The future enhancements that can be projected for the project are:
 More interactive user interface.
 Can be done as Mobile Application.
 More Details with result applicable to real time applications such as in
public places.
REFERENCES

1. Roomi, Mansoor, Beham, M.Parisa, “A Review Of Face Recognition Methods,” in


International Journal of Pattern Recognition and Artificial Intelligence, 2013, 27(04),
p.1356005.
2. S. Syed navaz, t. Dhevi sri, Pratap mazumder, “ Face recognition using principal
component analysis and neural network,” in International Journal of Computer
Networking, Wireless and Mobile Communications (IJCNWMC), vol. 3, pp. 245-256,
Mar. 2013.
3. Turk, Matthew A., and Alex P. Pentland. "Face recognition using eigenfaces," in IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, 1991, pp.
586-591.
4. H. Li, Z. Lin, X. Shen, J. Brandit, and G. Hua, “A convolutional neural network cascade
for face detection,” in IEEE CVPR, 2015, pp.5325-5334.
5. Wei Bu, Jiangjian Xiao, Chuanhong Zhou, Minmin Yang, Chengbin Peng, “A Cascade
Framework for Masked Face Detection,” in IEEE International Conference on CIS &
RAM, Ningbo, China, 2017, pp.458-462.
6. Shiming Ge, Jia Li, Qiting Ye, Zhao Luo, “Detecting Masked Faces in the Wild with
LLE-CNNs,” in IEEE Conference on Computer Vision and Pattern Recognition, China,
2017, pp. 2682--2690.
7. Opitz, G. Waltner, G. Poier, and et al, “Grid Loss: Detecting Occluded Faces,” in ECCV,
2016, pp. 386-402.
8. X. Zhu and D. Ramanan, “Face Detection, pose estimation and landmark localization in
the wild,” in IEEE CVPR, 2012, pp.2879-2886.
9. Florian Schroff, Dmitry Kalenichenko, James Philbin, “FaceNet: A Unified Embedding
for Face Recognition and Clustering,” in The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2015, pp. 815-823
10. Ankan Bansal Rajeev Ranjan Carlos D. Castillo Rama Chellappa, “Deep Features for
Recognizing Disguised Faces in wild” in IEEE/CVF Conference on Computer Vision
and Pattern Recognition Workshops, 2018.

You might also like