Face Mask Detection Project
Face Mask Detection Project
A PROJECT REPORT
Submitted by
Of
MASTER OF TECHNOLOGY
IN
Mrs.G.M.PADMAJA ,
Assistant Professor
Approved by AICTE,New Delhi ,Accredited by NBA & NACC-‘A’ Grade ,Permanently Affiliated to
JNTU KAKINADA
Dakamarri(V), Bheemunipatnam (M), Visakhapatnam District, Andhra Pradesh, India.
CERTIFICATE
This is to certify that the project report entitled “Realtime Face-Mask Detection on
Raspberry Kit” is the bonafide work of “M KALYAN CHAKRAVARTHI (Regd.No:
193J1D5802)” who carried out the project work under my supervision.
External Examiner
DECLARATION
I hereby declare that this project entitled “Realtime Face-Mask Detection on Raspberry
Kit” is the original work done by me in partial fulfillment of the requirement for the award of the
Degree of Master of Technology in Computer Science & Engineering. This project works/project
report has not been previously submitted to any other University/Institution for the award of any other
degree.
I would like to thank all the people who helped us in successful completion of my project “Realtime
Face-Mask Detection on Raspberry Kit”.
I would like to thank Sri Kalidindi Raghu Garu, Chairman of Raghu Institute of Technology, for
providing the necessary facilities.
Professor, Department of Computer Science and Systems Engineering (CSE), Andhra University
College of Engineering (A), for guiding me all through the project work, giving right direction and
shape to my learning by extending her expertise and experience in the education. Really I’m
Computer Science and Systems Engineering (CSSE), Andhra University College of Engineering
(A), for his valuable suggestions and constant motivation that greatly helped the project to be
successfully completed.
I also extend my heartfelt gratitude to all the teaching, technical and non-teaching staff of the
Department of Computer Science and Engineering (CSE) for their support.
We thank all those who contributed directly or indirectly in successfully carrying out this project
work.
The feasibility study is carried out to test whether the proposed system is worth being
implemented. The proposed system will be selected if it is best enough in meeting the
performance requirements.
• Economic Feasibility
• Technical Feasibility
• Behavioural Feasibility
Economic Feasibility
Economic analysis is the most frequently used method for evaluating effectiveness of the
proposed system. More commonly known as cost benefit analysis. This procedure determines the
benefits and saving that are expected from the system of the proposed system. The hardware in
system department if sufficient for system development.
Technical Feasibility
This study centre around the system’s department hardware, software and to what extend it
can support the proposed system department is having the required hardware and software there is
no question of increasing the cost of implementing the proposed system. The criteria, the proposed
system is technically feasible and the proposed system can be developed with the existing facility.
Behavioural Feasibility
People are inherently resistant to change and need sufficient amount of training, which
would result in lot of expenditure for the organization. The proposed system can generate reports
with day-to-day information immediately at the user’s request, instead of getting a report, which
doesn’t contain much detail.
CONTENTS
2 Literature Survey 27
3 Problem Statement 30
5.1 Introduction 22
5.2 Architecture Error! Bookmark not defined.
Figure4: Testcases 42
LIST OF FIGURES
Organization of Chapters:
Chapter 1: Introduction
Chapter 6: Implementation
Chapter 8: Conclusion
Chapter 1
Introduction
Face mask detection refers to detect whether a person is wearing a mask or not. In fact, the
problem is reverse engineering of face detection where the face is detected using different
machine learning algorithms for the purpose of security, authentication and surveillance. Face
detection is a key area in the field of Computer Vision and Pattern Recognition. A significant
body of research has contributed sophisticated to algorithms for face detection in past. The
primary research on face detection was done in 2001 using the design of handcraft feature and
application of traditional machine learning algorithms to train effective classifiers for detection
and recognition. The problems encountered with this approach include high complexity in
feature design and low detection accuracy. In recent years, face detection methods based on
deep convolutional neural networks (CNN) have been widely developed to improve detection
performance.
In this paper, we propose a two-stage CNN architecture, where the first stage detects human
faces, while the second stage uses a lightweight image classifier to classify the faces detected in
the first stage as either ‘Mask’ or ‘No Mask’ faces and draws bounding boxes around them
along with the detected class name.
This algorithm was extended to videos as well. The detected faces are then tracked between
frames using an object tracking algorithm, which makes the detection robust to the noise. This
system can then be integrated with an image or video capturing device like a CCTV camera, to
track safety violations, promote the use of face masks, and ensure a safe working environment.
While many machine learning algorithms have been around for a long time, the ability to
automatically apply complex mathematical calculations to big data – over and over, faster and
faster – is a recent development. Here are a few widely publicized examples of machine learning
applications you may be familiar with:
• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine learning
applications for everyday life.
• Knowing what customers are saying about you on Twitter? Machine learning combined
with linguistic rule creation.
• Fraud detection? One of the more obvious, important uses in our world today.
1.2.2 Importance of Machine Learning
Resurging interest in machine learning is due to the same factors that have made data mining and
Bayesian analysis more popular than ever. Things like growing volumes and varieties of
available data, computational processing that is cheaper and more powerful, and affordable data
storage.
All of these things mean it's possible to quickly and automatically produce models that can
analyze bigger, more complex data and deliver faster, more accurate results even on a very large
scale. And by building precise models, an organization has a better chance of identifying
profitable opportunities or avoiding unknown risks.
Machine learning tasks Machine learning tasks are typically classified into several broad
categories:
Supervised learning: The computer is presented with example inputs and their desired outputs,
given by a "teacher”, and the goal is to learn a general rule that maps inputs to outputs. As
special cases, theinput signal can beonlypartially available, or restricted to special feedback.
Semi-supervised learning: The computer is given only an incomplete training signal: a training
set with some (often many) of the target outputs missing.
Active learning: The computer can only obtain training labels for a limited set of instances
(based on a budget), and also has to optimize its choice of objects to acquire labels for. When
used interactively, these can be presented to the user for labelling.
Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to
find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden
patterns in data) or a means towards an end (feature learning).
Reinforcement learning: Data (in form of rewards and punishments) are given only as
feedback to the program's actions in a dynamic environment, such as driving a vehicle or
playing a game against an opponent.
Artificial Intelligence is everywhere. Possibility is that you are using it in one way or the other
and you don’t even know about it. One of the popular applications of AI is Machine Learning, in
which computers, software, and devices perform via cognition (very similar to human brain).
Herein, we share few applications of machine learning.
I. Image Recognition:
One of the most common uses of machine learning is image recognition. There are many
situations where you can classify the object as a digital image. For digital images, the
measurements describe the outputs of each pixel in the image.
• In the case of a black and white image, the intensity of each pixel serves as one
measurement. So if a black and white image has N*N pixels, the total number of pixels
and hence measurement is N2.
• In the colored image, each pixel considered as providing 3 measurements to the
intensities of 3 main colours component i.e RGB. So N*N colored image there are 3 N2
measurements.
• For face detection – The categories might be face versus no face present. There might be
a separate category for each person in a database of several individuals.
• For character recognition – We can segment a piece of writing into smaller images, each
containing a single character. The categories might consist of the 26 letters of the
English alphabet, the 10 digits, and some special characters.
Speech recognition (SR) is the translation of spoken words into text. It is also known as
“automatic speech recognition” (ASR), “computer speech recognition”, or “speech to text”
(STT).In speech recognition, a software application recognizes spoken words. The measurements
in this application might be a set of numbers that represent the speech signal. We can segment
the signal into portions that contain distinct words or phonemes.
In each segment, we can represent the speech signal by the intensities or energy in different time-
frequency bands. Although the details of signal representation are outside the scope of this
program, we can represent the signal by a set of real values. Speech recognition applications
include voice user interfaces. Voice user interfaces are such as voice dialing; call routing,
demotic appliance control. It can also use as simple data entry, preparation of structured
documents, speech-to-text processing, and plane.
ML provides methods, techniques, and tools that can help solving diagnostic and prognostic
problems in a variety of medical domains. It is being used for the analysis of the importance of
clinical parameters and of their combinations for prognosis, e.g. prediction of disease
progression, for the extraction of medical knowledge for outcomes research, for therapy planning
and support, and for overall patient management. ML is also being used for data analysis, such
as detection of regularities in the data by appropriately dealing with imperfect data, interpretation
of continuous data used in the Intensive Care Unit, and for intelligent alarming resulting in
effective and efficient monitoring.
It is argued that the successful implementation of ML methods can help the integration of
computer-based systems in the healthcare environment providing opportunities to facilitate and
enhance the work of medical experts and ultimately to improve the efficiency and quality of
medical care. In medical diagnosis, the main interest is in establishing the existence of a disease
followed by its accurate identification. There is a separate category for each disease under
consideration and one category for cases where no disease is present. Here, machine learning
improves the accuracy of medical diagnosis by analyzing data of patients.
In finance, statistical arbitrage refers to automated trading strategies that are typical of a short
term and involve a large number of securities. In such strategies, the user tries to implement a
trading algorithm for a set of securities on the basis of quantities such as historical correlations
and general economic variables. These measurements can be cast as a classification or estimation
problem. The basic assumption is that prices will move towards a historical average.
In the case of classification, the categories might be sold, buy or do nothing for each security. I
the case of estimation one might try to predict the expected return of each security over a future
time horizon. In this case, one typically needs to use the estimates of the expected return to make
a trading decision (buy, sell, etc.)
V. Learning Associations:
Learning association is the process of developing insights into various associations between
products. A good example is how seemingly unrelated products may reveal an association to one
another. When analyzed in relation to buying behaviors of customers.
One application of machine learning- Often studying the association between the products people
buy, which is also known as basket analysis. If a buyer buys ‘X’, would he or she force to buy
‘Y’ because of a relationship that can identify between them. This leads to relationship that
exists between fish and chips etc. When new products launches in the market a Knowing these
relationships it develops new relationship. Knowing these relationships could help in suggesting
the associated product to the customer. For a higher likelihood of the customer buying it, it can
also help in bundling products for a better package.
IV. Classification:
A Classification is a process of placing each individual from the population under study in many
classes. This is identifying as independent variables. Classification helps analysts to use
measurements of an object to identify the category to which that object belong. To establish an
efficient rule, analysts use data. Data consists of many examples of objects with their correct
classification.
For example, before a bank decides to disburse a loan, it assesses customers on their ability to
repay the loan. By considering factors such as customer’s earning, age, savings and financial
history we can do it. This information taken from the past data of the loan. Hence, Seeker uses to
create relationship between customer attributes and related risks.
VI. Prediction:
Consider the example of a bank computing the probability of any of loan applicants faulting the
loan repayment. To compute the probability of the fault, the system will first need to classify the
available data in certain groups. It is described by a set of rules prescribed by the analysts. Once
we do the classification, as per need we can compute the probability. These probability
computations can compute across all sectors for varied purposes.
VII. Extraction:
Now-a-days extraction is becoming a key in big data industry. As we know that huge volume of
data is getting generated out of which most of the data is unstructured. The first key challenge is
handling of unstructured data. Now conversion of unstructured data to structured form based on
some pattern so that the same can stored in RDBMS. Apart from this in current day’s data
collection mechanism is also getting change.
VIII. Regression:
We can apply Machine learning to regression as well. Assume that x= x1, x2, x3 … xn are the
input variables and y is the outcome variable. In this case, we can use machine learning
technology to produce the output (y) on the basis of the input variables (x).
You can use a model to express the relationship between various parameters as below:
In regression, we can use the principle of machine learning to optimize the parameters. To cut the
approximation error and calculate the closest possible outcome. We can also use Machine
learning for function optimization. We can choose to alter the inputs to get a better model. This
gives a new and improved model to work with. This is known as response surface design.
Table 2 shows what are the basic / minimal software requirements to do the project.
1.3.2.1 Python:
Python is an interpreted, object-oriented, high-level programming language with dynamic
semantics. Its high-level built in data structures, combined with dynamic typing and dynamic
binding; make it very attractive for Rapid Application Development, as well as for use as a
scripting or glue language to connect existing components together. Python's simple, easy to
learn syntax emphasizes readability and therefore reduces the cost of program maintenance.
Python supports modules and packages, which encourages program modularity and code reuse.
The Python interpreter and the extensive standard library are available in source or binary form
without charge for all major platforms, and can be freely distributed.
Often, programmers fall in love with Python because of the increased productivity it provides.
Since there is no compilation step, the edit-test-debug cycle is incredibly fast. Debugging Python
programs is easy: a bug or bad input will never cause a segmentation fault. Instead, when the
interpreter discovers an error, it raises an exception. When the program doesn't catch the
exception, the interpreter prints a stack trace. A source level debugger allows inspection of local
and global variables, evaluation of arbitrary expressions, setting breakpoints, stepping through
the code a line at a time, and so on. The debugger is written in Python itself, testifying to
Python's introspective power. On the other hand, often the quickest way to debug a program is to
add a few print statements to the source: the fast edit-test-debug cycle makes this simple
approach very effective.
Python was designed to be easy to understand and fun to use (its name came from Monty Python
so a lot of its beginner tutorials reference it). Fun is a great motivator, and since you'll be able to
build prototypes and tools quickly with Python, many find coding in Python a satisfying
experience. Thus, Python has gained popularity for being a beginner-friendly language, and it
has replaced Java as the most popular introductory language at Top U.S. Universities.
Easy to Understand
Being a very high level language, Python reads like English, which takes a lot of syntax-learning
stress off coding beginners. Python handles a lot of complexity for you, so it is very beginner-
friendly in that it allows beginners to focus on learning programming concepts and not have to
worry about too many details.
Very Flexible
As a dynamically typed language, Python is really flexible. This means there are no hard rules on
how to build features, and you'll have more flexibility solving problems using different methods
(though the Python philosophy encourages using the obvious way to solve things). Furthermore,
Python is also more forgiving of errors, so you'll still be able to compile and run your program
until you hit the problematic part.
Scalability
Not Easy to Maintain
Because Python is a dynamically typed language, the same thing can easily mean something
different depending on the context. As a Python app grows larger and more complex, this may
get difficult to maintain as errors will become difficult to track down and fix, so it will take
experience and insight to know how to design your code or write unit tests to ease
maintainability.
Slow
As a dynamically typed language, Python is slow because it is too flexible and the machine
would need to do a lot of referencing to make sure what the definition of something is, and
this slows Python performance down.
At any rate, there are alternatives such as PyPy that are faster implementations of Python.
While they might still not be as fast as Java, for example, it certainly improves the speed
greatly.
Community
As you step into the programming world, you'll soon understand how vital support is, as the
developer community is all about giving and receiving help. The larger a community, the more
likely you'd get help and the more people will be building useful tools to ease the process of
development.
Career Opportunities
On Angel List, Python is the 2nd most demanded skill and also the skill with the highest average
salary offered.
With the rise of big data, Python developers are in demand as data scientists, especially since
Python can be easily integrated into web applications to carry out tasks that require machine
learning.
Future
According to the TIOBE index, Python is the 4th most popular programming language out of
100, with the rise of Ruby on Rails and more recently Node.js, Python's usage as the main
prototyping language for backend web development has diminished somewhat, especially since it
has a fragmented MVC ecosystem. However, with big data becoming more and more important,
Python has become a skill that is more in demand than ever, especially it can be integrated into
web applications.
As an open source project, Python is actively worked on with a moderate update cycle, pushing
out new versions every year or so to make sure it remains relevant. A programming language's
ability to stay relevant also depends on whether the language is getting new blood. In terms of
search volume for anyone interested in learning Python, it has skyrocketed to the 1st place when
compared to other languages.
Benefits of Python:
Presence of Third-Party Modules
Extensive Support Libraries
Open Source and Community Development
Learning Ease and Support Available
User-friendly Data Structures
Productivity and Speed
Highly Extensible and Easily Readable Language.
Python allows us to store our code in files (also called modules). This is very useful for more
serious programming, where we do not want to retype a long function definition from the very
beginning just to change one mistake. In doing this, we are essentially defining our own
modules, just like the modules defined already in the Python library. To support this, Python has
a way to put definitions in a file and use them in a script or in an interactive instance of the
interpreter. Such a file is called a module; definitions from a module can be imported into other
modules or into the main module.
NumPy - NumPy is a module for Python. The name is an acronym for "Numeric Python"
or "Numerical Python". Furthermore, NumPy enriches the programming language Python
with powerful data structures, implementing multi-dimensional arrays and matrices.
Opencv - OpenCV-Python is a library of Python bindings designed to solve computer
vision problems. ... OpenCV-Python makes use of Numpy, which is a highly optimized
library for numerical operations with a MATLAB-style syntax. All the OpenCV array
structures are converted to and from NumPy arrays.
keras: Keras is a minimalist Python library for deep learning that can run on top of
Theano or TensorFlow. It was developed to make implementing deep learning models as
fast and easy as possible for research and development.
Tensorflow: It is an open source artificial intelligence library, using data flow graphs to
build models. It allows developers to create large-scale neural networks with many
layers. TensorFlow is mainly used for: Classification, Perception, Understanding,
Discovering, Prediction and Creation.
Jupyter notebook:
The Jupyter Notebook is an open source web application that you can use to create and share
documents that contain live code, equations, visualizations, and text. Jupyter Notebook is
maintained by the people at Project Jupyter. Jupyter Notebooks are powerful, versatile, shareable
and provide the ability to perform data visualization in the same environment. Jupyter Notebooks
allow data scientists to create and share their documents, from codes to full blown reports.
Visual Studio:
Visual Studio Code features a lightning fast source code editor, perfect for day-to-day use.
With support for hundreds of languages, VS Code helps you be instantly productive with syntax
highlighting, bracket-matching, auto-indentation, box-selection, snippets, and more.
Benefits of Visual Studio:
1. Cross-platform support : Windows, Linux, Mac
2. Light-weight
3. 3. Robust Architecture
4. Intelli-Sense
5. Freeware: Free of Cost- probably the best feature of all for all the programmers out there,
even more for the organizations.
6. Many users will use it or might have used it for desktop applications only, but it also
provides great tool support for Web Technologies like; HTML, CSS, JSON.
Google Colab:
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to
write and execute arbitrary python code through the browser, and is especially well suited
to machine learning, data analysis and education.
Sharing: You can share your Google Colab notebooks very easily. Thanks to Google
Colab everyone with a Google account can just copy the notebook on his own Google
Drive account. No need to install any modules to run any code, modules come preinstalled
within Google Colab.
Versioning: You can save your notebook to Github with just one simple click on a button.
No need to write “git add git commit git push git pull” codes in your command client (this
is if you did use versioning already)!
Code snippets: Google Colab has a great collection of snippets you can just plug in on
your code.E.g. if you want to write data to a Google Sheet automatically, there’s a snippet
for it in the Google Library
Forms for non-technical users: Not only programmers have to analyze data and Python
can be useful for almost everyone in an office job. The problem is non-technical people are
scared to death of making even the tiniest change to the code. But Google Colab has the
solution for that. Just insertthe comment #@param {type:”string”} and you turn any
variable
field in a easy-to-use form input field. Your non-technical user needs to change form
fields and Google Colab will automatically update the code. You can find more info
on https://colab.research.google.com/notebooks/forms.ipynb
Performance: Use the computing power of the Google servers instead of your own
machine. Running python scripts requires often a lot of computing power and can take time.
By running scripts in the cloud, you don’t need to worry. Your local machine performance
won’t drop while executing your Python scripts.
Price: Best of all it’s free.
Chapter 2
Literature Survey
Initially, researchers focused on the edge and gray value of the face image. It was based on a
pattern recognition model, having prior information of the face model. Ada-boost was a good
training classifier. The face detection technology got a breakthrough with the famous Viola-
Jones Detector, which greatly improved real-time face detection. Viola-Jones detector optimized
the features of Haar, but failed to tackle the real-world problems and was influenced by various
factors like face brightness and face orientation. Viola-Jones could only detect frontal well-light
faces. It failed to work well in dark conditions and with non-frontal images. These issues have
made the independent researchers work on developing new face detection models based on deep
learning, to have better results for the different facial conditions. We have developed our face
detection model using Convolutional Neural Network (CNN), such that it can detect the face in
any geometric condition frontal or non-frontal for that matter. Convolutional Neural Networks
have always been used for image classification tasks.
Introduction: Face recognition is one of the well-studied real-life problems. Excellent progress
has been done against face recognition technology throughout the last years. Face alterations
and the presence of different masks make it too much challenging. The primary concern of this
work is about facial masks, especially to enhance the recognition accuracy of different masked
faces. Face Mask detection has turned up to be an astonishing problem in the domain of image
processing and computer vision.
A Fully convolution Network is a significantly slower operation than, say max pool, both
forward and backward. If the network is pretty deep, each training step is going to take
much longer.
The network is a bit too slow and complicated if you just want a good pre-trained model.
The approach proposed system in the project uses only the Convolutional Neural
Network model (CNN) to detect the human faces.
In these Feature extraction was performed best compared to the existing system and
Computation is very fast. The power of CNN is to detect distinct features from images all by
itself, without any actual human intervention.
Chapter 4
Dataset Description
Kaggle allows users to find and publish data sets, explore and build models in a web-based data-
science environment, work with other data scientists and machine learning engineers, and enter
competitions to solve data science challenges. Kaggle has over 50,000 public datasets and
400,000 public notebooks.
Object-oriented analysis and design (OOAD) is a software engineering approach that models a
system as a group of interacting objects. Each object represents an entity of interest in the system
being modeled and is characterized by its class, its state (data elements), and its behaviors.
Various models can be created to show the static structure, dynamic behaviors, and runtime
deployment of these collaborating objects. there are a number of different notations for
representing these models, such as the Unified Modelling Language (UML).
Object-oriented analysis (OOA) applies object modeling techniques to analyze the
functional requirements for a system. Object-oriented design (OOD) elaborates the analysis
models to produce implementation specifications. OOA focuses on what this system does, OOD
on how this system does it.
Design: The most creative and challenging phase of the life cycle is system design. The term
design describes a final system and the process by which it is developed. It refers to the
technical specifications that will be applied in the implementation of the candidate system. The
design may be defined as “The process of applying various techniques and principal for the
purpose of defining a device, a process or a system with sufficient details to permit its physical
realization are documented and evaluated by management as a step towards implementation”.
The importance of software design can be stated in a single word “Quality”. Design providers
with the representation of software that can be assessed for quality. Designers the only way
where we can accurately translate customers’ requirements into a complete software product or
system. without design, we risk building an unstable system that might fail if small changes are
made. It may as well be difficult to test or could be one whose quality can't be tested. so it is an
essential facet in the development of software products.
5.1 Architecture :
In this model, our aim is to detect whether the person is wearing the mask or not wearing
the mask. Face Mask detection has turned up to be an astonishing problem in the domain of
image processing and computer vision. Face detection has various use cases ranging from face
recognition to capturing facial motions, where the latter calls for the face to be revealed with
very high precision. Due to the rapid advancement in the domain of machine learning
algorithms, face mask detection technology seems to be well addressed yet. This technology is
more relevant today because it is used to detect faces in static images and videos also and the
same is graphically represented in the figure as a Flow diagram.
The goal of UML is to provide a standard notation that can be used by all object-oriented
methods and to select and integrate the best elements of precursor notations. UML has been
designed for a broad range of applications. Hence, it provides constructs for a broad range of
systems and activities.
A UML use case diagram is the primary form of system/software requirements for a new
software program underdeveloped. Use cases specify the expected behavior (what), and not the
exact method of making it happen (how). Use cases once specified can be denoted by both
textual and visual representation (i.e., use case diagram).
A key concept of use case modeling is that it helps us design a system from the end user's
perspective. It is an effective technique for communicating system behavior in the user's terms
by specifying all externally visible system behavior.
Fig 4: use case diagram
5.3.2 Sequence Diagram:
UML Sequence Diagrams are interaction diagrams that detail how operations are carried
out. They capture the interaction between objects in the context of a collaboration. Sequence
Diagrams are time focused and they show the order of the interaction visually by using the
vertical axis of the diagram to represent time what messages are sent and when.
the interaction that takes place in a collaboration that either realizes a use case or
an operation (instance diagrams or generic diagrams)
high-level interactions between the user of the system and the system, between the
system and other systems, or between subsystems (sometimes known as system
sequence diagrams)
In software engineering, a class diagram in the UML is a type of static structure diagram
that describes the structure of a system by showing the system's classes, their attributes,
operations (or methods), and the relationships among objects.
The user performs an operation called giving input to the system. The system trains and tests the
model and predicts the result based on the inputs given to it and displays the result to the user.
Convolutional Neural Network, also known as CNN is a subfield of deep learning which is
mostly used for the analysis of visual imagery. CNN is a class of deep feedforward (ANN). This
Neural Network uses the already supplied dataset to it for training purposes and predicts the
possible future labels to be assigned. Any kind of data This Neural Network uses its strengths
against the curse of dimensionality. A portion of the territories where CNNs are broadly utilized
are image recognition, image classification, image captioning and object detection, etc. The
CNNs got immense popularity when Alex discovered it in 2012. In just three years, the
engineers have advanced it to an extent that an older 8 layer AlexNet now is converted into 152
layer ResNet. Tasks where recommendation systems, contextual importance, or natural
language processing (NLP) is considered, CNNs come handy. The key chore of the neural
network is to make sure it processes all the layers, and hence detects all the underlying features,
automatically. A CNN is a convolution tool that parts the different highlights of the picture for
analysis and prediction.
Convolutional neural network is also known as Artificial neural network that has so far
been most popularly used for analyzing images.
CNN has hidden layers called Convolutional layers, and these layers are precisely
what makes this CNN.
Convolutional layer more precisely able to detect patterns.
Through CNN model we insert any object from input it will check convolutional layer and
transform through output.
1. Convolutional layer
2. Pooling layer
3. Fully connected layer.
1. Convolution Layer: The convolution layer is a core building block of the CNN. It carries
the main portion of the network’s computational load. This layer performs a dot product
between two matrices, where one matrix is the set of learnable parameters otherwise known
as a kernel and the other matrix is the restricted portion of the receptive field. The kernel is
spatially smaller than an image but is more in-depth.
During the forward pass, the kernel slides across the height and width of the image-
producing the image representation of that receptive region. This produces a two-
dimensional representation of the image known as an activation map that gives the response
of the kernel at each spatial position of the image. The sliding size of the kernel is called a
stride. If we have an input of size W *W *D and Dout number of kernels with a spatial size
of F with stride S and amount of padding P, then the size of output volume can be
determined by the following formula:
2. Pooling layer: The pooling layer replaces the output of the network at certain locations by
deriving a summary statistics of the nearby outputs. This helps in reducing the spatial size of
the representation, which decreases the required amount of computation and weights. The
pooling operation is processed on every slice of the representation individually.
There are several pooling functions such as the average of the rectangular
neighborhood, L2 norm of the rectangular neighborhood, and a weighted average based on
the distance from the central pixel. However, the most popular process is max pooling,
which reports the maximum output from the neighborhood.
Fig:8 Pooling
If we have an activation map of sizeW*W*D, a pooling kernel of spatial size F, and stride S,
then the size of output volume can be determined by the following formula:
3. Fully Connected Layer: Neurons in this layer have full connectivity with all neurons in
the preceding and succeding layer as seen in regular FCNN. This is why it can be computed
as usual by a matrix multiplication folowed by a bias effect. The FC layer helps to map the
representation between the input and and output.
[INPUT]
Implementation
Implementation is the stage of the project when the theoretical design is turned out into a
working system. Thus, it can be considered to be the most critical stage in achieving a
successful new system and in giving the user, confidence that the new system will work and be
effective.
The implementation stage involves careful planning, investigation of the existing system and its
constraints on implementation, designing of methods to achieve changeover and evaluation of
changeover methods.
6.10 Evaluation of Model and print test loss and test accuracy
// importing libraries
import cv2
from tensorflow.keras.models import load_model
from keras.preprocessing.image import load_img , img_to_array
import numpy as np
//loading model.h5
model =load_model('model.h5')
img_width , img_height = 150,150
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
//providing video as input to the model
cap = cv2.VideoCapture('video.mp4')
img_count_full = 0
font = cv2.FONT_HERSHEY_SIMPLEX
org = (1,1)
class_label = ''
fontScale = 1
color = (255,0,0)
thickness = 2
while True:
img_count_full += 1
response , color_img = cap.read()
if response == False:
break
scale = 50
width = int(color_img.shape[1]*scale /100)
height = int(color_img.shape[0]*scale/100)
dim = (width,height)
color_img = cv2.resize(color_img, dim ,interpolation=
cv2.INTER_AREA) gray_img =
cv2.cvtColor(color_img,cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray_img, 1.1, 6)
img_count = 0
for (x,y,w,h) in faces:
org = (x-10,y-10)
img_count += 1
color_face = color_img[y:y+h,x:x+w]
cv2.imwrite('input/%d%dface.jpg'%(img_count_full,img_count),color_face)
img=
load_img('input/%d%dface.jpg'%(img_count_full,img_count),target_size=(img_width,img_height))
img = img_to_array(img)
img = np.expand_dims(img,axis=0)
prediction = model.predict(img)
if prediction==0:
class_label = "Mask"
color = (255,0,0)
else:
class_label = "No Mask"
color = (0,255,0)
cv2.rectangle(color_img,(x,y),(x+w,y+h),(0,0,255),3)
cv2.putText(color_img, class_label, org, font ,fontScale, color, thickness,cv2.LINE_AA)
cv2.imshow('Face mask detection', color_img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Chapter 7
UNIT TESTING:
Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches and
internal code flow should be validated. It is the testing of individual software units of the
application. It is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application, and/or system
configuration.
INTEGRATION TESTING:
Integration tests are designed to test integrated software components to determine if they
actually run as one program. Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that although the components were
individually satisfaction, as shown by successfully unit testing, the combination of components
is correct and consistent. Integration testing is specifically aimed at exposing the problems that
arise from the combination of components.
VALIDATION TESTING:
An engineering validation test (EVT) is performed on first engineering prototypes, to ensure that
the basic unit performs to design goals and specifications. It is important in identifying design
problems, and solving them as early in the design cycle as possible, is the key to keeping
projects on time and within budget. Too often, product design and performance problems are
not detected until late in the product development cycle — when the product is ready to be
shipped. The old
adage holds true: It costs a penny to make a change in engineering, a dime in production and a
dollar after a product is in the field.
Verification is a Quality control process that is used to evaluate whether or not a product,
service, or system complies with regulations, specifications, or conditions imposed at the start of
a development phase. Verification can be in development, scale-up, or production. This is often
an internal process. Validation is a Quality assurance process of establishing evidence that
provides a high degree of assurance that a product, service, or system accomplishes its intended
requirements. This often involves acceptance of fitness for purpose with end users and other
product stakeholders.
SYSTEM TESTING:
As a rule, system testing takes, as its input, all of the "integrated" software components that have
successfully passed integration testing and also the software system itself integrated with any
applicable hardware system.
System testing is a more limited type of testing; it seeks to detect defects both within the "inter-
assemblages" and also within the system as a whole. System testing is performed on the entire
system in the context of a Functional Requirement Specification (FRS) or System Requirement
Specification (SRS).
To avoid the problem of overfitting, two major steps are taken. First, we performed data
augmentation. Second, the model accuracy is critically observed over 60 epochs both for the
training and testing phase. The observations are reported in the below figures:
Future Work:
The future enhancements that can be projected for the project are:
More interactive user interface.
Can be done as Mobile Application.
More Details with result applicable to real time applications such as in
public places.
REFERENCES