Image Recognition Using Neural Network & Deep Learning
Image Recognition Using Neural Network & Deep Learning
“ IMAGE RECOGNITION
USING NEURAL NETWORK &
DEEP LEARNING”
Thesis submitted in partial fulfillment of the curriculum prescribed for
the award of the degree of Bachelor of Engineering in
Computer Science & Engineering by
1CR14CS138 Siddharth
1CR14CS163 Vivekanand Vivek
1CR14CS007 Akarsh Ramesh Khatagalli
1CR14CS134 Shreyans Maitrey
Under the Guidance of
2017-18
VISVESVARAYA TECHNOLOGICAL UNIVERSITY
JNANASANGAMA, BELAGAVI - 590018
Certificate
This is to certify that the project entitled “ IMAGE RECOGNITION US-
ING NEURAL NETWORK & DEEP LEARNING” is a bonafide work car-
ried out by Siddharth , Vivekanand Vivek , Akarsh Ramesh Khatagalli
and Shreyans Maitrey in partial fulfillment of the award of the degree of Bache-
lor of Engineering in Computer Science & Engineering of Visvesvaraya Technological
University, Belgaum, during the year 2017-18. It is certified that all corrections / sug-
gestions indicated during reviews have been incorporated in the report. The project
report has been approved as it satisfies the academic requirements in respect of the
project work prescribed for the Bachelor of Engineering Degree.
External Viva
1.
2.
Acknowledgement
We take this opportunity to thank all of those who have generously
helped us to give a proper shape to our work and complete our BE project
successfully. A successful project is fruitful culmination efforts by many
people, some directly involved and some others indirectly, by providing
support and encouragement.
Siddharth
Vivekanand Vivek
Akarsh Ramesh Khatagalli
Shreyans Matriey
i
Table of Contents
Table of Contents ii
List of Figures iv
List of Tables v
Abstract vi
1 PREAMBLE 1
1.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 MOTIVATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 FACE DETECTION APPROACHES . . . . . . . . . . . . . . . . . . . 3
2 LITERATURE SURVEY 5
2.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 LITERATURE SURVEY . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 PAPER 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 PAPER 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 PAPER 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 THEORETICAL BACKGROUND 8
3.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 PERCEPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 SIGMOID NEURONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 THE ARCHITECTURE OF NEURAL NETWORKS . . . . . . . . . . 11
3.5 A SIMPLE NETWORK TO CLASSIFY FACES . . . . . . . . . . . . . 12
3.6 DEEP NEURAL NETWORK . . . . . . . . . . . . . . . . . . . . . . . 13
3.7 CONVOLUTIONAL NEURAL NETWORK . . . . . . . . . . . . . . . 14
ii
4.3 NON-FUNCTIONAL REQUIREMENTS . . . . . . . . . . . . . . . . . 18
4.4 HARDWARE REQUIREMENTS . . . . . . . . . . . . . . . . . . . . . 21
4.5 SOFTWARE REQUIREMENTS . . . . . . . . . . . . . . . . . . . . . 22
4.6 SOFTWARE QUALITY ATTRIBUTES . . . . . . . . . . . . . . . . . 22
5 SYSTEM ANALYSIS 24
5.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 FEASIBILITY STUDY . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6 SYSTEM DESIGN 26
6.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.2 SYSTEM DEVELOPMENT METHODOLOGY . . . . . . . . . . . . . 27
6.3 DESIGN USING UML . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.4 DATA FLOW DIAGRAM . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.5 USE CASE DIAGRAM . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.6 ACTIVITY DIAGRAM . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.7 SEQUENCE DIAGRAM . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7 IMPLEMENTATION 35
7.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.2 TRAINING MODULE CODE . . . . . . . . . . . . . . . . . . . . . . . 36
7.3 IMAGE RECOGNITION CODE . . . . . . . . . . . . . . . . . . . . . 41
References 52
iii
List of Figures
iv
List of Tables
v
Abstract
The main idea of the project is to be able to classify product category
by looking only at the image, we think that it make sense to use Deep
Convolutional Neural Networks. To begin with, we are planning to use
transfer learning and fine-tuning on ResNet50, VGG-19, and some other
pre-trained on ImageNet data networks that are accessible in Keras. For
this step we are planning to use only small subset of the whole training
data it should also represent only small subset of the whole classes that
we need to classify data into (50-100 classes out of total 5000 classes).
vi
Chapter 1
PREAMBLE
1
Image Recognition Using Neural Network & Deep Learning Chapter 1
1.1 INTRODUCTION
As the necessity for higher levels of security rises, technology is bound to swell to fulfill
these needs. Any new creation, enterprise, or development should be uncomplicated
and acceptable for end users in order to spread worldwide. This strong demand for
user-friendly systems which can secure our assets and protect our privacy without
losing our identity in a sea of numbers, grabbed the attention and studies of scientists
toward whats called biometrics. There is a more scientific Mathematical Introduction
For Face Recognition: Pixel Arithmetic for readers who are interested in the mathe-
matical perspective and representation of pixels in face recognition applications. The
link also contains a VB.NET implementation of the Pixel class.
Biometrics is the emerging area of bioengineering; it is the automated method of
recognizing person based on a physiological or behavioral characteristic. There exist
several biometric systems such as signature, finger prints, voice, iris, retina, hand
geometry, ear geometry, and face. Among these systems, facial recognition appears
to be one of the most universal, collectable, and accessible systems.
Biometric face recognition is a particularly attractive biometric approach, since it
focuses on the same identifier that humans use primarily to distinguish one person
from another: their faces. One of its main goals is the understanding of the complex
human visual system and the knowledge of how humans represent faces in order to
discriminate different identities with high accuracy.
The face recognition problem can be divided into two main stages: face verification
(or authentication), and face identification (or recognition).
1.2 MOTIVATION
Face recognition has been a sought after problem of biometrics and it has a variety
of applications in modern life. The problems of face recognition attracts researchers
working in biometrics, patternrecognition eld and computer vision . Several face
recognition algorithms are also used in many dierent applications apart from biomet-
rics , such as video compressions , indexings etc. They can also be used to classify
multimedia content, to allow fast and ecient searching for material that is of interest
to the user. An ecent face recognition system can be of great help in forensic sciences,
identication for law enforcement, surveillance , authentication for banking and secu-
rity system, and giving preferential access to authorised users i.e. access control for
secured areas etc. The problem of face recognition has gained even more importance
after the recent increase in the terrorism related incidents. Use of face recognition
for authentication also reduces the need of remembering passwords and can provide
a much greater security if face recognition is used in combination with other security
measures for access control. The cost of the license for an ecient commercial Face
recognition system ranges from 30,000 $ to 150,000 $ which shows the signicant value
of t he problem. Though face recognition is considered to be a very crucial authenti-
cation system but even after two decades continuous research and evolution of many
face recognition algorithms , a truely robust and ecient system that can produce good
results in realtime and normal conditions is still not available. The Face Recognition
Vendor Test (FRVT) that has been conducted by the National Institute of Standards
and Technology (NIST), USA, has shown that the commercial face recognition sys-
tems do not perform well under the normal daily conditions. Some of the latest face
recognition algorithm involving machine learning tools perform well but sadly the
training period and processing time is large enough to limit its use in practical ap-
plications. Hence there is a continuous strife to propose an eective face recognition
system with high accuracy and acceptable processing time.
matching methods, the templates are predened by experts. Whereas, the templates in
appearance based methods are learned from examples in images. Statistical analysis
and machine learning techniques can be used to nd the relevant characteristics of face
and non-face images.
LITERATURE SURVEY
5
Image Recognition Using Neural Network & Deep Learning Chapter 2
2.1 INTRODUCTION
Literature survey is mainly carried out in order to analyze the background of the
current project which helps to find out flaws in the existing system and guides on which
unsolved problems we can work out. So, the following topics not only illustrate the
background of the project but also uncover the problems and flaws which motivated
to propose solutions and work on this project.
2.3 PAPER 1
In FFNN, the neurons are connected in a directed way having clear start and stop
place i.e., the input layer and the output layer. The layer between these two layers,
are called as the hidden layers. Learning occurs through adjustment of weights and
the aim is to try and minimize error between the output obtained from the output
layer and the input that goes into the input layer. The weights are adjusted by pro-
cess of back propagation (in which the partial derivative of the error with respect to
last layer of weights is calculated). The process of weight adjustment is repeated in a
recursive manner until weight layer connected to input layer is updated.
2.4 PAPER 2
An Artificial Neural Network (ANN) is an information processing paradigm that is
inspired by the way biological nervous systems, such asthe brain, process information.
The key element of this paradigm is the novel structure of the information processing
system. It is composed of a large number of highly interconnected processing elements
(neurons) working in unison to solve specific problems. ANNs, like people, learn by
example. An ANN is configured for a specific application, such as pattern recognition
or data classification, through a learning process. Learning in biological systems in-
volves adjustments to the synaptic connections that exist between the neurons. This
is true of ANNs as well.
2.5 PAPER 3
We propose a deep convolutional neural network architecture codenamed Inception
that achieves the new state of the art for classification and detection in the ImageNet
Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of
this architecture is the improved utilization of the computing resources inside the
network. By a carefully crafted design, we increased the depth and width of the
network while keeping the computational budget constant. To optimize quality, the
architectural decisions were based on the Hebbian principle and the intuition of multi-
scale processing. One particular incarnation used in our submission for ILSVRC14 is
called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the
context of classification and detection.
THEORETICAL BACKGROUND
8
Image Recognition Using Neural Network & Deep Learning Chapter 3
3.1 INTRODUCTION
The process of image classification involves two steps, training of the system followed
by testing. The training process means, to take the characteristic properties of the
images and form a unique description for a particular class. The process is done for
all classes depending on the type of classification problem; binary classification or
multi-class classification. The testing step means to categorize the test images under
various classes for which system was trained. This assigning of class is done based
on the partitioning between classes based on the training features. Since 2006, deep
structured learning, or more commonly called deep learning or hierarchical learning,
has emerged as a new area of machine learning research . Several definitions are
available for Deep Learning; coating one of the many definitions from Deep Learning
is defined as: A class of machine learning techniques that exploit many layers of
nonlinear information processing for supervised or unsupervised feature extraction
and transformation and for pattern analysis and classification. This work aims at
the application of Convolutional Neural Network or CNN for image classification.
The image data used for testing the algorithm includes remote sensing data of aerial
images and scene data from SUN database. The rest of the paper is organized as
follows. Section 2 deals with the working of the network followed by section 2.1 with
theoretical background. The working of CNN gives the experimental procedure in
detail.
3.2 PERCEPTION
Perceptrons were developed in the 1950s and 1960s by the scientist Frank Rosenblatt,
inspired by earlier work by Warren McCulloch and Walter Pitts. Today, it’s more
common to use other models of artificial neurons - in this book, and in much modern
work on neural networks, the main neuron model used is one called the sigmoid neuron.
We’ll get to sigmoid neurons shortly. But to understand why sigmoid neurons are
defined the way they are, it’s worth taking the time to first understand perceptrons.
So how do perceptrons work? A perceptron takes several binary inputs, x1,x2,, and
produces a single binary output:
In fact, the exact form of isn’t so important - what really matters is the shape of the
function when plotted. Here’s the shape:
As mentioned earlier, the leftmost layer in this network is called the input layer, and
the neurons within the layer are called input neurons. The rightmost or output layer
contains the output neurons, or, as in this case, a single output neuron. The middle
layer is called a hidden layer, since the neurons in this layer are neither inputs nor
outputs. The term ”hidden” perhaps sounds a little mysterious - the first time I heard
the term I thought it must have some deep philosophical or mathematical significance
- but it really means nothing more than ”not an input or an output”. The network
above has just a single hidden layer, but some networks have multiple hidden layers.
For example, the following four-layer network has two hidden layers:
SYSTEM REQUIREMENT
SPECIFICATION
16
Image Recognition Using Neural Network & Deep Learning Chapter 4
4.1 INTRODUCTION
This chapter describes about the requirements. It specifies the hardware and software
requirements that are required in order to run the application properly. The Software
Requirement Specification (SRS) is explained in detail, which includes overview of
dissertation as well as the functional and non-functional requirement of this disserta-
tion.
A SRS document describes all data, functional and behavioral requirements of the
software under production or development. SRS is a fundamental document, which
forms the foundation of the software development process. Its the complete descrip-
tion of the behavior of a system to be developed. It not only lists the requirements
of a system but also has a description of its major feature. Requirement Analysis
in system engineering and software engineering encompasses those tasks that go into
determining the need or conditions to meet for a new or altered product, taking ac-
count of the possibly conflicting requirements of the various stakeholders, such as
beneficiaries or users. Requirement Analysis is critical to the success to a develop-
ment project. Requirement must be documented, measurable, testable, related to in
identified business needs or opportunities, and defined to a level of detail sufficient for
system design.
The SRS functions as a blueprint for completing a project. The SRS is often
referred to as the ”parent” document because all subsequent project management
documents, such as design specifications, statements of work, software architecture
specification, testing and validation plans, and documentation plans, are related to it.
It is important to note that an SRS contains functional and non-functional require-
ments only.
Thus the goal of preparing the SRS document is to
• Input test case must not have compilation and runtime errors.
• The application must not stop working when kept running for even a long time.
• The application must function as expected for every set of test cases provided.
• The application should generate the output for given input test case and input
parameters.
• Product Requirements
• Organizational Requirements
• User Requirements
• Response time The time the system takes to load and the time for responses
on any action the user does.
• Growth Requirements As the system grows it will need more storage space to
keep up with the efficiency.
• Architectural Standards The standards needed for the system to work and
sustain.
• Ease of Use: The front end is designed in such a way that it provides an interface
which allows the user to interact in an easy manner.
• Modularity: The complete product is broken up into many modules and well-
defined interfaces are developed to explore the benefit of flexibility of the prod-
uct.
• Robustness: This software is being developed in such a way that the overall
performance is optimized and the user can expect the results within a limited
time with utmost relevancy and correctness.
Process Standards: IEEE standards are used to develop the application which is the
standard used by the most of the standard software developers all over the world.
Design Methods: Design is one of the important stages in the software engineering
process. This stage is the first step in moving from problem to the solution domain.
In other words, starting with what is needed design takes us to work how to satisfy
the needs.
The customers are those that perform the eight primary functions of systems engi-
neering, with special emphasis on the operator as the key customer. Operational
requirements will define the basic need and, at a minimum, will be related to these
following points:
• Performance and related parameters: It points out the critical system parame-
ters to accomplish the mission
• GPU : Nvidia
• VRAM : 6GB
• VRAM : 1.5GB
• Tools : PyCharm
• Tools : PyCharm
SYSTEM ANALYSIS
24
Image Recognition Using Neural Network & Deep Learning Chapter 5
5.1 INTRODUCTION
Design is a meaningful engineering representation of something that is to be built.
It is the most crucial phase in the developments of a system. Software design is a
process through which the requirements are translated into a representation of soft-
ware. Design is a place where design is fostered in software Engineering. Based on the
user requirements and the detailed analysis of the existing system, the new system
must be designed. This is the phase of system designing. Design is the perfect way
to accurately translate a customers requirement in the finished software product. De-
sign creates a representation or model, provides details about software data structure,
architecture, interfaces and components that are necessary to implement a system.
The logical system design arrived at as a result of systems analysis is converted into
physical system design.
• Operational Feasibility
• Economical Feasibility
• Technical Feasibility
• Social Feasibility
SYSTEM DESIGN
26
Image Recognition Using Neural Network & Deep Learning Chapter 6
6.1 INTRODUCTION
Design is a meaningful engineering representation of something that is to be built.
It is the most crucial phase in the developments of a system. Software design is a
process through which the requirements are translated into a representation of soft-
ware. Design is a place where design is fostered in software Engineering. Based on the
user requirements and the detailed analysis of the existing system, the new system
must be designed. This is the phase of system designing. Design is the perfect way
to accurately translate a customers requirement in the finished software product. De-
sign creates a representation or model, provides details about software data structure,
architecture, interfaces and components that are necessary to implement a system.
The logical system design arrived at as a result of systems analysis is converted into
physical system design.
• Coding: In this phase programmer starts his coding in order to give a full
sketch of product. In other words system specifications are only converted in to
machine readable compute code.
• Testing: In this phase all programs (models) are integrated and tested to en-
sure that the complete system meets the software requirements. The testing is
concerned with verification and validation.
• Maintenance: The maintenance phase is the longest phase in which the soft-
ware is updated to fulfill the changing customer need, adapt to accommodate
change in the external environment, correct errors and oversights previously
undetected in the testing phase, enhance the efficiency of the software.
• Less human resources required as once one phase is finished those people can
start working on to the next phase.
Basic elements:
IMPLEMENTATION
35
Image Recognition Using Neural Network & Deep Learning Chapter 7
7.1 INTRODUCTION
The implementation phase of the project is where the detailed design is actually
transformed into working code. Aim of the phase is to translate the design into a
best possible solution in a suitable programming language. This chapter covers the
implementation aspects of the project, giving details of the programming language
and development environment used. It also gives an overview of the core modules
of the project with their step by step flow. The implementation stage requires the
following tasks:
• Careful planning.
import math
from s k l e a r n import n e i g h b o r s
import os
import os . path
import p i c k l e
from PIL import Image , ImageDraw
import f a c e r e c o g n i t i o n
from f a c e r e c o g n i t i o n . f a c e r e c o g n i t i o n c l i
import i m a g e f i l e s i n f o l d e r
i f n n e i g h b o r s i s None :
n n e i g h b o r s = int ( round ( math . s q r t ( len (X) ) ) )
i f verbose :
print ( ” Chose n n e i g h b o r s a u t o m a t i c a l l y : ” , n n e i g h b o r s )
k n n c l f = n e i g h b o r s . K N e i g h b o r s C l a s s i f i e r ( n n e i g h b o r s=n n e i g h b o r s ,
a l g o r i t h m=knn alg o , w e i g h t s= ’ d i s t a n c e ’ )
k n n c l f . f i t (X, y )
i f m o d e l s a v e p a t h i s not None :
with open ( m o d e l s a v e p a t h , ’wb ’ ) as f :
p i c k l e . dump( k n n c l f , f )
return k n n c l f
i f k n n c l f i s None :
with open ( model path , ’ rb ’ ) as f :
knn clf = p i c k l e . load ( f )
i f len ( X f a c e l o c a t i o n s ) == 0 :
return [ ]
p i l i m a g e . show ( )
if name == ” m a i n ” :
print ( ” T r a i n i n g c l a s s i f i e r . . . ” )
c l a s s i f i e r = t r a i n ( ” t r a i n ” , m o d e l s a v e p a t h=
” t r a i n e d m o d e l . dat ” , n n e i g h b o r s =2)
print ( ” T r a i n i n g complete ! ” )
for i m a g e f i l e in os . l i s t d i r ( ” t e s t ” ) :
f u l l f i l e p a t h = os . path . j o i n ( ” t e s t ” , i m a g e f i l e )
p r e d i c t i o n s = p r e d i c t ( f u l l f i l e p a t h , model path=
” t r a i n e d m o d e l . dat ” )
s h o w p r e d i c t i o n l a b e l s o n i m a g e ( os . path . j o i n
(” test ” , image file ) , predictions )
import f a c e r e c o g n i t i o n as f a c e r e c o g n i t i o n
from PIL import Image , ImageTk
import T k i n t e r
import cv2
import c o n s t a n t s as CONSTANTS
import s y s
knownImagesPreEncoded = [ ]
del knownImages
#i d e n t i f y Image
f a c e l o c a t i o n s = f a c e r e c o g n i t i o n . f a c e l o c a t i o n s ( image )
tempReco = [ ]
,min( f a c e l o c a t i o n s [ f s e t ] [ 0 ] , f a c e l o c a t i o n s [ f s e t ] [ 2 ] )
, f i l l =CONSTANTS.LABEL TEXT COLOR
, f o n t=CONSTANTS. LABEL FONT, t e x t=tempReco [ f s e t ] )
canvas . pack ( )
print ( ” Updating . . . ” )
except I n d e x E r r o r :
print ” E r r o r : ” , s y s . e x c i n f o ( )
canvas . a f t e r ( 1 , updateImage , r o o t , canvas )
#Create A WIndow
top = T k i n t e r . Tk ( )
#Add CAnvas To Window To D I s p l a y RGB Image
canvas = T k i n t e r . Canvas ( top , h e i g h t=image . shape [ 0 ] ∗
CONSTANTS. SCALING X
, width=image . shape [ 1 ] ∗CONSTANTS. SCALING Y)
#Update CAnvas As Fast As P o s s i b l e
updateImage ( top , canvas )
#L I s t e n For Evwnts
top . mainloop ( )
#R e l e a s e Video Permission
videoCApture . r e l e a s e ( )
45
Image Recognition Using Neural Network & Deep Learning Chapter 8
8.1 INTRODUCTION
Testing is an important phase in the development life cycle of the product this was
the phase where the error remaining from all the phases was detected. Hence testing
performs a very critical role for quality assurance and ensuring the reliability of the
software. Once the implementation is done, a test plan should be developed and run
on a given set of test data. Each test has a different purpose, all work to verify that all
the system elements have been properly integrated and perform allocated functions.
The testing process is actually carried out to make sure that the product exactly does
the same thing what is suppose to do. Testing is the final verification and validation
activity within the organization itself. In the testing stage following goals are tried to
achieve:-
During testing the major activities are concentrated on the examination and modifi-
cation of the source code. The test cases executed for this project are listed below.
Description of the test case, steps to be followed; expected result, status and screen-
shots are explained with each of the test cases.
• Guarantee that all independent paths within a module have been exercised at
least once.
• Execute all loops at their boundaries and within their operational bounds.
• Performance errors.
• Errors in objects.
50
Image Recognition Using Neural Network & Deep Learning Chapter 9
9.1 CONCLUSION
This project was the first attempt to develop a system of this nature. We idntified from
the beginning that producing a complete result would be impossible within the given
time frame. We viewed the project as a journey where we learnt many lessons and
gained insights to the subject which we tried to share in this report and summarised
in this chapter. We tried to look at the problem from many points of view which
generated some new ideas that could be explored in future. We suggested formal
approaches for modelling and analysing the system which are by no means complete
but could become the initiation for further research. We also created a working system
and algorithms which we claim to be useful and extensible. However, as we have seen
in these chapter, all these achievements are only partialy successful.
Personally, we would consider this project a success if the ideas described in the
report can beme a useful reference for future work on the subject.
• robots that can see, feel, and predict the world around them
• composition of music
• trends found in the human genome to aid in the understanding of the data
compiled by the Human Genome Project
[1] Li Deng and Dong Yu ,Deep Learning: methods and applications, by Microsoft
research
[2] Lillesand, T.M. and Kiefer, R.W. and Chipman, J.W., in Remote Sensing and
Image Interpretation
[5] A back-propagation neural network based method for post life expectancy esti-
mation of thoracic surgery patients Abhishek Kumar Pandey; Mohd Anas Khan;
Aleena Swetapadma 2017 International Conference On Smart Technologies For
Smart Nation .
52