Neural Network
Neural Network
Facial recognition is a category of biometric software that maps an individual's facial features
mathematically and stores the data as a faceprint. The software uses deep learning algorithms to compare
a live capture or digital image to the stored faceprint in order to verify an individual's identity.
High-quality cameras in mobile devices have made facial recognition a viable option for authentication as
well as identification. Apple’s iPhone X, for example, includes Face ID technology that lets users unlock
their phones with a faceprint mapped by the phone's camera. The phone's software, which is designed with
3-D modeling to resist being spoofed by photos or masks, captures and compares over 30,000 variables.
As of this writing, Face ID can be used to authenticate purchases with Apple Pay and in the iTunes Store,
App Store and iBooks Store. Apple encrypts and stores faceprint data in the cloud, but authentication takes
place directly on the device.
Model Overview:
Let’s visualize how to create basic facial recognition application using a pre-trained deep neural network.
Training of the network has already been done as shown in below diagram.
I am using this pre-trained network to compare the embedding vectors of the images stored in the file
system with the embedding vector of the image captured from the webcam. This can be explained by below
diagram.
As per the above diagram, if the face captured by webcam has similar 128-bit embedding vector stored in
the database then it can recognize the person. All the images stored in the file system are converted to a
dictionary with names as key and embedding vectors as value.
An affine transformation rotates the face and makes the position of the eyes, nose, and mouth for each face
consistent. Performing an affine transformation ensure the position eyes, mouth and nose to be fixed, which
aid in finding the similarity between two images while applying one-shot learning on face recognition.
% Additions by me:
% Minumum face size constraint
% Adaptive theta thresholding (Theta is thresholded by mean2(theata)/4
% Parameters are modified by to detect better. Please check the paper for
% parameters they propose.
% Check the paper for more details.
% usage:
% I=double(imread('c:\Data\girl1.jpg'));
% detect_face(I);
% The function will display the bounding box if a face is found.
% Notes: This algorithm is very primitive and doesn't work in real life.
% The resaon why I implement is that I believe for low cost platforms
% people need such kind of algorithms. However this one doesn't perform so
% well in my opinion (if I implemented correctly)
function []=detect_face(I)
I=double(I);
H=size(I,1);
W=size(I,2);
R=I(:,:,1);
G=I(:,:,2);
B=I(:,:,3);
%normalize Y
minY=min(min(Y));
maxY=max(max(Y));
Y=255.0*(Y-minY)./(maxY-minY);
YEye=Y;
Yavg=sum(sum(Y))/(W*H);
T=1;
if (Yavg<64)
T=1.4;
elseif (Yavg>192)
T=0.6;
end
if (T~=1)
RI=R.^T;
GI=G.^T;
else
RI=R;
GI=G;
end
C=zeros(H,W,3);
C(:,:,1)=RI;
C(:,:,2)=GI;
C(:,:,3)=B;
figure,imshow(C/255);
title('Lighting compensation');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
S=zeros(H,W);
[SkinIndexRow,SkinIndexCol] =find(10<Cr & Cr<45);
for i=1:length(SkinIndexRow)
S(SkinIndexRow(i),SkinIndexCol(i))=1;
end
figure,imshow(S);
title('skin');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%
figure,imshow(SN);
title('skin with noise removal');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
L = bwlabel(SN,8);
BB = regionprops(L, 'BoundingBox');
bboxes= cat(1, BB.BoundingBox);
widths=bboxes(:,3);
heights=bboxes(:,4);
hByW=heights./widths;
lenRegions=size(bboxes,1);
foundFaces=zeros(1,lenRegions);
rgb=label2rgb(L);
figure,imshow(rgb);
title('face candidates');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%
figure, imshow(RIC/255);
title('Possible face R channel');
M=zeros(HCur, WCur);
for j=1:HCur
Hist(j)=length(find(M(j,:)==1));
end
wMax=find(Hist==max(Hist));
wMax=wMax(1); % just take one of them.
figure, imshow(M);
title('Mouth map');
eyeH=HCur-wMax;
eyeW=WCur;
YC=YEye(YStart:YStart+eyeH-1, XStart:XStart+eyeW-1);
E=zeros(eyeH,eyeW);
[EyeIndexRow,EyeIndexCol] =find(65<YC & YC<80);
for j=1:length(EyeIndexRow)
E(EyeIndexRow(j),EyeIndexCol(j))=1;
end
foundFaces(i)=1;
numFaceFound=numFaceFound+1;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%
if (numFaceFound>0)
disp('Indices of faces found: ');
ind=find(foundFaces==1);
CurBB=bboxes(ind,:);
CurBB
else
close all;
end
end
Model Overview:
We use a NN for our task. It consists of convolutional NN (CNN) layers, recurrent NN (RNN) layers and
a final Connectionist Temporal Classification (CTC) layer. Fig. 2 shows an overview of our HTR
system.
Fig. 2: Overview of the NN operations (green) and the data flow through the NN (pink).
We can also view the NN in a more formal way as a function (see Eq. 1) which maps an image (or matrix)
M of size W×H to a character sequence (c1, c2, …) with a length between 0 and L. As you can see, the
text is recognized on character-level, therefore words or texts not contained in the training data can be
recognized too (as long as the individual characters get correctly classified).
%% Manual Cropping
img = imread('sample.bmp');
imshow(img)
imgGray = rgb2gray(img);
imgCrop = imcrop(imgGray);
imshow(imgCrop)
%% Resizing
imgLGE = imresize(imgCrop, 5, 'bicubic');
imshow(imgLGE)
%% Rotation
imgRTE = imrotate(imgLGE, 35);
imshow(imgRTE)
%% Binary Image
imgBW = im2bw(imgLGE, 0.90455);
imshow(imgBW)
Input Output
X1 X2 Y
0 0 0
0 1 0
1 0 0
1 1 1
Row 1:
From W1X1 + W2X2 + b, initializing W1, W2, as 1 and b as -1.5, we get;
X1(1) + X2(1) + (-1.5) = 0x1 + 0x1 - 1.5 = -1.5, ⸫ φ (-1.5) = 0
⸫ Output, Y= 0
Row 2:
X1(1) + X2(1) + (-1.5) = 0x1 + 1x1 - 1.5 = -0.5, ⸫ φ (-0.5) = 0
⸫ Output, Y= 0
Row 3:
X1(1) + X2(1) + (-1.5) = 1x1 + 0x1 - 1.5 = -0.5, ⸫ φ (-0.5) = 0
⸫ Output, Y= 0
Row 4:
X1(1) + X2(1) + (-1.5) = 1x1 + 1x1 - 1.5 = 0.5, ⸫ φ (0.5) = 1
⸫ Output, Y= 1
Input Output
X1 X2 Y
0 0 1
0 1 1
1 0 1
1 1 0
Fig: Truth table
Row 1:
From W1X1 + W2X2 + b, initializing W1, W2, as -1 and b as 1, we get;
X1(-1) + X2(-1) + 1 = 0x(-1) + 0x(-1) + 1 = 1, ⸫ φ(1) = 1
⸫ Output, Y= 1
Row 2:
X1(-1) + X2(-1) + 1 = 0x(-1) + 1x(-1) + 1=0 , ⸫ φ(0) = 1
⸫ Output, Y= 1
Row 3:
X1(-1) + X2(-1) + 1 = 1x(-1) + 0x(-1) + 1 = 0, ⸫ φ(0) = 1
⸫ Output, Y= 1
Row 4:
X1(-1) + X2(-1) + 1 = 1x(-1) + 1x(-1) + 1= -1, ⸫ φ(-1) = 0
⸫ Output, Y= 0
Input Output
X1 X2 Y
0 0 0
0 1 1
1 0 1
1 1 1
Row 1:
From W1X1 + W2X2 + b, initializing W1, W2, as 1 and b as -1, we get;
X1(1) + X2(1) + (-1) = 0x1 + 0x1 - 1 = -1, ⸫ φ(-1) = 0
⸫ Output, Y= 0
Row 2:
X1(1) + X2(1) + (-1) = 0x1 + 1x1 - 1 = 0, ⸫ φ(0) = 1
⸫ Output, Y= 1
Row 3:
X1(1) + X2(1) + (-1) = 1x1 + 0x1 - 1 = 0, ⸫ φ(0) = 1
⸫ Output, Y= 1
Row 4:
X1(1) + X2(1) + (-1) = 1x1 + 1x1 - 1 = 1, ⸫ φ(1) = 1
⸫ Output, Y= 1
Input Output
X1 Y
0 1
1 0
Row 1:
From W1X1 + W2X2 + b, initializing W1, W2, as 0(since single layer) and b as 0.5, we get;
X1(-1) + 0.5 = 0x(-1) + 0.5= 0.5, ⸫ φ(0.5) = 1
⸫ Output, Y= 1
Row 2:
X1(-1) + 0.5 = 1x(-1) + 0.5 = -0.5, ⸫ φ(-0.5) = 0
⸫ Output, Y= 0
Input Output
X1 X2 Y
0 0 1
0 1 0
1 0 0
1 1 0
Row 1:
From W1X1 + W2X2 + b, initializing W1, W2, as -1 and b as 0.5, we get;
X1(-1) + X2(-1) + 0.5= 0x(-1) + 0x(-1) + 0.5 = 0.5, ⸫ φ(0.5) = 1
⸫ Output, Y= 1
Row 2:
X1(-1) + X2(-1) + 0.5= 0x(-1) + 1x(-1) + 0.5 = -0.5, ⸫ φ(-0.5) = 0
⸫ Output, Y= 0
Row 3:
X1(-1) + X2(-1) + 0.5= 1x (-1) + 0(-1) + 0.5= -0.5, ⸫ φ(-0.5) = 0
⸫ Output, Y= 0
Row 4:
X1(-1) + X2(-1) + 0.5= 1x(-1) + 1x(-1) + 0.5 = -1.5, ⸫ φ(-1.5) = 0
⸫ Output, Y= 0
Input Output
X1 X2 Y
0 0 1
0 1 0
1 0 0
1 1 1
AND (x1+x2–1.5)
NOR (-x1-x2+0.5)
OR (x1+x2–1)
Row 1:
X1=0, X1(1) + X2(1) - 1.5 = 0x1 + 0x1 - 1.5 = -1.5, ⸫ φ(-1.5) = 0
X2=0, X1(-1) + X2(-1) + 0.5 = 0x(-1) + 0x(-1) + 0.5 = 0.5, ⸫ φ(0.5) = 1
⸫ Output= 0x1 +1x1 – 1= 0, ⸫ φ(0) = 1
XOR Gate
Input Output
X1 X2 Y
0 0 0
0 1 1
1 0 1
1 1 0
Fig: Truth table
Row 1:
X1=0, X1(1) + X2(1) - 1 = 0x1 + 0x1 - 1 = -1, ⸫ φ(-1) = 0
X2=0, X1(-1) + X2(-1) + 1 = 0x(-1) + 0x(-1) + 1 = 1 , ⸫ φ(1) = 1
⸫ Output= 0x1 +1x1 – 1.5= -0.5, ⸫ φ(-0.5) = 0
Row 2:
X1=0, X1(1) + X2(1) - 1 = 0x1 + 1x1 - 1 = 0, ⸫ φ(0) = 1
X2=1, X1(-1) + X2(-1) + 1 = 0x(-1) + 1x(-1) + 1 = 0, ⸫ φ(0) = 1
⸫ Output= 1x1 +1x1 – 1.5= 0.5, ⸫ φ(0.5) = 1
Row 3:
X1=1, X1(1) + X2(1) - 1 = 1x1 + 0x1 - 1 = 0, ⸫ φ(0) = 1
X2=0, X1(-1) + X2(-1) + 1 = 1x(-1) + 0x(-1) + 1 = 0, ⸫ φ(0) = 1
⸫ Output= 1x1 +1x1 – 1.5= 0.5, ⸫ φ(1) = 1
Row 4:
X1=1, X1(1) + X2(1) - 1 = 1x1 + 1x1 - 1 = 1, ⸫ φ(1) = 1
X2=1, X1(-1) + X2(-1) + 1 = 1x(-1) + 1x(-1) + 1 = -1, ⸫ φ(-1) = 0
⸫ Output= 1x1 +0x1 – 1.5= -0.5, ⸫ φ(-0.5) = 0