Thesis PPT Hritu Raj-1
Thesis PPT Hritu Raj-1
9-layer CNN network, for face Labeled Faces in the Wild (LFW) Accuracy is low
1. DeepFace: Closing the Gap verification tasks. dataset containing more than
to Human-Level Performance Incorporates techniques such as
multi-task learning and supervised
13,000 labeled images of faces.
Deep (CNN) architecture VGG-16 VGGFace dataset, which includes VGGFace highlighted
2. VGGFace: A Deep Face model, adapted for face over 2.6 million images from over challenges in handling
Recognition Model [4] recognition tasks. 2,600 identities. variations in facial
expressions, occlusions, and
It involves training the network to
learn discriminative features aging.
directly from face images .
Uses deep (CNN) based on the VGGFace2 dataset consists of Issues Related to Face
3 VGGFace2: A Dataset for VGG-16 and VGG-19 models. over 3.3 million images from Recognition Accuracy Varying
Recognising Faces across over 9,000 identities. Based on Race and Skin Tone
Introduced Additive Angular Margin LFW (Labeled Faces in the Wild), Do not work well with low
5 ArcFace: Additive Angular (ArcFace) loss. MegaFace, and MS1M. resolution models.
Margin Loss for Deep Face
Recognition [7]
MobileFaceNet is a compact (CNN) CASIA-WebFace, MegaFace, and Due to lightweight the accuracy is
6 MobileFaceNet: Efficient architecture designed for face LFW (Labeled Faces in the Wild). little less can be improved.
CNN Model for Face recognition tasks on mobile and
embedded devices.
Recognition on Mobile
Reducing model size and
Devices [8] computational complexity
Proposed Dataset
Proposed Dataset
➢ Created for facial recognition in low-resolution
CCTV.
➢ Designed to develop and evaluate recognition
systems.
➢ Focused on low-resolution environments.
Process of Dataset Creation
1. Raw Data Collection
➢ A CCTV camera from TVT company, with a frame rate of Figure: 3 Proposed Dataset
20 FPS and resolution of 1910×1077, was used.
➢ Videos were recorded using a screen recorder on Linux.
Proposed Dataset
Extracted Frames
2. Preprocessing
MTCNN
➢ cropped image size of 130x170 pixels
High-Definition Face Images 102 These conditions highlight the need for
advanced super-resolution and recognition
Unique Identities 102
models to improve accuracy.
Males 95
Female 7
Category Value
Dataset was taken by the
drone Videos 200
Subjects 58
Input
Enhanced Faces
using
Super-Resolution Recognition
Figure: 7 Architecture for the proposed method.(a) MTCNN [10] (b) Super-resolution (c)
Recognition (FaceNet [14])
Proposed Architecture
Pipeline
Software Configuration
● Operating System: Ubuntu 20.04 LTS
● Deep Learning Frameworks: TensorFlow and PyTorch for implementing and training models.
● CUDA and cuDNN: NVIDIA CUDA and cuDNN libraries for GPU acceleration.
● Python: Python 3.10 as the primary programming language, with various libraries for data handling
and preprocessing.
Illustrative Result
Enhanced
Cropped
Input
[2] chowdhuri, D., K-S, S., M, R. & pradeep reddy, C. (2012). Very Low Resolution Face Recognition Problem. IEEE Transactions on
Image Processing, 21(1):327–340. doi: 10.1109/tip.2011.2162423
[3] Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). DeepFace: Closing the gap to human-level performance in face verification. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep Face Recognition. In British Machine Vision Conference (BMVC).
[5] Cao, Q., Shen, L., Xie, W., Parkhi, O. M., & Zisserman, A. (2018). VGGFace2: A dataset for recognising faces across pose and age.
International Conference on Automatic Face and Gesture Recognition (FG).
[6] King, D. E. (2009). Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10, 1755-1758.
[7] Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). ArcFace: Additive angular margin loss for deep face recognition. Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Chen, Y., Wen, F., Zhu, W., Zhang, Y., & Wang, Z. (2018). MobileFaceNets: Efficient CNNs for accurate real-time face verification on
mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] I. Kalra, M. Singh, S. Nagpal, R. Singh, M. Vatsa, and P. B. Sujit, "DroneSURF: Benchmark Dataset for Drone-based Face Recognition," in
IEEE Xplore. IIIT-Delhi, India.
[12] Xintao Wang, Yu Li, Honglun Zhang, Ying Shan, "Towards Real-World Blind Face Restoration with Generative Facial Prior," in Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021.
[13] Qing Guo, Xiaoming Li, Yulun Zhang, Yun Fu, Thomas H. Li, "CodeFormer: Towards Robust Face Restoration with Codebook Lookup
Transformer," in Proceedings of the 30th ACM International Conference on Multimedia (MM '22), 2022.
[14] Florian Schroff, Dmitry Kalenichenko, James Philbin, "FaceNet: A Unified Embedding for Face Recognition and Clustering," in Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[15] Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Chen Change Loy, Yu Qiao, Xiaoou Tang, "ESRGAN: Enhanced
Super-Resolution Generative Adversarial Networks," in Proceedings of the European Conference on Computer Vision Workshops (ECCVW),
2018.
[16] Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan, "Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data,"
in Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops, 2021.
[17] Bulat, A., Yang, J., and Tzimiropoulos, G. (2021). "To learn image super-resolution, use a GAN to learn how to do image degradation first."
arXiv preprint arXiv:2003.04047.
Thank You