International Journal of Engineering Trends and Applications (IJETA) – Volume 11 Issue 3 May - Jun 2024
RESEARCH ARTICLE OPEN ACCESS
Enhanced Human Age and Gender Detection Using Convolutional
Neural Networks
Santosh Kumar, Harshvardhan Tailor, Hemant Singh Jadoun,
Mandeep Kumar Biloniya, Aryan Jangid
Department of CS/IT/AI&DS, Global Institute of Technology, Jaipur
Department of AI&DS, Global Institute of Technology, Jaipur
Department of AI&DS, Global Institute of Technology, Jaipur
Department of AI&DS, Global Institute of Technology, Jaipur
Department of AI&DS, Global Institute of Technology, Jaipur
ABSTRACT
The primary motivation behind developing an automated method for gender and age detection in humans stems from its crucial
role in pattern recognition and computer vision. Beyond age and facial emotion detection, it holds significance in various
applications within computer vision, particularly in non-verbal communication methods such as facial expressions and hand
gestures, which are integral to human-computer interaction.
While considerable research has successfully advanced computer modeling of age, gender, and emotions, it still lags behind the
capabilities of the human vision system. In this project, we propose an architecture for age and gender classification utilizing
Convolutional Neural Networks (CNNs). Our model demonstrates superior accuracy in both age and gender detection compared
to other classifier-based methods. Additionally, for modeling human emotions, we aim to predict emotions using deep CNNs to
enhance future marketing strategies.
We employ the Viola-Jones pre-processing algorithm to extract features from images, which are then fed as input to the CNN.
The results of the detection process are presented to the user through a well-designed user interface.
Keywords — Face Detection, Viola-Jones, Deep CNN
Figure 1: - CNN architecture
I. INTRODUCTION
Age detection from images plays a very vital role in human
and computer vision which has wide range of applications like II. LITERATURE SURVEY
forensics or social media. It can detect other biometrics of
Our exploration into the background commenced by delving
human and such as age, gender, and emotions. Wide researches
into research papers and online blog posts relevant to our
are conducted already to detect age using facial features.
topic. One notable research paper introduces a novel
Various public standard datasets can be used for a realtime age
framework for facial expression recognition utilizing an
detection which helps public performance comparison of
attentional convolutional network. Attention proves crucial in
desired methods. As a result, a loads of active researches has
detecting facial expressions, enabling neural networks with
been done, with several recent works using the concept of
fewer than 10 layers to rival much deeper networks in emotion
Convolutional Neural Networks (CNN) for extraction of
recognition tasks. Despite successful face recognition,
features.
challenges persist due to factors such as illumination, pose
Facial expressions can be recognized using non-verbal variation, facial expressions, and facial components like
communication between humans, along with the interpretation eyebrows, nose, and mouth length. To address these, the
of facial expressions is being widely studied. Facial researchers utilized the Dlib library in OpenCV to handle face
expressions plays an important role in human interaction, recognition tasks, employing methods such as Adaboost for
Facial Expression Recognition (FER) algorithm with the help feature selection and the Viola-Jones algorithm for Haar-like
of computer vision helps in applications such as human- feature extraction, which are then fed into a CNN model for
computer interactions and data analytics. processing. Their deep model underwent training on a vast
dataset comprising four million images for face recognition
purposes. This model serves as the foundation for our facial
attribute recognizers, which are further fine-tuned for tasks
including apparent age estimation, gender recognition, and
emotion recognition. We collected images from various
sources, amassing over four million images encompassing
more than 40,000 individuals for facial recognition. Each
ISSN: 2393-9516 www.ijetajournal.org Page 251
International Journal of Engineering Trends and Applications (IJETA) – Volume 11 Issue 3 May - Jun 2024
image is labeled with gender, and the data is annotated with We compute 523,050 face images from the IMDb and
emotions. These images undergo a semi-automated trimming Wikipedia websites to form IMDB-WIKI our new dataset.
process, with human annotators involved to ensure accuracy.
While we address age estimation within the realm of
Subsequently, the images undergo preprocessing to extract
regression, we extend our approach by framing age estimation
and align the faces.
as a multi-class classification of age categories, followed by a
III. DEVELOPMENT PROCESS refinement of expected values using SoftMax.
Our main contributions are as follows:
1. The UTK Face dataset is the largest dataset with real
age and gender annotations.
2. A new regression approach is employed, combining
deep classification with subsequent refinement of
expected values.
3. FER-2013 (Kaggle) dataset for emotion detection.
• The We introduced the IMDB-WIKI dataset for age
detection, offering a detailed analysis of the projected DEX
system. We applied the method and reported the results on
standard age estimation datasets. Additionally, for Emotion
Figure 1: - Proposed System Architecture Detection & Classification, we extensively evaluated and
For Age and Gender Detection is used for age estimation tested various pre-processing techniques and model
which can be seen in image classification and object detection architectures. Through our experimentation, we
fueled by deep learning. From the deep learning concept, we successfully developed a custom Convolutional Neural
learn four key ideas that we apply to our solution: Network (CNN) model that achieved an accuracy of
• The deeper the neural networks (by a sheer increase of 70.47% on the FER-2013 test set.
parameters/model complexity) the better the capacity to • During pre-processing, we experimented with centered and
model highly non-linear transformations - with some scaled data, discovering that subtracting the mean
optimal depth on current architectures. significantly aids in aligning the training distribution across
• The larger and more diverse the datasets used for training, all sets before training and evaluation. To further enhance
the better the network learns to generalize and the more our model's robustness, we implemented data augmentation
effective it becomes to over-fitting. techniques such as random rotation, shifting, flipping,
cropping, and sheering of training images. This approach
• The alignment of the object in the input image impacts the led to approximately a ten percent reduction in
overall performance. inaccuracies.
• Use When the training dataset is small that is a network • We implemented several CNN architectures sourced from
pre-trained for comparable inputs which will help us from different papers for emotion detection across various
the transferred knowledge. datasets. Ultimately, our custom-developed CNN
Our procedure consistently initiates by rotating the input architecture yielded the best performance. However, we
image at various angles to identify the face exhibiting the acknowledge that detecting errors in neural networks can
highest score. We then align the face using the angle and crop be challenging.
it for further steps. This is a simple and effective procedure that • Through our analysis, we observed errors across different
does not involve facial landmark detection. We use deep VGG- emotion classes and conducted visual inspections of images
16 architecture for our Convolutional Neural Network (CNN). classified correctly and incorrectly. An early observation
We start from pre-trained CNNs on the large ImageNet dataset was our difficulty in accurately classifying certain
to classify images such that it helps us by discriminating 1000 emotions, particularly those reliant on fine details in
object categories in images by the representations learned, and images, such as small facial features or curves. To address
to obtain a meaningful representation and a smooth and warm this, we increased the number of layers and reduced filter
start for further fine-tuning on relatively smaller face datasets. sizes in our network to enhance its capacity to capture
Number Managing the CNN on facial images along with intricate details.
age annotations is a very important step for better performance • However, this adjustment led to overfitting issues, which
as we know the CNN adapts to best fit the particular data we mitigated by implementing dropout, early stopping at
distribution and perform effective age detection. Because of approximately 100 epochs, and augmenting our training
shortage of facial images with apparent age annotations, we go set. We noted that we could only effectively learn training
for the benefits of adjusting over crawled internet face images. set noise after achieving approximately 70% accuracy on
ISSN: 2393-9516 www.ijetajournal.org Page 252
International Journal of Engineering Trends and Applications (IJETA) – Volume 11 Issue 3 May - Jun 2024
the development set, as evidenced by the accuracy plot • OpenCV
during training. • Python 3.5
(TensorFlow not supported in higher versions)
• Moving forward, we suggest further exploration into • Num-Py
enabling increased parameterization of the network for • Tensor-Flow
improved performance.Do not confuse “imply” and “infer”. • h5py (for Keras model serialization)
• During real-time classification using OpenCV's Haar
cascades to detect and extract a face region from a webcam B. HARDWARE:
video feed, we found that it's optimal neither to subtract the Intel core processor with high GPU power & frequency.
training means nor to normalize the pixels within the
detected face region before classification. C. DATASET:
a. UTK Face – for age and gender.
• In real-time classification, our model demonstrated b. FER-2013 – for emotion detection.
strengths in detecting neutral, happy, surprised, and angry
emotions. However, illumination emerged as a crucial
factor influencing the model's performance. This suggests VI. REFERENCES
that our training set may not accurately represent the [1] Shervin Minaee, Amirali Abdolrashidi, Deep - Emotion:
distribution of emotions under lower brightness conditions Facial Expression recognition using attentional
on the screen. Further investigation into this aspect could convolutional network, 2019 3rd IEEE International
enhance the robustness of our model. Conference on Cybernetics
(CYBCONF)doi:10.1109/cybconf.2017.7985780.
[2] Alen Salihbasi and Tihomir Orehovacki presented a paper
titled "Development of Android Application for Gender,
Age, and Face Recognition Using OpenCV" at MIPRO
2019 on 20 May 2019. The paper was published with
IV. IMPLEMENTATION DETAILS ISSN: 1057-7149.
A. Dataset Download: We start by acquiring image sets [3] Jiu-Cheng Xie, Chi-Man Pun, ‘Age and Gender
from two datasets: UTK Face, which provides gender and age Classification using Convolutional Neural Networks,
IEEE Workshop on Analysis and Modelling of Faces &
labels, and FER-2013. Gestures’, IEEE Transactions on Information Forensics
B. Label Encoding Function: Once downloaded, we and Security, ISSN: 1556-6013,07 March 2019.
encode labels using one-hot encoding. This is necessary [4] Gil Levi and Tal Hassner,’ Facial Emotion Analysis using
because Convolutional Neural Networks (CNNs) require Deep Convolutional Neural Network’, 2017. The
International Conference on Signal Processing and
numeric values for processing. Communication (ICSPC) took place in Coimbatore in
C. For Resizing and Pre-processing: Images are resized 2017.
to a standard size of 256x256 pixels and converted to [5] Mostafa Mohammadpour, Seyyed Mohammed. R
grayscale using the Histogram of Oriented Gradients (HOG) Hashemi, Hossein Khaliliardali, Mohammad. M Alyan
algorithm. For emotion detection, the Viola-Jones - AdaBoost Nezhadi,’ Facial Emotion Recognition using Deep
Convolutional Networks’, 2017 4th International
algorithm is employed to extract facial features optimized for Conference on Knowledge-Based Engineering and
emotion recognition, reshaping the image to focus solely on Innovation (KBEI), 22 December 2017.
facial features. [6] Dehghan, Afshin & G. Ortiz, Enrique & Shu, Guang &
D. Feature Extraction and Integration: Facial features Zain Masood, Syed. (2017). DAGER: Deep Age, Gender
and Emotion Recognition Using Convolutional Neural
such as eyebrow shape, nose, mouth, and facial contours are Network.
extracted using the DLIB library in OpenCV. These features, [7] 2018-Sepidehsadat Hosseini, Seok Hee Lee, Hyuk Jin
along with age, gender, and emotional attributes, are Kwon, Hyung Il Koo Nam Ik Cho, “Age and Gender
integrated for training. Classification Using Wide Convolutional Neural Network
E. Training and Testing: The CNN model is constructed and Gabor Filter”, IEEE2018.
based on the VGG-16 architecture and trained over multiple [8] Rekha N, Dr.M.Z.Kurian "Face Detection in Real Time
Based on HOG" international journal of advanced
epochs. The loss Gauss function is utilized to filter out Research in computer engineering & tchnology volume 3
distorted or irrelevant images. During testing, users input issue 4,april2014 ISSN:2278-1323
images, and the model predicts age, gender, and emotion by [9] Ajit P. Gosavi, S. R. Khot “Facial Expression
comparing them with trained images. Recognition Using Principal Component Analysis” ISSN:
F. Output: Finally, fresh images undergo the same 2231-2307, Volume-3, Issue-4, September 2013
feature extraction process, with the extracted features fed into [10] A. Agarwal, R. Joshi, H. Arora and R. Kaushik, "Privacy
and Security of Healthcare Data in Cloud based on the
the trained ML algorithm to predict the corresponding labels. Blockchain Technology," 2023 7th International
Conference on Computing Methodologies and
V. REQUIREMENT ANALYSIS Communication (ICCMC), Erode, India, 2023, pp. 87-92.
[11] G. K. Soni, D. Yadav, A. Kumar and L. Sharma,
A. SOFTWARE: "Flexible Antenna Design for Wearable IoT Devices,"
Prerequisites are: IEEE 2023 3rd International Conference on
• Keras2 (with TensorFlow backend)
ISSN: 2393-9516 www.ijetajournal.org Page 253
International Journal of Engineering Trends and Applications (IJETA) – Volume 11 Issue 3 May - Jun 2024
Technological Advancements in Computational Sciences [24] P. Jha, T. Biswas, U. Sagar and K. Ahuja, "Prediction
(ICTACS), Tashkent, Uzbekistan, pp. 863-867, 2023. with ML paradigm in Healthcare System," 2021 Second
[12] Pradeep Jha, Deepak Dembla & Widhi Dubey , International Conference on Electronics and Sustainable
“Implementation of Transfer Learning Based Ensemble Communication Systems (ICESC), Coimbatore, India,
Model using Image Processing for Detection of Potato 2021, pp. 1334-1342, doi:
and Bell Pepper Leaf Diseases”, International Journal of 10.1109/ICESC51422.2021.9532752.
Intelligent Systems and Applications in Engineering, [25] Mehra, M., Jha, P., Arora, H., Verma, K., Singh, H.
12(8s), 69–80, 2024. (2022). Salesforce Vaccine for Real-Time Service in
[13] H. Arora, G. K. Soni, R. K. Kushwaha and P. Prasoon, Cloud. In: Shakya, S., Balas, V.E., Kamolphiwong, S.,
"Digital Image Security Based on the Hybrid Model of Du, KL. (eds) Sentimental Analysis and Deep Learning.
Image Hiding and Encryption," IEEE 2021 6th Advances in Intelligent Systems and Computing, vol
International Conference on Communication and 1408. Springer, Singapore. https://doi.org/10.1007/978-
Electronics Systems (ICCES), pp. 1153-1157, 2021. doi: 981-16-5157-1_78
10.1109/ICCES51350.2021.9488973. [26] Gaur, P., Vashistha, S., Jha, P. (2023). Twitter Sentiment
[14] Pradeep Jha, Deepak Dembla & Widhi Dubey, “Deep Analysis Using Naive Bayes-Based Machine Learning
learning models for enhancing potato leaf disease Technique. In: Shakya, S., Du, KL., Ntalianis, K. (eds)
prediction: Implementation of transfer learning based Sentiment Analysis and Deep Learning. Advances in
stacking ensemble model”, Multimedia Tools and Intelligent Systems and Computing, vol 1432. Springer,
Applications, Vol. 83, pp. 37839–37858, 2024. Singapore. https://doi.org/10.1007/978-981-19-5443-
[15] G. K. Soni, A. Rawat, D. Yadav and A. Kumar, "2.4 GHz 6_27
Antenna Design for Tumor Detection on Flexible [27] P. Jha, D. Dembla and W. Dubey, “Implementation of
Substrate for On-body Biomedical Application," 2021 Machine Learning Classification Algorithm Based on
IEEE Indian Conference on Antennas and Propagation Ensemble Learning for Detection of Vegetable Crops
(InCAP), pp. 136-139, 2021. doi: Disease”, International Journal of Advanced Computer
10.1109/InCAP52216.2021.9726323. Science and Applications, Vol. 15, No. 1, pp. 584-594,
[16] Babita Jain, Gaurav Soni, Shruti Thapar, M Rao, “A 2024.
Review on Routing Protocol of MANET with its [28] Bhatia, Pramod; Garg, Vivek et al. (Patent No: 20 2022
Characteristics, Applications and Issues”, International 102 590.8) (2022) Intelligent seating system based on IoT
Journal of Early Childhood Special Education, Vol. 14, and machine learning. .
Issue. 5, pp. 2950-2956, 2022.
[17] P. Upadhyay, K. K. Sharma, R. Dwivedi and P. Jha, "A
Statistical Machine Learning Approach to Optimize
Workload in Cloud Data Centre," 2023 7th International
Conference on Computing Methodologies and
Communication (ICCMC), Erode, India, 2023, pp. 276-
280, doi: 10.1109/ICCMC56507.2023.10083957.
[18] Pradeep Jha, Deepak Dembla & Widhi Dubey , “Crop
Disease Detection and Classification Using Deep
Learning-Based Classifier Algorithm”, Emerging Trends
in Expert Applications and Security. ICETEAS 2023.
Lecture Notes in Networks and Systems, vol 682, pp.
227-237, 2023.
[19] Gori Shankar, Vijaydeep Gupta, Gaurav Kumar Soni,
Bharat Bhushan Jain and Pradeep kumar Jangid, "OTA
for WLAN WiFi Application Using CMOS 90nm
Technology", International Journal of Intelligent Systems
and Applications in Engineering (IJISAE), vol. 10, no.
1(s), pp. 230-233, 2022.
[20] Unmasking Embedded Text: A Deep Dive into Scene
Image Analysis, Maheshwari, A., Ajmera.R.,
Dharamdasani D.K., 2023 International Conference on
Advances in Computation, Communication and
Information Technology, ICAICCIT 2023, 2023, pp.
1403–1408
[21] Internet of Things (IoT) Applications, Tools and Security
Techniques, Kawatra, R., Dharamdasani, D.K., Ajmera,
R,et.al. 2022 2nd International Conference on Advance
Computing and Innovative Technologies in Engineering,
ICACITE 2022, 2022, pp. 1633–1639
[22] P. Jha, D. Dembla and W. Dubey, "Comparative Analysis
of Crop Diseases Detection Using Machine Learning
Algorithm," 2023 Third International Conference on
Artificial Intelligence and Smart Energy (ICAIS),
Coimbatore, India, 2023, pp. 569-574, doi:
10.1109/ICAIS56108.2023.10073831.
[23] P. Jha, R. Baranwal, Monika and N. K. Tiwari,
"Protection of User’s Data in IOT," 2022 Second
International Conference on Artificial Intelligence and
Smart Energy (ICAIS), Coimbatore, India, 2022, pp.
1292-1297, doi: 10.1109/ICAIS53314.2022.9742970.
ISSN: 2393-9516 www.ijetajournal.org Page 254