Emotion Analyser Report-1
Emotion Analyser Report-1
A PROJECT REPORT
Submitted By
Submitted in partial fulfilment of the required for the award of the degree
Bachelor of Technology
IN
Department of Computer Science & Engineering
I
Declaration by the Student
I hereby declare that the work reported in the B. Tech. project entitled as
“Emotion Analyser”, in partial fulfillment for the award of degree
of Bachelors in Technology in Computer Science & Engineering submitted at
Jaypee University of Engineering and Technology, Guna, as per best of my
knowledge and belief there is no infringement of intellectual property right and
copyright. In case of any violation I will solely be responsible.
Abhishek Yadav(201b343)
Naveen Paudel(211b192)
Date - 27/11/2023
II
JAYPEE UNIVERSITY OF ENGINEERING &
TECHNOLOGY
Grade ‘A+’ Accredited with by NAAC & Approved U/S 2(f) of the UGC Act, 1956
A.B. Road, Raghogarh, Dist: Guna (M.P.) India, Pin-473226
Phone: 07544 267051, 267310-14, Fax: 07544 267011
Website: www.juet.ac.in
CERTIFICATE
Signature of Supervisor
Designation
III
ACKNOWLEDGEMENT
We would like to express our gratitude and appreciation to all those who gave us
the opportunity to complete this project. Special thanks is due to our supervisor
Dr. Mahesh Kumar whose help, stimulating suggestions and encouragement
helped us in all the time of development process and in writing this report. We
also sincerely thanks for the time spent proofreading and correcting my many
mistakes. We would also like to thank our parents and friends who helped us a
lot in finalizing this project within the limited period. Last but not the least I am
grateful to all the team members of Emotion Analyser.
IV
ABSTRACT
The web-based platform allows users to effortlessly submit images for emotion analysis,
eliminating the need for complex installations or technical expertise. Through the integration
of advanced CNN architectures, the Emotion Analyser dissects visual data, capturing intricate
facial expressions and contextual cues to provide nuanced and accurate predictions of
emotional states.
Key features of the web interface include an intuitive upload mechanism, real-time analysis
feedback, and a user-friendly design to enhance accessibility. The project's seamless
integration with a website extends its applications to a diverse range of scenarios, including
sentiment-aware user interfaces, personalized content recommendations, and emotion-driven
analytics.
As users upload images, the Emotion Analyser processes the data with efficiency, delivering
prompt and accurate emotion predictions. This user-centric approach not only facilitates a
hassle-free experience but also opens avenues for widespread adoption across various
domains.
This project represents a significant step forward in making emotion analysis accessible and
user-friendly. By combining sophisticated CNN models with an intuitive web interface, the
Emotion Analyser strives to contribute to the development of empathetic technologies that
resonate with users, offering a seamless and engaging experience in the realm of emotion
recognition.
V
LIST OF FIGURES USED IN THE PROJECT REPORT
VI
TABLE OF CONTENTS
Declaration…………………………………………………………………………………... ii
Certificate…………………………………………………………………………………… iii
Acknowledgement…………………………………………………………………………... iv
Abstract ………………………………………………………………………………………v
List of Figures ……………………………………………………………………………… vi
VII
CHAPTER – 01: INTRODUCTION
Existing solutions often require intricate installations, technical proficiency, or lack a user-
friendly interface, impeding their integration into various domains. Furthermore, the need for
sophisticated emotion recognition, especially in contexts where facial expressions convey
nuanced information, demands advanced machine learning techniques like Convolutional
Neural Networks (CNNs). The challenge lies in bridging the gap between cutting-edge deep
learning models and a practical, user-centric application accessible to a wider audience.
Thus, the problem at hand is to design and implement the "Emotion Analyser" project—a
web-based system that seamlessly integrates CNNs for emotion prediction from user-
uploaded images. This project aims to address the current limitations in accessibility, ease of
use, and real-world applicability of emotion recognition technologies, fostering a more
inclusive and user-friendly approach to emotion analysis through a web interface. The goal is
to create a solution that not only accurately predicts emotions but also provides a hassle-free
and engaging experience for users across diverse domains.
1
1.2 Project Overview
User-Friendly Interface:
Capture intricate facial expressions and contextual cues for nuanced predictions.
Applications:
2
1.3 Hardware Specifications
Atleast 12 GB ROM .
Atleast 8 GB of RAM .
Numpy
Tensorflow(2.0)
Keras(2.2)
Node(20.10.0)
Express(4.18.2)
3
CHAPTER – 02: LITERATURE SURVEY
In the realm of facial expression analysis, foundational contributions have been made by
Ekman and Friesen with the development of the Facial Action Coding System (FACS) in
1978. This system laid the groundwork for subsequent research in facial expression analysis,
providing a comprehensive framework for understanding and categorizing facial movements.
Zhao et al. (2018) further advanced the field by introducing the concept of deep facial
expression recognition. Their work specifically focused on leveraging Convolutional Neural
Networks (CNNs) to extract intricate features from facial images, leading to more robust and
accurate emotion recognition.
In the domain of deep learning models, LeCun et al. (1998) made pioneering contributions
with early work on convolutional neural networks (CNNs). Their research established the
foundation for image classification using deep neural networks, a crucial development for
subsequent emotion recognition models.
More recently, Liu et al. (2020) proposed a significant advancement by presenting a multi-
modal deep learning approach. This approach integrates visual and textual cues to enhance
emotion prediction, acknowledging the importance of combining different modalities for a
more comprehensive understanding of emotions.
Datasets have played a pivotal role in the training and evaluation of emotion recognition
models. The CK+ dataset, introduced by Kanade et al. in 2000, and its extension, the
Extended Cohn-Kanade (CK+) dataset, have been widely utilized in the field. These datasets
provide a diverse range of facial expressions, contributing significantly to the benchmarking
and advancement of emotion recognition models.
4
The "Emotion Analyser" project aims to revolutionize image-based emotion recognition
through the integration of advanced Convolutional Neural Networks (CNNs) in a user-
friendly web platform. Rooted in foundational works like Ekman and Friesen's Facial Action
Coding System (FACS) from 1978, the system uses this historical knowledge to interpret
facial expressions, forming the core of its emotion recognition framework.
By incorporating Zhao et al.'s (2018) deep facial expression recognition using CNNs, the
Emotion Analyser harnesses deep learning to extract intricate features from facial images,
enhancing accuracy and robustness in emotion prediction. Inspired by Liu et al.'s (2020)
multi-modal approach, the system recognizes the importance of combining visual and textual
cues, fostering a holistic understanding of emotions.
Utilizing the widely recognized FER2013 dataset, consisting of 28,709 labeled images across
seven emotion categories, the Emotion Analyser ensures diversity in facial expressions,
enabling it to handle a broad spectrum of emotional nuances during training and evaluation.
Web-Based Platform: Integral to the proposed system is a user-friendly website that facilitates
seamless interactions. Users can effortlessly upload images for real-time emotion predictions,
ensuring accessibility without the need for complex installations or technical expertise.
Backend Server Hosting: The Emotion Analyser is backed by a robust server infrastructure
hosting the deep learning model. This backend server ensures efficient processing of user
requests, enabling real-time analysis and prompt feedback.
In summary, the Emotion Analyser envisions a user-centric, accessible, and accurate emotion
recognition platform. By combining insights from historical advancements in emotion
analysis, state-of-the-art deep learning methodologies, a user-friendly website, and leveraging
the FER2013 dataset, the proposed system aims to set new standards in web-based emotion
recognition. Applications include sentiment-aware interfaces, personalized recommendations,
and emotion-driven analytics, all enriched by the inclusivity and diversity afforded by the
FER2013 dataset.
5
CHAPTER – 03 : METHODOLOGY
3.1.1 PYTHON
Python is one of the most popular programming languages in the field of machine learning,
and it has become the de facto language for many machine learning tasks. Here are some
reasons why Python is widely used in the context of machine learning:
1. Extesive libraries
2. Community Support
3. Versatility
4. Integration with Other Technologies
Python is a versatile programming language used in various contexts. It's prominent in web
development with frameworks like Django and Flask. In data science and machine learning,
libraries like NumPy and TensorFlow make it a key player. Python is widely chosen for
automation and scripting due to its simple syntax. In scientific computing, libraries like SciPy
enhance its capabilities. Python is applied in AI, NLP, game development (e.g., Pygame),
desktop GUI apps (Tkinter, PyQt), networking, and cybersecurity (Scapy). It's a popular
choice in education for its readability. Python is used in IoT, and in finance for quantitative
analysis and algorithmic trading. Its adaptability and extensive libraries contribute to its
widespread use across diverse fields.
6
Google Colab, short for Colaboratory, is a cloud-based platform offered by Google that
provides a collaborative environment for writing, executing, and sharing Python code. Built
on the Jupyter Notebook framework, Colab supports interactive coding, allowing users to
blend code, text, and visualizations in a single document. Notably, Colab offers free access to
Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs), enhancing the speed
of machine learning model training. Its seamless integration with Google Drive simplifies file
management and sharing, making it a valuable tool for collaborative projects. Colab is widely
used in education for teaching programming and data science due to its user-friendly interface
and collaborative features. Additionally, the platform comes with pre-installed libraries, such
as TensorFlow and PyTorch, making it easy for users to start working on data science and
machine learning tasks without extensive setup requirements. Despite some limitations, such
as session timeouts, Colab remains a popular choice for its accessibility, collaborative
capabilities, and the convenience of cloud-based computing.
3.1.3 TENSORFLOW
7
TensorFlow is a powerful open-source machine learning framework that plays a crucial role in
the development and deployment of machine learning models. Here's how TensorFlow helps
in machine learning:
The framework's efficiency in model training, support for hardware accelerators like GPUs
and TPUs, and seamless integration with other libraries position TensorFlow as a cornerstone
in the advancement of deep learning applications, enabling breakthroughs in artificial
intelligence and machine learning research.
3.1.5 KERAS
Keras is an open-source high-level neural networks API written in Python. It serves as a user-
friendly interface for building, training, and deploying deep learning models, providing a
modular and expressive structure. Originally developed as a user-friendly interface on top of
other deep learning frameworks, Keras has become an integral part of TensorFlow since its
integration in 2017, making it even more accessible to the broader deep learning community.
Key Features
1. User-Friendly API
2. Modularity
3. Compatibility
4. Extensibility
5. Wide Adoption
8
3.1.6 NUMPY
NumPy, short for Numerical Python, is a fundamental library in the Python ecosystem,
serving as a cornerstone for numerical computing and data manipulation. Its primary
functionality revolves around the efficient handling of arrays, matrices, and mathematical
operations, making it a critical tool for scientific computing, machine learning, and data
analysis.
At the core of NumPy is the `ndarray` (n-dimensional array), a versatile data structure that
allows for the representation of multi-dimensional datasets. These arrays provide a
homogenous, memory-efficient way to store and manipulate numerical data. The beauty of
NumPy lies in its ability to perform element-wise operations on entire arrays, eliminating the
need for explicit loops and vastly improving computational efficiency.
One of NumPy's key strengths is its extensive set of mathematical functions that operate on
entire arrays. These functions enable users to perform complex operations with concise and
readable code, enhancing the expressiveness of mathematical and scientific computations.
Whether it's linear algebra operations, statistical calculations, or Fourier transforms, NumPy
provides a rich set of functions that form the backbone of many scientific applications.
Beyond numerical operations, NumPy facilitates array manipulation and reshaping, allowing
users to transform and reorganize data effortlessly. Slicing, indexing, and broadcasting are
powerful features that contribute to the flexibility and ease of use of the library, particularly in
the context of data preprocessing and analysis.
NumPy's seamless integration with other Python libraries, such as SciPy (Scientific Python)
and Matplotlib (for data visualization), strengthens its position as an essential tool for
scientific computing. The interoperability of these libraries creates a cohesive ecosystem that
supports a wide range of scientific and engineering applications.
In conclusion, NumPy is a critical component of the Python data science stack, providing a
powerful and efficient array-processing library for numerical computing. Its intuitive array
operations, extensive mathematical functions, and seamless integration with other libraries
9
make it an indispensable tool for researchers, engineers, and data scientists working in various
domains.
3.1.7 CNN
Convolutional Neural Networks (CNNs) have become a cornerstone in the field of deep
learning, particularly in the domain of computer vision. Their success can be attributed to
their ability to automatically learn and hierarchically represent intricate patterns within data.
The distinctive architecture of CNNs, characterized by convolutional layers, pooling layers,
and fully connected layers, facilitates the extraction of meaningful features from input data.
In the convolutional operation, filters or kernels slide across the input, capturing spatial
hierarchies of features. This allows the network to discern low-level features, like edges and
textures, in initial layers and progressively assemble them into more complex, high-level
features in deeper layers. The parameter sharing mechanism, where the same set of weights is
applied across different spatial locations, enhances the network's ability to recognize patterns
irrespective of their position in the input.
Architecture:
1. Convolutional Layers - These layers apply convolutional operations to input data, utilizing
filters or kernels to extract features such as edges, textures, or more complex patterns.
3. Fully Connected Layers: After extracting features through convolutional and pooling layers,
fully connected layers are employed to make predictions or classifications. These layers
connect every neuron to every neuron in the preceding and succeeding layers.
4. Activation Functions: Non-linear activation functions, like ReLU (Rectified Linear Unit),
are applied to introduce non-linearity and enable the network to learn more complex patterns.
10
Applications:
2. Object Detection: In tasks like object detection, CNNs can identify and locate multiple
objects within an image.
3. Segmentation: CNNs are used in image segmentation to classify each pixel within an image,
enabling tasks like identifying and delineating objects.
4. Face Recognition: CNNs have been successful in face recognition applications, including
real-time face detection and identification.
5.Medical Imaging: CNNs are employed for tasks such as tumor detection in medical images.
6. Natural Language Processing: CNNs can be adapted for processing sequences of data,
making them suitable for tasks like sentiment analysis in text.
11
3.2 Our Emotion Analyser Model :
Our minor project, the Emotion Analyzer Model, focuses on developing an effective emotion
detection system using Convolutional Neural Networks (CNNs). The primary objective is to
discern between two fundamental emotions—happy and sad—by analyzing visual cues from
facial expressions. This project aligns with the growing interest in emotion recognition
applications and its potential impact on diverse fields, from human-computer interaction to
mental health assessments.
In today's digital era, understanding human emotions is crucial for creating empathetic and
responsive technologies. Our Emotion Analyzer Model aims to contribute to this realm by
building a CNN-based system capable of recognizing and categorizing facial expressions into
two emotions: happy and sad. This minor project serves as an exploration into the field of
computer vision and deep learning, providing valuable insights into emotion detection
techniques.The model's architecture is based on a CNN, a class of deep neural networks
known for their effectiveness in image-related tasks. The input to the model consists of
grayscale facial images, emphasizing the network's ability to capture relevant facial features
without the complexity of color information. We utilized popular deep learning libraries, such
as TensorFlow and Keras, to implement and train the CNN architecture.
Input Layer:
• The first layer uses the Conv2D function with 16 filters of size (3,3), a stride of 1, and
the 'relu' (Rectified Linear Unit) activation function.
MaxPooling Layer:
12
• Following each convolutional layer, a MaxPooling2D layer is added to downsample
the spatial dimensions of the output.
Convolutional Layers:
• Two additional convolutional layers are added, each with 32 and 16 filters,
respectively. The filter size and activation function remain consistent across all
convolutional layers.
Flatten Layer:
• After the convolutional layers, a Flatten layer is introduced to convert the 3D spatial
data into a 1D vector. This is a necessary step before transitioning to fully connected
layers.
Dense Layers:
Summary:
• The model is structured with a sequence of convolutional and pooling layers for
feature extraction, followed by fully connected layers for classification.
13
• The 'relu' activation function is used throughout the convolutional and dense layers to
introduce non-linearity.
• The output layer uses the 'sigmoid' activation function, suggesting that this model is
configured for binary classification tasks, where the output represents a probability
(sigmoid output) for a binary class (e.g., 0 or 1).
14
3.3 Sequence Diagram
The Emotion Analyser project enables users to leverage the power of image-based emotion
recognition through a straightforward and user-friendly process. In this use case, we outline
the steps involved when a user sends an image to the model, initiating a seamless process that
involves the backend server.
The Emotion Analyser project's use case illustrates a streamlined process where users can
easily submit images for emotion recognition. The backend server, housing the deployed
model, plays a pivotal role in orchestrating the image analysis and relaying results promptly
to the user. This user-centric approach provides a convenient and efficient means for
individuals to explore and understand the emotional content captured in facial expressions,
enhancing the overall user experience.
16
CHAPTER – 04 : ANALYSIS & RESULTS
After thorough testing and validation of our machine learning model for the Emotion
Analyser project, the results are highly promising, indicating the successful functioning of the
system. We initiated the testing process using two simple directories, and the model
demonstrated satisfactory performance. Encouraged by these initial results, we are confident
in expanding the model's capabilities by incorporating additional directories and diverse types
of emotional expressions.
Throughout the training process, our model underwent multiple epochs, refining its accuracy
at each stage. The achieved accuracy of 72% on the training dataset is a strong indicator of the
model's proficiency in recognizing emotions from images. Moreover, the validation accuracy
of 83% underscores the generalization capabilities of the model, providing assurance of its
effectiveness on new, unseen data.
Given that the model was trained on a relatively small dataset, the observed accuracy of 72%
signifies its potential for improvement with larger datasets. As we scale the model with more
diverse data, we anticipate a corresponding increase in accuracy, further enhancing its ability
to discern a broader spectrum of emotions.
In addition to quantitative assessments, we conducted extensive testing using images from our
test directory. This real-world validation process allowed us to verify the accuracy and
reliability of the Emotion Analyser in practical scenarios. The findings, including graphs and
training details, are meticulously documented and attached below for convenient reference.
Key Findings:
17
Real-world Testing: Successful validation using a diverse set of images from the test
directory.
These findings underscore the success of the Emotion Analyser project, demonstrating its
efficacy in recognizing emotions from facial expressions. As we move forward, we are
committed to continuous improvement, incorporating user feedback, expanding the dataset,
and exploring avenues for enhancing the model's accuracy and real-world applicability. The
achievement of a 72% accuracy rate marks a significant milestone, showcasing the potential
impact of the Emotion Analyser in diverse applications such as sentiment analysis, user
engagement, and personalized interactions.
18
Fig 4.2 - Graph between validation loss and training loss
19
Fig 4.4 - Model Summary
20
Fig 4.5 - Binary Classification for our Model
Binary classification deep learning model for distinguishing between happy and sad images
typically involves training a neural network to learn meaningful representations of features
from the input images. The model consists of layers of interconnected nodes, each with
associated weights, which are adjusted during training to minimize the difference between
predicted and actual labels. Input images are fed into the model, and through forward
propagation, the network computes a probability score for each class (happy or sad). The
model is trained using labeled data, adjusting its parameters through backpropagation and
optimization algorithms to iteratively improve its predictions. Once trained, the model can
classify new images as happy or sad based on learned patterns, enabling it to generalize its
understanding of the emotional content in unseen data.
21
4.1 Results:
23
Summary:
The model analysis summary provides three key metrics for evaluation: precision, recall, and
F1 score. The precision, given by 0.7099237, measures the accuracy of the model in correctly
predicting positive instances (e.g., identifying happy images) among all instances it predicted
as positive. A higher precision indicates fewer false positives. The recall, at 0.8378, represents
the model's ability to correctly identify positive instances out of all actual positive instances.
A higher recall suggests fewer false negatives. Lastly, the F1 score, computed as the harmonic
mean of precision and recall (0.75), provides a balanced measure that considers both false
positives and false negatives. These metrics collectively indicate the model's effectiveness in
classifying happy and sad images, with higher values generally reflecting better performance.
Further contextual information, such as the dataset size and class distribution, would be
valuable for a comprehensive assessment of the model's overall performance.
24
CHAPTER - 05 : CONCLUSION & REMARKS
In conclusion, the "Emotion Analyser" project represents a significant stride in the domain of
image-based emotion recognition, seamlessly integrating cutting-edge technologies with user-
friendly accessibility. The incorporation of advanced Convolutional Neural Networks (CNNs),
inspired by foundational works in facial expression analysis, deep learning models, and
leveraging the influential FER2013 dataset, positions the system at the forefront of emotion
recognition research.
The project's emphasis on a user-friendly web platform opens new avenues for widespread
adoption, eliminating barriers to entry and making emotion analysis accessible to a diverse
user base. Users can effortlessly upload images, and the real-time feedback from the backend
server, hosting the deep learning model, ensures a prompt and engaging experience.
The multi-modal learning approach, guided by Liu et al.'s methodology, acknowledges the
complexity of human emotions and enriches the system's understanding by combining visual
and textual cues. This holistic approach contributes to the system's ability to capture nuanced
emotional states, fostering a more comprehensive user experience.
The inclusion of the FER2013 dataset, known for its diversity in facial expressions, reinforces
the robustness of the Emotion Analyser. By training on this dataset, the system becomes adept
at handling a broad spectrum of emotional nuances, enhancing its accuracy and adaptability in
real-world scenarios.
In remark, the Emotion Analyser not only stands as a technological innovation but also as a
testament to the collaborative efforts in the fields of deep learning, human-computer
interaction, and emotion psychology. As the project moves forward, continuous iterations,
25
user feedback, and advancements in technology will further refine the Emotion Analyser,
solidifying its role as a pioneering solution in the dynamic landscape of emotion recognition.
26
CHAPTER - 06 : REFERENCES
Ekman, P., & Friesen, W. V. (1978). Facial Action Coding System: A Technique for
the Measurement of Facial Movement.
Zhao, G., Mao, X., & Chen, L. (2018). Facial expression recognition from near-
infrared videos with unknown illumination pattern.
Liu, Z., Song, H., & Li, X. (2020). Multimodal Deep Learning for Affective Analysis
and Aesthetic Prediction.
27
Team Members