0% found this document useful (0 votes)
16 views

Research Methods Speech Recognition

Uploaded by

ajac238
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Research Methods Speech Recognition

Uploaded by

ajac238
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Write date here

Speech
Recognition in ML
An Application of Machine Learning

Hosios Samuel Kirubakaran


Jason Kishore Mohan Raj
Surendaranath Kanniyappan
Agenda Case Study Highlight
EDNUS is used as an example application of speech
recognition, adapted based on findings from the research paper
“Sparse Autoencoder-Based Speech Emotion Recognition.”

1 Machine Learning 8 Workflow of EDNUS

2 Speech Recognition in ML 9 Super Learner Model

3 The Power and Potential 10 Deep Auto Encoder

4 Core Aspects 11 Results & Traction

5 EDNUS Overview 13 Summary of Key Takeaways

6 EDNUS’ Technical Foundations 14 References


Key Layers

Machine Machine Learning

Learning Supervised &


Unsupervised Learning

Neural Networks &


Deep Learning

Applications of Machine Learning


Speech Recognition
Natural Language Processing (NLP) & NLP
Speech Recognition
Image Recognition and Computer Vision
Automation and Robotics
Recommendation Systems Emotion Detection
Predictive Analytics
Fraud Detection and Cybersecurity
Healthcare and Biomedicine
Autonomous Vehicles
Finance and Banking

1
[2]

Speech
Recognition in ML
Converting spoken language to text using
computational models.

Key Applications:
• Virtual Assistants (e.g., Siri, Alexa)
• Healthcare Diagnostics
• Emotion Analysis for Mental Health

Why ML for Speech?


• Adaptability to accents, intonations, and
environments
• Efficiency in real-time processing
• Emotional insights through feature
extraction

2
[2]

The Power and Potential


Purpose Impact

• Accessibility • Inclusive Technology


• Human-Computer Interaction • Industry Transformation
• Healthcare Support • Research Advancements

Current Landscape Future Potential

Virtual assistants, Context-aware multilingual


Healthcare accessibility, capabilities
Enabling real-time Enhanced privacy
interactions Emotional understanding

3
Core Speech Recognition
Aspects

Accuracy in Speech-to-Text Real-Time Processing Emotion and Sentiment Analysis

Ensure high precision in Enable rapid analysis and Go beyond basic transcription by
transcribing spoken language, response for applications that identifying emotional tones and
addressing challenges like rely on immediate feedback, such sentiments in speech, useful for
accents, background noise, and as virtual assistants and applications in mental health and
diverse dialects. interactive customer support. user experience enhancement.

4
[1]

EDNUS
Emotion Detection of Neurological disorder
Using Speech
EDNUS demonstrates the application of speech
recognition in detecting emotions, which is
crucial for supporting individuals with
neurological disorders.
Provides real-time emotional insights for better
support in patient care.
Focuses on speech-based emotion recognition,
addressing a key need in mental health.
Offers a non-invasive, accessible approach for
monitoring emotional states.
Its use of deep autoencoders and ensemble
models addresses challenges in capturing
subtle emotional cues, providing an innovative
approach within healthcare, unlike other
proposed models.

5
[1]

EDNUS’ Technical Foundations


Emotion Detection: EDNUS analyzes speech signals to detect emotions,
focusing on tonal features.

ML Techniques: Uses deep autoencoders for feature extraction and


ensemble classifiers for accurate emotion classification.

Key Features: Extracts MFCC, chroma, and Mel spectrogram features to


capture emotional markers.

Architecture: Combines autoencoders for noise reduction with a meta-


learner for refined predictions.

Purpose: Provides real-time emotional insights to support neurological care.

6
Classification of
Emotions

Happy Sad Calm Angry

[3]

7
Workflow Algorithm/ SLM / DAE

Feature Engineering Model

The workflow starts with audio


upload, where speech features are Driver Code
extracted and processed through an
ML algorithm. Results are then
displayed via the user interface,
providing real-time emotional
feedback based on the speech input.

8
Super Learner Model
The Logistic Regression meta-learner
combines predictions from base models
to produce a final, more accurate output,
effectively optimizing each model’s
contribution to improve overall
performance.

9
Feature Extraction using DAE

Deep Auto Encoder


A Deep Autoencoder is a neural network that
compresses and reconstructs data, capturing
key features and reducing noise.

Architecture: Includes an encoder for


compression, a decoder for reconstruction, and
a bottleneck layer for dimensionality reduction.

Role in EDNUS: Extracts vital speech features


to improve emotion detection accuracy.

Benefits: Enhances classification by focusing


on relevant features, leading to more precise
emotion recognition.

[5]

10
Results
With and Without Auto Encoder

With Deep Auto Encoder Without Deep Auto Encoder

Accuracy Score 84 76

Cohen Kappa Score 78 68

F1 Score 84 76

Jaccard Score 72 61

Hamming Loss 0.1614 0.2395

11
Our Traction
Benefits of using Deep Auto Encoder

INFERENCE

High accuracy due to deep autoencoder and


feature selection.

IMPACT

With the Deep Autoencoder, the model


demonstrates consistent improvement across
all metrics, with increases ranging from
10.5% to 18%

effectively reduces the Hamming Loss by


approximately 33%

[4]
12
Summary of Key
Takeaways
• Accurate Emotion Detection: EDNUS effectively captures emotional
nuances in speech using deep autoencoders for feature extraction.

• Ensemble Learning: The system’s use of a meta-learning model combines


multiple classifiers, optimizing performance across varied speech inputs.

• Real-Time Application: EDNUS provides real-time feedback, making it


suitable for healthcare scenarios where immediate emotional insights are
valuable.

• Impact on Neurological Care: By focusing on non-invasive emotion


detection, EDNUS offers a supportive tool for understanding patient well-
being.

13
Back to Agenda
References
[1] https://www.shutterstock.com

[2] https://chatgpt.com

[3] https://www.canva.com

[4] https://colab.research.google.com

[5] https://doi.org/10.1007/978-981-19-2130-8_42

14
Thank you!

15

You might also like