0% found this document useful (0 votes)

16 views

Research Methods Speech Recognition

Uploaded by

ajac238

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Research Methods Speech Recognition

Uploaded by

ajac238

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Write date here

Speech
Recognition in ML
An Application of Machine Learning

Hosios Samuel Kirubakaran

Jason Kishore Mohan Raj
Surendaranath Kanniyappan
Agenda Case Study Highlight
EDNUS is used as an example application of speech
recognition, adapted based on findings from the research paper
“Sparse Autoencoder-Based Speech Emotion Recognition.”

1 Machine Learning 8 Workflow of EDNUS

2 Speech Recognition in ML 9 Super Learner Model

3 The Power and Potential 10 Deep Auto Encoder

4 Core Aspects 11 Results & Traction

5 EDNUS Overview 13 Summary of Key Takeaways

6 EDNUS’ Technical Foundations 14 References

Key Layers

Machine Machine Learning

Learning Supervised &

Unsupervised Learning

Neural Networks &

Deep Learning

Applications of Machine Learning

Speech Recognition
Natural Language Processing (NLP) & NLP
Speech Recognition
Image Recognition and Computer Vision
Automation and Robotics
Recommendation Systems Emotion Detection
Predictive Analytics
Fraud Detection and Cybersecurity
Healthcare and Biomedicine
Autonomous Vehicles
Finance and Banking

1
[2]

Speech
Recognition in ML
Converting spoken language to text using
computational models.

Key Applications:
• Virtual Assistants (e.g., Siri, Alexa)
• Healthcare Diagnostics
• Emotion Analysis for Mental Health

Why ML for Speech?

• Adaptability to accents, intonations, and
environments
• Efficiency in real-time processing
• Emotional insights through feature
extraction

2
[2]

The Power and Potential

Purpose Impact

• Accessibility • Inclusive Technology

• Human-Computer Interaction • Industry Transformation
• Healthcare Support • Research Advancements

Current Landscape Future Potential

Virtual assistants, Context-aware multilingual

Healthcare accessibility, capabilities
Enabling real-time Enhanced privacy
interactions Emotional understanding

3
Core Speech Recognition
Aspects

Accuracy in Speech-to-Text Real-Time Processing Emotion and Sentiment Analysis

Ensure high precision in Enable rapid analysis and Go beyond basic transcription by
transcribing spoken language, response for applications that identifying emotional tones and
addressing challenges like rely on immediate feedback, such sentiments in speech, useful for
accents, background noise, and as virtual assistants and applications in mental health and
diverse dialects. interactive customer support. user experience enhancement.

4
[1]

EDNUS
Emotion Detection of Neurological disorder
Using Speech
EDNUS demonstrates the application of speech
recognition in detecting emotions, which is
crucial for supporting individuals with
neurological disorders.
Provides real-time emotional insights for better
support in patient care.
Focuses on speech-based emotion recognition,
addressing a key need in mental health.
Offers a non-invasive, accessible approach for
monitoring emotional states.
Its use of deep autoencoders and ensemble
models addresses challenges in capturing
subtle emotional cues, providing an innovative
approach within healthcare, unlike other
proposed models.

5
[1]

EDNUS’ Technical Foundations

Emotion Detection: EDNUS analyzes speech signals to detect emotions,
focusing on tonal features.

ML Techniques: Uses deep autoencoders for feature extraction and

ensemble classifiers for accurate emotion classification.

Key Features: Extracts MFCC, chroma, and Mel spectrogram features to

capture emotional markers.

Architecture: Combines autoencoders for noise reduction with a meta-

learner for refined predictions.

Purpose: Provides real-time emotional insights to support neurological care.

6
Classification of
Emotions

Happy Sad Calm Angry

[3]

7
Workflow Algorithm/ SLM / DAE

Feature Engineering Model

The workflow starts with audio

upload, where speech features are Driver Code
extracted and processed through an
ML algorithm. Results are then
displayed via the user interface,
providing real-time emotional
feedback based on the speech input.

8
Super Learner Model
The Logistic Regression meta-learner
combines predictions from base models
to produce a final, more accurate output,
effectively optimizing each model’s
contribution to improve overall
performance.

9
Feature Extraction using DAE

Deep Auto Encoder

A Deep Autoencoder is a neural network that
compresses and reconstructs data, capturing
key features and reducing noise.

Architecture: Includes an encoder for

compression, a decoder for reconstruction, and
a bottleneck layer for dimensionality reduction.

Role in EDNUS: Extracts vital speech features

to improve emotion detection accuracy.

Benefits: Enhances classification by focusing

on relevant features, leading to more precise
emotion recognition.

[5]

10
Results
With and Without Auto Encoder

With Deep Auto Encoder Without Deep Auto Encoder

Accuracy Score 84 76

Cohen Kappa Score 78 68

F1 Score 84 76

Jaccard Score 72 61

Hamming Loss 0.1614 0.2395

11
Our Traction
Benefits of using Deep Auto Encoder

INFERENCE

High accuracy due to deep autoencoder and

feature selection.

IMPACT

With the Deep Autoencoder, the model

demonstrates consistent improvement across
all metrics, with increases ranging from
10.5% to 18%

effectively reduces the Hamming Loss by

approximately 33%

[4]
12
Summary of Key
Takeaways
• Accurate Emotion Detection: EDNUS effectively captures emotional
nuances in speech using deep autoencoders for feature extraction.

• Ensemble Learning: The system’s use of a meta-learning model combines

multiple classifiers, optimizing performance across varied speech inputs.

• Real-Time Application: EDNUS provides real-time feedback, making it

suitable for healthcare scenarios where immediate emotional insights are
valuable.

• Impact on Neurological Care: By focusing on non-invasive emotion

detection, EDNUS offers a supportive tool for understanding patient well-
being.

13
Back to Agenda
References
[1] https://www.shutterstock.com

[2] https://chatgpt.com

[3] https://www.canva.com

[4] https://colab.research.google.com

[5] https://doi.org/10.1007/978-981-19-2130-8_42

14
Thank you!

Huawei.H13-311 - V3.0.v2022-03-02.q107: Show Answer
No ratings yet
Huawei.H13-311 - V3.0.v2022-03-02.q107: Show Answer
24 pages
Deep Learning Based Emotion Recognition System Using Speech Features and Transcriptions
No ratings yet
Deep Learning Based Emotion Recognition System Using Speech Features and Transcriptions
12 pages
IJRPR4210
No ratings yet
IJRPR4210
12 pages
Applying-Machine-Learning-Techniques-for-Speech-Emotion-Recognition
No ratings yet
Applying-Machine-Learning-Techniques-for-Speech-Emotion-Recognition
6 pages
Applsci 12 04338 v3
No ratings yet
Applsci 12 04338 v3
18 pages
SERDL 2
No ratings yet
SERDL 2
10 pages
Reality
No ratings yet
Reality
11 pages
Research Proposal
No ratings yet
Research Proposal
3 pages
Chethana H N REPORT
No ratings yet
Chethana H N REPORT
12 pages
Zhao 2019
No ratings yet
Zhao 2019
12 pages
Speech Emotion Recognition: Submitted by Manoj Rajput 2019PEC5303
No ratings yet
Speech Emotion Recognition: Submitted by Manoj Rajput 2019PEC5303
11 pages
Electronics 11 03831
No ratings yet
Electronics 11 03831
12 pages
Speech-Emotion-Analysis-System
No ratings yet
Speech-Emotion-Analysis-System
10 pages
Speech Emotion Recognition Using Deep Learning
No ratings yet
Speech Emotion Recognition Using Deep Learning
6 pages
Lee-Tashev_ paper 6
No ratings yet
Lee-Tashev_ paper 6
4 pages
Speech Emotion Recognition1
No ratings yet
Speech Emotion Recognition1
86 pages
SPRINGERIJST
No ratings yet
SPRINGERIJST
11 pages
Exploring the Effectiveness of Advanced Machine Learning Models in Speech Emotion Recognition
No ratings yet
Exploring the Effectiveness of Advanced Machine Learning Models in Speech Emotion Recognition
6 pages
Human Emotion Detection With Speech Recognition Using Mel-Frequency Cepstral Coefficient and CNN - New
No ratings yet
Human Emotion Detection With Speech Recognition Using Mel-Frequency Cepstral Coefficient and CNN - New
2 pages
1-s2.0-S0950705123002757-main
No ratings yet
1-s2.0-S0950705123002757-main
11 pages
Towards the explainability of Multimodal Speech Emotion Recognition
No ratings yet
Towards the explainability of Multimodal Speech Emotion Recognition
5 pages
Speech and Text Emotion Recognition Using Machine Learning Batch Number - 08 First Review 2.0
No ratings yet
Speech and Text Emotion Recognition Using Machine Learning Batch Number - 08 First Review 2.0
12 pages
Mini Project B20CS061
No ratings yet
Mini Project B20CS061
16 pages
MS Thesis Final
No ratings yet
MS Thesis Final
47 pages
9.-Yogendra
No ratings yet
9.-Yogendra
5 pages
MiniProject 5
No ratings yet
MiniProject 5
11 pages
Final Presentation
No ratings yet
Final Presentation
50 pages
Group 110 Arun Kumar Review 2 Report
No ratings yet
Group 110 Arun Kumar Review 2 Report
14 pages
Speech Emotion Recoginition
No ratings yet
Speech Emotion Recoginition
5 pages
Real-Time Speech Emotion Recognition Using Deep Le
No ratings yet
Real-Time Speech Emotion Recognition Using Deep Le
40 pages
[2]
No ratings yet
[2]
7 pages
Speech Emotion Recognition PDF
No ratings yet
Speech Emotion Recognition PDF
5 pages
Speech Recog
No ratings yet
Speech Recog
5 pages
Speech Emotion Recognition (Sound C
No ratings yet
Speech Emotion Recognition (Sound C
2 pages
All-In-One Emotion, Sentiment and Intensity Prediction Using A Multi-Task Ensemble Framework-Ppt-1
No ratings yet
All-In-One Emotion, Sentiment and Intensity Prediction Using A Multi-Task Ensemble Framework-Ppt-1
29 pages
JETIR2106163 (37)
No ratings yet
JETIR2106163 (37)
5 pages
Research Paper
No ratings yet
Research Paper
5 pages
1904.06022v1
No ratings yet
1904.06022v1
9 pages
Emotion Recognition Based On Speech Signals by Combining Empirical Mode Decomposition and Deep Neural Network
No ratings yet
Emotion Recognition Based On Speech Signals by Combining Empirical Mode Decomposition and Deep Neural Network
10 pages
Sensors 23 06212 v2
No ratings yet
Sensors 23 06212 v2
20 pages
Speech Emotion Recognition Using Deep Learning
No ratings yet
Speech Emotion Recognition Using Deep Learning
4 pages
Multimodal Emotion Detection With An Emphasis On Speech Modal
No ratings yet
Multimodal Emotion Detection With An Emphasis On Speech Modal
38 pages
SER (Research Paper)
No ratings yet
SER (Research Paper)
5 pages
Winter Semester 2021-22 CSE4020-Machine Learning Digital Assignment-1
No ratings yet
Winter Semester 2021-22 CSE4020-Machine Learning Digital Assignment-1
20 pages
Machine Minds AI for all: An Ethical Intelligence & Responsible Revolution
From Everand
Machine Minds AI for all: An Ethical Intelligence & Responsible Revolution
aarat
No ratings yet
Tzirakis 2017
No ratings yet
Tzirakis 2017
9 pages
SECOND - s11042 023 16849 X
No ratings yet
SECOND - s11042 023 16849 X
18 pages
Speech Emotion System Full Project Report
No ratings yet
Speech Emotion System Full Project Report
54 pages
Speech
No ratings yet
Speech
12 pages
SET CONFERENCE DRAFT PAPER_223585
No ratings yet
SET CONFERENCE DRAFT PAPER_223585
6 pages
DL Emotion MFCC
No ratings yet
DL Emotion MFCC
6 pages
1 PB
No ratings yet
1 PB
12 pages
1822 B.E Cse Batchno 140
No ratings yet
1822 B.E Cse Batchno 140
55 pages
rw3[1]-1
No ratings yet
rw3[1]-1
21 pages
Speech Emotion Recognition With Deep Learning
No ratings yet
Speech Emotion Recognition With Deep Learning
5 pages
AI Techniques and Tools Through Python. Supervised Learning: Classification Methods, Ensemble Learning and Neural Networks
From Everand
AI Techniques and Tools Through Python. Supervised Learning: Classification Methods, Ensemble Learning and Neural Networks
César Pérez López
No ratings yet
Research Paper Attri
No ratings yet
Research Paper Attri
7 pages
(IJCST-V7I3P19) :aishwarya Prabha Kumar, Aiswarya Milton Lopez, Akhila Anjanan, Aneena Thereesa
No ratings yet
(IJCST-V7I3P19) :aishwarya Prabha Kumar, Aiswarya Milton Lopez, Akhila Anjanan, Aneena Thereesa
5 pages
Recognition_of_emotions_in_speech_using_deep_CNN_a (1)
No ratings yet
Recognition_of_emotions_in_speech_using_deep_CNN_a (1)
18 pages
Cyprus University of Technology TEPAK Report Template English PDF
No ratings yet
Cyprus University of Technology TEPAK Report Template English PDF
17 pages
Cyprus University of Technology TEPAK Report Template English PDF
No ratings yet
Cyprus University of Technology TEPAK Report Template English PDF
17 pages
21CS54 QB Test3
No ratings yet
21CS54 QB Test3
2 pages
Unit 2
No ratings yet
Unit 2
112 pages
Wadola Habte Seminar
No ratings yet
Wadola Habte Seminar
16 pages
CCS355 SET2 Anna University Lab Question Set Neural Network
No ratings yet
CCS355 SET2 Anna University Lab Question Set Neural Network
2 pages
Gene Selection and Classification of Microarray Data Using Convolutional Neural Network
No ratings yet
Gene Selection and Classification of Microarray Data Using Convolutional Neural Network
6 pages
Chronos (1)
No ratings yet
Chronos (1)
43 pages
Lab 4 Etapi 3 Corrected
No ratings yet
Lab 4 Etapi 3 Corrected
5 pages
Top 10 AI Content Detection Tools
No ratings yet
Top 10 AI Content Detection Tools
13 pages
2024-05-13-Kolmogorov-Arnold Networks the latest advance in Neural Networks, simply explained by Theo Wolf May
No ratings yet
2024-05-13-Kolmogorov-Arnold Networks the latest advance in Neural Networks, simply explained by Theo Wolf May
22 pages
CNN For Handwritten Arabic Digits Recognition Based On Lenet-5
No ratings yet
CNN For Handwritten Arabic Digits Recognition Based On Lenet-5
11 pages
01. Introduction to Machine Learning
No ratings yet
01. Introduction to Machine Learning
4 pages
5 Reasons Why Machine Learning Is Important in Today
No ratings yet
5 Reasons Why Machine Learning Is Important in Today
6 pages
Machine Learning Manual
100% (1)
Machine Learning Manual
81 pages
Artificial Intelligence & Machine Learning
No ratings yet
Artificial Intelligence & Machine Learning
37 pages
PART I Chapter 5 Neural Network
No ratings yet
PART I Chapter 5 Neural Network
19 pages
Ai Machine Learning
No ratings yet
Ai Machine Learning
39 pages
Aiml Cse
No ratings yet
Aiml Cse
1 page
A Pattern Recognition Approach To Image Segmentation
No ratings yet
A Pattern Recognition Approach To Image Segmentation
7 pages
Textile Defect Detection Algorithm Based on the Improved YOLOv8
No ratings yet
Textile Defect Detection Algorithm Based on the Improved YOLOv8
15 pages
Quiz 2 Solution
No ratings yet
Quiz 2 Solution
2 pages
Summary Notes of Cnn
No ratings yet
Summary Notes of Cnn
23 pages
Object Detection With Deep Learning
No ratings yet
Object Detection With Deep Learning
3 pages
AD3501-DL-Unit 2
No ratings yet
AD3501-DL-Unit 2
33 pages
Deep Residual Learning For Image Recognition (Summary)
No ratings yet
Deep Residual Learning For Image Recognition (Summary)
11 pages
Paper 7
No ratings yet
Paper 7
16 pages
Natural Language Processing Coursera
No ratings yet
Natural Language Processing Coursera
1 page
Soft Computing, Hard Computing, Basics of ANN
No ratings yet
Soft Computing, Hard Computing, Basics of ANN
26 pages
Automatic Age and Gender Estimation Using Deep Learning and Extreme Learning Machine
No ratings yet
Automatic Age and Gender Estimation Using Deep Learning and Extreme Learning Machine
11 pages
Revolut S Ai Chat Assistant Da1c5675 4.0.0 1726043253 en
No ratings yet
Revolut S Ai Chat Assistant Da1c5675 4.0.0 1726043253 en
2 pages

Research Methods Speech Recognition

Uploaded by

Research Methods Speech Recognition

Uploaded by

Write date here

Hosios Samuel Kirubakaran

1 Machine Learning 8 Workflow of EDNUS

2 Speech Recognition in ML 9 Super Learner Model

3 The Power and Potential 10 Deep Auto Encoder

4 Core Aspects 11 Results & Traction

5 EDNUS Overview 13 Summary of Key Takeaways

6 EDNUS’ Technical Foundations 14 References

Machine Machine Learning

Learning Supervised &

Neural Networks &

Applications of Machine Learning

Why ML for Speech?

The Power and Potential

• Accessibility • Inclusive Technology

Current Landscape Future Potential

Virtual assistants, Context-aware multilingual

Accuracy in Speech-to-Text Real-Time Processing Emotion and Sentiment Analysis

EDNUS’ Technical Foundations

ML Techniques: Uses deep autoencoders for feature extraction and

Key Features: Extracts MFCC, chroma, and Mel spectrogram features to

Architecture: Combines autoencoders for noise reduction with a meta-

Purpose: Provides real-time emotional insights to support neurological care.

Happy Sad Calm Angry

Feature Engineering Model

The workflow starts with audio

Deep Auto Encoder

Architecture: Includes an encoder for

Role in EDNUS: Extracts vital speech features

Benefits: Enhances classification by focusing

With Deep Auto Encoder Without Deep Auto Encoder

Cohen Kappa Score 78 68

Hamming Loss 0.1614 0.2395

High accuracy due to deep autoencoder and

With the Deep Autoencoder, the model

effectively reduces the Hamming Loss by

• Ensemble Learning: The system’s use of a meta-learning model combines

• Real-Time Application: EDNUS provides real-time feedback, making it

• Impact on Neurological Care: By focusing on non-invasive emotion

You might also like