Full Single-Type Deep Learning Models With Multihead Attention For Speech Enhancement

Uploaded by

edramonh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views

Full Single-Type Deep Learning Models With Multihead Attention For Speech Enhancement

Uploaded by

edramonh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

 Published:

15 April 2023
Full single-type deep learning models with multihead
attention for speech enhancement
 Noel Zacarias-Morales,
 José Adán Hernández-Nolasco &
 Pablo Pancardo
Applied Intelligence (2023)Cite this article
 Metricsdetails
Abstract
Artificial neural network (ANN) models with attention mechanisms for
eliminating noise in audio signals, called speech enhancement models, have
proven effective. However, their architectures become complex, deep, and
demanding in terms of computational resources when trying to achieve higher
levels of efficiency. Given this situation, we selected and evaluated simple and
less resource-demanding models and utilized the same training parameters
and performance metrics to conduct a fair comparison among the four
selected models. Our purpose was to demonstrate that simple neural network
models with multihead attention are efficient when implemented on
computational devices with conventional resources since they provide results
that are competitive with those of hybrid, complex and resource-demanding
models. We experimentally evaluated the efficiency of multilayer perceptron
(MLP), one-dimensional and two-dimensional convolutional neural network
(CNN), and gated recurrent unit (GRU) deep learning models with and
without multiheaded attention. We also analyzed the generalization capability
of each model. The results showed that although these architectures were
composed of only one type of ANN, multihead attention increased the
efficiency of the speech enhancement process, yielding results that were
competitive with those of complex models. Therefore, this study is helpful as a
reference for building simple and efficient single-type ANN models with
attention.
This is a preview of subscription content, access via your institution.
Data Availability
The datasets generated and analyzed during the current study are available
from the corresponding author upon reasonable request.
References
1. Brauwers G, Frasincar F (2021) A general survey on attention
mechanisms in deep learning. IEEE Trans Knowl Data Eng:1–
1. https://doi.org/10.1109/TKDE.2021.3126456
2. Fan C, Yi J, Tao J, et al (2021) Gated recurrent fusion with joint training
framework for robust end-to-end speech recognition. IEEE/ACM Trans
Audio Speech Language Process 29:198–
209. https://doi.org/10.1109/TASLP.2020.3039600

Article Google Scholar

3. Galassi A, Lippi M, Torroni P (2020) Attention in natural language

processing. IEEE Trans Neural Netw Learn Syst 32(10):4291–
4308. https://doi.org/10.1109/TNNLS.2020.3019893

Article Google Scholar

4. Garofolo J, Lamel L, Fisher W et al (1992) Timit acoustic-phonetic

continuous speech corpus. Linguis Data
Consortium. https://doi.org/10.35111/17gk-bn40
5. Hatzopoulos S, Ciorba AH, Skarzynski P (eds) (2020) The human
auditory system - basic features and updates on audiological diagnosis
and therapy. IntechOpen,
Rijeka. https://doi.org/10.5772/intechopen.77713
6. Hu G, Wang D (2010) A tandem algorithm for pitch estimation and
voiced speech segregation. IEEE Trans Audio Speech Lang Process
18(8):2067–2079. https://doi.org/10.1109/TASL.2010.2041110

Article Google Scholar

7. Jensen J, Taal C H, Jensen J, et al (2016) An algorithm for predicting

the intelligibility of speech masked by modulated noise maskers.
IEEE/ACM Transactions on Audio. Speech Lang Process 24 (11):2009–
2022. https://doi.org/10.1109/TASLP.2016.2585878

Article Google Scholar

8. Kamath U, Graham K, Emara W (2022) Transformers for Machine

Learning: A Deep Dive. Chapman and Hall/CRC, New
York. https://doi.org/10.1201/9781003170082
Book Google Scholar

The AI Wealth Creation Blueprint PDF
67% (3)
The AI Wealth Creation Blueprint PDF
50 pages
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
100% (8)
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
148 pages
How To Hack Atm
87% (15)
How To Hack Atm
1 page
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
88% (8)
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
56 pages
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
95% (20)
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
471 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
81% (48)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
Translating the Future: Exploring the Impact of Technology and AI on Modern Translation Studies
From Everand
Translating the Future: Exploring the Impact of Technology and AI on Modern Translation Studies
Tian Chuanmao
No ratings yet
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
100% (10)
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
821 pages
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
100% (10)
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
708 pages
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
100% (25)
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
306 pages
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
100% (24)
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
52 pages
M8 Manual V1.5
No ratings yet
M8 Manual V1.5
39 pages
Banana Pancakes - Ukulele Chord Chart
100% (1)
Banana Pancakes - Ukulele Chord Chart
2 pages
The Fabric of Reality
100% (1)
The Fabric of Reality
6 pages
75 Productivity Hacks - System Sunday
100% (7)
75 Productivity Hacks - System Sunday
75 pages
Military Remote Viewing Manual
100% (5)
Military Remote Viewing Manual
72 pages
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
No ratings yet
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
20 pages
Machine Learning For Humans
100% (4)
Machine Learning For Humans
97 pages
18
No ratings yet
18
14 pages
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Natural Language Understanding: Fundamentals and Applications
From Everand
Natural Language Understanding: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Fast Transformer Decoding - One Write-Head Is All You Need
No ratings yet
Fast Transformer Decoding - One Write-Head Is All You Need
9 pages
Explanation Based Learning: Fundamentals and Applications
From Everand
Explanation Based Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Neural Architecture of Grammar
From Everand
The Neural Architecture of Grammar
Stephen E. Nadeau
No ratings yet
2015 Attention Based Models For Speech Recognition Paper
No ratings yet
2015 Attention Based Models For Speech Recognition Paper
9 pages
Spcom20 Aaron
No ratings yet
Spcom20 Aaron
17 pages
Neurocomputing: Zhaoyang Niu, Guoqiang Zhong, Hui Yu
No ratings yet
Neurocomputing: Zhaoyang Niu, Guoqiang Zhong, Hui Yu
15 pages
Speech Recognition: Fundamentals and Applications
From Everand
Speech Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
L3 Transformer and PLMs
No ratings yet
L3 Transformer and PLMs
111 pages
Attention-Inspired Artificial Neural Networks For Speech Processing A Systematic Review
No ratings yet
Attention-Inspired Artificial Neural Networks For Speech Processing A Systematic Review
39 pages
Homo Ludens in the Loop: Playful Human Computation Systems
From Everand
Homo Ludens in the Loop: Playful Human Computation Systems
Markus Krause
No ratings yet
Statistical Semantics: Fundamentals and Applications
From Everand
Statistical Semantics: Fundamentals and Applications
Fouad Sabry
No ratings yet
2503.12992v1 (1)
No ratings yet
2503.12992v1 (1)
42 pages
A1
No ratings yet
A1
11 pages
Attention Is All You Need Paper Explained Well
No ratings yet
Attention Is All You Need Paper Explained Well
18 pages
A Neural Attention Model For Speech Command Recognition: A B C C
No ratings yet
A Neural Attention Model For Speech Command Recognition: A B C C
18 pages
Deep Neural Network Module 7 Attention Transformer
No ratings yet
Deep Neural Network Module 7 Attention Transformer
40 pages
Conformer
No ratings yet
Conformer
5 pages
Attention is all you need
No ratings yet
Attention is all you need
15 pages
AI Paper LLM
No ratings yet
AI Paper LLM
12 pages
Attn Is All You Need
No ratings yet
Attn Is All You Need
15 pages
Tructured Ttention Etworks: (Yoonkim@seas, Carldenton@college, Lhoang@g, Srush@seas) .Harvard - Edu
No ratings yet
Tructured Ttention Etworks: (Yoonkim@seas, Carldenton@college, Lhoang@g, Srush@seas) .Harvard - Edu
21 pages
Lecture 10
No ratings yet
Lecture 10
66 pages
Lecture Notes - Advanced Language Model - BERT, GPT
No ratings yet
Lecture Notes - Advanced Language Model - BERT, GPT
24 pages
2557_Differential_Transformer
No ratings yet
2557_Differential_Transformer
18 pages
Conformer
No ratings yet
Conformer
5 pages
Relationship Extraction: Fundamentals and Applications
From Everand
Relationship Extraction: Fundamentals and Applications
Fouad Sabry
No ratings yet
Knowledge Reasoning: Fundamentals and Applications
From Everand
Knowledge Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Scientific Research Process with ChatGPT: A Comprehensive Guide
From Everand
Scientific Research Process with ChatGPT: A Comprehensive Guide
Jayachandran M
No ratings yet
5 Attention
No ratings yet
5 Attention
50 pages
cs224n 2022 Lecture08 Final Project
No ratings yet
cs224n 2022 Lecture08 Final Project
71 pages
Correia Et Al. - 2019 - Adaptively Sparse Transformers
No ratings yet
Correia Et Al. - 2019 - Adaptively Sparse Transformers
11 pages
Natural Language Understanding
From Everand
Natural Language Understanding
Kai Turing
No ratings yet
attention
No ratings yet
attention
15 pages
Robust Speaker Recognition Using Speech Enhancement and Attention Model
No ratings yet
Robust Speaker Recognition Using Speech Enhancement and Attention Model
8 pages
Question Answering On Squad 2.0: Stanford Cs224N Default Project, Option 3
No ratings yet
Question Answering On Squad 2.0: Stanford Cs224N Default Project, Option 3
11 pages
AATN Merged
No ratings yet
AATN Merged
139 pages
Aiayn
No ratings yet
Aiayn
15 pages
The Annotated Transformer: Alexander M. Rush
No ratings yet
The Annotated Transformer: Alexander M. Rush
9 pages
MoBA_Tech_Report
No ratings yet
MoBA_Tech_Report
15 pages
Transformer
No ratings yet
Transformer
59 pages
An Unsupervised Deep Domain Adaptation Approach For Robust Speech Recognition PDF
No ratings yet
An Unsupervised Deep Domain Adaptation Approach For Robust Speech Recognition PDF
12 pages
D A: E L - C LLM I R S H: UO Ttention Fficient ONG Ontext Nference With Etrieval and Treaming Eads
No ratings yet
D A: E L - C LLM I R S H: UO Ttention Fficient ONG Ontext Nference With Etrieval and Treaming Eads
20 pages
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
No ratings yet
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
5 pages
An Energy-Efficient Transformer Processor Exploiting Dynamic Weak Relevances in Global Attention
No ratings yet
An Energy-Efficient Transformer Processor Exploiting Dynamic Weak Relevances in Global Attention
16 pages
Voice Assistants
From Everand
Voice Assistants
Kai Turing
No ratings yet
Deep Learning Frameworks
From Everand
Deep Learning Frameworks
Jamal Hopper
No ratings yet
Semantic Network: Fundamentals and Applications
From Everand
Semantic Network: Fundamentals and Applications
Fouad Sabry
No ratings yet
LLM_paper_1708860068
No ratings yet
LLM_paper_1708860068
63 pages
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
AN2DL_06_2324_AttentionAndTrasformers
No ratings yet
AN2DL_06_2324_AttentionAndTrasformers
60 pages
Hyena Hierarchy Towards Larger Convolutional Language Models PDF
No ratings yet
Hyena Hierarchy Towards Larger Convolutional Language Models PDF
38 pages
Analyzing the Structure of Attention
No ratings yet
Analyzing the Structure of Attention
14 pages
Example File
No ratings yet
Example File
3 pages
Attention Is All You Need
No ratings yet
Attention Is All You Need
15 pages
Roadmap How To Learn AI in 2024 (Uncovered AI)
No ratings yet
Roadmap How To Learn AI in 2024 (Uncovered AI)
6 pages
My Ai Cheat List
100% (11)
My Ai Cheat List
3 pages
Teas Topics To Study
100% (12)
Teas Topics To Study
6 pages
The Secrets of A Slot Machine
No ratings yet
The Secrets of A Slot Machine
4 pages
From Music To Mathematic
100% (1)
From Music To Mathematic
4 pages
2045: The Year Man Becomes Immortal
No ratings yet
2045: The Year Man Becomes Immortal
9 pages
Tech Trend 2024 Report-2
No ratings yet
Tech Trend 2024 Report-2
11 pages
Rationality From AI To Zombies
86% (7)
Rationality From AI To Zombies
1,813 pages
Mind Control Patents
100% (1)
Mind Control Patents
41 pages
Wisc V Interpretation
100% (1)
Wisc V Interpretation
8 pages
Attention Is All You Need
67% (3)
Attention Is All You Need
11 pages
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
100% (7)
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
145 pages
Current and Future Trends on AI Applications - Mohammed A Al-Sharafi
No ratings yet
Current and Future Trends on AI Applications - Mohammed A Al-Sharafi
456 pages
Psych Unit 7a Practice Quiz
No ratings yet
Psych Unit 7a Practice Quiz
4 pages
Sarang Dinesh Mange's Blackbook
No ratings yet
Sarang Dinesh Mange's Blackbook
47 pages
2022 Zame National Conference Report - Final - 123647
No ratings yet
2022 Zame National Conference Report - Final - 123647
37 pages
TS2020 Programme 19.8.2020
No ratings yet
TS2020 Programme 19.8.2020
10 pages
Unit 4
No ratings yet
Unit 4
14 pages
What Is Good Communication
No ratings yet
What Is Good Communication
17 pages
STS - PPT 6
No ratings yet
STS - PPT 6
17 pages
Coco Simulator
No ratings yet
Coco Simulator
2 pages
0. Indian pre cambrian
No ratings yet
0. Indian pre cambrian
3 pages
Geometrical Parameters of Solar Cooker: A Review - Part II by Aman Shrivas, R. Thombre, Subroto Dutt
100% (1)
Geometrical Parameters of Solar Cooker: A Review - Part II by Aman Shrivas, R. Thombre, Subroto Dutt
8 pages
Subic Water and Sewerage Co
No ratings yet
Subic Water and Sewerage Co
4 pages
Unit - 3 Non Parametric Test part 1
No ratings yet
Unit - 3 Non Parametric Test part 1
17 pages
Water Supply Engineering Notes by Sudip Khadka
100% (3)
Water Supply Engineering Notes by Sudip Khadka
543 pages
Effective Communication in Nursing Practice
No ratings yet
Effective Communication in Nursing Practice
34 pages
Experimental Psychology Midterms Reviewer
No ratings yet
Experimental Psychology Midterms Reviewer
7 pages
Copy-of-Black-Brother-Black-Brother-Novel-Study-2
No ratings yet
Copy-of-Black-Brother-Black-Brother-Novel-Study-2
66 pages
Literature Review On Crime in South Africa
100% (1)
Literature Review On Crime in South Africa
9 pages
9 Weather Climate and Temperature Zones
No ratings yet
9 Weather Climate and Temperature Zones
4 pages
Bengtech Metallurgy Extended
No ratings yet
Bengtech Metallurgy Extended
2 pages
Failure of Materials
No ratings yet
Failure of Materials
33 pages
The Eduscrum Guide: "The Rules of The Game"
No ratings yet
The Eduscrum Guide: "The Rules of The Game"
25 pages
Belgrade GCAP
No ratings yet
Belgrade GCAP
173 pages
Fujirebio Lumipulse g600 Brosura
No ratings yet
Fujirebio Lumipulse g600 Brosura
4 pages
Scott Rae
No ratings yet
Scott Rae
3 pages
S8_Mid-point Test_3 and 6
No ratings yet
S8_Mid-point Test_3 and 6
3 pages
Linear Algebra and Differential Equations: Sartaj Ul Hasan
No ratings yet
Linear Algebra and Differential Equations: Sartaj Ul Hasan
11 pages
Borophene Paper: by Raghisa Khalid
No ratings yet
Borophene Paper: by Raghisa Khalid
22 pages
Lecture 8 - Cognitive Processes
No ratings yet
Lecture 8 - Cognitive Processes
59 pages
Dominique Von Orelli - DHL Welcome
No ratings yet
Dominique Von Orelli - DHL Welcome
5 pages
Absolute Zero and The Kelvin Temperature Scale
No ratings yet
Absolute Zero and The Kelvin Temperature Scale
1 page

Full Single-Type Deep Learning Models With Multihead Attention For Speech Enhancement

Uploaded by

Full Single-Type Deep Learning Models With Multihead Attention For Speech Enhancement

Uploaded by

 Published:

3. Galassi A, Lippi M, Torroni P (2020) Attention in natural language

4. Garofolo J, Lamel L, Fisher W et al (1992) Timit acoustic-phonetic

7. Jensen J, Taal C H, Jensen J, et al (2016) An algorithm for predicting

8. Kamath U, Graham K, Emara W (2022) Transformers for Machine

You might also like