0% found this document useful (0 votes)

17 views4 pages

Saheaw 2020

The document describes a study that uses long short-term memory (LSTM) for Thai voice recognition to control electrical appliances. The researchers collected over 2,000 voice samples in Thai from 40 people turning on/off lights, air conditioners, computers, TVs and other devices. They preprocessed the audio, extracted MFCC features, and trained convolutional neural networks (CNNs) and LSTMs to classify the commands. Their experimental results showed that the proposed LSTM model achieved the best accuracy for this voice recognition task.

Uploaded by

Prasad Hiwarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views4 pages

Saheaw 2020

Uploaded by

Prasad Hiwarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

2020 IEEE 7th International Conference on Industrial Engineering and Applications

Thai Voice Recognition for Controlling Electrical appliances Using Long Short-
Term Memory

Wuttichai Saheaw Saichon Jaiyen

Department of Computer Science, Faculty of Science Department of Computer Science, Faculty of Science
King Mongkut’s Institute of Technology Ladkrabang King Mongkut’s Institute of Technology Ladkrabang
Bangkok 10520, Thailand Bangkok 10520, Thailand
e-mail: [email protected] e-mail: [email protected]

Anantaporn Hanskunatai
Department of Computer Science, Faculty of Science
King Mongkut’s Institute of Technology Ladkrabang
Bangkok 10520, Thailand
e-mail: [email protected]

Abstract—Human speech possesses characteristics in each of can be sent into the learning process in each Deep Learning
the word that can be recognized and learned by computers. In Model. In the teaching process, noise must be eliminated
this research, It is being proposed the use of the Deep Learning first or muted. [3] Since noise is an important variable that
Model to predict speech turn-on and turn-off various electrical can cause significant problems in voice recognition. [4]
appliances, by using the sound conversion method that has Machine Learning, being able to learn to remember images
been through the process to get the value of sound waves and well Convolutional Neural Network (CNN) [5] was used in
applied toward training process in different ways. As the speech recognition. An experiment has been performed and
sound has more than 1 syllable and having characteristics of provides 89.7% accuracy by predicting 1 - 2 syllables with
similar words that might difficult to predict. This research is different meanings by learning from images of sound waves
based on Convolutional Neural Network (CNN) for in the research [3] and in the research. [6] with various
comparison with the use of Long Short-Term Memory sounds regarding the environment, the accuracy is 77%
(LSTM), which is part of the Recurrent Neural Network which shows the efficiency of Convolutional Neural
(RNN) and Thai language Speech Dataset turn-on and turn-off Networks in Speech Recognition. In addition to CNN, the
by the 7 types of electrical appliances, the process of reducing Recurrent Neural Network (RNN) [7] has been used for
noise and silence of the front and back of the audio files by 14 speech recognition, It uses the previous state outputs for
classes in total. The experimental results signify that the learning which is a loop state training. However, the
proposed Long Short-Term Memory can achieve the best Recurrent Neural Network still has Vanishing gradient
accuracy. problem from the research [8]. Long Short-Term Memory
(LSTM) has been adopted as part of the Recurrent Neural
Keywords-component; deep learning; voice recognition; long Network [9] to solve the problem of Vanishing gradient
short-term memory problem. The tutorial uses Matrix data for iterative learning
and has Memory when used to remember the value of each
I. INTRODUCTION state output that is included. Further, there is a Gate that
Speech recognition with Machine Learning is widely helps in deciding to remember the value or clear the value or
used. The method is to make the computer understand the pass the value further. Both learning methods require a
form or nature of the sound and can decide the meaning of dropout to be used to solve the overfitting problem [10].
the sound. The sound that human speaks can be detected and However, in this article Convolutional Neural Network uses
converted into sound wave signals. However, many general dropout, but for LSTM Models from the article [11],
components must be implemented, such as the steps to Bayesian modeling is used to calculate dropout, which gives
determine the Simple Rate before teaching, it is important better results.
from the research [1] to determine the frequency resolution, This research has proposed the process of Convolutional
which is in the form of wavelet. Though it also has the Neural Network and Long Short-Term Memory to test and
characteristic of sound Mel Frequency Cepstral Coefficient compare the accuracy of predict speech turn on and turn off
(MFCC) [2] that can be used to identify the characteristics of various electrical appliances.
sound. The received data into parts with the power and
frequency of the sound into a vector of numbers by II. METHODOLOGY
converting audio signals into images by extracting the A. Data Preparation and Preprocessing
characteristics of the sound to the segment. Afterward, data
will be arranged in time data which can be generated as an In this research, the data from the speech produced by the
audio waveform or created as Matrix data so that the data collection of speech is used, including 40 people from males

978-1-7281-6785-5/20/$31.00 ©2020 IEEE 697

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 07,2020 at 18:32:31 UTC from IEEE Xplore. Restrictions apply.
and females (Thai language), turn on and turn off electrical Figure 1 shows the process of RNN 𝑥𝑡 is the input data at
appliances such as light bulbs, air conditioners, computers, that time and ℎ𝑡 value from hidden state at that time and 𝑦𝑡 is
TVs, doors, fans, curtains, in total 2,105 voices, the length of output. When spreading loops, it can be seen that there are
each sound is not the same and noise exist in the
multiple hidden states work jointly in sequence. ℎ𝑡 calculated
environment. The voice has a Simple Rate of 44.1 kHz
shown in (1)
through the process of silencing the front and back of the
audio file, and noise cancellation, which uses Library
FFmpeg and Sox to produce audio files with less noise [3] ℎ𝑡 = 𝑓ℎ (𝑈ℎ ℎ𝑡−1 + 𝑊ℎ 𝑥𝑡 + 𝑏ℎ ) (1)
but still has an unequal speech length, as the spoken words
have different syllables. While the characteristics of each The Activation Function 𝑓ℎ of the hidden layer, 𝑈ℎ is the
speech MFCCs features 20 to make a Spectrogram image weight at previous hidden state, 𝑊ℎ is the weight matrix of
with a size of 250x185, with color images and Spectrogram the current hidden state. And 𝑦𝑡 calculated shown in (2)
frequency data for training in CNN and LSTM respectively
is used. The Training Set 80%, while 20% for Test Set has
been used. Dataset used is as shown in Table I. 𝑦𝑡 = 𝑓𝑦 (𝑊𝑦 ℎ𝑡 + 𝑏𝑦 ) (2)

TABLE I. DATASET However, RNN still has a problem with Gradient

Label IPA Thai Phonemic Syllable instance
vanishing problem. This research using the Long Short-Term
Turn off lights p�
t + faj 2 194 Memory (LSTM) to solve the Gradient vanishing problem,
using vector format from Spectrogram with MFCCs Features
Turn on lights pɤ̀ːt + faj 2 202 20 and send the data to training in LSTM as shown in Figure
Turn off air conditioner p�
t + ʔɛː 2 193 2.
Turn on air conditioner pɤ̀ːt + ʔɛː 2 189

Turn off computer p�

t + kʰɔːm 2 133

Turn on computer pɤ̀ːt + kʰɔːm 2 135

Turn off Television p�

t + tʰiː + wiː 3 124

Turn on Television pɤ̀ːt + tʰiː + wiː 3 128

Turn off door p�

t + p�}
ʔ + tuː 3 130

Turn on door pɤ̀ːt + p�}

ʔ + tuː 3 123

Turn off fan p�

t + pʰ�
t + lom 3 159

Turn on fan pɤ̀ːt + pʰ�

t + lom 3 149

Turn off curtain p�

t + m�
ːn 2 119

Turn on curtain pɤ̀ːt + m�

ːn 2 127

TOTAL 2,105

B. Proposed Method Figure 2. The training process of long short-term memory.

This research uses RNN, which works in a loop on the Figure 2 shows the steps used. It starts by receiving
Hidden Layer of the Neural Network using the prior state speech which is converted into a sound wave and stored in a
data to calculate with the current state for sending data in the Matrix format. After that, the data is transmitted one by one
next state. The model will able to understand current data per row for training and there is Memory to record the state
from learning prior data as shown in Figure 1. from previous sound waves. By Forget Gate (𝑓𝑡 ) deciding
whether to remember previous information or not, using the
sigmoid function as the decider as shown in (3).
𝑓𝑡 = 𝜎(𝑊𝑓 ∙ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑓 ) (3)

The Input Gate is to allow the value to be updated using

the sigmoid function similar to Forget Gate as in (4)

𝑖𝑡 = 𝜎(𝑊𝑖 ∙ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑖 ) (4)

Figure 1. The process of Recurrent Neural Network (RNN)

698

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 07,2020 at 18:32:31 UTC from IEEE Xplore. Restrictions apply.
And use the Input Modulation Gate, which uses the 𝑡𝑎𝑛ℎ However, in this research, an experiment has been made
function instead to update the values as shown in (5). to adjust the cell size of LSTM and the number of stacks in
LSTM which has the structure as in Figure 4.
̌ 𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑐 ∙ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑐 )
𝐶 (5)

The next step, after getting the forget gate and input gate
values, updating the cell state as in (6).

̌𝑡
𝑐𝑡 = 𝑓𝑡 ∙ 𝑐𝑡−1 + 𝑖𝑡 ∙ 𝐶 (6)

Output Gate will send the value from the sigmoid

function and this value will be used in the next state in the
pointwise calculation to find the forwarded value (ℎ𝑡 ) as in
(7).

𝑜𝑡 = 𝜎(𝑊𝑜 ∙ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑜 ) (7)

Finally, the updated value will be passed through the

𝑡𝑎𝑛ℎ function and pointwise calculation with the value from Figure 4. RNN multiple stacks cell.
the output gate (𝑜𝑡 ) to get the value to be sent to the next
state (ℎ𝑡 ) and the resulting value of that state (𝑦𝑡) as in (8) The size of cells tested, are 128, 192, and 256 for 1 stack
and 2 stacks cell using size 128x128, 192x192, 256x256,
320x320, 384x384, and 512x512 respectively to measure the
ℎ𝑡 = 𝑜𝑡 ∙ 𝑡𝑎𝑛ℎ(𝑐𝑡 ) (8) effectiveness in predicting the results.
In the training process, K-Fold has been adopted by it is C. Performance Evaluation
divided into 10 Datasets has Training Set 90%, and Test Set In this research, the results are calculated from the use of
10% has been used in each round. After that, each round will Test data 20%. Then the result as a percentage that is
record the Model. Finally, each model has been used to predicted correctly is calculated from a total of 14 Labels,
predict the value and calculate the average value of the each type of electrical appliances in both CNN and LSTM.
Predict Set as shown in Figure 3.
III. EXPERIMENTAL RESULT
In the experiment, the results of machine learning from
Long Short-Term Memory provides the highest accuracy of
97.84%, which is more accurate than the Convolutional
Neural Network highest at 91.14%, as shown in Table II.
And In Figure 5-6, propose each Confusion Matrix.
Figure 6 proposes that LSTM predictions provide better
results than CNN. However, similar sounds in the first
syllable or second syllable may cause incorrect predictions
due to similar audio formats.
TABLE II. COMPARISON TABLE OF PERFORMANCE PREDICTION OF
CNN AND LSTM
Model Mean Max Min
CNN 91.06 91.14 89.17
LSTM 128 94.25 94.73 93.54
LSTM 192 95.59 96.17 95.21
LSTM 256 95.83 96.65 95.45
LSTM 128x128 96.45 97.12 95.69
LSTM 192x192 97.41 97.60 97.12
LSTM 256x256 97.55 97.60 97.36
LSTM 320x320 97.79 97.84 97.60
LSTM 384x384 97.17 97.60 96.17
Figure 3. Step training by K-fold cross validation.
LSTM 512x512 95.02 96.17 94.01

699

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 07,2020 at 18:32:31 UTC from IEEE Xplore. Restrictions apply.
IV. CONCLUSION
In this paper, although the results of LSTM are quite high
because the dataset of the sound has been reduced the noise.
And will see that the appropriate cell size increases the
prediction results, but when too much increase, the
prediction efficiency decreases. And the model can mix
between CNN and RNN (CRNN) as an additional
experimental approach. However, LSTM can be used to
classify the voice turn on and turn off electrical appliances.
REFERENCES
[1] Bhushan C. Kamble, “Speech Recognition Using Artificial Neural
Network – A Review”, Int'l Journal of Computing, Communications
& Instrumentation Engg. (IJCCIE) Vol. 1, Issue 1 (2016)
[2] Atik Charisma, M. Reza Hidayat, Yuda Bakti Zainal, “Speaker
RecognLtion Using Mel-Frequency Cepstrum Coefficients and Sum
Square Error”, The 3rd International Conference on Wireless and
Telematics 2017.
[3] Pete Warden, Google Brain, Mountain View, “Speech Commands: A
Dataset for Limited-Vocabulary Speech Recognition”,
arXiv:1804.03209v1 [cs.CL] April 2018.
Figure 5. The confusion matrix of Convolutional Neural Network [4] Md Mahadi Hasan Nahid, Bishwajit Purkaystha, Md Saiful Islam,
accuracy 91.14%. “Bengali Speech Recognition: A Double Layered LSTM-RNN
Approach”, 2017 20th International Conference of Computer and
Information Technology (ICCIT), 22-24 December, 2017.
[5] Saad ALBAWI , Tareq Abed MOHAMMED, Saad AL-ZAWI,
“Understanding of a Convolutional Neural Network”, ICET2017,
Antalya, Turkey
[6] Aditya Khamparia, Deepak Gupta, Nhu Gia Nguyen, Ashish Khanna,
Babita Pandey, Prayag Tiwari, “Sound Classification Using
Convolutional Neural Network and Tensor Deep Stacking Network”,
10.1109/ACCESS.2018.2888882, January 8, 2019.
[7] Alex Sherstinsky, “Fundamentals of Recurrent Neural Network
(RNN) and Long Short-Term Memory (LSTM) Network”,
arXiv:1808.03314v4 [cs.LG] 4 Nov 2018.
[8] Stefano Squartini, Amir Hussain, Francesco Piazza,
“PREPROCESSING BASED SOLUTION FOR THE VANISHING
GRADIENT PROBLEM IN RECURRENT NEURAL
NETWORKS”, 2003 IEEE.
[9] Sepp Hochreiter, Jürgen Schmidhuber, “LONG SHORT-TERM
MEMORY”, Neural Computation 9(8):1735{1780, 1997.
[10] Imanol Bilbao, Javier Bilbao, “Overfitting problem and the over-
training in the era of data”, The 8th IEEE International Conference on
Intelligent Computing and Information Systems (ICICIS 2017)
[11] Yarin Gal, Zoubin Ghahramani, “A Theoretically Grounded
Application of Dropout in Recurrent Neural Networks”,
arXiv:1512.05287v5 [stat.ML] 5 Oct 2016.

Figure 6. The confusion matrix of Long Short-Term Memory cell size

320x320 accuracy 97.84%

700

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 07,2020 at 18:32:31 UTC from IEEE Xplore. Restrictions apply.

Jumping Computation Updating Automata and Grammars for Discontinuous
No ratings yet
Jumping Computation Updating Automata and Grammars for Discontinuous
293 pages
Survey of Deep Learning Paradigms For Speech Processing
No ratings yet
Survey of Deep Learning Paradigms For Speech Processing
37 pages
Voice Disorder Detection Using Long Short Term Memory (LSTM) Model
No ratings yet
Voice Disorder Detection Using Long Short Term Memory (LSTM) Model
4 pages
Patrick FLANAGAN _ Neurophone ( II ) ~ Several Articles From Keelynet.com Archives
No ratings yet
Patrick FLANAGAN _ Neurophone ( II ) ~ Several Articles From Keelynet.com Archives
22 pages
Speaker Recognition Using MATLAB
95% (64)
Speaker Recognition Using MATLAB
75 pages
Deepak Report Phase1
No ratings yet
Deepak Report Phase1
80 pages
Mestrado-Engenharia_Informatica-Eduardo_Farofia_Medeiros
No ratings yet
Mestrado-Engenharia_Informatica-Eduardo_Farofia_Medeiros
103 pages
Automatic Speech Recognition Using Deep Neural Networks
No ratings yet
Automatic Speech Recognition Using Deep Neural Networks
6 pages
Yili Nyang 2022
No ratings yet
Yili Nyang 2022
95 pages
Thesis Trinh Khoi
No ratings yet
Thesis Trinh Khoi
110 pages
Speech Emotion Recognition From Raw Audio Using Deep Learning
No ratings yet
Speech Emotion Recognition From Raw Audio Using Deep Learning
83 pages
Aman
No ratings yet
Aman
71 pages
A_review_on_the_Long_short_term_memory_model
No ratings yet
A_review_on_the_Long_short_term_memory_model
34 pages
Full Text 01
No ratings yet
Full Text 01
54 pages
AIDS II (1)
No ratings yet
AIDS II (1)
42 pages
Alemayehu Yilma
No ratings yet
Alemayehu Yilma
67 pages
16 Mikami
No ratings yet
16 Mikami
27 pages
Temporal Pattern Classification Using Spiking Neural Networks
No ratings yet
Temporal Pattern Classification Using Spiking Neural Networks
67 pages
GROUP19_EEE_PAPER
No ratings yet
GROUP19_EEE_PAPER
23 pages
CubCadet RZT Series 2004 2012 Spec Sheet
No ratings yet
CubCadet RZT Series 2004 2012 Spec Sheet
46 pages
Research Method and Presentation (Mini Project Proposal)
No ratings yet
Research Method and Presentation (Mini Project Proposal)
26 pages
F-86F Flight Manual + Performance Data.
100% (3)
F-86F Flight Manual + Performance Data.
436 pages
2660 International+Journal+of+Intelligent+Systems+and+Applications+in+Engineering+ +Green+Publication+Service
No ratings yet
2660 International+Journal+of+Intelligent+Systems+and+Applications+in+Engineering+ +Green+Publication+Service
11 pages
Sentiment Analysis with an Recurrent Neural Networks
No ratings yet
Sentiment Analysis with an Recurrent Neural Networks
12 pages
Next Word Prediction Using Machine Learning Techniques: Cybersecurity November 2022
No ratings yet
Next Word Prediction Using Machine Learning Techniques: Cybersecurity November 2022
12 pages
Speech Command Recognition Using Deep Learning
No ratings yet
Speech Command Recognition Using Deep Learning
25 pages
Speech Representation Models For Speech Synthesis and Multimodal Speech Recognition
No ratings yet
Speech Representation Models For Speech Synthesis and Multimodal Speech Recognition
63 pages
EHaCON - 2019 Paper 8
No ratings yet
EHaCON - 2019 Paper 8
20 pages
10.2478 - Jaiscr 2019 0006
No ratings yet
10.2478 - Jaiscr 2019 0006
11 pages
Building A Voice Based Image Caption Generator With Deep Learning
No ratings yet
Building A Voice Based Image Caption Generator With Deep Learning
6 pages
Iconips Paper On Transfer Learning
No ratings yet
Iconips Paper On Transfer Learning
11 pages
NN Text Generation Zaid Bouslikhin
No ratings yet
NN Text Generation Zaid Bouslikhin
14 pages
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
No ratings yet
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
15 pages
Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks
No ratings yet
Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks
5 pages
134 Rashid Bicet2021
No ratings yet
134 Rashid Bicet2021
9 pages
The Truth Lies in Dreams of Many: Reflections of A Design Student On Changing Roles in Design Practice
100% (1)
The Truth Lies in Dreams of Many: Reflections of A Design Student On Changing Roles in Design Practice
121 pages
The Art of Loving Essay
No ratings yet
The Art of Loving Essay
10 pages
Torque & Service Specifications
100% (1)
Torque & Service Specifications
60 pages
Ijeet 12 03 035
No ratings yet
Ijeet 12 03 035
9 pages
Irjet V7i6804
No ratings yet
Irjet V7i6804
7 pages
Visual Image Caption Generator Using Deep Learning
No ratings yet
Visual Image Caption Generator Using Deep Learning
7 pages
Deep Speech - Scaling Up End-To-End Speech Recognition
No ratings yet
Deep Speech - Scaling Up End-To-End Speech Recognition
12 pages
Deep Learning Long-Short Term Memory (LSTM) For Indonesian Speech Digit Recognition Using LPC and MFCC Feature
No ratings yet
Deep Learning Long-Short Term Memory (LSTM) For Indonesian Speech Digit Recognition Using LPC and MFCC Feature
5 pages
Speech Recognition Using Convolutional Neural Netw
No ratings yet
Speech Recognition Using Convolutional Neural Netw
5 pages
A Speaker Verification Method Based On TDNN-LSTMP PDF
No ratings yet
A Speaker Verification Method Based On TDNN-LSTMP PDF
15 pages
Speech Recognition Using Convolutional Neural Netw PDF
No ratings yet
Speech Recognition Using Convolutional Neural Netw PDF
5 pages
Research Paper Attri
No ratings yet
Research Paper Attri
7 pages
Chigo Service Manual Air Condition
No ratings yet
Chigo Service Manual Air Condition
31 pages
Voice Command Based Wheelchair: Subtitle As Needed (Paper Subtitle)
No ratings yet
Voice Command Based Wheelchair: Subtitle As Needed (Paper Subtitle)
4 pages
Research On Text Classification Based On CNN and LSTM: Yuandong Luan Shaofu Lin
No ratings yet
Research On Text Classification Based On CNN and LSTM: Yuandong Luan Shaofu Lin
4 pages
Performance - Evaluation - of - Recurrent - Neural - Networks-LSTM - and - GRU - For ASR - IC2E3
No ratings yet
Performance - Evaluation - of - Recurrent - Neural - Networks-LSTM - and - GRU - For ASR - IC2E3
6 pages
2_CNN based speaker recognition in language and text independent small scale system
No ratings yet
2_CNN based speaker recognition in language and text independent small scale system
4 pages
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
No ratings yet
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
10 pages
Introduction to Statistical Modeling in big data
No ratings yet
Introduction to Statistical Modeling in big data
3 pages
Transactions On Neural Networks and Learning Systems 11
No ratings yet
Transactions On Neural Networks and Learning Systems 11
1 page
Speech Recognition Using Linear Predictive Coding and Artificial Neural Network For Controlling Movement of Mobile Robot
No ratings yet
Speech Recognition Using Linear Predictive Coding and Artificial Neural Network For Controlling Movement of Mobile Robot
5 pages
Voice Isolation Using Artificial Neural Network
No ratings yet
Voice Isolation Using Artificial Neural Network
7 pages
Speed Control Using Matlab
No ratings yet
Speed Control Using Matlab
23 pages
Biomedical Informatics for Cancer Research, 1st Edition scribd download
100% (11)
Biomedical Informatics for Cancer Research, 1st Edition scribd download
14 pages
Sap Rap Action
No ratings yet
Sap Rap Action
7 pages
Spa-9-Q2-Module 3
No ratings yet
Spa-9-Q2-Module 3
2 pages
Project Plan - Kel 5 PDF
No ratings yet
Project Plan - Kel 5 PDF
5 pages
A Pragmatic Theory of Presupposition Projection: Be Articulate
No ratings yet
A Pragmatic Theory of Presupposition Projection: Be Articulate
34 pages
Sound and Vibration Considerations of Some Materials For Automotive Engineering Applications
No ratings yet
Sound and Vibration Considerations of Some Materials For Automotive Engineering Applications
14 pages
Colin CO 1
No ratings yet
Colin CO 1
10 pages
Voice Controlled Robot: En407 Robotics
No ratings yet
Voice Controlled Robot: En407 Robotics
9 pages
C Programming Tokens Building Blocks of Code
No ratings yet
C Programming Tokens Building Blocks of Code
7 pages
Aurum 簡單報告
No ratings yet
Aurum 簡單報告
8 pages
Malayalam Speech Recognition
No ratings yet
Malayalam Speech Recognition
3 pages
NME 2022 Batch Onwards November 2023
No ratings yet
NME 2022 Batch Onwards November 2023
6 pages
EPHC Manual
No ratings yet
EPHC Manual
3 pages
Voice (Speaker) Recognition Using Neural Networks: Synopsis
No ratings yet
Voice (Speaker) Recognition Using Neural Networks: Synopsis
4 pages
TC2 - Osmosis Practical Sheet - Teacher
No ratings yet
TC2 - Osmosis Practical Sheet - Teacher
5 pages
Black Holes - Docx by BUN
No ratings yet
Black Holes - Docx by BUN
4 pages
John Hopkins Interview
No ratings yet
John Hopkins Interview
4 pages
Fdci222 Fdcio222 Fdcio224 Ds
No ratings yet
Fdci222 Fdcio222 Fdcio224 Ds
4 pages
Lego group forecasting techniques
No ratings yet
Lego group forecasting techniques
2 pages
A Mughal-Era Technique To Restore Humayun Mahal in Chennai's Chepauk Palace - The Hindu
No ratings yet
A Mughal-Era Technique To Restore Humayun Mahal in Chennai's Chepauk Palace - The Hindu
2 pages
Substantial Completion Checklist
No ratings yet
Substantial Completion Checklist
2 pages
Aryabhata Biography: Major Works
No ratings yet
Aryabhata Biography: Major Works
2 pages
Power Flows
No ratings yet
Power Flows
7 pages
National University of Computer and Emerging Sciences: Assignment # 3
No ratings yet
National University of Computer and Emerging Sciences: Assignment # 3
3 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
From Everand
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
Giancarlo Zaccone
No ratings yet
Simulation of Digital Communication Systems Using Matlab
From Everand
Simulation of Digital Communication Systems Using Matlab
Mathuranathan Viswanathan
3.5/5 (22)
Applied Digital Signal Processing and Applications
From Everand
Applied Digital Signal Processing and Applications
Othman Omran Khalifa
No ratings yet
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
Analog Dialogue, Volume 47, Number 1: Analog Dialogue, #9
From Everand
Analog Dialogue, Volume 47, Number 1: Analog Dialogue, #9
Analog Dialogue
No ratings yet

Saheaw 2020

Uploaded by

Saheaw 2020

Uploaded by

2020 IEEE 7th International Conference on Industrial Engineering and Applications

Wuttichai Saheaw Saichon Jaiyen

978-1-7281-6785-5/20/$31.00 ©2020 IEEE 697

TABLE I. DATASET However, RNN still has a problem with Gradient

Turn off computer p�

Turn on computer pɤ̀ːt + kʰɔːm 2 135

Turn off Television p�

Turn on Television pɤ̀ːt + tʰiː + wiː 3 128

Turn off door p�

Turn on door pɤ̀ːt + p�}

Turn off fan p�

Turn on fan pɤ̀ːt + pʰ�

Turn off curtain p�

Turn on curtain pɤ̀ːt + m�

B. Proposed Method Figure 2. The training process of long short-term memory.

The Input Gate is to allow the value to be updated using

𝑖𝑡 = 𝜎(𝑊𝑖 ∙ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑖 ) (4)

Figure 1. The process of Recurrent Neural Network (RNN)

Output Gate will send the value from the sigmoid

𝑜𝑡 = 𝜎(𝑊𝑜 ∙ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑜 ) (7)

Finally, the updated value will be passed through the

Figure 6. The confusion matrix of Long Short-Term Memory cell size

You might also like