Speech Recognition Using Convolutional Neural Netw
Speech Recognition Using Convolutional Neural Netw
6) (2018) 133-137
Research paper
Abstract
Automatic speech recognition (ASR) is the process of converting the vocal speech signals into text using transcripts. In the present era of
computer revolution, the ASR plays a major role in enhancing the user experience, in a natural way, while communicating with the machines.
It rules out the use of traditional devices like keyboard and mouse, and the user can perform an endless array of application s like controlling
of devices and interaction with customer care. In this paper, an ASR based Airport enquiry system is presented. The system has been
developed natively for telugu language. The database is created based on the most frequently asked questions in an airport en quiry. Because
of its high performance, Convolutional Neural Network (CNN) has been used for training and testing of the database. The salient feature of
weight connectivity, local connectivity and polling result is a through training of the system, thus resulting in a superior testing performance.
Experiments performed on wideband speech signals results in significant improvement in the performance of the system in comparison to the
traditional techniques.
Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted
use, distribution, and reproduction in any medium, provided the original work is properly cited.
International Journal of Engineering & Technology 134
2. Convolutional Neural Network in this if a pixel value is negative then the negative values are
replaced with zeros. This is done to all the filtered signals. This
becomes another type of layer which is known as a rectified
linear unit, a stack of signals which becomes a stack of signals
with no negative values. Now the three layers are stacked up so
that one output will become the input for the next. The final layer
is the fully connected layer.
Questions: Etc
E :How do I book my flight? Total words trained for the process of recognition are 583. The
total phone count resulted is 1765. The distinct phones are BU,
T :Nenu vimana ticket ela book cheskogalanu / Nenu vimana CHE, E, GA, K, KE, KO, LA, LI, MA, NA, NE, NU, S, SKO,
ticket ela book cheskovali T, TI, VA, VI.
(a) (b)
(c) (d)
International Journal of Engineering & Technology 136
(e) (f)
(h) (g)
(i) (j)
137 International Journal of Engineering & Technology
(k) (l)
Figure. (a) – (l) – Similarity matrices of phones 3 phones per sample.