Project New
Project New
Speech recognition is a technology that able a computer to capture the words spoken by a
human with a help of microphone. These words are later on recognized by speech
recognizer, and in the end, system outputs the recognized words. The process of speech
recognition consists of different steps that will be discussed in the following sections one
by one.
An ideal situation in the process of speech recognition is that, a speech recognition engine
recognizes all words uttered by a human but, practically the performance of a speech
recognition engine depends on number of factors. Vocabularies, multiple users and noisy
environment are the major factors that are counted in as the depending factors for a speech
recognition engine.
1.2 History
The concept of speech recognition started somewhere in 1940s, practically the first speech
recognition program was appeared in 1952 at the bell labs, that was about recognition of a
digit in a noise free environment .1940s and 1950s consider as the foundational period of
the speech recognition technology, in this period work was done on the foundational
paradigms of the speech recognition that is automation and information theoretic models .
In the 1960’s we were able to recognize small vocabularies (order of 10-100 words) of
isolated words, based on simple acoustic-phonetic properties of speech sounds. The key
technologies that were developed during this decade were, filter banks and time
normalization methods . In 1970s the medium vocabularies (order of 100-1000 words)
using simple template-based, pattern recognition methods were recognized. In 1980s large
vocabularies (1000-unlimited) were used and speech recognition problems based on
statistical, with a large range of networks for handling language structures were addressed.
The key invention of this era were Hidden Markov Model (HMM) and the stochastic
language model, which together continuous speech recognition enabled powerful new
methods for handling problem efficiently and with high performance .In 1990s the key
technologies developed during this period were the methods for stochastic language
understanding, statistical learning of acoustic and language models, and the methods for
implementation of large vocabulary speech understanding systems.
After the five decades of research, the speech recognition technology has finally entered
marketplace, benefiting the users in variety of ways. The challenge of designing a machine
that truly functions like an intelligent human is still a major one going forward.
1
REFERENCES
1.3 Motivation
There are two primary motivational factors for this research.
One, the fact that with a software like this the programmer can get a smooth, easy,
portable interpreter.
Two, the speech to text module allow the programmer to reduce time .
The implied benefit to design a working model of this system, is that involvement of re
configurable architecture. This point, in itself would allow any compatible speech
processing algorithm to be implemented at the hardware level directly, almost on-the-fly.
1.4 Objective