SPEECH

The document discusses speech recognition including its meaning, working process, advantages, disadvantages and future. Speech recognition is the process of converting spoken words to text. It works by using algorithms through language modeling and hidden Markov models. The future of speech recognition includes developing systems that can instantly translate languages with high accuracy and understand the meaning behind words.

Uploaded by

Ramesh k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

64 views

SPEECH

Uploaded by

Ramesh k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

SPEECH RECOGNITION

CONTENTS
Introduction
Meaning of Speech Recognition
Working of Speech Recognition
Speech Recognition Flowchart
Recognition process Flow Summary
Advantages
Disadvantages
The Future of Speech Recognition
Conclusion
Introduction
Speech recognition is the process of converting an acoustic
signal, captured by a microphone or a telephone, to set of words.
The recognized words can be an end in themselves, as for
applications such as commands & control, data entry, and
document preparation.
They can also serve as the input to further linguistic processing in
order to achieve speech understanding.
It is also known as Automatic Speech Recognition (ASR),
Computer Speech Recognition, Speech To Text (STT).
WHAT IS SPEECH RECOGNITION?
 SPEECH RECOGNITION BASICALLY MEANS TALKING TO A COMPUTER, HAVING
IT RECOGNIZE WHATEVER WE'RE SAYING.
 THE DEFINITION SAYS SPEECH RECOGNITION IS THE INTERDISCIPLINARY
SUBFIELD OF COMPUTATIONAL LINGUISTICS THAT DEVELOPS
METHODOLOGIES AND TECHNOLOGIES THAT ENABLES THE RECOGNITION AND
TRANSLATION OF SPOKEN LANGUAGE INTO TEXT BY COMPUTERS. IT IS ALSO
KNOWN AS AUTOMATIC SPEECH RECOGNITION (ASR), COMRUTER SPEECH
RECOGNITION OR SPEECH TO TEXT (STT).
HOW DOES IT WORK?
 This process fundamentally functions as a pipeline that converts pcm (pulse
code modulation) digital audio from a sound card into recognized speech.
 It basically uses algorithms through language modeling. it involves
relationship between linguistic units of speech and audio signals; language
modeling matches sounds with word sequences to help differentiate between
words that sound similar.
 We also use hidden markov models to identify temporal patterns to improve
accuracy.
TYPES OF SPEECH RECOGNITION

1) Speaker-Dependent
2) Speaker-Independent
1) Speaker-Dependent:-
 Speaker-dependent software is commonly used for dictation software,
while speaker-independent software is more commonly found in telephone
applications.
 Speaker-dependent software works by learning the unique characteristics
of a single person's voice, in a way similar to voice recognition. New users
must first "train" the software by speaking to it, so the computer can
analyze how the person talks. This often means users have to read a few
pages of text to the computer before they can use the speech recognition
software.
2) Speaker-Independent:-
 Speaker-independent software is designed to recognize anyone's voice, so no
training is involved. This means it is the only real option for applications such
as interactive voice response systems - where businesses can't ask callers to
read pages of text before using the system. The downside is that speaker-
independent software is generally less accurate than speaker-dependent
software.
 Speech recognition engines that are speaker independent generally deal with
this fact by limiting the grammars they use. By using a smaller list of
recognized words, the speech engine is more likely to correctly recognize
what a speaker said.
Recognition Process Flow
Summary
 Step 1:User Input
The system catches user's voice in the form of analog
acoustic signal.
 Step 2 Digitization
Digitize the analog acoustic signal.
 Step 3:Phonetic Breakdown
Breaking signals into phonemes
Recognition Process Flow
Summary
Step 4:Statistical Modeling
 Mapping phonemes to their phonetic representation using statistics
model.
Step 5:Matching
 According to grammar phonetic representation and Dictionary, the
system returns an n-best list (I.e,:a word plus a confidence score)
 Grammar-the union words or phrases to constraint the range of input
or output in the voice application.
 Dictionary-the mapping table of phonetic representation and word(EX
: thu, thee->the)
Program Training
 The process is more complicated for phrases and sentences -- the system has to
figure out where each word stops and starts.
 The statistical systems need lots of exemplary training data to reach their optimal
performance.
 Sometimes on the order of thousands of hours of human transcribed speech and
hundreds of megabytes of text.
 The training data are used to create acoustic models of words, word lists and multi-
word probability networks.
 The details can make the difference between a well-performing system and a poorly-
performing system -- even when using the same basic algorithm.
ADVANTAGES
 People with disabilities.
 Organizations - Increases productivity, reduces costs and errors.
 Lower operational Costs.
 Advances in technology will allow consumers and businesses to
implement speech recognition systems at a relatively low cost.
• Cell-phone users can dial pre-programmed numbers by voice
command.
• Users can trade stocks through a voice-activated trading system.
• Speech recognition technology can also replace touch-tone
dialing resulting in the ability to target customers that speak
different languages
DISADVANTAGES
 Difficult to build a perfect system.
 Conversations
•Involves more than just words (non-verbal communication;
stutters etc.
•Every human being has differences such as their voice,
mouth, and speaking style.
 Filtering background noise is a task that can even be difficult for
humans to accomplish.
The Future of Speech Recognition
 The Defense Advanced Research Projects Agency (DARPA) has
three teams of researchers working on Global Autonomous Language
Exploitation (GALE), a program that will take in streams of
information from foreign news broadcasts and newspapers and
translate them.
 It hopes to create software that can instantly translate two languages
with at least 90 percent accuracy.
 "DARPA is also funding an R&D effort called TRANSTAC to enable
the soldiers to communicate more effectively with civilian
populations in non English-speaking countries.
Conclusion:
 At some point in the future, speech recognition may become speech
understanding
 The statistical models that allow computers to decide what a person just
said may someday allow them to grasp the meaning behind the words.
 Although it is a huge leap in terms of computational power and software
sophistication, some researchers argue that speech recognition
development offers the most direct line from the computers of today to
true artificial intelligence.

Gr 11 Tourism June Examination 2025 With Memo
100% (2)
Gr 11 Tourism June Examination 2025 With Memo
24 pages
Instant Download The Voice of the Trobairitz Perspectives on the Women Troubadours William D. Paden (Editor) PDF All Chapters
100% (11)
Instant Download The Voice of the Trobairitz Perspectives on the Women Troubadours William D. Paden (Editor) PDF All Chapters
85 pages
2022 Pediatric Trauma Care. A Practical Guide 1ed - Springer
100% (3)
2022 Pediatric Trauma Care. A Practical Guide 1ed - Springer
561 pages
Technical Aptitude Questions and Answers
50% (2)
Technical Aptitude Questions and Answers
10 pages
Lecture 9 - Speech Recognition
No ratings yet
Lecture 9 - Speech Recognition
65 pages
14ec3029 Speech and Audio Signal Processing
No ratings yet
14ec3029 Speech and Audio Signal Processing
30 pages
FFT
No ratings yet
FFT
10 pages
Preliminary Assessment Report by Bryan Consultants, Inc.
No ratings yet
Preliminary Assessment Report by Bryan Consultants, Inc.
2 pages
Speech Recognition
0% (1)
Speech Recognition
27 pages
Speech Recognition Report
100% (1)
Speech Recognition Report
20 pages
Cepstrum Analysis
No ratings yet
Cepstrum Analysis
13 pages
Voice Biometrics
100% (1)
Voice Biometrics
12 pages
Orkshop On Acoustic Voice Analysis: Summary Statement
No ratings yet
Orkshop On Acoustic Voice Analysis: Summary Statement
36 pages
Detecting AI-Synthesized Speech Using Bispectral Analysis
No ratings yet
Detecting AI-Synthesized Speech Using Bispectral Analysis
6 pages
Voice Quality - Esling
No ratings yet
Voice Quality - Esling
5 pages
Download Complete Information Theory Coding And Cryptography 3rd Edition Ranjan Bose PDF for All Chapters
100% (2)
Download Complete Information Theory Coding And Cryptography 3rd Edition Ranjan Bose PDF for All Chapters
51 pages
Morphing Techniques For Enhanced Scat Singing
100% (1)
Morphing Techniques For Enhanced Scat Singing
4 pages
Sonagraph. A Cartoonified Spectral Model
No ratings yet
Sonagraph. A Cartoonified Spectral Model
8 pages
Voice Stress Analysis
No ratings yet
Voice Stress Analysis
8 pages
Ai Syllabus
No ratings yet
Ai Syllabus
5 pages
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
100% (1)
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
65 pages
EI6401-Transducer Engineering PDF
No ratings yet
EI6401-Transducer Engineering PDF
16 pages
An Automatic Speaker Recognition System
100% (1)
An Automatic Speaker Recognition System
11 pages
Method To Study Speech Synthesis
No ratings yet
Method To Study Speech Synthesis
43 pages
Cesarean Section
No ratings yet
Cesarean Section
9 pages
Digital Signal Processing: (Course code-ECE 303
100% (1)
Digital Signal Processing: (Course code-ECE 303
39 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
34 pages
CELP
No ratings yet
CELP
23 pages
Mind Reading Computer
No ratings yet
Mind Reading Computer
16 pages
Musiclm: Generating Music From Text: Google-Research - Github.Io/Seanet/Musiclm/Examples
No ratings yet
Musiclm: Generating Music From Text: Google-Research - Github.Io/Seanet/Musiclm/Examples
15 pages
Computer Python Voice Chatbot
No ratings yet
Computer Python Voice Chatbot
25 pages
Trance Music Scales
No ratings yet
Trance Music Scales
1 page
Ultrasonic Imaging System
100% (1)
Ultrasonic Imaging System
12 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
24 pages
DSP Arc - Priyanka1
100% (1)
DSP Arc - Priyanka1
121 pages
Audio Compression Using Wavelet Techniques: Project Report
No ratings yet
Audio Compression Using Wavelet Techniques: Project Report
41 pages
Speech Recognition
No ratings yet
Speech Recognition
16 pages
Automatic Speech Recognition Documentation
No ratings yet
Automatic Speech Recognition Documentation
24 pages
Designing Software Synthesizer Plugins in C With Audio DSP 2nd Edition Will C. Pirkle - Download the full ebook version right now
100% (1)
Designing Software Synthesizer Plugins in C With Audio DSP 2nd Edition Will C. Pirkle - Download the full ebook version right now
87 pages
Voice Evaluation and Therapy: Key Points
No ratings yet
Voice Evaluation and Therapy: Key Points
12 pages
Silent Speech Interface Using Facial Recognition and Electromyography
100% (2)
Silent Speech Interface Using Facial Recognition and Electromyography
15 pages
Data Acquisition System123
100% (1)
Data Acquisition System123
22 pages
RFSoC Evaluation Tool User Guide
No ratings yet
RFSoC Evaluation Tool User Guide
71 pages
Silent Sound Technology
No ratings yet
Silent Sound Technology
30 pages
30 Interesting Ways To Use Audio in Your Classroom
100% (1)
30 Interesting Ways To Use Audio in Your Classroom
32 pages
DSP Architecture
100% (1)
DSP Architecture
71 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
9 pages
Artificial Intelligence For Speech Recognition
92% (12)
Artificial Intelligence For Speech Recognition
48 pages
Bioregenerative Applications of The Human Mesenchymal Stem Cell Derived Secretome Part I
No ratings yet
Bioregenerative Applications of The Human Mesenchymal Stem Cell Derived Secretome Part I
18 pages
Iris-Based Medical Analysis by Geometric
No ratings yet
Iris-Based Medical Analysis by Geometric
9 pages
Expt. Pulse Shaping & Matched Filtering - Matlab
No ratings yet
Expt. Pulse Shaping & Matched Filtering - Matlab
5 pages
Experiment - 2: Time Division Multiplexing
No ratings yet
Experiment - 2: Time Division Multiplexing
5 pages
Speech Recognition
100% (4)
Speech Recognition
576 pages
Digital Communication Puzzles
No ratings yet
Digital Communication Puzzles
48 pages
HFE0508 Chenakin
100% (1)
HFE0508 Chenakin
7 pages
IIEQ Pro Manual
No ratings yet
IIEQ Pro Manual
4 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
Speech Recognition PPT F
100% (2)
Speech Recognition PPT F
16 pages
Speech Recognition: BY Charu Joshi
No ratings yet
Speech Recognition: BY Charu Joshi
26 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
14 pages
SPEECH RECOGNITION SYSTEM Final
No ratings yet
SPEECH RECOGNITION SYSTEM Final
16 pages
Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
No ratings yet
Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
30 pages
Speech Technology
No ratings yet
Speech Technology
5 pages
Speech Recognition: An Overview
No ratings yet
Speech Recognition: An Overview
19 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Emotion Based Music System
No ratings yet
Emotion Based Music System
51 pages
JAVA Interview Questions.
No ratings yet
JAVA Interview Questions.
8 pages
Java Technical Aptitude Questions and Answers
No ratings yet
Java Technical Aptitude Questions and Answers
10 pages
Python Vs Java Comparison Python Java
No ratings yet
Python Vs Java Comparison Python Java
23 pages
Coding - Programming Question Paper - Set A (Dec 2017)
No ratings yet
Coding - Programming Question Paper - Set A (Dec 2017)
1 page
Digital Asset Medium of Exchange Cryptography: What Is Cryptocurrency?
No ratings yet
Digital Asset Medium of Exchange Cryptography: What Is Cryptocurrency?
12 pages
Basic Data Science Interview Questions
No ratings yet
Basic Data Science Interview Questions
18 pages
18MCA48: Internet of Things (IOT) 2020-2021
No ratings yet
18MCA48: Internet of Things (IOT) 2020-2021
10 pages
Ingestable Robots: Presented By: Rakesh C N IV Sem Mca
No ratings yet
Ingestable Robots: Presented By: Rakesh C N IV Sem Mca
12 pages
Ingestable Robots: Presented By: Rakesh C N IV Sem Mca
No ratings yet
Ingestable Robots: Presented By: Rakesh C N IV Sem Mca
15 pages
RFID Based Library Management System
No ratings yet
RFID Based Library Management System
85 pages
Principled Artificial Intelligence: Mapping Consensus in Ethical and Rights-Based Approaches To Principles For AI
No ratings yet
Principled Artificial Intelligence: Mapping Consensus in Ethical and Rights-Based Approaches To Principles For AI
39 pages
Love Boat Life Boat Fan0177
No ratings yet
Love Boat Life Boat Fan0177
33 pages
CV Format For (Mechanical Engineer) A
100% (1)
CV Format For (Mechanical Engineer) A
2 pages
22 SmartTime 3 Test 3B
No ratings yet
22 SmartTime 3 Test 3B
4 pages
Archaeology in The Middle Benue Valley
100% (5)
Archaeology in The Middle Benue Valley
36 pages
Varcarolis: Essentials of Psychiatric Mental Health Nursing: Paranoid Schizophrenia
No ratings yet
Varcarolis: Essentials of Psychiatric Mental Health Nursing: Paranoid Schizophrenia
5 pages
Sustainable Ship Designs (Antony Prince)
No ratings yet
Sustainable Ship Designs (Antony Prince)
5 pages
Example 048
No ratings yet
Example 048
1 page
Hospitality Industry Environmental Management Systems and Strategies
No ratings yet
Hospitality Industry Environmental Management Systems and Strategies
14 pages
RPH MT DLP THN 3 V3 (Unit 1)
No ratings yet
RPH MT DLP THN 3 V3 (Unit 1)
12 pages
Barajas Social Housing Blocks - EMBT
No ratings yet
Barajas Social Housing Blocks - EMBT
15 pages
Soal Ulangan Kelas VII English
No ratings yet
Soal Ulangan Kelas VII English
4 pages
Succession Management
No ratings yet
Succession Management
12 pages
DLL - English 6 - Q3 - W10
No ratings yet
DLL - English 6 - Q3 - W10
5 pages
Hydraulics - Unit 2 Part 3
No ratings yet
Hydraulics - Unit 2 Part 3
27 pages
FRIGEL
No ratings yet
FRIGEL
4 pages
Coa M2014-011
No ratings yet
Coa M2014-011
13 pages
Comparative Political Institution Midterm Exam 22314010
No ratings yet
Comparative Political Institution Midterm Exam 22314010
5 pages
Cdi 4 - Final Activity1
No ratings yet
Cdi 4 - Final Activity1
1 page
SAT Math Formula Sheet
No ratings yet
SAT Math Formula Sheet
3 pages
Long Term Unemployed CV Template
No ratings yet
Long Term Unemployed CV Template
2 pages
Chapter 5 (Intestinal Parasites)
No ratings yet
Chapter 5 (Intestinal Parasites)
58 pages
Arduino For Secret Agents - Sample Chapter
No ratings yet
Arduino For Secret Agents - Sample Chapter
23 pages
As Per March 2011 Apostolic Line of Succession Bernard Bened
100% (1)
As Per March 2011 Apostolic Line of Succession Bernard Bened
81 pages
Magoosh 1000 Words: Common (High-Frequency) Words
No ratings yet
Magoosh 1000 Words: Common (High-Frequency) Words
35 pages
Differential Expansion Transducer System: Operation Manual
No ratings yet
Differential Expansion Transducer System: Operation Manual
64 pages

SPEECH

Uploaded by

SPEECH

Uploaded by

SPEECH RECOGNITION

You might also like