0% found this document useful (0 votes)

49 views22 pages

Work 3

The document discusses the history and development of voice recognition technology over the past 50 years. It provides details on early voice recognition systems from the 1950s and discusses major advances and applications through today. The document also explains the differences between voice recognition and speech recognition as well as how voice recognition systems work.

Uploaded by

mwarubwaj000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views22 pages

Work 3

Uploaded by

mwarubwaj000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

GROUP ASSIGNMENT NO 3

GROUP NUMBER 15
• History of voice recognition
• Voice recognition is the process of converting a voice into digital data. The
technology first appeared about 50 years ago, but it has become really popular in
recent years. In this article, we will look at what this technology is and how it works.
We will tell you how it is used in some industries and introduce you to some well-
known voice/speech recognition solutions
• Voice recognition technology has grown exponentially over the past five decades.
Dating back to 1976, computers could only understand slightly more than 1,000
words. That total jumped to roughly 20,000 in the 1980s as IBM continued to
develop voice recognition technology.

• In 1952, Bell Laboratories invented AUDREY -- the Automatic Digit Recognizer

-- which could only understand the numbers zero through nine. In the early to
mid-1970s, the U.S. Department of Defense started contributing toward speech
recognition system development, funding the Defense Advanced Research
Projects Agency Speech Understanding Research. Harpy, developed by Carnegie
Mellon, was another voice recognition system at the time and could recognize up
to 1,011 words.
• The company Dragon in 1990 launched the first speaker recognition
product for consumers, Dragon Dictate. This was later replaced by Dragon
NaturallySpeaking from Nuance Communications. In 1997, IBM
introduced IBM ViaVoice, the first voice recognition product that could
recognize continuous speech.
• Apple introduced Siri in 2011, and it's still a prominent voice recognition
assistant. In 2016, Google launched its Google Assistant for phones. Voice
recognition systems can be found in devices including phones, smart
speakers, laptops, desktops and tablets as well as in software like Dragon
Professional and Philips SpeechLive.

• During this past decade, several other technology leaders have developed
more sophisticated voice recognition software, such as Amazon Alexa, for
example. Released in 2014, Amazon Alexa also acts as a personal assistant
that responds to voice commands. Currently, voice recognition software is
available for Windows, Mac, Android, iOS and Windows phone devices
What is about a voice recognition

• Voice or speaker recognition is the ability of a program to identify a person based on their unique
voiceprint. It works by scanning the speech and establishing a match with the desired voiceprint. The
development of AI opened up extensive opportunities for this subfield of computer science. It enables
us to interact with machines without touching them. It is growing rapidly, and developers are finding
more and more ways to apply it in various fields.
• Voice or speaker recognition is the ability of a machine or program to receive and interpret dictation or
to understand and perform spoken commands.
• Voice recognition systems let consumers interact with technology simply by speaking to it, enabling
hands-free requests, reminders and other simple tasks.
• Voice recognition can identify and distinguish voices using automatic speech recognition (ASR)
software programs. Some ASR programs require users first train the program to recognize their voice
for a more accurate speech-to-text conversion. Voice recognition systems evaluate a voice's
frequency, accent and flow of speech.
IS THERE ANY DIFFERENCE BETWEEN VOICE RECOGNITION AND SPEECH RECOGNITION
• It is essential to understand the differences between these two things. The purpose of voice recognition
is to identify the voice owner. Speech recognition's purpose is to identify the words of the speaker. In
the first case, the program needs a unique voiceprint of the speaker for comparison. In the second case,
the program needs a huge dictionary to identify the speaker's words.

• While speech recognition translates anyone’s voice, voice recognition is a biometric system that
recognizes and authenticates a specific user’s voice.

• It analyzes the unique features of a person’s voice, including pitch, tone, and rhythm, to create a unique
voiceprint for identification.

• This technology is often used for security purposes, such as unlocking mobile devices or accessing
systems.

• Although voice recognition and speech recognition are referred to interchangeably, they aren't the same,
and a critical distinction must be made. Voice recognition identifies the speaker, whereas speech
recognition evaluates what is said.
TYPES OF VOICE RECOGNITION SYSTEMS
• Voice recognition has two categories, they are:
 Text-Dependent — The system is trained to recognize predetermined voice
passphrases by the speaker;
 Text Independent — It doesn't require predetermined passphrases. The
subject of the analysis is conversational speech.
TYPES OF SPEECH RECOGNITION SYSTEMS
We can classify Automatic Speech Recognition (ASR) into different categories.
First of all, it relies on the speaker. From this side, two types are known, they are:
 Speaker Dependent — The program is trained to recognize a specific voice, similar to voice
recognition. The speaker must “talk” to the program and give it the ability to analyze the voice.
Such systems are easier to implement. They provide high accuracy in speech recognition;
 Speaker Independent — This type of speech recognition software has wider usage. It doesn't
require training to analyze the voice. The emphasis is on the speaker's word recognition.
Typical examples of such programs are IVR systems.
The other method of categorization is based on how the user speaks. Those categories are:
 Discrete Speech Recognition — ASR applications have used this method since the early
versions. Тhe speaker must pronounce each word separately, inserting pauses between them.
With such programs, it is more difficult to work. It isn't easy to ensure the frequency of spoken
words;
 Continuous Speech Recognition — This is a relatively new method of ASR and requires more
effort to develop. The speaker's speech rate is close to normal in this case.
• How does voice recognition work?
Voice recognition uses technology to evaluate the biometrics of your voice.
That includes the frequency and flow of your voice, as well as your accent.
Every word you speak is broken up into segments of several tones. This is then
digitised and translated to create your own unique voice template.
Artificial intelligence, deep learning, and machine learning are the forces
behind speech recognition. Artificial intelligence is used to understand the
colloquialisms, abbreviations, and acronyms we use. Machine learning then
pieces together the patterns and develops from this data using neural networks.
Voice recognition software on computers requires analog audio to be
converted into digital signals, known as analog-to-digital (A/D) conversion.
For a computer to decipher a signal, it must have a digital database of words or
syllables as well as a quick process for comparing this data to signals.
• A voice recognition program runs many times faster if the entire vocabulary can be loaded into RAM
compared to searching the hard drive for some of the matches. Processing speed is critical, as it affects
how fast the computer can search the RAM for matches.
• Audio also must be processed for clarity, so some devices may filter out background noise. In some
voice recognition systems, certain frequencies in the audio are emphasized so the device can recognize
a voice better.

• Voice recognition systems analyze speech through one of two models: the hidden Markov model and
neural networks. The hidden Markov model breaks down spoken words into their phonemes, while
recurrent neural networks use the output from previous steps to influence the input to the current step.

• As uses for voice recognition technology grow and more users interact with it, the organizations
implementing voice recognition software will have more data and information to feed into
neural networks for voice recognition systems. This improves the capabilities and accuracy of voice
recognition products.
• The popularity of smartphones opened up the opportunity to add voice recognition technology into
consumer pockets, while home devices -- such as Google Home and Amazon Echo -- brought voice
recognition technology into living rooms and kitchens.
Voice recognition uses
• The uses for voice recognition have grown quickly as AI,
machine learning and consumer acceptance have matured. Examples
of how voice recognition is used include the following:
 Virtual assistants. Siri, Alexa and Google virtual assistants all
implement voice recognition software to interact with users. The way
consumers use voice recognition technology varies depending on the
product. But they can use it to transcribe voice to text, set up
reminders, search the internet and respond to simple questions and
requests, such as play music or share weather or traffic information.
 Smart devices. Users can control their smart homes – including smart
thermostats and smart speakers -- using voice recognition software.
 Automated phone systems. Organizations use voice recognition with
their phone systems to direct callers to a corresponding department by
saying a specific number.
 Conferencing. Voice recognition is used in live captioning a speaker
so others can follow what is said in real time as text.
 Bluetooth. Bluetooth systems in modern cars support voice recognition to help
drivers keep their eyes on the road. Drivers can use voice recognition to perform
commands such as "call my office."
 Dictation and voice recognition software. These tools can help users dictate and
transcribe documents without having to enter text using a physical keyboard or
mouse.
 Government. The National Security Agency has used voice recognition systems
dating back to 2006 to identify terrorists and spies or to verify the audio of anyone
speaking.
Voice recognition advantages and disadvantages
Voice recognition offers numerous benefits:
 Consumers can multitask by speaking directly to their voice assistant
or other voice recognition technology.
 Users who have trouble with sight can still interact with their devices.
 Machine learning and sophisticated algorithms help voice recognition
technology quickly turn spoken words into written text.
 This technology can capture speech faster than some users can type.
This makes tasks like taking notes or setting reminders faster and
more convenient.
 Increases the productivity of businesses;
 Automates the interaction between the businesses and customers;
 Adds an extra security level;
 Captures speech faster than a human can type;
Helps people with disabilities;

Helps control your home devices;

Assists drivers with in-car ASR systems and more.

Some disadvantages of the technology include the following:
 Background noise can produce false input.
 While accuracy rates are improving, all voice recognition systems and
programs make errors.
 There's a problem with words that sound alike but are spelled
differently and have different meanings -- for example, hear and here.
This issue might be largely overcome using stored contextual
information. However, this requires more RAM and faster processors.
 Systems can't fully recognize speech if the speaker speaks quickly
and not clearly;
 Large vocabularies are required to improve recognition accuracy;
 Each language requires separate training for ASR;
 Businesses can collect and use the user's voice data without their
permission;Time and financial costs are high;
 ASR software consumes a lot of memory and requires a large amount
of RAM.
Modern ASR systems are based on three models: acoustic, pronunciation, and language

i. Acoustic modeling makes it possible to distinguish between the voice signal and the
phonemes(a unit of sound). Hidden Markov Model (HMM) is a common acoustic modeling
approach. Other approaches use deep neural networks or convolutional neural networks, etc.;
ii. The pronunciation model defines how phonemes can be combined to make words;
iii. Language modeling is a discipline that helps distinguish between words and phrases that
sound the same.
• After recording the speech, the noise is cleared, and the useful signal is filtered from the
recording. Тhe record is divided into small fragments. After that, each fragment is passed
through the acoustic model. These fragments are compared to the phonemes, an initially built
statistical model that describes the pronunciation of each sound in speech. Based on these
matches, words are collected from phonemes. Тhe efficiency of finding words strongly depends
on the size of the pre-prepared phoneme database
Challenges of Voice Recognition Technology

Accuracy and Precision

• Voice recognition faces challenges in both accuracy and precision.

Accuracy refers to how well the software recognizes spoken words
and transcribes them correctly. In contrast, precision refers to how
well the software can distinguish between similar-sounding words or
phrases.

• For example, if someone says “there” instead of “their,” the software

must be able to recognize the correct word based on the context of the
sentence. This requires a high level of precision.
Noise and Disturbances
• Background noise, such as traffic, construction work, or conversations in the
vicinity, can interfere with the user’s voice signal, making it difficult for the
software to distinguish the spoken words.

• Similarly, disturbances in the environment, such as a sudden loud noise, can

cause errors in the speech recognition process.

• To overcome these challenges, speech recognition software uses various

techniques, such as noise cancellation algorithms, to filter out background noise
and enhance the accuracy of the user’s voice signal.

• However, these methods are not fool-proof and may only work effectively in
some situations. Therefore, it is essential to use speech recognition technology in
a controlled and quiet environment to ensure optimal performance.
Language and Accent Barriers
• While speech recognition systems have come a long way in accurately recognizing
spoken language, they still need help understanding accents and dialects that deviate
significantly from the standard language models they were trained on.

• This can be particularly problematic in multicultural or multilingual environments where

different accents and dialects are prevalent.

• For example, an English-speaking speech recognition system trained in American

English may have difficulty accurately recognizing the accents of speakers from other
English-speaking countries, such as the United Kingdom, Australia, or India.
• In addition, speech recognition systems may also struggle with languages that have
unique phonetic features or use tonal distinctions, such as Mandarin or Cantonese.
• These languages require more advanced language models and algorithms to recognize
spoken words and phrases accurately.
Privacy and Security
• Speech recognition systems often process sensitive and personal information, such as
passwords, credit card numbers, and private conversations. Therefore, protecting users’ data
privacy and preventing unauthorized access is crucial.

• One of the primary privacy concerns with speech recognition is data collection and storage.
Voice recordings may contain sensitive information, and the storage and use of these
recordings can pose a risk to user privacy if not handled correctly.

• Moreover, speech recognition technology may also face security challenges related to
malicious attacks or breaches that could compromise sensitive data.

• For instance, a hacker could gain access to a voice-controlled device or system and use it to
gather information, such as login credentials or financial information.
• To address these challenges, developers of speech recognition technology must incorporate
privacy and security features in their products, such as encryption, secure data storage, and
user control over data collection and deletion.
THANKS

17 Symbols For Inner Peace and How To Use Them
No ratings yet
17 Symbols For Inner Peace and How To Use Them
14 pages
Safety, Security, and Convenience: The Benefits of Voice Recognition Technology
No ratings yet
Safety, Security, and Convenience: The Benefits of Voice Recognition Technology
5 pages
slidesgo-unlocking-the-future-the-impact-of-voice-recognition-technology-202412160356347MWg
No ratings yet
slidesgo-unlocking-the-future-the-impact-of-voice-recognition-technology-202412160356347MWg
11 pages
Voice Recognition System Report
No ratings yet
Voice Recognition System Report
17 pages
Integration Technologies for Industrial Automated Systems Industrial Information Technology 1st Edition Richard Zurawski instant download
100% (2)
Integration Technologies for Industrial Automated Systems Industrial Information Technology 1st Edition Richard Zurawski instant download
59 pages
WP - AIMultiple - Voice AI
No ratings yet
WP - AIMultiple - Voice AI
29 pages
Design and Implementation of Voice Recognition System
No ratings yet
Design and Implementation of Voice Recognition System
7 pages
AI Speech Recognition Document
No ratings yet
AI Speech Recognition Document
26 pages
517_1724677333_0 (2)
No ratings yet
517_1724677333_0 (2)
1 page
Igbo 3
No ratings yet
Igbo 3
19 pages
Speech Recognition Technology Applications and Advances PDF
No ratings yet
Speech Recognition Technology Applications and Advances PDF
11 pages
project-report-admission-management-system
No ratings yet
project-report-admission-management-system
67 pages
IRJET-V7I6965 (1)
No ratings yet
IRJET-V7I6965 (1)
5 pages
Voice Assistant presentation
No ratings yet
Voice Assistant presentation
10 pages
Text and Speech CCS369-UNIT 5
No ratings yet
Text and Speech CCS369-UNIT 5
9 pages
Speech Processing
No ratings yet
Speech Processing
70 pages
Speech Recognition: Recent Developments in Devices and Software Applications
No ratings yet
Speech Recognition: Recent Developments in Devices and Software Applications
13 pages
Voice Recognition Thesis Topic
100% (3)
Voice Recognition Thesis Topic
8 pages
Features: Digital Assistant
No ratings yet
Features: Digital Assistant
8 pages
chessResultsList.xlsx (13)
No ratings yet
chessResultsList.xlsx (13)
7 pages
Dynamic Website: Speech Recognition
No ratings yet
Dynamic Website: Speech Recognition
9 pages
Voice Recognition: An Examination of An Evolving Technology and Its Use in Organizations
No ratings yet
Voice Recognition: An Examination of An Evolving Technology and Its Use in Organizations
8 pages
Notes Pardot Specialist Exam
No ratings yet
Notes Pardot Specialist Exam
41 pages
Log
No ratings yet
Log
55 pages
Booklet Unit 3 (New) 2024
No ratings yet
Booklet Unit 3 (New) 2024
21 pages
Pankaj Singh Synopsis (Recovoicegnition)
No ratings yet
Pankaj Singh Synopsis (Recovoicegnition)
11 pages
AI For Everyone
From Everand
AI For Everyone
Gurprit Singh
No ratings yet
Tan Pan Hassan VoiceRecognition
No ratings yet
Tan Pan Hassan VoiceRecognition
21 pages
Speech Recognition
0% (1)
Speech Recognition
27 pages
Green's Reciprocity Theorem
No ratings yet
Green's Reciprocity Theorem
6 pages
Applications of AI Speech Recognition
No ratings yet
Applications of AI Speech Recognition
11 pages
Key Application: Automatic Speech Recognition or ASR, As It's
No ratings yet
Key Application: Automatic Speech Recognition or ASR, As It's
8 pages
An Introduction To Speech and Speaker Recognition
No ratings yet
An Introduction To Speech and Speaker Recognition
8 pages
SAP HANA Analytics Catalog BIMC Views Reference en
No ratings yet
SAP HANA Analytics Catalog BIMC Views Reference en
46 pages
Features: Digital Assistant
No ratings yet
Features: Digital Assistant
7 pages
DAY JANUARY-APRIL 2024 FINAL TEACHING TIMETABLE
No ratings yet
DAY JANUARY-APRIL 2024 FINAL TEACHING TIMETABLE
9 pages
Binary Search Tree - Javatpoint
No ratings yet
Binary Search Tree - Javatpoint
19 pages
Irregular Verbs List
No ratings yet
Irregular Verbs List
4 pages
Assistant in Python
100% (1)
Assistant in Python
16 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Personal Voice Assistant in Python
86% (22)
Personal Voice Assistant in Python
30 pages
Dragon's Breath: Mastering Voice Recognition in the Digital Age
From Everand
Dragon's Breath: Mastering Voice Recognition in the Digital Age
Pasquale De Marco
No ratings yet
Tan Pan Hassan VoiceRecognition
No ratings yet
Tan Pan Hassan VoiceRecognition
21 pages
A Report On
No ratings yet
A Report On
35 pages
Voice Recognition
No ratings yet
Voice Recognition
16 pages
Speech recognition applications TEXT
No ratings yet
Speech recognition applications TEXT
7 pages
Introduction to Speech Recognition
No ratings yet
Introduction to Speech Recognition
3 pages
Working of A Voice Recognition System
No ratings yet
Working of A Voice Recognition System
2 pages
Voice Technology Seminar
100% (1)
Voice Technology Seminar
35 pages
Summary of Presentation
No ratings yet
Summary of Presentation
2 pages
Speech Recognition: White Paper
No ratings yet
Speech Recognition: White Paper
24 pages
Rohit
No ratings yet
Rohit
14 pages
Tsa Ut V
No ratings yet
Tsa Ut V
9 pages
Personal Voice Assistant in Python
100% (1)
Personal Voice Assistant in Python
30 pages
Least Learned MUSIC
No ratings yet
Least Learned MUSIC
1 page
CIS Amazon Linux 2 Benchmark v2.0.0
No ratings yet
CIS Amazon Linux 2 Benchmark v2.0.0
608 pages
Aesthetic and Emotional Effects of Meter and Rhyme in Poetry
No ratings yet
Aesthetic and Emotional Effects of Meter and Rhyme in Poetry
10 pages
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
No ratings yet
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
7 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
19 pages
Key Application: - Audrey System - The First Speech Recognition System Introduced by Bell Laboratories in 1952
No ratings yet
Key Application: - Audrey System - The First Speech Recognition System Introduced by Bell Laboratories in 1952
8 pages
学习目标 Objective: 学习基本笔画和复合笔画 Learn basic and compound strokes
No ratings yet
学习目标 Objective: 学习基本笔画和复合笔画 Learn basic and compound strokes
11 pages
RWS 11 Pre Final Handout
No ratings yet
RWS 11 Pre Final Handout
10 pages
PSYC1101-Cognition-2 - 2019 SLsky 4 Pre-Lecture Posting
No ratings yet
PSYC1101-Cognition-2 - 2019 SLsky 4 Pre-Lecture Posting
30 pages
SPEECH
No ratings yet
SPEECH
8 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
23 pages
LCI 102 Exam Question - VCE-job 41
No ratings yet
LCI 102 Exam Question - VCE-job 41
25 pages
Chapter 1. INTRODUCTION
No ratings yet
Chapter 1. INTRODUCTION
2 pages
Artificial Intelligence in Voice Recognition
No ratings yet
Artificial Intelligence in Voice Recognition
14 pages
Speech Recognition Technology: Applications & Future: Pankaj Pathak
No ratings yet
Speech Recognition Technology: Applications & Future: Pankaj Pathak
3 pages
Speech Recognition: Prof. Ram Meghe Institute of Technology and Research, Badnera-Amravati
No ratings yet
Speech Recognition: Prof. Ram Meghe Institute of Technology and Research, Badnera-Amravati
13 pages
Ai in Speech Recognition
No ratings yet
Ai in Speech Recognition
24 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
9 pages
Taller Escuela Libre de Musica
No ratings yet
Taller Escuela Libre de Musica
9 pages
Speech Recognition
No ratings yet
Speech Recognition
7 pages
LCD Grade 10 Q1-Q2 Math
No ratings yet
LCD Grade 10 Q1-Q2 Math
4 pages
SEO - Live Project Track 3
100% (1)
SEO - Live Project Track 3
4 pages
De Jesús - Shirley - WORKSHEET 8 - UNIT 3
No ratings yet
De Jesús - Shirley - WORKSHEET 8 - UNIT 3
6 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
14 pages
TTC S3 English Quiz - Gerunds, Infinitives & Modal Verbs P.1
No ratings yet
TTC S3 English Quiz - Gerunds, Infinitives & Modal Verbs P.1
5 pages
NHW Beg 4E Progress Tests_2
100% (1)
NHW Beg 4E Progress Tests_2
3 pages
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
No ratings yet
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
6 pages
Mahayana Buddhist Tripitaka in 12 Divisions
No ratings yet
Mahayana Buddhist Tripitaka in 12 Divisions
4 pages
Speech Recognition: BY Charu Joshi
No ratings yet
Speech Recognition: BY Charu Joshi
26 pages
Vivek Kumar - 1613112052
No ratings yet
Vivek Kumar - 1613112052
7 pages
Configuring Active Directory To Back Up Windows BitLocker Drive Encryption and Trusted Platform Module Recovery Information
No ratings yet
Configuring Active Directory To Back Up Windows BitLocker Drive Encryption and Trusted Platform Module Recovery Information
47 pages
Ai For Speech Recognition
100% (4)
Ai For Speech Recognition
24 pages
Unit Plan: Poem-The Snake Trying by W.W.E. Ross
No ratings yet
Unit Plan: Poem-The Snake Trying by W.W.E. Ross
9 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
Speech Recognition Full Report
No ratings yet
Speech Recognition Full Report
11 pages
Speaker Recognition: Fundamentals and Applications
From Everand
Speaker Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet

Work 3

Uploaded by

Work 3

Uploaded by

GROUP ASSIGNMENT NO 3

• In 1952, Bell Laboratories invented AUDREY -- the Automatic Digit Recognizer

Helps control your home devices;

Assists drivers with in-car ASR systems and more.

Accuracy and Precision

• Voice recognition faces challenges in both accuracy and precision.

• For example, if someone says “there” instead of “their,” the software

• Similarly, disturbances in the environment, such as a sudden loud noise, can

• To overcome these challenges, speech recognition software uses various

• This can be particularly problematic in multicultural or multilingual environments where

• For example, an English-speaking speech recognition system trained in American

You might also like