SlideShare a Scribd company logo
MR. JAYANAND KAMBLE
WELCOME TO JAK’S TUTORIAL
Prof. Jayanand Kamble
1
COURSE CODE: 20UIT704EA
TITLE: SPEECH RECOGNITION & NATURAL LANGUAGE
PROCESSING
Prof. Jayanand Kamble
2
Weekly Teaching Hrs Evaluation Scheme
Credit
L T P CT MSE ESE
3 0 0 20 20 60 3
CONTENT: 03 PHONOLOGY
• Speech sounds,
• Phonetic transcription,
• Phoneme and phonological rules,
• Optimality theory
• Machine learning of phonological rules
Prof. Jayanand Kamble
3
• Phonological Aspects of prosody and speech
synthesis
• Pronunciation, spelling and N-grams:
• Spelling errors
• Detection and elimination using probabilistic
models
• Pronunciation variations
Prof. Jayanand Kamble
4
• Decision tree model
• Counting words in corpora
• Simple N-grams, smoothing,
• N-gram for spelling and pronunciation
Prof. Jayanand Kamble
5
SPEECH SOUNDS
• Knowing a language includes knowing the sounds of that
language
• Phonetics is the study of speech sounds
• We are able to segment a continuous stream of speech
into distinct parts and recognize the parts in other words
• Everyone who knows a language knows how to segment
sentences into words and words into sounds
Prof. Jayanand Kamble
6
IDENTITY OF SPEECH SOUNDS
• Our linguistic knowledge allows us to ignore nonlinguistic
differences in speech (such as individual pitch levels, rates
of speed, coughs)
• We are capable of making sounds that are not speech
sounds in English but are in other languages
Prof. Jayanand Kamble
7
• The science of phonetics aims to describe all the sounds
of all the world’s languages
• Acoustic phonetics: focuses on the physical properties of the
sounds of language
• Auditory phonetics: focuses on how listeners perceive the
sounds of language
• Articulatory phonetics: focuses on how the vocal tract produces
the sounds of language
Prof. Jayanand Kamble
8
उच्चार
श्रवणववषयक
ध्वविक
THE PHONETIC ALPHABET
• Spelling does not consistently represent the sounds of
language
• Some problems with ordinary spelling:
• The same sound may be represented by many letters or
combination of letters:
• he people key
• believe seize machine
• The same letter may represent a variety of sounds:
• father village
Prof. Jayanand Kamble
9
• A combination of letters may represent a single
sound
• shoot character Thomas
• A single letter may represent a combination of
sounds
• xerox
Prof. Jayanand Kamble
10
PHONOLOGY
• Phonology is the study of the sound system of a language.
• A language's sound system is made up of a set of phonemes which are
used according to phonological rules.
• Phonology describes sound contrasts which create differences in meaning
within a language.
• Phonological systems are made up of phonemes and each language has
its own phonological system.
• This means that the study of phonology is language-specific.
Prof. Jayanand Kamble
11
• A phoneme is the smallest unit of meaningful sound.
• Phonemes are the basic phonological units and form the
building blocks of speech sounds.
• Phonemes are single sounds represented by a single
written symbol.
Prof. Jayanand Kamble
12
• Phonotactics is the study of the rules governing the
possible phoneme sequences in a language
Prof. Jayanand Kamble
13
PHONOLOGICAL RULES
• Phonological rules are related to the spoken or written
principles which control the changes of sounds during
speech
Prof. Jayanand Kamble
14
• These describe the process of articulation (how a speaker
produces speech sounds stored in the brain).
• Phonological rules help us understand which sounds
change, what they change to, and where the change
happens.
• Examples of phonological rules can be divided into four
types: assimilation, dissimilation, insertion, and deletion.
Prof. Jayanand Kamble
15
ASSIMILATION
• Assimilation is the process of changing one feature of a
sound to make it similar to another.
• This rule can be applied to the English plural system:
• The -s can change from voiced to voiceless depending on
whether the preceding consonant is voiced or unvoiced.
Prof. Jayanand Kamble
16
• So, the English plural -s can be pronounced in different
ways depending on the word it is part of, for example:
• In the word snakes, the letter 's' is pronounced /s/.
• In the word baths, the letter 's' is pronounced /z/.
• In the word dresses, the letter 's' is pronounced /ɪz/.
Prof. Jayanand Kamble
17
DISSIMILATION
• Dissimilation is the process of changing one feature of a
sound to make it different.
• This type of rule makes two sounds more distinguishable.
It can help non-native speakers to pronounce words.
• The pronunciation of the word chimney [ˈʧɪmni] as
chimley [ˈʧɪmli], with the change of [n] to an [l].
Prof. Jayanand Kamble
18
INSERTION
• Insertion is the process of adding an extra sound between two
others.
• For example, we usually insert a voiceless stop between a nasal
and a voiceless fricative to make it easier for English speakers
to pronounce a word.
• In the word strength /strɛŋθ/, we add the sound 'k' and it
becomes /strɛŋkθ/.
Prof. Jayanand Kamble
19
DELETION
• Deletion is the process of not pronouncing a sound (consonant,
vowel, or whole syllable) present in a word or phrase, to make it
easier to say.
• For example:
• In the phrase “you and me” [ju: ənd mi:] it is possible not to say the
sound /d/.
• You and me [ju:ənmi:].
Prof. Jayanand Kamble
20
• This also occurs in some words:
• /h/ in him [ɪm].
• /f/ in fifth [fɪθ].
Prof. Jayanand Kamble
21
PHONETIC TRANSCRIPTION
• Phonetic transcription is a written guide to
pronouncing specific words.
• Typical transcriptions feature the words people say
exactly, including thinking words and sounds such as
“um,” “like,” “uh,” or “hmm.”
Prof. Jayanand Kamble
22
• Phonetic transcription and traditional transcription use different
languages.
• These languages have phonetic symbols, each representing a consonant
or vowel sound.
• Phonetic transcription can transcribe any language in the world.
• Phonetic transcription features symbols from the International Phonetic
Alphabet (IPA).
• The IPA is the most widely used and recognized system for phonetic
transcription.
Prof. Jayanand Kamble
23
• Some words, like “dress,” look very similar to their English
spellings when spelled out using the IPA: drɛs.
• Other words, like the actual word “other,” look quite
different: ˈʌð ər.
• These differences occur because the Latin alphabet and
the IPA have different systems and ways of transcribing
audio material.
Prof. Jayanand Kamble
24
EXAMPLE
Prof. Jayanand Kamble
25
WHY IS PHONETIC TRANSCRIPTION IMPORTANT?
• All languages have differences in accents and dialects,
country to country, region to region, and even town to
town.
• Professional voice actors should be able to adjust their
default accents to those needed for whatever job they’re
doing.
• This may be tricky, however, which is where phonetic
transcription comes in handy.
Prof. Jayanand Kamble
26
OPTIMALITY THEORY
• In a traditional phonological derivation, we are given an underlying
lexical form and a surface form.
• Surface forms of words are those found in natural language text.
• The corresponding lexical form of a surface form is the lemma
followed by grammatical information (for example the part of
speech, gender and number).
• In English "give, gives, giving, gave and given" are surface forms of
the verb give.
• The lexical form would be "give", verb.
Prof. Jayanand Kamble
27
• The phonological system then consists of one component:
a sequence of rules which map the underlying form to the
surface form.
• Optimality Theory (OT) (Prince and Smolensky, 1993)
offers an alternative way of viewing phonological
derivation, based on two functions (GEN and EVAL) and a
set of ranked violable constraints (CON).
Prof. Jayanand Kamble
28
• Given an underlying form, the GEN function produces all imaginable surface
forms, even those which couldn’t possibly be a legal surface form for the input.
• The EVAL function then applies each constraint in CON to these surface forms in
order of constraint rank.
• The surface form which best meets the constraints is chosen.
Prof. Jayanand Kamble
29
MACHINE LEARNING OF PHONOLOGICAL RULES
• The task of a machine learning system is to automatically
induce a model for some domain, given some data from
the domain.
• Thus, a system to learn phonological rules would be given
at least a set of (surface forms of) words to induce from.
• A supervised algorithm is one which is given the correct
answers for some of this data, using these answers to
induce a model which can generalize to new data it hasn’t
seen before.
Prof. Jayanand Kamble
30
• An unsupervised algorithm does this purely from the data.
While unsupervised algorithms don’t get to see the
correct labels for the classifications, they can be given
hints about the nature of the rules or models they should
be forming. For example, the knowledge that the models
will be in the form of automata is itself a kind of hint. Such
hints are called a learning bias.
Prof. Jayanand Kamble
31
• Ellison (1992) showed that concepts like the consonant and vowel distinction, the syllable structure of a language, and harmony relationships could
be learned by a system based on choosing the model from the set of potential models which is the simplest. Simplicity can be measured by
choosing the model with the minimum coding length, or the highest probability.
• Daelemans et al. (1994) used the Instance-Based Generalization algorithm (Aha et al., 1991) to learn stress rule for Dutch; the algorithm is a
supervised one which is given a number of words together with their stress patterns, and which induces
• generalizations about the mapping from the sequences of light and heavy syllable type in the word to the stress pattern(A syllable is a part of a
word that contains a single vowel sound and that is pronounced as a unit).
• Johnson (1984) gives one of the first computational algorithms for
• phonological rule induction. His algorithm works for rules of the form b-->a/C, where C is the feature matrix of the segments around a. Johnson’s
algorithm sets up a system of constraint equations which C must satisfy, by considering
• both the positive contexts
Prof. Jayanand Kamble
32
TEXT BOOK
Prof. Jayanand Kamble
33

More Related Content

PPTX
Pronunciation chapter 4
PPTX
Grammar and Language Analysis.pptx
PDF
materi1epp-201034455787876645430102630.pdf
PPTX
PHONOLOGY & PHONETICSjsjjsjsjsjdjdjdjjdjdjdjd
PDF
Guia Didactica Uno
PDF
Week 3 phonology
PPT
Phonetics Phonology
PDF
INDTRODUCTION TO PHONETICS SCIENCE.pdf
Pronunciation chapter 4
Grammar and Language Analysis.pptx
materi1epp-201034455787876645430102630.pdf
PHONOLOGY & PHONETICSjsjjsjsjsjdjdjdjjdjdjdjd
Guia Didactica Uno
Week 3 phonology
Phonetics Phonology
INDTRODUCTION TO PHONETICS SCIENCE.pdf

Similar to Phonology.pptx (20)

PPTX
Phonetics & phonology
PPT
Phonetics and phonology junaid shahid
PPTX
[Class 1] [For Midterm] [Fall 2024] Introduction to Phonetics and Phonology.pptx
PPTX
Introduction to English phonology and phonetics pptx
PPTX
Phonology chapter 8
PPTX
phonology Chapter 8
PDF
Week 3 phonology copy
PPTX
Aspects of phonetics and phonology in pronunciation
PPTX
English Phonetics and Phonology
PPTX
first lecture. phonetics and phonology.pptx
PPT
Teaching Of Phonetics And Phonology At Ma Level In Pakistan
PPT
Teaching Of Phonetics And Phonology At Ma Level In Pakistan
PDF
B047006011
PDF
B047006011
PPTX
Phonetics
PPTX
Unit 3 Phonology_Linguistics for L teachers.pptx
PPT
Phonology333
PPT
Ctel module1 fall09
PDF
Elf.dec12 pronunciation
Phonetics & phonology
Phonetics and phonology junaid shahid
[Class 1] [For Midterm] [Fall 2024] Introduction to Phonetics and Phonology.pptx
Introduction to English phonology and phonetics pptx
Phonology chapter 8
phonology Chapter 8
Week 3 phonology copy
Aspects of phonetics and phonology in pronunciation
English Phonetics and Phonology
first lecture. phonetics and phonology.pptx
Teaching Of Phonetics And Phonology At Ma Level In Pakistan
Teaching Of Phonetics And Phonology At Ma Level In Pakistan
B047006011
B047006011
Phonetics
Unit 3 Phonology_Linguistics for L teachers.pptx
Phonology333
Ctel module1 fall09
Elf.dec12 pronunciation
Ad

Recently uploaded (20)

PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Insiders guide to clinical Medicine.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
High Ground Student Revision Booklet Preview
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
LDMMIA Reiki Yoga S2 L3 Vod Sample Preview
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
LDMMIA Reiki Yoga Workshop 15 MidTerm Review
PDF
Piense y hagase Rico - Napoleon Hill Ccesa007.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Onica Farming 24rsclub profitable farm business
PDF
Mga Unang Hakbang Tungo Sa Tao by Joe Vibar Nero.pdf
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
How to Manage Starshipit in Odoo 18 - Odoo Slides
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Sunset Boulevard Student Revision Booklet
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Module 3: Health Systems Tutorial Slides S2 2025
PDF
English Language Teaching from Post-.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Insiders guide to clinical Medicine.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
High Ground Student Revision Booklet Preview
Abdominal Access Techniques with Prof. Dr. R K Mishra
LDMMIA Reiki Yoga S2 L3 Vod Sample Preview
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
LDMMIA Reiki Yoga Workshop 15 MidTerm Review
Piense y hagase Rico - Napoleon Hill Ccesa007.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Onica Farming 24rsclub profitable farm business
Mga Unang Hakbang Tungo Sa Tao by Joe Vibar Nero.pdf
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
How to Manage Starshipit in Odoo 18 - Odoo Slides
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Sunset Boulevard Student Revision Booklet
Renaissance Architecture: A Journey from Faith to Humanism
Week 4 Term 3 Study Techniques revisited.pptx
Module 3: Health Systems Tutorial Slides S2 2025
English Language Teaching from Post-.pdf
Ad

Phonology.pptx

  • 1. MR. JAYANAND KAMBLE WELCOME TO JAK’S TUTORIAL Prof. Jayanand Kamble 1
  • 2. COURSE CODE: 20UIT704EA TITLE: SPEECH RECOGNITION & NATURAL LANGUAGE PROCESSING Prof. Jayanand Kamble 2 Weekly Teaching Hrs Evaluation Scheme Credit L T P CT MSE ESE 3 0 0 20 20 60 3
  • 3. CONTENT: 03 PHONOLOGY • Speech sounds, • Phonetic transcription, • Phoneme and phonological rules, • Optimality theory • Machine learning of phonological rules Prof. Jayanand Kamble 3
  • 4. • Phonological Aspects of prosody and speech synthesis • Pronunciation, spelling and N-grams: • Spelling errors • Detection and elimination using probabilistic models • Pronunciation variations Prof. Jayanand Kamble 4
  • 5. • Decision tree model • Counting words in corpora • Simple N-grams, smoothing, • N-gram for spelling and pronunciation Prof. Jayanand Kamble 5
  • 6. SPEECH SOUNDS • Knowing a language includes knowing the sounds of that language • Phonetics is the study of speech sounds • We are able to segment a continuous stream of speech into distinct parts and recognize the parts in other words • Everyone who knows a language knows how to segment sentences into words and words into sounds Prof. Jayanand Kamble 6
  • 7. IDENTITY OF SPEECH SOUNDS • Our linguistic knowledge allows us to ignore nonlinguistic differences in speech (such as individual pitch levels, rates of speed, coughs) • We are capable of making sounds that are not speech sounds in English but are in other languages Prof. Jayanand Kamble 7
  • 8. • The science of phonetics aims to describe all the sounds of all the world’s languages • Acoustic phonetics: focuses on the physical properties of the sounds of language • Auditory phonetics: focuses on how listeners perceive the sounds of language • Articulatory phonetics: focuses on how the vocal tract produces the sounds of language Prof. Jayanand Kamble 8 उच्चार श्रवणववषयक ध्वविक
  • 9. THE PHONETIC ALPHABET • Spelling does not consistently represent the sounds of language • Some problems with ordinary spelling: • The same sound may be represented by many letters or combination of letters: • he people key • believe seize machine • The same letter may represent a variety of sounds: • father village Prof. Jayanand Kamble 9
  • 10. • A combination of letters may represent a single sound • shoot character Thomas • A single letter may represent a combination of sounds • xerox Prof. Jayanand Kamble 10
  • 11. PHONOLOGY • Phonology is the study of the sound system of a language. • A language's sound system is made up of a set of phonemes which are used according to phonological rules. • Phonology describes sound contrasts which create differences in meaning within a language. • Phonological systems are made up of phonemes and each language has its own phonological system. • This means that the study of phonology is language-specific. Prof. Jayanand Kamble 11
  • 12. • A phoneme is the smallest unit of meaningful sound. • Phonemes are the basic phonological units and form the building blocks of speech sounds. • Phonemes are single sounds represented by a single written symbol. Prof. Jayanand Kamble 12
  • 13. • Phonotactics is the study of the rules governing the possible phoneme sequences in a language Prof. Jayanand Kamble 13
  • 14. PHONOLOGICAL RULES • Phonological rules are related to the spoken or written principles which control the changes of sounds during speech Prof. Jayanand Kamble 14
  • 15. • These describe the process of articulation (how a speaker produces speech sounds stored in the brain). • Phonological rules help us understand which sounds change, what they change to, and where the change happens. • Examples of phonological rules can be divided into four types: assimilation, dissimilation, insertion, and deletion. Prof. Jayanand Kamble 15
  • 16. ASSIMILATION • Assimilation is the process of changing one feature of a sound to make it similar to another. • This rule can be applied to the English plural system: • The -s can change from voiced to voiceless depending on whether the preceding consonant is voiced or unvoiced. Prof. Jayanand Kamble 16
  • 17. • So, the English plural -s can be pronounced in different ways depending on the word it is part of, for example: • In the word snakes, the letter 's' is pronounced /s/. • In the word baths, the letter 's' is pronounced /z/. • In the word dresses, the letter 's' is pronounced /ɪz/. Prof. Jayanand Kamble 17
  • 18. DISSIMILATION • Dissimilation is the process of changing one feature of a sound to make it different. • This type of rule makes two sounds more distinguishable. It can help non-native speakers to pronounce words. • The pronunciation of the word chimney [ˈʧɪmni] as chimley [ˈʧɪmli], with the change of [n] to an [l]. Prof. Jayanand Kamble 18
  • 19. INSERTION • Insertion is the process of adding an extra sound between two others. • For example, we usually insert a voiceless stop between a nasal and a voiceless fricative to make it easier for English speakers to pronounce a word. • In the word strength /strɛŋθ/, we add the sound 'k' and it becomes /strɛŋkθ/. Prof. Jayanand Kamble 19
  • 20. DELETION • Deletion is the process of not pronouncing a sound (consonant, vowel, or whole syllable) present in a word or phrase, to make it easier to say. • For example: • In the phrase “you and me” [ju: ənd mi:] it is possible not to say the sound /d/. • You and me [ju:ənmi:]. Prof. Jayanand Kamble 20
  • 21. • This also occurs in some words: • /h/ in him [ɪm]. • /f/ in fifth [fɪθ]. Prof. Jayanand Kamble 21
  • 22. PHONETIC TRANSCRIPTION • Phonetic transcription is a written guide to pronouncing specific words. • Typical transcriptions feature the words people say exactly, including thinking words and sounds such as “um,” “like,” “uh,” or “hmm.” Prof. Jayanand Kamble 22
  • 23. • Phonetic transcription and traditional transcription use different languages. • These languages have phonetic symbols, each representing a consonant or vowel sound. • Phonetic transcription can transcribe any language in the world. • Phonetic transcription features symbols from the International Phonetic Alphabet (IPA). • The IPA is the most widely used and recognized system for phonetic transcription. Prof. Jayanand Kamble 23
  • 24. • Some words, like “dress,” look very similar to their English spellings when spelled out using the IPA: drɛs. • Other words, like the actual word “other,” look quite different: ˈʌð ər. • These differences occur because the Latin alphabet and the IPA have different systems and ways of transcribing audio material. Prof. Jayanand Kamble 24
  • 26. WHY IS PHONETIC TRANSCRIPTION IMPORTANT? • All languages have differences in accents and dialects, country to country, region to region, and even town to town. • Professional voice actors should be able to adjust their default accents to those needed for whatever job they’re doing. • This may be tricky, however, which is where phonetic transcription comes in handy. Prof. Jayanand Kamble 26
  • 27. OPTIMALITY THEORY • In a traditional phonological derivation, we are given an underlying lexical form and a surface form. • Surface forms of words are those found in natural language text. • The corresponding lexical form of a surface form is the lemma followed by grammatical information (for example the part of speech, gender and number). • In English "give, gives, giving, gave and given" are surface forms of the verb give. • The lexical form would be "give", verb. Prof. Jayanand Kamble 27
  • 28. • The phonological system then consists of one component: a sequence of rules which map the underlying form to the surface form. • Optimality Theory (OT) (Prince and Smolensky, 1993) offers an alternative way of viewing phonological derivation, based on two functions (GEN and EVAL) and a set of ranked violable constraints (CON). Prof. Jayanand Kamble 28
  • 29. • Given an underlying form, the GEN function produces all imaginable surface forms, even those which couldn’t possibly be a legal surface form for the input. • The EVAL function then applies each constraint in CON to these surface forms in order of constraint rank. • The surface form which best meets the constraints is chosen. Prof. Jayanand Kamble 29
  • 30. MACHINE LEARNING OF PHONOLOGICAL RULES • The task of a machine learning system is to automatically induce a model for some domain, given some data from the domain. • Thus, a system to learn phonological rules would be given at least a set of (surface forms of) words to induce from. • A supervised algorithm is one which is given the correct answers for some of this data, using these answers to induce a model which can generalize to new data it hasn’t seen before. Prof. Jayanand Kamble 30
  • 31. • An unsupervised algorithm does this purely from the data. While unsupervised algorithms don’t get to see the correct labels for the classifications, they can be given hints about the nature of the rules or models they should be forming. For example, the knowledge that the models will be in the form of automata is itself a kind of hint. Such hints are called a learning bias. Prof. Jayanand Kamble 31
  • 32. • Ellison (1992) showed that concepts like the consonant and vowel distinction, the syllable structure of a language, and harmony relationships could be learned by a system based on choosing the model from the set of potential models which is the simplest. Simplicity can be measured by choosing the model with the minimum coding length, or the highest probability. • Daelemans et al. (1994) used the Instance-Based Generalization algorithm (Aha et al., 1991) to learn stress rule for Dutch; the algorithm is a supervised one which is given a number of words together with their stress patterns, and which induces • generalizations about the mapping from the sequences of light and heavy syllable type in the word to the stress pattern(A syllable is a part of a word that contains a single vowel sound and that is pronounced as a unit). • Johnson (1984) gives one of the first computational algorithms for • phonological rule induction. His algorithm works for rules of the form b-->a/C, where C is the feature matrix of the segments around a. Johnson’s algorithm sets up a system of constraint equations which C must satisfy, by considering • both the positive contexts Prof. Jayanand Kamble 32