Approaches To Automatic Lexicon Learning With Limited Training Examples

1) The document explores approaches for automatically generating pronunciations for words using limited hand-crafted training examples. 2) It proposes iteratively refining a grapheme-to-phoneme system by adding more pronunciations estimated from acoustic data to the training pool. 3) For English, the initial G2P models are inaccurate, so pronunciations must be refined using acoustic data. For Spanish, G2P is accurate but misses alternates, so free speech recognition helps add more.

Uploaded by

bajlooka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Approaches To Automatic Lexicon Learning With Limited Training Examples

Uploaded by

bajlooka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

APPROACHES TO AUTOMATIC LEXICON LEARNING

WITH LIMITED TRAINING EXAMPLES

Nagendra Goel1 , Samuel Thomas2,

Mohit Agarwal3 , Pinar Akyazi4 , Lukáš Burget5 , Kai Feng6 , Arnab Ghoshal7, Ondřej Glembek5,
Martin Karafiát5 , Daniel Povey8, Ariya Rastrow2 , Richard C. Rose9 , Petr Schwarz5
1
Go-Vivace Inc., Virginia, USA, [email protected];
2
Johns Hopkins University, MD, [email protected]; 3 IIIT Allahabad, India;
4
Boǧaziçi University, Turkey; 5 Brno University of Technology, Czech Republic;
6
Hong Kong UST; 7 Saarland University, Germany;
8
Microsoft Research, Redmond, WA;9 McGill University, Canada

ABSTRACT In this paper, we explore some approaches for automatically gener-

ating pronunciations for words using limited hand-crafted training
Preparation of a lexicon for speech recognition systems can be a examples. To address the issues of using these dictionaries in differ-
significant effort in languages where the written form is not exactly ent acoustic conditions, or to determine a phone-set inventory, other
phonetic. On the other hand, in languages where the written form approaches have been proposed [?, ?]. Use of multiple pronuncia-
is quite phonetic, some common words are often mispronounced. In tions when a much larger amount of acoustic data is available for
this paper, we use a combination of lexicon learning techniques to those words is explored in [?].
explore whether a lexicon can be learned when only a small lexi- In order to cover the words that are not seen in the acoustic train-
con is available for boot-strapping. We discover that for a phonetic ing data, it is necessary to have a grapheme-to-phoneme (G2P) sys-
language such as Spanish, it is possible to do that better than what tem that uses the word orthography to guess the pronunciation of the
is possible from generic rules or hand-crafted pronunciations. For a word. Our main approach is to iteratively refine this G2P system
more complex language such as English, we find that it is still pos- by adding more pronunciations to the training pool if they can be
sible but with some loss of accuracy. reliably estimated from the acoustics.
Index Terms— Lexicon Learning, LVCSR We find that for a language like English, the G2P models trained
on a small startup lexicon can be very inaccurate. It is necessary
to iteratively refine the pronunciations generated by the G2P for
1. INTRODUCTION each word, while constraining the pronunciation search space to the
top N pronunciations. On the other hand, if the language is very
This paper describes work done during the Johns Hopkins Univer- graphemic in pronunciation, such as Spanish, G2P models may be
sity 2009 summer workshop by the group titled “Low Development very accurate, but miss a number of common alternate pronuncia-
Cost, High Quality Speech Recognition for New Languages and Do- tions. Therefore to add more alternates, it helps to use free phonetic
mains”. For other work also done by the same team also see [?] speech recognition and align it with the transcripts.
which describes work on UBM models, [?] which describes in more The rest of the paper is organized as follows. In Section 2 we de-
detail our work as it relates to cross-language acoustic model train- scribe the approaches we use to estimate pronunciations. We discuss
ing, and [?] which provides more details on issues of speaker adap- how we use these approaches for experiments using two languages -
tation in this framework. English and Spanish - in Section 3. Section 4 talks about the results
Traditionally pronunciation dictionaries or lexicons are hand- using the proposed approaches. We conclude with a discussion of
crafted using a predefined phone-set. For building ASR systems in a the results in Section 5.
new language, having a hand-crafted dictionary covering the entire
vocabulary of the recognizer can be an expensive option. Linguis- 2. PRONUNCIATION ESTIMATION
tically trained human resources may be scarce and prone to errors.
Therefore it is desirable to have automated methods that can leverage Theoretically, the problem of lexicon estimation of words can be
on a limited amount of acoustic training data and a small pronuncia- defined as
tion dictionary, to generate a much larger lexicon for the recognizer. ˆ
Prn = arg max P (Prn|W, X), (1)
Prn
This work was conducted at the Johns Hopkins University Summer
Workshop which was supported by National Science Foundation Grant where P (Prn|W, X) is the likelihood of the pronunciation given the
Number IIS-0833652, with supplemental funding from Google Research, word sequence and acoustic data. If optimized in a un-constrained
DARPA’s GALE program and the Johns Hopkins University Human Lan- manner (for the words for which acoustic data is available), each in-
guage Technology Center of Excellence. BUT researchers were partially
supported by Czech MPO project No. FR-TI1/034. Thanks to CLSP staff
stance of a word could potentially have a different optimal pronunci-
and faculty, to Tomas Kašpárek for system support, to Patrick Nguyen for ation. It has been found in practice that doing such an optimization
introducing the participants, to Mark Gales for advice and HTK help, and to without additional constraints does not improve the system’s perfor-
Jan Černocký for proofreading and useful comments. mance. Also, this approach is not applicable to words that have not
been seen in the acoustic training data. For these words it is neces-
sary to have a well trained G2P system. Table 1. Illustration of aligning phonetic and word level transcrip-
tions
Start Start
2.1. Deriving pronunciations from graphemes
frame word frame phoneme
We use the joint-multigram approach for grapheme-to-phoneme con- 10 w1 8 p1
version proposed in [?, ?] to learn these pronunciation rules in a data 11 p2
driven fashion. Using a G2P engine gives us one additional advan- ... ... ... ...
tage. Due to the statistical nature of the engine that we use, it is now 21 w2 21 p9
possible to estimate not only the most likely pronunciation of a word 24 p10
but also to get a list of other less likely pronunciations. Now we can 26 p11
split the pronunciation search into two parts. In the first part, we find 28 w3 28 p12
a set of N possible pronunciations for each word W by training a ... ... ... ...
G2P with a bootstrap lexicon. We then use the acoustic data X, to 47 w6 43 p25
choose the pronunciation Prn ˆ that maximizes the likelihood of the 44 p26
data. ... ...
Using a set of graphoneme (pair of grapheme and phoneme se- 48 p30
quence) probabilities, the pronunciation generation models learn the
best rules to align graphemes to phonemes. The trained models are to force align the data, we recreate the pronunciation dictionary with
used to derive the most probable pronunciation Prn ˆ for each word the new pronunciations to build new G2P models. This procedure
W , such that is repeated iteratively until the best performing acoustic models are
ˆ obtained. We do not retain multiple pronunciations in the dictionary
Prn = arg max P (W, Prn), (2) for each word as we did not find this to be helpful. Instead we pick
Prn
the pronunciation with the maximum number of aligned instances
where P (W, Prn) is the joint probability of the word and its possible for the word. Before using the resulting dictionary to train the G2P
pronunciations. Trained acoustic models are then used to derive the we also discard words where the chosen pronunciation had only one
most probable pronunciation Prn ˆ for each word W in the acoustic aligned instance in the data.
data. Using the acoustic data X, we approximate Eqn 1. as
2.2.1. Approach for phonetic languages such as Spanish
ˆ
Prn = arg max P (X|Prn)P (Prn|W ) (3) In the case of Spanish, since letter-to-sound (LTS) rules are very
Prn
simple, the G2P system does not generate sufficient alternates for
Limiting the number of alternate pronunciations for each word to the dictionary learning as described above. We therefore use an unsu-
top N pronunciations of the word and assuming P (Prn|W ) to be a pervised approach to generate an optimized pronunciation dictionary
constant for each word, Eqn 3. reduces to using the acoustic training data. Using an ASR system built with
the initial pronunciation dictionary, we decode the training data both
ˆ
Prn = arg max P (X|Prn), (4) phonetically and at the word level. We use the time stamps on these
Prn ∈ Top N pron. of W recognized outputs to pick a set of reliable phonetically annotated
The trained G2P models are used to generate pronunciations for words. The selection procedure is illustrated with an example be-
the remaining words in the training corpus and the recognition lan- low. Table 1 shows an illustration of the phonetic and word level
guage model of the ASR system, not present in the initial pronunci- recognition of a hypothetical sentence. The sentence is transcribed
ation dictionary. at the word level into the sequence of words - “w1 w2 w3 . . . w6 ”
and at the phonetic level into phonemes - “p1 p2 p3 . . . p30 ”.
In this example, we pick the phoneme sequence “p9 p10 p11 ”
2.2. Refining pronunciations as the pronunciation of the word w2 , as their phonetic and word
We start the iterative process of building a lexicon, using a initial alignments match. In this unsupervised approach, by indirectly using
pronunciation dictionary containing a few hand-crafted pronuncia- the likelihoods of the acoustic data, we rely on the acoustic data to
tions. We use this dictionary as a bootstrap lexicon for training G2P pick reliable pronunciations.
models as described in the previous section. Since we do not have
any trained acoustic models yet, we use the G2P models to gener- 2.3. Adding more pronunciations to the dictionary using un-
ate pronunciations for all the remaining words in the recognizer’s transcribed audio data
vocabulary. Our first acoustic models are now trained using this dic-
tionary. Using the best acoustic models trained in the previous step, new pro-
We now use this initial acoustic model to search for the best pronunciations are added to the pronunciation dictionary in this step.
nunciations of words as described earlier. In Eq.4, which is essen- We use the best acoustic model to decode in-domain speech from
tially a forced alignment step involving a Viterbi search through the different databases. The decoded output is augmented with a confi-
word lattices, pronunciations that increase the likelihood of the train- dence score representative of how reliable the recognized output is.
ing data are picked up. We use the set of pronunciations derived from The recognized output is also used as a reference transcript to force
this process to create a new pronunciation dictionary. This new pro- align the acoustic data to phonetic labels. For this forced alignment
nunciation dictionary, along with the initial pronunciation dictionary step we use a reference dictionary with the top N pronunciations
with hand-crafted pronunciations, is used to re-train the G2P models (for example, N =5) from the best G2P model. Using a threshold
and subsequently new acoustic models. Using these acoustic models on the confidence score, reliable words and their phonetic labels are
Create an initial dictionary with
limited training examples

Train new grapheme−to−

phoneme models

Update pronunciation
dictionary
Use the models to generate
pronunciations to train and test
the LVCSR system
Pick new pronunciations
for words

Do the
No acoustic Yes
Force align the training data
models
using the new models
improve?

Use best models to decode Use the models to generate

in−domain speech from Pick reliable pronunciations Train new grapheme−to−
phoneme models pronunciations to test
different databases and update dictionary
the LVCSR system

Fig. 1. Schematic of lexicon learning with limited training examples

speech database along with high out-of-vocabulary rates, use of

Table 2. Illustration of a decoded sentence along with confidence foreign words and telephone channel distortions make the task of
scores and aligned phonetic labels speech recognition on this database challenging. The conversa-
Start Confidence tional telephone speech (CTS) database consists of 120 spontaneous
frame word score phoneme telephone conversations between native English speakers. Eighty
4 w1 c1 =0.15 p1 conversations corresponding to about 15 hours of speech, form the
p2 training set. The vocabulary size of this training set is 5K words.
... ... ... ... Instead of using a pronunciation dictionary that covers the entire 5K
11 w3 c3 =0.93 p10 words, we use a dictionary that contains only the 1K most frequently
p11 occurring words. The pronunciations for these words are taken from
p12 the PRONLEX dictionary.
15 w4 c4 =0.84 p13 Two sets of 20 conversations, roughly containing 1.8 hours of
... ... ... ... speech each, form the test and development sets. With the selected
42 w7 c7 =0.96 p32 set of 1K words, the OOV rate is close to 12%. We build a 62K
p33 trigram language model (LM) with an OOV rate of 0.4%. The lan-
... ... guage model is interpolated from individual models created using
p48 the English Callhome corpus, the Switchboard corpus, the Gigaword
corpus and some web data. The web data is obtained by crawling the
selected. Table 2 shows an illustration of a decoded sentence along web for sentences containing high frequency bigrams and trigrams
with confidence scores for each word. The sentence is decoded into occurring in the training text of the Callhome corpus. We use the
a sequence of words - “w1 w2 w3 . . . w8 ” with confidence scores “c1 SRILM tools to build the LM. We use 39 dimensional PLP features
c2 c3 . . . c8 ”. Using the decoded sequence of words the sentence is to build a single pass HTK [?] based recognizer with 1920 tied states
also forced aligned into phonemes - “p1 p2 p3 . . . p48 ”. and 18 mixtures per state along with this LM.
In our case, we set a confidence score threshold of 0.9, and se- In our experiments our goal is to improve the pronunciation dic-
lect words like w3 with its phonetic transcription “p10 p11 p12 ”. We tionary such that it effectively covers the pronunciations of unseen
also remove pronunciations that are not clear winners against other words of the training and test sets. Figure 1 illustrates the iterative
competing pronunciations of the same word instance. We train G2P process we use to improve this limited pronunciation dictionary for
models after adding new words and their pronunciations derived us- English. We start the training process with a pronunciation dictio-
ing this unsupervised technique. nary of the most frequently occurring 1K words. This pronunciation
dictionary is used to train G2P models which generate pronuncia-
3. EXPERIMENTS AND RESULTS tions for the remaining unseen words in the train and test sets of the
ASR system. As describe in Section 2, we use the trained acoustic
For our experiments in English, we built an LVCSR system using models to subsequently refine pronunciations. The forced alignment
the Callhome English corpus [?]. The conversational nature of the step picks pronunciations that increase the likelihood of the training
Table 3. Word Recognition Accuracies (%) using different iterations Table 4. Word Recognition Accuracies (%) using different initial
of training for English pronunciation dictionaries for Spanish
Iteration 1 41.38 Using automatically
Iteration 2 42.0 generated LDC pronunciations 30.45
Iteration 3 41.45 Using optimized
Iteration 4 42.93 pronunciation dictionary 31.65
Iteration 5 42.77
Iteration 6 42.37 using this dictionary, we decode the training data both phonetically
and at the word level. As described in Section 2.2, we derive a set
Iteration 4 + new pronunciations
of reliable pronunciations by aligning these transcripts. We use this
from un-transcribed switchboard data 43.25
new dictionary to train grapheme-to-phoneme models for Spanish.
Full training dictionary 44.35 Similar to the English lexicon experiments, we train new acoustic
models and grapheme-to-phoneme models using reliable pronuncia-
data from a set of 5 most likely multiple pronunciations predicted by tions from a forced alignment step. Table 2 shows the results of our
the model. We select close to 3.5K words and their pronunciations experiments with the Spanish data. Using an improved dictionary
from this forced alignment step, after throwing out singletons and improves the performance of the system by over 1%.
words that don’t have a clear preferred pronunciation. These new
pronunciations along with the initial training set are then used in the 4. CONCLUSIONS
next iteration. We continue this iterative process as the performance
of the recognizer increases. We have proposed and explored several approaches to improve pro-
We start with models trained using only 1K graphonemes (word- nunciation dictionaries created with only a few hand-crafted sam-
pronunciation pairs). For each subsequent iterations, pronunciations ples. The techniques provide improvements for ASR systems in
from forced alignments are used to train new grapheme-to-phoneme two different languages using only few training examples. How-
models. Table 3 shows the word accuracies we obtain for different ever, the selection of the right techniques depends on the nature of
iterations of lexicon training. We obtain the best performances in the language. Although we explored unsupervised learning of lexi-
Iteration 4. We use G2P models of order 4 in this experiment. con for English, we did not combine that with unsupervised learning
To add new words and their pronunciations to the dictionary, of acoustic models. However we plan to do that and hope that this
we decoded 300 hours of switchboard data using the best acoustic would make a powerful learning technique for resource poor lan-
models obtained in Iteration 4. The decoded outputs were then used guages.
as labels to force align the acoustic data. Using the approach out-
lined in Section 2.4, we use a confidence based measure to select 5. REFERENCES
about 2.5K new pronunciations. These pronunciations are appended
to the pronunciation dictionary used in Iteration 4. We added the [1] D. Povey et. al., “Subspace Gaussian mixture models for
pronunciations with a precedence to ensure that words in the pronun- speech recognition”, in submitted to: ICASSP, 2010.
ciation dictionary have the most reliable pronunciations. We used [2] Lukas Burget et. al., “Multilingual acoustic modeling for
the order - limited hand-crafted pronunciations, followed by pro- speech Recognition based on subspace Gaussian mixture mod-
nunciations from forced alignment with best acoustic models and els”, in submitted to: ICASSP, 2010.
finally pronunciations from unsupervised learning, while allowing [3] Arnab Ghoshal et. al., “A novel estimation of feature-space
only one pronunciation per word. New grapheme-to-phoneme mod- MLLR for full-covariance models”, in submitted to: ICASSP,
els are trained using this dictionary. Without retraining the acoustic 2010.
models, we used the new grapheme-to-phoneme models to generate
a new pronunciation dictionary. This new dictionary is then used [4] T. Slobada and A. Waibel, “Dictionary learning for sponta-
to decoding the test set. Adding additional words and pronuncia- neous speech recognition”, in ISCA ICSLP, 1996.
tions using this unsupervised technique improves the performance [5] R. Singh, B. Raj and R.M. Stern, “Automatic generation of
still further from 42.93% to 43.25%. To verify the effectiveness of phone sets and lexical transcriptions”, in IEEE ICASSP, 2000,
our technique we use the complete PRONLEX dictionary to train the pp. 1691-1694.
ASR system. When compared to the best performance possible with [6] Chuck Wooters and Andreas Stolcke, “Multiple-Pronunciation
the current training set, the iterative process helps us reach within lexical modeling in a speaker independent speech understand-
1% WER difference with the full ASR system. We use G2P models ing system”, in ICSA ICSLP, 1994
of order 8 while training with the complete dictionary.
[7] Sabine Deligne and Frédéric Bimbot “Inference of variable-
In the second scenario of Spanish, the written form is phonetic
length linguistic and acoustic units by multigrams”, in Speech
and simple LTS rules are usually used for creating lexicons. For
Commun., vol. 23,3, 1997, pp. 223-241
our experiments, we build an LVCSR system using the Callhome
Spanish corpus. We attempt to improve the pronunciation dictio- [8] M. Bisani and H. Ney, “Joint sequence models for grapheme-
nary for this language by creating an optimized initial pronunciation to-phoneme conversion”, Speech Communication, vol. 50, no.
dictionary using the acoustic training data. Similar to the English 5, pp. 434-451, 2008.
database, the Spanish databases consists of 120 spontaneous tele- [9] A. Canavan, D. Graff, and G. Zipperlen, “CALLHOME Amer-
phone conversation between native speakers. We use 16 hours of ican English Speech,” Linguistic Data Consortium, 1997.
Spanish to train an ASR system as we described before. We use an [10] S. Young et. al., “The HTK Book,” Cambridge University En-
automatically generated pronunciation dictionary from Callhome as gineering Department, 2009.
the initial pronunciation dictionary. After training an ASR system

2015_Multi-task Learning of Deep Neural Networks for Low-resource Speech Recognition_Chen_Mak_IEEEACM Transactions on Audio, Speech, and Language Processing
No ratings yet
2015_Multi-task Learning of Deep Neural Networks for Low-resource Speech Recognition_Chen_Mak_IEEEACM Transactions on Audio, Speech, and Language Processing
12 pages
Christoph Bensch Master Thesis
No ratings yet
Christoph Bensch Master Thesis
67 pages
Static Dictionary For Pronunciation Modeling
No ratings yet
Static Dictionary For Pronunciation Modeling
5 pages
Sagae Lehr ET AL 2012 Hallucinated N Best Lists For Discriminative Language Modeling
No ratings yet
Sagae Lehr ET AL 2012 Hallucinated N Best Lists For Discriminative Language Modeling
5 pages
Speech Recognition1
100% (1)
Speech Recognition1
39 pages
PHD Thesis Deep Learning For Automatic Assessment and Feedback of Spoken English
No ratings yet
PHD Thesis Deep Learning For Automatic Assessment and Feedback of Spoken English
282 pages
Static Dictionary For Pronunciation Modeling
No ratings yet
Static Dictionary For Pronunciation Modeling
5 pages
2023-Automatic Speech Recognition in L2 Learning A Review Based On PRISMA Methodology
No ratings yet
2023-Automatic Speech Recognition in L2 Learning A Review Based On PRISMA Methodology
13 pages
Orosanu13 Interspeech
No ratings yet
Orosanu13 Interspeech
5 pages
Mroz 2018
No ratings yet
Mroz 2018
21 pages
Meta-Learning For Phonemic Annotation of Corpora
No ratings yet
Meta-Learning For Phonemic Annotation of Corpora
8 pages
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Language Learning in A Special Education Environment: Oscar Saz (CMU Post-Doc) PSLC-CF
No ratings yet
Language Learning in A Special Education Environment: Oscar Saz (CMU Post-Doc) PSLC-CF
38 pages
Rnn-Based Ams + Introduction To Language Modeling: Instructor: Preethi Jyothi
No ratings yet
Rnn-Based Ams + Introduction To Language Modeling: Instructor: Preethi Jyothi
36 pages
1998issues in Building General Letter To Sound Rules
No ratings yet
1998issues in Building General Letter To Sound Rules
4 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
34 pages
Stahlberg 2014
No ratings yet
Stahlberg 2014
8 pages
A Methodology For Building Simple But Robust Stemmers Without Language Knowledge Stemmer Configuration
No ratings yet
A Methodology For Building Simple But Robust Stemmers Without Language Knowledge Stemmer Configuration
6 pages
1087
No ratings yet
1087
5 pages
A Bi-Directional Model of English Pronunciation
No ratings yet
A Bi-Directional Model of English Pronunciation
15 pages
Automatic Pronunciation Assessment in A Flipped Classroom Context
No ratings yet
Automatic Pronunciation Assessment in A Flipped Classroom Context
18 pages
Orosanu14 Interspeech
No ratings yet
Orosanu14 Interspeech
5 pages
McCrockling (2015)
No ratings yet
McCrockling (2015)
8 pages
ASR For L2 Japanese
No ratings yet
ASR For L2 Japanese
17 pages
Study On N-Gram Language Models For Topic and Out-Of-Vocabulary Words (PDFDrive)
No ratings yet
Study On N-Gram Language Models For Topic and Out-Of-Vocabulary Words (PDFDrive)
155 pages
Pronouncur: An Urdu Pronunciation Lexicon Generator: Haris Bin Zia, Agha Ali Raza, Awais Athar
No ratings yet
Pronouncur: An Urdu Pronunciation Lexicon Generator: Haris Bin Zia, Agha Ali Raza, Awais Athar
5 pages
2000decision Tree Based Text-To-Phoneme
No ratings yet
2000decision Tree Based Text-To-Phoneme
4 pages
Analysis of a Medical Research Corpus: A Prelude for Learners, Teachers, Readers and Beyond
From Everand
Analysis of a Medical Research Corpus: A Prelude for Learners, Teachers, Readers and Beyond
Georgette Nicolas Jabbour
No ratings yet
Ieee Icaicta Edited
No ratings yet
Ieee Icaicta Edited
6 pages
NLP Summary
No ratings yet
NLP Summary
6 pages
2022.bea-1.4
No ratings yet
2022.bea-1.4
5 pages
Embarrassingly Simple Llm Asr
No ratings yet
Embarrassingly Simple Llm Asr
11 pages
692C-TentativeSyllabus
No ratings yet
692C-TentativeSyllabus
4 pages
6.Chapter6_LanguageModel
No ratings yet
6.Chapter6_LanguageModel
33 pages
Pronunciation Adaptation 2002
No ratings yet
Pronunciation Adaptation 2002
26 pages
Klein Thesis
No ratings yet
Klein Thesis
140 pages
I Can Speak Improving English Pronunciation Through Automatic Speech Recognition-Based Language Learning Systems
No ratings yet
I Can Speak Improving English Pronunciation Through Automatic Speech Recognition-Based Language Learning Systems
20 pages
Contextualized Automatic Speech Recognition With Dynamic Vocabulary
No ratings yet
Contextualized Automatic Speech Recognition With Dynamic Vocabulary
5 pages
Error Analysis of A Public Domain Pronunciation Dictionary
No ratings yet
Error Analysis of A Public Domain Pronunciation Dictionary
6 pages
Incorporating Knowledge Sources Into Statistical Speech Recognition
No ratings yet
Incorporating Knowledge Sources Into Statistical Speech Recognition
20 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
2208.12666v1 Feature Extraction
No ratings yet
2208.12666v1 Feature Extraction
13 pages
2001 Data Driven Approach
No ratings yet
2001 Data Driven Approach
6 pages
Constraint-Based Learning of Phonological Processes
No ratings yet
Constraint-Based Learning of Phonological Processes
11 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
The spaCy Handbook: Simplifying Natural Language Processing
From Everand
The spaCy Handbook: Simplifying Natural Language Processing
Robert Johnson
No ratings yet
Robust Speech Recognition Using Articulatory Information: Der Technischen Fakult at Der Universit at Bielefeld
100% (1)
Robust Speech Recognition Using Articulatory Information: Der Technischen Fakult at Der Universit at Bielefeld
148 pages
C# for Beginners: A Step-by-Step Tutorial to Learning C# Programming from Scratch
From Everand
C# for Beginners: A Step-by-Step Tutorial to Learning C# Programming from Scratch
Lena Neill
No ratings yet
[Routledge Studies in Applied Linguistics] Okim Kang, David O. Johnson, Alyssa Kermad - Second Language Prosody and Computer Modeling (2021, Routledge) - libgen.li
No ratings yet
[Routledge Studies in Applied Linguistics] Okim Kang, David O. Johnson, Alyssa Kermad - Second Language Prosody and Computer Modeling (2021, Routledge) - libgen.li
189 pages
2102.00291_bert
No ratings yet
2102.00291_bert
5 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Ngrams
100% (1)
Ngrams
22 pages
Redaction HTK Amazigh Speech
No ratings yet
Redaction HTK Amazigh Speech
15 pages
Speech Synthesis Unit5
No ratings yet
Speech Synthesis Unit5
39 pages
Download
No ratings yet
Download
13 pages
2202.05209v1
No ratings yet
2202.05209v1
25 pages
Tutorial On Speech Recognition: Alex Acero Microsoft Research
No ratings yet
Tutorial On Speech Recognition: Alex Acero Microsoft Research
38 pages
Portuguese English Frequency Dictionary - Essential Vocabulary - 2.500 Most Used Words: Portuguese, #1
From Everand
Portuguese English Frequency Dictionary - Essential Vocabulary - 2.500 Most Used Words: Portuguese, #1
J.L. Laide
3.5/5 (4)
Malayalam Speech Recognition
No ratings yet
Malayalam Speech Recognition
3 pages
Domain Adap Asr 1
No ratings yet
Domain Adap Asr 1
5 pages
NBE Oral Narrative Assessment
No ratings yet
NBE Oral Narrative Assessment
5 pages
justificaciones (1)
No ratings yet
justificaciones (1)
2 pages
Feedback
No ratings yet
Feedback
10 pages
Extra
No ratings yet
Extra
3 pages
The Use of The International Phonetic Alphabet in ... - (Chapter 13 Chinese Pronunciation Guide For Western Singers Mei Zhong)
No ratings yet
The Use of The International Phonetic Alphabet in ... - (Chapter 13 Chinese Pronunciation Guide For Western Singers Mei Zhong)
11 pages
Remedial Instruction in READING Handout
100% (3)
Remedial Instruction in READING Handout
5 pages
LSPU Self-Paced Learning Module (SLM) Purposive Communication - GEC 105 First Semester/2020-2021 7
No ratings yet
LSPU Self-Paced Learning Module (SLM) Purposive Communication - GEC 105 First Semester/2020-2021 7
8 pages
Ann Banfield - Unspeakable Sentences
No ratings yet
Ann Banfield - Unspeakable Sentences
30 pages
The Signmaker’s Assistant
No ratings yet
The Signmaker’s Assistant
43 pages
Learning Competency Directory With Code
No ratings yet
Learning Competency Directory With Code
9 pages
E-Tech Second Quarter Home Learning Plan
No ratings yet
E-Tech Second Quarter Home Learning Plan
3 pages
Bachelor Thesis Hfu Furtwangen
100% (3)
Bachelor Thesis Hfu Furtwangen
4 pages
Top 7 Organic Instagram Growth Hacks
No ratings yet
Top 7 Organic Instagram Growth Hacks
17 pages
Verbal Ability Practice Paper 1
No ratings yet
Verbal Ability Practice Paper 1
3 pages
DLP Cot1
No ratings yet
DLP Cot1
4 pages
روﺪﻗ ةﺪﻧر ، د - ﺪﻤﺣأ ﻦﺑ ﺪﻤﺤﻣ Randa Kaddour, Mohamed BENAHMED ةﺪﻴﻠﺒﻟا ﺔﻌﻣﺎﺟ 2 ﻲﺴﻴﻧﻮﻟ ﻲﻠﻋ) ﺮﺋاﺰﳉا (ﳐ ﱪ ﺎ ادآو ﺔﻴﺑﺮﻌﻟا ﺔﻐﻠﻟا University of Blida 2 Ali lounici-Algeria 1
No ratings yet
روﺪﻗ ةﺪﻧر ، د - ﺪﻤﺣأ ﻦﺑ ﺪﻤﺤﻣ Randa Kaddour, Mohamed BENAHMED ةﺪﻴﻠﺒﻟا ﺔﻌﻣﺎﺟ 2 ﻲﺴﻴﻧﻮﻟ ﻲﻠﻋ) ﺮﺋاﺰﳉا (ﳐ ﱪ ﺎ ادآو ﺔﻴﺑﺮﻌﻟا ﺔﻐﻠﻟا University of Blida 2 Ali lounici-Algeria 1
20 pages
ChatGPT Power Sheet
No ratings yet
ChatGPT Power Sheet
9 pages
Effectiveness
No ratings yet
Effectiveness
2 pages
Example Demonstrative Speech Peer Review Checklist (1)
No ratings yet
Example Demonstrative Speech Peer Review Checklist (1)
2 pages
Process-Recording-Orientation-Phase (1)
No ratings yet
Process-Recording-Orientation-Phase (1)
3 pages
Lesson Plan - English Language and Social Studies - Grade 2 - Compare and Contrast Characters, Setting, and Events in A Written Text
No ratings yet
Lesson Plan - English Language and Social Studies - Grade 2 - Compare and Contrast Characters, Setting, and Events in A Written Text
5 pages
ELE02 Inquiry Based Approach
No ratings yet
ELE02 Inquiry Based Approach
5 pages
Teaching Writing and Reading Skill Through Mind Map
No ratings yet
Teaching Writing and Reading Skill Through Mind Map
5 pages
Summary To Mark Tonicity, Tonality and Tone
No ratings yet
Summary To Mark Tonicity, Tonality and Tone
13 pages
Speakout Pronunciation Extra Starter Unit 4
No ratings yet
Speakout Pronunciation Extra Starter Unit 4
1 page
Peer Review Task 1
No ratings yet
Peer Review Task 1
2 pages
Atg Template
No ratings yet
Atg Template
5 pages
16th English cls9
No ratings yet
16th English cls9
4 pages
ENGLISH 10 2nd QUARTER MONTHLY TEST
No ratings yet
ENGLISH 10 2nd QUARTER MONTHLY TEST
4 pages
TIPS For The CSEC Spanish Oral Exam June 13, 2021
100% (2)
TIPS For The CSEC Spanish Oral Exam June 13, 2021
21 pages

Approaches To Automatic Lexicon Learning With Limited Training Examples

Uploaded by

Approaches To Automatic Lexicon Learning With Limited Training Examples

Uploaded by

APPROACHES TO AUTOMATIC LEXICON LEARNING

WITH LIMITED TRAINING EXAMPLES

Nagendra Goel1 , Samuel Thomas2,

ABSTRACT In this paper, we explore some approaches for automatically gener-

Train new grapheme−to−

Use best models to decode Use the models to generate

Fig. 1. Schematic of lexicon learning with limited training examples

speech database along with high out-of-vocabulary rates, use of

You might also like