Unit-2 Aim 502
Unit-2 Aim 502
1|Page
AIM-502: UNIT-2 WORD LEVEL ANALYSIS
For N-1 words, the N-gram modeling predicts most occurred words that can follow the
sequences. The model is the probabilistic language model which is trained on the
collection of the text. This model is useful in applications i.e. speech recognition, and
machine translations. A simple model has some limitations that can be improved by
smoothing, interpolations, and back off. So, the N-gram language model is about finding
probability distributions over the sequences of the word. Consider the sentences i.e.
"There was heavy rain" and "There was heavy flood". By using experience, it can be said
that the first statement is good. The N-gram language model tells that the "heavy rain"
occurs more frequently than the "heavy flood". So, the first statement is more likely to
occur and it will be then selected by this model. In the one-gram model, the model usually
relies on that which word occurs often without pondering the previous words. In 2-gram,
only the previous word is considered for predicting the current word. In 3-gram, two
previous words are considered.
In the N-gram language model the following probabilities are calculated:
P (“There was heavy rain”) = P (“There”, “was”, “heavy”, “rain”) = P (“There”) P (“was”
|“There”) P (“heavy”| “There was”) P (“rain” |“There was heavy”)
As it is not practical to calculate the conditional probability but by using the “Markov
Assumptions”, this is approximated to the bi-gram model as
P (“There was heavy rain”) ~ P (“There”) P (“was” |“'There”) P (“heavy” |“was”) P (“rain”
|“heavy”)
While backoff considers each lower order one at a time, interpolation considers all the
lower order probabilities together.
However, interpolation is more computationally expensive than backoff-word classes.
2|Page
AIM-502: UNIT-2 WORD LEVEL ANALYSIS
3|Page
AIM-502: UNIT-2 WORD LEVEL ANALYSIS
4|Page
AIM-502: UNIT-2 WORD LEVEL ANALYSIS
5|Page
AIM-502: UNIT-2 WORD LEVEL ANALYSIS
6|Page