0% found this document useful (0 votes)

14 views222 pages

Week-2

Uploaded by

Shamilie M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views222 pages

Week-2

Uploaded by

Shamilie M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 222

Spelling Correction: Edit Distance

EL
Pawan Goyal

PT CSE, IITKGP

Week 2: Lecture 1
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 1 / 20
Spelling Correction

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Spelling Correction

I am writing this email on behaf of ...

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Spelling Correction

I am writing this email on behaf of ...

The user typed ‘behaf’.

Which are some close words?

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Spelling Correction

I am writing this email on behaf of ...

The user typed ‘behaf’.

Which are some close words?

EL
behalf
behave
....

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Spelling Correction

I am writing this email on behaf of ...

The user typed ‘behaf’.

Which are some close words?

EL
behalf
behave
....

Isolated word error correction PT

Pick the one that is closest to ‘behaf’
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Spelling Correction

I am writing this email on behaf of ...

The user typed ‘behaf’.

Which are some close words?

EL
behalf
behave
....

Isolated word error correction PT

Pick the one that is closest to ‘behaf’
N
How to define ‘closest’?

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Spelling Correction

I am writing this email on behaf of ...

The user typed ‘behaf’.

Which are some close words?

EL
behalf
behave
....

Isolated word error correction PT

Pick the one that is closest to ‘behaf’
N
How to define ‘closest’?
Need a distance metric

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Spelling Correction

I am writing this email on behaf of ...

The user typed ‘behaf’.

Which are some close words?

EL
behalf
behave
....

Isolated word error correction PT

Pick the one that is closest to ‘behaf’
N
How to define ‘closest’?
Need a distance metric
The simplest metric: edit distance

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 2 / 20
Edit Distance

EL
The minimum edit distance between two strings

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 3 / 20
Edit Distance

EL
The minimum edit distance between two strings
Is the minimum number of editing operations

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 3 / 20
Edit Distance

EL
The minimum edit distance between two strings
Is the minimum number of editing operations

PT
I Insertion
I Deletion
I Substitution
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 3 / 20
Minimum Edit Distance

Example
Edit distance from ‘intention’ to ‘execution’

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 4 / 20
Minimum Edit Distance

Example
Edit distance from ‘intention’ to ‘execution’

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 4 / 20
Minimum Edit Distance

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 5 / 20
Minimum Edit Distance

EL
PT
If each operation has a cost of 1 (Levenshtein)
N
I Distance between these is 5

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 5 / 20
Minimum Edit Distance

EL
PT
If each operation has a cost of 1 (Levenshtein)
N
I Distance between these is 5
If substitution costs 2 (alternate version)
I Distance between these is 8

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 5 / 20
How to find the Minimum Edit Distance?

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 6 / 20
How to find the Minimum Edit Distance?

Searching for a path (sequence of edits) from the start string to the final string:

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 6 / 20
How to find the Minimum Edit Distance?

Searching for a path (sequence of edits) from the start string to the final string:
Initial state: the word we are transforming

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 6 / 20
How to find the Minimum Edit Distance?

Searching for a path (sequence of edits) from the start string to the final string:
Initial state: the word we are transforming

EL
Operators: insert, delete, substitute

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 6 / 20
How to find the Minimum Edit Distance?

Searching for a path (sequence of edits) from the start string to the final string:
Initial state: the word we are transforming

EL
Operators: insert, delete, substitute
Goal state: the word we are trying to get to

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 6 / 20
How to find the Minimum Edit Distance?

Searching for a path (sequence of edits) from the start string to the final string:
Initial state: the word we are transforming

EL
Operators: insert, delete, substitute
Goal state: the word we are trying to get to
Path cost: what we want to minimize: the number of edits

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 6 / 20
How to find the Minimum Edit Distance?

Searching for a path (sequence of edits) from the start string to the final string:
Initial state: the word we are transforming

EL
Operators: insert, delete, substitute
Goal state: the word we are trying to get to
Path cost: what we want to minimize: the number of edits

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 6 / 20
Minimum Edit as Search

How to navigate?

EL
The space of all edit sequences is huge

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 7 / 20
Minimum Edit as Search

How to navigate?

EL
The space of all edit sequences is huge
Lot of distinct paths end up at the same state

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 7 / 20
Minimum Edit as Search

How to navigate?

EL
The space of all edit sequences is huge
Lot of distinct paths end up at the same state

PT
Don’t have to keep track of all of them
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 7 / 20
Minimum Edit as Search

How to navigate?

EL
The space of all edit sequences is huge
Lot of distinct paths end up at the same state

PT
Don’t have to keep track of all of them
Keep track of the shortest path to each state
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 7 / 20
Defining Minimum Edit Distance Matrix

For two strings

X of length n

EL
Y of length m

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 8 / 20
Defining Minimum Edit Distance Matrix

For two strings

X of length n

EL
Y of length m

We define D(i, j)

PT
the edit distance between X[1..i] and Y[1..j]
i.e., the first i characters of X and the first j characters of Y
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 8 / 20
Defining Minimum Edit Distance Matrix

For two strings

X of length n

EL
Y of length m

We define D(i, j)

PT
the edit distance between X[1..i] and Y[1..j]
i.e., the first i characters of X and the first j characters of Y
N
Thus, the edit distance between X and Y is D(n, m)

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 8 / 20
Computing Minimum Edit Distance

Dynamic Programming

EL
A tabular computation of D(n, m)

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 9 / 20
Computing Minimum Edit Distance

Dynamic Programming

EL
A tabular computation of D(n, m)
Solving problems by combining solutions to subproblems

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 9 / 20
Computing Minimum Edit Distance

Dynamic Programming

EL
A tabular computation of D(n, m)
Solving problems by combining solutions to subproblems
Bottom-up

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 9 / 20
Computing Minimum Edit Distance

Dynamic Programming

EL
A tabular computation of D(n, m)
Solving problems by combining solutions to subproblems
Bottom-up
I
PT
Compute D(i, j) for small i, j
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 9 / 20
Computing Minimum Edit Distance

Dynamic Programming

EL
A tabular computation of D(n, m)
Solving problems by combining solutions to subproblems
Bottom-up
I
I PT
Compute D(i, j) for small i, j
Compute larger D(i, j) based on previously computed smaller values
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 9 / 20
Computing Minimum Edit Distance

Dynamic Programming

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 9 / 20
Dynamic Programming Algorithm

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 10 / 20
The Edit Distance Table

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 11 / 20
The Edit Distance Table

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 11 / 20
The Edit Distance Table

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 12 / 20
Computing Alignments

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 13 / 20
Computing Alignments

Computing edit distance may not be sufficient for some applications

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 13 / 20
Computing Alignments

Computing edit distance may not be sufficient for some applications

EL
I We often need to align characters of the two strings to each other

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 13 / 20
Computing Alignments

Computing edit distance may not be sufficient for some applications

EL
I We often need to align characters of the two strings to each other
We do this by keeping a “backtrace”

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 13 / 20
Computing Alignments

Computing edit distance may not be sufficient for some applications

EL
I We often need to align characters of the two strings to each other
We do this by keeping a “backtrace”

PT
Every time we enter a cell, remember where we came from
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 13 / 20
Computing Alignments

Computing edit distance may not be sufficient for some applications

EL
I We often need to align characters of the two strings to each other
We do this by keeping a “backtrace”

PT
Every time we enter a cell, remember where we came from
When we reach the end,
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 13 / 20
Computing Alignments

Computing edit distance may not be sufficient for some applications

EL
I We often need to align characters of the two strings to each other
We do this by keeping a “backtrace”

When we reach the end,

I
PT
Every time we enter a cell, remember where we came from

Trace back the path from the upper right corner to read off the alignment
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 13 / 20
The Edit Distance Table

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 14 / 20
The Edit Distance Table

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 15 / 20
Minimum Edit with Backtrace

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 16 / 20
Adding Backtrace to Minimum Edit

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 17 / 20
The distance matrix

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 18 / 20
The distance matrix

Every non-decreasing path

EL
from (0,0) to (M,N)
corresponds to an alignment
of two sequences.

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 18 / 20
The distance matrix

Every non-decreasing path

EL
from (0,0) to (M,N)
corresponds to an alignment
of two sequences.

PT An optimal alignment is
composed of optimal
N
sub-alignments.

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 18 / 20
Result of Backtrace

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 19 / 20
Performance

Time

EL
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 20 / 20
Performance

Time
O(nm)

EL
Space

PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 20 / 20
Performance

Time
O(nm)

EL
Space
O(nm)

Backtrace
PT
N

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 20 / 20
Performance

Time
O(nm)

EL
Space
O(nm)

Backtrace
PT
N
O(n + m)

Pawan Goyal (IIT Kharagpur) Spelling Correction: Edit Distance Week 2: Lecture 1 20 / 20
Weighted Edit Distance, Other variations

EL
Pawan Goyal

PT CSE, IITKGP

Week 2: Lecture 2
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 1 / 12
Weighted Edit Distance

EL
Why to add weights to the computation?
Some letters are more likely to be mistyped.

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 2 / 12
Confusion Matrix for Spelling Errors

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 3 / 12
Keyboard Design

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 4 / 12
Weighted Minimum Edit Distance

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 5 / 12
How to modify the algorithm with transpose?

Transpose
transpose(x, y) = (y, x)
Also known as metathesis

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 6 / 12
How to modify the algorithm with transpose?

Transpose
transpose(x, y) = (y, x)
Also known as metathesis

EL
Modification to the dynamic programmic algorithm


PT
D(i − 1, j) + 1 (deletion)




D(i, j − 1) + 1 (insertion)


 (

 1 if (x[i] 6= y[j])(substitution)
N
D[i][j] = min D(i − 1, j − 1)+


 0 otherwise




 D(i − 2, j − 2) + 1 (x[i] = y[j − 1] and x[i − 1] = y[j]

 (transposition)

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 6 / 12
How to find dictionary entries with smallest edit distance?

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 7 / 12
How to find dictionary entries with smallest edit distance?

Naïve Method
Compute edit ditance from the query term to each dictionary term – an
exhaustive search

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 7 / 12
How to find dictionary entries with smallest edit distance?

Naïve Method
Compute edit ditance from the query term to each dictionary term – an
exhaustive search

EL
Can be made efficient if we do it over a trie structure

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 7 / 12
How to find dictionary entries with smallest edit distance?

Naïve Method
Compute edit ditance from the query term to each dictionary term – an
exhaustive search

EL
Can be made efficient if we do it over a trie structure

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 7 / 12
How to find dictionary entries with smallest edit distance?

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 8 / 12
How to find dictionary entries with smallest edit distance?

Generate all possible terms with an edit distance <=2 (deletion +

EL
transpose + substitution + insertion) from the query term and search
them in the dictionary.

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 8 / 12
How to find dictionary entries with smallest edit distance?

Generate all possible terms with an edit distance <=2 (deletion +

EL
transpose + substitution + insertion) from the query term and search
them in the dictionary.

PT
For a word of length 9, alphabet of size 36, this will lead to 114,324 terms
to search for
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 8 / 12
How to find dictionary entries with smallest edit distance?

Generate all possible terms with an edit distance <=2 (deletion +

EL
transpose + substitution + insertion) from the query term and search
them in the dictionary.

to search for
PT
For a word of length 9, alphabet of size 36, this will lead to 114,324 terms

For Chinese alphabet size is 70,000 (Unicode Han Characters)

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 8 / 12
How to find dictionary entries with smallest edit distance?

Symmetric Delete Spelling Correction

Generate terms with an edit distance ≤ 2 (deletes) from each dictionary

EL
term (offline)
Generate terms with an edit distance ≤ 2 (deletes) from the input terms
and search in dictionary

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 9 / 12
How to find dictionary entries with smallest edit distance?

Symmetric Delete Spelling Correction

Generate terms with an edit distance ≤ 2 (deletes) from each dictionary

EL
term (offline)
Generate terms with an edit distance ≤ 2 (deletes) from the input terms
and search in dictionary

PT
Number of deletes within edit distance ≤ 2 for a word of length 9 will be 45
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 9 / 12
How to find dictionary entries with smallest edit distance?

Symmetric Delete Spelling Correction

Generate terms with an edit distance ≤ 2 (deletes) from each dictionary

EL
term (offline)
Generate terms with an edit distance ≤ 2 (deletes) from the input terms
and search in dictionary

PT
Number of deletes within edit distance ≤ 2 for a word of length 9 will be 45
N
A further check is required to remove the false positives

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 9 / 12
Spelling Correction

EL
PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 10 / 12
Spelling Correction

Types of spelling errors: Non-word Errors

EL
behaf → behalf

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 10 / 12
Spelling Correction

Types of spelling errors: Non-word Errors

EL
behaf → behalf

Types of spelling errors: Real-word Errors

PT
Typographical errors: three → there
Cognitive errors (homophones): piece → peace, too → two
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 10 / 12
Non-word spelling errors

Non-word spelling error detection

Any word not in a dictionary is an error

EL
The larger the dictionary the better

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 11 / 12
Non-word spelling errors

Non-word spelling error detection

Any word not in a dictionary is an error

EL
The larger the dictionary the better

PT
Non-word spelling error correction
Generate candidates: real words that are similar to the error word
Choose the best one:
N
I Shortest weighted edit distance
I Highest noisy channel probabliity

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 11 / 12
Real word spelling errors

For each word w, generate candidate set

EL
Find candidate words with similar pronunciations
Find candidate words with similar spelling
Include w in candidate set

PT
N

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 12 / 12
Real word spelling errors

For each word w, generate candidate set

EL
Find candidate words with similar pronunciations
Find candidate words with similar spelling
Include w in candidate set

Choosing best candidate PT

N
Noisy Channel

Pawan Goyal (IIT Kharagpur) Weighted Edit Distance, Other variations Week 2: Lecture 2 12 / 12
Noisy Channel Model for Spelling Correction

EL
Pawan Goyal

PT CSE, IITKGP

Week 2: Lecture 3
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 1 / 17
Noisy Channel

We see an observation x of the misspelled word

Find the correct word w

EL
ŵ = arg maxP(w|x)
w∈V

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 2 / 17
Noisy Channel

We see an observation x of the misspelled word

Find the correct word w

EL
ŵ = arg maxP(w|x)
w∈V

PT
= arg max
w∈V
P(x|w)P(w)
P(x)
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 2 / 17
Noisy Channel

We see an observation x of the misspelled word

Find the correct word w

EL
ŵ = arg maxP(w|x)
w∈V

PT
= arg max
w∈V
P(x|w)P(w)
P(x)
N
= arg maxP(x|w)P(w)
w∈V

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 2 / 17
Non-word spelling error: acress

Words with similar spelling

Small edit distance to error

EL
Words with similar pronuncitation
Small edit distance of pronunciation to error

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 3 / 17
Non-word spelling error: acress

Words with similar spelling

Small edit distance to error

EL
Words with similar pronuncitation
Small edit distance of pronunciation to error

PT
Damerau-Levenshtein edit distance
Minimum edit distance, where edits are:
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 3 / 17
Non-word spelling error: acress

Words with similar spelling

Small edit distance to error

EL
Words with similar pronuncitation
Small edit distance of pronunciation to error

PT
Damerau-Levenshtein edit distance
Minimum edit distance, where edits are:
N
Insertion, Deletion, Substitution,

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 3 / 17
Non-word spelling error: acress

Words with similar spelling

Small edit distance to error

EL
Words with similar pronuncitation
Small edit distance of pronunciation to error

PT
Damerau-Levenshtein edit distance
Minimum edit distance, where edits are:
N
Insertion, Deletion, Substitution,
Transposition of two adjacent letters

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 3 / 17
Words within edit distance 1 of acress

EL
PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 4 / 17
Candidate generation

80% of errors are within edit distance 1

EL
Almost all errors within edit distance 2

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 5 / 17
Candidate generation

80% of errors are within edit distance 1

EL
Almost all errors within edit distance 2

thisidea → this idea

inlaw → in-law
PT
Allow deletion of space or hyphen
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 5 / 17
Computing error probability: confusion matrix

del[x,y]: count (xy typed as x)

EL
ins[x,y]: count (x typed as xy)
sub[x,y]: count (x typed as y)

PT
trans[x,y]: count(xy typed as yx)
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 6 / 17
Computing error probability: confusion matrix

del[x,y]: count (xy typed as x)

EL
ins[x,y]: count (x typed as xy)
sub[x,y]: count (x typed as y)

PT
trans[x,y]: count(xy typed as yx)

Insertion and deletion are conditioned on previous character

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 6 / 17
Channel model

EL
PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 7 / 17
Channel model for acress

EL
PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 8 / 17
Noisy channel probability for acress

EL
PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 9 / 17
Using a bigram language model

“ ... versatile acress whose ...”

EL
PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 10 / 17
Using a bigram language model

“ ... versatile acress whose ...”

EL
Counts from the Corpus of Contemporary American English with add-1
smoothing

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 10 / 17
Using a bigram language model

“ ... versatile acress whose ...”

EL
Counts from the Corpus of Contemporary American English with add-1
smoothing
P(actress|versatile) = 0.000021, P(across|versatile) = 0.000021

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 10 / 17
Using a bigram language model

“ ... versatile acress whose ...”

EL
Counts from the Corpus of Contemporary American English with add-1
smoothing
P(actress|versatile) = 0.000021, P(across|versatile) = 0.000021

PT
P(whose|actress) = 0.0010, P(whose|across) = 0.000006
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 10 / 17
Using a bigram language model

“ ... versatile acress whose ...”

EL
Counts from the Corpus of Contemporary American English with add-1
smoothing
P(actress|versatile) = 0.000021, P(across|versatile) = 0.000021

PT
P(whose|actress) = 0.0010, P(whose|across) = 0.000006
P(“versatile actress whose”) = 0.000021 * 0.0010 = 210 x 10−10
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 10 / 17
Using a bigram language model

“ ... versatile acress whose ...”

EL
Counts from the Corpus of Contemporary American English with add-1
smoothing
P(actress|versatile) = 0.000021, P(across|versatile) = 0.000021

PT
P(whose|actress) = 0.0010, P(whose|across) = 0.000006
P(“versatile actress whose”) = 0.000021 * 0.0010 = 210 x 10−10
N
P(“versatile across whose”) = 0.000021 * 0.000006 = 1 x10−10

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 10 / 17
Real-word spelling errors

EL
The study was conducted mainly be John Black
The design an construction of the system ...

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 11 / 17
Real-word spelling errors

EL
The study was conducted mainly be John Black
The design an construction of the system ...

PT
25-40% of spelling errors are real words
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 11 / 17
Noisy channel for real-word spell correction

Given a sentence X = w1 , w2 , w3 . . . , wn

EL
Candidate (w1 ) = {w1 , w0 1 , w00 1 , w000 1 , . . .}
Candidate (w2 ) = {w2 , w0 2 , w00 2 , w000 2 , . . .}

PT
Candidate (w3 ) = {w3 , w0 3 , w00 3 , w000 3 , . . .}
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 12 / 17
Noisy channel for real-word spell correction

Given a sentence X = w1 , w2 , w3 . . . , wn

EL
Candidate (w1 ) = {w1 , w0 1 , w00 1 , w000 1 , . . .}
Candidate (w2 ) = {w2 , w0 2 , w00 2 , w000 2 , . . .}

PT
Candidate (w3 ) = {w3 , w0 3 , w00 3 , w000 3 , . . .}
Choose the sequence W that maximizes P(W|X)
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 12 / 17
Noisy channel for real-world spell correction

EL
PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 13 / 17
Simplification: One error per sentence

Choose among all possible sentences with one word replaced

EL
two of thew
w1 , w00 2 , w3 two off thew
w1 , w2 , w0 3 two of the

PT
w000 1 , w2 , w3 too of thew
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 14 / 17
Simplification: One error per sentence

Choose among all possible sentences with one word replaced

EL
two of thew
w1 , w00 2 , w3 two off thew
w1 , w2 , w0 3 two of the

PT
w000 1 , w2 , w3 too of thew
N
Choose the sequence W that maximizes P(W|X)

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 14 / 17
Getting the probability values

Noisy Channel

Ŵ = arg maxP(W|X)
W∈S

EL
where X is the observed sentence and S is the set of all the possible
sequences from the candidate set

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 15 / 17
Getting the probability values

Noisy Channel

Ŵ = arg maxP(W|X)
W∈S

EL
where X is the observed sentence and S is the set of all the possible
sequences from the candidate set

PT
= arg maxP(X|W)P(W)
W∈S
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 15 / 17
Getting the probability values

Noisy Channel

Ŵ = arg maxP(W|X)
W∈S

EL
where X is the observed sentence and S is the set of all the possible
sequences from the candidate set

PT
= arg maxP(X|W)P(W)
W∈S
N
P(X|W)
Same as for non-word spelling correction

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 15 / 17
Getting the probability values

Noisy Channel

Ŵ = arg maxP(W|X)
W∈S

EL
where X is the observed sentence and S is the set of all the possible
sequences from the candidate set

PT
= arg maxP(X|W)P(W)
W∈S
N
P(X|W)
Same as for non-word spelling correction
Also require proabability for no error P(w|w)

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 15 / 17
Probability of no error

EL
What is the probability for a correctly typed word? P(“the”|“the”)

PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 16 / 17
Probability of no error

EL
What is the probability for a correctly typed word? P(“the”|“the”)

It may depend on the source text under consideration

1 error in 10 words → 0.9
1 error in 100 words → 0.99
PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 16 / 17
Computing P(W)

EL
Use Language Model
Unigram
Bigram
... PT
N

Pawan Goyal (IIT Kharagpur) Noisy Channel Model for Spelling Correction Week 2: Lecture 3 17 / 17
N-gram Language Models

EL
Pawan Goyal

PT CSE, IITKGP

Week 2: Lecture 4
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 1 / 24

Context Sensitive Spelling Correction

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 2 / 24

Context Sensitive Spelling Correction

The office is about fifteen minuets from my house

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 2 / 24

Context Sensitive Spelling Correction

The office is about fifteen minuets from my house

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 2 / 24

Context Sensitive Spelling Correction

The office is about fifteen minuets from my house

EL
Use a Language Model PT
P(about fifteen minutes from) > P(about fifteen minuets from)
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 2 / 24

Probablilistic Language Models: Applications

Speech Recognition
P(I saw a van) >> P(eyes awe of an)

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 3 / 24

Probablilistic Language Models: Applications

Speech Recognition
P(I saw a van) >> P(eyes awe of an)

EL
Machine Translation
Which sentence is more plausible in the target language?

PT
P(high winds) > P(large winds)
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 3 / 24

Probablilistic Language Models: Applications

Speech Recognition
P(I saw a van) >> P(eyes awe of an)

EL
Machine Translation
Which sentence is more plausible in the target language?

PT
P(high winds) > P(large winds)

Other Applications
N
Context Sensitive Spelling Correction
Natural Language Generation
...

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 3 / 24

Completion Prediction

EL
Language model also supports predicting the completion of a sentence.
I Please turn off your cell ...
I Your program does not ...

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 4 / 24

Completion Prediction

EL
Language model also supports predicting the completion of a sentence.
I Please turn off your cell ...
I Your program does not ...

PT
Predictive text input systems can guess what you are typing and give
choices on how to complete it.
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 4 / 24

Probabilistic Language Modeling

Goal: Compute the probability of a sentence or sequence of words:

P(W) = P(w1 , w2 , w3 , . . . , wn )

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 5 / 24

Probabilistic Language Modeling

Goal: Compute the probability of a sentence or sequence of words:

P(W) = P(w1 , w2 , w3 , . . . , wn )

EL
PT
Related Task: probability of an upcoming word:

P(w4 |w1 , w2 , w3 )
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 5 / 24

Probabilistic Language Modeling

Goal: Compute the probability of a sentence or sequence of words:

P(W) = P(w1 , w2 , w3 , . . . , wn )

EL
PT
Related Task: probability of an upcoming word:

P(w4 |w1 , w2 , w3 )
N
A model that computes either of these is called a language model

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 5 / 24

Computing P(W)

EL
How to compute the joint probability
P(about, fifteen, minutes, from)

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 6 / 24

Computing P(W)

EL
How to compute the joint probability
P(about, fifteen, minutes, from)

Basic Idea
PT
Rely on the Chain Rule of Probability
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 6 / 24

The Chain Rule

Conditional Probabilities
P(A, B)
P(B|A) =
P(A)

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 7 / 24

The Chain Rule

Conditional Probabilities
P(A, B)
P(B|A) =
P(A)

EL
P(A, B) = P(A)P(B|A)

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 7 / 24

The Chain Rule

Conditional Probabilities
P(A, B)
P(B|A) =
P(A)

EL
P(A, B) = P(A)P(B|A)

More Variables
PT
P(A, B, C, D) = P(A)P(B|A)P(C|A, B)P(D|A, B, C)
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 7 / 24

The Chain Rule

Conditional Probabilities
P(A, B)
P(B|A) =
P(A)

EL
P(A, B) = P(A)P(B|A)

More Variables
PT
P(A, B, C, D) = P(A)P(B|A)P(C|A, B)P(D|A, B, C)
N
The Chain Rule in General
P(x1 , x2 , . . . , xn ) = P(x1 )P(x2 |x1 )P(x3 |x1 , x2 ) . . . P(xn |x1 , . . . , xn−1 )

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 7 / 24

Probability of words in sentences

EL
P(w1 w2 . . . wn ) = ∏ P(wi |w1 w2 . . . wi−1 )
i

P(“about fifteen minutes from”) =

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 8 / 24

Probability of words in sentences

EL
P(w1 w2 . . . wn ) = ∏ P(wi |w1 w2 . . . wi−1 )
i

P(“about fifteen minutes from”) =

PT
P(about) x P(fifteen | about) x P(minutes | about fifteen) x P(from | about fifteen
minutes)
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 8 / 24

Estimating These Probability Values

Count and divide

EL
Count (about fifteen minutes from office)
P(office | about fifteen minutes from) = Count (about fifteen minutes from)

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 9 / 24

Estimating These Probability Values

Count and divide

EL
Count (about fifteen minutes from office)
P(office | about fifteen minutes from) = Count (about fifteen minutes from)

What is the problem

PT
We may never see enough data for estimating these
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 9 / 24

Markov Assumption

EL
Simplifying Assumption: Use only the previous word
P(office | about fifteen minutes from) ≈ P(office | from)

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 10 / 24

Markov Assumption

EL
Simplifying Assumption: Use only the previous word
P(office | about fifteen minutes from) ≈ P(office | from)

Or the couple previous words

PT
P(office | about fifteen minutes from) ≈ P(office | minutes from)
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 10 / 24

Markov Assumption

More Formally: kth order Markov Model

Chain Rule:
P(w1 w2 . . . wn ) = ∏ P(wi |w1 w2 . . . wi−1 )

EL
i

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 11 / 24

Markov Assumption

More Formally: kth order Markov Model

Chain Rule:
P(w1 w2 . . . wn ) = ∏ P(wi |w1 w2 . . . wi−1 )

EL
i

Using Markov Assumption: only k previous words

PT
P(w1 w2 . . . wn ) ≈ ∏ P(wi |wi−k . . . wi−1 )
i
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 11 / 24

Markov Assumption

More Formally: kth order Markov Model

Chain Rule:
P(w1 w2 . . . wn ) = ∏ P(wi |w1 w2 . . . wi−1 )

EL
i

Using Markov Assumption: only k previous words

PT
P(w1 w2 . . . wn ) ≈ ∏ P(wi |wi−k . . . wi−1 )
i
N
We approximate each component in the product

P(wi |w1 w2 . . . wi−1 ) ≈ P(wi |wi−k . . . wi−1 )

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 11 / 24

N-Gram Models

P(office | about fifteen minutes from)

An N -gram model uses only N − 1 words of prior context.

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 12 / 24

N-Gram Models

P(office | about fifteen minutes from)

An N -gram model uses only N − 1 words of prior context.

EL
Unigram: P(office)
Bigram: P(office | from)

PT
Trigram: P(office | minutes from)
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 12 / 24

N-Gram Models

P(office | about fifteen minutes from)

An N -gram model uses only N − 1 words of prior context.

EL
Unigram: P(office)
Bigram: P(office | from)

PT
Trigram: P(office | minutes from)

Markov model and Language Model

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 12 / 24

N-Gram Models

P(office | about fifteen minutes from)

An N -gram model uses only N − 1 words of prior context.

EL
Unigram: P(office)
Bigram: P(office | from)

PT
Trigram: P(office | minutes from)

Markov model and Language Model

N
An N -gram model is an N − 1-order Markov Model

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 12 / 24

N-Gram Models

We can extend to trigrams, 4-grams, 5-grams

EL
In general, an insufficient model of language:

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 13 / 24

N-Gram Models

We can extend to trigrams, 4-grams, 5-grams

EL
In general, an insufficient model of language:
language has long-distance dependencies:

PT
“The computer which I had just put into the machine room on the fifth
floor crashed.”
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 13 / 24

N-Gram Models

We can extend to trigrams, 4-grams, 5-grams

EL
In general, an insufficient model of language:
language has long-distance dependencies:

floor crashed.”
PT
“The computer which I had just put into the machine room on the fifth

In most of the applications, we can get away with N-gram models

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 13 / 24

Estimating N-grams probabilities

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 14 / 24

Estimating N-grams probabilities

Maximum Likelihood Estimate

EL
Value that makes the observed data the “most probable”

count(wi−1 , wi )
P(wi |wi−1 ) =

PT count(wi−1 )
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 14 / 24

Estimating N-grams probabilities

Maximum Likelihood Estimate

EL
Value that makes the observed data the “most probable”

count(wi−1 , wi )
P(wi |wi−1 ) =

PTP(wi |wi−1 ) =
count(wi−1 )

c(wi−1 , wi )
N
c(wi−1 )

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 14 / 24

An Example

<s>I am here </s>

EL
c(wi−1 , wi ) <s>who am I </s>
P(wi |wi−1 ) =
c(wi−1 ) <s>I would like to know </s>

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 15 / 24

An Example

<s>I am here </s>

c(wi−1 , wi ) <s>who am I </s>
P(wi |wi−1 ) =
c(wi−1 )

EL
<s>I would like to know </s>

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 15 / 24

An Example

<s>I am here </s>

c(wi−1 , wi ) <s>who am I </s>
P(wi |wi−1 ) =
c(wi−1 )

EL
<s>I would like to know </s>

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 15 / 24

Bigram counts from 9222 Restaurant Sentences

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 16 / 24

Computing bigram probabilities

Normlize by unigrams

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 17 / 24

Computing bigram probabilities

Normlize by unigrams

EL
Bigram Probabilities

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 17 / 24

Computing Sentence Probabilities

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 18 / 24

Computing Sentence Probabilities

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 18 / 24

What knowledge does n-gram represent?

P(english|want) = .0011

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 19 / 24

Practical Issues

Everything in log space

EL
Avoids underflow

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 20 / 24

Practical Issues

Everything in log space

EL
Avoids underflow
Adding is faster than multiplying

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 20 / 24

Practical Issues

Everything in log space

EL
Avoids underflow
Adding is faster than multiplying
log(p1 × p2 × p3 × p4 ) = logp1 + logp2 + logp3 + logp4

Handling zeros
PT
N
Use smoothing

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 20 / 24

Language Modeling Toolkit

EL
SRILM
http://www.speech.sri.com/projects/srilm/

PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 21 / 24

Google N-grams

Number of tokens: 1,024,908,267,229

Number of sentences: 95,119,665,584

EL
Number of unigrams: 13,588,391
Number of bigrams: 314,843,401
Number of trigrams: 977,069,902

PT
Number of fourgrams: 1,313,818,354
Number of fivegrams: 1,176,470,663
http://googleresearch.blogspot.in/2006/08/
N
all-our-n-gram-are-belong-to-you.html

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 22 / 24

Example from the 4-gram data

serve as the inspector 66

EL
serve as the inspiration 1390
serve as the installation 136
serve as the institute 187
serve as the institution 279
serve as the institutional 461
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 23 / 24

Google books Ngram Data

EL
PT
N

Pawan Goyal (IIT Kharagpur) N-gram Language Models Week 2: Lecture 4 24 / 24

Evaluation of Language Models, Basic Smoothing

EL
Pawan Goyal

PT CSE, IITKGP

Week 2: Lecture 5
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 1 / 16
Evaluating Language Model

Does it prefer good sentences to bad sentences?

Assign higher probability to real (or frequently observed) sentences than

EL
ungrammatical (or rarely observed) ones

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 2 / 16
Evaluating Language Model

Does it prefer good sentences to bad sentences?

Assign higher probability to real (or frequently observed) sentences than

EL
ungrammatical (or rarely observed) ones

Training and Test Corpora

PT
Parameters of the model are trained on a large corpus of text, called
training set.
N
Performance is tested on a disjoint (held-out) test data using an
evaluation metric

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 2 / 16
Extrinsic evaluation of N-grams models

EL
Comparison of two models, A and B
Use each model for one or more tasks: spelling corrector, speech
recognizer, machine translation

PT
Get accuracy values for A and B
Compare accuracy for A and B
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 3 / 16
Intrinsic evaluation: Perplexity

Intuition: The Shannon Game

How well can we predict the next word?

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 4 / 16
Intrinsic evaluation: Perplexity

Intuition: The Shannon Game

How well can we predict the next word?

EL
I always order pizza with cheese and . . .
The president of India is . . .
I wrote a . . .

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 4 / 16
Intrinsic evaluation: Perplexity

Intuition: The Shannon Game

How well can we predict the next word?

EL
I always order pizza with cheese and . . .
The president of India is . . .
I wrote a . . .

PT
Unigram model doesn’t work for this game.
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 4 / 16
Intrinsic evaluation: Perplexity

Intuition: The Shannon Game

How well can we predict the next word?

EL
I always order pizza with cheese and . . .
The president of India is . . .
I wrote a . . .

PT
Unigram model doesn’t work for this game.
N
A better model of text
is one which assigns a higher probability to the actual word

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 4 / 16
Perplexity
The best language model is one that best predics an unseen test set

Perplexity (PP(W))
Perplexity is the inverse probability of the test data, normalized by the number
of words:

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 5 / 16
Perplexity
The best language model is one that best predics an unseen test set

Perplexity (PP(W))
Perplexity is the inverse probability of the test data, normalized by the number
of words:

EL
1
PP(W) = P(w1 w2 . . . wN )− N

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 5 / 16
Perplexity
The best language model is one that best predics an unseen test set

Perplexity (PP(W))
Perplexity is the inverse probability of the test data, normalized by the number
of words:

EL
1
PP(W) = P(w1 w2 . . . wN )− N

Applying chain Rule

PT
PP(W) = ∏

1
P(wi |w1 . . . wi−1 )
N1
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 5 / 16
Perplexity
The best language model is one that best predics an unseen test set

Perplexity (PP(W))
Perplexity is the inverse probability of the test data, normalized by the number
of words:

EL
1
PP(W) = P(w1 w2 . . . wN )− N

Applying chain Rule

PT
PP(W) = ∏

1
P(wi |w1 . . . wi−1 )
N1
N
For bigrams
N1
1
PP(W) = ∏
P(wi |wi−1 )

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 5 / 16
Example: A Simple Scenario

Consider a sentence consisting of N random digits

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 6 / 16
Example: A Simple Scenario

Consider a sentence consisting of N random digits

Find the perplexity of this sentence as per a model that assigns a
probability p = 1/10 to each digit.

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 6 / 16
Example: A Simple Scenario

Consider a sentence consisting of N random digits

Find the perplexity of this sentence as per a model that assigns a
probability p = 1/10 to each digit.

EL
1
PP(W) = P(w1 w2 . . . wN )− N

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 6 / 16
Example: A Simple Scenario

Consider a sentence consisting of N random digits

Find the perplexity of this sentence as per a model that assigns a
probability p = 1/10 to each digit.

EL
1
PP(W) = P(w1 w2 . . . wN )− N

PT =
N !− N1
1
10
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 6 / 16
Example: A Simple Scenario

Consider a sentence consisting of N random digits

Find the perplexity of this sentence as per a model that assigns a
probability p = 1/10 to each digit.

EL
1
PP(W) = P(w1 w2 . . . wN )− N

PT =
N !− N1
1
10
N
−1
1
=
10
= 10

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 6 / 16
Lower perplexity = better model

WSJ Corpus
Training: 38 million words
Test: 1.5 million words

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 7 / 16
Lower perplexity = better model

WSJ Corpus
Training: 38 million words
Test: 1.5 million words

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 7 / 16
Lower perplexity = better model

WSJ Corpus
Training: 38 million words
Test: 1.5 million words

EL
PT
N
Unigram perplexity: 962?
The model is as confused on test data as if it had to choose uniformly and
independently among 962 possibilities for each word.

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 7 / 16
The Shannon Visualization Method

Use the language model to generate word sequences

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 8 / 16
The Shannon Visualization Method

Use the language model to generate word sequences

EL
Choose a random bigram
(<s>,w) as per its
probability

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 8 / 16
The Shannon Visualization Method

Use the language model to generate word sequences

EL
Choose a random bigram
(<s>,w) as per its
probability

PT
Choose a random bigram
(w,x) as per its probability
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 8 / 16
The Shannon Visualization Method

Use the language model to generate word sequences

EL
Choose a random bigram
(<s>,w) as per its
probability

PT
Choose a random bigram
(w,x) as per its probability
N
And so on until we choose
</s>

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 8 / 16
The Shannon Visualization Method

Use the language model to generate word sequences

EL
Choose a random bigram
(<s>,w) as per its
probability

PT
Choose a random bigram
(w,x) as per its probability
N
And so on until we choose
</s>

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 8 / 16
Shakespeare as Corpus

EL
N = 884,647 tokens, V = 29,066
Shakespeare produced 300,000 bigram types out of V 2 = 844 million
possible bigrams.
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 9 / 16
Approximating Shakespeare

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 10 / 16
Problems with simple MLE estimate: zeros

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 11 / 16
Problems with simple MLE estimate: zeros

Training set
... denied the allegations

EL
... denied the reports
... denied the claims
... denied the request

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 11 / 16
Problems with simple MLE estimate: zeros

Training set
... denied the allegations Test Data

EL
... denied the reports ... denied the offer
... denied the claims ... denied the loan
... denied the request

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 11 / 16
Problems with simple MLE estimate: zeros

Training set
... denied the allegations Test Data

EL
... denied the reports ... denied the offer
... denied the claims ... denied the loan
... denied the request

Zero probability n-grams

P(offer | denied the) = 0
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 11 / 16
Problems with simple MLE estimate: zeros

Training set
... denied the allegations Test Data

EL
... denied the reports ... denied the offer
... denied the claims ... denied the loan
... denied the request

Zero probability n-grams

P(offer | denied the) = 0
PT
N
The test set will be assigned a probability 0

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 11 / 16
Problems with simple MLE estimate: zeros

Training set
... denied the allegations Test Data

EL
... denied the reports ... denied the offer
... denied the claims ... denied the loan
... denied the request

Zero probability n-grams

P(offer | denied the) = 0
PT
N
The test set will be assigned a probability 0
And the perplexity can’t be computed

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 11 / 16
Language Modeling: Smoothing

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 12 / 16
Language Modeling: Smoothing

With sparse statistics

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 12 / 16
Language Modeling: Smoothing

With sparse statistics

EL
PT
Steal probability mass to generalize better
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 12 / 16
Laplace Smoothing (Add-one estimation)

Pretend as if we saw each word (N-gram) one more time that we actually

EL
did

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 13 / 16
Laplace Smoothing (Add-one estimation)

Pretend as if we saw each word (N-gram) one more time that we actually

EL
did
Just add one to all the counts!

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 13 / 16
Laplace Smoothing (Add-one estimation)

Pretend as if we saw each word (N-gram) one more time that we actually

EL
did
Just add one to all the counts!

PT
MLE estimate for bigram: PMLE (wi |wi−1 ) = c(wi−1 )i
i−1
c(w ,w )
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 13 / 16
Laplace Smoothing (Add-one estimation)

Pretend as if we saw each word (N-gram) one more time that we actually

EL
did
Just add one to all the counts!

PT
MLE estimate for bigram: PMLE (wi |wi−1 ) = c(wi−1 )i

Add-1 estimate: PAdd−1 (wi |wi−1 ) = c(wi−1 )+V

i−1
i
i−1
c(w
c(w

,w )+1
,w )
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 13 / 16
Reconstituted counts as effect of smoothing

EL
Effective bigram count (c∗ (wn−1 wn ))

c∗ (wn−1 wn ) c(wn−1 wn ) + 1
=

PTc(wn−1 ) c(wn−1 ) + V
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 14 / 16
Comparing with bigrams: Restaurant corpus

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 15 / 16
Comparing with bigrams: Restaurant corpus

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 15 / 16
More general formulations: Add-k

c(wi−1 , wi ) + k
PAdd−k (wi |wi−1 ) =
c(wi−1 ) + kV

EL
PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 16 / 16
More general formulations: Add-k

c(wi−1 , wi ) + k
PAdd−k (wi |wi−1 ) =
c(wi−1 ) + kV

EL
c(wi−1 , wi ) + m( V1 )
PAdd−k (wi |wi−1 ) =
c(wi−1 ) + m

PT
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 16 / 16
More general formulations: Add-k

c(wi−1 , wi ) + k
PAdd−k (wi |wi−1 ) =
c(wi−1 ) + kV

EL
c(wi−1 , wi ) + m( V1 )
PAdd−k (wi |wi−1 ) =
c(wi−1 ) + m
Unigram prior smoothing:

PT
PUnigramPrior (wi |wi−1 ) =
c(wi−1 , wi ) + mP(wi )
c(wi−1 ) + m
N

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 16 / 16
More general formulations: Add-k

c(wi−1 , wi ) + k
PAdd−k (wi |wi−1 ) =
c(wi−1 ) + kV

EL
c(wi−1 , wi ) + m( V1 )
PAdd−k (wi |wi−1 ) =
c(wi−1 ) + m
Unigram prior smoothing:

PT
PUnigramPrior (wi |wi−1 ) =
c(wi−1 , wi ) + mP(wi )
c(wi−1 ) + m
N
A good value of k or m?
Can be optimized on held-out set

Pawan Goyal (IIT Kharagpur) Evaluation of Language Models, Basic Smoothing Week 2: Lecture 5 16 / 16

SOZ03D - System - Description - HW8 - FW42 - PC-Tool 12.0 - 2024.02.27 - K
No ratings yet
SOZ03D - System - Description - HW8 - FW42 - PC-Tool 12.0 - 2024.02.27 - K
21 pages
Lab Cs3591 Computer Networks Lab
100% (2)
Lab Cs3591 Computer Networks Lab
38 pages
Assignment 2.sol
50% (2)
Assignment 2.sol
5 pages
SonoAce R3 Training Manual
100% (1)
SonoAce R3 Training Manual
72 pages
Spelling Correction: Edit Distance: Pawan Goyal
No ratings yet
Spelling Correction: Edit Distance: Pawan Goyal
67 pages
DL_UNIT_V_NLP_Application (1)
No ratings yet
DL_UNIT_V_NLP_Application (1)
83 pages
Lec 6
No ratings yet
Lec 6
19 pages
Lec10 12 Edit Distance
No ratings yet
Lec10 12 Edit Distance
54 pages
2 EditDistance 2022
No ratings yet
2 EditDistance 2022
37 pages
EditDistance
No ratings yet
EditDistance
28 pages
03 Text Processing- Minimum Edit Distance
No ratings yet
03 Text Processing- Minimum Edit Distance
41 pages
L3 Edit Distance
No ratings yet
L3 Edit Distance
23 pages
01 Defining Minimum Edit Distance 7-04
No ratings yet
01 Defining Minimum Edit Distance 7-04
3 pages
Lecture # 15 - New
No ratings yet
Lecture # 15 - New
70 pages
2 EditDistance 2023
No ratings yet
2 EditDistance 2023
35 pages
Minimum Edit Distance.
No ratings yet
Minimum Edit Distance.
12 pages
Spell correction & edit distance
No ratings yet
Spell correction & edit distance
35 pages
03 Med
No ratings yet
03 Med
52 pages
Edit Dist
No ratings yet
Edit Dist
35 pages
18-IntroNLP II PDF
No ratings yet
18-IntroNLP II PDF
187 pages
Edit Dist
No ratings yet
Edit Dist
24 pages
Edit Distance
No ratings yet
Edit Distance
19 pages
Lecture 4
No ratings yet
Lecture 4
57 pages
Multimedia Application L3
No ratings yet
Multimedia Application L3
49 pages
03 Med
No ratings yet
03 Med
35 pages
Defini'on of Minimum Edit Distance
No ratings yet
Defini'on of Minimum Edit Distance
52 pages
Calculating Minimum Edit Distance
0% (1)
Calculating Minimum Edit Distance
52 pages
Assignement 3 1
No ratings yet
Assignement 3 1
3 pages
IR Lecture 3b
No ratings yet
IR Lecture 3b
44 pages
Definition of Minimum Edit Distance
No ratings yet
Definition of Minimum Edit Distance
49 pages
Efficient Algorithm For Auto Correction Using N-Gram Indexing
No ratings yet
Efficient Algorithm For Auto Correction Using N-Gram Indexing
5 pages
DP and Edit Dist
No ratings yet
DP and Edit Dist
30 pages
Chain Matrix Multiply
No ratings yet
Chain Matrix Multiply
17 pages
Theory I Algorithm Design and Analysis: (13 - Edit Distance and Approximate String Matching)
No ratings yet
Theory I Algorithm Design and Analysis: (13 - Edit Distance and Approximate String Matching)
13 pages
Financial Technology (Fintech) Technology Presentation in Green White Illustrative Style
No ratings yet
Financial Technology (Fintech) Technology Presentation in Green White Illustrative Style
10 pages
Approximating_Edit_Distance_within_Constant_Factor_in_Truly_Sub-Quadratic_Time
No ratings yet
Approximating_Edit_Distance_within_Constant_Factor_in_Truly_Sub-Quadratic_Time
12 pages
4-Tolerant retrieval
No ratings yet
4-Tolerant retrieval
82 pages
A Guided Tour To Approximate String Matching: Gonzalo Navarro
No ratings yet
A Guided Tour To Approximate String Matching: Gonzalo Navarro
58 pages
Note 4
No ratings yet
Note 4
1 page
Lecture 24
No ratings yet
Lecture 24
10 pages
Final Exam Fall 23
No ratings yet
Final Exam Fall 23
10 pages
Levenshtein Distance Task
No ratings yet
Levenshtein Distance Task
3 pages
04 Weighted Minimum Edit Distance 2-47
No ratings yet
04 Weighted Minimum Edit Distance 2-47
2 pages
B505 Lec.10 DynamicProgramming 1
No ratings yet
B505 Lec.10 DynamicProgramming 1
19 pages
Lecture 2
No ratings yet
Lecture 2
71 pages
Error Detection
No ratings yet
Error Detection
6 pages
Levenshtein
No ratings yet
Levenshtein
14 pages
Dynamic Programming and Single Word Recognizers (Part 1)
No ratings yet
Dynamic Programming and Single Word Recognizers (Part 1)
25 pages
2312.01759v1
No ratings yet
2312.01759v1
34 pages
String Edit PDF
No ratings yet
String Edit PDF
39 pages
Compu'ng Minimum Edit Distance
No ratings yet
Compu'ng Minimum Edit Distance
8 pages
Levenshtein Distance - Coderust_ Hacking the Coding Interview
No ratings yet
Levenshtein Distance - Coderust_ Hacking the Coding Interview
10 pages
Java Lect 15
No ratings yet
Java Lect 15
14 pages
Workshop 01 - Live Session HW0
No ratings yet
Workshop 01 - Live Session HW0
21 pages
14. String Matching (1)
No ratings yet
14. String Matching (1)
116 pages
IR Lecture 3b
No ratings yet
IR Lecture 3b
44 pages
Lab 9
No ratings yet
Lab 9
2 pages
12_strings.v3
No ratings yet
12_strings.v3
111 pages
12 - Strings Matching
No ratings yet
12 - Strings Matching
111 pages
The Intractability of Computing the Hamming Distance 1st ediiton by Bodo Manthey, Rudiger Reischuk ISBN 3540206958Â 9783540206958 download
100% (3)
The Intractability of Computing the Hamming Distance 1st ediiton by Bodo Manthey, Rudiger Reischuk ISBN 3540206958Â 9783540206958 download
48 pages
Medict: Health Dictionary Application Using Damerau-Levenshtein Distance Algorithm
No ratings yet
Medict: Health Dictionary Application Using Damerau-Levenshtein Distance Algorithm
4 pages
IET Contest - 3rd Year Without Answers
No ratings yet
IET Contest - 3rd Year Without Answers
6 pages
candidate_elimination_algorithm
No ratings yet
candidate_elimination_algorithm
3 pages
hypothesis_in_ml
No ratings yet
hypothesis_in_ml
8 pages
inductive_bias
No ratings yet
inductive_bias
3 pages
bias_-_variance
No ratings yet
bias_-_variance
2 pages
CS3491 Ai Lab Manula R2021 Final
100% (4)
CS3491 Ai Lab Manula R2021 Final
43 pages
Ad3301 Data Exploration and Visualization
100% (3)
Ad3301 Data Exploration and Visualization
30 pages
Ad3301 Data Exploration and Visualization
100% (3)
Ad3301 Data Exploration and Visualization
30 pages
Lab Manual
No ratings yet
Lab Manual
42 pages
Data Base Assignment 2024
No ratings yet
Data Base Assignment 2024
12 pages
Vodafone
No ratings yet
Vodafone
15 pages
Worksheet in Grade 5 Mathematics Name: - Section
No ratings yet
Worksheet in Grade 5 Mathematics Name: - Section
3 pages
2D2024_2687 Appellants Motion to Disqualify Circuit Judge Patricia Muscarella Due to Conflicts of Interest and to Vacate All Orders Issued in Lower Court
No ratings yet
2D2024_2687 Appellants Motion to Disqualify Circuit Judge Patricia Muscarella Due to Conflicts of Interest and to Vacate All Orders Issued in Lower Court
156 pages
Tony Q Rastafara Riders - 2015
No ratings yet
Tony Q Rastafara Riders - 2015
7 pages
Im Rahul Chandran, A: Creative Graphic Designer
No ratings yet
Im Rahul Chandran, A: Creative Graphic Designer
2 pages
Quality Control
No ratings yet
Quality Control
10 pages
Heatcon Catalogue
No ratings yet
Heatcon Catalogue
129 pages
Eaton UPS 9130 Installation manual
No ratings yet
Eaton UPS 9130 Installation manual
26 pages
CD Rom 7 Slides Westgard Multirule System
100% (1)
CD Rom 7 Slides Westgard Multirule System
11 pages
Supermicro X10DAi MNL-1563
No ratings yet
Supermicro X10DAi MNL-1563
111 pages
Vibration Control
No ratings yet
Vibration Control
380 pages
Via Ivrea 8b 10098 Rivoli - (To) Italy Phone +39 011 9573423
No ratings yet
Via Ivrea 8b 10098 Rivoli - (To) Italy Phone +39 011 9573423
2 pages
My File
No ratings yet
My File
6 pages
Median Polish: Purpose
No ratings yet
Median Polish: Purpose
5 pages
Microsoft Office 2007 Syllabus
No ratings yet
Microsoft Office 2007 Syllabus
2 pages
Artificial Intelligence Tutorial PDF
100% (2)
Artificial Intelligence Tutorial PDF
69 pages
Blockchain Knowledge Check Lopez
No ratings yet
Blockchain Knowledge Check Lopez
4 pages
Customer Ageing Logics
No ratings yet
Customer Ageing Logics
4 pages
Sppu CG Papers 23
No ratings yet
Sppu CG Papers 23
6 pages
EEGI 3131-Adjustment Computations-Lesson 4
No ratings yet
EEGI 3131-Adjustment Computations-Lesson 4
17 pages
Jawaharlal Nehru Technological University Kakinada
No ratings yet
Jawaharlal Nehru Technological University Kakinada
4 pages
Order Reduction and Variation of Parameters Reduction of Order
No ratings yet
Order Reduction and Variation of Parameters Reduction of Order
5 pages
Hkust Thesis Defense
100% (3)
Hkust Thesis Defense
4 pages
FrameworksFundamentals Trainingv5
No ratings yet
FrameworksFundamentals Trainingv5
210 pages
Only And: For Regular Serving Railway Employees of SWR Rwfiynk
No ratings yet
Only And: For Regular Serving Railway Employees of SWR Rwfiynk
5 pages
Arts 6 Quarter 4 Module 1
100% (2)
Arts 6 Quarter 4 Module 1
9 pages