0% found this document useful (0 votes)
34 views

8-Syntax Part1 Merged

The document discusses syntax and context-free grammars. It defines context-free grammars and their components. It also discusses derivation, parsing, and different types of sentences. Noun phrases and their agreement and determiners are explained in detail.

Uploaded by

beshoyalks24
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

8-Syntax Part1 Merged

The document discusses syntax and context-free grammars. It defines context-free grammars and their components. It also discusses derivation, parsing, and different types of sentences. Noun phrases and their agreement and determiners are explained in detail.

Uploaded by

beshoyalks24
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 139

Year: 2023-2024

Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 1)

Lecture (8)

2
Contents
• Syntax
• Context-Free Grammar (CFG)
• Derivation
• Parsing

3
Syntax
• Study the order of words in a sentence and their
relationships.
o Syntax defines word categories and functions.

• Subject, verb, object is a sequence of functions


that corresponds to a common order in many
European languages including English and French.
• Syntactic Parsing determines the structure of a
sentence.
4
Syntax (Cont.)
• Why should we care about syntax?
• Grammars (and parsing) are key components
in many applications like:
o Grammar checkers
o Question answering
o Information extraction
o Machine translation

5
Context-Free Grammars
• Context-Free Grammars (CFGs).
o Also known as:
 Phrase structure grammars.
 Backus-Naur Form (BNF).
• Consist of: Rules, Terminals, and Non-terminals
1. Terminals:
o We’ll take these to be words (for now).
2. Non-Terminals:
o The constituents in a language.
 Like noun phrase, verb phrase, verb, noun, sentence, etc.

3. Rules:
o Rules are equations that consist of a single non-terminal on the left
and any number of terminals and non-terminals on the right.

6
Some NP Rules
• Here are some rules for our noun phrases (NP):

• Together, these three rules describe two kinds of NPs.


1. One that consists of a determiner followed by a
nominal.
2. And another that says that proper names are NPs.
• The third rule illustrates two things:
o An explicit disjunction (by using “|”): Two kinds of nominal.
o A recursive definition: Same non-terminal “Nominal” on the
right and left-side of the rule.
7
L0 Grammar

8
Formal Definition

9
Context-Free Grammar
• A context-free grammar consists of:
1. A set of production rules, each of which expresses the ways
that symbols of the language can be grouped and ordered
together, and
2. A lexicon of words and symbols.

10
Derivation and Parsing
• A derivation is a sequence of rules applied to a string
that accounts for that string:
o Covers all the elements in the string.
o Covers only the elements in the string.

11
Example (1): Derivation

• Derivation for the statement:


• Use this grammar to
S ⇒ NP VP
make a derivation for the ⇒ ProperNoun VP
following statement: ⇒ Adel VP
⇒ Adel Verb NP
Adel study Automata with Maged ⇒ Adel study NP
⇒ Adel study NP PP
⇒ Adel study ProperNoun PP
⇒ Adel study Automata PP
⇒ Adel study Automata Preposition NP
⇒ Adel study Automata with NP
⇒ Adel study Automata with ProperNoun
12 ⇒ Adel study Automata with Maged
Derivation

If A → α is a production rule, and β and γ are any strings in the set (Σ U V)*, then
we say that β A γ directly derives β α γ, written as:
βAγ⇒βαγ

• Derivation is a generalization of direct derivation.

13
Language Defined by CFG
• The formal language defined by a CFG is the set of strings
that are derivable from the designated start symbol.

14
Parsing
• Parsing (or Syntactic parsing) is the process of taking
a string and a grammar and returning a (multiple?)
parse tree(s) for that string.
• It is completely analogous to running a finite-state
transducer with a tape.
o It’s just more powerful:
 Remember this means that there are languages we can
capture with CFGs that we can’t capture with finite-
state methods.

15
Example (2): Parsing

• Parse tree:

16 Adel study Automata with Maged


Example (3): Parsing

Parse Tree

17
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 2)

Lecture (9)

2
Contents
• Context-Free Grammar (CFG) (Cont.)
o Grammatical and Ungrammatical Sentences
o Grammar Equivalence
o Sentence Types
 Noun Phrase
• Agreement
 Verb Phrase
• Subcategorization
3
Question (1)
• Which of the following rules is NOT a correct
context-free rule, where non-terminals are
{A, B}, and terminals are {a, b}.
1. A a
2. A  a | bA
3. AB  a
4. a  bA
5. Both 3 and 4

4
Question (1): Solution
• Which of the following rules is NOT a correct
context-free rule, where non-terminals are
{A, B}, and terminals are {a, b}.
1. A a
2. A  a | bA
3. AB  a
4. a  bA
5. Both 3 and 4

5
Question (2)
• Given {a, the, study} ∈ terminals and {VP,
Verb, Det} ∈ non-terminals, Which of the
following production rules is NOT a correct
CFG rule:
1. Det → a | the
2. VP → Verb
3. Verb → study
4. a → Det

6
Question (2): Solution
• Given {a, the, study} ∈ terminals and {VP,
Verb, Det} ∈ non-terminals, Which of the
following production rules is NOT a correct
CFG rule:
1. Det → a | the
2. VP → Verb
3. Verb → study
4. a → Det

7
Question (3)
• Given {x, y} is a set of terminals and A is a
non-terminal, which of the following words is
correctly generated from the grammar with a
rule: A → x | yA
1. xyyy
2. yxyxyx
3. yyyx
4. xyy

8
Question (3): Solution
• Given {x, y} is a set of terminals and A is a non-
terminal, which of the following words is
correctly generated from the grammar with a
rule: A → x | yA
1. xyyy
2. yxyxyx
3. yyyx
4. xyy

9
Grammatical and Ungrammatical
Sentences
• A CFG defines a formal language ( a set of strings ).

• Sentences (strings of words) that can be derived by a


grammar are in the formal language defined by that
grammar, and are called grammatical sentences.

10
Grammatical and Ungrammatical
Sentences (Cont.)
• Sentences that cannot be derived by a given formal
grammar are not in the language defined by that grammar,
and are referred to as ungrammatical sentences.

ungrammatical sentence

11
Grammar Equivalence
• Two Context Free Grammars (CFG) are
equivalent if they generate the same language
(i.e. set of strings).

12
Example (1)
Set of Non-Terminals Set of Terminals Set of Non-Terminals Set of Terminals

Start Symbol

Set of Rules

13
An English Grammar Fragment
• Sentences
• Noun phrases
o Agreement

• Verb phrases
o Subcategorization

• Prepositional phrases

14
Sentence Types
1. Declaratives: (e.g. A plane left.)
S  NP VP

2. Imperatives: (e.g. Leave!)


S  VP

3. Yes-No Questions: (e.g. Did the plane leave?)


S  Aux NP VP

4. WH Questions: (e.g. When did the plane leave?)


S  WH-NP Aux NP VP
15
Note
NP → Proper-Noun
NP → Det Nominal
Nominal → Noun | Noun Nominal
Proper-Noun → Cairo | Adel
Det → a
Det → the
Noun → flight

• Note: A CFG can be thought of as a generator for sentences.


o We could read the arrow as: rewrite the symbol on the left with
the string of symbols on the right.
o Example:
 NP → Det Nominal, rewrite NP as Det Nominal.
16
Noun Phrases
• Let’s consider the following rule in more detail:
NP  Det Nominal
• Most of the complexity of English noun
phrases is hidden in this rule.

17
Example (2): Noun Phrase
• The statement, “a flight” can be parsed (or derived) from
the rules as:
NP → Proper-Noun
NP → Det Nominal
Nominal → Noun | Noun Nominal
Proper-Noun → Cairo | Adel
Det → a
Det → the
Noun → flight

18
Determiners
• Noun phrases can start with determiners.
• Determiners can be:
o Simple lexical items: the, this, a, an, etc.
 A car.

o Or simple possessives.
 John’s car.

o Or complex recursive versions of that.


 John’s sister’s husband’s son’s car.
19
Nominal
• Contains the head and any pre- and post- modifiers
of the head.
• Pre-modifiers:
o Quantifiers, cardinals, ordinals...
 Three cars.
o Adjectives.
 Large cars.
o Ordering constraints.
 Three large cars.
 ?large three cars.
20
Post-modifiers
• Three kinds of post-modifiers:
1. Prepositional phrases (e.g. From Seattle).
o All flights from Cairo.
2. Non-finite clauses (e.g. Arriving before noon).
o Any flights arriving before noon.
3. Relative clauses (e.g. That serve breakfast).
o A flight that serve breakfast.
• Same general (recursive) rule to handle these:
o Nominal  Nominal PP
o Nominal  Nominal GerundVP
o Nominal  Nominal RelClause
21
Example (3): Noun Phrases
• Consider the NP Structure: Clearly
this NP is really about
parsing (or flights. That’s the central
criticial noun in this NP.
derivation) for Let’s call that the head.

the following
example:

All the morning


flights from
Denver to
Tampa leaving
before 10.
22 Head
Agreement
• By agreement, we have in mind constraints that hold
among various constituents that take part in a rule or
set of rules.

• For example, in English, determiners and the head


nouns in NPs have to agree in their number.

This flight *This flights


Those flights *Those flight
(Correct) (Incorrect)
23
Problem
• Our earlier NP rules are clearly deficient since they
don’t capture the agreement constraint.
o NP → Det Nominal
 Accepts, and assigns correct structures, to grammatical
examples (this flight).
 But it is also happy with incorrect examples (*these
flight).

o Such a rule is said to overgenerate.


o We’ll come back to this in a bit.
24
Possible CFG Solution
• Possible solution for
agreement. SgS  SgNP SgVP
• Disadvantage: Explosion of PlS  PlNP PlVP
rules can be a problem. SgNP  SgDet SgNom
• In English, subjects and verbs have to
agree in person and number. PlNP  PlDet PlNom
Determiners and nouns have to agree in
number. PlVP  PlV NP
SgVP  SgV NP

(Accepted) (Not Accepted)
25
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 3)

Lecture (10)

2
Contents
• Context-Free Grammar (CFG) (Cont.)
o Sentence Types (Cont.)
 Noun Phrase
• Agreement
 Verb Phrase
• Subcategorization
• Parsing
o Top-Down Parsing
o Bottom-Up Parsing

3
Question (1)
• For the grammar given below, find out the context free
language.
• The grammar G = ({S}, {a, b, c}, P, S) with the productions are:
S → aSa (Rule: 1)
S → bSb (Rule: 2)
S→ c (Rule: 3)

4
Question (1): Solution
• For the grammar given below, find out the context free
language.
• The grammar G = ({S}, {a, b, c}, P, S) with the productions are:
S → aSa (Rule: 1)
S → bSb (Rule: 2)
S→ c (Rule: 3)

L(G) = {c , aca , bcb , aacaa , abcba , bbcbb , bacab , ….}

L(G) = {wcwR : w ∈ {a, b}*}

5
Verb Phrases
• English VPs consist of a head verb along with 0 or
more following constituents which we’ll call
arguments.

6
Subcategorization
• But, even though there are many valid VP rules in
English, not all verbs are allowed to participate in all
those VP rules.
• We can subcategorize the verbs in a language
according to the sets of VP rules that they participate
in.
• This is a modern take on the traditional notion of
transitive/intransitive.
• Modern grammars may have many (i.e. 100s or such)
classes.
7
Subcategorization (Cont.)
• Sneeze: John sneezed
• Find: Please find [a flight to NY]NP
• Give: Give [me]NP[a cheaper fare]NP
• Help: Can you help [me]NP[with a flight]PP
• Prefer: I prefer [to leave earlier]TO-VP
• Told: I was told [United has a flight]S
• …
(Correct)
8
Subcategorization (Cont.)
• *John sneezed the book.
• *I prefer United has a flight.
• *Give with a flight.
(Incorrect)

• As with agreement phenomena, we need a


way to formally express the constraints.

9
Why?
• Right now, the various rules for VPs
overgenerate.
o They permit the presence of strings containing
verbs and arguments that don’t go together.
o For example: *John sneezed the book. (Incorrect)
VP  V NP
therefore, Sneezed the book is a VP since “sneeze” is
a verb and “the book” is a valid NP.

10
Possible CFG Solution
• Possible solution for
agreement. SgS  SgNP SgVP
• Can use the same trick for PlS  PlNP PlVP
all the verb/VP classes.
SgNP  SgDet SgNom
• Disadvantage: Explosion of
rules can be a problem. PlNP  PlDet PlNom
• In English, subjects and verbs have to PlVP  PlV NP
agree in person and number. Determiners
and nouns have to agree in number. SgVP  SgV NP

11
Possible CFG Solution (Cont.)
• Verb-with-NP-complement  find | leave | …
• Verb-with-S-complement  think | say | believe | …
• Verb-with-no-complement  sneeze | disappear | …

• VP  Verb-with-NP-complement NP
• VP  Verb-with-S-complement S
• VP  Verb-with-no-complement

• …
12
CFG Solution for Agreement
• It works and stays within the power of CFGs.
• But it’s ugly.
• And it doesn’t scale all that well because of
the interaction among the various constraints
explodes the number of rules in our grammar.

13
The Point
• CFGs appear to be just about what we need to
account for a lot of basic syntactic structure in
English.
• But there are problems.
o That can be dealt with adequately, although not elegantly,
by staying within the CFG framework.

• There are simpler, more elegant, solutions that take


us out of the CFG framework (beyond its formal
power):
o LFG, HPSG, Construction grammar, XTAG, etc.

14
Parsing
• Parsing means taking an input and producing some
sort of structure for it .
• Syntactic parsing is the task of recognizing a sentence
and assigning a syntactic structure to it.
• Parsing with Context free grammars refers to the task
of assigning proper parse trees to input strings.
o Proper here means a parse tree that covers all and only
the elements of the input and has an S at the top.
• It doesn’t actually mean that the system can select the
correct parse tree from among all the possible trees.

15
Search Strategies
• How can we use grammar (e.g. the grammar
shown) to assign the correct parse tree to an
input string (e.g. Book that trip)?

16
Search Strategies
1. Top-down or goal-directed search.
2. Bottom-up or data-directed search.

Stream of words
Parser Parse tree(s)
Context-free grammar

17
Top-Down and Bottom-Up
• Top-down:
o Helps with POS ambiguities – only consider relevant POS.
o Only searches for trees that can be answers (i.e. S’s).
o But also suggests trees that are not consistent with any of
the words.
o Spends a lot of time on impossible parses (trees that are
not consistent with any of the words).
• Bottom-up:
o Has to consider every POS.
o Only forms trees consistent with the words.
o But suggests trees that make no sense globally.
 Spends a lot of time on useless structures (trees that make no
sense globally, or trees that will not start with S on the top).
18
(1) Top-Down Parsing
• A top-down parser searches for a parse tree
by trying to build from the root node S down
to the leaves.
• Top-Down Search:
o Since we’re trying to find parse trees rooted with
an S (Sentences), why not start with the rules that
give us an S.
o Then we can work our way down from there to
the words.
19
Top Down Space

20
Sample L1 Grammar

21
Example (1)
The cat sat on the mat S  NP VP

22
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the

Nominal
the

23
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat

Nominal
the
Noun

cat

24
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal
the sat
Noun
Nominal
on the
cat Noun

mat
25
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal PP  Prep NP
Prep  on
the sat
Noun
Nominal
on the
cat Noun

mat
26
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal PP  Prep NP
Prep  on
the sat NP  Det Nominal
Noun Det  the
Nominal
on the
cat

27
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal PP  Prep NP
Prep  on
the sat NP  Det Nominal
Noun Det  the
Nominal
on Nominal  Noun
cat the Noun Noun  mat

mat
28
Example (2)
S  NP VP
NP  Nominal
Nominal  Noun
Noun  time
VP  V PP
Nominal V  flies
PP  Prep NP
Noun
flies
Prep  like
NP  Det
Time
Nominal
like Nominal Det  an
an Noun Nominal  Noun
Noun  arrow
arrow
29
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 4)

Lecture (11)

2
Contents
• Parsing:
o Top-Down Parsing
o Bottom-Up Parsing

3
Problems with Top-Down Parsing
1. Left Recursion.

2. Structural Ambiguity.

4
Left Recursive
• A grammar is left-recursive, if it contains at least one non-
terminal A, such that: it has a derivation that includes itself
anywhere along its left most branch:

A ⇒∗ α A β, for some α, β and α ⇒∗ ε


• For example:
NP → Det Nominal
Det → NP’s

5
Immediate Left-Recursion
• The grammar has a rule: A → A α

• Leads to an infinite expansions of trees.

• For example:
S → NP VP
NP → NP PP
VP → VP PP

6
Structural Ambiguity
• Occurs when the grammar assigns more than one possible
parse tree to a sentence.

7
Example (1): Structure Ambiguity
I shot an elephant in my pajamas

8
Example (1): Structure Ambiguity
(Cont.)
I shot an elephant in my pajamas

9
Example (2): Structure Ambiguity

I saw the man with the telescope

Pronoun Pronoun

10
Example (2): Structure Ambiguity
(Cont.)
I saw the man with the telescope

Pronoun Pronoun

11
(2) Bottom-up Parsing
• Bottom-Up Parsing:
o Of course, we also want trees that cover the input words. So
we might also start with trees that link up with the words in the
right way.
o Then work your way up from there to larger and larger trees.

• In bottom-up parsing,
1. The parser starts with the words of the input sentence, and
2. Tries to build parse trees from the words up, again by applying
rules from the grammar one at a time.

• The parse is successful if the parser succeeds in building


a parse tree rooted in the start symbol S that covers all
words of the input sentence.
12
Bottom-Up Search

13
Bottom-Up Search (Cont.)

14
Bottom-Up Search (Cont.)

15
Bottom-Up Search (Cont.)

16
Bottom-Up Search (Cont.)

17
Bottom-Up Search (Cont.)
S

18
Local Ambiguity
• Occurs when some part of a sentence is
ambiguous.
• For example, the sentence:

Book that flight


is not ambiguous, but when the parser sees the
first word book, it cannot know if the word is a
verb or a noun until later.
(The parser must consider both possible parses).
19
S

20
Top-Down and Bottom-Up: Review
• Top-down:
o Helps with POS ambiguities – only consider relevant POS.
o Only searches for trees that can be answers (i.e. S’s).
o But also suggests trees that are not consistent with any of
the words.
o Spends a lot of time on impossible parses (trees that are
not consistent with any of the words).
• Bottom-up:
o Has to consider every POS.
o Only forms trees consistent with the words.
o But suggests trees that make no sense globally.
 Spends a lot of time on useless structures (trees that make no
sense globally).
21
Control
• Of course, in both cases we left out how to
keep track of the search space and how to
make choices:
o Which node to try to expand next.
o Which grammar rule to use to expand a node.

• One approach is called backtracking.


o Make a choice, if it works out then fine.
o If not then back up and make a different choice.
 Same as with ND-Recognize.
22
Top-Down (Depth-First) Parsing

Depth-First Expansion: Expand a particular


node at a level, only considering an alternate
node at that level if the parser fails as a result
of the earlier expansion. i.e., expand the tree all
the way down until you can’t expand any more.
23
Top-Down (Breadth-First) Parsing

Breadth-First Expansion: All the nodes at each level are expanded once
24 before going to the next (lower) level. This is memory intensive when many
grammar rules are involved.
Example (3)
S  NP VP
NP  Pronoun | Proper-Noun | Det Nominal
Nominal  Noun Nominal | Noun
VP  Verb | Verb NP | Verb NP PP | Verb PP
PP  Preposition NP

25
Example (3): Sample Lexicon

26
Example (3) (Cont.)
• This grammar can be used to generate sentences of a
language as:
S ⇒ NP VP Parse Tree
⇒ Pronoun VP
⇒ I VP
⇒ I Verb NP
⇒ I prefer NP
⇒ I prefer Det Nominal
⇒ I prefer a Nominal
⇒ I prefer a Noun Nominal
⇒ I prefer a morning Nominal Nominal
⇒ I prefer a morning Noun
⇒ I prefer a morning flight Noun

Derivation flight
27 I prefer a morning flight.
Example (4)
• Use the following grammar to find a derivation and
parse tree for the following sentence:
does this flight include a meal?

28
Example (4) (Cont.)
S ⇒ Aux NP VP Derivation
⇒ does NP VP
⇒ does Det Nominal VP
⇒ does this Nominal VP
⇒ does this Noun VP
⇒ does this flight VP
⇒ does this flight Verb NP
⇒ does this flight include NP
⇒ does this flight include Det Nominal
⇒ does this flight include a Nominal
⇒ does this flight include a Noun
⇒ does this flight include a meal

does this flight include a meal?


29
Example (4) (Cont.)
Parse Tree

does this flight include a meal?


30
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 5)

Lecture (12)

2
Contents
• Probabilistic Context-Free Grammar
(PCFG)
• CKY Parsing

3
Probabilistic Context-Free Grammar
(PCFG)

4
A PCFG Example
Grammar Lexicon

5
A PCFG Example (Cont.)

6
Deriving a PCFG from a Corpus

7
Example (1)
• Two parse trees for the sentence: Book the dinner flight.
• Use the Probabilistic Context Free Grammar to solve the ambiguity.

8
Example (1) (Cont.)
The Left Tree The Right Tree

9
Example (1) (Cont.)
• The probability of each tree can be computed by multiplying
the probability of each of the rules used in the derivation.

10
Example (2)
• Use the following Grammar rules:

11
Example (2) (Cont.)
• Two parse trees for the sentence: People fish tanks with rods.
• Use the Probabilistic Context Free Grammar to solve the ambiguity.

12
Example (2) (Cont.)
• Two parse trees for the sentence: People fish tanks with rods.
• Use the Probabilistic Context Free Grammar to solve the ambiguity.

13
Example (2) (Cont.)
Tree Probabilities

14
Dynamic Programming (DP)
• DP search methods fill tables with partial results
and thereby:
o Avoid doing avoidable repeated work.
o Solve exponential problems in polynomial time (well,
no not really).
o Efficiently store ambiguous structures with shared
sub-parts.

• We’ll cover one approach that roughly


correspond to top-down and bottom-up
approaches.
o CKY (Cocke, Kasami, and Younger).
15
CKY Parsing
• First we’ll limit our grammar to epsilon-free, binary
rules.
• More specifically, in Chomsky-Normal Form,
we want our rules to be of the form:
A→BC (binary rule)
Or 2 non-terminals
A→w
single terminal

That is, rules can expand to either 2 non-terminals


or to a single terminal.
16
Problem
• What if your grammar isn’t binary?
o As in the case of the TreeBank grammar?
• Convert it to binary … any arbitrary CFG can be
rewritten into Chomsky-Normal Form (CNF)
automatically.
• What does this mean?
o The resulting grammar accepts (and rejects) the same
set of strings as the original grammar (i.e. the
resulting grammar must be equivalent to the original
grammar).
o But the resulting derivations (parse trees) are
different.
17
Binarization Intuition
• Eliminate chains of unit productions.
• Introduce new intermediate non-terminals into the
grammar that distribute rules with length > 2 over several
rules.
• So
S→ABC
turns into

S→XC
and
X→AB
• Where X is a non-terminal symbol that doesn’t occur
anywhere else in the grammar.
18
Sample L1 Grammar

19
CNF Conversion

20
CKY
• So let’s build a table so that an A spanning from i to j
in the input is placed in cell [i, j] in the table.

• So, a non-terminal spanning an entire string will sit in


cell [0, n]
o Hopefully an S

• If we build the table bottom-up, we’ll know that the


parts of the A must go from i to k and from k to j, for
some k.

21
CKY (Cont.)
• Meaning that for a rule like A → B C we should
look for a B in [i,k] and a C in [k,j].

• In other words, if we think there might be an


A spanning i,j in the input… AND
A → B C is a rule in the grammar THEN

• There must be a B in [i,k] and a C in [k,j] for


some i<k<j
22
CKY (Cont.)
• So to fill the table, loop over the cell[i,j] values
in some systematic way:
o What constraint should we put on that systematic
search?
o For each cell, loop over the appropriate k values
to search for things to add.

23
CKY Algorithm

24
Note
• We arranged the loops to fill the table a column
at a time, from left to right, bottom to top.
o This assures us that whenever we’re filling a cell,
the parts needed to fill it are already in the table
(to the left and below).
o It’s somewhat natural in that it processes the
input a left to right a word at a time.
 Known as online.

25
Example (3)

26
Example (3) (Cont.)

| the

27
Example (3) (Cont.)
Filling column 5

j=5
i from 3 to 0
k from i+1 to 4

28
Example (3) (Cont.)

j=5
i=3
k=4

29
Example (3) (Cont.)

j=5
i=2
k = 3 to 4

30
Example (3) (Cont.)

j=5
i=1
k = 2 to 4

31
Example (3) (Cont.)

j=5
i=0
k = 1 to 4

32
CKY Notes
• Since it’s bottom up, CKY populates the table
with a lot of phantom constituents.
o Segments that by themselves are constituents but
cannot really occur in the context in which they
are being suggested.
o To avoid this we can switch to a top-down control
strategy
o Or we can add some kind of filtering that blocks
constituents where they can not happen in a final
analysis.
33

You might also like