0% found this document useful (0 votes)
34 views

8-Syntax Part1 Merged

The document discusses syntax and context-free grammars. It defines context-free grammars and their components. It also discusses derivation, parsing, and different types of sentences. Noun phrases and their agreement and determiners are explained in detail.

Uploaded by

beshoyalks24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

8-Syntax Part1 Merged

The document discusses syntax and context-free grammars. It defines context-free grammars and their components. It also discusses derivation, parsing, and different types of sentences. Noun phrases and their agreement and determiners are explained in detail.

Uploaded by

beshoyalks24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 139

Year: 2023-2024

Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 1)

Lecture (8)

2
Contents
• Syntax
• Context-Free Grammar (CFG)
• Derivation
• Parsing

3
Syntax
• Study the order of words in a sentence and their
relationships.
o Syntax defines word categories and functions.

• Subject, verb, object is a sequence of functions


that corresponds to a common order in many
European languages including English and French.
• Syntactic Parsing determines the structure of a
sentence.
4
Syntax (Cont.)
• Why should we care about syntax?
• Grammars (and parsing) are key components
in many applications like:
o Grammar checkers
o Question answering
o Information extraction
o Machine translation

5
Context-Free Grammars
• Context-Free Grammars (CFGs).
o Also known as:
 Phrase structure grammars.
 Backus-Naur Form (BNF).
• Consist of: Rules, Terminals, and Non-terminals
1. Terminals:
o We’ll take these to be words (for now).
2. Non-Terminals:
o The constituents in a language.
 Like noun phrase, verb phrase, verb, noun, sentence, etc.

3. Rules:
o Rules are equations that consist of a single non-terminal on the left
and any number of terminals and non-terminals on the right.

6
Some NP Rules
• Here are some rules for our noun phrases (NP):

• Together, these three rules describe two kinds of NPs.


1. One that consists of a determiner followed by a
nominal.
2. And another that says that proper names are NPs.
• The third rule illustrates two things:
o An explicit disjunction (by using “|”): Two kinds of nominal.
o A recursive definition: Same non-terminal “Nominal” on the
right and left-side of the rule.
7
L0 Grammar

8
Formal Definition

9
Context-Free Grammar
• A context-free grammar consists of:
1. A set of production rules, each of which expresses the ways
that symbols of the language can be grouped and ordered
together, and
2. A lexicon of words and symbols.

10
Derivation and Parsing
• A derivation is a sequence of rules applied to a string
that accounts for that string:
o Covers all the elements in the string.
o Covers only the elements in the string.

11
Example (1): Derivation

• Derivation for the statement:


• Use this grammar to
S ⇒ NP VP
make a derivation for the ⇒ ProperNoun VP
following statement: ⇒ Adel VP
⇒ Adel Verb NP
Adel study Automata with Maged ⇒ Adel study NP
⇒ Adel study NP PP
⇒ Adel study ProperNoun PP
⇒ Adel study Automata PP
⇒ Adel study Automata Preposition NP
⇒ Adel study Automata with NP
⇒ Adel study Automata with ProperNoun
12 ⇒ Adel study Automata with Maged
Derivation

If A → α is a production rule, and β and γ are any strings in the set (Σ U V)*, then
we say that β A γ directly derives β α γ, written as:
βAγ⇒βαγ

• Derivation is a generalization of direct derivation.

13
Language Defined by CFG
• The formal language defined by a CFG is the set of strings
that are derivable from the designated start symbol.

14
Parsing
• Parsing (or Syntactic parsing) is the process of taking
a string and a grammar and returning a (multiple?)
parse tree(s) for that string.
• It is completely analogous to running a finite-state
transducer with a tape.
o It’s just more powerful:
 Remember this means that there are languages we can
capture with CFGs that we can’t capture with finite-
state methods.

15
Example (2): Parsing

• Parse tree:

16 Adel study Automata with Maged


Example (3): Parsing

Parse Tree

17
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 2)

Lecture (9)

2
Contents
• Context-Free Grammar (CFG) (Cont.)
o Grammatical and Ungrammatical Sentences
o Grammar Equivalence
o Sentence Types
 Noun Phrase
• Agreement
 Verb Phrase
• Subcategorization
3
Question (1)
• Which of the following rules is NOT a correct
context-free rule, where non-terminals are
{A, B}, and terminals are {a, b}.
1. A a
2. A  a | bA
3. AB  a
4. a  bA
5. Both 3 and 4

4
Question (1): Solution
• Which of the following rules is NOT a correct
context-free rule, where non-terminals are
{A, B}, and terminals are {a, b}.
1. A a
2. A  a | bA
3. AB  a
4. a  bA
5. Both 3 and 4

5
Question (2)
• Given {a, the, study} ∈ terminals and {VP,
Verb, Det} ∈ non-terminals, Which of the
following production rules is NOT a correct
CFG rule:
1. Det → a | the
2. VP → Verb
3. Verb → study
4. a → Det

6
Question (2): Solution
• Given {a, the, study} ∈ terminals and {VP,
Verb, Det} ∈ non-terminals, Which of the
following production rules is NOT a correct
CFG rule:
1. Det → a | the
2. VP → Verb
3. Verb → study
4. a → Det

7
Question (3)
• Given {x, y} is a set of terminals and A is a
non-terminal, which of the following words is
correctly generated from the grammar with a
rule: A → x | yA
1. xyyy
2. yxyxyx
3. yyyx
4. xyy

8
Question (3): Solution
• Given {x, y} is a set of terminals and A is a non-
terminal, which of the following words is
correctly generated from the grammar with a
rule: A → x | yA
1. xyyy
2. yxyxyx
3. yyyx
4. xyy

9
Grammatical and Ungrammatical
Sentences
• A CFG defines a formal language ( a set of strings ).

• Sentences (strings of words) that can be derived by a


grammar are in the formal language defined by that
grammar, and are called grammatical sentences.

10
Grammatical and Ungrammatical
Sentences (Cont.)
• Sentences that cannot be derived by a given formal
grammar are not in the language defined by that grammar,
and are referred to as ungrammatical sentences.

ungrammatical sentence

11
Grammar Equivalence
• Two Context Free Grammars (CFG) are
equivalent if they generate the same language
(i.e. set of strings).

12
Example (1)
Set of Non-Terminals Set of Terminals Set of Non-Terminals Set of Terminals

Start Symbol

Set of Rules

13
An English Grammar Fragment
• Sentences
• Noun phrases
o Agreement

• Verb phrases
o Subcategorization

• Prepositional phrases

14
Sentence Types
1. Declaratives: (e.g. A plane left.)
S  NP VP

2. Imperatives: (e.g. Leave!)


S  VP

3. Yes-No Questions: (e.g. Did the plane leave?)


S  Aux NP VP

4. WH Questions: (e.g. When did the plane leave?)


S  WH-NP Aux NP VP
15
Note
NP → Proper-Noun
NP → Det Nominal
Nominal → Noun | Noun Nominal
Proper-Noun → Cairo | Adel
Det → a
Det → the
Noun → flight

• Note: A CFG can be thought of as a generator for sentences.


o We could read the arrow as: rewrite the symbol on the left with
the string of symbols on the right.
o Example:
 NP → Det Nominal, rewrite NP as Det Nominal.
16
Noun Phrases
• Let’s consider the following rule in more detail:
NP  Det Nominal
• Most of the complexity of English noun
phrases is hidden in this rule.

17
Example (2): Noun Phrase
• The statement, “a flight” can be parsed (or derived) from
the rules as:
NP → Proper-Noun
NP → Det Nominal
Nominal → Noun | Noun Nominal
Proper-Noun → Cairo | Adel
Det → a
Det → the
Noun → flight

18
Determiners
• Noun phrases can start with determiners.
• Determiners can be:
o Simple lexical items: the, this, a, an, etc.
 A car.

o Or simple possessives.
 John’s car.

o Or complex recursive versions of that.


 John’s sister’s husband’s son’s car.
19
Nominal
• Contains the head and any pre- and post- modifiers
of the head.
• Pre-modifiers:
o Quantifiers, cardinals, ordinals...
 Three cars.
o Adjectives.
 Large cars.
o Ordering constraints.
 Three large cars.
 ?large three cars.
20
Post-modifiers
• Three kinds of post-modifiers:
1. Prepositional phrases (e.g. From Seattle).
o All flights from Cairo.
2. Non-finite clauses (e.g. Arriving before noon).
o Any flights arriving before noon.
3. Relative clauses (e.g. That serve breakfast).
o A flight that serve breakfast.
• Same general (recursive) rule to handle these:
o Nominal  Nominal PP
o Nominal  Nominal GerundVP
o Nominal  Nominal RelClause
21
Example (3): Noun Phrases
• Consider the NP Structure: Clearly
this NP is really about
parsing (or flights. That’s the central
criticial noun in this NP.
derivation) for Let’s call that the head.

the following
example:

All the morning


flights from
Denver to
Tampa leaving
before 10.
22 Head
Agreement
• By agreement, we have in mind constraints that hold
among various constituents that take part in a rule or
set of rules.

• For example, in English, determiners and the head


nouns in NPs have to agree in their number.

This flight *This flights


Those flights *Those flight
(Correct) (Incorrect)
23
Problem
• Our earlier NP rules are clearly deficient since they
don’t capture the agreement constraint.
o NP → Det Nominal
 Accepts, and assigns correct structures, to grammatical
examples (this flight).
 But it is also happy with incorrect examples (*these
flight).

o Such a rule is said to overgenerate.


o We’ll come back to this in a bit.
24
Possible CFG Solution
• Possible solution for
agreement. SgS  SgNP SgVP
• Disadvantage: Explosion of PlS  PlNP PlVP
rules can be a problem. SgNP  SgDet SgNom
• In English, subjects and verbs have to
agree in person and number. PlNP  PlDet PlNom
Determiners and nouns have to agree in
number. PlVP  PlV NP
SgVP  SgV NP

(Accepted) (Not Accepted)
25
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 3)

Lecture (10)

2
Contents
• Context-Free Grammar (CFG) (Cont.)
o Sentence Types (Cont.)
 Noun Phrase
• Agreement
 Verb Phrase
• Subcategorization
• Parsing
o Top-Down Parsing
o Bottom-Up Parsing

3
Question (1)
• For the grammar given below, find out the context free
language.
• The grammar G = ({S}, {a, b, c}, P, S) with the productions are:
S → aSa (Rule: 1)
S → bSb (Rule: 2)
S→ c (Rule: 3)

4
Question (1): Solution
• For the grammar given below, find out the context free
language.
• The grammar G = ({S}, {a, b, c}, P, S) with the productions are:
S → aSa (Rule: 1)
S → bSb (Rule: 2)
S→ c (Rule: 3)

L(G) = {c , aca , bcb , aacaa , abcba , bbcbb , bacab , ….}

L(G) = {wcwR : w ∈ {a, b}*}

5
Verb Phrases
• English VPs consist of a head verb along with 0 or
more following constituents which we’ll call
arguments.

6
Subcategorization
• But, even though there are many valid VP rules in
English, not all verbs are allowed to participate in all
those VP rules.
• We can subcategorize the verbs in a language
according to the sets of VP rules that they participate
in.
• This is a modern take on the traditional notion of
transitive/intransitive.
• Modern grammars may have many (i.e. 100s or such)
classes.
7
Subcategorization (Cont.)
• Sneeze: John sneezed
• Find: Please find [a flight to NY]NP
• Give: Give [me]NP[a cheaper fare]NP
• Help: Can you help [me]NP[with a flight]PP
• Prefer: I prefer [to leave earlier]TO-VP
• Told: I was told [United has a flight]S
• …
(Correct)
8
Subcategorization (Cont.)
• *John sneezed the book.
• *I prefer United has a flight.
• *Give with a flight.
(Incorrect)

• As with agreement phenomena, we need a


way to formally express the constraints.

9
Why?
• Right now, the various rules for VPs
overgenerate.
o They permit the presence of strings containing
verbs and arguments that don’t go together.
o For example: *John sneezed the book. (Incorrect)
VP  V NP
therefore, Sneezed the book is a VP since “sneeze” is
a verb and “the book” is a valid NP.

10
Possible CFG Solution
• Possible solution for
agreement. SgS  SgNP SgVP
• Can use the same trick for PlS  PlNP PlVP
all the verb/VP classes.
SgNP  SgDet SgNom
• Disadvantage: Explosion of
rules can be a problem. PlNP  PlDet PlNom
• In English, subjects and verbs have to PlVP  PlV NP
agree in person and number. Determiners
and nouns have to agree in number. SgVP  SgV NP

11
Possible CFG Solution (Cont.)
• Verb-with-NP-complement  find | leave | …
• Verb-with-S-complement  think | say | believe | …
• Verb-with-no-complement  sneeze | disappear | …

• VP  Verb-with-NP-complement NP
• VP  Verb-with-S-complement S
• VP  Verb-with-no-complement

• …
12
CFG Solution for Agreement
• It works and stays within the power of CFGs.
• But it’s ugly.
• And it doesn’t scale all that well because of
the interaction among the various constraints
explodes the number of rules in our grammar.

13
The Point
• CFGs appear to be just about what we need to
account for a lot of basic syntactic structure in
English.
• But there are problems.
o That can be dealt with adequately, although not elegantly,
by staying within the CFG framework.

• There are simpler, more elegant, solutions that take


us out of the CFG framework (beyond its formal
power):
o LFG, HPSG, Construction grammar, XTAG, etc.

14
Parsing
• Parsing means taking an input and producing some
sort of structure for it .
• Syntactic parsing is the task of recognizing a sentence
and assigning a syntactic structure to it.
• Parsing with Context free grammars refers to the task
of assigning proper parse trees to input strings.
o Proper here means a parse tree that covers all and only
the elements of the input and has an S at the top.
• It doesn’t actually mean that the system can select the
correct parse tree from among all the possible trees.

15
Search Strategies
• How can we use grammar (e.g. the grammar
shown) to assign the correct parse tree to an
input string (e.g. Book that trip)?

16
Search Strategies
1. Top-down or goal-directed search.
2. Bottom-up or data-directed search.

Stream of words
Parser Parse tree(s)
Context-free grammar

17
Top-Down and Bottom-Up
• Top-down:
o Helps with POS ambiguities – only consider relevant POS.
o Only searches for trees that can be answers (i.e. S’s).
o But also suggests trees that are not consistent with any of
the words.
o Spends a lot of time on impossible parses (trees that are
not consistent with any of the words).
• Bottom-up:
o Has to consider every POS.
o Only forms trees consistent with the words.
o But suggests trees that make no sense globally.
 Spends a lot of time on useless structures (trees that make no
sense globally, or trees that will not start with S on the top).
18
(1) Top-Down Parsing
• A top-down parser searches for a parse tree
by trying to build from the root node S down
to the leaves.
• Top-Down Search:
o Since we’re trying to find parse trees rooted with
an S (Sentences), why not start with the rules that
give us an S.
o Then we can work our way down from there to
the words.
19
Top Down Space

20
Sample L1 Grammar

21
Example (1)
The cat sat on the mat S  NP VP

22
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the

Nominal
the

23
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat

Nominal
the
Noun

cat

24
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal
the sat
Noun
Nominal
on the
cat Noun

mat
25
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal PP  Prep NP
Prep  on
the sat
Noun
Nominal
on the
cat Noun

mat
26
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal PP  Prep NP
Prep  on
the sat NP  Det Nominal
Noun Det  the
Nominal
on the
cat

27
Example (1) (Cont.)
The cat sat on the mat S  NP VP
NP  Det Nominal
Det  the
Nominal  Noun
Noun  cat
VP  V PP
V  sat
Nominal PP  Prep NP
Prep  on
the sat NP  Det Nominal
Noun Det  the
Nominal
on Nominal  Noun
cat the Noun Noun  mat

mat
28
Example (2)
S  NP VP
NP  Nominal
Nominal  Noun
Noun  time
VP  V PP
Nominal V  flies
PP  Prep NP
Noun
flies
Prep  like
NP  Det
Time
Nominal
like Nominal Det  an
an Noun Nominal  Noun
Noun  arrow
arrow
29
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 4)

Lecture (11)

2
Contents
• Parsing:
o Top-Down Parsing
o Bottom-Up Parsing

3
Problems with Top-Down Parsing
1. Left Recursion.

2. Structural Ambiguity.

4
Left Recursive
• A grammar is left-recursive, if it contains at least one non-
terminal A, such that: it has a derivation that includes itself
anywhere along its left most branch:

A ⇒∗ α A β, for some α, β and α ⇒∗ ε


• For example:
NP → Det Nominal
Det → NP’s

5
Immediate Left-Recursion
• The grammar has a rule: A → A α

• Leads to an infinite expansions of trees.

• For example:
S → NP VP
NP → NP PP
VP → VP PP

6
Structural Ambiguity
• Occurs when the grammar assigns more than one possible
parse tree to a sentence.

7
Example (1): Structure Ambiguity
I shot an elephant in my pajamas

8
Example (1): Structure Ambiguity
(Cont.)
I shot an elephant in my pajamas

9
Example (2): Structure Ambiguity

I saw the man with the telescope

Pronoun Pronoun

10
Example (2): Structure Ambiguity
(Cont.)
I saw the man with the telescope

Pronoun Pronoun

11
(2) Bottom-up Parsing
• Bottom-Up Parsing:
o Of course, we also want trees that cover the input words. So
we might also start with trees that link up with the words in the
right way.
o Then work your way up from there to larger and larger trees.

• In bottom-up parsing,
1. The parser starts with the words of the input sentence, and
2. Tries to build parse trees from the words up, again by applying
rules from the grammar one at a time.

• The parse is successful if the parser succeeds in building


a parse tree rooted in the start symbol S that covers all
words of the input sentence.
12
Bottom-Up Search

13
Bottom-Up Search (Cont.)

14
Bottom-Up Search (Cont.)

15
Bottom-Up Search (Cont.)

16
Bottom-Up Search (Cont.)

17
Bottom-Up Search (Cont.)
S

18
Local Ambiguity
• Occurs when some part of a sentence is
ambiguous.
• For example, the sentence:

Book that flight


is not ambiguous, but when the parser sees the
first word book, it cannot know if the word is a
verb or a noun until later.
(The parser must consider both possible parses).
19
S

20
Top-Down and Bottom-Up: Review
• Top-down:
o Helps with POS ambiguities – only consider relevant POS.
o Only searches for trees that can be answers (i.e. S’s).
o But also suggests trees that are not consistent with any of
the words.
o Spends a lot of time on impossible parses (trees that are
not consistent with any of the words).
• Bottom-up:
o Has to consider every POS.
o Only forms trees consistent with the words.
o But suggests trees that make no sense globally.
 Spends a lot of time on useless structures (trees that make no
sense globally).
21
Control
• Of course, in both cases we left out how to
keep track of the search space and how to
make choices:
o Which node to try to expand next.
o Which grammar rule to use to expand a node.

• One approach is called backtracking.


o Make a choice, if it works out then fine.
o If not then back up and make a different choice.
 Same as with ND-Recognize.
22
Top-Down (Depth-First) Parsing

Depth-First Expansion: Expand a particular


node at a level, only considering an alternate
node at that level if the parser fails as a result
of the earlier expansion. i.e., expand the tree all
the way down until you can’t expand any more.
23
Top-Down (Breadth-First) Parsing

Breadth-First Expansion: All the nodes at each level are expanded once
24 before going to the next (lower) level. This is memory intensive when many
grammar rules are involved.
Example (3)
S  NP VP
NP  Pronoun | Proper-Noun | Det Nominal
Nominal  Noun Nominal | Noun
VP  Verb | Verb NP | Verb NP PP | Verb PP
PP  Preposition NP

25
Example (3): Sample Lexicon

26
Example (3) (Cont.)
• This grammar can be used to generate sentences of a
language as:
S ⇒ NP VP Parse Tree
⇒ Pronoun VP
⇒ I VP
⇒ I Verb NP
⇒ I prefer NP
⇒ I prefer Det Nominal
⇒ I prefer a Nominal
⇒ I prefer a Noun Nominal
⇒ I prefer a morning Nominal Nominal
⇒ I prefer a morning Noun
⇒ I prefer a morning flight Noun

Derivation flight
27 I prefer a morning flight.
Example (4)
• Use the following grammar to find a derivation and
parse tree for the following sentence:
does this flight include a meal?

28
Example (4) (Cont.)
S ⇒ Aux NP VP Derivation
⇒ does NP VP
⇒ does Det Nominal VP
⇒ does this Nominal VP
⇒ does this Noun VP
⇒ does this flight VP
⇒ does this flight Verb NP
⇒ does this flight include NP
⇒ does this flight include Det Nominal
⇒ does this flight include a Nominal
⇒ does this flight include a Noun
⇒ does this flight include a meal

does this flight include a meal?


29
Example (4) (Cont.)
Parse Tree

does this flight include a meal?


30
Year: 2023-2024
Spring Semester

Natural Language
Processing

Dr. Wafaa Samy


Syntax (Part 5)

Lecture (12)

2
Contents
• Probabilistic Context-Free Grammar
(PCFG)
• CKY Parsing

3
Probabilistic Context-Free Grammar
(PCFG)

4
A PCFG Example
Grammar Lexicon

5
A PCFG Example (Cont.)

6
Deriving a PCFG from a Corpus

7
Example (1)
• Two parse trees for the sentence: Book the dinner flight.
• Use the Probabilistic Context Free Grammar to solve the ambiguity.

8
Example (1) (Cont.)
The Left Tree The Right Tree

9
Example (1) (Cont.)
• The probability of each tree can be computed by multiplying
the probability of each of the rules used in the derivation.

10
Example (2)
• Use the following Grammar rules:

11
Example (2) (Cont.)
• Two parse trees for the sentence: People fish tanks with rods.
• Use the Probabilistic Context Free Grammar to solve the ambiguity.

12
Example (2) (Cont.)
• Two parse trees for the sentence: People fish tanks with rods.
• Use the Probabilistic Context Free Grammar to solve the ambiguity.

13
Example (2) (Cont.)
Tree Probabilities

14
Dynamic Programming (DP)
• DP search methods fill tables with partial results
and thereby:
o Avoid doing avoidable repeated work.
o Solve exponential problems in polynomial time (well,
no not really).
o Efficiently store ambiguous structures with shared
sub-parts.

• We’ll cover one approach that roughly


correspond to top-down and bottom-up
approaches.
o CKY (Cocke, Kasami, and Younger).
15
CKY Parsing
• First we’ll limit our grammar to epsilon-free, binary
rules.
• More specifically, in Chomsky-Normal Form,
we want our rules to be of the form:
A→BC (binary rule)
Or 2 non-terminals
A→w
single terminal

That is, rules can expand to either 2 non-terminals


or to a single terminal.
16
Problem
• What if your grammar isn’t binary?
o As in the case of the TreeBank grammar?
• Convert it to binary … any arbitrary CFG can be
rewritten into Chomsky-Normal Form (CNF)
automatically.
• What does this mean?
o The resulting grammar accepts (and rejects) the same
set of strings as the original grammar (i.e. the
resulting grammar must be equivalent to the original
grammar).
o But the resulting derivations (parse trees) are
different.
17
Binarization Intuition
• Eliminate chains of unit productions.
• Introduce new intermediate non-terminals into the
grammar that distribute rules with length > 2 over several
rules.
• So
S→ABC
turns into

S→XC
and
X→AB
• Where X is a non-terminal symbol that doesn’t occur
anywhere else in the grammar.
18
Sample L1 Grammar

19
CNF Conversion

20
CKY
• So let’s build a table so that an A spanning from i to j
in the input is placed in cell [i, j] in the table.

• So, a non-terminal spanning an entire string will sit in


cell [0, n]
o Hopefully an S

• If we build the table bottom-up, we’ll know that the


parts of the A must go from i to k and from k to j, for
some k.

21
CKY (Cont.)
• Meaning that for a rule like A → B C we should
look for a B in [i,k] and a C in [k,j].

• In other words, if we think there might be an


A spanning i,j in the input… AND
A → B C is a rule in the grammar THEN

• There must be a B in [i,k] and a C in [k,j] for


some i<k<j
22
CKY (Cont.)
• So to fill the table, loop over the cell[i,j] values
in some systematic way:
o What constraint should we put on that systematic
search?
o For each cell, loop over the appropriate k values
to search for things to add.

23
CKY Algorithm

24
Note
• We arranged the loops to fill the table a column
at a time, from left to right, bottom to top.
o This assures us that whenever we’re filling a cell,
the parts needed to fill it are already in the table
(to the left and below).
o It’s somewhat natural in that it processes the
input a left to right a word at a time.
 Known as online.

25
Example (3)

26
Example (3) (Cont.)

| the

27
Example (3) (Cont.)
Filling column 5

j=5
i from 3 to 0
k from i+1 to 4

28
Example (3) (Cont.)

j=5
i=3
k=4

29
Example (3) (Cont.)

j=5
i=2
k = 3 to 4

30
Example (3) (Cont.)

j=5
i=1
k = 2 to 4

31
Example (3) (Cont.)

j=5
i=0
k = 1 to 4

32
CKY Notes
• Since it’s bottom up, CKY populates the table
with a lot of phantom constituents.
o Segments that by themselves are constituents but
cannot really occur in the context in which they
are being suggested.
o To avoid this we can switch to a top-down control
strategy
o Or we can add some kind of filtering that blocks
constituents where they can not happen in a final
analysis.
33

You might also like