0% found this document useful (0 votes)
0 views

Syntactic Analysis

The document discusses syntactic analysis, focusing on the arrangement of words (syntax) and the concept of constituency, where groups of words act as single units. It covers parsing methods, including top-down and bottom-up approaches, as well as dynamic programming techniques like the CKY algorithm and dependency parsing. Additionally, it explains the structure of dependency graphs and the transition-based parsing method used for analyzing language syntax.

Uploaded by

siddharthgor333
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Syntactic Analysis

The document discusses syntactic analysis, focusing on the arrangement of words (syntax) and the concept of constituency, where groups of words act as single units. It covers parsing methods, including top-down and bottom-up approaches, as well as dynamic programming techniques like the CKY algorithm and dependency parsing. Additionally, it explains the structure of dependency graphs and the transition-based parsing method used for analyzing language syntax.

Uploaded by

siddharthgor333
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 66

Syntactic Analysis

P.S.Sreeja
Syntax
• The word syntax comes from the Greek syntaxis ´ , meaning “setting
out together or arrangement”, and refers to the way words are
arranged together.
• Language Models: Importance of modeling word order
• POS categories: An equivalence class for words
• More complex notions: constituency, grammatical relations,
subcategorization etc.
Defining the notions: Constituency
• Constituency
• The fundamental idea of constituency is that groups of words may behave as
a single unit or phrase, called a constituent.
• For example we will see that a group of words called a noun phrase often acts
as a unit; noun phrases include single words
• Ex:
Preposed or Postposed:

• For example, the prepositional phrase on September


seventeenth can be placed in a number of different
locations in the following examples.

• Including preposed at the beginning, and postposed at the


end
Examples

while the entire phrase can be placed differently, the individual


words making up the phrase cannot be:
Modeling Constituency
Rules or productions
CFG for Languages
The symbols that are used in a CFG are divided into two classes.

• Terminal Symbols
• Non-Terminal Symbols
• Derivation
• Parse tree
Examples
Syntactic parsing
What is Parsing?
• The process of taking a string and a grammar and returning all
possible parse trees for that string
• That is, find all trees, whose root is the start symbol S, which
cover exactly the words in the input
Top-Down Parsing
• Searches for a parse tree by trying to build upon the root node S
down to the leaves
• Start by assuming that the input can be derived by the designated
start symbol S
• Find all trees that can start with S, by looking at the grammar rules
with S on the left-hand side
• Trees are grown downward until they eventually reach the POS
categories at the bottom
• Trees whose leaves fail to match the words in the input can be
rejected
Bottom-Up Parsing
• The parser starts with the words of the input, and tries to build trees
from the words up, by applying rules from the grammar one at a time
• Parser looks for the places in the parse-in-progress where the right-
hand-side of some rule might fit.
Grammatical relations
• Grammatical relations are a formalization of ideas from traditional grammar
such as SUBJECTS and OBJECTS, and other related notions.

• She ate a mammoth breakfast - She is the SUBJECT and a mammoth


breakfast is the OBJECT
Dynamic Programming Parsing
• To avoid extensive repeated work, must cache intermediate results,
i.e. completed phrases.

• Caching (memorizing) critical to obtaining a polynomial time parsing


(recognition) algorithm for CFGs.

• Dynamic programming algorithms based on both top-down and


bottom-up search can achieve O(n3) recognition time where n is the
length of the input string.
Dynamic Programming Parsing
Method
• CKY (Cocke-Kasami-Younger) algorithm: bottom-up, requires
normalizing the grammar

• Earley Parser - top-down, does not require normalizing grammar,


more complex

• More generally, chart parsers retain completed phrases in a chart and


can combine top-down and bottom-up searches
CKY Algorithm
• Grammar must be converted to Chomsky normal form (CNF) in which all
productions must have
• Either, exactly two non-terminals on the RHS
• Or, 1 terminal symbol on the RHS
• if we have a rule like

• we replace the leftmost pair of non-terminals with a new non-terminal and introduce
a new production result in the following new rules.

• Parse bottom-up storing phrases formed from all substrings in a triangular


table (chart)
Converting to CNF
CKY Algorithm
• Let n be the number of words in the input. Think about n+1 lines
separating them, numbered 0 to n.

• xij will denote the words between line i and j

• We build a table so that xij contains all the possible non-terminal


spanning for words between line i and j.

• We build the Table bottom-up.


CKY for CFG
Dependency Grammars and Parsing
Dependency Structure
Representation
Dependency Structure
Some of the Universal Dependency
relations
Criteria for Heads and Dependents
Comparison
Dependency tree

Dependency Graphs A directed graph that


satisfies the following
constraints:
• A dependency structure can be defined as a directed 1. There is a single
graph G, consisting of designated root node
• a set V of nodes, that has no incoming
• a set A of arcs (edges), arcs.
• Labeled graphs: 2. With the exception of
• Nodes in V are labeled with word forms (and annotation). the root node, each
• Arcs in A are labeled with dependency types. vertex has exactly one
• Notational convention: incoming arc.

3. There is a unique path


from the root node to
each vertex in V.
Formal conditions on Dependency
Graphs
Formal Conditions: Basic Intuitions
Dependency Parsing
• Dependency Parsing
Transition-Based Dependency
Parsing
• This architecture draws on shift-reduce parsing, a paradigm originally
developed for analyzing programming languages (Aho and Ullman, 1972).
• In transition-based parsing
• stack
• Buffer
• parser
• The left-to-right, successively
shifting items from the buffer
onto the stack.

• At each time point, examine the


top two elements on the stack,
and makes a decision about
what transition to apply to build
the parse.

• The possible transitions


correspond to the intuitive
actions one might take in
creating a dependency
• Assign the current word as the head of some previously seen word,
• Assign some previously seen word as the head of the current word,
• Postpone dealing with the current word, storing it for later processing.

• Intuition with the following three transition operators that will operate on
the top two elements of the stack:

• LEFTARC: Assert a head-dependent relation between the word at the top of the
stack and the second word; remove the second word from the stack.

• RIGHTARC: Assert a head-dependent relation between the second word on the


stack and the word at the top; remove the top word from the stack;

• SHIFT: Remove the word from the front of the input buffer and push it onto the
stack.
• Sometimes call operations like LEFTARC and RIGHTARC reduce
operations, based on a metaphor from shift-reduce parsing, in which
reducing means combining elements on the stack.
• There are some preconditions for using operators.
• The LEFTARC operator cannot be applied when ROOT is the second element
of the stack (since by definition the ROOT node cannot have any incoming
arcs).
• Both the LEFTARC and RIGHTARC operators require two elements to be on the
stack to be applied.

You might also like