0% found this document useful (0 votes)

16 views

Morphological Parsing

The document discusses parsing in Natural Language Processing (NLP), covering concepts such as syntax analysis, formal grammars, and types of parsing including top-down and bottom-up methods. It explains various grammar types like Context-Free Grammar (CFG) and Regular Grammar, as well as the roles and types of parsers used in syntactic analysis. Additionally, it describes the structure of parse trees and the process of derivation in both leftmost and rightmost forms.

Uploaded by

ketkijadhav.tae

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Morphological Parsing

Uploaded by

ketkijadhav.tae

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

lOMoARcPSD|34331009

2021 2.3 Parsing - NLP

COMPUTER APPLICATION (University of Madras)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by ketki jadhav ([email protected])
lOMoARcPSD|34331009

NATURAL LANGUAGE PROCESSING

UNIT 2

Syntax: Formal Grammars of English - Word Level Analysis: Regular Expressions - Finite-
State Automata - Syntactic Analysis / Parsing: Context-free Grammar – Types of Parsing:
Morphological Parsing, Syntactic Parsing, Statistical Parsing, Probabilistic Parsing,
Constituency Parsing -Spelling Error Detection and correction - Words and Word classes-
Part-of Speech Tagging.
2.3 PARSING

2.3.1 PARSING / SYNTACTIC ANALYSIS / SYNTAX ANALYSIS

The word ‘Parsing’ whose origin is from Latin word ‘pars’ (which means ‘part’), is used to
draw exact meaning or dictionary meaning from the text. It is also called Syntactic analysis
or syntax analysis.
Syntax analysis determines the syntactic structure of a text and checks the text for
meaningfulness, comparing the rules of formal grammar of the language.

Fig. 1 Syntactic Analysis

2.3.2 CONCEPT OF GRAMMAR

Grammar is very essential and important to describe the syntactic structure of well-formed
programs. A mathematical model of grammar was given by Noam Chomsky in 1956, which
is effective for writing computer languages.
Mathematically, a grammar G can be formally written as a 4-tuple (N, T, S, P) where −
 N or VN = set of non-terminal symbols, i.e., variables.
 T or ∑ = set of terminal symbols.
 S = Start symbol where S ∈ N
 P denotes the Production rules for Terminals as well as Non-terminals. It has the form
α → β, where α and β are strings on VN ∪ ∑ and least one symbol of α belongs to VN

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

Context Free Grammar (CFG)

Context free grammar, also called CFG, is a notation for describing languages and a superset
of Regular grammar. It can be seen in the following diagram −

CFG consists of finite set of grammar rules with the following four components −
Set of Non-terminals: It is denoted by V. The non-terminals are syntactic variables that
denote the sets of strings, which further help defining the language, generated by the
grammar.
Set of Terminals: It is also called tokens and defined by Σ. Strings are formed with the basic
symbols of terminals.
Set of Productions: It is denoted by P. The set defines how the terminals and non-terminals
can be combined. Every production(P) consists of non-terminals, an arrow, and terminals
(the sequence of terminals). Non-terminals are called the left side of the production and
terminals are called the right side of the production.
Start Symbol: The production begins from the start symbol. It is denoted by symbol S. Non-
terminal symbol is always designated as start symbol.
Regular Grammar

Regular grammars (RGs) are CFGs that generate regular languages. A regular grammar is
a CFG where productions are restricted to two forms, either A → a or A → aB,
where A, B ∈ NT and a ∈ T. Regular grammars are equivalent to regular expressions; they
encode precisely those languages that can be recognized by a DFA.

A regular grammar is a four-tuple G = (N, Σ, P, S), where

1. N is an alphabet called the set of nonterminals.

2. Σ is an alphabet called the set of terminals, with Σ ∩ N = Ø.
3. P is a finite set of productions or rules of the form A → w, where A ∈ N and w ∈
Σ*N ∪ Σ*.
4. S is the start symbol, S ∈ N.

Notice that productions in a regular grammar have at most one nonterminal on the right-hand
side and that this nonterminal always occurs at the end of a production.

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

Constituency Grammar
Phrase structure grammar, introduced by Noam Chomsky, is based on the constituency
relation. That is why it is also called constituency grammar. It is opposite to dependency
grammar.

Dependency Grammar

It is based on dependency relation. It was introduced by Lucien Tesniere. Dependency

grammar (DG) is opposite to the constituency grammar because it lacks phrasal nodes.

Parse tree that uses Constituency grammar is called constituency-based parse tree; and the
parse trees that uses dependency grammar is called dependency-based parse tree.

2.3.3 TYPES OF PARSING

A. Top-Down Vs Bottom-up Parsing

Derivation divides parsing into the followings two types −

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

 Top-down Parsing
 Bottom-up Parsing

Top-down Parsing: In this kind of parsing, the parser starts constructing the parse tree from
the start symbol and then tries to transform the start symbol to the input. The most common
form of topdown parsing uses recursive procedure to process the input. The main
disadvantage of recursive descent parsing is backtracking.

Bottom-up Parsing: In this kind of parsing, the parser starts with the input symbol and tries to
construct the parser tree up to the start symbol.

B. Deep Vs Shallow Parsing

Deep Parsing Shallow Parsing

In deep parsing, the search strategy will give It is the task of parsing a limited part of the
a complete syntactic structure to a sentence. syntactic information from the given task.

It is suitable for complex NLP applications. It can be used for less complex NLP
applications.

Dialogue systems and summarization are Information extraction and text mining are
the examples of NLP applications where the examples of NLP applications where
deep parsing is used. deep parsing is used.

It is also called full parsing. It is also called chunking.

2.3.4 PARSE TREE / DERIVATION TREE

The process of deriving a string is called as derivation. A Parse tree / Derivation tree may
be defined as the graphical depiction of a derivation. The start symbol of derivation serves as
the root of the parse tree. In every parse tree, the leaf nodes are terminals and interior nodes
are non-terminals. A property of parse tree is that in-order traversal will produce the original
input string.

1. Leftmost Derivation
The process of deriving a string by expanding the leftmost non-terminal at each step is
called as leftmost derivation. i.e the input is scanned and replaced from left to the right.

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

The geometrical representation of leftmost derivation is called as a leftmost derivation

tree.

Example: Consider the following grammar-

S → aB / bA
S → aS / bAA / a
B → bS / aBB / b
(Unambiguous Grammar)
Let us consider a string w = aaabbabbba. Derive the string w using leftmost derivation.
S → aB
→ aaBB (Using B → aBB)
→ aaaBBB (Using B → aBB)
→ aaabBB (Using B → b)
→ aaabbB (Using B → b)
→ aaabbaBB (Using B → aBB)
→ aaabbabB (Using B → b)
→ aaabbabbS (Using B → bS)
→ aaabbabbbA (Using S → bA)
→ aaabbabbba (Using A → a)

2. Rightmost Derivation

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

The process of deriving a string by expanding the rightmost non-terminal at each step is
called as rightmost derivation. i.e the sentential form of an input is scanned and replaced
from right to left. The sentential form in this case is called the right-sentential form.

The geometrical representation of rightmost derivation is called as a rightmost derivation

tree.
Example
Consider the following grammar-
S → aB / bA
S → aS / bAA / a
B → bS / aBB / b
(Unambiguous Grammar)
Let us consider a string w = aaabbabbba. Derive the string w using rightmost derivation.
S → aB
→ aaBB (Using B → aBB)
→ aaBaBB (Using B → aBB)
→ aaBaBbS (Using B → bS)
→ aaBaBbbA (Using S → bA)
→ aaBaBbba (Using A → a)
→ aaBabbba (Using B → b)
→ aaaBBabbba (Using B → aBB)
→ aaaBbabbba (Using B → b)
→ aaabbabbba (Using B → b)

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

Notes
 For unambiguous grammars, Leftmost derivation and Rightmost derivation represents
the same parse tree.
 For ambiguous grammars, Leftmost derivation and Rightmost derivation represents
different parse trees.

In the example, the given grammar was unambiguous. That is why, leftmost derivation and
rightmost derivation represents the same parse tree.
Leftmost Derivation Tree = Rightmost Derivation Tree
Properties Of Parse Tree
 Root node of a parse tree is the start symbol of the grammar.
 Each leaf node of a parse tree represents a terminal symbol.
 Each interior node of a parse tree represents a non-terminal symbol.
 Parse tree is independent of the order in which the productions are used during
derivations.

Yield Of Parse Tree

 Concatenating the leaves of a parse tree from the left produces a string of terminals.
 This string of terminals is called as yield of a parse tree.

Problem
Consider the grammar-
S → bB / aA
A → b / bS / aAA
B → a / aS / bBB

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

For the string w = bbaababa, find the following:

1. Leftmost derivation
2. Rightmost derivation
3. Parse Tree

Solution

1. Leftmost Derivation
S → bB
→ bbBB (Using B → bBB)
→ bbaB (Using B → a)
→ bbaaS (Using B → aS)
→ bbaabB (Using S → bB)
→ bbaabaS (Using B → aS)
→ bbaababB (Using S → bB)
→ bbaababa (Using B → a)

2. Rightmost Derivation

S → bB
→ bbBB (Using B → bBB)
→ bbBaS (Using B → aS)
→ bbBabB (Using S → bB)
→ bbBabaS (Using B → aS)
→ bbBababB (Using S → bB)
→ bbBababa (Using B → a)
→ bbaababa (Using B → a)

3. Parse Tree

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

 Whether we consider the leftmost derivation or rightmost derivation, we get the above
parse tree.
 The reason is given grammar is unambiguous.

2.3.5 PARSER
The main roles of the parser include the following:
 To report any syntax error.
 To recover from commonly occurring error so that the processing of the remainder of
program can be continued.
 To create parse tree.
 To create symbol table.
 To produce intermediate representations (IR).

2.3.6 TYPES OF PARSERS

A parser is basically a procedural interpretation of grammar. It finds an optimal tree for the
given sentence after searching through the space of a variety of trees.
i. Recursive descent Parser
ii. Shift-reduce Parser
iii. Chart Parser
iv. Regexp parser
v. Dependency Parser
vi. Morphological Parser
vii. Constituency Parser

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

viii. Probabilistic Parser

This section discusses some of the available parsers.

i. Recursive descent parser

Recursive descent parsing is one of the most straightforward forms of parsing.

 It follows a top-down process.
 It attempts to verify if the syntax of the input stream is correct or not.
 It reads the input sentence from left to right.
 One necessary operation for recursive descent parser is to read characters from the
input stream and matching them with the terminals from the grammar.

ii. Shift-reduce parser

Following are some important points about shift-reduce parser.

 It follows a simple bottom-up process.
 It tries to find a sequence of words and phrases that correspond to the right-hand side
of a grammar production and replaces them with the left-hand side of the production.
 The above attempt to find a sequence of word continues until the whole sentence is
reduced.
 In simple words, shift-reduce parser starts with the input symbol and tries to construct
the parser tree up to the start symbol.

iii. Chart parser

Following are some important points about chart parser

 It is mainly useful or suitable for ambiguous grammars, including grammars of
natural languages.
 It applies dynamic programming to the parsing problems.
 Because of dynamic programming, partial hypothesized results are stored in a
structure called a ‘chart’.
 The ‘chart’ can also be re-used.

iv. Regexp parser

Regexp parsing is one of the mostly used parsing technique. Following are some important
points about Regexp parser
 As the name implies, it uses a regular expression defined in the form of grammar on
top of a POS-tagged string.
 It basically uses these regular expressions to parse the input sentences and generate a
parse tree.

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

v. Dependency Parser

Dependency Parsing (DP) refers to examining the dependencies between the words of a
sentence to analyze its grammatical structure. Dependency parsing doesn’t make use of
phrasal constituents or sub-phrases. Instead, the syntax of the sentence is expressed in terms
of dependencies between words — that is, directed, typed edges between words in a graph.
More formally, a dependency parse tree is a graph G = (V, E) where the set of
vertices V contains the words in the sentence, and each edge in E connects two words. The
graph must satisfy three conditions:

1. There has to be a single root node with no incoming edges.

2. For each node in , there must be a path from the root to .
3. Each node except the root must have exactly 1 incoming edge.

Additionally, each edge in E has a type, which defines the grammatical relation that occurs
between the two words.
Let’s see what the previous example looks like if we perform dependency parsing:

As we can see, the result is completely different. With this approach, the root of the tree is the
verb of the sentence, and edges between words describe their relationships.
For example, the word “saw” has an outgoing edge of type nsubj to the word “I”, meaning
that “I” is the nominal subject of the verb “saw”. In this case, we say that “I” depends
on “saw”.

vi. Morphological Parser

Morphemes are smallest meaning-bearing units. Example: We can break the word foxes into
two, fox and -es. We can see that the word foxes, is made up of two morphemes, one
is fox and other is -es.
Morphemes can be divided into two types
i. Stems
It is the core meaningful unit of a word. We can also say that it is the root of the word.
Example: In the word foxes, the stem is fox.
 Affixes − As the name suggests, they add some additional meaning and grammatical
functions to the words. For example, in the word foxes, the affix is − es.
Affixes can also be divided into following four types −

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

o Prefixes − As the name suggests, prefixes precede the stem. For example, in
the word unbuckle, un is the prefix.
o Suffixes − As the name suggests, suffixes follow the stem. For example, in the
word cats, -s is the suffix.
o Infixes − As the name suggests, infixes are inserted inside the stem. For
example, the word cupful, can be pluralized as cupsful by using -s as the
infix.
o Circumfixes − They precede and follow the stem. There are very less
examples of circumfixes in English language. A very common example is ‘A-
ing’ where we can use -A precede and -ing follows the stem.
ii. Word Order
The order of the words would be decided by morphological parsing.
Morphology
Morphology is the study of the following:
 The formation of words.
 The origin of the words.
 Grammatical forms of the words.
 Use of prefixes and suffixes in the formation of words.
 How parts-of-speech (PoS) of a language are formed.
Morphological parsing is the problem of recognizing that a word breaks down into smaller
meaningful units called “morphemes” producing some sort of linguistic structure for it.
Requirements for building a Morphological parser
Let us now see the requirements for building a morphological parser

Lexicon: This includes the list of stems and affixes along with the basic information about
them. For example, the information like whether the stem is Noun stem or Verb stem, etc.

Morphotactics: It is basically the model of morpheme ordering. In other sense, the model
explaining which classes of morphemes can follow other classes of morphemes inside a
word. For example, the morphotactic fact is that the English plural morpheme always follows
the noun rather than preceding it.

Orthographic rules: These spelling rules are used to model the changes occurring in a word.
For example, the rule of converting y to ie in word like city+s = cities not citys.

The goal of morphological parsing is to find out what morphemes a given word is built from.
For example, a morphological parser should be able to tell us that the word cats is the plural
form of the noun stem cat, and that the word mice is the plural form of the noun stem mouse.
So, given the string cats as input, a morphological parser should produce an output that looks
similar to cat N PL. Here are some more examples:

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

mouse mouse N SG

mice mouse N PL

foxes fox N PL

Morphological parsing yields information that is useful in many NLP applications.

 In parsing, it helps to know the agreement features of words.

 Grammar checkers need to know agreement information to detect such mistakes.
Morphological information also helps spell checkers to decide whether something is a
possible word or not.
 In information retrieval it is used to search not only cats, if that's the user's input, but
also for cat.

To get from the surface form of a word to its morphological analysis, we are going to proceed
in two steps.

Step 1: Split the words up into its possible components.

Cats: cat + s where + is used to indicate morpheme boundaries.

Foxes: foxe + s which assumes that foxe is a stem and s is the suffix
fox + s where fox is the stem , e has been introduced due to the spelling rule

Step 2: Use a lexicon of stems and affixes to look up the categories of the stems and the
meaning of the affixes.
cat + s will get mapped to cat NP PL
fox + s to fox N PL.
We will also find out now that foxe is not a legal stem. This tells us that
splitting foxes into foxe + s was actually an incorrect way of splitting foxes, which
should be discarded.
Note: For the word houses splitting it into house + s is correct.

Here is a picture illustrating the two steps of our morphological parser with some examples.

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

vii. Constituency Parser

The constituency parse tree is based on the formalism of context-free grammars. In this type
of tree, the sentence is divided into constituents, that is, sub-phrases that belong to a specific
category in the grammar.
In English, for example, the phrases “a dog”, “a computer on the table” and “the nice sunset”
are all noun phrases, while “eat a pizza” and “go to the beach” are verb phrases.
The grammar provides a specification of how to build valid sentences, using a set of rules.
As an example, the rule VP  V NP means that we can form a verb phrase (VP) using a verb
(V) and then a noun phrase (NP).
While we can use these rules to generate valid sentences, we can also apply them the other
way around, in order to extract the syntactical structure of a given sentence according to the
grammar.
Example of a constituency parse tree for the simple sentence, “I saw a fox”:

A constituency parse tree always contains the words of the sentence as its terminal
nodes. Usually, each word has a parent node containing its part-of-speech tag (noun,
adjective, verb, etc…), although this may be omitted in other graphical representations.
All the other non-terminal nodes represent the constituents of the sentence and are
usually one of verb phrase, noun phrase, or prepositional phrase (PP).
To sum things up, constituency parsing creates trees containing a syntactical representation of
a sentence, according to a context-free grammar. This representation is highly hierarchical
and divides the sentences into its single phrasal constituents.

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

viii. Probabilistic Parser

Probabilistic parsing uses dynamic programming algorithms to compute the most likely
parse(s) of a given sentence, given a statistical model of the syntactic structure of a language.
CFG: A context free grammar consists of:
1. a set of non-terminal symbols N
2. a set of terminal symbols Σ (disjoint from N)
3. a set of productions, P, each of the form A → α, where A is a non-terminal and α is a string
of symbols from the infinite set of strings (Σ ∪ N)
4. a designated start symbol
Probabilistic CFGs / Stochastic Grammars (PCFGs)
Probabilistic CFGs Augments each rule in P with a conditional probability: A → β [p] where
p is the probability that the non-terminal A will be expanded to the sequence β.
Often referred to as P(A → β) or P(A → β|A).
Why are PCFGs useful?
• Assigns a probability to each parse tree T
• Useful in disambiguation – Choose the most likely parse – Computing the probability of a
parse If we make independence assumptions, P(T) = Qn∈T p(r(n)).
• Useful in language modeling tasks
Example:

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

Where does the grammar come from?

1. developed manually
2. from a treebank
Where do the probabilities come from?
1. from a treebank: P(α → β|α) = Count(α → β)/Count(α)
2. use EM (forward-backward algorithm, inside-outside algorithm)
Parsing with PCFGs
Produce the most likely parse for a given sentence: T ˆ (S) = argmaxT ∈τ(S)P(T) where τ (S)
is the set of possible parse trees for S.
 Augment the Earley algorithm to compute the probability of each of its parses.
When adding an entry E of category C to the chart using rule i with n subconstituents,
E1, . . . , En:
P(E) = P(rule i | C) ∗ P(E1) ∗ . . . ∗ P(En)
 probabilistic CKY (Cocke-Kasami-Younger) algorithm

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

Problems with PCFGs

• PCFGs assume that all rules are essentially independent
– But, e.g. in English “NP  pro” more likely when in subject position
• Difficult to incorporate lexical information
– Pre-terminal rules can inherit important information from words which help to
make choices higher up the parse, e.g. lexical choice can help determine PP
attachment
• Do not adequately model lexical dependencies.

xi. Statistical Parser

Statistical parsing is the task of computing the most probable parse of a sentence given a
probabilistic (or weighted) context-free grammar (CFG). The weights of the probabilistic or
weighted CFG are typically learned on a corpus of texts.

2.3.6 RELEVANCE OF PARSING IN NLP

We can understand the relevance of parsing in NLP with the help of following points −
 Parser is used to report any syntax error.
 It helps to recover from commonly occurring error so that the processing of the
remainder of program can be continued.
 Parse tree is created with the help of a parser.
 Parser is used to create symbol table, which plays an important role in NLP.
 Parser is also used to produce intermediate representations (IR).

2.3.7 CHALLENGES IN PARSING NATURAL LANGUAGE

Parsing natural language presents several challenges that don’t occur when parsing
programming languages. The reason for this is that natural language is often ambiguous,
meaning there can be multiple valid parse trees for the same sentence.
Let’s consider for a moment the sentence, “I shot an elephant in my pajamas”. It has two
possible interpretations: one where the man is wearing his pajamas while shooting the
elephant, and the other where the elephant is inside the man’s pajamas.
These are both valid from a syntactical perspective, but humans are able to solve these
ambiguities very quickly – and often unconsciously – since many of the possible
interpretations are unreasonable for their semantics or for the context in which the sentence
occurs. However, it’s not as easy for a parsing algorithm to select the most likely parse tree
with great accuracy.

Downloaded by ketki jadhav ([email protected])

lOMoARcPSD|34331009

To do this, most modern parsers use supervised machine learning models that are trained on
manually annotated data. Since the data is annotated with the correct parse trees, the model
will learn a bias towards more likely interpretations.

2.3.8 CONCLUSION
• Basic parsing approaches without constraints is not practical in real applications
• Whatever approach taken, bear in mind that the lexicon is the real bottleneck
• There’s a real trade-off between coverage and efficiency, so it’s a good idea to
sacrifice broad coverage (e.g. domain-specific parsers, controlled language), or use a
scheme that minimizes the disadvantages (e.g. probabilistic parsing)
– From computational perspective, a parser provides a formalism for writing
linguistic rules and an implementation which can apply the rules to an input
text
– An interface to allow grammar development and testing (eg tracing rules,
showing trees) and an interface with the application of which it is a part that
may be hidden to the end-user are necessary
• All of the above tailored to meet the needs

REFERENCES

1. https://www.gatevidyalay.com/parse-tree-derivations-automata/

Downloaded by ketki jadhav ([email protected])

The OXford Dictionary of Literary Terms (3 Ed)
33% (3)
The OXford Dictionary of Literary Terms (3 Ed)
1 page
Formal Languages And Automata Theory
From Everand
Formal Languages And Automata Theory
Ajit Singh
No ratings yet
Conditional Sentences in English and Turkish
No ratings yet
Conditional Sentences in English and Turkish
9 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
17 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
16 pages
Unit 2
No ratings yet
Unit 2
10 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
2024_CD-Ch03_Syntaxx_Analysis
No ratings yet
2024_CD-Ch03_Syntaxx_Analysis
28 pages
Unit2 TopDownParsing
No ratings yet
Unit2 TopDownParsing
12 pages
Chapter Three Context Free Grammar
No ratings yet
Chapter Three Context Free Grammar
55 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
toc 2
No ratings yet
toc 2
23 pages
Chapter 3 (Part 1)
No ratings yet
Chapter 3 (Part 1)
33 pages
Unit 2
No ratings yet
Unit 2
29 pages
Module 2 C D Notes
No ratings yet
Module 2 C D Notes
21 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
Samir Cfg
No ratings yet
Samir Cfg
105 pages
Module 2a - With soln
No ratings yet
Module 2a - With soln
90 pages
6836
No ratings yet
6836
42 pages
CD Unit2
No ratings yet
CD Unit2
45 pages
ATCD PPT Module-3
No ratings yet
ATCD PPT Module-3
136 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
Unit 2
No ratings yet
Unit 2
45 pages
II. Parser: Syntax Analysis
No ratings yet
II. Parser: Syntax Analysis
18 pages
Mod - 3 (2)
No ratings yet
Mod - 3 (2)
51 pages
CH03
No ratings yet
CH03
57 pages
Compiler Design 3
No ratings yet
Compiler Design 3
9 pages
2.2 - Syntax Analysis (Upto Top-down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-down Parsing)
91 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
CH 6
No ratings yet
CH 6
18 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Unit-3 Flat
No ratings yet
Unit-3 Flat
29 pages
Chapter 3 Syntax Analysis Full Reading Material
No ratings yet
Chapter 3 Syntax Analysis Full Reading Material
76 pages
Chapter 6
No ratings yet
Chapter 6
52 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
[Week 4] Syntax Analysis (CFG)
No ratings yet
[Week 4] Syntax Analysis (CFG)
50 pages
Chapter Four
No ratings yet
Chapter Four
54 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
Class Three
No ratings yet
Class Three
74 pages
2nd Phase Syntax Analyzer -1
No ratings yet
2nd Phase Syntax Analyzer -1
136 pages
CST302_COMPILER_DESIGN_MODULE 2
No ratings yet
CST302_COMPILER_DESIGN_MODULE 2
19 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Lecture 6 (6-2-23)
No ratings yet
Lecture 6 (6-2-23)
9 pages
Chương 3. Phân Tích Cú Pháp
No ratings yet
Chương 3. Phân Tích Cú Pháp
103 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
Module 2
No ratings yet
Module 2
19 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Lec 01. Grammar, Derivations, Parse Tree
No ratings yet
Lec 01. Grammar, Derivations, Parse Tree
14 pages
ATCD UT3 Material
No ratings yet
ATCD UT3 Material
20 pages
rkCD-Chapter 3 - Syntax Analysis
No ratings yet
rkCD-Chapter 3 - Syntax Analysis
15 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
Chapter 3 Syntax Analyzer
No ratings yet
Chapter 3 Syntax Analyzer
46 pages
COMPILER DESIGN UNIT 2
No ratings yet
COMPILER DESIGN UNIT 2
44 pages
Depth First Search: Fundamentals and Applications
From Everand
Depth First Search: Fundamentals and Applications
Fouad Sabry
No ratings yet
250 Common Anime Words and Phrases to Know.
No ratings yet
250 Common Anime Words and Phrases to Know.
11 pages
Routledge Encyclopedia Of Translation Technology 2nd Edition Sin-Wai Chan pdf download
100% (1)
Routledge Encyclopedia Of Translation Technology 2nd Edition Sin-Wai Chan pdf download
52 pages
GRADE 2 FAITH
No ratings yet
GRADE 2 FAITH
9 pages
The Croods Movie Worksheet
No ratings yet
The Croods Movie Worksheet
2 pages
3 Tibi Et Al 2020 - LK
No ratings yet
3 Tibi Et Al 2020 - LK
26 pages
Lingg 190 - Pronouns in Narrative Discourse
No ratings yet
Lingg 190 - Pronouns in Narrative Discourse
8 pages
Suci Maharani 96 102 PDF
No ratings yet
Suci Maharani 96 102 PDF
7 pages
Short Story Assignment
No ratings yet
Short Story Assignment
2 pages
Teaching Reading Strategies
No ratings yet
Teaching Reading Strategies
4 pages
Podar International School Progression Test - 1 English 1111/01
No ratings yet
Podar International School Progression Test - 1 English 1111/01
11 pages
Critical Issues in Writing PDF
No ratings yet
Critical Issues in Writing PDF
121 pages
Asking & Giving Information
No ratings yet
Asking & Giving Information
9 pages
Cambridge IGCSE™: Mathematics 0580/41
No ratings yet
Cambridge IGCSE™: Mathematics 0580/41
10 pages
Redefining Conceptions of Grammar in English Education in Asia: SFL in Practice
No ratings yet
Redefining Conceptions of Grammar in English Education in Asia: SFL in Practice
18 pages
A1 Movers Reading and Writing Part 6
No ratings yet
A1 Movers Reading and Writing Part 6
6 pages
Lesson Plan First Conditional
No ratings yet
Lesson Plan First Conditional
5 pages
B1 Year Syllabus 2024-2025 Skills Separated
No ratings yet
B1 Year Syllabus 2024-2025 Skills Separated
4 pages
PreTest A
No ratings yet
PreTest A
4 pages
Group 6ed123
No ratings yet
Group 6ed123
19 pages
Chapter 1 (Alphabet)
No ratings yet
Chapter 1 (Alphabet)
5 pages
CAE Listening + Speaking Part 3
No ratings yet
CAE Listening + Speaking Part 3
28 pages
Cantos, Roldan A. (Nihongo)
No ratings yet
Cantos, Roldan A. (Nihongo)
2 pages
Present Perfect
No ratings yet
Present Perfect
2 pages
Comparative and Superlative
No ratings yet
Comparative and Superlative
2 pages
English For Information Technology Level 1
No ratings yet
English For Information Technology Level 1
33 pages
Lesson - 3 I
No ratings yet
Lesson - 3 I
4 pages
ILMU PELAYARAN ASTRONOMI XI NAUTIK (Respons)
No ratings yet
ILMU PELAYARAN ASTRONOMI XI NAUTIK (Respons)
36 pages
Daily Routine Vocabulary IELTS 07
No ratings yet
Daily Routine Vocabulary IELTS 07
4 pages

Morphological Parsing

Uploaded by

Morphological Parsing

Uploaded by

lOMoARcPSD|34331009

2021 2.3 Parsing - NLP

COMPUTER APPLICATION (University of Madras)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

NATURAL LANGUAGE PROCESSING

2.3.1 PARSING / SYNTACTIC ANALYSIS / SYNTAX ANALYSIS

Fig. 1 Syntactic Analysis

2.3.2 CONCEPT OF GRAMMAR

Downloaded by ketki jadhav ([email protected])

Context Free Grammar (CFG)

A regular grammar is a four-tuple G = (N, Σ, P, S), where

1. N is an alphabet called the set of nonterminals.

Downloaded by ketki jadhav ([email protected])

It is based on dependency relation. It was introduced by Lucien Tesniere. Dependency

2.3.3 TYPES OF PARSING

A. Top-Down Vs Bottom-up Parsing

Downloaded by ketki jadhav ([email protected])

B. Deep Vs Shallow Parsing

Deep Parsing Shallow Parsing

It is also called full parsing. It is also called chunking.

2.3.4 PARSE TREE / DERIVATION TREE

Downloaded by ketki jadhav ([email protected])

The geometrical representation of leftmost derivation is called as a leftmost derivation

Example: Consider the following grammar-

Downloaded by ketki jadhav ([email protected])

The geometrical representation of rightmost derivation is called as a rightmost derivation

Downloaded by ketki jadhav ([email protected])

Yield Of Parse Tree

Downloaded by ketki jadhav ([email protected])

For the string w = bbaababa, find the following:

Downloaded by ketki jadhav ([email protected])

2.3.6 TYPES OF PARSERS

Downloaded by ketki jadhav ([email protected])

viii. Probabilistic Parser

i. Recursive descent parser

Recursive descent parsing is one of the most straightforward forms of parsing.

ii. Shift-reduce parser

Following are some important points about shift-reduce parser.

iii. Chart parser

Following are some important points about chart parser

iv. Regexp parser

Downloaded by ketki jadhav ([email protected])

1. There has to be a single root node with no incoming edges.

vi. Morphological Parser

Downloaded by ketki jadhav ([email protected])

Downloaded by ketki jadhav ([email protected])

Morphological parsing yields information that is useful in many NLP applications.

 In parsing, it helps to know the agreement features of words.

Step 1: Split the words up into its possible components.

Cats: cat + s where + is used to indicate morpheme boundaries.

Downloaded by ketki jadhav ([email protected])

vii. Constituency Parser

Downloaded by ketki jadhav ([email protected])

viii. Probabilistic Parser

Downloaded by ketki jadhav ([email protected])

Where does the grammar come from?

Downloaded by ketki jadhav ([email protected])

Problems with PCFGs

xi. Statistical Parser

2.3.6 RELEVANCE OF PARSING IN NLP

2.3.7 CHALLENGES IN PARSING NATURAL LANGUAGE

Downloaded by ketki jadhav ([email protected])

Downloaded by ketki jadhav ([email protected])

You might also like