0% found this document useful (0 votes)

16 views

Machine 22

This document discusses different types of parsing including chart parsing, regex parsing, dependency parsing, and chunking. It provides examples of how to implement each type of parsing using the NLTK library in Python. Chart parsing uses dynamic programming to efficiently parse text. Regex parsing uses regular expressions to parse part-of-speech tagged sentences. Dependency parsing represents linguistic relationships between words using directed links. Chunking performs shallow parsing by grouping chunks of related words in a sentence without fully analyzing syntactic structure.

Uploaded by

shahzad sultan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Machine 22

Uploaded by

shahzad sultan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Chapter 4

A chart parser
We will apply the algorithm design technique of dynamic programming to the
parsing problem. Dynamic programming stores intermediate results and reuses
them when appropriate, achieving significant efficiency gains. This technique can be
applied to syntactic parsing. This allows us to store partial solutions to the parsing
task and then allows us to look them up when necessary in order to efficiently arrive
at a complete solution. This approach to parsing is known as chart parsing.

For a better understanding of the parsers, you can go through an

example at
http://www.nltk.org/howto/parse.html.

A regex parser
A regex parser uses a regular expression defined in the form of grammar on top of a
POS-tagged string. The parser will use these regular expressions to parse the given
sentences and generate a parse tree out of this. A working example of the regex
parser is given here:
# Regex parser
>>>chunk_rules=ChunkRule("<.*>+","chunk everything")
>>>import nltk
>>>from nltk.chunk.regexp import *
>>>reg_parser = RegexpParser('''
NP: {<DT>? <JJ>* <NN>*} # NP
P: {<IN>} # Preposition
V: {<V.*>} # Verb
PP: {<P> <NP>} # PP -> P NP
VP: {<V> <NP|PP>*} # VP -> V (NP|PP)*
''')
>>>test_sent="Mr. Obama played a big role in the Health insurance bill"
>>>test_sent_pos=nltk.pos_tag(nltk.word_tokenize(test_sent))
>>>paresed_out=reg_parser.parse(test_sent_pos)
>>> print paresed_out
Tree('S', [('Mr.', 'NNP'), ('Obama', 'NNP'), Tree('VP', [Tree('V',
[('played', 'VBD')]), Tree('NP', [('a', 'DT'), ('big', 'JJ'), ('role',
'NN')])]), Tree('P', [('in', 'IN')]), ('Health', 'NNP'), Tree('NP',
[('insurance', 'NN'), ('bill', 'NN')])])

[ 49 ]

www.it-ebooks.info
Parsing Structure in Text

The following is a graphical representation of the tree for the preceding code:

Root

NP VP

NNP NNP VBD NP PP

Mr. Obama played DT JJ NN IN NP

a big role in DT NNP NN NN

the Health insurance bill

In the current example, we define the kind of patterns (a regular expression of

the POS) we think will make a phrase, for example, anything that {<DT>? <JJ>*
<NN>*} has a starting determiner followed by an adjective and then a noun is mostly
a noun phrase. Now, this is more of a linguistic rule that we have defined to get the
rule-based parse tree.

Dependency parsing
Dependency parsing (DP) is a modern parsing mechanism. The main concept of DP
is that each linguistic unit (words) is connected with each other by a directed link.
These links are called dependencies in linguistics. There is a lot of work going on in
the current parsing community. While phrase structure parsing is still widely used
for free word order languages (Czech and Turkish), dependency parsing has turned
out to be more efficient.
A very clear distinction can be made by looking at the parse tree generated by phrase
structure grammar and dependency grammar for a given example, as the sentence
"The big dog chased the cat". The parse tree for the preceding sentence is:

[ 50 ]

www.it-ebooks.info
Chapter 4

NP VP

Art Adj N V NP

Art N

the big dog chased the cat The big dog chased the cat
Phrase Structure tree Dependency Tree

If we look at both parse trees, the phrase structures try to capture the relationship
between words and phrases and then eventually between phrases. While a
dependency tree just looks for a dependency between words, for example, big is
totally dependent on dog.
NLTK provides a couple of ways to do dependency parsing. One of them is to use
a probabilistic, projective dependency parser, but it has the restriction of training
with a limited set of training data. One of the state of the art dependency parsers is
a Stanford parser. Fortunately, NLTK has a wrapper around it and in the following
example, I will talk about how to use a Stanford parser with NLTK:
# Stanford Parser [Very useful]
>>>from nltk.parse.stanford import StanfordParser
>>>english_parser = StanfordParser('stanford-parser.jar', 'stanford-
parser-3.4-models.jar')
>>>english_parser.raw_parse_sents(("this is the english parser test")
Parse
(ROOT
(S
(NP (DT this))
(VP (VBZ is)
(NP (DT the) (JJ english) (NN parser) (NN test)))))
Universal dependencies
nsubj(test-6, this-1)
cop(test-6, is-2)
det(test-6, the-3)
amod(test-6, english-4)
compound(test-6, parser-5)
root(ROOT-0, test-6)

[ 51 ]

www.it-ebooks.info
Parsing Structure in Text

Universal dependencies, enhanced

nsubj(test-6, this-1)
cop(test-6, is-2)
det(test-6, the-3)
amod(test-6, english-4)
compound(test-6, parser-5)
root(ROOT-0, test-6)

The output looks quite complex but, in reality, it's not. The output is a list of three
major outcomes, where the first is just the POS tags and the parsed tree of the
given sentences. The same is plotted in a more elegant way in the following figure.
The second is the dependency and positions of the given words. The third is the
enhanced version of dependency:

Root

NP VP

DT VBZ NP

this is DT JJ NN NN

the english parser test

For a better understanding of how to use a Stanford parser, refer to

http://nlpviz.bpodgursky.com/home and
http://nlp.stanford.edu:8080/parser/index.jsp.

Chunking
Chunking is shallow parsing where instead of reaching out to the deep structure
of the sentence, we try to club some chunks of the sentences that constitute some
meaning.

[ 52 ]

www.it-ebooks.info
Chapter 4

A chunk can be defined as the minimal unit that can be processed. So, for example, the
sentence "the President speaks about the health care reforms" can be broken into two
chunks, one is "the President", which is noun dominated, and hence is called a noun
phrase (NP). The remaining part of the sentence is dominated by a verb, hence it is
called a verb phrase (VP). If you see, there is one more sub-chunk in the part "speaks
about the health care reforms". Here, one more NP exists that can be broken down
again in "speaks about" and "health care reforms", as shown in the following figure:

The President speaks about The Health Care Reforms

NP NP

This is how we broke the sentence into parts and that's what we call chunking.
Formally, chunking can also be described as a processing interface to identify
non-overlapping groups in unrestricted text.

Now, we understand the difference between shallow and deep parsing. When we
reach the syntactic structure of the sentences with the help of CFG and understand
the syntactic structure of the sentence. Some cases we need to go for semantic
parsing to understand the meaning of the sentence. On the other hand, there are
cases where, we don't need analysis this deep. Let's say, from a large portion
of unstructured text, we just want to extract the key phrases, named entities, or
specific patterns of the entities. For this, we will go for shallow parsing instead of
deep parsing because deep parsing involves processing the sentence against all the
grammar rules and also the generation of a variety of syntactic tree till the parser
generates the best tree by using the process of backtracking and reiterating. This
entire process is time consuming and cumbersome and, even after all the processing,
you might not get the right parse tree. Shallow parsing guarantees the shallow parse
structure in terms of chunks which is relatively faster.

So, let's write some code snippets to do some basic chunking:

# Chunking
>>>from nltk.chunk.regexp import *
>>>test_sent="The prime minister announced he had asked the chief
government whip, Philip Ruddock, to call a special party room meeting for
9am on Monday to consider the spill motion."
>>>test_sent_pos=nltk.pos_tag(nltk.word_tokenize(test_sent))
>>>rule_vp = ChunkRule(r'(<VB.*>)?(<VB.*>)+(<PRP>)?', 'Chunk VPs')

[ 53 ]

www.it-ebooks.info

NLP Final
No ratings yet
NLP Final
72 pages
NLTK Analysis 5
No ratings yet
NLTK Analysis 5
5 pages
Unit 2_Lecture 1
No ratings yet
Unit 2_Lecture 1
19 pages
Natural Language Processing
No ratings yet
Natural Language Processing
11 pages
Chapter15 NaturalLanguage
100% (1)
Chapter15 NaturalLanguage
35 pages
Natural Language Processing
100% (1)
Natural Language Processing
48 pages
Syntax_complete
No ratings yet
Syntax_complete
22 pages
nlp unit 2
No ratings yet
nlp unit 2
13 pages
Seminar On Natural Language Processing
No ratings yet
Seminar On Natural Language Processing
21 pages
Understanding Sentencepiece ( (Under) (Standing) ( - Sentence) (Piece) )
No ratings yet
Understanding Sentencepiece ( (Under) (Standing) ( - Sentence) (Piece) )
15 pages
NLP Manual
No ratings yet
NLP Manual
15 pages
Sentiment Analysis Using Supervised Machine Learning Ijariie13051
No ratings yet
Sentiment Analysis Using Supervised Machine Learning Ijariie13051
7 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
17 pages
Group A Assignment No: 7
No ratings yet
Group A Assignment No: 7
10 pages
NLP Steps Basic
No ratings yet
NLP Steps Basic
26 pages
CH4
No ratings yet
CH4
15 pages
ch08
No ratings yet
ch08
31 pages
AI Unit 3 Lecture 2
No ratings yet
AI Unit 3 Lecture 2
8 pages
Comprehensive Guide Attention Mechanism Deep Learning
No ratings yet
Comprehensive Guide Attention Mechanism Deep Learning
17 pages
AIYA Session 3 Presentation (1)
No ratings yet
AIYA Session 3 Presentation (1)
40 pages
Unit V Natural Language Processing
No ratings yet
Unit V Natural Language Processing
5 pages
THE PERFECT CHATBOT DOC
No ratings yet
THE PERFECT CHATBOT DOC
11 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
12 pages
Natural Language Processing Parsing Techniques:: Unit IV
100% (1)
Natural Language Processing Parsing Techniques:: Unit IV
24 pages
Natural Language Processing manual
No ratings yet
Natural Language Processing manual
39 pages
UNIT NO 3
No ratings yet
UNIT NO 3
8 pages
Ss Mini Project
No ratings yet
Ss Mini Project
20 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
NLTK 3
No ratings yet
NLTK 3
5 pages
Parts of Speech Tagging
No ratings yet
Parts of Speech Tagging
17 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
Experiment 2 Manual
No ratings yet
Experiment 2 Manual
6 pages
NLP Complete - BEPEC - Opendir - Cloud
No ratings yet
NLP Complete - BEPEC - Opendir - Cloud
17 pages
SL-3_Assignment No 7
No ratings yet
SL-3_Assignment No 7
14 pages
Noun Phrase Extraction: A Description of Current Techniques
No ratings yet
Noun Phrase Extraction: A Description of Current Techniques
36 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
NLP Module 3
No ratings yet
NLP Module 3
41 pages
How To Build Knowledge Graph Text Using Spacy
No ratings yet
How To Build Knowledge Graph Text Using Spacy
17 pages
Introduction To Natural Language Processing and NLTK
No ratings yet
Introduction To Natural Language Processing and NLTK
23 pages
Natural Language Processing With RNNs .Ipynb - Colaboratory
No ratings yet
Natural Language Processing With RNNs .Ipynb - Colaboratory
15 pages
Lecture NLP
100% (1)
Lecture NLP
38 pages
Atural Anguage Rocessing: Chandra Prakash LPU
No ratings yet
Atural Anguage Rocessing: Chandra Prakash LPU
59 pages
Fundaments of Text Analysis
No ratings yet
Fundaments of Text Analysis
14 pages
Data Science Interview Preparation Questions (#Day06)
No ratings yet
Data Science Interview Preparation Questions (#Day06)
10 pages
Web and Social Media Analytics Lab
No ratings yet
Web and Social Media Analytics Lab
34 pages
NLP UNIT-II
No ratings yet
NLP UNIT-II
71 pages
NLP For ML - Spam Classifier
No ratings yet
NLP For ML - Spam Classifier
14 pages
Natural Language Processing
No ratings yet
Natural Language Processing
11 pages
Ass7 Write Up .Final
No ratings yet
Ass7 Write Up .Final
11 pages
NLB final lab manual (2)
No ratings yet
NLB final lab manual (2)
23 pages
Complex Sentiment Analysis Using Recursive Autoencoders
No ratings yet
Complex Sentiment Analysis Using Recursive Autoencoders
5 pages
Textsummarization 171230181022
No ratings yet
Textsummarization 171230181022
17 pages
l6 NLTK Chunking 2x2
No ratings yet
l6 NLTK Chunking 2x2
12 pages
Ss Mini Project
No ratings yet
Ss Mini Project
24 pages
Dsbdal A7
No ratings yet
Dsbdal A7
65 pages
English Tense
No ratings yet
English Tense
6 pages
8_NLP
No ratings yet
8_NLP
18 pages
nayie bayes classifier 21 page
No ratings yet
nayie bayes classifier 21 page
28 pages
SMA (TASK1 AND 2) ... HARDCOPY (Final) ..Pranchal..
No ratings yet
SMA (TASK1 AND 2) ... HARDCOPY (Final) ..Pranchal..
11 pages
From Simple IO to Monad Transformers
From Everand
From Simple IO to Monad Transformers
J Adrian Zimmer
2/5 (1)
Detail NLP
No ratings yet
Detail NLP
5 pages
Handover Note: "Welcome Back": Position Name: Days Name: Nights
No ratings yet
Handover Note: "Welcome Back": Position Name: Days Name: Nights
3 pages
HSSE Partner Report - 05 - 12-12-2018
No ratings yet
HSSE Partner Report - 05 - 12-12-2018
1 page
Handover Note: "Welcome Back"
No ratings yet
Handover Note: "Welcome Back"
3 pages
Updated Handover 16th July, 2018
No ratings yet
Updated Handover 16th July, 2018
5 pages
UEP Journey Management Form Shehzad - 26 Jan, 2019
No ratings yet
UEP Journey Management Form Shehzad - 26 Jan, 2019
2 pages
Health & Hygiene Inspection RIG 215 17 April 2018
No ratings yet
Health & Hygiene Inspection RIG 215 17 April 2018
8 pages
Onshore Operation Daily Maintenance Report: H2S Support Engineer
No ratings yet
Onshore Operation Daily Maintenance Report: H2S Support Engineer
1 page
Onshore Operation Daily Maintenance Report: H2S Support Engineer
No ratings yet
Onshore Operation Daily Maintenance Report: H2S Support Engineer
1 page
Onshore Operation Daily Maintenance Report: H2S Support Engineer
No ratings yet
Onshore Operation Daily Maintenance Report: H2S Support Engineer
1 page
Onshore Operation Daily Maintenance Report: H2S Support Engineer
No ratings yet
Onshore Operation Daily Maintenance Report: H2S Support Engineer
1 page
Daily Report November 13, 2018 PDF
No ratings yet
Daily Report November 13, 2018 PDF
1 page
Onshore Operation Daily Maintenance Report: H2S Support Engineer
No ratings yet
Onshore Operation Daily Maintenance Report: H2S Support Engineer
1 page
Onshore Operation Daily Maintenance Report: Company Man Name Rig Name Location USL H2S Support Engineer Date
No ratings yet
Onshore Operation Daily Maintenance Report: Company Man Name Rig Name Location USL H2S Support Engineer Date
1 page
Onshore Operation Daily Maintenance Report: H2S Support Engineer
No ratings yet
Onshore Operation Daily Maintenance Report: H2S Support Engineer
1 page
Daily Report November 13, 2018 PDF
No ratings yet
Daily Report November 13, 2018 PDF
1 page
Onshore Operation Daily Maintenance Report: H2S Support Engineer
No ratings yet
Onshore Operation Daily Maintenance Report: H2S Support Engineer
1 page
Daily Report Aug 8 2018
No ratings yet
Daily Report Aug 8 2018
1 page
Daily Report Aug 9 2018
No ratings yet
Daily Report Aug 9 2018
1 page
Daily Report Aug 6 2018
No ratings yet
Daily Report Aug 6 2018
1 page
Republic Act No. 8491
No ratings yet
Republic Act No. 8491
8 pages
kisi kisi PENILAIAN SUMATIF AKHIR SEMESTER TAHUN PELAJARAN 2024_2025 (Jawaban) (2)
No ratings yet
kisi kisi PENILAIAN SUMATIF AKHIR SEMESTER TAHUN PELAJARAN 2024_2025 (Jawaban) (2)
2 pages
Acsl 16-17 Contest 4 Notes - Graph Theory de Assembly
No ratings yet
Acsl 16-17 Contest 4 Notes - Graph Theory de Assembly
31 pages
2024 English Annual Plan Term1 gr.3
No ratings yet
2024 English Annual Plan Term1 gr.3
8 pages
Stylistics and Discourse Analysis
100% (1)
Stylistics and Discourse Analysis
7 pages
SIMPLE PRESENT AND PRESENT CONTINUOUS 5 Secondary
No ratings yet
SIMPLE PRESENT AND PRESENT CONTINUOUS 5 Secondary
8 pages
N.B. - The Figures in The Right Margin Indicate Full Marks
No ratings yet
N.B. - The Figures in The Right Margin Indicate Full Marks
4 pages
MSSC Y12 English IA1 2025
No ratings yet
MSSC Y12 English IA1 2025
7 pages
Descriptive English Essay (PPT) - Nadya
No ratings yet
Descriptive English Essay (PPT) - Nadya
17 pages
Leaving The Twenty-First Century: A Conversation With Mckenzie Wark
No ratings yet
Leaving The Twenty-First Century: A Conversation With Mckenzie Wark
33 pages
Modal Verbs Relative Pronouns Relative Clauses 2
No ratings yet
Modal Verbs Relative Pronouns Relative Clauses 2
17 pages
Lesson Objectives
No ratings yet
Lesson Objectives
2 pages
13 English Sample Paper 1 Alleyns
No ratings yet
13 English Sample Paper 1 Alleyns
7 pages
Key To Prctice 1
No ratings yet
Key To Prctice 1
32 pages
Rhetorical Devices in Advertisement
0% (1)
Rhetorical Devices in Advertisement
5 pages
Pre Intermediate Talking Shop
No ratings yet
Pre Intermediate Talking Shop
4 pages
Tools of Persuasion
No ratings yet
Tools of Persuasion
1 page
Setting Up Mpmath: Download and Installation
No ratings yet
Setting Up Mpmath: Download and Installation
4 pages
Grade 4 2024
No ratings yet
Grade 4 2024
34 pages
Inside Knowledge Streetwise in Asia
100% (1)
Inside Knowledge Streetwise in Asia
270 pages
CAE Writing
No ratings yet
CAE Writing
6 pages
Driver The Book of 1samuel
No ratings yet
Driver The Book of 1samuel
543 pages
Thesis Statement Concluding Sentence
100% (3)
Thesis Statement Concluding Sentence
5 pages
Censorship Is Always Self Defeating and Therefore Futile
No ratings yet
Censorship Is Always Self Defeating and Therefore Futile
2 pages
The Laburnum Top
No ratings yet
The Laburnum Top
5 pages
Eng. 3 2ND Monthly Exam
No ratings yet
Eng. 3 2ND Monthly Exam
2 pages
Test of Reasoning: Dena Bank P.O. Examination Held On April 28, 2006
No ratings yet
Test of Reasoning: Dena Bank P.O. Examination Held On April 28, 2006
9 pages
Think Fast, Talk Smart - Communication Techniques
No ratings yet
Think Fast, Talk Smart - Communication Techniques
110 pages
Week 6
No ratings yet
Week 6
10 pages
WEEK3to4 (AKTIVITY) Nimez
No ratings yet
WEEK3to4 (AKTIVITY) Nimez
8 pages

Machine 22

Uploaded by

Machine 22

Uploaded by

Chapter 4

For a better understanding of the parsers, you can go through an

NNP NNP VBD NP PP

Mr. Obama played DT JJ NN IN NP

a big role in DT NNP NN NN

the Health insurance bill

In the current example, we define the kind of patterns (a regular expression of

Universal dependencies, enhanced

the english parser test

For a better understanding of how to use a Stanford parser, refer to

The President speaks about The Health Care Reforms

So, let's write some code snippets to do some basic chunking:

You might also like