0% found this document useful (0 votes)

16 views

NLP 4

Uploaded by

srinjanchatterjee4

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

NLP 4

Uploaded by

srinjanchatterjee4

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 83

Chapter-3

Syntax Analysis
MRS. Priyanka Bhoir
Steps in NLP
Morphological Analysis

Syntactic analysis
John Ate the Apple

Semantic Analysis

Discourse Analysis Ate the Apple

John
Pragmatic Analysis
Parts
of
Speech
Taggin
POS TAGGING

g
POS Tagging
Annotate each word in a sentence with a part-of-speech.

•
POS TAG / Word Classes
o 9 traditional word classes of parts of speech
◦ Noun, verb, adjective, preposition, adverb, article, pronoun, conjunction, interjection
N noun chair, bandwidth, pacing
V verb study, debate, munch
ADJ adjective purple, tall, ridiculous
ADV adverb unfortunately, slowly
P preposition of, by, to
PRO pronoun I, me, mine
DET determiner the, a, that, those

8
Defining POS Tagging
The process of assigning a part-of-speech to each word in an input text/ a corpus
:

WORDS
TAGS
the
koala
put N
the V
keys P
on DET
the
table

5
Part-of-Speech Tagsets
There are various tag sets to choose.
•The choice of the tag set depends on the nature of the application.
–We may use small tag set (more general tags) or
–large tag set (finer tags).
•Some of widely used part-of-speech tag sets:
–Penn Treebank has 45 tags
–Brown Corpus has 87 tags
–C7 tag set has 146 tags
•In a tagged corpus, each word is associated with a tag from the used tag set.
Penn Treebank
Tagset

10
Tag Ambiguity

Words often have more than one POS: back

◦ The back door = JJ (Adjective)
◦ On my back = NN (Noun)
◦ Win the voters back = RB (Adverb)
◦ Promised to back the bill = VB (Verb)
The POS tagging problem is to determine the POS tag for a particular instance of a word

11
Tagging Whole Sentences with POS is
Hard
Ambiguous POS contexts
◦ E.g., Time flies like an arrow.

Possible POS assignments

◦ Time/[V,N] flies/[V,N] like/[V,Prep] an/Det arrow/N
◦ Time/N flies/V like/Prep an/Det arrow/N
◦ Time/V flies/N like/Prep an/Det arrow/N
◦ Time/N flies/N like/V an/Det arrow/N
◦ …..

12
How do we disambiguate POS
❑Tagging is disambiguation task: words are ambiguous have more than one

possible part of speech and the goal is to find the correct tag for the situation.

❑Many words have only one POS tag ( ex. Is ,mary, smallest)

❑Others words have single most likely tag (ex. S , dog)

❑Tags also tend to co-occur regularly with other tags(ex. Det,N)

POS Tagging
Approaches
Rule-based
◦ E.g. EnCG ENGTWOL tagger
tagging
Stochastic, or, Probabilistic tagging
◦ HMM (Hidden Markov Model) tagging

Transformation-based tagging
◦ Learned rules (statistic and linguistic)
◦ E.g., Brill tagger

13
Rule-Based Tagging
o The rule-based approach uses handcrafted sets of rules to tag input sentence.
o Stage 1:
o Typically…start with a dictionary of words and possible tags
o Assign all possible tags to words using the dictionary
o Stage 2:
o Write rules by hand to selectively remove tags
o Stop when each word has exactly one (presumably correct) tag
14
Rules based on context
Please book that flight
I bought a Book.
Rule-1: if (the previous word is “to”) •Rule-2: if (the previous tag is an article)
then eliminate all Noun tags then eliminate all verb tags

Book Book

W W Tn T
n-1 Wn n+1 T n+1
n-1
Start with a POS Dictionary

• She promised to back the bill

she:
PRP
promised:
VBN,VBD
to:
TO

back: VB, JJ, RB, NN

the:
DT
bill: NN, VB

Etc… for the ~100,000 words of English

16
Step 1:Assign All Possible POS to Each
Word

NN
RB

VBN JJ VB
PRP VBD TO VB DT NN
She promised to back the bill

17
Step 2: Apply Rules Eliminating Some POS

E.g., Eliminate VBN if VBD is an option when VBN|VBD follows “<start> PRP”
NN
RB

VBN JJ VB
PRP VBD TO VB DT NN
She promised to back the bill

18
Apply Rules Eliminating Some POS
E.g., Eliminate VBN if VBD is an option when VBN|VBD follows “<start> PRP”
NN
RB

JJ VB
PRP VBD TO VB DT NN
She promised to back the bill

19
Rule-Based Part-of-Speech Tagging: Example

Pronoun
Verb
Article
Verb Noun

Pronoun
Verb
Article
Verb Noun
Properties of Rule-Based POS
Tagging
oRule-based POS taggers possess the following properties −
oThese taggers are knowledge-driven taggers.
oThe rules in Rule-based POS tagging are built manually.
oThe information is coded in the form of rules.
oWe have some limited number of rules approximately around 1000.
oSmoothing and language modeling is defined explicitly in rule-based
taggers.
EngCG ENGTWOL Tagger
Richer dictionary includes morphological and syntactic features as well
as possible POS
Uses two-level morphological analysis on input and returns all possible
POS Apply negative constraints (> 3744) to rule out incorrect POS
Sample ENGTWOL Dictionary

23
ENGTWOL Tagging: Stage
1 1: Run words through FST morphological analyzer to get POS info from morph
Step
E.g.: Pavlov had shown that salivation …
Pavlov PAVLOV N NOM SG PROPER
had HAVE V PAST VFIN SVO
HAVE PCP2 SVO
shown SHOW PCP2 SVOO SVO SV
that ADV
PRON DEM SG
DET CENTRAL DEM SG
CS
salivation N NOM SG
24
ENGTWOL Tagging: Stage
2Step 2: Apply NEGATIVE constraints
E.g., Adverbial that rule
◦ Eliminate all readings of that except the one in It isn’t that odd.

Given input: that

If
(+1 A/ADV/QUANT) ; if next word is adj/adv/quantifier
(+2 SENT-LIM) ; followed by E-O-S
(NOT -1 SVOC/A) ; and the previous word is not a verb like
consider which allows adjective
complements (e.g. I consider that odd)
Then eliminate non-ADV tags
Else eliminate ADV

25
POS Tagging Approaches

Rule-based tagging
◦ E.g. EnCG ENGTWOL tagger

Stochastic, or, Probabilistic tagging

◦ HMM (Hidden Markov Model) tagging

Transformation-based tagging
◦ Learned rules (statistic and linguistic)
◦ E.g., Brill tagger

26
Stochastic POS Tagging

oThe model that includes frequency or probability (statistics) can be called stochastic.
oThe simplest stochastic tagger applies the following approaches for POS tagging −
oWord Frequency Approach
Using the probability that a word occurs with a particular tag.
the tag encountered most frequently with the word in the training set
oTag Sequence Probabilities
calculates the probability of a given sequence of tags occurring.
It is called n-gram approach because the best tag for a given word is determined
by the probability at which it occurs with the n previous tags.
Properties of Stochastic POS Tagging
Stochastic POS taggers possess the following properties −
o This POS tagging is based on the probability of tag occurring.
o It requires training corpus
o There would be no probability for the words that do not exist in the corpus.
o It uses different testing corpus (other than training corpus).
o It is the simplest POS tagging because it chooses most frequent tags associated
with a word in training corpus.
Stochastic tagging

HMM Part-of-Speech Tagger

A Markov chain
A Markov chain is a model that tells about the probabilities of sequences of
states (random variables).
–A Markov chain makes a very strong assumption that if we want to predict
the future in the sequence, all that matters is the current state(Markov
assumption).
–All states before the current state have no impact on the future except via the
current state.
•A Markov model embodies Markov assumption on the probabilities of
thesequenceq1 … qi-1qi:
Markov Assumption: P(qi| q1…qi-1) = P(qi| qi-1)
Hidden Markov model (HMM)
Markov chain is useful to compute a probability for a sequence of observable events.
•In many cases, the events we are interested in are hidden events:
–We don’t observe hidden events directly.
–For example we don’t normally observe part-of-speech tags in a text. Rather, we see
words, and must infer the tags from the word sequence.

–We call the tags hidden because they are not observed.
•A Hidden Markov model (HMM) allows us to talk about both observed events(like
words that we see in the input) and hidden events (like part-of-speech tags) that we think
of as causal factors in our probabilistic model.
?
P(John bit the apple) =?
Entropy

Event X
Probability P(X)
Surprise log (1/P(X)) Probability is high surprise is low
Entropy expected surprise obvious event entropy will be low and for rare
events ?

Eg. John ate banana , apple and Cake.

P(N)= 4/6= 2/3 (HIGH)
P(V)=1/6 (LOW)

26
Entropy
•Entropy or self-information is the average uncertainty of a single random variable:

(i) H(X) = 0 only when the value of X is determinate, hence providing no new information
(ii) H(X) >=0, only when the value of X is Not deterministic, hence providing new information

51
Entropy –high at first and last word
Aoccdrnig to rseearch at an Elingsh uinervtisy, it deosn't
mttaer in waht oredr the ltteers in a wrod are, the olny
iprmoatnt tihng is that the frist and lsat ltteer is at the rghit
pclae. The rset can be a toatl mses and you can sitll raed it
wouthit a porbelm. Tihs is bcuseae we do not raed ervey lteter
by it slef but the wrod as a wlohe.
40%
Removed
30%
Removed
20%
Removed
10%
Removed

49
0%
Removed
Entropy
•Entropy or self-information is the average uncertainty of a single random variable:

(i) H(X) = 0 only when the value of X is determinate, hence providing no new information
(ii) H(X) >=0, only when the value of X is Not deterministic, hence providing new information

51
Maximum Entropy Model
Limitations of HMM
• Limited Context( Previous tag)
• Could not used Morphological Clue(Suffix, Capitalization)
• Handling Unknown Words (NO Emission Probabilities)
• No additional heterogeneous observations/features used

• (eg. Whether next word is “TO” ,

• Previous five word contents Preposition)

Maximum Entropy
Maximum Entropy=Minimum commitment
Model everything all that it knows and assume nothing about what is unknown

Satisfy constraints that Choose the most uniform

must hold Distribution
Constraints
Possible tags for word “back”
{Noun, Adjective, Adverb, Verb}
Constrains:
P(Noun)+P(Adjective)+P( adverb)+P(verb)=1
Answer:

P(Noun)=1/4
P(Adjective)=1/4
P( adverb)=1/4
Most uniform
P(verb)=1/4
distribution
Constraints
Constrains:
P(verb)+P(Noun)=3/4

P(Noun)+P(Adjective)+P( adverb)+P(verb)=1
Answer:
P(verb)=3/8
P(Noun)=3/8 Moving away
P(Adjective)=1/8 from uniform
P( adverb)=1/8 distribution
Adding Constraints
❑ Brings the distribution further from Uniform Distribution
❑ Raises Maximum Likelihood of Data
❑ Lowers Maximum Entropy
Entropy
Event: X
Probability: P(X)
Surprise: log(1/P(X)) Probability is High Surprise is Low
Entropy: Expected Surprise Obvious event Entropy will be Low and for rare events?

H(P) = EP [Log2 ]

= -∑ P(X) Log2 P(X)

Expected value of surprise over all the values of P

Entropy
Event: X
Probability: P(X)
Surprise: log( ) Probability is High Surprise is Low

Entropy: Expected Surprise Obvious event Entropy will be Low and for rare events?

H(P) = EP [Log2 ]

= -∑ P(X) Log2 P(X)

Expected value of surprise over all the values of P

The maximum entropy model is based on conditional probability. The objective function of the
probability distribution is to maximize the conditional entropy:
Hidden Markov Models Maximum Entropy Markov Model

Generative Discriminative
◦ Assign Join Probabilitty to paired observation and ◦ Assign Conditional Probabilitty to paired observation
Label Sequences P(X,Y) and Label Sequences P(Y|X)
◦ P(S,O)= ∏ P(Si|Si-1)*P(Si|Oi ) ◦ P(S|O)= ∏ P(Si|Si-1,Oi)
Assumes features are independent No longer assume that features are independent
Conditional Random Fields (CRFs)

oDiscriminative

oDoesn’t assume that features are independent

oWhen labeling Yi future observations are taken into account

oThe best of both worlds!

Conditional Random Fields
<S> NN VB NN <E>

X X X X X

Transition functions
add associations
between transitions from State functions
one label to another determine the
identity of the state based on input
CRFs are based on the idea of Markov Random Fields
◦ Modelled as an undirected graph connecting labels with observation
Conditional Random
Fields

State Feature Weight State Feature Function

λ=10 f([x is stop], <e>)

e.g. positive wt value meaning e.g. if input word is full stop then
strong feature POS tag is <e>
for this state feature (Strong) For our input word and output tag
Conditional Random
Fields

Transition Feature Weight Transition Feature Function

μ=4 g(x, <s>,NN)

One possible weight value e.g. Transition from <s> to <NN> on

for this transition feature input word x
Indicates NN followed by <S>
Conditional Random Fields
Conceptual Overview
◦ Each feature function carries a weight that gives the strength of
that feature function for the proposed label
◦ High positive weights indicate a good association between the feature and
the proposed label
◦ High negative weights indicate a negative association between the
feature and the proposed label
◦ Weights close to zero indicate the feature has little or no impact on
the identity of the label
POS Tagging Approaches

Rule-based tagging
◦ E.g. EnCG ENGTWOL tagger

Stochastic, or, Probabilistic tagging

◦ HMM (Hidden Markov Model) tagging

Transformation-based tagging
◦ E.g., Brill tagger

70
Transformation-Based (Brill) Tagging
Also known as Brill Tagging.
Combines Rule-based and Stochastic Tagging
◦ Like rule-based rules are used to specify tags in a certain environment
◦ Like stochastic approach we use a tagged corpus to find the most likely tags.
◦ Before the rules are applied, the tagger labels every word with its most likely
tag.
◦Rules are learned from data

2/19/2021 71
Transformation-Based Tagging:
Example
Example: Labels every word with
–He is expected to race tomorrow its most-likely tag
E.g. race occurrences in
–he/PRN is/VBZ expected/VBN to/TO race/NN tomorrow/NN
corpus:
•After selecting most-likely tags, we apply transformation rules. P(NN|race) = .98
–Change NN to VB when the previous tag is TO P(VB|race)= .02
–This rule converts race/NN into race/VB
– he/PRN is/VBZ expected/VBN to/TO race/VB tomorrow/NN
•This may not work for every case
–….. According to race
TBL Tagging Algorithm
How Transformation Rules are
Learned?
We assume that we have a tagged corpus.
•Brill Tagger algorithm has three major steps.
1. Tag the corpus with the most likely tag for each (unigram model)
2. Choose a transformation that deterministically replaces an existing
tag with a new tag such that the resulting tagged training corpus has
the lowest error rate.
3.Apply the transformation to the training corpus and add rule to end of rule set. 4
These steps are repeated until a stopping criterion is reached.
Transformation
Rules
E.g. if word-1 is an X and word is a Y then change the tag to Z”
A transformation rule is selected from a small set of
templates. Change tag a to tag b when Word

–The preceding (following) word is tagged z.

Tn-2 Tn+2
–The word two before (after) is tagged z. T T
n-1 Tn n+1

–One of two preceding (following) words is tagged z.

–One of three preceding (following) words is tagged z.
–The preceding word is tagged z and the following word is tagged w.
–The preceding (following) word is tagged z and the word two before (after) is tagged w.
Templates for TBL

2/19/2021 76
TBL
Issues
Problem: Could keep applying (new) transformations ad infinitum
Problem: Rules are learned in ordered sequence

Problem: Rules may interact i.e. Rules may make errors that are corrected by later rules
More complex Problem: Tagging multipart words

wouldn’t --> would/MD n’t/RB

How to handle unknown words

sentence → noun_phrase verb_phrase

noun_phrase → proper_name
noun_phrase → article noun
verb_phrase → verb
verb_phrase → verb noun_phrase
verb_phrase → verb noun_phrase prep_phrase
verb_phrase → verb prep_phrase
prep_phrase → preposition noun_phrase
Simple CF
grammars (2)
There are still-undefined syntactic units are Non-terminals. They correspond to
parts of speech.

We can define them by adding lexical productions to the grammar:

article → the | a | an
noun → pizza | bus | boys | ...
preposition → to | on |

... proper_name → Jim |

Dan | ...

verb → ate | yawns | ...

This is not practical on a large scale.
Normally, we have a lexicon (dictionary) stored in a database, that can be
8. Explain in detail hybrid POS tagging.
9. Explain the use of CFG in natural processing with suitable example.
10. Discuss the following potential problems in CFG such as :
A) Agreement B) sub categorization C) Movement
And its types.

With the help of one example

Cme4408 p6 Pos Tagging
No ratings yet
Cme4408 p6 Pos Tagging
33 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
47 pages
Lecture 5
No ratings yet
Lecture 5
56 pages
Lecture 16-17-18-19
No ratings yet
Lecture 16-17-18-19
42 pages
NLP04 PartOfSpeechTagging
No ratings yet
NLP04 PartOfSpeechTagging
52 pages
Lecture05-Hmm Pos Tagging
No ratings yet
Lecture05-Hmm Pos Tagging
38 pages
12 Neuralcrf
No ratings yet
12 Neuralcrf
41 pages
13-neuralcrf pos tagging
No ratings yet
13-neuralcrf pos tagging
40 pages
Lecture 04
No ratings yet
Lecture 04
42 pages
3 - Naive Bayes
No ratings yet
3 - Naive Bayes
60 pages
Noun Phrase Extraction: A Description of Current Techniques
No ratings yet
Noun Phrase Extraction: A Description of Current Techniques
36 pages
Lec8-9- VSM
No ratings yet
Lec8-9- VSM
20 pages
UNIT NO 3
No ratings yet
UNIT NO 3
8 pages
Lec3-posner intro
No ratings yet
Lec3-posner intro
30 pages
Roark - Lec 2 - HMM Viterbi Forward
No ratings yet
Roark - Lec 2 - HMM Viterbi Forward
37 pages
Syntactic Rules
100% (12)
Syntactic Rules
4 pages
4.Chapter5_ Syntactic and Semantic Representations
No ratings yet
4.Chapter5_ Syntactic and Semantic Representations
47 pages
NLP_Unit 3
No ratings yet
NLP_Unit 3
11 pages
Intro DL 10 NLP
No ratings yet
Intro DL 10 NLP
99 pages
F15 CS194 Lec 05 Natural Language
No ratings yet
F15 CS194 Lec 05 Natural Language
69 pages
Final Practice
No ratings yet
Final Practice
12 pages
unit-3
No ratings yet
unit-3
50 pages
11 Text Mining
No ratings yet
11 Text Mining
16 pages
Unit V Natural Language Processing
No ratings yet
Unit V Natural Language Processing
5 pages
2 cs626 Pos Tagging Week of 1aug22
No ratings yet
2 cs626 Pos Tagging Week of 1aug22
57 pages
Rule_based_POS_Tagging_Example (1)
No ratings yet
Rule_based_POS_Tagging_Example (1)
4 pages
CSCI 5832 Natural Language Processing: Jim Martin
No ratings yet
CSCI 5832 Natural Language Processing: Jim Martin
46 pages
Unit 3
No ratings yet
Unit 3
8 pages
Mod - 3 (2)
No ratings yet
Mod - 3 (2)
51 pages
Deep Network Notes
No ratings yet
Deep Network Notes
54 pages
Chapter #3 Syntax Analysis
No ratings yet
Chapter #3 Syntax Analysis
22 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
BERT (v3)
No ratings yet
BERT (v3)
32 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
Machine Learning Natural Language 2023
No ratings yet
Machine Learning Natural Language 2023
28 pages
Midterm
No ratings yet
Midterm
5 pages
Discourse Based Opinion Mining On Roman Urdu Datasets
No ratings yet
Discourse Based Opinion Mining On Roman Urdu Datasets
25 pages
5. PoSTagging-HMM
No ratings yet
5. PoSTagging-HMM
24 pages
Formulir Observasi: Fasilitas: No. Periode: No - Sesi: Pelayanan: TGL: Observer
No ratings yet
Formulir Observasi: Fasilitas: No. Periode: No - Sesi: Pelayanan: TGL: Observer
3 pages
Form Hand Hygine
No ratings yet
Form Hand Hygine
3 pages
Form Hand Hygine
No ratings yet
Form Hand Hygine
3 pages
Form Hand Hygine
No ratings yet
Form Hand Hygine
3 pages
Slp14 Handout s17hw
No ratings yet
Slp14 Handout s17hw
71 pages
Part-Of-Speech (POS) Tagging
No ratings yet
Part-Of-Speech (POS) Tagging
53 pages
Machine 22
No ratings yet
Machine 22
5 pages
Reduced Orered Binary Decision Diagrams
No ratings yet
Reduced Orered Binary Decision Diagrams
3 pages
NLP Endsem Paper Regular Paper SOLUTION April 2024 (2)
No ratings yet
NLP Endsem Paper Regular Paper SOLUTION April 2024 (2)
10 pages
Natural Language Processing Dossier 20231110 141736 0000
No ratings yet
Natural Language Processing Dossier 20231110 141736 0000
114 pages
NLP Unit Ii
No ratings yet
NLP Unit Ii
30 pages
Week 2
No ratings yet
Week 2
90 pages
POS Tagging: Introduction: Heng Ji
No ratings yet
POS Tagging: Introduction: Heng Ji
35 pages
lx522 2 Trees
No ratings yet
lx522 2 Trees
21 pages
Sentence Diagramming
No ratings yet
Sentence Diagramming
3 pages
NLP Unit 3
No ratings yet
NLP Unit 3
17 pages
CFG and PCFG
No ratings yet
CFG and PCFG
7 pages
Parts of Speech Tagging
No ratings yet
Parts of Speech Tagging
17 pages
Catching Up
No ratings yet
Catching Up
11 pages
Lecture10 - SRL
No ratings yet
Lecture10 - SRL
32 pages
Computer Programming: A Simplified Entry to Python, Java, and C++ Programming for Beginners
From Everand
Computer Programming: A Simplified Entry to Python, Java, and C++ Programming for Beginners
Lena Neill
No ratings yet
OPEN CLOZE MADE EASY: MADE EASY SERIES
From Everand
OPEN CLOZE MADE EASY: MADE EASY SERIES
D.Méndez
1/5 (1)
Unit 20 Ta Ündem Formacio Ün
No ratings yet
Unit 20 Ta Ündem Formacio Ün
18 pages
The Verb Phrase in Marathi - Etl Voices - India (2014) : July 2018
No ratings yet
The Verb Phrase in Marathi - Etl Voices - India (2014) : July 2018
19 pages
04 - Lê Vũ Thùy Dương
No ratings yet
04 - Lê Vũ Thùy Dương
6 pages
Finite Nonfinite Clause
100% (4)
Finite Nonfinite Clause
10 pages
Element Realization Types
No ratings yet
Element Realization Types
3 pages
Group 2 - The Basic Verb Phrase
No ratings yet
Group 2 - The Basic Verb Phrase
12 pages
Introduction To Linguistics Syntax: Class 7
No ratings yet
Introduction To Linguistics Syntax: Class 7
35 pages
Generative Grammar Slides
No ratings yet
Generative Grammar Slides
106 pages
Noun Phrase
100% (1)
Noun Phrase
20 pages
Chapter 2 Valency
No ratings yet
Chapter 2 Valency
15 pages
Module 2 Grammatical & Syntactic Awareness
No ratings yet
Module 2 Grammatical & Syntactic Awareness
63 pages
Phrase and Types of Phrase
No ratings yet
Phrase and Types of Phrase
5 pages
CA-Group 1-Vietnamese and English Syntax
100% (3)
CA-Group 1-Vietnamese and English Syntax
113 pages
CHAPTER 7-Applied Linguistics
No ratings yet
CHAPTER 7-Applied Linguistics
18 pages
Classical and Non-Classical Computation
No ratings yet
Classical and Non-Classical Computation
19 pages
Vũ Mai Phương
No ratings yet
Vũ Mai Phương
6 pages
A Comparative Study On English and Turkish Syntactic Structures Within The Terms of The Minimalist Program
No ratings yet
A Comparative Study On English and Turkish Syntactic Structures Within The Terms of The Minimalist Program
17 pages
BAKER and STEWART A Serial Verb Construction Without Constructions
No ratings yet
BAKER and STEWART A Serial Verb Construction Without Constructions
59 pages
Xemtailieu A Study On Translating English Newspaper Headlines Into Vietnamese Newspaper Headlines On Dantri Online Newspaper 1
No ratings yet
Xemtailieu A Study On Translating English Newspaper Headlines Into Vietnamese Newspaper Headlines On Dantri Online Newspaper 1
48 pages
The Verb Phrase
No ratings yet
The Verb Phrase
73 pages
English Syntax 2 - Representinf Sentence Structure
No ratings yet
English Syntax 2 - Representinf Sentence Structure
14 pages
Contemporary Grammar of English: Third Year Prof. Mahdi M. Mohammed Alasadi
No ratings yet
Contemporary Grammar of English: Third Year Prof. Mahdi M. Mohammed Alasadi
43 pages
Grammatical Structures of English Module 02
No ratings yet
Grammatical Structures of English Module 02
53 pages
216 98 The Complete Sentence C
No ratings yet
216 98 The Complete Sentence C
5 pages
Verb Phrase
No ratings yet
Verb Phrase
15 pages
Engleski - Prefer I Rather
No ratings yet
Engleski - Prefer I Rather
3 pages
Verb Phrase Examples
100% (1)
Verb Phrase Examples
3 pages
Emonds 2013 Gerunds Vs Infinitives in English
No ratings yet
Emonds 2013 Gerunds Vs Infinitives in English
35 pages
Contrastive Analysis
No ratings yet
Contrastive Analysis
16 pages
A Critical Analysis of Verb Phrases in English Language
No ratings yet
A Critical Analysis of Verb Phrases in English Language
8 pages

NLP 4

Uploaded by

NLP 4

Uploaded by

Chapter-3

Discourse Analysis Ate the Apple

Words often have more than one POS: back

Possible POS assignments

❑Others words have single most likely tag (ex. S , dog)

❑Tags also tend to co-occur regularly with other tags(ex. Det,N)

• She promised to back the bill

back: VB, JJ, RB, NN

Etc… for the ~100,000 words of English

Given input: that

Stochastic, or, Probabilistic tagging

HMM Part-of-Speech Tagger

Eg. John ate banana , apple and Cake.

• (eg. Whether next word is “TO” ,

• Previous five word contents Preposition)

Satisfy constraints that Choose the most uniform

= -∑ P(X) Log2 P(X)

Expected value of surprise over all the values of P

= -∑ P(X) Log2 P(X)

Expected value of surprise over all the values of P

oDoesn’t assume that features are independent

oWhen labeling Yi future observations are taken into account

oThe best of both worlds!

State Feature Weight State Feature Function

λ=10 f([x is stop], <e>)

Transition Feature Weight Transition Feature Function

μ=4 g(x, <s>,NN)

One possible weight value e.g. Transition from <s> to <NN> on

Stochastic, or, Probabilistic tagging

–The preceding (following) word is tagged z.

–One of two preceding (following) words is tagged z.

wouldn’t --> would/MD n’t/RB

How to handle unknown words

sentence → noun_phrase verb_phrase

We can define them by adding lexical productions to the grammar:

... proper_name → Jim |

verb → ate | yawns | ...

With the help of one example

You might also like