0% found this document useful (0 votes)

1 views

Unit-5-NLP (1)

The document discusses coreference resolution in NLP, which identifies linguistic expressions that refer to the same entity, and its applications in tasks like text understanding and sentiment analysis. It also covers concepts like anaphora, cataphora, discourse analysis, and semantic role labeling, along with algorithms for discourse segmentation and word sense disambiguation. Additionally, it highlights challenges in word sense disambiguation and its relevance in various NLP fields.

Uploaded by

dyagalavarshith

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Unit-5-NLP (1)

Uploaded by

dyagalavarshith

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

UNIT-V

Coreference Resolution: Coreference Resolution

comes with NLP and it tries to find all linguistic expressions in a given text that refer to
the same real-world entity. This is how it works.

Suppose you have to find the pronouns in a sentence and replace them with relevant
nouns. Coreference resolution can be used to do that. It finds and groups the words which
refer to the same entities and replaces pronouns with noun phrases.

I Gave my book to Abdul because he said that he wants to write the Assignment.

Coreference resolution is using in a variety of NLP tasks such as,

• Text understanding

• Document summarization

• Information extraction

• Sentiment analysis

Different types of References:

Anaphora and cataphora

Anaphora: is the use of the expression whose interpretation depends upon another
expression in context (its antecedent)

The Movie was horrible so that it couldn’t be enjoyable.

Cataphora: When the referring expression is pointing forward then it is cataphora.

Despite of his difficulty rama, went ahead to help him.

Coreference Vs Anaphora:

Not all anaphoric relations are Coreference.

We went to see a movie last night. The tickets were expensive.

The relation is anaphoric but not coreference.

Text Coherence : As we have previously discussed, the coherent discourse in NLP aims
to find the coherence relation among the discourse text.

Hobb’s algorithm is one of the several approaches for pronoun resolution. The algorithm
is mainly based on the syntactic parse tree of the sentences. Hobbs Algorithm is one of the
techniques used for Pronoun Resolu on.

Consider two sentences:

Sentence 1(S1): Jack is an engineer.

Sentence 2 (S2): Jill likes him.

So here, we have the syntactic parse tree of the two sentences as shown.
ti
The algorithm starts with the target pronoun and walks up the parse tree to the root node
‘S’. For each noun phrase or ‘S’ node that it finds, it does the breadth rst le to right
search of the node’s children to the left of the target. So in our example, the algorithm
starts with the parse tree of the sentence 2 and climbs up to the root node S2. Then it does
a breadth first search to find the noun phrase (NP). Here the algorithm, finds its first noun
phrase for noun ‘Jill’. [Source : https://medium.com/analytics-vidhya/hobbs-algorithm-pronoun-
resolution-7620aa1af538]

Binding theory states that: A re exive can refer to the subject of the most immediate
clause in which it appears, whereas a nonre exive cannot co-refer this subject. Words
such as himself, herself, themselves, etc. are known as reflexive.

Let’s understand this with an example.

• John bought himself a new car.

Here, himself refers to John. Whereas if the sentence is

• John bought him a new car.

fl
fl
fi
ft
So according to the binding theory constraint, ‘him’ in our example will not refer to Jill.

Hence the algorithm now starts the search in the syntax tree of the previous sentence. And
hence the subject Jack in the sentence, Jack is an engineer, is explored before the object
engineer and finally Jack is the resolved referent for the pronoun him.

Discourse:

When we are dealing with Natural Language Processing, the provided language consists
of structured, collective, and consistent groups of sentences, which are termed discourse
in NLP. Discourse Analysis is extracting the meaning out of the corpus or text. Discourse
Analysis is very important in Natural language Processing and helps train the NLP model
better.

Coherence in terms of Discourse in NLP means making sense of the words or making
meaningful connections and correlations. There is a lot of connection between the
coherence and the discourse structure. The coherent rela on tells us that there is some
sort of connection present between the words.

Cohesion: A close relationship based on grammar or meaning between two parts of a

sentence.
Discourse Structure:
So far, we have discussed discourse and coherence, but we have not discussed the
structure of the discourse in NLP. Let us now look at the structure that discourse in NLP
must have. Now, the structure of the discourse depends on the type of segmentation
applied to the discourse.
Algorithms for Discourse Segmentation
We have different algorithms for Unsupervised Discourse Segmenta on and Supervised
Discourse Segmenta on.
Unsupervised Discourse Segmentation
The class of unsupervised segmentation is also termed or represented as linear
segmentation. The unsupervised discourse segmenta on can also be performed with the
help of lexicon cohesion. The lexicon cohesion indicates the relationship among similar
units, for example, synonyms.
ti
ti
ti
ti
Supervised Discourse Segmentation
But in the supervised discourse segmentation, we only deal with the training data set
having a labeled boundary
Text Coherence
As we have previously discussed, the coherent discourse in NLP aims to find the
coherence relation among the discourse text.
Coherence Relations:
Suppose we have two kinds of related sentences, namely: S0 and S1.
Result
We can say that the second statement, i.e., S1 can be the cause of the first statement, i.e.,
S0. For example, Rahul is late. He will be punished.
In the above example, we can say that the first statement, S0, i.e., Rahul is late, has
caused the second statement, i.e., S1, i.e., He will be punished.
Explanation
Similar to the result, we can say that the first statement, i.e., S0 can be the cause of the
second statement, i.e., S1. For example, Rahul fought with his friend. He was drunk.
Parallel
By the term parallel, we mean that the assertion from the statement S0, i.e., p(a1, a2, …),
and the assertion from the statement S1, i.e. p(b1, b2, …), the ai and bi is similar for all
the values of I.
In simpler terms, it shows us that the sentences are parallel. For example, He wants food.
She wants money. Both of the statements are parallel as there is a sense of want in both
sentences.
Elaboration (Additional Information)
Elabora on means that proposition P is inferring from both the assertions S0 and S1. For
example, Rahul is from Delhi. Rohan is from Mumbai.
Occasion
The occasion takes place when the change in the state is inferred from the first assertion
S0, the final state is inferred from the statement S1, and vice-versa. Let us take an
example to understand the relationship occasion better. For example, Rahul took the
money. he gave it to Rohan.
Let us consider the following phrases and serially number them.
ti
• S1:
Rahul went to the bank to deposit money.
• S2:
He then went to Rohan's shop.
• S3
He wanted a phone.
• S4
He did not have a phone.
• S5:
He also wanted to buy a laptop from Rohan's shop.
Now the entire discourse can be represented using the below hierarchal discourse
structure.

Word Senses and WordNet
Word sense: A sense (or word sense) is a discrete representation of one aspect of the
meaning of a word.

WordNet:
• A large online thesaurus —a database that represents word senses with versions in
many languages.
• It also represents relations between senses.
A sense (or word sense) is a discrete representation of one aspect of the meaning of a
word.
• We represent each sense with a superscript:
mouse1 and mouse2 , bank1 and bank2
mouse1 : .... a mouse controlling a computer system in 1968.
mouse2 : .... a quiet animal like a mouse
bank1 : ...a bank can hold the investments in a custodial account ...
bank2 : ...as agriculture burgeons on the east bank, the river ...
Here are the glosses for two senses of bank:
1. financial institution that accepts deposits and channels the money into
lending activities
2. sloping land (especially the slope beside a body of water).
Thesauruses:
➢ They define a sense through its relationship with other senses.

Semantic Role Labeling:

Semantic role labeling (SRL) is a natural language processing (NLP) technique that
involves identifying the syntactic and semantic roles of words in a sentence.
SRL involves identifying the roles of words in a sentence, such as the subject, object, and
verb. The subject is the entity that performs the action, while the object is the entity that
receives the action. The verb is the action itself. For example, in the sentence “John ate the
apple,” “John” is the subject, “ate” is the verb, and “the apple” is the object.
Sometimes, the subject and object are not explicitly stated in the sentence, making it
difficult to determine their roles. In addition, the same word can have different roles
depending on the context in which it is used.
One popular approach to SRL is the FrameNet approach, which uses a database of frames,
which are structured representations of common situations and events, and their associated
semantic roles.
FrameNet maps argument structure for frames, which are evoked by a lexical unit
Another approach to SRL is the PropBank approach, which uses a database of syntactic
frames, which are templates that describe the structure of a sentence and the roles of its
constituent words.
PropBank is an annotation of syntactically parsed, or treebanked, structures with
`predicate-argument' structures.

[ARG0 John] broke [ARG1 the window]

[ARG1 The window] broke
As this example shows, the arguments of the verbs are labeled as numbered arguments:
Arg0, Arg1, Arg2 and so on.
The second task of the PropBank annotation involves assigning functional tags to all
modifiers of the verb, such as manner (MNR), locative (LOC), temporal (TMP) and
others:
Mr. Bush met him privately, in the White House, on Thursday.
Rel: met
Arg0: Mr. Bush
Arg1: him

ArgM-MNR: privately
ArgM-LOC: in the White House
ArgM-TMP: on Thursday.

Semantic Roles:

Semantic/Thematic roles Verbs describe events or states (‘eventualities’):

Tom broke the window with a rock.
The window broke.
The window was broken by Tom/by a rock.
Thematic roles refer to participants of these events: Agent (who performed the action):
Tom Patient (who was the action performed on): window Tool/Instrument (what was used
to perform the action): rock
Word-sense disambiguation in NLP

Natural Language Processing (NLP), a branch of artificial intelligence which studies the
ability of computers to interpret and “understand” the human language. It is an open
problem in computational linguistics concerned with identifying which sense of a word is
used in a sentence.
Dictionary-based or Knowledge-based Methods. As the name suggests, for
disambiguation, these methods primarily rely on dictionaries

Lesk’s Algorithm: A simple method for word-sense disambiguation:

The Lesk de ni on, on which the Lesk algorithm is based is “measure overlap between
sense definitions for all words in context”.
Perhaps one of the earliest and still most commonly used methods for word-sense
disambiguation today is Lesk’s Algorithm, proposed by Michael E. Lesk in 1986. Lesk’s
algorithm is based on the idea that words that appear together in text are related somehow,
and that the relationship and corresponding context of the words can be extracted through
the definitions of the words of interest as well as the other words used around it.

Supervised Methods:

For disambiguation, machine learning methods make use of sense-annotated corpora to

train. These methods assume that the context can provide enough evidence on its own to
disambiguate the sense.

The context is represented as a set of “features” of the words. It includes the information
about the surrounding words also. Support vector machine and memory-based learning are
the most successful supervised learning approaches to WSD.

Semi-supervised Methods

Due to the lack of training corpus, most of the word sense

disambiguation algorithms use semi-supervised learning methods. It is because semi-
supervised methods use both labelled as well as unlabeled data. These methods require
very small amount of annotated text and large amount of plain unannotated text. The
technique that is used by semi-supervised methods is bootstrapping from seed data.

fi
ti
What are the challenges in word sense disambiguation?

WSD faces a lot of challenges and problems.

The most common problem is the difference between various dictionaries or text corpus.
Different dictionaries have different meanings for words, which makes the sense of the
words to be perceived as different. A lot of text information is out there and often it is not
possible to process everything properly.
[Source: https://www.engati.com/glossary/word-sense-disambiguation]

Word sense disambiguation has many applications in various text processing

and NLP fields.
➢ WSD can also be used in Text Mining and Information Extraction tasks.

➢ WSD can be used alongside Lexicography.

➢ Similarly, WSD can be used for Information Retrieval purposes.

Transformational Generative Grammar
88% (57)
Transformational Generative Grammar
6 pages
An Overview of Government-Binding Theory
100% (2)
An Overview of Government-Binding Theory
3 pages
Semantic Analysis
100% (1)
Semantic Analysis
16 pages
Module 5
No ratings yet
Module 5
27 pages
Unit - 5
No ratings yet
Unit - 5
21 pages
disclosure
No ratings yet
disclosure
7 pages
UNIT 5
No ratings yet
UNIT 5
13 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
18 pages
Implementation of Coreference Resolution
No ratings yet
Implementation of Coreference Resolution
5 pages
Unit 3 NLP
No ratings yet
Unit 3 NLP
103 pages
Concept of Coherence: Coherence Relation Between Utterances
No ratings yet
Concept of Coherence: Coherence Relation Between Utterances
5 pages
Unit 4
No ratings yet
Unit 4
15 pages
NLP Notes Unit-3.Doc
No ratings yet
NLP Notes Unit-3.Doc
19 pages
(Semantic) - Word, Meaning and Concept
No ratings yet
(Semantic) - Word, Meaning and Concept
13 pages
Syntax
No ratings yet
Syntax
4 pages
Unit 1. The Lexicon and Sentence Structure - Docx 2017
No ratings yet
Unit 1. The Lexicon and Sentence Structure - Docx 2017
8 pages
NLPQB2
No ratings yet
NLPQB2
8 pages
SFL Notes
No ratings yet
SFL Notes
14 pages
Predicate Logic: Syntax and Semantics: 5.1 Domains of Discourse and Interpreting A Language
No ratings yet
Predicate Logic: Syntax and Semantics: 5.1 Domains of Discourse and Interpreting A Language
27 pages
UNIT 1. The Lexicon and Sentence Structure - Docx 2017
No ratings yet
UNIT 1. The Lexicon and Sentence Structure - Docx 2017
8 pages
SEMANTICS 2023-2024
No ratings yet
SEMANTICS 2023-2024
35 pages
SemVII_NaturalLanguageProcessing
No ratings yet
SemVII_NaturalLanguageProcessing
32 pages
Unit V Intelligence and Applications: Morphological Analysis/Lexical Analysis
No ratings yet
Unit V Intelligence and Applications: Morphological Analysis/Lexical Analysis
30 pages
Our Aim Is To Give A Compositional Semantic Theory For A Language With Infinitely Many
No ratings yet
Our Aim Is To Give A Compositional Semantic Theory For A Language With Infinitely Many
3 pages
EXAM 2022 Cognitive Science Linguistics
No ratings yet
EXAM 2022 Cognitive Science Linguistics
8 pages
NLP Notes Last Sem
No ratings yet
NLP Notes Last Sem
48 pages
4 Sentence-Meaning
No ratings yet
4 Sentence-Meaning
11 pages
LING-2
No ratings yet
LING-2
9 pages
UNIT AI 4
No ratings yet
UNIT AI 4
25 pages
Chomsky's TGG - Part of Theses by DR Rubina Kamran of NUML
No ratings yet
Chomsky's TGG - Part of Theses by DR Rubina Kamran of NUML
15 pages
Natural Language Processing
No ratings yet
Natural Language Processing
34 pages
Tone, Mode Cohesive Devices
No ratings yet
Tone, Mode Cohesive Devices
13 pages
Semantics
No ratings yet
Semantics
9 pages
Unit 8
No ratings yet
Unit 8
24 pages
How Many Levels of Semantic Analysis Do We Have in English
No ratings yet
How Many Levels of Semantic Analysis Do We Have in English
16 pages
Assignment 3, LinELT B, Muh. Agradean Triyono
No ratings yet
Assignment 3, LinELT B, Muh. Agradean Triyono
7 pages
Course: Composition 1 Roll No: 201360120 Section: E Course Code: ENG110E Submitted To: Hafiza Rabia
No ratings yet
Course: Composition 1 Roll No: 201360120 Section: E Course Code: ENG110E Submitted To: Hafiza Rabia
4 pages
1-Lesson 1 Syntax 2021
No ratings yet
1-Lesson 1 Syntax 2021
5 pages
Transformative Generative Grammar
95% (42)
Transformative Generative Grammar
8 pages
SEMANTICS ORG (Autosaved) - 1
No ratings yet
SEMANTICS ORG (Autosaved) - 1
41 pages
NLP QB
No ratings yet
NLP QB
14 pages
Mod 2, Lesson 4
No ratings yet
Mod 2, Lesson 4
11 pages
Đề Cương Chi Tiết
No ratings yet
Đề Cương Chi Tiết
28 pages
lec46
No ratings yet
lec46
21 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
Discouse Analisis
No ratings yet
Discouse Analisis
7 pages
NLP unit-4
No ratings yet
NLP unit-4
24 pages
NLP Module 4
No ratings yet
NLP Module 4
45 pages
Lexical Semantics-Syntactic Model For Defining and Subcategorizing Attribute Noun Class
No ratings yet
Lexical Semantics-Syntactic Model For Defining and Subcategorizing Attribute Noun Class
10 pages
Linguistics - Unit 2
100% (1)
Linguistics - Unit 2
11 pages
0919 Notes
No ratings yet
0919 Notes
2 pages
NLP Unit Iv
No ratings yet
NLP Unit Iv
24 pages
Componential Analysis:: As Definition States
No ratings yet
Componential Analysis:: As Definition States
2 pages
Interpreting PDF
No ratings yet
Interpreting PDF
11 pages
Week 1 Class 1
No ratings yet
Week 1 Class 1
5 pages
Phrasal Semantics
No ratings yet
Phrasal Semantics
6 pages
Prototext-metatext translation shifts: A model with examples based on Bible translation
From Everand
Prototext-metatext translation shifts: A model with examples based on Bible translation
Bruno Osimo
No ratings yet
Coreference: Fundamentals and Applications
From Everand
Coreference: Fundamentals and Applications
Fouad Sabry
No ratings yet
A Sentence Diagramming Primer: The Reed & Kellogg System Step-By-Step
From Everand
A Sentence Diagramming Primer: The Reed & Kellogg System Step-By-Step
Dr. Judith Coats
No ratings yet
Metaphor: Art and Nature of Language and Thought
From Everand
Metaphor: Art and Nature of Language and Thought
Emilio Rivano Fischer
No ratings yet
Download Full Sociolinguistic and Structural Aspects of Cameroon Creole English Aloysius Ngefac PDF All Chapters
100% (1)
Download Full Sociolinguistic and Structural Aspects of Cameroon Creole English Aloysius Ngefac PDF All Chapters
55 pages
Standard-Based Lesson Planning
100% (1)
Standard-Based Lesson Planning
22 pages
Do Indians Get Enough Sleep NCERT Solutions Lecture 2 - by
No ratings yet
Do Indians Get Enough Sleep NCERT Solutions Lecture 2 - by
9 pages
Hidden Treasures Academy Language End of Easter Term Test Paper 1 Grade 4 Emerald April 2025.
No ratings yet
Hidden Treasures Academy Language End of Easter Term Test Paper 1 Grade 4 Emerald April 2025.
4 pages
Swahili 3 Minute Kobo Audiobook
No ratings yet
Swahili 3 Minute Kobo Audiobook
164 pages
Mluccon,+Gestor a+de+La+Revista,+Páginas+Desdeactas de Las Cuartas Jornadas Suplemento-12
No ratings yet
Mluccon,+Gestor a+de+La+Revista,+Páginas+Desdeactas de Las Cuartas Jornadas Suplemento-12
6 pages
Multi Word Verbs Meaning Match Up
No ratings yet
Multi Word Verbs Meaning Match Up
1 page
(MONDAY ONLY) Lesson 1 ENG 001 - Introduction To Reading and Writing
No ratings yet
(MONDAY ONLY) Lesson 1 ENG 001 - Introduction To Reading and Writing
22 pages
..Module 1.soe
No ratings yet
..Module 1.soe
7 pages
Culture and Translation - A Powerpoint Presentation
No ratings yet
Culture and Translation - A Powerpoint Presentation
26 pages
10_0861_01_MS_5RP_AFP_tcm143-725661
100% (1)
10_0861_01_MS_5RP_AFP_tcm143-725661
12 pages
Tag of Complex Sentences
No ratings yet
Tag of Complex Sentences
3 pages
People Also Ask: Kanji - Japan Guide
No ratings yet
People Also Ask: Kanji - Japan Guide
9 pages
German Uned Guide A2 PDF
No ratings yet
German Uned Guide A2 PDF
10 pages
Sample 4as Lesson Plan English
No ratings yet
Sample 4as Lesson Plan English
2 pages
Basic 1 0: Miss Ana Silgado
No ratings yet
Basic 1 0: Miss Ana Silgado
60 pages
Full Phonology Subj.
No ratings yet
Full Phonology Subj.
19 pages
Toddle-Weekly Lesson Plan Template KG
No ratings yet
Toddle-Weekly Lesson Plan Template KG
4 pages
Instant download Modern Syntax A Coursebook 1st Edition Andrew Carnie pdf all chapter
100% (2)
Instant download Modern Syntax A Coursebook 1st Edition Andrew Carnie pdf all chapter
41 pages
English Yearly Scheme of Work Year One 2020
No ratings yet
English Yearly Scheme of Work Year One 2020
11 pages
ĐỀ KIỂM TRA GIỮA KÌ 2 TIẾNG ANH 6 GLOBAL SUCCESS-2
No ratings yet
ĐỀ KIỂM TRA GIỮA KÌ 2 TIẾNG ANH 6 GLOBAL SUCCESS-2
2 pages
Speaking Rubric
No ratings yet
Speaking Rubric
2 pages
Verb Tenses, Relative Clauses and Conditionals
No ratings yet
Verb Tenses, Relative Clauses and Conditionals
11 pages
Exam Practice G v1 - 0120
No ratings yet
Exam Practice G v1 - 0120
5 pages
Verb to BE + English Exercises for Children. [Download PDF]
No ratings yet
Verb to BE + English Exercises for Children. [Download PDF]
9 pages
Unlocking The Words of The Quran 2nd Edition-1
100% (1)
Unlocking The Words of The Quran 2nd Edition-1
83 pages
RPH English YEAR 2
No ratings yet
RPH English YEAR 2
1 page
WENZEL, Siegfried. Reflections On (New) Philology
No ratings yet
WENZEL, Siegfried. Reflections On (New) Philology
8 pages
Majmua Stilistika
No ratings yet
Majmua Stilistika
121 pages
Week 3 Challenges
No ratings yet
Week 3 Challenges
5 pages

Unit-5-NLP (1)

Uploaded by

Unit-5-NLP (1)

Uploaded by

UNIT-V

Coreference Resolution: Coreference Resolution

Coreference resolution is using in a variety of NLP tasks such as,

Different types of References:

Anaphora and cataphora

The Movie was horrible so that it couldn’t be enjoyable.

Despite of his difficulty rama, went ahead to help him.

Not all anaphoric relations are Coreference.

We went to see a movie last night. The tickets were expensive.

The relation is anaphoric but not coreference.

Consider two sentences:

Sentence 1(S1): Jack is an engineer.

Sentence 2 (S2): Jill likes him.

Let’s understand this with an example.

• John bought himself a new car.

Here, himself refers to John. Whereas if the sentence is

• John bought him a new car.

Cohesion: A close relationship based on grammar or meaning between two parts of a

Semantic Role Labeling:

[ARG0 John] broke [ARG1 the window]

Semantic/Thematic roles Verbs describe events or states (‘eventualities’):

Lesk’s Algorithm: A simple method for word-sense disambiguation:

For disambiguation, machine learning methods make use of sense-annotated corpora to

Due to the lack of training corpus, most of the word sense

WSD faces a lot of challenges and problems.

Word sense disambiguation has many applications in various text processing

➢ WSD can be used alongside Lexicography.

➢ Similarly, WSD can be used for Information Retrieval purposes.

You might also like