Reference Resolution: Adam Meyers New York University
Reference Resolution: Adam Meyers New York University
Adam Meyers
New York University
Computational Linguistics
Lecture 8
2011-2012
Outline
• What is Reference Resolution?
• Linguistic Analysis of Coreference
• Coreference Algorithms: Proper Nouns,
Pronouns, Common Nouns
• Evaluation Issues
• Summary
Computational Linguistics
Lecture 8
2011-2012
Reference Resolution
• Reference Resolution:
– Which words/phrases refer to some other word/phrase?
– How are they related?
• Anaphora vs. Cataphora
– Anaphora: an anaphor is a word/phrase that refers back to
another phrase: the antecedent of the anaphor
• Mary thought that she lost her keys.
Computational Linguistics
Lecture 8
2011-2012
Coreference between Proper Nouns (NEs)
• Instances of the same name string in a document usually refers to the same entity
– IBM, IBM, IBM, IBM, … → Entity_IBM
– George Bush, George Bush, … → Entity_GB
• Abbreviations and Nicknames match full name (full name is often first)
– Examples:
• International Business Machines, IBM, Big Blue… → Entity_IBM
• George Bush, George Bush, W, … → Entity_GB
– Abbreviations: mostly rule based (acronyms, subsequences, etc)
• Example: If subsequence is a place name, some types of organizations are OK antecedents
– New York Yankees ←New York, New York Times ←New York
– Nicknames are idiosyncratic – need a lexicon
• Simple rules work, links most informative NPs, results in high 90s, no literature
– Important component of IE systems
• One interesting problem: Name disambiguation
– Distinguishing multiple individuals with the same name
– Usually, a problem across documents
• Exception: George Bush and his son George W were there.
– Abbreviation rules may allow two possible antecedents (and then George said)
Computational Linguistics
Lecture 8
2011-2012
Pronouns in English
• Definite Pronouns: typically refer to specific NPs
– 3rd person personal pronouns
• he, him, his, she, her, hers, it, its, they, them, their, theirs
– 3rd person Reflexive pronouns
• himself, herself, itself, themselves
– each other – reciprocal pronoun, similar to reflexives
– 1st and 2nd person pronouns
• I, me, my, myself, mine, our, ours, ourselves, you, your, yours, yourself
• Dialogues between 2 people; or writer/speaker and audience
• Indefinite Pronouns:
– one – can be used for type coreference
– Other indefinites – no antecedents in text
• something, someone, everything, everyone, ...
Computational Linguistics
Lecture 8
2011-2012
rd
3 Def Prons: NonSyntactic Constraints/Preferences
• Usually have an antecedent
• Gender/number/person agreement (language specific)
– Robert ← he, Robert ← she, Robert ← it, Robert ← they
– IBM ← he, IBM ← she, IBM ← it, IBM ← they
– I ← she, me ← her, you ← they
• Selection Restrictions
– Children have many toys. They love to play.
– Children have many toys. They are always breaking.
• Pragmatics
– Mary yelled at Alice. She interrupted the phone call.
– Mary yelled at Alice. She can be so mean sometimes.
• Others: closer antecedents preferred, repeated NPs are more likely to
be antecedents, etc. (J&M have several more examples)
Computational Linguistics
Lecture 8
2011-2012
Binding Theory Constraints
• An Antecedent of personal pronouns cannot be
“too close” to the pronoun.
• An Antecedent of a reflexive/reciprocal pronoun
cannot be “too far” from the pronoun.
• Definitions of “too close” and “too far”
– Vary from language to language
– Vary among different classes of pronouns/reflexives
– Are defined using different primitive concepts within
different linguistic theories
• Binding Theory Constraints are usually defined in
terms of syntactic configurations
Computational Linguistics
Lecture 8
2011-2012
Binding Theory for English 3rd Pers Prons
• Case 1: If the pronoun p is inside an NP premodified by a
possessive, the antecedent needs to be outside of this NP
– John likes Mary's drawing of him
– John likes his drawing of Mary
• Case 2: Otherwise, the antecedent must be outside the immediate
tensed clause containing the personal pronoun.
– John said that he liked pizza.
– John wanted for him to like pizza.
– John liked him.
• Theories of binding vary about how these (and similar) constraints
are encoded, but the differences in the final result (quality of
system output) is minimal. While these 2 rules cover most cases,
there are also some exceptions:
– John always carries a slice of pizza with him.
Computational Linguistics
Lecture 8
2011-2012
Binding Theory for English
Reflexives/Reciprocals
• The antecedent of a reflexive/reciprocal must be the closest
subject or possessive such that:
– The antecedent precedes and “commands” the pronoun
• A commands B if A is the sibling of a phrase that dominates B.
– There is no possessive or subject for phrases in the path in the phrase
structure tree between antecedent and pronoun
• Examples:
– Mary saw herself vs. *Mary said that John would meet herself soon
– Mary's picture of herself vs. *Mary saw John's picture of herself
• These rules covers most cases.
– Exception: Pictures of themselves made the actors nervous.
Computational Linguistics
Lecture 8
2011-2012
Reflexive Pronoun Constraint
Computational Linguistics
Lecture 8
2011-2012
Binding Theory Details are English Specific
• zìjǐ – Chinese reflexive pronoun (example)
– Ambiguous Example from Choi 1997
• Zhangsan renwei Lisi zhidao Wangwu xihuan zìjǐ
Computational Linguistics
Lecture 8
2011-2012
Hobbs Search Algorithm to Find Antecedent of Anaphors
Computational Linguistics
Lecture 8
2011-2012
Hobbs Search Example
Computational Linguistics
Lecture 8
2011-2012
No-Parse Hobbs-like Search
• Only Consider Nouns/NGs satisfying constraints
• Continue searching until antecedent or loop exits
1. Initialize sentence_counter to 0 and search current
sentence from left to right, ending before pronoun.
2. Repeat the following step until an antecedent is found
or sentence_counter reaches the maximum (e.g., 3)
i. Search previous sentence from left to right
ii. Increment sentence_counter by 1
Computational Linguistics
Lecture 8
2011-2012
More Pronoun Coreference Systems
• Lappin and Leass (1994): Hobbs-Search-like procedure,
Morphological filter, Binding Theory, Pleonastic Pronoun
Handler, preferences based on grammatical role hierarchy
(subject > object > ind-object), preference for same grammatical
role, frequency of noun, recency, decision procedure for finding
pronoun coreference
– 4% over Hobbs Search
• Other Systems Using Statistical Weights or Machine Learning
Score a Little Bit Better, e.g., Dagan et. al. (1995) score another
3% better (89% vs. 82%).
Computational Linguistics
Lecture 8
2011-2012
Common Noun Coreference
• Definite Common Nouns
– Poessio and Veira (2000) baseline:
• A common noun phrase NP1with determiner “the” can be coreferential with
a preceding NP2 if:
– NP1 and NP2 have the same head
– And (ignoring determiners) NP1 has a subset of the modifiers of NP2
• There has been very little improvement on this baseline and very
few systems that correctly identify the other cases with any large
degree of accuracy
• Other factors:
– Distance between NP1 and NP2
– Other determiners, modifiers, possessives, etc.
Computational Linguistics
Lecture 8
2011-2012
Why is Common Noun Coreference Difficult?
• Only some common noun phrases are anaphoric
– Definite vs. Generic
• The officers vs. officers vs. an officer
– Limit to the phrases is a conservative decision
• this, that, those, possessives, ... improves recall, lowers precision
• When can a common noun corefer to another noun?
– Limit to identical nouns is a conservative decision
– Other choices improve recall, lower precision
• My experience: a hand-crafted list of matches to NE classes
– Ex: PERSON matches: man, human, person, individual,
woman, .., officer, attorney, ...
– Hurts approximately as much as it helps (paper wasn't accepted
to conference)
Computational Linguistics
Lecture 8
2011-2012
Scoring Coreference 1
• Basics: Precision=System
Correct
Output
Recall=
Correct
Answer Key
F-Score=
1
2
1
+
• Problem: How do you measure number ofprecision correct?
recall
• MUC-6:
– Coreference Chains = Partitions of NPs
– Recall and Precision are based on mismatches (edit distance) between
partitions: numbers of links added and/or subtracted to change incorrect
partitions to correct ones
• Given 7 NPs in a system output chain: A1, A2,A3,A4,A5,B1,B2 such that:
– The sets {A1, A2,A3,A4,A5 } and {B1,B2 } belong to separate chains in the Answer Key
– The system output contains: 5 correct links and 1 incorrect link
» Precision = 5/6 = 83%
– The system has 5 links and there is one link missing
» Recall = 5/5 = 100%
– F-Score = 91%
Computational Linguistics
Lecture 8
2011-2012
Scoring Coreference 2
• B-Cubed (Bagga and Baldwin 1998)
– Precision calculated for each system chain (and averaged)
• Given 7 NPs in a system output chain: A1, A2,A3,A4,A5,B1,B2 such that:
– The sets {A1, A2,A3,A4,A5 } and {B1,B2 } belong to separate chains in the
Answer Key
– The precision calculated for each item in chain and averaged:
– 5 X 5 2 X 2 ∗ 1 ≈.59
7 7 7
Computational Linguistics
Lecture 8
2011-2012
Homework #8 – Slide 1
• Write a Proposal for Your Final Project
– There are 3 types of proposals: system, annotation, paper
• Proposals for System Projects Should Include:
– A General description of the type of system you want to work
on (relation detection, using corpus X, and technique Y)
– A Preliminary List of References (at least 3 references)
– A Description of a Baseline System: an extremely simple
system that is likely to produce some result, even if it is not an
exciting result.
– A Description of at least one more interesting experiment, set of
features, etc.
Computational Linguistics
Lecture 8
2011-2012
Homework # 8 – Slide 2
• Proposals for Annotation Projects Should Include:
– An initial description of the phenomena you would like to
annotate.
– A preliminary list of references (at least 3)
– A small sample corpus (25-50 sentences) with some sample
annotation
– The name of your other annotator (hint: 2 students in the class
can be each other's annotators)
• Proposals for Research Papers Should Include:
– A proposed introductory paragraph (this can be changed for
the final version)
– An outline
– A preliminary list of references (at least 10)
Computational Linguistics
Lecture 8
2011-2012