14-LexicalSemantics
14-LexicalSemantics
Martin Rajman
&
Jean-Cédric Chappelier
1
Overview
• Basic concepts
• Semantic relations
• Resources for Lexical Semantics: Wordnet
• Applications of Lexical Semantics
2
Basic concepts
3
Lexical Semantics vs.
Compositional Semantics
4
Compositional Semantics
• Compositional Semantics is the study of the meaning of complex
linguistic units such as sentences, paragraphs, or documents
5
Hands-on…
Reading tests
• Consider the following text:
7
Usual representations
• Symbolic representations:
➢various formal logics: the meaning is expressed as a logical formula that can
then be manipulated through various inferentialmechanisms;
➢various graph based representations: the meaning is expressed as a graph
that can then be manipulated through various graphtransformations;
• Vectorial representations:
➢typically approaches based on “distributional semantics” (e.g. Word
embeddings): the meaning is represented as a vector in a (usually high
dimension) vector space and can then be manipulated through vector based
operations (e.g. weighted sums, projections, etc.)
8
Usual representations (2)
• Currently, only vectorial representations can be deployed at a large
scale because:
➢it is extremely difficult (if not impossible) to guarantee the consistency of
large sets of logical propositions derived from textual input, which often
makes the inferential mechanisms very hard to use;
➢there isn’t yet a consensus neither on which are the most suitable graph
based representations (semantic nets? Conceptual graphs? ...) for expressing
the meaning of linguistic entities, nor on which are the proper operations to
be applied to these representations;
• ... but the currently associated vector based operations seems to be
too simplistic for suitably mimicking the transformations that are
required to manipulate linguistic meaning.
9
Intermediate conclusion
• Large scale Compositional Semantics is still out of reach,
and
• This lecture will therefore restrict to a simpler form of semantics, the
semantics of individual words, e.g. Lexical Semantics
10
Lexical Semantics
• Lexical Semantics is the study of the meaning of words (i.e. of the
simplest linguistic units)
• A standard approach for exploring lexical semantics for human
subjects are dictionaries (not to be confused with encyclopedias
which are not concerned with word meanings but with
comprehensive information about subjects/topics/fields from the real
world)
Note: In this course, a dictionary (especially when tailored for some
automated processing) will also often be called a lexicon
11
Word sense
• A word sense can be represented, for
example, as :
– A definition in natural language
– A definition based on it’s relationship (e.g.“is a”,
“has a”) to other word senses
– A set of synonyms (“synset”)
• Dictionaries usually define word
senses/meanings … However, different
dictionaries often use different definitions
(different content and/or different granularity)
12
(a bit) more formally: Lexemes
• Lexeme:
▪ An individual entry in a lexicon/dictionary
▪ A pairing of a particular orthographic and
phonological form with some meaning representation
Orthographic Phonological Meaning
form form
1. bass [beys] adj. low in pitch; a bass instrument
2. bass [bas] n. (…) freshwater or marine fishes (…)
3. wood [woo d] n. (…) substance of a tree (…)
4. would [woo d] v. A pt. and pp. of WILL
Dictionary definitions
• Propose a definition for the word “bee”...
14
Hands-on…
“A flying insect, of the superfamily Apoidea, known for its organised societies
and for collecting pollen and (in some species) producing wax and honey.”
15
Word sense definition in Natural Language
16
Semantic Relations
17
Many different types of relations
• Relations characterizing word meanings:
▪ Homonymy / Homophony / Homography
▪ Polysemy
▪ Synonymy
19
Polysemy
• A relation that holds between multiple related
meanings within a single lexeme
Orthographical Meaning
form
Crown 1. Headgear worn by a monarch
2. The highest part of anything, e.g. a tree
3.The part of a tooth that is covered by
enamel
…
20
Homonymy vs. Polysemy
• Both homonyms and polysems are spelled and pronounced the same but
...
• homonyms have a different etymology and usually correspond to two
distinct entries in a lexicon, while polysems share the same etymology but
correspond to two different meaning of the same lexicon entry
Example:
➢“bat” (the flying mammal) comes from a dialectal variant of the Middle
English “bakke”, while “bat” (the wooden club) comes from the Old English
“batt”
but
➢“crown” (the headgear) and “crown” (the highest part) both come from
the Anglo-Norman “coroune”
21
Source for polysemy
• Metaphor
– “Germany will pull Slovenia out of its economic
slump”
– “I spent 2 hours on that homework”
• Metonymy
– “The White House announced yesterday”
– “This chapter talks about part-of-speech tagging”
– Bank (building) and bank (financial institution)
22
Synonymy
• Two words are synonymous if they have the
same sense
• Criteria for synonymy:
– They have the same value for all their semantic
features
– They map to the same concept
– They satisfy the Leibniz substitution theory
• The substitution of one for the other never changes the truth
value of a sentence in which the substitution is made
• Example of non-synonyms:
• Tony is the big brother
• Tony is the large brother
23
Hyponymy/Hypernymy
A hyponym is a word whose meaning contains
the entire meaning of another, known as the
superordinate or hypernym.
animal device
is_a_kind_of
26
Defining word senses with semantic relations
• A standard way of defining word senses with semantic relations is
to follow the Aristotelian principle of “Genus-Differentia”:
➢Genus: each word meaning is first associated to a hypernym through a
“hyponymy/hypernymy” relation (this is equivalent to defining the superclass
associated with a given class in an object oriented model)
➢Differentia: each word meaning is then uniquely differentiated from the other
hyponyms of its hypernym by additional relations (e.g. Meronymy/Holonymy)
associating it with other words meanings
• Of course, to make this type of approach realistic on a large scale,
more than two semantic relations are required!
27
Hands-on…
https://commons.wikimedia.org/w/index.php?curid=28335
2. An input device that is moved over a pad or other flat surface to produce a
corresponding movement of a pointer on a graphicaldisplay.
28
Hands-on…
29
Hands-on…
30
Hands-on…
31
Hands-on…
Let us go further!...
• The definitions based on semantic relations given so far are good enough
for distinguishing the meanings of various polysemic words but they do not
allow to distinguish between the hyponyms of a given hypernym!...
device rodent
hyponym hyponym
mouse_1 rat_1
mouse_2 rat_2
32
Hands-on…
33
Hands-on…
rodent genus
hyponym hyponym
hyponym meronym hyponym
mouse_1 Mus
meronym
rat_1 Rattus
34
Hands-on…
35
Intermediate conclusion (2)
• In a relation based approach to Lexical Semantics, the word meanings are
defined as the nodes of a directed graph the arcs of which correspond to
various semantic relations
• The targeted semantic graph is built with the main purpose of correctly
differentiating the various meanings of the words (which is one of the
primary objectives of Lexical Semantics)
• Most often, pure lexical semantic models are not sophisticated enough to
be fully adequate for more advanced exploitations (such as the automated
generation of the answers to the questions asked in the simple reading
test given at the beginning of the lecture)
• For such advanced applications, lexical semantic models will have to be
embedded in more complex ones providing some (possibly limited)
semantic representation for linguistic units larger than words
(Compositional Semantics)
36
Resources for
Lexical Semantics
37
WordNet
http://wordnetweb.princeton.edu/perl/webwn
38
WordNet Search - 3.1
- WordNet home page - Glossary - Help
Display Options:
Key: "S:" = Show Synset (semantic) relations, "W:" = Show Word (lexical) relations
Display options for sense: (gloss) "an example sentence"
Display options for word: word#sense number
Noun
S: (n) mouse#1 (any of numerous small rodents typically resembling diminutive rats
having pointed snouts and small ears on elongated bodies with slender usually
hairless tails)
S: (n) shiner#1, black eye#1, mouse#2 (a swollen bruise caused by a blow to the
eye)
S: (n) mouse#3 (person who is quiet or timid)
S: (n) mouse#4, computer mouse#1 (a hand-operated electronic device that
controls the coordinates of a cursor on your computer screen as you move it around
on a pad; on the bottom of the device is a ball that rolls on the surface of the pad) "a
mouse takes much more room than a trackball"
Verb
S: (v) sneak#1, mouse#1, creep#2, pussyfoot#1 (to go stealthily or furtively) "..stead
of sneaking around spying on the neighbor's house"
S: (v) mouse#2 (manipulate the mouse of a computer)
39
WordNet Search - 3.1
- WordNet home page - Glossary - Help
Display Options:
Key: "S:" = Show Synset (semantic) relations, "W:" = Show Word (lexical) relations
Display options for word: word#sense number
Noun
S: (n) mouse#1
S: (n) shiner#1, black eye#1, mouse#2
S: (n) mouse#3
S: (n) mouse#4, computer mouse#1
Verb
40
Synsets
• Hypothesis: A synonym is often sufficient to identify
a sense.
• Example
– “board” means 1) piece of lumber 2) group of people
assembled for some reason
– Sense 1: {board, plank} Sense 2: {board, committee}
(Note that this is true for English which is rich in synonyms but
may not be true for all languages…)
• Nouns
– Organised as topical hierarchies with lexical
inheritance (hyponymy/hyperymy and
meronymy/holonymy).
• Verbs
– Organised by a variety of entailment relations
• Adjectives
– Organised on the basis of bipolar opposition
(antonymy relations)
• Adverbs
– Like adjectives
43
Application of lexical semantics in
language engineering
44
Lexical semantics in Speech
Processing
45
Lexical semantics for Spelling Error
Correction
• In some cases a spelling error can result in a real
word in the lexicon and therefore cannot be detected
by a conventional spell checker
• Examples:
➢ It is my sincere hole [hope] that you will recover soon
➢ The boss [toss] of the coin
46
Lexical semantics in Information
Retrieval
• Semantic indexing
– Indexing word senses instead of words
– Improves
• Recall by handling synonymy
• Precision by handling homonymy and polysemy
48
Lexical semantics for Information
Retrieval
• Indexing schemes
b) Indexing with a semantic ontology, each indexing
term is extended with all the hypernym senses
49
Lexical semantics for Information
Retrieval
• Indexing schemes
c) Synset (or hypernyms synsets) indexing, each
indexing term is replaced with it’s hypernym synset
50
Lexical semantics for Information
Retrieval
• Indexing schemes
d) Minimum Redundancy Cut (MRC) indexing, each
indexing term is replaced with it’s dominating
semantic concept defined by MRC
51
Lexical semantics for Information
Retrieval
52
Key points
53
References
• Cruse, D. A. (1986). Lexical Semantics. Cambridge, New York.
• Dan Jurafsky and Jim Martin, Speech and Language Processing,
Chapter16, Prentice Hall, 2000.
• Mark Stevenson, Word Sense Disambiguation, CSLI Press, 2003.
• Sanda Harabagiu and Dan Moldovan, Enriching the WordNet
Taxonomy with Contextual Knowledge Acquired from Text,
in Natural Language Processing and Knowledge Representation:
Language for Knowledge and Knowledge for Language, (Eds) S.
Shapiro and L. Iwanska, AAAI/MIT Press, 2000, pages 301-334.
• Sanda Harabagiu and Dan Moldovan, A Parallel System for Text
Inference Using Marker Propagations, IEEE Transactions in Parallel
and Distributed Systems August, 1998, pages 729-747.
• FrameNet web site: http://framenet.icsi.berkeley.edu/
54