0% found this document useful (0 votes)
84 views

Speech and Language Processing

This document provides an overview of speech and language processing. It discusses the different levels of language analysis required for a system like HAL to understand and generate speech, including phonetics, phonology, morphology, syntax, semantics, pragmatics, and discourse. It describes how ambiguity is resolved at each of these levels through techniques like part-of-speech tagging, word sense disambiguation, and parsing. The document also discusses tokenization, stemming, tagging parts of speech, context-free grammars, WordNet, and semantic tagging systems.

Uploaded by

seogmin chun
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views

Speech and Language Processing

This document provides an overview of speech and language processing. It discusses the different levels of language analysis required for a system like HAL to understand and generate speech, including phonetics, phonology, morphology, syntax, semantics, pragmatics, and discourse. It describes how ambiguity is resolved at each of these levels through techniques like part-of-speech tagging, word sense disambiguation, and parsing. The document also discusses tokenization, stemming, tagging parts of speech, context-free grammars, WordNet, and semantic tagging systems.

Uploaded by

seogmin chun
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Speech and Language Processing

Daniel Jurafsky and James H. Martin,


Prentice Hall, 2000.
Introduction
•  Dave Bowman: Open the pod bay doors, HAL.
•  HAL: I’m sorry Dave, I’m afraid I can’t do that.
•  From the Screenplay of 2001: A Space Odyssey.
•  What would it take to create at least the language-related parts of
HAL?
•  Understanding humans via speech recognition and natural language
understanding (and, of course lip-reading), and of communicating
with humans via natural language generation and speech synthesis.
HAL would also need to be able to do information retrieval (finding
out where needed textual resources reside), information extraction
(extracting pertinent facts from those textual resources) and inference
(drawing conclusions based on known facts).
Language Processing Systems
•  Language processing systems range from
mundane applications such as word counting to
spelling correction, grammar checking, and
cutting edge applications such as automated
question answering on the web and real-time
spoken language translation.
•  What distinguishes them from other data
processing systems is their use of knowledge of
language. Even the unix word count program
(wc) has knowledge of what constitutes a word.
Levels of Language (1)
•  To determine what Dave is saying, HAL must be able to
analyse the incoming audio signal. Similarly HAL must be
able to generate an audio signal that Dave can understand.
These tasks require knowledge of phonetics (how words
are pronounced in terms of individual speech units called
phones, listed in the international phonetic alphabet) and
phonology (the systematic way that sounds are differently
realised in different environments, e.g. cat, cook).
•  HAL is capable of producing contractions like I’m and
can’t. Producing and recognizing these and other variations
of individual words (e.g. recognising that doors is plural)
requires knowledge of morphology.
Levels of Language (2)
•  HAL has knowledge of syntax, rules for the combination of words. He
knows that the sequence I’m I do, sorry that afraid Dave I’m can’t will
not make sense to Dave, even though it contains exactly the same
words as the original. He has knowledge of lexical semantics (the
meanings of words, e.g. the difference between door and window, open
and shut. Compare with compile time and run time errors in a
computer program.
•  Next, despite it’s bad behaviour, HAL knows enough to be polite to
Dave, embellishing his responses with I’m sorry and I’m afraid. The
appropriate use of polite and indirect language comes under
pragmatics. HAL’s correct use of the word that in its response to
Dave’s request provides structure in their conversation, which requires
knowledge of discourse conventions.
•  See “The scope of linguistics”, p14 of “Teach Yourself Linguistics” by
Jean Aitchison.
To summarize, the knowledge of language needed to engage
in complex behaviour can be separated into six distinct
categories:

•  Phonetics and Phonology - the study of linguistic sounds


•  Morphology - the study of the meaningful components of
words
•  Syntax - the study of the structural relationships between
words
•  Semantics - the study of meaning
•  Pragmatics - the study of how language is used to
accomplish goals
•  Discourse - the study of linguistic units larger than a single
utterance
•  Most or all tasks in speech and language processing can be
viewed as resolving ambiguity at one of these six levels.
How many different meanings can you think of for the
sentence I made her duck?
Resolving Ambiguity
•  Duck can be a verb or noun, while her can be a dative pronoun or a
possessive pronoun. Make can mean create or cook or compel. It can
also be transitive, taking a single direct object ( I cooked waterfowl
belonging to her) or ditransitive, taking two objects, meaning that the
first object (her) was made into the second object (duck). In a spoken
sentence, there would be another kind of ambiguity. What is it?
•  How do we resolve or disambiguate these ambiguities?
•  deciding whether duck is a verb or a noun can be solved by part-of-
speech tagging
•  deciding whether make means create, cook or compel can be solved by
word sense disambiguation.
•  Deciding whether make is transitive or ditransitive is an example of
syntactic disambiguation and can be addressed by probabilistic
parsing.
Tokenisation and Stemming
•  Tokenisation, e.g:
•  Data base, database, data-base (splitting at
the word level)
•  3.21, Dr. Mr. (splitting at the sentence level)
•  Stemming Rules (see Paice’s rules)
Stemming rules
•  Rules for the reduction of different
grammatical forms of a word to a common
canonical form.
•  E.g. terror, terrorism, terrorist, terrorists à
terror.
•  The most popular set of rules are Porter’s
rules.
•  Look at the handout of Paice’s rules.
Word Classes and Part-of-Speech
Tagging
•  No definitive list, but 146 for the C7 tagset (Garside et al., 1997).
•  Two broad supercategories: closed class and open class.
•  Main open classes are nouns (cat, Daniel), verbs (walk), adjectives (green) and
adverbs (slowly).
•  Main closed classes are:
•  Prepositions: on, under, over, near, by, at, from, to, with
•  Determiners: a, an, the
•  Pronouns: she, who, I, others
•  Conjunctions: and, but, or, as, if, when
•  auxiliary verbs: can, may, should, are
•  particles: up, down, on, off, in, out, at, by
•  numerals: one, two, three, first, second, third
•  Tagsets for English, e.g. Penn Treebank
•  The/DT grand/JJ jury/NN commented/VBD on/IN a/DT number/NN of/IN other/JJ
topics/NNS ./.
•  Exercise: manual CLAWS tagger with disambiguation.
Context-free grammars for English
•  S à NP + VP
•  NP à DET + NOUN
•  VP à VERB + NP
•  DET à the
•  NOUN à man | book
•  VERB à took
CFG are also called Phrase-Structure
Grammars
•  They consist of a set of rules or productions, each of
which expresses the ways that symbols of the language can
be grouped together, and a lexicon of words or symbols.
The symbols that correspond to words in the surface form
of the language are called terminal symbols.
•  The CFG may be thought of in two ways: as a device for
generating sentences (top-down parsing), or as a device
for assigning a structure to a given sentence (bottom-up
parsing). It is sometimes convenient to represent a parse-
tree in bracketed notation (e.g. the Penn Treebank).
•  [S [NP [DET the] [NOUN man] ] [VP [VERB took] [NP
[DET the] [NOUN book] ] ] ]
SVO triples
•  A parser can show the existence of useful
subject-verb-object triples, showing two
related entities and the relation between
them.
•  aspirin (treats) headache
•  HBOS (takes-over) Halifax
Word Sense Disambiguation (1)
•  The WordNet thesaurus lists the range of
different senses a word can have, and also the
range of relations between related word senses:
•  hypernym, e.g. breakfast à meal (noun)
•  hyponym, e.g. meal à lunch
•  has-member e.g. faculty à professor
•  member-of e.g. copilot à crew
•  has-part e.g. table à leg
•  part-of e.g. course à meal
The DDF Thesaurus
•  PENICILLANATE h.t. ANTIBIOTICS
•  Penicillanate-sulfone use SULBACTAM
•  PENICILLATE h.t. ANTIBIOTICS
•  Penicillin use BENZYL-PENICILLIN
Word Sense Disambiguation (2)
•  hypernym e.g. fly à travel (verb)
•  troponym e.g. walk à stroll
•  entails e.g. snore à sleep
•  antonym e.g. increase à decrease
•  antonym e.g heavy à light (adjective)
•  antonym e.g. quickly à slowly (adverb)
•  synsets e.g. {chump, fish, fool, gull, mark, patsy,
fall guy, sucker, schlemiel, soft touch, mug }
The ACAMRIT semantic tagger
•  The SEMTAG semantic tagset was originally
loosely based on the categories found in the
Longman Lexicon of Contemporary English
(McArthur, 1981).
•  The categories are arranged in a hierarchy, with 21
major discourse fields denoted by an upper case
letter (such as E for “emotional actions, states and
processes”), then divided and sometimes even
subdivided again.
The ACAMRIT semantic tagger
•  This is shown using numeric components of the semantic codes
such as 4.1.
•  Antonyms are identified using the symbols + and -. Thus
“happy” is normally tagged E4.1+, and “sad” is normally tagged
as E4.1-.
•  Comparatives are shown with ++ or --, and superlatives with ++
+ or ---.
•  In some cases, a word type can only represent one possible
category. Often, however, a word such as “spring” can have a
number of different meanings, each requiring a different
semantic tag. In such cases, disambiguation is achieved using
six types of additional evidence, as follows:
Additional Evidence for WSD
•  The POS tag assigned by CLAWS. For example, if “spring” is a verb,
we know it must mean “jump”.
•  The general likelihood of a word taking a particular meaning, as found
in certain frequency dictionaries.
•  Idiom lists are kept. If an entire idiomatic phrase is found in the text
being analysed, it is assumed that the idiomatic meaning of each word
in the phrase is more likely than individual interpretations of the
words.
•  The domain of discourse can be an indicator. For example, if the topic
of discussion is footwear, then “boot” is unlikely to refer to the boot of
a car.
•  Special rules have been developed for the auxiliary verbs “be” and
“have”.
•  Proximity disambiguation. Are any collocates of the word, suggesting
a particular interpretation, found in the immediate vicinity?
ACAMRIT uses SEMTAG tags
•  The full set of SEMTAG semantic tags can
be found on http://www.comp.lancs.ac.uk/
ucrel/acamrit/setags.txt.
Some ACAMRIT codes
•  G1 Government, Politics and •  I1 Money generally
elections •  I1.1 Money: Affluence
•  G1.1 Government etc. •  I1.2 Money: Debts
•  G1.2 Politics •  I1.3 Money: Price
•  G2 Crime, law and order •  I2 Business
•  G2.1 Crime, law and order: Law and •  I2.1 Business: Generally
order •  I2.2 Business: Selling
•  G2.2 General ethics •  I3 Work and employment
•  G3 Warfare, defence and the army; •  I3.1 Work and employment:
weapons Generally
•  H1 Architecture and kinds of houses
and buildings •  I3.2 Work and employment:
Professionalism
•  H2 Parts of buildings •  I4 Industry
•  H3 Areas around or near houses
•  H4 Residence
•  H5 Furniture and household fittings
Text tagged with Part Of Speech and
Semantic Code
I_PPIS1_Z8mf went_VVD_M1[i3.2.1 down_RP_M1[i3.2.2
yesterday_RT_T1.1.1 to_II_Z5 the_AT_Z5 Peiraeus_NP1_Z99
with_IW_Z5 Glaucon_NP1_Z99 ,_,_PUNC the_AT_Z5
son_NN1_S4m of_IO_Z5 Ariston_NP1_Z99 ,_,_PUNC
to_TO_Z5 pay_VVI_I1.2 my_APPGE_Z8 devotions_NN2_Z99
to_II_Z5 the_AT_Z5 Goddess_NN1_S9/S2.1f ,_,_PUNC
and_CC_Z5 also_RR_N5++ because_CS_Z5 I_PPIS1_Z8mf
wished_VVD_X7+ to_TO_Z5 see_VVI_X3.4 how_RRQ_Z5
they_PPHS2_Z8mfn would_VM_A7+ conduct_VVI_A1.1.1
the_AT_Z5 festival_NN1_K1/S1.1.3+ since_CS_Z5
this_DD1_Z8 was_VBDZ_A3+ its_APPGE_Z8
inauguration_NN1_Z99 ._._PUNC
Discourse
•  Up to now, we have focussed mainly on language
pheomena that operate at the word or sentence level. Of
course, language does not normally consist of isolated,
unrelated sentences, but instead of related groups of
sentences. We refer to such a group of sentences as a
discourse (e.g. HCI).
•  Coherence and reference are discourse phenomena:
consider
•  John went to Bill’s car dealership to check out an Acura
Integra. He looked at it for about an hour.
•  Automatic reference resolution depends mainly on
proximity rules and constraints on coreference, e.g.
agreement in gender and number.
Discoursal Annotation
•  (0) The state Supreme Court has refused to release
{1[2 Rahway State Prison 2] inmate 1} (1 James
Scott 1) on bail. (1 The fighter 1) is serving 30-40
years for a 1975 armed robbery conviction. (1
Scott 1) had asked for freedom while <1 he waits
for an appeal decision. Meanwhile {3 <1 his
promoter 3], {{ 3 Murad Mohammad 3} said
Wednesday <3 he netted only $15,250 for (4 [1
Scott 1] ‘s nationally televised fight against {5
ranking contender 5 } (5 Yaqui Lopez 5) last
Saturday 4).
Dialogue Acts (Bunt) or
Conversational Moves (Power)
•  STATEMENT A claim made by the speaker
•  INFO-REQUEST A question by the speaker
•  CHECK A question for confirming information
•  INFLUENCE-ON-ADDRESSEE (= Searle’s directives)
•  OPEN-OPTION A weak suggestion or listing of options
•  ACTION-DIRECTIVE An actual command
•  INFLUENCE-ON-SPEAKER(= Austin’s commissives)
•  OFFER Speaker offers to do something (subject to confirmation)
•  COMMIT Speaker is committed to doing something
•  CONVENTIONAL Other
•  OPENING Greetings
•  CLOSING Farewells
•  THANKING Thanking and responding to thanks.

You might also like