100% found this document useful (1 vote)

381 views

Topic and Focus

Chungmin Lee and Matthew Gordo book Topic and Focus

Uploaded by

Ali Muhammad Yousif

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

381 views

Topic and Focus

Chungmin Lee and Matthew Gordo book Topic and Focus

Uploaded by

Ali Muhammad Yousif

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 299

T OPIC AND FOCUS

STUDIES IN LINGUISTICS AND PHILOSOPHY

VOLUME 82

Managing Editors
GENNARO CHIERCHIA, University of Milan
KAI VON FINTEL, M.I.T., Cambridge
F. JEFFREY PELLETIER, Simon Fraser University

Editorial Board
JOHAN VAN BENTHAM, University of Amsterdam
GREGORY N. CARLSON, University of Rochester
DAVID DOWTY, Ohio State University, Columbus
,
GERALD GAZDAR University of Sussex, Brighton
IRENE HEIM, M.I.T., Cambridge
EWAN KLEIN, University of Edinburgh
BILL LADUSAW, University of California, Santa Cruz
TERRENCE PARSONS , University of California, Irvine

The titles published in this series are listed at the end of this volume.
TOPIC AND FOCUS
CROSS-LINGUISTIC PERSPECTIVES ON MEANING
AND INTONATION

edited by

CHUNGMIN LEE
Seoul National University
Seoul, Republic of Korea

MATTHEW GORDON
University of California
Santa Barbara, CA, USA

and
..
DANIEL BURING
University of California
Los Angeles, CA, USA
A C.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN-10 1-4020-4795-9 (HB)

ISBN-13 978-1-4020-4795-4 (HB)
ISBN-10 1-4020-4796-7 (e-book)
ISBN-13 978-1-4020-4796-7 (e-book)

Published by Springer,
P.O. Box 17, 3300 AA Dordrecht, The Netherlands.

www.springer.com

Printed on acid-free paper

All Rights Reserved

© 2007 Springer
No part of this work may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means, electronic, mechanical, photocopying, microfilming, recording
or otherwise, without written permission from the Publisher, with the exception
of any material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work.
TABLE OF CONTENTS

Preface vii

Gorka Elordieta 1
Constraints on Intonational Prominence of Focalized Constituents
Ardis Eschenberg 23
Polish Narrow Focus Constructions
David Gil 41
Intonation and Thematic Roles in Riau Indonesian
Matthew Gordon 69
The Intonational Realization of Contrastive Focus in Chickasaw
Carlos Gussenhoven 83
Types of Focus in English
Nancy Hedberg and Juan M. Sosa 101
The Prosody of Topic and Focus in Spontaneous English Dialogue
Emiel Krahmer and Marc Swerts 121
Perceiving Focus
Manfred Krifka 139
The Semantics of Questions and the Focusation of Answers
Chungmin Lee 151
Contrastive (Predicate) Topic, Intonation, and Scalar Meanings
Kimiko Nakanishi 177
Prosody and Scope Interpretations of the Topic Marker ‘wa’ in Japanese
Ho-Hsien Pan 195
Focus and Taiwanese Unchecked Tones
Elisabeth Selkirk 215
Bengali Intonation Revisited: An Optimality Theoretic Analysis in which
FOCUS Stress Prominence Drives FOCUS Phrasing
Mark Steedman 245
Information-Structural Semantics for English Intonation
Klaus von Heusinger 265
Discourse Structure and Intonational Phrasing
PREFACE

During the 2001 Linguistic Summer Institute at University of California, Santa

Barbara, a group of linguists gathered at a workshop to discuss the expression and
role of topicalization and focus from a variety of perspectives: phonetic,
phonological, syntactic, semantic, and pragmatic. The workshop was designed to lay
the groundwork for collaborative efforts between linguists devoted to the study of
meaning and linguists engaged in the quantitative study of intonation.
This volume contains papers emerging from the Santa Barbara Workshop on
Topic and Focus. A wide variety of methodologies and research interests related to
topic and focus are represented in the papers. Some works present results of phonetic
studies, either acoustic or perceptual, on the expression of topic and/or focus; others
examine semantic or pragmatic features of topic and/or focus, while others are
concerned with the interface between intonation and meaning.
Data from several different languages are represented in the papers, including
several languages with relatively little documentation particularly in the venue of
topic and focus, e.g. Basque, Chickasaw, Indonesian, Polish, Taiwanese. The broad
sample of languages coupled with the wide variety of research topics addressed by
the papers promise to enrich our typological understanding of topic and focus
phenomena and provide an impetus for further research. The following paragraphs
offer brief summaries of the papers contained in this volume:
Gorka Elordieta’s paper describes prosodic conditions governing focus in a
dialect of Basque with pitch accents. He finds that narrow focus is only
intonationally marked for words carrying a pitch accent. Unaccented words rely on
contextual information rather than intonational cues to signal focus, unlike in
Japanese, in which unaccented words are free to express focus prosodically.
Ardis Eschenberg’s paper explores the role of word order and prosody in the
expression of focus in Polish. She shows that intonation and word order are used in
different capacities depending on the focused element and the type of focus.
Eschenberg discusses the implications her research has for various syntactic and
semantic theories of focus.
David Gil’s paper focuses on the role of intonation in signalling thematic roles in
the Riau dialect of Indonesian, a language with relatively free word order and no
obligatory case or agreement marking. Based on an analysis of data from a
naturalistic corpus of utterances, Gil finds that intonation is not used to cue thematic
roles. Drawing on this result, Gil proposes a model of Indonesian syntax and
semantics lacking traditional morphosyntactic categories.
Matthew Gordon’s paper is a phonetic study of the effect of focus on
fundamental frequency and duration in Chickasaw, a language in which focus is
morphologically marked. Gordon finds considerable variation between speakers in
the use of f0 and duration as correlates of focus, with temporal disjuncture between
elements playing a more important role than f0 in the expression of focus. Based on
viii PREFACE

these results, Gordon suggests that focus may be marked phonetically even in a
language in which focus has an overt morphological realization.
Carlos Gussenhoven’s work provides an overview of how various types of focus
are expressed syntactically and prosodically. Basing his classification on data from
several languages, Gussenhoven suggests that focus may differ along a number of
pragmatically conditioned dimensions. He finds that different categories of focus are
expressed through different intonational contours, with identificational focus
seeming to occupy a special status in its reliance on morphological as opposed to
prosodic cues.
Nancy Hedberg and Juan Sosa investigate the evidence for a prosodic distinction
between topic accents and focus accents in their paper. In an analysis of naturally
occurring English speech, they do not find any differences in pitch accent type
pointing to separate categories of topic and focus accent. On the other hand, they
find extensive marking of information structure categories with high pitch accents.
In their paper, Emiel Krahmer and Marc Swerts discuss a dialogue reconstructing
experiment designed to examine the role of pitch accents in perceiving focus in two
languages, Dutch and Italian, differing in the importance of pitch accents as a marker
of focus. Krahmer and Swerts find that Dutch listeners rely more on pitch accent
cues to reconstruct focus than Italian listeners, in keeping with the greater role of
pitch in signalling focus in Dutch. Results of an audiovisual experiment employing
talking heads suggest that visual cues can also play a role in the perception of focus,
though primarily when pitch cues are indecisive.
Manfred Krifka’s paper explores the proper semantic treatment of focus patterns
in response to constituent questions. He finds that neither the framework of
Alternative Semantics nor a theory that works with givenness rather than semantic
focus as a basic concept offers an adequate analysis of focus arising in answers to
questions. On the other hand, Krifka argues that the theory of Structured Meaning
provides a superior account of this type of focus.
In his paper, Chungmin Lee characterizes Contrastive Topic and Contrastive
Predicate Topic, particularly in connection with their ‘conventional’ scalar implicatures.
He distinguishes a typical kind that evokes a ‘conventional’ implicature from list
contrastive topics, which lack any implicature. The Contrastive Topic marker in
Korean gets a high tone responsible for focality, analogously to the fall-rise contour
in English. Lee’s paper explores the scalar meaning of type-subtype scalarity and
subtype, arguing for the inherent tendency of subtype scalarity even in entities. It
also explores scope relations between scope bearers and Contrastive Topic and CT’s
narrow-scope nature. The apparent non-narrow-scope of CT is claimed to be a
topicalization effect. Predicates are claimed to be inherently subtype-scalar when
CT-marked just like numerals and quantifiers. In conclusion, the uttered part is a
concessive admission with the intent of conveying a forceful implicature in the
unuttered part.
In her paper, Kimiko Nakanishi examines the prosodic and semantic properties
associated with the Japanese topic marker wa. She shows that the two pragmatic
functions of wa, as a marker of theme and contrast, are distinguished prosodically.
She further claims that the theme vs. contrast distinction is accounted for by an
PREFACE ix

Alternative Semantics analysis, in which the two functions of wa correspond to

different scope interpretations and pragmatic functions.
Ho-hsien Pan’s paper explores the influence of focus on fundamental frequency
and duration in Taiwanese, a language with lexical tone. Parallel to languages in
which tone is not used at the lexical level, Pan finds that increased duration and
expanded pitch range are both associated with narrow focus. However, duration
turns out to be a more reliable marker of focus than f0, a result which Pan suggests
may be due to the high functional load of f0 height in distinguishing lexical items in
Taiwanese.
Elisabeth Selkirk’s paper develops an Optimality Theoretic analysis of focus
constituency in Bengali, which is typologically unusual in requiring that focused
elements be delimited on both sides by phonological phrase boundaries. In order to
account for the Bengali focus facts, Selkirk proposes a theory of the prosody-syntax
interface in which a family of focus prominence constraints requires a focused
morphosyntactic structure to contain a phonological prominence within a specified
prosodic constituent. Selkirk shows that a member of this focus prominence
contraint family, working in conjunction with other hierarchically ranked tonal and
prosodic alignment constraints, offers a principled account of the complex tonal
phonology of Bengali.
Mark Steedman’s paper builds on his earlier work to develop a new theory of
intonation structure in which intonational tones are reduced to a small set of
semantically grounded binary oppositions. Steedman’s theory assumes a
distinction between the beliefs that the speaker attributes to the hearer by the literal
meaning of his or her utterance, and those that the hearer is actually committed to.
Steedman shows that this division is crucial in offering an adequate account of
situations in which the speaker and hearer do not mutually believe a proposition that
the speaker assumes is shared.
Klaus von Heusinger’s paper explores the function of intonational phrasing in
discourse, finding that semantics plays an important role in determining prosodic
constituency in discourse. He argues that discourse relations may hold between
relatively small subclausal units, which are defined in terms of their functions as
arguments in discourse. Von Heusinger argues that a version of Segmented
Discourse Representation Theory is equipped to handle the mutual relations holding
between discourse units.
The editors gratefully acknowledge the National Science Foundation’s support of
the Santa Barbara workshop on Topic and Focus through grant BCS-0104212. In
addition, we would like to thank the Linguistic Society of America’s Summer
Institute and the Institute for Social, Behavioral and Economic Research for their
logistical and administrative support of the workshop. Thanks are also extended to
Ed Luna for his editorial assistance in preparing the manuscripts for publication.

Chungmin Lee
Matthew Gordon
Daniel Büring

January 2006
GORKA ELORDIETA

CONSTRAINTS ON INTONATIONAL PROMINENCE

OF FOCALIZED CONSTITUENTS*

1. INTRODUCTION

Across languages, in narrow contrastive focus constructions one or more cues

(morphological, syntactic, intonational) are used by speakers in order to express the
intended meaning correctly, singling out the focalized element or constituent from
the rest of the elements in the sentence. However, in this article I will provide
evidence that in the pitch-accent dialects of Basque classified as Northern Bizkaian
Basque (NBB, Hualde, Elordieta, Gaminde and Smiljanic 2002) narrow focus
expressions may be left unexpressed through these cues. There are cases in which
focalized words cannot be identified on the basis of syntax or intonation alone
(morphology does not play a role as a focus cue in Basque). They may satisfy the
necessary syntactic conditions, but they do not satisfy the necessary conditions
imposed by the intonational grammar of these dialects. There is a constraint on
intonational focalization limiting main intonational prominence to focalized words
that bear a lexical or derived pitch accent, and more radically to words that
constitute a separate intonational unit on their own, an Accentual Phrase (AP). A
word forms an independent AP if it has a H*+L pitch accent and the word to its left
ends an AP.

2. BACKGROUND

It is well known that languages differ in the overt cues they use to make the hearer
identify clearly the focalized constituent. On the one hand, there are languages
which signal focalized elements intonationally, without overt syntactic or
morphological cues. These are languages of the so-called English type, in which
focalized elements receive main prosodic prominence in-situ, with no movement
from their base position.1
Other Germanic languages such as Dutch and German also have this strategy of
English for signaling narrow focus. However, in some cases these languages may
also resort to syntactic movement operations to cue focus. When the verb is the
focus of the sentence and a definite object is used, scrambling of the object may take
place so that the verb is interpreted as narrow focus (Reinhart and Neeleman 1998).
The verb receives main prosodic prominence by virtue of being in clause-final
position.2

1
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 1–22.
© 2007 Springer.
2 GORKA ELORDIETA

Some languages display two kinds of strategies for narrow focus manifestation:
one in which words are assigned main prominence in their base-generated syntactic
position (a strategy of the English-type), and another one in which syntactic
displacement operations are produced such that these words or constituents end up
occupying a syntactically specified position for narrow focus, by means of
scrambling or fronting, or some other means. Unlike Dutch or German, the latter
option is available for all constituents and is not subject to definiteness constraints,
and perhaps most importantly, in these languages focalized words which are
syntactically displaced are also assigned main prosodic prominence in the sentence
(cf. Bolinger 1954, Ladd 1980, Culicover and Rochemont 1983, Vallduví 1990,
Cinque 1993, Reinhart 1995, Selkirk 1995, Zubizarreta 1998, Frota 1998, among
others). Spanish and Italian constitute examples of this type of languages. But in
Spanish focalized words can also occur in non-in-situ positions, such as sentence-
end position or also a fronted position, in both cases accompanied by main sentence
stress (cf. Bolinger 1972, Contreras 1978, 1980, Uriagereka 1995, Zubizarreta 1998
among others, for discussion of the different options).
Then, there are languages which signal focus morphologically, by the addition
of a suffix, a prefix or some other overt marker that indicates focalization. This
strategy can be combined with syntactic displacement, intonational marking, or a
combination of both. In Wolof, for instance, a so-called ‘emphatic marker’ inserted
before the verb indicates which constituent is being focalized, whether it is the
subject, a complement, or the verb (cf. Rialland and Robert 2001). Narrow focus is
also cued syntactically in Wolof, as focalized constituents have to appear in
sentence-initial position.3 It is important to point out that in this language no
intonational prominence or phrasing effects are manifested on the focalized element.
English Creoles could be similar to Wolof in this respect, as focus is marked
morphologically and also syntactically, by fronting the focalized constituent, and
prosodic marking may be absent (cf. Bickerton 1993).
In Japanese, morphological marking combines with prosodic marking to
indicate focus: prosodic prominence and phrasing effects leave clear which
constituent(s) have to be interpreted as narrow focus, and the focus particle -ga
follows a focalized subject (cf. Pierrehumbert and Beckman 1988, Haraguchi 1991,
Kubozono 1993, among others).
In other languages, focus is both syntactically and intonationally identified.
That is, narrowly focalized elements not only receive main prosodic prominence
and/or are accompanied by intonational phrasing boundaries, but they also occupy a
syntactic position structurally defined for focalized expressions, be it Spec-CP,
Spec-FocusP, the most embedded position in the sentence, the position immediately
preceding the verb, or some other position. Hungarian, Turkish, Quechua, Basque
and Hausa are examples of this type of language (cf. among others Horvath 1986,
Kiss 1995, 1998 for Hungarian; Vogel and Kenesei 1987, 1990 for Turkish; Ortiz de
Urbina 1989, 1999, Hualde et al. 1994, Elordieta 2001, Arregi 2001, Etxepare and
Ortiz de Urbina 2003 for Basque; Inkelas and Leben 1990 for Hausa). The following
paradigm from Basque illustrates this pattern in which focalized elements must
appear immediately preceding the verb. Thus, examples (1e-g) are ill-formed
CONSTRAINTS ON INTONATIONAL PROMINENCE 3

because they contain focalized constituents which are either postverbal or not
immediately preverbal. Sentence (1a) represents a neutral declarative sentence, and
4
the rest are sentences with focalized constituents (capitalized):
(1) a. Jonek Mireni liburua eman dio
John-erg Miren-dat book-abs give aux
‘John has given the book to Miren’

b. Jonek liburua MIRENI eman dio

John-erg book-abs MIREN-DAT give aux
‘John has given the book TO MIREN’

c. Mireni liburua JONEK eman dio

Miren-dat book-abs JOHN-ERG give aux
‘JOHN has given the book to Mary’

d. Jonek Mireni LIBURUA eman dio

John-erg Miren-dat BOOK-ABS give aux
‘John has given THE BOOK to Miren’

e. *Jonek liburua eman dio MIRENI

John-erg book-abs give aux MIREN-DAT

f. *JONEK Mireni liburua eman dio

JOHN-ERG Miren-dat book-abs give aux

g. *Jonek Mireni eman dio LIBURUA

John-erg Miren-dat give aux BOOK-ABS

These examples show that, although Basque is a language with flexible word
order, there is a syntactic constraint in this language on the relative word order
between focus constituents and the verb, namely that they must be left-adjacent to it
(cf. the references mentioned in the previous paragraph for details on syntactic
analyses that could explain this constraint).5 But apart from this syntactic restriction,
in Basque the focalized expression receives main prominence in the sentence, that is,
focus is cued both syntactically and intonationally.
Serbo-Croatian offers a particularly rich case in focus marking possibilities
(Godjevac 2000, Frota 2002). Like in English, prosodic phrasing and prominence
with canonical word order serves to cue narrow focus. Another strategy to signal
narrow focus is to produce a marked word order by scrambling operations, assigning
at the same time prosodic phrasing and prominence cues to the constituent that is
focalized (i.e., the Hungarian-Basque type). Finally, it is also possible to mark
narrow focus by scrambling operations under a neutral intonation, leaving the
focalized constituent in sentence-final position, so that it receives default sentence
4 GORKA ELORDIETA

stress (like in Dutch or German for verb focus). Thus, three different strategies or
options are available in Serbo-Croatian to signal narrow focus.
The different possibilities for signaling narrow focus discussed above might not
constitute an exhaustive typology, although they might suffice for expository
purposes. The table in (2) summarizes this typology of the different possibilities for
signaling focus by means of syntax, morphology or prosody, or a combination of
more than one of these strategies. A few representative languages are also included.
Slots with a ‘?’ are those that to my knowledge do not have representatives.
(2)
Strategy for focus marking Sample languages
(a) Prosody alone - Only strategy: English, European Portuguese
- One of the strategies: Dutch, German, Spanish,
Italian
(b) Morphology alone ?
(c) Syntactic displacement - Only strategy: ?
alone - One of the strategies: Serbo-Croatian, Dutch,
German
(d) Prosody and morphology - Only strategy: Japanese
- One of the strategies: ?
(e) Morphology and syntactic - Only strategy: Wolof
displacement - One of the strategies: ?
(f) Prosody and syntactic - Only strategy: Hungarian, Basque, Turkish
displacement - One of the strategies: Serbo-Croatian, Spanish,
Italian
(g) Prosody, syntactic ?
displacement and morphology

Despite all these possibilities of marking focus, I will show that in pitch-
accent dialects of Basque (i.e., Northern Bizkaian Basque, NBB) there are cases in
which words which constitute the narrow focus of the utterance are not singled out
by syntactic, morphological or intonational means. In these dialects, intonational
highlighting of narrow focus is restricted to words which bear a lexical or derived
accent, or for some speakers, to words that constitute Accentual Phrases (APs) by
themselves. That is, not any independent word can bear intonational prominence
even though it may be the pragmatic focus of the utterance. I discuss these cases
in the following section.

3. SYNTACTIC AND PROSODIC CONSTRAINTS ON FOCUS IN NBB

3.1. Lexically and morphologically conditioned accentual classes in NBB

In order to understand the syntactic and prosodic constraints on focus in NBB, it is

necessary to provide an overview of the prosodic features of these dialects. NBB
dialects are pitch accent varieties of the Bizkaian dialect of Basque, and are spoken
in the northwestern Basque-speaking area, along the coast and in a band of around
15 kilometers inland from the coast. A noteworthy feature of these dialects is the
CONSTRAINTS ON INTONATIONAL PROMINENCE 5

lexical distinction between unaccented and accented roots, stems and affixes, like in
Japanese (cf. Poser 1984, Pierrehumbert and Beckman 1988, Haraguchi 1991,
Kubozono 1993 among others for details on Japanese tone and intonation structure).
An accented root or affix is sufficient to render an accented word, which surfaces
with prominence on a non-final syllable in all contexts. In most NBB varieties, the
syllable preceding the leftmost accented morpheme surfaces with main prominence,
as illustrated in (3) below for the Gernika variety (accented morphemes are indicated
by an apostrophe). In a few varieties, it is always the penultimate syllable that is
accented (as in the Lekeitio variety, cf. Hualde et al. 1994, Hualde 1997, 1999,
Elordieta 1997, 1998):

(3) a. sagar -‘ata - ‘tik ĺ sa.gá.rra.ta.tik ‘from the apples’

apple-plur.loc.-abl.

b. léku -‘ata - ra ĺ lé.ku.e.tara ‘to the places’

place-plur.loc.-all.

A combination of unaccented roots and affixes produces unaccented words

(except in compounding, where even if the members are unaccented the compound
word is accented). Unaccented words will only receive prosodic prominence if they
occur immediately preceding the verb or are pronounced in isolation. In these cases,
in most NBB varieties they display prominence on the final syllable, and in a few
dialects they show penultimate prominence (e.g., Ondarroa and Markina, cf. Hualde
1997, 2000). This kind of prominence is called derived accent by Jun and Elordieta
(1997), to distinguish it from the lexical accent of accented words. In all other
contexts, unaccented words do not surface with any kind of prosodic prominence on
any syllable. Thus, observe the behavior of the unaccented word laguna ‘the friend’
in (4), corresponding to the Lekeitio variety (henceforth Lekeitio Basque, LB). This
word is composed of the unaccented root lagun ‘friend’ and the unaccented singular
determiner –a. Prosodic prominence is indicated by an acute accent mark. The
different word orders in (4a-d) are due to the flexible word order of Basque,
constrained by topic and focus or theme-rheme structures. That is, (4a-d) differ in
information structure (rheme constituents are underlined).

(4) a. umiágas laguná etorri da

child-com friend-abs come aux
‘The friend has come with the child’

b. laguná etorri da umiágas

friend-abs come aux child-com
‘The friend has come with the child’
6 GORKA ELORDIETA

c. laguna umiágas etorri da

friend-abs child-com come aux
‘The friend has come with the child’

d. umiágas etorri da laguna

child-com come aux friend-abs
‘The friend has come with the child’

e. *laguná umiágas etorri da

f. *umiágas etorri da laguná

The unaccented/accented distinction is directly relevant for intonational

phrasing in NBB. Prominence is realized as a H*+L pitch accent on the syllable that
is phonologically associated with accent. As already mentioned above, accented
words will always bear stress in any position in the sentence, whereas unaccented
words only display a H*+L pitch accent if they are immediately left-adjacent to the
verb, i.e., when they bear derived accent. The intonational pattern that arises is the
following: the sentence starts with an initial low tone (%L), immediately followed
by a rise phonetically associated with the second or third syllable of the first word.
The pitch level is maintained until reaching a H*+L pitch accent, whether of an
accented word or an unaccented word with derived accent. If after that H*+L pitch
accent there is another word, the contour that is observed is one in which again there
is an initial low tone on the first syllable of that word, the pitch level rising again on
the second or third syllable of the following word, and the high tone level plateau
being maintained on all syllables until another H*+L accent, corresponding to an
accented word or an unaccented word preceding the verb, i.e., with derived accent.
And if another word follows, the same pattern is observed. Thus, a cycle of low
tone, rise, plateau and H*+L pitch accent is observed. The intonational units or
constituents with this shape are identified by Elordieta (1997, 1998) as Accentual
Phrases (APs). Jun and Elordieta (1997) and Elordieta (1998) show that APs consist
of an initial %L boundary tone, a phrasal H tone (H-) on the second syllable,6 and a
H*+L pitch accent. The phrasal H tone spreads phonologically onto all syllables
between the second one and the one with the pitch accent. Schematically, the tonal
structure of an AP is %L H- H*+L (cf. also Hualde et al. 2002).7
Figures 1-2 illustrate the general shape of APs in NBB, corresponding to (5a-b),
respectively. Figure 1 is an example of a sentence containing three unaccented
words before the verb; from an IP-initial %L there is a rise on the second syllable,
reaching the peak on the third syllable, and the H tone continues until the H*+L
pitch accent on the final syllable of the third word (i.e., the one immediately
preceding the verb, with the derived accent). The pitch drops on the verb until the
end of the utterance. In the figures in this article, the pitch accent is aligned with the
right edge of the accented syllable. Fig. 2 contains two accented words, each of them
with their corresponding H*+L pitch accent. Due to downstep, the second phrasal
H- does not rise as much as the first one, and the second peak is smaller than the
first one (cf. Elordieta 1997, 1998, Jun and Elordieta 1997 for details and more pitch
tracks):
CONSTRAINTS ON INTONATIONAL PROMINENCE 7

(5) a. AP{%L H- H*+L}

| | |
alargunen nebien diruá galdu dot
widow-gen brother-gen money-abs lose aux
‘I have lost the widow’s brother’s money’

b. AP{%L H+L} AP{%L H-H+L}

| | | | |
amúmen liburúak biar doras
grandmother-gen books-abs need aux
‘I need grandmother’s books’

Figure 1. alargunen nebien diruá galdu dot

8 GORKA ELORDIETA

Figure 2. amúmen liburúak biar doras

3.2. Intonational restrictions on the assignment of prominence to focalized words

As explained in section 2 above and illustrated in (1), in NBB only words contained
in an immediately preverbal syntactic constituent can be focalized. The focalized
word does not have to immediately precede the verb, but it has to be contained in a
syntactic constituent that is immediately preceding the verb. Thus, in the following
examples, (6b) is grammatical, as well as (6a). (6c) is ungrammatical, however, as
the syntactic constituent it is contained in is not immediately preverbal (syntactic
constituents are separated by square brackets):

(6) a. [maixuári] [lagúnen LIBURÚAK] emon dotzaras.

teacher-dat friends-gen BOOKS-ABS give aux
‘I have given the friends’ BOOKS to the teacher’
(responding to stimuli such as: ‘Which of the friends’ things have you
given to the teacher?’)

b. [maixuári] [LAGÚNEN liburúak] emon dotzaras.

teacher-dat FRIENDS-GEN book-abs give aux
‘I have given THE FRIENDS’ books to the teacher’
(responding to stimuli such as: ‘Whose books have you given to the
teacher?’)
CONSTRAINTS ON INTONATIONAL PROMINENCE 9

c. *[MAIXUÁRI] [lagúnen liburúak] emon dotzaras.

TEACHER-DAT friends-gen book-abs give aux
‘I have given the friends’ books TO THE TEACHER’
(erroneously responding to stimuli such as: ‘Who have you given the
friends’ books to?’)

However, in cases of utterances where one of the words constitutes the narrow
focus of the utterance, even if that word is contained in the immediately preverbal
constituent, there is a further constraint it must obey in order to be intonationally
singled out. In the variety of NBB I have investigated, LB, a focalized word can
be the most prominent intonationally if it has a lexical pitch accent (i.e., if it is a
lexically accented word) or if it has a derived accent (i.e., it is an unaccented word
immediately preceding the verb). Let us illustrate this constraint with sentence (7)
(repeated from (5b)), containing only one preverbal constituent with two accented
words, amúmen ‘grandmother’s’ and liburúak ‘books’. The intonational structure
corresponding to this constituent is thus the following:

(7) AP{%L H+L} AP{%L H-H+L}

| | | | |
amúmen liburúak biar doras
grandmother-gen books-abs need aux
‘I need grandmother’s books’

That is, in the immediately preverbal syntactic constituent there are two APs,
each of them containing one accented word. Let us now describe the main patterns
observed in contexts of narrow focus, that is, in cases in which the focalized word
replaces the variable introduced by a wh-word in a previous question. The two
words in (7) would become the narrow focus of an utterance if they formed part of a
response to the questions in (8a,b), respectively:

(8) a. Nóren liburúak biar dósus?

whose books-abs need aux
‘Whose books do you need?’

b. Sér biar dósu amuména?

what need aux grandmother-gen
‘What do you need of grandmother’s?’

Since amúmen and liburúak have lexical H*+L pitch accents, they can be
pronounced standing out as the most prominent words in the utterance. An interest-
ing aspect worth mentioning is that in narrow focus cases in which the first word
is focalized the pronunciation of such utterances is not usually distinguished from
cases of broad focus. That is, the first word will not necessarily show a boosted pitch
level and/or a following decreased pitch level. In the data I have analyzed from
five female speakers of LB, only one speaker produced some utterances in which the
10 GORKA ELORDIETA

first word was pronounced with a higher pitch followed by a lower level on the
following word. This might be due to the fact that in broad focus cases the
difference in pitch between the first peak and the following peaks is already quite
big (cf. Fig. 2). However, when the second word is focalized, there are more
instances in which the word is made more prominent intonationally and perceptually
distinguishable from broad focus cases. The focalized word may present a higher
pitch level (although the peak is still lower than the first peak, due to downstep),
followed by a decreased pitch level. Quite often there may also be a displacement of
the peak of the first word to the posttonic syllable. This strategy signals old
information or topic status for that word.8 For sentence (9), which would be an
answer to (8b), Figure 3 illustrates a case without peak delay at the end of the
preceding word, and Figure 4 illustrates a case with peak displacement, indicated in
the tone tier with a ‘>’ sign:

(9) amúmen LIBURÚAK biar doras.

grandmother-gen BOOKS-ABS need aux
‘I need grandmother’s BOOKS’

A similar scenario would apply for a preverbal constituent containing two

words, the first one accented and the second word unaccented. The accented word
has a lexical H*+L accent, and the unaccented word receives a H*+L pitch accent
by virtue of preceding the verb (i.e., it has a derived accent on its final syllable). The
sentence in (10) is an example:

Figure 3. amúmen LIBURÚAK biar doras.

CONSTRAINTS ON INTONATIONAL PROMINENCE 11

Figure 4. amúmen LIBURÚAK biar doras.

(10) AP{%L H+L} AP{%L H-H+L}

| | | | |
Amáien alabiá topa dot
Amaia-gen daughter-abs find aux
‘I came across Amaia’s daughter’

If the first word were the narrow focus of the sentence, most commonly it
would not receive more prominence than in broad focus cases. If the second word
were the narrow focus, however, it would be made more prominent by presenting a
higher pitch level than in broad focus cases, accompanied or not by peak delay in the
first word (interestingly, when there is peak delay in the previous word a bigger
pitch level on the focalized word is not necessary). An example with peak delay in
the first word is illustrated below in Figure 5, corresponding to (11). As described
above, however, this pattern is not obligatory, and it is also quite normal to find
cases which are intonationally very similar to broad focus utterances.9

(11) Amáien ALABIÁ topa dot.

‘I came across Amaia’s DAUGHTER’.
12 GORKA ELORDIETA

Figure 5. Amáien ALABIÁ topa dot

However, in the case of preverbal constituents containing one or more

unaccented words in nonfinal position (i.e., not immediately preceding the verb) the
situation is different. An unaccented word will only get a derived accent if it is left-
adjacent to the verb, and hence an unaccented word which is the narrow focus of an
utterance but which is not in the position that grants a derived accent cannot be
made more prominent intonationally. From a neutral sentence such as (12), the
leftmost unaccented word, nebien ‘the brother’s’ would not receive main
prominence even though it were the narrow focus of the sentence (as an answer to
‘Whose money have you lost?’, because it does not have a pitch accent, lexical or
derived. A crucial aspect of this pattern in NBB is that focus does not insert accents
that are not already there lexically or by virtue of a preverbal position. The first
word is lexically unaccented, and even if it is focalized, it remains unaccented, that
is, no accent is associated to it, as it is not left-adjacent to the verb and hence does
not receive a derived accent. This impossibility does not depend on the accentual
nature of the following word, as the same impossibility occurs with accented words
following the unaccented word. Thus, in a sentence such as (13) it would not be
possible to highlight the first word. The type of contours that surface in these
instances is one in which the leftmost word has to be pronounced with the same
pitch level as the following word, in the same AP. Figure 6 serves to illustrate such a
contour, corresponding to narrow focalization of the word nebien in (12):

(12) AP{%L H- H*+L}

| | |
nebien diruá galdu dot
brother-gen money-abs lose aux
‘I have lost the brother’s money’
CONSTRAINTS ON INTONATIONAL PROMINENCE 13

(13) AP{%L H- H*+L}

| | |
lagunen liburúa biar dot
friend-gen book-abs need aux
‘I need the friend’s book’

Figure 6. nebien diruá galdu dot

As for the second word in sentences such as (12)-(13), we do not find a uniform
pattern across speakers. However, such interspeaker variation reveals important
facts about constraints on the intonational realization of main prominence in
contexts of narrow focus. For two of the five speakers recorded, the second words in
those cases would be able to receive main prominence if they were the narrow focus
of the utterance, as in (14b), responding to a question such as (14a). An observed
strategy in these cases is a continuation rise at the end of the preceding word,
signaling old or known information. This rise cannot be due to an accent in the first
word, so it must be due to H- (cf. Fig. 7). Another possibility is to have a sustained
pitch at the end of the preceding word followed by a rise in pitch level on the
focused word (other non-intonational features such as higher intensity may also
be present). In both cases, a decrease in pitch level follows the focalized word. The
same pattern is observed in cases in which the second word is lexically accented, as
in (15):

(14) a. Ser galdu dósu nebiena?

What lose aux brother-gen
‘What have you lost of the brother?’
14 GORKA ELORDIETA

b. nebien DIRUÁ galdu dot

brother-gen MONEY-ABS lose aux
‘I have lost the brother’s MONEY’

(15) a. Ser biar dosu lagunena?

what need aux friend-gen
‘What do you need of the friend?’

b. lagunen LIBURÚA biar dot

friend-gen BOOK-ABS need aux
‘I need the friend’s BOOK’

Figure 7. nebien DIRUÁ galdu dot

Importantly, three of our five speakers did not produce utterances like (14b), or
could not pronounce the second word in (15) with main intonational prominence.
That is, they cannot highlight a word intonationally if it is preceded by an
unaccented word. For these speakers, not only the leftmost word but also the second
word cannot be prosodically highlighted. Regardless of which word is the corrective
focus of the utterance, the whole AP (i.e., the two words) would have to be
pronounced together. The explanation for this pattern is that these speakers have a
stricter constraint on the intonational highlighting of focalized words. This
constraint states that only words which constitute APs by themselves can be made
intonationally prominent. In cases of two words with accent, such as the ones in (7)/
(10), each word constitutes its own AP, and can thus be singled out intonationally.
But in cases in which the first word is unaccented, the second word does not
constitute an AP by itself. Rather, it continues the AP that the first word started. As
the intonational schemas in (12)-(13) show, the unaccented word starts an AP, with
CONSTRAINTS ON INTONATIONAL PROMINENCE 15

the initial %L H- tone sequence, but since it does not have a pitch accent, the phrasal
H- tone spreads onto the next word, until the H*+L accent (lexical or derived) of the
following word ends the AP (cf. Jun and Elordieta 1997; Elordieta 1998). There is thus
only one AP before the verb, containing the two words. Since neither word forms an
independent AP, they cannot be made intonationally prominent on their own. The two
words have to be pronounced in the same pitch level, in the same AP. The contour
observed in these instances is similar to the one illustrated in Figure 6, which showed
the impossibility of having the leftmost word as the most prominent word in the
utterance. The important issue at work here is that no pitch accent is specially
inserted to the first unaccented word, even if it is the narrow focus of the sentence
from a pragmatic or information-structure point of view, as already mentioned
above. Hence no AP boundary can be inserted at the right edge of the first word.
That is, the lexical association of pitch accents is respected by focus in NBB.
Thus, a mismatch between semantics and intonation arises in cases where a
word which does not constitute an AP by itself is the corrective focus of an
utterance. No intonational cues are used within the utterance containing the
contrastively focalized word alone in order to convey the intended meaning. There is
no way to single out the focalized word syntactically, as the word occurs with other
words in the preverbal constituent. Disambiguation can only come from the
preceding linguistic context. This mismatch situation between semantics and
prosody does not arise in languages surrounding NBB (Spanish and French) or in
Indo-European languages. And an insufficiency of syntax and/or morphology to
mark focalized words is unattested in the languages for which there are descriptions
of focus realization, a summary of which was provided in section 2. Thus, this
property of NBB is interesting from a typological point of view as well.
The patterns of realization of intonational highlighting change slightly when
corrective focus is considered. Corrective focus refers to those instances in which
the speaker corrects one of the words or syntactic phrases that her interlocutor has
stated incorrectly. For instance:

(16) a. Nóren alabia topa dosula? Alaznena?

whose daughter-abs find aux Alazne-gen
‘Whose daughter did you come across? Alazne’s?’

b. Es, AMÁIEN alabiá topa dot.

no AMAIA-GEN daughter-abs find aux
‘No, I came across AMAIA’s daughter.’

In (16b) above, the first accented word Amáien can be made more prominent,
usually by having a boosted pitch level followed by a decreased pitch level in the
rest of the material in the sentence. Thus, in corrective focus the first word is
distinguishable from cases of broad focus, unlike in narrow non-corrective focus.
The second word in (16b) would also be made more prominent, by means of a
delayed peak in the preceding word, signaling the character of topic or old
information of that word. This type of contour is illustrated in Figure 8, for a
16 GORKA ELORDIETA

sentence such Es, Amáien ALABIÁ topa dot ‘No, I came across Amaia’s
DAUGHTER’. Another option is to have simply a higher pitch level on the
focalized word, without a preceding peak displacement. Quite often, the focalized
word is accompanied by higher intensity levels and longer duration.10 As already
described above, the same options would be available for sentences in which the
second word were lexically accented.

Figure 8. Es, Amáien ALABIÁ topa dot

But the interesting cases are those in which the first word is unaccented,
forming an AP with the following word. As described above, in narrow non-
corrective focus some speakers could not intonationally highlight either of the two
words, due to a constraint that a word has to constitute an AP by itself in order to be
the most prominent word in the utterance, rather than simply having a pitch accent.
In corrective focus, however, these speakers can place main intonational prominence
in a word even if it does not constitute an AP by itself. The sufficient condition is
that the word has an accent, lexical or derived, like in narrow non-contrastive focus
for the other speakers. Words bearing an accent and following an unaccented word
may surface with main prominence, cued by a rise in pitch on the focalized word
coming from a sustained pitch of the unaccented word, or by a rise at the end of the
prefocal unaccented word. In both cases, usually the focalized word displays higher
intensity and duration (cf. Elordieta and Hualde 2001, 2003). It is important to
bear in mind, however, that this type of prosodic realization are scarce in the
production of the most restrictive speakers, that is, those for whom a word has to
constitute an AP by itself in order to stand out as the most prominent word.11 Figure
9 illustrates an F0 contour for a sentence such as (17b), in which the first option is
realized, and Figure 10 illustrates the second possibility, with a rise at the end of the
first word.
CONSTRAINTS ON INTONATIONAL PROMINENCE 17

(17) a. Ser biar dosula lagunena? Kuadernúa?

what need aux friend-gen notebook-abs
‘What do you need of the friend? His notebook?’

b. Es, lagunen LIBURÚA biar dot.

‘I need the friend’s BOOK’.

Figure 9. lagunen LIBURÚA biar dot

Figure 10. lagunen LIBURÚA biar dot

18 GORKA ELORDIETA

We will finish our presentation of the intonational constraints on the prosodic

realization of focus in NBB by summarizing in a table the focus realizations for all
logically possible two-word combinations in a preverbal phrase. The left-hand
column summarizes the patterns in narrow non-corrective focus, and the right-hand
column those of corrective focus. When the two types of constraints for intonational
highlighting (having an accent or being an independent AP) produce different
outputs, they are distinguished as (a) and (b).12

Narrow (non-corrective) focus Corrective focus

H*L H*L H*L H*L
| | | |
AP[Accented]–AP[Accented] – Verb AP[Accented]–AP[Accented] – Verb

Each word can be highlighted Each word can be highlighted (boosted

pitch on focalized word more frequent
than in non-corrective focus)
H*L H*L H*L H*L
| | | |
AP[Accented]–AP[Unaccented] – Verb AP[Accented]–AP[Unaccented] – Verb

Each word can be highlighted Each word can be highlighted (boosted

pitch on focalized word more frequent
than in non-corrective focus)
H*L H*L
| |
AP[Unaccented–Accented] – Verb AP[Unaccented–Accented] – Verb

a. Neither word can be highlighted; they a. Neither word can be highlighted; they
are uttered in the same AP are uttered in the same AP
b. Only the word with an accent can be b. Only the word with an accent can be
highlighted highlighted (more frequent than in non-
corrective focus)

H*L H*L
| |
AP[Unaccented–Unaccented] – Verb AP[Unaccented–Unaccented] – Verb

4. SUMMARY AND CONCLUSION

In this paper I have described the main constraints on the realization of prosodic
prominence on focalized words in a pitch accent dialect of Basque. It has been
shown that the minimum condition a word has to satisfy to receive main prosodic
prominence if pragmatically focalized is that it has an accent, whether lexical or
derived. However, in cases of narrow non-corrective focus some speakers reveal the
existence of a more restrictive constraint, which demands that a word must
constitute an AP by itself in order to surface with main prominence. In corrective
focus the sufficient condition for the five speakers recorded is that a word has an
accent. In either case, the interesting fact is that an unaccented word which does not
have an accent cannot receive an accent even if it is pragmatically focalized. The
context seems to prevent possible ambiguities between neutral and narrow focus
readings of unaccented words without an accent. To my knowledge, these are
crosslinguistically unattested constraints, and in this regard NBB is different even
from a language like Tokyo Japanese, which also has a lexical distinction between
accented and unaccented words, but which allows any unaccented word to be proso-
dically highlighted (cf. Pierrehumbert and Beckman 1988).

Dept. of Linguistics and Basque Studies,University of the Basque Country, Vitoria-

Gasteiz, Spain

NOTES
*
Many thanks are due to Matthew Gordon and José Ignacio Hualde for comments on earlier versions of
this article, as well as to Sónia Frota, Carlos Gussenhoven and Kiwa Ito for help with section 2. Of course,
this article would not have been made possible without my native informants, to whom I am indebted
immensely. This work was funded by research grants from the Department of Education, Universities and
Research of the Basque Government (PI-1998-127), the University of the Basque Country (UPV-HA-
8025/20 and 9/UPV 00033.130-13888/2001) and the Ministry of Science and Technology of Spain
(BFF2002-04238-C02-01/FEDER).
1
For the sake of expository purposes, we exclude cleft and pseudo-cleft sentences from the discussion, as
we will compare this type of language with another type of language that marks focus constituents
syntactically without clefting, by having focalized constituents occupy a certain syntactic position below
in the text. Thus, we want to distinguish languages which have a structural position for focus from
languages such as English that do not, although they may make use of cleft sentences to mark focus.
2
Scrambling is disfavored or does not apply with indefinite objects. In such cases, there is simply main
prosodic prominence on the verb.
3
However, when an object is focalized and there is a nonpronominal subject, the focalized object has to
follow the subject, which obligatorily appears thematized (i.e., topicalized, cf. Rialland and Robert
2001:897-898).
4
The following abbreviations will be used: abl = ablative, abs = absolutive, all = allative, aux = auxiliary,
dat = dative, erg = ergative, gen = genitive, ines = inessive, loc = locative, pl = plural, sg = singular.
5
It is possible for focalized constituents to appear after the verb, but they are usually uttered as separate
intermediate or intonational phrases. They are usually preceded by pauses, fillers such as e ‘err…/um…’,
or final lengthening of the verb ending in a rising intonation. It appears that copulas can be followed by
focalized constituents even without a pause (Hualde et al. 1994). In central and eastern dialects it is
possible to have focalized elements postverbally without a pause (cf. Hidalgo 1994, Elordieta 2003), apart
20 GORKA ELORDIETA

from the usual preverbal position, but the speakers I have consulted cannot have postverbal focus as an
answer to a wh-word. In that case preverbal focus is the only option. Perhaps only informational, non-
contrastive focus (Kiss 1998) presented by the speaker in her own discourse can appear postverbally in
these dialects, but more research is needed on this topic before making any generalizations.
6
Jun and Elordieta (1997) found that in APs up to four syllables long the peak of H- is reached on the
second syllable, and in APs more than fours syllables long it was reached on the third syllable. This H- is
not phonetically realized when the second syllable is associated to a pitch accent.
7
For some speakers, in sequences of four or more unaccented words certain dips in pitch can be observed
between two unaccented words. Jun and Elordieta (1997) and Elordieta (1998) take these to be AP-
boundaries, in the absence of H*+L pitch accents. However, the dips were difficult to perceive and were
much smaller than regular drops after H*+L pitch accents (see relevant pitch tracks in the mentioned
articles). Also, the factors conditioning these breaks were not very well established; desire for heaviness
reduction and slower rate of speech were suggested as factors involved in the insertion of these breaks,
but no systematic study was carried to prove these claims. Moreover, these facts were subject to speaker
dependence; some speakers always produce plateaus in sequences of four or more unaccented words,
without breaks. This issue deserves a more systematic study, which I plan to undertake in future
research.
8
The delayed peaks at the end of prefocal words were already observed for some speakers of LB by Ito
et al. (2003). However, their data involved cases of corrective focus, which we also discuss below. The
patterns presented in this paper show that it is possible to find such delayed peaks in non-corrective
narrow focus as well. Other strategies of main prominence that can be observed in these contexts and
which are not intonational in nature are higher intensity and duration on the focalized word.
9
Indeed, the speakers of LB on which Elordieta (2003) based his findings did not produce utterances in
which the second word was most prominent intonationally, and this lead to positing the absence of such a
possibility. That conclusion must now be corrected to capture the facts presented in this article.
10
Although the results in Elordieta and Hualde (2001, 2003) showed that lengthening applied to words
in corrective focus, it must be pointed out that in those utterances speakers were instructed to put special
emphasis on those words. In other recordings in which speakers were not told to put emphasis on the
correction, I have observed that lengthening did not occur significantly. It seems that a specific
experiment (left for future research) is needed to clarify the role of lengthening as a cue to corrective
focus.
11
Thus, highlighting words following an unaccented word without an accent is possible, but not frequent
in LB. Its frequency is speaker dependent, but as stated in note 10, the possibility of finding such patterns
has to be incorporated into the intonational grammar of LB, contra what was assumed in Elordieta (2003).
12
Interestingly, the two speakers that patterned differently from the other three speakers in contexts of
narrow non-corrective contexts in being able to highlight a word following an unaccented word also
patterned differently in other respects. For contexts in which the first unaccented word was correctively
focalized, they produced contours in which this word was prosodically set apart, by having a higher pitch
level followed by a fall in pitch for the following word, or by being pronounced with greater intensity and
duration. However, such cases were few in number, compared to the majority of cases in which the
unaccented word did not surface with main prominence, thus patterning with the other three speakers. At
this point I consider it premature to conclude that highlighting the unaccented word in these contexts is
a solid possibility in LB, and leave the issue open for further research based on data from more speakers
and based on more tokens of each type of context.

REFERENCES

Arregi, Karlos. “Focus and Word Order in Basque.” Manuscript, Massachusetts Institute of Technology,
2001.
Bickerton, Derek. “Subject Focus Pronouns.” In Francis Byrne and Donald Winford (eds.), Focus and
Grammatical Relations in Creole Languages, pp. 189-212. Amsterdam: John Benjamins, 1993.
Bolinger, Dwight. “English Prosodic Stress and Spanish Sentence Order.” Hispania 37 (1954): 152-156.
CONSTRAINTS ON INTONATIONAL PROMINENCE 21

Bolinger, Dwight. “Accent is Predictable (If You’re a Mind-reader).” Language 48 (1972): 633-644.
Cinque, Guglielmo. “A Null Theory of Phrase and Compound Stress.” Linguistic Inquiry 24 (1993):
239-297.
Contreras, Heles. El Orden de Palabras en Español. Madrid: Cátedra, 1978.
Contreras, Heles. “Sentential Stress, Word Order, and the Notion of Subject in Spanish.” In Linda Waugh
and C.H. van Schooneveld (eds.), The Melody of Language, pp. 45-53. Baltimore: University Park
Press, 1980.
Culicover, Peter, and Michael Rochemont. “Stress and Focus in English.” Language 59 (1983): 123-165.
Elordieta, Arantzazu. Verb Movement and Constituent Permutation in Basque. Utrecht: LOT, 2001.
Elordieta, Gorka. “Accent, Tone and Intonation in Lekeitio Basque.” In Fernando Martínez-Gil and
Alfonso Morales-Front (eds.), Issues in the Phonology and Morphology of the Iberian Languages,
pp. 4-78. Washington, DC: Georgetown University Press, 1997.
Elordieta, Gorka. “Intonation in a Pitch-Accent Dialect of Basque.” International Journal of Basque
Linguistics and Philology 32 (1998): 511-569.
Elordieta. Gorka. “Intonation.” In José I. Hualde and Jon Ortiz de Urbina (eds.), A Grammar of Basque,
pp. 72-113. Berlin: Mouton de Gruyter, 2003.
Elordieta, Gorka, and José I. Hualde. “The Role of Duration as a Correlate of Accent in Lekeitio Basque.”
In Proceedings of Eurospeech 2001 - Scandinavia, 105-108, 2001.
Elordieta, Gorka, and José I. Hualde. “Tonal and Durational Correlates of Accent in Contexts of
Downstep in Northern Bizkaian Basque.” Journal of the International Phonetic Association, 33
(2003): 195-209.
Etxepare, Ricardo and Jon Ortiz de Urbina. “Focalization”. In José I. Hualde and Jon Ortiz de Urbina
(eds.), A Grammar of Basque, pp. 459-515. Berlin: Mouton de Gruyter, 2003.
Frota, Sónia. Prosody and Focus in European Portuguese. University of Lisbon: Doctoral dissertation,
1998 [Published by Garland in 2000].
Frota, Sónia. Review of Intonation, Word Order and Focus Projection in Serbo-Croatian (Godjevac
(2000). Glot International 6 (2002): 251-256.
Godjevac, Svetlana. Intonation, Word Order and Focus Projection in Serbo-Croatian. Doctoral
Dissertation, Ohio State University, 2000.
Haraguchi, Shosuke. A Theory of Stress and Accent. Dordrecht: Foris, 1991.
Hidalgo, Bittor. Hitz Ordenaren Estatistikak Euskaraz. Doctoral dissertation, University of the Basque
Country, 1994.
Horvath, Julia. Focus in the Theory of Grammar and the Syntax of Hungarian. Dordrecht: Foris, 1986.
Hualde, José I. Euskararen Azentuerak. Bilbao: Servicio Editorial de la Universidad del País Vasco,
1997.
Hualde, José I. “Basque Accentuation.” In Harry van der Hulst (ed.), Word Prosodic Systems in the
Languages of Europe, pp. 947-993. Berlin: Mouton de Gruyter, 1999.
Hualde, José I. “On System-Driven Sound Change: Accent Shift in Markina Basque.” Lingua 110 (2000):
99-129.
Hualde, José I., Gorka Elordieta and Arantzazu Elordieta. The Basque Dialect of Lekeitio. Bilbao and San
Sebastián: Servicio Editorial de la Universidad del País Vasco, 1994.
Hualde, José I., Gorka Elordieta, Iñaki Gaminde and Rajka Smiljanic. “From Pitch-Accent to Stress-
Accent in Basque.” In Carlos Gussenhoven and Natasha Warner (eds.), Papers in Laboratory
Phonology VII, pp. 557-584. Berlin: Mouton de Gruyter, 2002.
Inkelas, Sharon, and William Leben. “Where Phonology and Phonetics Intersect: The case of Hausa
Intonation.” In John Kingston and Mary Beckman (eds.), Papers in Laboratory Phonology I, pp.
17-34. Cambridge: Cambridge University Press, 1990.
Ito, Kiwako, Gorka Elordieta, and José I. Hualde. “Peak alignment and intonational change in Basque.”
Proceedings of the 15 th International Congress of Phonetic Sciences. Barcelona. Spain, pp. 2929-2932.
Barcelona, 2003.
Jun, Sun-Ah, and Gorka Elordieta. “Intonational Structure of Lekeitio Basque.” In Antonis Botinis,
Georgios Kouroupetroglou and George Carayiannis (eds., Intonation: Theory, Models and
Applications, pp. 193-196. Proceedings of an ESCA Workshop. Athens, Greece, 1997.
Kiss, Katalin É. “Introduction.” In Katalin É. Kiss (ed.), Discourse Configurational Languages, pp. 3-27.
New York, Oxford: Oxford University Press, 1995.
22 GORKA ELORDIETA

Kiss, Katalin É. “Identificational Focus Versus Information Focus.” Language 74 (1998): 245-273.
Kubozono, Haruo. The Organization of Japanese Prosody. Tokyo: Kurosio, 1993.
Ladd, Robert D. The Structure of Intonational Meaning: Evidence from English. Bloomington, Indiana:
Indiana University Linguistics Club, 1980.
Ortiz de Urbina, Jon. Parameters in the Grammar of Basque. Dordrecht: Foris, 1989.
Ortiz de Urbina, Jon. “Focus in Basque.” In Georges Rebuschi and Laurice Tuller (eds.), The Grammar of
Focus, pp. 311-333. Amsterdam and Philadelphia: John Benjamins, 1999.
Pierrehumbert, Janet, and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988.
Poser, William. The Phonetics and Phonology of Tone and Intonation in Japanese. Doctoral Dissertation,
MIT, 1984.
Reinhart, Tanya. “Interface Strategies.” Manuscript, Utrecht University, 1995.
Reinhart, Tanya, and Ad Neeleman. “Scrambling and the PF Interface.” In W. Gueder and Myriam Butt
(eds.), Projecting from the Lexicon. Stanford: CSLI Publications, 1998.
Rialland, Annie, and Stéphanie Robert. “The Intonational System of Wolof.” Linguistics 39 (2001):
893-939.
Selkirk, Elisabeth. “Sentence Prosody: Intonation, Stress, and Phrasing.” In John Goldsmith (ed.), The
Handbook of Phonological Theory, pp. 550-569. Cambridge: Blackwell Publishers, 1995.
Uriagereka, Juan. “An F Position in Western Romance.” In Katalin É. Kiss (ed.), Discourse
Configurational Languages, pp. 153-175. Oxford: Oxford University Press, 1995.
Vallduví, Enric. The Informational Component. University of Pennsylvania: Doctoral dissertation, 1990.
Vogel, Irene, and István Kenesei. “The Interface between Phonology and Other Components of
Grammar: The Case of Hungarian.” Phonology Yearbook 4 (1997): 243-263.
Vogel, Irene, and István Kenesei. “Syntax and Semantics in Phonology.” In Sharon Inkelas and Draga
Zec (eds.), The Phonology-Syntax Connection, pp. 365-378. Chicago: University of Chicago Press,
1990.
Zubizarreta, María Luisa. Prosody, Focus and Word Order. Cambridge, Mass.: MIT Press, 1998.
ARDIS ESCHENBERG

POLISH NARROW FOCUS CONSTRUCTIONS

1. INTRODUCTION1
Polish, a western Slavic language, is a so-called ‘free word order’ or ‘scrambling’
language. SVO ordering has been posited to be basic for Polish (Szober 1963), and a
study by Klemensiewicz (1949) found the majority of isolated sentences to conform
to this ordering. However, other constituent orders are still common.
Variations in word order have often been explained in terms of information
structure (Szwedek 1976; Willim 1989), as well as constituent length (Siewerska
1993). However, a single word order can occur with various types of information
structure (Eschenberg 1999). In such cases, prosody may provide a way to
distinguish between the differing information structure types. Analyses which rely
on textual data or fail to consider prosody will be unable to account for cases where
one word order is used for differing information structures.
This paper explores Polish constructions involving focus on a single constituent,
narrow focus constructions. Not only word order but also intonation, particularly
sentence stress, is considered. First, declarative sentences are examined. Then,
wh-questions are turned to. Word order alone cannot be used to account for narrow
focus in Polish; prosody is crucial. Failure to consider prosody will be seen to cause
confusion between construction types. Differences in word order will be shown to
be motivated by different types of presupposition, as proposed by Dryer (1996). A
more restricted definition of focus type offered by Kiss (1998) will be seen to apply
in this situation.

2. FOCUS AND SYNTACTIC CONSTITUENTS: NARROW FOCUS

CONSTRUCTIONS

2.1. Theoretical Background

Analyses of Polish focus structure consistently refer to the syntactic structure of
clauses (Szober 1963, Szwedek 1976; Willim 1989, Siewerska 1993, Eschenberg
1999). Lambrecht (1994) bases his theory of information structure on the syntactic
notions of predicate, argument and sentence, which also have semantic
underpinnings. His concepts of predicate focus, argument focus and sentence focus
‘evoke both differences in syntactic focus domains such as VP, NP, PP, S, and

23
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 23–40.
© 2007 Springer.
24 ARDIS ESCHENBERG

differences in the focus portions of the pragmatically structured proposition, i.e.

predicate, argument, and sentence (222).’ This captures the generalization that
focus, a primarily pragmatic concept, tends to be associated with constituents which
are syntactic in nature. In this theory, the syntactic domain which expresses the
focus component of the pragmatically structured proposition is the focus domain of
the proposition. Thus, for sentence focus constructions, the focus domain is the
sentence, for predicate focus constructions it is the VP, and for argument focus
constructions it is a NP or PP.
Focus constructions must specify not only the focus domain, but also
presupposition, assertion, and, most obviously, focus. The presupposition is the set
of lexico-grammatically evoked propositions the speaker assumes the hearer knows,
believes, or will take for granted at the time of the utterance (52). The assertion is
the proposition expressed by a sentence which the hearer is expected to know or
believe or take for granted as a result of hearing the sentence uttered (52). The focus
of the assertion is the semantic component of a pragmatically structured proposition
whereby the assertion differs from the presupposition (213). The following provides
an example of these concepts used in an argument focus construction (from
Lambrecht 1994: 228, 5.11’)

Sentence: My CAR broke down.

Presupposition: “speaker’s x broke down”
Assertion: “x = car”
Focus: “car”
Focus domain: NP

This paper explores argument focus, which has the communicative function of
identifying a referent. Argument focus has also been called ‘narrow focus’ (Van
Valin & LaPolla 1997) and occurs when one constituent is focal. The term narrow
focus captures the fact that this constituent may be not only an actual argument
(subject, object, indirect object), but also an oblique NP or PP or a nucleus (V).
Narrow focus can be further divided into marked and unmarked narrow focus where
unmarked narrow focus occurs when the focal constituent occurs in the unmarked
focus position in the sentence for the given language. For example, the final
position2 is the unmarked focus position for English. Thus, as English is SVO,
objects, which occur finally, are unmarked for narrow focus.
Similarly, Polish has a final focus position which is unmarked. The following
section will examine narrow focus in Polish, beginning with marked narrow focus
on the subject and continuing to unmarked narrow focus on the object.

2.2. Narrow Focus in Polish

Narrow focus can be elicited through the use of a wh-question calling for an
argument filler. When replying to a wh-question asking about a subject, narrow
focus is placed on the subject in the reply sentence. In such a response where the
subject is focal, two possible replies are felicitous:
POLISH NARROW FOCUS CONSTRUCTIONS 25

(1) Q: Kto Ğpiewaá? ‘Who sang?’

A: a. PIOTR Ğpiewa-á -Ø.
Peter.NOM sing-PAST-3Msg
b. ĝpiewaá PIOTR.
‘PETER sang.’

Both SV (1a) and VS (1b) ordered sentences are felicitous replies placing narrow
focus on the subject.3 In each case, the subject is prosodically marked, receiving
intonational prominence.
Similarly, an answer containing (unmarked) narrow focus on an object can be
ordered in two ways, where in each case the object is intonationally prominent.

(2) Q: Co kupi-á-eĞ? ‘What did you buy?’

A: a. SAMOCHÓD kupi-á-em.
car.ACC buy-PAST-1Msg
b. Kupi-á-em SAMOCHÓD.
‘I bought a car.’

In (2a) the object is sentence initial and prominent, and in (2b) the object is sentence
final and prominent.
Although the above example (2) did not involve an overt subject, a similar
situation arises when an overt subject is present (3).

(3) Q: Kogo Jan kocha? ‘Who does Jan love?’

A: a. Jan kocha MARI-ĉ.
John.NOM love.3sg.PRES Maria-ACC
b. Jan MARI-ĉ kocha.
c. # MARI-ĉ kocha Jan.
d. ? MARI-ĉ Jan kocha.
‘John loves MARY.’

Again, the object can occur in its canonical position, sentence finally (a). It can also
occur pre-verbally after the subject (b), but is less felicitous sentence initially (c,d).
Note that in each of the above, while the word order changes, the pitch accent
placed upon the focal constituent is similar. This can be seen in a pitch curve, such
as in Figure 1.

.
26 ARDIS ESCHENBERG

Figure 1. Comparison of pitch curves for (a) SVO and (b) SOV ordered sentences.

In Figure 1(a), the final focal object begins at 6.6 seconds as the pitch curve rises
and continues until the end of the sentence. In (b) the medial focal object again
begins on the ascent of the curve and continues through its peak and descent. The
final constituent in each case is lengthened. Therefore, the curve associated with the
final object is lengthened compared to the medial object's curve. However, the
general shape and range in hertz associated with the focal object is similar for both
the final and medial focal objects.
Error correction paradigms provide another way to elicit narrow focus
constructions, yielding similar results to wh-question elicitation (4).

(4) Q: Jan kocha KasiĊ. ‘Jan loves Kasha.’

A: a. Nie, Jan kocha MARI-ĉ.
No, John. NOM love.3sg.PRES Maria- ACC
b. Nie, Jan MARI-ĉ kocha.
c. ?Nie, MARI-ĉ kocha Jan.
d. # Nie, MARI-ĉ Jan kocha.
‘No, John loves MARY.’

The error correction paradigm in (4) provides the same grammaticality judgments
and intonational contours as the similar wh-question paradigm in (3). This can be
seen in a comparison of the plots of the pitch curves as well (Figure 2).

Figure 2. Pitch curves of (a) wh-question and (b) error correction paradigm responses:
Jan kocha MARIĉ.
POLISH NARROW FOCUS CONSTRUCTIONS 27

Thus, for both error corrections and replies to wh-questions, variable word
orderings exist in Polish. Subjects can occur initially or finally, and objects can
occur pre-verbally or finally. In all cases the focal argument receives prosodic
prominence.

3. PREVIOUS ANALYSES
Variability in Polish word order is not a newly discovered phenomenon. Indeed, as
with many Slavic languages, Polish has been studied extensively by Prague School
linguists, who call the principles underlying the flexibility in word order the
“functional sentence perspective (FSP).” To describe how information is distributed
in a sentence, that is to give the information structure of sentences, Mathesius (1929:
127) divides the parts of an utterance into “theme” and “rheme.” The theme is what
“one is talking about, the topic,” and the rheme is “what one says about it, the
comment” (Danes & 1970: 134). These have also been explained as a distinction
between new information, rheme, and given information, theme.
Using the latter interpretation, Szwedek (1976: 51) states that it is “not true that
order of sentence elements in Polish is free or is a matter of style,” but that it is
“strictly determined” and “reflects the organization of the utterance according to the
new/given information distribution which, of course, is dependent on the context
and situation.” Thus, for the above, both the focal object and focal subject are
predicted to come last due to this organization. Szwedek notes that canonically
ordered, insitu focus (SVO) is more colloquial or conversational for focal subjects
than focus final placement (VOS). He does not discuss SOV constructions.
Variation in word-ordering has also been studied by linguists from other schools
of thought. Willim (1989: 38) notes that subjects are often introduced into discourse
in final position. She calls these VOS ordered sentences ‘presentational.’ However,
she does not note which argument in these presentational sentences is prosodically
prominent, and, thus, it is difficult to apply her analysis to the above. In her analysis,
OSV ordering with a prosodically prominent object is a case of ‘topicalization,’
where the object is focal and non-presupposed (122-3). Like Szwedek, she also does
not discuss SOV constructions. Neither of these analyses completely accounts for all
of the variation seen in (1-4).

4. EFFECTS OF PRESUPPOSITION

4.1. Theoretical background

Although the alternative felicitous word orderings in the above examples could be
strictly equivalent variations, this is not necessarily so. Just as the above Polish
wh-questions can be replied to using two different constructions, English wh-
questions can also be felicitously answered by two different constructions. Dryer
(1996) notes that both simple focus (SVO) sentences and cleft sentences can serve
as answers to English wh-questions (5, adapted from Dryer 1996: 486).

.
28 ARDIS ESCHENBERG

(5) Q: Who saw John?

A: a. MARY saw John.
b. It was MARY who saw John.

Both the simple focus focus sentence (5a) and the cleft sentence (5b) felicitously
answer the wh-question. The speaker of the wh-question believes that someone saw
John and is asking who that person is. Lambrecht (1994: 283) notes that the speakers
of wh-questions typically presuppose that there is an answer which fulfills the
question. He states that one does not normally ask questions one does not expect
answers to. However, the replies to the wh-question do not necessarily contain this
presupposition. Dryer (1996: 188), following Rochemont (1986: 130), claims that
cleft constructions necessarily contain this pragmatic presupposition but simple
focus sentences do not. In the above, (5b) necessarily presupposes that someone saw
John but (5a) does not. The following provides an overview of Dryer’s arguments as
relevant to this paper.
The presuppositional content of the replies becomes apparent in situations where
the question does not contain such a pragmatic presupposition. Example (6), adapted
from Dryer (1996: 510), provides an example of a question where the speaker does
not assume that someone did in fact see John.

(6) Q: Did anyone see John?

A: a. MARY saw John.
b. #It was MARY that saw John.

Note that only the simple focus sentence (6a) is a felicitous reply to the question
when there is no presupposition that someone saw John. The cleft cannot be used as
a felicitous reply (6b) because it inherently presupposes that someone did see John,
and the question does not presuppose this.
Although the questions in (5) and (6) differ for presupposition, both activate the
proposition ‘someone saw John.’ In (5), the first speaker believes this proposition to
be true; s/he presupposes it. In (6), the speaker does not have such a belief.
Therefore, the presupposition cannot be part of the common ground between the two
speakers.
When the presupposition of the answer is negated in the reply, the cleft cannot
occur (7).

(7) Q: Who saw John?

A: a. NOBODY saw John.
b. #It was NOBODY that saw John.

The answer (7a) does not presuppose that someone saw John. In fact, it asserts just
the opposite, that no one saw John. The cleft cannot felicitously assert this due to the
fact that cleft contains a presupposition that someone saw John (Rochemont
POLISH NARROW FOCUS CONSTRUCTIONS 29

1986: 130, Dryer 1996: 188). Thus, while clefts inherently contain pragmatic
presupposition, simple focus sentence answers do not.

4.2. Effects of presupposition in Polish

Turning to Polish, all the paradigms presented so far elicit replies which may
contain pragmatic presupposition. For the wh-question, one argument of the
presupposed proposition is not known, but assumed to exist. For the error correction
paradigm, the argument is incorrectly assumed and must be corrected. (8) provides
an example in Polish of a question which does not contain such a pragmatic
presupposition.

(8) Q: Czy ktoĞ Ğpiewaá? ‘Did anyone sing?’

A: a. PIOTR Ğpiewa-á-Ø.
Peter.NOM sing-PAST-3Msg
b. #ĝpiewaá PIOTR.
c. ĝPIEWAà Piotr.
‘Peter sang.’

Similar to (6), the question in (8) does not contain the presupposition that someone
actually sang. The speaker of the question has activated the proposition ‘someone
sang,’ but does not necessarily believe it to be true. The SV ordered sentence with
focus on the subject is felicitous (8a). It contains but does not presuppose the
proposition that someone sang. However, VS ordering is not grammatical if
prosodic prominence is placed on the subject (8b). Behaving analogously to an
English cleft construction, the VS construction presupposes that someone sang and
cannot felicitously answer a question which does not contain such a presupposition.
To use the VS construction would entail that the presupposition is part of the
common ground between speakers, but the question shows that it is not. The VS
ordering is felicitous if the sentential stress is perceived to be on the verb (8c). This,
however, is not a case of narrow focus on the just the subject, but rather the entire
sentence is in focus. Indeed, in a spectrogram, the pitch curve actually shows stress
on both the verb and the subject in such a construction. While (8a) places narrow
focus on Jan, (8c) places focus on the entire proposition. Both necessarily assert that
someone sang as the question does not presuppose this. However, one focuses on the
actor, entailing the event, and the other focuses on the entire event.
Narrow focus on subjects can occur with either sentence initial or sentence final
subjects (1). However, sentence final subjects contain pragmatic presupposition
which their sentence initially placed counterparts do not (8). Focal objects have also
been seen to occur both initially and finally (2). The following explores the effects
of presupposition on object word order (9).

.
30 ARDIS ESCHENBERG

(9) Q: Czy Jan kocha kogoĞ? ‘Does John love anyone?’

A: a. Jan kocha MARI-ĉ.
John.NOM love.PRES.3sg Mary-ACC
b. #Jan MARI-ĉ kocha.
‘Jan loves MARY.’

In example (9), the reply to a question with object focus but no pragmatic
presupposition felicitously occurs only with canonical ordering (SVO), as in (9a).
SOV ordering, similar to a cleft construction in English cannot felicitously answer
the question.
Thus, it can be seen that non-canonical word orderings with prosodic
prominence on an argument entail pragmatic presupposition. Canonically ordered
SVO sentences with prosodic prominence on an argument do not entail such a
presupposition. Without pragmatic presupposition, focus must occur in-situ, that is,
the word ordering must be SVO. In all constructions, the focal constituent receives
prosodic prominence.
Examples such as (8) and (9) necessarily lead to a revision of Lambrecht’s
formulations of assertion, presupposition and focus. Presupposition is not simply the
set of lexico-grammatically evoked propositions the speaker assumes the hearer
knows, believes, or will take for granted at the time of the utterance (Lambrecht
1994: 52). Rather, it is only the set of propositions that the speaker assumes the
hearer believes at the time of the utterance. His definition of assertion as the
proposition expressed by a sentence which the hearer is expected to know or believe
or take for granted as a result of hearing the sentence uttered still holds true (52).
However, the focus can no longer be defined as the semantic component of a
pragmatically structured proposition whereby the assertion differs from the
presupposition (213). Both (8a) and (8c) assert that someone sang and that Jan is the
person who sang. Neither contain a presupposition about the beliefs of the hearer.
However, the focus in these two constructions is not the same. In (8a), the focus is
the argument ‘Jan’ and in (8c) it is the entire sentence. Focus is determined not by
subtracting presupposition from assertion but rather by prosody.

5. IDENTIFICATIONAL AND INFORMATIONAL FOCUS

5.1. The phenomenon

Somewhat similar to Dryer (1996), Kiss (1998) also distinguishes between two
types of focus, ‘identificational focus’ and ‘informational focus,’ using pre-
supposition. Identificational focus conveys the exhaustive subset of a set of
contextually or situationally given elements for which the predicate phrase holds.
Informational focus conveys new, non-presupposed information (245-6). Infor-
mational focus is not associated with movement, and, although all sentences contain
information focus, not all contain identificational focus (246).
POLISH NARROW FOCUS CONSTRUCTIONS 31

Using tests developed by Szabolcsi (1981) and Farkas (p.c. to Kiss 1998), Kiss
demonstrates that identificational focus expresses exhaustive identification in
Hungarian pre-verbal focus constructions and in English cleft sentences. One test
involves a pair of sentences where the first contains two coordinated objects and the
second contains only one of the two objects. If the second sentence involves
exhaustive identification, it cannot be a logical entailment of the first. That is, if the
second sentence expresses exhaustive identification, it contradicts the first. The
following provides such a test in Polish using both canonical and non-canonical
word order.

(10) A. Jan kupi-á-Ø CHLEB i MASàO.

Jan.NOM buy-PAST-3Msg bread.ACC and butter.ACC
‘Jan bought BREAD and BUTTER.’
B: On kupi-á-Ø MASàO.
he buy-PAST-3Msg butter.ACC
‘He bought BUTTER.’

(11) A: Jan CHLEB i MASàO kupi-á-Ø.

Jan.NOM bread.ACC and butter.ACC buy-PAST-3Msg
‘It was BREAD and BUTTER Jan bought.’
B: On MASàO kupi-á-Ø.
He butter.ACC buy-PAST-3Msg
‘It was BUTTER he bought.’

While (10B) is a logical consequence of (10A), (11B) is not a logical consequence

of (11A). (11B) contradicts (11A) as (11B) asserts an exhaustive set which is not
equal to the exhaustive set of (11A). Thus, SOV sentences in Polish are instances of
identificational focus and SVO sentences are instances of informational focus.
Kiss’s prediction that informational focus is not associated with movement (or
noncanonical word ordering) is thus upheld.
Kiss also shows that the identificational focus position in Hungarian is not
available for certain types of constituents, such as ‘also’ phrases, ‘even’ phrases and
the existential quantifiers ‘somebody/something’ (251). This also proves true for
pre-verbal focal objects in Polish (12).

(12) a. #Jan TEĩ SWETR kupi-á-Ø.

Jan.NOM also sweater.ACC buy-PAST-3Msg
*‘It was ALSO A SWEATER Jan bought.’
b. #Jan NAWET SWETR kupi-á-Ø.
Jan.NOM even sweater.ACC buy-PAST-3Msg
*‘It was EVEN A SWEATER Jan bought.’
c. #Jan COĝ kupi-á-Ø.
Jan.NOM something.ACC buy-PAST-3Msg
*‘It was SOMETHING Jan bought.’

.
32 ARDIS ESCHENBERG

The preverbal focal object placement is not felicitous for ‘also’ phrases (12a),
‘even’ phrases (12b) and an existential quantifier (12c). All of these constructions
are possible for final objects (13).

(13) a. Jan kupi-á-Ø TEĩ SWETR.

Jan.NOM buy-PAST-3Msg also sweater.ACC
‘4Jan bought ALSO A SWEATER.’
b. Jan kupi-á-Ø NAWET SWETR .
Jan.NOM buy-PAST-3Msg even sweater.ACC
‘Jan bought EVEN A SWEATER.’
c. Jan kupi-á-Ø COĝ.
Jan.NOM buy-PAST-3Msg something.ACC
‘Jan bought SOMETHING.’

Whereas focal objects placed non-canonically were not felicitous for such phrases,
focal objects in-situ (clause final) are felicitous for ‘also’ phrases (13a), ‘even’
phrases (13b), and an existential quantifier (13c). Thus, the identificational focus
constructions are not felicitous, but the informational focus constructions, which are
not associated with movement, are felicitous in these examples.
In the analyses in sections 2 and 4, both focal subjects and objects were found to
behave in similar ways based on in-situ versus non-canonical word ordering and
focus. Although Kiss does not explore subjects, a thorough investigation of the
Polish phenomena presented thus far requires such an examination. The following
presents sentences similar to (12, 13) involving focal subjects rather than focal
objects.

(14) a. MARIA TEĩ Ğpiewa-á-a.

Maria.NOM also sing-PAST-3Fsg
a’. ĝpiewaáa TEĩ MARIA.
‘MARIA ALSO sang.’
b. NAWET JAN Ğpiewa-á-Ø.
even Jan.NOM sing-PAST-3Msg
b’. ĝpiewaá NAWET JAN.
‘EVEN JAN sang.’
c. #KTOĝ Ğpiewa-á-Ø.
someone.NOM sing-PAST-3Msg
c’. ĝpiewaá KTOĝ.
‘SOMEONE sang.’

‘Also’ phrases (14a, 14a’) and ‘even’ phrases (14b, 14b’) are felicitous for focal
subjects regardless of whether the subject is placed initially or finally. Although in
such constructions focal objects could only occur in the canonical position of
informational focus (13), focal subjects can occur in canonical or non-canonical
positions. However, focal existential quantifiers are not felicitous in initial position
(14c) but are felicitous in final position (14c’).
POLISH NARROW FOCUS CONSTRUCTIONS 33

Although the felicity judgements of (14c, 14c’) seem very odd considering the
results seen earlier, Kiss notes that existential quantifiers cannot function as either
identificational or informational focus. Thus, (14c’) must be a different type of
construction; it cannot be an identificationally focused final subject as in (1b).
Indeed, it is a presentative with a pitch accent on the introduced element ktoĞ. This is
an example of the VOS ordered sentences Willim (1989: 38) refers to. These
constructions introduce a new element rather than providing the contrastive reading
(section 3) of identificational focus due to exhaustive identification. Here, rather
than exhaustive identification, a constituent is introduced.
Similarly, the non-canonically ordered subjects in (14a’) and (b’) are not
examples of identificational focus, but rather presentatives. In the earlier examples
(1, 2, 3, 4) informational focus and identificational focus constituents have similar
pitch accents but different word orderings. This is confirmed by both native speaker
judgment and spectrographic analysis (figure 1). However, speakers do not judge the
SV ordered and VS ordered sentences in (14) to have the same pitch accents.
Whereas speakers state that in (14a) and (14b) the strongest pitch accent is on the
adverb (and a lesser pitch accent occurs on the noun4), they consistently judge
(14a’) and (14b’) to place the strongest pitch accent on the noun (and a lesser pitch
accent on the adverb). Spectrographic analysis confirms speaker judgments of
prosodic prominence (Figure 3).

Figure 3. Pitch curves of (14a) and (14a’), ‘also’ phrases with prosodically prominent
subjects.

In Figure 3, the highest points in the pitch curve differ for (a) and (b). In (a), the
highest point is over ‘also,’ but in (b) it is over ‘Maria.’ This confirms native
speaker judgements. Identificational focus with a subject noun phrase results in
prosodic prominence on the adverb in ‘also’ and ‘even’ phrases. Speakers also judge
the strongest pitch accent to be on the adverb in such constructions when the object
is focal (13).

.
34 ARDIS ESCHENBERG

That (14a’) and (14b’) are not identificational focus is further supported by the
fact that their pitch curves differ from clear examples of subject identicational focus
(Figure 4).

Figure 4. Comparison of pitch curves for a final focal subject (a) and presentational final
subject (b).

Whereas Maria begins when the pitch curve is already mid-ascent (4.5 sec.) in the
focal subject construction (a), it begins on the lowest point of the pitch curve (7.96
sec.) in the presentative construction. That is, a local minimum occurs in the pitch
curve well before the subject in (a) but coincides with the subject in (b). The fact
that the pitch curves are not identical is due to the fact that the VS sentences in
(14) are not instances of identificational focus, but rather are presentational
constructions. Thus, careful analysis of prosody can distinguish between sentence
final identificational focus subjects and sentence final presentational subjects.
Thus, in Kiss’ analysis, SVO and SVO sentences are examples of informational
focus, while SOV and VOS sentences are instances of identificational focus.
Additionally, VOS sentences can occur as sentences involving introduction of a
constituent.

5.2. Identificational focus versus focus with pragmatic presupposition

Although both Dryer’s and Kiss’ analyses are able to distinguish between the
variations found in section 2, they are not necessarily identical. Both concur that
informational focus (simple focus in Dryer’s terms) conveys non-presupposed
information. However, whereas Dryer explicitly states that clefts contain pragmatic
presupposition that involves belief and not simply activation, Kiss states that
identificational focus may convey contextually or environmentally given elements.
Crucially for these constructions, Dryer examines the non-focus portion of the
sentence, while Kiss considers the focal portion. That is, Dryer concentrates on what
is presupposed by the sentence while Kiss considers what is asserted. While Dryer
POLISH NARROW FOCUS CONSTRUCTIONS 35

notes that a cleft necessarily presupposes a proposition, Kiss notes that

identificational focus asserts all the variables that fulfil this proposition. For
example, for the sentence ‘it is Jan that sang,’ Dryer’s analysis shows that this
construction presupposes the proposition that someone sang. Kiss’ analysis shows
that Jan is the only person who sang. Thus, their insights are complimentary.
Together, they yield a larger picture of this construction, giving both its
presupposition and assertion.
However, identificational focus does not always lead to an exhaustive set of
variables cross-linguistically. In languages such as Finnish, Kiss notes that
identificational focus may or may not be exhaustive (1998: 271). Thus, ultimately, a
[+exhaustive] feature must be noted to truly account for the phenomenon of
identificational focus (or focus with presupposition) in Polish.

6. RELATED PHENOMENON

6.1. Clitic pronouns

Further evidence supporting a distinction between identificational and informational
focus can be found in the Polish pronoun system. Polish object (accusative case)
pronouns have two forms for the second person singular. These are the long form
ciebie and the short form ciĊ. Ciebie is used to give emphasis, to point out that it is
only you of all the possible people. This coincides with Kiss' identificational focus,
where the one person from a group is being pointed out.

(15) a. Ewa kocha ciĊ.

Ewa.NOM loves you.ACC
‘Ewa loves you.’
b. Ewa (TYLKO) CIEBIE kocha.
Ewa.NOM only you.ACC loves
‘Ewa loves (ONLY) YOU.’
c. #Ewa ciĊ kocha.
d. #Ewa kocha (TYLKO) CIEBIE.

Accordingly, use of ciebie coincides with the structure and intonation used for
identificational focus. It is placed in non-canonical position, pre-verbally, and given
prosodic stress (15b). It is less felicitous in the canonical (final) object position
reserved for informational focus (15d). Conversely, the non-presupposed ciĊ occurs
most felicitously in canonical object position (15a) and less felicitously pre-verbally
(15c). This phenomenon further supports the above analysis of identificational
versus informational focus in Polish.

.
36 ARDIS ESCHENBERG

6.2. Wh-questions
In the literature, wh-questions are often assumed to be a type of narrow focus with
properties similar to non-wh focus. For example, Kiss (1998: 249) states that for
Hungarian, a wh-phrase other than ‘why’ is ‘always placed in the preverbal
identificational focus position…’ However, she notes that wh-questions can be
answered by identificational or informational focus. This leads to an ambiguity as to
whether wh-question words are a type of identificational focus or not.
Polish, however, provides clear evidence that wh-focus is not the same as
identificational focus in a declarative (16).

(16) a. KTO umar-á-Ø?

who.NOM die-PAST-3Msg
‘Who died?’
b. *Umar-á-Ø KTO?
c. UMAR-à-Ø kto?
‘Did anyone die?’

In (16a), the felicitous wh-question, the subject is both initial and focal. This is
similar to the informational focus position of a subject (14a,b). It is unlike
identificational focus subjects, which have been seen to occur finally (8). In (16b)
the focal subject is final and the resulting sentence is ungrammatical. Example (16c)
shows that a ‘wh’ subject can occur finally, but only when it is not prosodically
prominent, or focal. In such a case, it also does not receive a wh-reading. Unlike in
Hungarian, Polish focal wh-subjects are clearly not in the identificational focus
position.
The fact that (16c) does not have a wh-reading can be seen by looking at its
felicitous answers:

(17) Q: UMAR-à-Ø kto?

A: MARIA umar-á-a.
Maria.NOM die-PAST-3Fsg
A’: UMAR-à-A Maria.
A”: #Umar-á-a MARIA.
‘Maria died.’
A’”: ?MARIA.
‘Mary.’
A””: Nie.
‘No.’

Only answers which do not presuppose that someone did indeed die are felicitous,
such as (A) with canonical order and prosodic prominence on the subject
(informational focus). (A”), an example of identificational focus, has the pragmatic
presupposition that someone died and is not grammatical. The answer ‘no’ (A””) is
a felicitous reply here but would not be for the wh-question ‘who sang?’ This
POLISH NARROW FOCUS CONSTRUCTIONS 37

paradigm proves different from an actual wh-question, such as (1), and, rather, is
similar to a question involving an indefinite pronoun (8). This focal whquestion/
non-focal indefinite pronoun patterning can also be seen in Siouan languages such
as Omaha and Lakhota where words which function as wh-words when focal act as
indefinites when non-focal.
Just as focal wh-subjects occur initially (16), focal wh-objects also occur initially
(18):

(18) a. CO Jan kupi-á-Ø?

What.ACC Jan.NOM buy-PAST-3Msg
b. *Jan kupi-á- Ø CO?
‘What did you buy?’
c. KUPI-à-Eĝ co?
buy-PAST-2Msg what.ACC
‘Did you buy anything?’

Similar to wh-subjects, wh-objects must occur initially and be prosodically accented

to receive a wh-reading (18a). Wh-objects in final position, which is the canonical,
informational focus position for non-wh-objects, cannot receive prosodic
prominence or a wh-word reading (18b). The wh-object word can occur finally but
in this case it is not prosodically prominent and functions as an indefinite and not a
wh-word (18c). Again, it can be seen that the grammatical wh-word placement is not
equivalent to the identificational focus position. Identificational focus objects are
placed pre-verbally but after the subject, SOV, as in (11). Here, the wh-word is
before the subject, WHSV, (18a).
Thus, it can be seen that wh-focus differs from non-wh-focus. It requires initial
position in the sentence, regardless of what type of argument the wh-word is.
Prosody importantly distinguishes grammatical and ungrammatical final placement
of a wh-word. Even grammatical sentence-final occurrence of a wh-word does not
yield a wh-reading and does not have prosodic prominence (focus) on the wh-word.

7. SUMMARY AND CONCLUSION

Word order and prosody intertwine to create different focus constructions in Polish.
An analysis based on only one or the other fails, as both are integral to Polish focus.
For narrow focus constructions, when word order differs, focal constituents can have
similar pitch accents (1, 2, Figure 1). In other constructions, the same word order
may involve differing prosody based on focus type (14). Failing to consider prosody
as well as word order results in an inability to draw relevant conclusions about the
word order felicitousness (8, 11, 14).
Also, it has been seen that narrow focus in Polish involves a finer
distinction than provided by a theory such as Lambrecht (1994), which is based on
syntactic constituenthood and semantic role. Under Lambrecht's theory, different
word orderings seem interchangeable (1,2). However, these differing word orderings
function in distinct ways (8, 9). In order to distinguish between constructions which

.
38 ARDIS ESCHENBERG

differ in word order but not prosodically prominent constituent (for example, 1a and
1b), Dryer’s notion of presupposition proves valuable (8, 9). Kiss’ definition of
identificational focus proves equally applicable (10,11). In both cases, a stipulation
that the construction provides exhaustive identification needs to be integrated.
In addition to refining the concept of narrow focus to include presupposi-
tion, Kiss’ analysis additionally provides that movement is not associated with
informational focus. Supporting this, in Polish informational focus occurs in-situ
(10, 14), while identificational focus is associated with non-canonical position (11,
13). Use of Polish clitics versus full pronouns provides additional evidence for the
distinction between informational and identificational focus (15).
However, Kiss’ observation that wh-words in Hungarian tend to occur in the
identificational focus position does not hold for Polish. A different type of focus,
wh-word focus behaves differently than focus in declaratives. Wh-word focus in
wh-questions entails placing the wh-word in initial position and giving it prosodic
prominence. This is true regardless of the argument type of the wh-word. Again,
accounting for prosody proved crucial in that a non-prosodically prominent
wh-word can occur sentence finally. However, in this case, an indefinite and not a
wh-reading is attained.
Thus, Polish, as a flexible word order language, provides an ideal testing ground
for theories of focus. Just examining prosodic accent on single constituents leads to
evidence for identificational focus, informational focus, wh-question focus, and
presentatives. Word order and/or prosody can distinguish each; there are no overlaps
where two constructions are homophonous and only distinguishable through
context. Positing a focus position applicable regardless of semanticosyntactic roles
proves valid for wh-words, but not for other forms of narrow focus. The position of
constituents involved in presentatives, informational focus and identificational focus
is best explained as in-situ versus non-canonical position, rather than as fixed
positions. Table 1 provides a summary of the word orders and prosody involved for
the constructions examined in this paper.
POLISH NARROW FOCUS CONSTRUCTIONS 39

Table 1. Different focus constructions in Polish and their syntactico-prosodic realization

Focus Type: Polish manifestation:

Narrow focus on subject, informational SV(O)
focus
Narrow focus on subject, V(O)S, pitch curve minimum
Identificational focus before the beginning of S

Presentative subject construction V(O)S, pitch curve minimum at

the beginning of S
Narrow focus on object, SVO
informational focus

Narrow focus on object, SOV

Identificational focus

wh-question focus WH(S)V(O)

Ardis Eschenberg
University at Buffalo
Nebraska Indian Community College

8. NOTES
1 I would like to thank Janina Aniszewska, Jolanta àapat, Maágorzata àapat, Czesáaw Prokopczyk, and
Piotr Szewczyk, and for their patience, teaching and insight into the Polish language. Any mistakes here
are the responsibility of the author, but all the truth obtained is due to the kindness of these consultants. I
would also like to thank Daniel Büring for his insightful comments.
2 Final position in the core, not the clause, where the core consists of the predicate and its arguments.
3 Bold underline represents prosodically accented constituent. Small caps are used to indicate sentence
stress in sample sentences.
4 The stronger pitch accent is indicated by bold small caps, while the lesser is in small caps.

9. REFERENCES
Daneš, FrantiĞek. “One instance of Prague school methodology: functional analysis of utterance and
text.” In Paul L. Garvin (ed.), Method and Theory in Linguistics. Paris: Mouton & Co, 1970.
Dryer, Matthew. “Focus, pragmatic presupposition, and activated propositions.” Journal of Pragmatics
26 (1996): 475-523.
Eschenberg, Ardis. Focus in Polish. M.A. thesis. University at Buffalo, 1999.
Kiss, Katalin. “Identificational versus information focus.” Language 74.2 (June 1998): 245-273.
Klemensiewicz, Zbigniew. Lokalizacja podmiotu i orzeczenia w zdaniach izolowanych. Biuletyn PTJ 9
(1949): 8-19.
Lambrecht, Knud. Information Structure and Sentence Form: a theory of topic, focus, and the mental
representations of discourse referents. New York: Cambridge University Press, 1994.

.
40 ARDIS ESCHENBERG

Mathesius, Vilem. Functional linguistics. In M. Mayenova, ed., O spojnosci tekstu, pp. 121-42. Warsaw:
1987.
Siewierska, Anna. “Syntactic weight vs. information structure and word order variation in Polish.”
Journal of Linguistics 29.2 (1993): 233-266.
Szabolcsi, Anna. “The semantics of topic-focus articulation.” In Jan Groenendijk, Theo Janssen, and
Martin Stokhof (eds.), Formal methods in the study of language, pp. 513-41. Amsterdam:
Matematisch Centrum, 1981.
Szober, Stanislaw. Gramatyka jĊzyka polskiego. Warsaw: PWN, 1963.
Szwedek, Aleksander. Word Order, Sentence Stress and Reference in English and Polish. Edmunton:
Linguistic Research, Inc, 1976.
Van Valin, Robert and Randy LaPolla. 1997. Syntax: Structure, meaning and function. New York:
Cambridge University Press, 1997.
Willim, Ewa. On word order: a government binding study of English and Polish. Krakow: Uniwersytet
Jagellonski, 1989.
DAVID GIL

INTONATION AND THEMATIC ROLES IN RIAU

INDONESIAN*

1. INTRODUCTION
What kinds of meanings may be expressed by intonation? There is general
agreement that intonation may convey emotions, and, related to this, speakers’
attitudes towards the propositional content of utterances. It is also well-known that
certain intonation contours may be associated with specific speech acts such as
questions. Moreover, as reflected by the title of this volume, intonation may encode
various pragmatic functions such as topic and focus.
Another, rather more indirect way in which intonation may express meanings is
via its relationship to syntactic structure. In general, intonation contours parse an
utterance into intonation groups, which correspond closely, albeit not always
perfectly, to syntactic constituents. However, in many cases, a given string of words
may be associated with two or more different constituent structures, each of which
in turn is associated with a different meaning. In such cases, the different syntactic
structures and corresponding meanings may be reflected by different intonation
groups.
Nevertheless, the range of meanings expressible by intonation is highly
constrained. For example, no language has intonation contours which, when applied
to any sentence, add meanings such as past tense, ‘in the rain’, or ‘because John
came to the party’. Thus, a major goal of any theory of intonation must be to
determine the set of meanings potentially encodable by intonation in one or more
human languages.
This paper contributes to the above goal through the examination of one specific
semantic domain, namely thematic roles: actor, undergoer, goal and the like. Most
commonly, thematic roles are encoded with various morphosyntactic features,
typically some combination of word order, case marking and verbal agreement. One
might wonder whether there are any languages in which thematic roles can also be
expressed by means of intonation. This paper addresses the question through an
empirical examination of intonation and thematic roles in one particular language,
namely the Riau dialect of Indonesian. The results of the study are negative: no
evidence is found that might point towards any correlation between intonation and
thematic roles in Riau Indonesian. This, in turn, is suggested to lend greater cogency

41
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 41–68.
© 2007 Springer.
42 DAVID GIL

to the question whether in fact it is possible in any language for thematic roles to be
encoded by intonation.

2. THE STATE OF THE ART

What is known with regard to the relationship between intonation and thematic
roles? At present, I am not familiar with a single study in the linguistic literature
showing the existence of a language in which intonation can be used to encode
thematic roles. An email query on the LINGTYP Discussion List (22 March 2001)
seeking references to such studies produced no clear cases. However, the email
query did reveal the presence of a common belief that languages in which intonation
may distinguish between thematic roles “ought to” exist; some potential examples
that were suggested include Hebrew, Persian, Russian and Italian.
In Hebrew, for example, if a number of morphosyntactic variables are set right,
it is possible to construct sentences exhibiting actor-undergoer ambiguities, such as
the following:

(1) Kelev radaf yeled

dog:M chase:PST:3:SG:M child:M
(i) ‘A dog chased a boy’
(ii) ‘A boy chased a dog’

Speakers of Hebrew occasionally claim that the two meanings can be distinguished
by intonation. But when asked how, they do not provide systematic answers. In
general, the most readily available interpretation is that in which the actor precedes
the undergoer, as in (1/i) above. In order to obtain the less readily available
interpretation, that in (1/ii), speakers of Hebrew sometimes offer a distinctive
intonation contour, involving greater pitch variation and greater duration for certain
syllables. However, when questioned, they will generally concede that even with the
distinctive intonation contour, the sentence can also be understood as in (1/i); and
then they will often admit that even with an ordinary intonation contour, the
sentence can also be understood as in (1/ii). Similar facts are reported also for Persian
and other Middle-Eastern languages by Stilo (1984, personal communication).
As suggested by the above, there would seem to be a rather striking mismatch
between the widespread conviction that intonation can be used to differentiate
between thematic roles, and the absence of any detailed empirical studies testing the
veracity of such claims. To the best of my knowledge, then, this paper represents the
first attempt to subject the possible relationship between intonation and thematic
roles to systematic empirical investigation.

3. RIAU INDONESIAN
Riau Indonesian is the variety of Indonesian spoken in informal situations by the
inhabitants of Riau province in east-central Sumatra. Riau Indonesian is quite
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 43

different from Standard Indonesian, familiar to many general linguists from a

substantial descriptive and theoretical literature.1
One of the most salient characteristics of Riau Indonesian is the absence of
obligatory morphosyntactic coding for a wide range of categories which play a
central role in the grammars of many other languages. In particular, there is no
obligatory morphosyntactic device for distinguishing thematic roles: word order is
flexible, and there is no case-marking or morphological agreement. Thus, in a
simple clause, a given expression denoting a participant in an activity could bear
any thematic role whatsoever with respect to that activity: it could be the actor or
the undergoer, or it could stand in any other semantic relationship that makes sense
in the given context. Indeed, it is only context that enables the hearer of such
utterances to interpret them in appropriate ways.
Below are some examples of Riau Indonesian sentences illustrating thematic role
indeterminacy. These examples, and all the Riau Indonesian examples that follow in
this paper, are from a corpus of naturalistic texts. As abstract sentences, each of the
following examples is indeterminate with respect to thematic roles; however, as
actual utterances, each is associated with a specific interpretation, as indicated in the
translation. Since the interpretation of the utterance is heavily context-dependent,
the context is also indicated, right above the translation, within square brackets.2

(2a) Beli aku laser, ‘kan

buy1:SG laser Q
[Contemplating a shopping trip]
‘I’ll buy a laser, right’

(b) Beli nasi goreng aku

buyrice fry 1:SG
[Group of people decide they want to pay cards; somebody tells
speaker to go out and buy some; speaker objects on the grounds that
it’s somebody else’s turn to go out]
‘I bought the fried rice’

(3a) Saya pakai kaca mata, Vid

1:SG use glass eye FAM|David
[Speaker putting on a new pair of glasses]
‘I’m wearing my glasses, David’

(b) Honda pakai abang Elly

motorcycle use elder.brother Elly
[Interlocutor tells speaker to go and buy food; speaker doesn’t budge;
interlocutor asks speaker why he isn’t going; speaker explains]
‘Elly’s using the motorcycle’
44 DAVID GIL

(4a) Si Pai aku usir

PERS Pai 1:SG send.away
[Complaining about his younger brother Pai, who won’t have
anything to do with him]
‘Pai sent me away’

(b) Abang dia sendiri dia

elder.brother 3 one-AG-stand 3

usir
send.away
[Complaining about his younger brother Pai, who won’t have
anything to do with him]
‘His very own brother he sent away’

In each of the above examples, a word denoting an activity is in boldface, and its
two associated participants are in italics. In (2) the activity word occurs before its
two participants, in (3) it occurs between them, and in (4) it occurs after them both.
Within each of the three sentence pairs, the activity word is the same; however, the
actor precedes the undergoer in the first sentence while following it in the second
sentence. Thus, in (2a) actor aku ‘I’ precedes undergoer laser ‘laser’ while in (2b)
actor aku ‘I’ follows undergoer nasi goreng ‘fried rice’; in (3a) actor saya ‘I’
precedes undergoer kaca mata ‘glasses’ while in (3b) actor abang Elly ‘Elly’
follows undergoer Honda ‘motorcycle’; and in (4a) actor si Pai ‘Pai’ precedes
undergoer aku ‘I’ while in (4b) actor dia ‘he’ follows undergoer abang dia sendiri
‘his very own brother’. Thus, each of the three sentence pairs constitutes a near
minimal pair illustrating the indeterminacy of thematic role assignment. Together,
sentences (2) - (4) show that in a basic sentence consisting of activity, actor and
undergoer, these three items may occur in any of the six possible orders. Similar
facts obtain also with respect to other thematic roles. Examples such as the above
occur frequently in the corpus; other similar examples are cited in Gil (1994:181,
1999:191-193, 2002b:246-249). Thus, sentences such as these point towards the
conclusion that in Riau Indonesian, grammar does not provide any obligatory
grammatical means for distinguishing between thematic roles.3
Given the kind of indeterminacy present in examples such as the above, it is only
natural to wonder whether intonation might play a role in differentiating between
various interpretations. In fact, practically every time I have presented examples
such as the above in lectures, somebody in the audience has asked whether it isn’t
perhaps the case that different interpretations involving different assignments of
thematic roles might be distinguishable by means of different intonation contours.
However, the answer to this question is a simple, straightforward ‘no’: intonation
does not and cannot differentiate between different assignments of thematic roles in
Riau Indonesian. Thus, for example, in sentences such as those in (2) - (4), there are
no systematic differences between the intonation contours of the (a) sentences, in
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 45

which the actor precedes the undergoer, and the (b) sentences, in which the actor
follows the undergoer.
Here the matter should rest, but unfortunately it does not always do so. Rather,
many scholars continue to hold steadfast to the belief that intonation must
distinguish thematic roles in Riau Indonesian, and in other varieties of Malay/
Indonesian. (Some of the possible reasons behind the persistence of this belief are
discussed in Gil 2003). However, not a single one of these scholars, when
challenged, has been able to formulate an explicit description of exactly how
intonation can be used to distinguish thematic roles, and to the best of my knowledge,
no such account appears anywhere in the linguistic literature on Malay/ Indonesian.
The closest to an explicit proposal that I have come across is perhaps the
following. (The claim is stated in my own words, and constitutes my interpretation
of one or two suggestions made by colleagues in informal discussions.) In general,
in Riau Indonesian, there is a significant tendency for undergoers to follow
activities, as in (2) and (3a) above. Accordingly, when undergoers precede
activities, as in (3b) and (4), this unusual word order is signalled by a pause
occurring right after the undergoer. Within a generative framework involving
movement, this generalization might be restated as follows: when a undergoer is
fronted to a higher position in the clause, a pause occurs between it and the clause
from which it was extracted. This “pause proposal” at least constitutes an explicit
hypothesis which can be examined in face of the facts. But as shown in Section 6
below, it is clearly false.4

4. TWO HYPOTHESES
So what needs to done in order to finally put such claims to rest? Three methods
suggest themselves. First, one might use elicitation, and ask native speakers for their
judgements of sentences exhibiting various possible pairings of intonation contours
and thematic roles. Secondly, one might construct experiments, which would
present native speakers with various tasks requiring them to make use of
intonational cues in order to distinguish thematic roles. Thirdly, one might study
naturalistic corpora, and search for possible correlations between intonation
contours and thematic roles. While each of these three methods is in principle
equally valid, this study chooses to make use of the third method, involving
naturalistic corpora. The reasons for this choice are entirely practical. On the one
hand, elicitation and experiments are particularly problematical in the study of Riau
Indonesian. As a regional colloquial language variety, Riau Indonesian stands in a
basilect-to-acrolect relationship with Standard Indonesian. Put a speaker of Riau
Indonesian in what is perceived to be a learnèd setting such as an elicitation session
or a controlled experiment, and he or she is likely to switch to Standard Indonesian,
no matter how clearly and repeatedly the investigator has asked the speaker to use
“ordinary language”, that is to say, Riau Indonesian. On the other hand, in Riau
Indonesian an extensive naturalistic corpus is available, containing recordings of
speech from many different speakers in a variety of settings, including narrative and
46 DAVID GIL

conversational. Accordingly, the present study makes use of the third method,
examining a naturalistic corpus for possible correlations between intonation
contours and thematic roles.
Two specific hypotheses are examined:

(5a) Hypothesis A (existential):

For each sentence, there exists at least one intonation contour which
renders the sentence undifferentiated with respect to thematic roles.

(b) Hypothesis B (universal):

For each sentence, every available intonation contour renders the
sentence undifferentiated with respect to thematic roles.

Both of the above hypotheses negate the claim that intonation distinguishes
between thematic roles in Riau Indonesian. However, the second hypothesis is
stronger than the first: one can envisage a state of affairs in which the first
hypothesis holds but the second one fails, but not vice versa. As we shall see in
Section 6 below, the naturalistic corpus provides overwhelming support for the
weaker Hypothesis A, and substantial support for the stronger Hypothesis B.
Accordingly, the results of this study lead to the conclusion that intonation does not
differentiate thematic roles in Riau Indonesian.

5. BASIC SUPRASEGMENTAL PATTERNS

To be in a position to examine the Riau Indonesian naturalistic corpus for possible
correlations between intonation contours and thematic roles, it is first necessary to
describe the basic suprasegmental patterns and establish an inventory of the major
intonation contours available in the language.

5.1. Word Structure

In Riau Indonesian, as in most other languages, intonation contours interact with
word structure; hence, before going any further, it is first necessary to develop a
clear picture of word structure in Riau Indonesian.
Riau Indonesian is a strongly isolating language, with no inflectional
morphology, little derivational morphology and little compounding. However,
unlike the stereotypical isolating languages of mainland Southeast Asia, the typical
or canonical word in Riau Indonesian is bisyllabic.
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 47

The bisyllabic nature of the Riau Indonesian word raises the issue of word
stress. As observed by Tadmor (1999, 2000), word stress in Malay and Indonesian
presents a thorny problem, with different scholars often providing conflicting
descriptions. Thus, for example, van Ophuijsen (1915) claims that stress is on the
final syllable, Amran (1984:60) maintains that it is on the penultimate, while Kähler
(1956:37) asserts that it is either on the final syllable (if the penultimate is a schwa)
or on the penultimate (in all other cases). One possible source for these
discrepancies might be that different scholars are unwittingly describing different
regional and/or social varieties of Malay / Indonesian. Thus, Tadmor (1999, 2000)
shows a tendency for word stress in Malay / Indonesian to progress from final, in
the western parts of the archipelago, towards penultimate, in the eastern regions,
reflecting a similar progression in the local languages, which often constitute
substrates for the regional varieties of Malay/ Indonesian. Another possible source
for these inconsistencies could well be that Malay/ Indonesian has no word stress.
In such a case, the patterns that are being described may be present in the
investigator’s ear but not in the language itself, as is suggested by Goedemans and
van Zanten (to appear). Alternatively, the patterns described may be phonetically
real, but pertaining not to word stress but rather to intonational prominence, as is in
fact suggested in the continuation of this section. Indeed, for Riau Indonesian, I am
not familiar with any positive evidence supporting the existence of a privileged
syllable which could be characterized as the locus of word stress. In this sense, then,
Riau Indonesian may be appropriately characterized as lacking word stress.
Nevertheless, while Riau Indonesian words lack a privileged syllable, there is
strong evidence for the presence of a privileged bisyllabic unit, which may be
referred to as the core foot. As represented in (6) below, the core foot (F) consists
of two syllables (S), each of which consists in turn of an onset (O) plus a rhyme (R):

(6) The Core Foot:

S S

O R O R
m a k an ‘eat’
m i ‘noodles’
ke p i t ing ‘crab’
b e l i kan ‘buy’
di c at ‘paint’
48 DAVID GIL

Most words, such as makan ‘eat’, are bisyllabic and thus coextensive with most
or all of the core foot. A few shorter monosyllabic words, such as mi ‘noodles’,
occupy only the second syllable of the foot, while a small number of longer words,
such as kepiting ‘crab’, occupy the entirety of the core foot plus additional space
preceding it. Clitics, when present, invariably occur outside of the core foot, either
after it, for example the end-point marker -kan in belikan ‘buy’, or before it, for
example the undergoer marker di- in dicat ‘paint’. The core foot is thus what
underlies the basic bisyllabic nature of Riau Indonesian words. However, the
existence of the core foot is also supported by a number of additional independent
phenomena.
One such phenomenon involves patterns of reduction in fast connected speech.
Typically, as shown in (7) below, material belonging to the core foot is retained,
while preceding material may undergo partial or complete deletion:

(7) Reduction in Fast Connected Speech:

S S

O R O R
p s a w at → [psawat] ~ [sawat]
‘airplane’
tang k e r ang → [taNkeraN] ~[NkeraN] ~ [keraN]
‘[place name]’
Whereas the above phenomenon involves the contraction of overly long words, a
number of others involve the expansion of words that are too short to fill the core
foot.
One such phenomenon pertains to the personal marker si, which marks
expressions as constituting names of people:
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 49

(8) The Personal Marker “si” in Non-Vocative Expressions:

S S

O R O R
si t o p an → [sitopan]~[stopan]~[topan]
‘[name]’
s i p an → [sipan], *[span], *[pan]
‘[name]’

Before bisyllabic names, such as Topan, the personal marker si is optional, and,
when present, it may undergo reduction of the kind exemplified in (7); this is shown
in the first line underneath the tree diagram in (8) above. However, names also
possess a monosyllabic familiar form derived by truncation; for example Pan from
Topan.5 Often, this form is used vocatively; however, it is also used in non-vocative
functions, in which case the use of the personal marker si is obligatory; this is
shown in the second line in (8). Thus, one of the functions of the personal marker si
is to expand the monosyllabic familiar form of the name to fill the core foot.
A similar phenomenon involves words with what might be characterized as a
defective penultimate rhyme. For this purpose it is necessary to acknowledge the
existence of two subdialects of Riau Indonesian, which may be referred to as the
schwa dialect and the schwaless dialect respectively. In the former dialect,
the schwa ´ is part of the phonemic inventory, though even in this dialect, it never
occurs in the final syllable. Of interest here however is the second, or schwaless
dialect, in which there is no phonemic schwa. Consider the way in which a word
containing a schwa in the schwa dialect, [b ´sar] ‘big’, is realized in the schwaless
dialect:

(9) Spreading and Epenthesis:

S S

O R O R
b s ar → [bs`ar] ~ [b´sar] ~ [besar] ‘big’

As shown above, realizations of the word in question involve a syllabic [ s` ´] (as

evidenced by the ways in which native speakers parse the sequence into syllables), a
50 DAVID GIL

phonetic schwa [´], or a full mid-high front vowel [e] (phonetically identical to the
mid-high front vowel phoneme). This range of possibilities can be most
appropriately accounted for by positing a segmental melody bsar occupying the
core foot as per (9) above, with an empty penultimate rhyme position which is
subsequently filled either by backward spreading of the sibilant s or by epenthesis
of a schwa or full vowel. Thus, these phonological processes, spreading and
epenthesis, beef up an impoverished segmental melody, thereby enabling the word
to extend across the entire core foot.
An analogous though somewhat less systematic phenomenon involves loan
words which, in the source language, are monosyllabic:

(10) Expansion of Monosyllabic Loan Words:

S S

O R O R
o om < Dutch oom ‘uncle’
g o l op < English golf

As suggested by the above examples, such monosyllabic words are often expanded
to form bisyllabic words in Riau Indonesian, though the strategies by which such
expansion is achieved are idiosyncratic and unpredictable. However, a particular
subclass of such cases, in the schwaless subdialect, make use of the same processes
of spreading and epenthesis that apply, as in (9) above, to native words:

(11) Expansion of Monosyllabic Loan Words through Spreading and

Epenthesis:
F

S S

O R O R
s (n) tr um → [strum] ~ [s´trum] ~ [setrum] ~
[s n`trum] ~ [s´trum] ~ [sentrum]
< Dutch stroom ‘electric current’
s m ek → [sm`ek] ~ [s´mek] ~ [semek]
< English smack
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 51

In the first example, the borrowing of Dutch stroom involves the optional
introduction of a nasal stop n, followed by various combinations of spreading and
epenthesis. In the second example, the borrowing of English smack involves either
the spreading of the nasal stop m or epenthesis. In general, evidence from borrowing
may be open to alternative interpretations, since the path from source to target
language could potentially involve any number of intermediate way stations, with
the word in question actually entering Riau Indonesian from another variety of
Indonesian, already in bisyllabic form. However, in at least one case, smek < smack,
it may be safely surmised that the word entered Riau Indonesian directly from
English. This is because the borrowing was actually observed to take place, in the
late 1990’s, via television, immediately following the introduction into US
professional wrestling (hugely popular throughout Indonesia) of the brand name
Smack Down. Accordingly, this latter example provides clear-cut evidence for the
relevance of the core foot as a factor governing the incorporation of loan words into
Riau Indonesian.
The final phenomenon supporting the core foot comes from the Warasa ludling,
a secret language in which the sequence war- is inserted at the beginning of each
word.6 In (12) below the results are shown of applying the ludling to the words
represented in (6) above:

(12) Warasa Ludling:

S S

O R O R
wa r a k an
makan → warakan ‘eat’
wa r m i
mi → waremi ‘noodles’
wa r i t ing
kepiting → wariting ‘crab’
wa r e l i kan
belikan → warelikan ‘buy’
wa r c at
dicat → warecat ‘paint’
52 DAVID GIL

As shown in (12) above, the sequence war- is inserted into a position that is defined
structurally, with reference to the core foot: r occupies the first onset of the core
foot with wa immediately preceding it. The effect of adding war- to a word thus
depends crucially on the size of the original word. For most words, which are
bisyllabic, adding war- involves deletion of the first consonant, if the word begins
with a consonant, for example makan → warakan. However, for monosyllabic
words, adding war- involves not deletion but rather the further insertion of an
epenthetic vowel, for example mi → waremi. Conversely, for polysyllabic words,
adding war- involves the deletion not just of the first consonant of the penultimate
syllable, but of any and all preceding material, for example kepiting → wariting.
For stems combined with an enclitic, the ludling ignores the enclitic and treats the
stem as though it constituted the entire word, for example belikan → warelikan. In
contrast, for stems combined with a proclitic, adding war- involves the deletion of
the proclitic, and treats the remainder of the word as though the clitic were absent;
for example dicat → warecat, with the further insertion of an epenthetic vowel.
Thus, as shown in (12) above, the application of the Warasa ludling relies crucially
on the core foot, thereby providing yet additional evidence for its central role in the
structure of the Riau Indonesian word.
Thus, a number of independent phenomena support the existence of a core foot
underlying the structure of the word in Riau Indonesian. Although, as noted in the
beginning of this section, Riau Indonesian has no privileged syllable which could be
characterized as the locus of word stress, the core foot does constitute a privileged
unit, albeit of a larger size. As such, Riau Indonesian may be characterized as being
endowed with a somewhat more abstract variety of word stress, whose locus is not
the syllable, as in most typical instances of stress, but rather the bisyllabic core foot.
As we shall see in Section 5.3 below, the characterization of the core foot as bearing
word stress may account also for properties of focus intonation.

5.2. Intonation Groups and Final Prominence

As in most other languages, intonation contours form intonation groups with a
hierarchical tree structure, in which smaller units group together to form larger ones,
which in turn group together to form even larger ones, and so forth; see, for
example, Nespor and Vogel (1986). Such intonational groupings often coincide to a
certain degree with syntactic groupings.
Because of this, intonation can sometimes help to disambiguate between
different readings associated with different syntactic constituencies underlying the
same sequence of words. Consider the following example:

(13) Tengok tikus aku

look mouse 1:SG
[Speaker learning to play a game of laptop billiards in which it is
rather difficult to control the position of the simulated player with the
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 53

track pad, and the player often ends up under the table; the first time
this happened, I jokingly asked him whether he was looking at the
mice; when this happened once again, speaker joked]
‘I’m looking at the mice’

In order to facilitate the intended interpretation, the above sentence was associated
with an intonation contour which effected the grouping [Tengok tikus] aku.
However, in a different context, a different intonation contour could have been used
to effect a different grouping, Tengok [tikus aku], which would have a quite
different meaning, ‘Looking at my mice’. It should be acknowledged, however, that
the above sentence may also be uttered with intonation contours that do not reflect
any internal constituent structure and hence do not disambiguate between the two
potentially available meanings.
Perhaps the most noticeable characteristic of intonation groups is final
prominence. Within each intonation group, the final syllable is accented, thereby
providing a salient marker of intonation phrase boundaries. Thus, for example, in
(13) above, the grouping [Tengok tikus] aku was affected by accent on the final
syllable of the intonation group, namely kus. As in many other languages, accent is
realized by a combination of phonetic features including greater pitch variation,
greater intensity and greater duration. However, compared to some other languages,
the contribution of greater duration would appear to be relatively larger. Examples
(14) and (15) illustrate the phenomenon of phrase-final lengthening, with durations
indicated in milliseconds:

370 700

(14) Banyak se mut

many ant
[Eating newly bought fruit]
‘Lots of ants’

760 1230

(15) Aku main seo rang

1:SG play one-person
[Speaker squabbling over who gets to play on laptop computer]
‘I’m playing by myself’

Each of the above examples represents a single intonation group. As indicated by

the figures, in each example the final syllable is almost twice as long as all of the
preceding syllables combined. The prolongation of the final syllable of the
intonation group is not always quite as dramatic as in the above examples. However,
54 DAVID GIL

the above examples are quite typical of the way in which final lengthening may be
exaggerated in order to increase the affective expressiveness of the utterance.
For the unwary investigator, one of the consequences of final prominence in
intonation groups is that it gives rise to the illusion of final word stress. For
example, in a situation involving elicitation, where the researcher asks what the
word for such-and-such is, the speaker typically responds with a one-word utterance
bearing the final-prominent intonation contour. This sounds like final word stress;
however, it is important to keep in mind that the suprasegmental pattern is not a
property of the word, but rather of the entire utterance, which just happens to consist
of a single word. Mistaken analyses of final-prominent intonation contours as word
stress are apparently responsible for the probably erroneous characterization of
many related Malayic language varieties of Sumatra as possessing final lexical
stress, for example Nurzuir et al (1985:32-33) for Jambi, Umar et al (1986:28) for
Muko-Muko, and Suwarni et al (1989:80) for Lintang.

5.3. Focus Intonation

Final-prominent intonation groups provide the backdrop for an additional layer of
intonational organization, that of focus intonation.
Within each intonation group, a single word, which may occur in any position
within the group (initial, medial or final), may optionally be assigned focus
intonation. Focus intonation provides an expression for the semantic focus operator,
though many of the details remain to be worked out. (The term “focus” is thus used
here in the sense that is current in general semantic theory, not in the rather peculiar
sense that has gained acceptance among Austronesianists, where it refers to what is
known elsewhere as verbal voice.)
Focus intonation is realized through a bundle of phonetic properties associated
with the core foot of the word in focus, as shown below:
(16) Focus Intonation:
F

S S

O R O R
M A K AN ‘eat’
M I ‘noodles’
ke P I T ING ‘crab’
B E L I kan ‘buy ’
di C AT ‘paint’
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 55

Example (16) above illustrates the domain of focus intonation for the words pre-
viously illustrated in (6) and (12); in this and subsequent examples, the domain of
focus intonation is indicated with upper case letters. As shown in (16), the domain
of focus intonation coincides precisely with the core foot, as supported by the
various phenomena discussed in Section 5.1 above.
The phonetic realizations of focus intonation are distributed unevenly over the
two syllables of the core foot. The most salient feature of focus intonation involves
the lengthening of the rhyme of the first syllable of the core foot, and sometimes
also the onset of the second syllable. (In some varieties of Riau Indonesian, the
onset of the second syllable may be lengthened if and only if it is other than an oral
stop, while for other varieties, more influenced by a Minangkabau substrate, the
onset of the second syllable may be lengthened no matter what its contents are.)
This lengthening is generally associated with a level pitch contour. At the same
time, focus intonation is also reflected by pitch prominence and secondary
lengthening on the rhyme of the second syllable of the core foot. Some examples of
focus intonation are given in (16) and (17) below, with durations again indicated in
milliseconds:

780 270 380 350

(17) PAY…AH budak i ni

bad child DEM-DEM:PROX
[Bantering with friends]
‘This kid’s really bad’

330 750 1030

(18) Rekam LA GI
record again
[Seeing me turn the laptop computer recorder on]
‘Recording again’
56 DAVID GIL

Each of the above examples consists of a single intonation group. In (17), focus
intonation falls on the first word, payah. In this example, focus intonation is
reflected primarily by the length of the first syllable plus second onset, pay, totalling
780 msec. The second rhyme, ah, is also relatively long, and in addition bears
salient pitch prominence. The remainder of the intonation group follows the usual
pattern of final prominence, with three short syllables followed by a final much
longer one, ni. In (18), focus intonation falls on the second word, lagi. Here, once
more, focus intonation is reflected by the length of the first syllable, la, totalling 750
msec., but in this case the second syllable gi is even longer, showing the combined
effect of secondary lengthening due to focus plus the regular final prominence of
the intonation group.7
This particular constellation of features, involving lengthening of a penultimate
syllable followed by some kind of pitch accent on the final syllable, is not peculiar
to Riau Indonesian. In the Jakarta dialect of Indonesian, focus intonation occurs
more frequently than in Riau Indonesian, and its phonetic realization is more
pronounced; so much so that when speakers from Riau attempt to imitate a Jakarta
accent, one of the things that they do is exaggerate the frequency and the phonetic
properties of focus intonation. Outside of Mala y /Indonesian, penultimate
lengthening coupled with some kind of final accentuation has been reported, among
others, for the Formosan language Amis (Edmundson, Huang and Pahalaan 2001),
for various Micronesian languages (Rehg 1993), and for the Polynesian language
Marquesan (Margaret Mutu, personal communication), thereby suggesting that the
feature may be of considerable antiquity within the Austronesian language family.
Just as final prominence in intonation groups sometimes creates the illusion of
final word stress, so focus intonation and concomitant penultimate lengthening may
occasionally give rise to an unwarranted impression of penultimate word stress,
at least in those cases where penultimate lengthening is more salient to the
investigator’s ear than final pitch accent. For example, such a misanalysis is what
underlies some descriptions of Minangkabau, for example Zarbaliev (1987:23) and
Adelaar (1992:12), as having penultimate word stress, even though in reality the
suprasegmental patterns of Minangkabau are largely identical to those of Riau
Indonesian. In some other dialects, such as Jakarta Indonesian, focus intonation and
penultimate lengthening are often used in place of the final-prominent intonation
contour in the context discussed earlier, where, in response to being asked what the
word for such-and-such is, the speaker responds with a one-word utterance. This
use of focus intonation thus contributes further to a characterization of Malay /
Indonesian as having penultimate word stress. However, in actual fact, focus
intonation and the way in which duration and pitch prominence split across the two
syllables of the core foot provide additional support for the claim that in Riau
Indonesian, as in many other related varieties, word stress is present not at the
domain of the syllable but rather at the level of the entire foot, with respect to which
it occurs in fixed position, falling invariably on the core foot.
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 57

6. INTONATION AND THEMATIC ROLES

The description of the basic intonational patterns of Riau Indonesian presented in
the preceding section makes it possible for us now to address the main concern of
this paper, namely, the purported correlation between intonation and thematic roles.
In order to do this, we shall examine the distribution, in the naturalistic corpus,
of four basic intonation contours, associated with declarative statements and
imperatives.8 These four contours make reference to intonation groups, as
recognizable by the feature of final prominence, and to pauses, which may separate
successive intonation groups.

(19) Four Basic Intonation Contours:

(a) Intonation Contour A: Two intonation groups separated by pause, no
focus

(b) Intonation Contour B: Single intonation group with no pause, no

focus

(c) Intonation Contour C: Single intonation group with no pause, initial

focus

(d) Intonation Contour D: Single intonation group with no pause, final

focus

The above four contours span much of the variety that is in evidence in the
intonational patterns of Riau Indonesian, though of course they do not exhaust it.
For declarative statements and imperatives, additional intonation contours may
involve more complex configurations containing two or more intonation groups,
focus, and pauses; however, as complexity increases, these intonation contours
become less and less frequent. Alternatively, other intonation contours of a
qualitatively different nature include those associated with certain specific sentence-
final particles, and also with other kinds of speech acts such as polar and
information questions, and direct quotation. Nevertheless, the above four basic
intonation contours suffice to give the proponents of a correlation between
intonation and thematic roles a good run for their money: if such a correlation did
exist, it is most likely that it would involve at least one of the above four contours.
The four basic intonation contours are examined with respect to a set of basic
sentence patterns defined in terms of an activity in construction with a single
associated participant. The participant in question may either precede or follow the
activity, and it may be associated with the thematic roles of either actor or
undergoer. Resulting from these two binary choices are the following four basic
sentence patterns:
58 DAVID GIL

(20) Four Basic Sentence Patterns:

(a) Actor precedes activity

(b) Undergoer precedes activity

(c) Actor follows activity

(d) Undergoer follows activity

Again, the above four basic sentence patterns do not exhaust the inventory of
sentence patterns in Riau Indonesian. However, it is reasonable to suppose that if
intonation did distinguish thematic roles, its effect would be observable with respect
to at least some of the above basic sentence patterns.
The four basic intonation contours in (19) and the four basic sentence patterns in
(20) may be combined to yield sixteen potentially possible pairings of intonation
contours and sentence patterns. These sixteen pairings are represented in the sixteen
cells of Table 1. (In Table 1, letters a, p and v stand for actor, undergoer and
activity respectively, upper case letters denote focus intonation, while ø represents a
pause between intonation groups.)

Table 1: Intonation Contours and Sentence Patterns

Participant precedes Participant follows

activity activity
Actor Undergoer Actor Undergoer

↔ ↔
Intonation Contour A: aØv pØv vØa vØp
Pause, no focus (21a) (21b) (22a) (22b)

↔ ↔
Intonation Contour B: av pv va vp
No pause, no focus (23a) (23b) (24a) (24b)

↔ ↔
Intonation Contour C: Av Pv Va Vp
No pause, initial focus (25a) (25b) (26a) (26b)

↔ ↔
Intonation Contour D: aV pV vA vP
No pause, final focus (27a) (27b) (28a) (28b)

Table 1 provides a classificatory scheme for utterances in the naturalistic corpus. If

intonation does distinguish thematic roles, then one would expect to find an unequal
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 59

distribution of utterances across the table, with, crucially, some empty cells,
reflecting impossible pairings of intonation contours and sentence patterns.
Conversely, if intonation does not differentiate thematic roles, then one would
expect to find utterances exemplifying all of the potential pairings of intonation
contours and thematic roles, with no empty cells in the table.
The facts are quite clear. Even a cursory examination of a small subset of the
naturalistic corpus turns up examples of all sixteen potential pairings of intonation
contours and sentence patterns: there are no empty cells in the table. Thus, there is
no correlation between the intonation contours defined in (19) and the sentence
patterns represented in (20): intonation does not differentiate thematic roles in Riau
Indonesian.
In examples (21)-(28) below, each of the sixteen pairings of intonation contours
and sentence patterns is illustrated with an utterance from the naturalistic corpus; for
easy cross-referencing, the number of each example is shown in the appropriate cell
in the table. As in examples (2)-(4) previously, the activity word is in boldface,
while the relevant associated participant is in italics. (In some of the examples, the
pairing of intonation contour and sentence pattern extends over just part of a larger
utterance; in such cases, the remaining parts of the utterance are enclosed in
parentheses. Breaks between intonation groups, either within the relevant part of the
utterance or outside of it, are represented with commas.)

(21a) Kepala desa itu, (Intonation contour A)

head village DEM-DEM:DIST

pindah rumah papan itu

move house board DEM-DEM:DIST

[From narrative about peeping tom]

‘The village head moved into that wooden house’

(b) ( Vid, ) hilangkan ini, lupa

FAM|David disappear-EP DEM-DEM:PROX forget

dah Vid
PFCT FAM|David

[Playing billiards on laptop computer; speaker asking me to help him

get rid of the lines on the screen which show where the balls will go]
‘David, I’ve forgotten how to get rid of these, David’
60 DAVID GIL

(22a) (Sangkut situ ‘kan,

be.caught.on LOC-DEM-DEM:PROX Q

selamat dia,) tidur, dia,

safe 3 sleep 3

(anak si Yung tadi ini)

child PERS Yung PST:PROX DEM-DEM:PROX

[From narrative about village boy and sparrowhawk; boy has fallen
off a bridge into a mangrove tree]
‘He was caught there, he was safe, he fell asleep, the boy Yung’

(b) Jumpa, satu asap, (nampak asap dari

meet one smoke see smoke from

jauh ‘kan )
far Q

[From narrative about village boy and sparrowhawk; boy is

wandering through forest]
‘He noticed a plume of smoke, he saw smoke from afar, right’

(23a) Bola putih masuk (Intonation contour B)

ball white enter
[Playing billiards on laptop computer]
‘The white ball’s gone in’

(b) Rokoknya buang

cigarette-ASSOC throw
[Cleaning a room with friends]
‘Throw the cigarette stubs away’

(24a) Kawin dia, (David)

marry 3 David
[From narrative about boy who grows up, gets married, and learns the
facts of life]
‘Then he got married, David’

(b) Tutup pintu oy

close door EXCL
[Speaker wants to prevent other people from coming in to the room]
‘Hey, close the door’
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 61

(25a) Bola PUTIH masuk (Intonation contour C)

ball white enter
[Playing billiards on laptop computer]
‘The white ball’s gone in’

(b) INI pencet ha, (ini pencet)

DEM-DEM:PROX press DEIC DEM-DEM:PROX press
[Playing billiards on laptop computer; speaker showing friend which
key to press]
‘Press this one, press this one’

(26a) MASUK bola putih

enter ball white
[Playing billiards on laptop computer]
‘The white ball’s gone in’

(b) TUKUL dio ha, (macam mano, sakit die)

hammer 3 DEIC kind what hurt 3
[From narrative about peeping tom]
‘She hammered him, it hurt’

(27a) (E,) bola putih MASUK

EXCL ball white enter

(Intonation contour D)
[Playing billiards on laptop computer]
‘The white ball’s gone in’

(b) Catur tak PANDAI itu, (Vid)

chess NEG know.how DEM-DEM:DIST FAM|David

[Discussing what game to play next on laptop computer; someone

suggests chess; speaker reacts]
‘I don’t know how to play chess, David’
62 DAVID GIL

(28a) Rekam DIE

record 3
[Speaker discovers I’ve been recording]
‘He’s recording’

(b) “Aku nak TANGAN dikau”, (katanya,

1:SG want hand 2 say-ASSOC

dia bilang)
3 say

[From horror story about ungrateful son who tries to rob his mother’s
tomb; at the end of the story, the mother’s ghost tries to snatch her
son’s hand]
‘“I want your hand” she said’

Each of the above eight numbered examples presents a near minimal pair, as close a
contrast as one is likely to find in a naturalistic corpus. Within each pair, the
intonation contours are the same, the relative orders of activity and participant are
the same, but the thematic role of the participant is different: whereas in the first, or
(a) example, the participant is an actor, in the second, or (b) example, it is a
undergoer. Thus, each of these minimal pairs shows that for a particular intonation
contour and a particular sentence pattern, the intonation contour in question fails to
differentiate between thematic roles, allowing a certain participant to be understood
either as an actor, in the first member of the pair, or as an undergoer, in the second.
For example, (21) shows that Intonation Contour A does not differentiate
between actors and undergoers when these occur in a position preceding an activity.
Similarly, (23) shows that Intonation Contour B does not distinguish between actors
and undergoers when these come before an activity. Thus, examples (21) and (23)
refute the “pause proposal”, discussed in Section 3 above, which suggests that when
a undergoer precedes an activity, it must be followed by a pause. Such, indeed, is
the case in (21b); however, in (23b), a undergoer also precedes an activity and here,
contrary to the pause proposal, there is no pause (and there are many more examples
like this in the corpus). Moreover, in (21a) there is a pause, even though here it is an
actor rather than an undergoer that precedes the activity. Thus, examples such as
these show that when the participant in question occurs before the activity, the
presence or absence of a pause plays no role whatsoever in distinguishing actors
from undergoers.
In conjunction, then, examples (21) - (28), and many others like them in the
corpus, show quite clearly that intonation plays no role in the differentiation of
thematic roles in Riau Indonesian. To the extent that the four basic sentence patterns
in (20) are representative of the variety of sentence patterns in the language, the
above examples provide overwhelming support for Hypothesis A, as formulated in
(5a), suggesting that for each sentence there is at least one intonation contour which
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 63

renders that sentence undifferentiated with respect to thematic roles. Moreover, to

the extent that the four basic intonation contours in (19) encompass the major
patterns of intonation that are available in the language, the above examples provide
substantial support for the stronger Hypothesis B, as formulated in (5b), asserting
that for each sentence, every intonation contour renders that sentence undif-
ferentiated with respect to thematic roles. In view of examples such as these, it is
hard to see how anybody can continue to maintain an uncritical position to the
effect that intonation can function to distinguish thematic roles in Riau Indonesian.
Playing devil’s advocate for a moment, it could, admittedly, still turn out to be
the case, contrary to everything suggested in this paper, that intonation somehow
does differentiate thematic roles in Riau Indonesian. For example, there could exist
some additional intonation contours, not considered in this paper, which do
differentiate thematic roles: such intonation contours would provide a counter-
example to Hypothesis B, though not contradict the weaker Hypothesis A.
More far-reachingly, it could conceivably be the case that each of the four would-be
basic intonation contours defined in this paper actually lumps together two or more
distinct intonation contours which do differentiate thematic roles: if this were true,
then counterevidence would be provided even for the weaker Hypothesis A. Thus,
the claims made in this paper constitute explicit hypotheses for which it is easy to
imagine hypothetical counterevidence. Nevertheless, the results of this paper
suggest that such counterevidence is indeed no more than strictly hypothetical.
Accordingly, if anybody still wishes to claim that intonation can differentiate
thematic roles in Riau Indonesian, then the burden of the proof now rests solidly on
their shoulders: they must produce the data, and specify exactly which intonation
contours distinguish which thematic roles. (To assist in such a challenge, I would be
happy to share the naturalistic corpus, including the sound files, with anybody
wishing to examine them for scientific purposes.) In the meantime, in the absence
of such counterarguments, the only position that can reasonably be maintained is
that intonation does not and cannot differentiate thematic roles in Riau Indonesian.

7. CONCLUSION
The results of this paper underscore the need for linguistic descriptions to avoid
Eurocentric assumptions with regard to the expressive power of languages. Just
because thematic roles are central to the grammatical organization of many familiar
languages does not mean that they are of equal importance in all of the world’s
languages. Riau Indonesian shows how a language can manage just fine, fulfilling a
wide range of communicative functions, without any obligatory grammatical means
for distinguishing between thematic roles: word order, case marking, agreement, or
intonation.
More specifically, the absence of any relationship between intonation and
thematic roles in Riau Indonesian provides reinforcement for previous descriptions
of the language which have argued that it is lacking in many of the categories that
are considered to be central to the grammatical organization of most other languages.
64 DAVID GIL

The reader may have noted that no mention was made at any point in this paper of
parts of speech (such as noun and verb), syntactic categories (such as noun phrase
and verb phrase), or grammatical relations (such as subject, direct object, indirect
object, and so forth). Indeed, in Gil (1994, 1999, 2000, 2001a,b, 2002b, 2005)
it is argued that such categories are absent in Riau Indonesian. As statements of
non-existence, such claims can be readily refuted, by showing how a single
grammatical generalization makes reference to the category in question. Conversely,
such claims can be supported only in gradual incremental fashion, through the
examination, one after the other, of a wider and wider range of phenomena, each of
which can in turn be accounted for without reference to the categories in question.
In the case at hand, the absence of any correlation between intonation and thematic
roles adds further to the plausibility of the claim that Riau Indonesian does not
possess any categories whose definitions make reference to thematic roles, such as
grammatical relations, or whose prototypical characteristics involve thematic roles
in any way, such as parts of speech and syntactic categories.
How would the grammar of Riau Indonesian work in the absence of so many
commonplace grammatical categories? Following are syntactic and semantic
representations for a typical Riau Indonesian sentence, example (2a) above, ‘I’ll buy
a laser’. (For ease of exposition, the final particle ’kan in (2a) is ignored.)

(29) syntactic
representation: S

S S S
beli aku laser
BUY 1:SG LASER

semantic
representation: A (BUY, 1:SG, LASER)

As argued in Gil (1994, 2000, 2001a, b, 2005), Riau Indonesian syntax

contains a single open syntactic category, S(entence). As shown above, beli, aku
and laser are all Ss, as is the construction as a whole; from a formal point of view,
beli aku laser is thus a coordination of three Ss. The semantics of Riau Indonesian
centers around the association operator, represented above with the letter A. In its
monadic, or one-place guise, the association operator provides a semantic
representation for markers of association, possession, and genitive case in many
languages. For example, in English, in an expression such as John’s, the possessive
’ s is interpreted as the association operator A, applying to the denotation JOHN,
yielding the formula A (JOHN), which can be read as ‘entity associated with John’,
where the detailed nature of the association is left unspecified by the grammar and is
instead determined by context. However, in a typical Riau Indonesian sentence, the
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 65

association operator applies polyadically, to a sequence of items, and without any

overt morphosyntactic realization. For example, in (29) above, it applies to the three
meaning components of the sentence, yielding the formula A (BUY, 1:SG, LASER),
which may be read as ‘entity associated with buying, speaker and laser’, where,
again, the precise nature of the association is left unspecified, to be determined by
context. Accordingly, the sentence Beli aku laser is endowed with a single unitary
semantic representation which is indeterminate with respect to a variety of
categories such as number, definiteness, tense, aspect and thematic roles. In the right
context, it could thus mean anything from ‘I’ll buy a laser’, as in fact it did in (2a),
to ‘Somebody bought me the laser’, ‘Somebody’s buying something from me with
some lasers’, and so forth. Thus, as suggested in (29) above, the basic sentence
structure of Riau Indonesian is extremely impoverished, making no reference to
thematic roles. It is thus hardly surprising that intonation, too, fails to differentiate
thematic roles in Riau Indonesian.
But what of most other languages, with more elaborate clause structure, in which
thematic roles play a central part? Prima facie, there might perhaps be more reason
to expect intonation to correlate with thematic roles in at least some such languages.
Imagine, for example, a language like Hebrew, with flexible word order, and in
which, for basic transitive clauses, there is evidence for hierarchical syntactic
structure of the kind commonly represented in terms of a VP containing the verb
and the object to the exclusion of the subject. Now imagine that in such a language,
intonational grouping were to reflect the VP constituent in sentences such as
Hebrew (1) in the same way that it reflects other kinds of constituency in Riau
Indonesian (13) and in many other examples in most or all languages. In such a
language, then, intonation would distinguish between thematic roles, albeit not
directly, but rather through the mediation of syntactic constituency, in accordance
with the alternative scenario suggested in the introduction. However, as noted in
Section 2, there are no clear documented cases of languages in which intonation
works in this way.
Can we thus conclude that intonation is in principle incapable of encoding
thematic roles in human language? At present, this is perhaps most appropriately
viewed as a conjecture still in need of serious further investigation, so that it may
either be refuted by means of one or more counterexamples, or else recognized as a
linguistic universal, a substantive constraint on what constitutes a possible human
language.

Max Planck Institute for Evolutionary Anthropology

8. NOTES

*
I would like to thank all my colleagues who asked whether intonation differentiates thematic roles in
Riau Indonesian, and/or insisted and perhaps still insist that it does, for providing me with the impetus to
write this paper. In particular, I am indebted to Peter Cole, Gabriella Hermon and Uri Tadmor for
numerous discussions on the issues dealt with in this paper, and to Matt Gordon for constructive
comments on an earlier draft. I am especially grateful to the many speakers of Riau Indonesian who
66 DAVID GIL

provided the naturalistic data on which this paper is based: Arief, Benny, Danzha Selpas, Desrul,
Ellyanto, Dwiarpianto, Fuad, Jumbro, Junaidi, Muchlis, Pai, Per, Riki, Rudy Chandra,
Septianbudiwibowo, Wira, Zainudin. Versions of this paper were presented at the Fifth International
Symposium on Malay/Indonesian Linguistics, Leipzig, Germany, 17 June 2001; at Topic and Focus: A
Workshop on Intonation and Meaning, University of California, Santa Barbara, CA, USA, 21 July 2001;
and at the Ninth Annual Meeting of the Austronesian Formal Linguistics Association, Cornell University,
Ithaca, NY, USA, 26 April 2002; I would like to thank participants at all three events for their helpful
comments and suggestions.
1
In addition to Riau Indonesian, some of the data cited in this paper show evidence for interference from
Siak Malay, the dialect of Malay spoken in the lower part of the Siak river basin, in Riau province. Riau
Indonesian and Siak Malay share a considerable degree of mutual intelligibility; in fact, in some cases it
is difficult to determine whether a given utterance is in one dialect or the other. Although this paper
focuses on Riau Indonesian, all of its main points are equally germane also for Siak Malay.
2
The interlinear glosses in this paper make use of the following abbreviations: AG ‘agent’; ASSOC
‘associative’; DEIC ‘deictic’; DEM ‘demonstrative’; DIST ‘distal’; DISTR ‘distributive’; EP ‘end point’; EXCL
‘exclamation’; FAM ‘familiar’; M ‘masculine’; NEG ‘negative’; PERS ‘personal’; PFCT ‘perfect’; PROX
‘proximal’; PST ‘past’; Q ‘question’; SG ‘singular’; 1 ‘first person’; 3 ‘third person.
3
Readers familiar with Malay / Indonesian may be wondering about the well-known “voice markers” and
whether they might perhaps be involved in the differentiation of thematic roles. In Riau Indonesian, the
relevant forms di- and N- are indeed present; however, their use is optional, and, crucially, they do not
help to differentiate thematic roles: sentences with di- or N- (or even both) remain indeterminate with
respect to thematic roles (see Gil 1999, 2002b for examples and detailed discussion). Perhaps the most
productive means for differentiating thematic roles in Riau Indonesian is provided by the form sama,
which can mark participants in any thematic role except that of absolutive, thereby discriminating
between roles such as, for example, actor and undergoer, by overtly marking the former. However, even
this form is optional; moreover, it is only very weakly grammaticalized, and is actually more
appropriately considered as an ordinary “content” word with a very broad and abstract meaning centered
around the notion of togetherness (see Gil 2004, for examples and argumentation).
4
Another proposal occasionally mentioned in discussions of intonation and clause structure in Malay /
Indonesian is that of Chung (1978), pertaining to a language variety that she refers to as “informal
Indonesian”, but which is actually closer to Standard Indonesian than to any of the regional colloquial
varieties (including those of Jakarta and Bandung, from where her speakers hailed). Chung is concerned
with a particular sentence pattern of the form AVP (Agent - Activity - Patient), where the V is devoid of
any morphological voice marking. For a subset of such sentences, those in which the A is a pronoun or a
proper noun, she maintains that two distinct intonation contours are available, which she calls “normal
declarative” and “subject shifting”. She then claims that these two intonation contours correspond to two
different syntactic analyses of the sentence in question, as “active” and “passive” respectively. In the
latter case, her suggestion involves the following derivation. First, an active sentence with AVP order
undergoes passivization (of the variety known in Indonesian studies as the pasif semu, or “second
passive”), resulting in a structure of the form PAV, where the P assumes some subjecthood properties,
and the A is cliticized to the V. Next, the P undergoes subject shifting, a process which moves subjects to
the end of the sentence, in this case restoring the original AVP order. Although it may seem as though
we’re back where we started, Chung asserts that such sentences are passive, and cites as evidence the
purported “subject shifting” intonation contour associated with such constructions. Whether or not the
facts are as described, and whether or not the analysis provided is the most appropriate one to account for
such facts, Chung’s proposal does not involve any suggestion to the effect that intonation may
differentiate thematic roles, since both intonation contours are associated with the same assignment of
thematic roles. Indeed, this could hardly be otherwise, since, in the variety of Indonesian described by
Chung, there is no thematic role indeterminacy of the kind illustrated in (2) - (4), and in particular no
sentences of the form PVA such as in (3b).
5
In general, in the derivation of such monosyllabic forms, the lighter of the two syllables is omitted,
while the heavier one is retained – w here the weight of the respective syllables is defined in terms of the
number of segments they contain and their position on the sonority hierarchy, greater sonority
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN 67

corresponding to lesser weight. Thus, in the above example, pan is heavier than to by dint of the
additional coda segment n; hence the familiar form of Topan is Pan, not To.
6
The name of the ludling, Warasa is derived by application of the ludling in question to the Malay /
Indonesian word bahasa ‘language’. This and other Riau Indonesian ludlings are described in detail in
Gil (2002a).
7
Occasionally, focus intonation occurs in a variant form, which might appropriately be referred to as
super-focus. Phonetically, super-focus has all the special features of ordinary focus, plus an additional
one, lip rounding on the lengthened penultimate syllable. Semantically, super-focus adds emphasis and
affective force; one common usage of super-focus is with scalar adjectives, where it lends itself to
translation into English with an accented intensifier such as “very”.
8
As far as I can tell, there are no systematic differences between the intonation contours of declarative
statements and imperatives. In fact, there would seem to be no grammatical differences whatsoever
distinguishing between sentences used to perform these two particular speech acts.

9. REFERENCES
Adelaar, K. Alexander. Proto-Malayic: The Reconstruction of Its Phonology and Parts of Its Lexicon and
Morphology, Pacific Linguistics Series C – 119. Canberra: The Australian National University, 1992.
Amran Halim. Intonasi dalam Hubungannya dengan Sintaksis Bahasa Indonesia, Seri ILDEP di bawah
Redaksi W.A.L. Stokhof. Jakarta: Penerbit Djambatan, 1984.
Chung, Sandra. “Stem Sentences in Indonesian.” In S.A. Wurm and L. Carrington (eds.), Second
International Conference on Austronesian Linguistics: Proceedings, Fascicle 1, Western
Austronesian, Pacific Linguistics Series C - No. 61, pp. 335-365. Canberra: Australian National
University, 1978.
Edmundson, Jerold A., Tung-Chiou Huang and Akiyo Pahalaan. “Phonological Strengthening in
Hsiukuluan Amis of Taiwan”, Paper presented at the Eleventh Annual Meeting of the Southeast
Asian Linguistics Society, Mahidol University, Bangkok, Thailand, 17 May 2001.
Gil, David. “The Structure of Riau Indonesian.” Nordic Journal of Linguistics 17 (1994): 179-200.
Gil, David. “Riau Indonesian as a Pivotless Language.” In E.V. Raxilina and Y.G. Testelec (eds.),
Tipologija i Teorija Jazyka, Ot Opisanija k Objasneniju, K 60-Letiju Aleksandra Evgen’evicha
Kibrika (Typology and Linguistic Theory, From Description to Explanation, For the 60th Birthday of
Aleksandr E. Kibrik), pp. 187-211. Moscow: Jazyki Russkoj Kul’tury, 1999.
Gil, David. “Syntactic Categories, Cross-Linguistic Variation and Universal Grammar.” In P. M. Vogel
and B. Comrie (eds.), Approaches to the Typology of Word Classes, Empirical Approaches to
Language Typology, pp. 173-216. New York: Mouton, 2000.
Gil, David. “Creoles, Complexity and Riau Indonesian.” Linguistic Typology 5 (2001a): 325-371.
Gil, David. “Escaping Eurocentrism: Fieldwork as a Process of Unlearning.” In P. Newman and M.
Ratliff (eds.), Linguistic Fieldwork, pp. 102-132. Cambridge: Cambridge University Press, 2001b.
Gil, David. “Ludlings in Malayic Languages: An Introduction.” In Bambang Kaswanti Purwo (ed.),
PELBBA 15, Pertemuan Linguistik (Pusat Kajian) Bahasa dan Budaya Atma Jaya: Kelima Belas,
Jakarta: Unika Atma Jaya, 2002a.
Gil, David. “The Prefixes di- and N- in Malay / Indonesian Dialects.” In F. Wouk and M. Ross (eds.), The
History and Typology of Western Austronesian Voice Systems, pp. 241-283. Canberra: Pacific
Linguistics, 2002b.
Gil, David. “Intonation Does Not Differentiate Thematic Roles in Riau Indonesian.” In A. Riehl and T.
Savella (eds.), Proceedings of the Ninth Annual Meeting of the Austronesian Formal Linguistics
Association (AFLA9), Cornell Working Papers in Linguistics 19 (2003): 64-78.
Gil, David. “Riau Indonesian sama, Explorations in Macrofunctionality.” In M. Haspelmath (ed.),
Coordinating Constructions (Typological Studies in Language 58), pp. 371-424. John Benjamins,
Amsterdam, 2004.
Gil, David. “Word Order Without Syntactic Categories: How Riau Indonesian Does It.” In A. Carnie, H.
Harley and S.A. Dooley (eds)., Verb First: On the Syntax of Verb-Initial Languages,
pp. 243-263. John Benjamins, Amsterdam, 2005.
68 DAVID GIL

Goedemans, Rob and Ellen van Zanten. “Stress and Accent in Indonesian.” In D. Gil (ed.), Studies in
Malay and Indonesian Linguistics. London: Curzon Press, to appear.
Kähler, Hans. Grammatik der Bahasa Indonesia. Wiesbaden: Otto Harrassowitz, 1956.
Nespor, Marina and Irene Vogel. Prosodic Phonology. Dordrecht: Reidel, 1986.
Nurzuir Husin, Zailoet, M. Atar Semi, Isma Nasrul Karim, Desmawati Radjab and Djurip. Struktur
Bahasa Melayu Jambi. Jakarta: Pusat Pembinaan dan Pengembangan Bahasa, 1985.
Rehg, Kenneth L. “Proto-Micronesian Prosody.” In J.A. Edmondson and K.J. Gregerson (eds.), Tonality
in Austronesian Languages, pp. 25-46. Oceanic Linguistics Special Publication No. 24. Honolulu:
University of Hawaii Press, 1993.
Stilo, Don. “Alternative Devices for Object Marking in Middle Eastern SOV Languages”, Paper
presented at the Middle East Studies Association of North America, San Francisco, CA, USA,
29 November - 1 December 1984.
Suwarni Nursato, Sutari Harifin, Zainin Wahab, Nangsari Ahmad and Homsen Nanung. Fonologi dan
Morfologi Bahasa Lintang. Jakarta: Pusat Pembinaan dan Pengembangan Bahasa, 1989.
Tadmor, Uri. “Can Word Accent Be Reconstructed in Malay?”, Paper presented at Third International
Symposium on Malay / Indonesian Linguistics, Amsterdam, The Netherlands, 24 August 1999.
Tadmor, Uri. “Rekonstruksi Aksen Kata Bahasa Melayu.” In Yassir Nasanius and Bambang Kaswanti
Purwo (eds.), PELBBA 13, Pertemuan Linguistik (Pusat Kajian) Bahasa dan Budaya Atma Jaya:
Ketiga Belas, Pusat Kajian Bahasa dan Budaya, pp. 153-167. Jakarta: Unika Atma Jaya, 2000.
Umar Manan, Zainuddin Amir, Nasroel Malano, Anas Syafei and Agustar Surin. Struktur Bahasa Muko-
Muko. Jakarta: Pusat Pembinaan dan Pengembangan Bahasa, 1986.
Van Ophuijsen, Ch. A. Maleische Spraakkunst. Leiden: van Doesburgh, 1915.
Zarbaliev, X.M. Jazyk Minangkabau. Moscow: Nauka, 1987.
MATTHEW GORDON

THE INTONATIONAL REALIZATION

OF CONTRASTIVE FOCUS IN CHICKASAW

1. INTRODUCTION
While the realization of focus in languages which express focus either syntactically
or prosodically or through a combination of both prosody and syntax has been
studied relatively extensively, e.g. English (Beckman and Pierrehumbert 1986),
Korean (Cho 1990, Jun 1993), Chichewa (Kanerva 1990), Bengali (Hayes and
Lahiri 1991, Lahiri and Fitzpatrick-Cole 1999), Shanghai Chinese (Selkirk and Shen
1990), Hungarian (Horvath 1986, Kiss 1998), Hausa (Inkelas and Leben 1990),
there is very little work on languages which mark focus morphologically through
affixes or particles attached to or adjacent to focused elements. Of particular
interest is the question of whether languages with morphological marking of focus
also utilize prosodic cues to signal focus, much as languages with special word
orders associated with focus may redundantly use prosody to cue focus. In their
study of Wolof, a language which marks focus morphologically, Rialland and
Robert (2001) claim that Wolof does not use intonation to signal focus redundantly.
Beyond this study of Wolof, however, there is little phonetic literature dealing with
the prosodic manifestation of focus in languages with morphological expression of
focus. It is thus unclear to what extent languages that mark focus morphologically
tend to also employ prosodic cues to focus.1
This study attempts to broaden our understanding of the phonetics of focus by
examining prosodic cues to focus in Chickasaw, a language like Wolof with
morphological marking of focus. A number of potential pitch and duration cues to
contrastive focus are examined to determine whether Chickasaw redundantly use
both prosody and morphology to mark focus.

2. BACKGROUND ON CHICKASAW
Chickasaw is a Western Muskogean language spoken by no more than a few
hundred predominantly elderly speakers in south-central Oklahoma. Chickasaw has
been the subject of extensive work by Pamela Munro and colleagues. Munro (2005)
provides a grammatical overview of Chickasaw and includes an analyzed text of a
traditional Chickasaw story. Munro and Willmond (1994) is a dictionary that also
contains a thorough description of Chickasaw grammar. Gordon et al. (2000)
provides a quantitative phonetic description of Chickasaw and Gordon (1999, 2005)

69
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 69–82.
© 2007 Springer.
70 MATTHEW GORDON

are descriptions of various aspects of the intonational system, including boundary

tones, prosodic phrasing, and pitch accents.

2.1. Intonation
Chickasaw utterances may be divided into a hierarchically ordered set of prosodic
constituents (Gordon 1999, 2005). The largest clearly defined intonational unit is the
Intonation Phrase, which is marked by a f0 excursion at its right edge, typically a f0
rise in statements and a f0 fall in questions. An Intonation Phrase consists of one or
more Accentual Phrases which are canonically associated with a [LHHL] tone
sequence when there is sufficient material in the phrase. The L tone is aligned with
the left edge of the Accentual Phrase, and the first H tone occurs early in the
Accentual Phrase, typically falling on or near the second sonorant mora, with
considerable gradience in its alignment. The final two tones usually associate with
the final syllable, yielding a f0 fall on the final syllable. Stressed final syllables, those
containing a coda consonant or a long vowel (see Gordon 2002 on stress in Chickasaw)
may not realize the final low tone, however. A short Accentual Phrase, one with fewer
than three sonorant moras, may also not realize all the tones of a canonical AP, with
deletion of the initial or final L being the typical strategy for truncating the AP. An
AP with three sonorant moras is usually sufficient to realize all tones though a two
syllable AP with three sonorant moras may not realize all its tones. Schematic
examples of the realization of tones in an AP appear in (1).

(1)
a. Monomoraic 1st Syllable b. Bimoraic first syllable c. Short AP
L H H L L H H L H L
[ µ µµ ! ]AP [ µµ ! ! ! ]AP [ µ µ ]AP
n a S oÚ b a… t n am bi laÚma/ fala

Chickasaw strongly tends to align Accentual Phrase boundaries with word

boundaries; thus, it is most common for a single word to constitute an entire
Accentual Phrase. It is possible, however, for two relatively short words, i.e. words
of one or two syllables, to group together into a single Accentual Phrase.

2.2. Focus
Chickasaw has at least two types of focus markers (Munro and
Willmond 1994) which are suffixed to focused nouns and differ according to
whether the focused element is a syntactic subject or an object. The first focus
suffix, -ho…t when attached to subjects and –ho when affixed to objects, is termed
a “focus/inferential case ending” by Munro and Willmond (1994:liv) and will not be
discussed further in this paper. The focus of this paper is the contrastive focus
CONTRASTIVE FOCUS IN CHICKASAW 71

suffix, which is realized as -akot with subjects and as -ako)… with objects (Munro
and Willmond:liii). Although the precise semantic conditions that give rise to the
contrastive focus are not completely understood, one of its primary functions is to
attract narrow focus to the noun which it modifies. There is no comparable suffix
affixed to verbs to signal narrow focus on the verb. Sentences exemplifying the
contrastive focus suffixes and their counterparts lacking contrastive focus marking
appear in (2).

(2) hat…ak-at koni(a)) pisa.

Man-subj skunk sees
The man sees the skunk.

hat…ak-akot koni(a)) pisa.

Man-cont.subj. skunk sees
THE MAN sees the skunk.

hat…ak-at koni-ako)… pisa.

Man-subj skunk-cont.obj. sees
The man sees THE SKUNK.

As the sentences in (2) indicate, non-focused subjects are marked with the suffix
– at, while non-focused objects may either have no overt suffix or be marked with the
suffix – a)…. The unmarked word order in Chickasaw is SOV, though other orders
are possible under certain as yet not well-understood semantic conditions, including
focus, which may be associated with fronting of the focused element. For example,
sentence (2c) could appear with a fronted object, i.e. koniako)… hat…akat pisa ‘The
man sees THE SKUNK’.

3. PRESENT STUDY

3.1. Methodology
The present study examines the prosodic realization of sentences involving
contrastive focus on subjects and verbs. Data were collected during elicitation
sessions with individual speakers. Subjects were presented with English sentences
containing a subject, object, and verb and instructed to give the Chickasaw
equivalent. Focus was elicited by offering English translations emphasizing the
focused element. Three different focus conditions were elicited: one involving
broad focus, i.e. no special focus on any particular element, one with narrow focus
on the subject and one with narrow focus on the object. Subjects repeated each
sentence between three and five times. The corpus used in the experiment appears
in Table 1.
72 MATTHEW GORDON

Table 1. Corpus recorded for the focus experiment

NO FOCUS
Speakers 1-4
hat…akat naSo…bai pisa The man sees the wolf.
hat…akat ampaska pisa The man sees my bread.
hat…akat wa…ka/ pisa The man sees the cow.
hat…akat hopa…ji/ pisa The man sees the fortune teller.
Speaker 5
na…hol…a…t naSo…ba pisa…tok The white man saw the wolf.
na…hol…a…t ampaska pisa…tok The white man saw my bread.
na…hol…a…t wa…ka/ pisa…tok The white man saw the cow.
na…hol…a…t hopa…ji/ pisa…tok The white man saw the fortune teller.

SUBJECT FOCUS
na…hol…a…kot a)…nampaka)…li/ pisa…tok THE WHITE MAN saw my flower.
na…hol…a…kot minko/ pisa(…tok) THE WHITE MAN sees(saw) the chief.
na…hol…a…kot ofo)…lo pisa…tok THE WHITE MAN saw the owl.

OBJECT FOCUS
na…hol…a…t minka…ko)… pisa…tok The white man saw THE CHIEF.
na…hol…a…t amofo)…la…ko)… pisa…tok The white man saw MY OWL.
na…hol…a…t sat…iba…piSiako)… pisa…tok The white man saw MY BROTHER.

Data was collected and analysed for a total of five female speakers. Four of the
speakers were recorded in Oklahoma in 1996 while the remaining speaker was
recorded in Los Angeles in 2002. Subjects were recorded on DAT tape while
wearing a high quality noise cancelling microphone on their heads. Data were then
transferred onto computer using Scicon MacQuirer at a sampling rate of 22.5 kHz.
Two measurements that could potentially distinguish different focus conditions
prosodically were made using the MacQuirer software. First, the average
fundamental frequency for each of the three words comprising each sentence was
calculated to determine whether focused words are produced with heightened pitch
relative to postfocus elements, a common prosodic realization of focus cross-
linguistically. Second, the duration of the pause between the subject and object and
between the object and verb was measured to ascertain the degree of juncture
CONTRASTIVE FOCUS IN CHICKASAW 73

between different words under different focus conditions. Prosodic boundaries

between words in postfocus position in other languages are commonly eliminated,
reducing the level of temporal disjuncture between elements in postfocus position.

4. RESULTS

4.1. Fundamental Frequency

A two factor (focus condition and syntactic category) analysis of variance pooling
together results from all speakers failed to indicate a significant effect of either focus
condition or syntactic category on f0 values: for syntactic category (subject, object,
verb), F(2, 349) = 1.706, p = .1831; for focus condition (no focus, subject focus,
object focus), F(2, 349) = .664, p = .5153. There was, however, a significant
interaction between focus condition and syntactic category: F(4, 349) = 3.280,
p = .0117. This interaction was attributed primarily to an overall raising of f0 for
subjects in sentences involving any type of focus, either subject or object focus.
This effect was only observed for certain speakers but not others. Another effect
contributing to the interaction between focus and syntactic category was a lowering
of f0 on verbs in sentences with a narrow focused noun. Again this effect was
speaker dependent, however. Given the considerable interspeaker variation in the
expression of focus, it is thus instructive to consider results for individual speakers.
Average f0 results for individual speakers are given in Table 2.
Speaker 1 displayed a significant raising of f0 for subjects in sentences with
either narrow focus on the subject or object. Unpaired t-tests for this speaker
revealed a significant difference between f0 values for subjects in broad focus
sentences and subjects in sentences with either narrow focus on the subject, t(2,13) =
2.824, p = .0144, or narrow focus on the object, t(2,14) = 3.146, p = .0072. F0
values did not differ reliably between subjects in sentences with subject focus and
those with object focus. Nor was there any significant difference in f0 values for
objects or verbs under the three focus conditions.
Results for speaker 2 were similar to those for speaker 1: f0 values for subjects
were higher in sentences involving narrow focus than those with broad focus. This
difference was only a trend, however, and did not quite reach statistical significance
in unpaired t-tests: broad focus vs. narrow subject focus, t(2, 20) = 1.899, p = .0721 ;
broad focus vs. narrow object focus, t(2, 9) = 1.963, p = .0662. F0 values for objects
and verbs did not differ significantly under different focus conditions.
Speaker 3 also displayed an overall raising of f0 in subjects in sentences with
narrow focus either on the subject or the verb: broad focus vs. narrow subject focus,
t(2, 25) = 4.588, p<.0001; broad focus vs. narrow object focus, t(2,22) = 5.832,
p<.0001. In addition, f0 values were heightened for objects in sentences involving
narrow focus on either the subject or object: broad focus vs. narrow subject focus,
t(2,25) = 2.340, p = .0275; broad focus vs. narrow object focus, t(2,22) = 2.221,
74 MATTHEW GORDON

p = .0369. Finally, verbs were found to have lower f0 values in sentences with
narrow subject focus than broad focus sentences: t(1,4) = 3.033, p = .0387. The
data recorded from this speaker did not allow for measurement of f0 values for verbs
in sentences with narrow object focus. Interestingly, a tendency to lower f0 of verbs
in sentences with narrow focus also was observed in speaker 1, though this effect
did not reach significance for this speaker.
Speaker 4 also raised f0 values for subjects in narrow focus sentences:
t(2,21) = 2.748, p =. 0120. Sentences with narrow focus on the object were not
recorded from this speaker. Focus did not impact f0 values for either objects or
verbs for speaker 4.
Speaker 5 was the only speaker for whom subject narrow focus and object
narrow focus were differentiated both from each other and from broad focus along
the f0 dimension. Interestingly, for this speaker, f0 values for subjects were highest
in object focus sentences (184Hz on average), and lowest in broad focus sentences
(158Hz on average), with intermediate values obtaining in subject focus sentences
(165Hz on average). Values differed significantly from each other between the
three focus conditions: broad focus vs. narrow subject focus, t(2,27) = 2.056,
p = .0495; broad focus vs. narrow object focus, t(2,22) = 3.919, p = .0007; narrow
subject focus vs. narrow object focus, t(2,21) = 2.811, p = .0105. Speaker 5 also
raised f0 for objects under focus relative to unfocused objects in both broad focus
sentences, t(2,23) = 3.176, p = .0042 and sentences with narrow focus on the
subject, t(2,23) = 2.456, p = .0220. Objects did not differ reliably in f0 between
broad focus and narrow subject focus sentences. Differences in focus condition did
not significantly affect f0 values for verbs.
Figures 1-3 illustrate sentences uttered by speaker 5 with three different focus
conditions. Figure 1 is realized with broad focus, Figure 2 with narrow focus on
the subject, and figure 3 with narrow focus on the object. As the figures show, the
sentence with object focus (figure 3) is associated with a blanket rising of f0 for the
subject and object (and to a lesser extent, the verb, though this is not a consistent
property of object focus). Subject focus (figure 2) triggers a raising of f0 in the
subject relative to the subject in the broad focus sentence (figure 1) but not relative
to the subject in the object focus sentence. It may also be observed that the broad
focus sentence in figure 1 differs in prosodic constituency from the two sentences
with a narrow focused element. The subject and object together form a single
Accentual Phrase when neither is focused but belong to different Accentual Phrases
when either one is focused.
CONTRASTIVE FOCUS IN CHICKASAW 75

Figure 1. Pitch track for broad focus sentence na…hol…a…t minko/ pisa…tok ‘The white man saw
the chief.’

Figure 2. Pitch track for subject focus sentence na…hol…a…kot minko/ pisa…tok ‘THE WHITE
MAN saw the chief.’
76 MATTHEW GORDON

Figure 3. Pitch track for object focus sentence na…hol…a…t minka…ko)… pisa…tok ‘The white man
saw THE CHIEF.’

Table 2. Average f0 results for individual speakers (in Hertz, N=narrow focus)

Speaker
1 2 3 4 5
Broad 191 192 160 192 158
Subject N-subj 205 201 183 210 165
N-obj 204 204 188 ---- 184
Broad 199 189 166 202 159
Object N-subj 195 196 177 203 164
N-obj 205 198 180 ---- 181
Broad 216 199 187 187 164
Verb N-subj 203 206 149 191 166
N-obj ---- ---- ---- ---- 174

In summary, both subject and object narrow focus consistently triggered raising
of f0. One speaker also displayed raising of f0 in objects under both object narrow
focus and subject narrow focus sentences. In addition, f0 for verbs was also lowered
in sentences involving narrow focus for two speakers. Somewhat surprisingly,
object narrow focus and subject narrow focus were only differentiated for one
speaker in terms of average f0 values. For this speaker, object focus triggered
raising of f0 for the focused object, as one might expect. However, this speaker also
curiously displayed higher f0 values for subjects in sentences with object focus than
for subjects under narrow focus themselves.
CONTRASTIVE FOCUS IN CHICKASAW 77

4.2. Duration
A two factor (syntactic category and focus condition) ANOVA pooling results from
all five speakers indicated a significant effect of both syntactic category and focus
on the pause duration between words in sentences: for syntactic category, F(1,284) =
6.200, p = .0133; for focus condition, F(2,284) = 11.242, p<.0001. There was also
a significant interaction between the two factors: F(2,284) = 23.029, p<.0001.
Overall, the pause between subject and object was shortest in broad focus sentences
and longest in sentences with narrow focus on the object. In contrast, the pause
between object and verb was shortest in object focus sentences and longest in
sentences with broad focus. Results averaged across speakers appear in Figure 4.

300

240
milliseconds

180 broad focus

subject focus
120 object focus

0
post-subject post-object

Figure 4. Pause durations under three different focus conditions (all speakers pooled
together, bars represent one standard deviation from mean)

A series of pairwise t-tests revealed a highly significant difference in the post-

subject pause between sentences with broad focus and both sentences with narrow
focus on the subject, t(1,120) = 5.404, p<.0001, and those with narrow focus on the
object, t(1,98) = 6.540, p<.0001. Only one of the three pairwise comparisons involving
the post object pause, however, reached significance, the difference between the
broad focus and narrow object focus conditions: t(1,98) = 2.363, p = .0201.
Pause durations for individual speakers appear in Table 3. There was some
variation between speakers in their duration results. Looking first at the post-
subject pause, four of the five speakers displayed the shortest pause after subjects
in broad focus sentences, while speaker 5 did not reliably differentiate the broad
focus and narrow subject focus conditions in terms of post-subject pause duration.
Only speaker 5 had a reliable difference in the post-subject pause between the two
narrow focus sentence types: the pause in object focus sentences was greater than in
subject focus sentences. For the other speakers, the two narrow focus conditions
were not significantly different from each other in their post-subject pauses.
78 MATTHEW GORDON

Turning to post-object pause duration, there was greater interspeaker variation,

with the most dominant pattern involving decreased duration following narrowly
focused objects. Speaker 2 had the shortest post-object pause in narrow object focus
sentences and roughly similar post-object pause durations in sentences with narrow
subject focus and those with broad focus, though none of the pairwise comparisons
reached significance in t-tests. Speaker 3 also displayed the shortest post-object
pause in sentences with object focus though differences between the three focus
conditions were quite small. Only the difference between object focus and subject
focus conditions reached statistical significance for this speaker: t(1,13) = 3.346,
p = .0053. Speaker 5 followed a similar pattern with shorter pauses following
focused objects with both pairwise comparisons involving narrow object focus
sentences reaching significance: narrow object focus vs. narrow subject focus,
t(1,24) = 2.652, p = .0139; narrow object focus vs. broad focus, t(1,24) = 2.518,
p = .0189. The close degree of juncture between a focused object and the following
verb can be observed in figure 3 earlier in the paper.
Speaker 1 displayed a very different pattern: she had the shortest post-object
pause under the narrow subject focus condition, and the longest post-object pause
under the object focus condition, with intermediate values in the broad focus
condition. Only the comparisons involving narrow subject focus reached
significance: narrow subject focus vs. narrow object focus, t(1,7) = 3.961, p = .0055;
narrow subject focus vs. broad focus, t(1,14) = 3.639, p = .0194. Speaker 4 for whom
sentences with narrow object focus were not recorded, displayed shorter pauses after
objects in sentences with narrow focus on the subject, though this difference
narrowly missed significance: t(1,21) = 2.057, p = .0523.

Table 3. Pause duration results for individual speakers (in milliseconds, N=narrow focus)

Speaker
1 2 3 4 5
Broad 44 13 0 8 93
Post- N-subj 316 313 41 132 75
subj
N-obj 235 244 88 ---- 143
Broad 66 144 97 125 64
Post- N-subj 25 123 104 104 69
obj
N-obj 96 61 90 ---- 56

In summary, broad focus was typically associated with a very close degree of
temporal juncture between subjects and objects (with zero or nearly zero pause after
the subject for speakers 2, 3, 4), while the two narrow focus conditions were not
consistently differentiated in terms of their effect on the duration of pauses after the
subject. The two narrow focus sentence types were, however, differentiated in their
CONTRASTIVE FOCUS IN CHICKASAW 79

effect on the level of juncture between object and verb. Objects carrying narrow
focus were followed by very short pauses relative to unfocused objects both in
sentences with broad focus and sentences with narrow focus on the subject. These
patterns, though dominant, however, were not entirely consistent across speakers.
Speaker 5 differed from the other speakers in terms of pause durations after the
subject, whereas speaker 1 differed from the other speakers in her results for post-
object pauses. It should also be noted that the increased temporal proximity between
a focused object and verb observed for most speakers is not associated with
elimination of the Accentual Phrase boundary typically separating most lexical
items greater than two syllables in sentences lacking any narrow focused element.
As figure 1-3 show, the first syllable of the verb is realized with low tone, the initial
tone of a Chickasaw Accentual Phrase, which characteristically has the tonal pattern
[LHHL] (Gordon 1999, 2005).

4. DISCUSSION
This paper has shown that Chickasaw marks contrastive focus not only
morphologically but also through prosody. The strategies employed by Chickasaw
to mark focus prosodically are similar in some respects to those exploited by other
languages but also differ in some respects from other languages. Both narrow object
focus and narrow subject focus were characteristically associated with raised
f0 values for subjects, and, for one speaker, objects as well. Only one speaker
differentiated narrow object focus and narrow subject focus, however: for this
speaker, f0 values were higher for focused objects than non-focused objects. The
raising of f0 of subjects in both sentences with narrow subject focus and sentences
with narrow object focus is an unusual feature of Chickasaw, as increased f0 is
characteristically associated with only the focused element in most languages,
including English (Beckman and Pierrehumbert 1986), Korean (Jun 1993), Hausa
(Inkelas and Leben 1990). The dominant cross-linguistic pattern entailing localized
raising of f0 under focus was found only for a single Chickasaw speaker. Even this
speaker, however, displayed higher f0 values for subjects in sentences with narrow
object focus than in sentences with narrow subject focus. It thus seems that raising
of f0 is a general strategy for signalling any type of focus in Chickasaw and is not a
reliable cue to picking out which element is being focused. It is also worth noting
that two speakers displayed lowering of f0 in verbs in sentences with narrow focus
on either the subject or object. This pattern may be viewed as similar to the
deaccenting of words in the same intermediate phrase following a focused element
in English (Beckman and Pierrehumbert 1986), though focus leads only to a blanket
lowering of f0 in verbs in Chickasaw and does not actually lead to suppression of
the nuclear pitch accent in an IP final verb.
Chickasaw’s use of duration to signal focus follows, in some respects, a pattern
typical of other languages. A focused object increases the temporal proximity of the
object and following verb, a pattern similar to that found in Korean (Cho 1990, Jun
1993). It is important to note, however, that while a focused object triggers deletion
of the Accentual Phrase boundary between an object and following verb in Korean,
80 MATTHEW GORDON

the change in temporal proximity of object and verb in Chickasaw is not necessarily
associated with a change in prosodic constituency. An Accentual Phrase boundary
may also separate the verb preceding a focused object as it typically separates a verb
and a preceding unfocused object. It is conceivable, however, that examination of
more data will reveal a statistically greater likelihood for focused objects to be
grouped together in an Accentual Phrase with the following verb. Thus, it is as yet
unknown whether the temporal effects induced by placing narrow focus on the
object in Chickasaw are purely phonetic or whether the increased temporal closeness
of a focused object and verb has ramifications for prosodic constituency.
Another temporal phonetic effect triggered by narrow focus is increased
separation between the subject and object. For all but one speaker, this enhanced
level of disjuncture is associated with either narrow focus on the subject or object
and often has phonological ramifications on Accentual Phrase formation: the
subject and object are more likely to be grouped in the same Accentual Phrase when
neither carries narrow focus than when one or both does. Although the symmetry of
this effect under both narrow focus conditions, subject focus and object focus, is
atypical from a cross-linguistic standpoint, it serves to set off the focused element
from adjacent words perhaps increasing its prominence. In the case of a focused
object, the increased pause before the object complements the decreased pause
following the object. For two speakers (the pause preceding a focused object is
greater than the pause preceding an unfocused object in both sentences without
narrow focus and sentences with a focused subject. The increased disjuncture
before a focused element for this speaker accords with other languages in which a
phonological phrase boundary is obligatory before a focused constituent, e.g.
Korean (Jun 1993), Hausa (Inkelas and Leben 1990), Japanese (Pierrehumbert and
Beckman 1988), and Greek (Condoravdi 1990).

5. SUMMARY
Results of this study suggest considerable diversity among Chickasaw speakers in
their prosodic realization of focus. More generally, the examined data suggest that
Chickasaw is less reliant on prosody to signal focus than other languages in which
focus is not signalled through morphology. While broad focus sentences is
characteristically differentiated from narrow focus through its lower f0 in nouns and,
for certain speakers, higher f0 in verbs, f0 does not, with the exception of one
speaker, distinguish sentences with narrow focus on the subject from those with
narrow focus on the object. Interword pause durations appear more reliable in
cueing focus, with both narrow focus conditions triggering increased temporal
disjuncture between subject and object, presumably a strategy for increasing the
salience of focused elements. For three speakers, narrow object focus was
associated with increased temporal proximity of the object and verb relative to the
other two focus conditions, broad focus and narrow focus on the subject, a trend
which parallels the dephrasing of post-focus elements in other languages. For one
speaker, narrow focused objects were preceded by a longer pause than objects not
under narrow focus.
CONTRASTIVE FOCUS IN CHICKASAW 81

The results for Chickasaw may be contrasted with the results of Rialland and
Robert’s study of Wolof, another language with morphological expression of focus.
Rialland and Robert do not report any use of prosodic cues to focus for Wolof,
though it should be noted that their study focused on intonation, i.e. f0, the
parameter which less reliably differentiated various focus conditions in Chickasaw.
It is thus conceivable that durational cues to focus are also present in Wolof. The
present study of Chickasaw suggests that, although the role of prosodic cues to focus
may be less consistently exploited in Chickasaw than in languages without overt
focus morphology, measurable phonetic cues to focus are potentially present even in
languages in which morphology carries the primary burden in signalling focus.

6. NOTES
1
A sincere thanks to the Chickasaw speakers, who so generously provided the data discussed in this
paper, and to Pam Munro for her assistance in preparing the corpus examined in this paper, and more
generally, for her insights and suggestions related to Chickasaw prosody. Portions of the data discussed
here were collected as part of an NSF grant awarded to Peter Ladefoged and Ian Maddieson to document
the phonetic properties of endangered languages.
2
Note that the long vowel in naSo…ba ‘wolf’, hopa…ji/ ‘fortune teller’ and pisa…tok ‘saw’ are not phone-
mic long vowels but are the result of a process of rhythmic lengthening targeting a non-final vowel in an
open syllable immediately preceded by a short vowel in an open syllable (see Munro and Willmond
1994, Munro 2005 for discussion of rhythmic lengthening). Rhythmically lengthened vowels behave
parallel to phonemic long vowels phonologically and are either, depending on the speaker, identical in
length or nearly identical in length to phonemic long vowels (see Gordon et al. 2000 for phonetic data).

7. REFERENCES
Beckman, Mary and Janet Pierrehumbert. “Intonational structure in Japanese and English.” Phonology
Yearbook 3 (1986): 255-310.
Cho, Young-Mee. “Syntax and Phrasing in Korean.” In Sharon Inkelas and Draga Zec (eds.), The
Phonology-Syntax Connection, pp. 47-62. Chicago: University of Chicago Press, 1990.
Condoravdi, Cleo. “Sandhi Rules of Greek and Prosodic Theory.” In Sharon Inkelas & Draga Zec (eds.),
The Phonology-Syntax Connection, pp. 63-84. Chicago: University of Chicago Press, 1990.
Gordon, Matthew. “The Intonational Structure of Chickasaw.” Proceedings of the 14th International
Congress of Phonetic Sciences (1999): 1993-1996.
Gordon, Matthew. “Intonational phonology of Chickasaw.” In Sun-Ah Jun (ed.), Prosodic Models and
Transcription: Towards Prosodic Typology, pp. 301-330. Oxford: Oxford University Press, 2005.
Gordon, Matthew, Pamela Munro and Peter Ladefoged. “Some Phonetic Structures of Chickasaw.”
Anthropological Linguistics 42 (2000): 366-400.
Hayes, Bruce and Aditi Lahiri. “Bengali Intonational Phonology.” Natural Language and Linguistic
Theory 9 (1991): 47-96.
Horvath, Julia. Focus in the Theory of Grammar and the Syntax of Hungarian. Dordrecht: Foris, 1986.
Inkelas, Sharon and William Leben. “Where Phonology and Phonetics Intersect: the Case of Hausa
Intonation.” In John Kingston and Mary Beckman (eds.), Papers in Laboratory Phonology I:
Between the Grammar and Physics of Speech, pp. 17-34. New York: Cambridge University Press,
1990.
Jun, Sun-Ah. The Phonetics and Phonology of Korean Prosody. The Ohio State University: Doctoral
dissertation, 1993.
Kanerva, Jonni. “Focusing on Phonological Phrases in Chichewa.” In Sharon Inkelas and Draga Zec
(eds.), The Phonology-Syntax Connection, pp. 145-162. Chicago: University of Chicago Press, 1990.
Kiss, Katalin. “Identificational Focus Versus Information Focus. Language 74 (1998): 245-273.
82 MATTHEW GORDON

Lahiri, Aditi and Jennifer Fitzpatrick-Cole. “Emphatic Clitics and Focus Intonation in Bengali.” In Kager,
René and Wim Zonneveld (eds.), Phrasal Phonology, pp. 119-144. Nijmegen: University of
Nijmegen Press, 1999.
Munro, Pamela. “Chickasaw.” In H. Hardy and J. Scancarelli (eds.) Native Languages of the
Southeastern United States, pp. 114-156. Lincoln: University of Nebraska Press, 2005.
Munro, Pamela and Catherine Willmond. Chickasaw: an Analytical Dictionary. Norman: University of
Oklahoma Press, 1994.
Pierrehumbert, Janet and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988.
Rialland, Annie and Stéphane Robert. “The intonational system of Wolof.” Linguistics 39 (2001):
893-939.
Selkirk, Elisabeth and Tong Shen. “Prosodic Domains in Shanghai Chinese.” In Sharon Inkelas and
Draga Zec (eds.), The Phonology-Syntax Connection, pp. 313-338. Chicago: University of Chicago
Press, 1990.
CARLOS GUSSENHOVEN

TYPES OF FOCUS IN ENGLISH

1. INTRODUCTION
This chapter has two aims. First, section 1.0 intends to shows that the way pitch
accents express information structure in English is subject to structural constraints.
This view is contrasted with one in which the pitch accent directly signals the
information status of the word it occurs on. The second aim, pursued in section 2.0,
is to show that there isn’t just a single semantic contrast between ‘old’ and ‘new’
information: languages express various kinds of focus meanings, like reactivating
focus, contingency focus and corrective focus.

1.1. The expression of focus by means of pitch accents in English

Intuitively, pitch accents in English indicate that the speaker means to stress the
importance of the words they appear on. Recapitulating earlier research, this sec-
tion endorses a research tradition in which this intuition is undermined by a
demonstration that there are structural constraints on the way pitch accents signal the
focus constituent of the sentence. That is, the connection between the pitch accent
and pragmatic ‘importance’ is not word-based. Depending on the syntactic structure,
an accented word may signal the focus of a larger constituent than that formed by
word on its own. Before exploring the role of the syntactic structure in determining
the relation between accentuation and the focus constituent, the alternative, ‘direct’
position is presented as a background.
Generally, speakers conduct conversations so as to establish a common
understanding about some aspect of the world. They keep track of the development
of their common understanding in a ‘discourse model’, and indicate the way their
information relates to the hearer’s understanding. Pitch accents express this
‘information status’. The focus constituent may, in Ladd’s (1980, p. 77)
terminology, be ‘broad’ or ‘narrow’, depending on size. If a speaker takes someone
to task for making a pedantic remark, the sentence Even a nineteeth-century
professor of CLASSics wouldn’t have allowed himself to be so pedantic contains a
relatively broad focus on a nineteenth-century professor of classics. In Ladd’s
words, the addressee here ‘has nothing to do with classics, is not a professor, and is
more or less contemporary’, and a nineteenth-century professor of classics just so
happens to be the most pedantic type of person the speaker could think of. However,
if the speaker were trying to come up with what to him is a particularly clear case of
nineteenth-century pedantry, the focus would be narrowed down to professor of

83
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 83–100.
© 2007 Springer.
84 CARLOS GUSSENHOVEN

classics, while the focus would be narrowed down further to just classics if
the discussion was more specifically about pedantry among nineteenth-century
professors.
The variation between ‘broad’ and ‘narrow’ focus which this example shows was
earlier discussed under the rubrics of ‘normal stress’, for the ‘broadest’ case, and
‘contrastive stress’, as in other cases (where ‘stress’ is equivalent to ‘accent’)
(Newman 1946; Chomsky & Halle 1986; Bresnan 1971; Bresnan 1972). This older
view held that ‘normal’ accent patterns (which were never defined, but were
assumed to be the most natural pattern when reading out an isolated sentence) were
determined by syntactic factors, but that ‘contrastive’ accent patterns arose from
independently meaningful considerations. Thus, ‘normal’ stress was believed to
yield to formal linguistic rules, while ‘contrastive’ stress was not. This position
came under attack by Bolinger (1972) and Schmerling (1974). Bolinger stressed that
all accent placements are meaningful, and that it is impossible to draw a dividing
line between ‘normal’ and ‘contrastive’ accents. In this view, all new information
implicitly contrasts with other information: the sentence I’ll give you a BOOK does
not change its structure if the implication changes from ‘I won’t give you a cd-rom’
to ‘I won’t give you anything else’, or even to ‘I won’t behave in any other way’.
These differences are gradient and non-structural, Bolinger argued (1972), and in all
three cases the accent location is determined by the speaker’s informational bias
towards the concept ‘book’.
Bolinger’s point that ‘neutral’ and ‘contrastive’ accentuations should be
explained within a single conception of information structure was welcomed by later
researchers (Schmerling 1974; Ladd 1980; Gussenhoven 1983a; Selkirk 1984). His
position was otherwise vulnerable on two counts, however. One is that ‘contrastive’
focus may actually be expressed differently from ‘neutral’ focus. In fact, when
looking at languages other than English, these differences turn out to be of two
kinds. First, ‘contrastive’ may refer to a type of focus, to be discussed as ‘corrective’
focus in section 2.0. Even if English does not always distinguish between
‘presentational focus’ (Zubizarreta 1998; Selkirk 2002) or ‘information focus’ (Kiss
1998) and corrective focus, some languages, like European Portuguese, consistently
use different forms (Frota 1998). In such cases, the equivalents of (1a) and (1b) are
not homophonous.

(1) a. (A: Has she driven any other cars besides Fords and Chevrolets?)
B: She used to drive [a RENault CLIO]FOC

b. (A: Helen used to drive a Ford Capri)

B: No, she used to drive [a RENault CLIO]FOC

Second, some languages appear to make a distinction between broad focus, in

which there is no particular constituent which is focused (or, alternatively, the entire
expression is considered the focus constituent) and narrow focus. For instance,
Bengali has different pitch accents in cases equivalent to (2a) and (2b) (Hayes &
Lahiri 1991). In such cases, the term ‘neutral’ is applied to the broad-focus form.
TYPES OF FOCUS IN ENGLISH 85

(2) a. (A: What else can you tell us about Helen?)

B: She [used to drive a Renault CLIO]FOC

b. (A: What kind of Renault did she drive?)

B: She used to drive a Renault [CLIO]FOC

The other element in Bolinger’s position to be challenged is more directly relevant

to English. It was the belief that a pitch accent highlighted the informational value of
just the word it is placed on. Bolinger (1985, 1987) insisted that the relation between
accent and focus was direct, rejecting what is known as focus projection, the ability
of an accented word to signal the focus for a higher constituent, like the phrase or
clause, causing differently sized focus constituents to have the same form (Chomsky
1971; Jackendoff 1972; Selkirk 1984). Such ‘focus ambiguity’ is excluded under
Bolinger’s direct view of the relation between accent and focus, but almost
inherently part of approaches which see the ‘focus-to-accent’ relation, a term
introduced in Gussenhoven (1985), as indirect, and mediated by the linguistic
structure.
The structural nature of accentuation in English in fact stretches all the way from
the lexicon to the sentence. The phonology determines that in the adjective MANifest
the accent is on the first syllable. Morphological formations impose accentuation
patterns, as in SaTANic, where the adjectival suffix causes the accent to be on the
second syllable of the stem (cf. SAtan), or in LANguage consultant, where
consultant is unaccented because it is the second constituent of a compound (Burzio
1994, Hayes 1995, Zonneveld, Trommelen, Jessen, Rice, Bruce, & Arnason 1999,
and references therein.) Post-lexically, the phonology determines the ‘shifted’
location of the pitch accent in the adjective in CHInese LANtern (cf. It’s ChiNESE).
Schmerling (1974) pointed to a further regularity at the level of the syntax, which
requires a predicate to be unaccented when paired with its subject or object in what
she called ‘news sentences’, i.e. when the focus is broad. Thus, while Schmerling
agreed with Bolinger on the untenability of the distinction between ‘contrastive’
stress and ‘normal’ stress, she disagreed on the role of structure. Her work formed
the basis of accounts of the focus-accent relation that rely on predicate-argument
structure (Gussenhoven 1978, 1983a, 1992, 1999a,b; Ladd 1983; Selkirk 1984,
1995). Ladd (1980) also emphasized the structural nature of accent distributions, but
related the fact that the verb often goes without an accent to a scale of accentability
applying to word classes, but endorsed my 1983 proposal in Ladd (1983). Bolinger
continued to argue for the conflation of the syntactic regularity - and indeed that of
the regularity expressed by Compound Stress - with deaccenting due to absence of
focus (Bolinger 1972, 1978, 1985, 1989).

1.2. The Focus-to-Accent relation in English: SAAR

The central observation is that English predicates that are surface-adjacent to an
accented argument need not be accented in order to be interpreted as focused
(Schmerling 1974; Gussenhoven 1983a; Gussenhoven 1992; Selkirk 1984; Selkirk
86 CARLOS GUSSENHOVEN

1995).1 Importantly, Schmerling observed that by the side of the SV ‘news sentence’
with its unaccented verb, (3), an unaccented predicate also appears after a non-
subject argument, as in (4). Accordingly, she formulated a principle stating that, in
news sentences, accents go to the argument (the subject and the object), but not to
the predicate. Thus, the lack of accent on died and grow has the same explanation, as
has the lack of accent on hit in (5). (All examples from Schmerling (1974)).

(3) JOHNSON died

(4) Great oaks from little ACORNS grow

(5) JOHN hit BILL

As I pointed out in a review of her book, Schmerling’s principle really extends to

all (presentational focus) sentences, provided the notion of ‘focus’ is brought in. To
account for the accentual pattern in her (6), she introduced the ‘Topic-comment’
sentence, and formulated the principle that in such sentences, both topic Truman and
comment died are accented. This is incorrect to the extent that when we reverse
topic and comment, the topic Truman remains unaccented, as in (6b), from
Gussenhoven (1978).

(6) a. TRUMAN DIED

b. The disease KILLED Truman

If we assume that ‘topic’ means ‘outside the focus’, things fall into place.
Comments are accented, topics are not; the reason why the topic in (6a) is accented
is due to its position before the focus, where accents are optional. Not only do we
now account for (6a) and (6b), we can also generalize the instruction ‘accent the
comment’. Subjects and objects are ‘arguments’, as noted by Schmerling. That is,
they represent necessary elements in the semantics of the predicate, and as such
contrast with constituents that express circumstantial conditions on the predicate,
like time, space and manner adjuncts, henceforth ‘modifiers’. Schmerling’s principle
amounts to the generalization that any focused argument, predicate or modifier
forms its own comment, except the special case of the single comment formed by a
predicate that is adjacent to one of its arguments. The argument-predicate
connection seems especially clear from cases like (7a,b). Since direction adjuncts are
arguments of verbs of motion, as in (7a), no accent appears on the verb taken, but in
(7b), where the verb bury appears in combination with a place adjunct, there are two
‘comments’, and two accents appear. The fact that Independence occurs in a
Prepositional Phrase in both cases is immaterial. (Truman, again, is a topic.)
Following Schmerling’s strategy to employ German to demonstrate the same
regularities in a language with a different word order, I give Dutch translations to
bring out the difference more clearly (Gussenhoven 1978).

(7) a. Truman was taken to INDEPENDENCE

Truman werd naar INDEPENDENCE gebracht
TYPES OF FOCUS IN ENGLISH 87

b. Truman was BURIED in INDEPENDENCE

Truman werd in INDEPENDENCE BEGRAVEN

The ‘comments’ were relabelled ‘focus domains’ in Gussenhoven (1983a), to mean

constituents that can be placed in focus by the accentuation of only one word, and
the generalization was given the status of a Sentence Accent Assignment Rule
(SAAR). It is comparable to the Compound Rule, which deletes accents on the
second constituent of compounds, but operates at a higher level of structure: a
predicate loses its accent when an adjacent accent appears on one of its arguments.
While the next paragraph discusses the way SAAR applies in complex sentences,
the structural nature of the relation can be shown in a number of ways even within
the clause. Interruption of the adjacency of predicate and argument by an accented
modifier breaks up the focus domain, causing the predicate to be accented, as in
(8c), which is to be compared with the uninterrupted focus domain in (8a) and with
the intervening unaccented time modifier just in (8b).

(8) a. [My TYRES have been cut]FOC

b. [My TYRES have just been cut]FOC ‘only a little while ago’
c. [My TYRES have JUST been CUT]FOC ‘without further ado’

Second, the fact that the semantically similar (9a) and (9b), from Gussenhoven
(1985), have accentuations that follow SAAR, not the semantics.

(9) a. [Your TROUSers are torn]FOC

b. [There’s a TEAR in your trousers]FOC

Third, the argument must have its head in focus. The focus constituent in (10a)
is black, and since the noun bird is outside it, the predicate cannot be deaccented. By
contrast, in (10b) the noun blackbird is included in the focus constituent, and the
pattern goes through. Both examples could be answers to a question about the well-
being of a group of birds, but only (10a) can count as a straightforward answer.
Example (10b) carries an implication of some awkward downplaying of the fact that
one of the birds has escaped (Gussenhoven 1985).

(10) a. ??The [BLACK]FOC bird [has escaped]FOC

(Cf. The BLACK bird has ESCAPED)
b. [The BLACKBIRD has escaped]FOC

Fourth, complex predicates behave like predicates. These include ‘natural

predicates’ like take advantage of, pay attention to, which Di Sciullo & Williams
(1987) argue are syntactically atomic. This explains why we can have (11a), but not
(11b). They support the syntactic difference between these structures by pointing out
that (11b) is in fact ill-formed, quite regardless of how it is accented. That is, take
great advantage of is not a multi-word verb, like take advantage of, but a syntactic
phrase.
88 CARLOS GUSSENHOVEN

(11) a. [ BILL’s been taken advantage of]FOC

b. ?[ BILL has been taken great advantage of]FOC

In Selkirk (1984, 1995), focus projection continues to higher levels of structure,

like the VP and the S. In my view of focus projection, only predicate focus can be
licensed by an accent on an argument. In fact, there is a further restriction to be
stipulated, which is that subjects can only license focus on the predicate if no further
lexical constituents follow. That is, Her HUSband beats the poor soul is not a well-
formed reply to Why has SHE come to this family refuge centre?, while Her
HUSband beats her is. In effect, the legitimate argument-predicate focus domains
are as in (12), where the constituents are lexical (i.e. function word are ignored).

(12) Possible predicate-argument focus domains:

[SUBJ-pred]]S, [pred-OBJ] ... ]S

1.3. SAAR in complex sentences

In complex sentences, SAAR applies as often as there are clause nodes in the
expression (Gussenhoven 1992). Consider the constructions in (13).

(13) a. Embedded nonfinite object clause (I heard a clock tick)

b. Embedded nonfinite object clause plus indirect object (I forced a
clock to tick)
c. Resultative (I’ve painted the door green)
d. Depictive (I drank the coffee cold )

In (13a), a nonfinite clause a clock tick functions as the argument of heard in the
main clause. SAAR requires that within the nonfinite clause, the argument a clock is
accented if both it and its predicate tick are included in the focus constituent. This is
shown in (14). In the main clause, the requirement is that the argument a clock tick
is accented and its predicate heard unaccented, if both constituents are included in
the focus constituent. Since the argument is accented, on clock, the condition has
been met. The accent on clock thus functions at two levels of structure, once at the
level of a clock (tick) and once at the level of (heard) a clock tick. In the same way,
lion and a lamb are arguments of the predicate devour in (15). At the higher level,
the requirement that a lion devour a lamb, the argument of saw, be accented here is
met through the presence of two accents.

(14) I [heard a clock tick]FOC

(I) [heard]Pred [[a CLOCK]Arg [tick]Pred]Arg

(15) I [saw a lion devour a lamb]FOC

(I) [saw]Pred [[a LION]Arg [devour]Pred [a LAMB]Arg]Arg

The structure of (13b) differs from (13a) in that the predicate (e.g. force,
promise, teach, tell ) takes three arguments rather than two. In (13b), there is an
object to tick and an indirect object a clock, in addition to the subject. The latter
TYPES OF FOCUS IN ENGLISH 89

licenses the unaccented predicate forced, while to tick forms its own focus domain.
Therefore, two accents appear in (16). When the direct object is a clause, as in (17),
SAAR applies within it: the argument devour a lamb is a clause, which has an
accent on the argument a lamb and leaves the predicate devour unaccented.

(16) I [forced a clock to tick]FOC

(I) [forced]Pred [a CLOCK]Arg [[[to TICK]Pred]S]Arg

(17) I [taught a lion to devour a lamb]FOC

I [taught]Pred [a LION]Arg [[[to devour]Pred [a LAMB]Arg]S]Arg

Selkirk (1995) offers an alternative explanation for the difference between (13a)
and (13b), which relies on the presence of a subject trace for the verb to tick in (13a),
as shown in (18a). Assuming that a pitch accent in any event licenses focus on the
word it occurs on, her syntactic theory of focus projection, which also builds on
Rochemont (1984), postulates three projection relations that license focus for higher
constituents. First, heads license focus on phrases; second, objects (i.e. internal
arguments) license focus of the head; and three, moved constituents license their
trace (Selkirk 1995). Because subjects are assumed to be raised from their clause,
they leave a trace which is focus if the subject is focus, and the trace then projects
focus to the VP and ultimately to the whole clause. In effect, because a subject trace
is now treated as an internal argument, this procedure equates subjects with objects
for the purposes of the second projection relation. It has the additional effect of
explaining why to tick in (18a) can be unaccented, and yet be focus, since (18a) has
a trace, but (13b) has not, as shown in (18b), after Selkirk (1995). Her theory is
considerably less restrictive than the one defended here. The restriction to internal
arguments in the second projection clause would appear to be rendered vacuous by
the addition of the third clause. While in the original two-clause version subjects
were incapable of projecting focus at all, in the three-clause version subjects can
project focus to the entire clause, even in a sentence like JOHNson died of natural
causes. This seems incorrect; for discussion see Gussenhoven (1999a).

(18) a. [I heard [a CLOCK [[t] tick]]]

b. [I forced [the CLOCK [to [PRO TICK]]]

Moving to (13c,d), a summary of the syntactic analyses proposed for these

constructions is provided in Winkler (1996, ch.2). Two analyses would at first sight
be compatible with the accentuation facts. First, Di Sciullo & Williams (1987)
analyse sentences like (13c) as containing only one clause. The special feature is the
complex predicate: paint green is a single constituent, which can form a focus
domain with an argument like the door, thus remaining unaccented itself. (Within
the predicate, the accent goes to the adjective, just as it goes to the particle in phrasal
verbs like to look up). This is shown in (19). As pointed out by Rod Walters
audience after a presentation of these data at the 2000 LAGB meeting In London,
this formation of complex predicates is probably subject to a size constraint, since
To paint the DOOR a bright green seems ill-formed.
90 CARLOS GUSSENHOVEN

(19) I [have painted the door green]FOC

(I) [have painted green]Pred [the DOOR]Arg

(20) (I) [have painted]Pred [[the DOOR]Arg [green]Pred]Arg

A second possible analysis of (19) is (20), where door would be accented

because it is a theme in the small clause the door (be) green. In this analysis, (be)
green is the predicate, while the small clause itself is a theme of paint. However, it
can be shown that resultatives don’t behave like argument-predicate structures. If the
door green is a clause, it should be able to be a focus constituent and have just an
accent on door. In (21), we see that this can be done with birds in birds sing, an
undoubted clause. By contrast, (22) cannot have the same accentuation. This is
explained if we assume that arguments like doors and walls do not have green, black
as predicates, but rather paint green, paint black. Under that interpretation, green,
black are headless fragments of predicates, and understandably arguments cannot
form focus domains with them (cf. also (10)).

(21) (A: So what have you seen in the nature reserve?)

I’ve seen BIRDS fight, I’ve seen TURTLES mate, I’ve seen
ELEPHANTS feed ...

(22) (A: So what have you ever painted?)

?? I’ve painted DOORS green, I’ve painted WALLS black, I’ve
painted FURNITURE red ...
I’ve painted DOORS GREEN, I’ve painted WALLS BLACK, I’ve
painted FURNITURE RED ...

Resultatives contrast with depictives of the type illustrated by (13d), which is

pronounced I drank the COFFEE COLD (Winkler 1996, p. 277 ff ). With respect to
the predicate drank, cold functions as a modifier requiring the proposition ‘I drank
the coffee’ to be valid under the condition ‘X be cold’. The possible interpretations
for ‘X’ are ‘the coffee’ and ‘he’. Since cold functions as a modifier in the clause I
drank the coffee, it has no argument at that level of structure with which it can form
a focus domain, even though the modifier itself may well analyzed as a small clause
containing just cold as a predicate.
This concludes the section on the structural relation between pitch accents and
information structure in English. The next section discusses different types of focus.

2. TYPES OF FOCUS
As Dik (1980, 1997) makes clear, languages not only express information packaging
in different ways, they also express different focus meanings, or ‘focus types’.
Unlike Culicover & Rochemont (1983), I take formal characteristics rather than
contextual differences to be the criterion for recognizing a focus type. This section
lists a number of focus types that have been distinguished in English. In each case,
the form is described, and the meaning informally characterized.
TYPES OF FOCUS IN ENGLISH 91

2.1. Presentational focus

The term ‘focus’ is usually equivalent to ‘presentational focus’. A commonly used
diagnostic is questioning: the focus constituent is the part of the sentence that
corresponds to the answer to a question, either overt or implied (Kanerva 1989). In
the preceding section, many examples were given.

2.2. Corrective focus

When the focus marks a constituent that is a direct rejection of an alternative, either
spoken by the speaker himself (‘Not A, but B’) or by the hearer, the focus is
‘corrective’ (or ‘counterassertive’ cf. Dik 1980; Gussenhoven 1983a). As explained
in section 1.0, this type is often referred to as ‘contrastive’, as in Chafe (1974),
which term must not be confused with ‘narrow focus’. English bans pitch accents to
the right of the presentational focus within the intonational phrase, but to the left of
the focus, pitch accents are commonly used, as in (23a). However, with corrective
focus, deaccentuation would equally seem appropriate before the focus constituent,
as illustrated in (23b).

(23) a. (A: What’s the capital of Finland?)

B: The CAPital of FINland is [HELsinki]FOC

b. (A: The capital of Finland is OSlo)

B: (NO.) The capital of Finland is [HELsinki]CORRECTIVE

Languages that make a formal distinction between presentational focus and

corrective focus include Efik, where a focused answer to a WH-question is not
expressed in the same way as a focused correction, which requires a corrective focus
particle (de Jong 1980; Gussenhoven 1983a). Lekeitio Basque, too, expresses
corrective focus and presentational focus differently (Elordieta, this volume). Navajo
has a neutral negative, doo ... da, shown in (24a), and one that expresses corrective
focus, hanii, as shown in (24b,c) (Schauber 1978). The acute indicates high tone.

(24) a. Jáan doo chidí yiyííáchø’-da

John NEG car 3RD-PAST-wreck-NEG
‘John didn’t wreck the car’

b. Jáan hanii chidí yiyííáchø’

John NEG car 3RD-PAST-WRECK
‘JOHN didn’t wreck the car (someone else did)’

c. Jáan chidí hanii yiyííáchø’

‘John didn’t wreck the CAR (he wrecked something else)’
92 CARLOS GUSSENHOVEN

2.3. Counterpresupposition focus

A third type, which may be rare, is ‘counterpresuppositional’ focus, which involves
a correction of information which the speaker detects in the hearer’s discourse
model. English has a special form for counterpresuppositional focus if the focus
constituent is the polarity of the sentence. In (25a), originally from Ladd (1980, 87),
the focus is on the negation, while John reads book is the background. If the focus
had been ‘corrective’, i.e. been a correction of new information brought in by a
preceding utterance, the expression would have been (25b), which has corrective
focus on the negation.2

(25) a. (A: Has John read Slaughterhouse Five?)

B: John does [n’t]COUNTERPRESUP READ books

b. (A: I’m telling you: John reads books!)

B: I’m sorry, John does [NOT]CORRECTIVE/ DOES[n’t]CORRECTIVE
read books

English counterpresuppositional polarity focus may be expressed by means of a

pitch accent on a preposition in a non-focused constituent. Such accentuation of
prepositions should be distinguished from (presentational or correction) focus for the
preposition itself. Example (26a) contrasts with (26b) in this way (Gussenhoven
1983a). Importantly, they would have different translations in German or Dutch.

(26) a. (A: What other artistes have been in your car?)

B: Patty Grey was [never]COUNTERPRESUP IN my car

b. Patty Grey was never [IN]CORRECTIVE my car

(implying ‘but she may have been underneath it’)

2.4. Definitional focus

By means of ‘definitional’ focus, the speaker indicates that the information does not
refer to a change in the world, but informs the hearer of attendant circumstances.
While presentational focus in English subject-predicate sentences requires the
predicate to be unaccented, as in (27a), a definitional focus requires accents on both
constituents, as in (27b) (originally from Kraak (1970)). I termed the pattern in (27a)
‘eventive’ in Gussenhoven (1983a). Semantically, definitional focus seems special,
but phonologically it is the accentuation pattern used for the eventive meaning
which seems marked: the predicate is left unaccented, even though it is in the focus
constituent.3 Below, I will label the semantically default type with EVENTIVE
rather than plain FOC.

(27) a. [Your EYES are red]EVENTIVE

b. [Your EYES are BLUE]DEFINITIONAL
TYPES OF FOCUS IN ENGLISH 93

The eventive vs. definitional distinction is akin, but not identical to the distinction
between ‘individual level’ and ‘stage level’ predicates (Kratzer 1996). Stage level
predicates involve transient qualities, as in (27a), where the redness is due to swollen
eyelids, and individual level predicates to permanent qualities, as in (27b), where
blue refers to the colour of the iris. However, (28) shows that the eventive
interpretation may combine with the inherent colour interpretation. Genericity of the
subject, as suggested by Diesing (1992), does not explain the pattern either. In (29a),
an existential subject licenses focus on the verb, but the generic subject in (29b) does
not. However, generic subjects may occur in eventive sentences, as shown by (30),
which the keeper of the last dodo might have used to announce its demise, leaving
his listeners to infer the death of the last dodo from his communication that none in
fact survive (Gussenhoven 1983c).

(28) (A: Why have you chosen me?)

B: Your EYES are blue (eventive, but permanent property)

(29) a. FIREMEN are available

b. FIREMEN are ALTRUISTIC

(30) The DOdo is extinct (eventive; but generic subject)

Having ruled out equations between ‘eventive’ and ‘stage level’ and between
‘eventive’ and ‘non-generic’, we would of course like to have a semantic definition
of ‘eventive’ that will cover all instances of this pattern. An eventive sentence reports
a change in the world. However, there are two caveats. First, the pattern would appear
to carry some additional semantic feature of ‘non-agentive’ or ‘non-volitional’
(Faber 1987). Thus, The BAby’s crying is the expected accentuation in a reply to
‘Why are you getting up?’, but so is GRANDmother’s CRYing, where a volitional
involvement of the subject is somehow conveyed by the accent on the verb. Second,
in a case like (28), there is no change in the world to report, as observed by Daniel
Bühring (personal communication, 2003). Here, the change would appear to lie in
the announcement of the relevance of blue eyes for mate selection, or for this
particular case of mate selection. Neither of these aspects seem at all easy to
incorporate in a definition of ‘eventive’.
Under definitional focus, objects retain their power to license definitional focus
on the predicate. Definitional focus thus differs from eventive focus in disallowing
subject-predicate focus domains. The accentuation of a broad-focus SOV sentence
like JOURnalists report the NEWS therefore corresponds to that of an eventive A
JOURnalist was reporting the NEWS. The next focus type, contingency focus, not
only bans Subject-Predicate focus domains, but also Predicate-Object ones, i.e. also
requires focused predicates with adjacent accented objects to be accented. The
situation can be summarized as in (31).
94 CARLOS GUSSENHOVEN

(31) Possible argument-predicate focus domains:

Eventive: [SUBJ-pred]]S, [pred-OBJ] ... ]S

Definitional: [pred-OBJ] ... ]S
Contingency: -

2.5. Contingency focus

As with definitional focus, with ‘contingency focus’ the speaker indicates that the
information is not about a change in the world, but defines attendant circumstances,
but the difference is that the information is presented as potentially relevant.
Examples (32a,b) are from Halliday (1967, p. 38), who incorrectly explained (32a)
as being due to the status of ‘dogs’ as ‘old information’. In Gussenhoven (1983a) I
pointed out that dogs is in fact accented and ‘new’ in both interpretations, but that
the meaning of (32a) is ‘contingency’ (‘If there are dogs, they must be carried’),
while that in (32b) is ‘eventive’, and implies that the speaker might be ‘worried
because he had no dog’, to quote Halliday (cf. also the discussion in Ladd (1996, p.
199)). Similarly, eventive The King of FRANCE is bald carries an implication that
there is a King of France which is absent from the contingency sentence the King of
FRANCE is BALD (Gussenhoven 1983c).

(32) a. [DOGS must be CARRIED]CONTINGENCY

b. [DOGS must be carried]EVENTIVE

Unlike definitional focus, contingency focus is evident in SOV structures, as

illustrated in (33), where the proverbial interpretation (33a) is an example of
contingency focus, and contrasts with eventive (33b). The phonetic difference with
the eventive reading is less salient than in (32a,b), because of the non-final position
of the word carrying the accent in the contingency version.
The contingency of the proposition need not always be due to the conditional
status of the subject. In (34a), from Gussenhoven (1983a), the object is conditional
(‘If there are thieves’). The phonetic salience of the contrast is again low in English,
due to the non-final position of the predicate. However, in Dutch, which has the
accentable part (aan) of the phrasal verb aangeven ‘report’ in final position, the
difference is as salient as that in (32a,b).

(33) a. [TOO many COOKS SPOIL the BROTH]CONTINGENCY

b. [[TOO many COOKS spoil the BROTH]EVENTIVE
(implying ‘we need to take soup off the menu’)

(34) a. [The MANagement rePORTS THIEVES]CONTINGENCY

Dutch: De DIRECTIE geeft DIEVEN AAN
TYPES OF FOCUS IN ENGLISH 95

b. [The MANagement reports THIEVES]EVENTIVE

(e.g. a caption for a cartoon)
Dutch: De DIRECTIE geeft DIEVEN aan

In broad-focus SOV structures, therefore, it is definitional and eventive that

contrast with contingency, as shown in (35), where the verb is accented only in
(35c).

(35) a. [The HUNters were shooting ANimals]EVENTIVE

b. [HUNTers shoot ANImals]DEFINITIONAL
c. [These HUNTers SHOOT ANimals]CONTINGENCY
(‘So don’t let your pets get near them!’)

In addition to the obligatory accent on the predicate, contingency sentences

obligatorily accent the negator, if there is one. A three-way contrast therefore arises
in negative SV structures, as shown in (36). In eventive (36a), the entire predicate is
unaccented, in definitional (36b), an accent is added to the verb, and in contingency
(36c), both the verb and the negation are accented. The presence of the accent on the
negation is more salient in German, where it appears post-verbally, as in (37): (37a)
is either eventive, which expression could be used to complain that someone doesn’t
blink, in spite of an earlier agreement that he would at that point in time, or
definitional, in which case it describes a state of affairs whereby someone just never
blinks. Contrast these with the contingency version (37b), which is a warning just in
case.

(36) a. (A: What seems to be the problem?)

B: [Our CUStomers aren’t admitted]EVENTIVE
b. [Our CUStomers aren’t adMITted]DEFINITIONAL
(‘That’s the way it is’)
c. [Our CUStomers AREN’T adMITted]CONTINGENCY
(‘In case you had forgotten’)
(37) a. You [don’t BLINK]EVENTIVE/DEFINITIONAL
German: Du ZWINKERST nicht!
b. You [DON’T BLINK]CONTINGENCY
German: Und du ZWINKerst NICHT!
2.6. Reactivating focus
Instead of, or in addition to, new information, languages may also mark old
information, an option referred to as ‘reactivating focus’. The term is somewhat
paradoxical, as it is the background information that is now marked for information
status. (That is, ‘focus’ is here used in the general sense of ‘structural marking of
information status’.) In (38), the constituent John has ‘reactivating focus’. Speaker B
considers the fact that she is not just acquainted with John, but actually dislikes him,
significant enough to single out the ‘given’ John by means of the syntactic device of
96 CARLOS GUSSENHOVEN

TOPICALIZATION. In English, constituents can be topicalized, giving the meaning

‘as for this constituent’.

(38) (A: Does she know JOHN?)

B: JOHN she DISLIKES

2.7. Identificational focus

English has the syntactic device ‘clefting’, the ‘It is X [who/that VP]’ construction,
where X is the subject of VP. It would appear that clefting causes the non-clefted
constituent to be reactivated information if it is accented, as in (39). Here, the
implication in B’s response is that Helen’s dislike of someone had been discussed
relatively recently. The clefted constituent is optionally accented. If the non-clefted
constituent is unaccented, it is old information, as in (40). The clefted constituent is
now obligatorily accented, and constitutes new information. It is impossible to have
both clefted and non-clefted constituents contain new formation. That is, in It is the
POSTMAN who CAME either the postman or the notion of arriving must be in the
context.

(39) (A: Does Helen know JOHN?)

B: It is John/JOHN she DISLIKES

(40) [A: I wonder who she dislikes]

B: It is JOHN she dislikes

Clefting, therefore, presents a somewhat complex picture when viewed from the
perspective of information status. Since no ready generalization arises, its meaning
may not really be concerned with legitimacy or recency of information in the
background. Rather, the meaning is to exhaustively identify a constituent (Szabolcsi
1981; Kiss 1998). In (41a), the focus constituent is egy kalapot ‘a hat+ACC’. The
sentence differs from that in (41b), which also has egy kalapot in focus, in that (41a)
entails that Mary bought nothing but a hat. By contrast, in (41b) the hat may be one
of a number of items that were bought by Mary. In other words, clefting expresses
identificational focus (Kiss 1998).

(41) a. Mari egy kalapot nézett ki magának

Mary a hat+ACC picked out herself+ACC
‘It was a HAT that Mary picked for herself’

b. Mari ki nézett magának egy kalapot

‘Mary picked a HAT for herself’
TYPES OF FOCUS IN ENGLISH 97

The difference between (41a) and (41b) is brought out by a test attributed by Kiss to
Szabolcsi (1981). Compare (42) with (43): (42b) is semantically incompatible with
(42a), since it claims that the hat in question is the only item bought by Mary, thus
denying (42a). By contrast, no such conflict arises in the case of (43a,b), even
though the speaker of (43b) may be accused of being parsimonious with the truth.
This is true regardless of the information status of the clefted constituent. All
examples could be answers to ‘What did Mary buy?’, so that the non-clefted
constituents (that Mary bought) are unaccented, but they can also be placed in a
context in which Mary has presentational focus and the clefted constituents are old
information (in which case the examples could be answers to I wonder why no one
bought a hat or a coat or a similar item of clothing).

(42) a. It was a hat and coat that Mary bought

b. It was a hat that Mary bought

(43) a. Mary/MARY bought a HAT and a COAT

b. Mary/MARY bought a HAT

3. CONCLUSION
One dimension of meaning expressed by sentence-level pitch accents in languages
like English concerns the size of the focus constituent, which is expressed through
deaccentuation of constituents after the focus. Beginning with Schmerling (1974),
researchers have found that the relation between the pitch accent and the focus is
mediated through the predicate-argument structure of the sentence, which is evident
from the fact that predicates remain unaccented when they abut a focused argument.
In many cases, therefore, the accent on the argument is properly to be seen as an
accent on the predicate-argument combination, a regularity which obtains as many
times as there are clauses in the sentence.
A second dimension of meaning concerns the meaning of ‘information
packaging’ itself. The semantics would appear to involve a number of distinctions.

• Background vs. New information. This is the basic distinction which

has been referred to as ‘topic’ vs. ‘focus’, ‘old/given’ vs. ‘new’, etc.
Information that serves to further develop the discourse model was
discussed as ‘presentational focus’, while ‘reactivating focus’ was used for
information retrieved from the background.

• Development vs. Correction. If ‘development’ is the default situation

whereby speakers add information to the discourse model, correction
involves the removal of information. When applied to new information,
it is ‘corrective focus’ and when applied to the background, it is
‘counterpresuppositional focus’.
98 CARLOS GUSSENHOVEN

• Eventive vs Non-eventive. The development of the discourse may involve

reports of changes in the world, or may further define the existing world.
In the former case, we have ‘eventive focus’; in the latter the focus is
non-eventive. Non-eventive focus subdivides into ‘definitional’ and
‘contingency’.

• Definitional vs. Contingency. In both cases, the information serves to

define the world, but for ‘contingency focus’ the speaker indicates that the
information is only potentially relevant to the discourse model.

The above summary suggests that the speaker indicates how the information in
his expression is to be related to the hearer’s information about the mini-world about
which they are together trying to reach a state of mutual understanding. The
meanings of the melodic aspects of the pitch accent in English proposed in Brazil
(1975) and Gussenhoven (1983b) as well as those proposed by Pierrehumbert &
Hirschberg (1990) fit this type of meaning well. The former include ‘Addition’
(Brazil’s ‘Proclaiming’), used for the commitment of information to the discourse
model and signalled by falling contours, and ‘Selection’ (Brazil’s ‘Referring’), used
for reference to information in the background and signalled by falling-rising
contours. A third meaning, ‘Testing’, signals the speaker’s inability or refusal to
commit information to the discourse model, signalled by rising contours
Gussenhoven (1983b). ‘Identificational’ focus somehow doesn’t quite match these
other meanings. The information that John is the only person who caught a fish, as
conveyed by It’s John who caught a fish, concerns information content rather than
information status. Possibly, therefore, intonation can only be used for the
expression of information structure, implying that identificational focus can only be
expressed through the morphology or syntax.4

Centre for Language Studies

University of Nijmegen
The Netherlands

4. NOTES

1
One difference between my account and Selkirk’s theory is that the latter contains two indirectness
relations rather than one. First, there is a relation between ‘focus interpretation’ and ‘F-marked
constituent’ (the focus constituent), and there is a second relation between the F-marked constituent
and accent distribution. While in my account the first relation is trivial in the sense that the
interpretation of each focus constituent is that it is focused, and thus ‘new’, in Selkirk’s theory,
focus interpretation principles are applied to the focus constituent so as to establish which parts in it
are interpreted as ‘new’ and which as ‘given’. See also Gussenhoven (1999a).
2
I incorrectly analyzed (25) as having focus on the verb in Gussenhoven (1983a, note 5). The latter
would indeed have the same form, but is only appropriate in some context like ‘What doesn’t John
do with books?’. Ladd (1980) himself analyzed his example as having ‘default accentuation’ on the
verb, his point being that the accentuation signals that books is outside the focus, rather than that
read is included in it.
TYPES OF FOCUS IN ENGLISH 99

3
A class of ‘event’ sentences was independently identified by Cruttenden (1984) in connection
with the accentuation pattern SUBJECT-predicate. My definition referred to a focus type regardless
of syntax.
4
Recent treatments which have not been covered in this survey include Lambrecht (1994), Vallduví
& Engdahl (1996), Erteschik (1997), and Zubizarreta (1998).

5. REFERENCES
Bolinger, D. Intonation: Selected Readings. Harmondsworth: Penguin, 1972.
Bolinger, D. Review of Schmerling (1974). American Journal of Computational Linguistics (1978), 1-23.
Microfiche.
Bolinger, D. “Two views of accent.” Journal of Linguistics 21 (1985): 79-123.
Bolinger, D. More views on ‘Two views on Accent’. In On Accent, pp. 124-146. Bloomington, IN:
Reproduced by the Indiana University Linguistics Club, 1987.
Bolinger, D. Intonation and its Uses. Stanford, CA: Stanford University Press, 1989.
Brazil, D. Discourse Intonation I. Birmingham UK: English Language Research,Birmingham University,
1975.
Bresnan, J. “Sentence Stress and Syntactic Transformations.” Language 47 (1971): 257-281.
Bresnan, J. “Stress and Syntax: a Reply.” Language 48 (1972): 326-342.
Burzio, L. Principles of English Stress. Cambridge: Cambridge University Press, 1994.
Chafe, W. L. “Language and Consciousness.” Language 50 (1974): 111-113.
Chomsky, N. “Deep Structure, Surface Structure and Semantic Interpretation.” In D. D. Steinberg and L.
A. Jakobovits (eds.), Semantics: an Interdisciplinary Reader in Philosophy, Linguistics and
Psychology, pp. 183-216. Cambridge, UK: Cambridge University Press, 1971.
Chomsky, N. and M. Halle. The Sound Pattern of English. New York: Harper and Row, 1986.
Cruttenden, A. “The Relevance of Intonational Misfits.” In D. Gibbon and H. Richter (eds.), Intonation,
Accent and Rhythm. Studies in Discourse Phonology, pp. 67-76. Berlin: de Gruyter, 1984.
Culicover, P.W. and M. Rochemont. “Stress and Focus in English.” Language 59 (1983): 123-165.
De Jong, J. “On the Treatment of Focus in Functional Grammar.” GLOT, Leids Taalkundig Bulletin 3
(1980): 89-115.
Di Sciullo, A. and E. Williams. On the Definition of Word. Cambridge, MA: MIT Press, 1987.
Diesing, M. Indefinites. Cambridge University: Doctoral dissertation, 1992.
Dik, S. C. The Theory of Functional Grammar. Part 1: The Structure of the Clause. New York: Mouton
de Gruyter. Edited by Kees Hengeveld, 1997.
Dik, S. C. “On the Typology of Focus Phenomena.” GLOT, Leids taalkundig bulletin 3 (1980): 41-74.
Erteschik-Shir, N. The Dynamics of Focus Structure. Cambridge: Cambridge University Press, 1997.
Faber, D. “The Accentuation of Intransitive Sentences in English.” Journal of Linguistics 23 (1987):
341-358.
Frota, S. Prosody and Focus in European Portuguese. University of Lisbon: Doctoral dissertation, 1998.
[Published by Garland, New York, 2000.]
Gussenhoven, C. Review of Schmerling (1974). Dutch Quarterly Review of Anglo-American Letters
(DQR) 8 (1978): 233-240.
Gussenhoven, C. “Focus, Mode and the Nucleus.” Journal of Linguistics 19 (1983a): 377-417.
Gussenhoven, C. A Semantic Analysis of the Nuclear Tones of English. Distributed by Indiana University
Linguistics Club (IULC). Bloomington, Indiana, 1983b.
Gussenhoven, C. “Van focus naar zinsaccent: Een regel voor de plaats van het zinsaccent in het
Nederlands.” GLOT 6 (1983c): 131-155.
Gussenhoven, C. “Two views of accent: A reply.”Journal of Linguistics 21 (1985): 125-138.
Gussenhoven, C. “Sentence Accents and Argument Structure.” In I. M. Roca (ed.), Thematic Structure:
Its Role in Grammar, pp. 79-106. Berlin/New York: Foris, 1992.
Gussenhoven, C. “Discreteness and Gradience in Intonational Contrasts.” Language and Speech 42
(1999a): 281-305.
Gussenhoven, C. “On the Limits of Focus Projection in English.” In P. Bosch and R. van der Sandt (eds.),
Focus: Linguistic, Cognitive, and Computational Perspectives, pp. 43-55. Cambridge, UK:
Cambridge University Press, 1999b.
100 CARLOS GUSSENHOVEN

Halliday, M. A. Intonation and Grammar in British English. The Hague: Mouton, 1967.
Hayes, B. Metrical Theory: Principles and Case Studies. Chicago: Chicago University Press, 1995.
Hayes, B. and A. Lahiri. “Bengali Intonational Phonology.” Natural Language and Linguistic Theory 9
(1991): 47-96.
Jackendoff, R. S. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972.
Kanerva, J. M. Focus and Phrasing in Chichewa Phonology. Stanford University: Doctoral dissertation,
1989.
Kiss, K. E. “Identificational Focus and Information Focus.” Language 74 (1998): 245-273.
Kraak, R. “Zinsaccent en syntaxis.” Studia Neerlandica 4 (1970): 41-62.
Kratzer, A. “Stage-level and Individual-level Predicates.” In G. N. Carlson and F. J. Pelletier (eds.), The
Generic Book, pp. 125-175. Chicago: Chicago University Press, 1996.
Ladd, D. R. The Structure of Intonational Meaning: Evidence from English. Bloomington: Indiana
University Press, 1980.
Ladd, D. R. “Phonological Features of Intonational Peaks.” Language 59 (1983): 721-759.
Ladd, D. R. Intonational Phonology. Cambridge: Cambridge University Press, 1996.
Lambrecht, K. Information Structure and Sentence Form. Topic, Focus, and the Mental Representation of
Discourse Referents. Cambridge: Cambridge University Press, 1994.
Newman, S. “On the Stress System of English.” Word 2 (1946): 171-187.
Pierrehumbert, J. B. and J. Hirschberg. “The Meaning of Intonational Contours in the Interpretation of
Discourse.” In P. Cohen, J. Morgan, and M. Pollack (eds.), Intentions in Communication, pp. 271-
311. Cambridge MA: MIT Press, 1990.
Rochemont, M. Focus in Generative Grammar. Amsterdam: John Benjamins, 1984.
Schauber, E. “A Comparison of English Intonation and Navajo Particle Placement.” In D. J. Napoli (ed.),
Elements of Tone, Stress, and Intonation, pp. 144-173. Washington, DC: Georgetown University
Press, 1978.
Schmerling, S. F. Aspects of English Sentence Stress. Austin: Texas University Press, 1974.
Selkirk, E. Phonology and Syntax: The Relation between Sound and Structure. Cambridge, Mass.: MIT
Press, 1984.
Selkirk, E. “Sentence Prosody: Intonation, Stress and Phrasing.” In J. Goldsmith (ed.), The Handbook of
Phonological Theory, pp. 550-569. Oxford: Blackwell, 1995..
Selkirk, E. “Contrastive FOCUS vs. Presentational Focus: Prosodic Evidence from English.” In B. Bel
and I. Marlien (eds.), Speech Prosody 2002. An International Conference, Aix-en-Provence.
Laboratoire Parole et Langage, CNRS and Université de Provence, 2002.
Szabolcsi, A. “The Semantics of Topic-Focus Articulation.” In J. Groenendijk, T. Janssen, and M.
Stokhof (eds.), Formal Methods in the Study of Language, pp. 513-514. Amsterdam: University of
Amsterdam, Mathematisch Centrum, 1981.
Vallduví, E. and E. Engdahl. “The Linguistic Realization of Information Packaging.” Linguistics 34
(1996): 459-519.
Winkler, S. “Focus and Secondary Predication.” Berlin: Mouton de Gruyter, 1996.
Zonneveld, W., M. Trommelen, M. Jessen, C. Rice, G. Bruce, and K. Árnason. “Wordstress in West-
Germanic and North-Germanic languages.” In H. van der Hulst (ed.), Word Prosodic Systems in the
Languages of Europe, pp. 477-603. Berlin: Mouton de Gruyter, 1999.
Zubizarreta, M. L. Prosody, Focus, and Word Order. Cambridge, MA: MIT Press, 1998.
NANCY HEDBERG AND JUAN M. SOSA

THE PROSODY OF TOPIC AND FOCUS

IN SPONTANEOUS ENGLISH DIALOGUE*

1. INTRODUCTION
Our research addresses the interface between meaning and prosody. In particular, it
concerns the way intonation plays a part in the interpretation of an utterance. For
example, we are concerned with the extent to which a falling versus a falling-rising
intonation at the end of an utterance or an extra tonal height on a specific word or
phrase affects the way the utterance is interpreted.
Information structure categories such as topic and focus have been correlated
with specific types of contours. Many authors have stated that there is a peak
associated with focus, while others have stated that there is also a peak associated
with topic. Claims have been made as to the specific sequence of underlying tones
associated with these categories, at least for constructed examples; for instance, that
focus will be marked with H* and topic will be marked with L+H*. Here, we test
these claims by analyzing the intonation and information structure of a sample of
spontaneous dialogue in English.

2. DATA
The data were taken from six half-hour episodes of the PBS political discussion
television show, The McLaughlin Group, videotaped in April and May 2001. The
host, John McLaughlin, discusses current issues of the day with four journalist
guests. The journalists have widely differing political beliefs and therefore the
discussions get heated and the speakers produce speech that we believe to be quite
spontaneous. The guests vary somewhat from week to week. Each half-hour episode
consists of four issues discussed. For the first five episodes, we selected the first
issue because it was the longest. For the sixth episode, we analyzed a combination of
issue two and three. Each issue is introduced by John McLaughlin in a monologue.
We didn’t analyze these portions of the videotapes. All participants are native
speakers of American English.
An advantage to analyzing the McLaughlin Group as a source of data is that
transcripts of the sessions are available on the World-Wide Web. In the few cases
where we found discrepancies between the transcript and the videotape in the
portions of the transcript we were analyzing, we corrected the transcript.

3. INFORMATION STRUCTURE CODINGS

One of us, Hedberg, coded the transcripts for five information-structure categories
and then listened to the videotape to confirm these codings. The five information-
structure categories are contrastive focus, plain focus, contrastive topic, unratified

101
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 101–120.
© 2007 Springer.
102 NANCY HEDBERG AND JUAN M. SOSA

topic and ratified topic. We follow Gundel (1988) in defining topic, comment, and
focus.
Topic
An entity, E, is the topic of a sentence, S, iff, in using S, the speaker intends to
increase the addressee’s knowledge about, request information about or otherwise get
the addressee to act with respect to E.
Comment
A predication, P, is the comment of a sentence, S, iff, in using S the speaker intends
P to be assessed relative to the topic of S.
Focus
That part of the linguistic expression that realizes the comment.

The focus is very long in the majority of cases, and consists of multiple pitch
accents and sometimes multiple intonational phrases. For that reason, Hedberg
picked the final pitch-accented phrase to annotate, except in the case of it-clefts
where she picked the clefted constituent since all three it-clefts in the data were either
topic-clause it-clefts or all-comment it-clefts (Hedberg, 2000). To explain the five
categories, we’ll illustrate with examples from the passage shown in (1). Topics are
italicized and foci are bold-faced. Contrastive elements receive double underlines,
and unratified topics receive a single underline.

(1) Ms. Clift: Look, John McCain would be the first one to say this
doesn’t improve the system to perfection; it makes it
marginally better. And there’s still a possibility that
Tom DeLay, who is an enemy of the bill, will forge an
unholy alliance with Democrats in the House. Because
Democrats have figured out, they do worse under this
bill than the Republicans do. But the big thing that
comes out of this, to me, is that it’s John McCain who
gets the big legislative triumph so far in this first 100-
day period, while President Bush is looking rather
passive on a number of issues across the board,
especially foreign policy. (3/31/01)

Ratified Topic
Contrastive Topic
Unratified Topic
Contrastive Focus
Plain Focus

The topic of the entire issue is the McCain-Feingold bill on campaign finance
reform. John McCain has just gotten it passed through the Senate and the question is
how it will do in the House. John McCain is an unratified topic because Eleanor
Clift is re-establishing him as the topic here and thus he is not already established as
a topic. The bill itself is already established as the topic and thus references to it with
‘this’ and ‘it’ are coded as ratified topics. The terms ‘ratified’ and ‘unratified’ topic
come from Lambrecht and Michaelis (1998). Both ‘John McCain’ and ‘this’ are
marked as topics here because John McCain is the topic of the matrix clause and the
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 103

referent of ‘this’ is the topic of the embedded clause. The focus of both the matrix
clause and the embedded clause falls on ‘perfection.’ Plain foci are marked in bold-
face. Tom DeLay is a Republican representative and is the topic of the next sentence.
Here ‘Democrats in the House’ is marked as a contrastive focus because there is an
implicit contrast with ‘Republicans in the House’. Likewise ‘John McCain’ is a
contrastive focus because it explicitly contrasts with ‘President Bush.’ The whole it-
cleft expresses a comment here, and thus there is no topic indicated for this sentence.
In the next sentence, President Bush contrasts with John McCain and is a topic, and
hence the phrase denoting him is marked as a contrastive topic.
To help identify the topic, Hedberg used Gundel’s (1974) ‘as for’ test and
Reinhart’s (1981) ‘said about’ test. For example, in (2), ‘you’ is identified as the
topic because the sentence can be paraphrased, ‘As for you, what do you think?’.

(2) Mr. McLaughlin: What do you think? (6.16)

A total of 1,669 phrases were coded for information structure category,

distributed as shown in Table 1. As can be seen from the table, the distribution of the
five information structure types was roughly equivalent across the five transcripts.
This rough equality serves as a broad check on the reliability of the information-
structure coding. Ideally we would have two information-structure coders, so that we
could compare their coding and come up with an inter-coder reliability statistic. We
plan to adopt this methodology in future work on this project.

Table 1. Distribution of Information Structure Types across the Six Transcripts

Tran- Ratified Contras- Un- Contras- Plain Total

script Topic tive ratified tive Focus
Topic Topic Focus
1 109 16 45 14 142 326
33.4% 4.9% 13.8% 4.3% 43.6%
2 61 7 45 24 138 275
22.2% 2.5% 16.4% 8.7% 50.2%
3 36 7 39 15 71 168
21.4% 4.2% 23.2% 8.9% 41.2%
4 79 17 36 31 114 277
28.5% 6.1% 13.0% 11.2% 41.2%
5 84 15 57 20 151 327
25.7% 4.6% 17.4% 6.1% 46.2%
6 89 10 44 23 130 296
30.1% 3.4% 14.9% 7.8% 43.9%
Total 458 72 266 127 746 1669
27.4% 4.3% 15.9% 7.6% 44.7%

We decided to select seven examples of each of the five categories from each
transcript for prosodic coding. For each transcript, Hedberg counted the total number
of each category and divided by seven. For example, there were 142 plain foci in
transcript 1. Division by 7 yields 20.3, so she selected every 20th example for
prosodic analysis. In this way, we acquired seven examples of each category spread
evenly across the transcript. She then printed a new copy of the transcript and
identified the 35 phrases to be analyzed with a highlighting pen, with no indication of
104 NANCY HEDBERG AND JUAN M. SOSA

information structure category. This transcript was given to Sosa, along with the
videotape, for prosodic coding. Because there were 6 transcripts, we subjected 210
phrases to prosodic coding. There were a total of 42 examples of each of the five
information-structure categories.
Sosa then listened to the videotapes and digitized each of the 210 phrases along
with some of their surrounding context. Using the Kay Computerized Speech Lab
(CSL 4300), he then analyzed the target phrases prosodically and assigned an
autosegmental sequence of tones to each phrase. He used annotations for pitch
accents (H*, L*, L+H*, H*+!H, H*+L, L*+H and H+L*), boundary tones (L%,
H%), intermediate phrase tones (L, H), downstep (!H), upstep (¡H), and increased
range (↑H). Again, in future work on this project, we plan to have two prosodic
coders, so that we can calculate an intercoder reliability statistic, to be surer that the
prosodic coding is accurate.

4. INTONATIONAL CODINGS
The intonational analysis and annotation of all digitized utterances was performed
following closely the Guidelines for ToBI Labelling (Beckman and Ayers Elam,
1997) and taking into consideration other published materials on the intonational
structure of English, notably Pierrehumbert and Hirschberg (1990) as well as other
autosegmental-metrical approaches to the phonology of intonation. The ToBI
conventions and assumptions were followed, although we introduced two additional
pitch accents that we felt were necessary in order to account for certain distinct
patterns. For example, we rescued the H*+L pitch accent (which was originally
designed to trigger downstep) to generate a dip between two H* pitch accents, which
is not captured by the notation H* ... H* alone.

Figure 1. [Even] Dan Goldin (unratified topic, 5.18)

H* H* LL%

Our independent feature downstep !H allowed us to free the H*+L notation and use it
for this effect. An example of this distinction is shown in figure 1 versus figure 2.1
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 105

Figure 2. Thirty years [of serious anthropological consideration] (plain focus, 3.19)
H*+L H* LL%

We noted that the sequence H* ...H* (equivalent to the high head in the British
tradition) is quite scarce in the data since the great majority of the utterances show
some kind of downdrifting pattern. The very few instances of sequences of straight
H* sequences may show a contrast with British English, which is said to typically
have this recurring high-pitched pre-nuclear pattern.
As already mentioned, the rest of the pitch accents used in this paper were H*,
L*, L+H*, L*+H, H+L*, and H*+!H, all of them with the value assigned to them in
the ToBI notation and previous work on English intonation. Given the emphasis
on this pitch accent in this paper, we present two instances of the L+H* in figures 3
and 4.

Figure 3. Our voyeurism (plain focus, 6.3)

L+H* LL%
106 NANCY HEDBERG AND JUAN M. SOSA

Figure 4. In Britain, in fact... (contrastive topic, 3.8)

L+H* LL%

The feature ‘increased range’ as well as the ‘upstep’ pitch accent ¡H* were added
to the tonal analysis, to specify high pitch excursions. Range is characterized by
higher peaks and low valleys, as shown in figure 5.

Figure 5. Made in China (plain focus, 2.34)

↑H* ↑H* LL%

On the other hand, upstep is mostly a H* that is higher than any previous H*,
reversing any downdrift of declination effect, as shown in figure 6.
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 107

Figure 6. Not a PBS documentary (contrastive focus, 3.32)

H* ¡H* LL%

The overwhelming majority of our utterances showed a downdrift most of the

time realized as one or more downstepped !H* in the tonal tier.
We noticed that many long utterances that were semantically coherent also had
overall prosodic patterns or designs that were larger than the intonational phrase. For
lack of a better term we tentatively called them ‘intonational macro-units.’ Two
examples are shown in figures 7 and 8.

Figure 7. Mr. McLaughlin: Can you handle that last question? Where do you think the
international community is? (2.14)
108 NANCY HEDBERG AND JUAN M. SOSA

Figure 8. Ms. Clift: It requires a leap of faith, however, to believe that the historical
Jesus was, in fact, the son of God. (3.27)

This macro-unit doesn’t necessarily coincide with Nespor and Vogel’s (1986)
phonological utterance, and is certainly perceptible in oral discourse and visible as
such in pitch tracks.
After the intonational coding was completed, it was entered on the data
spreadsheet and we proceeded with correlating the intonational coding with the
information-structure coding.

5. TOPIC ACCENT VERSUS FOCUS ACCENT HYPOTHESES

One important issue is whether there is a special ‘topic accent.’ Jackendoff (1972)
was the first to propose a distinction between ‘topic accents’ and ‘focus accents’. He
proposed that topics receive a fall-rise (‘B’) accent and that foci receive a fall (‘A’)
accent. Gundel (1978) follows Jackendoff in distinguishing between comment
(focus) accents and topic accents, but points out that topic accents only fall
on unactivated or contrastive topics. Pierrehumbert (1980) follows up with the
observation that Jackendoff’s B (‘background’) accents receive an H*LH% tune and
that Jackendoff’s A (‘answer’) accents receive a H*LL% tune. See Table 2 for
Pierrehumbert’s (1980) hypothesis and also hypotheses of researchers after her.

Table 2. Hypotheses of Researchers Concerning Topic Accent and Focus Accent

Topic accent Focus accent

Pierrehumbert 1980 H* LH% H* LL%
Steedman 1991 L+H* LH% H* LL%
Vallduvi & Engdahl 1996, Gundel 1999, L+H* H*
Steedman 2000a, Steedman 2000b, Gundel
& Fretheim (in press)
Lambrecht & Michaelis 1998 H% L%

Steedman (1991) states that foci (‘rhemes’) receive the H*LL% accent and tune
and that topics (‘themes’) receive a L+H*LH% (the so-called ‘scooped fall-rise’)
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 109

accent and tune. Vallduvi and Engdahl (1996) state that noncontrastive links
(Gundel‘s (1978) ‘unactivated topics’ or Lambrecht and Michaelis’s (1998)
‘unratified topics’) receive an L+H* pitch accent, that contrastive links are
obligatorily so marked, and that foci are marked with the pitch accent H*. Gundel
(1999) claims that topics, both new and contrastive, are marked with L+H*, and that
her category of ‘semantic focus’ (contrastive or noncontrastive) is marked with H*.
Steedman (2000a, 2000b) and Gundel and Fretheim (2004) also claim that topics
are marked with L+H* and foci with H* pitch accents. Lambrecht and Michaelis
(1998) distinguish topic accents from focus accents but don’t claim that there is any
prosodic difference between them; however, they mention in a footnote that H% may
mark topics and L% mark foci.
Pierrehumbert and Hirschberg (1990) suggest that L+H* is used to mark contrast,
or in their terms, to mark the selection of an item on a contextually-evoked salient
scale. They don’t specify whether this contrastiveness is associated with the
information structures of topic and focus. Presumably either a topic or a focus can be
marked by L+H*, according to them, just so as long as the category is contrastive in
their sense. We speculate that Gussenhoven’s (1983) fall-rise tone, which he says is
used to ‘select’ an entity from the background, corresponds to a topic accent, and that
his fall tone, which he says is used to introduce an entity into the ‘background’,
corresponds to a focus accent.
The major goal of our research was to put these hypotheses to the test.

6. PITCH ACCENTS
6.1. Does L+H* mark contrast, or topic?
With regard to L+H* marking information structure and/or contrast, we came up
with the results in Table 3:

Table 3. Distribution of L+H* Relative to Information-Structure Type

L+H* % out of 42
Ratified Topic 1 2%
Contrastive Topic 10 24%
Unratified Topic 13 31%
Contrastive Focus 11 26%
Plain Focus 6 14%

As can be seen from the table, we did find a significant number of L+H* pitch
accents marking contrastive topics or contrastive foci, e.g. the examples shown in (3)
and (4):
110 NANCY HEDBERG AND JUAN M. SOSA

(3) Mr. Kudlow: And we need to drill oil and gas in the Rockies. And
Jeb Bush is wrong and George Bush is right; we need
L+H* !H* L+H* !H*
to drill in the Gulf of Mexico.
(contrastive topic, 6.27, 28)
(4) Mr. McLaughlin: This exit question may be superfluous, but I’m
going to hit you with it anyway. Tito cracked the
space barrier between civilians and professionals.
For the most part, was his way the right way, or
for the most part was his way the wrong way, as
L+H* LH%
Goldin would lead you to believe, Michael
Barone? (contrastive focus, 5.32)
However, Pierrehumbert and Hirschberg’s (1990) proposal that L+H* is
associated particularly with contrast does not seem to be borne out by the number of
noncontrastive topics (6) and noncontrastive foci (14) marked by this tone. Examples
of noncontrastive topics are shown in (5) and (6):

(5) Ms. Clift: A good working-class guy may well be what Jesus was.
And in fact, this is discussed in a documentary that was
produced in England. And there they can talk about
these kinds of things. I think in this country we’re still a
little nervous about suggesting that Jesus may not fit the
Westernized, romanticized ideal. In Britain, in fact, the
archbishop of Canterbury there has called Britain a
H* L+H* L* HH%
nation of atheists. In a country of 60 million people, only
a million people go to church. (unratified topic, 3.9)

(6) Mr. Barone: I used to be an editorial writer, and I’ll tell you
something, there’s a temptation to harumph when
you’re an editorial writer – (laughter) – and I’m
afraid that that was the New York Times
harumphing.
Mr. McLaughlin: Well, they could have pointed out that $20
L+H*
million given to Russia probably wound up with
Russian scientists, and that might keep them
from making Iranian nuclear bombs.
(unratified topic, 5.26)
Similarly, examples of noncontrastive, plain foci marked by L+H* are shown in
(7) and (8):
(7) Mr. McLaughlin: Well, what is – do you think that NASA has egg
on its face? (plain focus, 5.29) L+H*
!H* HL%
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 111

(8) Mr. Kudlow: I have a different view, with all respect. I think it
turns this guy into a celebrity, and I think that
L+H*LL%
actually encourages more of these heinous
actions. (plain focus, 6.5)

Example (7) is about NASA’s unwillingness to allow Mr. Tito to pay $20 million to
go up in the Space station. Example (8) is about the pending excecution of Timothy
McVeigh.
It is clear from Table 3 that L+H* is not significantly correlated with topic as
opposed to focus, since there are 11 contrastive foci marked by this pitch accent and
6 examples of plain foci, although the raw number of 17 foci versus 24 topics
represents a trend in this direction. Example (4) shows an L+H*-marked contrastive
focus, and (7) and (8) show L+H*-marked plain foci. We present in figure 9 a pitch
track for example (8):

Figure 9. I think it turns this guy into a celebrity. (plain focus 6.5)
L+H*LL%

The Information Structure category from the literature that seems to best fit the
data concerning L+H* is Gundel’s (1999) category of ‘Contrastive Focus’. Her
category of ‘Contrastive Focus’ encompasses our ‘Contrastive Topic’, ‘Unratified
Topic’ and ‘Contrastive Focus’. This composite category accounts for 83% of our
L+H* marked phrases (34 out of 41).

6.2. Which Pitch Accents Mark Information Structure Categories?

It is important to determine what pitch accent information-structure categories are
marked with if they are not marked with L+H*. Table 4 shows the distribution of
primary pitch accent relative to information structure type.
112 NANCY HEDBERG AND JUAN M. SOSA

Table 4. Distribution of Pitch Accents or their Absence Relative to

Information Structure Type

H* H+L H+!H L+H* L* L+H H+L o

Ratified 10 1 0 1 4 0 0 26
Topic
Contrastive 23 1 0 10 1 2 0 5
Topic
Unratified 19 4 0 13 0 3 1 2
Topic
Contrastive 22 1 0 11 7 0 0 1
Focus
Plain Focus 26 1 1 6 8 0 0 0
TOTAL 100 8 1 41 20 5 1 34

Except for ratified topics, which tend to be unaccented, most phrases in each
information structure category are marked by H*. Except for H*+!H, we abstracted
away here from high tones further marked with increased range, upstep or downstep.
It is interesting that L+H* is the second most frequent pitch accent in the data, after
H*. This shows that the attention to this pitch accent exhibited in the literature has
not been misplaced.
Ratified topics, unsurprisingly, tend to be unaccented. 34 out of 42 ratified topics
were encoded as personal pronouns. Four ratified topics were coded as L*. In the
case of two of these, we were unsure as to whether they really received an L* pitch
accent, or simply exhibited an unaccented rhythmic beat.
Except for the four cases of unratified topics, L* tends to mark focus, either
contrastive or plain. The five cases of L*+H all mark topics. The other pitch accents,
except for L+H*, do not exhibit any particular pattern.
We were especially curious about the phrases coded as contrastive focus,
contrastive topic or unratified topic that did not receive the L+H* pitch accent. Is this
an error of our information structure coding, or does it represent the actual prosodic
marking system of English?
One interesting class of examples to check in this regard is cleft sentences, of
which there were three in our data. We coded the clefted constituent in each case as a
contrastive focus since the meaning of the cleft sentence involves an exhaustiveness
condition on the clefted constituent. For example, in (9), it is asserted that nobody
other than the Communist Chinese are behaving as a Cold War power right now; in
particular not the United States. The proposition that the United States has been
behaving as a Cold War power has been previously evoked.

(9) Mr. Buchanan: What the United States should do, John, is pull
the ambassador home right now. The president of
the United States should say, ‘I understand why
Americans are boycotting Chinese goods, and I
believe that if this thing is not resolved
satisfactorily, it will be time to suspend PNTR for
exactly one year.’ It is the Communist Chinese
↑H* !H* HL%
who are behaving as a Cold War power right
now. (contrastive focus, 2.23)
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 113

Like the other two it-clefts, the clefted constituent here is marked by some variant of
the H* pitch accent, but it is contrastive. It is interesting that the three it-clefts are the
only examples in the data of a subject receiving narrow focus. All three are subject
clefts.
Some narrow foci were coded as contrastive, but perhaps were not treated as
contrastive by the prosodic system. For example, at the end of transcript 4,
participants were asked to grade President Bush on style and substance during his
first 100 days. Because there was a limited set of possible answers (the grades A, B,
C, D, and F), we coded the resulting narrow focus answer as a contrastive focus.
Perhaps a more refined definition of contrastive focus, one that requires the explicit
ruling out of alternatives, would exclude these cases. An example is shown in (10):
(10) Mr. McLaughlin: Yeah, what about substance?
Ms. Clift: Substance, C-minus.
H* !H* LL% (contrastive focus, 4.25)

There nevertheless are several cases of focus phrases coded as contrastive which
do rule out alternatives but are not marked L+H*. The examples shown in (11) and
(12) are explicitly contrastive in this way:

(11) Mr. Page: Thank you, I want to concur with my colleagues in

saying that I think – well, actually, Tito will be
remembered as a pioneer; the first space tourist. And this
is the wave of the future, and NASA, like most
bureaucracies, has a difficult time ‘turning around in the
water.’ It’s a big ship, not a speedboat.
H* !H* (contrastive focus 5.17)
(12) Mr. McLaughlin: I think we’ve reached the end of our seminar here
today. Exit question: Will the Richard Neave
Jesus endure Michael Barone?
Mr. Barone: No. This is just a guess.
Mr. McLaughlin: Eleanor?
Ms. Clift: I don’t think so. This is a BBC documentary, not
a PBS documentary. Republicans on Capitol Hill
¡H* LL%
would go nuts if this ever showed on PBS.
(contrastive focus 3.32.)

6.3. Can Topics be Marked H*?

It can be seen from Table 4 that topics are frequently marked with H*, contrary to
predictions made in the literature that topics or at least contrastive topics should be
marked L+H*. Examples of H*-marked contrastive topics are shown in (13) and
(14):
(13) Ms. Clift: And the stakes in this confrontation are huge for China.
They have 54,000 students in this country. They want to
114 NANCY HEDBERG AND JUAN M. SOSA

get the Olympics. They want to keep trade going.

And the stakes for this country are also huge. We
H% H* !H* L*
don’t want to create an enemy where where there is
none. (contrastive topic 2.8)
(14) Mr. Page: What you call small, but which Democratic contributors
call $1,000 a lot of money. The Republicans have a lot
H* L
more of those kind of hard-money contributors and now
you’re going to raise that limit while killing soft money.
(contrastive topic, 1.18)

In general, it seems best to conclude that contrastive topics are only sometimes
marked L+H*. The same goes for non-contrastive topics, as examples (15) and (16)
show:
(15) Mr. McLaughlin: Can you handle that last question? Where do you
think the international community is, especially
the Third World?
Mr. O’Donnell: The international community is very
H* !H*
sympathetic to the Chinese. They’re wondering
what are we doing with the reflexive old Cold
Ward mentality of flying these missions in the
first place. (Unratified topic, 2.15)
(16) Mr. McLaughlin: Tony, what was his best move?
Mr. Blankley: I think there were two. One, coming off the
Florida event, establishing his legitimacy as
president….On a policy basis, his biggest success
is taxes….
Mr. McLaughlin: Do you see his best move as the tax cut’s
tenacity?
Mr. O’Donnell: Yes, I do. I agree with Eleanor it’s not a good tax
cut, it’s not a good policy; but it is an amazing
accomplishment to come from where it’s come
from….
Mr. McLaughlin: Actually, his best move was the handling of the
H* !H* !H*
China spy plane. He kept his cool; he kept the
country cool, he was measured and moderate.
And it worked. (unratified topic, 4.7)
In (15), ‘the international community’ expresses the topic, as it is repeated from the
question; similarly in (16), ‘his best move’ clearly expresses the topic. Indeed these
two phrases are so topical in their contexts that perhaps they should be considered
ratified topics. However, both are marked with H* (or !H*) instead of L+H*. We
present in figure 10 a pitch track for example (16):
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 115

Figure 10. Actually his best move was the handling of the China spy plane. (4.7)
H* !H* !H*

In our future work on this project, we will explicitly distinguish topic-comment

utterances from all-comment utterances (Gundel, 1988). Some of our unratified
topics and contrastive topics could have alternative codings. For example the subject
in (17) was coded as a contrastive topic, but the utterance could probably have been
coded as an all-comment one, so that ‘Eisenhower’ would be coded as part of the
focus, and thus the H* which marks it would not constitute a counterexample to
theories that associate H* only with foci.2
(17) Mr. Buchanan: I’ll just remind you of one thing. Eisenhower
H* !H* HL%
refused to apologize for the U-2, and even blew
up a summit, and we were a lot more at fault then.
(contrastive topic, 2.25)

Here the entire event of Eisenhower’s refusal is being put forth as the ‘new
information’ in the discourse. The entire clause answers the question ‘What
happened?’ Nevertheless, we believe that the bold-faced constituents in (13)-(16) do
express topics, and are marked H*, contrary to predictions in the literature.

7. INCREASED PITCH RANGE, UPSTEP, AND DOWNSTEP:

We believe that the L+H* pitch accent is a mechanism for emphatically highlighting
an element relative to its context. Two other prosodic devices for emphatic
highlighting are pronouncing a high-pitch tone with increased pitch range or
pronouncing it with upstep. Another variation on a high pitch tone is pronouncing it
with downstep relative to a previous high pitch tone. The distribution of these three
alternatives to a plain high tone across information type categories is shown in
Table 5.
116 NANCY HEDBERG AND JUAN M. SOSA

Table 5. Distribution of Increased Range, Upstep and Downstep

Relative to Information Structure Type

range upstep downstep

↑H ¡H !H
Ratified Topic 0 0 3
Contrastive Topic 4 0 15
Unratified Topic 5 0 16
Contrastive Focus 5 4 12
Plain Focus 3 5 12
TOTAL 17 9 58

It is clear from the table that downstep is distributed across the four substantive
information structure categories approximately equally, as is increased range.
Upstep, however, seems to mark focus, either contrastive or plain, although the data
are few. It might be worth following up on this latter tentative conclusion in a more
detailed study.

8. BOUNDARY TONES
Some of the claims and suggestions in the literature concerning topic and focus
accents have involved boundary tones. For example, Lambrecht and Michaelis
(1998) suggest in a footnote that H% might mark topic and L% mark focus. Table 6
shows the distribution in our data of intermediate phrase + boundary tone relative to
information structure type.

Table 6. Distribution of Phrase Accents and Boundary Tones

Relative to Information Structure Types

Rise from
Fall Level Rise Bottom
LL% HL% HH% LH% TOTAL
Ratified Topic 2 0 0 0 2
Contrastive Topic 7 4 1 1 13
Unratified Topic 12 2 6 0 20
Contrastive Focus 29 1 4 5 39
Plain Focus 26 4 4 5 39
TOTAL 76 11 15 11 113

It can be seen from Table 6 that LL% is associated primarily with foci, whether
contrastive or plain, and foci are most likely to be marked by this boundary tone. It is
not surprising that foci as opposed to topics are marked by LL% since this sequence
tends to come at the end of the sentence, and topics tend to precede foci in the
sentences of the data.
Some non-final topics are, nonetheless, marked by LL%, as shown in example
(18):
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 117

(18) Mr. Barone: … we’re going to reconsider this decision that Clinton
made that would apply in six years from now, or 2006.
So nobody’s putting any extra arsenic in the water, but
Bush has given the Democrats a good talking point.
H*LL% (Unratified topic, 4.8)

There were three wh-questions and four yes-no questions that ended in phrases
we examined. Interestingly none of them received H% boundary tones. Two wh-
questions and two yes-no questions ended in LL%, and one wh-question and two
yes-no questions ended in HL%. The one alternative question in our data did end in
LH%, see example (4).

8.1. Does H% mark topic?

Lambrecht and Michaelis’s (1998) hypothesis, in particular, is not borne out by the
data. Table 7 shows that three quarters of both topics and foci are marked by L%, so
there is no difference between them in this regard.

Table 7. Boundary Tones Relative to Topic and Focus

L% H% TOTAL
Topic 27 (75%) 8 (25%) 35
Focus 60 (77%) 18 (23%) 78
TOTAL 87 26 113

9. ENTIRE TUNES
Finally, Pierrehumbert (1980) and Steedman (1991) proposed that topics are
associated with entire tunes, H*LH% and L+H*LH%, respectively. Let us first look
at H*LH%.

9.1 Does H* LH% mark topic?

As Table 8 shows, there are perhaps surprisingly only four examples of H*LH% in
our data.

Table 8. H*LH% Tune Relative to Information Structure Type

H*LH%
Plain Focus 4

All four of these mark plain focus, and all seem to mark continuation. For example
(19) is a rejection of a previous participant’s contribution. It is continued with a
correction:

(19) Mr. McLaughlin: Lawrence and ah two other members are correct.
His style rating is probably a B, but your analysis
118 NANCY HEDBERG AND JUAN M. SOSA

of how much he should be doing in the first 100

days is absurd. He’s taking one piece at a time
H* LH%
and he’s being very successful. He gets an A on
substance. (plain focus, 4.35)

9.2. Does L+H* LH% mark topic?

Steedman’s (1991) hypothesis that the L+H*LH% tune is associated with topics in
particular is also not borne out by the data. Although the data are few, Table 9 shows
that the distribution of L+H*LH% primarily targets contrastive foci, instead of
topics.

Table 9. L+H*LH% Tune Relative to Information Structure Type

L+H*LH%
Contrastive Topic 1
Contrastive Focus 5
Plain Focus 1

It is interesting that the function of four out of five of the contrastive foci
examples of this tune are contradictions. See, for instance, examples (20) and (21):

(20) Ms. Clift: Well, I think definitions of beauty or

handsomeness change over the years, and I,
frankly, think this guy is pretty attractive. I don’t
find him unattractive.
L+H* LH%
(contrastive focus, 3. 5)

(21) Mr. McLaughlin: Well, he’s been a successful politician, and he’s
been a successful statesman, has he not?
Mr. O’Donnell: He’s done – the only thing – he was in a box with
China. He did the only thing you could do. He
hasn’t done anything extraordinary.
L+H* LH%
(contrastive focus, 4.20)

The speaker in (20) is contradicting the proposition expressed by other participants

that the likeness of Jesus being discussed is unattractive. The speaker in (21) is
contradicting the proposition evoked by other participants that Bush’s 63% approval
rating after his first 100 days was due to his behaving in an extraordinary fashion, in
particular with regard to his handling of the Chinese fighter plane crisis.3 We present
in figure 11 a pitch track for example (21):
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH 119

Figure 11. He hasn’t done any extraordinary. (contrastive focus, 4.20)

L+H*LH%

In future work on this project, we intend to correlate information structure with

entire tunes, i.e. full intonational phrases with specific combinations of heads and
nuclei, according to the sentence type.

10. CONCLUSION
We conclude that while there are systematic correlations between intonation and
information structure categories, these correlations are not as straightforward as is
suggested in the literature. In particular we deny that there is any prosodic category
as distinctive as a ‘topic accent’ as opposed to a ‘focus accent.’
With regard to L+H*, we found that it falls on contrastive topics and unratified
topics and contrastive foci 24-31% of the time and on plain foci 14% of the time. It
doesn’t just fall on topics. L+H* occurred in 41 of our analyzed phrases, or
approximately 20%, which is a significant number. This shows that this accent
deserves the reputation it has received in the literature.
Minor conclusions, given the relative lack of data, are that L* tends to mark focus
and that L*+H tends to mark topic. Upstep also seems to mark focus, although again
the data are few.
Except for ratified topics which tend to be unaccented, all information structure
categories were extensively marked with H*, including unratified and contrastive
topics. The fact that pitch accents with some kind of H* occur six times more often
than L* (150 versus 26) shows that American English is an H* language, as opposed
to other languages such as Spanish in which L* predominates, at least in prenuclear
positions (Sosa, 1999).
Finally, given the fact that our results mitigate the conclusions assumed in the
literature, it is clear that investigations into intonation should be carried out on
naturally-occurring spontaneous dialogue as well as on constructed examples and
experimentally induced speech.
120 NANCY HEDBERG AND JUAN M. SOSA

11. NOTES
* Part of this research was funded by a SSHRC Small Grant from Simon Fraser University, 2001.
1
For the contour in figure 2 the ToBI Guidelines would prescribe a notation H* L+H*. The reason for
which we decided to use the H*+L is that the salient fall is completely realized during the word ‘thirty’.
The point here is that there is an important descent during this word, not that there is a rise for H* on the
word ‘years’.
2
We thank Jeanette Gundel for pointing out this general problem to us.
3
In (20) and (21), it has been suggested to us by Mark Steedman and Chungmin Lee that an alternative
information structure analysis would treat the marked phrase as topic. Note that this alternative analysis
can be justified by the ‘as for’ test as follows: ‘As for whether he is unattractive, I don’t find him so’ and
‘As for whether he has done anything extraordinary, he hasn’t.’ The point here is that the questions of
whether or not the Christ image is attractive and whether or not Bush has done something extraordinary
are relevant in their contexts and to some extent are already under discussion. Büring (2003) would also
analyze the accents in (20) and (21) as contrastive topic accents since (20) and (21) can be seen as
answers to implied subquestions in the discourse, e.g. (21) in the context of the explicit question ‘Has
Bush been a successful politician?’ negatively answers the subquestion ‘Has he done anything
extraordinary?’.

12. REFERENCES
Beckman, Mary E. and Gayle Ayers Elam. Guidelines for ToBI Labelling. Version 3. Columbus: Ohio
State University, Department of Linguistics, 1997.
Büring, Daniel, “On D-Trees, Beans, and B-accents.” Linguistics and Philosophy 26.5 (2003): 511-545.
Gundel, Jeanette. The Role of Topic and Comment in Linguistic Theory. Ph.D. Dissertation, University of
Texas, Austin, 1974.
Gundel, Jeanette. “Stress, Pronominalization and the Given-New Distinction.” University of Hawaii
Working Papers in Linguistics 10.2 (1978): 1-13.
Gundel, Jeanette. “Universals of Topic-Comment Structure.” In Michael Hammond, Edith A. Moravcsik
and Jessica R. Wirth (eds.), Syntactic Universals and Typology, pp. 209-242. Amsterdam and
Philadelphia: John Benjamins, 1988.
Gundel, Jeanette K. “On Different Kinds of Focus.” In Peter Bosch and Rob van der Sandt (eds.), Focus:
Linguistic, Cognitive, and Computational Perspectives, pp. 293-305. Cambridge: Cambridge University
Press, 1999.
Gundel, Jeanette K. and Thorstein Fretheim. “Topic and Focus.” In Laurence Horn and Gregory Ward
(eds.), The Handbook of Contemporary Pragmatic Theory. Oxford: Blackwell, (2004): 175-196.
Gussenhoven, Carlos. “Focus, Mode and the Nucleus.” Journal of Linguistics 19 (1983): 377-417.
Hedberg, Nancy. “The Referential Status of Clefts.” Language 76 (2000): 891-920.
Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972.
Lambrecht, Knud and Laura Michaelis. “Sentence Accent in Information Questions: Default and Projection.”
Linguistics and Philosophy 21 (1998): 477-544.
Nespor, Marina and Irene Vogel. Prosodic Phonology. Dordrecht: Foris, 1986.
Pierrehumbert, Janet. The Phonology and Phonetics of English Intonation. MIT: Ph.D. Dissertation, 1980.
[Bloomington, IN: Indiana University Linguistics Club, 1987].
Pierrehumbert, Janet and Julia Hirschberg.. “The Meaning of Intonational Contours in the Interpretation of
Discourse.” In Philip R. Cohen, Jerry Morgan, and Martha E. Pollack (eds.), Intentions in Communication,
pp. 271-311. Cambridge: MIT Press, 1990.
Reinhart, Tanya. “Pragmatics and Linguistics: an Analysis of Sentence Topics.” Philosophica 27 (1981):
53-94.
Sosa, Juan M. La Entonación de Español. Madrid: Cátedra, 1999.
Steedman, Mark. “Structure and Intonation.” Language 67 (1991): 260-296.
Steedman, Mark. The Syntactic Process. Cambridge, Mass.: MIT Press, 2000a.
Steedman, Mark. “Information Structure and the Syntax-Phonology Interface.” Linguistic Inquiry 31 (2000b):
649-689.
Vallduvi, Enric and Elisabet Engdahl. “The Linguistic Realization of Information Packaging.” Linguistics
34 (1996): 459-510.
EMIEL KRAHMER (1) AND MARC SWERTS (1,2)

PERCEIVING FOCUS

1. INTRODUCTION
Many linguists approach intonational matters from a purely speaker-oriented
perspective1. For instance, in different studies, in as far as these are empirical in
nature, evidence for particular tonal distinctions is often solely based on acoustic
analyses of fundamental frequency (F0) traces. However, if one wants to gain full
insight into how intonation ‘functions’, such an approach is arguably incomplete.
That is, a prosodic feature, as any other linguistic feature, can only be said to be
communicatively relevant if it is not only encoded in the speech signal by a speaker,
but if it also has an impact on how an utterance is processed by a listener. In other
words, claims about important intonational categories and their respective meanings
are somewhat premature if they are not backed up with results that show that these
are also relevant at the receiving end of the communication chain. Ideally, such an
analysis should be more than an individual linguist’s interpretation of a prosodic
phenomenon.
Unfortunately, one cannot simply take it for granted that all prosodic detail really
matters to a listener. One obvious, but sometimes neglected, condition is that tonal
variation clearly needs to be above a perceptual threshold to be functionally
relevant. In that respect, it is striking to see that many researchers attach functional
load to particular tonal distinctions, which, from a purely phonetic point of view, are
only minimally separable or even highly overlapping in “tonal space”. For instance,
the difference between H* and L+H*, as defined in the ToBI framework, has been
claimed to indicate semantically distinct categories such as rheme and theme
(Steedman 2000) or new and contrastive information (Pierrehumbert & Hirschberg
1990). Yet these two intonational categories are often confused by labellers who are
instructed to transcribe intonation (e.g., Pitrelli et al. 1994), even to the extent that
some investigators simply give up on the distinction. In comparison, many vowel
systems of the world obey a contrast principle, which states that any two vowels
need to be optimally distinct in order to be appropriately applicable in speech
communication (the idea of vowel dispersion, see e.g., ten Bosch 1991). Also,
linguistic systems are highly redundant in that speakers have various strategies at
their disposal to signal particular meanings. Since tonal markers of semantic events
often covary with morpho-syntactic, lexical or other prosodic cues, it is theoretically
possible that their communicative function is ‘overruled’ by that of other resources,
or by the situational or linguistic context in which they occur.
In this chapter, we argue that controlled perceptual studies allow us to investigate
the communicative importance of intonational features. Rather than concentrating on

121
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 121–137.
© 2007 Springer.
122 E. KRAHMER AND M. SWERTS

subtle differences between intonational categories, we will illustrate this viewpoint

with a series of studies on the cue value of pitch accents. In languages such as Dutch
and English, the distribution of accents has been claimed to be exploited as a means
to distinguish important bits of information in an utterance from unimportant ones.
That is, in such languages, pitch accents serve as a linguistic strategy to put ‘new’ or
‘contrastive’ information in focus, whereas speakers take care not to highlight
‘given’ information, i.e., information which is present (explicitly or implicitly) in the
preceding context. However, it is unclear whether such observations generalize to
other languages as well; in addition, most studies on pitch accent distribution have
not looked at the relation of such accents, which are encoded in the speech signal
itself, to visual cues speakers may send to a communication partner. Therefore, in
order to gain insight into the relative cue value of pitch accents for signaling focus,
we will tackle this question in different perception tests, and approach it from (1) a
multilingual and (2) a multimodal perspective. In the studies presented below we
shall ignore newness accents, for reasons that will become clear later, and hence in
the present study being contrastive is equivalent to being in focus.
First, we will present a cross-linguistic approach, given that the communicative
importance of pitch accents is likely to be different for different languages. Consider
the following utterances (after Cruttenden 1993) i.e., readings of the scores in
English and Italian football reports (capitalized words indicate accents). In both
examples, the football score is a tie so that a particular number (the words ‘one’ and
‘uno’) is repeated:

English
TOTTENHAM ONE - LIVERPOOL one

Italian
UDINESE UNO - ROMA UNO

In a typical English realization of such scores, the second instance of ‘one’ is

deaccented, since it has just been mentioned in the preceding phrase. In the Italian
scores, the second instance of ‘uno’ is typically accented again, even though it is
literally given from the preceding context. This second accent in the Italian case is
due to the fact that Italian strongly disfavours deaccentuation within NPs or other
syntactic constituents (Ladd 1996:177-178 & p.c.). More general, Italian, as other
Romance languages such as Catalan or French, is an example of what Vallduví
(1992) would call a non-plastic language, i.e., a language which has a rather
constrained intonation structure to mark information status, but which more heavily
uses word order variation for that purpose. These languages are different from the
plastic ones, such as Dutch and English, whose prosodic pattern is “moulded” to fit
the information structure, so that intonation is used to mark information status.
These claims do not entail that deaccentuation is impossible in Italian, as Ladd
acknowledges that deaccentuation on sentence level in Italian is entirely possible,
e.g., repeated full NPs may be deaccented (see also Avesani et al. 1995; Hirschberg
& Avesani 1997; D’Imperio 1997), but they do mean that under certain conditions
deaccentuation is infelicitous. The current study aims to test whether differences
PERCEIVING FOCUS 123

between accent structures in Italian and Dutch, as two cases of a non-plastic and a
plastic language, respectively, have implications on the way listeners perceive focus.
Second, apart from variation between languages, the importance of pitch accents
may also depend on the communicative setting in which they are used, in particular
if we compare communicative settings in which dialogue participants can or cannot
see each other during a spoken interaction. Different studies have suggested that
there exist specific visual cues to focus structure as well. In particular, like pitch
accents, rapid eyebrow movements have been claimed to play an accentuation role
(e.g., Birdwhistell 1970, Condon 1976). It has even been argued that there is a one-
to-one connection between the two; see, for instance, the so-called Metaphor of Up
and Down (Morgan 1953, Bolinger 1985:202ff ): when the pitch rises or falls, the
eyebrows follow the same pattern. In fact, to see that there is indeed a close
connection between pitch and eyebrows, one may try to utter a two word phrase, say
“blue square”, with a pitch accent (but no corresponding eyebrow movement) on the
word “blue” and a rapid eyebrow movement (but no corresponding pitch accent) on
the word “square”. Most people find this a difficult exercise.
One of the few empirical studies devoted to the relation between pitch accents
and eyebrow movements is Cavé et al. (1996), who report on a significant
correlation between the two (in particular, and surprisingly, for the left eyebrow). It
appears that rapid eyebrow movements often co-occur with pitch accents. The
opposite is not the case: people do more with their pitch than with their eyebrows.
Cavé and co-workers suggest that eyebrow movements and pitch do not link
automatically (e.g., due to muscular synergy), but coincide for communicative
reasons. Naturally, this raises the question what these communicative reasons might
be. In the literature on Talking Heads (i.e., combinations of computer animations
with speech), there is no consensus on the timing and placement of eyebrow
movements. Pelachaud et al. (1996) note that the decision to raise the eyebrows is
affect dependent, but in the examples they discuss, pitch accents and eyebrows
coincide. Thus to the question I know that Harry prefers POTATO chips, but what
does JULIA prefer? the Talking Head of Pelachaud et al. (1996:19) would respond
with the following utterance, in which capitalized words again indicate an accent,
whereas overlined words are accompanied by a rapid eyebrow movement:

( JULIA prefers) theme ( POPCORN ) rheme

Cassell et al. (2001) use eyebrow raising (or “flashes” as they call them) more
sparingly. The eyebrows are raised when an object in the “rheme” is described. So in
reply to the question above, the algorithm of Cassell et al. would not produce a
‘flash’ on “Julia”. It should be noted that neither Pelachaud et al. (1996) nor Cassell
et al. (2001) report on evaluation: it is not known whether the animations are
effective in the way human listeners process the information. We get no insight in
the contribution of the eyebrow movement: its function remains unclear. Again, to
learn more about the relative importance of pitch accents and eyebrow movements,
this issue is tackled in the current study from a perceptual point of view, testing how
listeners detect focus in audiovisual stimuli.
124 E. KRAHMER AND M. SWERTS

To facilitate comparisons across languages and across modalities regarding the

cue value of pitch accents to signal focus, we have set up a particular experimental
paradigm which can be applied to different languages and to both speech-only and
audiovisual stimuli. The experiment consists of a perceptual task in which listeners
essentially have to detect the main focus in an utterance. More specifically, subjects
are instructed to decide, solely on the basis of a particular utterance, what the
information would have been in the preceding utterance, i.e., subjects have to
‘reconstruct the dialogue history’. Rather than using manipulated speech materials
(read-aloud or synthetic) with controlled prosodic properties, the stimuli for the
perceptual task discussed here consist of semi-spontaneous data whose intonational
features are untouched when used in the test. By using naturally elicited speech
materials, one avoids the risk that one tests the effect of intonation contours that are
not representative of real data. For that purpose, we developed a specific dialogue
game that triggers speakers’ productions of different focus distributions in particular
target sentences. The paradigm works for different languages so that it becomes
easier to make cross-linguistic prosodic comparisons. In addition, the resulting
utterances can be combined with visual cues, which makes it possible to study the
relative cue value of accents and visual information.
In the next section, we first describe the experimental design to elicit accent
patterns in both Dutch and Italian utterances, and the method to create audiovisual
stimuli to be used in a series of perception tests. The following sections then
describe the procedure and the results of the actual experiments on the perception of
focus in speech-only stimuli in Dutch (study 1) and in Italian (study 2), and in
multimodal stimuli in Dutch (study 3). We end with a general discussion and a
conclusion.

2. MATERIALS

2.1. Speech
For all three studies, utterances were used which were obtained in a semi-
spontaneous way via a simple dialogue game. The game was played each time by
two subjects, call them A and B, separated from each other by a screen. Figures 1
and 2 visualize the experimental set-up with a bird’s-eye perspective on the starting
situation of the game and the situation after the first turn in the game. In each game,
both players have an identical set of eight cards at their disposal, each card showing
a geometrical figure in a particular colour. Four of these cards are put on a stack in
front of them, the four other cards are in a row before them. The four cards in the
stack of A are the same as the four cards in the row of B, and vice versa. The game
consists of a series of turns in which one participant gives instructions to select a
card with a particular geometrical figure and the other follows these instructions. In
each consecutive turn, the participants switch roles so that the original instruction-
giver becomes the instruction-follower, and the other way around. In turn 1, the
instruction giver, say A, begins with describing the figure on the top of his stack (“a
blue square”). After he has described this figure, he removes it from his stack and
PERCEIVING FOCUS 125

puts it behind number 1 on his list. The instruction follower, B, listens to the
description of A and removes that figure from his row of figures, and also puts it
behind number 1 on his list. Now, the participants switch roles, so that B describes
the figure that is on top of his stack (“a black triangle”), and A follows the
instructions of B which will prompt both A and B to place the card with this object
on the second place in the row with figures, and so on. The game is over when both
players have no cards left. Each pair of subjects played a sequence of eight games,
each time separated by a break of at least two minutes. Note that the players are
given the instruction to describe the figure on top of their stack in terms of its colour
and figure property. Speakers generally found it a very easy game to play, and as a
consequence there are no faulty descriptions in the respective data sets.

Figure 1. Visualization of the initial set-up of the experiment to elicit different referring
expressions. A and B represent the two participants in the dialogue game. In the actual
experiment, the different figures were given different colours. Further explanations in the text

The speech data thus obtained allow for an unambiguous operationalization of

the relevant contexts. A property is defined to be new (N) to the conversation if it is
126 E. KRAHMER AND M. SWERTS

mentioned in the first turn of the current dialogue game, it is given (G) if it was
mentioned in the previous turn and finally a property is contrastive (C) if the object
described in the previous turn had a different value for the relevant property. We
define a property to be in focus, if it is not given. (In the three studies described
below, we will ignore newness and hence in these studies a property will be in focus
if, and only if, it is contrastive.) By systematically varying the order of the cards in
the stack, target descriptions (Dutch: “blauw vierkant” (blue square); Italian:
“triangolo nero” (black triangle)) could be collected in all contexts of interest: no
contrast (all new, NN), contrast in the prefinal word (CG), contrast in the final word
(GC), all contrast (CC). Notice that in the 2-letter abbreviations, the first letter
corresponds with the contextual status of the first word, and the second letter with
the contextual status of the second word. Table 1 summarizes the situation. It is
worth noting that in the Dutch elicited utterances the adjective always precedes the
noun, whereas in the Italian data it follows the noun. In other words, if we refer to
the first word in the elicited NPs, we mean the adjective in case of the Dutch data,
and we mean the noun in the case of the Italian data.

Table 1. Examples of the four contexts

NN (beginning of game)
B: “blue square”
CC A: “red circle”
B: “blue square”
CG A: “yellow square”
B: “blue square”
GC A: “blue triangle”
B: “blue square”
PERCEIVING FOCUS 127

Figure 2. Visualization of the set-up of the experiment after A’s first move (“blue square”)

Eight Dutch speakers were recruited from students and colleagues from IPO,
speaking the variant of standard Dutch as spoken in the Netherlands; the eight
Italian speakers we recorded were all living in Italy, and were native speakers of the
Tuscan variety of Italian. The Dutch speech materials are used in studies 1 and 3, the
Italian ones in study 2.
128 E. KRAHMER AND M. SWERTS

2.2. Animations
For study 3 we combined the Dutch speech materials with an animated talking head.
Since this was a male head, we only used the four male voices collected for Dutch.
In addition, two synthetic male voices were used, copying the intonation contours of
two of the human voices. We use both synthetic and natural voices in order to see to
what extent the naturalness of the voice influences the perception of focus. A human
voice has more natural and better sounding prosody, but a synthetic voice might be
better suitable to accompany the visual counterpart of a synthetic character. A Dutch
diphone speech synthesizer was used for the generation of the two synthetic
versions. The animations were produced with the CharToon environment (Ruttkay
et al. 1999). A 2D head of a male person formed the basis of the animations. Visual
speech is generated on the basis of a set of 48 visemes (elementary mouth positions).
Phonemes from the input are matched to corresponding visemes with a sampling
rate of 100 ms, while intermediate stages are computed using linear interpolation.
Rapid eyebrow movements coincide with the stressed syllable of either the first
(“blauwe”) or the second word (“vierkant”). Notice that these are the eyebrow
counterparts of focus on the adjective and focus on the noun respectively. We did
not include an eyebrow counterpart to “all focus”, since this would involve either a
raised eyebrow for a longer stretch of time or two rapid eyebrow movements in
succession. For Dutch subjects both of these primarily have a non-focus signalling
interpretation. It is worth stressing that in certain stimuli eyebrow movements are
associated with words which are not accented. Eyebrow movements always had the
following pattern: first, a 100 ms dynamic raising part, then a static raised part of
100 ms, and finally a 100 ms dynamic lowering part. The overall length of the
movement is comparable to the average duration of rapid eyebrow movements of
human speakers (± 375 ms, Cavé et al. 1996). We opted for slightly shorter
movements due to the overall short duration of the stimuli. Figure 3 shows two stills
from a typical animation used in the experiment.

Figure 3. Two stills from the Talking Head uttering “blauw vierkant” (blue square) with a
raised eyebrow on the first word (left) and no eyebrow action on the second word (right)
PERCEIVING FOCUS 129

3. STUDY 1: FOCUS IN DUTCH

3.1. Preliminaries
The first study tests to what extent Dutch listeners are able to determine the main
focus in an utterance by means of pitch accent distribution. For this purpose, we
used data collected via the game described above. Before performing the dialogue
reconstruction experiment, a distributive analysis of the target utterance “blauw
vierkant” (blue square) was carried out. A consensus labelling was done by three
independent intonation experts. The results of the labelling can be summarized as
follows: in most cases, a property which is in focus receives a pitch accent.
Interestingly the only exceptions to this general rule can be attributed to speaker
differences among the eight speakers. One group of four speakers always end their
utterance on a low boundary tone and always associate focused properties with a
pitch accent. The four remaining speakers uniformly employ high boundary tones,
and they associate the CC utterances with a single accent on the noun.

3.2. Procedure
Dialogue reconstruction data were obtained from 25 native speakers of Dutch
(different from the eight speakers). The experiment was performed on an individual
basis and was self-paced. All three versions (CG, GC and CC) of the target utterance
(“blauw vierkant”) produced by the eight speakers were used, making a total of 24
stimuli. In studies 1 and 3, Dutch subjects are presented with speech realizations of
“blauw vierkant” taken from their original context, and the task is to determine by
forced choice whether the preceding utterance would be: (1) “rood vierkant” (red
square), (2) “blauwe driehoek” (blue triangle) or (3) “rode driehoek” (red triangle).
The corresponding contexts are (1) CG (focus on the first word), (2) GC (focus on
the second word) and (3) CC (all focus), respectively. The stimuli were presented in
two random orders, to compensate for potential learning effects. Before the actual
experiment started subjects entered a brief training session (consisting of three
stimuli) to make them acquainted with the materials and the setting of the
experiment. No feedback was given on the correctness of their answers, and there
was no communication with the experimenters. Notice that the all new situation
(NN) is not incorporated in the experiment, because there are no utterances
130 E. KRAHMER AND M. SWERTS

preceding the NN so that subjects cannot reconstruct the preceding utterance. The
NN utterances have been studied extensively in Krahmer & Swerts (2001), to find
out whether there are prosodic differences between newness and contrastive accents
in this setting2.
3.3. Results
Table 2 contains the results for all eight speakers taken together. The overall
2
distribution is significantly different from chance (Ȥ = 395.3, df = 4, p < 0.001). The
first thing to note is that for each line the highest numbers are on the diagonal. This
means that each context is most likely to be classified correctly. However, these
chances are much higher in the case of single focus, on contrastive items (CG and
GC) than in the all focus case (CC). Subjects are particularly good in reconstructing
the dialogue history when the adjective is the single focused item (note that these are
the classic cases of narrow scope), which stands out prosodically due to the
occurrence of a nuclear accent in non-default position. However, also when it is the
noun that is the single item in focus, subjects are generally capable of reconstructing
the context. Interestingly, the number of confusions with the all focus (double
contrast) context increases. This seems to imply that there is at least some amount of
broad focus / narrow focus ambiguity (but see below), although the narrow focus
interpretation is still prevalent. This result is compatible with earlier findings from
Gussenhoven (1983) and Rump & Collier (1996) that these ambiguous cases are
more confusable than the CG case, which only allows a narrow focus interpretation.
In the case of double contrast there appears to be a very substantial broad vs. narrow
focus confusion.
However, looking at the results for each speaker separately (all significantly
different from chance as well), reveals an interesting difference between high and
low boundary speakers. The main difference between speakers is found for the

Table 2. Summary of the results of Study 1: classification of all 24 stimuli, for all 25 listeners
(n=600). The vertical axis indicates the actual CONTEXT of the target utterance “blauw
vierkant” (blue square). The horizontal axis indicates how many subjects CLASSIFIED the
utterance in each of the three contexts

CLASSIFIED as
CC GC CG Total
CC 95 83 22 200
CONTEXT GC 60 119 21 200
CG 10 6 184 200
PERCEIVING FOCUS 131

double contrast (CC) case. For low boundary speakers, utterances made in a CC
context are predominantly classified as CC. Strikingly, this is not the case for high
ending speakers, whose CC utterances are very frequently classified as GC
utterances, which matches the earlier observation that these speakers tend to produce
all-contrast utterances with a single accent on the noun. Thus, the fact that in table 1
CC utterances are often misclassified as GC utterances is essentially due to the
difference between low and high ending speakers rather than broad vs. narrow focus
interpretations.

4. STUDY 2: FOCUS IN ITALIAN

4.1 Preliminaries
The second study tests to what extent Italian listeners are capable to determine the
main focus and reconstruct the dialogue history of an utterance using prosodic cues.
Before performing the dialogue reconstruction experiment, a distributive analysis of
the target utterances “triangolo nero” (black triangle) was performed. Three
independent intonation experts listened to all realizations of “triangolo nero”
produced by the eight speakers in the various contexts of interest, and decided on
which words they perceived an accent. The three judges were in full agreement:
every word is always accented, irrespective of context. All speakers produce the
same contour, namely a flat hat shape with the second accent downstepped with
respect to the first. Of course, it might be that different kinds of accents are realized
in different contexts. However, an analysis of the fundamental frequency did not
reveal any differences between contexts (see Swerts et al. 2002). In addition, we
found no evidence for a clear correlation between information status and the
perceived prominence of accents for the Italian data. Therefore, it seems a
reasonable hypothesis that, contrary to the Dutch subjects, Italian subjects will not
be able to reconstruct the dialogue history on the basis of prosodic cues.
4.2 Procedure
Subjects of the second dialogue reconstruction experiment were 25 native speakers
of Italian (different from the eight speakers), mostly from Tuscany. The experiment
was performed on an individual basis and was self-paced. All three versions (CG,
GC and CC) of the target utterance (“triangolo nero”) produced by the eight
speakers were used, making a total of 24 stimuli. In this study, Italian subjects hear
versions of “triangolo nero” (black triangle), and have to guess whether the
preceding utterance was (1) “rettangolo nero” (black rectangle), (2) “triangolo viola”
(violet triangle) or (3) “rettangolo viola” (violet rectangle), again representing the
following contexts: (1) CG (focus on the first word), (2) GC (focus on the second
word) and (3) CC (all focus) respectively. The stimuli were again presented in two
random orders, to compensate for potential learning effects. Before the actual
experiment started subjects entered a brief training session (consisting of three
stimuli) to make them acquainted with the materials and the setting of the
132 E. KRAHMER AND M. SWERTS

experiment. No feedback was given on the correctness of their answers, and there
was no communication with the experimenters.
4.3 Results
The results of the Italian reconstruction experiment on the basis of all eight speakers
are displayed in table 3. A Ȥ 2 analysis reveals that the distribution is not
significantly different from chance. Looking at the results of the eight individual
speakers, we see that the results for seven of them are not significant.3 The picture is
significantly different from the one obtained for the Dutch data (Pearson Ȥ 2 = 223.8,
df = 8, p < 0.001). Thus, as expected, Italian listeners are not able to reconstruct the
prior dialogue context on the basis of prosodic properties of the current utterance, in
contrast to Dutch listeners.

Table 3. Summary of the results of Study 2: classification of all 24 stimuli, for all 25 listeners
(n=600). The vertical axis indicates the actual CONTEXT of the target utterance “triangolo
nero” (black triangle). The horizontal axis indicates how many subjects CLASSIFIED the
utterance in each of the three contexts

CLASSIFIED as
CC GC CG Total
CC 52 70 78 200
CONTEXT GC 53 82 65 200
CG 61 73 66 200

5. STUDY 3: FOCUS IN AUDIO-VISUAL SPEECH

5.1 Preliminaries
In the third study we investigate the relative contributions of pitch accents and
eyebrow movements to the perception of focus in Dutch. For this purpose, we use an
animated male Talking Head and six different male voices. Four of these voices are
human, and have also been used in study 1. The two remaining voices are synthetic,
with the respective intonation contours copied from two of the human speakers. This
makes it possible to compare the results of study 3 with those of study 1. The rapid
eyebrow movements have been shown to be clearly perceivable. A further test
indicated that the eyebrow movements boost the perceived prominence of words that
also receive a pitch accent, and downscale the prominence of unaccented words in
the direct context of the accented word (see Krahmer et al. 2002b). The question of
interest to us here is whether this also has functional ramifications.
5.2 Procedure
A total of 25 native speakers of Dutch participated in the audio-visual dialogue
history reconstruction experiment (different from the eight speakers, and also
PERCEIVING FOCUS 133

different from the 25 listeners from study 1). The experiment was individually
performed and self-paced. Subjects watched and listened to the Talking Head
uttering the two-word phrase “blauw vierkant” (blue square), with a particular
intonation contour (taken from its original context; CG, GC or CC) and a rapid
eyebrow movement on either the first or the second word. Eyebrow movements are
indicated with a hat on the relevant item; the resulting six contexts are ƘG, CƢ, ƢC,
GƘ, ƘC and CƘ. Since six voices are used the total number of stimuli is 36. The
stimuli were displayed on a high-resolution color PC screen, sound came over the
loudspeakers to the left and the right of the screen. Dutch subjects had to perform
the same task as those of study 1, except that they were now presented with
audiovisual stimuli. The stimuli were presented in two different random orders, to
compensate for possible learning effect. Before the experiment started, subjects
entered a brief training session (consisting of three stimuli) to make them acquainted
with the material and the setting of the experiment. No feedback was given on the
‘correctness’ of their answers and there was no communication with the conductor
of the experiment.

Table 4. Summary of the results of Study 3: classification of all 36 stimuli, for all 25 listeners
(n= 900). The vertical axis indicates the actual CONTEXT of the target utterance “blauw
vierkant” (blue square) plus the word which is associated with a rapid eyebrow movement.
The horizontal axis indicates how many subjects CLASSIFIED the utterance in each of the
three contexts

CLASSIFIED as
CC GC CG Total
ƘC 64 41 45 150
CƘ 59 70 21 150
ƢC 34 91 25 150
CONTEXT GƘ 33 90 27 150
ƘG 16 22 112 150
CƢ 16 30 104 150

5.3 Results
Table 4 summarizes the results. The total distribution is significantly different from
chance: Ȥ 2 = 292.2, df = 10, p < 0.001. First consider the cases with single pitch
accents, i.e., the cases with a single prosodic focus on either the adjective or the
noun. Notice that in these cases the majority of subjects indeed perceived the focus
on the adjective or the noun respectively, no matter which of the words is
accompanied by an eyebrow movement. Subjects are somewhat more likely to
classify the cases with the prosodic focus on the adjective correctly than those with
prosodic focus on the noun. Certainly for these single prosodic focus cases, the
distribution of pitch accents is more important for the perception of focus than the
placement of eyebrow movements. This is also reflected by the fact that in the post-
experiment interview, all subjects indicated that they paid most (if not all) attention
134 E. KRAHMER AND M. SWERTS

to information in the auditory channel. Nevertheless, there is an overall effect of

eyebrow movements: the distribution obtained with an eyebrow movement on the
first word is significantly different from the distribution with a movement on the
second word (Ȥ 2 = 19, df = 8, p < 0.025). Closer inspection of table 4 reveals that
this is primarily due to cases with a double pitch accent. If we compare the cases in
which the first word (the adjective “blauw”) is associated with a rapid eyebrow
movement with the cases in which the first word is not associated with such a
movement, we see that in the former case the focus is perceived on the first word in
45 instances, as opposed to 21 in the latter situation. And, conversely, when we
compare the cases in which the second word (the noun “vierkant”) is associated with
a rapid eyebrow movement with the cases in which it is not, we see that in the
former case 70 times a subject classified the noun as being in focus as opposed to
only 41 times in the latter case. In other words, when the intonation contour
provides less cues about the focus (since it contains two pitch accents), eyebrow
movements have relatively more impact. Overall, the results for the four human
voices are similar to the results for two synthetic voices, albeit that the effect of
eyebrow movements is a bit (but not significantly) more pronounced for the
synthetic ones. One subject explicitly indicated that she “trusted” the human voices
more than the synthetic ones, and thus paid special attention to pitch accents in the
former situation.

6. DISCUSSION AND FUTURE WORK

The perceptual approach to intonational phenomena has most strongly been
‘
promoted in the so-called IPO school of intonation ( t Hart et al. 1990). The original
goal of this approach was to develop a formal metalanguage to describe the
intonational properties of Dutch and a few other languages. Starting from the
observation that perception acts as a filter that can stylize the acoustic signal, this
enterprise has led to a phonetically explicit specification of a few basic intonational
categories, i.e., a limited set of pitch rises and falls, that serve as building blocks out
of which larger intonation contours can be constructed. In the current chapter, we
have shown that such a perceptual approach is also useful to gain insight into
functional aspects of intonation. In particular, we have shown that it helps to
comprehend how useful pitch accents are as signals of focus. This research question
was tackled from a multilingual and multimodal perspective, applying a particular
experimental approach, which consists of a dialogue game to elicit target utterances
in different discourse contexts, and a series of perception tests to evaluate the
functions of accents in different languages and in different communicative settings.
As to the results of the current study, we have found that the two languages
investigated, Dutch and Italian, are markedly different regarding accent patterns
inside NPs. In Dutch, it appears that accent patterns are indeed used to mark
information status: accent distribution is the main discriminative factor with new
and contrastive information generally accented, while given information is
deaccented. Study 1 shows that our Dutch listeners are capable, in the majority of
the cases, to reconstruct the prior dialogue utterance on the basis of properties of
the current utterance. Italian differs from Dutch in terms of accent structure:
PERCEIVING FOCUS 135

distribution is not a significant factor in this language, since within the elicited NPs
both adjective and noun are always accented, regardless of the information status.
As a result, it is not surprising that the Italian listeners fail completely to interpret
the target utterances in terms of the dialogue history (study 2). As noted in the
introduction, Italian, being a non-plastic language, has other means besides prosody
of marking information status. For instance, it has a freer word-order than plastic
languages such as Dutch, and it is known to exploit this freedom to mark
information status. However, the constraints of the experimental paradigm did not
offer any room for Italian speakers to use word-order as an indicator of information
status. Therefore it would be interesting to look for an experimental set-up in which
speakers have more freedom to describe a particular state of affairs. This might also
shed a different light on the deaccentuation debate, given that Ladd claims that
deaccentuation of complete NPs within a sentence is quite possible in languages like
Italian, which is supported by data from previous studies (Avesani, Hirschberg &
‘
Prieto 1995, D Imperio 1997, Hirschberg & Avesani 1997).
Regarding the outcome of the audiovisual test (study 3), we have found that both
auditory (accent distribution) and visual (eyebrow movement) cues can have a
significant effect on the perception of focus. However, the effect clearly differs in
magnitude; the impact of pitch accents is large, that of rapid eyebrow movements
comparatively small. The visual cues contribute more when the auditory cues are
inconclusive. Thus, for the condition which caused most confusion in study 1,
eyebrows contribute the most in study 3. One consequence of the overall dominance
of speech is that inconsistent cues go largely unnoticed (although a recent
experiment indicates that subjects have a preference for animations in which
eyebrow movements coincide with pitch accents, Krahmer et al. 2002b). That the
auditory cues appear to be more important for focus perception may —with
hindsight— be explained as follows: since human speakers do more with their pitch
than with their eyebrows, it is not unnatural that human listeners have learned to pay
more attention to changes in pitch than to eyebrow movements. It is interesting to
compare the result of study 3 with those of study 1. Since the auditory cues
dominate the visual ones, it is no surprise that the results basically confirm the
speech-only results of study 1. Nevertheless, there is clearly more confusion in the
audio-visual case. In part, the increase in confusion can be ascribed to the presence
of the eyebrow movements. Certainly, they account for much of the “confusion” in
the cases with a double pitch accent. However, eyebrows cannot account for the
slight increase in confusion for the cases with a single pitch accent. It might be that
the mere addition of a visual channel leads to more confusion (compare Doherty-
Sneddon et al. 2001).
As possible follow-up studies, it is useful to investigate real speaker behaviour in
natural interactions to gain more insight into possible visual cues. For study 3, use
was made of an analysis-by-synthesis technique, creating stimuli whose visual
properties were systematically varied to learn more about the relative effect of this
parameter on focus perception. While the manipulations were inspired by claims in
the literature, it would be nice to supplement the current results with findings of
observations on real speakers to see whether they indeed use eyebrow movements
for signaling focus as suggested here, or whether these mainly signal other types of
136 E. KRAHMER AND M. SWERTS

information, if any. It would also be highly interesting to see what happens with
Talking Heads for non-Germanic languages such as Italian. As shown above, the
results of study 2 reveal that Italian listeners systematically fail to correctly classify
the Italian utterances in terms of dialogue history when confronted with speech-only
stimuli. We are currently planning to do the dialogue reconstruction experiment with
an Italian Talking Head lifting its eyebrows on either the first (“triangolo”) or the
second word (“nero”). We would expect that rapid eyebrow movements have more
impact for the Italian head than for the Dutch one, since the auditory cues are less
informative for Italian than for Dutch. This would be in line with one of the findings
of study 3, that eyebrow movements become more important when pitch cues are
less clear.4

(1) Tilburg University, Communication & Cognition

(2) Antwerp University, Center for Dutch Language and Speech

7. NOTES

1
This chapter presents an overview of our work on the perception of focus, a research topic that we have
been involved with since 1998. The studies focusing on the dialogue reconstruction for Dutch and Italian
are presented with more detail in Swerts et al. (2002). A preliminary version of the third, audiovisual
study is described in Krahmer et al. (2002a). Thanks are due to our colleagues Cinzia Avesani, Zsófia
Ruttkay and Wieger Wesselink for their help in carrying out these studies.
2
Superficially, newness accents and contrastive accents appear to differ in our data, but a closer look
reveals that this is not the case. In particular, at first sight it seems that (1) single contrastive items on the
adjective (CG) have a different shape from newness accents in the same position and (2) contrastive items
are judged to be more prominent than newness accents. However, (1) the difference in accent type is not
so much associated with a contrast-specific prosodic shape but with the occurrence of a nuclear accent in
a non-default position. And (2) the perceived prominence is not so much the result of inherent melodic
properties of contrastive accents but seems due to the fact that the prosodic context does not contain other
intonationally comparable pitch peaks. When the words are presented in isolation, contrastive accents are
not perceived as more prominent than newness accents.
3
The results for the eighth speaker were just above the significance threshold. This was due to the fact that
his CC utterance was often classified as CG. There is no obvious reason for this. Anyway, it is hard to see
how this can be related to information status.
4
POSTSCRIPT (2004) Since the first version of this chapter was written (2002), both follow up studies
mentioned in the discussion have been carried out. Swerts and Krahmer (2004) report on a production
experiment in which subjects were asked to pronounce short utterances with one syllable marked for
focus. When the audio-visual recordings were analysed, it was indeed found that subject may use
eyebrow movements to signal focus, but various other cues were found of which head movement and
visual articulatory emphasis were the strongest. Krahmer and Swerts (2004) describe a series of
experiments with an Italian Talking Head. Contrary to our expectations, Italian subjects made less
functional use of eyebrow movements than Dutch subjects. In general, we found a number of interesting
differences between subjects’ evaluation of Dutch and Italian Talking Heads, but all of these could be
reduced to prosodic differences between the two languages.

8. REFERENCES

Avesani C. “I Toni della RAI. Un Esercizio di Lettura Inton Ativa”. In Gli Italiani Trasmessi: la Radio,
pp. 659-727. Firenze : Accademia della Crusca, 1997.
PERCEIVING FOCUS 137

Avesani, C., J. Hirschberg, and P. Prieto. “The Intonational Disambiguation of Potentially Ambiguous
Utterances in English, Italian and Spanish.” Proceedings of the 13th International Congress of
Phonetic Sciences, pp.174-177, 1995.
Birdwhistell, R. Kinesics and Context. Philadelphia: University of Pennsylvania Press, 1970.
Bolinger, D. Intonation and its Parts, London: Edward Arnold, 1986.
ten Bosch, L. On the Structure of Vowel Systems. Aspects of an Extended Vowel Model Using Effort and
Contrast. University of Amsterdam: Doctoral dissertation, 1991.
Cassell, J., H. Vihjálmsson, and T. Bickmore. “BEAT: the Behavior Expression Animation Toolkit.”
Proceedings of SIGGRAPH'01, Los Angeles, CA, pp.477-486, 2001.
Cavé, C., I. Guaítella, R. Bertrand, S. Santi, F. Harlay, and R. Espesser. “About the Relationship between
Eyebrow Movements and F0 Variations.” Proceedings of the International Conference on Spoken
Language Processing (ICSLP), Philadelphia, pp. 2175-2179, 1996.
Condon, W. “An Analysis of Behavioral Organization.” Sign Language Studies 13 (1976): 285-318.
Cruttenden, A. “The De-accenting and Re-accenting of Repeated Lexical Items.” Proceedings of the
ESCA Workshop on Prosody, Lund, pp. 16-19, 1993.
Doherty-Sneddon, G., L. Bonner, and V. Bruce. “Cognitive Demands of Face Monitoring: Evidence for
Visuospatial Overload.” Memory and Cognition 29.7 (2001): 909-919.
Gussenhoven, C. “Testing the Reality of Focus Domains.” Language and Speech 26 (1983): 61-80.
‘
t Hart, H., R. Collier and A. Cohen. A Perceptial Study of Intonation: An Experimental-Phonetic
Approach to Speech Melody, Cambridge: Cambridge University Press, 1990.
Hirschberg, J. and C. Avesani. “The Role of Prosody in Disambiguating Potentially Ambiguous
Utterances in English and Italian.” Proceedings of the ESCA Workshop on Intonation, Athens, pp.
‘189-192, 1997.
D Imperio, M. “Narrow Focus and Focal Accent in the Neapolitan Variety of Italian.” Proceedings of the
ESCA Workshop on Intonation, Athens, pp. 87-90, 1997.
Krahmer, E. and M. Swerts. “On the Alleged Existence of Contrastive Accents.” Speech Communication
34 (2001): 391-405.
Krahmer, E. and M. Swerts. “More about Brows.” In Zs. Ruttkay and C. Pelachaud (eds.), Evaluating
ECAs. Dordrecht: Kluwer Academic Publishers, 2004.
Krahmer, E., Zs. Ruttkay, M. Swerts, and W. Wesselink. “Pitch, Eyebrows, and the Perception of Focus.”
Proceedings of Speech Prosody, Aix-en-Provence, pp. 443-446, 2002a.
Krahmer, E., Zs. Ruttkay, M. Swerts, and W. Wesselink. “Perceptual Evaluation of Audiovisual Cues for
Prominence.” Proceedings of the International Conference on Spoken Language Processing
(ICSLP), Denver, CO, pp. 1933-1936, 2002b.
Ladd, D. Intonational Phonology. Cambridge: Cambridge University Press, 1996.
Morgan, B. “Question Melodies in American English.” American Speech 2 (1953): 181-191.
Pelachaud, C., N. Badler, and M. Steedman. “Generating facial expressions for speech.” Cognitive
Science 20 (1996): 1-46.
Pierrehumbert, J. and J. Hirschberg. “The Meaning of Intonational Contours in the Interpretation of
Discourse.” In P. Cohen, J. Morgan, and M. Pollack (eds.), Intentions in Communication, pp.
342-365. Cambridge MA: MIT Press, 1990.
Pitrelli, J.F., M. Beckman, and J. Hirschberg. “Evaluation of Prosodic Transcription Labeling Reliability
in the ToBI Framework.” Proceedings of the International Conference on Spoken Language
Processing (ICSLP), Yokohama, Japan, pp.123-126, 1994.
Rump, H.H. and R. Collier. “Focus Conditions and the Prominence of Pitch-Accented Syllables,
Language and Speech 39 (1996): 1-15.
Ruttkay, Zs., P. ten Hagen, and H. Noot. “CharToon; a system to Animate 2D Cartoon Faces.”
Proceedings Eurographics, 1999.
Steedman, M. “Information Structure and the Syntax Phonology Interface.” Linguistic Inquiry 31.4
(2000): 649-689.
Swerts, M. and E. Krahmer. “Congruent and Incongruent Audiovisual Cues to Prominence.” Proceedings
of Speech Prosody, Nara, Japan, 2004.
Swerts, M., E. Krahmer, and C. Avesani. “Prosodic Marking of Information Status in Dutch and Italian:
A Comparative Analysis.” Journal of Phonetics 30.4 (2002): 629-654.
Vallduví, E. The Informational Component. University of Pennsylvania: Doctoral dissertation, 1990.
MANFRED KRIFKA

THE SEMANTICS OF QUESTIONS

AND THE FOCUSATION OF ANSWERS*

1. INTRODUCTION
In Krifka (2001) I argued that three distinct phenomena of question semantics –
alternative questions like Did it rain or not?, multiple constituent questions with
pair-list readings like Who bought what? and the focus patterns of answers to con-
stituent questions – cannot be dealt with adequately within the framework of Alter-
native Semantics. In Krifka (to appear) I argue that Alternative Semantics also is
problematic as a framework for focus semantics in general; in particular, it makes
wrong predictions in case focus occurs in syntactic islands.
In this paper I will take up an issue of Krifka (2001) again, concentrating spe-
cifically on focus patterns in answers to constituent questions. Büring (2002) argued
that the discussion of phenomena in Krifka (2001) was inconclusive, and that Alter-
native Semantics actually does not have problems with the data put forward there. I
agree with the first point, but I will also show that on closer inspection, Alternative
Semantics does not predict the correct patterns of answer focus. I will also show that
the same holds for the theory of Schwarzschild (1999) which works with Givenness
instead of a semantic notion of Focus. The Structured Meaning theory, on the other
hand, does not have these problems.

2. ALTERNATIVE SEMANTICS FOR QUESTIONS AND ANSWERS

I will start with summarizing the essentials of the Alternative Semantics approach to
the meaning of questions and the corresponding focus of answers. The crucial idea
is that the meaning of a question is the set of propositions that answer the question.
It goes back to Hamblin (1958, 1973); Karttunen (1977) proposed a variant of it,
and Groenendijk & Stokhof (1984) developed a version that is quite different with
respect to what questions mean and how questions meanings are derived composi-
tionally. The original version of Hamblin, which is also the one assumed by Rooth
(1992), can be illustrated with the following examples:

139
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 139–150.
© 2007 Springer.
140 MANFRED KRIFKA

(1) [[Which student read Ulysses?]]

= { p | ∃x[STUDENT(x) ∧ p = λi[READ(i)(ULYSSES)(x)] }
equivalently, { λi[READ(i)(ULYSSES)(x)] | STUDENT(x)] }

(2) [[Which novel did John read?]]

= { λi[READ(i)(y)(JOHN)] | NOVEL(y) }

(3) [[Which student read which novel?]]

= { λi[READ(i)(y)(x)] | STUDENT(x), NOVEL(y) }
I represent propositions as functions from indices (possible worlds or times) i to
truth values. Predicates are dependent on indices; I make the simplifying assumption
that arguments are independent of indices. I also assume for simplification that noun
meanings are independent of indices. The meaning of the question Which student
read Ulysses? then is the set of propositions of that can be described as ‘x read
Ulysses’, where x ranges over the set of students, cf. (1).
This representation of question meanings predicts that certain assertions are pos-
sible answers, whereas others are not. This is the criterion for congruent question-
answer pairs (to be extended later):

(4) A question-answer pair Q – A is congruent iff [[A]] ∈ [[Q]].

As an example, consider the following assertions as answers to (1). (5.a) is a

felicitous answer; its meaning is an element of the meaning of (1). (5.b,c) are infeli-
citous answers, as their meanings are not elements of the meaning of (1).

(5) a. [[John read Ulysses.]]

= λi[READ(i)(ULYSSES)(JOHN)], where STUDENT(JOHN).
b. [[John read Moby-Dick.]]
= λi[READ(i)(MOBY-DICK)(JOHN)], where STUDENT(JOHN).
c. [[Jill read Ulysses.]]
= λi[READ(i)(ULYSSES)(JILL)], where ¬STUDENT(JILL).

The criterion that the meaning of the answer must be an element of the meaning
of the question is too crude to exclude answers that may express the right proposi-
tion but whose prosody does not fit to the question. The generalization is that the
position of the main accent must correspond to the wh-element of the question (see
Paul 1891 [1880]). With Jackendoff (1972) and many others, I assume that the main
accent is determined by a focus feature F in syntax. In modern terminology, we can
rephrase Paul’s observation as: The F-feature of the question must correspond to the
wh-constituent of the answer. This is illustrated by the following question-answer
pairs.
THE SEMANTICS OF QUESTIONS 141

(6) a. Which student read Ulysses?

b. JóhnF read Ulysses. / *John read Ul´yssesF.

(7) a. Which novel did John read?

b. John read Ul´yssesF. / *JóhnF read Ulysses.

(8) a. Which student read which novel?

b. JóhnF read Ul´yssesF (and MáryF read Moby-DíckF).
In the theory of Alternative Semantics, such correspondences have been captured
as follows (cf. von Stechow (1990), Rooth (1992)). Focus introduces alternatives to
regular question meanings; if [[α]] is the regular meaning of an expression α, then
[[α]]A is the set of its alternative meanings. The two propositions in (6.b) and (7.b) do
not differ in their regular meanings, but in their alternatives. In the following exam-
ple, De is the domain of entities of individuals.

(9) a. [[John read Ul´yssesF]] = [[JóhnF read Ulysses.]]

= λi[READ(i)(ULYSSES)(JOHN)]
b. [[ John read Ul´yssesF.]]A
= { λi[READ(i)(x)(JOHN)] | x ∈
e D }
c. [[ JóhnF read Ulysses.]]A
= { λi[READ(i)(ULYSSES)(x)] | x ∈e D }

Felicitous question-answer-pairs must satsify the extended congruence criterion,

which says that the meaning of the question must be a subset of the alternatives of
the answer:

(10) A question-answer pair Q – A is congruent iff

i. [[A]] ∈ [[Q]]
ii.[[Q]] ⊆ [[A]]A.

The second clause of this congruence criterion, (10.ii), excludes answers with
focus in the wrong place, like the infelicitous answers of (6) and (7). To see this,
consider example (6):

(11) Which student read Ulysses? – JóhnF read Ulysses.

Well-formed, as
{ λi[READ(i)(ULYSSES)(x)] | STUDENT(x)}
⊆ { λi[READ(i)(ULYSSES)(x)] | x ∈ De}
142 MANFRED KRIFKA

(12) Which student read Ulysses? – *John read Ul´yssesF.

Not well-formed, as
{ λi[READ(i)(ULYSSES)(x)] | STUDENT(x)}
⊄ { λi[READ(i)(y)(JOHN)] | y ∈ De}

The question meaning and the answer meaning in (12) share one proposition,
namely the proposition λi[READ(i)(ULYSSES)(JOHN)], but the question meaning is not
a subset of the answer meaning.
The congruence criterion also predicts that answers must be focus marked, as
otherwise the alternative meaning is reduced to a singleton set, and the subset
requirement cannot be satisfied:

(13) Which student read Ulysses? – *John read Ulysses.

Not well-formed, as
{ λi[READ(i)(ULYSSES)(x)] | STUDENT(x)}
⊄ { λi[READ(i)(ULYSSES)(JOHN)] }

In general, the congruence criterion (10) ensures that there is enough focus
marking. For example, it rules out question-answer pairs like (14) but allows for
question-answer pairs like (15):

(14) Which student read which novel? – *JóhnF read Ulysses.

Not well-formed, as
{ λi[READ(i)(y)(x)] | STUDENT(x), NOVEL(y)}
⊄ { λi[READ(i)(ULYSSES)(x)] | x ∈ De}

(15) Which student read which novel? – JóhnF read Ul´yssesF.

Well-formed, as
{ λi[READ(i)(y)(x)] | STUDENT(x), NOVEL(y)}
⊆ { λi[READ(i)(y)(x)] | x, y ∈ De}

But it is evident that congruence criterion, as it stands, does not rule out too much
focus marking. For example, it allows for unfelicitous question-answer relations as
in (16):

(16) Which student read Ulysses? – *JóhnF read Ul´yssesF.

But we have:
{ λi[READ(i)(ULYSSES)(x)] | STUDENT(x)}
⊆ { λi[READ(i)(y)(x)] | x, y ∈ De }
THE SEMANTICS OF QUESTIONS 143

This was the major point of criticism in Krifka (2001). In that paper, I also consi-
dered other possible congruence criteria within Alternative Semantics that assume
additional restrictions of the alternatives introduced by focus and by wh-elements,
but I concluded that they could not systematically exclude overfocused or under-
focused answers.

3. A PREFERENCE FOR MINIMAL FOCUS?

Büring (2002) proposes that we can exclude overfocused answers by requiring in
addition that focus be minimized. This option I dismissed in Krifka (2001) without
appropriate discussion, and I will turn to it here. The proposal is incorporated in the
following revised extended congruence criterion:

(17) A question-answer pair Q – A is congruent iff

i. [[A]] ∈ [[Q]]
ii. [[Q]] ⊆ [[A]]A
iii. There is no A′ that is like A except A′ has less focus marking than
A and still satisfies (i) and (ii).

With this congruence criterion, the problematic example (16) can be ruled out. To
see this, consider the following three potential answers to the question Which
student read Ulysses? and their alternative sets.

(18) Which student read Ulysses?

{ λi[READ(i)(ULYSSES)(x)] | STUDENT(x) }
a. *JóhnF read Ul´yssesF.
{ λi[READ(i)(y)(x)] | x, y ∈e D }
b. JóhnF read Ulysses.
{ λi[READ(i)(ULYSSES)(x)] | x ∈e D }
c. *John read Ulysses.
{ λi[READ(i)(ULYSSES)(JOHN)] }

All answers satisfy clause (i) of the congruence criterion. Answers (a) and (b) also
satisfy clause (ii), as [[(18)]] ⊆ [[(18.a)]]A and [[(18)]] ⊆ [[(18.b)]]A, but answer (c) is
ruled out by it, as [[(18)]] ⊄ [[(18.c)]]A. Clause (iii) rules out answer (a), as it has
more focus marking than (b): Where (a) has two F markings, (b) only has one.
The underlying idea is that focus marking has to be used sparingly, to achieve
the required purpose of ensuring that the question meaning is a subset of the alter-
native meanings of the answer. This could plausibly be modelled within optimality
theory by two constraints: A higher ranked one that requires the focus marking to
capture the meaning of the question (that is, [[Q]] ⊆ [[A]]A), and a lower ranked one
that prefers minimal focus marking.
144 MANFRED KRIFKA

Extending the congruence criterion by clause (iii) is a promising move, but no-
tice that (iii) contains a notion that is undefined so far, namely “less focus marking”.
It is clear what less focus marking means when comparing sentences like (18.a) and
(b): In (a), there is an additional focus feature that (b) lacks, and in this sense (b)
shows less focus marking. But there are cases in which it is not so clear what less
focus marking should mean. In particular, we should consider cases of broad and
narrow focus, and compare them with cases of more or less focus.
Let us start with the following case, in which the question asks for an activity,
indicated by the verb do. I again specify the meaning of the question and the alter-
native meanings of potential answers.

(19) What did John do?

{ λi[P(i)(JOHN)] | P ∈ Dset, P: activity }
a. John [read Ulýsses]F.
{ λi[P(i)(JOHN)] | P ∈ Dset}
b. *John [read UlýssesF].
{ λi[READ(i)(y)(JOHN)] | y ∈ De}
c. *John réadF Ulysses.
{ λi[R(i)(ULYSSES)(JOHN) | R ∈ Dseet }
d. *[John read Ulýsses]F.
{ λi[p(i)] | p ∈ Dst }

The VP question (19) asks for any property of John that is an activity. Here, Dset
is the domain of meanings that are functions from indices to functions from entities
to predicates, that is, the domain of properties, type set (or, in another notation, ¢s,
¢e, t²²), and Dseet is the domain of relations-in-intension, type seet. If the answer is
formed with a transitive verb, as in (19.a), the accent on the object NP marks focus
on the whole VP, a case of so-called focus projection or accent percolation. The
answer (b) with object NP focus, which happens to be homophonous with (b), is
unfelicitous. The same holds for answers like (c), with focus on the transitive verb.
Also, answer (d) is unfelicitous; it would be felicitous in the context of a question
like what happened. Again, the marking is similar to (a) by focus projection, with
the main accent on the direct object.
Obviously, all answers satisfy clause (i) of the congruence criterion (17). Ans-
wers (b) and (c) are ruled out by clause (ii), as we have [[(19)]] ⊄ [[(19.b)]]A,
[[(19.c)]]A. The question asks for activities of John in general; the alternatives of the
answer are restricted to reading activities by John and to relations of John to Ulys-
ses, respectively. Answers (a) and (d) satisfy clause (ii), as we have [[(19)]] ⊄
[[(19.a)]]A, [[(19.d)]]A. Answer (d) should then be excluded by clause (iii) if we inter-
pret “less” focus marking as meaning “more narrow” focus marking, if two expres-
sions are compared that differ only insofar as one has a broader focus marking than
the second.
THE SEMANTICS OF QUESTIONS 145

Consider now the following multiple constituent question and two potential
answers.

(20) What did John do with which novel?

{ λi[R(i)(y)(JOHN)] | R ∈ Dseet, R: activity, NOVEL(y) }
a. John réadF Ul´yssesF (... and críticizedF [Finnegan’s Wáke]F)
{ λi[R(i)(y)(JOHN)] | R ∈ Dseet, y ∈ De}
b. *John [read Ul´ysses]F
{ λi[P(i)(JOHN)] | P ∈ Dset}

Multiple wh-questions are often supposed to be answered by a list answer a fact that I
will disregard here. In the appropriate answer, each wh-element of the question corres-
ponds to a focus of the answer, cf. (20.a). This satisfies clause (ii); we have [[(20)]]
⊆ [[(20.a)]]A. Answer (20.b) is not felicitous, even though we have [[(20)]] ⊆
[[(20.b)]]A. Can (20.b) be ruled out by clause (iii) of the congruence criterion? We
have to decide what counts as less focusation: While (20.a) has more focus features,
(20.b) has a broader focus. If we want to keep up our general hypothesis, then we
must assume that broad focus is worse than having more foci:

(21) When two answers A and A′ compete, where both expressions are equal
except that A has more but smaller foci, and A′ has fewer but broader
foci, A is to be preferred over A′.

Consider now again question (19), repeated here, and two potential answers:

(22) What did John do?

{ λi[P(i)(JOHN)] | P ∈ Dset, P: activity }
a. John [read Ul´ysses]F.
{ λi[P(i)(JOHN)] | P ∈ Dset}
b. *John réadF Ul´yssesF.
{ λi[R(i)(y)(JOHN)] | R ∈ Dseet, y ∈ De}

Notice that (22.a) is a good answer but (22.b) is infelictous. Both answers satisfy
clause (ii) of the congruence criterion. In particular, answer (22.b) does, as we have
[[(22)]] ⊆ [[(22.b)]]A. To see this, we have to prove that each element of [[(22)]] is also
an element of [[(22.b)]]A. Take p to be an arbitrary element of [[(22)]]. This means
that p can be expressed as λi[P1(i)(JOHN)], where P1 is some constant of type set.
Now we can take an arbitrary constant y2 of type e and define a constant R2 of
type seet as follows: R2 := λyλxλi[P2(i)(x)]. Then we can express p as
λi[R2(i)(y2)(JOHN)], and hence we have p ∈ [[(22.b)]]A. As the choice of p was arbi-
trary, we have [[(22)]] ⊆ [[(22.b)]]A, q. e. d.1
146 MANFRED KRIFKA

The proof goes through if the choice of R2 is totally unrestricted, that is, R2 is an
arbitrary element of Dseet. This might be criticized; we might only allow “natural”
relations. But, first, it is difficult to determine what “natural” relations are. And
secondly, restricting the domain of focus alternatives easily yields to situations in
which it is not guaranteed anymore that the question meaning is a subset of the
alternatives of the answer; it might be just the other way round. See Krifka (2001)
for a discussion of alternative congruence criteria and their problems with exclu-
ding over- and underfocused answers.
Can clause (iii) of the focus criterion (17) decide between the two answers? Yes,
it can, but if we follow the preference rule (21) then it selects, incorrectly, (22.b)
over (22.a). And if we change the preference rule so that more foci are dispreferred
over broader foci, then clause (iii) would select, incorrectly, (20.b) over (20.a). This
means that the preference rule for less focusation cannot be spelled out in a general
way so that it always identifies the felicitous answer.

4. GIVENNESS AS AN ALTERNATIVE?
Büring (2002) also suggested to switch to the theory of Schwarzschild (1999) as a
generally more adequate theory of the distribution of sentence accents. In particular,
Schwarzschild assumes a rule of focus avoidance that is, in essence, the same as the
preference rule for minimal focusation expressed by (17.iii).
Schwarzschild (1999) follows Selkirk (1984) in assuming that focus on the lar-
ger constituent is licensed by focus projection. The general rule is that focus on an
argument licenses focus on the head, and focus on the head licenses focus on the
whole constituent. This is how VP focus is generated, step by step:

(23) a. John [read Ul´yssesF]. (focus licensed by accent)

b. John [readF Ul´yssesF]. (focus of head licensed by focus on arg.)
c. John [readF Ul´yssesF]F. (focus on VP licensed by focus on head)

According to this theory, VP focus in John read Ul´ysses contains three focus
features. In contrast, multiple focus on the transitive verb and the object NP only
contains two focus features:

(24) John réadF Ul´yssesF.

Hence this theory makes a clear prediction for cases in which VP focus and V focus
+ NP focus are to be compared. VP focus as in (23.c) contains more focus marking
than multiple focus on the verb and on the object NP as in (24). Consequently, every-
thing else being equal, (24) should be preferred over (23.c), and in general having
THE SEMANTICS OF QUESTIONS 147

more foci should be preferred over having broader foci. This gives us the correct
prediction for (20) but the false one for (22).
Schwarzschild’s theory adds to Selkirk’s rule of recursive F-marking the follow-
ing assumptions:

(25) If a constituent α is not F-marked, then α is Given.

(26) Avoid F-marking.

The notion of Givenness is defined as follows:

(27) An utterance α is Given iff it has a salient antecedent β, and

i. If α denotes an entity, α and β corefer,
ii. or, modulo existential type shifting, β entails the existential
F-closure of α.

To see how this is supposed to work, consider the following example:

(28) Q: Who did John’s mother praise?

A: She praised HIMF.

F-marking on him is allowed, even though the pronoun has a salient antecedent,
John. Why is this so? Existential type shifting of the question Q gives us the propo-
sition
∃x[PERSON(x) ∧ PRAISE(x)(MOTHER(JOHN))],
for which I will write ∃Q, for short. The existential F-closure of the answer A is
what we get when we replace the focus, if there is any, by a variable which is bound
by an existential quantifier with wide scope. In the case at hand, this is
∃x[PRAISE(x)(MOTHER(JOHN))]. Note that this is entailed by ∃Q. This means that the
sentence She praised HIMF is Given. Similarly, the VP praised HIMF is Given, as its
existential F-closure, ∃y∃x[PRAISE(x)(y)], is also entailed by ∃Q. The object noun
phrase HIMF is also Given, as it has an antecedent, John’s. Now, (25) allows for F-
marked constituents that are given, and so it allows that HIMF is F-marked. But (26)
says that F-marking should be avoided if possible. Can F-marking on HIMF be
avoided? No, because then the existential F-closure of the sentence without focus
marking, She praised him, is PRAISE(JOHN)(MOTHER(JOHN)) (notice that there is no
existential closure because there is no F-marking), and this is not entailed by
∃Q. But the projection of F-marking as in She [praisedF HIMF]F can be avoided,
as it is not necessary to ensure that the resulting existential F-closure
∃P[P(MOTHER(JOHN))] is entailed by ∃Q. This is already achieved by less focus
148 MANFRED KRIFKA

marking, on HIMF. For similar reasons, additional focus marking as in SHEF praised
HIMF, is not necessary, and hence avoided.
Schwarzschild’s account generally prefers narrow foci over broad foci, and few
foci over many. But as we have already seen, this makes wrong predictions. Con-
sider the following case again:

(29) What did John do?

a. He [readF Ul´yssesF]F.
b. *He [réadF Ul´yssesF].

The existential closure of the question is ∃P[P(JOHN)].2 This entails all the pos-
sible focus closures of (29.a), which is ∃P[P(JOHN)], ∃R[R(ULYSSES)(JOHN)],
∃x[READ(x)(JOHN)] and ∃R∃x[R(x)(JOHN)]. But it also entails the focus closure of
(29.b), which is ∃R∃x[R(x)(JOHN)]. As (29.b) has less focus marking according to
Selkirk, it should be preferred, but contrary to the theory, it is not.

5. THE STRUCTURED MEANING ACCOUNT

In concluding, let me point out that the Structured Meaning account of questions
and answers has no problems with over- or underfocused answers. The central idea
is that questions have functional interpretations:
(30) a. [[Which student read Ulysses?]]
= λx∈STUDENT λi[READ(i)(ULYSSES)(x)]
b. [[Which novel did John read?]]
= λy∈NOVEL λi[READ(i)(y)(JOHN)]
THE SEMANTICS OF QUESTIONS 149

c. [[Which student read which novel?]]

= λ¢x,y² ∈STUDENT × NOVEL λi[READ(i)(y)(x)]

Focus in answers leads to a background-focus structure that can be presented as

a pair:

(31) a. [[JóhnF read Ulysses.]] = ¢λxλi[READ(i)(ULYSSES)(x)], JOHN²

b. [[John read Ul´yssesF.]] = ¢λyλi[READ(i)(y)(JOHN)], ULYSSES²
c. [[JóhnF read Ul´yssesF]] = ¢λ¢x,y² λi[READ(i)(y)(x)], ¢JOHN,
ULYSSES²²

The obvious congruence criterion in this representation is that the question

meaning should correspond to the background of the answer, in the sense that the
question meaning differs from the background of the answer only insofar as it might
have more restricted domains. I will write f ⊆ g if the functions f is like g except that
the domain(s) of the argument(s) of g may be larger. In addition, the focus must be
an element of the domain of the question.

(32) A question-answer pair Q – A with meanings [[Q]] and [[A]] = ¢B, F² is

congruent iff:
i. [[Q]] ⊆ B
ii. F ∈ DOM([[Q]])

Clearly, the question-answer pairs (31.a) – (32.a), (31.b) – (32.b) and (31.c) –
(32.c) are congruent. We also find that the problematic cases considered above are
treated in the expected way. First, consider the following two questions:

(33) [[What did John do?]]

= λP ∈ [Dset ∩ Activities] λi[P(i)(JOHN)]

(34) [[What did John do with which novel?]]

= λR ∈ [Dseet ∩ Activities] λy ∈ NOVEL λi[R(i)(y)(JOHN)]

Now, consider the following two answers. For VP focus in (36) I do not assume
focus projection in the style of Selkirk; rather, I assume that focus is assigned
directly to the VP and expressed by accent on the object NP.

(35) [[John [read Ul´ysses]F]]

= ¢λPλi[P(i)(JOHN)], λi[READ(i)(ULYSSES)]²
150 MANFRED KRIFKA

(36) [[John réadF Ul´yssesF]]

= ¢λRλy λi[R(i)(y)(JOHN)], READ, UYLYSSES²

Clearly, only the combinations (33) – (35) and (34) – (36) satisfy the congruence
criterion (32); no other combinations do. No rule of minimization of focus is called
for; wrong focusation leads to a direct violation of clause (i) of the congruence crite-
rion.
In conclusion, it appears that the careful consideration of focus in answers to
constituent questions argues against the alternative semantics account, and for the
structured meaning account, of questions and answers.

Zentrum für Allgemeine Sprachwissenschaft, Typologie, und Universalienforschung

Berlin and Humboldt-Universität zu Berlin

6. NOTES

*
Thanks to Regine Eckardt, Andreas Haida and Kerstin Schwabe for discussion of the points of this
paper, and to Daniel Büring for pointing out problems in the argumentation in Krifka (2001).
1 As a matter of fact, we can also prove that [ (22)]] ⊇ [ (22.b)]]A, that is, the two sets are equal.
2
Or rather, ∃P[P(JOHN) ∧ P: activity], as the question asks for an activity. Then it is actually unclear
whether the existential closure of the question entails the existential F-closure of the answer because this
does not have to be restricted to activities.

7. REFERENCES
Büring, Daniel. Question-Answer Congruence - Unstructured Comments on Krifka (2001). Berlin: ZAS,
2002.
Groenendijk, Jeroen and Martin Stokhof. Studies on the semantics of questions and the pragmatics of
answers, Department of Philosophy, University of Amsterdam: Doctoral Dissertation, 1984.
Hamblin, C. L. “Questions.” The Australasian Journal of Philosophy 36 (1958): 159-168.
Hamblin, C. L. “Questions in Montague English.” Foundations of Language 10 (1973): 41-53.
Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972.
Karttunen, Lauri. “Syntax and Semantics of Questions.” Linguistics and Philosophy 1 (1977): 3-44.
Krifka, Manfred. “For a Structured Account of Questions and Answers.” In Audiatur Vox Sapientiae. A
Festschrift for Achim von Stechow, eds. Caroline Féry and Wolfgang Sternefeld, 287-319. Berlin:
Akademie-Verlag, 2001.
Krifka, Manfred. “Association with Focus Phrases.” In Valerie Molnar and Susanne Winkler, eds., The
architecture of focus. Berlin: Mouton de Gruyter, to appear.
Paul, Hermann. Principles of the History of Language [Prinzipien der Sprachgeschichte]. Translated
from the second edition of the original by H. A. Strong. London: Longmans, Green, and Co., 1891
Leipzig, [1880].
Rooth, Mats. “A Theory of Focus Interpretation.” Natural Language Semantics 1 (1992): 75-116.
Schwarzschild, Roger. “GIVENness, AvoidF and other Constraints on the Placement of Accent.” Natural
Language Semantics 7 (1999): 141-177.
Selkirk, Elisabeth O. Phonology and Syntax: The Relation between Sound and Structure: Current studies
in Linguistics. Cambridge, Mass.: MIT Press, 1984.
von Stechow, Arnim. “Focusing and Backgrounding Operators.” In Werner Abraham, ed., Discourse
Particles, 37-84. Amsterdam: John Benjamins, 1990.
CHUNGMIN LEE

CONTRASTIVE (PREDICATE) TOPIC, INTONATION,

AND SCALAR MEANINGS

1. INTRODUCTION
In this chapter I will consider Contrastive Topic (CT), Contrastive Predicate Topic
(CPT) and Focus in information structure and their relations to intonation and
meaning, as I have attempted to account for in a series of papers on related topics1.
Particularly, I will try to see the conventional scalar implicature meanings triggered
by CPT and CT in connection with its intonation. In dealing with those phenomena,
I will use data extensively from Korean, where CT is surprisingly clearly marked
morphologically and intonationally, in comparison with data from English.
Information structure, claimed to constitute a separate component from
phonological, syntactic and semantic components (Vallduvi 1992), consists basically
of Topic – Comment or Background – Focus information. Apart from whether it
constitutes a separate component in grammar, no one can deny that it is closely
interwoven with morphological structure (particularly in Korean and Japanese),
syntactic linear and hierarchical structure, semantic structure, and prosodic
phonological structure. That is why we came to organize the present workshop and
create a volume on Topic and Focus in connection with their meaning and
intonation. Recently the phenomenon of CT in particular has been well characterised.
Through this kind of common efforts we believe we can deepen our understanding
of underlying principles governing related issues cross-linguistically.
The organization of the chapter is as follows: In 2 Contrastive Topic is
distinguished from non-contrastive Topic and from list contrastive topics, which do
not leave implicature; CT is examined in a dialogue model and the notion of sum
considered; Korean CT is shown on pitch tracks. In 3 scalar meanings are analyzed;
type-subtype scalarity and subtype scalarity are distinguished and CT’s inherent
tendency of subtype scalarity even in entities is advocated. In 4 scope relations
between scope bearers and CT and CT’s narrow-scope nature is discussed, together
with non-narrow-scope topicalization effect. In 5 Contrastive Predicate Topic and
the scope relation between CT and REASON clause are explored. 6 concludes the
chapter.

151
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 151–175.
© 2007 Springer.
152 CHUNGMIN LEE

2. ASPECTS OF CONTRASTIVE TOPIC

2.1. Topic
We can view an utterance from a Topic perspective and get a Topic – Comment
structure, as follows (Topic here being a non-contrastive Topic):

(1) [Water]Topic [consists of oxygen and hydrogen]Comment.

(2) [kumsok hwalca]Top -nun [hankwukin-i palmyenghay-ss-ta]Comment
metal type -TOP Koreans-NOM invent-PAST-DEC
‘As for the metallic type, Koreans invented it.’
(3) Inswu -nun sosel chayk -ul sa-ss-e-yo
-TOP novel book-ACC buy-PAST-DEC(POLITE)
‘Inswu bought a novel.’ (to the question “What did Inswu buy?”)

Typically, a non-contrastive Topic is given, presupposed, or anchored in the speech

situation. It is something that is talked about by the Comment (or often predicate)
and lacks contrastiveness and is located at the initial, prominent position of a
sentence, with -nun (Korean) or -wa (Japanese) marking, though a null Topic or bare
nominal Topic without a Topic marker is possible, unaccented. The natural kind in
(1) and the artifact kind in (2) from an underlying object, as nominals in common
ground, both quantificational and proper name-like (though not placed in Prince’s
1989 or Gundel et al’s familiarity or givenness hierarchies), as well as the
previously mentioned proper name in (3), function as Topics, being talked about by
the following Comment. The notion of unmarked, non-contrastive Topic is
psychologically and theoretically real, basically based on categorical or double (as
opposed to thetic) judgment (Kuroda 1972, Brentano 1973, Marty 1918, Ladusaw
2000). The structure of Topic – Comment is most natural in information and
discourse structure. Thus, Roberts’ (1997) pessimism about the theoretical status of
Topic in information structure, and Buring’s (2003) exclusion of non-contrastive
Topic as a category in information structure, largely based on English, are not
tenable. Jackendoff (1972) failed to provide any intonational status for a non-
contrastive Topic, although Steedman (2000) assigned L to it. But Topic is a basic
category just like Focus. Null Topics in various languages have no phonetic (or
prosodic) manifestation but are conceptually real for propositional semantic
interpretations. CT is marked in meaning and intonation, constituting a complex
category, and therefore came to draw wide attention rather recently.
First, the intonation pattern of (3), a Topic sentence, is distinct in pitch and
energy concentration, as in (Fig.1). This is a typical sentential intonation (IntP=IP)
in Korean, with a Topic and a preverbal Focus. The Focus constituent, answering a
previous wh-word, is informative (via intercategorial entailment (Zuber 2002) and
existential closure (Scharzschild 1999 and Karttunen 1977)). The non-constituent
‘Inswu bought’ is given and relatively low in pitch compared to the Focus
constituent in the middle and Inswu-nun in the given is a Topic phrase. The 200 mh
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 153

peak comes on a novel at the end of the corresponding SVO English S Sam bought a
novel. Observe the intonation pattern of a Topic sentence in Korean in Fig. 1:

Figure 1. Non-contrastive Topic

We will shortly see how the above Topic intonation is sharply distinct from the CT
intonation shown in Figure 2.

2.2. The Nature of Contrastive Topic

Contrastive Topic, on the other hand, is also given, presupposed, or anchored in the
speech situation to a certain degree like a non-contrastive Topic. It is controversial
whether it is also something that is talked about; Hetland (2003), for instance, does
not agree that those CT instances derived from predicate positions meet the
aboutness condition of Topic and calls them simply “Contrasts” like some other
linguists. CT necessarily shows contrastiveness and is located typically in the middle
or some times at the initial position of a sentence, with morphological markers –nun
(Koresn), -wa (Japanese), thi (Vietnamese) or nan (Thai), together with a high tone,
or with a contrastive contour alone such as B accent (L+H%LH%) (English). CT is
distinct from unmarked, non-contrastive Topic but some linguists (Jackendoff partly,
Buring’s earlier works (though his 2003 adopts the term “Contrastive Topic” in
general for the first time) and Steedman (2000), etc.) confusingly label it as Topic
(or variously as S-Topic) or Theme (though Steedman (in this volume) began to
incorporate kontrast). On the other hand, some syntacticians call it contrastive focus
(CF). I will address the distinction between CT and CF briefly later. “CT” is
basically used to mark Contrastive Topic in logical form but here it will be used as
abbreviation of Contrastive Topic as well for convenience.
154 CHUNGMIN LEE

People often tend to forget that Jackendoff’s (1972) dialogue examples of A

accents and B accents are situated in a context of a given number of people eating a
given number of different foods. Sums (pluralities and mass-partitions with join
semilattices) are involved and they or their parts function as potential Topics or CTs
in the relevant question for a CT answer. Therefore, when the speaker asks about
FRED in (4), HE in the second sentence cannot be assigned a pure Focus as done by
Kadmon (2001: 392) (with her ‘LarryFF’).

(4) A: Well, what about FRED? What did HE eat?

B: FREDB ate the BEANSA. (Jackendoff 1972)

Here HE must be marked CT (or Topic), not F, however its intonation may be
modified in the English question sentence (the fall-rise accent remains in an echo
question (O’Connor et al 1973, Hetland 2003; in Hungarian a CT in a question is
reported in Molnar 1998). It is one of those people in the context and was mentioned
or accommodated in the previous question sentence, thus being in the background as
given. If Focus is assigned, because of rhe preceding focal wh-word, the sentence
becomes a reclamatory question such as (5):

(5) What did you say HEF ate?

Similarly, MARY in (6), with alternative individuals in the speaker’s mind, i.e. CT-
alternatives, not Focus alternatives must be marked CT, not F, contra Krifka (2003).

(6) What did John give MARYCT as a birthday present?

A multi-wh question (such as Who ate what? or Who kissed who?), appearing on
the top of discourse tree structures (Carlson 1983, Roberts 1995, Buring 2003)
typically requires a multi-narrow focus answer such as ‘FREDA ate the BEANSA ’ or
‘LarryA kissed NinaA (often a reciprocal alternative question), as an exhaustive
answer, a pair-list answer, etc. (cf. Krifka 2002). This will get the following dual
focal value, which Buring himself employed to criticize Roberts’ (1995)
characterization of CT as a set of propositions:

(7) {x ate y 蹙 x, y ം De}

In other words, immediate daughters of the top multi-wh question are not warranted
to get a person or food in them. CT utterances cannot be felicitously at the beginning
of a discourse and they cannot be felicitously preceded by a multi-wh question
abruptly. There must be an appropriate way of introducing a topical element in the
question (Kadmon 2001 also criticized this point; see Krifka in this volume for a
structural account) and at least a D-linked wh-question may have to be given such as
Which person ate what for a subject CT question-answer (What did Fred (and Sam)
eat?-FredCT ate the beans) and Who ate which food for an object CT question-
answer daughters for real congruence in the tree. Otherwise, the derivation is
arbitrary and unpredictable, ignoring which element is previously given. Thus, a CT
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 155

is ‘about’ a given part in the previous discourse and locally ‘about’ the rest of the
CT sentence. Hence it is topical. A CT is selection of one or part of the potential
sum Topic denotations and focal in this local sense in the given potential Topic. In
the multi-foci case in Korean, the Nominative marker –ka and the Accusative
marker –rul but not the Topic marker –nun is employed (Lee 1999). The given or
accommodated part as a potential Topic of the previous discourse context must be
present to represent an appropriate CT (below in the tree), as something like
FRED/HE in (4A). In Korean, a CT occurring in a question sentence has a tone
lower than a CT in a declarative S. The most natural and relevant question that
precedes a CT answer should include a potential Topic of a sum of individuals of
<e> type or properties of <e, t> type.
Buring’s claim, on the other hand, that his proposed CT-value is rather a set of
sets of propositions against Roberts’ (1995) ‘a set of propositions’ (Kadmon 2001
also criticizes this) is surely an improvement. The CT-value of (4B), then, should be:

(8) {{x ate y 蹙 y ം De}蹙 x ം De}} = {{Fred ate the beans, Fred ate the peanuts,
Fred ate the eggplant}{Sam ate the beans, Sam ate the peanuts, Sam ate the
eggplant}{Mary ate the beans, Mary ate the peanuts, Mary ate the
eggplant}}(The variables can be equivalently bound by Ȝ operator).

In each subset above, the subject happens to be fixed and functions as Topic for
alternative objects – foods. The choice of one of the alternative foods, i.e. the
beans here, is marked Focus at the outset because it is not relativized any further,
being exhaustive. The choice of one Topic from the alternative Topics – persons, i.e.
Fred here, is focal. The would-be Topic is relativized to become a CT, involving a
focal process. In this sense, CT is both topical and focal, but because of its Topic
base, the head of the term Contrastive Topic is Topic, not Focus, as in Contrastive
Focus. Focus does not have a Topic base. Furthermore, Contrastive Topic is more
marked than Topic in its term and content. Kadmon (2000) rightly criticized this
CT-value approach for relying too much on Focus-value approach. The invariance
of an element in one subset, however, suggests its topic-hood. If it had not a superset,
it would be a non-contrastive Topic. There would not be a choice involved.

2.2. List contrastive topics

A serious problem about the above and its corresponding D-tree approach by Buring
(2003) is that it is partly good only for the phenomenon of “list contrastive topics”
(Lee 2000), when the exhaustive list of all the contrastive topics that constitute a big
Topic is uttered. But, then, the intonations for these listed contrastive topics are not
proper CT contours (L+H*LH%, roughly B accent or fall-rise) except in the
topicalized, initial position. Note that people do not accept (9) and (10) but accept
(11) and (12).

(9) *Fred ate the BEANS but Sam ate the PEANUTS.
L+H*LH% L+H*LH%
156 CHUNGMIN LEE

(10) *Fred ate the BEANS but he did not eat the PEANUTS.
L+H*LH% L+H*LH%
(11) FRED ate the beans but MARY ate the peanuts.
L+H*LH% L+H*LH%
(12) The BEANS, he doesn’t like; the EGGPLANT, he doesn’t
L+H*LH% L+H*LH%
like; and the PEANUTS, he doesn’t like, either.

In (12), many people do not like the last item having a CT contour of L+H*LH%
because they are aware that it exhausts the list of items of the identical presicates
either Brown (1980) noted that a high boundary signals that there is more to come
on the current topic. If we consider topicalized CTs as special cases of CT requiring
a special syntactic position, the most natural and typical situation in which CT
occurs is a single sentence utterance with a CT in-situ like (4B), which unmistakably
involves a conventional implicature (because it is evoked by the contrastive contour
in English or a morpheme plus a high tone in Korean and even without these
linguistic devices the same implicature can be evoked purely from context
conversationally --- Steedman (in this volume) largely came to take this position
but Buring (2003) views it as conversational) of but Sam did not eat the beans (or
but I don’t know about the rest of the people). This denial is the first evoked
implicature even when ‘Sam ate the peanuts’ but it is somewhat redundant and
trivial because the alternative that entails the denial is rather explicitly asserted.
This listing effect (with no implicature) occurs in a discourse even across speakers
or sentence boundaries. Consider Kadmon’s interesting observation in (13). The
only potential relevant kissers are Larry and Bill

(13) A: Who kissed who?

B: (Let’s see) LarryTF kissed Nina FF.
C: (Right, and) BillTF kissed Sue.

Therefore, the notion of “Contrastive” may better be understood as showing a

contrast between the said part and the polarity-reversed, implicated unuttered part of
the partly realized, contrastively conjunctive complex sentence. The conjunction, of
course, includes more directly contrasted elements, one in the first conjunct and the
other in the implicated second conjunct. List contrastive topics do not have the
implicature part of this nature because the said sentence is complete as a whole.
Thus explored, the CT contour (L+)H*LH% in English (and similarly L*H(H%) in
German (Fery 1993)), with the required implicated proposition is used in rather
limited discourse contexts. Only syntactically topicalized contrastive topics, as list
contrastive topics, share the same CT contour with no argumentatively assertive
implicature, as can be seen in a typical CT utterance.
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 157

2.3. Contrastive Topic in Korean: Intonation

CT in Korean remarkably shares a great deal of features witnessed in English. First,
a typical CT with implicature requires the topic marker –nun and a high tone
((L)H*). The topic marker –nun is shared by a non-contrastive Topic, as we have
seen. Second, list contrastive topics do not show a high tone required for a typical
CT, although it is marked by the same topic marker –nun. Let us first observe how
sharply a CT contour in Fig. 2 is distinguished from the non-contrastive Topic pitch
in Fig. 1.

(14) (After hearing that Inho didn’t come, regarding his friend Yengswu)
Yengswu-nun w-ass-e
-CT come-PAST-DEC
‘YengswuCT came.’

Figure 2. Contrastive Topic

There is a sharp difference in pitch height between the Topic –nun (Fig. 1) (150 mh)
and the CT –nun (Fig. 2) (over 200 mh). This is why I described the CT -nun phrase
as (L)H*(%). There occurs a direct rise from L on the final syllable of the nominal
or other lexical constituent (CT target) to the CT marker –nun, a non-lexical
function element, unlike in Indo-European languages (C. Lee 2000). This implies
that contrastive accent and contour in Korean and English is different from other
focus accents. In Japanese, according to Nakanishi (in this volume), a CT marker wa
from Subject in initial position does not seem to be high, but mid-sentential CT wa
is high in tone according to my fieldwork. The marker -nun shows phrasal
boundaries, those of Intonational Phrase (IntP) or Accentual Phrase (AP)2. In
158 CHUNGMIN LEE

naturally occurring speeches, non-contrastive Topic and list Topic are so low in
pitch that marking H indiscriminately on their S-initial –nun in Jun’s (1998) K-ToBI
may have to be reconsidered, despite the tendency of LHLH AP in Korean. Because
of the phrase-final rise, CT has nothing to do with dephrasing effect witnessed in
(non-phrase-final) Focus elements (Jun 1993). Therefore, Focus may follow it. De-
phrasing is analogous to de-accenting in English (Pierrehumbert 1980), e.g. Q: Who
did Anna marry? A: (Anna married) MANNYH*LL%. Because of the following Focus,
backward deaccenting occurs and no pitch accent or boundary is marked on the
string of the non-contrastive Topic and the verb in the background (a non-
contrastive Topic given in Korean is similar, as in Fig. 1). Typologically, in Italian
and Romanian given information is not de-accented, contrastively focused elements
already lacking accent (Ladd 1996).CT –nun is also the longest in duration among
different phrase final elements. In contrast to the high pitch of the above typical CT,
observe the low pitches of the list contrastive topics in Fig. 3.

(15) A: ai-tul-un myet haknyen –i-ci-yo

children-TOP what grade –be-POLITE
‘What grades are your children in?’
B. kun ay nun sa-haknyen-i-ko cakun ay nun i-haknyen-i-ey-yo
older one –CT 4th grade-be-and younger one –CT 2nd grade-be-POLITE
‘The older one is in 4th grade and the younger one is in 2nd grade.’

Figure 3. List contrastive topics

2.4 Contrastive Topic to be Preceded by Potential Topic of Sum

The crucial requirement of CT is that potential Topic of sum must precede or be
assumed to precede it. If a sum is impossible, an entailing stronger element cannot
be marked CT. Consider:
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 159

(16) A: Did she give birth to a baby?

B: Yes, she got a daughterF.
B’: #She got a daughterCT.

In a join semilattice, a (local) top type is entailed by its lower types in the
ontological type/sort hierarchy, and thus ‘given’ (Schwarzschild (1999) by the latter
if a lower type element occurs first, e.g. male/female→gendered, gorilla/
monkey→animal. Likewise, daughter/son→offspring (baby) but we cannot get the
idea of sum in the situation of ‘giving birth to a baby’ in (16A). Therefore, a
stronger daughter is informative and can be not CT-marked but F-marked or CF-
marked (to be discussed shortly) because an assumed intervening direct question is
an alternative disjunctive question, ‘If yes, is it a daughter or son? If the question is
(17A), we can get the notion of sum in children (or babies) and hence B.

(17) A: Do you have children?

B: I have sonsCT.

If B’s answer is ‘Yes, I have sonsF,’ then it is exhaustive (but still can have the
conversational implicature of ‘but I don’t have daughters’ from the context. Once
(17B) is uttered, it by default evokes a scalar implicature and I say it is conventional
because it has a special fall-rise pitch contour and is not readily cancellable without
epistemic contradiction. Even an explicitly asserted proposition may at times be
cancelled in a very roundabout way, with hedges and corrections. A conventional
implicature may not be an exception to this kind of roundabout situation. The
implicature of (17B) may initially be scalar with something like “But I don’t have
daughters and I am not totally satisfied with this,’ tending to give more weight to
‘daughters’ on a pragmatically evoked scale. In a boy preference society, B’s answer,
I have daughtersCT’ may evoke a reversed scale of {daughter < son}.
Often a question is used indirectly to induce the hearer’s response on his/her
possible involvement in the event in question. For instance, ‘Who hit Mary?’ Then,
‘someone hit Mary’ is derived as presupposition via existential closure of the
interrogative (Karttunen 1977) such that λp∃x[p & p=hit(x, m)]. Next, a question,
“Did you and other people hit Mary?” is accommodated and ICT didn’t hit her is
naturally interpreted; here, I has more weight than other people (Lee 2003).

3. SCALAR MEANINGS

3.1. Subtype Scalarity

A ‘coin/bill→money’ situation (Lee 1999) evokes clearer scales. Although ‘money’

is a mass term, it can be partitioned into two equivalence classes: coins and bills.
When asked, ‘Do you have money?” A sum idea can be evoked because having both
coins and bills at the same time is all right unlike in the ‘baby birth’ situation and a
160 CHUNGMIN LEE

typical answer can be (17a) on a contextual scale of <coins, bills> (bills with greater
weight) (in this situation (17b) is infelicitous), but in a very special context, e.g.
getting on a bus, (17b) is possible, in an opposite scale <bills, coins> (coins with
greater weight).

(17) a. I have coinsCT.

b. I have billsCT.

My claim, then, is stronger than previous accounts in that scales are dually evoked in
my account, first by the semantic relations of atom – sum, member – set, subset –
superset, and subtype – type, and secondly by pragmatic ordering relations between
alternative parts, i.e. atoms, members, subsets, and subtype elements, of larger units
or wholes in the query, when individuals are discussed, as exemplified above
({coins < bills}, {daughter < son}. In other words, it is not a simple ordering of
money – coin, baby – daughter as values in a basic scale ordered by a relation
between type in the query and subtype in the reply. When the query is by sum and
the reply is by subset or atom, the reply is not enough and generates the implicature
of ‘not sum’ but the reply has affirmed the subset or atom already and it leads to ‘not
the rest or its relevant part’ even conversationally without fall-rise. This kind of
relation has been well explored by Ward and Hirschberg (1985), although they
characterised fall-rise as implicating “uncertainty,” which is general and somewhat
vague but was called “conventional implicature.” They defined scale by poset
(partially ordered set) and included in it hierarchical and linear orderings such as
spatial or temporal orderings, stages of a process, and relationships of type/subtype,
or part-whole, in addition to Ladd’s (1980) hierarchical sets ordered from root to
leaf. They give a ‘is a part of’ relation by dissertation - first chapter - first half.
They also provide a symmetric relation ‘cousin of’ creating oddness in fall-rise. One
conjunct cannot be denied, with the other being affirmed, in ‘I am John’s cousin and
he is mine’ in my account. Consider their example:

(18) A: Are you John’s cousin?

B: #He’s \mine/.
.
The same kind of relation, which may be termed as an abstract LARGER THAN
relation, holds in Topic formation: the Topic denotation must be LARGER THAN
its parts and the parts again are ordered in the same way LARGER first in the
multiple nominative/accusative case construction and only the largest can be Topic
(Lee 1989, 1994). In (19), where ‘elephants’ are larger than their parts ‘noses’ and
comes first, forming a Topic, as in (a), and if the part nominal ‘noses’ takes a topic
marker it comes to function as a CT, as in (b), implicating ‘but not other parts’ or
‘but they do not smell well.’ If the Topic marker in the initial position is replaced by
the nominative marker, the nominal is focused, as in (c).

(19) a. khokkiri-nun kho-ka kil-ta

elephant-TOP nose-NOM long-DEC
‘(As for) Elephants, their noses are long.’
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 161

b. khokkiri-nun kho-nun kil-ta

elephan -TOP nose -CT long-DEC
‘(As for) Elephants, their noses are long but ---.’
c. khokkiri-ka kho-ka kil-ta
elephant-NOM(FOC) nose-NOM long-DEC
‘It is elephants whose noses are long.’

My further claim is that the lower line sister alternatives in hierarchies may typically
form scales in CT. A typical CT with an appropriate contour evokes a scalar
implicature conventionally by default but a list alternatives reading may be forced
by certain nominals in certain contexts. Consider further examples by them:

(20) A: Is she taking any medication?

B: \Vi/tamines.
(21) A: Are you a doctor?
B: I have a Ph.\D/.

In (20B) a stronger kind of medication is denied and in (21) ‘a medical doctor,’

which has more weight on that particular pragmatic scale, may be denied. Note
that Ladd’s (1980) following example shows that there is a whole-part (poset)
relation between the locations in (A) and (B), unlike in (A) and (C). B does not
entirely agree, denying the wider range, whereas C agrees with A’s claim strongly,
leaving no room for skepticism but metalinguistically negating A’s expression covertly.

(22) A: Harry’s the biggest fool in the state of New York.

B: In ITHACACT , maybe.
C: In THE WHOLE WORLDF, maybe.

Consider van Rooy’s (2002) example of scalar interpretation of nominals. He does

not introduce fall-rise here.

(23) Q: Which Beatle’s autograph do you have?

A: George Harrison’s.
~> ¬John Lennon’s, though ¸Ringo Star’s
“Standard” partition: 4 Beatles ~> 16 cells.
Autographic prestige:
Star < Harrison < {Lennon, McCartny}

Van Rooy does not distinguish between a semantic scale arising from the hierarchy
of the sum of Beatles’ autographs (this must be posited in the assumed query
preceding (23Q)) and the individual Beatles’ autographs and a pragmatic scale
arising from different weights among different alternative Beatles. He addresses the
latter type of scale. Without any CT contour on (23A), it may have an exhaustive
interpretation with “standard” partition and list reading, evoking no particular scale
among alternative Beatles. Herburger (2000) also indicates that “When a fall contour
162 CHUNGMIN LEE

on free focus is changed to fall-rise, a resulting “at least” interpretation undermines

the exhaustivity of focus.” Alternatively, it can have a conversational scalar
implicature shown above, based on the given prestige scale in the context. If we use
the Contrastive (fall-rise) Contour on the answer “George Harrison’s,” preferably
with the question ‘Do you have John Lennon’s autograph?’ the scalar implicature is
unmistakable and because of the linguistic device used (a contrastive pitch contour
in English or a morpheme + a high tone in Korean) it is a conventional
implicature. Even without this contour or morpheme, the answer can have a
conversational implicature, depending on contexts or can be free of it, exhaustively
interpreted. Evolutionarily, those particular prosodic or morphological devices seem
to have come to regularly license fairly predictable Contrastive Topic meanings
associated with them from relevant contexts. The unuttered meanings of Contrastive
Topic developed from conversational implicatures arising without such special
devices and still co-exist with them. In a nutshell, Contrastive Topic is employed to
convey this kind of implicature, concessively admitting the uttered proposition.
What happens when an answer is uttered negatively with a CT? Let us consider
the following dialogue situation: The potential Topic of sum is given in the query
(Q) and the answer (A) is negatively uttered with a CT John Lennon’s, which may
be located highest in a scale of prestige. This pragmatic scale may be the speaker’s
presupposition or accommodated by the hearer’s scalar reply.

(24 ) Q. Do you have Beatles’ autographs?

A. I don’t have John Lennon’s CT.

Then, its conventional implicature is polarity reversed, i.e. affirmative but the value
of weight not higher than the given value but lower than it. Therefore, the
implicature in the given context turns out to be “But I have other Beatles’ (weaker
than John Lennon in the scale of prestige) autographs.” Often the context is limited
than this, e.g. the speaker knows whether the hearer has Lennon’s and McCartny’s
and he/she knows that the hearer knows the speaker’s knowing of the fact and asks,
“Do you have Harrison’s autograph?” The reply is “I don’t have Harrison’s CT . Then
the relevant value element is the lower one: Harrison’s, generating the implicature of
“I don’t have Star’s.” This is the opposite of what happened in (23), where an
affirmative CT reply is uttered.
Now a generalization follows: if a sentence with a CT is uttered (as a reply),
contrastively (“but”) a polarity-reversed proposition with an alternative value greater,
if the reply is positive, and less, if the reply is negative, than the CT denotation, in
the pragmatic scale.
Next, let us turn to what kinds of categories can be marked CT. In Korean (and
presumably crosslinguistically), basically most categories may be marked CT
including adverbs. In Korean, however, prenominal quantifying Determiners such as
motun ‘all’ cannot be marked CT, unlike in English. Instead, their adverbial forms
(motu, ta ‘all’) can. In (25), an adverb cal ‘well’ has been marked CT and a very
high tone far over 200 hz is noticed in Fig. 5. (25) is negative and an affirmative
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 163

proposition with a weaker value than ‘well’ in the scale is implicated, such as ‘but I
know a little bit.’ This is sharply distinguished from an utterance without CT-
marking: cal molla ‘I don’t know it well,’ ‘I am not quite sure,’ which can be used
when the speaker knows (almost) nothing about it. Chierchia (2002) discusses a
similar, interesting point but does not have the idea of CT at all when it is required.
Observe:

(25) cal -un moll-a

well-CT no-know-DEC
‘(I) don’t know (it) wellCT .’

Figure 5. Adverb CT

Nominals in all grammatical relations or positions take CT in Korean including

object CT, as in (26) and Fig. 6. An object CT fronted to the initial position of a
sentence tends to be more topical passively with wide scope than that in situ.

(26) sakwa –nun mek –ess- eyo

apple -CT eat-PAST-DEC
‘(I) ate apples.’ (with a null Topic) -
164 CHUNGMIN LEE

Figure 6. Object CT

Nominals with the Possessive marker –uy following cannot take the CT marker
neither after the nominals nor after –uy. Only predicatively used categories can take
CT (introducing the Nominalizer –ki in the prenominal modifier position, e.g.
yeyppu-ki-nun ha-n sonye ‘A prettyCT girl.’ A postpositional phrase of DP + P takes
the CT marker after P but not after DP. Ku ai-nun [cip’house’-eyse’at’-nun] nul wu-
n-ta ‘That child cries always at home.’ Contrastive Predicate Topic will be discussed
shortly. Hedberg’s (2003) example He hasn’t (H*) done anything (L+H*)
extraordinary.( L+H* LH%) [4/27/01] shows a modifier CT in a negative sentence
and evokes an affirmative implicature with a lower value such as he may have done
something ordinary. Its correspondence in Korean gets CT-marking with –nun on
the nominal kes ‘thing,’ but the CT-marking is associated with the modifier
thekpyeha-n triggers its alternatives. This is a CT and it seems that she departed
from assigning a “Contrastive Focus” to this fall-rise case (Hedberg et al in this
volume).
Let us further consider what types of sentences license CT in general. A simple
declarative sentence is a typical type and an interrogative sentence in Korean is
another. I demonstrated elsewhere (Lee 2002, etc) that in most languages CT is
licensed in relative and subordinate clauses, though restrictively crosslinguistically,
but that occurrence of non-contrastive Topic is impossible in Korean because the
relative clause head nominal comes through Topic in the relative clause during
relativization (Lee 1973) (and in Japanese as well). Complement clauses license CT
in them easily crosslinguistically, as in (27b).

(27) a. John knows a song that MARYCT sings well (from Subject)
b. John knows that MARYC T sings the song well.

In Korean, a whole complement clause can take CT before a main clause attitude or
communication verb and it can be focally associated with either the predicate
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 165

(preferred) or the subject of the complement clause. Because (28) is negative, an

affirmative proposition with a weaker predicate in the scale than the complement
predicate ‘right’ is conventionally implicated. Observe:

(28) Yumi-nun ku–ka olh -ta -ko -nun po-ci anh-nun-ta

-TOP he-NOM right-DEC-COMP-CT think not
‘Yumi does not think [that he is right] CT.’

The contrastively implicated proposition may be ‘But Yumi thinks that he’s got a
point.’
Crosslinguistically, in English, German, and Korean, the pitch accent for
(information) Focus, H*(L), is distinct from the one for CT, roughly (L(+))H*(-),
whereas in Finnish and Norwegian, Focus and CT are not so distinct prosodically
(Vallduví and Vilkuna (1998:89), Fretheim (1992), Gundel (2002)).

4. CONTRASTIVE TOPIC AS A NARROW-SCOPE-BEARER?

In Korean, CT-marked universal quantifiers, universally quantifying time, degree
and frequency adverbials as well as positively quantifying adverbials such as
‘often’(cacu-nun), ‘much/many’ (manhi-nun) always take narrow-scope over
negation. Observe:

(29) ta nun an mek-ess-e

all –CT not eat-PAST-DEC
‘(I) didn’t eat all.’
(30) ta an mek-ess-e
all –CT not eat-PAST-DEC
‘(I) didn’t eat all.’

In (29), the CT marker is attached to the universal quantifier (originally adverb

‘completely’) and we can see the high pitch of the CT marker –nun in Fig. 4 and in
(30) the CT marker has been deleted but its tone has been preserved and there is a
rising tone from ta ‘all’ to an ‘not’ because of the compensatory high tone coming
from the deleted CT marker, as in Fig. 5. Thus it is noted that the CT marker is
deletable, just as the non-contrastive Topic marker is, whereas the CT high tone,
which is largely responsible for the focality in CT, is not. Thus (29) and (30) are
identical in interpretation with the narrow-scope CT or wide-scope negation.
Compare it with the pitch track of a negative sentence with no CT marker or its
compensatory tone ta an wasse ‘All didn’t come’ in Fig. 6, with wide-scope universal.
166 CHUNGMIN LEE

Figure 4. Universal Quantifier with CT marker in Negation

Figure 5. Universal Quantifier with Compensatory Tone in Negation

Figure 6. Universal Quantifier with no CT or Compensatory Tone in Negation

CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 167

Ladd (1980) and Jackendoff (1972) claim that fall-rise forces a narrow-scope
reading in (31) and (32) also in English.

(31) \All/ the men didn’t go.

(32) I didn’t see \all/ of the men.

Suppose (31) is interpreted as ษ¬, then all is exhaustive and ¬ go and there is no
continuation to a contrasted proposition with weaker affirmation (see (30) above)
‘but some men went,’ etc. The same applies to (32). Therefore, there is no scope
ambiguity in (31) and (32). Consider, however, the ‘ambiguity’ between the narrow-
scope CT and wide-scope CT reading in (32) in English advocated by Buring (1999),
Kadmon (2001).

(32) Two thirdsCT of the politicians are not corrupt.

a. ¬ 2/3
b. 2/3 ¬ (not easy with fall-rise)

In (a), a typical CT reading of scalar, nonspecific, non-partition cardinality is given.

Roughly, (32), on this reading, is ‘it is not the case that up to two thirds of the
politicians are corrupt but a little less than that may be corrupt.’ This reading is
denial of the other party’s high value assertion, implicating a low value affirmation
on the scale. In (b), on the other hand, a topicalized partition reading is given and
this reading of (32) is roughly ‘two thirds of the politicians are non-corrupt (and one
third may be corrupt.)’ The latter reading is similar to a Topic reading, in which no
fall-rise is required. I claim that there occurs a topicalization effect for wide-scope
CT. This also occurs in Korean in the Topic position. Consider Korean. (33) is
ambiguous but a CT in the object position in (34) is not:

(33) cengchika-euy sam-pwun-euy i-nun pwuphay-ha-ci anh-ass-ta.

politician-of 3rd -of 2-CT corrupt was– not -DEC
‘Two thirdsCT of the politicians are not corrupt.’
a. ¬ 2/3 (non-partition, less than 2/3 corrupt – by polarity reversal
affirmative weaker value implicature)
b. 2/3 ¬ (partition, the rest=1/3 corrupt by implicature)
(34) euysa-euy sam-pwun-euy i-nun hayko-ha-ci anh-ass-ta.
doctor-of 3 –minute-of 2-CT corrupt was– not DEC
‘(The Government) did not fire two thirds of the doctors.’
a. ¬ 2/3 (non-partition, with an assumed null or realized Topic in the initial
position)
b. (i) ¬ 2/3 (non-partition, with a subject ‘the Government’
after the CT phrase inserted and the CT high tone contour)
(ii) 2/3 ¬Focal subject; 2/3 ¬Focal verb; 2/3 ¬ (partition, with a subject,
say, ‘the Government’ inserted after the CT phrase and a CT high
168 CHUNGMIN LEE

tone which tends to be low) (with constituent negation on focused

subject or predicate, evoked by Contrastive Predicate Topic)
c. 2/3 ¬ (Topic reading with TOP marking and no high tone, partition,
specific, the rest = 1/3 may be fired) (this reading is also possible
with the Topic phrase with a low tone in the original object position)
(constituent negation readings evoked by Contrastive Predicate Topic
as in (bii) are also possible)

Exactly parallel readings evolve in English; the 2/3 ¬ reading in (32) is a

topicalization effect and a non-scalar partition is denoted. Consider an object CT. In
(35), ¬ 2/3 seems natural. The Government did not fire up to 2/3. So, ‘---fired less
than two thirds’ is implicated.

(35) The Government did not fire two thirdsCT of the doctors. (With contrastive
fall-rise contour on ‘two thirds’)

How about the same object CT in the topicalized position?

(36) Two thirdsCT of the doctors the Government did not fire. (With contrastive
fall-rise contour on ‘two thirds’)

In this position, both a partition reading with topicalization effect (with constituent
negation possibilities as in Korean) and a scalar non- partition reading seem to be
available.
We can now see that fall-rise (in CT) in fact forces a narrow scope reading,
which is scalar, both in Korean and in English. A non-scalar partition reading is a
consequent of topicalization effect.
When CT follows a scope-bearing element such as a quantified, focal expression,
it shows narrow scope over the scope-bearing element. Observe:

(37) motu-ka/nwukwuna-ka sakwa sey kay –nun mek-ess-ta

all-NOM/everyone-NOM apple three CL-CT ate
‘Everyone ate three applesCT .’ ∀ > ∃ 3 (CL=Classifier)

The CT expression has narrow scope with respect to the preceding universal
quantifier in (37) with the meaning of ‘at least three but not more than three apples.’
It has the same effect of having a distributive marker –ssik ‘each’ attached to the
numeral classifier (sey kay-ssik-un). When the CT phrase is scrambled to the initial
position of the sentence, it still predominantly keeps narrow scope but opens the
possibility of wide scope rather marginally. Even when it comes to have wide scope
reading, ‘three apples as a whole’ is contrasted with other alternatives. Consider:

(38) sakwa sey kay –nun motu-ka/nwukwuna-ka mek-ess-ta

apple three CL-CT all-NOM/everyone-NOM ate
‘Everyone ate three applesCT .’ ∃3 < ∀ (∃3 >∀)
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 169

A Focus phrase Yumi-man-i ‘Yumi-only-NOM’ can replace the universal quantifier

phrases in (37) and (38), seemingly preserving the same scope relations. In
particular, if the CT phrase in (38) is replaced by the [sakwa-rul sey kay-nun]
‘apple-ACC 3-CL-CT,’ then the narrow scope of CT is unmistakable, although the
acceptability of the S slightly aggravates; this case-marker-intervening construction
lacks specificity. Also, in (38) if the predicate has a modal expression such as ‘can’
and ‘will,’ the CT narrow scope is unmistakable. If the ACC marker –rul replaces
the CT marker -nun in those sentences, both sentences get an ambiguous scope
relation.
This tendency of CT narrow scope is also reported in the CT initial position in
Hungarian (Gyuris 2004).

5. CONTRASTIVE PREDICATE TOPIC

5.1. Scalarity of Contrastive Predicate Topic

So far we have treated mainly entity type CTs. However, there are ample cases in
which properties (or predicates) become Contrastive Topic, which I call Contrastive
Predicate Topic (Lee 1999, 2000, 2002). Contrastive Predicate Topic is also a sort of
topic (topical) in the sense that it has been a potential Topic, discussed or assumed in
the previous discourse. In this sense, it is not Hetland’s (2003) “main news,”
although it is a predicate, typically used for Comment information. It is more
discoursal than sentential. Therefore, it may not fit the narrow definition of Topic by
means of ‘aboutness,’ in which the rest of the sentence talks about it. Steedman
(2000) strikingly coincides with my view, though he does not so clearly distinguish
between Contrastive Topic and his “unmarked theme” until this volume. Secondly,
it is scalar in a stronger sense than entity type CT. Consider (39), (40), in which
pragmatic scales are evoked:

(39) She ARRIVEDCT. ~> ¬She went on the stage.

(40) She PASSEDCT ~> ¬She aced the exam.

(39) evokes a scale of {arrive < go on the stage}in context and (40) readily evokes
{pass < ace the exam}. Interestingly, the former scale is not semantic but pragmatic,
in other words, the larger value ‘go on the stage’ does not entail the lower one. But
if we consider a specific context in which ‘go on the stage’ requires ‘arrive’ as a
precondition, the former entails the latter in that context and we can call it a
pragmatic entailment. The latter scale is semantic; ‘ace the exam’ entails ‘pass the
exam.’ (Conventional) scalar implcatures are evoked by both pragmatic and
semantic entailments. On the predicate part we can have such as a CT: “All the
abstracts DID get accepted. ~> but there may be withdrawals. Rooth’s (1996) simple
alternatives by F-marking cannot explain why fall-rise requires the relevant type of
170 CHUNGMIN LEE

scalar implicatures. See Lee (2000) for further examples of scalar Contrastive
Predicate Topic.
Then, a big question arises: Is a single CT sentence without Focus [Topic + CT]
possible, as in (39) and (40)? On surface at least, it is a fact (Steedman 2000 agrees
on this, while some others claim there must be a Focus on surface). If we consider,
however, why we talk without giving new information by focusing something, we
may want to ponder about possible explanations: (1) There is a silent Focus in the
scalar implicature part. This phenomenon is not independent; identification focus is
silent with a rising Topic marker (-nun (Korean), wa (Japanese), shi (Chinese) in a
question such as ney irum-un? or “Your name?”; (2) The yes/no (or verum)
question demands an answer with respect to whether or not, i.e. arrived or not;
passed or not. So, it may include a (Contrastive) Focus (Lee 2003). A partial
affirmative answer to this yes/no question is the concessively admitted CT sentence;
(3) CT itself is partially focal and we may assume that the implicature part is also
partially focal. Thus, the totality may be fully focal; (4) There is nothing beyond the
surface form [Topic + CT]. (1) and (3) above consider the implicature part and
are preferable to (2) and (4).
Focus is even neurologically real: Some ERP experiment results (Yuki 2004)
show striking brain responses to the lack of expected intonational prominence (A2)
in Figure 7 for focused words in Japanese. For the Subject wh-Q “Who lost the
key?” (Da’re-ga kagi’-o nakushita’-no?), A1 is Match: MA’SAYA-ga kag’i-o
nakushita’-N-da-yo and A2 is Mismatch: Ma’saya-ga KAGI’-o nakushita’-N-da-yo.
The Subject that lacks the expected intonational prominence (A2) is more positive
in the waveform than the properly prominent subject (A1). Observe:

Figure 7. ERP waveforms for Subject-focus WH-Q-answer pairs (A1 vs. A2)
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 171

5.3. REASON Adjunct Clause and Negation

A reason adjunct clause and negation interact scopally in various languages and
Korean is not an exception. But observe (39) first, which has a Contrastive Predicate
Topic. It has the wide-scope negation and the CT is focally associated with the
reason clause. If the CT marker is deleted but its compensatory tone is retained, its
interpretation is the same as (39). But if the same sentence has no CT marker and no
high tone, then its interpretation is the same as (40). In the written text without any
intonation marking, the sentence is ambiguous between the two opposite scopal
interpretations. Because the Contrastive Predicate Topic is associated with the
reason clause both in (39) and in its corresponding sentence with a null CT marker
but with a high tone and the reason clause comes to have the direct CT effect, the
interpretation is: [It is not because she is richCT that he married her]. Then, its
implicature may be: [I married her because she is nice], ‘nice’ being weaker than
‘rich’ in the pragmatic scale. In the narrow-scope reason clause sentences with the
CT marker or its compensatory high tone in its narrow-scope reason, the reason
clause is rather high and is immediately followed by the matrix clause intonationally,
whereas in the wide-scope reason clause sentences with no CT marker or tone the
reason clause falls and there arises a big pause before the main clause. There is an
exact correlation between intonation and interpretation.

(39) pwuca –yese kyelhon-ha-ci-nun anh-ass-e

rich-be-because marry -CT not
‘(He) didn’t marry (her) because she is rich.’ REASON < NEG
.

Figure 8. REASON Clause < Negation (CT-marked)

172 CHUNGMIN LEE

(40) pwuca –yese kyelhon- an hay-ss-e

rich-be-because marry not do -DEC
‘(He) didn’t marry (her) because she is rich.’ REASON > NEG

Figure 9. REASON Clause > Negation

All the scope relations involving quantifier–negation and REASON-negation depend

on whether the sentences in question have inherently Contrastive Predicate Topic
(with a pitch accent or marker), related to the previous discourse context. If that is
the case, the sentences must take the wide-scope negation, with the Contrastive
Predicate Topic focally associated with the relevant quantifiers or REASON clause.
Thus viewed, scope ambiguity is not present. Constituent negation also involves
Contrastive Predicate Topic, with the latter being focally associated with the
relevant constituent (Lee 2006).

7. CONCLUDING REMARKS
Contrastive Topic is preceded by a question that includes a sum as a potential Topic
or a conjunctive question (or even if it is a disjunctive question, inclusive reading
must be possible). On the other hand, Contrastive Focus, which has not been treated
here, is preceded by an alternative disjunctive question which expects a choice of a
single answer (see Lee 2003). A typical CT, which necessarily evokes a
conventional implicature, must be distinguished from a type of list contrastive topics.
Not only type-subtype scalarity (based on poset) but also subtype scalarity must
be incorporated in any model of Contrastive Topic, although some entities in some
contexts are allowed to receive list reading.
Contrastive Topic basically behaves as a narrow-scope-bearer in interaction with
other scope bearers including a REASON clause. A Contrastive Predicate Topic
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 173

analysis is proposed for the wide-scope negation reading of the scope ambiguous
sentences.
Predicates are necessarily subtype-scalar when CT-marked and numerals and
quantifiers, which are semantically ordered, have the same nature when CT-marked.
We cannot miss the real intent of using a CT: it is to convey a conventionally
implicated proposition. If ‘CT(p)’ is given, then contrastively (‘but’) ‘not q’ (q: a
contextually higher stronger predicate) is conveyed and if ‘CT(not-q)’ is given, then
contrastively ‘p’ (contextually a lower weaker predicate) is conveyed (Lee 2002).
The rhetorical force of CT is placing more weight on the unuttered implicature
proposition. The CT utterance is concessive admission and its concessivity can be
shown by the near-paragraph relation of (39) to (41):

(41) Even though/Even if/Although she ARRIVED, she didn’t go on the stage.

Although ‘even if’ is possible, it is not like a normal conditional, not licensing
contraposition. The truth of the consequent is urged, whatever the antecedent may
turn out to be in truth. The implicature of (39) i.e. the consequent of (40) is so
forceful in rhetorical structure.
Steedman (2000) incorporates a CT tone (L+H*) in the specification of ‘married’
in the lexicon (from Anna MARRIED (L+H* LH%) MANNY (H*LL%) but claims
that its implicature is “conversational” (this volume). But he emphasizes that
“kontrast, thematicity, and hearer responsibility are all elements of literal meaning,
and hence in your terms conventional implicature” (p.c.). Scalar implicatures,
generated by CT marking, though their higher values are determined by context, are
not cancelable and conventional. The intonational device may better be closer to its
meaning as conventional. Information structure must be able to show the relation
between intonation and meaning more closely by our further scrutiny.

Seoul National University

8. NOTES
1
I would like to express my gratitude to Klaus von Heusinger, Mark Steedman and Julia Hirschberg and
other audiences of the Workshop on Topic and Focus: Meaning and Intonation at the 2001 LSA
Linguistic Institute (UCSB) for their questions and encouragement. I am also grateful to my co-editors
Matt Gordon and Daniel Buring for their patience in organizing the workshop and leading it to this
volume eventually. For part of this research Sun-Ah Jun’s comments on intonation, Hyunkyung Hwang’s
assistance on pitch tracks from subjects, KRF grants and the SNU leave of absence for my staying at
UCLA were all helpful.
2
Mira Oh, in her recent experiments (in preparation), ‘Phonetic Realizations of Focus and Topic in
Korean’, observes that the Cheonnam dialect shows an IntPBoundary in contrast with the Seoul dialect.
3
Steedman’s (2000) example (1) can be given a similar scalar interpretation. A theatrical musical
performance is assumed in the previous query and under it a pragmatic scale <musical, opera> can be set
up.
(1) Q: Does Marcel love opera?
A: Marcel likes MUSICALS.
L+H* LH%
Therefore, if opera and musicals are substituted by each other, the answer Marcel likes OPERACT would
not be appropriate on the scalar reading. On a non-scalar reading, the implicature may be open to a list
alternatives reading and even roundabout affirmation.
174 CHUNGMIN LEE

9. REFERENCES
Bach, K. “The Myth of Conventional Implicature.” Linguistics and Philosophy 22.4 (1999): 327-366.
Brentano, Franz. Psychology from an Empirical Point of View, trans’. A. C. Rancurrelo, et al London:
Routledge and Kegan Paul, 1973.
Buring, Daniel. “Topic.” In P. Bosch and R. van der Sandt (eds.), Focus and Natural Language
Processing 2, pp. 271-280. Cambridge: MIT Press, 1994.
Buring, Daniel. “On D-trees, Beans and B-accents.” Linguistics and Philosophy 26 (2003): 511-545.
Carlson, Lauri. Dialogue Games: An Approach to Discourse Analysis, Reidel: Dordrecht, 1983.
Chierchia, Gennaro. “Scalar Phenomena and Polarity.” Manuscript, 2002.
Choi, Hye-won. Optimizing Structure in Context: Scrambling and Information Structure. Stanford: CSLI,
1999.
Diesing, Molly. Indefinites [Linguistic Inquiry Monograph 20]. Cambridge: MIT Press, 1992.
von Fintel, Kai. Restrictions on Quantifier Domains. University of Massachusetts, Amherst: Doctoral
dissertation, 1994.
Fery, Caroline. German Intonational Pattens. Tuebingen: Niemeyer, 1993.
Groenendijk, Jeroen and Martin Stokhoff. Studies on the Semantics of Questions and the Pragmatics of
Answers. University of Amsterdam: Doctoral dissertation, 1984.
Hamblin, C. L. Fallacies. Bungay, Suffolk: Methuen, 1970.
Hedberg, Nancy and J. M. Sosa. “The Prosodic Structure of Topic and Focus in Spontaneous English
Dialogue.” This volume.
Hedberg, Nancy. “The Prosody of Contrastive Topic and Focus in Spoken English.” Talk presented at the
Workshop on Information Structure in Context, University of Stuttgart, 2002.
Hetland, Jorunn. “Contrast, the fall-rise accent, and Information Focus.” I: Structures of Focus and
Grammatical Relations, pp. 1-39. Tubingen: Niemeyer Linguistische Arbeiten, 2003.
Horn, L. A Natural History of Negation. Chicago: Chicago University Press, 1989.
Ito, Kiwako and Susan M. Garnsey. “Brain Responses to Focus-Related Prosodic Mismatch in Japanese.”
at SP2004, Tokyo.
Jackendoff, R. Semantic Interpretation in Generative Grammar, Cambridge: MIT Press, 1972.
Jun, Sun-Ah. The Phonetics and Phonology of Korean Prosody. The Ohio State University: Doctoral
dissertation, 1993. [Published by Garland, 1996].
Kennedy, Chris. Projecting the Adjective: The Syntax and Semantics of Gradability and Comparison. UC
Santa Cruz: Doctoral dissertation, 1997.
Krifka, Manfred. “At Least Some Determiners aren’t Determiners.” In K. Turner (ed), The
Semantics/Pragmatics Interface from Different Points of View 1, pp. 257-91. London: Elsevier, 1999.
Krifka, Manfred. “The Semantics of Questions and Focusation of Answers.” This volume.
Ladd, D. R. The Structure of Intonational Meaning, Indiana University Press, 1980.
Ladusaw, William. “Thetic and categorical, stage and individual, weak and strong.” In Negation and
Polarity, L. Horn and Yasuhiro Kato (eds.), Oxford: Oxford University Press, 2000.
Lee, Chungmin. Abstract Syntax and Korean with Reference to English. Seoul; Thaehaksa, 1973.
Lee, Chungmin. “(In-)definites, Case Markers, Classifiers and Quantifiers in Korean.” In S. Kuno et al
(eds.), Harvard Studies in Korean Linguistics. Department of Linguistics, Harvard University, 1989.
Lee, Chungmin. “Definite/Specific and Case Marking in Korean.” In Y.-R. Kim (ed.), Theoretical Issues
in Korean Linguistics, CSLI, Stanford University, 1994.
Lee, Chungmin. “Generic Sentences are Topic Constructions.” In T. Fretheim and G. Gundel (eds.),
Reference and Referent Accessibility. Amsterdam/Philadelphia: John Benjamins, 1996.
Lee, Chungmin. “Contrastive topic: A locus of the interface.” In K. Turner (ed.), The
Semantics/Pragmatics Interface from Different Points of View 1, pp. 317-41. London: Elsevier, 1999.
Lee, Chungmin. “Types of NPIs and nonveridicality in Korean and other languages.” In G. Storto (ed.),
UCLA Working Papers in Linguistics 3: Syntax at Sunset 2, pp. 96-132. Department of Linguistics,
UCLA, 1999.
Lee, Chungmin. “Contrastive predicates and scales.” CLS 36 (2000): 243-257.
Lee Chungmin. “Contrastive Topic and/or Contrastive Focus.” Japanese/Korean Linguistics, 2003.
Lee, Chungmin. “Contrastive Topic/Focus and Polarity in Discourse.” In K. von Heusinger and K. Turner
(eds.), Where Semantics Meets Pragmatics CRiSPI 16, pp. 381-420. London: Elsevier.
Marty, Anton. Gesammelte Schriften, II. Halle: Max Niemeyer, 1918.
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES 175

Molnar, Vleria. “Topic in Focus: the Syntax, Phonology, Semantics, and Pragmatics of the So-called
‘Contrative Topic’ in Hungarian and German”, Acta Linguistica Hungrica 45 (1998): 389-466.
Nakanishi, Kimiko. “Prosody and Scope Interpretations of the Topic Marker WA in Japanese.” This
volume.
Neale, S. “Coloring and composition.” In ed. by K. Murasugi and R. Stainton (eds.), Philosophy and
Linguistics. Westview Press, 1999.
O’Connor J. D. and G. F. Arnold (eds.). Intonation of Colloquial English 2nd edition. London: Longmans,
1973.
Pierrehumbert, J. and J. Hirschberg. “The meaning of intonational contours in the interpretation of
discourse.” In Cohen, J. Morgan, and M. Pollack (eds.), Intentions in Communication, pp. 271-311.
Cambridge: MIT Press, 1990.
Rooth, M. “Focus.” In S. Lappin (ed.), The Handbook of Contemporary Semantic Theory, London:
Blackwell, 1996.
Roberts, C. “Information Structure in Discourse: Towards an Integrated Formal Theory of Pragmatics.”
Manuscript. The Ohio State University, 1996.
Van Rooy, Robert. “Questions and Relevance.” NASSLLI 4 handout, 2004.
Steedman, Mark. The Syntactic Process, Cambridge: MIT Press, 2000.
Steedman, Mark. “Information Structural Semantics of English Intonation.” This volume.
Ward, Gregory and Julia Hirschberg. “Implicating Uncertainty: The Pragmatics of Fall-Rise Intonation.”
Language 61 (1985): 747-776.
Wee, Hae-Kyung. “Semantics and pragmatics of Contrastive Topic in Korean and English.” Manuscript.
Indiana University, 1997.
KIMIKO NAKANISHI

PROSODY AND SCOPE INTERPRETATIONS

OF THE TOPIC MARKER WA IN JAPANESE*

1. INTRODUCTION: THE TOPIC MARKER WA

It is well known that intonational patterns influence pragmatic interpretations in
various languages (Bolinger 1965, Halliday 1967, Jackendoff 1972, Lambrecht
1994, Hirst and Di Cristo 1998, Ladd 1998, and Steedman 2000; McCawley 1968,
Poser 1984, Pierrehumbert and Beckman 1988, for Japanese in particular). Another
well-known fact is that intonation can have an effect on semantic interpretation. For
example, in German, different intonational patterns yield different scope readings
(Féry 1993, Büring 1997, and Krifka 1998, among others; see section 3 below). This
paper discusses how pragmatic information, prosody, and semantic interpretation are
related. The empirical domain on which I focus concerns the pragmatics, prosody,
and semantics of the topic marker wa in Japanese.
It has been claimed that the Japanese topic marker wa is used to convey
pragmatic information. In particular, it is said to have two functions, namely, to
mark a theme or to mark a contrasted element of a sentence, as shown in (1) (Kuno
1973, among others). In the following, I call occurrences of wa with the first
function thematic and examples of wa with the second function contrastive.1

(1) a. Thematic wa: “Speaking of ..., talking about ...”

John-wa gakusei desu.
John-TOP student is
‘Speaking of John, he is a student.’

b. Contrastive wa: “X ... but ... , as for X ...”

Ame-wa futte imasu ga, yuki-wa futte imas-en.
rain-TOP falling is but snow-TOP falling is-NEG
‘It is raining, but it is not snowing’ (cf. Kuno 1973:38)

In section 2, I address the question of whether prosody can express pragmatic

information in Japanese. In particular, I examine whether intonational patterns
influence the information structure created by the topic marker in a significant way. I
conducted an experiment to examine the pitch contours of sentences with the
thematic or the contrastive wa and I show that the two functions are realized by

177
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 177–193.
© 2007 Springer.
178 KIMIKO NAKANISHI

different contours. Section 3 shows that the two functions of wa which are realized
by different intonational patterns are tied to different scope interpretations. This
correlation between the pragmatic functions of wa and their scope interpretations
can be formalized by applying Büring’s (1997) Alternative Semantics Approach. In
the last section, accepting a widely taken view that scope is expressed in syntax at
LF, I claim that the thematic wa and the contrastive wa must be syntactically
different at least at LF.

2. THE PROSODY-PRAGMATICS INTERFACE2

In the introduction, we saw that the topic marker wa has two interpretations: the
thematic wa and the contrastive wa. In this section, based on an experiment, I claim
that this difference on interpretations can be conveyed by different prosodic patterns.

2.1. Basics of Phonetics/Phonology in Japanese

Japanese has a pitch-accent system, where some words can be distinguished only by
accent. The location of the accent corresponds to the mora before the pitch drop, i.e.,
the accent is on the H immediately before L.3 Accent is a lexical property of some
morphemes; underlying accents are modified by rules of word-level phonology
(McCawley 1968, Haraguchi 1977, Poser 1984).4

(2) ŉ ňŉ ň
a. kaki-ga b. kaki-ga c. kaki-ga
oyster-NOM fence-NOM persimmon-NOM

Furthermore, as first claimed in Poser (1984), Japanese has Downstep (cf.

Pierrehumbert and Beckman 1988 and Kubozono 1993 for further discussions).
Downstep is the reduction in pitch range following an accented syllable, as
schematised in (3).

(3)

Downstep applies within the phonological domain of so-called ‘major phrases’. 5

Poser (1984) claims that “the topic phrase (marked by the particle wa) is generally
set off from the rest of the sentence by a major phrase boundary, as indicated by the
fact that it seems to have no effect on the following material” (1984:101). The
dotted line in (4) expresses an expected Downstep, and the solid line expresses an
actual pitch contour.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 179

(4)

However, Poser did not distinguish between thematic wa and contrastive wa.6 The
question is whether the prosodic patterns of thematic wa and of contrastive wa are
the same, which I explore in the next subsection.

2.2. Prosodic Patterns of Wa

An experiment was conducted to answer the question of whether the thematic and
the contrastive wa are prosodically distinct. First, I constructed examples with the
thematic wa and also with the contrastive wa. For simplicity, the structure of the
examples was ‘Subject-wa Predicate’. The subject and the verb were three or more
moras, accented, with voiced segments in accented syllables.7 The following are the
examples used for this experiment:

(5) a. Thematic Wa
ŉ ň ŉ
Naoya-wa nonbiri-si-teiru.8
Naoya-TOP relax-do-PROG
‘Naoya is relaxing.’

b. Contrastive Wa
ŉ ň ŉ ň ŉ ň ŉ
Naoya-wa nonbiri-si-teiru ga Maria-wa nonbiri-si-tei-nai.
Naoya-TOP relax-do-PROG but Maria-TOP relax-do-PROG-NEG
‘Naoya is relaxing, but Maria is not relaxing.’

Five native speakers of Japanese participated in this experiment (3 males, 2

females). The participants were provided with two cards on which the sentence in
(5a) or in (5b) was written. They were asked to read sentences on each card aloud
five times with an interval of a few seconds between each sentence. They were also
asked to read sentences at a natural speed without any pause during each sentence.
To see the prosodic patterns, I measured the fundamental frequency (F0). F0 is an
acoustic correlate of the psycho-acoustic percept of pitch of the voice. Specifically, I
measured the value of the F0 peak immediately before and after wa. P1 is the value
of the F0 peak immediately before wa, and P2 is the value of the F0 peak
immediately after wa. In the examples in (5), P1 is at ‘na’ in Naoya and P2 is at ‘n’
in nonbiri.
180 KIMIKO NAKANISHI

(6)

The following patterns are found: When wa is thematic, P2 is either slightly

higher than P1, or slightly lower than P1. Overall, P1 and P2 are about the same
value. When wa is contrastive, on the other hand, P2 is much lower than P1.9 The
contours in Figure 1 indicate typical patterns for thematic and contrastive wa.

Figure 1. F0 contours of thematic wa and contrastive wa10

Above: Thematic wa [P1: 127.7Hz, P2: 129.9Hz]
Below: Contrastive wa [P1: 159.7Hz, P2: 91.0Hz]
(Participant KO: male)

The distribution of the thematic and the contrastive cases of the five participants
are given in Figure 2. The X-axis indicates the value of P1 (Hz), and the Y-axis
indicates the value of P2 (Hz). As can be seen in the figure, the thematic cases
distribute around or above the P1 = P2 line. It means that P1 and P2 are roughly equal
or P1 is lower than P2. The contrastive cases, on the other hand, distribute mostly
below the P1 = P2 line, indicating that P1 is higher than P2.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 181

Figure 2. Distribution of thematic wa and contrastive wa

2.3. Summary
In sum, the difference between thematic wa and contrastive wa is reflected in
different F0 patterns. That is, intonation can distinguish two pragmatic functions of
wa. Thus, intonational patterns are used in a significant way to convey different
pragmatic information.

3. THE PROSODY-SEMANTICS INTERFACE

In this section, I first show that prosodic patterns have some effects on quantifier
scope interpretations. In particular, the two prosodic patterns above distinguish two
different scope readings of wa with respect to negation. I argue that two pragmatic
functions of wa which are expressed by different prosodic patterns yield different
scope interpretations. The correlation between pragmatics and scope interpretations
can be captured by Büring’s (1997) Alternative Semantics Approach. It follows that
pragmatic information influences semantic interpretations by way of prosody.

3.1. The Topic Marker Wa and Quantifier Scope

There are two sets of data that I intend to show here: First, a universal quantifier
with wa exhibits scope interactions with negation, yielding scope ambiguity. Second,
the thematic wa and the contrastive wa correspond to different scope readings.
182 KIMIKO NAKANISHI

3.1.1. Interaction with Negation

In Japanese, it is claimed that a sentence with a universal quantifier followed by wa
shows scope ambiguity (Kato 1988, among others). For example, in (7), the
universal quantifier with wa in the subject position can take either wide or narrow
scope with respect to the negation, which appears in a verbal inflection.

(7) Minna-wa ne-nakat-ta.11

everyone-TOP sleep-NEG-PAST
‘Everyone didn’t sleep.’
√Total negation: ∀>¬ (No one slept.)
√Partial negation: ¬>∀ (It is not the case that everyone slept,
i.e. There is someone who didn’t sleep.)

Given this scope ambiguity, a question to be addressed is the following: Is there

a mapping between these two scope readings and the two prosodic patterns of wa
that we saw in the previous section? I show in the next subsection that this is the
case.

3.1.2. Prosodic Patterns and Quantifier Scope

We saw in the previous section that the two functions of wa are realized in different
intonational patterns: in sentences with the thematic wa, P1 is almost as high as or
can be lower than P2, whereas, in contrastive cases, P1 is always higher than P2.
These results are schematised in (8).

(8) a. Thematic wa b. Contrastive wa

P1 wa P2 P1 wa P2

First, I read aloud the sentence in (7) using the two prosodic patterns in (8) and
tape-recorded it. Actual F0 contours are shown in Figure 3.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 183

300

250

200

150

100

50
0 1.37113
Time (s)

300

250

200

150

100

50
0 1.32972
Time (s)

Minna wa ne-nakat-ta
everyone TOP sleep-NEG-PAST
P1 P2

Figure 3. F0 contours of thematic wa and contrastive wa12

Above: Thematic wa [P1: 251.13Hz, P2: 251.79Hz]
Below: Contrastive wa [P1: 247.47Hz, P2: 174.52Hz]

Second, four Japanese informants (2 males, 2 females) were asked to listen to the
recordings, and further asked whether there is any correspondence to the two scope
interpretations. They all agreed that the prosodic pattern of thematic wa corresponds
to the ∀>¬ reading, whereas the pattern of contrastive wa corresponds to the ¬>∀
reading.

(9) a. Prosodic pattern of thematic wa --- ∀>¬ reading

b. Prosodic pattern of contrastive wa --- ¬>∀ reading

Thus, I conclude that two prosodic patterns correspond to different scope

interpretations. The next question to be addressed is why there is such a correlation,
which is discussed below.
184 KIMIKO NAKANISHI

3.2. Alternative Semantics Approach to Wa and Quantifier Scope

In this section, I show that the correlation between prosody and scope interpretations
can be captured by Büring’s (1997) Alternative Semantics Approach.

3.2.1. Büring’s (1997) Alternative Semantics Approach to German Scope Inversion

Let us first introduce Büring’s (1997) Alternative Semantics Approach to German
scope inversion. In German, it is claimed that a rise-fall accent contour has a
disambiguating effect with respect to scope interpretations (Féry 1993, Büring 1997,
and Krifka 1998, among others). For example, in (10a), a sentence with a universal
quantifier as a subject and a negation is scopally ambiguous. However, as in (10b),
when the subject is prosodically marked with a rising pitch accent (/) and the
negation is marked with a falling accent (\), only the ¬>∀ reading is available.

(10) a. Alle Politiker sind nicht korrupt.

all politicians are not corrupt
√∀>¬ (For all politicians, it is not the case that they are corrupt.)
√¬>∀ (It is not the case that all politicians are corrupt.)
b. / ALLE Politiker sind NICHT \ korrupt.
all politicians are not corrupt
*∀>¬, √¬>∀

Büring (1997) assumes that each sentence S derives three different semantic
objects, that is, the ordinary semantic value [[ S]] o, the Focus value [[ S]]f, and the Topic
value [[ S]] t. The first two values are defined by Rooth (1985): According to Rooth,
the ordinary value is a proposition and the Focus value is a set of propositions. What
is new to Büring is that a Topic as well as a Focus evokes alternatives. In particular,
the Topic value is a set of sets of propositions. He claims that the Topic accent
marks a deviation from the original Discourse Topic: an element marked with the
Topic accent is interpreted as a sentence internal topic such as a contrastive topic.
Let us examine an actual example in (11), which includes a contrastive topic. In (11),
with the Topic accent, a topic is interpreted as contrastive, and thus evokes
alternatives. Following Büring, I represent Topic and Focus marking by using
subscripted brackets, [ ]T and [ ]F, respectively.

(11) Q: Which book would John buy?

A: [I]T would buy [The Hotel New HAMPshire]F.

a. [[(11Q)]]o = which book would John buy

b. [[(11A)]]f = {I would buy War and Peace, I would buy The Hotel
New Hampshire, I would buy Harry Potter, …}
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 185

c. [[(11A)]]t = {{I would buy War and Peace, I would buy The
Hotel New Hampshire, I would buy Harry Potter,
…},
{John would buy War and Peace, John would buy
The Hotel New Hampshire, John would buy Harry
Potter, …},
{Tom would buy War and Peace, Tom would buy
The Hotel New Hampshire, Tom would buy Harry
Potter, …}, … }

Büring further introduces the notion of Residual Topic, which is a set of disputable
propositions induced by the Topic. The definitions are given in (12) and (13).

(12) a. Given a question answer sequence QA, [[Q]]o must be an element of

[[A]]t.
b. Given a sentence A containing Topic, there must be at least one
disputable element in [[A]]t after uttering A.

(13) Disputability: A set of propositions P is disputable wrt a set of worlds

CG (the Common Ground) if there is at least one element p in P such
that both p and ¬p could informatively and coherently be added to CG
(cf. Stalnaker 1978:325). (Büring 1997:178)

The example in (11) satisfies the requirements in (12): [[ (11Q)]] o is an element of

[[(11A)]]t and there is at least one Residual Topic, e.g., which book would John buy?
With these semantic tools, Büring (1997) accounts for an unambiguity of
sentences with a rise-fall pitch in German. The example is cited again in (14).

(14) [Alle]T Politiker sind [nicht]F korrupt.

all politicians are not corrupt
*∀>¬, √¬>∀

Büring assumes that the sentences are structurally ambiguous by LF at the latest.
The different intonational contour leads to certain implicatures that differ for both
LF representations. The unavailable reading is ruled out because its LF
representation does not yield reasonable implicatures. Thus, the LF for the ∀>¬
reading in (14) does not have reasonable implicatures, whereas the LF for the ¬>∀
reading does. His analysis for these two readings is summarized below.
First, let us discuss the ¬>∀ reading. As shown in (15), there are Residual
Topics: if not all politicians are corrupt, are there corrupt politicians at all? If so,
186 KIMIKO NAKANISHI

how many? Thus, this reading is available. In the following, non-disputable

propositions are crossed out.

(15) a. [[(14)]]o = it is not that all politicians are corrupt

b. [[(14)]]f = {all politicians are corrupt, it is not that all politicians
are corrupt}
c. [[(14)]]t = {{all politicians are corrupt, it is not that all politicians
are corrupt},
{most politicians are corrupt, it is not that most
politicians are corrupt},
{some politicians are corrupt, it is not that some
politicians are corrupt},
{no politicians are corrupt, it is not that no politicians
are corrupt}}
The ∀¬ reading is, on the other hand, unavailable because there is no Residual
Topic: if all politicians are such that they are not corrupt, then, it is true that most
politicians are such that they are not corrupt, and it is also true that some politicians
are corrupt. Other elements of the sets express a contradiction.

(16) a. [[(14)]]o = all politicians are such that they are not corrupt
b. [[(14)]]f = {all politicians are such that they are not corrupt, all
politicians are such that they are corrupt}
c. [[(14)]]t = {{all politicians are such that they are not corrupt, all
politicians are such that they are corrupt},
{most politicians are such that they are not corrupt,
most politicians are such that they are corrupt},
{some politicians are such that they are not corrupt,
some politicians are such that they are corrupt},
{no politicians are such that they are not corrupt, no
politicians are such that they are corrupt}}

3.2.2. Alternative Semantics Approach to the Japanese Data

In this section, I account for the above Japanese data, which indicates the
correspondence between prosodic patterns and scope readings.
We saw that Büring (1997) captures German scope data by using its pragmatic
information, which is realized by a special prosodic pattern. I interpret this approach
in the following way: A sentence that conveys certain pragmatic information
corresponds to a certain scope reading, i.e., there is a one-to-one correspondence
between pragmatic information and scope interpretation. Pragmatic information can
be expressed by a certain prosodic pattern. For example, German rise-fall pitch
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 187

marks topic and focus. Büring’s approach captures a direct relation between
pragmatics, which is expressed by a certain prosody, and semantics. I apply this
approach to the Japanese data: Rather than examining the relation between prosody
and semantics, I examine the relation between pragmatics and semantics. That is, in
the relevant Japanese data, the thematic wa corresponds to the ∀>¬ reading,
whereas the contrastive wa corresponds to the ¬>∀ reading. Indeed, this approach
can account for the Japanese data.
First, I consider the correspondence between the thematic wa and the ∀>¬
reading, which is shown in (17).

(17) Thematic wa
Minna-wa ne-nakat-ta.
everyone-TOP sleep-NEG-PAST
‘Everyone didn’t sleep.’
√∀>¬, *¬>∀

Kuno (1973) claims that, when the topic marker wa is interpreted as a theme, the
element to which wa attaches must be either anaphoric or generic. If an element is
‘anaphoric’, it should have an antecedent in a previous context. In this sense, an
anaphoric element is definite. A ‘generic’ element does not have an antecedent, and
it denotes something that holds regardless of time or place of the utterance. Minna
‘everyone’ in (17) cannot be generic, because it is a subject of an eventive predicate,
which does not hold for a general time or place. Thus, minna ‘everyone’ in (17)
must be anaphoric. It is independently known that anaphoric definite elements do
not enter into a scopal relation with other scope-bearing elements in a sentence
(Fodor and Sag 1982, for example). In other words, anaphoric definite elements are
said to take the widest scope reading not because they take scope over other
elements by syntactic mechanisms such as quantifier raising (May 1985), but
because they are scopeless. For this reason, in (17), the universal quantifier with the
thematic wa has a wide scope interpretation only. In this way, the scope
interpretation of the sentence can be accounted for by its pragmatic information.
Let us move on to the contrastive wa. The correspondence between the
contrastive wa and the ¬>∀ reading can be straightforwardly captured by applying
Büring’s (1997) framework. Following Büring, I assume that the contrastive wa
always evokes alternatives. The question is where a Focus falls in sentences with
contrastive wa. Consider a possible context for a sentence with contrastive wa given
in (18). Note that (18b) is uttered using the intonational pattern for contrastive wa,
where P1 is much higher than P2.

(18) a. John-wa ne-ta?

John-TOP sleep-PAST
‘Did John sleep?’
188 KIMIKO NAKANISHI

b. Iie, John-wa ne-nakat-ta.

no John-TOP sleep-NEG-PAST
‘No, John didn’t sleep (but someone else slept).’
In the above context, the sentence ‘John didn’t sleep’ with contrastive wa implicates
that there is someone else who slept. Following Büring, the contrastive topic evokes
alternatives in that ‘John’ is contrasted with ‘someone else’, say, Mary and Bill. In
addition, alternatives are evoked with respect to a polarity of a predicate, i.e., ‘didn’t
sleep’ and ‘slept’. 13 As stated earlier, the general function of Focus is to evoke
alternatives (Rooth 1985). Since a polarity of a predicate here evokes alternatives,
we can consider it as a Focus.14 Thus, a Topic and a Focus in (18b) are assigned in
the way shown in (19).
(19) [John-wa]T ne-[nakat]F-ta.
John-TOP sleep-NEG-PAST
‘John didn’t sleep.’
a. [[(19)]]o = John didn’t sleep
[[(19)]]f = {John didn’t sleep, John slept}
[[(19)]]t = {{John didn’t sleep, John slept},
{Mary didn’t sleep, Mary slept},
{Tom didn’t sleep, Tom slept}, … }
b. Residual Topic: Did Mary sleep?
With the above Topic-Focus assignment, Büring’s approach should
straightforwardly apply to the scope of contrastive wa with respect to negation.

(20) Contrastive wa
[Minna-wa]T ne-[nakat]F-ta.
everyone-TOP sleep-NEG-PAST
‘Everyone didn’t sleep.’
*∀>¬, √¬>∀
We can see that the example in (20) and the German rise-fall example discussed in
(14) above have the same Topic-Focus assignments. It follows that Büring’s
analysis for German should apply to the Japanese example. The ¬>∀ reading in (20)
is available because there are Residual Topics: if not everyone slept, is there anyone
who slept at all? If so, how many?

(21) a. [[(20)]]o = it is not that all people slept

b. [[(20)]]f = {all people slept, it is not that all people slept}
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 189

c. [[(20)]]t = {{all people slept, it is not that all people slept},

{most people slept, it is not that most people slept},
{some people slept, it is not that some people slept},
{no one slept, it is not that no one slept}}

The ∀>¬ reading in (20) is, on the other hand, unavailable because there is no
Residual Topic: if all people are such that they didn’t sleep, then, it is true that most
people are such that they didn’t sleep, and it is also true that some people are such
that they didn’t sleep. Other elements of the sets express a contradiction.

(22) a. [[(20)]]o = all people are such that they didn’t sleep
b. [[(20)]]f = {all people are such that they didn’t sleep,
all people are such that they slept}
c. [[(20)]]t = {{all people are such that they didn’t sleep, all people
are such that they slept},
{most people are such that they didn’t sleep, most
people are such that they slept},
{some people are such that they didn’t sleep, some
people are such that they slept},
{no people are such that they didn’t sleep, no people are
such that they slept}}

Thus, the scope interpretation of the sentence with contrastive wa as well as

thematic wa can be accounted for by its pragmatic information.

3.3. Summary
In Japanese, sentences with negation and the topic marker wa are subject to scope
ambiguity. I first showed that the two different prosodic patterns correspond to
different scope readings. In other words, the two pragmatic functions of wa
expressed by different prosodic patterns correspond to different scope interpretations.
I examined a correspondence between pragmatic functions of wa and scope
interpretations based on Büring (1997). The thematic wa corresponds to one reading
and the contrastive wa to the other. In other words, the scope ambiguity arises
because wa has two pragmatic functions.

4. DISCUSSION
In this paper, I presented two sets of empirical data: First, the two pragmatic
functions of the topic marker, that is, theme and contrast, are realized by different F0
190 KIMIKO NAKANISHI

patterns. Second, these two prosodic patterns correspond to two different scope
interpretations. The relevant findings are summarized in Table 1 below.

Table 1. Prosody, pragmatics, and semantics of the topic marker

Pragmatics Thematic wa Contrastive wa

Prosody P1 is as high as P2 P1 is higher than P2
Semantics √∀>¬, *¬>∀ *∀>¬, √¬>∀

Different prosodic patterns are used to make pragmatic distinctions between theme
and contrast. Those pragmatic distinctions, which are realized by distinct prosodic
patterns, are correlated with different scope readings. This correlation between
pragmatics and semantics is not arbitrary. As formalized in section 3, the correlation
between pragmatic functions of wa and scope readings can be captured by Büring’s
Alternative Semantics Approach, which uses a direct relation between pragmatics
and semantics. In this way, three properties of the topic marker, i.e., prosodic
patterns, pragmatic functions, and scope readings, are coherently related to each
other.
Finally, I would like to briefly address the question that many previous studies in
Japanese linguistics have discussed: Should the thematic wa and the contrastive wa
be distinguished in syntax? Some previous studies claim that they need not be
distinguished in syntax (Mihara 1996, for example). For these studies, theme and
contrast might be merely different in pragmatic interpretation, not syntactically
different. Others claim that they should (Hoji 1985, Saito 1985, Tateishi 1994, for
example). Their claim is based on the argument that two kinds of wa are base-
generated in different positions in a syntactic structure. For example, Tateishi (1994)
shows that the thematic wa violates Subjacency, whereas the contrastive wa obeys it.
This is because the thematic and the contrastive wa are base-generated in different
positions. The current study does not say anything about where the two kinds of wa
are base-generated. However, it shows that they have different syntax at least at LF,
since they correspond to different scope readings, which are expressed by syntactic
structures at LF. I interpret this fact as a piece of evidence that the thematic and the
contrastive wa should be distinguished in the syntax.

University of Pennsylvania

5. NOTES

*
I would like to thank Mark Liberman, Bill Poser, Satoshi Tomioka, and Jennifer Venditti for valuable
discussions and their insights. I am also grateful to Daniel Büring, Elsi Kaiser, and Kazuaki Maeda.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 191

Thanks are also due to the audience at Topic, Focus and Intonation Workshop (University of California,
Santa Barbara, July, 2001).
1
To be precise, the distinction between the thematic and the contrastive wa is valid only when the topic
marker is attached to the subject in canonical word order, which is S-O-V. When the topic marker is
attached to the object in canonical position, the object is exclusively interpreted as a contrastive element,
as shown in (i).
(i) John-ga ringo-wa tabe-ta.
John-NOM apple-TOP eat-PAST
‘John ate apples (but there were some food that he didn’t eat)’
For this reason, I only consider the examples where wa is attached to a canonical subject.
2
An earlier version of this section is presented in Nakanishi (2002).
3
See Haraguchi (1999) for a recent survey of the Japanese pitch accent system.
4
‘ň’ marks a low-high sequence of pitch, and ‘ŉ’ marks a high-low sequence of pitch.
5
For how to determine Major Phrases, see Selkirk and Tateishi (1991).
6
Finn (1984) claimed that thematic wa and contrastive wa were differentiated by pauses as well as
fundamental frequency (F0). Her claim is based on experimental studies in which she measured the peak
of F0 contours before wa and the valley F0 of wa, and also the pause between wa and the following word.
Her experimental methods, however, are problematic; unfortunately, I do not have space to discuss them
here.
7
Voiced segments exhibit smoother F0 contours than other consonants, without being disturbed much by
segmental effects.
8
The predicate nonbiri-site-iru can describe either the current state ‘be relaxing’ or the permanent state
‘be laid-back’. In other words, it can be either a stage-level or an individual-level predicate (Carlson
1977). To avoid possible prosodic effects of this ambiguity, the participants were informed that the
sentences used in the experiment mean ‘be relaxing’, not ‘be laid-back’.
9
For the contrastive case, a question arises as to whether the low value of P2 is a result of Downstep or a
reduction of range for another reason. The result of the experiment suggests that the drop of P2 is not due
to Downstep, since the difference between P1 and P2 is much larger than the case of Downstep. I thank
Jennifer Venditti for discussions of this issue.
10
The first and second arrows in F0 contours indicate P1 and P2, respectively.
11
The negative morpheme is just -na, as we can see in forms such as -na-i ‘-NEG-PRES’. The status of a
suffix after negation -kat (or arguably, and certainly historically, -kar) is admittedly a problem. Bill Poser
(p.c.) pointed out to me that, synchronically -kat has to be analyzed as obligatorily affixed to adjectives
when certain suffixes, such as -ta ‘-PAST’, are added. Following Poser’s suggestion, for the purpose of
this study, I assume that -nakat is a suppletive form of the negative required by suffixes like -ta.
12
The first and second arrows in F0 contours indicate P1 and P2, respectively.
13
Related to this is Noda’s (1996) claim that, when a sentence with the contrastive wa is conjoined with
another sentence, the predicates of these two sentences tend to express opposite states. For example, in
(i) below, the predicate didn’t sleep is most naturally conjoined with the opposite predicate slept.
(i) John-wa ne-nakat-ta-ga Mary-wa ne-ta.
John-TOP sleep-NEG-PAST-but Mary-TOP sleep-PAST
‘John didn’t sleep, but Mary slept.’
14
In Japanese, the negation is a morpheme attached to a verb. For this reason, it seems impossible for the
negation alone to be accented. Thus, I assume that, although the negation is a Focus, it does not have a
special prosodic pattern as in German, where the focused negation is realized with a falling accent.

6. REFERENCES
Bolinger, Dwight. Forms of English: Accent, Morpheme, Order. Cambridge: Harvard University Press,
1965.
192 KIMIKO NAKANISHI

Büring, Daniel. “The Great Scope Inversion Conspiracy.” Linguistics and Philosophy 20 (1997):
175−194.
Carlson, Gregory. Reference to Kinds in English. Ph.D. dissertation, University of Massachusetts,
Amherst, 1977. [New York: Garland, 1980].
Féry, Caroline. German Intonational Patterns. Tübingen: Niemeyer, 1993.
Finn, A.N. “Intonational accompaniments of Japanese morphemes wa and ga.” Language and Speech
27:1 (1984): 47−57.
Fodor, Jane, and Ivan Sag. “Referential and Quantificational Indefinites.” Linguistics and Philosophy 5
(1982): 355−398.
Halliday, M.A.K. “Notes on Transitivity and Theme in English, Part II.” Journal of Linguistics 3 (1967):
199−244.
Haraguchi, Shosuke. The Tone Pattern of Japanese: An Autosegmental Theory of Tonology. Tokyo:
Kaitakusha, 1977.
Haraguchi, Shosuke. “Accent.” In N. Tsujimura (ed.), An Introduction to Japanese Linguistics, pp. 1−61.
Cambridge: Blackwell, 1999.
Hirst, Daniel, and A. Di Cristo. Intonation Systems: A Survey of Twenty Languages. Cambridge:
Cambridge University Press, 1998.
Hoji, Hajime. Logical Form Constraints and Configurational Structures in Japanese. University of
Washington: Doctoral dissertation, 1985.
Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge: MIT Press, 1972.
Kato, Yasuhiko. “Negation and the Discourse-Dependent Property of Relative Scope in Japanese.”
Sophia Linguistica (1988): 23−24.
Krifka, Manfred. “Scope Inversion under Rise-Fall Contour in German.” Linguistic Inquiry 29:1 (1998):
75−112.
Kubozono, Haruo. The Organization of Japanese Prosody. Tokyo: Kuroshio, 1993.
Kuno, Susumu. The Structure of the Japanese Language. Cambridge: MIT Press, 1973.
Ladd, D. Robert. Intonational Phonology. Cambridge: Cambridge University Press, 1996.
Lambrecht, Knud. Information Structure and Sentence Form: Topic, Focus and the Mental
Representations of Discourse Referents. Cambridge: Cambridge University Press, 1994.
May, Robert. Logical Form: Its Structure and Derivation. Cambridge: MIT Press, 1985.
McCawley, James. The Phonological Component of a Grammar of Japanese. Hague: Mouton, 1968.
Mihara, Ken-ichi. Nihongo-no Toogo Koozoo [Syntactic Structures in Japanese]. Tokyo: Syohakusya,
1996.
Nakanishi, Kimiko. “Prosody and Information Structure in Japanese: a Case Study of Topic Marker wa.”
Japanese/Korean Linguistics 10 (2002): 434−447. Stanford: CSLI.
Noda, Hisashi. Wa to Ga [Wa and Ga]. Tokyo: Kuroshio Syuppan, 1996.
Pierrehumbert, Janet, and Mary Beckman. Japanese Tone Structure. Cambridge: MIT Press, 1988.
Poser, William. The Phonetics and Phonology of Tone and Intonation in Japanese. MIT: Doctoral
dissertation, 1984.
Rooth, Mats. Association with Focus. University of Massachusetts, Amherst: Doctoral dissertation, 1985.
Saito, Mamoru. Some Asymmetries in Japanese and their Theoretical Implications. MIT: Doctoral
dissertation, 1985.
Selkirk, Elizabeth and Koichi Tateishi. “Syntax and Downstep in Japanese.” In C. Georgopoulos and R.
Ishihara (eds.), Interdisciplinary Approaches to Language: Essays in Honor of S.-Y. Kuroda, pp.
519−543. Dordrecht: Kluwer, 1991.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE 193

Stalnaker, Robert. “Assertion.” In P. Cold (ed.), Syntax and Semantics 9: Pragmatics, pp. 315−332. New
York: Academic Press, 1978.
Steedman, Mark. “Information Structure and the Syntax-Phonology Interface.” Linguistic Inquiry 31:4
(2000): 649−689.
Tateishi, Koichi. The Syntax or ‘Subjects’. Stanford: CSLI, 1994.
HO-HSIEN PAN

FOCUS AND TAIWANESE UNCHECKED TONES

Abstract. This study investigated how focus influences f0 contour and duration of Taiwanese lexical
tones. F0 and duration values were taken from pitch tracks and spectrograms generated from SVO
sentences with different focus conditions. The four focus conditions included a broad focus condition
with focus on the entire sentence, and three narrow focus conditions with narrow focus falling on the first,
second, and third words. Results of the duration data revealed that (1) duration of narrow focused
syllables were longer than syllables in other focus conditions and (2) duration of narrow focused syllables
varied as a function of their position within the phrase; penultimate focused syllables were longest.
Analysis of f0 minimum and maximum indicated that (1) f0 range of narrow focused syllables was
expanded and (2) together, mean f0 value and expansion of f0 range distinguish focus conditions.
Comparison between f0 and duration data showed that duration was more consistently used to distinguish
focus condition than f0 range and mean f0 value in Taiwanese.

1. INTRODUCTION
Focus, tone, and intonation are all manifested through fundamental frequency (f0)
contours and duration in Taiwanese. There is no one-to-one correspondence between
the surface acoustical realization and the deeper structure, nor do surface f0 contours
and duration directly reflect underlying features. To improve our understanding of
surface f0 and duration formation, the contribution of underlying global or local
factors to surface f0 and duration patterns must be investigated. The global factors
that contribute to f0 modulation can be divided into two categories, i.e. declination
and final lowering. The gradual decline of f0 over the course of an utterance is
called declination, while the f0 decline at the end of an utterance or phrase is called
final lowering (Liberman & Pierrehumbert, 1984; Pierrehumbert & Beckman, 1988;
Shih, 1988). Global effects also affect duration. For example the duration of a
syllable varies according to a syllable’s position relative to a prosodic boundary.
Studies have showed that phrase-medial segments are shorter than those in phrase-
initial and phrase-final positions (Lindblom & Rapp, 1973). In addition to global
effects, f0, and duration are also affected by local factors such as tone and focus (Ho,
1976; Lin, 1988).
The contribution of tone to the duration of a tone-bearing unit has been observed
in languages such as Taiwanese. In Taiwanese, the rising tone is longer, and the
duration of checked syllables (CVC structures with final voiceless stops) is shorter
than the duration of unchecked syllables (CV or CVN structures) (Cheng, 1968,
1973; Lin, 1988). Focus also influences syllable duration. It was found that the
duration of narrow focus syllables are longer than broad focus syllables, which in
turn are longer than post-focus syllables (Jin 1996; Xu 1999).

195
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 195–213.
© 2007 Springer.
196 HO-HSIEN PAN

Turning to f0, it was observed that local factors such as tone and focus both
affect the surface f0 pattern. There are unique intrinsic tonal targets that each
Taiwanese lexical tone possesses. These tonal targets determine the f0 height
(register), and f0 shape (contour) of tone bearing syllables. For example, a high level
tone has an intrinsic high and level f0 contour which coarticulates with surrounding
tones (Lin, 1988; Shih, 1988; Gandour et al., 1994; Xu, 1993, 1997).
The contribution of focus on surface f0 patterns was reported in various
languages (Pierrehumbert, 1980; Cooper, Eady & Muller, 1985; Eady & Cooper,
1986; Eady, Cooper, Klouda, Mueller & Lotts, 1986; Jin, 1996; Xu, 1999). Jin
(1996) found that in Mandarin the f0 range of narrow focus syllables was expanded.
In this study he varied the four lexical tones of the first two syllables (or words) in
sentences with the following structure, /___ mi ŋ15 nien 15 liau15 yaŋ 15/, ‘X is going
to the sanitarium next year.’ Each sentence employed four different focus conditions,
including broad focus with focus on the entire sentence, and three narrow focus
conditions with focus placed either on the first, second, and third word. Results
showed that (1) duration of narrow focus syllables was the longest, (2) the f0
range of the narrow focus syllable was expanded, and (3) the f0 contours of the final
word in broad focus sentences were perceptually indistinguishable from narrow
focus final syllable.
Xu (1999) investigated local factors including tone and focus. He varied the
lexical values of the first three words in a sentence. Each of the three words carried
four Mandarin lexical tones, e.g. level, rising, falling, and falling rising tones. For
sentence /mao55 mi35 mai51 mao55 mi55/ ‘Cat fan sells kitty.’ four questions were
asked to elicit production with broad focus, or narrow focus on word one, two, or
three. For example, when the question ‘What is kitty doing?’ was asked, the narrow
focus was appropriately produced on word three, /mai/, in the target sentence.
Results further confirmed that duration increased and f0 range was expanded for
narrow focus syllables in Mandarin.
A tone language with its clear specification of local tonal targets on each syllable
is suitable for studying the contribution of global and local effects on surface f0
realization and duration. This study followed the line of research on the influence of
local effects, i.e. tone and focus, on the surface f0 formation and syllable duration in
a tone language, by controlling the intonation of each utterance and its syntactic
composition, while varying lexical tone and focus condition (Jin, 1996; Xu, 1999).
Lexical tones in a tone language are contrastive in terms of f0 height, contour,
and duration. In Mandarin, there are four lexical tones, namely high level (55), low
rising (15), high falling (51), and falling rising tones (315). These four tones are
distinguished mainly through f0 shapes. Each Mandarin tone has its own distinctive
f0 contour not shared by other tones. However, little is known about a tone
language, like Taiwanese, with tones distinguished mainly by not only f0 contour
but also f0 height contrasts. There are seven lexical tones in Taiwanese, i.e. high
level (55), low rising (24), high falling (51), mid falling (21), mid level (33), high
falling checked (51), and mid falling checked tones (21), as shown in Table 1.
There are pairs of lexical tones that differ in only tonal height. For example, high
falling and mid falling tones differ only in their relative f0 levels, as do high and mid
FOCUS AND TAIWANESE UNCHECKED TONES 197

level tones. Compared with Mandarin, Taiwanese has a richer tone inventory. This
study contributed to the little data on the realization of focus in a tone language.

Table 1. Taiwanese lexical tones

High level 55 /kun/ ‘army’

Low rising 24 /kun/ ‘skirt’
High falling 51 /kun/ ‘boiling’
Mid falling 21 /kun/ ‘batons’
Mid level 33 /kun/ ‘near’
High falling checked 51 /kut/ ‘slippery’
Mid falling checked 21 /kut/ ‘plow’

The present study reports on how focus contributes to the realization of f0 and
syllable duration of lexical tones in Taiwanese. Words containing different lexical
tones were produced in short sentences that controlled for the global effects of
intonation and prosodic tonal grouping, while varying the local effects of tonal value
and focus pattern. The purpose of this study was to examine the surface realization
of f0 and duration of Taiwanese lexical tones under different focus conditions, with
attention drawn to following issues: (1) the effect of narrow focus on duration, (2)
the effect of a narrow focus syllable’s position in an utterance on its duration, (3) the
effect of narrow focus on f0 range and (4) the influence of focus on tone height
between high and mid falling tones, and between high and mid level tones.

2. METHOD

2.1. Corpus
Each Taiwanese syllable has two different lexical tones, i.e. a juncture tone
(underlying) tone and a context (sandhi) tone. The surface realization of tonal values
depends on a syllable’s position in a tone group. When a syllable is located at the
end of a tone group, that is the juncture position and so the juncture (underlying)
tone surfaces. Any other syllables that are not last in a tone group carry a context
tone. The juncture and context tone values that each syllable possesses are recursive
in nature. For example, a syllable that surfaces with either the tones 55 or 24 at the
juncture position of a tone group has a context (sandhi) tone value 33 at non-
juncture positions. The context tone for a syllable with a juncture tone 33 is tone 21,
while the context tone for a syllable with a juncture tone 21 is tone 51. A syllable
with a juncture tone 51 would carry a tone 55 in a non-juncture position, as shown in
Table 2. It should be noted that tone 24 only surfaces at juncture positions, and not
in initial or medial positions of a tone group. The domain of the tone group
boundary is prosodically determined and closely related to syntactic structures in
Taiwanese (Chang, 1968, 1973; Chan, 1987; Lin, 1988).
198 HO-HSIEN PAN

In the corpus, the sentence type was a statement with SVO structure. The tone
group boundaries for these short sentences were located between the first and second
words. That is, the first word (first and second syllables) formed a tonal group, while
the second word (third syllable) and third word (fourth and fifth syllables) formed
another tone group.

Table 2. Taiwanese tonal sandhi rules

Unchecked Checked

55 51 21 53

24 33 21

(1) [ σcntxt σjnctr ]tone group [ σcntxt σcntxt σjnctr ]tone group

According to tone sandhi patterns, the second and fifth syllables which are the
last syllables in these tone groups carried a juncture tone, while the first, third, and
fourth syllables carried context tones. Since a low rising tone is not a possible
context tone, it was not used in the third and fourth syllables, as shown in Table 3.
In the corpus, only sonorants were used as initial consonants to minimize
perturbation in vocal fold vibration in order to ensure smooth pitch tracks, as shown
in Table 3.
The subject, including the first and second syllables of the sentence, was a
surname. The first syllable of the subject was a diminutive morpheme /a- 55/. The
second syllable consisted of five juncture tones: high level (55), low rising (24),
high falling (51), mid falling (21), and mid level (33). The third syllable consisted of
four context tones: high level (55), high falling (51), mid falling (21), and mid level
(33). Since a low rising tone was not a possible context tone, only the four tones
were used in the third and fourth syllables. The fourth and fifth syllables formed the
object. The fourth syllable consisted of the tones 55, 51, 21, and 33. The fifth
syllable was the diminutive affix /-a 51/. Since it was not possible to find an object
carrying a high falling tone for the fourth and fifth syllables (e.g., 51 51) the lexical
item, /a 51 ´ŋ 33/ ‘duck egg’, was chosen for the object with a high falling tone in the
fourth syllable. Checked syllables were not investigated in this study.
FOCUS AND TAIWANESE UNCHECKED TONES 199

Table 3. Tones and syllables used as corpus. Tones are in underlying form within // and
surface forms within [ ].

Word 1 Word 2 Word 3

(1st and 2nd syllables) (3rd syllable) (4th and 5th syllables)
/55 55/ [33 55] [a me] /51/ [55] [lam] ‘hug’ /51 51/ [55 51] [liu a] ‘button’
55

/55 24/ [33 24] [a mõ]

24
Tone

/55 51/ [33 51] [a mã] /21/ [51] [liam] ‘pinch’ /21 33/ [51 33][‫´׀‬ŋ]
51

‘duck egg’
/55 21/ [33 21] [a lun] /33/ [21] [mã] ‘scold’ /33 51/ [21 51] [lua a] ‘comb’
21

/55 33/ [33 33] /24/ [33 ] [law] ‘save’ /24 51/ [33 51] [n ĩ ũ a]
[a liaŋ ]
33

‘silkworm’

Nine hundred and sixty sentences (5 first word X 4 second word X 4 third word
X 4 focus conditions X 3 repetitions) in the corpus were formed by alternating the
five words in position 1 to match the four alternating words in position 2 and the
four alternating words in position 3. There were four focus conditions. Narrow focus
was placed either on the first word (first and second syllables), the second word (third
syllable), or the third word (fourth and fifth syllables), while broad focus was placed
on the entire sentence. Each sentence was repeated three times. The order in which
the 960 sentences were produced was randomized. The sentences were written on a
list with no specification of the placement of focus. A question list corresponding to
the order of the corpus list was created to elicit focus on the desired part of the
sentence, as shown in (2). For example, to elicit broad focus on the sentence ‘A-mei
holds buttons’, the precursor question listed on the question list would be ‘What
happened?’ as shown in (2) d.

(2) a. Who holds buttons? ‘A-MEI holds buttons.’

b. What did A-mei do to the buttons? ‘A-Mei HOLDS buttons.’

c. What did A-mei hold? ‘A-Mei holds BUTTONS.’

d. What happened? ‘A-Mei holds buttons.’

200 HO-HSIEN PAN

2.2. Speaker
Four male native Taiwanese speakers, CYS, LWS, LYK, and HYH, participated in the
experiment. They were all trilingual speakers of Taiwanese Min, Mandarin, and
English. HYH spoke a variety of dialects in which the underlying low rising tone
changes into a mid falling surface tone. All speakers were students at National Chiao
Tung University at the time of the recordings. They were paid for their participation.

2.3. Instrumentation
Recordings were made in a sound-treated booth in The Department of Foreign
Languages and Literatures at National Chiao Tung University in Hsinchu, Taiwan.
A TEV TM-728II unidirectional dynamic microphone was placed 40 cm in front of
each speaker’s mouth and 1 m from the experimenter. A SONY MZS-R4ST Mini
Disk recorded acoustical signals in digital quality. The digital acoustical signal was
transferred from Mini Disk to PC through an optical fiber at 22kHz to the digital
input of Creative Sound Blaster Live sound card, and saved in .wav format. The
ESPS xwaves program was used to generate fundamental frequency tracks for each
sentence.

2.4. Procedure
During the recording a female experimenter and a speaker were present in the sound
booth. Short dialogues between the experimenter and speaker were exchanged to
ensure that each speaker produced the corpus in a conversational, and not in a
citation manner and to ensure that each speaker placed focus in the target position
naturally, as opposed to reading the sentence directly from the list. During the
recording, speakers read the sentences without indication for the placement of focus
from a randomised corpus list. Speakers waited until the experimenter read a
precursor question from a question list and then responded by producing the
sentence, which he read from the corpus list with focus on the specific part of the
sentence. Different questions elicited focus on different parts of the sentence as
shown in Table 2. The experimenter judged the utterance according to the desired
location of focus at the targeted position. If the experimenter decided that the desired
focus condition was not produced, then she would repeat the precursor again, and
ask for another production.

2.5. Data Analysis

An Emu labelling program (http://www.shlrc.mq.edu.au/emu/) was used to display
fundamental frequency (f0) patterns, spectrograms, and waveforms and to provide a
means for labelling relevant tonal and intonational aspects of the utterance using the
Taiwanese ToBI annotation conventions, currently under progress. Syllabic
boundaries were determined by identifying spectrographic cues, such as the energy
difference between nasals and vowels and the formant transitions between
consonants and vowels. After identifying and labelling syllable boundaries, labelling
FOCUS AND TAIWANESE UNCHECKED TONES 201

words, phones, tones, and the location of focused elements, another Emu program
(Emuquery) was used to obtain the time at the onset and offset of the second
syllable, third, and fourth syllables. The duration of each syllable was calculated
by subtracting the time at the syllable onset from the time of the syllable offset.
Next, the fundamental frequency was extracted for each syllable using
get_track, and the Emu pitch extraction program. Fundamental frequency values at
5%, 20%, 40%, 60%, 80%, and 95% time points in the target syllables were obtained
from these pitch tracks. The average f0 and duration for the second, third, and fourth
syllables carrying the same tone in different focus conditions were compared. One-
way ANOVAs (focus position) were used to determine the effect of focus position
on peak f0, f0 range expansion, and duration.

3. RESULTS

3.1. Duration

3.1.1. Effect of focus

Table 4 shows the results of 51 one-way ANOVAs (focus position) on the duration
of the syllable carrying the same tone at the same position produced by the same
speaker.
For CYS, there was a significant difference between the duration of the syllable
carrying the same tone at the same position in different focus condition. The mean
duration of the narrow focus syllable was the longest among syllables carrying the
same tone at the same position in different focus conditions. For speaker HYH, there
was a significant difference between the duration of the syllable carrying the same
tone at the same position in different focus conditions. This excludes tone 55 in the
fourth syllable, tone 51 in the third syllable, and tone 21 in the second syllable.
Mean duration showed that the duration of the narrow focus syllable was the longest
among syllables carrying the same tone at the same position in different focus
conditions. This excludes tone 55 in the fourth syllable, tone 21 in the second
syllable, and tone 33 in the fourth syllable. For speaker LWS, there was a significant
effect of duration on the syllable carrying the same tone in the same position and
with different focus conditions. Mean duration showed that the duration of the
narrow focus syllable was the longest among syllables carrying the same tone in the
same position. This excludes tones 55, 51, and 33 in the fourth syllable. For speaker
LYK, the durations for the same syllable in different focus conditions were
significantly different. This excludes tone 21 in the fourth syllable. Mean duration
showed that besides tones 55 and 21 in the fourth syllable and tone 33 in the second
syllable, the duration of narrow focus syllables was the longest among syllables
carrying the same tone at the same position under different focus conditions. There
was a trend for a narrow focus syllable to be the longest.
202 HO-HSIEN PAN

3.1.2. Effect of syllable position on duration

Table 4 displays mean duration of syllables in the same position carrying the same
lexical tone with different focus conditions across speakers. As shown in Table 4,
the duration of narrow focus second syllables was longer than broad focus, pre-
narrow focus, or post-narrow focus second syllables. Duration of narrow focus third
syllables was also longer than broad focus, pre-narrow focus, and post-narrow focus
third syllables.

Table 4. One-way ANOVA’s (4 focuses) on mean duration (ms), ** p < .001, p < .05, NF:
Narrow Focus, bold face: narrow focus syllable

Speaker CYS HYH

Syllable position Syllable position
2 3 4 2 3 4
Broad focus 211.8 ** 240.6 ** 257.4 ** 195.8 ** 219.1 * 210.2
NF on syllable 2 240.2 248.4 ** 255.9 ** 242.1 210.7 * 216.8
NF on syllable 3 199.0 ** 306.2 280.6 ** 202.3 ** 233.8 224.8
55

NF on syllable 4 191.2 250.4 322.5 205.6 ** 225.7 * 221.3

Broad focus 236.1 * 214.5 **
NF on syllable 2 248.3 270.8
NF on syllable 3 226.2 * 228.1 **
24

NF on syllable 4 226.9 * 224.1 **

Tone

Broad focus 213.3 234.7 199.3 205.8 193.3 209.8 **

NF on syllable 2 230.4 239.8 ** 197.7 ** 243.2 196.4 198.8 **
NF on syllable 3 204.9 ** 280.6 223.3 ** 214.8 ** 207.4 216.0 **
51

NF on syllable 4 207.8 243.9 251.4 216.5 ** 206.0 225.5

Broad focus 227.3 ** 246.7 ** 223.3 ** 262.4 210.4 ** 207.1 *
NF on syllable 2 244.9 257.6 ** 227.1 ** 262.2 217.4 ** 190.1 *
NF on syllable 3 222.7 ** 328.6 220.2 ** 255.9 255.4 205.5 *
21

NF on syllable 4 211.3 257.0 257.4 261.1 218.6 ** 216.9

Broad focus 219.4 * 244.7 ** 268.7 ** 167.6 ** 219.9
NF on syllable 2 232.2 248.4 ** 273.6 ** 211.4 223.3
NF on syllable 3 221.4 * 310.2 281.9 ** 177.0 ** 224.3
33

NF on syllable 4 201.7 * 254.3 320.9 166.9 219.4

Speaker LWS LYK
2 3 4 2 3 4
Broad focus 265.3 ** 233.5 ** 227.5 * 225.7 ** 233.7 ** 276.3 **
NF on syllable 2 294.0 255.2 ** 226.1 * 241.3 229.9 ** 268.7 **
NF on syllable 3 246.4 ** 320.6 250.3 * 221.8 ** 263.8 290.9 **
55

NF on syllable 4 252.3 273.4 234.0 207.3 245.4 297.7

Tone

Broad focus 255.1 242.0

NF on syllable 2 296.3 280.0
NF on syllable 3 249.3 ** 239.5 **
24

NF on syllable 4 250.7 218.4

Broad focus 252.0 ** 245.8 ** 182.7 ** 234.4 ** 246.4 ** 193.4 **
NF on syllable 2 295.8 265.4 ** 186.7 ** 258.6 231.2 ** 177.1 **
NF on syllable 3 244.3 ** 329.5 219.8 ** 228.0 ** 266.0 199.4 **
51

NF on syllable 4 242.1 253.0 202.8 208.6 251.9 212.4

FOCUS AND TAIWANESE UNCHECKED TONES 203

Broad focus 251.3 250.8 213.2 * 189.3 245.1 247.6

NF on syllable 2 278.0 271.8 ** 218.4 * 205.2 230.9 ** 239.4
NF on syllable 3 246.7 ** 342.2 225.0 * 186.6 ** 295.7 264.6
21

NF on syllable 4 252.5 251.2 231.6 165.6 241.1 255.0

Broad focus 223.7 * 221.6 ** 223.0 ** 199.3 ** 225.0 ** 292.6 *
NF on syllable 2 251.8 251.9 ** 214.8 ** 213.2 213.8 ** 274.8 *
NF on syllable 3 232.3 * 309.7 259.8 ** 215.0 ** 255.1 296.9 *
33

NF on syllable 4 242.9 * 231.8 237.3 185.7 228.8 ** 302.7

Among the narrow focus syllables carrying the same tone in different syllable
positions, the duration of the narrow focus third syllables was the longest, compared
with the duration of the narrow focus syllables in the second and fourth syllable
position. The effect of position on syllable duration was confounded by the vowel
quality and the syllable structure (closed vs. open) which were not controlled in the
corpus.
Although the duration of narrow focus second and third syllables was longer
than the same syllable in other focus conditions, the narrow focus fourth syllable
was not the longest, as shown in Table 4. According to Table 4, the duration of
the narrow focus tones 55 and 33 in the fourth syllable produced by HYH, was not
the longest when compared to the same syllable produced in the other focus
conditions. This was also the result for syllables with tones 55, 51, and 33 in the
fourth syllable produced by LWS, and for syllables produced with tones 55 and 21
in the fourth syllable produced by LYK. The duration of the narrow focus fourth
syllable was similar to that of the post-focus fourth syllable, as shown in Table 4.
In summary, increased duration for narrow focus syllables was most obvious in
second and third syllable position and least noticeable in the fourth syllable position.

3.2. F0
3.2.1. Tonal register (f0 level) contrast
The f0 contours were averaged across speakers to reveal a potential contrast in tonal
register between high level vs. mid level tones and between high falling vs. mid
falling tones, as shown in Figure 1. A comparison between the f0 range of narrow
focus tones 55 and 33 in the syllable onset and the f0 peak revealed that f0 onset of
both the tones 55 and 33 was between 140 to 160 Hz. However, the f0 peak was
between 170 to 190 Hz for the tone 55 and remained below 160 Hz for the tone 33.
The only exception was the 20% point of tone 33 in the second syllable and the 95%
point of tone 33 in the fourth syllable, which was slightly above 160 Hz for tone 33.
Turning to the tones 51 and 21, we see that the f0 peak of the narrow focus tone 51
was between 180 to 200 Hz, while the f0 peak of the narrow focus tone 21 was
between 120 to 140 Hz. As for the lowest point of f0, it was between 150 to 170 Hz
for tone 51 and below 140 Hz for tone 21. The f0 level difference between tones 51
vs. 21 and between tones 33 vs. 55 was maintained for narrow focus syllables even
after the f0 range was expanded under narrow focus condition.
204 HO-HSIEN PAN

HH tone f0 average ML tone f0 average

220 220

200 200

2 2 4
180 2 4
0
3 3
0 4
180
2 0 4 3 0
3 4 3 3 0 3
0 3 0 4 0 3
4 0 3
f0(Hz)

3 0

f0(Hz)
2 0 2
4 4 0 2
4 2
4 2 3 2
0 2
160 2
3
0 4
0
2 160
0 2
4 3
2
4 4
3 3
0 3 4
0
3 2
3
4 0 2 2 2 2
4 0 0
4 4
3 0 0 4 3 3
2 2 2
4 4
3 0 4
0 2
3 0 03
3
4 4 2 03 3
140 140 2 4
2
4 0
4 4
0 2
0
2 3 3
4
3 3 2 2
0
4 0
4 0
2
3 3 4 0
2 2 0
4
3 4
2 2
3
120 120

100 100

syllable 2 syllable 3 syllable 4 syllable 2 syllable 3 syllable 4

LH tone F0 average MM tone f0 average

220 220

200 200

180 180
f0(Hz)
f0(Hz)

4
2 0
160 2 160 0 2
0 0
2 0
2 0
2
0 2 2
0 3 3
0 3 3 3
4 4 4
3 4 4 3
4 0
3 0
4 3
4 3 4
2 0 2
0 2 4 4
2
3 3
0 3 0 0
3
4
2 3 0 2
3
4 4 3
3
2
4 0 4
0
2 4
2 2 0
3
4 4
2 3
4 4
0 0
0 2 2
140 0
2
3
4
3
4 140 3 3
2 2

120 120

100 100
syllable 2
0 broad focus syllable 3 syllable 4
Syllable Position
2 Narrow focus on syllable 2

3 narrow focus on syllable 3

4 narrow focus on syllable 4

HL tone f0 average

220

200
2 2
3 3
0 0
3 4 3
0 3
180 0 4
2
4 0 4
4 0
3 0 2
2 3 4
4 4 4
2
0 0 4
3
f0(Hz)

3
0 2 0
3 2 0
2 4 4 0 0
4 2 3
160 3
0
2 3
2 4 0 2 4
0
2
4 2 2
3 3 3 4
0
2
4
3
2
140 3

120

100

syllable 2 syllable 3 syllable 4

Syllable position

Figure 1. F0 of five tones in the second, third, and fourth syllable position receiving four
different focus conditions

3.2.2. Tonal shapes

Observation of f0 movement within the vowel nuclei revealed both assimilatory and
anticipatory tonal coarticulation in Taiwanese (Peng, 1997). The corpus of the
present study was composed of sonorants and vowels, therefore f0 movements
during consonants surrounding the vowel nuclei were also included. Since
surrounding lexical tones influenced f0 movement of different lexical tones, the
averaged tonal contexts for each syllable should be discussed first.
FOCUS AND TAIWANESE UNCHECKED TONES 205

Lexical tones in the second syllable were preceded by tone 33 with mid offset at
the first syllable and followed by tones 55, 51, 21, and 33 at the third syllable with
an averaged onset realized at a slightly above mid average. For the third syllable, it
was preceded by tones 55, 24, 51, 21, and 33 at the second syllable and produced
with a slightly below mid average f0 onset. The third syllable was followed by tones
55, 51, 21, and 33 and produced with a slightly above mid average f0 offset. The
fourth syllable was preceded by tones 55, 51, 21, and 33 and realized at a mid
average onset. The fourth syllable was followed by suffixes, /a51/, in seventy five
percent of the tokens and followed by the morpheme, /ls´ŋ 33/ ‘egg’, in twenty five
twenty five percent of the tokens. On the average offset of fourth syllable was
realized at an upper mid to high average.
Due to preservatory tonal coarticulation, tone 55 at the second, third, and fourth
syllables started around the mid tonal range following the averaged mid offset of the
first, second, and third syllables. The f0 contours of tone 55 at second and third
syllables then gradually rose to a higher offset target at 80% into the syllables then
slightly declined to coarticulate anticipatorily with the following mid onset of third
and fourth syllables. The f0 contours of tone 55 at the fourth syllable did not decline
at the end of the syllable, since they were followed by an upper mid to high onset at
the fifth syllable. Both preservatory and anticipatory tonal coarticulation was
observed on tone 55. The gradual decrease of the high offset of tone 55 from the
second, to third and fourth syllables was a sign of global declination.
The onset of tone 24 at the second syllable started from the mid offset of
preceding syllable then moved downward to the low onset target of rising tone 24.
The low onset target of tone 24 was reached around the 60% time point into the
syllable and then the f0 pattern began to take on the rising contour of tone 24.
Preservatory tonal coarticulation can be observed at the beginning of tone 24.
The onset of tone 51 at the second, third, and fourth syllables began around the
mid tonal range then began to rise toward the high onset target. The high target was
reached at the 60% time point in the second syllable and the 40% time point in the
third and fourth syllables. After this, the f0 pattern began to move downward toward
the low offset target of falling tone 51. Effects of declination can be observed by
comparing the f0 height of high onset targets that gradually decreased from the
second to the third and to the fourth syllable. Preservatory tonal coarticulation was
observed at the beginning of tone 51.
The onset of tone 21 began around the mid tonal range for the second and third
syllables. The onset of tone 21 in the fourth syllable was much lower due to global
declination and the lower averaged offset of the third syllable. F0 moved downward
toward the target and then began to rise at the 95% for the third syllable and the 60%
time point for the fourth syllable. Effects of declination were observed on the f0
height of the low offset target between the second and third syllables. The rising
contour of tone 21 at the fourth syllable was due to anticipatory tonal coarticulation
with the high to mid onset of following fifth syllable.
The onset of tone 33 gradually declined from the second, third to the fourth
syllable. The rising f0 of tone 33 at the fourth syllable was due to anticipatory tonal
206 HO-HSIEN PAN

coarticulation with averaged upper mid to high onset of the following fifth syllable.
Both anticipatory and preservatory tonal coarticulation was observed here.

3.2.3. Effect of focus on f0 range

Fifty-one one-way ANOVAs (focus position) were used to analyse individual
speakers’ f0 range of syllables carrying the same tone in the same sentence position.
F0 range was the difference between the highest and lowest f0 values for a given
syllable. Results are shown in Table 5. There was missing data for the narrow focus
tone 55 in the second syllable, since HYH produced this syllable with tone 33.
Results indicated that a significant effect of focus on f0 range was consistently
observed on tones 24 and 51, but not on level tones (55, 33) or the low falling tone
(21). The exceptions were tone 55 in the second and fourth syllables, tone 21 in the
third syllables, and tone 33 in the third and fourth syllables produced by CYS; tone
55 in the second syllables produced by HYH; tone 55 in the third syllables and tone 33
in the second syllables produced by LWS; tone 55 in the second syllables, and tones
21 and 33 in the third syllables produced by LYK.
The mean f0 range of syllables carrying the same tone in the same position and
produced by same speaker, but under different focus conditions revealed that the f0
range of narrow focus syllables was the greatest, as shown in Table 5. However,
there were some exceptions. These included the f0 range of tone 24 in the second
syllables and tone 33 in the fourth syllables produced by CYS; tones 55 and 33 in the
second syllables, tone 21 in the third syllables, and tone 33 in the fourth syllables
produced by HYH; tones 24 and 33 in the second syllables produced by LWS; tones
24 and 33 in the second syllables, tone 33 in the third syllables, and tones 21 and 33
in the fourth syllables produced by LYK.

3.2.4. Effect of focus on mean f0

In addition to differences in f0 range, a significant effect of focus was observed in
the mean f0 value of syllables carrying the same tone in the same position but with
different focus conditions. For syllables that did not have a significant effect of
focus on f0 range, a significant effect of focus on mean f0 height was usually observed,
as shown in Table 6. This is illustrated in production of tone 55 in the second
and fourth syllables, and tones 21 and 33 in the third syllables produced by CYS;
productions of tone 55 in the second syllables produced by HYH; productions of tone
55 in the second syllables, and tones 21 and 33 in the third syllables produced by
LYK; and in productions of tone 55 in the third syllables, and tone 33 in the second
syllables produced by LWS.
Table 7 summarizes the significant effect of focus on duration, f0 range, and
mean f0 on each syllable. Duration was more consistent than f0 range and mean f0
in distinguishing focus conditions produced by CYS, LWS, and LYK, but not HYH.
A significant effect of focus was found on either f0 range or mean f0, and sometimes
both f0 range and mean f0 of most syllables. The exceptions occurred mainly on
tones 33 and 21 in either the third or fourth syllable.
FOCUS AND TAIWANESE UNCHECKED TONES 207

Table 5. One-way ANOVAs (4 focus conditions) on f0 range (Hz), ** p < .001, * p <.05,
NF: Narrow focus, bold face: narrow focus syllable

Speaker CYS HYH

Syllable position Syllable position
2 3 4 2 3 4
Broad focus 10.95 11.11* 10.89 49.08 22.25** 22.81**
NF on syllable 2 13.08 12.31* 11.08 35.56 25.05** 22.42**
55

NF on syllable 3 12.14 15.15 11.56 37.71 40.71 24.29**

NF on syllable 4 11.52 12.47* 12.56 33.82 25.50** 41.05
Broad focus 7.22 ** 13.97 **
NF on syllable 2 13.11
24

NF on syllable 3 16.90 36.89

NF on syllable 4 10.62 ** 14.34 **
Tone

Broad focus 20.49 23.19 17.56 37.52 34.04 ** 40.37 *

NF on syllable 2 26.88 24.71 ** 21.11 ** 56.04 42.61 ** 37.65 *
51

NF on syllable 3 22.49 34.17 26.05 38.23 ** 49.38 45.86 *

NF on syllable 4 20.11 ** 29.04 ** 29.44 37.18 ** 33.59 ** 52.81
Broad focus 18.35 * 19.72 14.73 * 30.06 ** 34.85 * 36.24 **
NF on syllable 2 21.12 22.17 16.57 * 40.11 47.18 * 28.21 **
21

NF on syllable 3 19.43 * 24.90 15.20 * 25.02 38.80 36.01

NF on syllable 4 17.54 * 22.98 13.28 26.74 ** 32.33 * 42.04
Broad focus 7.22 ** 13.11 16.90 13.97 * 36.89 *
NF on syllable 2 10.62 13.30 17.29 14.34 33.48 *
33

NF on syllable 3 6.63 ** 15.58 19.27 15.05 * 42.00 *

NF on syllable 4 6.39 ** 13.06 17.65 10.69 * 41.56
Speaker LWS LYK
2 3 4 2 3 4
Broad focus 27.91 ** 19.47 24.21 ** 29.54 25.63* 38.09 **
NF on syllable 2 27.68 19.56 21.68 ** 30.80 27.60* 25.85 **
55

NF on syllable 3 33.72 25.16 29.80 34.39 37.44 36.50 **

NF on syllable 4 30.69 ** 23.19 41.93 30.13 26.25* 47.08
Broad focus 9.31 ** 12.17 **
NF on syllable 2 14.86 18.05
24

NF on syllable 3 26.38 30.96

NF on syllable 4 7.97 ** 10.52 **
Tone

Broad focus 29.25 26.36 18.73 29.25 26.36 18.73

NF on syllable 2 35.13 24.24 ** 16.41 ** 35.13 24.24 ** 16.41 **
51

NF on syllable 3 34.33 45.66 21.25 34.33 45.66 20.91

NF on syllable 4 29.86 ** 28.36 ** 27.69 29.86 ** 28.36 ** 27.69
Broad focus 22.59 ** 25.81 ** 17.33 * 23.85 ** 39.20 25.88
NF on syllable 2 27.23 25.51 ** 17.64 * 31.00 37.38 24.17
21

NF on syllable 3 20.08 ** 38.62 22.10 * 23.62 ** 41.77 29.53

NF on syllable 4 17.36 ** 24.60 ** 23.76 16.96 ** 37.76 26.30
Broad focus 9.31 14.86 * 26.38 ** 12.17 * 18.05 30.96
NF on syllable 2 7.97 13.71 * 19.10 ** 10.52 19.22 25.80
33

NF on syllable 3 8.63 18.44 29.66 ** 11.55 * 16.67 30.52

NF on syllable 4 9.89 15.07 * 33.98 13.50 * 15.63 29.86
208 HO-HSIEN PAN

Table 6. One-way ANOVAs (4 focus conditions) on mean f0, ** p < .001, * p < .05,
NF: Narrow focus, bold face: narrow focus syllable

Speaker CYS HYH

Syllable position Syllable position
2 3 4 2 3 4
Broad focus 145.0 ** 139.2 133.9 ** 184.7 * 176.7 * 177.5 **
NF on syllable 2 133.9 141.2 134.8 ** 177.5 169.6 * 159.2 **
55

NF on syllable 3 150.5 141.9 137.1 176.0 * 176.5 171.1 **

NF on syllable 4 147.4 ** 140.7 138.5 171.4 * 166.5 * 178.0
Broad focus 137.3 ** 171.6 **
NF on syllable 2 131.3 151.8 **
24

NF on syllable 3 129.7 178.6

NF on syllable 4 142.8 **
Tone

Broad focus 151.3 144.0 134.0 187.4 179.9 163.2 **

NF on syllable 2 157.8 145.7 136.2 ** 189.0 169.1 ** 145.4 **
51

NF on syllable 3 155.5 146.5 135.8 174.4 177.7 153.8

NF on syllable 4 155.1 ** 147.1 139.1 172.6 ** 165.7 ** 173.7
Broad focus 127.1 123.9 * 125.6 * 138.6 ** 144.0 * 147.8 **
NF on syllable 2 127.7 124.6 * 127.5 * 144.2 136.1 * 132.0 **
21

NF on syllable 3 127.7 127.7 127.9 * 130.8 143.3 141.3

NF on syllable 4 128.1 127.6 * 129.4 135.4 ** 136.6 * 142.3
Broad focus 137.3 ** 131.3 ** 129.7 171.6 ** 151.8 **
NF on syllable 2 142.8 132.7 ** 129.1 178.6 138.4 **
33

NF on syllable 3 139.6 126.1 130.3 152.6 145.2 **

NF on syllable 4 139.1 ** 126.0 ** 131.8 153.5 ** 152.7
Speaker LWS LYK
2 3 4 2 3 4
Broad focus 180.3 178.0 ** 180.7 ** 170.9 * 168.1 * 170.9 *
NF on syllable 2 180.7 173.2 ** 173.9 ** 170.9 161.7 * 162.5 *
55

NF on syllable 3 180.7 184.1 176.5 ** 166.4 * 167.0 169.3 *

NF on syllable 4 176.9 178.6 ** 188.8 164.6 * 158.7 * 174.0
Broad focus 170.1 ** 157.6 **
NF on syllable 2 164.9 144.2
24

NF on syllable 3 171.0 141.7

NF on syllable 4 166.5 ** 155.9 **
Tone

Broad focus 180.7 * 182.1 176.3 180.7 * 182.1 176.3

NF on syllable 2 182.6 171.8 ** 167.5 ** 182.6 171.8 ** 167.5 **
51

NF on syllable 3 184.7 * 179.1 170.7 ** 184.7 * 179.1 170.7 **

NF on syllable 4 181.5 * 179.7 ** 178.6 181.5 * 179.7 ** 178.6
Broad focus 162.1 * 154.2 162.7 136.6 ** 138.1 138.7
NF on syllable 2 159.4 150.0 160.1 137.7 133.9 133.1
21

NF on syllable 3 160.4 * 154.2 159.4 133.3 ** 138.9 137.2

NF on syllable 4 159.0 * 152.7 161.6 134.7 ** 136.3 137.2
Broad focus 170.1 ** 164.9 * 171.0 ** 157.6 ** 144.2 157.6 **
NF on syllable 2 166.5 161.8 * 161.4 ** 155.9 144.0 155.9 **
33

NF on syllable 3 166.4 164.4 165.0 151.9 142.1 151.9

NF on syllable 4 164.3 ** 162.2 * 170.8 150.6 ** 143.9 150.6
FOCUS AND TAIWANESE UNCHECKED TONES 209

Table 7. Summary of significant effect of focus on duration (D), f0 range (R), and mean f0 (M)

CYS HYH LWS LYK

2 3 4 2 3 4 2 3 4 2 3 4

55 DM DR DM DM DRM RM DR DM DRM DM DRM DRM

24 DRM DRM DRM DRM

51 DRM DR DRM DRM RM DRM DRM DRM DRM DRM DRM DRM

21 DR DM DRM RM DRM DRM DRM DR DR DRM D D

33 DRM DM D DRM D RM DM DRM DRM DRM D DM

3.2.5. Mandarin vs. Taiwanese

Jin (1996) found that perceptually it is difficult to distinguish between broad focus
sentences and sentences with narrow focus on the last word. However, Xu (1999)
found that the duration of the same syllable under broad focus and narrow focus was
significantly different at all five syllable positions. The f0 range differences between
the same word but under either broad or narrow focus conditions was significantly
different from each other in Mandarin. The discrepancy between production and
perceptual data in Mandarin focus studies was not investigated.
To examine the production distinctiveness between broad focus, narrow focus,
and post-focus final words in Taiwanese, a post-hoc Duncan test was used to analyse
the duration and f0 range of the penultimate syllables of the final words carrying the
same tone, but under different focus conditions, as shown in Table 8. Results of
post-hoc Duncan tests shown in Table 8 indicated that the duration difference
between narrow and broad focus penultimate syllables was significant regardless of
the following syllables, e.g. tone 55 produced by HYH and LWS, tone 21 produced
by HYH and LYK, tone 33 produced by HYH and LYK. As for the f0 range, it was
found that the f0 range of narrow and broad focus penultimate syllables was
distinctive, besides tone 55 produced by CYS and LYK, tone 21 produced by CYS
and LYK, and tone 33 produced by HYH and LYK. In summary, both the duration
and f0 range of broad focus and narrow focus penultimate syllables in the final word
were significantly different when the penultimate syllable carried tone 51, but not
when they carried a level tone (55, 33) or low falling tone (21). Speaker-wise, either
the duration or the f0 range was significantly different between narrow focus and
broad focus fourth syllables produced by CYS and LWS. Narrow focused and broad
focused penultimate syllables carrying tone 33 produced by HYH and LYK, or
carrying tone 21 produced by LYK were not significantly different from each other
in terms of either duration or f0 range.
In Mandarin narrow focused final words and final words in broad focus
sentences were perceptually indistinguishable, but acoustically distinguishable.
210 HO-HSIEN PAN

According to the Taiwanese acoustical data observed here, narrow focus final words
was distinguishable from final words in broad focus sentences produced by CYS and
LWS, but not for LYK and HYH. The discrepancy between production and
perceptual data in Mandarin can be further explored by comparing the results of
future production and perceptual studies in Taiwanese.

Table 8. Post-hoc Duncan tests on the mean duration and f0 range of the penultimate (fourth)
syllable. Means of the fourth syllable in different focus conditions produced by the same
speaker were significantly different from each other when followed by different alphabets.
Means followed by the same alphabets were not significantly different from each other.
p < .05.

DURATION CYS HYH LWS LYK

Narrow Focus 322.5 A 221.3 A 234.0 B 297.7 A
55 Post-Focus 280.6 B 224.8 A 250.3 A 290.9 AB
Broad Focus 257.4 C 210.2 A 227.5 B 276.3 B
Narrow Focus 251.4 A 225.5 A 202.8 B 212.4 A
51 Post-Focus 223.3 B 216.0 AB 219.8 A 199.4 B
Broad Focus 199.3 C 209.8 B 182.7 C 193.4 B
Narrow Focus 257.4 A 216.9 A 231.6 A 255.0 A
21 Post-Focus 220.2 B 205.5 A 225.0 AB 264.6 A
Broad Focus 223.3 B 207.1 A 213.2 B 247.6 A
Narrow Focus 320.9 A 219.4 A 237.3 B 302.7 A
33 Post-Focus 281.9 B 224.3 A 259.8 A 296.9 A
Broad Focus 268.7 B 219.9 A 222.9 C 292.6 A
F0 RANGE CYS HYH LWS LYK
Narrow Focus 12.6 A 41.0 A 41.9 A 47.1 A
55 Post-Focus 11.6 A 24.3 B 29.8 B 36.5 B
Broad Focus 10.9 A 22.8 B 24.2 B 38.1 AB
Narrow Focus 29.4 A 52.8 A 27.7 A 27.7 A
51 Post-Focus 26.0 B 45.9 AB 21.2 B 20.9 B
Broad Focus 17.6 C 40.4 B 18.7 B 18.7 B
Narrow Focus 13.3 B 42.0 A 23.8 A 26.3 A
21 Post-Focus 15.2 B 36.0 B 22.1 A 29.5 A
Broad Focus 14.7 B 36.2 B 17.3 B 25.9 A
Narrow Focus 17.7 A 41.6 A 34.0 A 29.7 A
33 Post-Focus 19.3 A 42.0 A 29.7 B 30.5 A
Broad Focus 16.9 B 36.9 A 26.4 B 31.0 A

4. DISCUSSION
The f0 and duration data produced by Taiwanese speakers in the present study
revealed five major results. First, the duration of narrow focus syllables was longer
than syllables under other focus conditions. Second, the degree of lengthening due to
narrow focus was affected by a syllable’s position in a sentence. Third, the f0 range
of the narrow focus syllable was expanded. Fourth, the tonal register (f0 level)
contrasts between narrow focus high falling vs. mid falling tones, and between
narrow focus high level vs. mid level tones was maintained even when f0 range was
FOCUS AND TAIWANESE UNCHECKED TONES 211

expanded. Fifth, duration was a more consistent cue than either f0 range or mean f0
values in signaling focus condition in Taiwanese. F0 range and mean f0 value
complement each other in distinguishing focus conditions.
In addition to the effect of focus, tonal coarticulation also influenced the f0
contour in Taiwanese. In Taiwanese the f0 offset target of a dynamic tone occurred
after the offset boundary of a tone bearing unit, while the f0 offset target of a level
tone occurred before the syllable boundary (Pan, 2002). By using only sonorants at
either the beginning or end of a syllable, both anticipatory and preservatory tonal
coarticulation was observed in this study. Preservatory tonal coarticulation was
observed in tones 55, 24, and 51, while anticipatory tonal coarticulation was found
in tones 55, 21, and 33. It was proposed that the preservatory tonal coarticulation
took place during the initial consonant of the syllable, as found in Mandarin (Xu,
1999). To support the claim that preservatory tonal coarticulation occurred during
the initial consonant of the syllable in Taiwanese, further studies with various
syllable structures are necessary.
Among narrow focus second, third, and fourth syllables, the duration of narrow
focus third syllable was the longest, while the duration of the fourth syllable was the
shortest. In Mandarin the duration of the narrow focus third syllable was also the
longest, however the shortest syllable was the second syllable (Xu, 1999). The effect
of focus lengthening was the strongest on the third syllable in both Mandarin and
Taiwanese. According to global final lengthening rules, the duration of the narrow
focus fourth syllable should be longer than the duration of the narrow focus third
syllable, however local focus lengthening interacts with final lengthening here to
determine the surface syllable duration. Focus lengthening exerts a strong effect on
the third syllable but not on the fourth syllable. Narrow focused fourth syllables
appeared to be shorter than narrow focused third syllables in both Taiwanese and
Mandarin data. Further investigations with more variable sentence structures are
needed to explore possible factors such as syllable position, part of speech, and
syntactic or prosodic structures that contribute to the longer duration of narrow focus
third syllable.
In Mandarin with four distinctive f0 contours for each lexical tone, f0 range
expansion was used as the major cue for signaling narrow focus. In Taiwanese,
duration lengthening is a more consistent cue for narrow focus. The fact that there
are two tonal pairs in Taiwanese contrasted mainly by f0 height and not by f0
contour may contribute to the limited manipulation of f0 range in different focus
conditions. To further explore this potential cause, studies on other tonal languages
with tonal pairs contrasting mainly by f0 height are needed.
The study here concentrated only on the effect of focus on Taiwanese unchecked
tones. Taiwanese checked tones are known for their shorter syllable duration and
glottalized voiced quality in contrast with unchecked tones. To fully understand the
influence of focus on duration contrasts between checked and unchecked syllables in
Taiwanese and the influence of focus on voice quality in Taiwanese, further studies
are necessary. The interaction between focus conditions, final and initial lengthening
in different prosodic domains, and tonal coarticulation should also be investigated to
fully understand the interaction of prosodic effects on surface duration and f0
contour in tonal languages.
212 HO-HSIEN PAN

Department of Foreign Languages and Literatures, National Chiao Tung

University, Hsinchu, TAIWAN.

NOTES
This research was supported by grants from National Science Council in Taiwan. Thanks to
Professor Anne Chao and Pi-chiang Li for assistance in statistical analysis.

REFERENCES
Beckman, Mary E., and Jan Edwards. (1990) “Lengthening and Shortenings and the nature of prosodic
constituency.” In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech
(J. Kingston and M. E. Beckman, editors.):152-178. Cambridge: Cambridge University Press.
Berkovits, Rochele. (1993) “Utterance-Final Lengthening and Duration of Final-Stop closures” Journal
of Phonetics 21 (4): 479-489.
Chao, Yun Ren. (1968) A Grammar of Spoken Chinese, University of California Press.
Cheng, Robert. (1968) “Tone sandhi in Taiwanese.” Linguistics 41: 19-42.
Cheng, Robert. (1973) “Some notes on tone sandhi in Taiwanese.” Linguistics 100: 5-25.
Cooper, William E., Stephen J. Eady, and Pamela R. Muller. (1985) “Acoustical Aspects of Contrastive
Stress in Question-Answer Contexts.” Journal of Acoustical Society of America 77: 2142-2156.
Eady, Stephen J., and William E. Cooper. (1986) “Speech Intonation and Focus Location in Matched
Statements and Questions.” Journal of the Acoustical Society of America 80: 402-416.
Eady, Stephen J., William E. Cooper, Gayle V. Klouda, Pamela R. Mueller, and Dan W. Lotts. (1986)
“Acoustic Characteristics of Sentential Focus: Narrow vs. Broad and Single vs. Dual Focus
Environments.” Language and Speech 29: 233-251.
Fougeron, Cecile. (1999) “Articulatory Properties of Initial Segments in Several Prosodic Constituents in
French.” UCLA Working Papers in Phonetics 97: 74-99.
Gandour, Jack, Siripong Potsuk, and Sumalee Dechongkit. (1994) “Tonal coarticulation in Thai.” Journal
of Phonetics 22: 477-492.
Ho, Aichen T. (1976) “Mandarin Tones in Relation to Sentence Intonation and Grammatical Structure.”
Journal of Chinese Linguistics 4: 1-13.
Jin, Shunde. (1996) An Acoustic Study of Sentence Stress in Mandarin Chinese. Ph.D. dissertation, The
Ohio State University.
Liberman, Mark, and Janet Pierrehumber. (1984) “Intonational Invariance under Changes in Pitch Range
and Length.” In Language Sound Structure (M. Aronoff & R. T. Oehrle, editors): 157-233.
Cambridge, MA: MIT Press.
Lin, Hui-Bin. (1988) Contextual Stability of Taiwanese tones. Ph.D. dissertation, The University of
Connecticut.
Lindblom, Bjorn, and K Rapp. (1973) “Some Temporal Regularities of Spoken Swedish.” Papers from
the Institute of Linguistics, University of Stockholm 21: 1-58.
Pan, Ho-hsien. (2002) “The location of F0 offset for Taiwanese Long Tones” In Speech Prosody 2002:
Proceedings of the first International Conference on Speech Prosody: 555-558.
Peng, Shu-hui (1997) “ Production and Perception of Taiwanese Tones in Different Tonal and Prosodic
Contexts.” Journal of Phonetics 25 (3): 371-400.
Pierrehumbert, Janet. (1980) The Phonology and phonetics of English Intonation. Ph.D. dissertation,
Massachusetts Institute of Technology.
Pierrehumbert, Janet, and Mary E. Beckman. (1988) Japanese Tone Structure. Cambridge, MA: MIT
Press.
Shen, Xiaonan Susan. (1973) “A Pilot Study on the Relation between the Temporal and Syntactic
Structures in Mandarin.” Journal of the International Phonetic Association 22 (1-2): 35-43.
Shih, Chi-Lin. (1988) “Tone and Intonation in Mandarin.” Working Papers Cornell Phonetics Laboratory
No. 3: 83-109.
FOCUS AND TAIWANESE UNCHECKED TONES 213

Shi, Chi-Lin, and Benjamin Ao. (1994) “Duration Study for the AT&T Mandarin Text-to-Speech
System.” In Conference Proceedings of the second ESCA/IEEE Workshop on Speech Synthesis:
29-32.
Xu, Yi. (1999) “Effects of Tone and Focus on the Formation and Alignment of f0 Contours.” Journal of
Phonetics 27: 55-107.
Xu, Yi. (1997) “Contextual Tonal Variations in Mandarin.” Journal of Phonetics, 25: 61-83.
ELISABETH SELKIRK

BENGALI INTONATION REVISITED:

An Optimality Theoretic Analysis in which FOCUS Stress Prominence
Drives FOCUS Phrasing*

1. INTRODUCTION
In this paper, I want to investigate the consequences of an idea about focus prosody
that was first put forward by Jackendoff 1972, namely the hypothesis that the focus-
phonology interface in grammar is expressed as a relation between focus-marked
syntactic constituents on the one hand, and prosodic stress prominence on the other.
A strong form of the hypothesis, advocated in Truckenbrodt’s 1995 thesis and
pursued here and in other recent work of mine (e.g. Selkirk 2002), is that the focus-
phonology interface consists only of interface constraints on the relation between
syntactic focus and prosodic prominence. All the other predictable, non-
morphological, phonological properties of focus are claimed to be derived as a
consequence of phonological markedness constraints on the relation between
prosodic prominence and other aspects of phonological representation. This
proposal can be called the Focus-Prominence theory of the focus-phonology
interface. I think this theory provides an insightful account of the array of
phonological properties that are associated with focus crosslinguistically, and at the
same time explains the observed generalizations about focus projection and the
distribution of focus-related prominence within the sentence. The question of focus
projection is not addressed in this paper (but see Selkirk 1999, 2000; Selkirk and
Katz, in preparation). What I want to show here is that Focus Prominence theory
provides the basis for an understanding of focus-related phonological phrasing. In
this I am following a path first charted out by Truckenbrodt 1995.
Focus constituents are claimed to display a variety of prosodic properties
crosslinguistically:

i. appearance of special tonal morphemes 1

ii. appearance of default pitch accent2
iii. demarcation by a prosodic phrase edge3
iv. presence of main stress of a prosodic phrase4
v. appearance in a higher pitch range 5
vi. vowel length under main phrasal stress 6
(This list should not be taken to be exhaustive.)

215
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 215–244.
© 2007 Springer.
216 ELISABETH SELKIRK

Should there be distinct focus-prosody interface constraints to account for each of

the diverse non-morphemic prosodic properties listed above? I think not. The Focus
Prominence hypothesis holds that there is a prevalent commonality to the
phonological expression of focus, in languages of diverse types, and that it lies in
the level of stress prominence assigned within a focus constituent. The appeal of this
hypothesis is that stress prominence, at the appropriate designated level, is quite
plausibly responsible for the various other reported phonological reflexes of focus,
be it the appearance of default pitch accents to mark stress prominence, the
lengthening of vowels under that prominence, or the appearance of a phonological
phrase edge adjacent to that prominence.
So under the Focus Prominence theory there are no constraints directly relating
predictable pitch accent or prosodic phrasing to the focus-marking of constituents in
the interface syntactic structure. For example there would be no constraints of the
form Align L/R (Focus, ʌ) where π is a prosodic constituent of a selected level.
Rather, following Truckenbrodt’s 1995 proposal, the presence of a ʌ edge flanking a
focus would be the consequence of a constraint calling for the focus constituent to
contain a prosodic prominence together with a a prosodic alignment constraint
calling for a prominence to be located at the edge of the prosodic constituent of
which it is the head7.
Bengali presents an apparent counterexample to the claim made by Focus
Prominence theory that the phonological phrase edge alignment that appears with
focus can be derived through the markedness-driven alignment of a prosodic phrase
edge with the stress prominence of that phrase. The prominence-based theory of
focus phrasing predicts a phonological phrase edge at only one edge of a focus
constituent, the edge where the focus prominence is located. But according to Hayes
and Lahiri in their classic 1991 article on Bengali intonation, a focus constituent in
Bengali is flanked by phonological phrase edges at both the right and the left edges
of the focus. The stress prominence of a phonological phrase in Bengali is claimed
by Hayes and Lahiri to be located at the left of the phrase. So within the Focus
Prominence theory, the appearance of a phonological phrase edge at the left edge of
a focus constituent could be derived through an instance of the familiar sort of
surface phonological markedness constraint Align R/L (π-prom, π), which aligns a
π-prominence with a π-edge (π-prom is the prominent daughter constituent of π (its
head)). It is the right phrase edge with focus that poses the problem. There is no
evidence elsewhere in the language for the alignment of a phonological phrase with
the right edge of a constituent. So Hayes and Lahiri propose a focus interface
alignment constraint—formulable as Align R (Focus, ϕ)-- to account for the right
phrase edge (ϕ stands for phonological phrase). The present theory, which seeks to
eliminate focus-phrasing alignment constraints from the universal interface
constraint repertoire and to reduce all nonmorphemic, phonological, reflexes of
focus to reflexes of stress prominence, will require some principled non-prominence
based explanation for the right phrase edge with focus in Bengali. The purpose of
this paper is to put forward such an explanation.
BENGALI INTONATION REVISITED 217

An example of the focus phrasing seen in Bengali appears in sentence (2) below.
(2) is a sentence with a sentence-medial contrastive focus appearing on a medial
constituent within a left branching object noun phrase. The surface syntactic
structure which we tentatively assume for this focus-marked sentence structure is as
in (1). The prosodic phrasing structure in (3), which is an all-new, out of the blue,
utterance of the same sentence structure, but minus the focus marking, should be
contrasted to that in (2)8.

(1) S

VP
NP
PP
NP
NP NP N-FOC P N V
ami raj‡a-r c‡hobi-r j‡onno ˇaka anlam
I king’s PICTURES for money gave
‘ I gave money for the king’s PICTURES.’

(2)
phrase-edge not prominence-related
L* HP L* HP LI
((ami raj‡ar) ( c‡hobir ) j‡onno ˇaka anlam )IP
I king’s PICTURES for money gave.

(3) LHP L <HP> H* LI

((ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam))IP
I king’s pictures for money gave

These are both declarative utterances. The phrasing of the neutral focus sentence (3)
puts the subject, the complex NP object, and the verb each in a separate
phonological phrase. (Nonfinal phonological phrases are in general marked by two
tonal events--the presence of a L* pitch accent on the main stressed syllable in the
phrase and the presence of a HP peripheral tone at the right edge of the phrase.) The
focus sentence (2) alters the otherwise default phrasing in flanking the focus
constituent, here a head noun internal to the complex noun phrase, with the left and
right edges of a phonological phrase. The arrow marks the problematic right
phonological phrase edge found at the right edge of the focus, the phrase edge that
the Focus Prominence hypothesis can’t account for.
Aside from the flanking of a contrastive focus constituent with phonological
phrase edges, there is another important property of sentences with focus in Bengali,
namely the absence of any phonological phrase following the focus constituent. This
is visible in (2) through the absence of any pitch accent or nonfinal peripheral tones
following the focus. We will see that this apparent “dephrasing” can also be given
218 ELISABETH SELKIRK

an explanatory account by Focus Prominence theory, in terms already suggested by

Truckenbrodt 1995 for Japanese (section 2).

2. SKETCHING OUT THE FOCUS PROMINENCE THEORY

I am going to assume that an utterance is simultaneously analyzed in terms of two
discrete types of structure—morphosyntactic and phonological. Specifically, the
assumption is that the two output representations defined by the grammar for a
sentence, namely the surface morphosyntactic representation (PF) and the surface
phonological representation (PR), share a terminal string. This assumption about the
interfacing output representations predicts three general types of constraint that
would be defined on output representations alone: morphosyntactic markedness
constraints, phonological markedness constraints, and interface constraints relating
morphosyntactic and phonological properties of the output. Syntactic structure-
prosodic structure alignment constraints such as Align-L (XP, MaP) are a classic
type of interface constraint. They have a demarcative function, in calling for the
edge of a designated category in the syntax to correspond to the edge of a designated
category of prosodic structure (cf. Selkirk 1986 et seq). In addition, the family of
Wrap constraints proposed by Truckenbrodt 1995, 1999 has a cohesive function in
requiring that a syntactic constituent of a particular level be entirely contained
within a prosodic phrase of a particular level. These Align and Wrap constraints on
the syntax-phonology interface clearly have the function of carrying over into the
hierarchically organized phonological/prosodic representation of the sentence
salient, landmark, properties of the morphosyntactic phrase structure constituency.
These constraints, which are apparently cross-categorial, ignore any featural
properties of the morphosyntactic representation. Constraints relating focus and
prosodic prominence of the sort being proposed in Focus-Prominence theory belong
to a distinct class of syntax-phonology interface constraints. A morphosyntactic
constituent with the property of being a focus is assumed to be focus-marked
(Jackendoff 1972, Selkirk 1984, 1995, Rooth 1992, 1995, Schwarzschild 1999 and
many others), so that any constraint calling for a focus-marked constituent in PF to
contain a certain level of prosodic prominence in PR is a syntax-phonology interface
constraint. But the difference between this and the Align/Wrap constaints is that
information structural salience, represented as a property of individual
morphosyntactic constituents, is being translated into prosodic structure salience, or
prominence. Thus the two defining properties of prosodic structure-- prosodic
grouping structure and prosodic prominence, or headedness— correspond to the two
faces of surface morphosyntactic structure—phrase structural grouping and an
encoding of information structural prominence.
What then might be the formulation of the Focus-Prominence interface
constraints that are being given responsibility for at least some of the focus phrasing
properties of Bengali? Providing a fully motivated answer to this question is a topic
of ongoing research, but it is possible to say something here of the ideas under
consideration. Truckenbrodt 1995 and Rooth 1996 propose a Focus Prominence
constraint that is essentially syntagmatic in character:
BENGALI INTONATION REVISITED 219

(4) Focus Prominence Constraint—syntagmatic (Truckenbrodt 1995, Rooth

1996): A focus is more prominent than any other element within the focus
domain. [where focus and focus domain are syntactic/semantic constituents]

In its definition of focus prominence this theory does not distinguish between types
of focus (e.g. contrastive vs. presentational) and their associated types of domain
constituent. Nor does the theory assure a regular prosodic level of prominence for
the different focus types. In other work, however, this simplicity is shown to be
problematic for the characterization of at least a certain range of focus phenomena
(see Selkirk 2000, 2002; Sugahara 2002, 2003). So in what follows I will assume a
paradigmatic theory of Focus Prominence, leaving open the question whether the
syntagmatic version above is also required in grammar.
The paradigmatic theory of Focus Prominence that I am entertaining posits a
family of Focus Prominence constraints of the general form in (5), according to
which a focused constituent of a particular morphosyntactic structure type must
contain a phonological prominence of a particular prosodic structure type:

(5) Focus-Prominence Constraint Family—paradigmatic (Selkirk 2000a,

2002)

ƒ ( Xn) ⊂ ∆ (π)

“The terminal string of an ƒ-focussed syntactic constituent of level Xn in PF (the

interface morphosyntactic representation) is a terminal string of PR (the interface
phonological representation) which contains the designated terminal element ∆
of a prosodic constiuent of level π.”
i. ƒ is a variable over focus types (contrastive, presentational, …)
ii. Xn is a variable over syntactic constituent types (word, phrase,
…)
iii. ∆ stands for “designated terminal element of” (see below), and
iv. π is a variable over prosodic constituent types

Of particular relevance to the current paper is a constraint relating the presence

of a contrastively focused constituent in the syntax to the presence of a prosodic
prominence of the Intonational Phrase level in the phonology. The formulation in (6)
appears to achieve the correct results for Bengali.

(6) FOCUS Prominence: FOCUS(α) ⊂ ∆IP

“The terminal string of a contrastively focused (“big” FOCUS ) constituent of
level α in PF (=α FOC) is a terminal string of PR which contains the designated
terminal element ∆ of an Intonational Phrase.”

Contrastive focus invokes a set of alternatives and its semantics can be

characterized by alternatives semantics (Rooth 1992, 1995). I am suggesting here
220 ELISABETH SELKIRK

that it is an intonational phrase-level prominence that is called for in a contrastive

focus constituent (notated with big caps as FOCUS and referred to as ‘big’ focus).
As for other Focus Prominence constraints, they would include, at a minimum,
constraint(s) relating words or phrases that are in presentational focus to presumably
lower levels of prosodic prominence, for example:

(7) Focus XP Prominence: Focus(XP) ⊂ ∆MaP

“The terminal string of a presentationally focussed (“small” Focus ) constituent

of level XP in PF (=XP Foc) is a terminal string of PR which contains the
designated terminal element ∆ of a Major Phrase.”

A presentational focus has the property of newness in the discourse, and its
semantics is characterizable in terms of the theory proposed by Schwarzschild
(1999). It will sometimes be notated with initial caps as Focus and nicknamed as
‘small’ focus. As for the prosodic category name ‘major phrase’, this is the level of
phrasing immediately below the intonational phrase, sometimes also referred to as
‘intermediate phrase.’ I have chosen the term ‘major phrase’ for its mnemonic value,
since the level of prosodic major phrase is identified by its alignment with the
morphosyntactic maximal projection phrase.
Notice that these hypothesized constraints of the paradigmatic Focus Prominence
theory make the felicitous prediction that the phonological properties of big,
contrastive, FOCUS are either a superset of those of small, presentational, Focus, or,
if different, then are characteristic of a higher level of prominence than those of
small focus. This is because, given the nature of prosodic structure, the ∆IP called
for in a big focus constituent is necessarily also a ∆MaP, and ∆MaP is what is called
for in a presentational focus phrase. That is, both contrastive and presentational
focus will be called on by constraints to show the properties of a ∆MaP, but only
contrastive focus will be called on to show the properties of the higher level ∆IP.
Call this prediction big focus-small focus containment. This point becomes clear
when we examine the definitions for designated terminal element and prosodic head
and apply them to an example.

(8) Def: A head of a prosodic constituent π is (i) the most prominent prosodic
constituent immediately dominated by π (the π-prom of ʌ) or (ii) the most
prominent prosodic constituent immediately dominated by a head of π.

(9) Def. The designated terminal element (DTE, or ∆) of a prosodic constituent

π is that mora in the terminal string of π that is dominated by the chain of heads
of π.

Note that the sample representation (10) satisfies the Focus Prominence constraint in
(5) which requires that the contrastively focused word Mississippi contain the
designated terminal element of an intonational phrase IP.
BENGALI INTONATION REVISITED 221

According to the recursive definition of head given above, the boldfaced head
constituents are all heads of IP. Assuming that moras are part of the terminal string,
the penultimate mora in Mississippi is the designated terminal element of IP. This is
because it is the head mora of the head syllable of the head foot of the head prosodic
word of the head minor phrase of the head major phrase of the intonational phrase.
Turning to Bengali, we will assume that the focus type whose prosodic
properties are being described in the Hayes and Lahiri paper is big, contrastive,
focus. Their examples of focus involve cases of explicitly contrastive focus or
answers to wh-questions. So we will be investigating in Bengali the consequences of
assuming that a big focus (FOCUS) constituent contains the DTE of an Intonational
Phrase, as called for by the big FOCUS Prom constraint in (5). The properties of
presentational focus in Bengali have not yet been submitted to a systematic
investigation.

(10) IP
|
MaP π-prom of IP = MaP
|
MiP π-prom of MaP = MiP

PWd PWd π-prom of MiP = PWd

Ft Ft Ft π-prom of PWd = Ft

σ σ σ σ σ σ π-prom of Ft = σ
| | | | | |
µ µ µ µ µ µ π-prom of σ = µ
v Ι s Ι t [M Ι ss Ι ss Ι pp Ι ]FOC

= ∆IP (dom. by the head σ, Ft, PWd,

MiP, MaP of IP)

(Underlining will be consistently taken to denote head status.)

Let’s look at the general shape of the analysis I am proposing for the flanking of
a contrastive FOCUS-marked syntactic constituent by phonological phrase edges in
Bengali. First, FOCUS-Prom (6) calls for a ∆IP within the FOCUS constituent. This
has the consequence that the ∆IP is dominated by the head MaP of IP, the head of
that head MaP, and so on, as seen in the partial representation in (11) below:
222 ELISABETH SELKIRK

(11) [Partial prosodic structure 1]

FOCUS-Prom ⇒ IP
|
MaP
|
PWd
|
Ft
|
σ
|
µ = ∆IP
[[ami] [[[[raj‡a-r] [c‡hobi-r]FOC] j‡onno] ˇaka] [anlam]]]
I king’s PICTURES for money gave
‘I gave money for the king’s PICTURES’
In meeting the requirements of the FOCUS-Prominence constraint, head
constituents are defined at all prosodic levels lower than IP. Now, the grammar
contains a class of prosodic markedness constraints that call for the alignment of
these prosodic head constituents with the right or left edge of their mother prosodic
constituents (McCarthy and Prince 1993) such as the well-attested Align R/L (Ft,
PWd). Hayes and Lahiri argue that a phonological phrase has its head at the left
edge of the phrase, giving a pattern of left edge phonological phrase prominence. I
will express this constraint as Align L (PWd, MaP), assuming that the phonological
phrase appealed to in the constraint is at the level of the major phrase and that it is a
prosodic word level head-constituent that is aligned with the MaP left edge. (This
analysis ignores for reasons of expository convenience the possibility that there may
be an additional level of phonological phrase (the Minor Phrase) intervening
between PWd and MaP, as does the analysis of Hayes and Lahiri.) Following
Truckenbrodt’s 1995 analysis of the left phonological phrase edge that appears with
FOCUS in Japanese, my analysis of Bengali gives this prosodic alignment constraint
the responsibility for the flanking of Bengali FOCUS with a left phonological phrase
edge, as shown in (12a). [Note that (12a) is only a partial prosodic tree and (12b) is a
partial prosodic labelled bracketing.]
BENGALI INTONATION REVISITED 223

(12) [Partial prosodic structure2]

Align L (PWd, MaP) ⇒

a. IP
|
MaP MaP
|
PWd
|
Ft
|
σ
|
… µ …
| [H]FOC [L]DECL
[[ami] [[[[raj‡a-r [c‡hobi-r]FOC] j‡onno] ˇaka] [anlam]]]
I king’s PICTURES for money gave
‘ I gave money for the king’s PICTURES’

b.
IP ((ami raj‡ar)MaP MaP(c‡hobir j‡onno ˇaka anlam)IP
I king’s PICTURES for money gave.
On this proposal, then, a constraint like AlignL (PWd, MaP) has in general two
functions. Here it induces the presence of a phonological phrase edge at the edge of
a prosodic prominence whose position with respect to the syntactic structure is fixed
by the FOCUS-Prom constraint. In cases where the location of the prominence is not
fixed by an interface constraint, the same constraint predicts that the prominence
will fall wherever the grammar determines that the left edge of a phonological
phrase might appear. This two-fold effect follows from the fact that the locus of
prosodic prominence may either be fixed independently in which case the edge
comes to align with it, or the locus may not be fixed independently, in which case
the prominence locates itself wherever the grammar may call for a phrase edge.
As for appearance of the right edge of a phonological phrase edge seen in (13) at
the right edge of FOCUS, I argue in the following section that it is to be ascribed to
the presence of the tonal morpheme [H]FOC at the right edge of the FOCUS
constituent in morphosyntactic structure.
224 ELISABETH SELKIRK

(13) [Partial prosodic structure3]

Align R ([H]FOC, MaP) ⇒

a. IP
|
MaP MaP
|
PWd PWd …………..
|
Ft
|
σ
|
… µ …
| [H]FOC [L]DECL
[[ami] [[[[raj‡a-r [c‡hobi-r]FOC] j‡onno] ˇaka] [anlam]]]
I king’s PICTURES for money gave
‘ I gave money for the king’s PICTURES’

b. [H]FOC [L]DECL
h
IP(ami raj‡ar) MaP(c‡ obir)MaP j‡onno ˇaka anlam)IP

The (a) examples of these partial representations contain the morphosyntactic

labelled bracketing of the sentence, which includes the marking for contrastive big
FOCUS on the phrase-medial noun pictures, as well as what I will argue below are
the tonal morphemes for FOCUS and DECLARATIVE, [HFOC] and [L]DECL
respectively. The hypothesis is that the morphemic contrastive FOCUS tone is
lexically specified to appear at the right edge of a phonological phrase, through the
effect of a morpheme specific alignment constraint Align R ([H]FOC, MaP). This
constraint induces the presence of the phonological phrase edge at the position at the
right edge of FOCUS constituent that the FOCUS morpheme is hypothesized to
occupy in morphosyntactic structure. Hayes and Lahiri in fact argue that the H
phrase-edge tone of FOCUS is morphemic in Bengali; the present proposal simply
draws the consequences of that morphemic status within the framework of
assumptions adopted here. In section 3 I give arguments for the morphemic status of
the focus H tone.
There is a final phrasing property of big FOCUS sentences, one that is also
arguably a prosodic prominence alignment effect, namely a “dephrasing” to the right
of the FOCUS constituent. No tones appear between the right edge of the FOCUS
constituent and the end of the sentence in Bengali. This can be seen in example (2).
The post-FOCUS stretch is demarcated at the beginning by the [H] morphemic tone
that appears at the right edge of the FOCUS and at the end by the sentence-final
illocutionary tonal morpheme. Between them, there are no prominence-marking
BENGALI INTONATION REVISITED 225

pitch accents, nor any phrase-edge-marking H peripheral tones. Since tones mark
these prosodic structure landmarks of a phonological phrase by default, the absence
of the tones is most straightforwardly explained by the post-FOCUS absence of the
phonological phrasing and prominence that trigger the presence of these tones. This
sort of post FOCUS “dephrasing” is argued by Truckenbrodt 1995 to result from a
constraint which calls for the prosodic head of an intonational phrase to align with
the right edge of the IP. Any phonological phrase intervening between the FOCUS
phrase and the right edge of the intonational phrase would be disaligning and so
produce a non-optimal prosodic representation for the sentence. In particular, after
FOCUS one never sees the appearance of the phrasing normally associated with the
matrix verb. So the provisional constraint “Verb-ϕ Align” (see footnote 8) must be
dominated by the IP-level prosodic alignment constraint. I will assume that the
“dephrasing” observed in the optimal candidate moreover constitutes a violation of
Exhaustivity (IP), hence:

ϕ Align” ⇒
(14) Align R (MAP, IP) >> Exhaustivity (IP), “Verb-ϕ

MaP MaP

PWd PWd PWd PWd PWd PWd

L* H L* [H]FOC [L]DECL
h
IP((ami raj‡ar )MaP MaP (c‡ obi-r)MaP j‡onno ˇaka anlam)IP

The section below is devoted to establishing that the account I have proposed for
the appearance of a phonological phrase edge at the right of the FOCUS constituent
is well founded. It will rely on establishing the morpheme status of the H tone that
flanks the FOCUS constituent on the right as well as establishing the existence of a
morpheme-specific alignment constraint that may induce the presence of a
phonological phrase edge at the edge of the FOCUS tonal morpheme.

3. TONAL MORPHEMES IN BENGALI SENTENCE TONOLOGY

The preceding analysis of Bengali FOCUS prosody has adopted many of the
assumptions of Hayes and Lahiri’s masterful (1991) account: the notion that
phrasing is central to an account of the distribution of tones; the notion that phrase
stress is leftmost in the phonological phrase, while stress prominence in the
intonational phrase is rightmost; the notion that a [H]FOC tonal morpheme must be
posited. Where the account proposed here crucially differs from Hayes and Lahiri’s
is in giving the [H]FOC morpheme responsibility for the FOCUS-related phrasing. A
more general difference is that the account offered here is a constraint-based
optimality theoretic account which seeks to provide an explicit, exhaustive, analysis
of all the relevant tonal patterns in Bengali as well as of all the relevant phrasing
226 ELISABETH SELKIRK

patterns in the language. Specifically, the aim is provide a complete account of the
tonological differences between declarative and question utterances under both
“neutral” and contrastive focus conditions. We will see that the H tone that appears
at the right edge of a FOCUS constituent has a significantly different behavior from
the peripheral default H tone that is the regular marker of right edge of phonological
phrase.

3.1. The intonation of neutral focus sentences

3.1.1. The default L* HP pattern for phonological phrases
To begin, we will look at Bengali sentences with so-called neutral or broad focus,
starting with a treatment of the default L* pitch accent and H edge tones that mark
nonfinal phonological phrases, as seen in (3). A pitch accent is simply a tone whose
distribution is defined with respect to a prosodic prominence. The insertion of a
default, non-lexically specified, pitch accent is a type of prosodic enhancement and
must be the consequence of a phonological markedness constraint calling for the
designated terminal element of a prosodic constituent to be associated with some
tone. Constraints of this type are known to play a role in the world’s languages9. The
insertion of a default edge tone is also a variety of prosodic enhancement, this time
serving to demarcate prosodic phrase edges. It must be the result of a markedness
constraint calling for the edge of a phrase to be aligned with a tone. Again, such
constraints are attested crosslinguistically10.
It is likely not a coincidence that the pitch accent and peripheral tone of the
Bengali phonological phrase have polar tonal values, and indeed Hayes and Lahiri
argue that the Obligatory Contour Principle (OCP) has a central place in Bengali
tonology. For reasons to be seen right below, the OCP-based analysis offered here
picks out the pitch accent as the tonal element whose polar value is a function of the
other. The High value of the peripheral tone will be specified by an edge-tone
alignment constraint, as in (15a). With the High edge specified by constraint, the
introduction of the polar Low value for the pitch accent can be achieved by a
combination of the prominence-tone association constraint in (15b) and the OCP.

(15) a. Align R (MaP, H tone)

“Align the R edge of a major phonological phrase with the R edge of a H tone.”
[= the source of the default High edge tone]

∆MaP, Tone)
b. Associate (∆

“Associate the designated terminal element of a major phonological phrase. i.e.

the head mora of the MaP, with some Tone.” [= the source of a pitch accent on
∆MaP, which is realized as either L or H, as dictated by the OCP]

The tableau in (16) illustrates the role for these constraints in deriving the tones
of the initial phonological phrase from the sentence in (2):
BENGALI INTONATION REVISITED 227

(16)
OCP Align R Assoc *Tone
…. [ ami ] [[ raj‡a-r]... (MaP, H) (∆MaP, T)
a. … ( ami raj‡ar)MaP…. *! *

⇒ L H **
b. … (ami raj‡ar )MaP
H H *! **
c. … (ami raj‡ar )MaP

The two constraints in (15), which call for the presence of tone in the representation,
crucially outrank the constraint *Tone, which minimizes the presence of tone in the
representation. The OCP adjudicates the choice of tone, and is not crucially ranked
with respect to the others. (The non-ranking among the higher constraints is
provisional.)
There is another role for the OCP. In addition to assuring the non-identical
character of the tones introduced by default into the representation, as here, Hayes
and Lahiri also propose that it is responsible for the failure of the default H edge
tone to appear in the first place, when it is followed by another H tone in the
utterance. This effect is seen in (3), where the bracketed perpherial <H> tone is
actually not realized, because of the H* that follows in the next phrase. The absence
of that peripheral H will be analyzed below.

3.1.2. The tonal patterns of final phrases

The default L* H phrasal tone pattern is preempted in the final phrase of the
sentence by the tonal morphemes expressing the illocutionary force of the sentence.
The patterning of tones in the final phonological phrase of the sentence is
contrastive, and is a function of the declarative vs. interrogative status of the
utterance, together with the FOCUS status of the elements within the final phrase.
The basic, neutral focus, declarative ends in a H* pitch accent followed by L
boundary tone, while the basic, neutral focus, yes-no interrogative ends in L* plus
HL boundary tone combination.
Compare the declarative non-FOCUS sentence in (17) with the non-FOCUS
interrogative in (18).

(17) LH L H* [L]DECL

(ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam)
I king’s pictures for money gave
“I gave money for the king’s pictures.”

(18) LH L H L* [HL]QUES

(ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam)
I king’s pictures for money gave?
“Did I give money for the king’s pictures?”
228 ELISABETH SELKIRK

According to Hayes and Lahiri, the H* L of the declarative is composed of an

underlying H* declarative morpheme followed by a L sentence-final ‘neutral’
morpheme, while the L* HL of the yes-no interrogative is composed of an
underlying L* interrogative morpheme and a peripheral HL ‘yes-no’ morpheme.
Operating within a pre-OT framework of assumptions, Hayes and Lahiri propose
that the OCP has a role to play in determining possible tonal contours in Bengali,
but they do not exploit this idea fully in the analysis of these contours. I would like
to suggest, as an alternative OT-based analysis, that the final boundary L of the
declarative is the declarative morpheme itself, and that the H* pitch accent
preceding the declarative morpheme [L]DECL is not morphemic. Rather that H* is a
default pitch accent whose quality is determined by the OCP on the basis of the L
quality of the declarative morpheme, as shown schematically in (19a). Similarly, the
HL boundary combination can be taken to be the morpheme for the yes-no
interrogative and the preceding L* pitch accent in the final phrase can be determined
by OCP-respecting default, as shown in (19b):

(19) PF PR
(morphosyntactic interface) (surface phonological representation)

a. [[…….]InflP[ L ]DECL ]ForceP (….. ( ∆H….. µ L)ϕ)IP

b. [[…....]InflP[ HL ]QUES]ForceP (….. ( ∆L….. µ HL)ϕ)IP

In other words, I am suggesting that the illocutionary tonal morphemes in Bengali

consist only of boundary tones, as is the case in many tone languages, for example.
In the interface PF representation, these illocutionary morphemes are the functional
heads of a syntactic projection that, following Rizzi 1997, will be referred to as a
Force Phrase. In the surface phonological representation PR, the presence of the
default pitch accent is determined by Assoc (¨MaP, Tone) and the quality of the
pitch accent tone is determined by the quality of the illocutionary force morpheme
and the OCP:
BENGALI INTONATION REVISITED 229

(20)

Declarative: Realize Realize OCP Assoc *Tone

[…...[anlam]V ]InflP[ L ]DECL ]ForceP [L]DECL [HL]QUES (∆MaP, T)
⇒ H* [L] **
a. (…… ( anlam)MaP)IP
L* [H] *! **
b. . (…… ( anlam)MaP)IP
L* [L] *! **
c. . (…… (anlam)MaP)IP
[L] *! *
d. . (…… (anlam)MaP)IP
Interrogative
[…..[anlam]V]InflP[ HL ]QUES ]ForceP
⇒ L* [HL] ***
a. (…… ( anlam)MaP)IP
H* [L] *! **
b. . (…… ( anlam)MaP)IP
H* [HL] *! ***
c. . (…… (anlam)MaP)IP
[HL] *! **
d. . (…… (anlam)MaP)IP

The constraints Realize [L]DECL and Realize [HL]QUES mentioned in the tableau
assure that the tones of a tonal morpheme in the input are maintained in the output,
in the quality specified in the input; these constraints are members of the family of
constraints which require that a morpheme have some phonological realization in
the output. I will assume that the general character of these Realize constraints for
tonal morphemes is as in (21).

(21) Realize [Tone(s)]M ( = a constraint schema)

The tone(s) of a tonal morpheme [T1 (T2)]M in the morphosyntactic input

representation must be realized as such in the output phonological
representation

Together with the OCP, these faithfulness constraints assure that the default pitch
accent in the final phrase is the polar opposite of the following lexically specified
boundary tone morpheme. So just as the quality of the L* pitch accent in nonfinal
phrases is determined by constraint, so is the quality of the pitch accents in the final
phrase.
Note that the constraint MaxTone, which calls for an input tone to have a
corresponding tone in the output (McCarthy and Prince 1995), cannot be given the
function of maintaining the tonal morphemes in the output. Bengali is not a tone
230 ELISABETH SELKIRK

language, in which tonal contrasts in morphemes which also have segmental content
are preserved on the surface. Rather, assuming Richness of the Base (Prince and
Smolensky 1993), *Tone must be ranked above Max Tone in order to ensure that
any nonmorphemic tones are eliminated in the output. But *Tone must be ranked
below the morpheme realization constraints of the form Realize [Tone]M. An
intonational language, which lacks lexical tone contrasts expected for those found in
tonal morphemes, is thus characterized by the ranking Realize [Tone]M >> *Tone
>> MaxTone.

3.1.3. The absence of default edge H in the penultimate phrase in declaratives

A minor ranking adjustment to the constraint system developed so far will allow an
account of a further property of declarative intonation. Hayes and Lahiri report that
no default peripheral H tone appears on the penultimate phonological phrase in the
case of the declarative, as seen in (17) (=(3)). They ascribe this to the OCP, which
disallows a H tone sequence consisting of a phrase-final H followed by the pitch
accent H* of the declarative. Since, by hypothesis, both of these tones are default,
none of the faithfulness constraints seen above can decide which one of the H tones
is realized. Rather the ranking of Assoc (¨MaP, T) and the OCP over Align R
(¨MaP, H) will derive the result that it is the edge tone, not the pitch accent, which
fails to appear. In other words, it is more important in Bengali to maintain a pitch
accent than to maintain a peripheral tone, when the identical qualities of these would
produce an OCP violation. The tableau in (22), which illustrates the analysis,
contains a version of sentence (17), which, for the sake of the exposition, lacks the
overt subject noun phrase:

(22)
L
Realize OCP Assoc Align R *Tone
[[raj‡a-r c‡hobi-r j‡onno ˇaka] [anlam]][ ] DECL] α (∆MaP (MaP, H)
Tone)
L* H H* [L]DECL *! ****
a. (raj‡ar c‡hobir j‡onno ˇaka) (anlam))

L* H [L]DECL *! ***
b. (raj‡ar c‡hobir j‡onno ˇaka) (anlam))

⇒ L* ø H* [L]DECL * ***
c. (raj‡ar c‡hobir j‡onno ˇaka) (anlam))

H* L H* [L]DECL * ****!
d. (raj‡ar c‡hobir j‡onno ˇaka) (anlam))
BENGALI INTONATION REVISITED 231

The optimal candidate c. shows a violation of the constraint Align R (MaP, Tone);
the ranking of this constraint below the OCP and the Assoc (¨MaP, Tone) allows for
this candidate to emerge as the winner. Candidate c. shares an Align R (MaP, H)
violation with the nonoptimal candidate d, because both the absence of a H tone and
the appearance of a L tone instead of H constitute violations of this constraint.
Candidate d. is therefore ruled out by its greater number of violations of the
structure-minimizing constraint *Tone. No higher ranked constraint calls for the
presence of a default peripheral L at the edge of major phrase, so *Tone rules it out.

3.1.4. The absence of default edge H in the final phrase in declaratives

There is one last property of the tonal patterns of the final phrase that remains to be
explained, namely the absence of the default H right edge tone that is normally
found in nonfinal phrases (except, of course, for the case just described). The default
peripheral H tone simply does not appear preceding the L boundary tone of the
declarative, as shown in (23a). As for the interrogative, which ends in a [HL] tonal
morpheme, as in (18), there is no way of telling whether the default peripheral H
tone is present as well.

(23) Neutral focus declaratives lack a phrase-peripheral High tone in the final
phrase:

H* [L]
a. ok
…….. ( ¨….. µ)MaP)IP
L* H[L]
b. * …….. ( ¨….. µ )MaP)IP

If the default H were to surface, the tonal pattern to be predicted by the OCP would
be identical to that found in interrogatives, namely a L* pitch accent followed by a
HL boundary sequence, as in (23b). Homophony avoidance is transparently not a
factor in ruling out this candidate for the declarative pattern, however, since
homophony of distinct sentence types is not avoided in Bengali. As we will see
below, a declarative with a contrastive FOCUS in the final phrase has exactly the L*
HL pitch pattern found in the interrogative. Rather, the impossibility of the pattern
in (23b) is analyzable as a consequence of the constraint system. Basically, the
proposal is that the tonal alignment constraint Align R (MaP, H), which is violated
in the optimal candidate (17), is dominated by the morpheme-specific alignment
constraint Align R ([L]DECL, IP) and the well-known tonal markedness constraint
*Contour Tone. (24) gives the ranking that will derive the absence of the default
peripheral H tone in the final phrase and (25) is the tableau that shows it. (The pitch
accents of the final phrase are not shown in the schematic phrase-final
representations in (25).)

(24) Realize [L]DECL, Align R ([L]DECL, IP), *Contour Tone >>

Align R (MaP, H) >> *Tone
232 ELISABETH SELKIRK

(25)

Realize [L]DECL Align R Contour Align R Tone

…[anlam]] [ L]DECL ]FroP ([L]DECL, IP) Tone (MaP, H)

H *! *
a. …. (… µ µ)MaP)IP
[H] *! *
b. …. (… µ µ)MaP)IP
[L] H *! **
c. …. (… µ µ )MaP)IP
H[L]
*! **
d. …. (… µ µ )MaP)IP
H [L] * **!
e. …. (… µ µ )MaP)IP
⇒ [L] * *
f. … . (…µ µ )MaP)IP

Candidate f., with its simple declarative [L] morpheme, is the optimal one. It
violates Align R (MaP, H), but does not show the violations of the higher ranked
constraints seen in candidates a.- d., and has fewer violations of *Tone than
candidate e. has. The constraint *Contour Tone introduced here is a tonal
markedness constraint familiar from much previous research. Its essential role is to
disallow the case where both the peripheral default H and the tonal morpheme
[L]DECL are associated to the same tone-bearing unit, i.e. the same mora. As for the
constraint Align R ([L]DECL, IP), it has the function of ruling out candidate c. in this
tableau, in which the declarative morpheme is associated to the penultimate mora of
the phrase rather that to the edge mora, to which the default H edge tone is
associated here. Morpheme-specific subcategorizational alignment constraints like
Align R ([L]DECL, IP) are made explicit or presupposed in the the literature
(Gussenhoven 2000 , Grice et al 2000), where they are given the function of
linearizing tonal morphemes within the prosodic representation.

(26) Align R ([L]DECL, IP)

Align [L]DECL with the rightmost tone-bearing unit of an Intonational Phrase.

Note that an alternative analysis based on the metathesis-banning input-output

correspondence constraint Linearity (McCarthy and Prince 1995) cannot do the job
of ruling out candidate c., since the H, as a default tone, is not in the input
representation, and so its position with respect to input tones is not regulated by the
constraint.
This completes my constraint-based analysis of the tonal contours found in
declarative and interrogative sentence types under conditions of neutral focus. The
full constraint ranking motivated so far, (27), shows the role for totally familiar
types of constraints from the tonal and intonational literature in accounting for
neutral intonation in Bengali.
BENGALI INTONATION REVISITED 233

(27)
Realize [L]DECL, AlignR ([L]DECL, IP)

*ContourTone, OCP, Assoc (¨MaP, Tone)

AlignR (MaP, H)

*Tone

In the next section an analysis of tonal contours in sentences with contrastive

FOCUS will be provided which draws on this constraint ranking and adds to it just
the constraints relevant to realizing and linearizing the FOCUS morpheme, namely
Realize [H]FOC and Align R ([H]FOC, MaP).

3.2 The intonation of sentences with contrastive FOCUS

A declarative sentence containing a FOCUS constituent lacks the H* LI contour of
the basic declarative. Instead what one finds in the FOCUS declarative is a final
contour consisting of a L* pitch accent followed by a H peripheral tone followed by
the L peripheral tone of the declarative morpheme. There are two cases to
distinguish:

(28) FOCUS constituent is final in the declarative sentence (on the verb)

L*HP L* HP L* H [L]
(ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam)
I king’s pictures for money GAVE

(29) FOCUS constituent is not final in the declarative sentence

L* HP L H [L]
(ami raj‡ar) (c‡hobir) j‡onno ˇaka anlam)
I king’s PICTURES for money gave.
The H peripheral tone of a FOCUS, marked in bold italics, always flanks the right
edge of the morphosyntactic FOCUS constituent and so differs in its distribution
from the declarative morpheme [L], which is confined to the right edge of the
sentence. If the FOCUS constituent is not final in the sentence, the H appears at its
non-final right edge, at a distance from the L tone at sentence end.

3.2.1. Final FOCUS

Let’s review the Hayes and Lahiri argument that the H tone appearing with final
FOCUS in declaratives is a morpheme, rather than merely the default H peripheral
234 ELISABETH SELKIRK

tone seen in nonfinal phonological phrases. Hayes and Lahiri base the argument on
the contrast between the final tonal patterns of nonFOCUS declaratives like (17) and
FOCUS declaratives like (28). The contrast is not in the final L tone, which is
common to both forms of the declarative. The contrast is also not in the tonal value
of the pitch accents, which are predictable on the basis of the quality of the
following peripheral tone. It is the presence of the peripheral H tone in final FOCUS
declaratives like (28) which is contrastive. That peripheral H in (28) must be
morphemic. As we saw above, it cannot be an instance of the default peripheral H
tone, which simply fails to appear in nonFocus declaratives like (17). So we must
posit a FOCUS tonal morpheme--[H]FOC, an entity whose presence in the
representation can be assured by a morpheme-realization constraint. As we will see,
it is this morphemic status which permits an explanation for the distribution of this
H tone in (28) and (29), and for the appearance of the right edge of phonological
phrase at the right edge of the FOCUS constituent.
A contour tone consisting of the [H]FOC morpheme and the [L]DECL morpheme is
formed at sentence edge in the final FOCUS case. The simple presence of these
tones in the representation is guaranteed by morpheme realization constraints, but
faithfulness does not guarantee the joint positioning of the tonal morphemes at the
right extreme of the utterance, in violation of *Contour Tone. The creation of the
contour tone must be forced by constraints requiring that these morphemes appear at
a phrase edge. Such an alignment constraint was proposed above for the declarative
morpheme, namely (26), Align R ([L]DECL, IP). For the FOCUS morpheme, the
constraint should be formulated as an alignment with the edge of a phonological
phrase:

(30) Align R ([H]FOC, MaP)

Align [H]FOC with the rightmost tone-bearing unit of a Major Phrase.

The ranking above *Contour Tone of these two morpheme-specific alignment

constraints in (31) explains why they form an illicit contour at phrase edge, as we
see in tableau (32).

(31) Realize [H]FOC, Realize [L]DECL, Align R ([H]FOC, MaP),

Align R ([L]DECL, IP) >> *Contour Tone
BENGALI INTONATION REVISITED 235

(32)

Realize Realize Align R Align R *Contour

... [ anlam][H]FOC] [ L]DECL ]FroP [L]DECL [H]FOC ([H]FOC, MaP) ([L]DECL, IP) Tone

[H] [L]
⇒ *
a. …. (…µ µ)MaP)IP
[H]
*!
b. …(… µ µ)MaP)IP
[L]
*!
c. …(…µ µ )MaP)IP
[H][L]
*!
d. ...(… µ µ )MaP)IP

Note that this analysis assumes that the alignment constraints for both [H] and [L]
tonal morphemes are satisfied by an association to the final tone-bearing unit of the
phrase, as seen in candidate a. In other words, the [H] in the optimal candidate a. is
considered to be right-aligned even if it precedes the [L] within the phrase.
Candidate d. shows a real misalignment of the [H], however, in being associated to
the penultimate mora. In candidates b. and c., it is the disappearance of the input
tonal morphemes, in violation of the morpheme realization constraints, which
accounts for the ungrammaticality of the forms. What we don’t yet have an
explanation for is the ungrammaticality of an additional candidate where the order
of the morphemes in the final contour tone is simply the opposite of what we see in
candidate a. Some additional principle would be required to account for the
optimality of candidate a over this alternative. In the spirit of Pierrehumbert and
Beckman 1988, one might assume that an IP-aligned edge tone must lie outside a
MaP-aligned edge tone. But there is also a possible explanation based on the
positioning of these tonal morphemes in the morphosyntactic structure, where the
sentence-peripheral [L] declarative tone lies higher up and to the right of the focus
[H] tone, which marks a constituent lower down in the sentence.
To sum up, the two morpheme-specific constraints for the FOCUS morpheme--
Realize [H]FOC and AlignR ([H]FOC, MaP)-- have been brought into play in this
section and the constraint ranking has been refined. The full constraint ranking is
now as in (33).
236 ELISABETH SELKIRK

(33)

Realize [H]DECL AlignR ([L]DECL, IP) Realize [H]FOC AlignR ([H]FOC, MaP)

Assoc (¨MaP, Tone) *Contour Tone OCP

Align R (MaP, H)

*Tone

MaxTone

In the next section we will see that the constraints motivated here will also enable us
to account for the characteristic tone and phrasing properties of nonfinal FOCUS in
Bengali.

3.2.2. Nonfinal FOCUS

Now we are at the point where we can understand the apparently problematic fact
with which this paper began, namely the fact that a FOCUS is always flanked by a
MaP edge on its right, even when it is not final in the sentence. Sentence (2),
repeated here in (34), contains an example of a non-final FOCUS. The interface
syntactic representation is (35).

(34) L* H L* H LI
((ami raj‡ar) (c‡hobir) j‡onno ˇaka anlam)IP

(35)
S

VP
NP
PP
NP
NP NP NFOC P N V
| | |\
N N N [H]FOC
ami raj‡a-r c‡hobi-r j‡onno ˇaka anlam
I king’s PICTURES for money gave
‘I gave money for the king’s pictures’
What immediately meets the eye (and ear), is that the right edge of the
nonfinalFOCUS constituent is marked by a H tone. We must presume that this is the
same focus morpheme [H]FOC that is observed when the FOCUS is final in the
sentence. For explicitness, let’s take the FOCUS morpheme to be adjoined as a
BENGALI INTONATION REVISITED 237

suffix to a word (as in (35)) or a larger phrase, where it licenses the FOCUS
property on the dominating node, which in turn gets interpreted as FOCUS in the
semantics. Given the position of the [H]FOC morpheme as a suffix of the FOCUS
constituent in the syntax, the interface and markedness constraints in (33) will
guarantee that in the phonological representation of the sentence the right edge of
the FOCUS constituent will correspond to the right edge of a major phrase in the
declarative case given in (34). The constraint AlignR ([H]FOC, MaP) plays a crucial
role in deriving this result.
The analysis goes as follows. The FOCUS morpheme [H]FOC is forced by
faithfulness to the syntactic representation (call this “Syntax Faith” for short) to
remain in its syntactically specified position as a suffix at the right edge of the
FOCUS constituent. Confined to that position, the FOCUS morpheme is nonetheless
required to satisfy its own morpheme-specific interface alignment constraint,
AlignR ([H]FOC, MaP), which calls for the morpheme to appear at the right edge of a
major phrase in phonological representation. Since the position of the [H]FOC is fixed
by the syntax in a context in which the right edge of phonological phrase may not be
called for, satisfaction of the alignment constraint may require that the phrase edge
be introduced into the representation. In other words, AlignR ([H]FOC, MaP) may in
effect induce the presence of the phrase edge. This is the case in (34)/(35), as seen in
(36):

(36) Nonfinal FOCUS in the declarative

Syntax AlignR AlignR Exh

L
[[raj‡a-r [[c‡hobi] [H]FOC-r] j‡onno ˇaka] [anlam]] [ ]DECL] Faith ([H]FOC, (MaP, (IP)
MaP) IP)
⇒ L* [H]FOC [L]DECL *
a. ( ….. (c‡hobir )MaP j‡onno ˇaka anlam)
L*[H]FOC [L]DECL *!
b. ( ….. (c‡hobir j‡onno ˇaka anlam) MaP)IP
L* [H][L] *!
c. ( ….. (c‡hobir j‡onno ˇaka anlam)MaP)IP
L* [H]FOC L* H* [L]DECL *!
d. ( ….. (c‡hobir)MaP ( j‡onno ˇaka)MaP (anlam)MaP)IP

Candidate c. moves the [H]FOC to coincide with a MaP edge at the end of the
sentence, and so violates Syntax Faith. Candidate b. lacks a right edge of MaP at the
[H]FOC in situ position, and so violates AlignR([H]FOC,MaP). Candidate a., which
respects both these constraints, is the optimal one. As for candidate d., it contains
the phonological phrase edge that is otherwise always present at the left edge of the
verb, as well as the edge induced by the FOCUS morpheme, all organized into a
prosodic structure respecting Exhaustivity. But, as was proposed in section 2, this
post-FOCUS phrasing is ruled out by the markedness constraint which aligns the
head MaP with the right edge of IP. The optimal candidate a. lacks any Major
238 ELISABETH SELKIRK

Phrase intervening between the head MaP of the FOCUS and the end of the
sentence, and so is not considered to count as a violation of AlignR (MaP, IP). It
does show a violation of the lower ranked constraint Exhaustivity (IP) (Selkirk
1995), which requires the Intonational Phrase to immediately dominate only major
phrases, i.e. constituents at the next level down in the prosodic hierarchy.
So this, then, is the explanation for the presence of the right edge of phonological
phrase at the right edge of a nonfinal FOCUS constituent in Bengali. The FOCUS
morpheme, through its own, independently motivated, subcategorizational prosodic
alignment constraint AlignR ([H]FOC, MaP), induces the presence of the phrase edge
observed. This means that there is no reason to follow Hayes and Lahiri in positing a
FOCUS-prosody interface constraint which aligns the right edge of a FOCUS
syntactic constituent with the right edge of phonological phrase. The Hayes and
Lahiri analysis is incompatible with the Focus Prominence theory of the interface of
focus and phonology, so it is a welcome result that there is an alternative to that
theory which falls out from the independently motivated analysis of Bengali
intonation that has been proposed here.
While the current proposal might be preferable on the grounds of theoretical
economy, given that it successfully excludes the class of Focus-Phrasing interface
alignment constraints from universal grammar, it would desirable to clinch the case
on the basis of empirical fact. Fortunately, the facts are in principle available,
though they have not yet been investigated. In a current collaborative project with
Aditi Lahiri, we hope to bring the facts to light.
The theory proposed here predicts that if the [H]FOC morpheme is for some
reason absent at the right edge of a FOCUS constituent in surface representation,
there should be no right edge of phonological phrase at that location. The Hayes and
Lahiri theory predicts on the other hand that, regardless of the presence or absence
of the FOCUS morpheme, a phonological phrase edge should appear at the right
edge of a syntactic FOCUS constituent. Now there happens to be a case of nonfinal
FOCUS in Bengali where the [H] FOC morpheme fails to be realized in the output.
This occurs in interrogatives, where <H> indicates the deleted FOCUS [H] tone:

(37) L* H L*<H> [HL]QUES

(ami raj‡ar) (c‡hobir j‡onno ˇaka anlam)
I king’s PICTURES for money gave.
‘Did I give money for the PICTURESFOC of the king?

The tonal morpheme for interrogatives is [HL]QUES, and, as Hayes and Lahiri point
out, the absence of the FOCUS morpheme [H]FOC at the right edge of the FOCUS
constituent could be attributed to the OCP. Given the Hayes and Lahiri analysis of
FOCUS phrasing, there are no implications of this tonal deletion for the phrasing.
But in the analysis of Bengali intonation that I have proposed, the loss of the tonal
morpheme implies an absence of phonological phrase edge at the right edge of the
FOCUS constituent, since there is no other constraint that would produce that
phrasing. Now it turns out that there is a way of probing this difference in phrasing
predictions in Bengali.
BENGALI INTONATION REVISITED 239

As Hayes and Lahiri demonstrate with admirable systematicity, the phonological

phrase organization of Bengali is reflected not just in the patterning of tones within
the sentence, but also in the segmental phonology. Interword assimilations like the
complete assimilation of final r to a word-initial coronal are found within the
phonological phrase, but are blocked at phrase edges. So, for example, the sequence
/c‡Hobi-r j‡onno / is differently realized in sentences (2) and (3). In (3) where the
sequence is phrase-internal, the /r j‡/ sequence is realized on the surface as [j‡ j‡],
while in (2), where the first word is a FOCUS and followed by a phonological phrase
edge marked by the [H] focus morpheme, the sequence remains unchanged. Segmental
assimilation patterns thus provide a means of diagnosing the presence of phono-
logical phrase edges independent of tone, and it turns out that they may choose
between my theory of the appearance of phonological phrase edge at FOCUS
right edge and the one proposed by Hayes and Lahiri. My analysis predicts that there
should be assimilation in the sequence /r j‡‡/ in (37), since the sequence is phrase-
internal. Hayes and Lahiri predict that assimilation should be blocked, since they
posit a phrase edge there even in the absence of [H]. The assimilation facts for
this case are not reported in Hayes and Lahiri 1991, and are unavailable to me at
this writing, but hopefully will emerge soon from joint investigation of such cases
planned currently planned.
I want to complete this section by showing just how it is that my analysis will
select the representation in (37) as optimal for a case of nonfinal FOCUS in an
interrogative sentence. The input representation for (37) contains two tonal
morphemes—[H]FOC and [HL]Ques. If the nonrealization of the [H]FOC morpheme is
the consequence of the OCP, then it must be the case that both the OCP and the
constraint Realize [HL]QUES dominate the constraint Realize [H]Foc, which is
violated in the representation, as in (38). The tableau in (39) illustrates the analysis.

(38) Realize [HL]QUES, OCP >> Realize [H]FOC

(39) Nonfinal FOCUS in the interrogative

OCP Realize Realize AlignR Exh

HL
[….. […..[[c‡hobi][H]FOC-r]…..][ ]QUES] [HL]QUES [H]FOC ([H]FOC, (IP)
MaP)
L* [H]FOC [HL]Ques *! *
a. ( ….. (c‡hobir)MaP ….. )IP
L* [H]FOC *! *
b. ( ….. (c‡hobir)MaP ….. )IP
L* [H]FOC [L]QUES *! *
c. ( ….. (c‡hobir)MaP …... )IP
⇒ L* [HL]QUES *
d. ( ….. (c‡hobir …... )MaP)IP
L* [HL]QUES * *!
e. ( ….. (c‡hobir)MaP ….. )IP
240 ELISABETH SELKIRK

The optimal candidate d. lacks the FOCUS morpheme, and in so doing respects the
higher ranked OCP and Realize [HL]QUES, while incurring a violation of Realize
[H]FOC. In this optimal candidate, there is no phrase edge at the right of the FOCUS
since there is no [H]FOC to require it. Note that candidate e. has the same tones as the
optimal d. but differs in having a phrase edge present at the right edge of the
FOCUS. In this particular case a phrase edge in that medial position would be ruled
out by the constraint Exhaustivity (IP), since the stretch between the major phrase it
demarcates and the end of the intonation phrase is not itself parsed into major
phrase.
Observe that the new ranking in (38) is consistent with the other rankings
motivated above for Bengali tonology. (40) is the summary ranking in (33),
modified in virtue of (38). Here the OCP is promoted from the lower rank it had
been given in (33) for want of any further evidence.

(40) Realize [HL]Ques OCP

Realize [H]DECL AlignR ([L]DECL, IP) Realize [H]FOC AlignR ([H]FOC, MaP)

Assoc (¨MaP, Tone) *Contour Tone

Align R (MaP, H)

*Tone

MaxTone

The claim embodied by exploiting a tonal grammar of this sort is that the
tonal/intonational patterns of sentences—in any language-- must be seen as deriving
from the interaction of different types of constraints, including morpheme-specific
realization and alignment constraints, generic faithfulness constraints like MaxTone,
prosodic enhancement constraints calling for (default) pitch accent or edge tones,
and classic tonal markedness constraints like the OCP and *Contour Tone. Of
course these tonal constraints interact with the constraints of the grammar which
define the prosodic structure of sentences. They may either collaborate within a
prosodic structure that is independently defined, or, as in the case of the morpheme-
specific constraint AlignR ([H]FOC, MaP), may in fact be responsible for the
presence of some aspect of prosodic structure.

4. SUMMARY
In the early sections of the paper, I sketched out a theory of Bengali FOCUS-related
phrasing that would be consistent with the Focus Prominence hypothesis, and in the
last section this theory was further fleshed out, and shown to be viable. To
summarize, the constraints and rankings crucially involved in the analysis of
Bengali FOCUS phrasing patterns are:
BENGALI INTONATION REVISITED 241

(i) The FOCUS-Prominence interface constraint: FOCUS (α) ⊂ ¨IP

(ii) Phonological markedness constraints of the prosodic prominence
prosodic edge alignment family:
-- AlignL (PWd, MaP)
-- AlignR (MaP, IP)
(iii) The ranking hierarchy FOC-Prom, AlignR (PWd, MaP) >> *StrucMaP
(collectively responsible for the phrase edge at the left of FOCUS)
(iv) The morpheme-specific alignment constraint AlignR ([H]FOC, MaP)
(responsible for the phrase edge at the right of FOCUS)
(v) The ranking hierarchy
FOC Prom, AlignR (MaP, IP), AlignR ([H]FOC, MaP) >> Exh (IP)
(collectively responsible for absence of phrasing to the right of the
FOCUS phrase)

The FOCUS-Prominence interface constraint makes appeal to the FOCUS properties

of syntactic constituents in the interface representation, and is seconded in
producing its prosodic phrasing consequences by familiar prosodic markedness
constraints, as proposed by Truckenbrodt 1995. The additional right-edge phrasing
effect in Bengali is produced by a constraint which calls on a specific morpheme in
the interface syntactic representation to be aligned with a prosodic phrase edge in
phonological representation, namely AlignR ([H]FOC, MaP) . The existence of this
latter sort of constraint, which relates the FOCUS morpheme to prosodic phrasing, is
consistent with the Focus Prominence theory of the focus-phonology interface.
Focus Prominence theory does not exclude subcategorizational constraints that are
restricted to specific morphemes like the FOCUS morpheme. The theory limits only
the nature of interface constraints which appeal to the semantically interpreted focus
feature marking of higher order constituents in the syntactic representation. This
focus marking of higher order constituents may of course be projected from focus
morphemes like that in Bengali, but the morpheme itself is not a focus(sed)
constituent in this sense. The facts of Bengali focus intonation therefore do not
challenge the hypothesis that the only focus-phonology interface constraints in a
grammar are those which relate focus-marked constituents of surface PF to prosodic
stress prominence in surface PR.
The Hayes and Lahiri claim for the centrality of the OCP is supported in the
present optimality theoretic analysis, which relies on the OCP for an explanation of
the polar character of pitch accents and following peripheral tones within a phrase,
as well as for an explanation of the absence of peripheral tones (whether default tone
or underlying tonal morpheme) when a following tone (whether default pitch accent
or underlying boundary tone morpheme) would be of identical tone quality. This
long-distance application of the OCP, between tones of disparate provenance and
surface association type is noteworthy, and demands notice in a typology of possible
conditions of OCP application across languages. In the particular case of Bengali,
242 ELISABETH SELKIRK

assuming that the OCP governs possible output representations has permitted a
pared down theory of what the tonal morphemes of Bengali are in the first place,
restricting them in this language to sentence-final morphemes, as in the case of the
declarative [L] and interrogative [HL] illocutionary force morphemes, or to the
constituent-final [H] FOCUS morpheme. All other tones in Bengali intonation are
analyzable as default tones, whose presence, and quality, is determined by
phonological markedness constraints.

University of Massachusetts Amherst

5. NOTES

* The research for this paper was supported in part by National Science Foundation grant BCS000438
The Reflexes of Focus in Phonology, Principal Investigator: Elisabeth Selkirk.
1
[H*+L]FOC pitch accent in European Portuguese(Frota 2000), [H]FOC phrase-edge tone in Bengali
(Hayes and Lahiri 1991), [H]FOC accent-tropic tone in Swedish (Bruce 1977)
2
Selkirk 1984, 1995 proposes that pitch accents are a default reflex of the presentational Focus status of a
word in English. Selkirk 2002 suggests that it is the L+H* which appears by default with contrastive
FOCUS.
3
Hungarian (Vogel and Kenesei 1987), Japanese (Pierrehumbert and Beckman 1988), Chichewa
(Kanerva 1989), Shanghai Chinese (Selkirk and Shen 1990), and others
4
Jackendoff 1972, Hayes and Lahiri 1991, Reinhart 1995, Roberts 1996
5
Pierrehumbert and Beckman 1988, Inkelas and Leben 1990
6
European Portuguese (Frota 2000)
7
Note than I am not saying that there are no alignment constraints at all which characterize the syntax-
phonology interface. Indeed, there is evidence that, independent of focus, you do need interface
constraints aligning the edges of syntactic constituents defined in X-bar level terms with prosodic
constituents at a designated level, e.g. Align R/L (XP, MaP) (see Selkirk 1986 et seq, Nespor and Vogel
1986, Chen 1987, Truckenbrodt 1998, Sugahara 2003, among others).
8
The position of the verb in the surface representation of these sentences is particularly in need of
clarification. Given the structure in (1), there can be no principled explanation for the systematic
appearance of a phonological phrase break at the left edge of the verb, seen in nonFOCUS sentences such
as (3) . But since this aspect of Bengali phonological phrasing is not of immediate concern, I will
continue to assume the structure in (1). It at least shows the analysis in terms of noun phrases that will
survive regardless of the ultimate decision about their position in a higher order syntactic structure.
9
For example, in some languages with lexical pitch accent, words lacking pitch accents in their input
form receive a default pitch accent on the main stressed syllable of the output form. See Zec 1999 on
Serbo-Croatian, Lahiri 2002 on Swedish .
10
Beginning with the analysis of “initial lowering” in Japanese as an alignment of L and H peripheral
tones (Poser 1984, Pierehumbert and Beckman 1988), there have been a variety of languages analyzed as
showing default, constraint-introduced edge tones, including the medial MaP-edge L phrase tone of
English (Selkirk 2000), the LH phrase edge tone of Korean (Jun 1993), etc.

6. REFERENCES
Bruce, Gösta. Swedish Word Accents in Sentence Perspective. Lund: Gleerup, 1977.
Chen, Matthew. “The syntax of Xiamen tone sandhi.” Phonology 4 (1987): 109-150.
Frota, Sonya. Prosody and Focus in European Portuguese: Phonological Phrasing and Intonation. New
York: Garland Publishing, 2000.
BENGALI INTONATION REVISITED 243

Grice, Martine, D.R. Ladd and Amalia Arvaniti. “On the Place of Phrase Accent in Intonational
Phonology.” Phonology 17 (2000): 143-185.
Gussenhoven, Carlos. The Lexical Tone Contrast of Roermond Dutch in Optimality Theory. In M. Horne
(ed.), Prosody: Theory and Experiment. Dordrecht: Kluwer Publishing, 2001.
Hayes, Bruce and Aditi Lahiri. “Bengali Intonational Phonology.” Natural Language and Linguistic
Theory 9 (1991): 47-96.
Inkelas, Sharon and W. R. Leben. “Where Phonology and Phonetics Intersect: The Case of Hausa
Intonation. In J. Kingston and M. Beckman (eds.), Papers in Laboratory Phonology 1: Between the
Grammar and Physics of Speech, pp. 17-34. Cambridge, Cambridge University Press, 1990.
Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972.
Jun, Sun-Ah. The Phonetics and Phonology of Korean Prosody. New York: Garland Publishing, 1995.
Lahiri, Aditi, A. Wetterlin, and E. Steiner. “Unmarked Tone in Scandinavian.” Manuscript. Fachbereich
Allgemeine Sprachwissenschaft, Unversity of Konstanz, 2002.
Kanerva, Jonni. Focus and Phrasing in Chichewa Phonology. Stanford University: Doctoral dissertation,
1989.
Kanerva, Jonni. “Focusing on Phonological Phrases in Chichewa.” In S. Inkelas and D. Zec (eds.), The
Phonology-Syntax Connection, pp. 145-162. Chicago: University of Chicago Press, 1990.
McCarthy, John and Alan Prince. “Generalized alignment.” In G. Booij and J. van Marle (eds.), Yearbook
of Morphology, pp. 79-153. Dordrecht: Kluwer, 1993.
Nespor, Marina and Irene Vogel. Prosodic Phonology. Dordrecht: Foris, 1986.
Pierrehumbert, Janet and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988.
Poser, William. The Phonetics and Phonology of Tone and Intonation in Japanese. MIT: Doctoral
dissertation, 1984.
Prince, Alan and Paul Smolensky. Optimality theory: Constraint Interaction in Generative Grammar.
Manuscript, Rutgers University and Johns Hopkins University, 1993.
Reinhart, Tanya. “Interface Strategies.” OTS Working Papers, OTS-WP-TL-95-002, Utrecht University,
1995.
Rizzi, Luigi. “The Fine Structure of the Left Periphery.” In L. Haegemann (ed.), Elements of Grammar.
Handbook of Generative Syntax, pp. 281-337. Dordrecht: Kluwer, 1997.
Roberts, Craige. “Focus, Information Flow and Universal Grammar.” In P. Culicover and L. McNally
(eds.), The Limits of Syntax, pp. 109-160. New York, Academic Press, 1998.
Rooth, Mats. “A Theory of Focus Interpretation.” Natural Language Semantics 1 (1992): 75-116.
Rooth, Mats. “Focus.” In S. Lappin (ed.), The Handbook of Contemporary Semantic Theory. . London,
Blackwell, 1996a.
Rooth, Mats. “On the Interface Principles for Intonational Focus.” Proceedings of SALT VI, pp. 202-226.
Ithaca, NY: Cornell University, 1996b.
Schwarzschild, Roger. “Givenness, Avoid F, and Other Constraints on the Placement of Accent.” Natural
Language Semantics 7 (1999): 141-177.
Selkirk, Elisabeth. Phonology and Syntax: The Relation between Sound and Structure. Cambridge, Mass.:
MIT Press, 1984.
Selkirk, Elisabeth. “Sentence Prosody: Intonation, Stress, and Phrasing.” In John Goldsmith (ed.), The
Handbook of Phonological Theory, pp. 550-569. Cambridge: Blackwell Publishers, 1995.
Selkirk, Elisabeth. “Interface Constraints on Focus.” Talk delivered at the Workshop on Syntax-
Phonology Interface, Linguistic Society of Japan, Tokyo, November 1999.
Selkirk, Elisabeth. “Focus Types and Tone.” Paper presented at the First North American Phonology
Conference, Concordia University, Montreal, 2000.
Selkirk, Elisabeth. “The Interaction of Constraints on Prosodic Phrasing.” In M. Horne (ed.), Prosody:
Theory and Experiment, Dordrecht: Kluwer Publishing, 2001.
Selkirk, Elisabeth. “Contrastive FOCUS vs. Presentational focus: Prosodic Evidence from Right Node
Raising in English.” In B. Bel and I. Marlin (eds.), Speech Prosody 2002: Proceedings of the First
International Speech Prosody Conference, pp. 643-646. Laboratoire Parole et Langage, Université de
Provence, Aix-en-Provence, 2002.
Selkirk, E. and J. Katz (in preparation) Phrasal stress and focus types. Ms. UMass Amherst and MIT.
Selkirk, Elisabeth and Tong Shen. “Prosodic domains in Shanghai Chinese.” In S. Inkelas and D. Zec
(eds.), The Phonology-Syntax Connection, pp. 313-338. Chicago, University of Chicago Press, 1990.
244 ELISABETH SELKIRK

Selkirk, E. and K. Tateishi. “Syntax and downstep in Japanese.” In C. Georgopoulos and R. Ishihara
(eds.), Interdisciplinary Approaches to Language. Essays in Honor of S.-Y. Kuroda. Dordrecht,
Kluwer, 1991.
Sugahara, M. Downtrends and Post-FOCUS Intonation in Tokyo Japanese. University of Massachusetts,
Amherst: Doctoral dissertation, in preparation.
Truckenbrodt, Hubert. Phonological Phrases: Their Relation to Syntax, Focus and Prominence. MIT:
Doctoral dissertation, 1995.
Truckenbrodt, Hubert. On the relation between syntactic phrases and phonological phrases. Linguistic
Inquiry 30 (1999): 219-255.
Vogel, Irene and István Kenesei. “Syntax and Semantics in Phonology.” In S. Inkelas and D. Zec (eds.),
The Phonology-Syntax Connection, pp. 339-364. Chicago, University of Chicago Press, 1990.
Zec, Draga. “Footed Tones and Tonal Feet: Rhythmic Constituency in a Pitch-Accent Language.”
Phonology 16 (1999): 225-264.
MARK STEEDMAN

INFORMATION-STRUCTURAL SEMANTICS
*
FOR ENGLISH INTONATION

1. INTRODUCTION

Selkirk (1984), Hirschberg and Pierrehumbert (1986), Pierrehumbert and Hirschberg

(1990), and the present author, have offered different but related accounts of
intonation structure in English and some other languages. These accounts share the
assumption that the system of tones identified by Pierrehumbert (1980), as modified
by Pierrehumbert and Beckman (1988) and Silverman et al. (1992), has as
transparent and type-driven a semantics in these languages as do their words and
phrases. While the semantics of intonation in English concerns information structure
and propositional attitude, rather than the predicate-argument relations and operator-
scope relations that are familiar from standard semantics in the spirit of the papers
collected as Montague 1974, this information-structural semantics is fully
compositional, and can be regarded as a component of the same semantic system.
The present paper builds on Steedman (1991) and Steedman (2000a) to develop a
new semantics for intonation structure, which shares with the earlier versions the
property of being fully integrated into Combinatory Categorial Grammar (CCG, see
Steedman 2000b, hereafter SP). This grammar integrates intonation structure into
surface derivational structure and the associated Montague-style compositional
semantics, even when the intonation structure departs from the restrictions of
traditional surface structure. Many of the diverse discourse meanings that have been
attributed to intonational tunes are shown to arise via conversational implicature
from more primitive literal meanings distinguished along the three dimensions of
information structure, speaker/hearer commitment, and contentiousness.

2. TONES AND INFORMATION STRUCTURE

It is standard to assume, following Bolinger (1958, 1961) and Halliday (1963,
1967a,b), that pitch-accents, high or low, simple or compound, are in the first place
properties of the words that they fall on, and that they mark the interpretations of
those words as contributing to the distinction between the speaker’s actual utterance
and other things that they might be expected to have said in the context to hand, as
in the “Alternative Semantics” of Kartunnen (1976), Karttunen and Peters (1979),
Rooth (1985, 1992), and Büring (1997a,b).1 In this sense, all pitch accents are
contrastive. For example, in response to the question “Which finger did he bite?”,
the word that contributes to distinguishing the following answer from other possible

245
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 245–264.
© 2007 Springer.
246 MARK STEEDMAN

answers via reference is the deictic “this”, so the following intonation is

appropriate.2

(1) He bit THIS one .

H* LL%

It is important to be clear from the start that the set of alternative utterances from
which the actual utterance is distinguished by the tune is in no sense the set of all
possible utterances appropriate to this context, a set which includes infinitely many
things like “Mind your own business,” “That was no finger,” “What are you talking
about?” and “Lovely weather we’re having.” Rather, the presupposed set of
(presumably, ten) alternative utterances is accommodated by the hearer in the sense
of Lewis (1979) and Thomason (1990), like any speaker presupposition that is not
actually inconsistent with their beliefs. This does not imply that such alternative sets
are confined to things that have been mentioned, or that they are mentally
enumerated by the participants—or indeed that they are even finite.
In terms of Halliday’s given/new distinction pitch-accents are markers of “new”
information, although the words that receive pitch-accents may have been recently
mentioned, and it might be better to call them markers of “not given” information.
That seems a little cumbersome, so I will use the term “kontrast” from Vallduv´ı and
Vilkuna 1998 for this property of English words bearing pitch-accents, spelling the
corresponding verb “k-contrast”.3
I’ll further attempt to argue that there are just two independent semantic binary-
valued dimensions along which the literal meanings of the various pitch-accent types
are further distinguished. The first of these dimensions has been identified in the
literature under various names, and distinguishes between what I’ll continue to call
“theme” and “rheme” components of the utterance, using these terms in the sense of
Bolinger (1958, 1961) rather than Halliday. Theme can be thought of informally as
the part of the sentence corresponding to a question or topic that is presupposed by
the speaker, and rheme is the part of the utterance that constitutes the speaker’s
novel contribution on that question or topic. However, it will become clear below
that the notion of theme differs from that of topic as defined by, for example, Gundel
(1974); Gundel and Fretheim (2001) in being speaker-defined rather than text-based.
A great deal of the huge and ramifying literature on information structure can be
summarized as distinguishing two dimensions corresponding to the given/kontrast
and theme/rheme distinctions, although the consensus has tended to be obscured by
the very different nomenclatures that have been applied. (See discussion by
Steedman and Kruijff-Korbayov´ a (2001), which summarizes the terminology and
its lines of descent, along with some contiguous semantic influences.)
However, there is a further dimension of discourse meaning along which the
pitch-accent types are distinguished which has not usually been identified in this
literature. It concerns whether or not the particular theme or rheme to hand is
mutually agreed–that is, uncontentious. This notion is related to various notions of
Mutual Belief or Common Ground proposed by Lewis (1969), Cohen (1978), Clark
and Marshall (1981) and Clark (1996).4
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 247

Both of these components of meaning are projected by the process of

grammatical derivation from the words that carry the pitch-accent to the prosodic
phrase corresponding to these information units, following Steedman 2000a, along
lines briefly summarized in section 5.
I’ll also try to argue that the intonational boundaries such as those sometimes
referred to as “continuation rises,” which delimit the prosodic phrase, fall into two
classes respectively distinguishing the speaker or the hearer as responsible for, or (in
terms of the related accounts of Gussenhoven (1983, p. 201) and Gunlogson (2001,
2002)) committed to, the corresponding information unit.5
I’ll assume that the speaker’s knowledge can be thought of as a database or set of
propositions in a logic (second-order, since themes etc. may be functions), divided
into two subdomains, namely: a set S of information units that the speaker claims to
be committed to, and a set H of information units which the speaker claims the
hearer to be committed to. Information units are further distinguished on a
dimension ±AGREED according to whether the speaker claims them to be
uncontentious or contentious. The set of +AGREED information units is not merely
the intersection of S and H: the speaker may attribute uncontentiousness to an
information unit and responsibility for it to the hearer whilst knowing that in fact
they do not regard themselves as so committed. In Steedman 2000a, S and H are
treated as modalities [S] and [H] of a modal logic, and Stone (1998) has proposed a
similar modality for mutual belief. In the present paper we will combine the feature
±AGREED with the speaker/hearer modalities, writing it as a superscript ±, as in
[H+].
These classifications can be set out diagrammatically as in the tables 1 and 2, in
which θ signifies theme, r signifies rheme, + indicates +AGREED, ± indicates
±AGREED, and [S] and [H] respectively denote speaker and hearer commitment. If
a theme or a rheme is marked as agreed, then it’s in AGREED, whoever is explicitly
claimed to be committed to it. If it is not so marked, then it is not in AGREED, even
if speaker and hearer in fact both believe it. This last possibility arises because H is
only the speaker’s attribution of commitment to the hearer, not the hearer’s actual
belief. It follows that a theme or rheme may be believed by the speaker, and asserted
by the speaker to be something that the hearer is committed to, without the hearer’s
actually agreeing to it. We will come to a case of this later on.

Table 1: The Meanings of the Pitch-accents

+ -
θ L+H* L*+H
ρ H*, (H*+L) L*, (H+L*)

Table 2: The Meanings of the Boundaries

[S] L, LL%, HL%

[H] H, HH%, LH%
248 MARK STEEDMAN

At first glance, this proposal might appear to miss the point entirely. Where are
notions like “topic continuation” (Brown, Currie and Kenworthy 1980) and
“evaluation with respect to subsequent material” (Pierrehumbert and Hirschberg
1990), or the latter authors’ scales of commitment and belief? I’m going to argue
that many of the effects that have been associated with intonational tunes arise as
conversational implicatures from the interaction with context of literal meanings
made up of the above simple components. To consider this claim we need some
examples.

3. AN EXAMPLE: PITCH-ACCENTS
The first example commemorates Miles Davis’ response to Dave Brubeck’s question
concerning his reason for playing E‫ ڸ‬as the final note of In Your Own Sweet Way, in
6
place of E‫ ڷ‬as written by Brubeck:

(2) DB: Why did you play E-natural?

MD: (Why didn’t YOU ) (WRITE E-natural ?)
L+H* LH% H* LL%

background kontrast kontrast background

theme rheme

The LH% boundary splits the utterance into two intonational phrases and two
information units. The L+H* accent marks the first of these units as theme (L*+H
would also be appropriate). It falls on the word you because its referent (Brubeck) is
the element that distinguishes this theme from the other themes that are available.
(Lambrecht and Michaelis (1998) in a related approach call such “marked” or
contrastive themes “ratified topics”. Ratification certainly presupposes some
alternative. However, the example to hand suggests that ratification is only one of
many things that you can do with a contrastive theme or topic.)
The set of available themes, which we will call the “Theme Alternative Set”
(ThAS) is pre-supposed by Davis and accommodated by Brubeck as including just
two possible themes. These can be thought of informally as “Why did/didn’t Davis
do x?” and “Why did/didn’t Brubeck do x?” More formally we can think of the
Theme Alternative Set as a set of l terms, which for this context is as follows, in
which ± stands for polarity:

(3)
λvp.λreason.cause′ reason(± do′ vp brubeck′)
λvp.λreason.cause′ reason(± do′ vp davis′)

(It’s assumed here that the fragment Why didn’t you is assigned a meaning which is
a function from VP interpretations to why-question interpretations—the latter being
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 249

themselves functions from adverbial interpretations to causal propositions. This is in

fact what the CCG grammars outlined in SP and below actually deliver, given an
appropriate lexicon.)
Other themes and Theme Alternative Sets are possible. For example, a further
L+H* pitch-accent on didn’t is possible:

(4) (Why DIDN’T YOU ) (WRITE E-natural ?)

L+H* L+H* LH% H* LL%

By saying (4), Davis presupposes, and Brubeck accommodates, a Theme Alternative

Set which informally can be thought of as “Why did Davis do x” and “Why didn’t
Brubeck do x”, and can be written as:

(5) λvp.λreason.cause′ reason(– do′ vp brubeck′)

λvp.λreason.cause′ reason(+do′ vp davis′)

In both cases, words whose interpretation distinguishes the intended theme from
the others—which is how “k-contrasted” or “not given” is defined in the present
system—bear pitch-accents, while those that do not contribute to the distinction—
which is how we define “background” or “given”—do not. (See Prevost and
Steedman 1994; Prevost 1995 for further detail on the determination of pitch-accent
placement in sentence generation.)
We do not need to think of the Theme Alternative Sets as closed under terms that
are already in play in the conversation. A more general representation of the ThAS
for (2) reminiscent of the “Structured Meaning” approach of Cresswell (1973, 1985)
and von Stechow (1981) can be obtained by abstracting over the element(s)
corresponding to accented words, thus:

(6) λsubj.λvp.λ reason.cause′ reason(± do′ vp subj)

Similarly, the ThAS for (4) can be written as follows:

(7) λpolarity.λsubj.λ vp.λreason.cause′ reason(polarity(do′ vp subj))

Of course, themes including this one may not, and in fact usually do not, bear
any pitch-accent at all, as in:

(8) (Why didn’t you ) (WRITE E-natural ?)

H* LL%
250 MARK STEEDMAN

Such noncontrastive or “unmarked themes” presuppose or are accommodated to a

singleton ThAS - in this case the following:

(9) {λvp.λreason.cause′ reason( – do′ vp brubeck′))}

Thus according to the present theory, as Halliday and Brown insisted, what is
“new”, “not given,” or k-contrasted vs. what is “given” or background is in part
determined by the speaker, not a property of a text or context alone (Brown
1983:67). By the same token, the notion of theme is also partly speaker-determined,
not text-based as is the notion of topic of Gundel (1974); Gundel and Fretheim
(2001).
Similar considerations govern the effect of the rheme-tune in (2) and (4). The H*
accent marks the second information unit as a rheme, and it falls on the word write
because it is the interpretation of that word that distinguishes this rheme from the
others that the context affords. This set of available rhemes, which we will call
the “Rheme Alternative Set” (RhAS) is, again presupposed/accommodated by the
participants to include only doing things to E\. In this particular case we can think of
the RhAS as being closed under the things that have actually been mentioned—that
is as

(10)
λx.play′ e′ x
λx.write′ e′ x

Again, we can again think of the RhAS more generally by abstracting over the
transitive predicate in structured-meaning style:

(11) λtv.λx. tv e′ x

We have so far passed over the role of the particular boundary tones in (2) and
(4). Earlier we identified this role as assigning responsibility for theme/rheme status
to either speaker or hearer. Thus the claim must be that in the above examples the
theme is marked by Miles Davis as Brubeck’s responsibility, whereas the rheme is
marked as his own. To see what this means, and to understand the implication of
table 1 that both are “agreed”, we must look more broadly at the function of the
boundary tones.

4. AN EXAMPLE: BOUNDARIES
Brown (1980:30) identifies the role of high boundaries as indicating that there is
more to come on the current topic from some participant. Pierrehumbert and
Hirschberg (1990:304-308), from whom the following example is adapted, make a
related claim concerning interpretation with respect to succeeding material (again,
this may come from either participant):
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 251

(12) a. Attach the jumper cables to the car that’s running,

L+H* L+H* LH%

b. Attach them to the car you want to start,

L+H* L+H* LH%

c. Try the ignition,

L+H* L+H* LH%

d. If you’re lucky,
L+H* LH%

e. You’ve started your car.

H* LL%

Pierrehumbert and Hirschberg don’t actually specify the pitch-accent types for
this example, but L+H* seems appropriate for all accents except those in the last
clause — in fact, H* accents sound quite odd, for reasons we’ll come to. In present
terms, this means that the earlier clauses are all themes, and illustrates the fact that
multiple themes, and in fact isolated themes without any rheme, are all possible.
It is interesting to consider the effect of replacing the LH% boundaries by LL%
boundaries, retaining the L+H* accents. This manipulation does not affect the
coherence of the example very much. The main effect is to make the speaker’s
prescription seem somewhat abrupt and discouraging of any interruption, and to be
generally unconcerned with whether or not it is making any sense to the hearer. In
comparison, the original (12) seems more attentive, and to invite the hearer to take
control of the discourse if they want to.
I’m going to claim that in both cases the forward motion of the discourse is the
same, and is brought about, not by the inclusion of high boundaries as such, but by
the rheme-expectation stemming from the theme-marking L+H* pitch-accents. The
specific “kinder, gentler” effect of the version with LH% boundaries arises from
their primary meaning of marking hearer-commitment. By marking the themes as, in
the speaker’s view, the hearer’s responsibility (although in fact they may be
completely new to the hearer), the possibility of the latter taking control of the
discourse is maintained at every turn.
These claims are borne out by considering the effect of substituting H* rheme
accents for L+H* in both high- and low- boundary versions. With high boundaries,
the instructions become quite irritating, and seem to imply that the hearer knows all
this already. With low boundaries, the effect is again abrupt and not hearer-oriented.
In both cases, coherence (though inferable from world knowledge) is reduced.
I’m further going to claim that all the related effects of high boundaries, which
have been variously described in the descriptive literature as “other-directed”, “turn-
yielding”, “discourse-structuring,” or “continuation” are similarly indirect
implicatures that follow from the basic sense of high boundaries, which is to identify
the hearer as in the speaker’s view committed to the relevant information unit.
252 MARK STEEDMAN

5. THE FULL SYSTEM

We are now ready to look at the entire system laid out in tables 1 and 2, via some
simpler minimal pairs of examples in which tones including the L* pitch-accents
and boundaries are systematically varied across the same text.
If we limit ourselves for the sake of simplicity to tunes with a single pitch-
accent, assume that H*+L and H+L* are not distinct from H* and L*, and take LL%
and LH% as representative of the two classes of boundary then the classification in
tables 1 and 2 allow eight tunes which exemplify the 23 = 8 possible combinations
of these three binary features. It is instructive to consider the effect of these tunes
when applied to the same sentence “I’m a millionaire,” uttered in response to
various prompts. It’s important to realize that all these responses are indirect, and
their force depends on whether the participants regard being a millionaire as
counting as being rich.

(13) H: You appear to be rich.

S: I’m a MILLIONAIRE.
H* LL%
[S+]ρ millionaire′me′ (S committed to an agreed rheme.)

(14) H: You appear to be poor

S: I’m a MILLIONAIRE.
L* LL%
[S ]ρ millionaire′me′ (S committed to a non-agreed rheme.)
-

(15) H: Congratulations. You’re a millionaire.

S: I’m a MILLIONAIRE?
H* LH%
[H+]ρ millionaire′me′ (H committed to an agreed rheme.)

(16) H: Congratulations. You’re a millionaire.

S: I’m a MILLIONAIRE?
L* LH%
[H-]ρ millionaire′me′ (H committed to a non-agreed rheme.)

The above four responses can be assumed to consist of a single rheme.7The ones
involving an L* pitch-accent mark the rheme as being not agreed. However, the
pitch-accent itself does not distinguish who the opposition is coming from. This is
not an ambiguity in the pitch-accent itself. Rather, the identification of the source of
the conflict and the entire illocutionary force of the response depends on inference
on the basis of what else is known about the participants’ beliefs. Thus, in (14), the
one who appears to doubt the proposition in the second utterance is the hearer, but in
(16) it is the speaker. In different contexts, the difference could be reversed or
eliminated.
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 253

A similar pattern can be observed for the theme pitch-accents:

(17) H: You appear to be rich.

S: I’m a MILLIONAIRE.
L+H* LL%
[S+]θ millionaire′me′ (S committed to an agreed theme.)

(18) H: You appear to be poor.

S: I’m a MILLIONAIRE.
L*+H LL%
[S-]θ millionaire′me′ (S committed to a non-agreed theme.)

(19) H: You appear to be a complete jerk.

S: I’m a MILLIONAIRE.
L+H* LH%
[H ]θ millionaire′me′ (H committed to an agreed theme.)
+

(20) H: You appear to be a complete jerk.

S: I’m a MILLIONAIRE.
L*+H LH%
[H-]θ millionaire′me′ (H committed to a non-agreed theme.)

At first encounter, it may appear that these tunes must mark rhemes, like those in
(13) to (16). However in Steedman 2000a, I show that these are in fact isolated
themes, of the kind we have already noticed in connection with example (12). These
isolated themes achieve the effect of a response (as well as various other
implicatures of impatience, diffidence, incompleteness, etc.) via the indirect speech
act of leaving the hearer to generate the rheme for themselves.
As before, the tunes involving L*+H accents imply disagreement or absence
from mutual belief. Once again, the source of the disagreement can only be
identified from the full discourse context. In the case of (19) and (20), it is important
to remember that the speaker’s LH% boundary means only that the speaker views
the hearer as committed to these themes. As far as the hearer is concerned, that is
not the same as an actual commitment. Thus the L*+H in (20) simply has the effect
of correctly excluding from the mutual belief set AGREED this theme which the
boundary marks as in H, in spite of the fact that can also be inferred to be in the
speaker’s own beliefs S. This is the possibility that was noticed in the discussion of
tables 1 and 2: it seems a fundamental property of the system that there is a
distinction between a proposition merely being in both S and H and it actually being
in AGREED. The former amounts to a claim by the speaker that both participants
ought to be committed to it. The latter is a claim by the speaker that both actually
are committed.
Example (20) is identical in information structural terms to the following
example, extensively discussed by Ward and Hirschberg (1985) (see Pierrehumbert
and Hirschberg 1990:295, (26)):
254 MARK STEEDMAN

(21) H: Harry’s such a klutz.

S: He’s a good BADMINTON player.
L*+H LH%

In terms of the present theory, the response is an isolated theme, which achieves its
effect of contradiction by: a) claiming via an LH% boundary that the hearer is
committed to the proposition (even though in fact they may not be); b) claiming via
the L*+H pitch-accent that the theme is not (yet) mutually agreed (even though the
hearer may in fact believe its content already); and c) leaving the hearer to infer for
themselves on the basis of their world knowledge about badminton players the
implicated rheme, that Harry is not in fact a total klutz. The contradiction is
particularly effective, because a and b between them further implicate that H’s
original remark was pretty stupid, and thereby force the hearer to infer this intended
further conclusion for themselves, without the speaker needing to explicitly uttering
it. However, this effect of the utterance is an indirect speech-act or conversational
implicature, not part of the literal meaning of the words or the tones.
As an aside, it is striking that within the present theory, such conversational
implicatures can be analyzed solely in terms of knowledge and modality, without
appealing explicitly to notions of cooperation, flouting, or to speech-act types and
illocutionary force recognition. Many of the examples discussed by Grice (1975)
and Searle (1975) seem to be susceptible to similar knowledge-based analysis,
making Speech-act-theoretic analyses merely emergent, as in Steedman and
Johnson-Laird (1980) and Cohen and Levesque (1990).
For example, consider Grice’s famous analysis of the sarcastic or ironic
conversational implicature achieved by saying “You’re a fine friend!” in a situation
where the hearer has actually done the speaker a disservice. His analysis requires the
hearer to detect that the speaker has flouted a conversational maxim (Quality), to
assume that the speaker is still cooperating and therefore (by a step that is not quite
clear), to infer that the speaker must mean the opposite of what they said. It is
interesting however, to observe that one intonation contour with which such
sarcastic comments are characteristically uttered is the following:

(22) You’re a FINE FRIEND !

L* L* LH%

This all-rheme utterance is marked by the L* pitch-accents as not agreed or in

Mutual Belief, and by the LH% boundary as being something the hearer is in the
speaker’s view committed to. It is the latter marking that makes the hearer compare
the speaker’s proposition with their own beliefs, and identify the Rheme Alternative
Set as something like the following:

(23)
( – fine′(friend′self′))
(+fine′(friend′self′))
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 255

At this point, the speaker has achieved their goal of making the hearer aware of
their own misdeed, and the indirect speech-act is complete, without any appeal to
cooperation, maxims, or rules explicitly associating maxim-violating utterances with
their negation. Indeed the effectiveness of the indirect accusation is greatly increased
by the fact that the speaker has, so to speak, got under the hearer’s guard, forcing
them into coming up with this thought for themselves, rather than stating it as a
speaker commitment, which the hearer might reject. We as linguists may identify
this as illocutionary uptake of an act of sarcasm, but the participants don’t need to
know about any of this.

6. INTONATION IN COMBINATORY CATEGORIAL GRAMMAR

CCG is a form of lexicalized grammar in which grammatical categories are made up
of a syntactic type defining valency and order of combination, and a logical form.
For example, the English intransitive verb walks has the following category, which
identifies it as a function from (subject) NPs (which the backward slash identifies as
on the left, and the feature-value indicated by subscript SG identifies as bearing
singular agreement) into sentences S:

(24) walks := S\NPSG :λx.walk′x

Its interpretation is written as a l-term associated with the syntactic category by the
operator “:”. The transitive verb admires has the category of a function from (object)
noun phrases (which the forward slash identifies as on the right) into predicates or
intransitive verbs:

(25) admires := (S\NPSG)/NP :λx.λy.admire′xy

In this case the syntactic type is simply the SVO directional form of the semantic
type. (Juxtaposition of function and argument symbols in logical forms as in
admire′x indicates function application. A convention of left association holds,
according to which admire′xy is equivalent to (admire′x)y).
In other cases categories may “wrap” arguments into the logical form, as in the
analysis of Bach (1979, 1980), Dowty (1982), and Jacobson (1992). For example,
the following is the category of the English ditransitive verb showed, which reverses
the dominance/command relation of indirect and direct object x and y between
syntactic derivation and the logical form:8

(26) showed := (S\NPSG)/NP)/NP :λx.λy.λz.show′yxz

(The reason for doing this is to capture at the level of logical form the binding theory
and its dependence on the c-command hierarchy in which subject outscopes direct
object, which outscopes indirect (dative) object, which outscopes more oblique
arguments—see Steedman 1996 for discussion).
256 MARK STEEDMAN

The syntactic operations of CCG by which such interpretations are assembled are
distinguished by being strictly type-dependent, rather than structure-dependent. For
present purposes they can be regarded as limited to operations of type-raising
(corresponding to the combinator T) and composition (corresponding to the
combinator B ).
Type-raising turns argument categories such as NP into functions over the
functions that take them as arguments, such as the verbs above, into the results of
such functions. Thus NPs like Harry can take on categories such as the following:

(27) a. S/(S\NPSG) :λp.p harry′

b. S\(S/NP) :λp.p harry′
c. (S\NP)/((S\NP)\NP) :λp.p harry′
d. etc.

This operation has to be strictly limited to argument categories. One way to do so is

to specify it in the lexicon, in the categories for proper names, determiners, and the
like.
The inclusion of composition rules like the following as well as simple
functional application and lexicalized type-raising engenders a potentially very
freely “reordering and rebracketing” calculus, engendering a generalized notion of
surface or derivational constituency.

(28) Forward composition (> B )

X/Y : f Y/Z : g B X/Z : λx. f(gx)

For example, the simple transitive sentence of English has two equally valid
surface constituent derivations, each yielding the same logical form:

(29) Harry admires Louise

______>T ________ ___________<T
S/(S\NPSG) (S\NPSG)/NP S\(S/NP))
λf.f harry′ . λx.λy.admire′xy : λp.p louise′
____________________
>B
S/NP : λx.admire′x harry′
__________________________________
<
S : admire′louise′ harry′
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 257

(30) Harry admires Louise

______ ________________ _______________
>T <T
S/(S\NPSG) (S\NPSG)/NP (S\NP)\((S|NP)/NP)
λf.f harry′ . λx.λy.admire′xy : λp.p louise′
_____________________________________
<
S\NPSG : λy.admire′ louise′y
__________________________________________________
<
S : admire′louise′ harry′

In the first of these, Harry and admires compose as indicated by the annotation > B
to form a non-standard constituent of type S/NP. In the second, there is a more
traditional derivation involving a verb phrase of type S\NP. Both yield identical
logical forms, and both are legal surface or derivational constituent structures. More
complex sentences may have many semantically equivalent derivations, a fact whose
implications for processing are discussed in SP.
This theory has been applied to the linguistic analysis of coordination,
relativization, and intonational structure in English and many other languages. For
example, since substrings like Harry admires are now fully interpreted derivational
constituents, they can undergo coordination via the schematised rule (31), allowing a
movement- and deletion- free account of right node raising, as in (32):

(31) Simplified coordination rule (<Φ>)

X CONJ X′ X′′

(32) [Harry admires] and [Louise detests] a saxophonist

_____________ _____ ______________
>B >B ____________<T
S/NP CONJ S/NP S\(S/NP)
____________________________________
<Φ>
S/NP
__________________________________________
<
S

This type-dependent account of extraction, as opposed to the standard account using

structure-dependent rules, makes the across-the-board condition on extractions from
coordinate structures a prediction or theorem, rather than a stipulation, as
consideration of the types involved in the following examples will reveal:

(33) a. A saxophonist [that(N\N)/(S/NP) [[Harry admires]S/NP and [Lousie

detests]S/NP]S/NP]N/NP
b. A saxophonist that(N\N)/(S/NP) *[[Harry admires]S/NP and [Lousie
detests him]S]]
c. A saxophonist that(N\N)/(S/NP) *[[Harry admires him]S and [Lousie
detests]S/NP]
258 MARK STEEDMAN

The availability of fully interpreted nonstandard derivational constituents

corresponding to substrings like Harry admires was originally motivated by their
participation in constructions like relativization and coordination and the desire to
capture those constructions with a grammar obeying a very strict form of the
Constituent Condition on Rules (SP, chapter 1). However, a theory that allows
alternative derivations like (29) and (30) is clearly immediately able to cap-ture the
fact that prosody can make exactly the same non-standard constituents into
intonational phrases, as in (34a), as easily as the standard consituents in (34b):

(34) a. HARRY admires LOUISE

L+H* LH% H* LL%
b. HARRY admires LOUISE
H* L L+H* LH%

The way that CCG derivation is made sensitive to the presence of tones is as
follows (adapted from Steedman 1999). The presence of a pitch-accent on a word
infects its whole category with themehood or rhemehood, via a pair of feature-values
θ=ρ and ±AGREE, the latter here abbreviated as superscript +/-. For example the
transitive verb admires bearing an H* pitch-accent has the following category:9

(35) admires := (Sρ+ \NPρ+)/NPρ+ :λx.λy.*admire′xy

The feature r ensures that a verb so marked can only combine with arguments
that are compatible with rheme marking—that is, which do not bear the theme
marking feature value θ—and marks its result as rheme marked as well. The element
in the logical form corresponding to the accented word itself is marked for k-contrast
with the asterisk operator.
Boundaries, by contrast are not properties of words or phrases, but independent
string elements in their own right. They bear a category which, by mechanisms
parallel to those discussed in more detail in SP, “freezes” θ± /ρ± -marked
constituents as complete information-/intonation-structural units, making them
unable to combine further with anything except similarly complete prosodic units.
For example, the hearer-responsibility signaling LH% boundary bears the
following category:

(36) LH% := S$φ\S$η± : λf .[H±]η′ f

—where S$ is a variable ranging over S and syntactic function categories into S, η is

a variable ranging over syntactic features θ/ρ, η′ ranges over the corresponding
semantic translation θ′/ρ′ defined in terms of the alternative semantics discussed in
section 2, superscript ± is a variable ranging over ±AGREE, and φ marks the result
as a complete phonological phrase.
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 259

The derivation of (34a) then appears as follows:

(37) Harry admires Louise

L+H* LH% H* LL%
_________ ______ _____ ______ ________
>T <T
Sθ+ /(Sθ+ \NPθ+) (S\NP)/NP S$φ\S$η± Sρ+ \(Sρ+/NPρ+) S$φ/S$η±

: λf.f harry′ :λx.λy.admire′xy :λf.[H±]η′f :λp.p louise′ :λg.[S±]η′g

_______________________
>B
Sθ+/NPθ+ : x.admire′x *harry′
______________________________________ _______________________________________
< <
Sφ/NPφ : [H+]θ′ (λx.admire′ x *harry′ ) Sφ\ (Sφ /NPφ) : [S+]ρ′ (λp.p *louise′)
_______________________________________________________________________________
<
Sφ : [S+]ρ′ (λp.p *louise′)([H+]θ′ (λx.admire′x *harry′)
____________________________________________________________________________

S : admire′louise′harry

In the last step of the derivation, the markers of speaker/hearer commitment,

agreement/ disagreement, and theme/rheme are evaluated with respect to the
database, to check that the associated presuppositions hold or can be accommodated.
In the latter case this includes support or accommodation for the relevant alternative
sets, and will include updates corresponding to the new theme and rheme. If any of
these presuppositions fails, then processing will block and incomprehension will
result. If it succeeds, then the two core λ -terms can β-reduce to give the canonical
proposition as the result of the derivation.

7. EMPIRICAL ISSUES
The present paper has laid a considerable burden of meaning on the distinction
between pitch-accent types, and in particular that between H* and L+H*, which
according to the present theory are respectively the most frequent rheme accent and
theme accent. It might therefore appear to be an embarrassment that there is
controversy in the literature over the reality of this distinction.
Part of this controversy stems from the fact that trained ToBI annotators show
quite low inter-annotator reliability in drawing this particular distinction (John
Pitrelli, p.c.). When the characteristics of the actual pitch-accents annotated by them
as H* and L+H* are plotted in terms of objective TILT parameters, there is very
considerable overlap between the two categories (Taylor 2000).
However, this seems to be a problem with the definitions of the relevant pitch
contours that are provided in the ToBI annotation conventions (Beckman and
Hirschberg 1999). The distinguishing characteristic of the L+H* accent is that the
rise to the pitch maximum is late, typically beginning no earlier than onset of the
vowel in the accented syllable. H* accents typically begin to rise earlier, in many
cases much earlier. The definition of L+H* in the manual as “a high peak target on
260 MARK STEEDMAN

the accented syllable which is immediately preceded by relatively sharp rise from a
valley in the lowest part of the speaker’s pitch range” does not make this entirely
clear. Indeed it is likely that the distinction can only be drawn reliably if syllable
boundary alignment is taken into account, and this information is not provided in the
ToBI annotation system.
It is also important to recall in using ToBI-annotated material that the manual
explicitly instructs the annotator to use H* as the “default” accent type, explicitly
instancing L+H* accents as examples that when in doubt should be annotated as
H*.10
These characteristics of the ToBI annotation scheme mean that, useful though it
is for other purposes, extreme caution has to be exercised in drawing strong
conclusions concerning the reality of the H*/L+H* distinction from ToBI annotated
corpora. In particular, while Taylors conclusion that the H*/L+H* distinction as
drawn in the annotation to the relevant section of the Boston News Corpus is not
phonetically real, it does not follow that the pitch-accent types themselves are not
distinct.
It is similarly unsafe to assess the present claim that L+H* is distinctively
associated with theme by applying text-based criteria for identifying topics in free
text such as those proposed by Gundel (1988). The only definition of a theme that is
possible under the present proposal is in terms of contextually established or
accommodating alternative sets. While the definitions in Steedman 2000a would
allow restricted contexts to be manipulated to control the available alternatives, and
allow the predictions concerning tune to be tested, identifying themes in free
discourse is not easy, because of the pervading involvement of accommodation and
inference inhuman discourse. For example, as Hedberg notes in her paper in the
present volume, some of the L+H* accents which she finds not to be associated with
topics in Gundel’s sense would be classified as isolated themes in the terms of the
present theory (see Hedberg and Sosa 2001, note3; Hedberg and Sosa 2002).11

8. CONCLUSION
The system proposed here reduces the literal meaning of the tones to just three
semantically grounded binary oppositions. Crucially, it grammaticalizes a distinction
between the beliefs that the speaker claims by their utterance that the speaker is
committed to, and those that the hearer actually is committed to. It is only the latter
set that includes Mutual Beliefs. It is therefore consistent for the speaker to claim
and/or implicate that both they and the hearer are committed to a proposition, but
that it is not mutually believed. This is a move in the present theory that is forced by
examples like (21) and the minimal pairs in (13)-(20).
The theory places a correspondingly greater emphasis on the role of speaker-
presupposition (and its dual, hearer-accommodation, and by inference and
implicature. To that extent, the present theory follows the tradition of Halliday and
Brown, in claiming that it is the speaker who, within the constraints imposed by the
context and the participants’ beliefs and intentions, determines what is theme and
rheme, and what contrasts they embody, and not the text.

University of Edinburgh
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 261

9. NOTES

*
Thanks to Betina Braun, Daniel Büring, Klaus von Heusinger, Stephen Isard, Alex Lascarides, and
Bonnie Webber for comments on the draft. An earlier version of some parts of the paper appears as
Steedman (2002). The work was supported in part by EPSRC grants GR/M96889 and GR/R02450, and
EU FET grant MAGICSTER and EU IST grant PACO-PLUS.
1
The term “pitch-accent” is here restricted to what Ladd (1996) calls “primary” pitch-accents, sometimes
called “nuclear” pitch accents (although there may be more than one in a sentence). Ladd follows
Bolinger and many others in distinguishing primary accents from certain other accents that arise from the
interaction of lexical stress with metrical the metrical grid. While there is still no objective measure to
distinguish the two varieties, it is the primary accents that are perceived as emphatic or contrastive.
2
The notation for tunes is Pierrehumbert’s, see Pierrehumbert and Hirschberg 1990 for details including
characteristic pitch-contours.
3
In Steedman 2000a and earlier work I called this property “focus”, following the “narrow” sense of
Selkirk (1984). However this term invites confusion with the “broad” sense intended by Hajiþová and
Sgall (1988) and Vallduví (1990), which is closer to the term “rheme” as used in the present system, and
in Steedman 2000a and Vallduví and Vilkuna1998.
4
Hobbs (1990), who proposes a very different revision of Pierrehumbert and Hirschberg (1990) to the
present one, also gives a central role to Mutual Belief.
5
In Steedman 2000a, I called this dimension “ownership”.
6
The story comes from Dave Brubeck. Miles was of course absolutely right. The tones shown in the
example remain conjectural, however, given his complete lack of any trackable F0 .
7
Under the proposal in Steedman 2000a, they could also be analyzed as an unmarked theme “I’m” and a
rheme “a millionaire”. In this particular context it makes very little difference, and we’ll ignore these
readings.
8
The present analysis differs from that of Bach and colleagues in making Wrap a lexical combinatory
operation, rather than a syntactic combinatory rule. One advantage of this analysis, which is discussed
further in Steedman 1996, is that phenomena depending on Wrap, such as anaphor binding and control,
are immediately predicted to be bounded phenomena.
9
Number agreement is suppressed in the interests of reducing formal clutter.
10
“Implicit in our discussion of the five pitch-accents is the notion that H* is the ‘default’ accent type. So,
if there is any uncertainty about how low the F0 is before the peak, as in some cases of possible L+H*
near the beginning of an utterance, the transcriber should mark ‘H*’ rather than ‘L+H*’.” (Beckman and
Hirschberg 1999).
11
Similarly, the fact that non-native speakers often obliterate pitch-accent type distinctions, and yet
manage to be understood, should no more lead one to conclude that the distinctions are not real than does
the possibility of written communication.

10. REFERENCES
Bach, Emmon. “Control in Montague Grammar.” Linguistic Inquiry 10 (1979): 513–531.
Bach, Emmon. “In Defense of Passive.” Linguistics and Philosophy 3 (1980): 297–341.
Beckman, Mary, and Julia Hirschberg. “The ToBI Annotation Conventions.” Manuscript, URL
http://ling.ohio-state.edu/ tobi/ame tobi/annotation conventions.html. Ohio State University, 1999.
Bolinger, Dwight. “A Theory of Pitch Accent in English.” Word 14 (1958): 109–149. Reprinted in
Bolinger (1965), pp. 17-56.
Bolinger, Dwight. “Contrastive Accent and Contrastive Stress.” Language 37 (1961): 83–96. Reprinted in
Bolinger (1965), pp. 101-117.
Bolinger, Dwight. Forms of English. Cambridge, Mass.: Harvard University Press, 1965.
262 MARK STEEDMAN

Brown, Gillian. “Prosodic Structure and the Given/New Distinction.” In Anne Cutler, D. Robert Ladd,
and Gillian Brown (eds.), Prosody: Models and Measurements, pp. 67–77. Berlin: Springer-Verlag,
1983.
Brown, Gillian, Karen Currie, and Joanne Kenworthy. Questions of Intonation. London: Croom Helm,
1980.
Büring, Daniel. “The Great Scope Inversion Conspiracy.” Linguistics and Philosophy 20 (1997a): 175–
194.
Büring, Daniel. The Meaning of Topic and Focus: The 59th Street Bridge Accent. London: Routledge,
1997b.
Clark, Herbert. Using Language. Cambridge: Cambridge University Press, 1996.
Clark, Herbert, and Catherine Marshall. “Definite Reference and Mutual Knowledge.” In Aravind Joshi,
Bonnie Webber, and Ivan Sag (eds.), Elements of Discourse Understanding, pp. 10–63. Cambridge:
Cambridge University Press, 1981.
Cohen, Philip. On Knowing What to Say: Planning Speech Acts. University of Toronto: Doctoral
dissertation, 1978.
Cohen, Philip and Hector Levesque. “Rational Interaction as the Basis for Communication.” In Philip
Cohen, Jerry Morgan, and Martha Pollack (eds.), Intentions in Communication, pp. 221–255.
Cambridge, Mass.: MIT Press, 1990.
Cresswell, M.J. Logics and Languages. London: Methuen, 1973.
Cresswell, M.J. Structured Meanings. Cambridge, Mass.: MIT Press, 1985.
Dowty, David. “Grammatical Relations and Montague Grammar.” In Pauline Jacobson and Geoffrey K.
Pullum (eds.), The Nature of Syntactic Representation, pp. 79–130. Dordrecht: Reidel, 1982.
Grice, Herbert. “Logic and Conversation.” In Peter Cole and Jerry Morgan (eds.), Speech Acts, vol. 3 of
Syntax and Semantics, 41–58. New York: Seminar Press, 1975 [Written in 1967].
Gundel, Janet. The Role of Topic and Comment in Linguistic Theory. University of Texas, Austin:
Doctoral dissertation, 1974.
Gundel, Janet. “Universals of Topic-Comment Structure.” In Michael Hammond, Edith Moravcsik, and
Jessica Wirth (eds.), Syntactic Universals and Typology, pp. 209–242. Amsterdam: John Benjamins,
1988.
Gundel, Janet, and Torsten Fretheim. “Topic and Focus.” In Laurence Horn and Gregory Ward (eds.),
Handbook of Pragmatic Theory. Oxford: Blackwell, 2001.
Gunlogson, Christine. True to Form: Rising and Falling Declaratives in English. University of California
at Santa Cruz: Doctoral dissertation, 2001.
Gunlogson, Christine. “Declarative Questions.” In Brendan Jackson (ed.), Proceedings of Semantics and
Linguistics Theory XII, pp. 144–163. Ithaca, NY: Cornell University. 2002.
Gussenhoven, Carlos. On the Grammar and Semantics of Sentence Accent. Dordrecht: Foris, 1983.
Hajiþová, Eva and Petr Sgall. “Topic and Focus of a Sentence and the Patterning of a Text.” In Jánös
Petöfi (ed.), Text and Discourse Constitution, pp. 70–96. Berlin: de Gruyter, 1988.
Halliday, Michael. “The Tones of English.” Archivum Linguisticum 15 (1963): 1.
Halliday, Michael. Intonation and Grammar in British English. The Hague: Mouton, 1967a.
Halliday, Michael. “Notes on Transitivity and Theme in English, Part II.” Journal of Linguistics 3
(1967b): 199–244.
Hedberg, Nancy and Juan Sosa. “The Prosodic Structure of Topic and Focus in Spontaneous English
Dialogue.” This volume.
Hedberg, Nancy and Juan Sosa. “The Prosody of Questions in Natural Discourse.” In Proceedings of
Speech Prosody, Aix en Provence, Aptil. To appear.
Hirschberg, Julia and Janet Pierrehumbert. “Intonational Structuring of Discourse.” In Proceedings of the
24th Annual Meeting of the Association for Computational Linguistics, New York, pp. 136–144. San
Francisco, CA: Morgan Kaufmann, 1986.
Hobbs, Jerry. “The Pierrehumbert-Hirschberg Theory of Intonational Meaning Made Simple: Comments
on Pierrehumbert and Hirschberg.” In Philip Cohen, Jerry Morgan, and Martha Pollack (eds.),
Intentions in Communication, pp. 313–323. Cambridge, Mass.: MIT Press, 1990.
Jacobson, Pauline. “Flexible Categorial Grammars: Questions and Prospects.” In Robert Levine (ed.),
Formal Grammar, pp. 129–167. Oxford: Oxford University Press, 1992.
Karttunen, Lauri, and Stanley Peters. “Conventional Implicature.” In Choon-Kyu Oh and David Dinneen
(eds.), Syntax and Semantics 11: Presupposition, pp. 1–56. New York: Academic Press, 1979.
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION 263

Kartunnen, Lauri. “Discourse Referents.” In J. McCawley (ed.), Syntax and Semantics, vol. 7, pp. 363–
385. New York: Academic Press, 1976.
Ladd, D. Robert. Intonational Phonology. Cambridge: Cambridge University Press, 1996.
Lambrecht, Knud, and Laura Michaelis.“Sentence Accent in Information Questions: Default and
Projection.” Linguistics and Philosophy (1998): 477–544.
Lewis, David. Convention: a Philosophical Study. Cambridge Mass.: Harvard University Press, 1969.
Lewis, David. “Scorekeeping in a Language Game.” Journal of Philosophical Logic 8 (1979): 339–359.
Montague, Richard. Formal Philosophy: Papers of Richard Montague. Richmond H. Thomason (ed.).
New Haven, CT: Yale University Press, 1974.
Pierrehumbert, Janet. The Phonology and Phonetics of English Intonation. MIT: Doctoral dissertation,
1980.
Pierrehumbert, Janet, and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988.
Pierrehumbert, Janet, and Julia Hirschberg. “The Meaning of Intonational Contours in the Interpretation
of Discourse.” In Philip Cohen, Jerry Morgan, and Martha Pollack (eds.), Intentions in
Communication, pp. 271–312. Cambridge, Mass.: MIT Press, 1990.
Prevost, Scott. A Semantics of Contrast and Information Structure for Specifying Intonation in Spoken
Language Generation. University of Pennsylvania: Doctoral dissertation, 1995.
Prevost, Scott and Mark Steedman. “Specifying Intonation from Context for Speech Synthesis.” Speech
Communication 15 (1994): 139–153.
Rooth, Mats. (1985). Association with Focus. University of Massachusetts, Amherst: Doctoral
dissertation.
Rooth, Mats. “A Theory of Focus Interpretation.” Natural Language Semantics 1 (1992): 75–116.
Searle, John. “Indirect Speech Acts.” In Peter Cole and Jerry Morgan (eds), Speech Acts, vol. 3 of Syntax
and Semantics, pp. 59–82. New York: Seminar Press, 1975.
Selkirk, Elisabeth. Phonology and Syntax. Cambridge, Mass.: MIT Press, 1984.
Silverman, Kim, Mary Beckman, John Pitrelli, Marie Ostendorf, Colin Wightman, Patti Price, Janet
Pierrehumbert, and Julia Hirschberg. “ToBI: A Standard for Labeling English Prosody.” In
Proceedings of the International Conference on Spoken Language Processing, Banff, Alberta, pp.
867–870. Edmonton: University of Alberta, 1992.
Steedman, Mark. “Structure and Intonation.” Language 67 (1991): 262–296.
Steedman, Mark. Surface Structure and Interpretation. Cambridge, Mass.: MIT Press, 1996.
Steedman, Mark. “Connectionist Sentence Processing in Perspective.” Cognitive Science 23 (1999): 615–
634.
Steedman, Mark. “Information Structure and the Syntax-Phonology Interface.” Linguistic Inquiry 34
(2002a): 649–689.
Steedman, Mark. The Syntactic Process. Cambridge, Mass.: MIT Press, 2000b.
Steedman, Mark. “Towards a Compositional Semantics for English Intonation.” Manuscript, URL
http://www.cogsci.ed.ac.uk/~steedman/papers.html. University of Edinburgh, 2002.
Steedman, Mark, and Philip Johnson-Laird. “Utterances, Sentences, and Speech-Acts: Have Computers
Anything to say?” In Brian Butterworth (ed.), Language Production 1: Speech and Talk, pp. 111–
141. London: Academic Press, 1980.
Steedman, Mark, and Ivana Kruijff-Korbayová. “Two Dimensions of Information Structure in Relation to
Discourse Semantics and Discourse Structure.” Journal of Logic, Language, and Information,
Introduction to the Special Issue on Information Structure, Discourse Semantics, and Discourse
Structure, to appear.
Stone, Matthew. Modality in Dialogue: Planning Pragmatics and Computation. University of
Pennsylvania: Doctoral dissertation, 1998.
Taylor, Paul. “Analysis and Synthesis of Intonation Using the Tilt Model.” Journal of the Acoustical
Society of America 107 (2000): 1697–1714.
Thomason, Richmond. “Accomodation, Meaning, and Implicature.” In Philip Cohen, Jerry Morgan, and
Martha Pollack (eds.), Intentions in Communication, pp. 325–363. Cambridge, Mass.: MIT Press,
1990.
Vallduví, Enric. The Information Component. University of Pennsylvania: Doctoral dissertation, 1990.
Vallduví, Enric, and Maria Vilkuna. “On Rheme and Kontrast.” In Peter Culicover and Louise McNally
(eds.), Syntax and Semantics, Vol. 29: The Limits of Syntax, pp. 79–108. San Diego, CA: Academic
Press, 1998.
264 MARK STEEDMAN

Von Stechow, Arnim. “Topic, Focus and Local Relevance.” In Wolfgang Klein and Willem Levelt (eds.),
Crossing the Boundaries in Linguistics, pp. 95–130. Dordrecht: Reidel, 1981.
Ward, Gregory, and Julia Hirschberg. “Implicating Uncertainty: the Pragmatics of Fall-Rise Intonation.”
Language 61 (1985): 747–776.
KLAUS VON HEUSINGER

DISCOURSE STRUCTURE AND INTONATIONAL PHRASING*

1. INTRODUCTION
Theories that relate discourse structure and intonational structure often concentrate
on the discourse functions of pitch accents and boundary tones. Intonational
phrasing, however, is less prominently investigated. T his paper focuses on
intonational phrasing and its contribution to the construction of a discourse
representation. I argue that intonational phrasing determines minimal discourse units
which serve as the building blocks in a discourse representation. Even though
minimal discourse units often correspond to syntactic constituents, sometimes they
cross constituent boundaries. The problem can be illustrated by the very first
sentence from the novel Das Parfum by Patrick Süskind, in (1).
H* !H* H* L% H*
| | | | |
(1) [Im achtzehnten Jahrhundert | lebte in Frankreich] [ein Mann, |
‘In the eighteenth century lived in France a man
H* H* !H*
| | |
der zu den genialsten | und abscheulichsten Gestalten dieser an
who was one of the most gifted and abominable personages
(H*) (H*) !H* H* !H* L%
| | | | | |
genialen und abscheulichen Gestalten nicht armen Epoche gehörte.]
in an era that knew no lack of gifted and abominable personages.’

We analyzed a read version of the novel with respect to intonational clues. The
novel was professionally read by the artist Gert Westphal in 1995. The text was
analyzed and intonationally segmented by Braunschweiler et al. (1988ff) in a project
on spoken text in Konstanz. Parts of the text were then labeled for the following
intonational properties: pitch accents (H*, L* or bitonal versions of it), boundary
tones (H%, L%), and intonational phrasing (intonational phrases “[...]”, and
intermediate phrases: “|...|”). We checked part of the labeling with Jennifer
Fitzpatrick.1
(1) is phrased into two intonational phrases, and both further into intermediate
phrases. The length of the different phrases differs quite remarkably. For example,
the second intonational phrase consists of the three intermediate phrases | ein Mann |
der zu den genialsten | und abscheulichsten Gestalten dieser an genialen und

265
C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 265–290.
© 2007 Springer.
266 KLAUS VON HEUSINGER

abscheulichen Gestalten nicht armen Epoche gehörte |. At first glance, it is not

straightforward to assign well-formed syntactic constituents to these intonational
units, e.g. | der zu den genialsten |. Intonational phrasing depends on different
parameters, including Selkirk’s (1984) “sense unit”. For Selkirk, an intonational
phrase must be a sense unit. However, she does not give a definition of sense unit.
The paper presents a new approach that defines sense units in terms of discourse
structure. A sense unit corresponds to a discourse unit that establishes a certain
discourse relation to the already established discourse universe.
The paper is organized as follows: In section 2, I discuss different elements of
discourse representation in terms of Discourse Representation Theory (DRT) and
extend the formalism to segmented DRT, which is an attempt to integrate discourse
relations into DRT. In section 3, I discuss the different elements of the intonational
structure and their function with respect to the discourse structure. While pitch
accents and boundary tones have received various functions, the discourse function
of intonational phrasing has rarely been investigated. In section 4, I discuss the
different parameters that determine the intonational phrasing. Besides metrical,
phonological and syntactic parameters, semantics plays an important role. This
function has been termed differently: Halliday (1967) introduced the term
informational unit, while Selkirk (1984) uses sense unit. However, there is no
semantic account of these terms. I argue that the semantics of intonational phrasing
can be best accounted for in terms of discourse units. Discourse units are defined by
their function to serve as arguments in discourse relations.
In section 5, I describe different discourse relations, in particular I introduce new
discourse relations that are relations between subclausal units. While discourse
relations are defined between propositions, I show that there are also discourse
relations between smaller units. Section 6 gives a short summary. Throughout this
paper, I try to illustrate the arguments with examples from the novel Das Parfum.
Die Geschichte eines Mörders (‘Perfume: The Story of a Murderer.’) by Patrick
Süskind.2. Examples from the novel are quoted by chapter and sentence, e.g. 13-022.
The intonational phrasing always relates to the German text, even though the
English translation is often used for the discourse representation. The translation
itself is from the English version of the novel.

2. DISCOURSE STRUCTURE
Discourse structure is a cover term for different properties of a coherent text or
discourse. In the following I focus on (i) reference and anaphora, (ii) information
structure (topic-comment, or focus-background), and (iii) discourse relations between
different discourse units. There are different families of theories treating discourse
structure, each of which focuses on a different aspect. Discourse Representation Theory
(Kamp 1981, Kamp & Reyle 1993) concentrates on representing the conditions for
anaphoric reference. The discourse is incrementally (re)constructed. There is in
principle no difference between parts of sentences and whole sentences since the
construction algorithm does not recognize a special category of sentences (even though
such a category is determined by the syntactic categories of the input). A second family
of approaches (Klein & von Stutterheim 1987, Hobbs 1990, van Kuppevelt
1995, Roberts 1996, Büring 1997, 2003) understands a discourse structure as
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 267

representing the relations between propositions. Here the structure is represented as

a tree of propositions. Such theories focus on the relation between sentences (or
clauses), rather than on the relation between parts of sentences (or clauses). Neither
view – except for Roberts (1996) and Büring (1997) – integrates aspects of
information structure (topic-comment, or focus-background) in the analyses. These
concepts are often used in the description of an additional level of sentence
structure. Only the Prague School (Sgall & Hajičová & Benešová 1973) integrates
information structure into the analysis of texts and discourses (see von Heusinger
2004 for a discussion of different approaches to information structure).

2.1 Reference and anaphora in discourse

The initial problem that motivated discourse representation theories is the
interpretation of nominal and temporal anaphora in discourse. The phenomenon of
cross-sentential anaphora forces semantics to extend its limits from the sentence to
the discourse. The key idea in the approach to semantics of discourse, exemplified in
Heim (1982) and Kamp (1981), is that each new sentence or phrase is interpreted as
an addition or ‘update’ of the context in which it is used. This update often involves
connections between elements from the sentence or phrase and elements from the
context. Anaphoric relations and definite expressions are captured by links between
objects in this representation. In order to derive the truth condition of the sentence,
the representation is embedded into a model. The best way to get acquainted with
DRSs is to look at the example (2).
(2) Im achtzehnten Jahrhundert lebte in Frankreich ein Mann.
‘In the eighteen century France there lived a man.’
t, u, x
(2a)
18th cent(t)
France(u)
Man(x)
live(x,u,t)

(2b) {t,u,x | 18th cent(t) & France(u) & Man(x) & live(x,u,t)}

The box in (2a) graphically describes a discourse representation structure (DRS)

with two parts. One part is called the universe of the DRS, the other its condition set.
A DRS is an ordered pair consisting of its universe and condition set, which can also
be represented as in (2b) in set notation – this set describes all possible instances for
the discourse referents such that the conditions hold of them. The DRS in (2a) or
(2b) has three discourse referents t, u, x in its universe and the conditions that the
discourse referent t is a time point in the 18th century, the discourse referent u a
location in France, the discourse referent u a man, and that the predicate live holds
of x at the location u and at the time t. For getting the truth condition, we have to
map the DRS onto a model by an embedding function f that maps the discourse
referents onto elements of the domain of M such that the elements are in the
268 KLAUS VON HEUSINGER

extension of the predicates that are ascribed to the discourse referents. For example,
the DRS (2a) or (2b) is true just in case that f(t) is in the 18th century, f(u) is in
France, f(x) is a man and f(x) lives in f(u) at f(t).
The sequence or conjunction of two sentences as in (3) receives a DRS
incrementally. We start with the already established DRS for the first conjunct in
(2a), and build the new DRS (3b) by inserting the new discourse referents for the
pronoun er and the NP Jean-Baptiste Grenouille, and a condition for the predicate
hieß. The anaphoric link of the pronoun is graphically represented as y = ?,
indicating that the reference of the pronoun is still unresolved. The discourse
referent which stands for an anaphoric expression must be identified with another
accessible discourse referent in the universe. In the given context, y is identified
with x, as in (3c). This mini-discourse is true if there is an embedding function f onto
a model such that f(t) is in the 18th century, f(u) is in France, f(x) is a man, f(x) lives
in f(u) at f(t), f(y) = f(x), f(z) is Jean-Baptiste Grenouille, and f(y) was named f(z).
(3) Im achtzehnten Jahrhundert lebte in Frankreich ein Mann. Er hieß
Jean-Baptiste Grenouille.
‘In the eighteen century France there lived a man. His name was Jean-
Baptiste Grenouille.’
t, u, x, y, z t, u, x, y, z
18th cent(t) 18th cent(t)
t, u, x France(u) France(u)
(3a) (3b) Man(x) (3c) Man(x)
18th cent(t)
France(u) live(x,u,t) live(x,u,t)
Man(x) y=? y =x
live(x,u,t) z = J.B. Grenouille z = J.B. Grenouille
name(y,z) name(y,z)

The new discourse referent introduced by the pronoun must be linked with an
already established and accessible discourse referent. DRT defines accessibility in
terms of structural relations, i.e. the discourse referent must be in the same (or in a
higher) DRS. With this concept of accessibility, the contrast between (4) and (5) can
be described by the difference in the set of discourse referents that are accessible for
the discourse referent v of the pronoun er in (4) and (5). The construction rule for
the negation in (4) creates an embedded discourse universe with the discourse
referent u and the conditions scent(u) and x gave u to the world. The anaphoric
pronoun er in the third (hypothetical) sentence cannot find a suitable discourse
referent since it has no access to the embedded discourse universe with the only
fitting discourse referent u. In (5a), however, the pronoun er in the second sentence
is represented by the discourse referent v and the condition v = ?. This referent can
be linked to the accessible discourse referent x, licensing the anaphoric link.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 269

(4) So ein Zeck war das Kind Grenouille. An die Welt gab es nichts ab
(...) nicht einmal einen Duft1 . (04-061) #Er1 war stark.
‘The young Grenouille was such a tick. He gave the world nothing (...)
not even his own scent. #It was strong.’

x, y, z, v
Tick(x)
young Gr(y) x is y
z=x
(4a) u
not
scent(u)
z gave u to
the world
v = ? strong(v)

(5) Ein anderes Parfum aus seinem Arsenal war ein mitleiderregender
Duft1 , der sich bei Frauen mittleren und höheren Alters bewährte.
Er1 roch nach dünner Milch und sauberem weichem Holz. (38-015)
‘Another perfume in his arsenal was a scent for arousing sympathy
that proved effective with middle-aged and elderly women. It smelled
of watery milk and fresh soft wood.’

x, y, v
scent for arousing sympathy
that proved effective with
middle–aged and elderly women(x)
Another perfume in
(5a) his arsenal (y)
x is y
v=x
v smelled of watery
milk and fresh soft wood

2.2 Information structure and discourse structure

Information structure is generally understood as an additional linguistic level to
describe sentence structure. Information structure often does not map syntactic
structure, and this was the main reason for introducing this level of description in the
270 KLAUS VON HEUSINGER

19th century. It subsequently received different terms, such as theme-rheme, topic-

comment, focus-background (see Sgall et al. 1973 for an overview). The theoretical
basis for this additional structure varies according to the background theory of the
researcher. But in most approaches information structure is defined by the
contribution of the informational units to the sentence meaning.
This is illustrated by the next two examples. In (6) the time of the reported event
is fronted – since the time was already introduced, one can also say that this phrase
is discourse-linked or backgrounded. In (7), however, the exclamation gut ‘good’ is
fronted for focusing, while the given reference of the pronoun is backgrounded.
(6) Zu der Zeit, von der wir reden, herrschte in den Städten ein für uns
moderne Menschen kaum vorstellbarer Gestank.
‘In the period of which we speak, there reigned in the cities a stench
barely conceivable to us modern men and women.’
(7) Gut schaut er aus.
‘He looks good.’

In general, theories assume that one unit is linked to the established discourse, while
the other is said to express the new information in the sentence. Because of space
limitations, I cannot present a full survey of the different approaches and a
general criticism (see von Heusinger 2004). I only want to stress the point that
information structure is often understood as a sentence structure and not as part of a
discourse structure. Therefore, it is not included in discourse representation theories.

2.3 Sentence and discourse relation

A discourse consists of sentences that are related to each other by relations, such as
causation, explanation, coherence, elaboration, continuation. This can be illustrated
in the following two discourse segments. In (8) the question is followed by a
continuation, which in itself consists of a causation and a conjunction. This is best
represented in an annotated tree, as in (8a). Similarly, the sentence (9) can be split
into its clauses, which can then be represented in a tree, as in (9a).
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 271

(8) “Was ist das?” sagte Terrier und beugte sich über den Korb und
schnupperte daran, denn er vermutete Eßbares. (02-002)
‘“What‘s that?” asked Terrier, bending down over the basket and
sniffing at it, in the hope that it was something edible.’

Continuation

What's that? asked Terrier Causation

(8a)
in the hope that
Conjunction it was something
edible.
bending down sniffing
over
the basket at it
(9) Technische Einzelheiten waren ihm sehr zuwider, denn Einzelheiten
bedeuteten immer Schwierigkeiten, und Schwierigkeiten bedeuteten
eine Störung seiner Gemütsruhe, und das konnte er gar nicht
vertragen. (02-015)
‘He despised technical details, because details meant difficulties, and
difficulties meant ruffling his composure, and he simply would not put
up with that.’
Causation

He despised
technical Elaboration
details,

(9a) because
details Elaboration
meant
difficulties
and difficulties meant and he simply would
ruffling his composure not put up with that.

Recent approaches to discourse structure (Hobbs 1990, van Kuppevelt 1995,

Roberts 1996, Büring 1997, 2003) use anotated trees that relate propositions to
each other. However, such approaches do not relate the internal structure to the
propositions nor do they assume smaller discourse units than propositions.
272 KLAUS VON HEUSINGER

Only Asher (1993, 2004) combines insights from DRT and discourse relation in his
theory of segmented DRT (= SDRT), which is not confined to the incremental
composition of DRSs, but also captures discourse relations between the sentences in
the discourse. He revises the classical DRT of Kamp (1981) and Kamp & Reyle
(1993). The classical version describes the dynamic meaning of words or phrases
with respect to a discourse structure. There is, however, no means to compare the
dynamic potential of a full sentence with the discourse so far established. Asher
(1993, 256) notes that
the notion of semantic updating in the original DRT fragment of Kamp (1981) (...) is
extremely simple, except for the procedures for resolving pronouns and temporal
elements, which the original theory did not spell out. To build a DRS for the discourse
as a whole and thus to determine its truth conditions, one simply adds the DRS
constructed for each constituent sentence to what one already had. (...) This procedure is
hopelessly inadequate, if one wants to build a theory of discourse structure and
discourse segmentation.

In SDRT, each sentence Si is first represented as a particular segmented DRS for

that sentence. The segmented DRS can then interact with the already established
DRS reconstructing a discourse relation R, such as Causation, Continuation,
Conjunction, Elaboration, etc. as informally sketched in (8b) and (8c) for the tree
structure (8a). First the clause receives its DRS, which can then be related to the
already established DRS, and then the representation can be integrated into the
already established representation. In (8b), the already established DRS contains
among other elements the discourse referents for the basket and for Terrier. The first
two sentences from the tree (8a) are translated into DRSs which establish the
discourse relation of Continuation, while the rest remains in the tree. In (8c) these
two DRSs are integrated into the main DRS and the other three clauses are translated
into segmented DRSs which again establish certain discourse relations with the main
DRS: The sentence in (8) is represented as the DRS in (8b) with the box for the
discourse information. The relation between the sentences (or propositions) are Cont.
The remaining structure is given in (8b) and the DRSs for that structure is given in
(8c):
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 273

(8) “Was ist das?” sagte Terrier und beugte sich über den Korb und
schnupperte daran, denn er vermutete Eßbares. (02-002)
(8b)
x, y, z, ... u, p
Cont What(u) = p
basket(x)
Terrier(y) u =
.....
Cont v
v=?
Terrier(v)
asked(v ,p)

Causation

in the hope that

Conjunction it was
something
edible.
bending down
over the basket sniffing at it

(8c) x, y, z,u, p, v
Cont w Conj k Caus l
basket(x) y bending y sniffing in the hope
Terrier(y) down at k that l was
..... over the k =? something
What(u) = p basket(w) edible
u =x v=y w=? l =?
Terrier(v)
asked(v,p)

To summarize this very short presentation of DRT, the discourse structure of DRT
provides not only a new structure but also introduces new semantic objects:
discourse referents, conditions, and discourse domains (“boxes”). DRT explains
semantic categories such as definiteness and anaphora in terms of interaction
between these representations. Furthermore, the extension to SDRT allows us to
express discourse relations between whole propositions, as well. These new tools,
objects, and representations form the basis for a new semantic analysis of
information structure. In the next section, this approach is sketched briefly.

3. INTONATIONAL STRUCTURE
Intonation contours are represented by phonologists as a sequence of abstract tones
consisting of pitch accents and two types of boundary tones. Pierrehumbert &
Hirschberg (1990, 308) assign discourse functions to the particular tones: “Pitch
accents convey information about the status of discourse referents (...). Phrase
accents [= boundary tones of intermediate phrases] convey information about the
relatedness of intermediate phrases (...). Boundary tones convey information about
274 KLAUS VON HEUSINGER

the directionality of interpretation for the current intonational phrase (...).” The
status of discourse referents can be accounted for in terms of given vs. new; the
boundary tones of intonational phrases indicate how the proposition expressed by
the whole phrase is integrated into the discourse. Similarly, boundary tones of
intermediate (or phonological) phrases that correspond to a full proposition indicate
the way these propositions are interpreted with respect to the linguistic context, as
illustrated in (10) and (11). While in (10), the L-boundary tone indicates that the two
clauses have no relation to each other, the H-boundary tone in (11) indicates that the
first clause is related to the second, suggesting a discourse relation of causation.
L L L%
| | |
(10) [(George ate chicken soup) | (and got sick) ]
H L L%
| | |
(11) [(George ate chicken soup) | (and got sick)]

However, in this view there is no way of treating phrases that correspond to units
below the clause level, such as the modification im achtzehnten Jahrhundert (‘in the
eighteenth century’), the unsaturated phrase lebte in Frankreich (‘lived in France’)
or the first part of the complex noun der zu den genialsten (‘one of the most gifted’)
in example (1), repeated as (12).
(12) [Im achtzehnten Jahrhundert | lebte in Frankreich] [ein Mann, | der zu
den genialsten | und abscheulichsten Gestalten dieser an genialen und
abscheulichen Gestalten nicht armen Epoche gehörte.]

All these phrases can constitute intermediate phrases in German. Even though
English and many other languages mark their intermediate phrases by boundary
tones, in German there is no evidence for boundary tones for intermediate phrases
(Féry 1993, 59-79). Evidence for intermediate phrases in German must be taken
from other criteria. I argue on the basis of discourse structure and discourse relations
that intonational phrasing (intonational and intermediate phrases) can sufficiently be
defined by its function in building a discourse structure. Before I give a
characterization of intonational phrasing for intonational phrases and intermediate
phrases, I first present some approaches to the functions of pitch accents and
boundary tones.

3.1 Pitch accents and reference

Each intonational unit (intermediate phrase or intonational phrase) must have at least
one pitch accent. Pitch accents are associated with prosodically prominent
expressions in that phrase. Often they are associated with focus and thus indicate
new (or not-given) information. Pitch accents themselves are often said to express
the discourse status of their associated expressions (Hobbs 1990, Gussenhoven
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 275

1984, Ladd 1996). This can be illustrated by (13) and (14). (13) is the first sentence
of the novel and introduces the time, the place and the person by phrases marked
with a H* pitch accent. (14) is the first sentence of the second chapter. The wet
nurse Jeanne Bussie was already introduced in the first chapter; so the L* indicates
that she is discourse-old.
H* !H* H* L% H*
| | | | |
(13) [Im achtzehnten Jahrhundert | lebte in Frankreich] [ein Mann,
In the eighteenth century lived in France a man
L* H% H* L* LH* H%
| | | | | |
(14) [Einige Wochen später] [stand die Amme | Jeanne Bussie] ...(02-001)
Few weeks later stood the wet nurse Jeanne Bussie

The pitch accent can also indicate contrast between two referents or unexpected
relations between two referents, as illustrated in the often quoted example (15) and a
sentence from our novel (16):
(15) First HE called HIM a Republican and then HE offended HIM.
(16) Grenouille folgte ihm, mit bänglich pochendem Herzen, denn er ahnte,
daß nicht ER DEM DUFT folgte, sondern daß DER DUFT IHN
gefangengenommen hatte und nun unwiderstehlich zu sich zog. (08-
036)
“Grenouille followed it, his fearful heart pounding, for he suspected
that it was not he who followed the scent, but the scent that had
captured him and was drawing him irresistibly to it.”

3.2 Tune representing information structure

‘
Steedman (1991, 2000) intertpretes Halliday s thematic structure (see section 4.2) in
terms of combinatory categorial grammar (CCG). This can be illustrated with the
following example which receives the informational structure in theme-rheme. Both
thematic units are further divided into given material and new material; the latter is
associated with a pitch accent.
(17) Q: I know that Mary‘s FIRST degree is in PHYSICS.
But what is the subject of her DOCTORATE?
L+H*LH% H* LL%
A: [Mary‘s DOCTORATE | is in CHEMISTRY]
Given New Given New
Theme Rheme
The basic informational units are the theme and the utterance. All other parts are
defined with respect to these basic elements. For example, the rheme is a function
276 KLAUS VON HEUSINGER

that takes the theme as an argument to yield the utterance. Steedman now defines the
syntactic function of the pitch accent L+H* as a theme that lacks a boundary tone,
i.e. as a function that needs a boundary tone to yield a theme. Analogously, the pitch
accent H* indicates a function that needs a boundary tone in order to yield a rheme.
Thus in the description of tones, Steedman assumes the boundary tones and the
whole tune as the primary units, while the pitch accents define the informational
status as theme or rheme (cf. Hayes & Lahiri 1991 for a similar approach with
respect to sentence type).
(18) Categorial functions of tones for English (Steedman 1991)
a LH% boundary tone simple argument
b LL% boundary tone simple argument
c L+H* pitch accent function from boundary tone into theme
d H* pitch accent function from boundary tones into rheme
e L+H*LH% contour simple argument: theme
f H* LL% contour function from themes into utterance

Steedman uses the terms theme and rheme as well as given and new. The first pair
can be defined with respect to the sentence under analysis. Yet the second pair can
only be defined by the discourse in which the sentence is embedded.
Even though the tones and their functions are different for German, the
following example from our novel may illustrate Steedman’s analysis. The first
phrase ends with a H% boundary tone representing the theme (with the global
contour of L*H%, cf. (18e)), while the second intonational phrase ends with L%
expressing the rheme (with the global contour ...H*L%, cf. (18f)).
L* H%
| |
(19) [Zu der Zeit, von der wir reden,] [herrschte in den Städten
‘In the period of which we speak, there reigned in the cities
H*L H* !H* L%
| | | |
ein für uns moderne Menschen | kaum vorstellbarer Gestank.]
to us modern men and women a stench barely conceivable’

However, not all sentences can be divided into one theme and one rheme, as in (20):
L* H% H* L* LH* H%
| | | | | |
(20a) [Einige Wochen später] [stand die Amme | Jeanne Bussie]
‘Few weeks later stood the wet nurse Jeanne Bussie
H* H%
| |
b [mit einem Henkelkorb in der Hand]
with a market basket in the hand
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 277

L* H* H%
| | |
c [vor der Pforte des Klosters von Saint-Merri]
at the gate of the cloister of Saint-Merri
H* !H* !H* L%
| | | |
d [und sagte dem öffnenden Pater Terrier,]
and said to the opening Father Terrier’

The first four intonational phrases end with an H% boundary tone, and only the last
phrase with an L% boundary tone. This is difficult to explain in terms of a view of
information structure that is sentence bound. In such a view we must assume several
themes before we get to the rheme, and the final sentence. The example suggests
that the boundary tones indicate the relation of the phrase to the already established
discourse on the one hand, and to the subsequent discourse on the other.

3.3 Tones representing different discourse functions

Pierrehumbert & Hirschberg (1990) give a list of functions of pitch accents and
boundary tones. The latter indicate whether the phrase to which the boundary tone is
associated should be interpreted with respect to the preceding discourse or to the
following discourse. Pierrehumbert & Hirschberg (1990, 304) illustrate this point in
the following contrast between (21) and (22). The low boundary tone L% in (21a)
indicates that this sentence as a unit is related to the discourse on its own, while the
high boundary tone H% in (22a) indicates that it is to be interpreted with respect to
the following sentence forming a large unit which then can be inserted into or
related to the discourse. This difference influences the choice of the antecedent of
the pronoun it in (21b) and (22b). In (21) it refers to the following proposition I
spent two hours figuring out how to use the jack, while in (22) it refers back to the
new car manual.
L L%
(21a) My new car manual is almost unreadable.
‘ L H%
b It s quite annoying.
L L%
c I spent two hours figuring out how to use the jack.
L H%
(22a) My new car manual is almost unreadable.
‘ L H%
b It s quite annoying.
L L%
c I spent two hours figuring out how to use the jack.
278 KLAUS VON HEUSINGER

Pierrehumbert & Hirschberg (1990, 308) assign the following discourse functions to
the particular tones:
Pitch accents convey information about the status of discourse referents, modifiers,
predicates, and relationships specified by accented lexical items. Phrase accents convey
information about the relatedness of intermediate phrases–in particular, whether (the
propositional content of) one intermediate phrase is to form part of a larger
interpretative unit with another. Boundary tones convey information about the
directionality of interpretation for the current intonational phrase–whether it is
“forward-looking” or not.

In explaining the function of intonational phrasing (intonational and intermediate

phrases), they refer to the “propositional content” of the corresponding phrase. This
can also be illustrated by the following fragment from our novel. The low boundary
tones in (23a) and (23b) indicate that the content of the utterance can be added to the
discourse without relating it to subsequent utterances. However, the high boundary
‘
tone in (23c) indicates that the utterance (“But I ve put a stop to that”) must be
related to the next utterance (23d) (“Now you can feed him yourselves”).
H* L%
| |
(23a) [Weil er sich an mir vollgefressen hat.]
‘Because he himself on me stuffed has
H* L% H* L%
| | | |
b [Weil er mich leergepumpt hat] [bis auf die Knochen.]
‘
Because he s pumped me dry down to the bones.
H* H%
| |
c [Aber damit ist jetzt Schluß.]
But with that is now end
H* !H* L%
| | |
d [Jetzt könnt Ihr ihn selber weiterfüttern]
Now can you him yourselves feed.’

However, not all intonational phrases can be associated with a propositional

content, some intonational units might only refer to modifications such as im
achtzehnten Jahrhundert (‘in the eighteenth century’) or the unsaturated phrase lebte
in Frankreich (‘lived in France’) of example (1), repeated as (12). Thus, the
functions of boundary tones must be redefined with respect to these “sub-propositional”
units. Intonational phrasing doesn’t always correspond to propositions or to simple
discourse referents. Therefore, we need a more fine-grained discourse structure that
allows to construct corresponding discourse segments.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 279

Summarizing, pitch accents may indicate the discourse status of their respective
discourse referents. They can also form the nucleus of an informational unit, as in
‘
Steedman s approach, which is, however, limited to the sentence. Pierrehumbert &
Hirschberg define the function of boundary tones with respect to the relations
between clauses. However, they can only deal with phrases that are associated with
propositions. None of these approaches accounts for the discourse function of
subclausal units. Before I develop such an approach in section 5, I give a sketch of
the description of intonational phrasing in the next section.

4. INTONATIONAL PHRASING AND ITS FUNCTION

4.1 Phrasing
The term intonational phrase (IP) is usually applied to spans of the utterance which
are delimited by boundary tones: “Like other researchers, we will take the melody
‘
for an intonational phrase to be the tune whose internal makeup is to be described.
‚
As a rule of thumb, an intonational phrase boundary (transcribed here as %) can be
taken to occur where there is a non-hesitation pause or where a pause could be
felicitously inserted without perturbing the pitch contour” (Pierrehumbert 1980, 19).
In (24) from Selkirk (1995, 566), there are three intonational phrases, such that the
relative clause corresponds to one, while each part of the matrix sentence to the right
and to the left constitutes one. In (25) from the novel Das Parfum (02-125), one
intonational phrase marks the direct speech, while the two others are associated with
the two conjuncts of the assertion. The second conjunct is further divided into two
intermediate phrases.
(24) H% H% L%
| ‘ | |
[Fred,]IP [who s a volunteer fireman,]IP [teaches third grade]IP
(25) H* L% L* H% (L*) H* L%
| | “ | | | | |
[“Na? ] [bellte Terrier] [und knipste ungeduldig | an seinen Fingernägeln.]
‘
‘“Well?” barked Terrier, clicking his fingernails impatiently.

The terms in which we can define an intonational phrase are not very clearly
understood. There are phonetic, syntactic and semantic criteria for forming an
intonational phrase:
280 KLAUS VON HEUSINGER

(26) Linguistic criteria for defining an intonational phrase (IP)

(i) Timing: An IP can be preceded and followed by a pause.
(ii) Metrical: The metrical structure provides an additional clue,
viz., the presence of a most prominent accent.
(iii) Tonal: The boundary of an IP is sometimes tonally marked
by a boundary tone. Pitch range adjustment plays a
role, as well.
(iv) Junctural: The boundary of an IP can block certain junctural
phenomena (cf. Nespor & Vogel (1986)).
(v) Syntactic-prosodic: The boundaries of an IP correspond to
those of some syntactic constituents.
(vi) Semantic: The material in the IP must constitute an
informational unit or sense unit.

The conflict between different criteria can be illustrated with the first sentence of
our novel (1), repeated as (27).
(27) H* !H* H* L% H*
| | | | |
[Im achtzehnten Jahrhundert |580 lebte in Frankreich]300[ein Mann,|590
In the eighteenth century lived in France a man

The subscript indicates the duration of the pauses, which is shorter between the two
intonational phrases than inside either of them. We rather assume the boundary tone
as a robust criterion for an intonational phrase. Unfortunately, German does not
show boundary tones for intermediate phrases (Féry 1993, 59-79). They can,
however, be detected by other criteria such as pauses, lengthening of the final
syllable and a pitch accent for each intermediate phrase. I argue that the discourse
function of the intermediate phrase is one of the most reliable criteria.
There are very short and very long intonational phrases, which means that the
phrases do not depend on length. They rather depend on their appropriateness for
building a coherent discourse. A discourse is coherent if at least the following two
requirements are met: (i) anaphoric relations can be established; (ii) discourse
relations hold between the discourse units, as argued in section 4.4.

4.2 Halliday: information units and information structure

Halliday postulates an independent level for information structure and is the first
one to introduce the term “informational unit”. He is in fact the first who uses the
term information structure and establishes it as an independent concept. His main
preoccupation was to account for the structure of intonation in English. Since phrasing
does not always correspond to syntactic constituent structure, Halliday (1967, 200)
“
postulates a different structural level as the correlate to phrasing (his “tonality ):
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 281

‘.
Any text in spoken English is organized into what may be called information units
‚
(...) this is not determined (...) by constituent structure. Rather could it be said that the
distribution of information specifies a distinct structure on a different plane. (...)
‘
‚
Information structure is realized phonologically by tonality , the distribution of the text
into tone groups.

The utterance is divided into different tone groups, which are roughly equivalent to
intermediate phrases. These phrases exhibit an internal structure. Analogously,
Halliday assumes two structural aspects of information structure: the informational
partition of the utterance, and the internal organization of each informational unit.
He calls the former aspect the thematic structure (theme-rheme), and the latter
aspect is treated under the title givenness. The thematic structure organizes the linear
ordering of the informational units, which corresponds to the Praguian view of
theme-rheme (or topic-comment, or topic-focus, see section 2.2). The theme refers
to that informational unit that comprises the object the utterance is about, while the
rheme refers to what is said about it. Halliday (1967, 212) assumes that the theme
always precedes the rheme. Thus theme-rheme are closely connected with word
order, theme being used as a name for the first noun group in the sentence, and
theme for the following: “The theme is what is being talked about, the point of
departure for the clause as a message; and the speaker has within certain limits the
option of selecting any element in the clause as thematic.”
The second aspect refers to the internal structure of an informational unit, where
elements are marked with respect to their discourse anchoring. Halliday (1967, 202)
writes: “At the same time the information unit is the point of origin for further
options regarding the status of its components: for the selection of point of
information focus which indicates what new information is being contributed.”
Halliday calls the center of informativeness of an information unit information
focus. The information focus contains new material that is not already available in
the discourse. The remainder of the intonational unit consists of given material, i.e.
material that is available in the discourse or in the shared knowledge of the discourse
participants. Halliday (1967, 202) illustrates the interaction of the two systems of
organization with the following example (using bold type to indicate information
focus; // to indicate phrasing). Sentence (28a) contrasts with (28b) only in the
placement of the information focus in the second phrase. The phrasing, and thus the
thematic structure, is the same. On the other hand, (28a) contrasts with (28c) in
phrasing, but not in the placement of the information focus. However, since the
information focus is defined with respect to the information unit, the effect of the
information focus is different.
(28)a //Mary//always goes to town on Sundays.//
b //Mary//always goes to town on Sundays.//
c //Mary always goes to //town on Sundays.//

Halliday does not connect the sentence perspective with the discourse perspective,
even though he makes some vague comments on it:
‘
The difference can perhaps be best summarized by the observation that, while given
‘ ‚
‘ ‘
means what you were talking about (or what I was talking about before ), theme
‚ ‚ ‚
282 KLAUS VON HEUSINGER

‘ ‘
‚ ‚
means what I am talking about (or what I am talking about now ); and, as any student
of rhetoric knows, the two do not necessarily coincide. (Halliday 1967, 212)

The main progress initiated by the work of Halliday is the assumption of an

independent level of information structure. This structure is closely related to the
discourse and assigns the features given or new to the expressions in a sentence.
However, he does not provide a criterion for informational units in terms of
discourse.

4.3 Selkirk: sense units and argument structure

Selkirk (1984) has argued that the intonational phrase (IP) constitutes a domain
relevant to various aspects of the phonetic implementation of the sentence, including
timing effects like constituent-final lengthening. Selkirk (1984, 286) employs the
notions of sense unit since she argues that the intonational phrase cannot be defined
by phonetics or by syntax alone, but it needs additional semantic constraints:
Our position, then – again following Halliday 1967 – is that there are no strictly
syntactic conditions on intonational phrasing. Any apparently syntactic conditions on
‘
‚
where breaks in intonational phrasing may occur are, we claim, ultimately to be
attributed to the requirement that the elements of an intonational phrase must make a
certain kind of semantic sense.

Selkirk (1984, 286ff) defines the correlation between intonational phrase and the
sense unit in (29), and in (30) she determines the sense unit as a complex of
constituents that stand either in a modifier-head or argument-head relation:
(29) The Sense Unit Condition on intonational phrasing
The immediate constituents of an intonational phrase must together
form a sense unit.
(30) Two constituents Ci, Cj form a sense unit if (a) or (b) is true of the
semantic interpretation of the sentence:
(a) Ci modifies Cj (a head)
(b) Ci is an argument of Cj (a head)

This can be illustrated with (31). The first intermediate phrase im achtzehnten
Jahrhundert modifies the head lebte in Frankreich, and the argument ein Mann...is
an argument of the complex predicate im achtzehnten Jahrhundert lebte in
Frankreich.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 283

(31)
licensed by (30b)

Head Argument
| |
licensed by (30a) [ein Mann, ...

Modifier Head
| |
[Im achtzehnten Jahrhundert | lebte in Frankreich]

Selkirk herself (1984, 295f) notes that the Sense Unit Condition is very closely
related to argument structure, so it does not cover cases where material is preposed
or in nonrestrictive modifiers such as nonrestrictive relative clauses. The latter is a
typical instance of backgrounding, which expresses a discourse relation rather than
an argument-head relation, as illustrated by (32):
(32) [und sagte dem öffnenden Pater Terrier,] [einem etwa fünfzigjährigen
| kahlköpfigen, | leicht nach Essig riechenden Mönch:] [“Da!” ]
‘... and the minute they were opened by Father Terrier, a bald
monk of about fifty, with a faint odour of vinegar about him, she
said “There!”’

While the background information about the Father Terrier is “embedded” into an
independent intonational phrase, this phrase itself is divided into three intermediate
phrases that each give one characteristic property of the person. Thus, it is not the
argument structure that triggers the intonational phrasing, but rather the discourse
relation of backgrounding.

4.4 Intonational phrasing and discourse units

The discussion in the last two sections has shown that informational phrasing is
‘
partly determined by informational units. However, neither Halliday s concept of
‘
informational unit nor Selkirk s definition of sense unit succeeded in covering all
cases. It already became clear that intonational phrasing must be described in terms
of discourse units, which serve as arguments for discourse relations. This can be
illustrated in the discourse tree (32a) for the sentence (32).
284 KLAUS VON HEUSINGER

(32a)
Backgrounding

[und sagte
dem öffnenden Enumeration
Pater Terrier,]

[einem etwa leicht nach Essig

| kahlköpfigen, |
fünfzigjährigen riechenden Mönch]

We can assign different discourse relations to the discourse units associated with the
intonational phrasing. A discourse unit is defined by its appropriateness to serve as
an argument in a discourse relation, rather than by its content or some other intrinsic
property. This means that we can only define discourse units by defining discourse
relations that operate on them.

5. DISCOURSE UNITS AND DISCOURSE RELATIONS

Discourse relations are generally described in terms of relations between
propositions. Therefore, the arguments for discourse relations are associated with
clauses (or other linguistic phrases that express a proposition). This can be
illustrated with (8), repeated as (33).
(33) „Was ist das?” sagte Terrier und beugte sich über den Korb und
schnupperte daran, denn er vermutete Eßbares. (02-002)
‘ ‘
„‘What s that? asked Terrier, bending down over the basket and
“
sniffing at it, in the hope that it was something edible.
Continuation

‘
What s that? asked Terrier Causation

(33a)
in the hope that
Conjunction it was something
edible.
bending down sniffing
over
the basket at it
The relation between the first two sentences can be described by Continuation,
while the relation between the last clauses are Causation. Approaches to discourse
or text structure that use these kind of discourse relations are fairly widespread (e.g.
Mann & Thompson 1987, 1988 for Rhetorical Structure Theory (RST) or Asher
1993, 2004 for segmented DRT).
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 285

None of these approaches allow for subclausal discourse units and relations between
them. However, we have seen in the last sections that intonational phrasing often
corresponds to subclausal units. We have also said that discourse units are defined
by the relations they establish. If we assume subclausal discourse units we must also
define discourse relations that hold between them. In the following I discuss five
discourse relations: (i) non-restrictive modification, (ii) backgrounding, (iii)
enumeration, (iv) topicalization, and (v) frame-setting. While the first four are
discussed in the literature, the relation of frame-setting is new.

5.1 Non-restrictive modification

The relative clause in (34) consists of two intermediate phrases which correspond to
der zu den genialisten (Gestalten gehörte) and to (der zu den) abscheulichsten
Gestalten... gehörte. These two modifications are independent of each other, even
though they both modify the same discourse referent x for a man. The point is that
the main character of the book is not only one of the most gifted and abominable
personages, but he is at the same time one of the most gifted personages and one of
the most abominable personages. This is difficult to express in a purely linear way.
However, if we assume two independent discourse representations, we can capture
these two relations.

(34) [ein Mann | der zu den genialsten | und abscheulichsten Gestalten ....
gehörte]
„
a man who was one of the most gifted and abominable personages”

y
t, u, x y =x
(34a)
18th cent(t) non – y ∈ most gifted personages
France(u) restr.
Man(x) Mod y
live(x,u,t) y =x
y ∈ most abnominable personages

5.2 Backgrounding
In the example (35) below, a more general type of backgrounding can be found.
Actually, there are even two levels of backgrounding: First the phrase in contrast to
the names of other gifted abominations and second the actual names. The discourse
relation of backgrounding relates these discourse units directly to the already
established main DRS — there is no need to wait for the interpretation of the actual
sentence. This is informally represented in (35a).
(35) [Er hieß | Jean-Baptiste Grenouille,] [und wenn sein Name]
His name was Jean-Baptiste Grenouille, and if his name –
286 KLAUS VON HEUSINGER

[im Gegensatz zu den Namen | anderer genialer Scheusale,]

in contrast to the names of other gifted abominations,
[wie etwa de Sades, | Saint-Justs, | Fouchés, | Bonapartes | undsoweiter,]
de Sade’s, for instance, or Saint-Just’s, Fouché’s, Bonaparte’s etc. –
[heute in Vergessenheit geraten ist,]
has been forgotten today,

l, m
t, u, x, y, z
name of l(m) l =x
18th cent(t)
France(u) ?
(35a) Man(x) in contrast to the names of
live(x,u,t) other gifted abominations
y=x
z = J.B. Grenouille a, b, c, d
name(y,z) de Sade(a), Saint–Just(b),
Fouché(c), Bonaparte(d)

5.3 Enumeration
A classical case of independent units is enumeration, which is here illustrated by
(36). The intonational phrasing suggests that the discourse structure is constructed
via independent representations for each predicate NP with goat‘s milk, with pap,
and with beet juice, as given in (36a).
(36) [Jetzt könnt Ihr ihn selber weiterfüttern]
‘Now can you him yourselves feed
[mit Ziegenmilch, | mit Brei, | mit Rübensaft.]
with goat‘s milk, with pap, with beet juice.’

goat's milk(z)
x, y
(36a)
feed(x,y,z)
y =a pap(z)
x = you
beet juice(z)

Once we have an independent representation of each of the conjuncts, we can

compare them and establish additional relations of gradation between them. This
works particularly well for the following example (02-121), where we can compare
the different representations according to a scale of intimacy.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 287

(37a)
x, y
nurse(x) dozen-babies(y)

fed(x,y) < tended (x,y) < cradled(x,y) < kissed(x,y)

more intimate activity

5.4 Topicalization
Topicalization or thematization is one of the central concepts of the functional
sentence perspective of the Prague School, which was later adapted by Halliday and
‘
others (see section 4.2). Steedman s analysis of the thematic structure of a sentence
focuses exactly on this aspect (see section 2.2 for discussion). The fragment (38)
(02-126) illustrates this. The theme-rheme or the topic-comment establishes a
functor-argument structure on a sentence that is independent from the grammatical
relations. Since this issue is repeatedly discussed, I will continue to the next
subclausal discourse relation.
(38)[also an den Füßen zum Beispiel|da riechen sie wie ein glatter | warmer |Stein]
Their feet for instance, they smell like a smooth warm stone
[wie frische Butter riechen sie.] [Und am Körper] [riechen sie wie... ]
They smell like fresh butter. And their bodies smell like...’

5.5 Frame-setting
The discourse relation of “frame-setting” is illustrated by the first sentence of the
second chapter (14), repeated as (39). The phrase einige Wochen später cannot be
the topic, since the topic is the introduced person or the thing the sentence is about.
However, it stands in its own phrase. I therefore assume the discourse relation of
frame-setting. The phrase “sets the frame” for what there is to come. Here it shifts
the reference time. The phrase can be integrated into the already established
discourse before the rest of the sentence is interpreted, as illustrated in (39a) (see
Maienborn 2003 for a related concept with the same name):
288 KLAUS VON HEUSINGER

(39) [Einige Wochen später] [stand die Amme | Jeanne Bussie] ...(02-001)
‘Few weeks later stood the wet nurse Jeanne Bussie...’

x, y, z, t 1, ... t2
(39a) t2 = few weeks later as t1
wet nurse(x)
... u
... stood(u)
... u=x
the wet nurse Jeanne Bussie(u)

6. SUMMARY
The presented analysis associates intonational phrasing with discourse units. I have
‘
proposed an extension of Asher s SDRT with smaller discourse representations and
new relations between subclausal discourse representations. This allows us to assign
discourse functions to intonational phrases, including phrases that do not correspond
to entire clauses. Many more discourse relations must be defined, and I am
convinced they can be defined in terms of discourse construction rules.
Universität Stuttgart

7. NOTES
*
The paper is a revised version of a talk given at the Topic/Focus Workshop, at the UC Santa Barbara,
July 2001, and at the Linguistic Circle at the University of Edinburgh, October 2002. I would like to
thank the audiences for the comments. In particular I would like to thank Jennifer Fitzpatrick, Carlos
Gussenhoven, Bob Ladd, Aditi Lahiri, and Mark Steedman for discussion of earlier versions of this paper,
and Daniel Büring, Matthew Gordon, and Chungmin Lee for editing this volume and for the very helpful
and constructive review of this paper. The research was supported by a Heisenberg-Fellowship of the
German Science Foundation and by a research Grant (HE 2259/9-2).
1
An intonational phrase boundary always coincides with an intermediate phrase boundary, therefore we
shorten “[|...|...|]” to “[...|...]”. Even though English and many other languages mark their intermediate
phrases by boundary tones, in German it is very controversial if there is evidence for boundary tones for
intermediate phrases (Féry 1993, 59-79).
2
A short summary of the novel: “ In the slums of 18th-century Paris a baby is born and abandoned, passed
over to monks as a charity case. But the monks can find no one to care for the child—he is too
,
demanding, and he doesn t smell the way a baby should smell. In fact, he has no scent at all.
Jean-Baptiste Grenouille clings to life with an iron will, growing into a dark and sinister young man
who, although he has no scent of his own, possesses an incomparable sense of smell. Never having
known human kindness, Grenouille lives only to decipher the odors around him, the complex swirl of
smells—ashes and leather, rancid cheese and fresh-baked bread—that is Paris. He apprentices himself to
a perfumer, and quickly masters the ancient art of mixing flowers, herbs, and oils. Then one day he
catches a faint whiff of something so exquisite he is determined to capture it. Obsessed, Grenouille
follows the scent until he locates its source—a beautiful young virgin on the brink of womanhood. As his
demented quest to create the “ultimate perfume” leads him to murder, we are caught up in a rising storm
of terror until his final triumph explodes in all of its horrifying consequences.” (Short decription of the
English translation of the novel, Süskind 1987)
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING 289

8. REFERENCES
Asher, Nicholas. Reference to Abstract Objects in Discourse. Dordrecht: Kluwer Academic, 1993.
Asher, Nicholas. “From Discourse Macro-Structure to Micro-Structure and Back Again: Discourse
Semantics and the Focus/Background Distinction.” In H. Kamp, and B. Partee (eds.), Context
Dependence in the Analysis of Linguistic Meaning. Amsterdam: Elsevier, 2004.
Braunschweiler, Norbert, Jennifer Fitzpatrick and Aditi Lahiri. The Konstanz Intonation Database:
German, Swiss German, American English, British English, East Bengali, West Bengali. University
of Konstanz, 1988ff.
Büring, Daniel. The 59th Street Bridge Accent. On the Meaning of Topic and Focus. London: Routledge,
1997.
Büring, Daniel. “On D-Trees, Beans, and B-Accents.” Linguistics and Philosophy 26 (2003): 511-545.
Féry, Caroline. German Intonational Patterns. Tübingen: Niemeyer, 1993.
Gussenhoven, Carlos. On the Grammar and Semantics of Sentence Accents. Dordrecht: Foris, 1984.
Halliday, Michael “Notes on transitivity and theme in English. Part 1 and 2.” Journal of Linguistics 3
(1967): 37-81, 199-244.
Hayes, Bruce, and Aditi Lahiri. “Bengali intonational phonology.” Natural Language and Linguistic
Theory 9 (1991): 47-96.
Heim, Irene. The Semantics of Definite and Indefinite Noun Phrases. University of Massachusetts,
Amherst. Ann Arbor: University Microfilms, 1982.
Hobbs, Jerry. “The Pierrehumbert-Hirschberg Theory of Intonational Meaning Made Simple. Comments
on Pierrehumbert and Hirschberg.” In P. R. Cohen, J. Morgan, and M. E. Pollack (eds.), Intentions in
Communication, 313-323. Cambridge, Mass.: MIT, 1990.
Kamp, Hans. “A theory of truth and semantic interpretation.” In J. Groenendijk, T. Janssen, and M.
Stokhof (eds.), Formal Methods in the Study of Language, pp. 277-322. Amsterdam: Amsterdam
Center, 1981.
Kamp, Hans, and Uwe Reyle. From Discourse to Logic. Introduction to Modeltheoretic Semantics of
Natural Language, Formal Logic and Discourse Representation Theory. Dordrecht: Kluwer, 1993.
Klein, Wolfgang and Christiane von Stutterheim. “Quaestio und referentielle Bewegung in Erzählungen.”
Linguistische Berichte 109 (1987): 163-183.
Ladd, Robert. Intonational Phonology. Cambridge: Cambridge University Press, 1996.
Maienborn, Claudia. Die logische Form von Kopula-Sätzen. Berlin: Akademie Verlag, 2003.
Mann, William, and Sandra Thompson. “Rhetorical Structure Theory: Description and Construction of
Text Structures.” In G. Kempen (ed.), Natural Language Generation. New Results in Artificial
Intelligence, Psychology, and Linguistics, 85-95. Dordrecht: Nijhoff, 1987.
Mann, William, and Sandra Thompson. “Rhetorical Structure Theory: Towards a Functional Theory of
Text Organisation.” Text 8.3 (1988): 243-281.
Nespor, Marina, and Irene Vogel. Prosodic Phonology. Dordrecht: Foris, 1986.
Pierrehumbert, Janet. The Phonology and Phonetics of English Intonation. Ph.D. Dissertation.
Cambridge, Mass.: MIT, 1980.
Pierrehumbert, Janet, and Julia Hirschberg. “The Meaning of Intonational Contours in the Interpretation
of Discourse.” In P. R. Cohen, J. Morgan, and M. E. Pollack (eds.), Intentions in Communication,
pp. 271-311. Cambridge, Mass.: MIT, 1990.
Roberts, Craige. “Information Structure in Discourse. Towards an Integrated Formal Theory of
Pragmatics.” In J.-H. Yoon, and A. Kathol (eds.), Ohio State University [=OSU] Working Papers in
Linguistics. vol. 49, 91-136. Columbus, Ohio, 1996.
Selkirk, Elisabeth. Phonology and Syntax. The Relation between Sound and Structure. Cambridge, Mass.:
MIT, 1984.
Selkirk, Elisabeth. “Sentence Prosody: Intonation, Stress, and Phrasing.” In J. Goldsmith (ed.), The
Handbook of Phonological Theory, pp. 550-569. Oxford: Blackwell, 1995.
Sgall, Petr, Eva Hajičová, and Eva Benešová. Topic, Focus and Generative Semantics.
Kronberg/Taunus: Scriptor, 1973
Steedman, Mark. “Structure and Intonation.” Language 67 (1991): 260-296.
Steedman, Mark. The Syntactic Process. Cambridge, Mass.: MIT, 2000.
Süskind, Patrick. Das Parfum. Die Geschichte eines Mörders. Zürich: Diogenes, 1985.
Süskind, Patrick. Das Parfum. Die Geschichte eines Mörders. Gelesen von Gert Westphal. Hamburg:
Litraton, 1995.
290 KLAUS VON HEUSINGER

Süskind, Patrick. Perfume: The Story of a Murderer. Translated from the German by John E. Woods.
New York: Vintage Books, 2001.
Van Kuppevelt, Jan. “Discourse Structure, Topicality and Questioning. ” Linguistics 31 (1995): 109-147.
Von Heusinger, Klaus. “Focus particles, sentence meaning, and discourse structure.” In W. Abraham, and
A. ter Meulen, eds. The composition of Meaning. From Lexeme to Discourse, 167-193 Amsterdam:
Benjamins.
Studies in Linguistics and Philosophy

1. H. Hiż (ed.): Questions. 1978 ISBN 90-277-0813-4; Pb: 90-277-1035-X

2. W. S. Cooper: Foundations of Logico-Linguistics. A Uniﬁed Theory of Information, Language,
and Logic. 1978 ISBN 90-277-0864-9; Pb: 90-277-0876-2
3. A. Margalit (ed.): Meaning and Use. 1979 ISBN 90-277-0888-6
4. F. Guenthner and S.J. Schmidt (eds.): Formal Semantics and Pragmatics for Natural Lan-
guages. 1979 ISBN 90-277-0778-2; Pb: 90-277-0930-0
5. E. Saarinen (ed.): Game-Theoretical Semantics. Essays on Semantics by Hintikka, Carlson,
Peacocke, Rantala, and Saarinen. 1979 ISBN 90-277-0918-1
6. F.J. Pelletier (ed.): Mass Terms: Some Philosophical Problems. 1979
ISBN 90-277-0931-9
7. D. R. Dowty: Word Meaning and Montague Grammar. The Semantics of Verbs and Times in
Generative Semantics and in Montague’s PTQ. 1979 ISBN 90-277-1008-2; Pb: 90-277-1009-0
8. A. F. Freed: The Semantics of English Aspectual Complementation. 1979
ISBN 90-277-1010-4; Pb: 90-277-1011-2
9. J. McCloskey: Transformational Syntax and Model Theoretic Semantics. A Case Study in
Modern Irish. 1979 ISBN 90-277-1025-2; Pb: 90-277-1026-0
10. J. R. Searle, F. Kiefer and M. Bierwisch (eds.): Speech Act Theory and Pragmatics. 1980
ISBN 90-277-1043-0; Pb: 90-277-1045-7
11. D. R. Dowty, R. E. Wall and S. Peters: Introduction to Montague Semantics. 1981; 5th printing
1987 ISBN 90-277-1141-0; Pb: 90-277-1142-9
12. F. Heny (ed.): Ambiguities in Intensional Contexts. 1981
ISBN 90-277-1167-4; Pb: 90-277-1168-2
13. W. Klein and W. Levelt (eds.): Crossing the Boundaries in Linguistics. Studies Presented to
Manfred Bierwisch. 1981 ISBN 90-277-1259-X
14. Z. S. Harris: Papers on Syntax. Edited by H. Hiż. 1981
ISBN 90-277-1266-0; Pb: 90-277-1267-0
15. P. Jacobson and G. K. Pullum (eds.): The Nature of Syntactic Representation. 1982
ISBN 90-277-1289-1; Pb: 90-277-1290-5
16. S. Peters and E. Saarinen (eds.): Processes, Beliefs, and Questions. Essays on Formal Semantics
of Natural Language and Natural Language Processing. 1982 ISBN 90-277-1314-6
17. L. Carlson: Dialogue Games. An Approach to Discourse Analysis. 1983; 2nd printing 1985
ISBN 90-277-1455-X; Pb: 90-277-1951-9
18. L. Vaina and J. Hintikka (eds.): Cognitive Constraints on Communication. Representation and
Processes. 1984; 2nd printing 1985 ISBN 90-277-1456-8; Pb: 90-277-1949-7
19. F. Heny and B. Richards (eds.): Linguistic Categories: Auxiliaries and Related Puzzles. Volume
I: Categories. 1983 ISBN 90-277-1478-9
20. F. Heny and B. Richards (eds.): Linguistic Categories: Auxiliaries and Related Puzzles. Volume
II: The Scope, Order, and Distribution of English Auxiliary Verbs. 1983 ISBN 90-277-1479-7
21. R. Cooper: Quantiﬁcation and Syntactic Theory. 1983 ISBN 90-277-1484-3
22. J. Hintikka (in collaboration with J. Kulas): The Game of Language. Studies in Game-
Theoretical Semantics and Its Applications. 1983; 2nd printing 1985
ISBN 90-277-1687-0; Pb: 90-277-1950-0
23. E. L. Keenan and L. M. Faltz: Boolean Semantics for Natural Language. 1985
ISBN 90-277-1768-0; Pb: 90-277-1842-3
24. V. Raskin: Semantic Mechanisms of Humor. 1985 ISBN 90-277-1821-0; Pb: 90-277-1891-1

Volumes 1–26 formerly published under the Series Title: Synthese Language Library.
Studies in Linguistics and Philosophy

25. G. T. Stump: The Semantic Variability of Absolute Constructions. 1985

ISBN 90-277-1895-4; Pb: 90-277-1896-2
26. J. Hintikka and J. Kulas: Anaphora and Definite Descriptions. Two Applications of Game-
Theoretical Semantics. 1985 ISBN 90-277-2055-X; Pb: 90-277-2056-8
27. E. Engdahl: Constituent Questions. The Syntax and Semantics of Questions with Special
Reference to Swedish. 1986 ISBN 90-277-1954-3; Pb: 90-277-1955-1
28. M. J. Cresswell: Adverbial Modification. Interval Semantics and Its Rivals. 1985
ISBN 90-277-2059-2; Pb: 90-277-2060-6
29. J. van Benthem: Essays in Logical Semantics 1986 ISBN 90-277-2091-6; Pb: 90-277-2092-4
30. B. H. Partee, A. ter Meulen and R. E. Wall: Mathematical Methods in Linguistics. 1990;
Corrected second printing of the first edition 1993 ISBN 90-277-2244-7; Pb: 90-277-2245-5
31. P. Gärdenfors (ed.): Generalized Quantifiers. Linguistic and Logical Approaches. 1987
ISBN 1-55608-017-4
32. R. T. Oehrle, E. Bach and D. Wheeler (eds.): Categorial Grammars and Natural Language
Structures. 1988 ISBN 1-55608-030-1; Pb: 1-55608-031-X
33. W. J. Savitch, E. Bach, W. Marsh and G. Safran-Naveh (eds.): The Formal Complexity of
Natural Language. 1987 ISBN 1-55608-046-8; Pb: 1-55608-047-6
34. J. E. Fenstad, P.-K. Halvorsen, T. Langholm and J. van Benthem: Situations, Language and
Logic. 1987 ISBN 1-55608-048-4; Pb: 1-55608-049-2
35. U. Reyle and C. Rohrer (eds.): Natural Language Parsing and Linguistic Theories. 1988
ISBN 1-55608-055-7; Pb: 1-55608-056-5
36. M. J. Cresswell: Semantical Essays. Possible Worlds and Their Rivals. 1988
ISBN 1-55608-061-1
37. T. Nishigauchi: Quantification in the Theory of Grammar. 1990
ISBN 0-7923-0643-0; Pb: 0-7923-0644-9
38. G. Chierchia, B.H. Partee and R. Turner (eds.): Properties, Types and Meaning. Volume I:
Foundational Issues. 1989 ISBN 1-55608-067-0; Pb: 1-55608-068-9
39. G. Chierchia, B.H. Partee and R. Turner (eds.): Properties, Types and Meaning. Volume II:
Semantic Issues. 1989 ISBN 1-55608-069-7; Pb: 1-55608-070-0
Set ISBN (Vol. I + II) 1-55608-088-3; Pb: 1-55608-089-1
40. C.T.J. Huang and R. May (eds.): Logical Structure and Linguistic Structure. Cross-Linguistic
Perspectives. 1991 ISBN 0-7923-0914-6; Pb: 0-7923-1636-3
41. M.J. Cresswell: Entities and Indices. 1990 ISBN 0-7923-0966-9; Pb: 0-7923-0967-7
42. H. Kamp and U. Reyle: From Discourse to Logic. Introduction to Modeltheoretic Semantics
of Natural Language, Formal Logic and Discourse Representation Theory. 1993
ISBN 0-7923-2403-X; Student edition: 0-7923-1028-4
43. C.S. Smith: The Parameter of Aspect. (Second Edition). 1997
ISBN 0-7923-4657-2; Pb 0-7923-4659-9
44. R.C. Berwick (ed.): Principle-Based Parsing. Computation and Psycholinguistics. 1991
ISBN 0-7923-1173-6; Pb: 0-7923-1637-1
45. F. Landman: Structures for Semantics. 1991 ISBN 0-7923-1239-2; Pb: 0-7923-1240-6
46. M. Siderits: Indian Philosophy of Language. 1991 ISBN 0-7923-1262-7
47. C. Jones: Purpose Clauses. 1991 ISBN 0-7923-1400-X
48. R.K. Larson, S. Iatridou, U. Lahiri and J. Higginbotham (eds.): Control and Grammar. 1992
ISBN 0-7923-1692-4
49. J. Pustejovsky (ed.): Semantics and the Lexicon. 1993
ISBN 0-7923-1963-X; Pb: 0-7923-2386-6
Studies in Linguistics and Philosophy

50. N. Asher: Reference to Abstract Objects in Discourse. 1993 ISBN 0-7923-2242-8

51. A. Zucchi: The Language of Propositions and Events. Issues in the Syntax and the Semantics
of Nominalization. 1993 ISBN 0-7923-2437-4
52. C.L. Tenny: Aspectual Roles and the Syntax-Semantics Interface. 1994
ISBN 0-7923-2863-9; Pb: 0-7923-2907-4
53. W.G. Lycan: Modality and Meaning. 1994 ISBN 0-7923-3006-4; Pb: 0-7923-3007-2
54. E. Bach, E. Jelinek, A. Kratzer and B.H. Partee (eds.): Quantification in Natural Languages.
1995 ISBN Vol. I: 0-7923-3128-1; Vol. II: 0-7923-3351-9; set: 0-7923-3352-7;
Student edition: 0-7923-3129-X
55. P. Lasersohn: Plurality, Conjunction and Events. 1995 ISBN 0-7923-3238-5
56. M. Pinkal: Logic and Lexicon. The Semantics of the Indefinite. 1995 ISBN 0-7923-3387-X
57. P. Øhrstrøm and P.F.V. Hasle: Temporal Logic. From Ancient Ideas to Artificial Intelligence.
1995 ISBN 0-7923-3586-4
58. T. Ogihara: Tense, Attitudes, and Scope. 1996 ISBN 0-7923-3801-4
59. I. Comorovski: Interrogative Phrases and the Syntax-Semantics Interface. 1996
ISBN 0-7923-3804-9
60. M.J. Cresswell: Semantic Indexicality. 1996 ISBN 0-7923-3914-2
61. R. Schwarzschild: Pluralities. 1996 ISBN 0-7923-4007-8
62. V. Dayal: Locality in WH Quantification. Questions and Relative Clauses in Hindi. 1996
ISBN 0-7923-4099-X
63. P. Merlo: Parsing with Principles and Classes of Information. 1996 ISBN 0-7923-4103-1
64. J. Ross: The Semantics of Media. 1997 ISBN 0-7923-4389-1
65. A. Szabolcsi (ed.): Ways of Scope Taking. 1997 ISBN 0-7923-4446-4; Pb: 0-7923-4451-0
66. P.L. Peterson: Fact Proposition Event. 1997 ISBN 0-7923-4568-1
67. G. Păun: Marcus Contextual Grammars. 1997 ISBN 0-7923-4783-8
68. T. Gunji and K. Hasida (eds.): Topics in Constraint-Based Grammar of Japanese. 1998
ISBN 0-7923-4836-2
69. F. Hamm and E. Hinrichs (eds.): Plurality and Quantification. 1998 ISBN 0-7923-4841-9
70. S. Rothstein (ed.): Events and Grammar. 1998 ISBN 0-7923-4940-7
71. E. Hajičová, B.H. Partee and P. Sgall: Topic-Focus Articulation, Tripartite Structures, and
Semantic Content. 1998 ISBN 0-7923-5289-0
72. K. von Heusinger and U. Egli (Eds.): Reference and Anaphoric Relations. 1999
ISBN 0-7923-6070-2
73. H. Bunt and R. Muskens (eds.): Computing Meaning. Volume 1. 2000
ISBN 0-7923-6108-3; Pb: ISBN 1-4020-0290-4
74. S. Rothstein (ed.): Predicates and their Subjects. 2000 ISBN 0-7923-6409-0
75. K. Kabakčiev: Aspect in English. A "Common-Sense" View of the Interplay between Verbal
and Nominal Referents. 2000 ISBN 0-7923-6538-0
76. F. Landman: Events and Plurality. The Jerusalem Lectures. 2000
ISBN 0-7923-6568-2; Pb: 0-7923-6569-0
77. H. Bunt, R. Muskens and E. Thijsse: Computing Meaning. Volume 2. 2001
ISBN 0-7923-0175-4; Pb: 1-4020-0451-6
78. R. Musan: The German Perfect. Its Semantic Composition and Its Interactions with Temporal
Adverbials. 2002 ISBN 1-4020-0719-1
Studies in Linguistics and Philosophy

79. G. Grevendorf and G. Meggle (eds.): Speech. Acts, Mind, and Social Reality. Discussions with
R. Searle. 2002 ISBN 1-4020-0853-8; Pb: 1-4020-0861-9
80. G.-J.M. Kruijff and R.T. Oehrle (eds.): Resource-Sensitivity, Binding and Anaphora. 2003
ISBN 1-4020-1691-3; Pb: 1-4020-1692-1
81. R. Elugardo and R.J. Stainton (eds.): Ellipsis and Nonsentential Speech. 2005
ISBN 1-4020-2299-9; Pb: 1-4020-2300-6
82. C. Lee, M. Gordan and D. Bü ring (eds.): Topic and Focus : Cross-linguistic Perspectives on
Meaning and Intonation. 2006 ISBN 1-4020-4795-9