100% found this document useful (15 votes)
205 views14 pages

The Auditory System at the Cocktail Party Optimized DOCX Download

The document discusses 'The Auditory System at the Cocktail Party,' a comprehensive volume that explores how the auditory system processes complex auditory scenes, such as those found in social settings. It includes contributions from various experts on topics like auditory object formation, masking in speech recognition, and the effects of age and hearing impairments on auditory processing. The volume aims to provide insights into the neural mechanisms underlying auditory scene analysis and the challenges faced by different age groups and individuals with hearing loss.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (15 votes)
205 views14 pages

The Auditory System at the Cocktail Party Optimized DOCX Download

The document discusses 'The Auditory System at the Cocktail Party,' a comprehensive volume that explores how the auditory system processes complex auditory scenes, such as those found in social settings. It includes contributions from various experts on topics like auditory object formation, masking in speech recognition, and the effects of age and hearing impairments on auditory processing. The volume aims to provide insights into the neural mechanisms underlying auditory scene analysis and the challenges faced by different age groups and individuals with hearing loss.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

The Auditory System at the Cocktail Party

Visit the link below to download the full version of this book:

https://medipdf.com/product/the-auditory-system-at-the-cocktail-party/

Click Download Now


John C. Middlebrooks Jonathan Z. Simon

Arthur N. Popper Richard R. Fay


Editors

The Auditory System


at the Cocktail Party
With 41 Illustrations

123
Editors
John C. Middlebrooks Arthur N. Popper
Department of Otolaryngology, Department of Biology
Department of Neurobiology & Behavior, University of Maryland
Department of Cognitive Sciences, College Park, MD
Department of Biomedical Engineering, USA
Center for Hearing Research
University of California Richard R. Fay
Irvine, CA Loyola University of Chicago
USA Chicago, IL
USA
Jonathan Z. Simon
Department of Electrical & Computer
Engineering, Department of Biology,
Institute for Systems Research
University of Maryland
College Park, MD
USA

ISSN 0947-2657 ISSN 2197-1897 (electronic)


Springer Handbook of Auditory Research
ISBN 978-3-319-51660-8 ISBN 978-3-319-51662-2 (eBook)
DOI 10.1007/978-3-319-51662-2
Library of Congress Control Number: 2017930799

© Springer International Publishing AG 2017


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
The Acoustical Society of America

On 27 December 1928 a group of scientists and engineers met at Bell Telephone


Laboratories in New York City to discuss organizing a society dedicated to the field
of acoustics. Plans developed rapidly and the Acoustical Society of America
(ASA) held its first meeting 10–11 May 1929 with a charter membership of about
450. Today ASA has a world-wide membership of 7,000.
The scope of this new society incorporated a broad range of technical areas that
continues to be reflected in ASA’s present day endeavors. Today, ASA serves the
interests of its members and the acoustics community in all branches of acoustics,
both theoretical and applied. To achieve this goal, ASA has established technical
committees charged with keeping abreast of the developments and needs of
membership in specialized fields as well as identifying new ones as they develop.
The Technical Committees include: acoustical oceanography, animal bioacous-
tics, architectural acoustics, biomedical acoustics, engineering acoustics, musical
acoustics, noise, physical acoustics, psychological and physiological acoustics,
signal processing in acoustics, speech communication, structural acoustics and
vibration, and underwater acoustics. This diversity is one of the Society’s unique and
strongest assets since it so strongly fosters and encourages cross-disciplinary learn-
ing, collaboration, and interactions.
ASA publications and meetings incorporate the diversity of these Technical
Committees. In particular, publications play a major role in the Society. The
Journal of the Acoustical Society of America (JASA) includes contributed papers
and patent reviews. JASA Express Letters (JASA-EL) and Proceedings of Meetings
on Acoustics (POMA) are online, open-access publications, offering rapid publi-
cation. Acoustics Today, published quarterly, is a popular open-access magazine.
Other key features of ASA’s publishing programinclude books, reprints of classic
acoustics texts, and videos.
ASA’s biannual meetings offer opportunities for attendees to share information,
with strong support throughout the career continuum, from students to retirees.
Meetings incorporate many opportunities for professional and social interactions
and attendees find the personal contacts a rewarding experience. These experiences
result in building a robust network of fellow scientists and engineers, many of
whom become lifelong friends and colleagues.
From the Society’s inception, members recognized the importance of developing
acoustical standards with a focus on terminology, measurement procedures, and
criteria for determining the effects of noise and vibration. The ASA Standard Program
serves as the Secretariat for four American National Standards Institute Committees
and provides administrative support for several international standards committees.
Throughout its history to present day ASA’s strength resides in attracting the interest
and commitment of scholars devoted to promoting the knowledge and practical
applications of acoustics. The unselfish activity of these individuals in the development
of the Society is largely responsible for ASA’s growth and present stature.

v
Series Preface

The following preface is the one that we published in Volume 1 of the Springer
Handbook of Auditory Research back in 1992. As anyone reading the original
preface, or the many users of the series, will note, we have far exceeded our original
expectation of eight volumes. Indeed, with books published to date and those in the
pipeline, we are now set for over 60 volumes in SHAR, and we are still open to new
and exciting ideas for additional books.
We are very proud that there seems to be consensus, at least among our friends
and colleagues, that SHAR has become an important and influential part of the
auditory literature. While we have worked hard to develop and maintain the quality
and value of SHAR, the real value of the books is very much because of the
numerous authors who have given their time to write outstanding chapters and to
our many coeditors who have provided the intellectual leadership to the individual
volumes. We have worked with a remarkable and wonderful group of people, many
of whom have become great personal friends of both of us. We also continue to
work with a spectacular group of editors at Springer. Indeed, several of our past
editors have moved on in the publishing world to become senior executives. To our
delight, this includes the current president of Springer US, Dr. William Curtis.
But the truth is that the series would and could not be possible without the support
of our families, and we want to take this opportunity to dedicate all of the SHAR
books, past and future, to them. Our wives, Catherine Fay and Helen Popper, and our
children, Michelle Popper Levit, Melissa Popper Levinsohn, Christian Fay, and
Amanda Fay Seirra, have been immensely patient as we developed and worked on
this series. We thank them and state, without doubt, that this series could not have
happened without them. We also dedicate the future of SHAR to our next generation
of (potential) auditory researchers—our grandchildren—Ethan and Sophie
Levinsohn, Emma Levit, and Nathaniel, Evan, and Stella Fay.

vii
viii Series Preface

Preface 1992

The Springer Handbook of Auditory Research presents a series of comprehensive


and synthetic reviews of the fundamental topics in modern auditory research. The
volumes are aimed at all individuals with interests in hearing research including
advanced graduate students, postdoctoral researchers, and clinical investigators.
The volumes are intended to introduce new investigators to important aspects of
hearing science and to help established investigators to better understand the fun-
damental theories and data in fields of hearing that they may not normally follow
closely.
Each volume presents a particular topic comprehensively, and each serves as a
synthetic overview and guide to the literature. As such, the chapters present neither
exhaustive data reviews nor original research that has not yet appeared in
peer-reviewed journals. The volumes focus on topics that have developed a solid
data and conceptual foundation rather than on those for which a literature is only
beginning to develop. New research areas will be covered on a timely basis in the
series as they begin to mature.
Each volume in the series consists of a few substantial chapters on a particular
topic. In some cases, the topics will be ones of traditional interest for which there is
a substantial body of data and theory, such as auditory neuroanatomy (Vol. 1) and
neurophysiology (Vol. 2). Other volumes in the series deal with topics that have
begun to mature more recently, such as development, plasticity, and computational
models of neural processing. In many cases, the series editors are joined by a
co-editor having special expertise in the topic of the volume.
Arthur N. Popper, College Park, MD, USA
Richard R. Fay, Chicago, IL, USA

SHAR logo by Mark B. Weinberg, Bethesda, Maryland, used with permission.


Volume Preface

The cocktail party is the archetype of a complex auditory scene: multiple voices
compete for attention; glasses clink; background music plays. Other situations of
daily life, including busy offices, crowded restaurants, noisy classrooms, and
congested city streets, are no less acoustically complex. The normal auditory sys-
tem exhibits a remarkable ability to parse these complex scenes. Even relatively
minor hearing impairment, however, can disrupt this auditory scene analysis.
This volume grew out of the Presidential Symposium, “Ears and Brains at the
Cocktail Party,” at the Midwinter Meeting of the Association for Research in
Otolaryngology, held in 2013 in Baltimore, Maryland. In this volume, the authors
describe both the conditions in which the auditory system excels at segregating
signals of interest from distractors and the conditions in which the problem is
insoluble, all the time attempting to understand the neural mechanisms that underlie
both the successes and the failures. In Chap. 1, Middlebrooks and Simon introduce
the volume and provide an overview of the cocktail party problem, putting it into
the perspective of broader issues in auditory neuroscience. In Chap. 2,
Shinn-Cunningham, Best, and Lee further set the stage by elaborating on the key
concept of an auditory object, which can be thought of as the perceptual correlate of
an external auditory source and the unit on which target selection and attention
operate. In Chap. 3, Culling and Stone address the challenges of low-level sepa-
ration of signal from noise and consider the mechanisms by which those challenges
may be overcome. They introduce the distinction between energetic and informa-
tional masking. Next, in Chap. 4, Kidd and Colburn develop the concept of
informational masking by focusing on speech-on-speech masking.
Computational models can aid in formalizing the basic science understanding of
a problem as well as in generating algorithms that exploit biological principles for
use in solution of practical engineering problems. In Chap. 5, Elhilali considers the
challenges of creating useful computational models of the cocktail party problem.
Then, in Chap. 6, Middlebrooks considers the importance of spatial separation of
sound sources for stream segregation and reviews the psychophysics and physio-
logical substrates of spatial stream segregation. Next, in Chap. 7, Simon reviews
new developments in the field of experimental human auditory neuroscience.

ix
x Volume Preface

A cocktail party is no place for infants and children. The auditory scene,
however, is easily as acoustically complex on a noisy playground or in a crowded
classroom. Young people apprehend these scenes with immature auditory systems
and not-yet-crystallized language recognition. Werner, in Chap. 8, considers mul-
tiple stages and levels of development. Next, in Chap. 9, Pichora-Fuller, Alain, and
Schneider consider older adults in whom maturity of language skills and stores of
knowledge can to some degree compensate for senescence of the peripheral and
central auditory systems. Finally, in Chap. 10, Litovsky, Goupell, Misurelli, and
Kan consider the consequences of hearing impairment and the ways in which
hearing can at least partially restored.
Successful communication at the eponymous cocktail party as well as in other,
everyday, complex auditory scenes demands all the resources of the auditory sys-
tem, from basic coding mechanisms in the periphery to high-order integrative
processes. The chapters of this volume are intended to be a resource for exploration
of these resources at all levels: in normal mature hearing, in early development, in
aging, and in pathology.
John C. Middlebrooks, Irvine, CA, USA
Jonathan Z. Simon, College Park, MD, USA
Arthur N. Popper, College Park, MD, USA
Richard R. Fay, Chicago, IL, USA
Contents

1 Ear and Brain Mechanisms for Parsing the Auditory Scene . . . . . . 1


John C. Middlebrooks and Jonathan Z. Simon
2 Auditory Object Formation and Selection . . . . . . . . . . . . . . . . . . . . . 7
Barbara Shinn-Cunningham, Virginia Best,
and Adrian K.C. Lee
3 Energetic Masking and Masking Release . . . . . . . . . . . . . . . . . . . . . 41
John F. Culling and Michael A. Stone
4 Informational Masking in Speech Recognition . . . . . . . . . . . . . . . . . 75
Gerald Kidd Jr. and H. Steven Colburn
5 Modeling the Cocktail Party Problem . . . . . . . . . . . . . . . . . . . . . . . . 111
Mounya Elhilali
6 Spatial Stream Segregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
John C. Middlebrooks
7 Human Auditory Neuroscience and the Cocktail
Party Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Jonathan Z. Simon
8 Infants and Children at the Cocktail Party . . . . . . . . . . . . . . . . . . . . 199
Lynne Werner
9 Older Adults at the Cocktail Party . . . . . . . . . . . . . . . . . . . . . . . . . . 227
M. Kathleen Pichora-Fuller, Claude Alain,
and Bruce A. Schneider
10 Hearing with Cochlear Implants and Hearing Aids
in Complex Auditory Scenes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Ruth Y. Litovsky, Matthew J. Goupell, Sara M. Misurelli,
and Alan Kan

xi
Contributors

Claude Alain Department of Psychology, The Rotman Research Institute,


University of Toronto, Toronto, ON, Canada
Virginia Best Department of Speech, Language and Hearing Sciences, Boston
University, Boston, MA, USA
H. Steven Colburn Department of Biomedical Engineering, Hearing Research
Center, Boston University, Boston, MA, USA
John F. Culling School of Psychology, Cardiff University, Cardiff, UK
Mounya Elhilali Laboratory for Computational Audio Perception, Center for
Speech and Language Processing, Department of Electrical and Computer
Engineering, The Johns Hopkins University, Baltimore, MD, USA
Matthew J. Goupell Department of Hearing and Speech Sciences, University of
Maryland, College Park, MD, USA
Alan Kan Waisman Center, University of Wisconsin–Madison, Madison, WI,
USA
Gerald Kidd Jr. Department of Speech, Language and Hearing Sciences, Hearing
Research Center, Boston University, Boston, MA, USA
Adrian K.C. Lee Department of Speech and Hearing Sciences, Institute for
Learning and Brain Sciences (I-LABS), University of Washington, Seattle, WA,
USA
Ruth Y. Litovsky Waisman Center, University of Wisconsin–Madison, Madison,
WI, USA
John C. Middlebrooks Department of Otolaryngology, Department of
Neurobiology & Behavior, Department of Cognitive Sciences, Department of
Biomedical Engineering, Center for Hearing Research, University of California,
Irvine, CA, USA

xiii
xiv Contributors

Sara M. Misurelli Department of Communication Sciences and Disorders,


University of Wisconsin–Madison, Madison, WI, USA
M. Kathleen Pichora-Fuller Department of Psychology, University of Toronto,
Mississauga, ON, Canada
Bruce A. Schneider Department of Psychology, University of Toronto,
Mississauga, ON, Canada
Barbara Shinn-Cunningham Center for Research in Sensory Communication
and Emerging Neural Technology, Boston University, Boston, MA, USA
Jonathan Z. Simon Department of Electrical & Computer Engineering,
Department of Biology, Institute of Systems Research, University of Maryland,
College Park, MD, USA
Michael A. Stone Manchester Centre for Audiology and Deafness, School of
Health Sciences, University of Manchester, Manchester, UK
Lynne Werner Department of Speech and Hearing Sciences, University of
Washington, Washington, USA
Chapter 1
Ear and Brain Mechanisms for Parsing
the Auditory Scene

John C. Middlebrooks and Jonathan Z. Simon

Abstract The cocktail party is a popular metaphor for the complex auditory scene
that is everyday life. In busy offices, crowded restaurants, and noisy streets, a
listener is challenged to hear out signals of interest—most often speech from a
particular talker—amid a cacophony of competing talkers, broadband machine
noise, room reflections, and so forth. This chapter defines the problems that the
auditory system must solve and introduces the ensuing chapters, which explore the
relevant perception and physiology at all levels: in normal mature hearing, in early
development, in aging, and in pathology.

 
Keywords Auditory object Auditory scene analysis Cocktail party problem 
 
Energetic masking Grouping Informational masking Stream segregation  
Streaming

J.C. Middlebrooks (&)


Department of Otolaryngology, Department of Neurobiology & Behavior,
Department of Cognitive Sciences, Department of Biomedical Engineering,
Center for Hearing Research, University of California, Irvine, CA 92697-5310, USA
e-mail: [email protected]
J.Z. Simon
Department of Electrical & Computer Engineering, Department of Biology,
Institute for Systems Research, University of Maryland, College Park,
MD 20742, USA
e-mail: [email protected]

© Springer International Publishing AG 2017 1


J.C. Middlebrooks et al. (eds.), The Auditory System
at the Cocktail Party, Springer Handbook of Auditory Research 60,
DOI 10.1007/978-3-319-51662-2_1
2 J.C. Middlebrooks and J.Z. Simon

1.1 Introduction

The cocktail party is the archetype of a complex auditory scene: multiple voices vie
for attention; glasses clink; background music plays; all of which are shaken, not
stirred, by room reflections. Colin Cherry (1953) brought hearing science to the
cocktail party when he introduced the term “cocktail party problem.” Cherry’s
cocktail party was rather dry: just two talkers reading narratives at the same time,
either with one talker in each of earphones or with the two talkers mixed and played
to both earphones. Real-life cocktail parties are far more acoustically complex, as
are other auditory situations of daily life, such as busy offices, crowded restaurants,
noisy classrooms, and congested city streets. Albert Bregman (1990) has referred to
people’s efforts to solve these everyday cocktail party problems as “auditory scene
analysis.”
The normal auditory system exhibits a remarkable ability to parse these complex
scenes. As pointed out by Shinn-Cunningham, Best, and Lee (Chap. 2), the best
efforts of present-day technology pale compared to the ability of even a toddler to
hear out a special voice amid a crowd of distractors. Conversely, even a relatively
minor hearing impairment can disrupt auditory scene analysis. People with mild to
moderate hearing loss report that their inability to segregate multiple talkers or to
understand speech in a noisy background is one of their greatest disabilities
(Gatehouse and Nobel 2004).

1.2 Some Central Concepts

In attempting to make sense of the auditory scene, a listener must form distinct
perceptual images—auditory objects—of one or more sound sources, where the
sound sources might be individual talkers, musical lines, mechanical objects, and so
forth. Formation of an auditory object requires grouping of the multiple sound
components that belong to a particular source and segregation of those components
from those of other sources. Grouping can happen instantaneously across fre-
quencies, such as grouping of all the harmonics of a vowel sound or of all the
sounds resulting from the release of a stop consonant. Grouping must also happen
across time, such as in the formation of perceptual streams from the sequences of
sounds from a particular source. In the cocktail party example, the relevant streams
might be the sentences formed by the successions of phonemes originating from the
various competing talkers. To a large degree, segregation of auditory objects takes
place on the basis of low-level differences in sounds, such as fundamental fre-
quencies, timbres, onset times, or source locations. Other, higher-level, factors for
segregation include linguistic cues, accents, and recognition of familiar voices.
Failure to segregate the components of sound sources can impair formation of
auditory objects: this is masking. When a competing sound coincides in frequency
and time with a signal of interest, the resulting masking is referred to as energetic.
1 Ear and Brain Mechanisms for Parsing the Auditory Scene 3

Energetic masking is largely a phenomenon of the auditory periphery, where signal


and masker elicit overlapping patterns of activity on the basilar membrane of the
cochlea and compete for overlapping auditory nerve populations. There is an
extensive literature on the characteristics of energetic masking and on brain
mechanisms that can provide some release from energetic masking.
Another form of masking can occur in situations in which there is no spec-
trotemporal overlap of signal and masker: this is referred to as informational
masking. In cases of informational masking, listeners fail to identify the signal amid
the confusion of masking sounds. The magnitude of informational masking, tens of
decibels in some cases, is surprising inasmuch as the spectral analysis by the
cochlea presumably is doing its normal job of segregating activity from signal and
masker components that differ in frequency. Given the presumed absence of
interference in the cochlea, one assumes that informational masking somehow
arises in the central auditory pathway. Chapters of this volume review the phe-
nomena of informational masking and the possible central mechanisms for release
from informational masking.

1.3 Overview of the Volume

The present volume addresses conditions in which the auditory system succeeds at
segregating signals from distractors and conditions in which the cocktail party
problem cannot be solved. Shinn-Cunningham, Best, and Lee (Chap. 2) set the
stage by introducing the notion of the auditory object, which can be thought of as
the perceptual correlate of an external auditory source and the unit on which target
selection and attention operate. Sequences of auditory objects that are extended in
time form auditory streams. Parsing of the auditory scene, then, consists of selection
of particular auditory objects through some combination of bottom-up object sal-
ience and top-down attention, filtered by experience and expectation.
Culling and Stone (Chap. 3) address the challenges of low-level formation of
auditory objects and consider some mechanisms by which those challenges can be
overcome. They introduce the notion of energetic masking, in which interfering
sounds disrupt the representation of speech signals at the level of the auditory
nerve. Release from energetic masking can be achieved by exploiting differences
between target and masker, such as differences in their harmonic structure or
interaural time differences. In some conditions a listener can circumvent energetic
masking by “listening in the dips,” where “the dips” are moments at which masker
amplitude is minimal. In addition, a listener might exploit the acoustic shadow of
the head by attending to the ear at which the target-to-masker ratio is higher.
Understanding of a speech target can be impaired by the presence of a competing
speech source even in the absence of energetic masking, that is, when there is no
spectral or temporal overlap of target and masker. That residual informational
masking is the topic of Chap. 4, by Kidd and Colburn. Focusing on
speech-on-speech masking, the authors contrast and compare energetic and

You might also like