0% found this document useful (0 votes)
14 views

Multimodality and Cognitive Linguistics

The document is a publication titled 'Multimodality and Cognitive Linguistics' edited by María Jesús Pinar Sanz, featuring 13 papers that explore the intersection of multimodality and cognitive linguistics. It includes contributions on various topics such as multimodal metaphors, social semiotics, and multimodal interactional analysis. The introduction summarizes the main approaches and objectives of the papers included in the volume.

Uploaded by

Miku Tojiriki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Multimodality and Cognitive Linguistics

The document is a publication titled 'Multimodality and Cognitive Linguistics' edited by María Jesús Pinar Sanz, featuring 13 papers that explore the intersection of multimodality and cognitive linguistics. It includes contributions on various topics such as multimodal metaphors, social semiotics, and multimodal interactional analysis. The introduction summarizes the main approaches and objectives of the papers included in the volume.

Uploaded by

Miku Tojiriki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 223

Copyright 2015. John Benjamins Publishing Company. All rights reserved.

2015. John Benjamins Publishing Company. All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under U.S. or applicable copyright law.

Account: ns335141
AN: 1076832 ; Pinar Sanz, Maria Jesus.; Multimodality and Cognitive Linguistics
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 2/10/2023 2:18 AM via
Multimodality and Cognitive Linguistics

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Benjamins Current Topics
issn 1874-0081

Special issues of established journals tend to circulate within the orbit of the
subscribers of those journals. For the Benjamins Current Topics series a number
of special issues of various journals have been selected containing salient topics of
research with the aim of finding new audiences for topically interesting material,
bringing such material to a wider readership in book format.
For an overview of all books published in this series, please see
http://benjamins.com/catalog/bct

Volume 78
Multimodality and Cognitive Linguistics
Edited by María Jesús Pinar Sanz
These materials were previously published in Review of Cognitive Linguistics 11:2
(2013).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality
and Cognitive Linguistics

Edited by

María Jesús Pinar Sanz


University of Castilla-La Mancha

John Benjamins Publishing Company


Amsterdam / Philadelphia

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


TM
The paper used in this publication meets the minimum requirements of
8

the American National Standard for Information Sciences – Permanence


of Paper for Printed Library Materials, ansi z39.48-1984.

doi 10.1075/bct.78
Cataloging-in-Publication Data available from Library of Congress:
lccn 2015021640 (print) / 2015028826 (e-book)
isbn 978 90 272 4266 2 (Hb)
isbn 978 90 272 6801 3 (e-book)

© 2015 – John Benjamins B.V.


No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any
other means, without written permission from the publisher.
John Benjamins Publishing Co. · https://benjamins.com

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Table of contents

About the contributors vii

Multimodality and Cognitive Linguistics: Introduction 1


María Jesús Pinar Sanz

Part I. Cognitive Linguistics and multimodal metaphor

Cross-modal resonances in creative multimodal metaphors:


Breaking out of conceptual prisons 13
Elisabeth El Refaie

Metaphor and symbol: Searching for one’s identity is looking


for a home in animation film 27
Charles Forceville

Woven emotions: Visual representations of emotions in medieval


English textiles 45
Javier E. Díaz Vera

Approaching the utopia of a global brand: The relevance of image


schemas as multimodal resources for the branding industry 61
Lorena Pérez Hernández

Multimodal metaphors in political entertainment 79


Diana E. Popa

Part II. Multimodality, Cognitive and Systemic Functional Linguistics

The visual representation of metaphor: A social semiotic approach 99


Dezheng Feng and Kay L. O’Halloran

Visual metonymy in children’s picture books 115


A. Jesús Moya Guijarro

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


vi Multimodality and Cognitive Linguistics

The establishment of interpretative expectations in film 131


John A. Bateman and Chiaoi Tseng

Multimodal digital storytelling: Integrating information, emotion


and social cognition 147
Isabel Alonso, Silvia Molina and María Dolores Porto

Part III. Cognitive Linguistics and multimodal interaction

Intermedial cognitive semiotics: Some examples of multimodal


cueing in virtual environments 167
Asunción López-Varela

Multimodality in conversational humor 181


Salvatore Attardo, Lucy Pickering, Fofo Lomotey and Shigehito Menjo

Image schemas and mimetic schemas in cognitive linguistics


and gesture studies 195
Alan Cienki

Index 211

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


About the contributors

María Jesús Pinar Sanz is a Lecturer in Linguistics and Discourse Analysis at the
University of Castilla-La Mancha (Spain). Her research interests are in multimod-
al discourse analysis and, more specifically, in aspects related to the analysis of
election campaigns, political advertising and ethnic humour. She has published
several articles on the generic structure of political ads, ethnic humour and the
relationship between the verbal and visual elements not only in political texts but
also in children’s narratives.

Elisabeth El Refaie is a Senior Lecturer in Language and Communication at


Cardiff University. The focus of her research is on new literacies and visual/mul-
timodal forms of metaphor, narrative, and humour. Her work has appeared in
several edited volumes and scholarly journals, including Visual Communication,
Visual Studies, and Studies in Comics. Her research monograph, Autobiographical
comics: Life writing in pictures, was published by the University Press of Missis-
sippi in 2012.

Charles Forceville is associate professor in the Media Studies department at the


University of Amsterdam (http://home.medewerker.uva.nl/c.j.forceville/). He
authored Pictorial metaphor in advertising (Routledge 1996) and co-edited, with
Eduardo Urios-Aparisi, Multimodal metaphor (Mouton de Gruyter 2009). The vol-
ume Creativity and the agile mind, co-edited with Tony Veale and Kurt Feyaerts,
appeared in 2003, also with Mouton de Gruyter. Committed to cognitive, socio-­
biological, and relevance-theoretical approaches, his work is expanding from mul-
timodal metaphor to multimodal rhetoric and narrative more generally. Genres
and media he finds pertinent include animation, comics, documentary, fiction
film, and advertising. For more information: http://muldisc.wordpress.com/.

Javier E. Díaz Vera is a Lecturer of English and Linguistics in the Department of


Modern Languages of the University of Castilla-La Mancha (Spain). His research
interests focus on historical sociolinguistics and language change in the history
of English, with special attention to diachronic metaphor and the expression of
emotions in different diachronic and dialectal varieties of English.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


viii Multimodality and Cognitive Linguistics

Lorena Pérez Hernández, PhD, has worked as an Associate Professor at the Uni-
versity of La Rioja (Spain) since 2001. She is an Associate Editor of The Metaphor
and Metonymy Bibliography (John Benjamins), and a member of the editorial
board of Journal of English Studies. Her research has been published in inter-
national journals such as Metaphor and Symbol, Journal of Pragmatics, Applied
Linguistics, and Language and Communication. Since 1996, she collaborates as a
linguistics consultant with the marketing company Lexicon Branding, Co. (San
Francisco, USA).

Diana Elena Popa is Associate Professor in the Department of English, Dunar-


ea de Jos University of Galati, Romania. Her research interests are primarily in
pragmatic, cognitive, and sociolinguistic mechanisms of humour. She has recent-
ly co-­edited a book with Villy Tsakona titled Studies in political humour (John
Benjamins, 2011). She is also one of the editors for The European Journal of Hu-
mour Research.

Dezheng Feng, PhD, is Research Assistant Professor in the Department of En-


glish, The Hong Kong Polytechnic University. His research interests include the
critical analysis of multimodal discourse, social semiotic theory and cognitive lin-
guistics. His recent publications include “Representing emotion in visual images:
A social semiotic approach” in Journal of Pragmatics and “Intertextual voices and
engagement in TV advertisements” in the journal Visual Communication.

Kay O’Halloran is Associate Professor in the School of Education at Curtin Uni-


versity, Australia. Her main research areas include a social semiotic approach to
multimodal discourse analysis with a particular interest in mathematics and sci-
entific texts, and the development of interactive digital media technologies for
multimodal analysis of (multimedia) data. Further information is available at
http://multimodal-analysis-lab.org/.

A. Jesús Moya Guijarro is Professor of Language and Linguistics at the Faculty


of Education, University of Castilla-La Mancha, Spain. He does research in dis-
course and text analysis. He has published several articles on information, the-
maticity and multimodal discourses in international journals such as Word, Text,
Functions of Language and Journal of Pragmatics. He has co-edited The world told
and the world shown: Multisemiotic issues. His research interests are also in Chil-
dren’s Literature and Applied Linguistics.

John Bateman is a full professor of applied linguistics at the University of Bremen


and has been applying mechanisms of discourse interpretation to film for several
years. He obtained his PhD in Artificial Intelligence from the University of Edin-
burgh in 1986 and has worked in various areas of multimodal computational and

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


About the contributors ix

functional linguistics since the early 1990s. He is currently head of the doctoral
training research group on the ‘Textuality of Film’ at the University of Bremen, as
well as several third-party funded projects on the application of linguistic meth-
ods to filmic analysis.

Chiaoi Tseng is an assistant researcher at the Faculty of Linguistics and Liter-


ary Science, Bremen University. Her research interests include film analysis,
multimodal discourse and genre. She completed her dissertation entitled Cohe-
sion in film, and the construction of filmic thematic configurations: A functional
perspective in 2009. Dr Tseng currently works within a project exploring the de-
velopment of automatic support for high-level narrative analysis of films using
image-­processing techniques.

Isabel Alonso Belmonte is Associate Professor at the Universidad Autónoma de


Madrid. Her research concerns media discourse analysis and pragmatics, areas in
which she has extensively published.

Silvia Molina is now Associate Professor at the Universidad Politécnica de Ma-


drid. Her research interests are discourse analysis, pragmatics and lexis.

María Dolores Porto is a lecturer at the Universidad de Alcalá, in Madrid. Her


research is mostly related to the processes of interpretation of discourse, whether
literary, technical, academic or spontaneous.

López-Varela’s research interests include socio-semiotics, intermedial studies


and comparative literature and cultural studies. Member of the Executive Com-
mittee of the European Network of Comparative Literatury Studies ENCLS and
the Harvard Institute of World Literatures IWL, she coordinates the research pro-
gram: Studies on Intermediality and Intercultural Mediation SIIM at Universidad
Complutense Madrid, and participates in the board of journals such as Cultura.
International Journal of Philosophy of Culture and Axiology, Comparative Litera-
ture and Culture CLCWeb, the Cypriot Journal of Educational Sciences CJES, Hy-
perCultura Journal, the International Journal of the Humanities, and the Southern
Semiotic Review. López-Varela has been visiting scholar at Brown University in
2010 and Harvard University in 2013. More information at: <http://www.ucm.
es/sim/>.

Shigehito Menjo is a doctoral student in Literature and Languages at Texas A&M


University-Commerce, focusing on Applied Linguistics. He received his MA in
Japanese Language and Pedagogy and BA in Linguistics from the University of
Oregon. His research interests include prosody in humor and the cross-linguistic
analysis of prosody acquisition in second language, especially the acquisition of
timing and intonation in discourse.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


x Multimodality and Cognitive Linguistics

Lucy Pickering is Associate Professor of Applied Linguistics and Director of the


Applied Linguistics Laboratory at Texas A&M-Commerce. She received her PhD
from the University of Florida in 1999. Her research interests include prosody and
humor, cross-linguistic transfer of prosodic features and discourse intonation.

Charlotte Fofo Lomotey is a PhD student and a Research Assistant in the Applied
Linguistics Laboratory at Texas A&M University-Commerce. She received her
MPhil. (Applied Linguistics) at the University of Education, Winneba, in Ghana.
Prior to coming to Commerce, she taught Linguistics courses at the UEW. Her
research interests include dialectal differences, ELF/ESL, Language Documenta-
tion, Discourse and Prosody.

Salvatore Attardo holds a PhD in English Linguistics from Purdue University and
is Professor of Linguistics and Dean of the College of Humanities, Social Sciences,
and Arts at Texas A&M University-Commerce. His research is focused primarily
on humor studies and pragmatics.

Alan Cienki is Associate Professor in the Department of Language and Com-


munication at the Vrije Universiteit (VU) in Amsterdam, Netherlands. He co-­
edited the volumes Conceptual and discourse factors in linguistic structure (2001)
and Metaphor and gesture (2008), and is currently working on a monograph on
gesture and cognitive linguistics. He is Associate Editor of the journal Cognitive
Linguistics, Chair of the international Association for Researching and Applying
Metaphor (RaAM), and Director of the Amsterdam Gesture Center.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality and Cognitive Linguistics
Introduction

María Jesús Pinar Sanz


University of Castilla-La Mancha

This volume includes 13 papers dealing with Multimodality and Cognitive Lin-
guistics. The introduction provides an overview of three of the main approaches
dealing with multimodality – Cognitive Linguistics and multimodal metaphors
(Forceville & Urios-Aparisi, 2009), social semiotics and systemic functional lin-
guistics, and multimodal interactional analysis (Jewitt, 2009, p. 29). The paper
summarizes the contributions to the volume, highlighting the main objectives
and conclusions of each of the papers.

Keywords: Multimodality, Cognitive Linguistics, multimodal metaphor,


systemic functional grammar, multimodal interactional analysis

1. Introduction

The turn of the millennium has brought an increasing interest in multimodality,


i.e. the relationship between different semiotic modes in human communication
and their ‘textual’ instantiation. The analysis of multimodal discourse involves
looking into the kind of information provided by different modes – image, ges-
ture, gaze, posture, and so on – and their interplay. The starting assumption is
that the overall effect is more than the sum of the parts since communication
is achieved through all modes interacting both separately and simultaneous-
ly (Kress & van Leeuwen, 2001). As it is, at present there is a growing number
of studies dealing with different aspects of multimodal analysis; however, there
seems to be a gap concerning this increasing interest in multimodality and Cog-
nitive Linguistics. In this regard, the aim of this volume is to advance our theoret-
ical and empirical understanding of the relationship between Multimodality and
Cognitive Linguistics.

doi 10.1075/bct.78.01pin
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
2 María Jesús Pinar Sanz

Among the current scholars from different persuasions dealing with multi-
modality, we find Gibbons (2011), who provides a set of critical tools for analyz-
ing the cognitive impact of multimodal literature. Bateman (2011) presents an
approach to analyzing page-based documents that combine text, graphics and
pictures in different layouts. In turn, Jewitt (2009) surveys a variety of theoretical
approaches which have looked at multimodal communication and representa-
tion, including visual studies, anthropology and socio-linguistics, among other
disciplines. The papers in Ventola and Moya (2009) discuss the relationship be-
tween the discourses that “tell” and visuals that “show”. Jones and Ventola (2008)
explore the ways in which multimodality influences the work of linguists, lin-
guistic description and application. O’Halloran (2011) proposes a distinct mul-
timodal studies field as both the mapping of a domain of enquiry, and as the site
of the development of theories, descriptions and methodologies specific to and
adapted for the study of multimodality. Corpus Linguistics and multimodality
are addressed in Knight (2013), who looks at possible directions in the construc-
tion and use of multimodal corpus linguistics. In fact, Sindoni (2013) reconsiders
underlying linguistic and semiotic frameworks of analysis of spoken and written
discourse, in keeping with a multimodal corpus linguistics theoretical frame-
work. Finally, within Cognitive Linguistics, the papers compiled in Forceville and
Urios-Aparisi (2009) discuss metaphors drawing on combinations of visuals, lan-
guage, gestures, sound, and music.
The innovative nature of this volume in comparison to those existing in
the field lies in the fact that it brings together contributions from three of the
main approaches dealing with multimodality – Cognitive linguistics and mul-
timodal metaphors (Forceville & Urios-Aparisi, 2009), social semiotics and sys-
temic functional grammar and multimodal interactional analysis (Jewitt, 2009,
p. 29) – highlighting the importance of multimodal resources, and showing the
close relationship between this field of study and Cognitive Linguistics applied to
a variety of genres – ranging from comics, films, cartoons, or visuals in tapestry,
to name a few.
The present volume is structured in three parts. The first one is rooted in
Cognitive Linguistics and focuses on non-verbal and multimodal metaphor – for
a state-of-the-art panorama, see the papers in Forceville & Urios-Aparisi (2009).
The second part follows Hallidayan Systemic Functional Linguistics in a double
perspective: Social semiotic multimodality and multimodal discourse analysis –
for detailed discussion see Jewitt (2009); and the third part draws upon Norris’
model of multimodal interaction (Jewitt, 2009; Norris, 2004, 2011).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality and Cognitive Linguistics: Introduction 3

2. Cognitive Linguistics and multimodal metaphors

According to Forceville (2010, p. 59), analyzing multimodal metaphor and me-


tonymy is a productive way to gain insight into multimodal discourse, since in
their prototypical manifestations target and source occur in different modalities.
Thus, the first part of the volume deals with multimodal metaphor research ap-
plied to genres such as ‘alternative’ comics, films, automobile brands, political
entertainment or images in tapestry. The aim is to shed some light into the mean-
ings and realizations of those conceptual metaphors which are not solely instanti-
ated in linguistic form. The papers in this first section add up to Forceville and
Urios-Aparisi’s (2009, p. 5) claim that a healthy theory of metaphor must study
non-verbal and multimodal metaphor. The authors explore various genres and, in
this regard, the importance of taking into account the socio-cultural dimension
when looking into the creation and interpretation of multimodal metaphor and
metonymy is the starting assumption in all the papers.
In “Cross-modal resonances in creative multimodal metaphors: Breaking out
of conceptual prisons”, El Refaie provides examples from three different genres –
an autobiographical comic, a television commercial and a political cartoon – to
develop a new understanding of the nature of creativity in metaphor. El Refaie’s
claim is that multimodality provides opportunities for metaphor creativity “by
exploiting the unique affordances of the different semiotic modes and the pos-
sibility of combining them in unexpected ways”. The paper provides a critical
view on mainstream Lakoffian Conceptual Metaphor Theory (henceforth CMT)
(Lakoff & Johnson, 1980), and the customary neglect of metaphors based on
novel connections between different areas of experience and at the same time
offers an original answer to how to theorize striking instances of metaphor cre-
ativity while remaining committed to a view of metaphor as an essential aspect
of common, everyday thought patterns. The author suggests that multimodality
increases the opportunity for creativity at the level of representation, encourag-
ing novel thought patterns, even in cases where the metaphorical mappings are
relatively conventional. The notion of “cross-modal resonances” is introduced to
emphasize the role of the unconscious, preverbal, intuitive understanding and
the emotions in producing and interpreting creative multimodal metaphors. The
examples used illustrate and develop the central arguments of the paper.
“Metaphor and Symbol: searching for one’s identity is looking for
a home in animation film” explores the conceptual metaphor searching for
one’s identity is looking for a home in a number of animation films. This is
relevant as little work using CMT has been applied to film. Forceville claims that
investigating the animation medium has various advantages, including the fact
that short animations seldom use language – which helps counter criticisms that

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


4 María Jesús Pinar Sanz

CMT is ultimately a language-based theory. The author intends to aid CMT and
multimodality scholarship by examining the concept home in a variety of instan-
tiations of the metaphor purposive activity is movement toward a destina-
tion in animation films. His contribution shows that (a) analysing the metaphor
under study presupposes understanding “home” as a symbol; (b) animation has
medium-specific affordances to implement the metaphor; (c) the metaphor com-
bines embodied and cultural dimensions.
In “Woven emotions: Visual representations of emotions in Medieval English
textiles”, Díaz Vera explores how the same conceptual metaphors underlie the
expression of Old English emotions in both the language and the visual modes.
The author analyzes the pictorial representations of emotions in the Bayeux Tap-
estry, an 11th century embroidered cloth that narrates and depicts the events that
led up to the Norman Conquest of England and the invasion itself. His analysis
shows that (1) Anglo-norman artists used a well-organised set of visual stimuli to
convey emotion-related meanings in a patterned way, that (2) the same idealized
conceptual models are shared by verbal and visual modalities and that (3) where-
as verbal expressions of emotions regularly draw on non-embodied behavioural
concepts, visual representations show a clear preference for embodied container
concepts.
In “Approaching the utopia of a global brand: the relevance of image schemas
as multimodal resources for the branding industry”, Pérez Hernández explores
the relevance of image schemas and related multimodal image schematic meta-
phors and metonymies in the branding industry. The author argues that image
schemas represent an efficient cognitive tool for the purpose of creating global
brands, as they have an experiential basis and are largely pervasive across cultures
and languages. In addition, she argues that the universal nature of image schemas
can be maximized through their multimodal expression. Her main claim is that
the multimodal and systematic use of image schemas in the process of brand cre-
ation and the final output may provide branding professionals with an inventory
of sound and ready-to-use multimodal resources for the design of global brands.
Diana E. Popa’s contribution “Multimodal metaphors in political entertain-
ment” attempts to shed light on the issue of multimodal metaphor in political
entertainment, with special attention to the ways in which the verbal, visual and
auditory modalities employed contribute to the construal of the multimodal met-
aphor and the functions of multimodal metaphors in animated political cartoons.
The paper deals with the way entertaining politics relies on multimodal meta-
phors to (a) explain the significance of real life events and characters through the
means of imaginary scenarios, (b) persuade people, (c) propagate a critical stance
towards somebody or something, and (d) provide information about political is-
sues, events and players that no other medium could openly transmit.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality and Cognitive Linguistics: Introduction 5

3. Multimodality, Cognitive and Systemic Functional Linguistics

The second part of the volume delves into multimodality and its relationship with
Cognitive Linguistics and Systemic Functional Linguistics (henceforth SFL). Al-
though it may be argued that the latter approach “is in many respects too heavily
biased by its roots in linguistics” (Forceville, 2010, p. 59), recent studies show
that there have been significant advances in the development of other modalities
apart from language in Systemic Functional Linguistics (Böck & Pachler, 2013;
Jewitt, 2009; Jones & Ventola, 2008; O’Halloran, 2011; Ventola & Moya, 2009,
among others). The papers in this section deal with the construction of visual
metaphor from a Social Semiotic Approach, the use of semiotic metaphors and
visual metonymies within the framework of Systemic Functional Linguistics, the
cognitive mechanisms involved in the creation and interpretation of multimodal
texts and the way in which different semiotic channels provide different kinds of
information.
The first contribution in this section, “The visual representation of metaphor:
A social semiotic approach”, by Feng and O’Halloran, combines El Refaie’s (2003)
views of visual metaphor as “the pictorial expression of metaphorical thinking”
and Carroll’s (1996) and Forceville’s (1996) definition of visual metaphors in terms
of “their surface realization or formal characteristics”. While Feng and O’Halloran
agree with both definitions, they further ask (a) how metaphors are visually ex-
pressed or realized, (b) what the metaphor resources in visual images are, and
(c) how these resources work to construct metaphor. The authors consider that
visual images do not build spatial relations but are complex metafunctional con-
structs (Halliday & Matthiessen, 2004), integrating representational, interactive
and compositional meanings (Kress & van Leeuwen, 2006). In their approach, the
meta-functional resources are seen as metaphor potential and they explore how
these construct pictorial metaphors. Feng & O’Halloran employ the social semi-
otic theory of intersemiotic relations to explain the complex image-text interac-
tion in visual metaphor, and conclude that (a) social semiotic visual grammar
can provide a comprehensive account of the visual construction of metaphor, and
(b) conceptual metaphor theory lends epistemological status to such a grammar.
While Feng and O’Halloran explore the way visual metaphors are construct-
ed, in “Visual metonymy in children’s picture books”, Moya explores visual me-
tonymies, the other main trope within Cognitive Linguistics alongside metaphor.
After outlining the main features of the concept of visual metonymy, the author
examines the discourse functions of the metonymies and interprets the data in
functional terms. The data make it evident how visual metonymies are useful
strategies to convey representational meaning and create engagement in picture
books. All in all, the aim of the paper is to show how the use of visual metonymies

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


6 María Jesús Pinar Sanz

in picture books contributes to children’s understanding of the stories in them


and, in turn, attracts their attention towards relevant aspects of the plot. The two
picture books selected are intended for children under 9. A multimodal and cog-
nitive perspective is adopted when applying the non-verbal trope of visual me-
tonymy to the picture books under analysis. The results of his analysis show that
the visual metonymies are used in children’s tales with a double function: firstly,
to facilitate the understanding of the story to ‘first time readers’ and, secondly, to
create narrative tension in certain stages of the plot, and, in turn, to establish a
bond between the represented participants and the child-viewer.
Whereas the previous contribution demonstrates how visual metonymies
contribute to the representation of reality in two picture books, “The establish-
ment of interpretative expectations in film”, by Bateman and Tseng, shows that
some notions from the textual organization of verbal texts appear also to give
insights to the organization of films. Bateman and Tseng compare the beginnings
of films with the macro-theme, hyper-theme and theme organization discussed
by Martin (1992), which establish a scaffold of expectations that help the text’s re-
cipient negotiate the complex textual structures being constructed. Bateman and
Tseng demonstrate that film beginnings exhibit differing organizational features
that correlate with the overall narrative strategies pursued in films as a whole.
These features, Bateman and Tseng argue, may then function as “useful indica-
tors for viewers concerning just what interpretative challenges they will face later
in the text”.
In “Multimodal digital storytelling: integrating information, emotion and so-
cial cognition”, Alonso, Molina and Porto explore how diverse semiotic channels
provide different kinds of information (factual, emotional, cultural, etc.) which are
finally integrated to construct the global meaning of the narrative. They combine
different analytical tools to achieve that goal. In the study of multimodality they
follow Kress and van Leeuwen’s work (2006) for the analysis of images, voice qual-
ity and other issues related to multimodal representation. On the cognitive side,
the authors make use of some notions of Mental Spaces and Conceptual Integra-
tion Theory and apply it to narratives in order to explain how the different modes
can be regarded as providing separate narrative-input spaces which interact both
among themselves and with the social knowledge shared by the participants in
the discourse event to be finally integrated in making sense of the narrative. The
results are of interest for those scholars concerned with the representational and
communicational modes of semiotic resources in making meaning.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality and Cognitive Linguistics: Introduction 7

4. Cognitive Linguistics and multimodal interaction

The third part focuses on Cognitive Linguistics and multimodal interaction and is
based on the analysis of human interaction. Multimodal interaction is concerned
with what individuals express and how others react or perceive in interaction
(Norris, 2004, p. 4; Norris, 2011). Norris studies the embodied (language, gesture
and gaze) and disembodied modes (music, print and layout) used by people in
interaction. She claims that human interaction can’t be explained if the human
mind is not taken into consideration since “a person always thinks, perceives,
and/or feels something when interacting with others, and at least some of these
thoughts, perceptions, and/or feelings are communicated through a person’s ac-
tions” (Norris, 2004, p. xi). The papers in this section will show that different
modes of communication are structured in different ways. Thus, some of the is-
sues explored are the perception and marking of humour in conversational texts,
human perception and the intersubjective coordinated patterns that move hu-
mans to interaction.
In López-Varela’s “Intermedial Cognitive Semiotics: some examples of multi-
modal cueing in virtual environments”, intermediality is studied from cognitive-­
semiotic concerns and insights from digital environments. Human perception
is revised as well as the role of shared attention in communication. The paper
explores spatial and temporal cueing (eye-contact and the sonic modality) from
a task oriented and social interactive dimension that highlights their importance
in intersubjective communication. The importance of multimodal mirror-­neuron
mappings, index assignment and pointers in cognition and discourse, and the
role of affective phenomena in engaging intersubjectivity are highlighted. The ex-
amples proposed explore vision and sound in online collaboration and show the
importance of mediating channels on the spatiotemporal axis of perception. This
is relevant as the slightest cues can have significant impact on communicative
situations.
Attardo, Pickering, and Taherzadeh’s paper “Multimodality in conversational
humor” addresses the issue of multimodal markers of humour in conversational
texts. Their paper seeks to determine whether humor is marked in the texts under
analysis and how. In particular, the paper examines the hypothesis put forth in
Attardo, Pickering, and Baker (2011) that the only consistent marker of conver-
sational humor is smiling. A further hypothesis is investigated: the “marking” of
humor consists of a combination of multimodal items such as prosodic, gestural
and facial expressions having the purpose of framing the interaction as humor-
ous, rather than mechanically “marking” the occurrence of humor. This paper fills
a gap in the field, since very little has been written about the prosodic and multi-
modal markers of humor, with the exception of the markers of irony. The tentative

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


8 María Jesús Pinar Sanz

conclusion is that the prosodic features investigated and smiling/laughter are not
markers, both because they are not consistently associated with the phenomenon
and because they lack integration.
Finally, in “Image schemas and mimetic schemas in cognitive linguistics and
gesture studies”, Cienki explores how cognitive linguistics and gesture studies ap-
proach the study of schemas. This author states that, rather than simply applying
the theoretical constructs form cognitive linguistics to gestures as data, the ges-
ture research raises new questions for schema research as well as provides new
insights into the role of schemas in cognition. The distinction between image and
mimetic schemas is looked into as well as the way in which each kind supports
lexical semantic analysis of a different kind. The paper explores to what degree
image schemas provide a useful explanatory tool for researching the concrete,
physically embodied details of gestures. The author considers that the research on
‘mimetic schemas’ has a great potential for thinking about some known phenom-
ena of gesture in a new way, and thus schema research provides a useful means
to analyze behavior in another modality involved in spoken language use, namely
the visual.
The collection of papers in this volume are a step towards developing the rela-
tionship between multimodality in its three main forms – social semiotic analysis
and multimodal discourse analysis, multimodal interactional analysis and mul-
timodal metaphors – and cognitive linguistics. Attention has been focused on
language and images, but also on gestures, posture, gazes or voice quality, among
others, how they interplay and the final effect of the interaction of the different
modes, since they are sometimes integrated in unprecedented ways, enacting new
interactional patterns and new systems of interpretation. The contributors have
discussed the need to integrate multimodality in its various forms with cognitive
linguistics in a variety of genres and situations and have highlighted the impor-
tance of studying contextual, socio-cultural backgrounds in both verbal and non-
verbal manifestations.

References

Attardo, S., Pickering, L., & Baker, A. (2011). Prosodic and multimodal markers of humor in
conversation. Pragmatics and Cognition, 19(2), 224–247. DOI: 10.1075/pc.19.2.03att
Bateman, J. (2011). Multimodality and genre: A foundation for the systematic analysis of multi-
modal documents. Basingstoke: Palgrave.
Böck, M., & Pachler, N. (2013). Multimodality and social semiosis: Communication, meaning
making, and learning in the work of Gunther Kress. London/New York: Routledge.
Carroll, N. (1996). A note on film metaphor. Journal of Pragmatics, 26, 809–822.
DOI: 10.1016/S0378-2166(96)00021-5

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality and Cognitive Linguistics: Introduction 9

Forceville, C. (1996). Pictorial metaphor in advertising. London/New York: Routledge.


DOI: 10.4324/9780203272305
Forceville, C. (2010). Why and how study metaphor, metonymy, and other tropes in multimod-
al discourse? In R. Caballero & M.J. Pinar (Eds.), Ways and modes of human communica-
tion (pp. 57–76). Cuenca: Ediciones de la Universidad de Castilla-La Mancha.
Forceville, C., & Urios-Aparisi, E. (Eds.). (2009). Multimodal metaphor. Berlin/New York:
Mouton de Gruyter. DOI: 10.1515/9783110215366
Gibbons, A. (2011). Multimodality, cognition, and experimental literature. London/New York:
Routledge.
El Refaie, E. (2003). Understanding visual metaphor: The example of newspaper cartoons. Vi-
sual Communication, 2(1), 75–96. DOI: 10.1177/1470357203002001755
Halliday, M.A.K., & Matthiessen, C. (2004). An introduction to functional grammar. London:
Arnold.
Jewitt, C. (Ed.). (2009). The Routledge handbook of multimodal analysis. London: Routledge.
Jones, C., & Ventola, E. (Eds.). (2008). From language to multimodality. London: Equinox.
Knight, D. (2013). Multimodality and active listenership: A corpus approach. London: T. & T.
Clark Ltd.
Kress, G., & van Leeuwen, T. (2001). Multimodal discourse: The modes and media of contempo-
rary communication. London: Arnold.
Kress, G., & van Leeuwen, T. (2006). Reading images: The grammar of visual design. London:
Routledge.
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press.
Martin, J. (1992). English text: Systems and structures. Amsterdam: John Benjamins.
DOI: 10.1075/z.59
Norris, S. (2004). Analysing multimodal interaction: A methodological framework. London:
Routledge.
Norris, S. (2011). Identity in interaction: Introducing (Inter)action Multimodal Analysis. New
York: Mouton de Gruyter. DOI: 10.1515/9781934078280
O’Halloran, K. (2011). Multimodal representation and knowledge. London: Routledge.
Sindoni, M.G. (2013). Spoken and written discourse in online interactions: A multimodal ap-
proach. London: Routledge.
Ventola, E., & Moya, J. (Eds.). (2009). The world told and the world shown: Multisemiotic issues.
Basingstoke: Palgrave. DOI: 10.1057/9780230245341

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
Part I

Cognitive Linguistics
and multimodal metaphor

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
Cross-modal resonances in creative
multimodal metaphors
Breaking out of conceptual prisons

Elisabeth El Refaie
Cardiff University

This article uses examples of multimodal metaphors from three different genres
in order to develop a new understanding of the nature of creativity in metaphor.
I argue that multimodality provides distinctive opportunities for metaphor cre-
ativity by exploiting the unique affordances of the different semiotic modes and
the possibility of combining them in unexpected ways. Such innovation at the
level of representation may encourage novel thought patterns, I suggest, even
in such cases where the underlying metaphorical mappings are relatively con-
ventional. The notion of “cross-modal resonances” is introduced to emphasize
the role of unconscious, preverbal, intuitive understanding and the emotions in
producing and interpreting creative multimodal metaphors.

Keywords: advertisements, cartoons, comics, creativity, multimodal metaphor,


social semiotics

1. Introduction

According to Arthur Koestler’s (1964, p. 38) famous definition, creativity involves


the restructuring of habitual thought patterns through the bisociation of two or
more apparently incompatible frames of reference.1 Creativity is believed to have
the potential to free us from “the concept prisons of old ideas” (de Bono, 1990,
p. 11) and reveal new solutions to long-standing problems. Since metaphor en-
ables us to modify our understanding of two distinct conceptual domains as we
consider one entity in terms of another, many scholars believe that “constructing
and using a metaphor is by its very nature a creative process” (Finke, Ward, &
Smith, 1992, p. 105).

doi 10.1075/bct.78.02elr
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
14 Elisabeth El Refaie

For many centuries, academic interest in metaphor focused almost exclu-


sively on particularly creative metaphorical expressions and their poetic or per-
suasive effects. This emphasis on literary or rhetorical metaphor changed in the
late 1970s, when scholars began to notice the ubiquity of metaphor in everyday
discourses and to challenge the notion of an essential difference between the way
we understand literal and non-literal forms of language: “The use of language is
an essentially creative activity, as is its comprehension. To be sure, metaphors and
other figures of speech may sometimes require a little more creativity than literal
language, but the difference is quantitative, not qualitative” (Ortony, 1979, p. 2).
Conceptual metaphor theory (CMT) took these ideas one step further, argu-
ing that metaphor is a perfectly common way of thinking that allows us to un-
derstand abstract areas of experience in terms of more concrete, embodied ones
(Lakoff & Johnson, 1980). Such metaphors are deeply ingrained in our everyday
thought-processes and generate large numbers of ordinary, mundane metaphori-
cal expressions, which are typically processed without much conscious awareness
on the part of their users. Indeed, Lakoff & Turner (1989) go so far as to claim that
even the most illustrious examples of metaphors in poetry are based on the same
entrenched thought patterns that govern our everyday thinking.
More recently, however, the pendulum has started to swing back again, as
scholars within the CMT paradigm have rediscovered the creative dimension of
metaphor in literature, poetry, and political discourse, for example (e.g. Crisp,
2008; Fludernik, 2011; Musolff, 2006; Semino, 2008; Steen, 2008). Analysts of
non-verbal and multimodal genres, in particular, often come across instances of
highly original metaphors that cannot be described easily in terms of entrenched
conceptual mappings between a concrete source and a more abstract target (e.g.
Forceville & Urios-Aparisi, 2009). The question for these scholars is how to the-
orize such striking instances of metaphor creativity, while remaining commit-
ted to a view of metaphor as an essential aspect of common, everyday thought
patterns.
The aim of my article is to provide an original answer to this question, by
combining insights from metaphor theory with ideas developed by social semio-
ticians about the unique properties of the different modes and the ways in which
they can be combined to form new meanings. As I argue in the first section of this
article, we must take seriously the now widely accepted premise that the way we
represent the world does not simply reflect our thinking but may also shape it.
Consequently, the various creative forms that metaphors can take at the expres-
sion level have to be regarded as more than just decorative flourishes. Even the
most conventional conceptual mappings between one area of experience and an-
other may be reinvigorated or completely transformed when they are represented
in a new, original form. Drawing on the notion of semiotic “affordances” (Kress,

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Cross-modal resonances in creative multimodal metaphors 15

2009; Kress & van Leeuwen, 2001), I suggest that multimodality dramatically in-
creases the opportunity for creativity at the level of representation, by exploit-
ing the distinct characteristics and meaning potentials of the various modes and
their combinations. The resulting “cross-modal resonances”, I argue, may encour-
age new insight, but this insight is often of a preverbal, emotional, and intuitive
nature, rather than involving logical processes of mapping knowledge from one
conceptual domain to another.
The second section of this article uses several examples to illustrate and de-
velop my central arguments. In keeping with the idea that creativity enables us
to break out of the mental prison imposed by entrenched patterns of thinking,
my examples of the multimodal metaphors used in an autobiographical comic,
a television commercial, and a political cartoon all use the source domain of in-
carceration to re-conceptualize a particular area of human experience. While the
underlying metaphorical mappings of my examples are all rather conventional,
the particular ways in which they are expressed are unique and highly creative.

2. Reassessing the creative potential of metaphors at the expression level

According to CMT, metaphors provide “sets of mappings between a more con-


crete or physical source domain and a more abstract target domain” (Kövecses,
2002, p. 67). For example, since we all feel hot as a result of physical exertion or
excitement, metaphors that are rooted in the concept of intensity is heat seem
entirely natural to us (Kövecses, 2005, p. 18). Similarly, we all experience a con-
nection between our movement through space and the passing of time, which
accounts for metaphors such as life is a journey. These conceptual mappings
are thought to generate lots of metaphorical expressions, many of which are en-
tirely unremarkable and which are produced and interpreted intuitively, typically
at the level of unconscious or barely conscious thought processes. This does not
prevent them from impacting upon our thinking and behavior, however. Indeed,
it is precisely the unreflected, apparently commonsensical nature of conventional
metaphors that gives them so much power over the way we understand and act
towards particular areas of our lives (Thibodeau & Boroditsky, 2011).
In what is probably their most direct rebuttal of the traditional belief in the
uniquely creative nature of metaphor, Lakoff & Turner (1989) claim that even lit-
erary metaphors are typically based on the same conventional metaphorical pat-
terns that underlie our ordinary thinking, although poets are often able to find
fresh, idiosyncratic extensions, elaborations or combinations of these metaphors,
thereby guiding us beyond their “automatic and unconscious everyday use”
(Lakoff & Turner, 1989, p. 72). When, for example, Hamlet considers his own

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


16 Elisabeth El Refaie

death in terms of sleeping – “To sleep? Perchance to dream!” (III, 1, 65) – William
Shakespeare can be said to be using a conventional mapping but extending it to
include the concept of death as a form of nightmare. Elaboration refers to cases
where existing elements of a conventional metaphor are specified in an unusual
way, for example, when a dangerous but exhilarating period in a person’s life is
described as tobogganing down a sheer mountain face. Novel combinations of
two or more metaphors provide another important means of achieving creativity
in poetry.
According to Lakoff & Turner (1989, p. 89–96), most examples of metaphor
creativity fall into the three categories described above, but they also discovered
some cases of “one-shot” image metaphors, which work by superimposing one
image onto another in the mind of the reader. The mental imagery that results
from such metaphors can be more or less creative, depending on whether or not
it involves the mapping of visual properties that are normally perceived to be
similar. When, for instance, a woman’s body is likened to an hourglass, the result-
ing mental image is conventional, whereas comparing it to the trunk of a weeping
willow would result in a more original visual mapping.2
Lakoff & Turner’s claims regarding the essentially conventional basis of most
poetic metaphors have been criticized for failing to pay sufficient attention to the
essential role of linguistic and textual choices in people’s perceptions of meta-
phor creativity. One of the main premises of critical approaches towards language
and discourse is that our representations of the world have constitutive effects,
in the sense that they shape the way we attend to and understand social reality
(e.g. Mayr, 2008). If we apply the same argument to the analysis of metaphor, we
need to accept that “how we think about a given topic is altered by the metaphors
we regularly hear and employ, so that our mental representations are not whol-
ly antecedent to and independent of metaphorical talk” (Camp, 2006, p. 160).
Thus, the idea that we can limit our consideration of metaphor creativity to the
level of conceptual mappings becomes untenable. Instead, we must accept that
the particular form of a metaphor may have profound effects upon the ways in
which it is conceived and understood. As Semino (2008) points out, even the
most entrenched conceptual metaphor may be reactivated by a uniquely creative
word or by novel patterns of related expressions that may occur within a particu-
lar passage, throughout a text, or across different texts. Metaphor creativity thus
“needs to be considered both in terms of the novelty or otherwise of underlying
conceptual mappings, and in terms of the salience and originality of individual
metaphorical choices and patterns” (Semino, 2008, p. 54).
The opportunities for creativity are vastly increased in the case of multimodal
metaphors. Because of differences in the potentials and limitations of their mate-
rial properties and the way they have been used over many generations in specific

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Cross-modal resonances in creative multimodal metaphors 17

cultures and contexts, semiotic modes have developed distinct ways of expressing
similar meanings, as well as displaying a tendency towards specialization (Kress,
2009; Kress & van Leeuwen, 2001). Language is perhaps more suited to the rep-
resentation of actions and causality, for instance, while the spatial organization of
images may “lend itself with greater facility to the representation of elements and
their relation to each other” (Kress, 2000, p. 147). In Western culture, the word-
image distinction has also long been freighted with strong value judgments, with
writing being associated with high culture and learning, and images with popular
culture and illiteracy.
For these reasons, meanings from one mode cannot be translated exactly into
another: “No text has the exact same set of meaning-affordances as any image.
No image or visual representation means in all and only the same ways that some
text can mean” (Lemke, 2002, p. 304). It is this essential “incommensurability” of
semiotic modes that allows genuinely new meanings to be created through their
combination. According to Lemke (1998, p. 92), when several semiotic modes are
brought together, the possible meanings are thus multiplied rather than simply
added together.
The affordances of the different modes and the meanings that can be created
through their combination clearly have a bearing on the creation and interpreta-
tion of metaphors. For example, pictures are able to exploit more effectively than
language the visual similarities that may exist between the size, shape, texture, or
color of two entities. Consequently, the “image metaphors” that Lakoff & Turner
(1989) have identified as the most creative instances of poetic metaphor are not
at all unusual in visual genres.3 Indeed, metaphors of the concrete is concrete
variety are prevalent in all non-verbal semiotic modes, where perceptual resem-
blance (i.e., looking, sounding, smelling, feeling similar) often provides the cue
for constructing metaphorical meaning (Forceville, 2009, p. 27).
Apart from perceptual resemblance, we can distinguish between at least the
following other forms that non-verbal or multimodal metaphors may take: Fu-
sion/simultaneous cueing refers to cases where the source and the target of a meta-
phor are fused together into one amalgamated whole, or, in a temporal mode such
as sound or moving images, presented at the same moment. Many multimodal
metaphors are also based on some form of incongruity, where an object or living
being appears to be out of place in its context, thereby challenging expectations
of the “real” or “natural”. The term perceptual echo, finally, may be used to refer to
instances of metaphorical meaning that emerges from the representation of one
entity in a way that strongly calls to mind a different entity (terms adapted and
extended from Forceville, 2009).
The multiplication of meanings that can be achieved through the combination
of semiotic modes is also often exploited in the construction and interpretation of

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


18 Elisabeth El Refaie

multimodal metaphors. For instance, one mode may represent the source and tar-
get domains of a metaphor on its own, rendering the other mode(s) wholly or at
least partially redundant. More often, however, the other modes have an augment-
ing or modifying effect, in that they are able to reinforce metaphorical meaning, or
change it by drawing in additional connotations. In complementary relationships
between the semiotic modes, each supplies essential information about either the
source or the target of a metaphor. The many possibilities of evoking metaphori-
cal meanings through the combination of different semiotic modes, each with
their own, unique affordances, thus provide countless opportunities for metaphor
creativity at the level of representation, which, I suggest, may also enable new
connections to be made at the conceptual level.
Rachel Giora’s work (Giora, 1999; Giora et al., 2004) provides a useful way of
theorizing the role of creativity at the expression level of metaphors. In a series
of empirical studies, Giora et al. (2004) discovered that the degree of pleasure
we derive from both verbal and visual metaphors depends on whether they are
“optimally innovative.” Optimal innovation involves the automatic recovery of a
familiar, salient meaning, which is at the forefront of our mind due to frequency,
familiarity, conventionality, and/or prototypicality, while also inviting a non-sa-
lient, qualitatively different interpretation: “It is not a sheer surprise, then, that is
pleasing, but a somewhat novel response assignable to or involving a salient, alto-
gether different response that is gratifying” (Giora et al., 2004, p. 118). This strat-
egy can be considered as a way of “foregrounding” or “estranging” entrenched
salient responses, thereby opening metaphors up to new meaning (Giora et al.,
2004, p. 120). The choice of an unusual form to express a metaphor may thus
be seen as an effective way of foregrounding conceptual mappings that are so
entrenched that they may otherwise be overlooked. This allows the creators of
multimodal metaphors to achieve optimal innovation and encourage increased
mental activity on the part of their audiences.
As mentioned above, the mental activity involved in understanding meta-
phors is described by CMT scholars as a form of “mapping.” This notion, which
is itself a metaphor, suggests a rather mechanistic thought process, whereby each
element of a set is matched up with an element in another set, following certain
logical rules of correspondence. In my view, the intuitive flash of understanding
involved in the interpretation of creative metaphors is captured better through
Max Black’s (1979, p. 20) concept of “resonance,” which, in its most common lit-
eral meaning, refers to the “sound produced by a body vibrating in sympathy
with a neighbouring source of sound” (Collins English Dictionary, p. 1376).4 In
the following discussion I will use the term “multimodal resonances” to describe
the way creative multimodal metaphors are often grasped intuitively and imag-
inatively, through a process that involves a sort of sympathetic vibration, both

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Cross-modal resonances in creative multimodal metaphors 19

between the source and target domain and between the distinct semiotic modes
that are used to represent a metaphor.

3. Cross-modal resonances in metaphors of incarceration

I will start my discussion of metaphor creativity with an example of a multimodal


metaphor from an autobiographical comic. In this genre, metaphors often fulfill
the function of enabling readers to draw on their own embodied experience in or-
der to imagine the thoughts and feelings of the protagonists in a story (El Refaie,
2012, p. 207). Persepolis records Marjane Satrapi’s memories of growing up in Iran
during the 1970s and 80s. After a period of exile in Vienna as a teenager, Marjane
returns to Teheran and tries to readapt to life under the repressive Iranian regime.
At the age of twenty-one, she marries her boyfriend and fellow art student, Reza.
The autobiographical narrator recalls the moment she and her husband arrive
back home after the wedding party, describing the “bizarre feeling” she has when
the apartment door closes behind them (Plate 1). The picture shows a head-and-
shoulders view of Marjane looking out at the reader from behind bars, her hands
gripping the bars on either side of her face. In a textbox at the bottom of the panel
the autobiographical narrator explains her feelings: “…I was already sorry! I had
suddenly become ‘a married woman.’ I had conformed to society, while I had
always wanted to remain in the margins” (Satrapi, 2006, p. 319). Since it is clear
that the couple’s apartment is unlikely to be equipped with prison bars (in other
words, that it is incongruous), the reader is encouraged to interpret this sequence
metaphorically.

Plate 1. Marjane Satrapi (2006) Persepolis, p. 319 / © 2002, 2003 L’Association;


translation © 2003 L’Association and 2004 Anjali Singh

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


20 Elisabeth El Refaie

At first sight, the metaphor Satrapi employs in this example seems to follow a
highly conventional pattern, since it is not at all unusual for people to understand
the emotional experience of being in an unhappy, constraining relationship in
terms of the more concrete, physical experience of incarceration. On closer in-
spection, however, it turns out to be a lot more creative than it would have been if
the author had expressed the same idea in a purely verbal form.
One way of determining how creative a metaphor is, Hanks (2006) suggests,
is by examining the degree to which the two concepts share semantic properties.
Metaphors that bring together a concrete and an abstract concept, for example,
are always more resonant than those where target and source are closely related,
because the reader must work harder to establish a relevant interpretation. Images
differ from words in that it is simply not possible to represent abstract meaning
visually without recourse to some form of symbolism or metaphor (El Refaie,
2009a, p. 177). In our example (Plate 1) the target domain – the concept of an un-
happy marriage – could not have been expressed literally in pictorial form at all.
The full metaphorical meaning of the drawing only emerges in unison with the
verbal narration, which supplies the target of the metaphor and thus performs a
complementary function. To be precise, the same source domain of incarceration
is first used to represent Marjane’s feelings towards the couple’s shared apartment,
but then, in a second process of mapping, the target of the metaphor shifts onto
the more abstract level of her marriage to an Iranian man and all the cultural
expectations that this entails. So, in the space of just one panel, the conceptual do-
mains that are being brought together in this metaphor have been slightly modi-
fied, moving from a closer to a more distant semantic relation between source and
target, and thereby, according to Hank’s (2006) theory, acquiring increasing reso-
nance. This may invoke in the reader a range of associations between the home,
personal relationships, and cultural traditions on the one hand and incarceration
on the other.
Metaphor creativity is also related to meaning change over time. In verbal
metaphors, expressions whose meaning was originally acquired through a figura-
tive process gradually lose their metaphorical meaning as they become lexical-
ized, moving from being “active,” i.e. having no fixed meaning and being able to
generate lots of different resonances, over “inactive”, where metaphorical mean-
ing may still be switched on in particular contexts, to “dead”, where the original
metaphorical meaning of a word is no longer accessible to the average speaker of a
language (Goatly, 1997, pp. 30–38). A similar mapping between moral constraints
and physical confinement can thus be expressed in a conventional form, by talk-
ing about being caught or trapped in a relationship, but it can also be discussed in
more unusual language, by describing a marriage as being thrown into a deep, dark
dungeon, for instance.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Cross-modal resonances in creative multimodal metaphors 21

In non-verbal modes, it is less common for metaphors to become inactive


through repeated use over time, because they are not as standardized as language
is. One possible example of an inactive metaphorical representation in the visual
mode is a simple drawing of a light bulb above a person’s head in cartoons to
signify that he or she is having an idea. Since this device has become convention-
alized in certain genres, it is unlikely to evoke a lot of novel connections between
the domain of thinking and the domain of electric light. The drawing of Marjane
behind bars may also be described as somewhat clichéd, since it uses abstraction
and simplification to signify the “essence” of incarceration. However, because im-
ages always represent a particular instance of someone or something, they are
typically more specific than words, capturing nuances of meaning that would be
hard to convey through language. In this case, it is a recognizable self-portrait, in
Satrapi’s own inimitable style, and her facial expression and hands gripping the
bars suggest a particular attitude towards her sudden loss of freedom, which add
color and detail to the metaphor at the representational level. Satrapi can thus be
said to be estranging the reader’s entrenched salient responses through a concrete
visual “literalization” (El Refaie, 2003, p. 89) of a rather conventional conceptual
metaphor, thereby opening it up to new meaning.
Television commercials provide another rich source of creative metaphors,
since the advertising industry is always trying to find new ways of attracting the
attention of potential customers and creating (implicit) cognitive links between
the product and some desirable qualities. From 1999 onwards, a UK television
campaign for a well-known brand of breakfast cereals featured a cartoon personi-
fication of “Hunger” as a bright blue monster with big teeth which assails hungry
individuals (e.g., Tarzan, Father Christmas, Romeo and Juliet) by drumming on
their stomachs with a pair of silver spoons. When the afflicted person starts to
tuck into a bowl of the cereal, which is made up of square shapes of interlocking
shreds of wheat, four huge sections of the cereal fly in from all sides to trap Hun-
ger between them. The sound of heavy metal grids clanging shut – an excellent
example of simultaneous cueing – supports the source domain prison. The advert
ends with the now famous slogan that exhorts consumers to “keep hunger locked
up ‘til lunch.”
While the idea of hunger as a wild creature that makes our stomach “growl”
and must be “kept at bay” may again be described as a reasonably conventional
metaphor, here it is extended to appeal to several senses, including vision, hear-
ing, and physical sensation. It is also elaborated to include a specific, detailed
mini-narrative, complete with a dramatic plot, characters, and a happy ending.
According to Semino (2008, pp. 219–222) the use of such a specific “scenario” as
a source domain is characteristic of many of the most creative metaphors. What
is particularly striking about this example, however, is that its multimodal form

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


22 Elisabeth El Refaie

clearly played a key role in the emergence of the prison scenario: Without the
perceptual resemblance cues provided by the grill-like visual appearance of the
cereals, the concept of locking hunger up inside a foodstuff would probably not
have occurred to the producers of this advert. It would also have made a lot less
sense to audiences and is unlikely to have achieved optimal relevance.
My final example, a political cartoon by the American artist Clay Bennett, il-
lustrates the creative potential of multimodal metaphors particularly well. Gener-
ally, the purpose of a political cartoon is to represent an aspect of social, cultural,
or political life in a way that condenses reality and transforms it in a striking,
original, and/or humorous way (El Refaie, 2009b). In one of Bennett’s cartoons
(Plate 2), the letters spelling out of the name that was given to the 2003 invasion of
Iraq by the United States military, “Operation Iraqi Freedom,” have become part
of the picture plane. A shadowing figure looks out at us through the gaps formed
by the letters, as if peering through the barred windows of a prison. It is just pos-
sible to make out the words “Abu Ghraib Prison” on the man’s shirt.
Part of the creative impact of the cartoon comes from the way Bennett ex-
ploits and subverts ingrained assumptions about the differences between the ver-
bal and the visual mode. On the face of it, the distinction between words and
images is perfectly straightforward, but it is becoming increasingly clear that the
boundaries between the two modes are in fact fuzzy. As Mitchell’s (2009, p. 118)
concept of imagetext suggests, much visual art includes writing in some form,
and, by the same token, all texts “incorporate visuality quite literally the moment
they are written or printed in visible form.” The size, weight, form, and regularity
of type often convey a vast amount of connotative meaning. In Bennett’s cartoon

Plate 2. Clay Bennett / © 2004 The Christian Science Monitor (www.CSMonitor.com).


Reprinted with permission

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Cross-modal resonances in creative multimodal metaphors 23

(Plate 2) the typography suggests the kind of stenciled writing, much favored by
the military, which can be found on crates and packing cases. It also demonstrates
that written words can assume more explicit pictorial qualities, thereby remind-
ing us of the artificiality of the word/image dichotomy and, by extension, of all
rigidly dichotomous thinking.
The most interesting thing about this cartoon is that its central metaphor is
impossible to translate into words. We might say that it represents a political slo-
gan as a form of prison in order to castigate the hypocrisy inherent in the professed
aim of the US government and military to liberate the Iraqi people while robbing
many individuals of their freedom and human rights. However, such paraphrases
do not even come close to the shock of recognition that people are likely to expe-
rience upon viewing this cartoon. In my view, this is because the metaphor works
not so much by initiating logical thinking, but rather by encouraging pre-verbal,
intuitive cross-modal resonances in the mind of the reader.

4. Conclusion

Artists, authors and scientists often describe how their most creative insights seem
to come out of nowhere, when they are least expecting them; indeed, conscious-
ness actually seems to hamper their ability to form novel connections (Gibbs,
2011; Tardif & Sternberg, 1988). Similarly, the interpretation of metaphors typi-
cally relies on our human ability to grasp creative metaphorical meaning intui-
tively and imaginatively, often involving several of our senses. For this reason, the
full breath of a creative metaphor’s meanings cannot be captured fully by trying
to translate it into an explicit comparative statement:
When the metaphor is paraphrased or replaced, whatever had been extralingual,
unconscious, and therefore potentially new and alive in the collision of these two
entities gets reconstructed, this time in terms only of what is familiar. The point
of metaphor is to bring together the whole of one thing with the whole of another,
so that each is looked at in a different light. (McGilchrist, 2009, p. 117)

McGilchrist’s use of a visual analogy to describe the power of some metaphors


to make us “look at” two entities “in a different light” draws attention to the un-
conscious, non-verbal thought processes involved in understanding the vast net
of associations invoked by a truly creative metaphor. Like McGilchrist, I believe
that creative metaphors facilitate an understanding that is often achieved seren-
dipitously, by allowing unexpected connections to emerge in our unconscious
mind and engaging the emotions. I have used the acoustic analogy of cross-­modal
resonances to try and capture the ways in which multimodal metaphors are

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


24 Elisabeth El Refaie

sometimes able to open up such new, creative avenues of thought by expressing a


relatively conventional union in novel and unexpected ways.
In all the examples of metaphors I have discussed in this article, their multi-
modal form is likely to have inspired the initial conceptualization of a similarity
between two distinct entities, as well as shaping the way these metaphors are then
understood and interpreted by others. It is thus essential to take representational
aspects of creativity just as seriously as creativity at the level of conceptual map-
pings. Perhaps in future we will be able to find ways of exploiting the creative
potential of multimodal metaphor to address some of the seemingly intractable
problems facing us today, thereby allowing us to move beyond the “concept pris-
ons” of our old, established patterns of thinking.

Notes

1. Conceptual blending theory (Fauconnier & Turner, 2003) is heavily indebted to Koestler’s
work.

2. There is a lot of anecdotal and empirical evidence for the important role of visualization
in creative thinking (Finke, Ward, & Smith, 1992, pp. 45–63). Other empirical research (Gibbs
& Bogdonovich, 1999) has apparently confirmed Lakoff & Turner’s (1989) intuition that “one-
shot” image metaphors do not usually persist as part of people’s ordinary conceptualization of
their experience.

3. A large proportion of metaphors in advertising (Forceville, 2009, p. 28) and film (Rohdin,
2009, p. 422) are of this type, for example.

4. Black (1979, p. 27) originally used “resonance” to describe instances of metaphor that “sup-
port a high degree of implicative elaboration.” Hanks (2006) has also adopted this terminol-
ogy.

References

Black, M. (1979). More about metaphor. In A. Ortony (Ed.), Metaphor and thought (pp. 19–43).
Cambridge: Cambridge University Press.
Camp, E. (2006). Metaphor in the mind: The cognition of metaphor. Philosophy Compass,
1(2), 154–170. Available at: http://upenn.academia.edu/ElisabethCamp/Papers/1047417/
Metaphor_in_the_Mind_The_Cognition_of_Metaphor1 (accessed 10.11.2012).
DOI: 10.1111/j.1747-9991.2006.00013.x
Collins English Dictionary (2005). 7th edition. Glasgow: HarperCollins.
Crisp, P. (2008). Between extended metaphor and allegory: Is blending enough? Language and
Literature, 17(4), 291–308. DOI: 10.1177/0963947008095960
de Bono, E. (1990). Lateral thinking: A textbook of creativity. London: Penguin.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Cross-modal resonances in creative multimodal metaphors 25

El Refaie, E. (2003). Understanding visual metaphor: The example of newspaper cartoons. Vi-
sual Communication, 2(1), 75–96. DOI: 10.1177/1470357203002001755
El Refaie, E. (2009a). Metaphor in political cartoons: Exploring audience responses. In
C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 173–196). Berlin/New
York: Mouton-de Gruyter.
El Refaie, E. (2009b). Multiliteracies: How readers interpret political cartoons. Visual Commu-
nication, 8(2), 181–205. DOI: 10.1177/1470357209102113
El Refaie, E. (2012). Autobiographical comics: Life writing in pictures. Jackson: University Press
of Mississippi. DOI: 10.14325/mississippi/9781617036132.001.0001
Fauconnier, G., & Turner, M. (2003). The way we think: Conceptual blending and the mind’s
hidden complexities. New York: Basic Books.
Finke, R.A., Ward, T.B., & Smith, S.M. (1992). Creative cognition: Theory, research, and applica-
tions. Cambridge/ London: Massachusetts Institute of Technology.
Fludernik, M. (Ed.). (2011). Beyond cognitive metaphor theory: Perspectives on literary meta-
phor. New York/London: Routledge.
Forceville, C. (2009). Non-verbal and multimodal metaphor in a cognitivist framework: Agen-
das for research. In C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 19–
42). Berlin/New York: Mouton de Gruyter. DOI: 10.1515/9783110215366
Forceville, C., & Urios-Aparisi, E. (Eds.). (2009). Multimodal metaphor. Berlin/New York:
Mouton de Gruyter. DOI: 10.1515/9783110215366
Gibbs, R.W. Jr. (2011). Are ‘deliberate’ metaphors really deliberate? Metaphor and the Social
World, 1(1), 26–52. DOI: 10.1075/msw.1.1.03gib
Gibbs, R.W. Jr. & Bogdonovich, J. (1999). Mental imagery in interpreting poetic metaphor. Met-
aphor and Symbolic Activity, 14(1), 37–44. DOI: 10.1207/s15327868ms1401_4
Giora, R. (1999). On the priority of salient meanings: Studies of literal and figurative language.
Journal of Pragmatics, 31(7), 919–929. DOI: 10.1016/S0378-2166(98)00100-3
Giora, R., Fein, O., Kronrod, A., Elnatan, I., Shuval, N., & Zur, A. (2004). Weapons of mass dis-
traction: Optimal innovation and pleasure ratings. Metaphor and Symbol, 19(2), 115–141.
DOI: 10.1207/s15327868ms1902_2
Goatly, A. (1997). The language of metaphors. London/New York: Routledge.
DOI: 10.4324/9780203210000
Hanks, P. (2006). Metaphoricity is gradable. In A. Stefanowitsch & S. Gries (Eds.), Corpus-based
approaches to metaphor and metonymy (pp. 17–35). Berlin: Mouton de Gruyter.
Koestler, A. (1964). The act of creation. New York: Penguin Books.
Kövecses, Z. (2002). Metaphor: A practical introduction. Oxford: Oxford University Press.
Kövecses, Z. (2005). Metaphor in culture: Universality and variation. Cambridge: Cambridge
University Press. DOI: 10.1017/CBO9780511614408
Kress, G. (2000). Text as the punctuation of semiosis: Pulling at some of the threads. In U.H.
Meinhof & J. Smith (Eds.), Intertextuality and the media: From genre to everyday life
(pp. 132–154). Manchester: Manchester University Press.
Kress, G. (2009). What is mode? In C. Jewitt (Ed.), The Routledge handbook of multimodalanal-
ysis (pp. 54–67). London/New York: Routledge.
Kress, G., & van Leeuwen. T. (2001). Multimodal discourse: The modes and media of contempo-
rary communication. London: Arnold.
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago/London: University of Chicago
Press.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


26 Elisabeth El Refaie

Lakoff, G., & Turner, M. (1989). More than cool reason: A field guide to poetic metaphor. Chicago/
London: University of Chicago Press. DOI: 10.7208/chicago/9780226470986.001.0001
Lemke, J.L. (1998). Multiplying meaning: Visual and verbal semiotics in scientific text. In J.R.
Martin & R. Veel (Eds.), Reading science: Critical and functional perspectives (pp. 87–113).
London: Routledge.
Lemke, J.L. (2002). Travels in hypermodality. Visual Communication, 1(3), 299–325.
DOI: 10.1177/147035720200100303
Mayr, A. (2008). Language and power: An introduction to institutional discourse. London/New
York: Continuum.
McGilchrist, I. (2009). The master and his emissary: The divided brain and the making of the
western world. Yale: Yale University Press.
Mitchell, W.J.T. (2009). Beyond comparison. In J. Heer & K. Worcester (Eds.), A comics stud-
iereader (pp. 116–123). Jackson: University Press of Mississippi.
Musolff, A. (2006). Metaphor scenarios in public discourse. Metaphor and Symbol, 21(1), 23–
38. DOI: 10.1207/s15327868ms2101_2
Ortony, A. (1979). Metaphor, language, and thought. In A. Ortony (Ed.), Metaphor and thought
(pp. 1–16). Cambridge: Cambridge University Press.
Rohdin, M. (2009). Multimodal metaphor in classical film theory from the 1920s to the 1950s.
In C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 403–428). Berlin/
New York: Mouton de Gruyter.
Satrapi, M. (2006). Persepolis. London: Jonathan Cape.
Semino, E. (2008). Metaphor in discourse. Cambridge: Cambridge University Press.
Steen, G. (2008). The paradox of metaphor: Why we need a three-dimensional model of meta-
phor. Metaphor and Symbol, 23(4), 213–241. DOI: 10.1080/10926480802426753
Tardif, T.Z., & Sternberg, R.J. (1988). What do we know about creativity? In R.J. Sternberg
(Ed.), The nature of creativity: Contemporary psychological perspectives (pp. 429–440).
Cambridge: Cambridge University Press.
Thibodeau, P.H., & Boroditsky, L. (2011). Metaphors we think with: The role of metaphor in rea-
soning. PLOS one, 6(2), 1–11. Available at: http://www.plosone.org (accessed 11.10.2011).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol
Searching for one’s identity is looking
for a home in animation film

Charles Forceville
University of Amsterdam

The quickly growing discipline of multimodality has hitherto primarily found


its inspirational models in semiotics and in Systemic Functional Linguistics.
However, Cognitive Linguistics, and specifically its Conceptual Metaphor
Theory branch, has over the past years proved a store of knowledge and meth-
ods of analysis that can benefit the further advance of the young discipline. In
this paper the metaphor searching for one’s identity is looking for a
home in animation films is examined. It is shown that (a) analysing this meta-
phor presupposes understanding “home” as a symbol; (b) animation has medi-
um-specific affordances to implement the metaphor; (c) the metaphor combines
embodied and cultural dimensions.

Keywords: conceptual metaphor, symbolism, life is a journey, source-path-goal,


animation film, multimodality

1. Introduction

The journal Metaphor and Symbol (formerly called Metaphor and Symbolic Activ-
ity) has in the more than 25 years of its existence been true to one half of its name
by publishing a vast number of papers with the word (or root) “metaphor” in the
title. In fact, a count of titles including the word or root “metaphor-” at least once
in the first 25 volumes (1986–2010) yields no less than 275 instances. By contrast,
“symbol” or one of its derivations occurs only 8 times in that same period.1
Since the very name of the journal suggests that “metaphor” and “symbol” are
closely related tropes, this is a somewhat surprising finding. Perhaps one reason
for the scarcity of work on symbolic activity is that Conceptual Metaphor Theory
(CMT), with its mission to lay bare structural “metaphors we live by” (Lakoff &

doi 10.1075/bct.78.03for
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
28 Charles Forceville

Johnson, 1980), has long focused predominantly on the embodied dimension of


metaphors. CMT only later developed more interest in metaphors’ cultural di-
mensions (e.g., Fludernik, 2011; Gibbs & Steen, 1999; Kövecses, 2005; Semino &
Culpeper, 2002; Yu, 1998) – and symbolism is a cultural phenomenon par excel-
lence. Another reason may be that whereas studies linking poetic and conceptual
metaphor (Lakoff & Turner, 1989; Turner, 1996) pertain only to the verbal realm,
the study of symbolism has a long tradition in art history scholarship, a discipline
that has hitherto not engaged much with CMT, or vice versa (but see Rothenberg,
2008). This is unsurprising precisely because of CMT’s penchant for studying ver-
bal manifestations of non-literal thinking.
But it is important to investigate how metaphor and symbol are related. In
formulating more precisely how the poetics of the mind (Gibbs, 1994) govern the
production and interpretation not just of verbal, but also of visual and multimod-
al discourse, CMT can complement and refine the work on multimodality done
in the Systemic Functional Linguistics (SFL) tradition (e.g., Jewitt, 2009; Kress &
Van Leeuwen, 1996, 2001; Royce & Bowcher, 2007). More specifically, CMT can
contribute to the discipline of multimodality by its theoretical explicitness and
its greater commitment to methodological rigour (for more discussion of these
problems in SFL, see e.g., Forceville, 2010a). In this paper I intend to aid CMT
and of multimodality scholarship by examining the concept home in a variety
of the metaphor PURPOSIVE ACTIVITY IS MOVEMENT TOWARD A DESTINATION in
animation films, showing how metaphoric and symbolic dimensions interact in
the creation of meaning. I will first briefly consider the relation between “meta-
phor” and “symbol.” Subsequently I will discuss five animation films that feature,
I argue, the SEARCHING FOR ONE’S IDENTITY IS LOOKING FOR A HOME metaphor,
concentrating on the visual and verbal modalities. Finally I will draw tentative
conclusions and make suggestions for broadening this project.

2. Metaphor and symbol

Lakoff & Johnson’s (1980, p. 5) “understanding and experiencing one kind of


thing in terms of another” remains a useful description of metaphor. That is, we
comprehend target domain A as source domain B. Making sense of a metaphor is
done by mapping salient properties (and where possible: relations between those
properties) from source to target. Importantly, the mapping is to be understood as
including salient connotations adhering to the source, and also typical emotional
responses to it (Forceville, 1996). Target and source belong to semantic domains
or categories that in the context in which the metaphor occurs are understood
as different. By contrast, in symbolism we understand a source domain B, in a

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 29

given (sub)cultural community, to stand for a target domain A (see e.g., Beckson
& Ganz, 1975, p. 246; Wales, 2001, p. 379). Some well-known examples are the
following: a rose stands for love, a cross for suffering, a skull for death, an hour-
glass for mortality. In these examples, B stands in a metonymical relationship
to A, where in contrast to metaphor both target and source belong to the same
semantic domain. The lover gives (red) roses to his beloved; Christ died on the
cross; the skull is a part of the human body’s remains after death; and the hour-
glass visualizes the passing of time. I suspect that most symbols are rooted in
metonymy rather than in arbitrary convention.
To avoid terminological confusion, it is crucially important to emphasize
that I use the word “symbol” for a sign that typically exemplifies a non-arbitrary
relation between signifier (e.g., cross) and signified (suffering). By contrast, in
his famous triad “icon-index-symbol” Charles Sanders Peirce reserves the word
“symbol” precisely for signs with an arbitrary relation between signifier and sig-
nified. In his theory, the verbal signifier “dog” refers to the signified (or concept)
dog purely by an arbitrary convention adopted in the English language. Given
my view that a symbol is a special kind of metonym, my “symbol” is closer to
Peirce’s “index.” Whether symbolism can also be based on an arbitrary link be-
tween source and target is a difficult question. After all, what appears as arbitrary
may once have been a motivated, metonymic connection that is now no longer
retrievable. (Has the metonymic motivation for the one-time symbolizing of gay-
ness by wearing a single earring been lost, or was it a symbol arising out of an
arbitrary convention in the first place?)
If this reasoning is correct, we could say that – always: within a given cultural
group – in symbolism one metonym of a concept has become so salient at the ex-
pense of other (existing or unrealized or possible) metonyms of that concept that
this privileged metonym suffices to evoke that concept on its own, even with no
or minimal context. The test for this is to provide the members of a cultural group
(country, club, party, gang …) with the word for the metonym (“cross,” “rose”)
and ask them to provide connotations. If the members of the group significantly
often mention the same connotations, the metonym can be said to serve as a sym-
bol for that salient connotation (“[Christ’s] suffering,” “[romantic] love”).

3. House/home as symbol

The concept “house” is a phenomenon with a wide network of connotations. A


house is a usually man-made structure that ideally provides protection against
extreme temperatures and other unpleasant weather conditions (Brown, 2010,
p. 89). In addition, the house protects humans against hostile creatures, whether

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


30 Charles Forceville

animals or unfriendly fellow humans. A suitable house thus helps human beings
to survive. Inasmuch as houses are often places where humans live together in
groups, often as (extended) families, houses are typically places where they live
out, or perform, a large part of what they consider their identities. Intimate re-
lationships flourish or derail in houses, one entertains friends there, and people
are born, love, and die in houses. The connotations of a house as a place where
one can be oneself, where important events take place, and where one feels safe,
adhere more specifically to the concept that, in English, is referred to by the word
“home”: “home” = “house” + positive connotations. This transpires from expres-
sions such as “my home is my castle,” “home is where the heart is,” “there’s no
place like home,” “make yourself at home,” and “East, west, home’s best.” In short,
human beings strive to have some sort of house-as-home. (Even though other
languages do not make the house/home distinction, I will henceforward assume
that the positive connotations adhering to “house” are widespread, possibly uni-
versal; I will thus go on using the word “home” as shorthand for “house + positive
connotations”). The material conditions of such a home can differ: it can be made
of stone, wood, clay, ice, or cloth; it can be a hut or a mansion; and while homes
are usually man-made, existing natural conditions such as caves or bowers can be
adapted to function as homes, too.
The house-as-home is often used as a symbol for safety, intimacy with kin
and friends, and thus for experiencing a sought-after identity. In this paper I will
examine the metaphor SEARCHING FOR ONE’S IDENTITY IS LOOKING FOR A HOME,
which is a special case of the more general metaphor PURPOSIVE ACTIVITY IS
MOVEMENT TOWARD A DESTINATION. The popular version of this latter is X IS A
JOURNEY – where X can for instance be LIFE, A RELATIONSHIP, or A CAREER. The
JOURNEY metaphor is probably one of the most deep-rooted metaphors in human
thinking (see Forceville, 2006, 2011a, 2011b; Forceville & Jeulink, 2011; Johnson,
1987; Katz & Taylor, 2008; Ritchie, 2008; Yu, 2009). Here, my central claim is that
the search for a/the home has such strong symbolical connotations that artistic
discourses exemplifying it evoke the metaphor SEARCHING FOR ONE’S IDENTITY
IS LOOKING FOR A HOME.
In Max Black’s (1979) terms, SEARCHING FOR ONE’S IDENTITY IS LOOKING FOR
A HOME would be a “strong” metaphor: it is emphatic in that it would be very diffi-
cult to replace the “home” part of the source domain by another concept without
affecting the potential mappings from source to target, and it is resonant in that
it allows for a wide range of mappings. These mappings in most contexts do not
consist of isolated features, but of structured networks of features in which the
relations between the features are co-mapped (this is discussed in terms of “struc-
ture mapping” by Dedre Gentner; see e.g., Gentner & Jeziorski, 1993, p. 448; I

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 31

will here assume that connotations, too, can be part and parcel of “structure map-
pings”). It is to a considerable extent the mappability of these relations between
the pertinent features in the source domain that make the metaphor emphatic.
Thus the source domains LOOKING FOR/GOING TO CHURCH/THE OFFICE/THE MU-
SEUM, for instance, while all potentially giving rise to emphatic metaphors in their
own right, cannot serve as replacements to LOOKING FOR A HOME because what
people typically do at home is very different from what they do at these other
buildings. Another way of putting this is that the symbolic connotations (if any)
evoked by these other buildings do not coincide with those of “home.”

4. Case studies

As in Forceville and Jeulink (2011) and Forceville (2011b), the case studies ana-
lyzed here are animation films. One reason for focusing on film is CMT’s central
tenet of “embodied cognition,” which entails that humans typically conceptualize
the abstract in terms of the concrete – where the concrete is that which is percepti-
ble or pertains to the body’s motor functions (Forceville 2011a, p. 282). However,
hitherto little work using the CMT framework has been applied to film (exceptions
are Coëgnarts & Kravanja, 2012; Fahlenbrach, 2007, 2010; Forceville, 2006, 2011a;
Kappelhoff & Müller, 2012; Yu, 2009). Animation is a specifically interesting type
of film because the visuals of animation usually are entirely made (rather than the
result of registering a pro-filmic reality), and can for instance easily make use of
transformation and exaggeration without requiring a realistic motivation. They
are thus to an unusually large extent under the control of the creator. This enables
the exploitation of embodied schemata such as BALANCE, PATH and CONTAIN-
MENT for metaphorical purposes more easily than for instance in live-­action film.
Since in terms of resources (money, time), the making of animation is moreover a
costly procedure, it is a medium that requires careful planning of each detail that
is to end up in the final film. Perhaps more than in live-action photo­graphy or
film, in animation (as in comics) we are encouraged to find each single element
meaningful. A further reason for the focus on animation films is that they are
often short (say, between 1 and 15 minutes): meaning appears in condensed form.
Finally, short animation films often have no language, so that demonstrating how
structural metaphors are the motor for their interpretation helps show that “the
locus of metaphor is thought, not language” (Lakoff, 1993, p. 204).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


32 Charles Forceville

4.1 Hoppity Goes to Town/Mr. Bug Goes to Town


(Max & Dave Fleischer, USA 1941, 78’)

Summary
After a long trip, Hoppity the grasshopper returns to the small “Lowlands”
world where his fellow bugs live, commenting “There’s still no place like
home.” However, the Lowlands, a patch of urban garden in the middle of a
metropolis, is under threat by “the human ones,” who carelessly drop their
garbage on the bugs’ houses and disturb their territory. Hoppity is shocked:
“Nobody’s safe in their own homes – or out of them. How long has this been
going on? … There’s only one thing that we can do, we’re in a groove, we got
to move.” Together with Mr. Bumble, he goes scouting for a new place, but on
this expedition Mr. Bumble is almost drowned. He is rescued by the lady of
the house, who says, “There you are, Mr. Bumble, this is where you belong,
right out here in the garden.” Eventually Hoppity finds the bug community’s
new home: in the garden next to a cottage on top of a skyscraper.

The variety of the central metaphor at work here could be formulated as SURVIVAL
IS LOOKING FOR A HOME. That the garden where the bug community will settle is
where they “belong” was anticipated by the lady of the house’s rescue of Mr. Bee.
The notion of looking for a safe house runs through the entire film. Hoppity is in
love with Honey Bee, but his rival, Bagley Beetle, puts pressure on Mr. Bee to let
him, Beetle, marry his daughter by promising that the two of them can live with
him in the vase-house that adorns the fence surrounding the Lowlands. Tellingly,
this vase-house is located higher than the houses of the other bugs; and it is no
less significant that the place where the bugs eventually find their new abode is
high up, exemplifying the metaphor GOOD IS UP (Plate 1). The animation medium

Plate 1. The bug community tries to find a safer place to live, higher up.
Still from Hoppity Goes to Town

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 33

is well-suited to depict, with due exaggerations, the embodied strains and difficul-
ties of climbing toward a craved-for home on a high location.

4.2 Arrietty the Borrower


(Hiromasa Yonebayashi, Japan 2010, 94’)

Summary
Arrietty is a miniature girl who lives with her parents in the basement below a
country house, unbeknownst to its owners. Normal, big people are considered
dangerous enemies (like the “human ones” in Hoppity Goes to Town) intent
on getting rid of miniature people. The family survives because the father
every now and then undertakes a nightly expedition to the big people’s home
to “borrow” things they don’t need or won’t miss, such as a lump of sugar, or
a lost pin. However, once the ill-disposed servant Haru has discovered the
family’s existence and whereabouts, Arrietty and her parents need to move
house.

As in Hoppity Goes to Town, the home where the protagonists live is no longer safe.
Moving towards a new home, then, is again primarily a matter of SURVIVAL IS LOOK-
ING FOR A HOME, but by extension the new home is where the miniature people
can be themselves and peacefully live out their true identity. In the film’s final shot
(Plate 2) the family is seen travelling down a stream toward the sunshine, in search
of a new home. The film medium, with its depth-of-field, here optimally exploits the
spatial dimensions of the journey via the TIME IS SPACE metaphor: the family travels
not just forward into the distance; they travel towards the future.

Plate 2. Final shot: Arrietty’s family travels down a stream in a tea kettle boat in search
of a new home. Still from Arrietty the Borrower

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


34 Charles Forceville

4.3 The Village of Idiots


(Eugene Fedorenko & Rose Newlove, Canada 1999, 13’)
http://www.youtube.com/watch?v=TLGnekh2y6k
[last accessed May 2012]

Summary
Shmendrick, living in the small Polish village of Chelm, has “a thirst for more
knowledge” – as his voice-over tells us – and leaves wife and children for
Warsaw “to see the big city.” On the way he takes a nap and waking up, without
realizing, takes the same road back. He is surprised to find a village which is
precisely like Chelm, with people very much resembling those he knew in
Chelm and a woman and children shockingly similar to his own family. Only
he himself is not there … After some qualms, he decides to stay, believing that
his alter ego is now in the village that he left, and that in the end “perhaps the
entire world is simply one enormous Chelm.”

Plate 3. After his nap, Shmendrick thinks he walks on to Warsaw, while actually
retracing his steps back to Chelm. Still from The Village of Idiots

Plate 4. Shmendrick’s dream of transporting his house and the rest of the village from
one place to another. Still from The Village of Idiots

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 35

Although Shmendrick’s ostensive goal is to gain knowledge and see the big city,
the idea of “going home” as going to the place where one can live out one’s true
identity is strongly present. Unlike Shmendrick himself, the audience knows that
he has simply returned home. The idea of making a journey toward where you
are already in order to find your identity is cued in an interesting manner in the
opening of the film: we see Shmendrick on the roof of his house with a pile of soles
with holes in them. He addresses one of them, and says, “an old sole [punning on
“old soul,” ChF] must have travelled far, having seen many places.” He puts corks
in the holes of the soles, and then hammers the soles over the holes in the roof. In
this context, that is, the soles are metonymically tied to both shoes, and therefore
to journeying, and to the home – the symbol of identity and the destination of
the journey. Again, spatial dimensions are made to work metaphorically: in Plate
3 Shmendrick walks toward a future that is simultaneously his past. The visual-
ization of Shmendrick’s dream, too, draws on movement: the journey toward his
“new self ” is a circular one, and shows him carrying his native village, the locus of
his home and thus his identity, on his back (Plate 4).

4.4 The Lost Thing


(Shaun Tan & Andrew Ruhemann, 2010, 15’)
http://www.youtube.com/watch?v=4EMzzJhH1Ec and
http://www.youtube.com/watch?v=ODdagZUp4tI
[last accessed May 2012]

Summary
A boy, bottle cap collecting on the beach, runs into a large machine-like
creature with which he plays. At the end of the day he realizes it has nowhere
to go and decides to take care of it. After investigating this “Lost Thing,” his
scientifically minded friend Pete says that he “didn’t think the Lost Thing
came from anywhere, and didn’t belong anywhere either.” The boy then takes
it home. His parents are not very interested and he installs the Lost Thing in
the shed behind the house, where it “seemed happy.” But this can only be a
temporary solution, and the next day he takes the creature to the “Federal
Department of Odds and Ends.” However, in this depressively dark building,
the Lost Thing would just be stored away and forgotten. A cleaner advises
the boy to take his ward to a place he can find by following a wobbly arrow
sign. Eventually, they arrive at “what seemed to be the right place, in a dark
little gap, off some anonymous little street.” After opening a door, a brightly
lit world appears where all kinds of oddly-shaped “lost things” happily play
around. The boy takes leave of his friend, who from now on lives in this haven
for lost things.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


36 Charles Forceville

Plate 5. The bright and happy world which will become the home of the Lost Thing.
Still from The Lost Thing

Plate 6. The Lost Thing moves to screen-right, whereas other people move to screen-left.
Still from The Lost Thing (thanks to reviewer 1 for pointing this out)

While the place the Lost Thing ends up living in is not, in the strict sense, a house,
the world in which it is finally “home” has house-like qualities: its entrance is a
door-like porch, and it is partly “enclosed” (Plate 5). The fact that it is obviously
too big to live in a normal house (as transpires from its size when it sits on the roof
of Pete’s house, and when it occupies too much space in the parental living room)
further supports the idea that the bright world is its new, and definitive home.
Here, too, spatial dimensions are important. In the search for a home, the Lost
Thing generally moves screen-right (Plate 6), and the camera’s panning (= mov-
ing horizontally) reinforces this left-right direction. Thus, the past is screen-left,
and the future is screen-right.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 37

4.5 La Maison en Petits Cubes/Tsumiki No Ie


(Kunio Kato, Japan 2008, 12’)
http://www.youtube.com/watch?v=50-fWCXvhAY
[last accessed September 2013]

Summary
In this wordless film (which won an Oscar in 2009), an old man lives alone in
a house that stands in a sea, along with many other houses (Plate 7). But due to
persistent rain the water keeps rising, so with regular intervals he needs to add
a new floor to his house (Plate 8). Each floor is separated from a lower one by
a trapdoor. One time, he drops his pipe, which floats down through the open
trapdoor. He swims down in a diver’s suit to retrieve it, but then decides to
go even further down, through more trapdoors. At each underwater floor he

Plate 7. All houses stand in the sea; their inhabitants live in the top story.
Still from La Maison en Petit Cubes

Plate 8. The old man regularly needs to build a new story on his house to be safe from
the rising water. Still from La Maison en Petit Cubes

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


38 Charles Forceville

relives the period of his life spent there – with his ageing wife, with his young
wife, the birth of their daughter, his marriage … – the shift to memory-status
being signalled by a warm yellow glow as opposed to the real, greyish blue of
the underwater world. At the bottom of the sea we see how he meets his future
wife and together with her builds the first story of their house. He also finds a
wine glass there. When he is up again, he pours two glasses of wine, toasting his
now dead wife.

The home symbolizes the old man’s identity, each floor representing an episode in
his life. No words are needed, because we understand this, again, thanks to the visu-
ally presented metaphor TIME IS SPACE (see Forceville, 2011b; Forceville & Jeulink,
2011). Interestingly, TIME/SPACE is represented on a vertical, not the more custom-
ary horizontal dimension. In this orientation, PAST IS DOWN and FUTURE IS UP. So
the man needs to literally descend into his past (cf. “digging into the past”). The
homes the man is diving into are earlier versions of the home he is currently living
in. In order to understand the film, we need to recruit both the REMEMBERING THE
PAST IS GOING DOWNWARDS metaphor as a specific instantiation of the TIME IS SPACE
metaphor, and the home as symbol of identity. The old man thus literally dives into
his earlier identities as husband, father, young man, and child. These two conceptual
schemas (TIME-AS-SPACE and HOME-FOR-IDENTITY) are productive throughout the
film. The higher the story the old man builds, the older he is (the fact that, as Plate 7
shows, some other houses are under sea level suggests their owners are now dead).
The rising sea level exemplifies thus the inexorable progress of time; the moment
the man can no longer summon the strength or will to build a new story on top of
his house, he will drown in the sea of time. The TIME IS SPACE metaphor is also sup-
ported by the fact that, in the first scene, we see the man fishing through the trap-
door in his house, presumably angling for memories of the past. It is also telling that
each time he has to move to a higher story, he takes part of his furniture with him.
But as we can witness during his diving to lower floors, he also left some furniture
behind – a chair, the bed in which his wife was ill and possibly died and, lower down
yet, a couch where he remembers photographing his daughter and son-in-law with
their child. Moreover, his initial motivation for diving down is that he lost his pipe,
and although he at first considers the option of buying a new pipe from a travelling
salesman, he rather dives down in the hope of retrieving his beloved old pipe. These
events reinforce the idea that the home and the objects used in it are closely related
to the man’s identity: the bed is tied to his identity as married man, and when his
wife is dead he no longer wants the bed; but the pipe is part of an older identity he is
not yet ready to relinquish.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 39

5. Discussion

The five animation films discussed all draw on the house-as-home as the sym-
bolical locus of literal survival and, by extension, of identity. Inasmuch as hu-
man beings (or their anthropomorphized animal or fabled counterparts) are
typically always in search of their identity, it is unsurprising that the structural
metaphor PURPOSIVE ACTIVITY IS MOVEMENT TOWARD A DESTINATION has as one
of its recurring manifestations SEARCHING FOR ONE’S IDENTITY IS LOOKING FOR
A HOME.
While the two feature-length mainstream films discussed, Hoppity Goes to
Town and Arrietty, primarily emphasize the search for a new home as a strategy
for literal survival, here too, there are overtones of the home as symbol of iden-
tity. For instance, it wouldn’t feel right for Mr. Bee and Honey Bee to go and live
in Beetle’s house, since the price would be Honey’s forced marriage with Beetle,
whereas she loves Hoppity. Living in Beetle’s house would be a violation of Honey
Bee’s identity. In Arrietty, the grandfather of the big people’s family long ago made
a doll’s house for the miniature people, hoping that one day they would realize
that some human beings are friendly to them, accept his present, and start living
there. That is, he wanted to provide a home allowing them to live out their identity
peacefully. By contrast, in the three short art animations, the home shifts from
being primarily a resort of protection against physical harm to being the locus of
identity in a more spiritual sense.
In both Hoppity Goes to Town and Les Maisons de Petits Maisons the move-
ment takes place along a vertical dimension, the UP/DOWN orientation being im-
portant here. But it is important to realize that the source domain here is linked
to different target domains. In Hoppity there is little doubt that both FUTURE IS
UP and GOOD IS UP (and PAST/BAD IS DOWN). By contrast, in Les Maisons, only
FUTURE IS UP (and PAST IS DOWN); a healthy reminder that a given source do-
main – here spatial image schemas – can occur with different targets (Kövecses,
2010, p. 136, calls this the “scope” of metaphors; see also Hampe, 2005).
The case studies show that the metaphors governing the animation films,
while crucial for the stories, are no more than very basic templates for the so-
phisticated refinements that can only be appreciated by audiences familiar with
symbols, intertexts, and genres: basic, embodied templates acquire rich meaning
only by being enhanced and nuanced by aesthetic and cultural details. It is im-
portant to be aware of the continuum from deep-rooted, embodied, presumably
universal image schemas and metaphors, via culturally specific knowledge, to the
idiosyncrasies of unique texts. Cognitivist scholars should never forget that the
convention to write conceptual metaphors in small capitals enables unequivocal

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


40 Charles Forceville

references to the CONCEPTUAL level of metaphor as distinct from its verbal level –
but that this is no more than convenient shorthand whose precise formulation
is of little consequence. I completely agree with Pettersson who, in a demonstra-
tion of conceptual metaphors’ role in poetry, warns that a healthy development
of CMT scholarship requires sensitivity to stylistic elements: “In terms of cog-
nitive literary theory […] one ignores essential thematic and formal qualities if
one reduces literary works to cognitive patterns or techniques” (Pettersson, 2011,
p. 108) – a point that pertains no less to the animations discussed here. There is
always the danger that the small-capitals version of conceptual metaphors is taken
as a somehow “correct” rendering of what happens in the mind. But if Lakoff and
Johnson are right – as I think they are – that metaphors are “primarily a matter of
thought and action and only derivatively a matter of language” (Lakoff & Johnson,
1980, p. 153), the verbal rendering of metaphors’ conceptual level is no more than
an approximation of our minds’ activities. A discourse, particularly an artistic
story, can be informed by or even depend on certain metaphors, but it can never
be reduced to them. So in the end, analysis of conceptual metaphors in artistic
discourse requires the analyst’s attentive and sensitive eye and ear not only to the
skeletal metaphors and symbols that structure it, but also to the medium-specific
stylistic and narrative choices made by its maker to present them afresh. The same
principle, incidentally, pertains to the genre of political cartoons (see Bounegru
& Forceville, 2011). Much more is going on in any multimodal discourse than
whatever can be captured by the conceptual metaphor that may trigger its central
strategy of interpretation.

6. Concluding remarks

In this paper I have made the following points – all of which require further in-
vestigation and empirical (dis)confirmation:

1. The PURPOSIVE ACTIVITY IS MOVEMENT TOWARD A DESTINATION meta-


phor has as one of its most interesting and possibly most significant variants
SEARCHING FOR ONE’S IDENTITY IS LOOKING FOR A HOME.
2. Adequate analysis of this metaphor requires considering “home” as a symbol.
If my proposal to consider a symbol as a special kind of metonym makes
sense, this will facilitate future examination of symbolism within a CMT
framework. Acceptance of the proposal also means that symbols can play a
role in conceptual metaphors.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 41

3. The focus on a structural metaphor in a non-verbal/multimodal medium


such as animation film further supports the CMT tenet that we think meta-
phorically; simultaneously it shows that to further ground this tenet CMT
must expand and broaden the fledgling research on structural metaphor in
visual and multimodal discourse (film, comics, cartoons, gestures, music).
4. Whereas SEARCHING FOR ONE’S IDENTITY IS LOOKING FOR A HOME occurs in
other media (e.g., language, live-action film), animation has medium-specific
affordances to express it. Since it can easily defy laws of gravity and realism,
animation can (audio) visually exaggerate and emphasize embodied elements
of metaphors that may be more difficult to convey in other media. Specifi-
cally, the TIME IS SPACE metaphor has been exploited to a smaller or greater
degree in all five films discussed; in The Village of Idiots and La Maison au
Petits Cubes the to-and-fro of spatial movements moreover maps on the no-
tion of circularity in the TIME dimension (for more discussion of this issue,
see Forceville & Jeulink, 2011).

Further work on the topic discussed here could branch out in several directions.
In the first place, it is worthwhile to test other animations featuring LOOKING FOR
A HOME in light of the claims made here. We may for instance ask: are there other
target domains to which LOOKING FOR A HOME is systematically connected? A
systematic investigation of the direction of movement as well as of the vehicle
of movement is also worth pursuing. My hunch is that walking or other ways of
progressing depending on protagonists’ own muscle activity is privileged over
transportation in cars, planes, trains, motorboats, etc., since this reinforces the
physical and existential nature of LOOKING FOR A HOME.
It is to be expected that the home will not only feature as a symbol for identity
in journeys toward it, but also in building, repairing, extending, and changing the
home. Are there animation films which feature these other activities pertaining to
house-as-home, and if so, how are these metaphorically exploited? It could also be
insightful to investigate different kinds of buildings. I could imagine that X IS GO-
ING TO/BUILDING/REPAIRING A HOUSE/A CHURCH/A CASTLE/A MUSIC HALL might
occur, and that, given the symbolic potential of these buildings, they might func-
tion in conceptual metaphors as well. If so, it would be interesting to see whether
they are perhaps systematically linked to specific target domains.
Of course, there is no reason to limit such investigations to animation films.
Many live-action road movies and comics, too, feature the SEARCHING FOR ONE’S
IDENTITY IS LOOKING FOR A HOME metaphor non-verbally and multimodally –
and the alternatives suggested in the preceding paragraph (different buildings,
different activities) are no less worth examining. Indeed one reviewer of this

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


42 Charles Forceville

paper, stressing the “containment” dimension of the home, suggested that it


makes sense to investigate the Western genre, “whereby the inside of the home
often symbolizes civilisation, family values, and the outside symbolizes danger,
Indians, cruel nature.”
And finally, this paper may further spur on the work to be done on charting
with more precision how different pictorial and multimodal varieties of tropes
(metaphor, metonymy, symbol, irony, hyperbole, oxymoron …) relate to one an-
other (see Forceville, 2010b).

Acknowledgments

I thank Marloes Jeulink for insights about Hoppity Goes to Town and The Village
of Idiots, and Galen Campbell for alerting me to, and discussing, The Lost One. I
am furthermore deeply indebted to the thoughtful and detailed comments and
criticisms by two anonymous reviewers of an earlier draft of this paper.

Notes

1. The 25 volumes comprise 474 papers and book reviews. Occurrences of the two keywords
were counted in the titles of the reviewed books if these were indicated in the online database
at http://www.tandfonline.com/toc/hmet20/current (accessed January 2012). The word “irony”
occurred in 30, and the root “figur-” in 28 titles. The root “metonym-” appeared fewer than 10
times.

References

Beckson, K., & Ganz, A. (1975). Literary terms: A dictionary. New York: Farrar, Strauss and
Giroux.
Black, M. (1979). More about metaphor. In A. Ortony (Ed.), Metaphor and thought (pp. 19–43).
Cambridge: Cambridge University Press.
Brown, D.E. (2010 [1991]). The universal people. In B. Boyd, J. Carroll & J. Gottschall (Eds.),
Evolution, literature & film (pp. 83–95). New York: Columbia University Press.
Bounegru, L., & Forceville, C. (2011). Metaphors in editorial cartoons representing the global
financial crisis. Journal of Visual Communication, 10, 209–229.
DOI: 10.1177/1470357211398446
Coëgnarts, M., & Kravanja, P. (2012). From thought to modality: A theoretical framework for
analysing structural-conceptual metaphor and image metaphor in film. Image [&] Narra-
tive, 13(1), 96–11.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Metaphor and symbol 43

Fahlenbrach, K. (2007). Embodied spaces: Film spaces as (leading) audiovisual metaphors. In


J.D. Anderson & B. FisherAnderson (Eds.), Narration and spectatorship in moving images
(pp. 105–124). Cambridge: Cambridge Scholar Press.
Fahlenbrach, K. (2010). Audiovisuelle Metaphern: Zur Korper- und Affektasthetik in Film und
Fernsehen. Marburg: Schuren.
Fludernik, M. (Ed.). (2011). Beyond cognitive metaphor theory: Perspectives on literary meta-
phor. London: Routledge.
Forceville, C. (1996). Pictorial metaphor in advertising. London: Routledge.
DOI: 10.4324/9780203272305
Forceville, C. (2006). The source-path-goal schema in the autobiographical journey documen-
tary: McElwee, Van der Keuken, Cole. New Review of Film and Television Studies, 4, 241–
261. DOI: 10.1080/17400300600982023
Forceville, C. (2010a). Review of C. Jewitt (Ed.), The Routledge handbook of multimodal analysis
(2009). Journal of Pragmatics, 42, 2604–2608. DOI: 10.1016/j.pragma.2010.03.003
Forceville, C. (2010b). Why and how study metaphor, metonymy, and other tropes in
multi-modal discourse? In R. Caballero & M.J. Pinar (Eds.), Ways and modes of human
communication (pp. 57–76). Cuenca: Ediciones de la Universidad de Castilla La Mancha.
Forceville, C. (2011a). The journey metaphor and the source-path-goal schema in Agnès
Varda’s autobiographical gleaning documentaries. In M. Fludernik (Ed.), Beyond cognitive
metaphor theory: Perspectives on literary metaphor (pp. 281–297). London: Routledge.
Forceville, C. (2011b). Structural pictorial and multimodal metaphor. Lecture 7/8 of the Course
in pictorial and multimodal metaphor. http://semioticon.com/sio/courses/pictorial-mul-
timodal-metaphor/
Forceville, C., & Jeulink, M. (2011). The flesh and blood of embodied understanding: The
source-path-goal schema in animation film. Pragmatics & Cognition, 19, 37–59.
DOI: 10.1075/pc.19.1.02for
Gentner, D., & Jeziorski, M. (1993). The shift from metaphor to analogy in Western science. In
A. Ortony (Ed.), Metaphor and thought (2nd ed.) (pp. 447–480). Cambridge: Cambridge
University Press. DOI: 10.1017/CBO9781139173865.022
Gibbs, R.W. Jr. (1994). The poetics of the mind: Figurative thought, language, and understanding.
Cambridge: Cambridge University Press.
Gibbs, R.W. Jr., & Steen, G.J. (Eds.). (1999). Metaphor in cognitive linguistics. Amsterdam: John
Benjamins. DOI: 10.1075/cilt.175
Hampe, B. (2005). When down is not bad, and up not good enough. A usage-based assessment
of the plus-minus parameter in image-schema theory. Cognitive Linguistics, 16, 81–112.
DOI: 10.1515/cogl.2005.16.1.81
Jewitt, C. (Ed.). (2009). The Routledge handbook of multimodal analysis. London: Routledge.
Johnson, M. (1987). The body in the mind: The bodily basis of meaning, imagination and reason.
Chicago: University of Chicago Press.
Kappelhoff, H., & Müller, C. (2012). Embodied meaning construction: Multimodal metaphor
and expressive movement in speech, gesture, and feature film. Metaphor and the Socia
World, 1(2), 121–153. DOI: 10.1075/msw.1.2.02kap
Katz, A.N., & Taylor, T.E. (2008). The journeys of life: Examining a conceptual metaphor with
semantic and episodic memory recall. Metaphor and Symbol, 23, 148–173.
DOI: 10.1080/10926480802223051

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


44 Charles Forceville

Kövecses, Z. (2005). Metaphor in culture: Universality and variation. Cambridge: Cambridge


University Press. DOI: 10.1017/CBO9780511614408
Kövecses, Z. (2010). Metaphor: A practical introduction (2nd ed.). Oxford: Oxford University
Press.
Kress, G., & van Leeuwen, T. (1996). Reading images: The grammar of visual design (2nd ed.
published in 2006). London: Routledge.
Kress, G., & van Leeuwen, T. (2001). Multimodal discourse. London: Arnold.
Lakoff, G. (1993). The contemporary theory of metaphor. In A. Ortony (Ed.), Metaphor and
thought (2nd ed.) (pp. 202–251). Cambridge: Cambridge University Press.
DOI: 10.1017/CBO9781139173865.013
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press.
Lakoff, G., & Turner, M. (1989). More than cool reason: A field guide to poetic metaphor. Chica-
go: University of Chicago Press. DOI: 10.7208/chicago/9780226470986.001.0001
Pettersson, B. (2011). Literary criticism writes back to metaphor theory: Exploring the relation
between extended metaphor and narrative in literature. In M. Fludernik (Ed.), Beyond cog-
nitive metaphor theory: Perspectives on literary metaphor (pp. 94–112). London: Routledge.
Ritchie, D.L. (2008). X is a journey: Embodied simulation in metaphor interpretation. Meta-
phor and Symbol, 23, 174–199. DOI: 10.1080/10926480802223085
Rothenberg, A. (2008). Rembrandt’s creation of the pictorial metaphor of self. Metaphor and
Symbol, 23, 108–129. DOI: 10.1080/10926480801944269
Royce, T.D., & Bowcher, W.L. (Eds.). (2007). New directions in the analysis of multimodal dis-
course. Mahwah, NJ: Lawrence Erlbaum.
Semino, E., & Culpeper, J.V. (Eds.). (2002). Cognitive stylistics: Language and cognition in text
analysis. Amsterdam: John Benjamins. DOI: 10.1075/lal.1
Turner, M. (1996). The literary mind. New York: Oxford University Press.
Wales, K. (2001). A dictionary of stylistics (2nd ed.). Harlow, England: Pearson.
Yu, N. (1998). The contemporary theory of metaphor: A perspective from Chinese. Amsterdam:
John Benjamins. DOI: 10.1075/hcp.1
Yu, N. (2009). Nonverbal and multimodal manifestations of metaphors and metonymies: A
case study. In C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 119–143).
Berlin: Mouton de Gruyter

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions
Visual representations of emotions
in medieval English textiles

Javier E. Díaz Vera


University of Castilla-La Mancha

Following Forceville (2005, 2011), in this paper I show that the same conceptual
models underlie the expression of Old English emotions in both the language
and the visual modes. Kövecses (2000, 2005) and Stefanowitsch (2004, 2006)
have shown that verbal expressions and idioms used to describe emotions can
be traced back to a limited number of conceptual metaphors. In the light of
these findings, I will analyze here the pictorial representations of emotions
in the Bayeux Tapestry, an 11th century embroidered cloth that narrates and
depicts the events that led up to the Norman Conquest of England and the
invasion itself. The tapestry, which has been described as an example of early
narrative art (McCloud, 1993, pp. 12–14), shows hundreds of human figures in
an astounding range of poses and circumstances.
My analysis of the set of pictorial signals used in the Anglo-Norman
Bayeux Tapestry to represent emotion types such as ‘anger’, ‘grief ’ and ‘fear’
shows that (1) Anglo-Norman artists used a well-organized set of visual stimuli
to convey emotion-related meanings in a patterned way, that (2) the same ide-
alised conceptual models are shared by verbal and visual modalities, and that
(3) whereas verbal expressions of emotions regularly draw on non-embodied,
behavioural concepts, visual representations show a clear preference for embod-
ied container concepts.

Keywords: Old English, emotions, metaphor, visual representation, pictorial


runes, textiles

1. Aims and scope

The analysis and description of the different ways language mediates our concep-
tualization of emotions has received a growing amount of attention by researchers

doi 10.1075/bct.78.04dia
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
46 Javier E. Díaz Vera

in the field of Conceptual Metaphor Theory (henceforth CMT; Fesmire, 1994;


Kövecses, 1986, 1988, 1990, 2000; Lakoff, 1987; Lakoff & Johnson, 1980; Lakoff
& Kövecses, 1987). Within this framework, a central claim by CMT is that the
conceptualization of human emotions largely depends on embodiment, that is, on
what one believes as occurring in and to one’s body in a given emotional state.
Although CMT has traditionally focused on the verbal manifestations of folk
models of emotions, the attestation of their manifestations in non-verbal repre-
sentations and, especially, in the visual modality, has recently attracted the interest
of a growing number of scholars. Forceville (2005, 2011), Abbott and Forceville
(2011) and Eerden (2009) use comic strips and cartoons in order to demonstrate
that the visual representations of emotions used by comic artists from different
cultural areas cannot only confirm previous analyses based on exclusively linguis-
tic data, but also complement and enrich these, throwing new light on concrete
aspects of our conceptualization of each emotional state. As a matter of fact, in
the same way as languages, comics and cartoons make use of a well-defined set
of stereotypical exaggerations and of a rudimentary “sign-system” which play a
pervasive role in helping readers to cue emotions (Forceville, 2005, p. 71).
Based on these claims, in this article I present a study of the visual representa-
tions of fear, anger and grief used in medieval visual arts and how these can
be related to the corresponding verbal expressions.

2. Methodology and data

The multimodal corpus used in this study is based on Foys’ (2003) digital edition
of the Bayeux Tapestry (hence BT; Fowke, 1913; Messant, 1999; Musset, 2005;
Stenton, 1957; Wilson, 1985), an embroidered cloth illustrating, in 32 scenes, the
events leading to the 1066 Norman Conquest of England and the invasion itself.
The BT provides a fitting corpus for the study proposed here; in fact, the tapestry
depicts up to 626 human figures in different actions and poses, accompanied by
57 Latin texts narrating the historical facts. Although the tapestry was commis-
sioned and produced after the Norman Conquest, it was probably designed and
constructed in England by Anglo-Saxon artists (Coatsworth, 2005) and, conse-
quently, is expected to illustrate Anglo-Saxon artistic traditions and techniques.
The BT is referred to by Scott McCloud in Understanding comics (1993) as
an example of early narrative art, whereas British comic book artist Bryan Talbot
(2007, p. 5) describes it as “the first known British comic strip”. In fact, just like a
comic strip, the BT is a hybrid medium having a verbal side tied to a visual side in
order to convey narrative, seeking synergy by using both visual (non-verbal) and
verbal side in interaction.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions 47

In this study, I am especially interested in exploring how these emotions were


construed by Anglo-Saxon weavers through the use of a well-defined set of facial
expressions and bodily gesture, and the role of embodiment in that construal.
With this aim, I have made an inventory of the pictorial signals used in the BT in
order to suggest one of the three emotions under scrutiny here: fear, anger and
grief. All these signals are indexical in nature (Danaher, 1998; Forceville, 2005,
p. 74), in so far as they are metonymically motivated. Once this inventory was
completed, I have analyzed each single panel and scene in the tapestry featuring
one or more characters unambiguously affected by one of these emotions. The
decision whether a character represents one of these emotional states or not was
triggered (i) in part by the narration itself and (ii) in part by the use of visual sig-
nals by the embroiderers. There is indeed a certain degree of circularity implicit
to this argument (Forceville, 2005, p. 75), which I have tried to solve using textual
and historical information. In the following section, I provide a survey of the
pictorial signs identified in this study, followed by some examples from the BT
featuring them.

3. Pictorial signals for emotions in the Bayeux Tapestry

The complete list of pictorial signals for fear, anger and grief consists of nine-
teen different tokens. Of these, seven signals indicate facial expressions (eyes,
eyebrows and mouth), whereas ten correspond to bodily gesture (head, neck,
shoulders, trunk, arms, hands and legs) and the last two refer to body size.1

3.1 Facial expressions

3.1.1 Eyes
I have analyzed three different types of eyes in this study: bulging eyes, semi-
closed eyes and tightly closed eyes. Bulging eyes are the most commonly used pic-
torial signal for fear in the BT (32 occurrences) and, less frequently, they are also
used in order to express anger (4 occurrences) and grief (1 occurrence). They
denote an enlarged, black pupil located on the edge of wide-open eyes, normally
delimited by two thick, black lines: upper (eyebrow) and lower (pouch). Plate 1 il-
lustrates two different types of bulging eyes used in the BT: black pupil with round
whole (left) and black pupil without hole (right).
For example, in scene 12 (panel 45), we find an image of King Harold strug-
gling to get to the shore of river Couesnon, carrying one foot-soldier on his back
and pulling another soldier by his hand. Both King Harold and one of the two

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


48 Javier E. Díaz Vera

Plate 1. Bulging eyes in BT

Plate 2. Et hic transierunt flumen Cosnonsis hic Harold dux trahebat eos de arena
(scene 12, panel 45, detail from the Bayeux Tapestry – By special permission of the City
of Bayeux)

soldiers are depicted with bulging eyes, as can be seen in Plate 2. Based on the
context of the scene (i.e., two persons trying to escape from drowning), one can
confidently argue that this pictorial signal is being used in order to express fear
on the side of these two characters.
Semi-closed eyes (6 occurrences) as a sign of grief are indicated by a lowered
upper eyelid, which is normally represented by a straight, horizontal line. Finally,
tightly closed eyes illustrating anger (7 occurrences) are indicated by a long line
indicating an enlarged pouch, besides or under the pupil, represented by a small
dot, as in the two characters on the left in Plate 3, where a messenger informs an
enraged Duke William on Harald’s usurpation of the English throne.

3.1.2 Eyebrows
Raised eyebrows (as in Plate 4) are very frequently associated with fear (29 oc-
currences) and grief (11 occurrences) in the tapestry, whereas no occurrence of

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions 49

Plate 3. Willelm dux iussit naves edificare (scene 24, panel 80, detail from the Bayeux
Tapestry – By special permission of the City of Bayeux)

Plate 4. Raised eyebrows in BT

anger has been found. In the first case, they are normally combined with bulging
eyes, whereas in the second they are accompanied by semi-closed eyes. Raised
eyebrows are thicker and longer than non-raised eyebrows, and have an inverted-
C shape.
In scene 4, panel 10, Harold’s ship is crossing the channel driven solely by
the wind (as indicated by a billowing sail and empty oarlocks). The Latin text
notes that the wind filled the ship’s sails. Contemporary records indicate that
Harold’s ship was blown off course by a heavy storm, and that he was shipwrecked
in Picardy, where he was taken captive by Guy I. The panel depicts fourteen men
trying to regain control of the ship: as can be seen in Plate 5, raised eyebrows are
used for all the characters but one, whose face is hidden behind a rope. Based on
the context of the scene (i.e., a shipwreck), it can be argued that this pictorial sig-
nal is being used in order to express fear on the side of these characters.
Frowned eyebrows are preferred in the BT in order to express anger (10 oc-
currences). In this paper, eyebrows count as frowned when they are joined to each
other (frontal view) or to the upper part of the nose (view from the side).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


50 Javier E. Díaz Vera

Plate 5. Mare navigavit et veils vento plenis venit in terra Widonis comitis (scene 4,
panel 10, detail from the Bayeux Tapestry – By special permission of the City of Bayeux)

3.1.3 Mouth
A very thin, closed mouth is frequently found in the BT in order to express fear
(13 occurrences). The upper lip covers the lower lip, as if the character was biting
it with his upper teeth. Less frequently, we find a large mouth represented by a
thin, curved line. For example, the two characters to the right in scene 8, panel 23
(see Plate 6) represent two messengers of Duke William of Normandy, who im-
plore Guy I to release Harold. Closed, thin mouths are combined here with raised

Plate 6. Ubi nuntii Willelmi ducis venerunt at Wido (scene 8, panel 23, detail from
the Bayeux Tapestry – By special permission of the City of Bayeux)

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions 51

Plate 7. Ubi Harold 7 Wido parabolant (scene 7, panel 20, detail from the Bayeux
Tapestry – By special permission of the City of Bayeux)

eyebrows and hunched shoulders (see below), indicating a mixture of fear and
respect.
Where a thin, closed mouth indicates anger (13 occurrences), the line is
frequently curved downwards. In two different cases (as, for example, the rep-
resentation of Guy de Ponthieu in the centre of Plate 7), the lower lip protrudes
markedly as the upper one recedes.
Finally, grief is indicated by a large, closed mouth (5 occurrences) represent-
ed by an inverted C-shaped line running from cheek to cheek.

3.2 Gesture

3.2.1 Head
A downcast face is frequently used in the BT as an indication of grief (15 occur-
rences). The notion of looking downwards is normally reinforced by the position
of the hair, the eyes and, less frequently, the hands and shoulders of the character
affected by grief.

3.2.2 Neck
An exaggeratedly long, upright neck is used in different scenes (5 occurrences) in
order to indicate anger. This is especially frequent in those cases where a charac-
ter expresses his anger towards a social inferior and can consequently be consid-
ered a symbol of superiority.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


52 Javier E. Díaz Vera

Plate 8. Reversus est ad anglicam terram et venit ad Edwardu(m) regem (scene 19,
panel 64, detail from the Bayeux Tapestry – By special permission of the City of Bayeux)

3.2.3 Shoulders
Hunched shoulders are used in the BT in order to indicate fear (9 occurrences).
The neck and the head are horizontally aligned with the shoulders, which are
emphatically raised. As described above, downward inclination of the body or
the head is considered a sign of fear, veneration, submission and reverence in
the Anglo-Saxon world (Díaz Vera, 2011). In fact, Anglo-Saxon writers use the
predicates feallan and creopan in order to refer to the physical expression of these
feelings, showing the existence of a close connection between them. This is clearly
illustrated by Plate 8, where Harold humbly stands before King Edward of Eng-
land, who shows his dissatisfaction with him.
Lowered shoulders (sometimes accompanied by lowered arms) are an indica-
tor of grief (7 occurrences). In some cases, lowered shoulders are exaggeratedly
narrow (that is, as wide as the character’s head, or even less).

3.2.4 Trunk
Turning the trunk backwards (face to the left, body to the right, or both face and
body to the left) is also used in the BT to indicate that someone is escaping or hid-
ing from the source of fear (6 occurrences in all). For example, scene 13, panel 47
depicts a Breton soldier trying to escape from the fortress of Dol, which is being
attacked by the Normans.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions 53

Plate 9. Et venerunt at Dol et Conan fuga vertit (scene 13, panel 47, detail from
the Bayeux Tapestry – By special permission of the City of Bayeux)

3.2.5 Arms
Fear can also be indicated by an upper arm emphatically close to the body, where-
as the lower arm is slightly separated from it, sometimes placed around the belly,
or with the hand on the chest (26 occurrences). This can be clearly seen in the
group of characters represented in the famous comet scene (scene 22, panel 74;
see Plate 10 below), whose right arms and left upper arms are complete adhered
to their bodies as a sign of fear provoked by the spectacular vision of a comet

Plate 10. Isti mirant(ur) stella(m) (scene 22, panel 74, detail from the Bayeux Tapestry –
By special permission of the City of Bayeux)

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


54 Javier E. Díaz Vera

Plate 11. Portatur corpus Eadwardi regis ad ecclesiam Sci Petri ap(osto)li hic Eadwardus
rex in lecto alloquit(ur) fideles et hic defunctus est (scene 20, panel 68–69, detail from
the Bayeux Tapestry – By special permission of the City of Bayeux)

(frequently identified with Halley’s comet; Wright, 1999) with a tail flying through
the English skies by the beginning of the year 1066.
Moreover, widely extended arms (4 occurrences) are used in the BT as an
indication of anger. The upper arm is clearly separated from the trunk and hori-
zontally aligned with the shoulders, as can be seen in Duke William’s portrait in
Plate 2 above.

3.2.6 Hands and fingers


Hands and fingers are a very powerful tool for the expression of emotions in the
BT. Hand position is especially relevant in the case of grief, which can be indi-
cated by an open hand with the palm turned upwards (6 occurrences) or placed
on the heart (3 occurrences). Both signals are illustrated by the first two clerics on
the right side of scene 20, panels 68–69, depicting King Edward’s burial.
The index finger is frequently pointing at the source of fear (the upper arm
remains stuck to the body; 12 occurrences in all) or anger (fully extended arm; 5
occurrences). It should be noted here that, in the BT, the right hand is preferred in
order to indicate anger, whereas the left hand is normally used to express fear.
This can be illustrated by Plate 8 above, where an angry King Edwards points with
his right hand towards a scared Harold, who humbly moves his left hand towards
the king.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions 55

Plate 12. Hic ceciderunt qui erant cum Haroldo (scene 38, panel 166, detail from
the Bayeux Tapestry – By special permission of the City of Bayeux)

3.2.7 Limbs
Hanging limbs are used to indicate total paralysis of the body in a situation of
extreme fear in three different panels. For example, in scene 38, panel 166, a
Normal soldier holds an unarmoured Anglo-Saxon by the hair and prepares to
decapitate him. As can be seen in Plate 12, both the arms and the legs of the
Anglo-Saxon soldier are hanging, as well as his head.

3.3 Body size

Body scale size in the BT is highly conditioned by the physical size of the panels
and by the number of characters represented on each one of them. Furthermore,
different body sizes can also be used in order to try to represent the effects of vi-
sual perspective. However, it has been noted that, in the tapestry, someone who is
scared is at times represented in a much smaller scale, as compared to the other
characters in the same scene (13 occurrences). This is the case of the Anglo-­Saxon
woman and her child trying to escape from their burning house in scene 30, pan-
el 118 (see Plate 13).
Similarly, anger can be indicated by a relative increase in body size, especial-
ly when the angry character is a social superior (see, for example, King Edward’s
portrait in Plate 8 above).
Table 1 indicates the number of occurrences of each signal as a possible indi-
cator of fear, anger or grief in the BT.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


56 Javier E. Díaz Vera

Plate 13. Hic domus incenditur (scene 30, panel 118, detail from the Bayeux Tapestry –
By special permission of the City of Bayeux)

Table 1. Pictorial signals of fear, anger and grief in the Bayeux Tapestry
Pictorial signals in BT fear anger grief
Facial expression bulging eyes 32 4 1
tightly closed eyes 7
semi-closed eyes 6
raised eyebrows 29 11
frowned eyebrows 10
small, closed mouth 13 13
large, closed mouth 5
Gesture downcast face 15
long neck 5
hunched shoulders 9
lowered shoulders/arm(s) 7
trunk backwards (left) 6
arm(s) stuck to body 26
extended arm(s) 5
open hand(s) 9
pointing with the finger 12 5
hanging limbs 3
Body size smaller scale size 13
bigger scale size 6
Total 143 55 54

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions 57

4. Interpreting pictorial signals in the Bayeux Tapestry

4.1 Facial expressions

The pictorial signals analysed here are all highly coherent with existing folk mod-
els for emotions (Díaz Vera, 2011; Gevaert, 2002; Kövecses, 1990). In fact, most of
the Old English metaphors described in earlier research on the verbal expression
of emotions have got a non-verbal correspondence in the BT. It is widely known
that many of our everyday words for emotions stem from a combination of two
ontological metaphors: the body is a container for emotions and emotions
are temperature changes. Within this context, bulging eyes (indicating the
three emotions analyzed here) can be interpreted as a sign of the interior pressure-
aspect of the fluid in container metaphor. Alternatively, bulging eyes could
indicate increased body heat, which is in any case related to the substance in
container metaphor. Similarly, tightly closed eyes can suggest both the pressure
on the body-container in the stage of suppression, or a bodily accompaniment of
released anger (Forceville, 2005, p. 81). In the case of semi-open eyes, which are
used in the BT as an indicator of grief, they can be interpreted as a signal of lack
of vitality (Kövecses, 1990, p. 25).
Raised eyebrows indicate fear and grief in the tapestry, and can also be in-
terpreted as a representation of the interior pressure-aspect encoded in the sub-
stance in container metaphor. Furthermore, a mouth emphatically closed (13
occurrences) could be indicating that the character is preventing his fear or his
anger from coming out of the body/container, whereas a mouth emphatically
wide and large is, again, an indicator of lack of vitality.

4.2 Gesture

Downcast face and lowered shoulders and arms are another manifestation of the
lack of vitality that accompanies grief, and square with the general sadness is
down metaphor.
As for truck backwards and hunched shoulders, both are to be connected to the
figurative expressions fear is turning back and fear is becoming smaller.
Furthermore, arms stuck to the body indicate that the body is rigid and hence
square with the fear is change of flexibility metonymy (Díaz Vera, 2011).
Less frequently, paralysis is represented by hanging limbs in the BT.
Long neck and extended arms can be metaphorically motivated by the map-
ping anger is an opponent in a struggle (Kövecses, 1990, p. 21), as they imply
an increase in body size. Similarly, pointing towards the source of anger with the

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


58 Javier E. Díaz Vera

index finger of the right hand is a sign of aggressive behaviour, whereas pointing
towards the source of fear with the index finger of the left arm indicates a defen-
sive reaction and, consequently, illustrates the fear is an opponent metaphor.
Finally, open hands are another indicator of lack of vitality, a recurrent metaphor
for grief.

4.3 Body size

Finally, as in the case of hand and finger position, body size is to be related to
the aggressive and defensive behaviours that accompany, respectively, anger and
fear, and square well with the anger/fear is an opponent metaphors.

5. Conclusion

In this paper, I have examined the different ways anger, fear and grief are rep-
resented in the Bayeux Tapestry. As a result of this analysis, I have reconstructed
a whole set of pictorial signals used to represent these three emotions in the tap-
estry. The pictorial signals described in this chapter are commensurate with the
results of previous research on the linguistic expression of emotions (Díaz Vera,
2011; Gevaert, 2002; Kövecses, 2000). This set of pictorial signals explains how
medieval ‘readers’ of the BT were able to identify in a simple and straightforward
way what kind of emotion is expressed in each panel from only pictorial signals
of the types described here. Furthermore, this set constitutes a more or less rudi-
mentary sign-system where facial expressions and gesture play an important role
in helping observers to cue emotions. It should also be noted that, apparently,
none of these pictorial signals alone is able to convey the idea of fear, anger or
grief. It is the combination of different pictorial signals in the same character that
is used by medieval embroiderers in order to suggest one of these three emotions.
Conversely, as has been seen here, many of the signals described in this paper are
shared by more than one emotion, which indicates that a particular signal is not
necessarily reserved for the expression of one concrete emotion but may, in com-
bination with other signals, suggest a completely different emotion.
How precise and universal this sign-system is can only be assessed after the
model presented here has been applied to a wider number of visual represen-
tations of emotions in different historical periods and cultural areas, and the
multiple relationships between these visual representations and the correspond-
ing linguistic expressions for emotions have been reconstructed (with special
attention to metaphoric and metonymic expressions). The primary aim of this

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Woven emotions 59

research was to outline, on the basis of a selection of pictorial signals used in the
Bayeux Tapestry, how these signals can be distinguished, classified and interpret-
ed. I hope that this analysis has demonstrated how theorists of visual metaphor
and metonymy may benefit from the systematic examination of the use of picto-
rial signals for the expression of emotions. Furthermore, cognitivists in various
disciplines should find the existence of stable sets of pictorial signals and their use
as communication systems pertinent to their research.

Notes

1. Only four of these pictorial signals appear in Forceville’s analysis of the representation of
anger in the Asterix album La Zizanie (Forceville, 2005, p. 75–77): bulging eyes, tightly closed
eyes, tightly closed mouth and arm position.

References

Abbott, M., & Forceville, C.J. (2011). Visual representations of emotion in manga: loss ofcontrol
is loss of hands in Azumanga Daioh volume 4. Language and Literature, 20(2), 91–112.
DOI: 10.1177/0963947011402182
Coatsworth, E. (2005). Stitches in time: Establishing a history of AngloSaxon embroidery. In
R. Netherton & G.R. OwenCrocker (Eds.), Medieval clothing and textiles, Vol. 1 (pp. 1–27).
Woodbridge: Boydell & Brewer.
Danaher, D. (1998). Peirce’s semiotic and conceptual metaphor theory. Semiotica, 119(1/2), 171–
207.
Díaz Vera, J.E. (2011). Reconstructing the Old English cultural model for fear. Atlantis, 33(1),
85–103.
Eerden, B. (2009). Anger in Asterix: The metaphorical representation of anger in comics and
animated films. In C.J. Forceville & E. UriosAparisi (Eds.), Multimodal metaphor (pp. 246–
264). Berlin: Mouton de Gruyter.
Fesmire, S.A. (1994). Aerating the mind: The metaphor of mental functioning as bodily func-
tioning. Metaphor and Symbolic Activity, 9, 31–44. DOI: 10.1207/s15327868ms0901_2
Forceville, C.J. (2005). Visual representations of the idealized cognitive model of anger in the
Asterix album La Zizanie. Journal of Pragmatics, 37, 69–88.
DOI: 10.1016/j.pragma.2003.10.002
Forceville, C.J. (2011). Pictorial runes in Tintin and the Picaros. Journal of Pragmatics, 43, 875–
890. DOI: 10.1016/j.pragma.2010.07.014
Fowke, F.R. (1913). The Bayeux Tapestry: A history and description. London: G. Bell & Sons.
Foys, M.K. (2003). The Bayeux Tapestry digital edition. Woodbridge: Boydell & Brewer.
Gevaert, C. (2002). The evolution of the lexical and conceptual field of anger in Old and Middle
English. In J.E. Díaz Vera (Ed.), A changing world of words: Studies in English historical
lexicology, lexicography and semantics (pp. 275–299). Amsterdam: Rodopi.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


60 Javier E. Díaz Vera

Kövecses, Z. (1986). A figure of thought. Metaphor and Symbolic Activity, 6, 29–46.


DOI: 10.1207/s15327868ms0601_2
Kövecses, Z. (1988). The language of love: The semantics of passion in conversational English.
Lewisburg, PA: Bucknell University Press.
Kövecses, Z. (1990). Emotion concepts. New York: SpringerVerlag.
Kövecses, Z. (2000). Metaphor and emotion. Cambridge: Cambridge University Press.
Kövecses, Z. (2005). Metaphor in culture: Universality and variation. Cambridge: Cambridge
University Press. DOI: 10.1017/CBO9780511614408
Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind.
Chicago: University of Chicago Press. DOI: 10.7208/chicago/9780226471013.001.0001
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press.
Lakoff, G., & Kövecses, Z. (1987). The cognitive model of anger inherent in American English.
In D. Holland & N. Quinn (Eds.), Cultural models in language and thought (pp. 195–221).
Cambridge: Cambridge University Press. DOI: 10.1017/CBO9780511607660.009
McCloud, S. (1993). Understanding comics: The invisible art. Northampton, MA: Kitchen Sink
Press.
Messant, J. (1999). Bayeux Tapestry embroiderers’ story. Thirsk: Madeira Threads.
Musset, L. (2005). The Bayeux Tapestry. Woodbridge: Boydell & Brewer.
Stefanowitsch, A. (2004). Happiness in English and German: A metaphoricalpattern analysis.
In K. Achard & S. Kemmer (Eds.), Language, culture, and mind (pp. 137–149). Stanford:
CSLI.
Stefanowitsch, A. (2006). Words and their metaphors: A corpusbased approach. In A.
Stefanowitsch & S. Th. Gries (Eds.), Corpus-based approaches to metaphor and metonymy
(pp. 63–105). Berlin/New York: Mouton de Gruyter. DOI: 10.1515/9783110199895
Stenton, F. (1957). The Bayeux Tapestry. A comprehensive survey. London: Phaidon.
Talbot, B. (2007). The history of the British comic. The Guardian Guide, September 8, 2007,
p. 5 http://forbiddenplanet.co.uk/blog/2011/history-of-british-comics-2/> (Accessed 10
December, 2011).
Wilson, D.M. (1985). The Bayeux Tapestry. London: Thames and Hudson.
Wright, P.P. (1999). Hastings. Gloucestershire: Windrush.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand
The relevance of image schemas as multimodal
resources for the branding industry

Lorena Pérez Hernández


University of La Rioja

Increasingly global markets impose strains on the branding industry for the
design of trademarks with a worldwide appeal. This paper explores the potential
benefits of the exploitation of embodied schemata for this purpose. A corpus of
international automobile brands is analyzed in search of the image schemas at
work in the conceptualization of different car categories (i.e. minis, family cars,
sports cars, and off-road 4 × 4s). Our findings evince that, together with other
well-known strategies (i.e. sound symbolism), multimodal image schemas can
be added to the inventory of branding tools which help to imbue brands with a
globally comprehensible semantics. In the context of branding, it is also attested
that the structure of the general schemas is fleshed out through their interaction
with the most salient attributes of the target product/service named by a partic-
ular brand, rather than in relation to other contextual or cultural facts.

Keywords: branding, Cognitive Linguistics, image schemas, metaphor,


metonymy

1. Introduction

Creating a new brand for the already crowded present-day market is a complex
task, as shown by the increasing number of branding and naming companies
offering such services. New brands must not only comply with basic marketing
needs (i.e. ease of pronunciation/spelling, lack of negative unfortunate semantic
associations, displaying a catchy and distinctive character, etc.), but they are also
expected to be meaningful to consumers worldwide. This paper explores the role
of image schemas and related multimodal image schematic metaphors and me-
tonymies in the process of creating globally-valid trademarks.

doi 10.1075/bct.78.05per
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
62 Lorena Pérez Hernández

In order to connect with potential buyers brought up in different cultures,


and speaking different languages, branding professionals have gradually depart-
ed from purely descriptive brands. Simultaneously, they have turned their atten-
tion to certain resources that yield more evocative, cross-culturally significant
trademarks. One such strategy hinges on the ability of phonemes to invoke spe-
cific feelings and notions, thus causing similar emotional reactions in the audi-
ence regardless of their diverse linguistic or cultural backgrounds (Klink, 2001;
Lowrey & Shrum, 2007). As noted in Pereltsvaig’s (2011) interesting article on the
use of sound symbolism as a branding strategy, obstruent sounds like [t] and [k]
in names such as taketa are perceived as harder and sharper and hence suit better
angular shaped patterns and objects, while the sonorants [n], [l], [m] in naluma
are perceived as softer and smoother, more suited to name rounder shapes.
The semantic effects derived from the sound symbolism of brand names are,
however, somehow limited and the branding industry remains in need of alterna-
tive tools for the purpose of expediting the creation of universal brands.
In this regard, the multimodal texture of cognitive operations, such as concep-
tual metaphors and metonymies, has already been proved useful (Forceville, 2007,
2008; Koller, 2009; Hidalgo-Downing & Kraljevic-Mujic, 2011). The recent litera-
ture shows how such conceptual operations have made their way into trademark
logos and marketing campaigns, allowing the branding specialist to overcome the
limitations of a particular language and, in turn, to reach a more global market. A
significant number of such mappings, however, are closely tied to specific cultures
and, even when adopting a visual mode, their semantics may fail to be universally
understood.
More recently, cognitive approaches to advertising have also taken into con-
sideration the workings of image schemata: bodily-grounded recurring patterns
derived from our sensory-motor interaction with the surrounding environment.
Image schemas have an experiential basis and, as a consequence, they are largely
pervasive across cultures and languages. The literature on this topic is to date
fairly scarce, but includes related proposals such as Forceville’s (1998), Cortés de
los Ríos’ (2001), and Ortiz’s (2010) research on how primary metaphors (i.e. those
whose primary source concept is an image schema) might generate visual con-
structs often used in pictorial advertising. Shifting the focus from image schemas
as source domains of primary metaphors to image schemas as explanatory tools
in themselves, Núñez Perucha (2003) reinterprets the whole persuasion commu-
nicative domain in terms of the force image schema. In her account, image sche-
mas are used as persuasive strategies by means of which the advertiser tries “to
force” the receiver to buy. More specifically, Umiker-Sebeok (1996) and Velasco
Sacristán and Cortés de los Ríos (2009) explore how image schemas are used in

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 63

the construction of gender spaces, as well as with the purpose of introducing sex-
ism, in advertising. In a similar vein, Felices Lago and Cortés de los Ríos (2009)
make use of image schemas in their account of the dominant values in print ad-
vertisements announcing different types of environmentally friendly products
and services from energy (oil, electricity, etc.) and heavy industry corporations.
In this paper, we further contend that image schemas, and image-schematic
mappings (Johnson, 1987; Lakoff, 1987; Lakoff & Johnson, 1999), by virtue of
their aforementioned pre-conceptual, experiential, and cross-cultural nature, also
qualify as an efficient, yet still largely overlooked, tool for the more specific pur-
pose of creating global brands (Pérez Hernández, 2011). In addition, this paper
argues that the universal nature of image schemas can be maximized through
their multimodal expression. In fact, the systematic exploitation of multimodal
image schemas in the process of brand design is expected (1) to ease the gen-
eration of global brands and (2) to increase their suggestiveness and conceptual
richness, both through the fixed implications of the internal logic of the sche-
mas, and also via metaphoric and metonymic extensions making use of those
embodied schemata as their source domains. With this aim in mind, a corpus of
international automobile brands/logos is analyzed in search of the image schemas
at work in their conceptualization; the extension of their use and its multimodal
nature is assessed; and a description of the semantic shades added by each image
schema is offered.
In so doing, our aim is to add one more strategy to the inventory of multimod-
al resources for the design of internationally appealing and meaningful brands.
Additionally, our research into the image-schematic foundations of brands al-
lows a parallel incursion into the current debate about the situatedness of image
schemas themselves (Gibbs, 1999; Johnson, 2005; Kimmel, 2005, 2008; Sinha &
Jensen de Lopez, 2000). In the context of branding, our findings suggest that the
structure of the general schemas is fleshed out through their interaction with the
most salient attributes of the target product/service named by a particular brand,
rather than in relation to other contextual or cultural facts.
The outline of this paper is as follows. Section 2 tackles the issue of the defini-
tion and scope of image schemas, and image-schematic mappings. Additionally,
the current debate about the universal versus contextually-situated nature of em-
bodied schemata is revisited from the perspective of the branding industry and
the specific needs of brand designers. Section 3 describes the corpus of analysis.
Section 4 offers a description of the main image schemas at work in four automo-
bile categories. Finally, Section 5 considers the implications of our findings for the
process of brand design and provides some suggestions for future research.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


64 Lorena Pérez Hernández

2. Image schemas: Definition, scope, and relevance


for the branding industry

In this paper we take sides with the original definition of image schemas as men-
tal patterns of bodily, perceptual, including kinetic, experience (Johnson, 1987;
Lakoff, 1987):1 a category of “recurring patterns of our sensory-motor experience
by means of which we can make sense of that experience and reason about it”
(Johnson, 2005, pp. 18–19). As pointed out by Lakoff and Johnson (1999), this
type of meaning structures constitutes a sort of “cognitive unconscious”, operating
beneath the level of our perceptive awareness. Furthermore, given that all human
bodies share several quite specific sensory-motor capacities constrained by their
own size and constitution, as well as by the common traits of the diverse environ-
ments they inhabit, image schemas are expected to be the same across different
cultures. Thus, through the relative bilateral symmetry of our bodies we experi-
ence the image schema of left-right; our constant interaction with containers
leads us to structure such experiences in terms of the container schema; etc.
It is precisely their directly meaningful, preconceptual, and universal nature,
which emerges as an interesting trait for the branding industry. If the essential
embodied semantics of image schemas could somehow be incorporated into the
content of brands, the latter would by default convey a basic, cross-culturally
shared meaning, which could be straightforwardly apprehended by consumers
worldwide.
This global semantics of trademarks would then be necessarily enriched and
completed through their embeddedness within specific cultural, emotional, and
interactional experiences. As Gibbs (1999), Johnson (2005), and Kimmel (2005,
2008), among others, have rightly argued, attending only to the structure of image
schemas as recurring patterns of organism-environment sensory motor interac-
tions and ignoring their nonstructural, more qualitative aspects stemming from
socio-cultural facts, prevents us from accounting for their variation both across
cultures and in situated cognition. Different instantiations of the container
schema, for example, illustrate the need to “put flesh on image-schematic skel-
etons” (Johnson, 2005, p. 27). More specifically, there is a varied range of feel-
ings associated with the actual experience of containment: from the coziness and
warmth caused by a tight hug to the constraint prompted by the confines of a
small room; and from the freedom or, in some cases, fear people may feel when
leaving a closed area to the phobia or the sense of autonomy which can derive
from entering an open territory.
The present account of the image-schematic foundation of brands does not
entertain as one of its goals the description of related “situated image schemas”.2
Since the branding industry focuses on the generation of globally understandable

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 65

messages, our analysis is geared towards those universal aspects of image schemas
serving this purpose. Nevertheless, it would be unwise to overlook the mounting
evidence in favor of the situated and context-sensitive dimension of embodied
schemata. As far as the branding industry is concerned, our findings evince that
it is possible to strike a balance between the use of too general schemas devoid of
anything context-bound, on the one hand, and the positing of isolated, culturally-
tied, situated schemas devoid of a universally shared semantic core, on the other.
This can be achieved by limiting the situated enrichment of the initial general
schemas to those facts deriving from the key attributes of the product/service un-
der consideration. In other words, the flesh needed to complete the image-sche-
matic skeleton upon which a particular brand is built stems from the interaction
between the structure of the image schema at work and the concrete ‘affordances’
offered by the target product which the brand names. By way of illustration, the
force image schema, in the context of automobile brands, may be instantiated
differently in relation to a sports car, in which case those aspects of the schema
related to ‘speed’ are activated; or in relation to a jeep, whose affordances bring to
the forefront those features of the force schema related to the notion of ‘power’.3

3. Data selection

Which image schemas underlie the semantic configuration of different car


brands? Do these schemas relate to and help to communicate the key attributes
(i.e. affordances) offered by different car categories? The data collected to answer
these questions include automobile brands (and related logos) corresponding to
vehicles manufactured in different countries, but marketed with a global audience
in mind.4 More specifically, this study is based on a corpus of 200 brand names
and logos corresponding to four different categories of cars. We have made use of
a simplified typology borrowed from The European New Car Assessment Program
(Europe NCAP),5 thus distinguishing between minis, family cars, sports cars, and
off-road 4 × 4s. Executive/luxury cars have not been included in the analysis as an
independent category due to their cross-sectional nature. Rough descriptions of
each subgroup and their associated key attributes are offered below:
(1) Minis are those vehicles whose physical dimensions and engine capacities are
lower than those of family cars. Their key attribute is that of “small dimen-
sions”, from which other positive traits derive (i. e. “nimbleness in traffic” and
“ease to park”, among others). E.g. Smart.
(2) Family cars are normally-sized cars suitable to carry a whole family. Their
essential key attributes are “space”, “comfort”, and “safety”. E.g. Ford Focus.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


66 Lorena Pérez Hernández

Plate 1. Suzuki and Suzuky Jimny logos. Reproduced with permission.

(3) Sports cars are medium-size, usually two-seat, two-door vehicles designed for
high speed driving. Their focal affordances include those of “speed”, “smooth
driving”, and “high performance”. E.g. Audi TT.
(4) Off-road 4 × 4s: Basic key attributes characterizing off-road 4 × 4s are “power”,
“solidness”, and “robustness”. E.g. Land Rover.

A particular car brand is always a conglomerate of elements with varying degrees


of complexity. In most cases it includes the general brand/logo representing the
manufacturer plus the more specific brand/logo for the specific car (E.g. Suzuki
Jimny).
The general brand usually serves a referential function, helping customers to
identify and remember the manufacturer and the central characteristics of their
range of cars (economical vs. luxury cars; urban vs. off road, etc.). The specific
brand highlights the particular traits of each model. Some companies, though,
have traditionally specialized in the production of a precise type of vehicle, as is
the case with Jaguar, which is readily associated with sports cars. As shown in the
ensuing discussion, the general or umbrella brand of these manufacturers is gen-
erally richer in detail, reflecting most of the key attributes of the specific models
by itself.
The image schemas at work in the brands under study have been analyzed in
order to gather evidence supporting (1) the relation between the product category
and the image schema used in their logo/text characterization; (2) how well the
image schema relates to the affordances or key attributes of each product catego-
ry; (3) the existence of cognitive dissonance between the basic meaning provided
by the schema and the nature of the product category.

4. Image schemas at work in automobiles brands and logos

4.1 Case study 1: Minis

Minis display characteristically small dimensions, and therefore, “size” and the
related attribute schema (small-big) become essential to their marketing. In
spite of the fact that some cultures may share primary metaphors which assign an

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 67

axiological value to this notion (i.e. important is big/unimportant is small),


the fact is that smallness, in itself, cannot be considered negative. As shown in
Pérez Hernández (2011), the Propositional Idealized Cognitive Model (ICM) of Size
captures a shared feeling, among western cultures, that small entities are harm-
less, charming, and therefore, desirable. In the context of automobiles, smallness
is associated with nimbleness in traffic and ease to park, both positive traits. Al-
ternatively, small dimensions can also correlate in this context with lack of space
and discomfort. The logos of minis show an interesting exploitation of the image
schema of attribute (big-small), leading the audience to focus precisely on their
most salient attribute (smallness), and at the same time minimizing somehow its
potential negative implications. Our data reveal that this may be achieved in three
different ways:

a. Through the use of a proportionally bigger typeface size. Big logos/letters do


not equal big cars, but in the context under consideration they may lead the
audience to reconsider the size of the car under a more positive light. Sig-
nificant examples are those of Mini ONE, Ford KA, Mitsubishi COLT, and
Citroën C1 and C2 series, where the brand name typeface visually stands out
as bigger than that of the manufacturers’ brand itself.
b. Through the exclusive use of lowercase letters which, by iconically resembling
the actual smallness of the car, may also function as potential cues pointing to
the ICM of Size. According to this cultural model small entities can either be
perceived as harmless, charming and desirable, or alternatively as worthless
and unimportant. In the context of a marketing campaign, the audience is by
pure logic led to entertain the positive interpretation. Brands which exploit
the image schema of attribute (big-small) in this way are, among others,
those of Kia picanto and smart.
c. Through the alternation of capital and lowercase letters, or different typeface
sizes for each of the letters of the brand name. This is the case with Mitsubishi
i-MiEV, Hyundai i1O, Peugeot iON, and Alfa Romeo MiTo. In the context of
minis, such alternation of big-small typeface sizes may set off a parallel inter-
pretation of the actual size of the car (i.e. small, but big: small, but probably
more spacious than it seems).

In all cases, the positive affordances of small cars are still preserved, but the nega-
tive ones (i.e. lack of space/comfort) may somehow be challenged by the interac-
tion of the image schema of attribute (hinted at by the diverse visual features
of the brand name) and the ICM of Size, on the one hand, and the positive in-
terpretation which is, by default, expected in a marketing/branding context, on
the other.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


68 Lorena Pérez Hernández

As shall become apparent in Sections 4.2 and 4.3, the attribute schema will
be found at work in different, sometimes diametrically opposed, car categories
such as minis, family cars, and 4 × 4s. The image schema of attribute describes
basic properties of objects related to their size, strength, weight, and tempera-
ture, among others. Most of these properties involve a scale between two extreme
points (i.e. big-small). Which of these gets to be activated, or to what degree,
will be determined by the intrinsic properties of the target product in interaction
with the cue provided by the visual inputs included in its brand and/or marketing
campaign. This cognitive process, known as parametrization (Ruiz de Mendoza,
2011), consists in adapting the basic conceptual layout provided by the expression
to other textual and contextual clues (i.e. in the case of branding, the relevant con-
text is provided by the nature of the target product itself). If we consider a wine
brand such as Imperial, for instance, it is our knowledge that emperors lived in a
world of luxury that allows us to interpret this wine as a high quality product in
terms of taste and aroma. If the same brand were to be used to name a horse or a
car instead, its parametrization would trigger different interpretations, probably
along the lines of a pure breed, competitive horse, and of a luxurious and expen-
sive car, respectively. The same brand (i.e. linguistic cue) is parametrized differ-
ently depending on the product that it names. Likewise, the use of the attribute
image schema as a branding strategy may be parametrized as required depending
on the nature of the target product (i.e. car category). Different visual inputs will
serve as cues for the achievement of a felicitous parametrization.

4.2 Case study 2: Family cars

Specific brands and logos of family cars stand out as the most neutral in our cor-
pus. They make a standard use of capitalization and typeface size. Alternations
of capital/lowercase letters or the exclusive use of small letters observed in Sec-
tion 4.1 are not recurrent features of those brands in this segment.6 34 out of a
total of 55 family car brands are rendered in full capital letters, a feature which
is largely compatible with the key attribute “space”, typical of this car category,
and which points to a potential exploitation of the image schema of attribute
(big-small). The rest are presented with an initial capital letter as is mandatory
for proper names, thus making no relevant contribution to the semantics of the
brand as far as the key attribute of “space” is concerned.
In most cases, branding specialists rely on the manufacturers’ general brand
for the commercialization of family cars. When considering the embodied foun-
dations of general automobile trademarks, our analysis reveals that the most
recurrent image schema is that of container, one of the most ubiquitous and

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 69

Plate 2. BMW, Honda, and Lexus logos. Logos reproduced with permission.
All Lexus logos, trademarks, service marks and copyrights are solely and exclusively
owned by Toyota Motor Corporation.

central cognitive templates. It comes as no surprise that this schema is so per-


vasive in this context, being vehicles containers in themselves and exhibiting the
same structural elements: an interior, an exterior and a boundary. With the only
exceptions of Mitsubishi and Citroën, all the brands in the subcorpus of family
cars display schematic container forms, inherited from their corresponding gen-
eral manufacturers’ brand logos. Most of them display one-dimensional forms of
enclosures, such as a circle or an oval figure. A few make use of other, more elab-
orated kinds of closed areas. By way of illustration, consider the logos for BMW,
Honda, and Lexus (Plate 2).
The semantics of these brands automatically partake of the internal logic of
the container image schema, especially of its first entailment, which is fully
compatible with one of the key affordances of family cars (i.e. safety/protection):
(i) The experience of containment typically involves protection from, or resistance to,
external forces (Johnson, 1987, p. 22).
In the context under scrutiny, this entailment straightforwardly maps onto the
sense of safety and protection against forceful impacts or crashes, which the auto-
mobile industry aims to convey through its brands and marketing campaigns.
In addition, the closed areas found in most logos can also be interpreted in
terms of the cycle image schema, which is subsidiary to the more general path
schema. According to Peña Cervel (2000, p. 377), a cycle is a circular path con-
sisting of a source, a terminal point, and a directionality. Its internal logic includes
the general idea of motion along the path, which is compatible with the nature of
the product under scrutiny. It also captures a more specific consequence of mo-
tion along circular paths to the effect that once the end point of the path has been
reached, the latter becomes the source again. In the context of driving, the idea
of returning to the departing point goes hand in hand with the aforementioned
expectations of safety, thus highlighting and reinforcing this essential attribute of
family cars.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


70 Lorena Pérez Hernández

4.3 Case study 3: Sports cars

“Speed”, “smooth driving”, and “high performance” are some of the focal features
of sports cars. Our corpus reveals that the first of them finds its way into the
brands/logos of this car category through a metonymic exploitation of the image
schema of force. As Johnson points out (1987, p. 43) “our experience of force
usually involves the movement of some object (mass) through space in some di-
rection. In other words, force has a vector quality, a directionality”. In addition,
“forces have degrees of power or intensity” (Johnson, 1987, p. 43). As a car moves,
it traces a path that can be described by a force vector. In turn, the speed of the
moving car is a reliable indicator of the intensity of the force it exerts. The vector
element of the force image schema is visually cued in many of the sports cars
brands in our corpus. In some cases this is done by means of a subtle transforma-
tion of the traditionally circular/oval frames of car logos (see Plate 2) into other
types of closed areas incorporating some kind of pointed element resembling a
vector (e.g. >).
Plate 3 is a schematic representation of the type of logo frame found in some
of the most representative manufacturers of sports cars, including Porsche, Lam-
borghini, Gumpert, and Ferrari. Other sports car companies, such as Maclaren,
Maseratti, Corvette, and Tesla, also display logos with some kind of vector image
built into them. Plate 4 offers a schematic representation of the different graphics
representing force vectors in the aforementioned trademarks.
Nevertheless, the most pervasive and subtle strategy for the metonymic acti-
vation of the force schemata in two-seaters brands is by far the use of italics. 18
out of 29 brands in this category display letters whose upper part is slightly tilted

Plate 3. Schematic representation of sports cars logo frames.

Plate 4. Schematic representations of force vectors in sports car logos.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 71

Plate 5. Honda CR-Z and Hyundai Veloster logos. Reproduced with permission.

Plate 6. Chrysler Crossfire logo. Reproduced with permission.

towards the right, thus suggesting a forward movement. Honda CR-Z and Hyun-
dai Veloster are just two examples (Plate 5).
The Honda CR-Z logo displays another recurrent feature of sports cars trade-
marks, namely the use of elongated letters, especially in their horizontal strokes.7
Such visual lengthening of the horizontal lines in car brand names iconically
cues the path image schema, to which the force schema itself is subsidiary, and
which, in the context under consideration, may evoke the felicitous notion of
movement towards a destination. Several other brands achieve the same effect by
linking two or several characters of the brand name with an underlying line which
again visually suggests a path. Veloster above is a good example, together with
other well-know trademarks like Ferrari, Lotus Evora, and Lexus LF-A. Opel GT
combines both strategies, so that the elongated upper parts of the G and T charac-
ters are virtually fused into a long curved stroke resembling an open and smooth
curve. As a result, the brand logo is especially apt for suggesting the third of the
essential attributes of sports cars, that is, their high performance in curved paths.
Chrysler Crossfire deserves closer attention. This brand name and its associ-
ated logo point to the image schematic gestalt of force through a double met-
onymic mapping of the effect for cause and attribute for entity types,
respectively. This brand name exhibits horizontal strokes lingering after each of
its characters as a sort of tail or hair blowing in the wind. Thus, the visual effects
of speed (hair/clothes lingering backwards in the wind) stand for their cause (i.e.
speed). In turn, speed is but one of the intrinsic characteristics of the smooth
movement of an entity (a car in the case under consideration) along a path. As
a result, this logo succeeds in conveying the central key attributes of sports cars

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


72 Lorena Pérez Hernández

(i.e. “speed” and “smooth movement”) in a novel and singular way, consequently
increasing the distinctiveness of its associated brand.

4.4 Case study 4: Off-road 4 × 4s

Off-road 4 × 4s are characterized by their “power”, “robustness”, and “solidness”.


Since they are big, heavy, tough vehicles, the image schema of attribute (big-
small) plays once more a relevant role in their conceptualization. In the case of
minis, the activation of this embodied schema has been shown to highlight the
benefits of small cars by making the audience reinterpret their apparent small-
ness under a more positive light. On the contrary, off-road brands incorporating
the attribute schema do so in order to underlie their sturdiness. Visually, this
effect is achieved through a generalized use of capital letters. Lowercase letters
have been partially found in only one example within this category (e.g. Hyundai
iX-35). In addition, italics, which have been shown to be a common trait of sports
cars brands, are present in only two 4 × 4s brand names (i.e. LAND ROVER and
Toyota RAV4). Still, the degree of inclination is much softer than that of the italics
found in the sports car segment. Two brands that deserve closer attention are
those of the Honda CR-V and CR-Z series. The limited use of italics in the former,
corresponding to the off-roads series, contrasts with its intensive use in the latter,
which names the sports cars category.
The absence of italics and the use of capital letters is often coupled with a
tendency towards the use of slightly broader vertical strokes and the squaring of
rounded characters, as has been observed in off-road brands like Audi Q7, BMW
X1, and Volvo S70, S80. See the Suzuki Jimny logo in Plate 1 and also Plate 8 for a
schematic representation of these features.

Plate 7. Honda CR-V and CR-Z brand logos. Reproduced with permission.

VS. VS.

Plate 8. Broader vertical strokes and squared circular characters in off-roads brand
names.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 73

Plate 9. Hyundai Tucson and Chevrolet Captiva brand logos.


Reproduced with permission.

All and all, non-italicized, sturdy, capital letters turn up as pervasive features
of 4 × 4s brands. These visual cues help to parametrize the schema of attribute
(big-small) in a way compatible with the nature of 4 × 4s, and in turn, to highlight
their characteristic dimensions and robustness. Finally, an interesting exploitation
of the path and force image schemas at work in off-road brands involves the use
of broken letters.
Discontinuous paths or gaps along a path block the force exerted by the
moving vehicle and are considered impediments to travel. The broken letters in
Hyundai Tucson and the discontinuous, but still straight path running along the
Chevrolet Captiva logo visually activate these notions, which are also defining
characteristics of the type of bumpy, irregular roads on which 4 × 4s are expected
to perform. Such visual call on the difficulties of the path, together with the ro-
bustness of the chosen typeface, combine to convey a clear positive message as
to the good performance of these vehicles in the adverse driving circumstances
under consideration.

5. Conclusions

While it is not possible to make full generalizations on the basis of the analysis of
a limited number of brands, the four case studies presented in Section 4 license
the formulation of the following observations, which are subject to confirmation
through further research:

1. Incorporating image schemas into the design of brands grounds them in em-
bodied experience and, as a result, may make them more extensively mean-
ingful across cultures. By way of illustration, consider the incorporation of the
container schema in many of the car brands in our corpus. Humans with
different cultural backgrounds will all share a certain general understanding
of the semantics of those brands, since they all have a common experience
of what it means for something/someone to be inside or outside a container,
especially in relation to the affordances offered by the target product (i.e. ve-
hicle).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


74 Lorena Pérez Hernández

2. Image schemas have been shown to be incorporated into car brands mainly
through visual means. Due to its iconic, cross-cultural nature, the use of the
visual mode, either in isolation or in conjunction with others (linguistic, au-
ditory) is expected to result in the generation of brands with a wider global
scope. Special attention has, therefore, been paid to the visual aspects of the
brands analyzed in this study, including their associated logos and the font
type, size and shape of their brand names.
3. Just as generic image schemas operate within bodies which interact with en-
vironments offering specific affordances, the specific schemata found in the
semantic make-up of car brands have been proved to be largely compatible
with the “affordances” offered by each specific car category. The interaction
between the schema and the product-affordances gives way to enriched, con-
textually-situated, but still globally-meaningful brands.

Further research should look for empirical confirmation of the theoretical find-
ings of the present analysis on the image-schematic foundation of car brands. In
this respect, a survey is at present being conducted among a group of multi-cul-
tural consumers in order to assess to what extend car brands with specific image
schemas built into their semantics (1) are more straightforwardly associated with
those related key attributes identified by the present study, and (2) whether such
association is equally established by all consumers independently of their cultural
and linguistic backgrounds.
Another area of interest for future studies would be to examine how image
schemas in combination with primary metaphors may allow branding specialists
to boost their brands with additional shades of suggestive meaning tied to more
abstract domains of experience.8 Given their grounding in sensory-motor physi-
cal experience, such metaphorical extensions of the meaning of brands would still
retain a conveniently cross-cultural shared semantic core.

Acknowledgements

The research on which this paper is based has been funded by the Ministry of Sci-
ence and Innovation, Spain, Project No. FFI2010-17610. This research has been
carried out within the Center for Research in the Applications of Language (CRAL),
University of La Rioja (Spain).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 75

Notes

1. This view has found support in simultaneous research carried out within psychology and
other related fields (Damasio, 1994; Finke, Pinker, & Farah, 1989). It differs, however, from
more recent characterizations of image schemas by authors such as Sweetser (1990), Turner
(1991), and Clausner and Croft (1999). Some of these accounts envisage image schemas as
structures which are too complex to count as basic dimensions of perceptual representation.
Thus, Turner (1991, pp. 176–177) has identified image schemas for notions such as a “cup” or
a particular “phoneme”. Other proposals include within the category of image schemas certain
nonperceptual representations (i.e. schemas which are not tied to any particular aspect of bodi-
ly experience). The schemas of “complexity”, “ceasing to exist” (Turner, 1991), or “sharpness”
(Clausner & Croft, 1999) fall within this class.

2. Schemas which capture commonalities of a limited set of (typically culture-specific) expe-


riential settings, as outlined in Kimmel (2005, p. 300).

3. Image schemas do not merely exist in our brains, but rather they operate within bodies of
a particular physiological make-up which interact with environments that offer very specific
“affordances” (Gibson, 1979). An “affordance”, according to Gibson, is a pattern of potential
engagement and interaction with parts of our environment. A chair, for instance, “affords” sit-
on-ability for human beings.
In the context of marketing, the affordances offered by those products/services named by
brands are closely linked to their key attributes. Thus, a car, for example, affords driving-abil-
ity. In turn, different cars will afford diverse types of driving-ability: fast driving, safe driving,
power driving etc. Such specific affordances help to flesh out the full semantic interpretation of
the initial image schemas at work in the semantic make-up of their related brand names.

4. The choice of automobile brands as data for this study was motivated by the extensive and
varied number of image schematic visual inputs they display, as well as by the straightforward
connection observed between the basic semantics of the image schemas of attribute, force,
and container, on the one hand, and the target attributes of the four car categories under
scrutiny, on the other.

5. The full classification of cars by Euro NCAP can be accessed through its web page at [http://
www.euroncap.com]. For the full list of manufacturers, whose brands have been included in
this research, see Appendix 1. Due to space constraints and copyright issues we cannot include
a visual image for every brand logo included in the analysis. The vast majority of them can be
found at the web page Brands of the World available at [http://www.brandsoftheworld.com/].

6. Only one exception (i.e. cee’d) has been found in a list of 55 brands of family cars. This low
frequency of occurrence contrasts sharply with the generalized use of this type of strategies
observed within the category of minis.

7. This trait is remarkably evident in brands like Porsche, Lotus Elise, Mitsubishi Eclipse, Jaguar
XJ and XK, which unfortunately cannot be reproduced here due to lack of permission.

8. Johnson (2005, pp. 24–27) has amply dealt with the pivotal role of image schemas in ab-
stract reasoning. Together with Lakoff and Johnson (1980, 1999), Lakoff (1987), and Lakoff and

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


76 Lorena Pérez Hernández

Núñez (2000), among others, he argues in favor of an embodied logic, which recruits body-
based image-schematic structures for the understanding of abstract concepts and the drawing
of inferences. Such mappings, labeled as “primary metaphors”, provide perceptual anchors to
target notions which can be equally basic and central to human interaction and understanding
as their corresponding embodied source domains, but which are not directly apprehensible
through our perceptual systems (Grady, 1997, 2005).

References

Clausner, T.C., & Croft, W. (1999). Domains and image schemas. Cognitive Linguistics, 10, 1–
31. DOI: 10.1515/cogl.1999.001
Cortés de los Ríos, M.E. (2001). Nuevas perspectivas lingüísticas en la publicidad impresa anglo
sajona. Almería: Servicio de Publicaciones de la Universidad de Almería.
Damasio, A. (1994). Descartes’ error. New York: GrossetPutnam.
Felices Lago, A., & Cortés de los Ríos, M.E. (2009). A cognitiveaxiological approach to print
ecoadvertisements in the economist: the energy sector under scrutiny. Revista de Lingüísti-
ca y Lenguas Aplicadas, 4, 59–78. DOI: 10.4995/rlyla.2009.735
Finke, R., Pinker, S., & Farah, M. (1989). Reinterpreting visual patterns in mental imagery.
Cognitive Science, 13, 41–78.
Forceville, C. (1998). Pictorial metaphor in advertising. London: Routledge.
Forceville, C. (2007). Multimodal metaphor in ten Dutch TV commercials. Public Journal of
Semiotics, 1, 19–51.
Forceville, C. (2008). Pictorial and multimodal metaphor in commercials. In E.F. Mc Quarrie
& B.J. Phillips (Eds.), Go figure!: New directions in advertising rhetoric (pp. 272–310).
Armonk, NY: ME Sharpe.
Gibbs, R.W. (1999). Taking metaphor out of our heads and putting it into the cultural world. In
R. Gibbs & G. Steen (Eds.), Metaphor in cognitive linguistics (pp. 145–166). Philadelphia/
Amsterdam: John Benjamins. DOI: 10.1075/cilt.175.09gib
Gibson, J. (1979). The ecological approach to visual perception. Boston: Houghton-Mifflin.
Grady, J. (1997). Foundations of meaning: Primary metaphors and primary scenes. Ph.D. disser-
tation at the University of Berkeley.
Grady, J. (2005). Image schemas and perception. In B. Hampe (Ed.), From perception to mean-
ing: Image schemas in cognitive linguistics (pp. 35–56). Berlin: Mouton de Gruyter.
DOI: 10.1515/9783110197532.1.35
Hidalgo-Downing, L., & Kraljevic-Mujic, B. (2011). Multimodal metonymy and metaphor as
complex discourse resources for creativity in ICT advertising discourse. In F. Gonzálvez
García, S. Peña Cervel & L. Pérez Hernández (Eds.), Metaphor and metonymy revisited
beyond the Contemporary Theory of Metaphor (pp. 153–178). Special issue of Review of
Cognitive Linguistics. Amsterdam: John Benjamins. DOI: 10.1075/bct.56.08hid
Johnson, M. (1987). The body in the mind: The bodily basis of meaning, imagination, and reason.
Chicago: The University of Chicago Press.
Johnson, M. (2005). The philosophical significance of image schemas. In B. Hampe (Ed.), From
perception to meaning: Image schemas in cognitive linguistics (pp. 15–34). Berlin: Mouton
de Gruyter. DOI: 10.1515/9783110197532.1.15

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Approaching the utopia of a global brand 77

Kimmel, M. (2005). Culture regained: Situated and compound image schemas. In B. Hampe
(Ed.), From perception to meaning: Image schemas in cognitive linguistics (pp. 285–312).
Berlin: Mouton de Gruyter. DOI: 10.1515/9783110197532.4.285
Kimmel, M. (2008). Properties of cultural embodiment: Lessons from the anthropology of the
body. In M.F. Roslyn, R. Dirven, T. Ziemke & E. Bernardez (Eds.), Body, language and
mind. Vol. 2. Socio-cultural situatedness (pp. 77–108). Berlin: Mouton de Gruyter.
Klink, R.R. (2001). Creating meaningful new brand names: A study of Semantics and sound
symbolism. Journal of Marketing: Theory and Practice, 9, 27–34.
Koller, V. (2009). Brand images: multimodal metaphors in corporate branding. In C.J. Forceville
& E. UriosAparisi (Eds.), Multimodal metaphor (pp. 45–73). Berlin: Mouton de Gruyter.
Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind.
Chicago: University of Chicago Press. DOI: 10.7208/chicago/9780226471013.001.0001
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press.
Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and its challenge
to western thought. New York: Basic Books.
Lakoff, G., & Núñez, R. (2000). Where Mathematics comes from: How the embodied mind brings
Mathematics into being. New York: Basic Books.
Lowrey, T.M., & Shrum, L.J. (2007). Phonetic symbolism and brand name preference. Journal
of Consumer Research, 34, 406–414. DOI: 10.1086/518530
Núñez Perucha, B. (2003). Esquemas de imágenes y modelos populares: An estudio del lenguaje
dela victimización en textos narrativos en lengua inglesa. Logroño: AESLA.
Ortiz, M.J. (2010). Visual rhetoric: Primary metaphors and symmetric object alignment. Meta-
phor and Symbol, 25(3), 162–180. DOI: 10.1080/10926488.2010.489394
Peña Cervel, M.S. (2000). A cognitive approach to the image schematic component in the meta­
phorical expression of emotions in English. PhD Dissertation. Universidad de La Rioja.
Pereltsvaig, A. (2011). “What’s in a name?”. Languages of the World (October 6, 2011). http://
languagesoftheworld.info/etymology/whatsinaname.html.
Pérez Hernández, L. (2011). Cognitive tools for successful branding. Applied Linguistics, 32,
369–388. DOI: 10.1093/applin/amr004
Ruiz de Mendoza, F.J. (2011). Metonymy and cognitive operations. In R. Benczes, A. Barcelona
& F.J. Ruiz de Mendoza (Eds.), What is metonymy?: An attempt at building a consensus view
on the delimitation of the notion of metonymy in Cognitive Linguistics (pp. 45–76). Amster-
dam/Philadelphia: John Benjamins. DOI: 10.1075/hcp.28
Sinha, C., & Jensen de Lopez, K. (2000). Language, culture and the embodiment of spatial cog-
nition. Cognitive Linguistics, 11, 17–41.
Sweetser, E. (1990). From etymology to pragmatics: Metaphorical and cultural aspects of seman-
tic structure. Cambridge: Cambridge University Press. DOI: 10.1017/CBO9780511620904
Turner, M. (1991). Reading minds: The study of English in the age of Cognitive Science. Princeton:
Princeton University Press.
Umiker-Sebeok, J. (1996). Power and the construction of gendered spaces. International Review
of Sociology, 6(3), 389–403. DOI: 10.1080/03906701.1996.9971211
Velasco Sacristán, M.S., & Cortés de los Ríos, M.E. (2009). Persuasive nature of image schemat-
ic devices in advertising: Their use for introducing sexisms. Revista Alicantina de Estudios
Ingleses, 22, 239–270.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


78 Lorena Pérez Hernández

Appendix

Car manufacturers included in the analysis:

Alfa Romeo, Audi, BMW, Chevrolet, Chrysler, Citröen, Dacia, Daewoo, Daihatsu, Dodge,
Ferrari, Fiat, Ford, Honda, Hyundai, Jaguar, Jeep, Kia, Lancia, Land Rover, Lexus, Lotus,
Mazda, Mercedes Benz, MG, MINI, Mitsubishi, Nissan, Opel, Peugeot, Porsche, Renault,
Toyota, VOLVO, Volkswagen.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors
in political entertainment

Diana E. Popa
Dunarea de Jos University of Galati

Metaphor is one of the primary ways we accommodate and assimilate informa-


tion and experience into our conceptual organization of the world. The present
study investigates the internal cognitive mechanisms of multimodal-metaphor
construction in the television genre of animated political cartoons. Taking a
cognitive-semantic approach, we analyze how zoomorphs are constructed by
the audience when they first appear. The study concludes by describing the po-
tential of multimodal-metaphor analysis as a methodological tool.

Keywords: multimodal metaphor, political cartoons, cognitive mechanisms

1. Introduction

Metaphors are the essential core of human thought and creativity (Bronowski,
1972, p. 108). Seen as a central feature of human cognition that has evolved with
the development of language (Pinker, 1993), the ability to conceptualize one en-
tity in terms of another allows us to communicate through metaphor (Hart &
Long, 2011, p. 53).
From Black (1962) to Richards (1936) and Lakoff and Johnson (1980), theo-
rists have asserted metaphor’s irreducible cognitive force and the fact that many
of our actions are based on our metaphorical conceptions. A metaphor has such
cognitive force not because it provides new information about the world but be-
cause it reconceptualizes information that is already available (Kittay, 1989; Mio
& Katz, 1996). Moreover, we most often use metaphors to understand and experi-
ence the intangibles of a culture (Kövecses, 2005, p. 2). Metaphors, therefore, are
integral to public life, particularly to politics, not only at the verbal level but also
visually (Punter, 2007, p. 43).

doi 10.1075/bct.78.06pop
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
80 Diana E. Popa

The combination of verbal discourse with pictorial/visual and auditory meta-


phors has resulted in the fairly new and still-under-researched category of mul-
timodal metaphors, which have been defined as “metaphors whose target and
source are rendered exclusively or predominantly in two different modes/modali-
ties” (Forceville & Urios-Aparisi, 2009, p. 4). The modes may involve written or
spoken language, static or moving images, music or sound or gesture.
In political life, the dynamics and consequences of politics are neither tan-
gible, self-evident, nor simple (see Thompson, 1996, p. 186). Thus, multimodal
metaphor functions as a link between the individual and the political by “provid-
ing a way of seeing relations, reifying abstractions, and framing complexity in
manageable terms” (Thompson, 1996, p. 186).
The aim of the present paper is to shed light on the issue of multimodal meta-
phor in political entertainment, with special attention to two distinct aspects:

1. the ways in which the verbal, visual and auditory modalities employed con-
tribute to the construal of the multimodal metaphor; and
2. the functions of multimodal metaphors in political entertainment, particu-
larly in animated political cartoons, as they are constructed in the examples
under analysis.

By approaching the issue of multimodal metaphor in the institutionalized genre


of animated political cartoons from a cognitive-semantic framework, we hope
to contribute to the growing body of literature investigating the idea that meta-
phors can occur nonverbally and multi-modally as well as purely verbally (see
Forceville, 1996, 2002; Forceville & Urios-Aparisi, 2009 and references therein;
this volume). In pursuing that central idea, we use the following definition of
metaphor:

1. a metaphor is an identity relationship created between two phenomena that,


in the given context, belong to different categories (Bounegru & Forceville,
2011, p. 213);
2. the phenomena are to be understood as target and source, and the mapping of
features between them is unidirectional, i.e., non-reversible (Forceville, 2002,
p. 5);
3. at least one characteristic or connotation associated with the source domain
is to be mapped onto the target domain, and, in many instances, all possible
connotations are to be mapped (Bounegru & Forceville, 2011; Punter, 2007).

Section 2 provides a more detailed discussion of the relationship between poli-


tics and entertainment, focusing on the fact that the humorous effect of political
entertainment is to provide an opportunity for socially acceptable laughter. Sec-
tion 3 describes the constraints and style of animated political cartoons in more

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 81

detail. Sections 4, 5 and 6 are dedicated to outlining the criteria used to select
the corpus and the sample for the current study as well as the analytical method
employed. Finally, Section 7 includes concluding remarks and a discussion of po-
tential topics for further research in the field of multimodal metaphor.

2. Politics and entertainment

Since the writings of Walter Lippmann (1922/1965), it has come to seem obvi-
ous that politics is too complex and abstract to be experienced directly by ordi-
nary people. For many people, it is easier to access the political world (Mio, 1997,
p. 114) through television, which has been the site where politics and public life
are played out and where the meanings of public life are generated and debated
(Craig, 2004, p. 4). Political entertainment in the media, particularly in television,
can therefore serve two distinct, but not necessarily mutually exclusive, purposes
in addition to its overtly expressed mission to entertain. The first possible purpose
is to oversimplify politics and its internal processes so that it can reach a much
broader audience, perhaps even reaching the passively uninterested. The second
possible purpose is to deliberately manipulate public opinion or, at the least, to
influence public perceptions of certain politicians and their political acts. Either
of these effects, however, may stimulate citizen engagement with politics only tex-
tually rather than in an organizational or participatory manner (see Jones, 2010;
Popa, 2011a).
Political entertainment is no exception to the uses-and-gratifications model
developed by Pierce, Beatty, and Hagnar (1982), which asserts that people get
what they want to get out of the media. Therefore, if people watch political en-
tertainment to understand politics and politicians and their actions, humorous
political programs will have an effect on their political awareness. If people con-
sume political entertainment to confirm their preexisting opinions, the programs
will not have any effect on their opinions. However, there may be people who
enjoy political entertainment purely for its entertainment value. For such view-
ers, political entertainment functions as an institutionalized genre that exposes
politicians’ foibles and hypocrisy but also guarantees that the target of the humor
will not truly be injured. The relief function of such conventionally permitted
laughter is also supported by the fact that the target of the humor is a member of
a different class than the audience. Thus, the humorous effect of political enter-
tainment is triggered by both the internal humor devices of a program and, more
importantly, by the very fact that politicians, as a class, are being disparaged. In
this way, humor strengthens in-group solidarity for those who share the same
political views.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


82 Diana E. Popa

However, despite the fact that trivialization, personalization and commercial-


ization are important phenomena in contemporary media (Popa, 2007, p. 494),
not everyone has access to political entertainment because more often than not,
appreciating political entertainment requires a sophisticated sense of both poli-
tics and humor. Political sophistication is determined by the combination of a
subject’s political interests and his or her consumption of political information
(see Bosman & Hagendoorn, 1991). Sophisticated humor is to be understood in
Raskin’s (1988) acceptance of the term of both limited-access, or allusive knowl-
edge, and complex processing. Sophisticated humour is hence based on the use
of nuanced devices, such as metaphor, and requires background knowledge to be
understood, as opposed to the use of less sophisticated devices such as exaggera-
tion and repetition.

3. Animated political cartoons as a genre

At a time when television and the Internet are predominantly visual media, politi-
cal cartoons have adapted to a much more dynamic format that can easily com-
bine bright, funny cartoons with hard-hitting political analysis to appeal to all
types of audiences.
As Keefe (2004) and Fiore (2004) note, the attraction of this genre lies in its
graphics, i.e., that the motion, sound, color and interactivity of animated political
cartoons set them apart from their print siblings. The goal of any cartoon, wheth-
er print or animated, is, as Fiore (2004, p. 5) notes, to get a message across in an
engaging and entertaining way. The message comes first and the humor second,
but ideally, these elements affect the viewer simultaneously.
Moss (2007) highlights the fact that, despite technological progress, the
themes of political cartoons have remained unaltered; they all involve domes-
tic politics, social themes and, in some cases, foreign affairs. Most commonly,
cartoons address a current political issue or event, a social trend, or a famous
personality in a way that takes a stand or presents a particular point of view (El
Refaie, 2009a, pp. 184–185).
Tsakona (2009, p. 1172) notes that, because of their condensed form and the
interaction between language and image, cartoons are often considered to be a
direct and easily processed means of communicating a message and influencing
public opinion. As part of televised political culture, animated political cartoons
are a medium that combines satirical verbal discourse with a visual representation
of opinion.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 83

In the specific context of Romania, but in an analysis that can easily be ex-
tended to other cultures, Popa (2011a, p. 140) identifies four distinct functions of
political animated cartoons:

1. they function as a tool for deliberation on present conditions;


2. they constitute a source of information, and they sometimes bluntly commu-
nicate what no other medium could openly express;
3. they contribute to an ongoing scrutiny of public life; and
4. if nothing else, they function as a medium for protest and critique.

As Teng (2009, p. 207) notes, one of the generic conventions of cartooning is the
inclusion of a critical stance toward a particular socio-political situation, event
or person. Thus, an individual seeking or occupying a position of authority, the
institution housing that position, the policies that the individual promotes or the
political system as a whole may all be targeted by animated political cartoons.
However, cartoons are not themselves obligated to provide a viable solution for
the problem identified.
At a functional level, print and animated cartoons operate in similar ways;
however, with respect to their internal mechanisms, there are noticeable differ-
ences between the genres. In print cartoons, pictorial metaphor and its more
complex form, the visual pun (in Attardo & Chabanne’s (1992), Hempelmann and
Samson’s (2007) but also Mitchell’s (2007) understanding of the concept), seem to
be the most common mechanisms of humor; in animated political cartoons, one
of the most common mechanisms is the multimodal metaphor. This is not surpris-
ing given that, as Edelman (1971, p. 65) argues, metaphors are generally devices
for simplifying and giving meaning to complex or confusing sets of observations
that evoke concern. Because the field of politics is complex and confusing, the
use of multimodal metaphor in animated political cartoons is a way of explaining
the significance of real-life events and characters through an imaginary scenario
(El Refaie, 2009b, p. 176; Schilperoord & Maes, 2009). An imaginary scenario that
relies on an extended metaphor (see, among others, Crisp, 2005 and references
therein) that, in turn, sets up blended spaces (Fauconnier & Turner, 2003, p. 39)
results in both the creation of new meaning and the processing of information.
This effect may occur because metaphors, on the surface, simply draw a compari-
son between one thing and another, but in a more subtle way, they usually imply
an entire narrative and a prescription for action (Stone, 1988, p. 118).
The prescription for action may also be determined by the fact that meta-
phor, whether multimodal or monomodal, is argumentative (Moss, 2007;
Schilperoord & Maes, 2009). Consequently, metaphor can facilitate the persuasion
process through the exchange of information, further influencing the attitudes of

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


84 Diana E. Popa

the audience. Though not explicitly, the argumentative nature of metaphor is also
an effect of its ability to convey both cognitive and emotional meaning within a
single framework. Whereas cognitively metaphor creates a feeling of enlighten-
ment during the processing of information, its emotional effect is to indirectly in-
fluence attitudes by establishing a connection between the sender and the receiver
of the information.

4. The corpus

The sample analyzed in this paper was chosen from a larger corpus consisting
of 14 episodes of the 2007–2008 season of The Animated Planet Show (originally
titled Animat Planet Show). The script from 230 minutes of 2D computer anima-
tion was manually transcribed, stored and archived. Although the program was
discontinued at the end of 2008, most of the 2007–2008 season is still freely avail-
able at www.youtube.com.
As with all print and animated political cartoons, the show is meant to be
amusing, but its purpose is not solely humor for its own sake. Political cartoons
include a stance toward a particular socio-political situation, event or person, and
The Animated Planet Show is no exception. However, the show can only aim to af-
fect points of view, beliefs and perspectives on those socio-political affairs, not to
change or influence behavior. Thus, as with non-animated political cartoons, this
Romanian television program operates on two distinct levels: it tells an imaginary
story about a make-believe world, and, concomitantly, it refers to real-life events
and characters. The relationship between these levels of meaning is essentially
metaphorical.
We decided to select our examples from this particular corpus because the
program was representative of its genre in terms of structure and was successful
according to its record high ratings during its run (for a detailed analysis of the
corpus as well as a comprehensive description of the television phenomenon in-
ternationally, see Popa, 2011a, pp. 140–145).

5. Sample selection

We selected the specific examples to be analyzed based on the following criteria.

1. Formal criteria: we targeted examples that included multimodal metaphors


(in El Refaie’s (2009b, p. 191) sense of the term).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 85

2. Content-related criteria: we chose examples whose target domain was politi-


cians as public figures, itself one of the common topics of animated political
cartoons.
3. Practical criteria: we favored Romanian examples of the genre over their in-
ternational equivalents because we believed that this provided the greatest
likelihood that we would be able to understand as precisely as possible the
public events and persons referred to in the cartoons. However, we do not
pretend to have achieved an exhaustive analysis, partially as a result of space
constraints and partially because the process of making meaning relies on
individual creativity and might evoke different mythic cognitive structures
for other analysts.

The level of abstraction in the examples to be analyzed was perhaps less than that
found in most printed cartoons for the following reasons:

a. establishing the real-world referents for this animated cartoon was facilitated
by its producers in that all public figures were represented by a complex mix-
ture of real images and drawings, with photographs of their faces attached to
caricatured representations of their bodies;
b. the audience did not have to impose a narrative on the cartoon because the
narrative was explicitly and overtly constructed by a narrator or character in
the show; and
c. the text–image relationship was much more easily comprehended by the au-
dience in conjunction with simultaneous modes of cognitive processing for
sound, music, color, movement, and other modes, which were important re-
sources for signaling narrative sequencing.

The sample was taken from the episode aired on May 11, 2008. The entire episode
was 20 minutes long, which was the average duration for episodes of this pro-
gram. The examples given here are taken from one of the mini-narratives within
the episode; the mini-narrative has a duration of approximately 8 minutes and 50
seconds, including the time allocated for the passage from one mini-narrative to
another.
The root metaphor is politics is a forest. This metaphor is rich with con-
notations, powerful and flexible in application, yet familiar to all. The rhetorical
fantasia (Edwards, 2001) of the animal tale is used to domesticate the political
world (Conners, 2005). The program relies on animal metaphors that are deeply
rooted in popular culture. As Haslam, Loughnan, and Sun (2011, p. 312) note,
the animal kingdom is a bountiful source domain and provides a rich metaphori-
cal vocabulary. Moreover, animal metaphors are particularly vivid and therefore

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


86 Diana E. Popa

reduce ambiguity and increase the likelihood that the intended meaning of a criti-
cal stance will be understood.
In the following section, we focus in more detail on the analysis of the multi-
modal-metaphor examples, paying special attention to the way the verbal, visual,
and auditory modalities employed contribute to the construction of the multi-
modal metaphor.

6. Analysis of the sample

In this analysis, we take a cognitive-semantic approach. Thus, our intent is to ac-


count for the metaphorical conceptualizations and critical stances of the example
animated cartoon in terms of what happened in that cartoon rather than in terms
of prior contextual or pragmatic knowledge, but we cannot completely ignore
the pragmatic side of the analysis because its contribution to the interpretation
is essential.
The analytical process we used in our analysis can be described as follows.

1. We defined the conceptual content of each given frame of the cartoon by


paraphrasing the metaphor using the ‘X is Y’ template, thereby delineating
the target and source domains and the aspects to be mapped. This study fo-
cused exclusively on the ‘X is Y’ template, ignoring similes and expressions of
likeness such as ‘X is like Y’ or ‘X is as…as Y’.
2. We then determined whether and how the two domains involved in each
metaphor were realized simultaneously in multiple modalities.
3. Given the argumentative function of metaphor, we tried to reconstruct as ob-
jectively as possible the critical stance addressed by the animated political
cartoon.

Because politics is a forest, politicians are animals. As with all metaphors,


zoomorphs involve meaning transfer. In meaning transfer, the attributes and ac-
tions associated with an animal are transferred to the person who is metaphori-
cally associated with an animal (see Ọlátéjú, 2005).
In our analysis of the examples provided below, we address personality char-
acteristics rather than physical attributes or demographic characteristics. We
demonstrate that the meanings and interpretations assigned to a particular ani-
mal metaphor are mostly culture- and context-dependent.
In Table 1, we have identified examples of the politician is an animal
metaphor and divided them into types according to the modalities involved at
the moment when the audience would have first perceived them. Interestingly,
though not surprisingly, the pictorial component was a constant; it was used to

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 87

Table 1. Examples of animal metaphors from the Animat Planet Show,


episode of May 11, 2008, http://www.youtube.com/watch?v=iPFMGE6oyis
(last accessed January 7, 2012)
Metaphor Time Type of metaphor Fig.
point no.
1. the politician (Oprescu) is a hare 1:21 auditory-pictorial 1
2. the politician (Orban) is a cricket 1:34 auditory-verbo-pictorial 2
3. the politician (Tăriceanu) is a hedgehog 2:25 verbo-pictorial 3
4. the politician (Băsescu) is a lion 3:44 auditory-pictorial 4
5. the politician (Blaga) is a bulldog auditory-pictorial
6. the politician (Udrea) is a squirrel auditory-pictorial
7. the politician (Boc) is a woodpecker auditory -pictorial
8. the politician (Geoană) is a donkey 6:32 auditory-pictorial 5
9. the politician (Diaconescu) is a stork 6:45 verbo-pictorial 6
10. the politician (Iliescu) is a dinosaur 8:02 verbo-pictorial 7

construct the animal representation of each of the politicians included in the ani-
mal tale set in the forest.
As Sommer and Sommer (2011, p. 243) show, when used as metaphors for
human personalities, the generic terms ‘animal’ and ‘beast’ generally have exclu-
sively negative connotations. To refer to someone as ‘an animal’ or ‘a beast’ im-
plies that they are an ugly, uncouth, unpleasant individual. Sommer and Sommer
(2011) take their argument further and assert that most zoomorphs are negative
and that animals and beasts, including birds and insects, are uncomplimentary
when applied to people. In the specific context of animated political cartoons,
animal metaphors identify attributes that are socio-politically disapproved of. It
could be said that, under the circumstances, zoomorphs are “nothing but a polite
form of a more mean-spirited joke or putdown” (Pollio, 1996, p. 233). However,
animal metaphors should not be perceived by their targets as offensive because
their illocutionary force is satirical; in other words, they are part of a television
genre that qualifies as an institutionalized form of humor. Institutionalized hu-
mor has an acknowledged release function involving defunctionalization and lib-
eration from social constraints (Popa, 2011b).
The metaphorical representations in animated political cartoons can be clas-
sified as complimentary representations (examples 1, 4, and 9 in Table 1), uncom-
plimentary representations (examples 2, 5, 7, 8, and 10) (see Sommer & Sommer,
2011; Haslam et al., 2011) or ambivalent representations (examples 3 and 6).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


88 Diana E. Popa

Plate 1. The politician is an animal: hare (Animat Planet Show, 2008)

In the first example (Plate 1), the animal metaphor was of the auditory-pic-
torial type. Pictorially, the target was cued using the same technique that was
used for all of the animal characters, namely, a caricatured body in the shape of
a particular animal to which the animators attached a photograph of the face of
a politician. The auditory modality used a stereotypical sound representing an
animated hopping hare.
The high-priority semantic markers in all animal metaphors profile areas of
general similarity among animals’ features; they are significant in determining
the cognitive/conceptual meaning of those animal metaphors (Ọlátéjú, 2005). In
this case, the high-priority semantic markers are [+animate] and [+animal], or in
some cases, variations such as [+animate] or [+bird] and [+animal].
The low-priority semantic markers contribute minimally to the cognitive/
conceptual meaning of the animal metaphors but contribute significantly to their
secondary or metaphorical meanings. In other words, low-priority semantic
markers determine the connotative interpretations of animal metaphors. In the
first example, the relevant low-priority semantic markers are [+speedy], [+un-
skilled] and [+novice].
As stated above, all animal-metaphor interpretation relies heavily on contex-
tual and cultural factors. Although the hare metaphor could be interpreted as
indicating someone is a novice player in the political campaign, in this case, it in-
dicated not that the individual was unskilled but that he was a winning candidate
who would outrun all of the other candidates in the Bucharest mayoral elections.
This interpretation was later verbally mediated by the dinosaur, Iliescu. Because
the hare was a relatively complimentary use of an animal metaphor, its critical
stance was not established by its mere presence but was constructed later through
the implied contrast with other, more uncomplimentary, animal metaphors such
as the dinosaur.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 89

Plate 2. The politician is an insect: cricket (Animat Planet Show, 2008)

Plate 3. The politician is an animal: hedgehog (Animat Planet Show, 2008)

In the second example (Plate 2) as well, the target and the source were rep-
resented partially using different modes. The target was shown in a caricatured
cricket body, playing a guitar and singing at the same time. Crickets are not a
highly zoomorphic species; therefore, the use of a cricket in the analyzed episode
relied heavily on viewers’ prior pragmatic knowledge regarding internal political
affairs. However, the mapped connotation that can be inferred from the cultural
reference may be that of a deceiver. The politician was being criticized for his
empty promises and ineffective actions.
The third example (Plate 3) was of the verbo-pictorial type. The target was rep-
resented partially in the verbal mode and partially in the pictorial mode; the verbal
mode anchors introduced the character along with the other characters in the situ-
ation. The two domains involved in this metaphor would likely have been recog-
nizable without the verbal message, but the verbal aspect confirmed the identity of
the target. The hedgehog is a highly zoomorphic species in Romanian culture, and

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


90 Diana E. Popa

Plate 4. The politician is an animal: lion, bulldog and squirrel and the politician is a bird:
woodpecker (Animat Planet Show, 2008)

its association with a politician is not especially surprising. The ambivalent nature
of the hedgehog metaphor as it was used in this cartoon – i.e., that it had both
positive and negative connotations constructed in the verbal and pictorial modali-
ties – rendered it more complex in this case. Even if the pictorial representation
tended to associate the hedgehog with a positive character, the verbal message,
“What do you mean? You just hang in there and the rest doesn’t matter,” represent-
ing the hedgehog’s attitude in the face of danger, depicted it in a rather unfavorable
light. The implied connotations were of self-protection and obstinacy to facilitate
a point that the cartoonists wanted to make through the animal metaphors they
introduced in the episode.
Examples 4, 5, 6 and 7 (Plate 4 above) were all introduced at the same time;
they were all of the sonic-pictorial type. Except in example 7, the target and source
would still have been identifiable even if the sonic factor were ignored. The audi-
tory modality involved stereotypical cartoon background music, which perhaps
reinforced the general situational theme rather than adding to the representations
of the specific zoomorphs. Given this analysis, it may be that examples 4, 5 and
6, as they would have been initially perceived by the audience, might qualify as
monomodal pictorial metaphors. The only complimentary zoomorph was the lion
(example 4). The target of this metaphor was not just any politician, but the Presi-
dent himself (the president is a lion). The particular connotations mapped
from the source to the target would be those of being a notable person and being a
powerful, aggressive player. Although initially complimentary in nature, the lion
zoomorph was later used to express criticism, when the other characters empha-
sized the lion’s intent to rule over not only the jungle but the forest as well (where
the country is a jungle and the capital is a forest).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 91

Plate 5. The politician is an animal: donkey (Animat Planet Show, 2008)

Both the woodpecker and the bulldog metaphors were uncomplimentary;


they connoted the inferiority of the political players portrayed. However, their
secondary connotations were complimentary; whereas the woodpecker is ‘tena-
cious’ but ‘impressionable’, the bulldog is ‘aggressive’ and ‘easily manipulated’ but
‘useful.’ In example 7, the target was cued in two modalities, with the usual picto-
rial representation and an auditory representation of the noise made by a wood-
pecker pecking on wood.
The squirrel metaphor in example 6 was ambivalent; this was reflected in its
fairly negative connotations of ‘frivolousness’ and ‘meddling’.
Example 8 (Plate 5) was also an easily identifiable multimodal metaphor of
the auditory-pictorial type. Without the donkey’s bray, the connection between
the target and the source would certainly have been less vivid. The connotations
were that the target was ‘ignorant’ and ‘obstinate’.
In example 9 (Plate 5), the target was cued both visually and verbally. The
verbal cue involved the line “But I do not want to put my bill in it. I am clean.”
(‘Put my bill in it’ is a play on words similar to ‘get my hands dirty’.) Although
it appears complimentary at first glance, the bird metaphor connoted a lack of
intuition and tact.
Example 10 (Plate 6) was also of the verbo-pictorial type. The visual technique
used was similar to that in the previous examples, whereas the verbal dimension
was expressed by the line ‘Great, you animal!’ addressed to the hare. This was an-
other example that could be considered a simple pictorial metaphor because the
words only reinforced the theme of the visual animation. Highly zoomorphic, this
metaphor was uncomplimentary in that it connoted old age and inadequacy and,
in that context, the abuse of power.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


92 Diana E. Popa

Plate 6. The politician is an animal: dinosaur (Animat Planet Show, 2008)

7. Discussion and preliminary conclusions

This study investigated the concept of multimodal metaphor in the context of


animated political cartoons, which has proven to be a promising field of enquiry.
After briefly discussing the communicative goals of multimodal metaphor in ani-
mated cartoons, including simplification, argumentation and, above all, propagat-
ing a critical stance toward a given topic, we discussed in more detail the criteria
we used in selecting our sample and the steps we took in analyzing our data.
The aim of the present study was to focus only on how a multimodal meta-
phor is constructed by an audience when it is first presented to the viewer. We
have concentrated here primarily on the three main modalities: visual, verbal and
auditory. Due to space constraints, we cannot introduce specific analyses of ges-
ture, color and movement, but, in practice, we have implicitly taken those aspects
to be part of the broader visual category.
The examples discussed here involve zoomorphs, which are used to represent
personality characteristics because they are vivid, less ambiguous and cutting.
Based on our cognitive-semantic analysis, we concluded that, although the (ex-
tra) auditory and verbal modalities might not be perceived as providing essential
information for the identification of target and source, if they were omitted, the
whole metaphorical identity relationship would be diluted, and their [+animate]
high-priority semantic markers would be directly affected. Especially in the case
of verbal anchors, subtracting the secondary modality from the construction of a
given metaphor would have noticeable implications for the low-priority markers
because those markers generally provide the contextual and pragmatic informa-
tion that is essential to information processing. Consequently, we believe that the
target and source could still be identified in some of the cases described in this

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 93

paper but that cognitive meaning and connotative interpretation would be im-
paired.
In conclusion, a considerable amount of research has been carried out that
has highlighted the need to acknowledge the existence of multimodal metaphor.
Research in this area should perhaps be extended further to explore the effects
of multimodal metaphor in other forms of representation, such as postmodern-
ist theater. In the case of animated political cartoons, we would like to pursue a
cross-cultural investigation. Empirical data and case studies on instances in which
multimodal metaphor has been used for purposes other than simplification, ar-
gumentation or the propagation of a critical stance would provide the necessary
evidence to support the claim that multimodal metaphor is indeed operative and
that it is therefore a viable concept when compared with other methodologi-
cal tools.

Primary source

Animal Planet Show. 2008. Episode of May 11th, part 1 of 3. http://www.youtube.com/watch?v=


iPFMGE6oyis (accessed January 7, 2012).

References

Attardo, S., & Chabanne, J.C. (1992). Jokes as a text type. Humour, 5, 165–176.
Black, M. (1962). Models and metaphors. Ithaca, NY: Cornell University Press.
Bosman, J., & Hagendoorn, L. (1991). Effects of Literal and Metaphorical Persuasive Messages.
Metaphor and Symbolic Activity, 6(4), 271–292. DOI: 10.1207/s15327868ms0604_3
Bounegru, L., & Forceville, C. (2011). Metaphors in editorial cartoons representing the global
financial crisis. Visual Communication, 10(2), 209–229. DOI: 10.1177/1470357211398446
Bronowski, J. (1972). Science and human values. New York: Harper Torchbooks.
Conners, J. (2005). Visual representations of the 2004 presidential campaign: Political cartoons
and popular culture references. American Behavioral Scientist, 49, 479–487.
DOI: 10.1177/0002764205280920
Craig, G. (2004). The media, politics and public life. Australia: Allen and Unwin Academic.
Crisp, P. (2005). Allegory, blending and possible situations. Metaphor and Symbol, 20(2), 115–
131. DOI: 10.1207/s15327868ms2002_2
Edelman, M. (1971). Politics as symbol action: Mass arousal and quiescence. Chicago: Markham.
Edwards, J. (2001). Running in the shadows in the Campaign 2000: Candidate metaphors in
editorial cartoons. American Behavioral Scientist, 44, 2140–2151.
DOI: 10.1177/00027640121958249
El Refaie, E. (2009a). Multiliteracies: how readers interpret political cartoons. Visual Commu-
nication, 8(2), 181–205. DOI: 10.1177/1470357209102113

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


94 Diana E. Popa

El Refaie, E. (2009b). Metaphor in political cartoons: Exploring audience responses. In


C. Forceville & E. UrioAparisi (Eds.), Multimodal metaphor (pp. 173–196). Berlin/New
York: Mouton de Gruyter.
Fauconnier, G., & Turner, M. (2003). The way we think: Conceptual blending and the mind’s
hidden complexities. New York: Basic Books.
Fiore, M. (2004). Animation and political cartoon. Nieman Report, 58(4), 41.
Forceville, C. (1996). Pictorial metaphor in advertising. London: Routledge.
DOI: 10.4324/9780203272305
Forceville, C. (2002). The identification of target and source in pictorial metaphors. Journal of
Pragmatics, 34, 1–14. DOI: 10.1016/S0378-2166(01)00007-8
Forceville, C., & UriosAparisi, E. (Eds.). (2009). Multimodal metaphor. Berlin/New York:
Mouton de Gruyter. DOI: 10.1515/9783110215366
Haslam, N., Loughnan, S., & Sun, P. (2011). Beastly: What makes animal metaphors offensive?
Journal of Language and Social Psychology, 30(3), 311–335.
DOI: 10.1177/0261927X11407168
Hart, K., & Long, J. (2011). Animal metaphors and metaphorizing animals: An integrated lit-
erary, cognitive, and evolutionary analysis of making and partaking of stories. Evolution:
Education and Outreach, 4(1), 52–63. DOI: 10.1007/s12052-010-0301-6
Hempelmann, C., & Samson, A. (2007). Visual puns and verbal puns: Descriptive analogy or-
false analogy? In D. Popa & S. Attardo (Eds.), New approaches to the linguistics of humour
(pp. 180–196). Galati: Academica.
Jones, J. (2010). Entertaining politics. Satiric television and political engagement (Communica-
tion, Media and Politics), 2nd edition. New York/Toronto/Plymouth: Rowan and Littlefield.
Keefe, M. (2004). How about animated cartoons? Cuttingedge cartoons have script, storyboard.
The Masthead, 56(4).
Kittay, E.F. (1989). Metaphor: Its cognitive force and linguistic structure. Oxford: Clavendon-
Press.
Kövecses, Z. (2005). Metaphor in culture: Universality and variation. Cambridge: Cambridge
University Press. DOI: 10.1017/CBO9780511614408
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago.
Lippmann, W. (1922/1965). Public opinion. New York: Free Press.
Mio, J.S. (1997). Metaphor and Politics. Metaphor and Symbol, 12(2), 113–133.
DOI: 10.1207/s15327868ms1202_2
Mio, J.S., & Katz, A. (Eds.). (1996). Metaphor: Implications and applications. Mahwah, NJ:
Lawrence Erlbaum Associates.
Mitchell, A. (2007). Ancient Greek visual puns: a case study in visual humour. In D. Popa &
Attardo (Eds.), New Approaches to the Linguistics of Humour (pp. 180–196). Galati: Aca-
demica.
Moss, D. (2007). The animated persuader. PS: Political Science and Politics, 40(2), 241–244.
DOI: 10.1017/S1049096507070369
Ọlátéjú, A. (2005). The Yorùbá animal metaphors: Analysis and interpretation. Nordic Journalf
African Studies, 14(3), 368–383.
Pierce, J.C., Beatty, K.M., & Hagnar, P.R. (1982). The dynamics of American public opinion.
Glenview, IL: Scott Foresman.
Pinker, S. (1993). The language instinct. New York: William Morrow.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal metaphors in political entertainment 95

Pollio, H. (1996). Boundaries in humour and metaphor. In J.S. Scott & A. Katz (Eds.), Metaphor:
Implications and applications (pp. 231–253). Mahwah, NJ: Lawrence Erlbaum Associates.
Popa, D. (2007). Media and the public sphere. In Y. Pasedeos (Ed.), International dimensions of
mass media research (pp. 493–504). Athens: Athens Institute for Education and Research.
Popa, D. (2011a). Political satire dies last: A study on democracy, opinion formation, and po-
litical satire. In V. Tsakona & D. Popa (Eds.), Studies in political humour (pp. 137–166).
Amsterdam/Philadelphia: John Benjamins. DOI: 10.1075/dapsac.46.10pop
Popa, D. (2011b). Political humour and the ritual of rebellion in computermediated communi-
cation. Mélanges francophones, 5(6), 344–362.
Punter, D. (2007). Metaphor. New York: Routledge.
Raskin, V. 1988. Sophisticated jokes. In F. Shaun, D. Hughes & V. Raskin (Eds.), WHIMSY VII.
W. Lafayette (pp. 125–127). IN-Tempe, AZ: Purdue University International Society of
Humor Studies.
Richards, I.A. (1936). The philosophy of rhetoric. London: Oxford University Press.
Schilperoord, J., & Maes, A. (2009). Visual metaphoric conceptualization in editorial cartoons.
In C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 213–240). Berlin/
New York: Mouton de Gruyter.
Sommer, R., & Sommer, B.A. (2011). Zoomorphology: Animal metaphors for human personal-
ity. Anthrozoős, 24(3), 237–248. DOI: 10.2752/175303711X13045914865024
Stone, D.A. (1988). Policy paradox and political reason. Glenview, IL: Scott, Foresman.
Teng, N.Y. (2009). Image alignment in multimodal metaphor. In C. Forceville & E. Urios-­
Aparisi (Eds.), Multimodal metaphor (pp. 197–212). Berlin/New York: Mouton de Gruyter.
Thompson, S. (1996). Politics without metaphors is like a fish without water. In J.S. Scott &
A. Katz (Eds.), Metaphor: Implications and applications (pp. 185–201). Mahwah, NJ:
Lawrence Erlbaum Associates.
Tsakona, V. (2009). Language and image interaction in cartoons: Towards a multimodal theory
of humour. Journal of Pragmatics, 41, 1171–1188. DOI: 10.1016/j.pragma.2008.12.003

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
Part II

Multimodality, Cognitive and Systemic


Functional Linguistics

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
The visual representation of metaphor
A social semiotic approach

Dezheng Feng and Kay L. O’Halloran


The Hong Kong Polytechnic University / Curtin University

Complementing cognitive theories which attribute the understanding of visual


metaphors to situational and cultural contexts, this study adopts a social semi-
otic perspective to investigate how visual images themselves are constructed
to cue conceptual metaphors. The visual realization of metaphors in represen-
tational, interactive and compositional meaning structures is elucidated based
on Kress and van Leeuwen’s (2006) visual grammar. It is found that most types
of visual metaphor identified by cognitive linguists can be explained within
the framework. Instances of visual metaphor in advertisements are analyzed in
terms of their persuasive effects. It is concluded that the social semiotic frame-
work is able to provide a comprehensive account of the visual realization of
metaphor, and in addition, the study also offers a cognitive explanation of how
resources like camera positioning and composition acquire meanings.

Keywords: visual metaphor, conceptual metaphor theory, social semiotics,


visual grammar, metafunctions

1. Introduction

The central argument of conceptual metaphor theory is that metaphor is a con-


ceptual phenomenon (Lakoff & Johnson, 1980), realized in both language and
other communication modes, such as visual image, gesture and architecture (e.g.
Forceville, 2009; Goatly, 2007; Kövecses, 2002). Recently, the study of non-­
linguistic realization of metaphor has attracted much attention, following the
pioneering works of Forceville (1994, 1996), Carroll (1996), Morris (1993), and
others. However, these early attempts define visual metaphors in terms of “their
surface realization or formal characteristics” (El Refaie, 2003, p. 78). El Refaie
(2003) argues that visual metaphors should be seen as the pictorial expression
of metaphorical thinking. While agreeing with this definition, we further ask, as

doi 10.1075/bct.78.07fen
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
100 Dezheng Feng and Kay L. O’Halloran

semioticians, how metaphors are visually expressed. In other words, we are in-
terested in the visual mechanisms which are used to construct metaphors. From
this perspective, Carroll (1996) and Forceville (1996) can be viewed as efforts
to describe the visual realization of conceptual metaphors, which accord with
El Refaie’s (2003) cognitive definition.
However, the descriptions are inadequate, as “there seems to be a whole range
of different forms through which metaphorical concepts can be expressed visu-
ally” (El Refaie, 2003, p. 80). We argue that the inadequacy is due to the lack
of understanding and systematic description of meaning making mechanisms in
visual images. As a result, cognitive linguists attribute the understanding of vi-
sual metaphors to the situational/cultural context, but pay less attention to the
text-internal mechanisms of visual images. From a semiotic point of view, while
acknowledging the role of context and human cognition, we argue that visual
images themselves are constructed in certain ways to cue metaphors. Therefore,
our aim is to provide a systematic account of the visual mechanisms for the re-
alization of metaphor, based on Kress and van Leeuwen’s (2006) social semiotic
visual grammar. This endeavor complements Feng’s (2011a) explanation of visual
grammar with conceptual metaphor theory.
The data analyzed in this study is comprised of 100 car advertisements, which
are chosen for their creative use of visual images (cf. Forceville, 1996). The main
conceptual framework of the social semiotic approach is presented in Section 2.
Then we discuss how metaphors are visually realized in representational mean-
ing structures in Section 3. The metaphorical meaning of interactive resources is
investigated in Section 4, and in Section 5 we examine how conceptual metaphors
are realized in the composition of visual images. Finally, we describe how a social
semiotic visual grammar can provide a comprehensive account of the visual con-
struction of metaphors and how conceptual metaphor theory lends epistemologi-
cal status to such a grammar in Section 6.

2. Visual metaphor from a social semiotic perspective

Alongside Forceville’s (1994, 1996) theory of pictorial metaphor, the field of mul-
timodal semiotics emerged, building on Halliday’s (1994) social semiotic theory
of language (known as systemic functional linguistics). In this approach, language
is modeled as sets of inter-related systems of choices which are metafunctionally
organized. The “systemic” principle regards grammar as systems of paradigmatic
choices which are represented as system networks. The “functional” principle
states that language simultaneously provides resources for constructing three
metafunctions: ideational meaning, interpersonal meaning and textual meaning.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The visual representation of metaphor 101

Social semioticians argue that these principles are applicable to non-linguistic


resources as well, which results in the development of metafunctional frame-
works for semiotic resources such as visual image, architecture and mathematical
symbols (e.g. Kress & van Leeuwen, 2006; O’Toole, 2010; O’Halloran, 2005). Ac-
cording to Kress and van Leeuwen (2006), visual images, like language, fulfill the
metafunctions of representing the experiential world (representational meaning),
interacting with viewers (interactive meaning), and arranging the visual resources
(compositional meaning).
Representational meaning is realized by the configuration of processes (e.g.
actions), participants (e.g. actors), and circumstances (e.g. locations). Kress and
van Leeuwen (2006, pp. 45–113) further identify two types of structure in terms of
representation: narrative and conceptual. These structures are defined in terms of
the relationship between the image participants, that is, whether it is based on the
“unfolding of actions and events, processes of change” (i.e. narrative), or based on
“generalized, stable and timeless essence” (i.e. conceptual). Interactive meaning
involves the four parameters of symbolic contact, social distance, power relations,
and involvement between viewers and visual participants. Contact is constructed
by the nature of the visual participants’ gaze at viewers; social distance is con-
structed by shot distance (e.g. close or long shot); power relation is constructed
by vertical camera angle (i.e. high or low angles); involvement is constructed by
horizontal camera angle (i.e. frontal or oblique angles). Compositional meaning
relates the representational and interactive meanings into a meaningful whole
through three interrelated systems: information value, salience, and framing
(Kress & van Leeuwen, 2006, p. 177). Information value is realized by the place-
ment of visual elements (e.g. top or bottom, left or right); salience deals with the
prominence of visual elements, through size, sharpness of focus, color contrast,
and so on; framing is concerned with the connection between visual elements.
In what follows, we discuss how this social semiotic framework can explain the
construction of different types of visual metaphors.
According to Lakoff and Johnson (1980), many concepts (e.g. ‘a’) are under-
stood in terms of other concepts (e.g. ‘b’), the process of which constitutes a con-
ceptual metaphor a is b. ‘A’ is termed the target domain and ‘b’ the source domain.
In cognitive metaphor theory, metaphor is classified into two broad categories:
conventional and creative (Lakoff & Johnson, 1980), or inactive and active (Goatly,
1997). Conventional metaphors are those that structure the ordinary conceptual
system of our culture, while creative metaphors are those which give us new un-
derstandings of our experience (Lakoff & Johnson, 1980, p. 139). For example,
happy is up is a conventional metaphor based on correlations in our bodily ex-
perience and shoe is tie is a creative metaphor which offers a new perspective
of conceptualizing shoes (Forceville, 1996, p. 110). Conventional metaphors, as

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


102 Dezheng Feng and Kay L. O’Halloran

well as many creative metaphors, serve the purpose of understanding through


domestication, a process in which abstract ideas and unfamiliar persons or events
are converted into something close, familiar, and concrete (Morris, 1993, p. 201).
However, creative metaphors may also defamiliarize the target domain for rhe-
torical or decorational purposes, especially in poetry and art. In visual images,
both the target and the source of the metaphor are usually concrete objects, thus
constituting the concrete is concrete metaphor (Forceville, 2009, p. 27).
Forceville’s (1996) examples mostly belong to this type (e.g. shoe is tie, popcorn
is wine, ticket is deck chair, etc.) (see also El Refaie, 2009, p. 175). However,
abstract concepts can also be metaphorically represented in visual images, and
in this study, the visual realization of both defamiliarization and domestication
metaphors is investigated.
In terms of visual realization, Forceville (1996) distinguishes three kinds of
pictorial metaphors: MP1 (only the source or the target is present), MP2 (the
source and the target are present and integrated) and pictorial simile (the source
and the target are juxtaposed). Forceville’s (1996) three types of pictorial meta-
phor are based on the systemic choices of spatial relations between the “meta-
phorical subject” (typically the target domain, that is, the primary subject) and
the “pictorial context”. From the social semiotic perspective, Forceville’s “meta-
phorical subject” and “pictorial context” belong to one unified grammatical unit
in the representational meaning structure. Meanwhile, aside from representa-
tional meaning, visual images also have interactive and compositional meanings,
which are important resources for the visualization of abstract concepts. In this
paper, the metafunctional resources are seen as metaphor potential and we shall
explore how they realize visual metaphors, building on Feng (2011b).
The role of context (e.g. linguistic context, discourse purpose, cultural back-
ground, etc.) is also acknowledged in the social semiotic approach, in this case
for the identification of the source and target domains. However, countering
Forceville’s (1994, p. 7) claim that “invoking the pictorial context helps little to
determine the order of the terms,” we argue that the structural features of repre-
sentation provide essential cues for the determination of visual metaphors. There-
fore, the aim of the present study is to see what the social semiotic visual grammar
can offer in modeling the representation of metaphor.

3. Representational meaning and the visual construction of metaphor

In this section, a framework is proposed to model how both defamiliarization


metaphors and domestication metaphors are realized in representational struc-
tures.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The visual representation of metaphor 103

3.1 Defamiliarization metaphors

In this sub-section, we propose a social semiotic model for the “object is object”
metaphor which is the focus of Forceville (1994, 1996). As mentioned above,
Forceville’s “metaphorical subject” and “pictorial context” are seen as belonging to
one unified grammatical unit in representational meaning structures. Represen-
tational meaning in visual images is modeled in terms of processes, participants
and circumstances, and each image is a configuration of choices from these three
categories. In narrative structures, the metaphorical subject relates to other ele-
ments through actional, verbal, or mental processes; in conceptual structures, it
relates to other elements through relational processes in the form of taxonomic
relations (classification processes), part-whole relations (analytical processes) or
identifying relations (symbolic processes).
Defamiliarization metaphors are mainly constructed by anomaly, or un-
conventionality, of visual elements in the representational structure, in a similar
manner to the colligational interpretation of metaphor in language (Goatly, 1997,
p. 111). As there is variation in the conventions associated with different process
structures, it is important to examine the different types of anomaly. In what fol-
lows, we shall investigate how metaphors are realized in actional, classificational
and analytical processes.
In actional processes, conventional participants (i.e. actors or goals) or cir-
cumstantial elements associated with certain actions are substituted by uncon-
ventional ones, with the former as the source domain and the latter as the target
domain. For example, in a car advertisement in Feng (2011a, p. 63), the car is
worn on a man’s wrist like a watch. Apparently, the car takes the place of a watch,
which results in colligational anomaly. By taking the place of a watch, the car
adopts its attributes, constituting the metaphor car is watch. The medium of
an action (e.g. the tools which are used to perform the action, see Halliday, 1994,
p. 154) can also be substituted. In Forceville’s (1994, p. 10) example, a person is
killing himself by pointing a gas nozzle on his head. The metaphor gas nozzle
is gun is constructed because the gas nozzle adopts the role of a gun. El Refaie’s
(2003, p. 79) example in which a group of Kurdistan refugees are holding the flag
with its inscription “New Kurdistan” can also be explained with participant sub-
stitution. Conventionally, the army carries the flag and claims sovereignty after
conquering a place. Here, the conventional actor is substituted by refugees, which
results in the metaphor refugees are army/invaders.
In classificational processes, metaphor is constructed in two ways. First, en-
tity a is an unconventional member of a category whose conventional member is
entity b. As a result, a borrows the salient features of b and the metaphor a is b is
formed. For example, in a car advertisement from Qilu Evening Paper on 19 July

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


104 Dezheng Feng and Kay L. O’Halloran

2008, the image shows five athletes ready to run a 100-meter race, but the middle
track is occupied by a car. As a result, the car adopts the most salient feature of the
athletes, that is, fast. In Teng’s (2009, p. 198) example, where an American news-
paper is put among horror books on a bookshelf labeled “horror”, the resultant
metaphor American news is like horror novels is another case of this type of
realization. Second, two entities may be put together unconventionally to form a
covert category (Kress & van Leeuwen, 2006). The formation of covert categories
requires a crucial visual feature – that is, symmetry in composition, such as equal-
ity in size, framing and arrangement (Kress & van Leeuwen, 2006, p. 79). This
process is similar to the visual simile in Forceville’s (1996) categorization, but the
conceptualization of two juxtaposing entities as forming an unconventional co-
vert category helps to explain the metaphorical mapping – the unconventionality
of the category alerts us to the metaphor and being members of the same category
makes the mapping of attributes possible. However, the source and target do-
mains cannot be structurally determined in this case because they are represented
on the premise that they are equal, and we have to draw upon other cues like the
linguistic context and the discourse purpose. The advertisement in Plate 1 is a
good case in point. The minivans are juxtaposed with weight-lifting champions.
They form a covert category by being identical in number and arrangement. Since
it is an advertisement for the minivan, the minivan is the target and the metaphor
thus formed is minivans are weight-lifting champions. The salient feature of
the athletes, that is, strength, is mapped onto the minivans.
Anomaly in analytical processes occurs when there is an unconventional part
in the whole. This can happen in two ways. First, the unconventional part a takes
the place of the conventional part b and hence inherits its salient features. The

Plate 1. Wuling minivan, from Qilu Evening Paper, July, 19th, 2008, A10
(reproduced with permission)

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The visual representation of metaphor 105

well-known example from Forceville (1996, p. 110), which shows a man’s torso
with a suit but with the tie substituted by a shoe, illustrates this type of metaphor.
By taking the place of the tie, the shoe inherits the salient features of the tie and
the metaphor shoe is tie is formed (see Forceville, 1996, p. 10 for detailed analy-
sis). However, there are also rare cases in which the substituted part is the target,
as in Forceville’s (1996, p. 123) car advertisement in which the life buoys take the
place of car tires. The metaphor formed is car tires are life buoys, in which the
unconventional part is the source. This is where the structural cue of realization
contradicts the contextual cue, and we have to resort to the latter to identify the
metaphor.
Second, an entity (or part of it) is superimposed on another entity (or part
of it). The superimposition may or may not change the conventional identity of
the entity. If it doesn’t, the superimposed entity becomes an unconventional part
of the whole, and as in the case of substitution, the unconventional part is the
target. However, this case differs from substitution, because the superimposed
entity inherits the attributes of the whole which it forms a part. We can call this
type Superimposition 1 (S1). An example is found in Yus (2009, p. 162), where a
saucepan has an image of the continents of the earth superimposed upon it. The
superimposition doesn’t change the identity of the saucepan. As part of the sauce-
pan, the earth inherits one of its attributes, that is, warms up gradually.
The superimposed entity may also change the identity of the original image
and they together form an unconventional whole, or a hybrid, similar to the for-
mation of covert categories in classificational processes. In this case, the superim-
posing part is the source and its salient features are added to the whole. We shall
call this type Superimposition 2 (S2). For example, in an advertisement in which
a pair of butterfly wings is added to a motorbike, the salient features of butterflies
such as beauty and lightness are projected onto the motorbike, which produces
the metaphor motorbike is butterfly. However, this example can also be seen
as the motorbike substituting the body of the butterfly, which results in the same
metaphor. Yus (2009, p. 164) provides a similar example in which dice dots are
superimposed on a ballot box. The superimposing part is the source which lends
attributes to the entity it is superimposed upon. It can also be interpreted as the
ballot box substituting the body of the dice. Either way, the ballot box borrows
the features of the dice and results in the metaphor ballot box is dice, which
further stands for election is gambling.
To summarize, we have examined the visual mechanisms for realizing met-
aphors in representational structures, as illustrated in Figure 1. The slanted ar-
row denotes realization. For example, visual anomaly in narrative structures is
“realized” as participant substitution and circumstance substitution. The analysis
shows that the source and target domains of defamiliarization metaphors can

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


106 Dezheng Feng and Kay L. O’Halloran

Narrative anomaly

Participant substitution
Circumstance substitution

Visual anomaly Classificational anomaly

Member substitution
Unconventional covert category

Analytical anomaly

Part substitution
Part superimposition

Figure 1. Visual anomaly in the representational structure

mostly be identified by examining the way they are represented (i.e., anomaly in
different process structures). However, since metaphors are not constructed by
decontextualized visual components, representational resources alone may not
be able to specify the source and the target. Moreover, the cues from the process
of construction may contradict with the more explicit contextual cues of interpre-
tation. Such awareness of context makes our approach social semiotic, whereby
representational anomalies are seen as resources for metaphor realization, rather
than as rigid semiotic codes. In this sense, our framework only describes the met-
aphor potential in the representational structure, without claiming that structural
anomalies are able to determine all visual metaphors independently.

3.2 Domestication metaphors

In this sub-section, we examine the visual realization of creative and conventional


metaphors which serve the purpose of understanding abstract concepts. We shall
call them domestication metaphors without distinguishing between creative and
conventional metaphors, because the working mechanism remains the same, re-
gardless of how conventionalized the mappings are. Domestication metaphors
play a significant role in the representation of abstract meaning because it is not
possible to represent abstract meaning visually without recourse to symbols, me-
tonymies or metaphors (El Refaie, 2009, p. 177). However, visual images seldom
work alone in the process of domestication, and as our main focus is on visual
metaphor, we only discuss the process of visual domestication briefly.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The visual representation of metaphor 107

A typical strategy of domestication metaphors is that the image shows the


source domain and the linguistic context specifies the target by labeling the im-
age. In social semiotic terms, they constitute a symbolic attributive process (Kress
& van Leeuwen, 2006, p. 105), in which the image is the token and the verbal label
endows it with a value. In cognitive terms, the “value” is understood in terms of
the “token”. This kind of verbal-visual metaphor works in the same way as Super-
imposition 1 (S1) metaphor in analytical processes, except in this case, it is the
verbal text that is superimposed. Most researchers talk about multimodal meta-
phor in this sense (e.g. El Refaie, 2003; Forceville, 2009). For example, in a cartoon
in El Refaie (2003, p. 83), the image shows a fortress with the word “EUROPA” on
it. The token (i.e. the fortress) only refers to Europe because the value (EUROPA)
is superimposed on it. The metaphor thus formed is Europe is fortress.
The “value” of a visual “token” may not be explicitly labeled by linguistic text,
but is sometimes implicit in the cultural context. This is the case with conven-
tional metaphors, such as orientational metaphors in which spatial orientations
(e.g. up, down) are endowed with metaphorical meanings based on our embodied
experience (e.g. happy is up) (see Lakoff & Johnson, 1980). For example, the
windows with lights on form an up-pointing vector in Plate 2, which echoes the
cultural practice that a person’s office, salary and benefits are adjusted to a higher
level when he/she gets promoted. The visual process of “moving up” is also rep-
resented in the verbal text. The metaphorical target, or the symbolic meaning,
of this visual process is clearly “becoming more powerful or wealthy”, based on
cultural knowledge. In this way, the advertisement works by implicitly linking the
car with gaining power.

Plate 2. Toyota Camry, from The Straits Times, 4th October, 2008, C2
(reproduced with permission)

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


108 Dezheng Feng and Kay L. O’Halloran

4. Interactive meaning and the visual representation of metaphor

Cognitive studies of visual metaphor mostly focus on what is in the image, instead
of how the image is represented. In social semiotic terms, only representational
resources are investigated, while interactive and compositional resources remain
largely implicit. As an exception, El Refaie (2009) discusses the visual realization of
orientational metaphors in political cartoons, associating spatial orientations with
concepts like power and time. However, a systematic account of the metaphori-
cal meaning of spatial orientations in visual images is not yet available. Kress and
van Leeuwen’s (2006) semiotic model of interactive and compositional meaning
resources provides a comprehensive framework for systemizing such metaphors.
Building on Feng (2011a, 2011b), we discuss the visual realization of metaphor in
interactive and compositional meaning structures in Sections 4 and 5.
Interactive and compositional meaning resources construct conventional
metaphors. The mappings between the source and target domains in conventional
metaphors are not based on similarities, but on correlations derived from our ba-
sic experience of the world (Lakoff & Johnson, 1980, p. 155). The interpretation
of such metaphors does not depend on immediate context, but on physical and
cultural experiences that are common to human beings in general or to specific
cultural communities (Lakoff & Johnson, 1980, p. 14). Therefore, to prove the
validity of the conventional metaphors realized by interactive and compositional
resources, we need to provide their experiential bases.
Interactive meanings include contact, social distance and subjectivity. Accord-
ing to Kress and van Leeuwen (2006), contact is realized by gaze, social distance
by shot distance and subjectivity by camera angle (see also Dyer, 1989; Messaris,
1994). Since gaze and camera angle converge in most cases (i.e. gaze normally
converges with front angle and absence of gaze with oblique angle), we shall only
discuss the resources for social distance and subjectivity, under the term camera
positioning.
From a cognitive perspective, the relation between camera positioning and
interactive meaning is the metaphorical mapping between the source domain and
the target domain (Feng, 2011a). This mapping can be considered as a master
metaphor which entails all the sub-mappings between camera positioning and in-
teractive meaning. In this way, Kress and van Leeuwen’s (2006) descriptive gram-
mar is reformulated as a conceptual metaphor system that is visually realized, as
shown in Figure 2.
To prove the validity of the metaphor system, we need to provide experiential
bases for the mappings. The metaphorical meaning of camera positioning is pre-
mised on the iconic nature of visual images. That is, shot distance reproduces the

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The visual representation of metaphor 109

Image-viewer relation is camera positioning

Social distance is Power relation is Involvement is


shot distance vertical angle horizontal angle

Close relation is close shot Image power is low angle Involvement is frontal view
Equality is eye-level angle
Distant relation is long shot Viewer power is high angle Detachment is back view

Figure 2. The visual realization of metaphor in interactive resources (Feng, 2011b, p. 27)

structural features of physical distance in real life and camera angle reproduces
features of the ways we look at and interact with people. The basis of the mapping
between physical distance (hence shot distance) and social distance is well estab-
lished in the study of proxemics (e.g. Hall, 1969) and will not be elaborated here.
The mapping between image-viewer power relation and vertical camera angle is
based on the structural features of real-life situations in which we “look up” to
powerful people and “look down” upon weak people (Messaris, 1994, p. 9). The
mapping between involvement and horizontal camera angle is based on real life
situations where we face the person we want to interact with and gaze at him/her,
and turn our face (gaze) away if we don’t want to interact.
Through these experiential bases, it can be argued that these metaphors do
exist and are conventionalized in our ordinary conceptual system. However, these
conventional or default interpretations of camera positioning may be overridden
by other factors in specific contexts. For example, Dick (2005, p. 53) points out
that sometimes film scripts require a high or low angle shot for the sake of consis-
tency rather than for symbolism. For this reason, social semiotic interpretations
are often criticized for being too rigid, while in reality the connections are fluid
and subject to change. From the cognitive perspective, this is because certain se-
miotic choices (e.g. low angle) are not motivated by the default experiential basis,
but by other factors (e.g. intertextual and discursive consistency). In such cases,
the overriding factors are usually more salient and point to one specific interpre-
tation. In the social semiotic approach, we do not consider camera positioning, as
well as composition, as rigid semiotic rules, but as resources for making meaning.
Ambiguity may arise as a result, particularly as metaphors are by their very nature
open to more than one interpretation (El Refaie, 2009, p. 182).
In the corpus of 100 print car advertisements, 67% uses high camera angle,
which suggests the advertisers’ intention to build consumer power (Feng, 2011a).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


110 Dezheng Feng and Kay L. O’Halloran

However, low camera angle is sometimes used to construct product superior-


ity, for example, in the case of expensive cars such as Mercedes-Benz and BMW.
Viewers look up to the cars (and the characters), as if they are superior. Such ad-
vertisements thus persuade viewers by subliminally guide their interpretation of
the car as a symbol of high status.

5. Compositional meaning and the visual representation of metaphor

Lakoff and Johnson (1980, p. 126) point out that linguistic forms are endowed
with content by virtue of spatial metaphors. This observation is certainly appli-
cable to visual images where space plays an even more important role across a
larger number of dimensions. For example, visual semiotic resources include
the spatial positioning of different elements, their relative size, and the distance
between them. These semiotic resources construct compositional meanings of
information value, salience and framing (Kress & van Leeuwen, 2006). The infor-
mation values of given/new, ideal/real and important/unimportant are realized by
the spatial orientations of left/right, up/down and central/marginal respectively.
Salience and framing are not abstract concepts and will not be discussed, but the
size and the distance between elements are included in the visual metaphor sys-
tem, as shown in Figure 3.
Similar to interactive meanings, visual compositional meanings are also de-
rived from our embodied experience. Given is left/new is right is based on
the experience that in most cultures, people write and read from left to right, so
we take the left as given information and the right as new. In ideal is up, “ideal”
has two different but related entailments, that is, desirable and unrealistic (Feng,
2011a, p. 59). Desirable is up is synonymous with the well-established metaphor
good is up (Lakoff & Johnson, 1980) and will not be further explained. Unreal-
istic is up uses a different sense of “up” – that is, high. It is difficult or unrealistic
to get things that are too high (e.g. stars). Therefore, ideal things, while desirable,

Information value is Time is Importance is Social closeness is


spatial position space size physical closeness

Given is left Ideal is up Important is central Important is foreground


New is right Real is down Unimportant is marginal Unimportant is background

Figure 3. The visual realization of metaphor in compositional resources

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The visual representation of metaphor 111

may be unrealistic. Real is down is based on the same experience as unrealis-


tic is up. The association between central and important is so conventionalized
that “important” has become a lexical meaning of “central”. It may arise from our
biological composition whereby the most vital organs (e.g. heart and lungs) are
located near the center of our bodies (Goatly, 2007, p. 40). In terms of foreground/
background, Feng (2011a, p. 70) explains their meanings in relation to the notion
of “depth”, which is “the distance between the viewers’ eyes and any point in the
visual field” (Messaris, 1994, p. 51). The foreground is perceived as nearer to the
viewer than the background. Our biological feature of vision results in different
visual impacts of the objects at different distances: we notice what is in the fore-
ground first (most likely for reasons of survival) and take it as more important
than that which appears in the background.
Aside from information value, spatial orientations may also construct other
abstract concepts. Left/right and foreground/background orientation can also
represent the concept of time, resulting in the general time is space metaphor
(El Refaie, 2009, p. 179; Lakoff & Johnson, 1980). Taking left/right orientation as
an example, it includes the sub-mappings of past is left/present is right or
present is left/future is right based on our experience that the information
to the left is processed before the information to the right in many cultures. How-
ever, if human beings are in the image, then their front represents future and their
back represents past (El Refaie, 2009, p. 179).
The other two conceptual metaphors we identify are importance is size and
social closeness is physical closeness. El Refaie (2009, p. 176) notices that
the size of objects in an image does not just construct salience, but also impor-
tance. The mapping is based on the association between physical power and social
power. Goatly (2007, pp. 35–39) provides a detailed study on its realization in both
language and buildings. The social closeness is physical closeness metaphor
states that the physical distance between visual participants is perceived as an in-
dex of their social distance. It is similar to the mapping between shot distance and
image-viewer closeness, and it reproduces real life situations more directly.
Aside from conceptualizing abstract concepts, in visual media such as adver-
tisements, compositional resources also construct implicit meaning which influ-
ences readers’ attitudes. For example, in Plate 3, the car image is salient because
of its large size and central position, and is therefore perceived as the most im-
portant object. The price, in contrast, is positioned at the bottom left with a small
font size and thus is interpreted as real, given and unimportant (i.e. not the focus
of attention). The price is strategically downplayed because it is high, and is not
the selling point. This is manipulative because for many people, price may be the
most important information.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


112 Dezheng Feng and Kay L. O’Halloran

Plate 3. Alfa-Romeo, from The Straits Times, 4th October, 2008, C8


(reproduced with permission)

6. Conclusion

As a new development of the conceptual metaphor theory, the working mecha-


nism of visual metaphor needs further exploration. Complementing current ap-
proaches which attribute the understanding of visual metaphor to our cognitive
capacity and situational/cultural context, this study argues that the structural
features of visual images themselves play an essential role in the construction of
visual metaphor. Visual representations of metaphor are modeled with respect to
the metafunctions of visual images, namely, the representational, interactive and
compositional meaning structures. It is found that defamiliarization metaphors,
which include those in both Carroll’s (1996) and Forceville’s (1996) definitions,
can be explained by colligational anomalies in representational structures. Vi-
sual representational resources may also serve to domesticate abstract concepts,
either in a creative or conventional way, where the metaphor is constructed by
superimposing a verbal “value” to a visual “token”. Interactive and compositional
resources, whose meanings are derived from correlations in our physical and cul-
tural experience, realize conventional metaphors. Through the analysis of car ad-
vertisements, this study also demonstrates how the visual resources which invoke
metaphorical interpretations are exploited as tools of persuasion.
We conclude that the social semiotic framework can provide a comprehen-
sive account of the visual realization of both creative and conventional meta-
phors. Meanwhile, the approach also offers a cognitive explanation of how visual
resources like camera positioning and composition acquire meanings. However,
as the first step toward the application of semiotic theory in cognitive studies, the
analysis is limited in depth and comprehensiveness. More convincing conclusions
should be based on the systematic analysis of a large corpus of data. Nonetheless,

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The visual representation of metaphor 113

this study has demonstrated that the integration of social semiotics and cognitive
metaphor theory is significant for the understanding and explanation of visual
semiosis. Therefore, we conclude with the hope that these two theoretical ap-
proaches will be combined in further explorations of multimodal discourse.

Acknowledgement

The research for this article was supported by the Interactive Digital Media Pro-
gram Office (IDMPO) in Singapore under the National Research Foundation’s
(NRF) Interactive Digital Media R&D Program (Grant Number: NRF2007IDM-­
IDM002-066).

References

Carroll, N. (1996). A note on film metaphor. Journal of Pragmatics, 26(6), 809–822.


DOI: 10.1016/S0378-2166(96)00021-5
Dick, B.F. (2005). Anatomy of film. Boston, Mass: Bedford/St. Martins.
Dyer, G. (1989). Advertising as communication. London: Routledge.
El Refaie, E. (2003). Understanding visual metaphor. Visual Communication, 2(1), 75–95.
DOI: 10.1177/1470357203002001755
El Refaie, E. (2009). Metaphor in political cartoons: Exploring audience responses. In
C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 173–196). Berlin:
Mouton de Gruyter.
Feng, D. (2011a). Visual space and ideology: A critical cognitive analysis of spatial orientations
in advertising. In K.L. O’Halloran & B. Smith (Eds.), Multimodal studies: Exploring issues
and domains (pp. 55–75). London: Routledge.
Feng, D. (2011b). The construction and categorization of multimodal metaphor: A systemic
functional approach. Foreign Language Research, 1, 24–29.
Forceville, C. (1994). Pictorial metaphor in advertisements. Metaphor and Symbolic Activity,
9(1), 1–29.
Forceville, C. (1996). Pictorial metaphor in advertising. London: Routledge.
DOI: 10.4324/9780203272305
Forceville, C. (2009). Nonverbal and multimodal metaphor in a cognitivist framework: agendas
for research. In C. Forceville & E. UriosAparisi (Eds.), Multimodal metaphor (pp. 19–42).
Berlin: Mouton de Gruyter. DOI: 10.1515/9783110215366
Goatly, A. (1997). The language of metaphors. London: Routledge. DOI: 10.4324/9780203210000
Goatly, A. (2007). Washing th brain: Metaphor and hidden ideology. Amsterdam: John Benja-
mins. DOI: 10.1075/dapsac.23
Hall, E. (1969). The hidden dimension. London: Bodley Head.
Halliday, M.A.K. (1994). An introduction to functional grammar. London: Arnold.
Kövecses, Z. (2002). Metaphor: A practical introduction. Oxford: Oxford University Press.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


114 Dezheng Feng and Kay L. O’Halloran

Kress, G., & van Leeuwen, T. (2006). Reading images: The grammar of visual design (2nd ed).
London: Routledge.
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press.
Messaris, P. (1994). Visual literacy: Image, mind and reality. Boulder: Westview Press.
Morris, R. (1993). Visual rhetoric in political cartoons: A structuralist approach. Metaphor and
Symbolic Activity, 8(3), 195–210. DOI: 10.1207/s15327868ms0803_5
O’Halloran, K.L. (2005). Mathematical discourse: Language, symbolism and visual images.
London: Continuum.
O’Toole, M. (2010). The language of displayed art. (2nd ed). London: Routledge.
Teng, N.Y. (2009). Image alignment in multimodal metaphor. In C. Forceville & E. Urios-­
Aparisi (Eds.), Multimodal metaphor (pp. 197–211). Berlin: Mouton de Gruyter.
Yus, F. (2009). Visual metaphor versus verbal metaphor: a unified account. In C. Forceville &
E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 147–172). Berlin: Mouton de Gruyter.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books

A. Jesús Moya Guijarro


University of Castilla-La Mancha

This article aims to explore how the use of visual metonymies in picture books
contributes to children’s understanding of stories and, in turn, attracts their at-
tention towards relevant aspects of the plot. The two picture books selected for
analysis are Gorilla, by Browne, and The Tale of Peter Rabbit, by Potter, intended
for children under 9 years of age.
A multimodal and cognitive perspective is adopted here to apply the non-
verbal trope of visual metonymy to the two picture books that form the sample
texts (Forceville, 2009, 2010; Forceville & Urios-Aparisi, 2009).
The results of the analysis show that visual metonymies are essentially used
in children’s tales to create narrative tension in certain stages of the plot and, in
turn, to establish a bond between the represented participants and the child-
viewer.

Keywords: multimodality, visual metonymy, picture books, systemic functional


linguistics

1. Aims and scope of the study

Two fundamental tropes within cognitive linguistics are the metaphor and the
metonymy, defined as phenomena of thought used to conceptualise reality by
means of the relationships that are established between a source and a target do-
main. Cognitive scholars have been interested essentially in the metaphor as a
figure of thought which may be used to represent abstract entities in terms of
more concrete phenomena, but in recent decades the verbal metonym has also
been a focus of study by renowned linguists such as Barcelona (2000), Dirven and
Pörings (2002), Gibbs (1994), Panther and Thornburg (2003), Ruiz de Mendoza
(2000, 2002, 2011) and Taylor (2002), among others. While these authors have
based their research on verbal manifestations of language, Forceville (1996, 2002,
2006, 2009, 2010) has extensively studied the potential of visual metaphor and
metonymy in multimodal discourses.

doi 10.1075/bct.78.08moy
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
116 A. Jesús Moya Guijarro

In his several works on multimodal tropes, Forceville (1996, 2006, 2009, 2010)
has emphasized repeatedly that visual tropes may behave differently depending
on the genre in which they are used. Therefore, it is important for the theorization
of visual metaphors and metonymies to consider their occurrence in a variety of
genres and analyze how these create meaning in specific contexts. Specifically,
this article aims to explore the occurrence of visual metaphors/metonymies in
children’s picture books in order to study how visual tropes may contribute to
children’s understanding of the stories and, in turn, draw their attention towards
relevant aspects of the plot.
Forceville (1996, 2002) has analysed the manifestations of non-verbal met-
aphors in advertising discourse within a cognitive linguistic framework, high-
lighting that conceptual metaphors can be manifested not only verbally, but also
non-verbally and multimodally. He defines a multimodal metaphor as a cognitive
process in which the target and the source domains are represented completely
or predominantly in different modes (Forceville, 2006, p. 384). Through meta-
phor an entity or concept provides mental access to another phenomenon which
belongs to a different conceptual domain. In the sample texts that are studied in
this article, no visual metaphors have been identified. This is most likely due to
the fact that they are picture books intended for children under nine (Cerrillo &
Yubero, 2007; Moya & Ávila, 2009), whose cognitive abilities are still in the pro-
cess of development. Consequently, these young readers are still not in a mature
enough cognitive state in order to understand the communicative potential of
metaphorical constructions.
Nevertheless, the illustrators have used monomodal visual metonymies to
transmit relative aspects of the narrative content of the stories to the young read-
ers through the visual mode1 (Forceville, 2009). Two well-known 20th century
children’s picture books, which respond to a standard of literary quality (Cerrillo
& Yubero, 2007) have been selected as sample texts to study the trope of visual
metonymy. In the selected stories, the texts, as well as the illustrations, play a
fundamental function in the construction of the plot, as is usually the case of tales
intended for young children between 0 and 9 years of age (Cerrillo & Yubero,
2007; Hunt, 2004; Moya & Ávila 2009). The two books selected for analysis are
Gorilla (Browne, 2002[1983]), written and illustrated by Browne, and The Tale of
Peter Rabbit (Potter, 2002[1902]), written and illustrated by Potter. These stories
are contemporary classics, that is, works that can be considered models to imitate
due to their notable literary quality. They are known to have persevered with suc-
cess among children, generations after they were created or written.
The article is structured in the following way. After the introduction, in
Section 2 the main features of the concept of visual metonymy are outlined.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books 117

Forceville’s (2009) notion of visual metonymy, studied within a framework from


multimodal cognitive linguistics, is the main approach adopted here to demon-
strate the occurrence of this non-verbal trope in the two tales that form the sam-
ple texts (Forceville, 2009, 2010; Forceville & Urios-Aparisi, 2009). In Section 3,
I explore the discourse functions of these metonymies in the two picture books
selected for analysis. Finally, in the conclusion, the data extracted from the em-
pirical study are interpreted in functional terms. This last part makes evident how
visual metonymies are useful strategies to convey representational meaning and
create engagement in picture books.

2. The concept of visual metonymy

There is a generalized consensus among scholars working in cognitive linguistics


that human thought is essentially metaphorical; we systematically understand
and experience abstract phenomena in terms of concrete elements (Lakoff &
Johnson, 1980). In recent decades metonymy has also awakened the interest of
linguists, who are more aware of the meaning potential of this trope, still consid-
ered the younger sister of metaphor (Forceville, 2009), to convey meaning and
represent reality. Through metaphor, an entity or a fact is understood and repre-
sented mentally via another referent or concept that belongs to a different domain
(Barcelona, 2000; Dirven & Pörings, 2002; Gibbs, 1994; Lakoff, 1987; Lakoff &
Johnson, 1980, 1999; Panther & Radden, 1999; Turner, 1996; Yu, 1998, 2009).
In the case of metonymy, the source and target, however, are part of the same
conceptual domain. As Forceville (2009, p. 56) points out: “In short, in meta-
phor we get A-as-B; in metonymy B-for-A.” Renowned cognitive linguists such as
Barcelona (2000) and Taylor (2002) have studied metonymy in verbal manifes-
tations of language. Indeed, this fact, together with the research carried out by
Barcelona (2000), Kövecses and Radden (1999) and Ruiz de Mendoza and Díez
(2002), who have made evident the full potential that is born of the interaction
of metaphor and metonymy in discourse, has increased research on metonymy
within the cognitive framework.
Whilst cognitive scholars have based their research on metaphor and me-
tonymy within verbal manifestations of language, the meaning potential of
non-verbal manifestations of these tropes still remain largely unexplored. One
exception to this is Forceville’s (1996, 2002, 2006, 2009, 2010) studies on pictorial
metaphor and visual metonymy in advertising discourse and films. As Forceville
points out, metaphor and metonymy do not only occur in language, these tropes
can also come about in other semiotic representations which are beyond verbal

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


118 A. Jesús Moya Guijarro

modes. He explains the necessity of dealing with non-verbal tropes as follows: “If,
as Lakoff and Johnson famously claim, ‘the essence of metaphor is understand-
ing and experiencing one kind of thing in terms of another’ (1980, p. 5), it is
inevitable that scholars investigate other modes/modalities than language alone”
(Forceville, 2010, p. 57). In line with Lakoff and Johnson (1980, p. 36), Forceville
(2009, p. 69) considers that metonymy fulfils a referential function as it involves
a cognitive process by means of which, in a specific context, we use an entity to
stand for another that belongs to the same conceptual domain. In his analysis of
advertising discourse, Forceville (2009, p. 69) demonstrates that context is crucial
to understand the metonymic relationship that is established between a target and
a source domain.
Following Bartsch (2002, pp. 50–51) and Warren (2002, p. 123), Forceville
(2009, pp. 57–58) also acknowledges that there is always a reason for a speaker/
writer to use a metonymy in a specific context and that this reason can be ex-
plained in terms of relevance (Sperber & Wilson, 1995) and communicative in-
tentions (Gibbs, 1999). In fact, metonymies are frequently used to highlight some
aspect of the message and attract the reader’s attention to relevant parts of a mul-
timodal ensemble. Forceville (2009, p. 58) affirms that, “the choice of metonymic
source makes salient one or more aspects of the target that otherwise would not,
or not as clearly, have been noticeable, and thereby makes accessible the target
under a specific perspective […].” The use of a metonymy often implies a change
in salience or perspective.
Along this line of thought, Yu (2009) and Urios-Aparisi (2009) have also
studied the non-verbal manifestations of metaphor and metonymy in educa-
tional TV advertisements and TV commercials, respectively. In turn, Hidalgo &
Kraljevic-­Mujic (2011) have also analyzed the role of multimodal metaphors and
metonymies in ICT advertising discourse. In addition, Pérez (2011, 2013a, 2013b,
2014) also attempts to theorise the concept of metonymy in combination with
notions from picture/film theory by analysing green washing and environmen-
tal campaigns in advertising discourse. Although she is in the initial stages of
her research, by adopting Ruiz de Mendoza & Díez’s (2002) and Urios-Aparisi’s
(2009) accounts to metaphor-metonymy interaction as well as Forceville’s (2002)
multimodal proposal on pictorial metaphor, she has found out that metaphor and
metonymy play a key role in the construction of persuasive meaning in advertis-
ing discourse. The current article attempts to contribute to the research on non-­
verbal metonymy in discourse by exploring the potential of utilizing this trope in
a sample of two picture books intended for children. My interest resides essential-
ly in researching how visual metonymy contributes to representing the narrative
reality that visual artists try to transmit to their young readers in picture books.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books 119

3. The analysis

In this section, I will analyze the non-verbal manifestations of visual metonymies


in the sample texts to represent the narrative reality. Once the metonymies utilised
are identified, I will discuss the communicative functions they fulfil in their spe-
cific stories and the effect they have to convey representational and interactive
meanings. Due to space restrictions, only four plates have been reproduced here.

3.1 Gorilla

Gorilla by Anthony Browne is the first picture book analysed in this article.
The story is constructed on the basis of the relationship established between
Hannah, the protagonist, her real father and a gorilla, who acts as a stand-in-fa-
ther. Hannah is a lonely girl who is looking forward to seeing a real gorilla. Her
father, however, does not have enough time to take her to the zoo. The night be-
fore her birthday, something amazing happens. During the night, the toy gorilla
she gets as a birthday present comes to life and takes Hannah to the zoo to see the
gorillas. In the morning her real father also invites Hannah to go to the zoo, add-
ing a happy ending to the story.
The trope of visual metonymy plays a key role in the construction of the
narrative reality transmitted in this picture book. Sometimes, the protagonists
referred to in the verbal language are reflected in the illustrations through part/
whole metonymies. Eight visual metonymies have been identified in Gorilla, and
they contribute to emphasize some important aspects of the story’s plot. In each of
these, the target and the source belong to the same conceptual domain. In double
spread 9, for example, reproduced here as Plate 1, there are two close-ups that
show the face of two primates.2 The primates are represented through a visual me-
tonymy as one part (their heads) stands for their whole (the animals themselves),
attracting the reader’s attention in a special way. The use of visual metonymy in
these demand images, those that establish eye-contact between the represented
participants and the viewer (Kress & van Leeuwen 2006, p. 118), encourages the
child-reader’s empathy with the two characters depicted, in this case an orang-
utan and a chimpanzee. The face of the primates is the focus of the two illustra-
tions and through them the illustrator reveals their sad faces behind the bars of
their cages. Although they are wild creatures, the illustrator gives them humanised
features as they demonstrate a certain sadness in their eyes. This may be due to
their lack of freedom. In both cases, Browne uses frames within frames since the
Represented Participants (RPs) are presented within square and rectangular en-
closures and covered by bars which in turn are metonyms for cages. The pictures

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


120 A. Jesús Moya Guijarro

clearly reflect that both animals are looking directly at the viewer, seeking support
and perhaps even imploring the reader to free them from their imprisonment. As
Gill (2002, p. 57–58) acknowledges, “the effect of their direct engagement with the
viewer is evocative, with the viewer almost feeling the animals beseeching them
to free them. The fact the animals are depicted at close range enhances this effect,
since the reader is positioned to engage with them on an intimate level.” In picture
books, this type of metonymic demand image is not common, as its utilization
usually interrupts the development of the narrative plot. However, Browne has
used them to achieve a strong engagement between the RP and the child, and
forge the identification of the latter with the animals.
Apart from these two part-whole metonymies (face-character), which show
the sad faces of the two primates locked in cages, there are also another two close-
ups in double spread 13 that reveal the face of Hannah smiling with the gorilla. In
this case, the reference to their faces is again a metonymy for the characters them-
selves, who communicate through visual language the way they feel. Here Hannah
is looking at a toy gorilla and the viewer can only see the character’s head. Hannah
is lying down in bed almost covered up to her head. Only her hair, one of her eyes
and her nose can be seen. Through this part-whole metonymy, Browne places
emphasis on the visual contact between Hannah and the gorilla, since the met-
onymic representation of the characters establishes the gazes that are exchanged
between them. This clearly contrasts with the lack of visual and verbal communi-
cation that characterizes the relationship between the main character and her real
father, as they never make eye contact with each other (Gill, 2002; Moya, 2011a).
In this way, the metonymy suggests an intimate/personal relationship that pro-
duces the effect of close proximity between Hannah and her substitute father, the
gorilla. Thus, while Hannah, the primates at the zoo and the gorilla are sometimes
depicted showing their feelings to the child-reader, Hannah’s father never appears
in a close-up or frees his emotions. In fact, he is never depicted in a demand

Plate 1. The orang-utan and the chimpanzee in their cages3

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books 121

image or through a metonymy that focuses the attention of the child-viewer on


his facial expression. By contrast, the gorilla, the primates at the zoo and Hannah
are all depicted in demand and metonymic images, which give the reader access
to their feelings. It seems evident that the trope of metonymy contributes to the
establishment of greater engagement between the main characters in the story,
as well as between them and the young reader since it allows for Hannah’s or the
gorilla’s facial expressions to be shown clearly.
Another metonymy that is worthy of comment is found on the right-hand side
of double spread 5, reproduced here as Plate 2. Hannah is represented through the
mound that her knees form under her bedding, a part for the character in her
totality. This metonymy emphasizes Hannah’s small size and lack of power, espe-
cially if she is compared with the huge dimensions of the gorilla that is emerging
at the foot of her bed. In addition, through this metonymy Browne focuses the at-
tention of the child on the real gorilla who, after this illustration, shares a leading
role in the story with Hannah. By depicting the animal as a huge primate, Browne
makes the power of the enormous gorilla over Hannah evident and emphasises
the visual contact established between the two characters – the girl and the sub-
stitute father. This prominence given to Gorilla is achieved through the use of low
angles, monomodal visual metonymies and visual focalisations,4 which present
the character to the reader from the protagonist’s visual perspective, as a powerful
primate of great size (Moya, 2011a). The gorilla is seen from Hannah’s perspective
as a huge and powerful creature embodying the characteristics of protection that
are typically associated with fatherhood.
The main protagonist is on one occasion reflected in an abstract way in the
illustrations through metonymy. On the left-hand side of double spread 14, re-
produced here as Plate 3, for example, when Hannah is rushing downstairs to tell
her father what had happened during the night, the metonymy, part of Hannah’s

Plate 2. The toy gorilla has become real5

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


122 A. Jesús Moya Guijarro

Plate 3. Hannah is rushing downstairs6

robe for Hannah, represents the protagonist in a more abstract way. The function
of the trope in this specific case is to accentuate the sensation of speed (a flash of
her red robe speeding down the stairs is enough to refer to Hannah in her totali-
ty), supported by the material process “rush”, referred to in the verbal component:
“Hannah rushed downstairs to tell her father what had happened.” The blur of red
reveals that the character is Hannah, since in the following illustration printed on
the same double spread, Hannah appears sitting on a chair in front of her father
with a red robe on.

3.2 The Tale of Peter Rabbit

I will now identify the metonymies that play a key role in the construction of real-
ity in the thirty-two double spreads that are contained in The Tale of Peter Rabbit,
in which verbal and visual elements are intertwined in the verso and the recto of
the pages that make up the story. Before identifying the visual metonymies found
in them, I will briefly refer to the plot of the story. Peter disobeys his mother and
trespasses on Mr McGregor’s garden where he greedily enjoys a feast of vegetables
until he feels sick. He is almost caught by the owner of the farm, Mr McGregor,
and runs the risk of losing his life on several occasions. After some risky encoun-
ters with Mr McGregor, Peter manages to escape from the garden and comes back
home where his mother gives him a dose of chamomile tea to relieve his pain.
Meanwhile, Peter’s sisters, Flopsy, Mopsy, and Cotton-tail, who have been good
girls, have bread, milk and blackberries for supper.
The first metonymy is identified in the first illustration of the tale and fulfils
the discourse function of introducing the main character to the reader, making a

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books 123

specific feature of his personality evident. In the first double spread, the rabbits are
located near the tree trunk where they live. Mrs Rabbit, the mother of the litter,
directs her gaze directly at the viewer, inviting him into the story and introduc-
ing her children, as the verbal language also does through a there-­construction:
“ONCE UPON A TIME there were four little Rabbits, and their names were –
Flopsy, Mopsy, Cotton-tail and Peter”. In this scene, although both the verbiage
and the illustration are essential to the creation of the story, the illustration
conveys more relevant information about Peter’s personality than the words do
(Moya, 2010). Peter is represented through a visual metonymy: Peter’s tale is met-
onymic for Peter. The picture clearly reflects that Flopsy, Mopsy and Cotton-tail
do not share the same personality as the main character. While his sisters show
their heads to the reader, Peter is playing in the burrow and only his backside can
be seen. He is absorbed in his own world and reveals a different attitude. Thus, the
visual component seems to anticipate the rebel nature of the main character, who
later trespasses Mr. McGregor’s garden, not heeding his mother’s advice.
Another interesting metonymy, the ears of the rabbit for Peter, is found in
double spread 19. The illustration reflects that Mr McGregor and Peter are in the
shed. Peter is hidden in a watering can. Only his ears can be seen. Presumably
the part-whole metonymy, ears for Peter, shows that the rabbit, aware of the risk
he is running, is trying to hide from the old man, who wants to capture him and
probably turn him into a rabbit pie. At least this was what happened to his father
when he broke into the farmer’s fields in the past. Mrs. Rabbit reminds Peter of
this ordeal before she sends her litter into the fields: “Your Father had an accident
there; he was put in a pie by Mrs. McGregor” (double spread 3). In turn, this
metonymy puts the child-viewer in a position of dominance as he knows more
about what is going on than either Peter, the main character, or Mr McGregor.
The enemy of the rabbit, Mr McGregor, unlike the child-viewer, does not know
where the protagonist is hidden (Moya, 2010). Through this metonymy, Potter
manages to put the child-viewer on the protagonist’s side, since the main charac-
ter involves him directly in the plot. This picture book was initially intended for
children of the English middle class in the Victorian era. This period was charac-
terised by strict and conservative manners in court and in children’s education. In
line with the moralising literature addressed to children that Potter was familiar
with, the author, who also doubles as illustrator, probably followed the ideolog-
ical requirements of the Victorian period. And, indeed, some moralistic values
predominate in the verbal narrative: whilst Peter’s sisters behave properly and,
consequently, are rewarded at the end of the tale with a nice supper, Peter, after
having disobeyed his mother, is almost killed by Mr McGregor and ends up with a
stomach-ache. However, the tale seems to be more than a story in which a charac-
ter has exposed himself to a risky situation by disobeying his mother’s advice. As

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


124 A. Jesús Moya Guijarro

has been observed in the critical literature (Carpenter, 1989, p. 279; Scott, 2001,
p. 29), Potter’s voice seems to be that of a rebel in defence of liberty and natural
instinct (Moya, 2010). What the author may not be able to express in words is re-
flected in the visual component of the story. Despite Peter’s disobedience, Potter’s
visual techniques move the reader to be on the side of the defenceless and scared
rabbit and also perhaps to wish for his escape from his oppressor. Therefore, the
narrative function of this second metonymy is to encourage the child’s empathy
towards the main character in the story.
Finally, illustration 20, reproduced here as Plate 4, is evidence of another me-
tonymy, also used by Potter to make the child-viewer be in favour of the pro-
tagonist. The large foot of Mr McGregor is just about to step on the little rabbit.
Once again there is a metonymy that reflects a part (Mr McGregor’s boot) for a
whole (the old man), while at the same time intensifying the narrative tension
and the immediacy of the threat. Peter is about to be stepped on by his aggressor
near a small window, also reflected metonymically. The fact the Mr McGregor is
reflected through metonymy gives a greater sense of tension in the plot, as the
young reader can contemplate how close the protagonist, the little rabbit, is to
being trapped by the adult. If Mr McGregor had been represented completely, the
image, without a doubt, would offer less narrative tension and the danger would
be less imminent. In turn, the part-whole metonymy, foot for Mr. McGregor,
is probably used by Potter to dehumanize the old man, who is depicted more
as an entity of destruction than as a human being. The studs on the sole of Mr
McGregor’s boot contribute to representing the farmer as a dangerous man who
is utterly devoid of human feelings.

Plate 4. Peter is in danger7

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books 125

4. Conclusions

In this article, I aimed to demonstrate how visual metonymies contribute to the


representation of reality in two picture books intended for young readers. The
results of the analysis show that visual metonymies of the type a part (face, boot,
garment) for the whole (the fictional characters) are essentially used in children’s
tales with two main functions: firstly, to create narrative tension and emphasise
some stages of the story and, secondly, to establish a bond between the characters
of fiction and the child-viewer.
In Gorilla, for example, the part-whole metonymy, heads of the primates at
the zoo for the animals as a whole, fulfils the function of encouraging the child-
viewer’s empathy for the two imprisoned creatures. As the animals are reflected
through metonymies and demand images, the illustrator attracts the viewers’ at-
tention to their corresponding faces and through them he reveals their feelings
of sadness. Similarly, when Hannah and the gorilla are represented in the visual
component by their faces looking at each other, this metonymy creates a visual
link between Hannah and her substitute father and produces the effect of an in-
timate relationship between them. This also simultaneously establishes a contrast
with the lack of contact that exists between the girl protagonist and her real father,
who is always presented as busy and tired and unable to keep eye-contact with his
daughter or devote time to her (Moya, 2011a). The engagement that is established
between Hannah and the viewer of the tale is also achieved through monomodal
visual metonymies. When the toy gorilla becomes real, Hannah is represented
through the bumps her knees make under the duvet of her bed. This metonymic
relationship shows Hannah’s fears and emphasizes the girl’s small size compared
to the gorilla that gets bigger and bigger, suggesting the power of the primate
over the frightened girl. In this way, the illustrator encourages the identification
between the child-reader and the main character in the story, depicted as a young
girl, probably of the same age as the prospective readers of the tale.
In The Tale of Peter Rabbit, Beatrix Potter also uses monomodal visual me-
tonymies to bring the young reader closer to the fictional world of the characters
in the story. So, when Peter is hiding in the watering can, a part-whole metonymy,
ears for Peter, manages to create engagement between the character of fiction and
the young child. This source-in-target metonymy gives the viewer power over
Mr McGregor, the adult that wishes to end the life of the story’s protagonist. The
young reader can identify Peter and knows the exact place where he is hiding,
since his large ears stick out of the watering can where the rabbit is. Nonetheless,
Mr McGregor appears to be looking for him inside the tool shed without any

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


126 A. Jesús Moya Guijarro

luck. In this manner Potter encourages the readers to empathise with the protago-
nist of the story, since she makes them an accomplice of the current predicament
that Peter is in at this narrative moment. At other times, the protagonist, Peter, is
represented through a visual metonymy in order to create narrative tension. Evi-
dence of this is the metonymy Mr. McGregor’s boot for Mr McGregor, when he is
just about to step on Peter before he escapes from the toolshed where he tried to
hide to avoid being caught by the old man. This metonymy intensifies the narra-
tive tension in the story as it shows that Peter is just about to be stepped on by one
of the old man’s big feet.
To the two main functions of visual metonymies already referred to, another
function may be added. The utilization of visual metonymies may encourage the
interaction of the child-viewer with the adult in various ways, be it teacher, parent
or grand-parent. The kind of metonymic depiction used in the tales studied in
this article may lead to questions such as: (i) where is Peter in this picture?, what is
this (shoe) on the right-hand side of the picture, and whose is it? (The Tale of Peter
Rabbit), and (ii) “what is this red blob in the left-hand part of the illustration?, is
she Hannah or someone else and how do you know? (Gorilla). This way the child
will be socialized into the reading experience and will adopt an active role in the
understanding of visual language. In addition, by being involved in this dialogic
linguistic experience the child reinforces his apprenticeship into the dialogic na-
ture of language, where the mutual completion of utterances plays an important
role (Purver, Howes, Healey, & Gergoromichelaki, 2009). Thus, the pictures in
the two tales do not serve to merely illustrate the stories, as they do in books for
older children, but to involve the child-viewer in dialogic interaction with the
adult (Moya, 2011b).
The analysis carried out also reveals that, although Forceville (2009, 2010)
distinguishes different types of metonymies (producer for product, object for
user, controller for controlled, institution for people responsible, the place for the
institution, the place for the event, etc.), the most common, at least in our sample
texts, is that in which a part stands for the whole. The part-whole metonymy is
based on the premise that we perceive paintings or photographs not by looking
at their individual elements, but discursively, that is, their elements are contem-
plated as arranged into a visual syntax and constituting a whole (Bal, 2006, p. 69).
Thus, when we look at a visual composition, any element can act metonymically
and one of its parts can stand for the whole. The characters are not always repre-
sented totally in the visual mode; in some stages of the tales, only some parts of
their bodies are drawn to refer to them in general. These visual representations
emphasise some key aspects of the plot of the stories, making them more notice-
able, as the analyses carried out in Section 3 demonstrate.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books 127

In addition, the results of the study also show that, in agreement with
Forceville’s (2009, p. 71) findings, the non-verbal metonymies found in the sample
of picture books are of the type source-in-target rather than target-in-source me-
tonymies. Ruiz de Mendoza (2000) and Ruiz de Mendoza and Díez (2002) propose
a distinction between target-in-source and source-in-target metonymies. Whilst
in the former a superordinate or matrix domain (source) stands for a subdomain
(target), in the latter a subdomain (source) establishes a metonymic relationship
with a matrix domain (target) through an expansion process. Ruiz de Mendoza
and Díez (2002, p. 495–499) point out that the first type of metonymy involves a
“domain reduction” while the latter, source-in-target metonymy, is understood in
terms of “domain expansion.” Picture books seem to adapt to the model of source-
in-target domain, as parts of the body of the represented participants depicted in
the stories stand for their whole. By means of these “domain expansions” (Ruiz
de Mendoza & Díez, 2002, p. 495–499), the illustrator highlights some relevant
aspects of the plot and creates a bond between the characters of fiction and the
child-viewer.
Writers and illustrators need to know how visual metonymies may be ex-
ploited to create interaction and represent the narrative reality without surpass-
ing the cognitive capacities of the children for whom the stories are written and
illustrated. Knowing how to create narrative tension and how to generate interac-
tion between the characters in tales and the child-viewer by means of visual me-
tonymies can stir children’s interest in picture books and, in turn, help educators
encourage children to read.

Notes

1. Forceville (2009, p. 62–63) distinguishes between monomodal and multimodal metony-


mies. While the former is purely visual, the latter involves a referential process in which the
source and the target domains are manifested in different modes. This distinction derives from
a previous classification that he applied to the trope of pictorial metaphor (Forceville, 1996).

2. Close-ups show the head and shoulders or even less; sometimes only the face is visible
(Kress & van Leeuwen, 2006).

3. Illustration from GORILLA by Anthony Browne Illustrations © 1983 Anthony Browne.


Reproduced by permission of Walker Books Ltd, London SE11 5HJ. www.walker.co.uk

4. It is interesting to note that while Hannah and her father are often depicted from a high
angle, the gorilla is the only represented participant that is shown from a low angle, suggest-
ing the power he has over Hannah. The use of high angles proves Hannah’s vulnerability at
home when she feels lonely and is not capable of attracting her father’s attention. The vertical
angle transmits two types of power relationships, that between the RPs and the viewer, and that

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


128 A. Jesús Moya Guijarro

between the RPs within an image (Kress & van Leeuwen, 2006). We can be positioned from a
high, low or eye-level angle. When the point of view is arranged upwards or downwards along a
vertical axis, an increase or a diminution of power over the RPs can be experienced: the viewer
has power over the RP if it is projected from a high angle. However, the RP has power over the
viewer if seen from a low angle. Finally, RPs aligned at eye level angles have equal power status
with their viewers (Kress & van Leeuwen, 2006).

5. Illustration from GORILLA by Anthony Browne Illustrations © 1983 Anthony Browne.


Reproduced by permission of Walker Books Ltd, London SE11 5HJ. www.walker.co.uk

6. Illustration from GORILLA by Anthony Browne Illustrations © 1983 Anthony Browne.


Reproduced by permission of Walker Books Ltd, London SE11 5HJ. www.walker.co.uk

7. Illustration from THE TALE OF PETER RABBIT by Beatrix Potter. Copyright ©


Frederick Warne & Co. 1902, 2002. Reproduced by permission of Frederick Warne & Co. www.
peterrabbit.com

References

Barcelona, A. (Ed.). (2000). Metaphor and metonymy at the crossroads: A cognitive perspective.
Berlin/New York: Mouton de Gruyter.
Bartsch, R. (2002). Generating polysemy: Metaphor and metonymy. In R. Dirven & R. Pörings
(Eds.), Metaphor and metonymy in comparison and contrast (pp. 49–74). Berlin/New York:
Mouton de Gruyter.
Bal, M. (2006). Reading ‘Rembrandt’: Beyond the word-image opposition. Amsterdam: Amster-
dam University Press.
Browne, A. (2002[1983]). Gorilla. London: Walker Books.
Carpenter, H. (1989). Excessively impertinent bunnies: The subversive element in Beatrix Pot-
ter. In G. Avery & J. Briggs (Eds.), Children and their books: A celebration of the work of Iona
and Peter Opie (pp. 271–298). Oxford: Clarendon Press.
Cerrillo, P., & Yubero, S. (2007). Qué leer y en qué momento [What to read and when]. In
P. Cerrillo & S. Yubero (Eds.), La formación de mediadores para la promoción de la lectu-
ra. Segunda Edición [Training specialists in the promotion of reading. Second edition]
(pp. 285–293). Cuenca: Servicio de Publicaciones de la Universidad de CastillaLaMancha.
Dirven, R., & Pörings, R. (Eds.). (2002). Metaphor and metonymy in comparison and contrast.
Berlin/New York: Mouton de Gruyter.
Forceville, C. (1996). Pictorial metaphors in advertising. London/New York: Routledge.
DOI: 10.4324/9780203272305
Forceville, C. (2002). The identification of target and source in pictorial metaphors. Journal of
Pragmatics, 34, 1–14. DOI: 10.1016/S0378-2166(01)00007-8
Forceville, C. (2006). Non-verbal and multimodal metaphor in a cognitive framework. Agen-
das for research. In G. Kristiansen, M. Achard, R. Dirven & F. Ruiz de Mendoza (Eds.),
Cognitive linguistics: Current applications and future perspectives (pp. 379–402). Berlin/
New York: Mouton de Gruyter.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Visual metonymy in children’s picture books 129

Forceville, C. (2009). Metonymy in visual and audiovisual discourse. In E. Ventola & A.J. Moya
(Eds.), The world told and the world shown: Multisemiotic Issues (pp. 57–74). Basingstoke/
New York: Palgrave Macmillan.
Forceville, C. (2010). Why and how study metaphor, metonymy, and other tropes in multimod-
al discourse? In R. Caballero & M.J. Pinar (Eds.), Ways and modes of human communica-
tion (pp. 57–76). Cuenca: Ediciones de la Universidad de CastillaLa Mancha.
Forceville, C., & Urios-Aparisi, E. (Eds.). (2009). Multimodal metaphor. Berlin/New York:
Mouton de Gruyter. DOI: 10.1515/9783110215366
Gibbs, R. (1994). The poetics of mind: Figurative thought, language, and understanding.
Cambridge: Cambridge University Press.
Gibbs, R. (1999). Intentions in the experience of meaning. Cambridge: Cambridge University
Press.
Gill, T. (2002). Visual and verbal playmates: An exploration of visual and verbal modalities in
children’s picture books. Unpublished B.A. (Honours), University of Sydney, Australia.
Hidalgo, L., & Kraljevic-Mujic, B. (2011). Multimodal metonymy and metaphor as complex
discourse resources for creativity in ICT advertising discourse. Review of Cognitive Lin-
guistics, 9(1), 153–178. DOI: 10.1075/rcl.9.1.08hid
Hunt, P. (Ed.). (2004). International companion encyclopedia of children’s literature. Second Edi-
tion. Volume I. London: Routledge.
Radden, G., & Kövecses, Z. (1999). Towards a theory of metonymy. In K.U. Panther &
G. Radden (Eds.), Metonymy in Language and Thought. Amsterdam: John Benjamins.
DOI: 10.1075/hcp.4.03rad
Kress, G., & van Leeuwen, T. (2006 [1996]). Reading images. The grammar of visual design.
London: Routledge.
Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind.
Chicago: University of Chicago Press. DOI: 10.7208/chicago/9780226471013.001.0001
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University Chicago Press.
Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and its challenge
to western thought. New York: Basic Books.
Moya, A.J. (2010). A multimodal analysis of Tale of Peter Rabbit within the interpersonal meta-
function. ATLANTIS, 32(1), 123–140.
Moya, A.J. (2011a). Engaging readers through language and pictures. A case study. Journal of
Pragmatics, 43(12), 2982–2991. DOI: 10.1016/j.pragma.2011.05.012
Moya, A.J. (2011b). A bimodal and systemicfunctional study of Dear Zoo within the textual
metafunction. Revista Canaria de Estudios Ingleses, 62, 123–138.
Moya, A.J., & Ávila, J.A. (2009). Thematic progression of children’s stories as related to different
stages of cognitive development. Text and Talk, 29(6), 755–774.
DOI: 10.1515/TEXT.2009.038
Panther, K.U., & Radden, G. (Eds.), (1999). Metonymy in language and thought. Amsterdam/
Philadelphia: John Benjamins. DOI: 10.1075/hcp.4
Panther, K.U., & Thornburg, L.L. (Eds.), (2003). Metonymy and pragmatic inferencing. Amster-
dam/Philadelphia: John Benjamins. DOI: 10.1075/pbns.113
Pérez, P. (2011). Don’t be so green. Analysis of the interaction between multimodal metaphor
and metonymy in greenwashing advertisements. Available at: http://sites.google.com/site/
meta­phormetaphor2011/abstractscontributedspeakers/metaphorandmedia.
Logroño: University of La Rioja.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


130 A. Jesús Moya Guijarro

Perez-Sobrino, P. (2013a). Metaphor use in advertising: analysis of the interaction between


multimodal metaphor and metonymy in a greenwashing advertisement. In E. Gola &
F. Ervas (Eds.), Metaphor in Focus. Philosophical Perspectives on Metaphor Use (pp. 67–82).
Cambridge: Cambridge University Press.
Pérez-Sobrino, P. (2013b). Humanimals. What do multimodal metaphor and metonymy reveal
about meaning creation in environmental advertising? Proceedings of XXX International
Conference of the Spanish Association of Applied Linguistics (pp. 402–408). Lerida: Univer-
sity of Lerida UP.
Potter, B. (2002[1902]). The Tale of Peter Rabbit. In F. Warne (Ed.), Beatrix Potter. The complete
tales. The original and authorized edition (pp. 9–20). London: Penguin.
Purver, M., Howes, C., Healey, P.G., & Gergoromichelaki, E. (2009). Split utterances in di-
alogue: a corpus study. Available at: http://www.eecs.qmul.ac.uk/~mpurver/papers/
purveretal09sigdialcorpus.pdf.
Ruiz de Mendoza, F.J. (2000). The role of mapping and domains in understanding metonymy.
In A. Barcelona (Ed.), Metaphor and metonymy at the crossroads: A cognitive perspective
(pp. 109–132). Berlin/New York: Mouton de Gruyter.
Ruiz de Mendoza, F.J. (2002). From semantic underdetermination, via metaphor and metony-
my to conceptual interaction. Theoria et Historia Scientiarum, An International Journal of
Interdisciplinary Studies, 6(1), 107–143.
Ruiz de Mendoza, F.J. (2011). Metonymy and cognitive operations. In R. Benczes, A. Barcelona
& F.J. Ruiz de Mendoza (Eds.), Defining metonymy in cognitive linguistics: Towards a con-
sensus view (pp. 103–124). Amsterdam/Philadelphia: John Benjamins.
DOI: 10.1075/hcp.28.06rui
Ruiz de Mendoza, F.J., & Díez, O.I. (2002). Patterns of conceptual interaction. In R. Dirven & R.
Pörings (Eds.), Metaphor and metonymy in comparison and contrast (pp. 489–532). Berlin/
New York: Mouton de Gruyter.
Scott, C. (2001). An unusual hero: Perspective and point of view in The Tale of Peter Rabbit.
In M. Mackey (Ed.), Beatrix Potter’s Peter Rabbit: A children’s classic at 100 (pp. 19–30).
Lanham, MC: Scarecrow.
Sperber, D., & Wilson, D. (1995). Relevance: Communication and cognition. Second Edition.
Oxford: Blackwell.
Taylor, J.R. (2002). Category extension by metonymy and metaphor. In R. Dirven & R. Pörings
(Eds.), Metaphor and metonymy in comparison and contrast (pp. 323–334). Berlin/New
York: Mouton de Gruyter.
Turner, M. (1996). The literary mind. Oxford: Oxford University Press.
UriosAparisi, E. (2009). Interaction of multimodal metaphor and metonymy in TV commer-
cials: Four case studies. In C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor
(pp. 95–117). Berlin/New York: Mouton de Gruyter. DOI: 10.1515/9783110215366
Warren, B. (2002). An alternative account on the interpretation of referential metonymy and
metaphor. In R. Dirven & R. Pörings (Eds.), Metaphor and metonymy in comparison and
contrast (pp. 113–130). Berlin/New York: Mouton de Gruyter.
Yu, N. (1998). The contemporary theory of metaphor: A perspective from Chinese. Amsterdam/
Philadelphia: John Benjamins. DOI: 10.1075/hcp.1
Yu, N. (2009). Nonverbal and multimodal manifestations of metaphor and metonymies: A case
study. In C. Forceville & E. Urios-Aparisi (Eds.), Multimodal metaphor (pp. 119–143).
Berlin/New York: Mouton de Gruyter.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative
expectations in film

John A. Bateman and Chiaoi Tseng


Bremen University, Bremen, Germany

In this paper we show that some notions from the textual organisation of verbal
texts appear also to give insights to the organisation of films. In particular, the
beginnings of films are suggested to operate as indicators of those films’ ‘meth-
od of development’ and so serve to set up expectations for guiding hypotheses
and selective attention during film viewing. By means of a small exploratory
study, we demonstrate that film beginnings exhibit differing organisational fea-
tures that correlate with the overall narrative strategies pursued in the films as
a whole. These features may then function as useful indicators for viewers con-
cerning just what interpretative challenges they will face later in the film.

Keywords: film, text organisation, filmic discourse, thematic structure, film


beginnings, interpretation

1. Introduction: Macro-themes and film

It is a commonplace that one of the open problems in dealing with strongly mul-
timodal artefacts such as film is their reliance on combinations of very different
information channels – channels traditionally listed in the case of film as spo-
ken language, written language, visual image, music and ambient sound. This be-
comes even more challenging when we move further away from a sensory-based
view of the information contributions and consider instead, or in addition, the
multitude of semiotic modes potentially involved, such as gesture, proxemics,
colour schemes, clothing, spatial relationships, and many more (Metz, 1974;
Monaco, 2009). Given this, it is clear that viewers must be being quite selective
in their allocation of attention: the dynamic unfolding of audiovisual representa-
tions in real-time would otherwise overwhelm the viewer rather than giving rise
to the broadly similar responses to film actually observed. That selectivity should
play a role here is in line with results in perceptual psychology that indicate that

doi 10.1075/bct.78.09bat
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
132 John A. Bateman and Chiaoi Tseng

“perception is selective: we attend to objects that bear salient meaning for certain
goals” (Gibson, 1979), a notion which has now also been applied in psychologi-
cally-oriented approaches to film (Anderson, 1996; Persson, 2003; Smith, Levin,
& Cutting, 2012; Wuss, 2009). Moreover, because of the nature of film as a de-
signed artefact, there is good evidence to believe that the perceptual guidance that
films exhibit is in many respects intended: that is, film makers explicitly construct
films precisely so that the attention of viewers is directed along paths that contrib-
ute to desired affects.
A further research question that must be raised is just how this selection and
guidance of attention comes about. Naturally the audiovisual properties of the
filmic material being experienced can guide attention in various ways – Smith
and Henderson (2008), for example, show how various aspects of movement at-
tract viewers’ attention at a very early stage in processing. The role of particular
distinctive combinations of properties for heightening suspense or emotional af-
fect has also been considered (Carroll, 2008; Grodal, 1999). Both facets build on
a long assumed connection between film perception and natural perception (cf.
Münsterberg, 1916). While by no means denying that this connection plays a sig-
nificant role when interpreting film, the phenomenon that will be explored in this
article is rather different. We will suggest that films not only combine contribu-
tions to meaning analogous to sensory perception in the real world but that this
combination is itself strongly guided by further mechanisms that are reminis-
cent of aspects of discourse organisation revealed in the study of natural language
texts. These two perspectives on the mechanisms involved in film interpretation
and understanding have been characterized in Bateman and Schmidt (2012,
p. 142) in terms of two contrasting families of ‘codes’: the reality codes, which
build on film’s audiovisual iconic nature as perceptually real and bring to bear
all the potential interpretative practices available in the interpretation of real-life
events, and the representation codes, which are specific to artefacts which employ
textual structuring for pro-actively shaping and guiding audience response and
interpretation.
The former family of codes has naturally been given most attention by re-
searchers concerned with psychological processes and the cognitive modelling
of film; indeed, as Smith et al. (2012) make clear, the connection drawn between
film and natural perception has been cited as one of the principal motivations
for psychologists to concern themselves with film at all. In this paper, however,
we will suggest that the other family of codes, although often overlooked, is also
highly relevant for explaining viewers’ responses to films. It is not only the mo-
ment-by-moment assessment of a perceptual input that is significant, but also the
entire scaffold of potential interpretations constituting textual organisation that
must be considered.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative expectations in film 133

To support this position, we will consequently consider attention guidance in


film as having more in common with linguistic information structuring, particu-
larly structuring at the text level, than is generally assumed. More specifically, we
will compare the beginnings of films with the textual construction of macro-themes
discussed by Martin (1992) – a possibility also suggested, for example, by Baldry
and Thibault (2006, p. 188). Macro-themes, and thematic organisation in general,
is seen within the functional approach of Martin as a general property of texts,
regardless of text genre and usage: it is one of the means available to all genres by
which particular communicative functions may be structured to be supportive of
their uptake by hearers and readers. Analogously, and as we will see below, the be-
ginnings of films have similarly been considered in terms that range across all film
genres and types of filmic artefact. We can then also explore them as a potential
site of functional discrimination that may operate for films in general.
According to the account that Martin develops, texts are organised into hier-
archically nested thematic blocks. The ‘beginning’ of each such block is made up
of a corresponding theme. Themes at higher levels ‘predict’, i.e., make expected for
recipients, the kinds of themes that will be taken up at lower levels. Crucially, this
organisation is a textual organisation rather than a content one: thematic organi-
sation provides a “scaffold” that indicates the textual method of development that a
text will follow (Fries, 1995; Ghadessy, 1995). As an example, we might consider a
segment of text beginning “There were already attempts to find a new passage to
India and China in the fifteenth century.” The unmarked new information of this
sentence includes the temporal extent in the fifteenth century and so this ‘predicts’
that one likely method of development for the text following might concern the
indicated attempts organised by their time of occurrence. This temporal method
of development can then be carried out with sentences such as: “In 1488, …”,
“In 1497, …”, etc. The leftmost elements in these sentences constitute temporal
grammatical themes in the sense of Hallidayan systemic-functional grammatical
analysis (Halliday, 1967; Halliday & Matthiessen, 2004).
Within Martin’s account, individual sentences have their themes predicted by
higher level ‘hyper-themes’, each of which predicts an entire sequence of thematic
selections within the individual sentences of the thematic block. Hyper-themes
may in turn be predicted by higher-level hyper-themes, continuing until we reach
the ‘text as a whole’. Martin designates the thematic material of this last, most
encompassing level, the macro-theme. It corresponds, according to Martin, to the
‘topic paragraph’ of traditional composition but, in addition to the presentation
of any content material that might be classified as the topic, also serves the crucial
textual function of predicting a range of methods of development that will be
used by the text for its textual organisation. In effect, the macro-theme, hyper-
theme and theme organisation establish a scaffold of expectations that help the

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


134 John A. Bateman and Chiaoi Tseng

text’s recipient negotiate the complex textual structures being constructed. This
signposting function is most pronounced in more complex, written language,
since it is there that there is most need for guidance concerning how a text is go-
ing to develop – there is little opportunity in written text to interrupt and reorient
oneself as to where the text producer is attempting to go.
Transferring this kind of organisation back to the situation with film is sug-
gestive for several reasons. Given the broader range of material available for sig-
nalling structural relationships in film, one logical hypothesis is that this might
be put to good effect, particularly at the beginnings of films, for establishing a
scaffold of expectations for recipients concerning how the film is going to unfold
subsequently. These expectations could then be used to pre-structure interpreta-
tions and to serve as a convenient source of guidance concerning where attention
should be paid, thus marking a bridge, or cross-over point, between the textual
organisation provided by the representation codes and the operation of the per-
ceptual system.
A special role for the beginnings of films has often been suggested in film
theory. As Bordwell, for example, states:
The sequential nature of narrative makes the initial portions of a text crucial for
the establishment of hypotheses. A character initially described as virtuous will
tend to be considered so even in the face of some contrary evidence; the initial
hypotheses will be qualified but not demolished unless very strong evidence is
brought forward. (Bordwell, 1985, p. 38)

This function of the initial portions of a film is also often seen in terms of psy-
chological processes such as the primacy effect and priming. This is presumed to
charge the material which is first encountered with a particular salience for pro-
viding a frame for interpretation for what follows (cf. Luchins & Luchins, 1962).
The broadest and most detailed account of film beginnings to date is that set
out by Hartmann (2009). Hartmann identifies several perspectives that have been
taken on film beginnings and provides extensive discussion and examples, each
offering different insights on the functions being performed for viewers when a
film begins. These perspectives include: the ‘point of attack’ for the exposition, i.e.,
that particular event, moment, etc. chosen to lead into the narrative; an exempla-
ry microcosm reflecting the world of the film; a point of equilibrium/disturbance
leading into the narrative arc as set out by Propp (1968) and subsequently sug-
gested as a standard model for film screenplays by Vogler (1998); a densely coded
matrix of connections symptomatic for the rest of the film and so functional for
the viewers’ subsequent comprehension processes; a textual prelude, similar in
role to preludes or overtures in music; the ‘threshold’ between everyday reality
and the world of the film; ‘instructions for use’ or training material for how to

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative expectations in film 135

interpret the film; and, finally, the place where a ‘communication contract’ is made
between producer and audience on what they are to expect.
Several of these perspectives clearly overlap with the functions suggested for
macro-themes. Just as macro-themes set out expectations for methods of devel-
opment, the use of film beginnings as ‘training material’ or the dense establish-
ment of the techniques to be employed for the remainder of some film similarly
establish clear predictions of what kind of development strategies that film will
employ and so may help direct the interpretative hypotheses that viewers enter-
tain. Moreover, early on in film semiotics, theorists such as Metz (1974) and Heath
(1975) suggested that each film, as an aesthetic artefact, creates its own ‘system’
as well as deploying semiotic codes established external to the film. Building on
this, Bordwell (1985, p. 38), Hartmann (2009, p. 106) and others suggest that it is
in precisely this respect that film beginnings may come to play a central narrative
function. This is echoed very closely in the role claimed for macro-themes above:
film beginnings as macro-themes may then also be taken as providing guidance
into the filmic system of each individual film just as in text they provide guidance
and predictions concerning the development of each individual text.
One of the motivations for accepting macro-theme/hyper-theme analyses for
natural language texts, however, is that it is possible to find repercussions of this
textual structure in the actual linguistic forms and structures that are selected.
Unless there were some identifiable consequences in the observed linguistic ma-
terial, there would be no grounds for assuming that macro-themes represent a
significant linguistic abstraction. In language, this works particularly through the
deployment of marked themes and other textual constructions as suggested in
our example of preposed spatial prepositional phrases above. The question raises
itself, therefore, of whether similar arguments can be made for presumed filmic
macro-themes. In the rest of this paper we will accordingly explore this further,
suggesting how a linguistically-motivated characterisation of filmic organisation
indeed allows us to see filmic macro-themes at work.

2. Filmic discourse organisation

To carry out an empirical study, we need first to identify the filmic properties that
we are to analyse. Here we draw on previous work in which we have argued that
there are several quite specific kinds of discourse organisation at work in film,
sometimes similar to those of language, sometimes interestingly distinct. We have
described two of these kinds of discourse organisation at length elsewhere: filmic
discourse relations (Bateman, 2007) and filmic cohesion (Tseng, 2008, 2012); a
brief overview and application of both these aspects of filmic discourse is also

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


136 John A. Bateman and Chiaoi Tseng

S1a S1b S1c S1d

S2 S3a S3b S4

Figure 1. The first four shots of Alfred Hitchcock’s The Birds (1963)

provided in Tseng and Bateman (2012). For current purposes, we will simply sug-
gest by example the kinds of textual organisations that these aspects of filmic dis-
course organisation employ so that the subsequent discussion can be followed.
We illustrate both kinds of analysis briefly with respect to the opening of
Alfred Hitchcock’s The Birds, as shown in Figure 1. The first shot (S1) is a complex
tracking shot following a character that gradually comes into closer focus against
the backdrop of busy streets, going behind a poster of San Francisco (S1c) along
the way. She looks up at the sky, seeing massed birds (S2) and then proceeds into
a pet store (S3). The fragment ends with her going up some stairs inside the pet
store (S4).
Filmic discourse relations are relations postulated to hold between film seg-
ments in terms of temporality, spatiality, epistemic status and audiovisual struc-
tural dependence. They draw on the notion of conjunctive relations proposed for
verbal language by Martin (1983) and as extended for the moving audiovisual
image by van Leeuwen (1991). Building on this, Bateman (2007) and Bateman
and Schmidt (2012) argue that relations of this kind can also be taken as the basis
for constructing filmic discourse structures, although the specificity of the filmic
medium necessitates some changes with respect to the relations employed for ver-
bal language. These relations are seen to operate at a discourse level of analysis,
which means that the description may well cut across shot-boundaries; in this
respect, analyses employing filmic discourse relations are somewhat different to
traditional notions of inter-shot relations pursued in treatments of filmic montage
and relate more to notions of ‘events’ as explored in the cognitive study of film
(cf. Zacks, Speer, & Reynolds, 2009) or to ‘subphases’ in the discourse account of
Baldry and Thibault (2006). The discourse relation approach has also now been
taken further by Wildfeuer (2013), who provides a more formal framework for
capturing discourse relations for film that relates directly to the formal account

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative expectations in film 137

g s

g s

g s

at or pin us

s
in ou

in ou

in ou

in ou
p uo
ow nu

ow nu

ow nu

nd nu
g
tem erla tin
rr ti

rr ti

rr ti

te ti
g
na on

na on

na on

ov on

ex on
l: : c

l: : c

l: : c

l: : c

l: : c
ia al

ia al

ia al

ia al

ia al
a t or

a t or

a t or

a t or
sp mp

sp p

sp p

sp mp

sp p
tem

tem
te

te
[ ] [ ]
S1a S1b S1c S1d S2 S3a S3b S4
temporal: continuous temporal: continuous
spatial: ? spatial: broadening
hypotactic: insert projection

Figure 2. The filmic discourse relations holding among the first four shots of Alfred
Hitchcock’s The Birds (1963). Hypotactically embedded fragments (inserts) are marked
with square brackets and their relation types are shown below the main line of
fragments; other relations are shown above, labelling the arcs

of verbal structured discourse representation theory developed by Asher and


Lascarides (2003). For present purposes it will suffice to describe this aspect of
filmic discourse interpretation in terms of the relations that the film material sug-
gests for relating successive film segments, rather than detailing the discourse
structure itself and the mechanisms employed for building it.
Filmic discourse relation analysis operates in a similar fashion to the corre-
sponding resource for verbal language. As each putative ‘event’ is encountered, it
is necessary to find a relation that holds from the (small) set of relations defined.
The relations are cross-classified according to temporal, spatial and mental state
(seen, heard, imagined, etc.) and to whether they are dependent (hypotactic) or
independent (paratactic). The relations that apply to our illustrative opening se-
quence are shown in Figure 2. Hypotactic relations can also dynamically con-
struct inserted sequences that interrupt the unfolding sequence around them, as
is here the case with fragment S1c and shot S2. Since the sequence as a whole
follows the well-known principles of ‘invisible continuity’, there is little problem
in allocating appropriate discourse relations in this case; a more challenging ex-
ample is given below.
In contrast to relations between fragments, filmic cohesion sets out how char-
acters, objects and settings in coherent film narratives are presented, re-identified
and tracked throughout a film. These tracks are combined into cohesive chains.
The reoccurrence of elements across chains then indicates particular patterns of
textual density that serve to bind together information across the semiotic modes
at work in a film. Looking at chains of cohesion appears to provide important

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


138 John A. Bateman and Chiaoi Tseng

Fragment Melanie Setting (a): SF street view Setting (b): petshop Birds
S1a [v] [v] (squawking)

S1b [v] [v] (squawking)

S1c “San Fransisco” [v: poster] “Davidson’s (squawking)


pet shop”
S1d [v] [v] [v] (squawking)

S2 [v] [v]

S3a [v] [v] (squawking)


“Davidson’s
S3b [v] pet shop” [v]

S4 [v] [v] (chirping)

Figure 3. The filmic cohesive ties holding between the first four shots of Alfred
Hitchcock’s The Birds (1963) organised into cohesive chains. [v] indicates visual
elements, (…) indicates aural elements, and “…” indicates printed or verbal linguistic
elements

guidance for keeping viewers on intended paths of interpretation. Here also, it


will suffice for current purposes to identify the cohesive relations, or ‘ties’, that
films exhibit during their opening sections.
We illustrate this for the opening sequence from the The Birds in Figure 3.
This figure shows the cohesive ties between filmic elements across the sequence.
Such elements can be drawn from any modality employed within the film. The
constructed ties then build up cohesive chains, which we then investigate for
their ‘interactions’ in order to show how filmic texture is created (cf. Tseng,
2008, 2012). In the current case, for example, the main character, Melanie (Tippi
Hedren) is shown visually in almost every fragment, while the general street set-
ting is shown visually in all the external shots and, most probably, is also labelled
textually in the poster shown in S1c – all such discourse attributions are expressed
as abductive hypotheses in the style of linguistic discourse representation theory
and so may turn out to require revision by a viewer when more material concern-
ing the film becomes available. In the move from outside to inside, the setting is
also labelled textually by the printed name of the pet shop and, throughout the
sequence, there is an acoustic indication of birds being present. Because cohesion
analysis operates by picking out just those filmic elements that are repeated, it
functions to ‘self-select’ chains that are being constructed by the film – that is, it is
generally unproblematic to take too many elements (for example, individual cars,

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative expectations in film 139

other passers-by, fire hydrants or whatever else may be visible in any particular
shot) at first, because the fact that they do not reappear means that they will not
participate in cohesive ties. The backgrounded presentation of “Davidson’s Pet
Shop” already in fragment S1d is of this nature; had the main character simply
walked on following S1d, the cohesive chain for this named pet shop as part of the
setting would never have been established. The building of ties into chains and,
subsequently, the description of chains that are brought together in particular
actions or events, thus reveals quite clearly what the film itself is constructing as
textually significant for its development.

3. Application of the analysis methods for empirical investigation


of a corpus

Our study now applies these analytic schemes to a selection of films in order to
explore whether the structuring effect of the opening sections of a film can be
seen in how these resources are deployed. For this we employ a small film corpus
and annotate their beginnings according to the organisations suggested by our
framework. The corpus consists of 20 films selected to include both ‘mainstream’,
traditional narrative organisations and some less traditional, non-linear narrative
structures. The purpose of including films varying in this way is to see whether
the very different ‘methods of development’ that such films make use of can al-
ready be detected in the film beginnings.
The non-linear films selected have been variously described. They are
sometimes characterised as ‘puzzle films’ (Buckland, 2009), ‘forking narratives’
(Bordwell, 2002), ‘multiple draft’ films (Branigan, 2002) and similar. Less extreme
cases are commonly described using the narratological concept of ‘unreliable nar-
ration’ (e.g., Booth, 1961; Hansen, 2007; Koch, 2011). There is therefore consid-
erable variation within the groupings discussed, as well as many open questions
concerning their definitions and demarcation. The idea behind our selection is
that for the majority of non-linear films, it is nevertheless the case that the film is
considered narratively coherent by viewers. This raises the question as to how this
is achieved and the role that a film’s beginning may play in bringing this about.
If a film’s beginning functions in any way similarly to macro-themes, it would
be expected that guidance for following even non-traditionally structured films
would still be forthcoming.
The film corpus is listed in Table 1, grouped into two subcorpora reflecting a
pre-theoretical classification according to their linearity. An analysis of the first 5
minutes was undertaken for each film. Within the discourse relation dimension,
relations between shots were classified according to whether they were spatial,

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


140 John A. Bateman and Chiaoi Tseng

Table 1. Film corpus used in the exploratory study; for production details, the reader is
referred to the invaluable Internet Movie Database (IMDB): http://www.imdb.com
Non-linear
Blind Chance (1981) Hana-bi (1997)
Run Lola Run (1998) Code: Unknown (2000)
Memento (2000) Vanilla Sky (2001)
Oldboy (2003) Reconstruction (2003)
2046 (2004) The Fountain (2006)
Linear
The Thin Red Line (1998) The Sixth Sense (1999)
Donnie Darko (2001) Black Hawk Down (2001)
Synecdoche, New York (2006) The Prestige (2006)
Once (2006) Juno (2007)
My Blueberry Nights (2007) Mr Brooks (2007)

temporal, ‘projective’, comparison, structuring relations or unidentifiable as sug-


gested above. Within the cohesion dimension, the specificity of characters, objects
and settings was compared, including both multimodal cohesive devices – verbal
(spoken or written) text for naming/specifying characters, objects and settings
visually shown on screen – and visual cohesive devices – without verbal cues.
When films unfold linearly this is carried most straightforwardly by the filmic
discourse relations between shots and the scenes which build themselves on top
of these shots: these filmic discourse relations can therefore be expected to be
dominantly chronological, just as was the case in the analysis of the The Birds
opening shown in Figure 2 above. This particular aspect of the method of de-
velopment should then be signalled in the openings of films also. For non-linear
films the filmic discourse relations will generally be more diverse. Thus as a first
hypothesis to be explored empirically, we can suggest that:
Hypothesis 1
The beginnings of linear films should signal their linear method of devel-
opment by employing generally chronological filmic discourse relations,
whereas the beginnings of non-linear films withhold chronological filmic
discourse relations.

Moreover, since non-linear films are still generally perceived as coherent by view-
ers, it might be expected that there is actually a trade-off between the two filmic
discourse organisations: the very fact of non-linear narrative will mean that the
discourse relations, and particularly scene transitions, will violate chronological
development and, as a consequence, the cohesive ties established may need to

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative expectations in film 141

take on a greater functional load in helping to guide viewers across underspeci-


fied transitions; an example of this trade-off for one particular case is discussed
at greater length in Tseng and Bateman (2012), where the opening sequence of
Christopher Nolan’s Memento (2000) is addressed.
If a film’s beginning is then indeed to function as an indicator of the method
of development, then there should also be observable differences across the cor-
pus concerning the structuring employed within the beginnings of non-linear
films. That is, according to the hypothesised function of macro-themes, differing
methods of development should also bring about recognisable differences in the
film’s beginnings. Two further hypotheses are then:
Hypothesis 2
The beginnings of non-linear puzzle films should signal a non-linear method
of development by relying less on spatial and temporal regularity in the filmic
discourse relations.
Hypothesis 3
The beginnings of non-linear films should employ a higher degree of cohesive
organisation than linear films.

This means that in order to maintain coherence it should be the case that the re-
spective beginnings of non-linear films signal to the viewer that non-linear, more
cohesively based interpretative schemes are to be applied during the film.

4. Results of the exploratory study

The results of performing the filmic discourse relation analysis on the beginnings
of the 20 films of the corpus are suggested graphically in Figure 4. To improve
reliability the analysis in each case adopted shot boundaries as a common level
of granularity throughout. The graph shows how many transitions between shots
needed to be taken before the discourse relations were reliably identifiable, order-
ing the films analysed along the horizontal axis according to the increasing delay
holding before reliable identification becomes possible. For the last film in the
list, Christoffer Boe’s Reconstruction (2003), the uncertainty lasts a considerable
time into the film, persisting well beyond what might reasonably be considered
the ‘beginning’. For the linear films, the discourse relations are in fact never seri-
ously in doubt. Each shot follows on the other in the manner that we saw with
the The Birds. This distribution is certainly compatible with our hypotheses 1 and
2 above.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


142 John A. Bateman and Chiaoi Tseng

Number of unclear transitions


16
14
12
10
8
6
4
2
0
s:

n:

to

46

n
ab
sk

nc
ru

io
ilm

ow

ta
en

db
20

ct
n
a
lla

un
la
rf

em

ch
kn

Ha

ru
Ol
lo

ni

fo
ea

st
un

d
M
Va
n
lin

n
e

in
Ru

Th

co
:

Bl
de
10

Re
Co
e
Th

Figure 4. Number of unclear filmic discourse relation transitions in the opening


sequences of the films of the corpus

scene A scene B scene C

scene G scene F scene E scene D

scene H

Figure 5. Filmic discourse relations in the opening sequence of Blind Chance (1981)

Illustrative of the complexity involved in the non-linear portion of the corpus


is the beginning of Kieślowski’s Blind Chance (1981), shown in Figure 5. Here
there is very little opportunity for the viewer to decode the filmic discourse rela-
tions. We take this as strongly suggesting to the viewer that the film will be em-
ploying some method of development that is not straightforwardly linear, which
indeed turns out to be the case.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative expectations in film 143

character 1. Witek character 2. character 3.


Father Daniel
scene A

scene C

“Daniel”
scene D “Witek”
(called by
(called by Daniel)
Witek)
scene H “He is my son
Witek” (spoken
by father)

Figure 6. Cohesive ties in the opening sequence of Blind Chance (abbreviated)

Turning to the cohesive analysis, there are several possibilities for describ-
ing the various textures that are constructed. Indeed, in both linear and non-
linear films, cohesive structures are well established from the beginning of each
film. Considering again the opening from Blind Chance we find with respect to its
cohesive analysis a relatively dense interweaving of reoccurring filmic elements.
This is shown graphically in Figure 6. We suggest here that it is particularly sig-
nificant that the texture is created cross-modally: i.e., there are reoccurrences of
visuals and verbal elements which are bound together, often again employing es-
tablished continuity techniques. Thus, the boy that is shown in Scene B is identi-
fied as ‘Witek’ by being called verbally in Scene C. This verbal identification is
then picked up again in Scene D. From this we can see that cohesive structures
tracking the main characters’ identities are strongly established and may then be
available as a counterbalance to the severe uncertainty in the filmic discourse rela-
tions that apply.
When we explore the distribution of visual cohesion and visual cohesion
combined with verbal cohesion across the entire corpus (cf. Table 2), a system-
atic trade-off between cohesion and underspecified discourse relations is sup-
ported. The results show a suggestive difference between how cohesion patterns
operate in the linear and non-linear films. Although both linear and non-linear

Table 2. Cohesion results across the beginnings of the films of the corpus
Cohesive texture Linear Non-linear
Cohesive chains with specific verbal cues specifying names and 5 9
identities of characters
Cohesive chains with no verbal cues but with visual reappearance to 5 1
confirm main characters

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


144 John A. Bateman and Chiaoi Tseng

films adopt cohesive patterns relying on visual and verbal ties, when we move to
non-linear films: much less reliance is made on the purely visual elements and
multimodal cohesive ties, such as naming, are employed in addition in order to
carry the viewer over material for which it is difficult, or impossible, to find visual
relations. The sample size here is too small to offer statements of statistical signifi-
cance but the trend is clear: the openings of non-linear films appear to exhibit pat-
terns of filmic discourse relations and filmic cohesion that distinguish them from
the linear films, thus supporting hypothesis 3. This difference may well provide
a more or less explicit message to the viewer that a different kind of method of
development than usual is to be expected for the film that is to follow.

5. Conclusion

As a summary of the analytic results reported above, we can therefore suggest the
following general film properties:

– Non-linear films require highly specific, multimodal cohesive chains to direct


viewers’ recognition of elements across underspecified interpretations of dis-
course relations between sequences.
– The sooner the discourse relations are made predictable, the less demanding
subsequent interpretation paths become and less work is required of the co-
hesive texture.

In addition, all three hypotheses above were supported, although further studies
need to be undertaken to evaluate the extent to which the tendencies revealed are
robust with respect to larger samples. Nevertheless, the apparent availability of
clear indicators for the kinds of filmic narrative strategies that are going to be em-
ployed within a film suggests that this source of guidance needs to be taken into
consideration whenever exploring viewers’ interpretative activities when watch-
ing film.

References

Anderson, J.D. (1996). The reality of illusion: An ecological approach to cognitive film theory.
Carbondale/Edwardsville: Southern Illinois University Press.
Asher, N., & Lascarides, A. (2003). Logics of conversation. Cambridge: Cambridge University
Press.
Baldry, A., & Thibault, P. (2006). Multimodal transcription and text analysis. London: Equinox.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


The establishment of interpretative expectations in film 145

Bateman, J.A. (2007). Towards a grande paradigmatique of film: Christian Metz reloaded. Se-
miotica, 167(1/4), 13–64.
Bateman, J.A., & Schmidt, K.H. (2012). Multimodal film analysis: How films mean. London:
Routledge.
Booth, W. (1961). Rhetoric of fiction. Chicago: Chicago University Press.
Bordwell, D. (1985). Narration in the fiction film. Madison, WI: University of Wisconsin Press.
Bordwell, D. (2002). Film futures. SubStance, 31(1), 88–104. DOI: 10.1353/sub.2002.0004
Branigan, E. (2002). Nearly true: forking plots, forking interpretations. A response to David
Bordwell’s ‘Film Futures’. SubStance, 31(1), 105–114.
Buckland, W. (Ed.). (2009). Puzzle films: Complex storytelling in contemporary cinema.
Chichester, U.K.: Wiley-Blackwell.
Carroll, N. (2008). The philosophy of motion pictures. Oxford: Oxford University Press.
Fries, P.H. (1995). Themes, methods of development, and texts. In R. Hasan & P.H. Fries (Eds.),
On subject and theme: A discourse functional perspective (pp. 317–359). Amsterdam: John
Benjamins. DOI: 10.1075/cilt.118.10fri
Ghadessy, M. (Ed.). (1995). Thematic development in English texts. London: Pinter Publishers.
Gibson, J.J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.
Grodal, T. (1999). Emotions, cognitions, and narrative patterns in film. In C. Plantinga & G.M.
Smith (Eds.), Passionate views: Film, cognition, and emotion (pp. 127–145). Baltimore/
London: The John Hopkins University Press.
Halliday, M.A.K. (1967). Notes on transitivity and theme in English – Parts 1 and 2. Journal of
Linguistics, 3, 37–81 and 199–244. DOI: 10.1017/S0022226700012949
Halliday, M.A.K., & Matthiessen, C.M.I.M. (2004). An introduction to functional grammar.
Third edition. London: Edward Arnold.
Hansen, P.K. (2007). Reconsidering the unreliable narrator. Semiotica, 165(1/4), 227–246.
Hartmann, B. (2009). Aller Anfang: Zur Initialphase des Spielfilms. Marburg: Schüren.
Heath, S. (1975). Film and system: terms of analysis. Part I. Screen, 16(1), 7–77.
Koch, J. (2011). Unreliable and discordant film narration. Journal of Literary Theory, 5(1), 57–
80. DOI: 10.1515/JLT.2011.006
Luchins, A.S., & Luchins, E.H. (1962). Primary-recency in communications reflecting attitudes
toward segregation. Journal of Social Psychology, 58, 357–369.
DOI: 10.1080/00224545.1962.9712387
Martin, J.R. (1983). Conjunction: the logic of English text. In J.S. Petöfi & E. Sözer (Eds.), Micro
and macro connexity of discourse (pp. 1–72). Hamburg: Helmut Buske Verlag.
Martin, J.R. (1992). English text: Systems and structure. Amsterdam: John Benjamins.
DOI: 10.1075/z.59
Metz, C. (1974). Language and Cinema. The Hague: Mouton. Translated by D.J. Umiker-­Sebeok.
DOI: 10.1515/9783110816044
Monaco, J. (2009). How to read a film: Movies, media and beyond. 30th edition. Oxford, U.K.:
Oxford University Press.
Münsterberg, H. (1916). The photoplay: A psychological study. New York: D. Appleton and
Company.
Persson, P. (2003). Understanding cinema. Cambridge: Cambridge University Press.
DOI: 10.1017/CBO9780511497735
Propp, V. (1968). The morphology of the folktale. Austin, Texas: University of Texas Press. Orig-
inally published in Russian in 1928.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


146 John A. Bateman and Chiaoi Tseng

Smith, T.J., & Henderson, J.M. (2008). Edit blindness: The relationship between attention and
global change blindness in dynamic scenes. Journal of Eye Movement Research, 2(2)(6),
1–17.
Smith, T.J., Levin, D.T., & Cutting, J.E. (2012). A window on reality: Perceiving edited moving
images. Current Directions in Psychological Science, 21(2), 107–113.
DOI: 10.1177/0963721412437407
Tseng, C. (2008). Cohesive harmony in filmic text. In L. Unsworth (Ed.), Multimodal semiotics:
Functional analysis in contexts of education (pp. 87–104). London: Continuum.
Tseng, C. (2012). Audiovisual texture in scene transition. Semiotica, 192, 123–160.
Tseng, C., & Bateman, J.A. (2012). Multimodal narrative construction in Christopher Nolan’s
Memento: A description of method. Journal of Visual Communication, 11(1), 91–119.
DOI: 10.1177/1470357211424691
van Leeuwen, T. (1991). Conjunctive structure in documentary film and television. Continuum:
Journal of Media and Cultural Studies, 5(1), 76–114. DOI: 10.1080/10304319109388216
Vogler, C. (1998). The writer’s journey: Mythic structure for writers. Studio City, CA: Michael
Wiese Productions.
Wildfeuer, J. (2013). Film discourse interpretation: Towards a new paradigm for multimodal film
analysis. London: Routledge.
Wuss, P. (2009). Cinematic narration and its psychological impact: functions of cognition, emo-
tion and play. Newcastle: Cambridge Scholars.
Zacks, J. M, Speer, N. K, & Reynolds, J.R. (2009). Segmentation in reading and film comprehen-
sion. Journal of Experimental Psychology, 138, 307–327. DOI: 10.1037/a0015305

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling
Integrating information, emotion
and social cognition

Isabel Alonso, Silvia Molina and María Dolores Porto


Universidad Autónoma de Madrid / Universidad Politécnica de Madrid /
Universidad de Alcalá de Henares

Digital stories are a very recent multimedia practice by which ordinary people
construct short narratives on personal affairs combining voice, images and
sometimes music. This paper contributes to the description of this new emer-
gent genre from both a multimodal and a cognitive point of view, by exploring
how diverse semiotic channels in digital storytelling provide different kinds of
information (factual, emotional, cultural, etc.) which are finally integrated to
construct the global meaning of the narrative. For this purpose, we combine
Kress and Van Leeuwen’s (1996) scholarly work related to multimodal repre-
sentation, with the use of some notions of the Mental Spaces and Conceptual
Integration theory (Dancygier, 2008; Fauconnier & Turner, 2002). The results of
this study are of interest to those concerned with the representational and com-
municational modes of semiotic resources in storytelling.

Keywords: digital storytelling, multimodality, mental spaces, conceptual


integration, narrative

1. Introduction1

Multimodal digital storytelling is an expanding activity by which ordinary people


all over the world relate their personal experiences in the form of a multimodal
story and upload them onto the Internet. They are always very short – ranging
from 4 to 5 minutes – and combine a narrating voice with still images and, some-
times, a musical soundtrack. As narrators are not professional writers or technol-
ogy experts, digital stories are frequently composed in workshops where they are
advised on how to construct their story, step by step, from selecting, scanning and
editing the images, reading aloud and recording the texts, to the more technical

doi 10.1075/bct.78.10alo
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
148 Isabel Alonso, Silvia Molina and María Dolores Porto

aspects of how to synchronise images, voice and background music, or even mat-
ters of copyright when using published material for their stories.
This paper’s main aim is to contribute to the multimodal description of this
new emerging genre from a cognitive standpoint. To our knowledge, this com-
bined approach of multimodal analysis and cognitive linguistics has not been ex-
plored yet in previous research on digital storytelling. Thus, our objective is to
explore how both the Internet and the technical resources for integrating sound,
image and text affect the receivers’ interpretation of these digital narratives and
add interpersonal (emotional) build up in the receiver’s mental image of the sto-
ries. For this purpose, we have analysed thirty randomly chosen digital stories
from several organizations and project websites.
From a cognitive perspective, narratives must be regarded as more than a
mere succession of causal/temporal events. Some recent theories see narratives as
a complex network of mental spaces which combine and blend in order to reach
a final emergent mental representation (Dancygier, 2008; Semino, 2009; Turner,
2008). Listeners and readers are thus not mere recipients of the stories, but active
participants in the co-construction of the narrative. Multimodality adds com-
plexity to both the production and interpretation of digital stories. In a very short
time span, the information provided by three different channels – verbal, visual
and auditory – must be processed and combined to construct a global representa-
tion of the story. Therefore, the interaction between the different modes and the
way it affects the processing of the narrative is essential for the analysis of the
stories and for the understanding of the effects they produce on the receivers. The
content of the narrative, that is, the instructions given to the interpreter in order
to construct the mental representation of the story, is distributed across those
three channels, or “information tracks” (Herman, 2009, p. 80), and contains not
only factual information, but also cultural and emotional aspects that play a main
role in the final representation. Therefore, the present analysis can be inscribed
within the recent trend of transmedial narratology (Herman, 2004; Ryan, 2005),
which claims that the constraints and affordances associated with a given medium
affect the narrative and the construction of the receiver’s mental image.
In the first section of this work, we present some of the main features of digi-
tal stories as a new emergent genre. Following, there is a description of the sample
where we provide some explanations of the methodological tools we have com-
bined in order to account for the multiple dimensions of digital stories. On the one
hand, we carried out a joint application of Unsworth and Cléirigh’s model (2009,
pp. 151–163) and Kress & van Leeuwen’s (2006) grammar of visual design in or-
der to advance understanding of how images and language interact to construct
ideational meaning. On the other, we make use of some notions of the Mental
Spaces and Conceptual Integration theory as applied for narratives (Dancygier,

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling 149

2008; Fauconnier & Turner, 2002; Porto & Romano, 2010; Semino, 2009; Turner,
2008) in order to explain how the different modes are integrated with one another
and also with the socio-cultural knowledge of participants to construct a global
meaning of the narrative. We believe these cognitive and functional oriented tools
are complementary since both seek to explain linguistic forms in terms of their
meaning, in strict relation with usage and cognition. After introducing the study’s
theoretical framework, the analysis of the three modes is presented and the way
in which the different meanings provided are integrated is examined in detail in
Sections 4 and 5, respectively.

2. Digital stories as a new emergent genre

Digital storytelling was initiated at the Center for Digital Storytelling (CDS) in
California in the early 1990s. Digital stories are made in workshops where people
can learn both about the software and technical aspects of the process and about
effective ways to construct their story. The genre and workshop methods have
spread to large parts of the world, and we can read, for example, how projects have
been carried out in Australia, Wales, and in South-African secondary schools.
It is possible to achieve a general view of the main features of the genre by
reading the various guides available on websites that publish digital stories.2 In
these tutorials, potential storymakers are advised on the several steps needed to
create their own story, from script to publication. Among the main features that
define this new emergent genre, the following can be highlighted:

– Digital stories cover a wide range of topics in different contexts (educational,


social, historical and cultural), but all of them are written from a personal
point of view and transmit emotional content. Thus, some intimate experi-
ences of anonymous citizens from various countries can reach millions of
equally anonymous readers/listeners in distant cultures. It is probably for this
reason that most of these stories have a didactic, exemplary purpose, to help
other people.
– In general, the stories produced are not given the same status as professional
productions or works of art. They are “home-made”, with choices made from
a variety of well-designed templates and/or following instructions received in
workshops.
– They are very short. Most tutorials recommend two to five minutes as the right
length, about 250 words and a dozen or so pictures. Brevity is a challenge for
story makers, who must compress sometimes highly emotional events in such
a short time span and so reduce it to “its essence”.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


150 Isabel Alonso, Silvia Molina and María Dolores Porto

– The textual narrative is usually told in first person or a fictional third person
and is frequently well structured in the traditional schema of abstract, orien-
tation, complication, resolution and coda (Labov & Waletzky, 1967). The text
is the main, leading mode and images and sounds tend to be supportive. All
tutorials consulted reveal that the verbal narrative is always the first step in
the construction of the multimodal story.
– Most images are still, but there some advanced digital storymakers that in-
clude short video clips in their work, or at least some zooming and travelling
to the images. They can range from drawings to family photos – colour, sepia
or black and white –, in addition to collages, common cultural signs (the red
ribbon for AIDS), symbols and visual metaphors.
– Quite often, digital stories contain personal images of the narrators them-
selves, as a child or at the present time. Guides for digital storytelling encour-
age authors to use this kind of images, partly because of the personal content
of the stories, but also in order to avoid any conflicts of copyright. Images
mainly provide the emotional and cultural content of the digital story.
– The narrating voice is also another defining feature of digital stories, as it
lends a sense of veracity to the story that enhances its emotional content.
In fact, the voice conveys information about the author, such as gender, age,
socio-cultural background and status, etc. On the other hand, the general
tone of the narratives is usually neutral, as if trying to sound objective in the
narrated events. In addition, the voice also provides rhythm and pace in the
display of images.
– The soundtrack is not always present. It can vary from commercial songs to
traditional, folk music or just some background music or sounds. It frames
the whole story in a context and can also help to mark shifts in the narrative.
Moreover, as it tends to become louder at the end of the story, it may act as a
coda for the story.

The stories which integrate our sample were uploaded by both non-governmental
and non-profit organizations which periodically hold digital storytelling work-
shops in various parts of the world to encourage people to share their experi-
ences.3

3. Sample description and methodological approach

This section describes the data and the combination of methodological tools used
to depict multimodal digital stories from a cognitive point of view. For the pur-
pose of this study, thirty digital narratives were analysed. As for their selection, no

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling 151

variables of sex, age, or cultural origin were considered. No attention was given
to particular topics either. The sample was selected at random from various pub-
lic, well-known digital storytelling web sites across continents to guarantee that
different cultural approaches to narratives are considered (see the Annex for the
complete list of stories). Consequently, our sample does not constitute a formal
corpus, but a first approach to the genre which is expected to provide a prelimi-
nary multimodal characterization of digital storytelling from a cognitive point
of view. All digital narrations were coded by a number and the acronym of the
organization that promoted its elaboration, e.g. Engender Health: EH. Also, the
audio was transcribed.
Concerning the article’s methodology, we drew some basic concepts from the
theories of Mental Spaces and Conceptual Integration Theory. Very recently, these
theories have been applied to the study of discourse (Oakley & Coulson, 2008)
and more specifically to narratives, both fictional (Dancygier, 2008; Semino,
2009) and non-fictional (Porto & Romano, 2010). According to these theories,
the meaning construction of a narrative is a process by which several input mental
spaces, i.e. “partial assemblies constructed as we think and talk, for purposes of
local understanding” (Fauconnier, 1997, pp. 1–2), are set up and then activated or
deactivated as the story unfolds by means of several devices such as linguistic and
pragmatic markers that act as space builders. Some elements in these input spac-
es are then cross-mapped and selectively projected onto a blended space whose
structure is emergent, i.e. dynamically elaborated and not just deriving from the
inputs. In addition, the selection of the elements that will be projected onto the
final blend depends on the generic space, which is not given by the narrative, but
belongs to the shared knowledge of the participants and is activated by the dis-
course. Therefore, the information, not only factual, but also modal, emotional,
cultural, etc. provided by the input spaces is just the prompt for the construction
of the global final meaning of the narrative – the emergent story. In Section 4
readers will find further explanations on the theory and examples taken from the
stories analysed.
This process becomes far more complex in multimodal narratives, where the
process of conceptual integration can be seen as working at different levels:

i. Since the story is presented in sequential visual frames with a synchronised


text and sound, there is a “microprocess of conceptual integration” of the in-
formation simultaneously provided by the various modes at every picture.
ii. Each channel – verbal, visual and acoustic – constitutes a narrative itself, and
each one, as in monomodal narratives, is composed of several narrative spac-
es (Dancygier, 2008) or contextual frameworks (Emmott, 1997) that must be
integrated to construct the emergent story.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


152 Isabel Alonso, Silvia Molina and María Dolores Porto

iii. The three narratives, each of them a different “information track” (Herman,
2009), must be integrated into the one and only emergent narrative to which
they all contribute.

In this framework, we decided to carry out a qualitative study that depicts the
main characteristics of the verbal, visual and acoustic channels as they are made
manifest in digital stories and describes how these three individual channels in-
teract to construct the meaning of this narrative genre.
For the multimodal study of the verbal mode, apart from the oral text, we
drew from two signifying systems proposed by Kress and van Leeuwen (1996) in
their grammar of visual design: salience and framing. Salience is determined on
the basis of the identification of visual cues – size, sharpness of focus, or amount
of detail or texture shown, tonal contrast, colour contrast, placement in the visual
field, perspective, and any cultural symbolism associated with the image – that
enhance verbal language. Besides all these elements, we also looked at the mean-
ing of any possible written text on screen (titles, introductory slides and credits),
and described its position on the image, its font, (ex. white fonts on black) and
its size. We also paid attention to edited images, for example, bubbles on pho-
tographs, cartoons, etc. As for the notion of framing, it is essential to unveil the
narrative structure of digital stories since they can connect and disconnect dif-
ferent narrative elements. So we analysed the structure of our stories, looking for
borders between elements, connective vectors, repetition of shapes, colours, etc.
which provide textual coherence to the story.
Regarding the visual mode, we first identified the type of image which accom-
panies the verbal mode in our stories (photos, cartoons, drawings, collages, etc.)
and its main features (edited or not, public, private, etc.). As we will show, most
slides are static or pictorial images. To identify the construction of ideational or
representational meaning through images, we relied on Unsworth and Cléirigh’s
(2009) model. It is a systemic functionally oriented tool (Halliday & Matthiessen,
2004) which provides a complete basis for the modelling of image-text relations
that jointly construct experiential meaning. From the point of view of the im-
age that visualises the language, Unsworth and Cléirigh (pp. 156ff.) classify pic-
tures into three types: (a) images which elaborate the (verbalised or unverbalised)
qualities of the main participants and/or events in the story; (b) pictures which
visualise the (verbalised or unverbalised) parts of the main participants and/or
events in the story; and; (c) pictures which visualise (verbalised or unverbalised)
geographical locations.
Identifying the construction of other meanings, like emotional or cultural,
was also very important in our analysis. As we will see in Sections 4 and 5, the

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling 153

integration between image and text in these narratives through metaphors often
has a persuasive function, that is, to appeal to the listener.
Finally, regarding the acoustic mode, we analysed the songs and melodies that
go with the stories to unveil their contribution to the narrative meaning in the
digital narration. Although the narrators’ voices are also acoustic, we regard them
as belonging to the verbal mode, as these are oral stories. Thus the description
of gender voices, types of accents, etc. will be found in the verbal mode. As for
the acoustic mode, we distinguished between the music that creates a context for
the narration (introducing the story, setting the cultural context) and the music
which marks a significant moment in the story, or a change in context.

4. Multimodal analysis

In this section we present the results of a preliminary multimodal analysis of how


the three channels – verbal, visual and acoustic – present in our sample of digital
stories interact to construe narrative meaning.
We start here with the verbal mode. As previously mentioned, all sample sto-
ries are verbally structured, that is, the oral narration we hear is the main, leading
mode and images and sounds tend to be supportive. The narrators’ voices provide
additional information to the audience about their location, cultural background,
socio-cultural status, age, gender, etc. which is supported or even reinforced by the
images. Interestingly, the narrative voices analysed do not convey any emotional
meaning. We believe this is so because most narrators are advised to read their
stories, not to interpret them. The written mode on screen is scarcely used in our
sample stories, except for the titles and introductory slides at the beginning, and
the credits at the end. However, when used, it is an extremely powerful resource
used to focus the listener’s attention to the meaning and impact of those words.
In such cases, also the colours of the background and the size and types of fonts
are significant. So for instance, in Stop Bullying (DS1) several slides are shown
with one or two words in big white fonts on a black background. The messages
(a look, name calling, sly comments) refer to the ways in which the narrator was
bullied at school. Prominence is thus given to the noun phrases dominating the
entire page by means of colours. Black and white in the European context cultur-
ally connote authority and seriousness. Their connotative value is thus contextual
and ideologically determined. The choice of white typeface also creates salience:
it highlights the processes involved in bullying. The figure/ground perception
caused by the combination of the black and white colours then serves not only the
interpersonal metafunction, but the ideational and textual metafunctions as well

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


154 Isabel Alonso, Silvia Molina and María Dolores Porto

(O’Halloran & Lim Fei, 2009, p. 145). The stark contrasts of the black and white
colours enhance the message against bullying and allow the receiver to focus on
its different expressions.
As for the visual channel, the narratives are structured via a sequence of static
and very often private images which belong to the narrator’s familiar archive and
are used to speak about him/herself, his/her family and his/her past and pres-
ent personal experiences. The narrator usually appears at the very centre of most
images, as the nucleus of information to which other elements (other people,
objects, etc.) are ancillary. In many instances, these photographs are black and
white or sepia, which reinforces the idea of past, gone times (SC5, SC6), which
alternate with colour images referring to present times. In some other cases (EH8,
SC2, SC5), black and white photographs bring forth dark emotions (poverty, de-
pression) versus colour pictures, which represent positive emotions (enthusiasm
and joy).
Edited images with cartoon speech bubbles on photographs (CN3) or car-
toons (UM1), are very few, but their use, again, is multimodally marked since
they attract the audience’s attention to the sometimes socio-culturally determined
message they convey. This is the case of the slide taken from EH8 (see Plate 1),
which reproduces graffiti of a man beating his wife with a cartoon speech bubble
on the picture saying: I paid lobola for her.
Graffitis are generally considered a young artistic expression that rebels
against authority. We believe the narrator uses them to challenge the traditional
African notion of masculinity and to transmit this message among male African
youngsters. As for the speech bubble, “lobola” is a set amount paid by a prospec-
tive husband to the bride’s family among certain peoples in Southern Africa.
It is clear then that images can perform different functions in digital stories.
Also, from the structural point of view, images guide the listener as signposts
through the story. Picture repetitions are used to provide textual coherence to
the story (WL1, SC8). Other times, they are used for the contrary, to signal topic

Plate 1. The use of written mode in edited images (EH8)

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling 155

Plate 2. Sequence of edited images for visual coherence (CN3)

change. That is, in some cases, it is also possible to find the same image but ed-
ited in different ways, in order to show a change in the topic. The preservation of
the main features guarantees the identification of the topic although the addition
of elements (bubbles, other images such as graffiti, different background…) may
suggest a change, a step forward (see Plate 2).
It is not only pictures that play a distinctive role in structuring the story; most
of the analysed narrations are also framed by the background music. In some
stories, music goes on all the time, from beginning to end, merely accompany-
ing the story without extra meaning (EH8). Indeed, it can even go unnoticed. In
some other cases, it plays only at the beginning and at the end (EH5). And as with
pictures, music can be used by narrators to highlight a significant change in the
setting of the story (SC3, EH3…).
Apart from structuring, images play other functions in digital stories. Pho-
tographs also help the audience to visualise all the narrative elements. Most pic-
tures elaborate the unverbalised qualities of the main participants in the story, for
example, information about the people’s ethnicity (SC6, SC7, CN3, WL1, WL2),
about their condition (women with HIV in WL2, a lesbian and immigrant in
SC8, a transgender child in SC1), or about some other circumstances that sur-
round the narrator’s life: poverty and rural life (CN3), etc. This type of intermodal
identification between text and image is defined by Unsworth and Cléirigh (2009)
as intensive identification. Less frequently, we have also found cases of circum-
stantial identification, by which the image visualises (unverbalised) geographical
locations. In the narratives from the channels “living in Bristol” or “stories from

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


156 Isabel Alonso, Silvia Molina and María Dolores Porto

India” or “stories from Namibia”, the location is established from the very first
moment. However, in some other cases the verbal mode does not usually give a
proper physical setting for the story, since, as already pointed out, these stories are
condensed in a few minutes and reduced to their essence. So, images provide this
kind of information, sometimes intentionally, i.e. when a landscape or a map is
shown (SC8), and some other times unintentionally, by showing some cues at the
background of a photograph, e.g. the style of buildings (UM1), or the way people
dress (pendant with Ethiopia’s map in BS1, narrator’s Peruvian hat in SC8, Indian
clothes in EH6).
Finally, images can be used by digital narrators to effectively transmit ad-
ditional, emotional, and/or cultural meanings that are not conveyed by the oral
mode. Thus, pictures of celebrations and different lifestyles (e.g. children’s birth-
days (SC7), photos of weddings (UM1), etc.) help to situate the audience in a
given socio-cultural context. This is reinforced by features such as the accents
or English varieties spoken by the narrators and/or the music which frames the
story: we can find rap music in the story of a urban black boy that breaks the law
(SC2) or traditional folk music in stories about immigration (SC8, CN1), or the
sound of drums in a story of a Rastafarian living in Bristol (BS1). Gospel music
appears in stories of black people that speak of hope and overcoming difficulties
(CN3, WL1).
Special mention should be made to the presence of metaphors in our sample
stories. The integration of verbal and visual modes gives way to the production
of metaphors which are created by narrators to verbalise new ideas and emo-
tions (EH3, CN3, SC6 among others). Metaphors are used in digital stories when
an unknown or difficult situation needs explanation and other linguistic devices
prove insufficient (Cameron, 1999; Lakoff & Turner, 1989), or else in order to en-
able the compression of complex ideas and meanings in such a brief format.
Similarly, symbols can accompany the text to intensify its meaning, or be
added to it if the verbal mode does not mention the concept represented by the
symbol. In Nelao’s story (EH1), for example, the HIV ribbon, a group of ques-
tion marks, or the icon of sound crossed out to mean silence are all displayed at
different points of the narrative in order to convey the struggles that the narrator
faces living with HIV. Also, at the macro level, the symbolic use of colours, red,
black and white, has a role in establishing visual coherence during the narration:
red refers to passion (Hodge & Kress, 1988), broken hearts to HIV, black indicates
worries, white triggers innocence. Other bright colours, appearing at the end of
the story, show a shift towards happiness and joy.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling 157

5. Integrating meanings

After having described the different modes and the way they interact, we will
now show how the verbal, visual and acoustic meanings are integrated into a final
global construction of the story. Since this integration works at several levels, the
number of mental spaces created during the production and interpretation of a
story is extremely high. Thus, in order to glimpse some of the hundreds of mental
spaces that compose the network and of the processing strategies that take place
during the process of understanding a digital multimodal narration, we will show
the analysis of a fragment of one of the stories at the micro level and then extend
the results to the macro level of integration of the whole narrative.
James’ story, Rock Bottom (EH3), tells how the narrator, after many love expe-
riences, found a girl he really loved and wanted a stable relationship with her, but
she finally broke up with him. He then became aware of the pain he had inflicted
upon his previous girlfriends and now feels ready to be a different kind of man
and to challenge traditional gender roles. Figure 1 presents the description of two
slides in this story.
In the 7th slide, the verbal mode constitutes input space 1. It provides, on the
one hand, the basic factual information: he grew up without a father figure and
was brought up by his mother, and on the other, some emotional meaning: he
regrets the lack of a father figure. The photo of the lions prompts the construction
of input space 2, conveying unmistakable cultural meanings, since lions are repre-
sentative of an African setting. Also, the attitude of the animals transmits feelings
of comfort and protection, as they are sitting together under the sunlight, resting

Slide Narrator’s voice Picture


number
#7 I am who I am because of her
[narrator’s mother]. However, there
are times I wish I had someone to
look up to, as a boy, to support me,
someone who could help me and
guide me towards becoming a man
in life one day.

#8 The absence of a father figure in my


life played itself off during my years [black]
at university.

Figure 1. Meaning integration in EH3

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


158 Isabel Alonso, Silvia Molina and María Dolores Porto

parenthood
lions are strong
lions protect theri brood GENERIC SPACE
father and mother bring up
their children together
children must feel safe

lion I had a strong


Tinkling
young lions mother
sitting together I missed a father Childish
as a child Light- heartedness
stillness
I missed Happiness
safety guidance
I missed support
Africa and help
INPUT SPACE 3
INPUT SPACE 1
INPUT SPACE 2

African setting
A father is strong and supportive
A father provides protection
A father transmits a sense of peace
Narrator did not feel protected and
supported by a male parent

EMERGENT FINAL BLEND

Plate 3. Conceptual integration in slide 7 (EH3)

but watching attentively. The tinkling music matched with these words and photo
conveys mental images of childhood, innocence and light-heartedness,
which are part of input space 3 (see Plate 3).
The generic space activated by the elements of these three input spaces in-
cludes several cognitive and cultural models not explicit in the story: an abstract
schema of parenthood and what it involves; the image of lions as strong ani-
mals and the common knowledge that lions fiercely protect their brood;
the conventional models that children are brought up by both parents,
male and female and that children must feel safe and happy.
By cross-mapping the information provided by the input spaces, as enabled
by the commonalities evidenced by the models activated in the generic space, the
narrator as a child is projected onto the image of the lion brood and the absent

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling 159

father figure onto the lion; the strength and fierceness commonly associated with
lions is matched with the support and care of parents for their children; the still-
ness of the lions in the image corresponds to the peace of a child who feels secure
and all that is framed by a general feeling of safety, peace and innocent happiness
prompted by the music.
Thus, a final blended space is constructed, with an emergent structure that
integrates elements from the input spaces, selectively projected and fine-tuned
by the previous knowledge of the discourse participants. That blend shows the
lion as a metaphor for a good father and states that the narrator feels he would
have become a better man if he had had a strong and supportive father, who had
protected and guided him as a boy and stayed by him and granted him a happy
and carefree childhood.
The next slide is just black, which means that the lions are gone, and the father
image is missing, and so are the feelings of safety and protection provided by him.
Besides, black means darkness, which stops us from seeing, from knowing what
comes next, and so it is culturally associated with cognitive models of fear and
loneliness. Also there is a change in the music, indicating that childhood is over
and there is a new stage in the narration of James’s life.
The whole story is composed of thirty-nine slides, and the blends for each
of these also act as input spaces in the whole network that will generate the final
blend of the narrative. Moreover, the mappings between input spaces can also be
done between different slides. For instance, the African setting is not only pro-
vided by the image of the lions in the 7th frame, but also the slides that display
his house on a street in Africa (4th, 5th and 6th), another (33rd) with a symbolic
image of a pregnant womb with a map of Africa on it and several others where
black African people appear in different attitudes – laughing (34th), attending
a lesson (20th) at a graduation ceremony with a white robe (9th) – or just some
landscapes (26th, 35th…). Therefore, the whole visual mode can be regarded as a
narrative itself where several input spaces are cross-mapped for the construction
of a blended space. Similarly, the verbal and the acoustic modes blend into spaces
that contribute to the final emergent one for the whole narrative.

6. Concluding remarks

In this paper, the meaning potential of thirty multimodal digital narratives has
been explored from both a multimodal perspective (verbal, pictorial and musical
resources) and a cognitive approach (Mental Spaces and Conceptual Integration
Theory). The oral channel has proved to be the leading mode, supported by im-
ages and music. Visual metaphors, cultural symbols and identifications provide

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


160 Isabel Alonso, Silvia Molina and María Dolores Porto

additional meanings beyond the mere factual content of the texts. The music,
when present, frames and contextualises the whole story. A highly complex men-
tal image of the story is achieved by integrating factual, cultural and emotional
meanings in an emergent blend that allows converting a minor, private story, of-
ten culturally specific, and only individually relevant into a universal, exemplify-
ing story. Our analysis attempted thus to show how different meanings and modes
integrate to convey a very complex and cognitively demanding message to society,
because digital stories are typically personal events narrated by an individual, but
when posted on a website, intend to become universal, reaching thousands of
people who may feel mirrored in them. It is in this sense that digital stories con-
stitute a good example of narratives as a tool for thinking, or a cognitive artifact
(Herman, 2003), that is, something used by humans for the purpose of aiding,
enhancing or improving cognition.
We are cautious about the interpretations provided here on the basis of the
study of the sample. Further research on a more extended corpus of digital stories
in English is needed. Also, in-depth interviews with digital narrators investigat-
ing their narrative processes and with the viewers are encouraged here to test
whether the explanations provided in this paper can be confirmed or not and the
importance of parameters such as individual narrative style should be studied to
determine their influence in the narrative structure. It is important to remember
that the stories were created by non-professional, non-expert storytellers, who
followed the instructions of monitors in workshops on the one hand and on the
other, may have felt constrained by the technical equipment or the software they
used for the creation of the stories, or even by their unequal abilities for the differ-
ent stages of the process (editing images, reading aloud, playing music, etc.).
However, the fact that some of the features here described and analysed were
not intentional by the narrator does not invalidate the conclusions on how those
features affect the interpretation and influence in the construction of the final
emergent story. For example, in SC5, instead of background music there is some-
one humming a song first and later singing it aloud. This is probably the result of
some advice given by the workshop organisers about the use of original music or
sounds, or even of people singing instead of using copyrighted material. But the
point is that this humming makes the whole story far more personal and the emo-
tional meaning attached to it stronger than it would have been with pre-recorded,
commercial music.
Apart from these limitations, it is nevertheless hoped that the multimodal
and cognitive analysis will contribute to deepen our understanding of the com-
plex ways in which different modes and resources contribute to form meaning in
these dynamic digital stories.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodal digital storytelling 161

Notes

1. This paper forms part of a research project on Narration, Discourse and Cognition financed
by the Spanish Ministry of Science and Innovation (FFI2009-13582). The authors would like to
thank Profs. Anne McCabe, Carmen Pena and the two anonymous reviewers for their revision
and constructive commentaries of an earlier version of this chapter.

2. http://www.storycenter.org/cookbook.pdf,
http://www.bbc.co.uk/wales/audiovideo/sites/yourvideo/pdf/aguidetodigitalstorytellingbbc.
pdf, http://thecollaboratory.wikidot.com/creating-a-digital-story,
http://net.educause.edu/ir/library/pdf/ELI08167B.pdf, http://www.inms.umn.edu/elements.

3. Engender Health: http://www.engenderhealth.org,


Just Associates: http://www.justassociates.org,
The Center for Digital Storytelling: http://www.storycenter.org,
The BBC project Telling Lives: http://www.bbc.co.uk/tellinglives,
Digistories: http://www.youtube.com/digistoriesuk,
Creative Narrations: http://www.creativenarrations.net,
Bristol Stories: http://www.bristolstories.org,
Digital stories@UMBC: http://www.umbc.edu/oit/newmedia/studio/digitalstories.

References

Cameron, L. (1999). Operationalising metaphor for applied linguistic research. In L. Cameron


& G. Low (Eds.), Researching and applying metaphor (pp. 3–28). Cambridge: Cambridge
University Press. DOI: 10.1017/CBO9781139524704.004
Dancygier, B. (2008). The text and the story: levels of blending in fictional narratives. In
T. Oakley & A. Hougaard (Eds.), Mental spaces in discourse and interaction (pp. 51–78).
Amsterdam: John Benjamins. DOI: 10.1075/pbns.170.03dan
Emmott, C. (1997). Narrative comprehension: A discourse perspective. Oxford: Oxford Univer-
sity Press.
Fauconnier, G. (1997). Mappings in thought and language. Cambridge, MA: Cambridge Univer-
sity Press. DOI: 10.1017/CBO9781139174220
Fauconnier G., & Turner, M. (2002). The way we think: Conceptual blending and the mind’s
hidden complexities. New York: Basic Books.
Halliday, M.A.K., & Matthiessen, C.M. (2004). An introduction to functional grammar. Third
revised edition. London: Edward Arnold.
Herman, D. (2003). Stories as a tool for thinking. In D. Herman (Ed.), Narrative theory and the
cognitive sciences (pp. 163–192). Stanford: CSLI Publications.
Herman, D. (2004). Toward a transmedial narratology. In M.L. Ryan (Ed.), Narrative across
media: The languages of storytelling (pp. 47–75). Lincoln: University of Nebraska Press.
Herman, D. (2009). Cognitive approaches to narrative analysis. In G. Brône & J. Vandaele
(Eds.), Cognitive poetics: Goals, gains and gaps (pp. 79–118). Berlin: Mouton de Gruyter.
Hodge, R., & Kress, G. (1988). Social semiotics. Cambridge: Polity Press.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


162 Isabel Alonso, Silvia Molina and María Dolores Porto

Kress, G., & van Leeuwen, T. (1996) (2nd edition 2006). Reading images: The grammar of visual
design. London: Routledge.
Labov, W., & Waletzky, J. (1967). Narrative analysis: Oral versions of personal experience. In
J. Helm (Ed.), Essays on the verbal and visual arts (pp. 12–44). Seattle, WA: University of
Washington Press.
Lakoff, G., & Turner, M. (1989). More than cool reason: A field guide to poetic metaphor.
Chicago: University of Chicago Press. DOI: 10.7208/chicago/9780226470986.001.0001
O’Halloran, K., & Lim Fei, V. (2009). Sequential visual discourse frames. In E. Ventola & A.J.
Moya Guijarro (Eds.), The world told and the world shown: Multisemiotic issues (pp. 139–
156). London: Palgrave Macmillan.
Oakley, T., & Coulson, S. (2008). Connecting the dots: Mental spaces and metaphoric language
in discourse. In T. Oakley & A. Hougaard (Eds.), Mental spaces in discourse and interaction
(pp. 27–50). Amsterdam: John Benjamins. DOI: 10.1075/pbns.170.02cou
Porto, D., & Romano, M. (2010). Conceptual integration in natural oral narratives. Actes des
journées d’étude Narratology and the new social dimension of narrative (01–02 Février
2010). At http://narratologie.ehess.fr/index.php?681 (last accessed 8th March 2013).
Ryan, M.L. (2005). On the theoretical foundations of transmedial narratology. In J.C.
Meister (Ed.), Narratology beyond literary criticism: Mediality, disciplinarity (pp. 1–23).
Berlin/New York: Walter de Gruyter. DOI: 10.1515/9783110201840.1
Semino, E. (2009). Text worlds. In G. Brône & J. Vandaele (Eds.), Cognitive poetics: Goals, gains
and gaps (pp. 33–37). Berlin: Mouton de Gruyter.
Turner, M. (2008). Conceptual integration. In D. Geraeerts & H. Cuyckens (Eds.), The Oxford
handbook of cognitive linguistics (pp. 264–293). Oxford: Oxford University Press.
Unsworth, L., & Cleirigh, C. (2009). Multimodality and reading: The construction of meaning
through image-text interaction. In C. Jewitt (Ed.), The Routledge handbook of multimodal
analysis (pp. 151–163). London: Routledge.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Annex (last accessed March 2013)

Code Title Link


BS1 The Home Land http://www.bristolstories.org/story/111
BS2 Despite my fears http://www.bristolstories.org/story/73
BW1 Never Too Old to Learn http://www.bbc.co.uk/wales/audiovideo/sites/yourvideo/pages/catherine_collins_01.shtml
CN1 What the water gave me http://www.creativenarrations.net/node/16
CN2 Bad Choices http://www.creativenarrations.net/node/76
CN3 To every child http://www.creativenarrations.net/node/78
DS1 Stop Bullying http://www.youtube.com/digistoriesuk#p/c/E2571E3579DEA1CF/5/6YsFZjbH6_Y
DS2 Culture Clash http://www.youtube.com/digistoriesuk#p/c/E2571E3579DEA1CF/0/2wv37Co_aRU
EH1 Nelao’s Story http://www.youtube.com/user/engenderhealth#p/u/32/ryAZlNjGot8
EH2 Ngamane’s Story http://www.youtube.com/user/engenderhealth#p/u/31/30UuSNy4-pY
EH3 James’s Story. Rock Bottom. http://www.youtube.com/user/engenderhealth#p/u/36/15Tb3vYgQ1U

Multimodal digital storytelling


EH4 Naresh’s story. I’m not alone. http://www.youtube.com/user/engenderhealth#p/u/5/rW4m8K--WS4
EH5 Sibongiseni’s Story http://www.youtube.com/user/engenderhealth#p/u/20/BI7DOgX6sAY
EH6 Manoj’s Story. It’s 9 am. http://www.youtube.com/user/engenderhealth#p/u/28/pn2Tb7sQuPk
EH7 Gary’s Story. Mission http://www.youtube.com/user/engenderhealth#p/u/34/mKHzMfhnpSw
EH8 Lillo’s Story http://www.youtube.com/user/engenderhealth#p/c/920999EC5CA7C216/5/Vs4PHB62TPs
SC1 Untitled- a transgender boy http://www.storycenter.org/PEdined.php?cat=7

163
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
164 Isabel Alonso, Silvia Molina and María Dolores Porto
Code Title Link
SC2 Sacrificios http://www.storycenter.org/stories/index.php?cat=4
SC3 Privilege http://www.storycenter.org/stories/index.php?cat=3
SC4 Bruises http://www.storycenter.org/stories/index.php?cat=3
SC5 Memories of a Political Prisoner from Worcester http://www.storycenter.org/stories/
SC6 The Balcony http://www.storycenter.org/stories/
SC7 Mixed Race Me http://www.storycenter.org/stories/index.php?cat=7
SC8 My Shoes http://www.storycenter.org/stories/
TL1 Stop this Madness http://www.bbc.co.uk/humber/telling_lives/ram_files/roz_carr.ram
TL2 Whatever happened to Miss Pears? http://www.bbc.co.uk/humber/telling_lives/humber_intermediary4.shtml
TL3 McGuire http://www.bbc.co.uk/humber/telling_lives/40_humber_intermediary.shtml
UM1 Real Men Do Housework http://www.umbc.edu/oit/newmedia/studio/digitalstories/projects.php?movie=ELC054_
S09ByungchangKim.flv
WL1 The day I Made Him Stop http://www.youtube.com/user/WomenCrossingtheLine#p/search/0/hneAZCEl5v4
WL2 Power, HIV and the Feminist Movement Building http://www.youtube.com/user/WomenCrossingtheLine#p/u/1/KLI4N0XelSM

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Part III

Cognitive Linguistics and multimodal


interaction

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
Intermedial cognitive semiotics
Some examples of multimodal cueing
in virtual environments

Asunción López-Varela
Universidad Complutense Madrid

In this chapter, intermediality is explored from an interdisciplinary perspective


that uses neuroscience as well as cognitive-semiotic concerns, and insights from
online digital communication, presenting it as a process where biophysical,
technological, and interpersonal factors interact. Shared attention as well as
spatial and temporal cueing – eye contact and the sonic modality – are explored
from a task-oriented and social interactive dimension. The spatiotemporal im-
pact of the mediating context is highlighted by moving from the role of visual
cueing, in a brief reference to Al Davison’s (2003) autobiographical graphic
novel The spiral cage, to a more detailed analysis of Annie Abrahams’ (2010) on-
line project A fragmented relation, where cueing is dependent not just on spatial
frames but also on the temporal dynamics introduced by the aural dimension
recorded in an online environment. The paper tangentially touches upon the
role of affect in communication.

Keywords: Al Davison, deictic cueing, cognitive semiotics, intermedial studies,


Annie Abrahams

1. Intermedial semiosis and cross-modal mappings

During the 20th-century, research in a number of disciplines, from semiotics


(work by Peirce, 1932) to cognitive linguistics (from Fauconnier, 1994, 1997 to
Forceville & Urios-Aparisi, 2009) indicates that what seems to us as external real-
ity is in fact shaped in great part by our embodied selves. “There is a certain set
of facts which are ordinarily regarded as external, while others are regarded as
internal… thus the sensation of redness is as it is, owing to the constitution of the
mind” (Peirce, 1991, p. 27). In this way, brain activity takes the form of represen-
tations (signs) that correlate to the neural processes initiated by sensory input.

doi 10.1075/bct.78.11lop
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
168 Asunción López-Varela

For Charles Sanders Peirce and the Harvard School of Pragmatism, a sign
is an object that stands for another so that an experience of the former affords
knowledge of the latter in some respect or capacity. This includes sounds, im-
ages, gestures, scents, tastes, textures, words, etc. The sign creates in the mind a
more developed sign, a mental effect or thought that Peirce calls ‘interpretant’
and which gives the sign significance or meaning, becoming in turn a sign in a
dynamic process ad infinitum. Interestingly, for Peirce, “cognition involves some-
thing represented, that of which we are conscious, and some action of passion of
the self whereby it becomes represented. The former shall be termed the objective,
the latter the subjective, elements of cognition” (Peirce, 1991, p. 46; my emphasis).
Alongside the deictic aspects of intermedial semiotics, this paper seeks to unveil
some of the intersubjective aspects of this ‘passion of the self ’.
The most basic mechanism for selecting information is to process stimuli
from a limited portion of space. This function is mediated by spatial attention
which operates capturing information by means of different sensory modalities
(sight, sound, etc.). For example, when trying to follow a conversation in a noisy
environment, attending to relevant lip movements and gestures may be as impor-
tant as attending to the speaker’s voice (Magosso, Serino, Pellegrino, & Ursino,
2010). It has been found (Blake, Sobel, & James, 2004; Kemmerer, 2006) that spa-
tiotemporal contiguity in the processing of information across sensory modalities
facilitates cross-modal coordination, so that, for instance, the sense of balance
located in the vestibular portion of the human inner ear can provide informa-
tion about the position of the human body. The senses of hearing and smell also
signal positions of other objects in relation to the body, even when visibility is
excluded.
Recent research on mirror neuron structures appears to indicate that different
senses encode proto-objects in similar ways. These neurons, located fundamen-
tally in brain areas that involve sense perception and motor activities (premo-
tor cortex, primary somatosensory cortex, inferior parietal cortex), respond not
only to one’s own bodily acts, but to actions performed by others, thus providing
mechanisms for learning by means of imitation. Jenson and Iacoboni (2011) ar-
gue that mirror neurons form the neural basis of the human capacity for emo-
tional engagement, with emotions having a very strong impact in cognition and
representation. However, despite the excitement generated by these findings, no
consensus has been reached on the kind of mirror neuron activity that supports
cognitive functions such as understanding the intention of others, and the imita-
tion of their actions (Churchland, 2012). Therefore, some neuroscience aspects
discussed here remain at the level of hypothesis.
Researchers seem to agree, however, in that from one unconscious fixation
to another, working memory only retains certain basic properties and relative

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Intermedial cognitive semiotics 169

locations of a small number of objects. Quick index assignment, without categori-


zation, seems to precede sensorial (visual, auditory etc.) attention to a proto-object.
The spatiotemporal variables involved in working memory index proto-objects as
future targets for motor commands, so that objects are detected without being
conceptualized. The encoding of sensorial properties takes place at a later stage
as long as the indexed object remains within perception (visual, auditory, etc.)
and perhaps for a short time thereafter. This is possibly an unconscious strat-
egy that maximizes the amount of information in rapidly changing environments
(Glickstein & Doron, 2008; Rensink, 2000). Focal attention (attentive to proper-
ties such as colour, shape, etc. in the case of vision; pitch, tone, timbre, and so on,
in the case of sound) is employed later to individualize temporarily indexed items.
This means that orientation is first directed towards the temporal and dynamic
rather than the spatial aspects that involve marks, tags and categorization.
For some researchers from the field of neuroscience (Kurylowicz, 1964,
p. 180; see also Wind, 1989), the functioning of mental pointers which establishes
provisional spatiotemporal positions (directionality, localization) seems akin to
the textual deictic markers used in intra or extra-discursive reference. Charles J.
Fillmore found evidence that the accusative or direct object case in Indo-­European
languages may have its origin in pointing and gestures, and that certain inflections
are also used with a locative/directional meaning (i.e. in English phrasal-verbs
such as ‘point out’). In his research on deixis, Fillmore described causal connec-
tions enabled by pronouns that point back and forth within texts and also outside
them, offering information about “the identity of the conversation partners, the
nature of the social context or the social relations between partners” (Fillmore,
1975, p. 75). More recently, Grishakova (2012) traces the workings of indexical
pointing from ‘speech events’ to ‘narrated events’ and their possible ontogenet-
ic ability to adopt other person’s perspective, acknowledging another as a self.
Her paper focuses on impact of fictional world figuration in cognitive semiosis
as applied to visual narratives in particular. However, despite rapid progress on
the topic, more research is needed in order to unveil the relationships between
human perceptual modes, their corresponding mental mappings, discourse, and
the cognitive complexities that computer mediated environments add to human
communication.
Although sensory inputs from different modalities function together, vision
remains, for most researchers, the important perceptual mechanism. Many stud-
ies in cognitive linguistics have focused on establishing the relationship between
discourse and mental mappings in the form of images (on this see Gestalt psy-
chology; Fauconnier, 1994, 1997). In neuroscience, Event-Related brain Potential
(ERP) tests, such as poking volunteers’ forearms while letting them look or not,
have indeed shown that when one modality is blocked, volunteers show shifts in

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


170 Asunción López-Varela

attention and sensorial response. For instance, activity activated by touch in the
somatosensory cortex increases when the volunteers have just been looking at
their arm. Other experiments have revealed cross-modal interactions between
vision and touch and vision and audition. To give other examples, sound can in-
fluence the perceived roughness of a touched surface (Guest, Catmur, Lloyd, &
Spence, 2002), and touch can influence visual perception of surface texture and
surface slant (Blake et al., 2004, p. 397; Ernst, Banks, & Bulthoff, 2000). Some
people experience specific colours in relation to differences in musical pitch, so
that musical patterns are converted into visual experience (Critchley, 1977).
As aforementioned, selective attention is the first step in multimodal percep-
tion, followed by an analysis of information to produce separable data (colour,
shape, etc.), incorporation of the experience in the brain by means of body-sche-
mas, generalization of features from repetition of the experience, abstraction and
establishment of a concept-structure by means of the elimination of accidental
features found in differing circumstances and, finally, the relation of this concept-
structure to specific patterns in order to activate a functional response. Due to
space constrains, the following section of this paper will offer a glimpse on how
visual cueing captures attention in fictional/virtual contexts such as graphic nov-
els. The last part will explore multimodal cueing in online communication.

2. Intersubjective communication and pointers

Research on the relation of language, phasing (temporal coding), motor expe-


riences and bodily actions by means of collections of subroutines began in the
1980s (i.e. work by Schmidt, 1982, 1991; Tresilian, 2012). Findings revealed that
human agentive consciousness is task-oriented and that certain factors such as
novelty, frequency and emotional and affective input, together with multimodal
perception, have a high impact in long-term memory consolidation (i.e. Bach,
Schachinger, Neuhoff, Esposito, & Di Salle, 2008).
As noted, initial deictic references do not provide cognitive encoding in terms
of absolute properties of the proto-object or event. Instead, in order to function in
dynamic scenes, they refer to the relations taking place between the objects and
the perceiver/actor and “allow the system to map a newly perceived property onto
a representation of the object that had been previously (incompletely) encoded”
(Pylyshyn, 2000, p. 200). Zlatev (2008) distinguishes five levels of what he terms
‘bodily mimesis’ in the early stages of human development. Acts such as neo-
natal mirroring (i.e. mutual gaze) are representational and gradually intentional
(‘proto-mimesis’). Action imitation, shared attention and mirror self-­recognition

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Intermedial cognitive semiotics 171

conform the second stage (‘dyadic-mimesis’). In terms of somatosensory devel-


opment, these two stages involve cross-modal mappings between exteroception
(i.e. perception of the environment, generally dominated by vision and sound
until the infant is able to move and touch), interoception (sense receptors which
are sensible to internal conditions of the body), and proprioception (i.e. through
kinesthetic sense perception encoded in receptors which provide information re-
garding muscle tension). It is not clear if wielded objects, encoded as tools, be-
come incorporated into the body schema (largely innate, in the sense of being
present at birth), so that the end of the tool effectively becomes a bodily exten-
sion. Some recent studies have disclosed the existence of tactile input to neurons
in object-selective visual areas (on this see work by Gallese & Lakoff, 2005, and
Gallese, 2008).
Importantly, Zlatev (2008) argues that the focus on shared attention in ‘dy-
adic mimesis’ comes a good deal towards the construction of a consensual reality
because it depends on a multi-tasking ability to shift one’s perspective across dis-
tinct axes – the common focal point, the other’s attention to the same, and also
on the other’s attending to one’s own attending. The third stage, ‘triadic-­mimesis’
involves declarative pointing, iconic gestures and full joint attention. This means
that human interactions require not only an understanding of the represen-
tational relation between one’s bodily motion and the object, action or event it
corresponds to, but the realization that such a representational relation can be
used communicatively because it has a similar meaning for the receptor as for the
sender. In other words, sign-representations are not just related to objects but to
the actions and events that include both agent and observer/actor situated in the
world in terms of the interpersonal relations (see also Gallese, 2008).
As pointed out earlier in this paper, Peirce had already established the ba-
sis for intersubjective cognition leading to levels of systemic complexity in mes-
sage-formation and reception which engage both natural and cultural contexts
and where signs are not merely mediators in communication but building-blocks
in creation, inference and interpretation: “[…] signs require at least two Quasi-
minds; a Quasi-utterer and a Quasi-interpreter… Accordingly, it is not merely a
fact of human Psychology, but a necessity of Logic, that every logical evolution of
thought should be dialogic” (Peirce, 1906, p. 523–524).
In the case of human discourse, Paul Grice also identified some features of
cooperation which he summarized in a well-known set of maxims. Work by
Searle (1995), or Sperber and Wilson’s (1986) notion of ‘relevance’, also placed
an emphasis on this common underlying cooperative principle, which occurred
particularly in face-to face communication, requiring the sharing of the same spa-
tiotemporal coordinates, but also in written communication which recreates an
absent spatiotemporal context.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


172 Asunción López-Varela

In Zlatev’s classification, ‘post-mimesis’ involves an understanding of others


as proto-agents whose motor representations are understood in a task-oriented
manner that enables their actions to be interpreted. Human ability for long term
recollection helps to contextualize individual behaviour (desires) in terms of
collective, shared goals (beliefs, values, norms, the representations of which are
termed ‘culture’). This process might be achieved by means of the development
of empathy, enabled by the artificial or virtual construction of fictional worlds
in storytelling (on this see Grishakova, 2012; López-Varela, 2010). These studies
also show that the position where each person perceives the world is essentially
semiotic and represents desires of future interaction and cooperation.
Psychological research would also support that self-conception is inseparable
from spatiotemporal coordinates that contribute to self-organization (Gillespie &
Cornish, 2010; Stolorow & Atwood, 1992). The discourses and actions of other
people challenge self-responses by directing attention (perceptual, cognitive and
metacognitive) to the object, person or event, by triggering a functional action
tendency, and by affective involvement. Emotions are only aroused when signs
are mobilized in such a way as to affect pre-existing dispositions to action and
reaction. Gestures, perceptual cues (eye-contact, facial expressions), stylistic ele-
ments in the case of spoken and written language, etc. determine the functional
capability for the perceived signs to fulfil a particular expectative, need or desire.
In terms of the visual mode, mutual gaze is probably the most basic indexical
element in engaging and sharing attention by empathic means. Eye contact is not
just used in human-to-human communication in ‘real’ environments. It can also
occur in fictional/virtual contexts (note: virtual is not just digital; the virtual mode
functions in contexts that do not share the same spatiotemporal coordinates).
Al Davison’s autobiographical graphic novel The spiral cage is a good example
of how mutual gaze is employed as a pointer to capture and direct audience’s at-
tention and empathic engagement. The brain is very rapidly engaged by some-
one’s direct gaze (i.e. George & Conty, 2008; Wade, 2010). Throughout his story,
Davison makes use of scenic and cinematographic close-ups where readers find
themselves staring into his eyes as a young disabled man. Once eye-contact is
achieved, empathy is triggered and the audience is able to move into Davison’s
own space-time. As I explained in another place (López-Varela, 2011), Davison’s
work does not follow the source-path-goal structure based on horizontal move-
ment, one of the most basic mental schemas. Instead, the narrative focuses on the
category of force to present Davison’s autobiographical struggle with mobility. By
means of empathy, the audience re-enacts his specific perceptual sensations and
behaviour as a sufferer of spina-bifida.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Intermedial cognitive semiotics 173

3. Intermedial attunement

The expression ‘attunement’ originates in the field of music and refers to the mea-
sures taken to solve discrepancies and dissonances among different instruments
so that they can play in tune. Attunement is achieved by means of a correlation of
simultaneity. In the visual mode, simultaneity is favoured within spatial contigu-
ity, so that the use of text and image in written formats (i.e. in comics and graphic
novels) helps convey information more easily. Along with simultanenity, tempo-
ral contiguity is also important in multimodal crossings between, for instance,
vision and sound, as the online example analyzed in this paper shows.
In the case of human hearing and its relation to speech, several attempts have
been made to relate sound deixis, motor action and task-oriented phoneme selec-
tion (Abry, Vilain, & Schwartz, 2004). There is evidence that mirror neuron map-
pings have allowed connections between pointing and language (Gallese, 2006;
Iacoboni, 2005). Auditory sensation involves the ability to perceive and localize
sound sources in space by means of acoustic waves, vibrations and pulses that act
on the body. Once intensity, frequency and duration are sensed, auditory percep-
tion requires a higher level of cognitive processing in association with previous
experience, and the formation of auditory images, that is, psychological represen-
tations of sounds exhibiting internal coherence. Attention correlates with fluctua-
tions in loudness, pitch, perceived duration, spatiality and timbre.
In hearing, deictic pointers also hold a position as mental indexes, carrying
information about sound sources, their interrelationships, and the environment
(i.e. sound movement in the horizontal and vertical plane, distance, depth, etc.).
The difference in pitch sensation and degree of correlation between the left and
right ear (binaural hearing) improves sound perception (for some species of
birds, such as owls, hearing asymmetry is the main localization cue in the vertical
direction). Distance can also be located on this basis, since distant sounds have a
lower frequency than the sounds radiated from proximal sound sources. Motion
is located by changes in the angle of sound perception. In humans, the timbre of
a continuous sound can be perceived at 60 m and its variations are the basis for
sound classification in music. In the absence of vision, humans can also deter-
mine a person’s size by changes in pitch sensation and by calculating the mass of
air displaced when the person walks (for further information see Lopez-Poveda,
Palmer, & Meddis, 2010).
In human-to-human communication, the case of speech sound is different
from the perception of all other complex sounds because it is cognitively tied
up to the process of speech production (where the stream of air controlled is
processed by the lungs, the larynx, the vocal folds and tract, and the articulators

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


174 Asunción López-Varela

located in the mouth cavity − tongue, teeth, and lips − whose movements divide
the vocal tract into a series of resonance tubes and cavities). If the speech sounds
are unfamiliar to the listener (e.g., an unknown foreign language) it may loose the
coupling between perception and production, and sounds are not recognized as
language. In a party, for instance, where speech might not be audible, a listener
might be able to hear his or her own name because awareness does not only de-
pend on sonic qualities but also on co-operating semiotic sources such as atten-
tion, gestures, and the cultural and emotional aspects of communication.
The case of human-to-human communication by means of computer interac-
tion has particular characteristics because it does not take place in spatial conti-
guity (and sometimes not in temporal simultaneity, as we shall see below). The
digital medium of computer communication increases the weight of touch, vision
and sound. Tactile information can also relate to visual information from dis-
tant objects because tools (in his case the keyboard, the mouse or touch-screens)
are used to extend possible reaching space (Bouchardon & López-Varela, 2011;
Maravita, Spence, Kennett, & Driver, 2002).
The notion of interactivity in digital environments means that performativity
is about physical ‘actions’ operated by participants (clicking the mouse or touch-
ing the screen; looking through the Webcam; speaking through the microphone),
not just mental interpretations or readings. But not all digital formats are modifi-
able, and when they are, only certain manipulations are allowed. Some formats,
such as pdf, protected images, certain graphics, and some sound and video files
do not allow copy-paste, TV capture, or Photoshop. Some applications (Apple
products such as iPad, iPodTouch, iPhone, or the Android Open Source Project
AOSP) encourage the use of touch, just as the Braille Translation Software which
converts visual text to tactile format for visually impaired users (among Screen
Reading Software and Text to Speech programs the most widely used is ‘Jaws’).
There is also software that allows the incorporation of biophysical environmental
elements (temperature, light, body heat, eye-tracking movements and other touch
and motor activities).
Since the early 1990s, the widespread use of computer mediated communi-
cation in the World Wide Web encouraged arguments pro the potential use of
these technologies to connect people and empower social groups. In the 21st-­
century, user/participant interactions have increased dramatically, enabled by so-
cial software applications such as blogs and platforms like Twitter or Facebook.
The debate on the power of such networks upon public opinion remains open,
and research on the workings of online interpersonal relations and communica-
tion may shed some light upon the ongoing discussions.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Intermedial cognitive semiotics 175

4. Online multimodal communication

In order to offer a glimpse of the spatiotemporal constraints involved in online


communication, the following lines explore Annie Abrahams’s networked per-
formance entitled A fragmented relation http://vimeo.com/4812716. It questions
online-collaboration by presenting two people, a man and a woman (Abrahams),
attempting to cooperate online. The project explores the significance of silences in
a video-recorded display that lasts over 14 minutes. The performance begins with
a human hand making shadow-shapes on a wall. It recalls Plato’s allegory of the
cave, used to describe the material world of perceptions and sensations as illusory
and inferior to the world of Forms or Ideas. The importance of achieving a rela-
tivistic vantage point that enables meta-cognition is emphasized in this project.
The silent hand-shadow of a pink rabbit becomes a male voice saying hello, and
a hand touching the rugged surface of the wall. Viewed through the camera-eye
the perceived object (hand) works as a metonymical pointer. Presumably, there is
a human owner attached to the hand. Three perceptual modes meet in this open-
ing sequence: vision, sound, and touch, all mediated through the camera-eye. The
audience maintains visual contact and watches the hand stroking the wall while
the colour of an artificial light changes conveying distinct emotional hues. The
sonic modality suddenly stops, reconnects and stops again. The screen goes black
for a brief moment. The male voice returns, and two French words are simulta-
neously heard and visualised in English transcription on the bottom left of the
screen: “impressed even”. Black and silence follow before a female voice in French
is heard, and the accompanying English transcription is shown on the bottom
right: “very very very happy”.
In both cases, sound is stereophonic, but visuality acts as bridging modality
and the audience associates the male voice with the left side of the screen, and
the female voice with the right. This is also enabled by the deictic cueing coming
from the spatial location of the English transcriptions. Silence and black return
to the screen, creating an uncomfortable communication gap. More speech in
French transcribed in the centre and bottom: “Is it pleasant to touch. What can
be pleasant to touch?” Another window opens on the left and the hand strokes a
pink teddy-bear while the female voice is again heard in French, transcribed in
English on the bottom right: “I don’t know” “What can be nice to touch?” “The
skin of a person may be?” The hand continues to stroke the toy while sound is
again interrupted. In the absence of the soothing voice, the familiar gesture be-
comes strangely uncanny (strokes on the wall and the toy bring different empathic
responses because the toy’s features are apprehended in their similarities to liv-
ing things). Visual connection is again interrupted and the camera moves oddly

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


176 Asunción López-Varela

until it focuses on the face of a man. The speaker is obviously sitting in front of a
webcam that follows his movements. He talks through a head-set. An audience
familiar with the language would identify his speech as French. He explains that
sometimes human reactions cannot be anticipated. As he talks, his gaze moves
deictically towards the right where another window opens and a female voice
speaks: “et moi” (English transcription on the bottom-right). For a non-bilingual
speaker, this doubling effect in two different languages undoubtedly contributes
to the estrangement of the situation. The cueing process only works if one is able
to recognize the language and the words uttered as French, and identify them
with the English transcription. The focus of the webcams begins to fluctuate again
and to move in several directions, as if attempting to point towards a particular
place. The sonic modality sets the audience’s expectations to rest: “I don’t know
what to do”. This sentence provides reassurance that communication has not been
interrupted and that cooperation continues. The uncanny effect returns when si-
lence is again prolonged and the camera moves around the male speaker. “How to
manage a fragmented relation” (une relation entre-coupé) says the female voice
as the screen goes black once again (except for the English transcription). She
laughs. Information exchange between the two voices seems to continue, but it is
not perceived as a dialogue.
In this piece, interruptions take place at the visual and sonic level. If one of the
two perceptual modes works, the audience is able to maintain connection. How-
ever, both channels are intermittently interrupted, and the uncanny communica-
tion gaps create a desire to fill the voids by performing some kind of intervention,
for instance moving the cursor of the video-recording to see what happens next.
In minute 4 approximately, both begin to speak as if they are unable to hear each
other. It first sounds as two simultaneous monologues, but one soon realizes that
they are answering each other, although the synchrony of the responses is slightly
delayed. This shows the importance of temporal contiguity.
The speakers reflect on their previous relationship. The fact that they know
each other comes as a relief and a guaranty of perhaps greater connection be-
tween them, even if the technological context does not enable it. Reassurance is
achieved when all modal cues are activated, that is, when the two windows are
displayed simultaneously and their faces visible, showing occasional eye-contact
with the audience. Also when sound is synchronized and when there are signs
of emotional connection such as smiles or laughs between the speakers. In the
absence of touch, physical contact is achieved through mutual gaze, agreement in
speech (i.e. emphasis on the word ‘sweetness’), and acts such as slowly stroking
the teddy-bear. All of these gestures convey tranquillity and reassurance.
Communication in Abrahams’ piece is not broken but syncopated. This cre-
ates occasional frustration in the speakers. They discuss who conveys more ‘hope’,

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Intermedial cognitive semiotics 177

but they do not seem to achieve consensus. However, this is not felt as a com-
munication breach. Anxiety arises from the interruptions that take place in the
channel, rather than being the result of the speakers’ actions and discourses. This
is probably due to the fact that the audience perceives them as friends. Neverthe-
less, weariness grows as the channel fails to enable connectivity.
In minute 8 when the male speaker blows the smoke of his cigar into the
microphone, the audience experiences a synaesthetic moment triggered by the
blowing sound and the visual perception of smoke. Both deictic cues create a
mental association in which one is almost able to feel the smell of the cigar. The
last part of the video explores other aspects that impair or facilitate online com-
munication. A very important one is speed. Digital communication increases
the speed of information processing at all levels. However, this is not necessarily
translated into an improvement in communication. The last section of the project
shows how rapid changes create distractions and anxiety because the audience is
unable to track down information accurately.
As a summary, we can say that digital environments allow easy access to tracks
and segments of information by means of instantaneous captures of sound, text
and image which can be frozen for subsequent inspection, for instance, in video-
capture, as in the example analyzed here. However, navigation through sound
perception – whether speech or music – involves a kind of representation of the
complete piece (this is also true of text, which looses its coherence and narrativity
when too many links are introduced in hypermedia environments). Unlike visual
images which are held in space, sound has a temporal structure where deictic cues
are processed in their continuity as well as in their simultaneity.

5. Conclusions

The paper has traced the communicative impact of multimodal mappings, index
assignment and pointers both in cognition and discourse. It has also introduced
the potential role of affective phenomena and intersubjectivity in engaging coop-
eration. After a succinct reference to visual indexes and the role of eye-contact
in capturing attention and conveying empathic responses in Al Davison’s auto-
biographical graphic novel The Spiral Cage, the paper has offered a more detailed
reading of Annie Abrahams’ “A Fragmented Relation”, which explores both vision
and sound in online-collaboration. The case study shows the importance of me-
diating channels (whether analogue or digital) upon the spatiotemporal axis of
perception, with the slightest cues (spatial visual frames in the case of the graphic
novel, and time-lag in digital setting) having significant impact on communica-
tive situations.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


178 Asunción López-Varela

References

Abrahams, A. (2010). A fragmented relation. Produced by <www.bram.org> CAC Centre d’Art


Contemporain, Le Quartier, Quimper. <http://aabrahams.wordpress.com/2009/05/25/
video-­a-fragmented-relation-on-line/> <http://www.bram.org/info/cv_aabrahams_
english.pdf>.
Abry, C., Vilain, A., & Schwartz, J.L. (2004). Vocalize to localize? A call for better crosstalk be-
tween auditory and visual communication systems. Interaction Studies, 5, 313–325.
DOI: 10.1075/is.5.3.01abr
Bach, D., Schachinger, H., Neuhoff, J., Esposito F., & Di Salle, F. (2008). Rising sound intensity:
An intrinsic warning cue activating the amygdala. Cerebral Cortex, 18, 145–150.
DOI: 10.1093/cercor/bhm040
Blake, R., Sobel, K.V., & James, T.W. (2004). Neural synergy between kinetic vision and touch.
Psychological Science, 15(6), 397–402. DOI: 10.1111/j.0956-7976.2004.00691.x
Bouchardon, S., & LópezVarela, A. (2011). Making sense of the digital as embodied experience.
CLCWeb: Comparative Literature and Culture, 13(3). DOI: 10.7771/1481-4374.1793
Churchland, P. (2012). Braintrust: What neuroscience tells us about morality. Princeton:
Princeton University Press.
Critchley, M. (1977). Ecstatic and synaesthetic experiences during musical perception. In M.
Critchley & R.A. Henson (Eds.), Music and the brain: Studies in the neurology of music
(pp. 217–230). London: Heinemann.
Davison, A. (2003). The spiral cage. London: Active Images.
Ernst, M., Banks, M., & Bülthoff, H. (2000). Touch can change visual slant perception. Nature
Neuroscience, 3(1), 69–73. DOI: 10.1038/71140
Fauconnier, G. (1994). Mental spaces. New York: Cambridge University Press.
DOI: 10.1017/CBO9780511624582
Fauconnier, G. (1997). Mappings in thought and language. New York: Cambridge University
Press. DOI: 10.1017/CBO9781139174220
Fillmore, C. (1975). Santa Cruz lectures on deixis. Bloomington: Indiana University Linguistics
Club.
Forceville, C., & Urios-Aparisi, E. (2009). Multimodal metaphor. Berlin/New York: Mounton de
Gruyter. DOI: 10.1515/9783110215366
Gallese, V., & Lakoff, G. (2005). The brain’s concepts: The role of the sensory-motor system in
reason and language. Cognitive Neuropsychology, 22, 455–479.
DOI: 10.1080/02643290442000310
Gallese, V. (2006). Intentional attunement: A neurophysiological perspective on social cogni-
tion and its disruption in autism. Brain Res. Cog. Brain Res, 1079, 15–24.
Gallese, V. (2008). Mirror neurons and the social nature of language: The neural exploitation
hypothesis. Social Neuroscience, 3, 317–333. DOI: 10.1080/17470910701563608
George, N., & Conty, L. (2008.) Facing the gaze of others. Clinical Neurophysiology, 38(3), 197–
207. DOI: 10.1016/j.neucli.2008.03.001
Glickstein, M., & Doron, K. (2008) Cerebellum: Connections and Functions. The Cerebellum,
7(4), 589–594. DOI: 10.1007/s12311-008-0074-4
Gillespie, A., & Cornish, F. (2010). Intersubjectivity: Towards a dialogical analysis. Journal for
the Theory of Social Behaviour, 40, 19–46. DOI: 10.1111/j.1468-5914.2009.00419.x

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Intermedial cognitive semiotics 179

Grishakova, M. (2012). On cognitive and semiotic functions of shifters. Chinese Semiotic Stud-
ies, 8(2), 227–238. DOI: 10.1515/css-2012-0041
Guest, S., Catmur, C., Lloyd, D., & Spence, C. (2002). Audiotactile interactions in roughness
perception. Experimental Brain Research, 146(2), 161–171.
DOI: 10.1007/s00221-002-1164-z
Iacoboni, M. (2005). Understanding others: imitation, language, empathy. In S. Hurley &
N. Chater (Eds.), Perspectives on imitation: From cognitive neuroscience to social science
(pp. 77–99). Cambridge, MA: MIT Press.
Jenson, D., & Iacoboni, M. (2011). Literary biomimesis: Mirror neurons and the ontological
priority of representation. California Italian Studies, 2(1) <http://escholarship.org/uc/
item/3sc3j6dj>.
Kemmerer, D. (2006). The semantics of space: integrating linguistic typology and cognitive neu-
roscience. Neuropsychology, 44, 1607–1621. DOI: 10.1016/j.neuropsychologia.2006.01.025
Kurylowicz, J. (1964). The inflectional categories of Indo-European. Heidelberg: Carl Winter
Universitätsverlag.
Lopez-Poveda, E., Palmer, A.R., & Meddis, R. (2010). The neurophysiological bases of auditory
perception. New York: Springer Verlag. DOI: 10.1007/978-1-4419-5686-6
López-Varela, A. (2010). Exploring intercultural relations from the intersubjective perspectives
offered through creative art in multimodal formats. Lexia, 5–6, 125–147.
López-Varela, A. (2011). Multimodal metaphor and intersubjective experiences: the impor-
tance of eye-contact in Davison’s graphic novel ‘The Spiral Cage’ and in Annie Abrahams
Net-Project ‘On Collaboration’. In L. Masucci & G. Di Rosario (Eds.), Lavori del Convegno
Palazzo degli Artista Italiani (pp. 307–324). Naples: Oficina di Letterature Electrónica.
Maravita, A., Spence, Ch., Kennett, S., & Driver, J. (2002). Tool-use changes multimodal spatial
interactions between vision and touch in normal humans. Cognition, 83, 25–34.
DOI: 10.1016/S0010-0277(02)00003-3
Magosso, E., Serino, A., Pellegrino, G.D., & Ursino, M. (2010). Crossmodal links between vi-
sion and touch in spatial attention: A computational modeling study. Computational Intel-
ligence and Neuroscience. DOI: 10.1155/2010/304941 http://www.hindawi.com/journals/
cin/2010/304941/
Peirce, C.S. (1906). Prolegomena to an apology for pragmaticism, The Monist, 5(4), 492–546.
Reprinted in C. Hartshorne & A. W. Weiss (Eds.), Collected papers of C. S. Peirce. Vol-
ume V (pp. 530–572). Cambridge, MA: Harvard University Press.
DOI: 10.5840/monist190616436
Peirce, C.S. (1932). Division of signs. In C. Hartshorne & A.W. Weiss (Eds.), Collected papers of
C. S. Peirce. Volume II. Cambridge, MA: Harvard University Press.
Peirce, C.S. (1991). On the nature of signs. In J. Hoopes (Ed.), Peirce on signs. Chapel Hill: Uni-
versity of North Carolina Press.
Pylyshyn, Z.W. (2000). Situating vision in the world. Trends in Cognitive Sciences, 4(5), 107–
207. DOI: 10.1016/S1364-6613(00)01477-7
Rensink, R. (2000). The dynamic representation of scenes. Visual Cognition, 7, 17–42.
DOI: 10.1080/135062800394667
Searle, J.R. (1995). The construction of social reality. London: Allen Lane.
Schmidt, R.A. (1982). Motor control and learning: A behavioural emphasis. Champaign, IL: Hu-
man Kinetics Publishers Inc.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


180 Asunción López-Varela

Schmidt, R.A. (1991). Motor learning and performance. Champaign, IL: Human Kinetics Pub-
lishers Inc.
Tresilian, J. (2012). Sensorimotor control and learning: An introduction to the behavioral neuro-
science of action. London: Palgrave Macmillan.
Sperber, D., & Wilson, D. (1986). Relevance: Communication and cognition. Oxford: Blackwell.
Stolorow, R.D., & Atwood, G.E. (1992). Contexts of being: The intersubjective foundations of
psychological life. Hillsdale, NJ & London: The Analytic Press.
Wade, N.J. (2010). Pioneers in eye movement. i perception, 1, 33–68. <http://i-perception.
perceptionweb.com/fulltext/i01/i0389.pdf>. DOI: 10.1068/i0389
Wind, J. (1989). Studies in language origins. Amsterdam/Philadelphia: John Benjamins.
DOI: 10.1075/z.los1
Zlatev, J. (2008). The co-evolution of intersubjectivity and bodily mimesis. In J. Zlatev, T.
Racine, C. Sinha & E. Itkonen (Eds.), The shared mind: Perspectives on intersubjectivity
(pp. 215–244). Amsterdam/Philadelphia: John Benjamins. DOI: 10.1075/celcr.12.13zla

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality in conversational humor

Salvatore Attardo, Lucy Pickering, Fofo Lomotey


and Shigehito Menjo
Texas A&M University-Commerce

The paper presents the analysis of the humor found in four dyadic conversa-
tions. The results of the conversational data match those of previous studies
(Pickering et al., 2009): no differences were found in volume or speech-rate
between humorous pause units and non-humorous ones. Similarly, pauses
were not found to mark humorous turns. However, the result that punch-lines
showed lower pitch than non-humorous parts of the text was not replicated:
humorous pause units showed no significant differences in pitch from non-hu-
morous ones. Smiling is found to mark humor only in a general sense of “setting
the frame” and is not integrated (i.e., co-extensive) with the humor.

Keywords: humor, conversation, prosody, multimodality, smiling, laughter

1. Introduction

The purpose of this paper is to begin to address the issue of multimodal1 mark-
ers of humor in conversational humor. We present an extension of the results we
found in previous studies that show that speakers do not consistently mark punch
lines prosodically or with pauses. We also present, as a first tentative hypothesis, a
view that smiling and laughter co-occurring with humorous turns may mark local
discourse as in a playful, humorous frame.
The field is in clear need of research, as very little has been written about the
prosodic and multimodal markers of humor, with the exception of the markers
of irony (which, for the purposes of this study, we consider to be a case of humor;
the issue is needless to say, much more complex, see Hidalgo Downing & Iglesias
Recuero, 2009). Moreover, what little research is extant on the subject of markers
of humor and irony is primarily focused and based on laboratory data and does

doi 10.1075/bct.78.12att
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
182 Salvatore Attardo et al.

not concern itself on discourse data. For a review of the extant literature, on mark-
ers of humor see Pickering et al. (2009) and Attardo, Pickering, and Baker (2011);
for markers of irony, see for example Attardo (2000) and Rockwell (2006).
Thus, for example, in Pickering et al. (2009), the first study to systematically
test various hypotheses about prosodic marking of humor in canned jokes, we
presented evidence showing that untrained speakers did not mark as prosodically
salient the punch lines of jokes. In fact, we found that punch lines, which occur at
the end of narrative texts, were produced with significantly lower pitch, consistent
with their occurrence at the end of a paratone. Moreover, we found that speakers
did not mark a significant number of punch lines with pauses, nor did they con-
sistently use smiling and laughter to do so.2
In a follow up study (Attardo et al., 2011), we found that humor in a con-
versational setting, as opposed to the narrative setting of the jokes studied in
Pickering et al. (2009), did not display the significantly lower pitch, as we would
have expected, since the humor did not occur at the end of a narrative. The study
in Attardo et al. (2011), however, was only a pilot, consisting of the analysis of one
dyadic conversation, making broader generalizations impossible.
The present study broadens the analysis to include four dyadic conversations,
in which speakers were instructed to tell each other a joke and then were left free
to converse for five minutes. The setup allowed for the collection of ten minutes
(both sides of the conversation were recorded) of high quality sound files and
video files of the faces and torso of the speakers, for each conversation. There were
39 instances of humor in the corpus. A detailed description of the setup for the
data collection and of the transcription conventions can be found in Attardo et al.
(2011). Syllables in all caps are stressed. A full discussion of the data is forthcom-
ing. We gratefully acknowledge the help of several students on this project.
For the purposes of this study, we treated smiling and laughter as points on
a continuum of intensity for mirthful displays. This is purely a matter of conve-
nience and we take no position as to whether smiling and laughter are genetically
related. The continuum can be visualized as ranging from a very low intensity
smile, to a high intensity smile, and then transitioning into laughter. We also as-
sume as given the distinction between canned narrative (prepared, rehearsed)
jokes, which end in a punch line, which is the locus of the humor, and discourse

Table 1. The four dyads (all names are pseudonyms)


Tamara Mary
Carmen Marina
Miranda Paul
Courtney Melinda

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality in conversational humor 183

humor, un-rehearsed, generally non-narrative, which occurs in jab lines. Further


discussion can be found in Attardo (1994, 2001) and Pickering et al. (2009).
The organization of the paper is as follows: section one will present a brief
overview of the results concerning the prosodic features analyzed in Pickering
et al. (2009) for the dyadic conversation corpus, i.e., in the context of conversa-
tional humor. Section two will discuss in more detail the results concerning smil-
ing and laughter, in the dyadic conversation corpus, as well. The final section will
review the significance of these results and address some methodological issues.

2. Prosodic markers of conversational humor

The difference between canned jokes (rehearsed narratives with a humorous end-
ing, the punch line) and conversational humor (improvised, often non-narrative)
has been mentioned. Despite these differences, humor, regardless of its position
in a narrative or occurring in an isolated turn of a conversation, is assumed to be
salient (Giora, 2003). Thus, when we decided to examine how speakers marked
the salient passages of narrative humor (canned jokes), we expected that the se-
mantically and pragmatically salient punch line would be marked with the pro-
sodic counterparts of salience (high pitch, high volume, slow speech rate) and
we expected to find the punch lines marked by significant pauses and marked
by smiling and/or laughter. The view of laughter and smiling as reactive to hu-
mor has been rejected early on by the pioneering work of Jefferson (1979) which
showed that laughter could precede and accompany the production of humor to
signal its presence and effectively “invite” more laughter.
However, the results of the Pickering et al. (2009) study rejected most of these
expectations. The punch lines of narrative jokes were found to be significantly
lower in pitch than the rest of the text. There was no significant difference in
volume between the punch lines and the rest of the texts. There also was no sig-
nificant difference in speech rate, although speakers delivered punch lines slower
than the rest of the text. There were no significantly longer pauses around punch
lines. Finally, only 60% of punch lines were accompanied by laughter and/or smil-
ing voice.
The results just described naturally bring about the question of whether they
apply to conversational, improvised humor as well. The ongoing study of a small
corpus of dyadic conversations was designed to answer this question.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


184 Salvatore Attardo et al.

2.1 Pitch

We found no significant difference between the pitch of the humorous pause


units and the pitch of the non-humorous pause units. The mean of the humorous
pause-based units was 200.86, whereas the non-humorous was 201.32. The stan-
dard deviation for the humor data was: 48.98; for the non-humor data: 53.10.
We tested the significance using the non-parametric Mann-Whitney test and
found the two-tailed P value is 0.7489, considered not significant.

2.2 Volume

We also did not find any significant difference between the volume of the hu-
morous pause units and the pitch of the non-humorous pause units. The mean
volume was: Humor data: 74.77; Non-humor: 74.75. The standard deviation: Hu-
mor: 6.79; Non-humor: 6.50. The two-tailed P value is 0.9437, considered not
significant, using the Mann-Whitney test.

2.3 Speech rate

We measured the speech rate in syllables per second. The mean speech rate was
0.36 sps for serious speech and 0.24 for humorous speech. This difference is not
significant (Mann-Whitney two tailed p = 0.3014).

2.4 Pauses

As discussed in detail in Pickering et al. (2009) and Attardo and Pickering (2011),
the traditional folk-theory of humor predicts that punch lines should be set apart
from the preceding text by noticeable pauses. A pause was operationalized as
a “substantial” pause > .6 seconds (Brown, Currie, & Kenworthy, 1980, p. 56),
since shorter pauses are not generally noticeable and hence would not function as
markers. Significantly, a pause of .6 seconds is roughly double the mean length of
pauses in three out of four conversations, in our data. The fourth conversation has
a mean length of pauses of .7 seconds, but this is clearly an exception, as the two
speakers had some difficulty getting started with the conversation.
Since we used pause-based units, obviously every unit is preceded by a pause.
Therefore we looked both at whether the prediction that all instances of humor
were preceded by a substantial pause was borne out both using a strict definition
of punch/jab line, and using a broader definition, i.e., whether the unit in which
the humor occurs was preceded by a substantial pause.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality in conversational humor 185

Our results match those in Pickering et al. (2009): no jab line (including iro-
ny) occurs immediately after a substantial pause (strict definition). 11 out of 39
(roughly 26%) instances of humor occur in a pause-based unit preceded by a
substantial pause (broad definition). These results clearly falsify the folk-theory of
humor’s claim that humor should be preceded by a substantial pause.
The following example clearly shows how even in an instance in which there
is a substantial pause before the pause-based unit, it clearly does not mark the jab
line.

233 P 0.3
234 P //a Pep Boys and an Aldi’s GROcery [100] [73] store //
235 0.81
236 M //wh:::ich [((laughs)) that’s very IRISH [245] [80] [((laughs))] //
237 0.32

Note how speaker M begins his turn with an elongated vowel, a bout of laughter,
followed by a sentence, in which the direct object is the jab line. Thus, in a strict
definition of jab line, the pause should have been between “that’s very” and “Irish,”
whereas there is no audible pause at all. It is notable how laughter occurs before
and after the sentence in which the jab line occurs. The presence of laughter is not
unusual, as it is found in about 50% of the instances of humor.

2.5 General discussion

Table 2 summarizes the results for conversational humor, contrasted with the
Pickering et al. (2009) study on canned jokes.
A fuller discussion of these data would take too much space, in this context,
so we will limit ourselves to explaining the only difference between the two sets
of data, namely the absence of a significantly lower pitch for the conversational
humor, as compared with the canned humor.

Table 2. A comparison of the differences between canned jokes and conversational


humor
Canned jokes Conversational humor
Pitch Significantly lower in punch lies No signficant difference
Volume No signficant difference No signficant difference
Speech rate No signficant difference No signficant difference
Pauses No signficant difference No signficant difference

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


186 Salvatore Attardo et al.

As we argued in Attardo et al. (2011), the difference is not surprising: by defi-


nition, the punch line of a joke occurs at the end of a narrative. A narrative is
prototypically delivered as one extended turn, in conversation, and indeed in the
experimental setting of Pickering et al. (2009), the speakers were not interrupted
while performing the jokes. Hence, prosodically, the speakers will tend to produce
a spoken paragraph, or paratone (Hirst & Di Cristo, 1998, p. 71; Wennerstrom,
2001, p. 130). The definition of paratones is a complex issue, which however, need
not concern us in this context. The relevant issue is that paratones are marked by
extra-low pitch at the end. As we argued in Pickering et al. (2009), the extra-low
pitch caused by the position at the end of a narrative, and hence of a paratone, is
the likely explanation for the results showing that punch lines have significantly
lower pitch. Naturally, then , when other instances of humor occur in a text, since
they do not occur at the end of a paratone, there is no reason for them to be
marked by extra-low pitch.
We should conclude this short discussion of the results on the markers of hu-
mor in conversation, with a discussion of the results in Archakis, Giakoumelou,
Papazachariou, and Tsakona (2010). Archakis et al. (2010) analyzed a corpus of
three conversations, each lasting about one hour, among 6 speakers (girls aged
15–17), who were friends. Their corpus included 170 jab lines.
Archakis et al. (2010) did not report pitch data, so no comparison is possible.
They report a significant difference in speech rate: the jab lines were produced at
161ms/syllable (6.2 sps), while the preceding turns were produced at 136 ms/syl-
lable (7.3 sps). The slower speech rate for the humorous passages is consistent
with our data, except that our differences are not significant.
Table 3 reports a comparison between the data reported in Archakis et al.
(2010; row 1) and the data in Pickering et al. (2009; row 2) and in the current
study (row 3). The differences are reported to be significant in Archakis et al.
(2010) but not in the other two studies.

Table 3. Volume measurements for the Greek and American data


Values in dB Humor Non-humor
Greek 70.75 67.64
American (canned) 73.99 73.20
American (conversational) 74.78 74.75

Table 4. Placement of pauses in the Archakis et al. (2010) data


Before After Before and after None Total
Serious 15 25 16 44 100
Jab lines 25 37 104 4 170

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality in conversational humor 187

Finally, Archakis et al. (2010) report significant differences in the placement


of pauses, which would contradict our findings (see Table 4).
However, we find some potential methodological problems in Archakis et al.
(2010):

– They define pause as interruptions of the flow of speech longer than .3 sec-
onds; for the reasons discussed in Pickering et al. (2009), we used substantial
pauses (> .6 s)
– They use “intonation phrases” for which boundaries are, among other things,
pauses (Archakis et al., 2010, p. 195). Hence intonation phrases have a higher
probability of being preceded and/or followed by pauses than other non-
pause-based unit.
– They selected “semi-randomly” 100 serious intonation phrases from narra-
tives, this may have reduced the probability that they be framed by pauses,
unlike jab lines. The only way of avoiding potential bias is either to utilize all
non-humorous turns or to select randomly a sample.

Because of these potential methodological issues, we cannot at this point discuss


these differences in the results. There is furthermore a general issue: Archakis et
al. (2010) use in most cases parametric statistics. However, our data showed that
the distribution of the data was not normal. Archakis et al. (2010) do not discuss
the distribution of their data. We can also mention that another general issue
might be that the participant in the Greek study were younger, all females, and
close friends, unlike the American subjects who were older (college age), of mixed
gender, and not as closely connected. We are planning to further investigate these
differences in our results.

3. Smiling and laughter as markers of the humorous frame

Let us note that, even when looking at only four conversations, that the conver-
sations differ in interesting ways, at a very direct impressionistic level. Some of
the conversations are very easy going and the participants are clearly enjoying
themselves, whereas in others they are much less relaxed. This is reflected in the
quantity of smiling and laughter. For example, Tamara and Mary smile and laugh
almost continuously throughout the conversation, while Carmen and Marina
smile and laugh a lot less. In this context, we will not try to characterize these
differences and instead we will focus on the intensity of smiles and their use in
back channels.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


188 Salvatore Attardo et al.

Screwed up easy mac and stuff like THAT you know

Figure 1. Mary and Tamara’s facial expressions during the jab line occurring on line 215
of the transcription “I’ve screwed up Easy Mac and stuff like that, you know?”

3.1 Intensity of smiling

Intensity of smiling was assessed impressionistically, essentially on a binary scale


(reduced smile/full smile). Other far more sophisticated approaches exist (such
as Harker & Keltner, 2001, which uses a five-point scale of intensity of the acti-
vation of two action units in Ekman’s FACS (AU 12 and 6, corresponding to the
zygomatic major and the orbicularis oculi; see e.g., Ekman, Davidson, & Friesen,
1990). An analysis of our data using FACS is being planned. Despite the bluntness
of our instrument, some interesting interpretations emerge.
The context of the exchange is Mary’s self-deprecating sequence about her
inability to cook, which has caused a fire in her kitchen. Mary is the speaker (top
row of images) and Tamara the hearer. The jab line is “Easy Mac” (strict defini-
tion; on strict vs. broad definition of jab and punch lines, see Attardo et al., 2011)
and occurs in the second image from the left (placement of facial expressions in
the transcription is necessarily approximate). Note that Mary signals through a
low intensity smile the humorous frame, until she breaks out into a full smile, well
after the jab line. Note also that Tamara listens with a low intensity smile match-
ing Mary’s and, likewise, breaks into a full smile well after the jab line. The best
way to describe this interplay is that Mary tentatively offers the statement that
she’s “screwed up Easy Mac” as a humorous turn, while Tamara waits to assess
where the story is going. Once Tamara is satisfied that the anecdote has ended
(the tag question “you know?”) and that it was meant humorously (Mary’s full
smile in the fourth image from the left), she accepts the humorous framing and
joins in with a full smile. Both speakers then laugh.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality in conversational humor 189

Clearly, the high intensity smiles do not mark the punch line (in its strict
definition) in the sense of accompanying it or of announcing it. The function of
smiling and laughter seems to be that of providing clues that lead to the framing
of segments of the exchange as humorous.
It should be noted that some speakers smile and laugh also in turns where no
humor takes place, most notably Tamara and Mary. Thus obviously smiling is not
a clear-cut marker of humorous intent. Smiles can have many different meanings
(Ekman, 1985). Conversely, there are instances of humor not accompanied by
smiling (for example, Carmen’s ironical turns).
The occurrence of laughter when no humor occurs can possibly be explained
by the framing of a multi-turn conversational stretch as humorous. Once the
speakers have agreed that the current exchange is in the humor frame, subsequent
utterances that have no humorous value in and of themselves are “interpreted” as
humorous (i.e., the speakers react with laughter). The presence of laughter itself
being a ostensive marker of humor (the speakers would be “reasoning” along the
lines of “we are laughing, we must be being funny”. Significantly, these laughing
stretches do not continue indefinitely but eventually decay: the speakers return to
neutral expressions or low intensity smiles fairly rapidly.

3.2 Smiling as back channel

From the exchanges in our data exemplified in Figure 1, it is fairly clear that speak-
ers use reduced (low intensity) smiles to back channel and to “test” the humorous
intention (see stills 1–3 of Figure 1), whereas they break out in full smiles (high
intensity) either to spontaneously show enjoyment (often followed by laughter) or
to signify uptake of the humor.
Smiling has been likened to back channel (Brunner, 1979). Obviously, not all
smiles are back channel devices and, regardless of that fact, it remains to be deter-
mined what message is being sent through the back channel. Traditionally, back
channel messages signify that the attention of the hearer is engaged and provide
clues to the uptake of the message in the predominant channel. We wish to put
forward the hypothesis that back channel smiling during humorous turns has a
richer, more specific value.
Specifically, we hypothesize that back channel smiling indicates agreement
with the humor. “Agreement” is here used as a technical term, introduced in Hay
(2001). Hay (2001, p. 67) distinguished four levels of uptake of a humorous in-
stance in discourse: recognition, understanding, appreciation, and agreement.
The first three levels are self-explanatory: the hearer must become aware of the in-
tention of the speaker to be funny (there are exceptions, which need not concern

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


190 Salvatore Attardo et al.

us in this context), must understand the humor, and must appreciate it (i.e., enjoy
it). The fourth is more complex: agreement implies “agreement with the message,
including any attitudes, presuppositions or implicatures contained in the humor.”
(Hay, 2001, p. 72). Significantly, the levels are the object of a scalar implicature:
agreement implies appreciation, which in turn implies understanding, and the
latter implies recognition. For the record, we simplify Hay’s sophisticated discus-
sion, but not so much as to distort her argument.
Our hypothesis, then, is that by smiling during and after the production of
humor, the hearer signals not only their attention to the message, but that they
“get it” and moreover that they agree with the speaker, in the sense above. We wish
to stress that we are not arguing that the above is the only meaning of smiling in
conversation. On the contrary, we argue that it is one of many meanings it may
convey. Our work should be seen as exploratory in nature, as it explores discur-
sive exchanges and departs from some methodological problems, such as cherry
picking of the data, which plague other approaches, as we discuss below.

4. Methodological issues

4.1 Cherry picking of data

This is a very complex issue, which we cannot hope to tackle in all its complex-
ity. Let us just note two general points: (1) a sufficiently large corpus will yield
enough examples to illustrate just about any theory. A study should analyze either
all the data or a representative sample thereof. (2) Investigating the occurrence of
a feature and its correlation to another is not sufficient. For example, we know that
smiling and laughter occur also when no humor is present. Therefore, the mere
occurrence of smiling and/or laughter cannot be said to mark a discourse passage
as humorous. Even if all the data in one’s corpus co-occurred with smiling/laugh-
ter, the occurrence of smiling and/or laughter outside of humorous turns needs to
be explained and/or discriminated somehow from smiling/laughter in humorous
turns, if one wants to speak of “marking.”
There have been attempts to circumvent this problem by coining the tauto-
logical term “laughable” (a laughable is a stretch of discourse in which laughter
occurs; Glenn, 2003) but clearly the concept has no explanatory power at all (see
Attardo, 2005).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality in conversational humor 191

4.2 Markers of humor

We have seen that, while it is possible to find examples of humor accompanied by


prosodic features associated with saliency, and it is possible to find several exam-
ples of humor preceded by significant pauses, and while it is also possible to find
many examples of humor followed and/or preceded by laughter, we cannot claim
that any of these features, with the exception of smiling, is a marker of humor.
This is an important methodological issue. When can we say that that any
prosodic feature is a reliable marker of any particular discourse feature? The ques-
tion is best answered by considering a good example of reliable marker. The use
of a low pitch has been shown to indicate asides or parenthetical remarks within
spoken instructional discourse, also termed a paradiscourse subtext (Coulthard
& Montgomery, 1981; Pickering, 1999). Speakers routinely signal these utterances
as commentary on the discourse with the use of a marked pitch change followed
by a return to a high or mid pitch level.
In this sense, none of the prosodic features we have analyzed can be said to
be reliable markers of humor, both because they do not co-occur reliably with the
jab lines or punch lines, and because when considered in the context of the entire
discourse they are not marked. This relates to the cherry picking of examples is-
sue, discussed above (4.1). Moreover, there is some evidence that markers, such
as the parenthetical intonation, co-occur closely with their referent (the part of
discourse marked). Bavelas and Chovil (2000) call this feature “integration” and
describe it as “simultaneous audible and visible elements of a message” (Bavelas
& Chovil, 2000, p. 184). As can be seen clearly in figure 1, the punch line and the
full smile are not integrated.
If smiles and full smiles are not symbolic (in the sense of conveying a mean-
ing), what Bavelas and Chovil (2000) would call an “act of meaning,” they could
nonetheless “mean,” in a weaker, ostensive (Wittgenstein, 1953), natural (Grice,
1957) sense. This is the approach taken by Owren and Bachorowski (2003), for ex-
ample, who claim that laughter provides insight on the speaker’s emotional state. It
is clear that under some perspective this approach is correct: indeed laughter and
smiling, when spontaneous, reveal the emotional states or frame of mind of the
speaker. This information may then be used by the hearer/observer to assess the
discursive situation, in a fairly broad sense: obviously if the speaker just laughed
during the production of a turn, assuming that the laughter is genuine, the hearer
can infer that that turn was meant humorously and that hence the speaker was in
a play frame. However, speakers can revert to the serious mode without transition
in the next turn or even within the same turn (e.g., Rees & Monrouxe, 2010). So
the ostensive information would not extend beyond the past and current turns.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


192 Salvatore Attardo et al.

However, we feel that this approach underestimates the meta-discursive na-


ture of smiling and laughter. We wish to suggest an exploratory hypothesis: we
found that very often, although not always, a manifestation on the smile-laughter
continuum was used to “frame” a segment of the discourse as humorous. Typi-
cally, for narrative jokes, we found that the teller started the narrative with a clear
smile, matched by the hearer (we make no firm claims as to the sequential nature
of the smiles at this time; such analysis requires more sophisticated instrumental
analysis). The setup phase of the joke was interspersed by low intensity smiles,
likewise matched by the hearer. Finally, after delivering the punch line, the speak-
er smiles, usually with a fuller smile and/or laughed following the punch line. The
hearer also matched these displays. In non-narrative humor (jab lines), the jab
line or irony was almost always accompanied by smiling and/or laughter by both
interactants.
It should be stressed that we are not simply saying that the presence of smil-
ing and/or laughter frames the entire exchange as humorous. On the contrary,
we interpret these findings as being in accordance with Eisterhold, Attardo, and
Boxer’s (2006) approach, which assumes that speakers negotiate with each other
when they operate in a humorous mode (or frame) and return to a default serious
mode when they no-longer signal their humorous intention. Independent sup-
port for this approach comes from research by Tsakona (2011), who found that
the ironical turns in a Greek parliament discussion did not exceed three utter-
ances, and only two instances are cases of mode adoption (i.e., the hearer continu-
ing the speaker’s irony) Ruiz-Gurillo (2009) who found that 48.5% of her corpus
of ironical exchanges consisted of single turns and 68.5% consisted of three or
less turns.
In other words, the speakers in our data use smiles and laughter to mark the
immediate context as belonging to the humorous non-bona-fide mode of com-
munication. When they stop doing so, they revert to the serious mode. By using
low intensity smiles, speakers and hearers can negotiate extended turns or multi-
turn passages of playful framing. Whether these conclusions can be extended out-
side our data is an empirical question.
Our tentative conclusion is that the prosodic features we investigated and
smiling/laughter are not markers, in the technical sense seen above, both because
they are not consistently associated with the phenomenon and because they lack
“integration.” Smiling/laughter is a local framing device for short stretches of dis-
course, serving the function of negotiating between the speakers the switch to the
joking mode. It is quite possible that noticeable pauses, or notably low/high pitch,
volume, or speech rate occasionally draw attention to a punch line or jab line (and
particularly irony).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Multimodality in conversational humor 193

We are deliberately avoiding, at this point, the issue of what units of discourse
get “framed.” There are different approaches, ranging from framing as humorous
entire conversations, to single turns. We might label the former “global” and the
latter “local.” Clearly, the hypothesis outlined in this paper is that the framing of
discourse as humorous (play frame) is local. Further research will need to exam-
ine this hypothesis.

Notes

1. The term “multimodality” refers to multiple modes of communication being used concur-
rently. Thus speech uses the auditory channel and smiling (primarily) the visual channel. In this
paper we only consider a few aspects of multimodality in what is also known as “paralanguage”
(Trager, 1958). Multimodality can include gestures, the posture of the body during speech, the
way the speakers locate themselves in relation to each other, their clothing, etc. Clearly, only
a few features can be analyzed at a time. In this article we focus on prosody and smiling and
laughter.

2. One of the anonymous referees asks us to speculate as to why speakers do not mark punch
lines. Our best guess, at this time, would be that they do not feel the need to do so, insofar as
their hearers can be relied upon to recognize the humorous intention.

References

Archakis, A., Giakoumelou, M., Papazachariou, D., & Tsakona, V. (2010). The prosodic framing
of humour in conversational narratives: Evidence from Greek data. Journal of Greek Lin-
guistics, 10, 187–212. DOI: 10.1163/156658410X531375
Attardo, S. (1994). Linguistic theories of humor. Berlin: Mouton de Gruyter.
Attardo, S. (2000). Irony markers and functions: Towards a goal-oriented theory of irony and
its processing. RASK, 12, 3–20.
Attardo, S. (2001). Humorous texts. Berlin: Mouton de Gruyter. DOI: 10.1515/9783110887969
Attardo, S. (2005). Review of Glenn (2003). HUMOR: International Journal of Humor Research,
18(4), 422–429.
Attardo, S., & Pickering. L. (2011). Timing in the performance of jokes. Humor: International
Journal of Humor Research, 24(2), 233–250.
Attardo, S., Pickering, L., & Baker, A. (2011). Prosodic and multimodal markers of humor in
conversation. Pragmatics and Cognition, 19(2), 224–247. DOI: 10.1075/pc.19.2.03att
Bavelas, J.B., & Chovil, N. (2000). Visible acts of meaning. An integrated message model of lan-
guage use in face-to-face dialogue. Journal of Language and Social Psychology, 19, 163–194.
DOI: 10.1177/0261927X00019002001
Brown, G., Currie, K.L., & Kenworthy, J. (1980). Questions of intonation. Baltimore, MD: Uni-
versity Park Press.
Brunner, L.J. (1979). Smiles can be back channels. Journal of Personality and Social Psychology,
37(5), 728–734. DOI: 10.1037/0022-3514.37.5.728

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


194 Salvatore Attardo et al.

Coulthard, M., & Montgomery, M. (Eds.). (1981). Studies in discourse analysis. London:
Thomas Litho Press.
Eisterhold, J., Attardo, S., & Boxer, D. (2006). Reactions to irony in discourse: evidence for the
least disruption principle. Journal of Pragmatics, 38(8), 1239–1256.
DOI: 10.1016/j.pragma.2004.12.003
Ekman, P. (1985). Telling lies: Clues to deceit in the marketplace, politics, and marriage. Norton,
New York, NY.
Ekman, P., Davidson, R.J., & Friesen, W.V. (1990). Duchenne’s smile: Emotional expression and
brain physiology II. Journal of Personality and Social Psychology, 58, 342–353.
DOI: 10.1037/0022-3514.58.2.342
Giora, R. (2003). On our mind: Salience, context and figurative language. Oxford: Oxford Uni-
versity Press. DOI: 10.1093/acprof:oso/9780195136166.001.0001
Glenn, P. (2003). Laughter in interaction. Cambridge: Cambridge University Press.
DOI: 10.1017/CBO9780511519888
Grice, P.H. (1957). Meaning. The Philosophical Review, 66(3), 377–388. DOI: 10.2307/2182440
Harker, L., & Keltner, D. (2001). Expressions of positive emotion in women’s college yearbook
pictures and their relationship to personality and life outcomes across adulthood. Journal
of Personality and Social Psychology, 80(1), 112–124. DOI: 10.1037/0022-3514.80.1.112
Hay, J. (2001). The pragmatics of humor support. HUMOR: International Journal of Humor
Research, 14(1), 55–82. DOI: 10.1515/humr.14.1.55
Hidalgo Downing, R., & Iglesias Recuero, S. (2009). Humor e ironía: una relación compleja. In
L. Ruiz Gurillo & X.A. Padilla García (Eds.), Dime cómo ironizas y te diré quién eres: Una
aproximacion pragmática a la ironía (pp. 423–455). Frankfurt: Peter Lang.
Hirst, D., & Di Cristo, A. (1998). Intonation systems: A survey of twenty languages. Cambridge:
Cambridge University Press.
Jefferson, G. (1979). A technique for inviting laughter and its subsequent acceptance declina-
tion. In G. Psathas (Ed.), Everyday language: Studies in Ethnomethodology (pp. 79–96).
New York: Irvington.
Owren, M.J. & Bachorowski, J.A. (2003). Reconsidering the evolution of non-linguistic com-
munication: the case of laughter. Journal of Nonverbal Behavior, 27, 183–200.
DOI: 10.1023/A:1025394015198
Pickering, L. (1999). An analysis of prosodic systems in the classroom discourse of native speaker
and nonnative speaker teaching assistants. Unpublished dissertation. University of Florida.
Pickering, L., Corduas, M., Eisterhold, J., Seifried, B., Eggleston, A., & Attardo, S. (2009). Pro-
sodic markers of saliency in humorous narratives. Discourse Processes, 46, 517–540.
DOI: 10.1080/01638530902959604
Rees, C.E., & Monrouxe, L.V. (2010). I should be lucky ha ha ha ha: The construction of power,
identity and gender through laughter within medical workplace learning encounters. Jour-
nal of Pragmatics, 42, 3384–3399. DOI: 10.1016/j.pragma.2010.05.004
Rockwell, P.A. (2006). Sarcasm and other mixed messages: The ambiguous ways people use lan-
guage. Lewistown, NY/Queeston/Lampeter: Edwin Mellen.
Ruiz-Gurillo, L. (2009). ¿Cómo se gestiona la ironía en la conversación? RILCE, 23(2), 363–377.
Trager, G.L. (1958). Paralanguage: A first approximation. Studies in Linguistics, 13, 1–12.
Tsakona, V. (2011). Irony beyond criticism: Evidence from Greek parliamentary discourse.
Pragmatics and Society, 2(1), 57–86. DOI: 10.1075/ps.2.1.04tsa
Wennerstrom, A. (2001). The music of everyday speech. Oxford: Oxford University Press.
Wittgenstein, L. (1953). Philosophical investigations. New York: The Macmillan Company.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image schemas and mimetic schemas
in cognitive linguistics and gesture studies

Alan Cienki
Vrije Universiteit (VU) / Free University

Image schemas have been a fundamental construct in cognitive linguistics, pro-


viding grounds for psychological, philosophical, as well as linguistic research.
Given the focus in cognitive linguistics on embodied experience as a fundamen-
tal basis for language structure and meaning, the employment of image schemas
in the analysis of gesture with speech is a logical extension. However, given their
level of abstraction, to what degree do image schemas provide a useful explan-
atory tool for researching the concrete, physically embodied details of gestures?
This article considers the answer to this question and then turns to a more
recent theoretical development that complements the picture by encompassing
a different realm of cognitive and linguistic phenomena. This research, on ‘mi-
metic schemas’, is shown to have great potential for thinking about some known
phenomena of gesture in a new way. Schema research on these different levels
thus provides a useful means to analyze behavior in another modality involved
in spoken language use, namely the visual.

Keywords: image schema, mimetic schema, gesture, metaphor

1. Introduction

The idea of schemas is certainly not novel with the field of cognitive linguistics.
While research on schemas has its origins in various disciplines, the work which
was among the most influential in the early years of cognitive linguistics (as noted
in Lakoff, 1987) came from cognitive psychology (originating most notably with
Rumelhart’s [1980] schema theory), computer science (dating back to Minsky’s
[1975] frames with defaults and Schank and Abelson’s [1977] scripts), and, of
course, philosophy: Johnson (1987) discusses Kant ([1781] 1968) as the particu-
lar inspiration for his idea of ‘image schemas’, and he notes (Johnson, 2005) the

doi 10.1075/bct.78.13cie
© 2015 John Benjamins Publishing Company
EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
196 Alan Cienki

predecessor of this notion in the works of James (1890), Dewey ([1925] 1958),
and Merleau-Ponty ([1945] 1962).
Within cognitive linguistics, Langacker (1987, p. 132) provides the follow-
ing definition: “The notion of schematicity pertains to levels of specificity, i.e. the
fineness of detail with which something is characterized; … A schema is thus
abstract relative to its … elaborations in the sense of providing less information
and being compatible with a broader range of options…”. He concludes that the
fact that we are able to conceptualize situations at different levels of specificity,
and express some characterizations of these different levels linguistically, means
schematicity has huge implications for how language is structured and used: “The
linguistic significance of this ability is hard to overstate” (Langacker, 1987, p. 135).
Various categories of schemas have been proposed in cognitive linguistics
within different frameworks of analysis. These include the construct of schema as
it is used in cognitive grammar – “an abstract characterization that is fully com-
patible with all the members of the category it defines” (Langacker, 1987, p. 371)
and as discussed in terms of syntactic constructional schemas (Goldberg, 1995;
Tomasello, 1992). One of the most influential uses of the term has been in the
work on image schemas (Hampe, 2005; Johnson, 1987; Lakoff, 1987). Another
proposal, that of mimetic schemas (Zlatev, 2005, and elsewhere), is a more recent
addition to the field.
The latter two particular notions will be the focus of this article, as they are
two types which have particular significance for research in a different but related
field – the study of spontaneous gesture with speech. While scholarship on ges-
ture dates back at least to the time of the Roman orators (Kendon, 2004, Ch. 3),
research on it from a linguistic point of view was suppressed in the Anglo-Amer-
ican tradition by the dominance of a modular view of language, promulgated
within generative theories of linguistics, which entailed the, since much-lament-
ed, separation in the scholarship of the study of the mind from that of the body.
Work on gesture from a more psychological perspective only took hold among
(cognitive) linguists in the 1990s and thereafter upon the publication of McNeill
(1992) and the development of this work by psycholinguists.
Now we see an enthusiastic in-corpor-ation of gesture research into cognitive
linguistics, evidenced by its inclusion as a topic in cognitive linguistic conferences
and journals. This move, however, raises some interesting questions for concepts
that have been developed for cognitive linguistics. How do they apply to a broad-
er notion of language, beyond that found in written and spoken words? The as-
sumption has been that the embodied approach of cognitive linguistics should
be readily amenable to co-speech embodied behavior. The research on schemas
provides a good case study to ascertain the relevance of, and possible problems

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image and mimetic schemas in cognitive linguistics and gesture studies 197

with, constructs from cognitive linguistics as applied to the visuo-motoric modal-


ity of gesture. In addition, given the tenet in cognitive linguistics that embodied
patterns, such as the ones represented by image schemas, can play an important
role in our metaphoric understanding of abstract concepts, some attention will
be given to research on image schemas as source domains, an idea which dates
back to both Johnson’s (1987) and Lakoff ’s (1987) work, with precursors found in
Lakoff and Johnson (1980).
As is clear from the starting point of the paper, we are concerned here mainly
with spoken language; signed language, given its nature as a visuo-spatial medi-
um of communication, involves a rather different set of questions in relation to
gesture, which go beyond the scope of this article (see, for example, Liddell, 2003).
More specifically, the scope of the claims being made here is that of certain Ger-
manic and Romance languages since the published research on the topic of this
article has mainly drawn upon examples from English, but some also concerns
French, German, Spanish, and Swedish. Given the close relation of these West
European Indo-European languages to each other, many questions remain to be
explored as to how the concepts discussed below pertain to languages that are
typologically different, particularly with regard to their spatial reference systems
(see, e.g., Levinson, 2003).

2. Image schemas

2.1 Image schemas in cognitive linguistics

Though having implicit origins in Lakoff & Johnson (1980), as discussed below,
the construct of image schemas was explicitly proposed in cognitive linguistics
in the same year by Johnson (1987) and Lakoff (1987). It was most notably char-
acterized as follows: “An image schema is a recurring, dynamic pattern of our
perceptual interactions and motor programs that gives coherence and structure
to our experience” (Johnson, 1987, p. xiv). Some examples include path1 and
related notions such as cycle; container and patterns related to containers and
areas, such as full-empty and center-periphery; and force relations such as
compulsion, attraction, and enablement. Johnson (1987, p. 126) provides a
list of some 27, including those listed above, which he counts as among the most
important, but it is not meant to be an exhaustive list, and others have been pro-
posed in subsequent research, such as straight (Cienki, 1998a) and self-mo-
tion and animate motion for animals and caused motion and inanimate
motion for artifacts (Mandler, 1992).

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


198 Alan Cienki

As Oakley (2007, p. 214) sums it up, the basic notion grew as an instrumental
part of the epistemology and moral philosophy that Johnson developed as well as
part of Lakoff ’s articulation of a theory of categorization. Lakoff provides two fun-
damental questions which gave rise to (among other ideas) the theorizing about
image schemas: “What kind of preconceptual structure is there to our experience
that could give rise to conceptual structure?” and “How can abstract concepts and
abstract reason be based on bodily experience?” (Lakoff, 1987, p. 267). Thus the
notion of image schemas arose within the establishment of a new philosophical
basis for a particular strand of cognitive science, one which led to Lakoff and
Johnson’s (1999) treatise on “the embodied mind and its challenge to Western
thought.”
Various attempts have been made to refine or qualify the notion of what im-
age schemas are, or the scope of what they encompass, including accounting for
their static and dynamic qualities, and the differing levels at which their schema-
ticity is relevant (Cienki, 1997; Quinn, 1991). One of the most important of these
sets of distinctions has been Grady’s differentiation between patterns which are
claimed to constitute “mental representations of fundamental units of sensory ex-
perience” (Grady, 2005, p. 44, emphasis in original) and fundamental units which
“relate to our interpretations of and responses to the world, our assessments of the
physical situations we encounter, their nature and their meaning” (Grady, 2005,
p. 47). Grady proposes limiting the former under the term ‘image schemas’ and
acknowledging the different status of the latter under a separate term; he proposes
‘response schemas’ (Grady, 2005, p. 46). While his category of image schemas in-
cludes, for example, center-periphery, container, and balance, others such
as cycle, scale, and process fall under his response schemas.
Within linguistics, image schemas have proven to be a useful notion in theo-
ries of grammar (Langacker, 1987), in psycholinguistic research (Gibbs & Colston,
1995), and most of all in numerous applied linguistic studies, particularly those
accounting for the polysemy of individual or related words or constructions and
semantic change (see Oakley, 2007, p. 219–223 for an overview). The relevance of
image schemas in these linguistic analyses has been taken by many cognitive lin-
guists as supporting evidence for the reality of image schemas in some way on the
cognitive and experiential levels, even if the status being claimed for that reality is
still acknowledged to be complex and subject to revision (Gibbs, 2005).
In sum we can say that the thinking about image schemas developed primar-
ily from philosophical reasoning within a subgroup of cognitive science oriented
toward the analysis of conceptual structure, and its successful uptake in linguistic
analyses bolstered (a) the credence in image schemas as a cognitive construct of
language users and (b) the attractiveness of image schemas as an explanatory tool
which linguists could use, particularly for semantic analyses. Interestingly, despite

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image and mimetic schemas in cognitive linguistics and gesture studies 199

the claims about the embodied basis of image schemas, the theorizing about them
originated starting in the 1980s without reference to gesture. One major reason
could be that gesture studies only gained wide reception in cognitive psycholo-
gy, and later in cognitive linguistics, after the publication of McNeill’s Hand and
Mind in 1992. However another important factor is the research methodology
used: much of the fundamental work on image schemas was based on intuitive
analysis of linguistic examples – phrases which were often invented by the authors
as plausible, but which were not drawn from corpora of actual usage.
If we consider talk as used in context by hearing, seeing co-participants, the
canonical encounter (Clark, 1973) of human communication is face to face inter-
action. This has led Kendon (1980, 2004) to approach gesture and speech as two
aspects of the process of utterance and McNeill (1985, 1992) to claim that gesture
and language are one system. Numerous heirs of their research tradition in ges-
ture studies now consider gesture as not nonverbal “body language” but rather as
co-verbal behavior. In light of the important role that image schemas have played
in cognitive linguistic theory and analysis, to what degree do image schemas pro-
vide a useful explanatory tool for researching the concrete, physically embodied
details of gestures?

2.2 Image schemas in gesture studies

While there is a solid tradition of experimental work on gesture, especially in


the field of cognitive psychology, the bulk of the research on schemas in gesture
comes from the tradition of observational (micro)analysis and interpretation.
These kinds of studies are fundamental to the field and are needed to lay the
groundwork, to ascertain the scope of the phenomena of natural gestural behav-
ior. Perhaps not surprisingly, some of the same image-schematic patterns that
Johnson argued as being fundamental to our embodied experience have been ob-
served in spontaneous gestural behavior (Roth & Lawless, 2002). Cienki (2005)
found a number of image schemas (container, cycle, force, object, path)
were reliably used to categorize gestures observed from natural conversations. In
Ladewig (2011), the cycle image schema provides the basis for analyzing a cate-
gory of gestures among speakers of German (but also other languages) involving a
repeated circular movement of the hand, rotating outward at the wrist. Variations
were found in the location in which the gesture was performed, corresponding
to different functions, e.g., use in the central space in front of the speaker during
word searches, but on the side when making requests. The coordinated varia-
tions in form and meaning lead to the characterization of cyclic gestures as what
Kendon (2004) calls a ‘gesture family’. Harrison (2009, Ch. 3) also observes the

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


200 Alan Cienki

same gesture type accompanying use of the progressive aspect in English, ex-
pressed with be + -ing, e.g., “there’s something going on in the city that…”, sug-
gesting that it can play a grammatical role and not simply occur with or in place
of lexical items. Williams (2008) argues that the path image schema – or more
precisely, the source-path-goal image schematic structure – underlies gestures
involving tracing, usually with one’s extended finger(s). Tracing of a path can sup-
port the speaker’s cognition but also, in contexts of demonstrating something or
when teaching, it can be used to guide the addressee’s conceptualization. Part
of the basis of arguing for straight as an image schema (Cienki, 1998a) had
to do with the distinctive recurring pattern of experience of muscular tension
and control involved in effortful, non-curvilinear movement of body parts. While
discussed in isolation in the studies mentioned above, image schematic patterns
often co-occur in gesture, e.g., movements involving path and iteration, or
path, straight, and up-down (Bressem, 2008). This resonates with claims about
the co-occurrence of certain image schemas in other aspects of our experience
(Cienki, 1997).

2.3 Image schemas in metaphor research

The study of schemas in cognitive linguistics has some close ties with research on
conceptual metaphors. With regard to image schemas, some might even argue that
we can see some circularity in the history of their development as a construct in
relation to metaphor studies. Even though image schemas were not named in the
1980 book as such, Oakley (2007, p. 214 ) notes that “[t]he locus classicus of image
schema theory is Lakoff and Johnson’s (1980) conceptual theory of metaphor.”
Generalizing over patterns of metaphors found in language led to conclusions
about underlying conceptual metaphors (mappings of target domain in terms
of source domain) that provided the structure for the linguistic expressions, and
what kinds of source domains that were showing up in the most fundamental
types of conceptual metaphors (such as more is up) provided answers to the ques-
tions posed by Lakoff (above). Indeed the convention of using small capital letters
to name image schemas follows naturally from many of them having been named
in Lakoff and Johnson (1980) as common metaphoric source domains.
The process of deriving image schemas from the analysis of metaphors in lan-
guage and then justifying those image schemas through later application of them
in the analysis of metaphors in linguistic data relies on reasoning that has been
critiqued by psychologists as circular (Gibbs & Colston, 1995, p. 354). However,
as we see below, the vicious cycle of reasoning – that verbal metaphoric expres-
sions provide evidence for conceptual metaphors and that we know that because

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image and mimetic schemas in cognitive linguistics and gesture studies 201

we see conceptual metaphors expressed verbally – can be broken by analyzing


metaphor in a type of behavior other than speech itself, such as manual gesture
(Cienki, 1998b, p. 190).
As noted earlier, research on image schemas in cognitive linguistics has often
been closely connected to metaphor research. In gesture studies, too, we find the
link of the schematic imagery in gesture serving as a source domain for various
kinds of metaphoric expression. For example, in the gesture involving cyclic ro-
tation of the hand is analyzed in Ladewig (2006, 2011) and Harrison (2009) as
connected with various types of processes, either in the content of the speech (on-
going actions) or in the speech situation itself (such as when a speaker is trying
to retrieve a word or concept). Here the gesture involves a partial (metonymic)
representation embodying something in rotation, such as a gear or wheel, thus
actually performing the metaphor of process as object in rotational mo-
tion. Considering the gesture of a hand tracing a path, discussed by Williams
(2008), it can physically instantiate the metaphorical linearity of logical thought
(Emanatian, 1997) as movement through space, whether or not coordinate verbal
expressions of this metaphor (e.g., do you follow my line of thinking?) are used.
Some gestures moving with short, tense motion in a straight line forward (away
from the speaker) have been analyzed (in the proper speech context) as reflect-
ing honest behavior as straight (Cienki, 1999). But since such gestural met-
aphoric expressions do not necessarily always occur with metaphorically used
words (Cienki, 1998b, 2008), they can sometimes be considered possible evidence
of cognitive activation of conceptual metaphors on some level.

2.4 A note on image schemas from a developmental perspective

This research discussed above all concerns behavior of adult speakers. Image
schemas have been claimed to play an important role in early development as well
(e.g., Gibbs & Colston, 1995; Mandler, 1992, 2005), for example, as patterns which
infants may realize and thereby be capable of generalizing across perceptions. Im-
age schemas might therefore be expected to appear in children’s early gestural
behavior. However, Andrén (2008) argues that at least up to 27 months of age, chil-
dren “are not doing abstract and refined image schema-like gestures of the kind
that can be seen in adults until, possibly, the very end of [that time] period.” This
suggests that “performing refined schema-like gestures is not simply a question of
these abstracted image schematic structures of thought ‘spontaneously’ coming
out of the hands in the form of expression” (Andrén, 2008). Andrén supports the
position that less abstract schemas of actions provide a more fruitful option for
characterizing young children’s gestures, and that later in development, schematic

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


202 Alan Cienki

patterns more like those of image schemas become relevant in the structuring of
gestures (and thought). The importance, for children and adults, of patterns on
a less schematic level than that of image schemas has led Zlatev and colleagues
(see below) to investigate what they call mimetic schemas, patterns which relate
more closely to the basic level of categorization (Rosch, Mervis, Gray, Johnson,
& Boyes-Braem, 1976) of actions in human experience than image schemas do.

3. Mimetic schemas

3.1 Background on mimetic schemas

The notion of mimetic schemas had a rather different starting point than that of
image schemas. One difference is that it arose from discussion among a team of
researchers coming from a variety of interrelated theoretical perspectives: “an in-
terdisciplinary group of linguists, semioticians, cognitive scientists and philoso-
phers” who have taken both “a phylogenetic and ontogenetic perspective” (Zlatev,
2005, p. 315), importantly including developmental research in the scope of their
work. In addition, the research has been multimodal in nature from the start,
concerning both the audio and visual modalities by considering the interrelation-
ship between language, gestures, and pictures (Zlatev, 2005, p. 315). Finally, it is
more recent in origin than the theory of image schemas. It has been developed
in Zlatev, Persson, & Gärdenfors (2005a, 2005b) and Zlatev (2005, 2007a, 2007b),
building on a key concept of bodily mimesis from Donald (1991). Therefore mi-
metic schemas were developed with image schemas as background knowledge, in
fact in comparison and contrast with them (Zlatev, 2005, §3).
Let us consider the specifics of mimetic schemas. Some examples that Zlatev
(2005, p. 317) proposes are eat, sit, kiss, hit, put in, take out, run, crawl, fly,
and fall.2 The difference from image schemas is clear, in that while the following
properties are possible descriptors of image schemas, they are definitional char-
acteristics of mimetic schemas. Zlatev (2005, p. 318) characterizes mimetic sche-
mas as bodily, representational (not just abstract patterns), dynamic, accessible to
consciousness, specific (relating to bodily acts), and pre-reflectively shared (since
they derive from culturally salient actions). As opposed to the potentially static
nature of some image schemas (Cienki, 1997), mimetic schemas are all about ac-
tions, and thus dynamic. In this way, they concern a different level of specificity
than image schemas. Thus, while each applies to a narrower range of phenomena,
it also is more information-rich (to return to the quote from Langacker, 1987,
with which we began). In this regard, they are argued to provide a strong ba-
sis for language development in children. Zlatev (2005, pp. 327–328) notes their

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image and mimetic schemas in cognitive linguistics and gesture studies 203

the close correspondence of the claimed mimetic schemas to the first verbs that
Tomasello (1992) observed from an English-speaking child between the ages of
16 to 24 months, such as hammer, kick, jump, swim, get-out. Note, however, that
detailed research has yet to be carried out on the metaphoric use of verbs express-
ing mimetic schemas, although there is interesting potential for this topic, given
that the greater specificity of mimetic schematic structures raises questions about
the ways and contexts in which they might be extended metaphorically.

3.2 Mimetic schemas in gesture

While the notion of mimetic schemas is still new and has not yet been explic-
itly employed in gesture studies as a construct, some existing studies implicitly
support further exploration of it. Of particular interest here is work which con-
cerns gestures involving schematized versions of manual actions. Calbris’ (2003)
study, for example, concerns a family of gestures involving a flat hand making a
tense, straight movement either down or horizontally across. The horizontal vari-
ants, with the palm facing down, are often used by speakers of French, as Calbris
observes, but also in many other European cultures, when refusing or negating
something. One can see the possible origins in action via the kind of sweeping
motion made when removing small unwanted objects (such as dust particles or
water droplets) from a flat surface by wiping it. We can see how a mimetic rep-
resentation of this could be used in other contexts in which no physical object
was present that required sweeping. In this sense, the unwanted or refused idea is
metaphorically wiped away.
This kind of gesture is what is described by Müller (1998a, 1998b) as the
mode of representation in which the hand imitates or enacts an action it would
actually do, such as when one depicts writing with a pen by moving one’s empty
hand horizontally in the air with the hand shape gripped as if holding a pen. Sim-
ilarly, Streeck (2009) describes handling and mimesis (depicting action) as two
forms by which gestures can depict. These kinds of gestures appear to represent
mimetic schemas through motor patterns that are informationally rich. In some
contexts, enactment gestures are performed with a different kind of function than
a referential depiction of some action of the hand. Teßendorf and Ladewig (2008),
for instance, discuss the brushing away gesture used by Spanish speakers (but
again, also observed elsewhere) in which the slightly curled fingers of one hand
quickly flick outward, often done two or three times; the same gestural form to
brush something small and unwanted (such as crumbs or lint) off of one’s clothes
is also sometimes used in the air, not against any surface. In these cases, it can play
the role of indicating dismissiveness towards an idea that has been mentioned. In
this sense, the schematic action takes on a pragmatic function. In this regard it is

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


204 Alan Cienki

interesting to compare Traugott’s (1988) discussion of lexical semantic change in


some cases, extending the use of an expression from physical contexts to what is
more abstract, subjective, and related to the discourse context. Apparently in ges-
ture as in verbal language, semantic ‘weakening’ may go hand in hand with what
Traugott calls pragmatic strengthening.

4. Discussion

We see from this overview how aspects of the current state of play in both sche-
ma research and gesture research manifest themselves more saliently as they are
brought into contact with each other. Image schemas and mimetic schemas have
been argued to perform different kinds of functions in cognitive terms for lan-
guage users, and have proven to be useful analytically for the researcher as tools
for linguistic analysis on different levels. Similarly, for gesture research, the two
notions provide different tools for analyzing, and different levels of explanation
for, gesture forms and functions.
For example, we saw above that while both types of schemas provide patterns
which can be used in gestures as source domains of metaphors, the target domains
of the metaphors involved appear to be different in the two cases. Metaphorically
used gestures based on image schemas seem to relate to ideas on the general level
of types of processes, reasoning, or behavior, while those based on mimetic sche-
mas, at least in the examples considered above, concerned more particular ideas,
like negation or dismissiveness. This could have to do with different types of sche-
maticity involved in the gestures expressing the source domains: with simple mo-
tions with less specific handshapes being characteristic of the image-schematic
type of gestures (such as path and cycle), and with handshapes more specifically
associated with basic level actions in the case of mimetic-schematic type gestures
(such as wipe and brush away). The greater schematicity of the gestures realizing
image schemas may allow for a wider variety of possible metaphoric extensions,
while the information richness of the mimetic schemas in gestures may constrain
their scope for metaphoric extension. However, confirmation of this hypothesis
will have to await further research on image schematic and mimetic schematic
structures in gestures as source domains for metaphors.
One question that arises from the discussion above is: at what cognitive level
are these schemas operating? Lakoff and Johnson (1999) place image schemas on
the level of the “cognitive unconscious,” though, Zlatev (2005, p. 322) observes,
Johnson (2005, p. 22) qualifies this by saying that the level at which image sche-
mas have meaning for us “typically operates beneath the level of our conscious
awareness” (emphasis added). Mimetic schemas, however, with their greater level

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image and mimetic schemas in cognitive linguistics and gesture studies 205

of experiential specificity, are claimed to be accessible to consciousness, even if


not normally in focal consciousness (Zlatev, 2005, p. 318). In relation to gesture,
much spontaneous gesturing during talk (what Kendon, 1980, calls ‘gesticulation’)
is done without the speaker being aware of it. In light of this, we can postulate that
the image schemas that may structure many gestures are normally playing this
role below the speaker’s level of consciousness. Indications of the inadvertentness
of gesture use can be seen in their production in a low space by the abdomen and
with relaxed hands, and from some co-gestural behaviors, such as the lack of eye
gaze at them. (Contrast Müller, 2008, on the opposite types of behaviors – such
as use of a high gesture space, tension in the hands, and eye gaze at the gestures –
which are argued to provide cues of gestural awareness). However the gestures
discussed above pertaining to mimetic schemas, especially those involving enac-
tion of a specific behavior, do appear more carefully articulated, more effortful,
and more closely involved in detailed, intentionally communicative depiction of
the content of the talk. These provide some cues that mimetic schemas may more
readily be invoked on a more conscious level.
This gestures discussed in this article involve types which have been observed
across different contexts of use, but of course not all gestures can be seen like
these as having a basis in image schemas or mimetic schemas. Some contexts
call for more idiosyncratic use of gestural forms. Mittelberg (2010), for example,
discusses how this plays out in lectures by linguistics professors, in which the no-
tions of syntactic constituents are not only drawn in triangular diagrams on the
board in the classroom but are also embodied in the ways in which the professors
hold their arms and hands to demonstrate the analyses they are talking about. It
is therefore worth bearing in mind that the process of iconic representation of
specific physical objects, images, or actions (especially of inanimate objects) lends
itself to geometric representation in diagrammatic fashion, rather than in terms
of image schemas or schemas mimetic of bodily actions.
A closing point is that this line of research also brings some challenges. One
is the current lack of an appropriate meta-language or heuristic tool for describ-
ing semantics in this dynamic, multi-modal way (Cienki, 2012) (though see
Fricke [2008] 2012, for one proposal). Another is that much of the research on
image schemas that has made claims about linguistic semantics has been based
on written (sometimes constructed) examples. An approach to semantics which
can handle gesture as part of the act of utterance (Kendon, 1980) needs not only
to draw upon spoken language data, but also to be based on appropriate units
of analysis for spoken language, such as intonation units (Chafe, 1994), rather
than the traditional level in linguistics, that of the sentence. We see that the writ-
ten-language bias found in mainstream linguistics (Linell, 2005) persists even in
much of cognitive linguistic theorizing.

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


206 Alan Cienki

Acknowledgements

This article has its origins in a commentary for a theme session on “Motivation
in gesture: Image and motor schemas and their metaphorical extensions” at the
Third International Conference of the German Cognitive Linguistics Association,
held in Leipzig in September, 2008. I am grateful to Irene Mittelberg and Cornelia
Müller for having organized that panel, to Doris Schönefeld for valuable discus-
sion of ideas leading to this paper, to María Jesús Pinar Sanz for undertaking
the compilation of this special issue, and to two anonymous reviewers for their
helpful comments.

Notes

1. I will follow the convention in the literature of using small capital letters to indicate names
of image schemas.

2. The convention of also identifying mimetic schemas with small capital letters that Zlatev
uses will be followed here.

References

Andrén, M. (2008). Iconicity, object manipulation, and schematicity in the development of


children’s bodily communication. Talk presented at the third conference of the German
Cognitive Linguistics Association, Leipzig, Germany, September 2008.
Bressem, J. (2008). Clusters of image schematic patterns in coverbal gestures. Talk presented at
the third conference of the German Cognitive Linguistics Association, Leipzig, Germany,
September 2008.
Calbris, G. (2003). From cutting an object to a clear cut analysis: Gesture as the representa-
tion of a preconceptual schema linking concrete actions to abstract notions. Gesture, 3(1),
19–46. DOI: 10.1075/gest.3.1.03cal
Chafe, W. (1994). Discourse, consciousness, and time. Chicago: University of Chicago Press.
Cienki, A. (1997). Some properties and groupings of image schemas. In M. Verspoor, K.D. Lee
& E. Sweetser (Eds.), Lexical and syntactical constructions and the construction of meaning
(pp. 3–15). Amsterdam: John Benjamins. DOI: 10.1075/cilt.150.04cie
Cienki, A. (1998a). Straight: An image schema and its metaphorical extensions. Cognitive Lin-
guistics, 9, 107–149. DOI: 10.1515/cogl.1998.9.2.107
Cienki, A. (1998b). Metaphoric gestures and some of their relations to verbal metaphoric ex-
pressions. In J.P. Koenig (Ed.), Discourse and cognition: Bridging the gap (pp. 189–204).
Stanford: CSLI Publications.
Cienki, A. (1999). Metaphors and cultural models as profiles and bases. In R.W. Gibbs, Jr. & G.J.
Steen (Eds.), Metaphor in cognitive linguistics (pp. 189–203). Amsterdam: John Benjamins.
DOI: 10.1075/cilt.175.11cie

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image and mimetic schemas in cognitive linguistics and gesture studies 207

Cienki, A. (2005). Image schemas and gesture. In B. Hampe (Ed.), From perception to meaning:
Image schemas in cognitive linguistics (pp. 421–441). Berlin: Mouton de Gruyter.
DOI: 10.1515/9783110197532.5.421
Cienki, A. (2008). Why study metaphor and gesture? In A. Cienki & C. Müller (Eds.), Metaphor
and gesture (pp. 5–25). Amsterdam: John Benjamins. DOI: 10.1075/gs.3
Cienki, A. (2012). Usage events of spoken language and the symbolic units we (may) abstract
from them. In J. Badio & K. Kosecki (Eds.), Cognitive processes in language (pp. 149–158).
Bern: Peter Lang.
Clark, H. (1973). Space, time, semantics, and the child. In T.E. Moore (Ed.), Cognitive develop-
ment and the acquisition of language (pp. 27–63). New York: Academic Press.
Dewey, J. ([1925] 1958). Experience and nature. New York: Dover Publications. [Original edi-
tion: Chicago/London: Open Court].
Donald, M. (1991). Origins of the modern mind: Three stages in the evolution of culture and
cognition. Cambridge, MA: Harvard University Press.
Emanatian, M. (1997). The spatialization of judgment. In W.A. Liebert, G. Redeker, & L. Waugh
(Eds.), Discourse perspective in cognitive linguistics (pp. 131–147). Amsterdam: John Benja-
mins. DOI: 10.1075/cilt.151.11ema
Fricke, E. (([2008] 2012). Grundlagen einer multimodalen Grammatik des Deutschen: Syntak-
tische Strukturen und Funktionen. Habilitationsschrift, Frankfurt/Oder: Europa-Universi-
tät Viadrina. Republished as Grammatik multimodal (Berlin: Walter de Gruyter).
Gibbs, R.W., Jr. (2005). The psychological status of image schemas. In B. Hampe (Ed.), From
perception to meaning: Image schemas in cognitive linguistics (pp. 113–135). Berlin: Mouton
de Gruyter. DOI: 10.1515/9783110197532.2.113
Gibbs, R.W., Jr., & Colston, H. (1995). The psychological reality of image schemas and their
transformations. Cognitive Linguistics, 6, 347–378. DOI: 10.1515/cogl.1995.6.4.347
Goldberg, A. (1995). Constructions: A construction grammar approach to argument structure.
Chicago: University of Chicago Press.
Grady, J. (2005). Image schemas and perception: Refining a definition. In B. Hampe (Ed.), From
perception to meaning: Image schemas in cognitive linguistics (pp. 35–55). Berlin: Mouton
de Gruyter. DOI: 10.1515/9783110197532.1.35
Hampe, B. (Ed.). (2005). From perception to meaning: Image schemas in cognitive linguistics.
Berlin: Mouton de Gruyter. DOI: 10.1515/9783110197532
Harrison, S. (2009). Grammar, gesture, and cognition: The case of negation in English. PhD dis-
sertation. Bordeaux, France: Université Michel de Montaigne.
James, W. (1890). The principles of psychology. New York: Dover. DOI: 10.1037/11059-000
Johnson, M. (1987). The body in the mind: The bodily basis of meaning, imagination, and reason.
Chicago: University of Chicago Press.
Johnson, M. (2005). The philosophical significance of image schemas. In B. Hampe (Ed.), From
perception to meaning: Image schemas in cognitive linguistics (pp. 15–33). Berlin: Mouton
de Gruyter. DOI: 10.1515/9783110197532.1.15
Kant, I. ([1781] 1968). Critique of pure reason. Translated by N.K. Smith. New York: St. Martin’s
Press.
Kendon, A. (1980). Gesticulation and speech: Two aspects of the process of utterance. In M.R.
Key (Ed.), The relation between verbal and nonverbal communication (pp. 207–227). The
Hague: Mouton.
Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge: Cambridge University
Press. DOI: 10.1017/CBO9780511807572

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


208 Alan Cienki

Ladewig, S.H. (2006). Die Kurbelgeste: Konventionalisierte Markierung einer kommunikativen


Aktivität. Freie Universität Berlin, Unpublished MA thesis.
Ladewig, S.H. (2011). Putting the cyclic gesture on a cognitive basis. CogniTextes, 6. Consulted
30 December 2011. URL: http://cognitextes.revues.org/406.
Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind
Chicago: University of Chicago Press. DOI: 10.7208/chicago/9780226471013.001.0001
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press.
Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and its challenge
to Western thought. New York: Basic Books.
Langacker, R. (1987). Foundations of cognitive grammar. Vol. 1. Stanford: Stanford University
Press.
Levinson, S.C. (2003). Space in language and cognition: Explorations in cognitive diversity.
Cambridge: Cambridge University Press. DOI: 10.1017/CBO9780511613609
Liddell, S. (2003). Grammar, gesture, and meaning in American Sign Language. Cambridge:
Cambridge University Press. DOI: 10.1017/CBO9780511615054
Linell, P. (2005). The written language bias in linguistics: Its nature, origins, and transformations.
London: Routledge. DOI: 10.4324/9780203342763
Mandler, J. (1992). How to build a baby: II. Conceptual primitives. Psychological Review, 99,
587–604. DOI: 10.1037/0033-295X.99.4.587
Mandler, J. (2005). How to build a baby: III. Image schemas and the transition to verbal
thought. In B. Hampe (Ed.), From perception to meaning: Image schemas in cognitive lin-
guistics (pp. 137–163). Berlin: Mouton de Gruyter. DOI: 10.1515/9783110197532.2.137
McNeill, D. (1985). So you think gestures are nonverbal? Psychological Review, 92(3), 350–371.
DOI: 10.1037/0033-295X.92.3.350
McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago: Chicago Uni-
versity Press.
Merleau-Ponty, M. ([1945] 1962). Phenomenology of perception. Translated by Colin Smith.
London: Routledge.
Minsky, M. (1975). A framework for representing knowledge. In P.H. Winston (Ed.), The psy-
chology of computer vision (pp. 211–277). New York: McGrawHill.
Mittelberg, I. (2010). Geometric and image-schematic patterns in gesture space. In V. Evans
& P. Chilton (Eds.), Language, cognition and space: The state of the art and new directions
(pp. 351–385). London: Equinox.
Müller, C. (1998a). Iconicity and gesture. In S. Santi, Guaïtella, Cavé & Konopczynski (Eds.),
Oralité et gestualité: Communication multimodale, interaction (pp. 321–328). Paris:
L’Harmattan.
Müller, C. (1998b). Redebegleitende Gesten. Kulturgeschichte – Theorie – Sprachvergleich. Berlin:
Berlin Verlag A. Spitz.
Müller, C. (2008). What gestures reveal about the nature of metaphor. In A. Cienki & C. Müller
(Eds.), Metaphor and gesture (pp. 219–245). Amsterdam: John Benjamins.
DOI: 10.1075/gs.3
Oakley, T. (2007). Image schemas. In D. Geeraerts & H. Cuyckens (Eds.), The Oxford handbook
of cognitive linguistics (pp. 214–235). Oxford: Oxford University Press.
Quinn, N. (1991). The cultural basis of metaphor. In J.W. Fernandez (Ed.), Beyond tropes: The
theory of tropes in anthropology (pp. 56–93). Stanford: Stanford University Press.
Rosch, E., Mervis, C.B., Gray, W.D., Johnson, D.M., & Boyes-Braem, P. (1976). Basic objects in
natural categories. Cognitive Psychology, 8, 382–439. DOI: 10.1016/0010-0285(76)90013-X

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


Image and mimetic schemas in cognitive linguistics and gesture studies 209

Roth, W.M., & Lawless, D.V. (2002). How does the body get into the mind? Human Studies, 25,
333–358. DOI: 10.1023/A:1020127419047
Rumelhart, D.E. (1980). Schemata: The building blocks of cognition. In R.J. Spiro, B.C. Bruce
& W.F. Brewer (Eds.), Theoretical issues in reading comprehension (pp. 38–58). Hillsdale,
NJ: Erlbaum.
Schank, R.C., & Abelson, R.P. (1977). Scripts, plans, goals, and understanding. Hillsdale, NJ:
Lawrence Erlbaum Associates.
Streeck, J. (2009). Gesturecraft: The manu-facture of meaning. Amsterdam: John Benjamins.
DOI: 10.1075/gs.2
Teßendorf, S., & Ladewig, S. (2008). The brushing-aside and the cyclic gesture–reconstructing
their underlying patterns. Talk presented at the third conference of the German Cognitive
Linguistics Association, Leipzig, Germany, September 2008.
Tomasello, M. (1992). First verbs: A case study of early grammatical development. Cambridge:
Cambridge University Press. DOI: 10.1017/CBO9780511527678
Traugott, E.C. (1988). Pragmatic strengthening and grammaticalization. Proceedings of the
Fourteenth Annual Meeting of the Berkeley Linguistics Society, 406–416.
Williams, R.F. (2008). Path schemas in gesturing for thinking and teaching. Talk presented at
the third conference of the German Cognitive Linguistics Association, Leipzig, Germany,
September 2008.
Zlatev, J. (2005). What’s in a schema? Bodily mimesis and the grounding of language. In B.
Hampe (Ed.), From perception to meaning: Image schemas in cognitive linguistics (pp. 313–
342). Berlin: Mouton de Gruyter. DOI: 10.1515/9783110197532.4.313
Zlatev, J. (2007a). Language, embodiment and mimesis. In T. Ziemke, J. Zlatev & R. Frank
(Eds.), Body, language and mind, vol. 1: Embodiment (pp. 297–337). Berlin: Mouton de
Gruyter.
Zlatev, J. (2007b). Intersubjectivity, mimetic schemas and the emergence of language. Intellec-
tica, 2–3(46–47), 123–152.
Zlatev, J., Persson, T., & Gärdenfors, P. (2005a). Bodily mimesis as the missing link in human
cognitive evolution. LUCS 121. Lund: Lund University Cognitive Studies.
Zlatev, J., Persson, T., & Gärdenfors, P. (2005b). Triadic bodily mimesis is the difference. Com-
mentary to Tomasello, et al. Behavioral and Brain Sciences, 28, 720–721.
DOI: 10.1017/S0140525X05530127

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use
Index

A conventional metaphor 16, G


acoustic meaning 157 21, 101 gesture 1, 7–8, 43, 47, 51, 56–58,
actional process 103 conversation 8, 144, 168–169, 80, 92, 99, 131, 175, 195–197,
see also mental process 181–184, 186–187, 190, 193 199–201, 203–209
advertisement(s) 13, 63, 99– creative metaphor 23, 101 gesture studies 8, 195, 199,
100, 103–105, 107, 109–113, multimodal metaphors 3, 201, 203
118, 129–130 13, 18
see also print advertisement see also metaphor creativity H
63 3, 13–14, 16, 18–20 human interaction 7, 75
affordances 3–4, 13–14, 17–18, creativity 3, 13–16, 18–20, 24, humour 7, 82, 93–95, 193
27, 41, 65–67, 69, 73–75, 148 26, 76, 79, 85, 129 hyper-theme 6, 133, 135
auditory modality 88, 90 cross-modal resonances 3, 13,
15, 19, 23 I
B image schema(s) 4, 8, 39,
branding 4, 61–65, 67–68, 74, D 61–76, 195–202, 204–209
76–77 defamiliarization metaphor see also schemas
brands 3–4, 61–75 102–103, 105, 112 image schematic metaphors and
digital storytelling 6, 147–151, metonymies 4, 61
C 161 index assignment 7, 169, 177
cartoon(s) 2, 3, 4, 9, 13, 15, domestication metaphor 102, interaction 2, 5, 7–9, 46, 61–65,
21–23, 25, 40–42, 46, 106–107 67–68, 74–75, 82, 95, 117–118,
79–80, 82–87, 90, 92–95, El Refaie 3, 9, 13, 19–22, 25, 82, 126–127, 129–130, 148, 161–162,
107–108, 113–114, 152, 154 93–94, 99–100, 102, 106–109, 172, 174, 178, 194, 199, 208
see also political cartoons 111, 113 interactive meaning 101, 108
cognitive linguistics 1–3, 5, Intermedial Cognitive Semiotics
E
7–8, 27, 43, 61, 76–77, 115, 117, 7, 167
emotions 3–4, 13, 23, 45–47, 54,
128–130, 148, 162, 167, 169, Intermediality 7, 167
57–59, 77, 120, 145, 154, 156,
195–197, 199–201, 206–209 interpersonal metafunction
168, 172
cognitive mechanism(s) 5, 79 129, 153
comics 2–3, 13, 25–26, 31, 41, F see also textual metafunction
46, 59–60, 173 film 3, 6, 8, 24, 26–27, 31–33, 35,
compositional meaning 99, 37–38, 41–43, 109, 113, 118, L
101, 108, 110, 112 131–146 Lakoff and Johnson 40, 64, 75,
conceptual structures 103 beginnings 6, 131, 134–135, 79, 101, 110, 118, 197, 200, 204
Conceptual 139 laughter 8, 80–81, 181–183, 185,
Integration Theory 6, filmic discourse 131, 135–137, 187, 189–194
147–148, 151, 159 140–144
Integration 6, 147–148, 151, Forceville 1–3, 5, 8–9, 14, 17, M
158–159, 162 24–28, 30–31, 38, 40–47, 57, macro-theme 6, 133, 135
mappings 14–16, 18, 24 59, 62, 76, 80, 93–95, 99–103, mental process 103
metaphor theory 3, 5, 14, 27, 105, 107, 113–118, 126–130, Mental Spaces 6, 147–148, 151,
46, 59, 99–100, 112 167, 178 157, 159, 161–162, 178

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use


212 Multimodality and Cognitive Linguistics

metaphor multimodality 1–9, 13, 15, semiotic modes 1, 3, 13, 17–19,


creative metaphor 23, 101 27–28, 115, 147–148, 162, 181, 131, 137
defamiliarization metaphor 193 smiling 7–8, 120, 181–183,
102–103, 105, 112 187–193
domestication metaphor N social semiotics 1–2, 13, 99,
102, 106–107 narrative 6, 40, 42, 44–46, 113, 161
pictorial metaphor(s) 5, 83, 85, 101, 103, 105, 115–116, social semiotic perspective
8, 43–44, 76, 83, 90–91, 118–120, 123–127, 131, 134–135, 99–100, 102
94, 100, 102, 113, 117–118, 139–140, 144–153, 155–157, Systemic Functional Linguistics
127–128 159–162, 172, 182–183, 186, 192 1–2, 5, 27–28, 100, 115
verbal metaphor(s) 20, 114
visual metaphor(s) 5, 18, 59, P T
99–102, 106–108, 110, 112, parametrization 68 textual
115–116, 150, 159 pictorial metaphor(s) 5, 8, metafunction 129
metaphor creativity 3, 13–14, 43–44, 76, 83, 90–91, 94, 100, see also interpersonal
16, 18–20 102, 113, 117–118, 127–128 metafunction
see also creative metaphor picture books 5–6, 115–118, 120, organization 6
23, 101 125, 127 thematic
metaphorical mappings 3, pointers 7, 169–170, 173, 177 organisation 133
13, 15 political cartoons 3–4, 15, structure 131
method of development 133, 22, 25, 40, 79–80, 82–87, theme organization 6
140–142, 144 92–94, 108, 113–114 see also hyper-theme, macro-
mimetic schema(s) 8, 195–196, see also cartoons theme
202–206, 209 prosody 181, 193
mirror neuron mapping 7, 173 punch lines 181–184, 186, 188, V
multimodal 191, 193 verbal
communication 2, 175 metaphor(s) 20, 114
interaction 2, 7, 9 R mode(s) 89, 117, 152–153,
interactional analysis 1–2, 8 representational meaning 5, 156–157
mappings 177 100–103, 117, 152 visual
metaphor analysis 79 grammar 5, 99–100, 102
metaphor(s) 1–4, 8–9, S metaphor(s) 5, 9, 18, 25,
13, 15–19, 22–26, 43–44, schemas 4, 8, 38–39, 61–66, 59, 99–102, 106, 108, 110,
59, 76, 79–81, 83–84, 86, 73–76, 172, 195–209 112–116, 150, 159
91–95, 107, 113–114, 116, 118, see also image schemas 4, metonymy(-ies) 5–6,
128–130, 178–179 8, 39, 61–76, 195–202, 115–119, 121–123, 125–127
representation 6, 9, 147 204–209 modality 46
resonance 18 schemata 31, 61–63, 65, 70, mode(s) 4, 21–22, 45, 62, 74,
74, 209 126, 152, 156, 159, 172–173

EBSCOhost - printed on 2/10/2023 2:18 AM via . All use subject to https://www.ebsco.com/terms-of-use

You might also like