100% found this document useful (1 vote)
8 views

Deep learning in natural language processing Deng download

Ebook

Uploaded by

klessbigotw1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
8 views

Deep learning in natural language processing Deng download

Ebook

Uploaded by

klessbigotw1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Deep learning in natural language processing

Deng download

https://textbookfull.com/product/deep-learning-in-natural-
language-processing-deng/

Download more ebook from https://textbookfull.com


We believe these products will be a great fit for you. Click
the link to download now, or visit textbookfull.com
to discover even more!

Deep Learning in Natural Language Processing 1st


Edition Li Deng

https://textbookfull.com/product/deep-learning-in-natural-
language-processing-1st-edition-li-deng/

Deep Learning for Natural Language Processing Develop


Deep Learning Models for Natural Language in Python
Jason Brownlee

https://textbookfull.com/product/deep-learning-for-natural-
language-processing-develop-deep-learning-models-for-natural-
language-in-python-jason-brownlee/

Python Natural Language Processing Advanced machine


learning and deep learning techniques for natural
language processing 1st Edition Jalaj Thanaki

https://textbookfull.com/product/python-natural-language-
processing-advanced-machine-learning-and-deep-learning-
techniques-for-natural-language-processing-1st-edition-jalaj-
thanaki/

Deep Learning for Natural Language Processing (MEAP


V07) Stephan Raaijmakers

https://textbookfull.com/product/deep-learning-for-natural-
language-processing-meap-v07-stephan-raaijmakers/
Applied Natural Language Processing with Python:
Implementing Machine Learning and Deep Learning
Algorithms for Natural Language Processing 1st Edition
Taweh Beysolow Ii
https://textbookfull.com/product/applied-natural-language-
processing-with-python-implementing-machine-learning-and-deep-
learning-algorithms-for-natural-language-processing-1st-edition-
taweh-beysolow-ii/

Natural language processing with TensorFlow Teach


language to machines using Python s deep learning
library 1st Edition Thushan Ganegedara

https://textbookfull.com/product/natural-language-processing-
with-tensorflow-teach-language-to-machines-using-python-s-deep-
learning-library-1st-edition-thushan-ganegedara/

Natural Language Processing 1st Edition Jacob


Eisenstein

https://textbookfull.com/product/natural-language-processing-1st-
edition-jacob-eisenstein/

Bayesian analysis in natural language processing 2nd


Edition Cohen S

https://textbookfull.com/product/bayesian-analysis-in-natural-
language-processing-2nd-edition-cohen-s/

Machine Learning with PySpark: With Natural Language


Processing and Recommender Systems 1st Edition Pramod
Singh

https://textbookfull.com/product/machine-learning-with-pyspark-
with-natural-language-processing-and-recommender-systems-1st-
edition-pramod-singh/
Li Deng · Yang Liu Editors

Deep Learning
in Natural
Language
Processing
Deep Learning in Natural Language Processing
Li Deng Yang Liu

Editors

Deep Learning in Natural


Language Processing

123
Editors
Li Deng Yang Liu
AI Research at Citadel Tsinghua University
Chicago, IL Beijing
USA China

and

AI Research at Citadel
Seattle, WA
USA

ISBN 978-981-10-5208-8 ISBN 978-981-10-5209-5 (eBook)


https://doi.org/10.1007/978-981-10-5209-5
Library of Congress Control Number: 2018934459

© Springer Nature Singapore Pte Ltd. 2018


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
part of Springer Nature
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Foreword

“Written by a group of the most active researchers in the field, led by Dr. Deng, an
internationally respected expert in both NLP and deep learning, this book provides
a comprehensive introduction to and up-to-date review of the state of art in applying
deep learning to solve fundamental problems in NLP. Further, the book is highly
timely, as demands for high-quality and up-to-date textbooks and research refer-
ences have risen dramatically in response to the tremendous strides in deep learning
applications to NLP. The book offers a unique reference guide for practitioners in
various sectors, especially the Internet and AI start-ups, where NLP technologies
are becoming an essential enabler and a core differentiator.”
Hongjiang Zhang (Founder, Sourcecode Capital; former CEO of KingSoft)
“This book provides a comprehensive introduction to the latest advances in deep
learning applied to NLP. Written by experienced and aspiring deep learning and
NLP researchers, it covers a broad range of major NLP applications, including
spoken language understanding, dialog systems, lexical analysis, parsing, knowl-
edge graph, machine translation, question answering, sentiment analysis, and social
computing.
The book is clearly structured and moves from major research trends, to the
latest deep learning approaches, to their limitations and promising future work.
Given its self-contained content, sophisticated algorithms, and detailed use cases,
the book offers a valuable guide for all readers who are working on or learning
about deep learning and NLP.”
Haifeng Wang (Vice President and Head of Research, Baidu; former President
of ACL)
“In 2011, at the dawn of deep learning in industry, I estimated that in most speech
recognition applications, computers still made 5 to 10 times more errors than human
subjects, and highlighted the importance of knowledge engineering in future
directions. Within only a handful of years since, deep learning has nearly closed the
gap in the accuracy of conversational speech recognition between human and
computers. Edited and written by Dr. Li Deng—a pioneer in the recent speech

v
vi Foreword

recognition revolution using deep learning—and his colleagues, this book elegantly
describes this part of the fascinating history of speech recognition as an important
subfield of natural language processing (NLP). Further, the book expands this
historical perspective from speech recognition to more general areas of NLP,
offering a truly valuable guide for the future development of NLP.
Importantly, the book puts forward a thesis that the current deep learning trend is
a revolution from the previous data-driven (shallow) machine learning era, although
ostensibly deep learning appears to be merely exploiting more data, more com-
puting power, and more complex models. Indeed, as the book correctly points out,
the current state of the art of deep learning technology developed for NLP appli-
cations, despite being highly successful in solving individual NLP tasks, has not
taken full advantage of rich world knowledge or human cognitive capabilities.
Therefore, I fully embrace the view expressed by the book’s editors and authors that
more advanced deep learning that seamlessly integrates knowledge engineering will
pave the way for the next revolution in NLP.
I highly recommend speech and NLP researchers, engineers, and students to read
this outstanding and timely book, not only to learn about the state of the art in NLP
and deep learning, but also to gain vital insights into what the future of the NLP
field will hold.”
Sadaoki Furui (President, Toyota Technological Institute at Chicago)
Preface

Natural language processing (NLP), which aims to enable computers to process


human languages intelligently, is an important interdisciplinary field crossing
artificial intelligence, computing science, cognitive science, information processing,
and linguistics. Concerned with interactions between computers and human lan-
guages, NLP applications such as speech recognition, dialog systems, information
retrieval, question answering, and machine translation have started to reshape the
way people identify, obtain, and make use of information.
The development of NLP can be described in terms of three major waves:
rationalism, empiricism, and deep learning. In the first wave, rationalist approaches
advocated the design of handcrafted rules to incorporate knowledge into NLP
systems based on the assumption that knowledge of language in the human mind is
fixed in advance by generic inheritance. In the second wave, empirical approaches
assume that rich sensory input and the observable language data in surface form are
required and sufficient to enable the mind to learn the detailed structure of natural
language. As a result, probabilistic models were developed to discover the regu-
larities of languages from large corpora. In the third wave, deep learning exploits
hierarchical models of nonlinear processing, inspired by biological neural systems
to learn intrinsic representations from language data, in ways that aim to simulate
human cognitive abilities.
The intersection of deep learning and natural language processing has resulted in
striking successes in practical tasks. Speech recognition is the first industrial NLP
application that deep learning has strongly impacted. With the availability of
large-scale training data, deep neural networks achieved dramatically lower
recognition errors than the traditional empirical approaches. Another prominent
successful application of deep learning in NLP is machine translation. End-to-end
neural machine translation that models the mapping between human languages
using neural networks has proven to improve translation quality substantially.
Therefore, neural machine translation has quickly become the new de facto tech-
nology in major commercial online translation services offered by large technology
companies: Google, Microsoft, Facebook, Baidu, and more. Many other areas of
NLP, including language understanding and dialog, lexical analysis and parsing,

vii
viii Preface

knowledge graph, information retrieval, question answering from text, social


computing, language generation, and text sentiment analysis, have also seen much
significant progress using deep learning, riding on the third wave of
NLP. Nowadays, deep learning is a dominating method applied to practically all
NLP tasks.
The main goal of this book is to provide a comprehensive survey on the recent
advances in deep learning applied to NLP. The book presents state of the art of
NLP-centric deep learning research, and focuses on the role of deep learning played
in major NLP applications including spoken language understanding, dialog sys-
tems, lexical analysis, parsing, knowledge graph, machine translation, question
answering, sentiment analysis, social computing, and natural language generation
(from images). This book is suitable for readers with a technical background in
computation, including graduate students, post-doctoral researchers, educators, and
industrial researchers and anyone interested in getting up to speed with the latest
techniques of deep learning associated with NLP.
The book is organized into eleven chapters as follows:
• Chapter 1: A Joint Introduction to Natural Language Processing and to Deep
Learning (Li Deng and Yang Liu)
• Chapter 2: Deep Learning in Conversational Language Understanding (Gokhan
Tur, Asli Celikyilmaz, Xiaodong He, Dilek Hakkani-Tür, and Li Deng)
• Chapter 3: Deep Learning in Spoken and Text-Based Dialog Systems
(Asli Celikyilmaz, Li Deng, and Dilek Hakkani-Tür)
• Chapter 4: Deep Learning in Lexical Analysis and Parsing (Wanxiang Che and
Yue Zhang)
• Chapter 5: Deep Learning in Knowledge Graph (Zhiyuan Liu and Xianpei Han)
• Chapter 6: Deep Learning in Machine Translation (Yang Liu and Jiajun Zhang)
• Chapter 7: Deep Learning in Question Answering (Kang Liu and Yansong Feng)
• Chapter 8: Deep Learning in Sentiment Analysis (Duyu Tang and Meishan
Zhang)
• Chapter 9: Deep Learning in Social Computing (Xin Zhao and Chenliang Li)
• Chapter 10: Deep Learning in Natural Language Generation from Images
(Xiaodong He and Li Deng)
• Chapter 11: Epilogue (Li Deng and Yang Liu)
Chapter 1 first reviews the basics of NLP as well as the main scope of NLP
covered in the following chapters of the book, and then goes in some depth into the
historical development of NLP summarized as three waves and future directions.
Subsequently, in Chaps. 2–10, an in-depth survey on the recent advances in deep
learning applied to NLP is organized into nine separate chapters, each covering a
largely independent application area of NLP. The main body of each chapter is
written by leading researchers and experts actively working in the respective field.
The origin of this book was the set of comprehensive tutorials given at the 15th
China National Conference on Computational Linguistics (CCL 2016) held in
October 2016 in Yantai, Shandong, China, where both of us, editors of this book,
Preface ix

were active participants and were taking leading roles. We thank our Springer’s
senior editor, Dr. Celine Lanlan Chang, who kindly invited us to create this book
and who has been providing much of timely assistance needed to complete this
book. We are grateful also to Springer’s Assistant Editor, Jane Li, for offering
invaluable help through various stages of manuscript preparation.
We thank all authors of Chaps. 2–10 who devoted their valuable time carefully
preparing the content of their chapters: Gokhan Tur, Asli Celikyilmaz, Dilek
Hakkani-Tur, Wanxiang Che, Yue Zhang, Xianpei Han, Zhiyuan Liu, Jiajun Zhang,
Kang Liu, Yansong Feng, Duyu Tang, Meishan Zhang, Xin Zhao, Chenliang Li,
and Xiaodong He. The authors of Chaps. 4–9 are CCL 2016 tutorial speakers. They
spent a considerable amount of time in updating their tutorial material with the
latest advances in the field since October 2016.
Further, we thank numerous reviewers and readers, Sadaoki Furui, Andrew Ng,
Fred Juang, Ken Church, Haifeng Wang, and Hongjiang Zhang, who not only gave
us much needed encouragements but also offered many constructive comments
which substantially improved earlier drafts of the book.
Finally, we give our appreciations to our organizations, Microsoft Research and
Citadel (for Li Deng) and Tsinghua University (for Yang Liu), who provided
excellent environments, supports, and encouragements that have been instrumental
for us to complete this book. Yang Liu is also supported by National Natural
Science Foundation of China (No.61522204, No.61432013, and No.61331013).

Seattle, USA Li Deng


Beijing, China Yang Liu
October 2017
Contents

1 A Joint Introduction to Natural Language Processing and to


Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Li Deng and Yang Liu
2 Deep Learning in Conversational Language Understanding . . . . . . 23
Gokhan Tur, Asli Celikyilmaz, Xiaodong He, Dilek Hakkani-Tür
and Li Deng
3 Deep Learning in Spoken and Text-Based Dialog Systems . . . . . . 49
Asli Celikyilmaz, Li Deng and Dilek Hakkani-Tür
4 Deep Learning in Lexical Analysis and Parsing . . . . . . . . . . . . . . . 79
Wanxiang Che and Yue Zhang
5 Deep Learning in Knowledge Graph . . . . . . . . . . . . . . . . . . . . . . 117
Zhiyuan Liu and Xianpei Han
6 Deep Learning in Machine Translation . . . . . . . . . . . . . . . . . . . . 147
Yang Liu and Jiajun Zhang
7 Deep Learning in Question Answering . . . . . . . . . . . . . . . . . . . . . 185
Kang Liu and Yansong Feng
8 Deep Learning in Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . 219
Duyu Tang and Meishan Zhang
9 Deep Learning in Social Computing . . . . . . . . . . . . . . . . . . . . . . . . 255
Xin Zhao and Chenliang Li
10 Deep Learning in Natural Language Generation from Images . . . . 289
Xiaodong He and Li Deng
11 Epilogue: Frontiers of NLP in the Deep Learning Era . . . . . . . . . . 309
Li Deng and Yang Liu
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

xi
Contributors

Asli Celikyilmaz Microsoft Research, Redmond, WA, USA


Wanxiang Che Harbin Institute of Technology, Harbin, China
Li Deng Citadel, Seattle & Chicago, USA
Yansong Feng Peking University, Beijing, China
Dilek Hakkani-Tür Google, Mountain View, CA, USA
Xianpei Han Institute of Software, Chinese Academy of Sciences, Beijing, China
Xiaodong He Microsoft Research, Redmond, WA, USA
Chenliang Li Wuhan University, Wuhan, China
Kang Liu Institute of Automation, Chinese Academy of Sciences, Beijing, China
Yang Liu Tsinghua University, Beijing, China
Zhiyuan Liu Tsinghua University, Beijing, China
Duyu Tang Microsoft Research Asia, Beijing, China
Gokhan Tur Google, Mountain View, CA, USA
Jiajun Zhang Institute of Automation, Chinese Academy of Sciences, Beijing,
China
Meishan Zhang Heilongjiang University, Harbin, China
Yue Zhang Singapore University of Technology and Design, Singapore
Xin Zhao Renmin University of China, Beijing, China

xiii
Acronyms

AI Artificial intelligence
AP Averaged perceptron
ASR Automatic speech recognition
ATN Augmented transition network
BiLSTM Bidirectional long short-term memory
BiRNN Bidirectional recurrent neural network
BLEU Bilingual evaluation understudy
BOW Bag-of-words
CBOW Continuous bag-of-words
CCA Canonical correlation analysis
CCG Combinatory categorial grammar
CDL Collaborative deep learning
CFG Context free grammar
CYK Cocke–Younger–Kasami
CLU Conversational language understanding
CNN Convolutional neural network
CNNSM Convolutional neural network based semantic model
cQA Community question answering
CRF Conditional random field
CTR Collaborative topic regression
CVT Compound value typed
DA Denoising autoencoder
DBN Deep belief network
DCN Deep convex net
DNN Deep neural network
DSSM Deep structured semantic model
DST Dialog state tracking
EL Entity linking
EM Expectation maximization
FSM Finite state machine

xv
xvi Acronyms

GAN Generative adversarial network


GRU Gated recurrent unit
HMM Hidden Markov model
IE Information extraction
IRQA Information retrieval-based question answering
IVR Interactive voice response
KBQA Knowledge-based question answering
KG Knowledge graph
L-BFGS Limited-memory Broyden–Fletcher–Goldfarb–Shanno
LSI Latent semantic indexing
LSTM Long short-term memory
MC Machine comprehension
MCCNN Multicolumn convolutional neural network
MDP Markov decision process
MERT Minimum error rate training
METEOR Metric for evaluation of translation with explicit ordering
MIRA Margin infused relaxed algorithm
ML Machine learning
MLE Maximum likelihood estimation
MLP Multiple layer perceptron
MMI Maximum mutual information
M-NMF Modularized nonnegative matrix factorization
MRT Minimum risk training
MST Maximum spanning tree
MT Machine translation
MV-RNN Matrix-vector recursive neural network
NER Named entity recognition
NFM Neural factorization machine
NLG Natural language generation
NMT Neural machine translation
NRE Neural relation extraction
OOV Out-of-vocabulary
PA Passive aggressive
PCA Principal component analysis
PMI Point-wise mutual information
POS Part of speech
PV Paragraph vector
QA Question answering
RAE Recursive autoencoder
RBM Restricted Boltzmann machine
RDF Resource description framework
RE Relation extraction
RecNN Recursive neural network
RL Reinforcement learning
RNN Recurrent neural network
Acronyms xvii

ROUGE Recall-oriented understudy for gisting evaluation


RUBER Referenced metric and unreferenced metric blended evaluation routine
SDS Spoken dialog system
SLU Spoken language understanding
SMT Statistical machine translation
SP Semantic parsing
SRL Semantic role labeling
SRNN Segmental recurrent neural network
STAGG Staged query graph generation
SVM Support vector machine
UAS Unlabeled attachment score
UGC User-generated content
VIME Variational information maximizing exploration
VPA Virtual personal assistant
Chapter 1
A Joint Introduction to Natural
Language Processing and to Deep
Learning

Li Deng and Yang Liu

Abstract In this chapter, we set up the fundamental framework for the book. We
first provide an introduction to the basics of natural language processing (NLP) as an
integral part of artificial intelligence. We then survey the historical development of
NLP, spanning over five decades, in terms of three waves. The first two waves arose
as rationalism and empiricism, paving ways to the current deep learning wave. The
key pillars underlying the deep learning revolution for NLP consist of (1) distributed
representations of linguistic entities via embedding, (2) semantic generalization due
to the embedding, (3) long-span deep sequence modeling of natural language, (4)
hierarchical networks effective for representing linguistic levels from low to high,
and (5) end-to-end deep learning methods to jointly solve many NLP tasks. After
the survey, several key limitations of current deep learning technology for NLP are
analyzed. This analysis leads to five research directions for future advances in NLP.

1.1 Natural Language Processing: The Basics

Natural language processing (NLP) investigates the use of computers to process or to


understand human (i.e., natural) languages for the purpose of performing useful tasks.
NLP is an interdisciplinary field that combines computational linguistics, computing
science, cognitive science, and artificial intelligence. From a scientific perspective,
NLP aims to model the cognitive mechanisms underlying the understanding and pro-
duction of human languages. From an engineering perspective, NLP is concerned
with how to develop novel practical applications to facilitate the interactions between
computers and human languages. Typical applications in NLP include speech recog-
nition, spoken language understanding, dialogue systems, lexical analysis, parsing,
machine translation, knowledge graph, information retrieval, question answering,

L. Deng (B)
Citadel, Seattle & Chicago, USA
e-mail: [email protected]
Y. Liu
Tsinghua University, Beijing, China
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2018 1


L. Deng and Y. Liu (eds.), Deep Learning in Natural
Language Processing, https://doi.org/10.1007/978-981-10-5209-5_1
2 L. Deng and Y. Liu

sentiment analysis, social computing, natural language generation, and natural lan-
guage summarization. These NLP application areas form the core content of this
book.
Natural language is a system constructed specifically to convey meaning or seman-
tics, and is by its fundamental nature a symbolic or discrete system. The surface or
observable “physical” signal of natural language is called text, always in a sym-
bolic form. The text “signal” has its counterpart—the speech signal; the latter can
be regarded as the continuous correspondence of symbolic text, both entailing the
same latent linguistic hierarchy of natural language. From NLP and signal processing
perspectives, speech can be treated as “noisy” versions of text, imposing additional
difficulties in its need of “de-noising” when performing the task of understanding the
common underlying semantics. Chapters 2 and 3 as well as current Chap. 1 of this
book cover the speech aspect of NLP in detail, while the remaining chapters start
directly from text in discussing a wide variety of text-oriented tasks that exemplify
the pervasive NLP applications enabled by machine learning techniques, notably
deep learning.
The symbolic nature of natural language is in stark contrast to the continuous
nature of language’s neural substrate in the human brain. We will defer this discussion
to Sect. 1.6 of this chapter when discussing future challenges of deep learning in NLP.
A related contrast is how the symbols of natural language are encoded in several
continuous-valued modalities, such as gesture (as in sign language), handwriting
(as an image), and, of course, speech. On the one hand, the word as a symbol is
used as a “signifier” to refer to a concept or a thing in real world as a “signified”
object, necessarily a categorical entity. On the other hand, the continuous modalities
that encode symbols of words constitute the external signals sensed by the human
perceptual system and transmitted to the brain, which in turn operates in a continuous
fashion. While of great theoretical interest, the subject of contrasting the symbolic
nature of language versus its continuous rendering and encoding goes beyond the
scope of this book.
In the next few sections, we outline and discuss, from a historical perspective, the
development of general methodology used to study NLP as a rich interdisciplinary
field. Much like several closely related sub- and super-fields such as conversational
systems, speech recognition, and artificial intelligence, the development of NLP can
be described in terms of three major waves (Deng 2017; Pereira 2017), each of which
is elaborated in a separate section next.

1.2 The First Wave: Rationalism

NLP research in its first wave lasted for a long time, dating back to 1950s. In 1950,
Alan Turing proposed the Turing test to evaluate a computer’s ability to exhibit intelli-
gent behavior indistinguishable from that of a human (Turing 1950). This test is based
on natural language conversations between a human and a computer designed to gen-
erate human-like responses. In 1954, the Georgetown-IBM experiment demonstrated
1 A Joint Introduction to Natural Language Processing and to Deep Learning 3

the first machine translation system capable of translating more than 60 Russian sen-
tences into English.
The approaches, based on the belief that knowledge of language in the human
mind is fixed in advance by generic inheritance, dominated most of NLP research
between about 1960 and late 1980s. These approaches have been called rationalist
ones (Church 2007). The dominance of rationalist approaches in NLP was mainly
due to the widespread acceptance of arguments of Noam Chomsky for an innate
language structure and his criticism of N-grams (Chomsky 1957). Postulating that
key parts of language are hardwired in the brain at birth as a part of the human
genetic inheritance, rationalist approaches endeavored to design hand-crafted rules
to incorporate knowledge and reasoning mechanisms into intelligent NLP systems.
Up until 1980s, most notably successful NLP systems, such as ELIZA for simulating
a Rogerian psychotherapist and MARGIE for structuring real-world information into
concept ontologies, were based on complex sets of handwritten rules.
This period coincided approximately with the early development of artificial
intelligence, characterized by expert knowledge engineering, where domain experts
devised computer programs according to the knowledge about the (very narrow)
application domains they have (Nilsson 1982; Winston 1993). The experts designed
these programs using symbolic logical rules based on careful representations and
engineering of such knowledge. These knowledge-based artificial intelligence sys-
tems tend to be effective in solving narrow-domain problems by examining the
“head” or most important parameters and reaching a solution about the appropriate
action to take in each specific situation. These “head” parameters are identified in
advance by human experts, leaving the “tail” parameters and cases untouched. Since
they lack learning capability, they have difficulty in generalizing the solutions to new
situations and domains. The typical approach during this period is exemplified by
the expert system, a computer system that emulates the decision-making ability of a
human expert. Such systems are designed to solve complex problems by reasoning
about knowledge (Nilsson 1982). The first expert system was created in 1970s and
then proliferated in 1980s. The main “algorithm” used was the inference rules in the
form of “if-then-else” (Jackson 1998). The main strength of these first-generation
artificial intelligence systems is its transparency and interpretability in their (limited)
capability in performing logical reasoning. Like NLP systems such as ELIZA and
MARGIE, the general expert systems in the early days used hand-crafted expert
knowledge which was often effective in narrowly defined problems, although the
reasoning could not handle uncertainty that is ubiquitous in practical applications.
In specific NLP application areas of dialogue systems and spoken language under-
standing, to be described in more detail in Chaps. 2 and 3 of this book, such ratio-
nalistic approaches were represented by the pervasive use of symbolic rules and
templates (Seneff et al. 1991). The designs were centered on grammatical and onto-
logical constructs, which, while interpretable and easy to debug and update, had
experienced severe difficulties in practical deployment. When such systems worked,
they often worked beautifully; but unfortunately this happened just not very often
and the domains were necessarily limited.
4 L. Deng and Y. Liu

Likewise, speech recognition research and system design, another long-standing


NLP and artificial intelligence challenge, during this rationalist era were based
heavily on the paradigm of expert knowledge engineering, as elegantly analyzed
in (Church and Mercer 1993). During 1970s and early 1980s, the expert system
approach to speech recognition was quite popular (Reddy 1976; Zue 1985). How-
ever, the lack of abilities to learn from data and to handle uncertainty in reasoning was
acutely recognized by researchers, leading to the second wave of speech recognition,
NLP, and artificial intelligence described next.

1.3 The Second Wave: Empiricism

The second wave of NLP was characterized by the exploitation of data corpora and
of (shallow) machine learning, statistical or otherwise, to make use of such data
(Manning and Schtze 1999). As much of the structure of and theory about natural
language were discounted or discarded in favor of data-driven methods, the main
approaches developed during this era have been called empirical or pragmatic ones
(Church and Mercer 1993; Church 2014). With the increasing availability of machine-
readable data and steady increase of computational power, empirical approaches have
dominated NLP since around 1990. One of the major NLP conferences was even
named “Empirical Methods in Natural Language Processing (EMNLP)” to reflect
most directly the strongly positive sentiment of NLP researchers during that era
toward empirical approaches.
In contrast to rationalist approaches, empirical approaches assume that the human
mind only begins with general operations for association, pattern recognition, and
generalization. Rich sensory input is required to enable the mind to learn the detailed
structure of natural language. Prevalent in linguistics between 1920 and 1960, empiri-
cism has been undergoing a resurgence since 1990. Early empirical approaches to
NLP focused on developing generative models such as the hidden Markov model
(HMM) (Baum and Petrie 1966), the IBM translation models (Brown et al. 1993),
and the head-driven parsing models (Collins 1997) to discover the regularities of
languages from large corpora. Since late 1990s, discriminative models have become
the de facto approach in a variety of NLP tasks. Representative discriminative mod-
els and methods in NLP include the maximum entropy model (Ratnaparkhi 1997),
supporting vector machines (Vapnik 1998), conditional random fields (Lafferty et al.
2001), maximum mutual information and minimum classification error (He et al.
2008), and perceptron (Collins 2002).
Again, this era of empiricism in NLP was paralleled with corresponding approaches
in artificial intelligence as well as in speech recognition and computer vision. It came
about after clear evidence that learning and perception capabilities are crucial for
complex artificial intelligence systems but missing in the expert systems popular in
the previous wave. For example, when DARPA opened its first Grand Challenge for
autonomous driving, most vehicles then relied on the knowledge-based artificial intel-
ligence paradigm. Much like speech recognition and NLP, the autonomous driving and
1 A Joint Introduction to Natural Language Processing and to Deep Learning 5

computer vision researchers immediately realized the limitation of the knowledge-


based paradigm due to the necessity for machine learning with uncertainty handling
and generalization capabilities.
The empiricism in NLP and speech recognition in this second wave was based
on data-intensive machine learning, which we now call “shallow” due to the general
lack of abstractions constructed by many-layer or “deep” representations of data
which would come in the third wave to be described in the next section. In machine
learning, researchers do not need to concern with constructing precise and exact rules
as required for the knowledge-based NLP and speech systems during the first wave.
Rather, they focus on statistical models (Bishop 2006; Murphy 2012) or simple neural
networks (Bishop 1995) as an underlying engine. They then automatically learn or
“tune” the parameters of the engine using ample training data to make them handle
uncertainty, and to attempt to generalize from one condition to another and from one
domain to another. The key algorithms and methods for machine learning include EM
(expectation-maximization), Bayesian networks, support vector machines, decision
trees, and, for neural networks, backpropagation algorithm.
Generally speaking, the machine learning based NLP, speech, and other artificial
intelligence systems perform much better than the earlier, knowledge-based counter-
parts. Successful examples include almost all artificial intelligence tasks in machine
perception—speech recognition (Jelinek 1998), face recognition (Viola and Jones
2004), visual object recognition (Fei-Fei and Perona 2005), handwriting recognition
(Plamondon and Srihari 2000), and machine translation (Och 2003).
More specifically, in a core NLP application area of machine translation, as to be
described in detail in Chap. 6 of this book as well as in (Church and Mercer 1993), the
field has switched rather abruptly around 1990 from rationalistic methods outlined in
Sect. 1.2 to empirical, largely statistical methods. The availability of sentence-level
alignments in the bilingual training data made it possible to acquire surface-level
translation knowledge not by rules but from data directly, at the expense of discarding
or discounting structured information in natural languages. The most representative
work during this wave is that empowered by various versions of IBM translation
models (Brown et al. 1993). Subsequent developments during this empiricist era of
machine translation further significantly improved the quality of translation systems
(Och and Ney 2002; Och 2003; Chiang 2007; He and Deng 2012), but not at the
level of massive deployment in real world (which would come after the next, deep
learning wave).
In the dialogue and spoken language understanding areas of NLP, this empiri-
cist era was also marked prominently by data-driven machine learning approaches.
These approaches were well suited to meet the requirement for quantitative evalua-
tion and concrete deliverables. They focused on broader but shallow, surface-level
coverage of text and domains instead of detailed analyses of highly restricted text
and domains. The training data were used not to design rules for language under-
standing and response action from the dialogue systems but to learn parameters of
(shallow) statistical or neural models automatically from data. Such learning helped
reduce the cost of hand-crafted complex dialogue manager’s design, and helped
improve robustness against speech recognition errors in the overall spoken language
6 L. Deng and Y. Liu

understanding and dialogue systems; for a review, see He and Deng (2013). More
specifically, for the dialogue policy component of dialogue systems, powerful rein-
forcement learning based on Markov decision processes had been introduced during
this era; for a review, see Young et al. (2013). And for spoken language understand-
ing, the dominant methods moved from rule- or template-based ones during the first
wave to generative models like hidden Markov models (HMMs) (Wang et al. 2011)
to discriminative models like conditional random fields (Tur and Deng 2011).
Similarly, in speech recognition, over close to 30 years from early 1980 s to around
2010, the field was dominated by the (shallow) machine learning paradigm using the
statistical generative model based on the HMM integrated with Gaussian mixture
models, along with various versions of its generalization (Baker et al. 2009a, b;
Deng and O’Shaughnessy 2003; Rabiner and Juang 1993). Among many versions of
the generalized HMMs were statistical and neural-network-based hidden dynamic
models (Deng 1998; Bridle et al. 1998; Deng and Yu 2007). The former adopted EM
and switching extended Kalman filter algorithms for learning model parameters (Ma
and Deng 2004; Lee et al. 2004), and the latter used backpropagation (Picone et al.
1999). Both of them made extensive use of multiple latent layers of representations for
the generative process of speech waveforms following the long-standing framework
of analysis-by-synthesis in human speech perception. More significantly, inverting
this “deep” generative process to its counterpart of an end-to-end discriminative
process gave rise to the first industrial success of deep learning (Deng et al. 2010,
2013; Hinton et al. 2012), which formed a driving force of the third wave of speech
recognition and NLP that will be elaborated next.

1.4 The Third Wave: Deep Learning

While the NLP systems, including speech recognition, language understanding, and
machine translation, developed during the second wave performed a lot better and
with higher robustness than those during the first wave, they were far from human-
level performance and left much to desire. With a few exceptions, the (shallow)
machine learning models for NLP often did not have the capacity sufficiently large to
absorb the large amounts of training data. Further, the learning algorithms, methods,
and infrastructures were not powerful enough. All this changed several years ago,
giving rise to the third wave of NLP, propelled by the new paradigm of deep-structured
machine learning or deep learning (Bengio 2009; Deng and Yu 2014; LeCun et al.
2015; Goodfellow et al. 2016).
In traditional machine learning, features are designed by humans and feature
engineering is a bottleneck, requiring significant human expertise. Concurrently,
the associated shallow models lack the representation power and hence the ability
to form levels of decomposable abstractions that would automatically disentangle
complex factors in shaping the observed language data. Deep learning breaks away
the above difficulties by the use of deep, layered model structure, often in the form of
neural networks, and the associated end-to-end learning algorithms. The advances in
1 A Joint Introduction to Natural Language Processing and to Deep Learning 7

deep learning are one major driving force behind the current NLP and more general
artificial intelligence inflection point and are responsible for the resurgence of neural
networks with a wide range of practical, including business, applications (Parloff
2016).
More specifically, despite the success of (shallow) discriminative models in a
number of important NLP tasks developed during the second wave, they suffered from
the difficulty of covering all regularities in languages by designing features manually
with domain expertise. Besides the incompleteness problem, such shallow models
also face the sparsity problem as features usually only occur once in the training
data, especially for highly sparse high-order features. Therefore, feature design has
become one of the major obstacles in statistical NLP before deep learning comes
to rescue. Deep learning brings hope for addressing the human feature engineering
problem, with a view called “NLP from scratch” (Collobert et al. 2011), which was
in early days of deep learning considered highly unconventional. Such deep learning
approaches exploit the powerful neural networks that contain multiple hidden layers
to solve general machine learning tasks dispensing with feature engineering. Unlike
shallow neural networks and related machine learning models, deep neural networks
are capable of learning representations from data using a cascade of multiple layers of
nonlinear processing units for feature extraction. As higher level features are derived
from lower level features, these levels form a hierarchy of concepts.
Deep learning originated from artificial neural networks, which can be viewed as
cascading models of cell types inspired by biological neural systems. With the advent
of backpropagation algorithm (Rumelhart et al. 1986), training deep neural networks
from scratch attracted intensive attention in 1990s. In these early days, without large
amounts of training data and without proper design and learning methods, during
neural network training the learning signals vanish exponentially with the number
of layers (or more rigorously the depth of credit assignment) when propagated from
layer to layer, making it difficult to tune connection weights of deep neural networks,
especially the recurrent versions. Hinton et al. (2006) initially overcame this problem
by using unsupervised pretraining to first learn generally useful feature detectors.
Then, the network is further trained by supervised learning to classify labeled data.
As a result, it is possible to learn the distribution of a high-level representation using
low-level representations. This seminal work marks the revival of neural networks. A
variety of network architectures have since been proposed and developed, including
deep belief networks (Hinton et al. 2006), stacked auto-encoders (Vincent et al. 2010),
deep Boltzmann machines (Hinton and Salakhutdinov 2012), deep convolutional
neural works (Krizhevsky et al. 2012), deep stacking networks (Deng et al. 2012),
and deep Q-networks (Mnih et al. 2015). Capable of discovering intricate structures
in high-dimensional data, deep learning has since 2010 been successfully applied to
real-world tasks in artificial intelligence including notably speech recognition (Yu
et al. 2010; Hinton et al. 2012), image classification (Krizhevsky et al. 2012; He et al.
2016), and NLP (all chapters in this book). Detailed analyses and reviews of deep
learning have been provided in a set of tutorial survey articles (Deng 2014; LeCun
et al. 2015; Juang 2016).
8 L. Deng and Y. Liu

As speech recognition is one of core tasks in NLP, we briefly discuss it here due to
its importance as the first industrial NLP application in real world impacted strongly
by deep learning. Industrial applications of deep learning to large-scale speech recog-
nition started to take off around 2010. The endeavor was initiated with a collaboration
between academia and industry, with the original work presented at the 2009 NIPS
Workshop on Deep Learning for Speech Recognition and Related Applications. The
workshop was motivated by the limitations of deep generative models of speech, and
the possibility that the big-compute, big-data era warrants a serious exploration of
deep neural networks. It was believed then that pretraining DNNs using generative
models of deep belief nets based on the contrastive divergence learning algorithm
would overcome the main difficulties of neural nets encountered in the 1990s (Dahl
et al. 2011; Mohamed et al. 2009). However, early into this research at Microsoft, it
was discovered that without contrastive divergence pretraining, but with the use of
large amounts of training data together with the deep neural networks designed with
corresponding large, context-dependent output layers and with careful engineering,
dramatically lower recognition errors could be obtained than then-state-of-the-art
(shallow) machine learning systems (Yu et al. 2010, 2011; Dahl et al. 2012). This
finding was quickly verified by several other major speech recognition research
groups in North America (Hinton et al. 2012; Deng et al. 2013) and subsequently
overseas. Further, the nature of recognition errors produced by the two types of sys-
tems was found to be characteristically different, offering technical insights into how
to integrate deep learning into the existing highly efficient, run-time speech decod-
ing system deployed by major players in speech recognition industry (Yu and Deng
2015; Abdel-Hamid et al. 2014; Xiong et al. 2016; Saon et al. 2017). Nowadays,
backpropagation algorithm applied to deep neural nets of various forms is uniformly
used in all current state-of-the-art speech recognition systems (Yu and Deng 2015;
Amodei et al. 2016; Saon et al. 2017), and all major commercial speech recogni-
tion systems—Microsoft Cortana, Xbox, Skype Translator, Amazon Alexa, Google
Assistant, Apple Siri, Baidu and iFlyTek voice search, and more—are all based on
deep learning methods.
The striking success of speech recognition in 2010–2011 heralded the arrival of
the third wave of NLP and artificial intelligence. Quickly following the success of
deep learning in speech recognition, computer vision (Krizhevsky et al. 2012) and
machine translation (Bahdanau et al. 2015) were taken over by the similar deep
learning paradigm. In particular, while the powerful technique of neural embedding
of words was developed in as early as 2011 (Bengio et al. 2001), it is not until more
than 10 year later it was shown to be practically useful at a large and practically useful
scale (Mikolov et al. 2013) due to the availability of big data and faster computation.
In addition, a large number of other real-world NLP applications, such as image
captioning (Karpathy and Fei-Fei 2015; Fang et al. 2015; Gan et al. 2017), visual
question answering (Fei-Fei and Perona 2016), speech understanding (Mesnil et al.
2013), web search (Huang et al. 2013b), and recommendation systems, have been
made successful due to deep learning, in addition to many non-NLP tasks including
drug discovery and toxicology, customer relationship management, recommendation
systems, gesture recognition, medical informatics, advertisement, medical image
1 A Joint Introduction to Natural Language Processing and to Deep Learning 9

analysis, robotics, self-driving vehicles, board and eSports games (e.g., Atari, Go,
Poker, and the latest, DOTA2), and so on. For more details, see https://en.wikipedia.
org/wiki/deep_learning.
In more specific text-based NLP application areas, machine translation is perhaps
impacted the most by deep learning. Advancing from the shallow statistical machine
translation developed during the second wave of NLP, the current best machine
translation systems in real-world applications are based on deep neural networks. For
example, Google announced the first stage of its move to neural machine translation
in September 2016 and Microsoft made a similar announcement 2 months later.
Facebook has been working on the conversion to neural machine translation for
about a year, and by August 2017 it is at full deployment. Details of the deep learning
techniques in these state-of-the-art large-scale machine translation systems will be
reviewed in Chap. 6.
In the area of spoken language understanding and dialogue systems, deep learning
is also making a huge impact. The current popular techniques maintain and expand
the statistical methods developed during second-wave era in several ways. Like the
empirical, (shallow) machine learning methods, deep learning is also based on data-
intensive methods to reduce the cost of hand-crafted complex understanding and
dialogue management, to be robust against speech recognition errors under noise
environments and against language understanding errors, and to exploit the power
of Markov decision processes and reinforcement learning for designing dialogue
policy, e.g., (Gasic et al. 2017; Dhingra et al. 2017). Compared with the earlier
methods, deep neural network models and representations are much more powerful
and they make end-to-end learning possible. However, deep learning has not yet
solved the problems of interpretability and domain scalability associated with earlier
empirical techniques. Details of the deep learning techniques popular for current
spoken language understanding and dialogue systems as well as their challenges
will be reviewed in Chaps. 2 and 3.
Two important recent technological breakthroughs brought about in applying deep
learning to NLP problems are sequence-to-sequence learning (Sutskevar et al. 2014)
and attention modeling (Bahdanau et al. 2015). The sequence-to-sequence learning
introduces a powerful idea of using recurrent nets to carry out both encoding and
decoding in an end-to-end manner. While attention modeling was initially developed
to overcome the difficulty of encoding a long sequence, subsequent developments
significantly extended its power to provide highly flexible alignment of two arbitrary
sequences that can be learned together with neural network parameters. The key
concepts of sequence-to-sequence learning and of attention mechanism boosted the
performance of neural machine translation based on distributed word embedding over
the best system based on statistical learning and local representations of words and
phrases. Soon after this success, these concepts have also been applied successfully
to a number of other NLP-related tasks such as image captioning (Karpathy and
Fei-Fei 2015; Devlin et al. 2015), speech recognition (Chorowski et al. 2015), meta-
learning for program execution, one-shot learning, syntactic parsing, lip reading, text
understanding, summarization, and question answering and more.
10 L. Deng and Y. Liu

Setting aside their huge empirical successes, models of neural-network-based


deep learning are often simpler and easier to design than the traditional machine
learning models developed in the earlier wave. In many applications, deep learning
is performed simultaneously for all parts of the model, from feature extraction all
the way to prediction, in an end-to-end manner. Another factor contributing to the
simplicity of neural network models is that the same model building blocks (i.e., the
different types of layers) are generally used in many different applications. Using
the same building blocks for a large variety of tasks makes the adaptation of models
used for one task or data to another task or data relatively easy. In addition, software
toolkits have been developed to allow faster and more efficient implementation of
these models. For these reasons, deep neural networks are nowadays a prominent
method of choice for a large variety of machine learning and artificial intelligence
tasks over large datasets including, prominently, NLP tasks.
Although deep learning has proven effective in reshaping the processing of speech,
images, and videos in a revolutionary way, the effectiveness is less clear-cut in inter-
secting deep learning with text-based NLP despite its empirical successes in a number
of practical NLP tasks. In speech, image, and video processing, deep learning effec-
tively addresses the semantic gap problem by learning high-level concepts from raw
perceptual data in a direct manner. However, in NLP, stronger theories and structured
models on morphology, syntax, and semantics have been advanced to distill the under-
lying mechanisms of understanding and generation of natural languages, which have
not been as easily compatible with neural networks. Compared with speech, image,
and video signals, it seems less straightforward to see that the neural representations
learned from textual data can provide equally direct insights onto natural language.
Therefore, applying neural networks, especially those having sophisticated hierar-
chical architectures, to NLP has received increasing attention and has become the
most active area in both NLP and deep learning communities with highly visible
progresses made in recent years (Deng 2016; Manning and Socher 2017). Surveying
the advances and analyzing the future directions in deep learning for NLP form the
main motivation for us to write this chapter and to create this book, with the desire
for the NLP researchers to accelerate the research further in the current fast pace of
the progress.

1.5 Transitions from Now to the Future

Before analyzing the future dictions of NLP with more advanced deep learning, here
we first summarize the significance of the transition from the past waves of NLP to
the present one. We then discuss some clear limitations and challenges of the present
deep learning technology for NLP, to pave a way to examining further development
that would overcome these limitations for the next wave of innovations.
1 A Joint Introduction to Natural Language Processing and to Deep Learning 11

1.5.1 From Empiricism to Deep Learning: A Revolution

On the surface, the deep learning rising wave discussed in Sect. 1.4 in this chapter
appears to be a simple push of the second, empiricist wave of NLP (Sect. 1.3) into
an extreme end with bigger data, larger models, and greater computing power. After
all, the fundamental approaches developed during both waves are data-driven and
are based on machine learning and computation, and have dispensed with human-
centric “rationalistic” rules that are often brittle and costly to acquire in practical
NLP applications. However, if we analyze these approaches holistically and at a
deeper level, we can identify aspects of conceptual revolution moving from empiricist
machine learning to deep learning, and can subsequently analyze the future directions
of the field (Sect. 1.6). This revolution, in our opinion, is no less significant than the
revolution from the earlier rationalist wave to empiricist one as analyzed at the
beginning (Church and Mercer 1993) and at the end of the empiricist era (Charniak
2011).
Empiricist machine learning and linguistic data analysis during the second NLP
wave started in early 1990 s by crypto-analysts and computer scientists working
on natural language sources that are highly limited in vocabulary and application
domains. As we discussed in Sect. 1.3, surface-level text observations, i.e., words
and their sequences, are counted using discrete probabilistic models without relying
on deep structure in natural language. The basic representations were “one-hot” or
localist, where no semantic similarity between words was exploited. With restric-
tions in domains and associated text content, such structure-free representations and
empirical models are often sufficient to cover much of what needs to be covered.
That is, the shallow, count-based statistical models can naturally do well in limited
and specific NLP tasks. But when the domain and content restrictions are lifted for
more realistic NLP applications in real-world, count-based models would necessarily
become ineffective, no manner how many tricks of smoothing have been invented
in an attempt to mitigate the problem of combinatorial counting sparseness. This
is where deep learning for NLP truly shines—distributed representations of words
via embedding, semantic generalization due to the embedding, longer span deep
sequence modeling, and end-to-end learning methods have all contributed to beat-
ing empiricist, count-based methods in a wide range of NLP tasks as discussed in
Sect. 1.4.

1.5.2 Limitations of Current Deep Learning Technology

Despite the spectacular successes of deep learning in NLP tasks, most notably in
speech recognition/understanding, language modeling, and in machine translation,
there remain huge challenges. The current deep learning methods based on neu-
ral networks as a black box generally lack interpretability, even further away from
explainability, in contrast to the “rationalist” paradigm established during the first
12 L. Deng and Y. Liu

NLP wave where the rules devised by experts were naturally explainable. In practice,
however, it is highly desirable to explain the predictions from a seemingly “black-
box” model, not only for improving the model but for providing the users of the
prediction system with interpretations of the suggested actions to take (Koh and
Liang 2017).
In a number of applications, deep learning methods have proved to give recog-
nition accuracy close to or exceeding humans, but they require considerably more
training data, power consumption, and computing resources than humans. Also,
the accuracy results are statistically impressive but often unreliable on the individ-
ual basis. Further, most of the current deep learning models have no reasoning and
explaining capabilities, making them vulnerable to disastrous failures or attacks with-
out the ability to foresee and thus to prevent them. Moreover, the current NLP models
have not taken into account the need for developing and executing goals and plans
for decision-making via ultimate NLP systems. A more specific limitation of current
NLP methods based on deep learning is their poor abilities for understanding and
reasoning inter-sentential relationships, although huge progresses have been made
for interwords and phrases within sentences.
As discussed earlier, the success of deep learning in NLP has largely come from a
simple strategy thus far—given an NLP task, apply standard sequence models based
on (bidirectional) LSTMs, add attention mechanisms if information required in the
task needs to flow from another source, and then train the full models in an end-to-
end manner. However, while sequence modeling is naturally appropriate for speech,
human understanding of natural language (in text form) requires more complex
structure than sequence. That is, current sequence-based deep learning systems for
NLP can be further advanced by exploiting modularity, structured memories, and
recursive, tree-like representations for sentences and larger text (Manning 2016).
To overcome the challenges outlined above and to achieve the ultimate success
of NLP as a core artificial intelligence field, both fundamental and applied research
are needed. The next new wave of NLP and artificial intelligence will not come until
researchers create new paradigmatic, algorithmic, and computation (including hard-
ware) breakthroughs. Here, we outline several high-level directions toward potential
breakthroughs.

1.6 Future Directions of NLP

1.6.1 Neural-Symbolic Integration

A potential breakthrough is in developing advanced deep learning models and meth-


ods that are more effective than current methods in building, accessing, and exploit-
ing memories and knowledge, including, in particular, common-sense knowledge.
It is not clear how to best integrate the current deep learning methods, centered
on distributed representations (of everything), with explicit, easily interpretable, and
1 A Joint Introduction to Natural Language Processing and to Deep Learning 13

localist-represented knowledge about natural language and the world and with related
reasoning mechanisms.
One path to this goal is to seamlessly combine neural networks and symbolic
language systems. These NLP and artificial intelligence systems will aim to discover
by themselves the underlying causes or logical rules that shape their prediction and
decision-making processes interpretable to human users in symbolic natural language
forms. Recently, very preliminary work in this direction made use of an integrated
neural-symbolic representation called tensor-product neural memory cells, capable
of decoding back to symbolic forms. This structured neural representation is provably
lossless in the coded information after extensive learning within the neural-tensor
domain (Palangi et al. 2017; Smolensky et al. 2016; Lee et al. 2016). Extensions
of such tensor-product representations, when applied to NLP tasks such as machine
reading and question answering, are aimed to learn to process and understand mas-
sive natural language documents. After learning, the systems will be able not only to
answer questions sensibly but also to truly understand what it reads to the extent that
it can convey such understanding to human users in providing clues as to what steps
have been taken to reach the answer. These steps may be in the form of logical reason-
ing expressed in natural language which is thus naturally understood by the human
users of this type of machine reading and comprehension systems. In our view, natu-
ral language understanding is not just to accurately predict an answer from a question
with relevant passages or data graphs as its contextual knowledge in a supervised
way after seeing many examples of matched questions–passages–answers. Rather,
the desired NLP system equipped with real understanding should resemble human
cognitive capabilities. As an example of such capabilities (Nguyen et al. 2017)—
after an understanding system is trained well, say, in a question answering task
(using supervised learning or otherwise), it should master all essential aspects of the
observed text material provided to solve the question answering tasks. What such
mastering entails is that the learned system can subsequently perform well on other
NLP tasks, e.g., translation, summarization, recommendation, etc., without seeing
additional paired data such as raw text data with its summary, or parallel English and
Chinese texts, etc.
One way to examine the nature of such powerful neural-symbolic systems is
to regard them as ones incorporating the strength of the “rationalist” approaches
marked by expert reasoning and structure richness popular during the first wave of
NLP discussed in Sect. 1.2. Interestingly, prior to the rising of deep learning (third)
wave of NLP, (Church 2007) argued that the pendulum from rationalist to empiri-
cist approaches has swung too far at almost the peak of the second NLP wave, and
predicted that the new rationalist wave would arrive. However, rather than swinging
back to a renewed rationalist era of NLP, deep learning era arrived in full force in just
a short period from the time of writing by Church (2007). Instead of adding the ratio-
nalist flavor, deep learning has been pushing empiricism of NLP to its pinnacle with
big data and big compute, and with conceptually revolutionary ways of representing
a sweeping range of linguistic entities by massive parallelism and distributedness,
thus drastically enhancing the generalization capability of new-generation NLP mod-
els. Only after the sweeping successes of current deep learning methods for NLP
14 L. Deng and Y. Liu

(Sect. 1.4) and subsequent analyses of a series of their limitations, do researchers


look into the next wave of NLP—not swinging back to rationalism while abandon-
ing empiricism but developing more advanced deep learning paradigms that would
organically integrate the missing essence of rationalism into the structured neural
methods that are aimed to approach human cognitive functions for language.

1.6.2 Structure, Memory, and Knowledge

As discussed earlier in this chapter as well as in the current NLP literature (Man-
ning and Socher 2017), NLP researchers at present still have very primitive deep
learning methods for exploiting structure and for building and accessing memories
or knowledge. While LSTM (with attention) has been pervasively applied to NLP
tasks to beat many NLP benchmarks, LSTM is far from a good memory model
for human cognition. In particular, LSTM lacks adequate structure for simulating
episodic memory, and one key component of human cognitive ability is to retrieve
and re-experience aspects of a past novel event or thought. This ability gives rise
to one-shot learning skills and can be crucial in reading comprehension of natural
language text or speech understanding, as well as reasoning over events described by
natural language. Many recent studies have been devoted to better memory model-
ing, including external memory architectures with supervised learning (Vinyals et al.
2016; Kaiser et al. 2017) and augmented memory architectures with reinforcement
learning (Graves et al. 2016; Oh et al. 2016). However, they have not shown general
effectiveness, but have suffered from a number of of limitations including notably
scalability (arising from the use of attention which has to access every stored element
in the memory). Much work remains in the direction of better modeling of memory
and exploitation of knowledge for text understanding and reasoning.

1.6.3 Unsupervised and Generative Deep Learning

Another potential breakthrough in deep learning for NLP is in new algorithms for
unsupervised deep learning, which makes use of ideally no direct teaching signals
paired with inputs (token by token) to guide the learning. Word embedding discussed
in Sect. 1.4 can be viewed as a weak form of unsupervised learning, making use of
adjacent words as “cost-free” surrogate teaching signals, but for real-world NLP pre-
diction tasks, such as translation, understanding, summarization, etc., such embed-
ding obtained in an “unsupervised manner” has to be fed into another supervised
architecture which requires costly teaching signals. In truly unsupervised learning
which requires no expensive teaching signals, new types of objective functions and
new optimization algorithms are needed, e.g., the objective function for unsupervised
learning should not require explicit target label data aligned with the input data as
in cross entropy that is most popular for supervised learning. Development of unsu-
1 A Joint Introduction to Natural Language Processing and to Deep Learning 15

pervised deep learning algorithms has been significantly behind that of supervised
and reinforcement deep learning where backpropagation and Q-learning algorithms
have been reasonably mature.
The most recent preliminary development in unsupervised learning takes the
approach of exploiting sequential output structure and advanced optimization meth-
ods to alleviate the need for using labels in training prediction systems (Russell and
Stefano 2017; Liu et al. 2017). Future advances in unsupervised learning are promis-
ing by exploiting new sources of learning signals including the structure of input data
and the mapping relationships from input to output and vice versa. Exploiting the rela-
tionship from output to input is closely connected to building conditional generative
models. To this end, the recent popular topic in deep learning—generative adversar-
ial networks (Goodfellow et al. 2014)—is a highly promising direction where the
long-standing concept of analysis-by-synthesis in pattern recognition and machine
learning is likely to return to spotlight in the near future in solving NLP tasks in new
ways.
Generative adversarial networks have been formulated as neural nets, with dense
connectivity among nodes and with no probabilistic setting. On the other hand,
probabilistic and Bayesian reasoning, which often takes computational advantage
of sparse connections among “nodes” as random variables, has been one of the
principal theoretical pillars to machine learning and has been responsible for many
NLP methods developed during the empiricist wave of NLP discussed in Sect. 1.3.
What is the right interface between deep learning and probabilistic modeling? Can
probabilistic thinking help understand deep learning techniques better and motivate
new deep learning methods for NLP tasks? How about the other way around? These
issues are widely open for future research.

1.6.4 Multimodal and Multitask Deep Learning

Multimodal and multitask deep learning are related learning paradigms, both con-
cerning the exploitation of latent representations in the deep networks pooled from
different modalities (e.g., audio, speech, video, images, text, source codes, etc.) or
from multiple cross-domain tasks (e.g., point and structured prediction, ranking,
recommendation, time-series forecasting, clustering, etc.). Before the deep learning
wave, multimodal and multitask learning had been very difficult to be made effective,
due to the lack of intermediate representations that share across modalities or tasks.
See a most striking example of this contrast for multitask learning—multilingual
speech recognition during the empiricist wave (Lin et al. 2008) and during the deep
learning wave (Huang et al. 2013a).
Multimodal information can be exploited as low-cost supervision. For instance,
standard speech recognition, image recognition, and text classification methods make
use of supervision labels within each of the speech, image, and text modalities sepa-
rately. This, however, is far from how children learn to recognize speech, image, and
to classify text. For example, children often get the distant “supervision” signal for
Other documents randomly have
different content
On Corpus Christi Day that year Alice and I, having received our
invitations from the Bishop of Salford, of happy pilgrimage memory, to join
in the services and procession in honour of the Blessed Sacrament at the
Missionary College, Mill Hill, we went thither that glorious midsummer
day. At page 127 of the Diary I have put down certain sentiments about the
practice of the Catholic faith in England, and I express a longing to see the
Host carried through English fields. I little thought in one year to see my
hope realised; yet so it was at Mill Hill. After vespers in the little church,
the procession was formed, and I shall long remember the choristers, in
their purple cassocks, passing along a field of golden buttercups and the
white and gold banners at the head of the procession floating out against a
typical English sky as their bearers passed over a little hillock which
commands a lovely view of the rich landscape. The bishop bore the Host,
and six favoured men held the canopy. Franciscan nuns in the procession
sang the hymns.
The early days of that July had their pleasant festivities, such as a dinner,
with Alice, at Lady Londonderry’s (she who was our mother’s godmother
on the occasion of her reception into the Catholic Church) and the Academy
soirée, where Mrs. Tait invited me and Dr. Pollard to a large garden party at
Lambeth Palace. There I note: “The Royalties were in full force, the
Waleses, as I have heard the Prince and Princess called, and many others. It
was amusing and very pleasant in the gardens, though provokingly windy. I
had a curiously uncomfortable and oppressed feeling, though, in that
headquarters of the—what shall I call it?—Opposition? The Archbishop and
Mrs. Archbishop, particularly Mrs., rather appalled me. But dear Dr.
Pollard, that stout Protestant, must have been very gratified.”
On July 4th Colonel Browne, C.B., R.E., who took the keenest interest in
my “Quatre Bras,” and did all in his power to help me with the military part
of it, had a day at Chatham for me. He, Mrs. B. and daughters called for me
in the morning, and we set forth for Chatham, where some 300 men of the
Royal Engineers were awaiting us on the “Lines.” Colonel Browne had
ordered them beforehand, and had them in full dress, with knapsacks, as I
desired.
They first formed the old-fashioned four-deep square for me, and not
only that, but the beautiful parade dressing was broken and accidenté by my
directions, so as to have a little more the appearance of the real thing. They
fired in sections, too, as I wished, but, unfortunately, the wind was so strong
that the smoke was whisked away in a twinkling, and what I chiefly wished
to study was unobtainable, i.e., masses of men seen through smoke. After
they had fired away all their ammunition, the whole body of men were
drawn up in line, and, the rear rank having been distanced from the front
rank, I, attended by Colonel Browne and a sergeant, walked down them
both, slowly, picking out here and there a man I thought would do for a
“Quatre Bras” model (beardless), and the sergeant took down the name of
each man as I pointed him out very unobtrusively, Colonel Browne
promising to have these men up at Brompton, quartered there for the time I
wanted them. So I write: “I shall not want for soldierly faces, what with
those sappers and the Scots Fusilier Guards, of whom I am sure I can have
the pick, through Colonel Hepburn’s courtesy. After this interesting
‘choosing a model’ was ended, we all repaired to Colonel Galway’s
quarters, where we lunched. After that I went to the guard-room to see the
men I had chosen in the morning, so as to write down their personal
descriptions in my book. Each man was marched in by the sergeant and
stood at attention with every vestige of expression discharged from his
countenance whilst I wrote down his personal peculiarities. I had chosen
eight out of the 300 in the morning, but only five were brought now by the
sergeant, as I had managed to pitch upon three bad characters out of the
eight, and these could not be sent me. We spent the rest of the day very
pleasantly listening to the band, going over the museum, etc. I ought to see
as much of military life as possible, and I must go down to Aldershot as
often as I can.
“July 16th.—Mamma and I went to Henley-on-Thames in search of a
rye field for my ‘Quatre Bras.’ Eagerly I looked at the harvest fields as we
sped to our goal to see how advanced they were. We had a great difficulty
in finding any rye at Henley, it having all been cut, except a little patch
which we at length discovered by the direction of a farmer. I bought a piece
of it, and then immediately trampled it down with the aid of a lot of
children. Mamma and I then went to work, but, oh! horror, my oil brushes
were missing. I had left them in the chaise, which had returned to Henley.
So Mamma went frantically to work with two slimy water-colour brushes to
get down tints whilst I drew down forms in pencil. We laughed a good deal
and worked on into the darkness, two regular ‘Pre-Raphaelite Brethren,’ to
all appearances, bending over a patch of trampled rye.”
I seem to have felt to the utmost the exhilaration produced by the
following episode. Let the young Diary speak: “The grand and glorious
Lord Mayor’s banquet to the stars of literature and art came off to-day, July
21st, and it was to me such a delightful thing that I felt all the time in a
pleasant sort of dream. I was mentioned in two speeches, Lord Houghton’s
(‘Monckton Milnes’) and Sir Francis Grant’s, P.R.A. As the President spoke
of me, he said his eye rested with pleasure on me at that moment! Papa
came with me. Above all the display of civic splendour one felt the
dominant spirit of hospitality in that ever-to-me-delightful Mansion House.
It was a unique thing because such aristocrats as were there were those of
merit and genius. The few lords were only there because they represented
literature, being authors. Patti was there. She wished to have a talk with me,
and went through little Italian dramatic compliments, like Neilson. Old
Cruikshank was a strange-looking old man, a wonder to me as the illustrator
of ‘Oliver Twist’ and others of Dickens’s works—a unique genius. He said
many nice things about me to Papa. I wished the evening could have lasted
a week.”
The next entries are connected with the “Quatre Bras” cartoon:
“Dreadful misgivings about a vital point. I have made my front rank men
sitting on their heels in the kneeling position. Not so the drill book. After
my model went, most luckily came Colonel Browne. Shakes his head at the
attitudes. Will telegraph to Chatham about the heel and let me know in the
morning.
“July 23rd.—Colonel Browne came, and with him a smart sergeant-
major, instructor of musketry. Alas! this man and telegram from Chatham
dead against me. Sergeant says the men at Chatham must have been sitting
on their heels to rest and steady themselves. He showed me the exact
position when at the ‘ready’ to receive cavalry. To my delight I may have
him to-morrow as a model, but it is no end of a bore, this wasted time.”
“July 24th.—The musketry instructor, contrary to my sad expectations,
was by no means the automaton one expects a soldier to be, but a
thoroughly intelligent model, and his attitudes combined perfect drill-book
correctness with great life and action. He was splendid. I can feel certain of
everything being right in the attitudes, and will have no misgivings. It is
extraordinary what a well-studied position that kneeling to resist cavalry is.
I dread to think what blunders I might have committed. No civilian would
have detected them, but the military would have been down upon me. I feel,
of course, rather fettered at having to observe rules so strict and imperative
concerning the poses of my figures, which, I hope, will have much action. I
have to combine the drill book and the fierce fray! I told an artist the other
day, very seriously, that I wished to show what an English square looks like
viewed quite close at the end of two hours’ action, when about to receive a
last charge. A cool speech, seeing I have never seen the thing! And yet I
seem to have seen it—the hot, blackened faces, the set teeth or gasping
mouths, the bloodshot eyes and the mocking laughter, the stern, cool,
calculating look here and there; the unimpressionable, dogged stare! Oh!
that I could put on canvas what I have in my mind!
“July 25th.—A glorious day at Chatham, where again the Engineers
were put through field exercises, and I studied them with all my faculties. I
got splendid hints to-day. Went with Colonel Browne and Papa.
“July 28th.—My dear musketry instructor for a few more attitudes. He
has put me through the process of loading the ‘Brown Bess’—a flint-lock—
so that I shall have my soldiers handling their arms properly. Galloway has
sold the copyright of this picture to Messrs. Dickenson for £2,000! They
must have faith in my doing it well.”
On August 11th I see I took a much-needed holiday at home, at Ventnor;
and, as I say, “gave myself up to fresh air, exercise, a little out-of-door
painting, and Napier’s ‘Peninsular War,’ in six volumes.” Shortly before I
left for home I received from Queen Victoria a very splendid bracelet set
with pearls and a large emerald. My mother and good friend Dr. Pollard
were with me in the studio when the messenger brought it, and we formed a
jubilant trio.
It was pleasant to be amongst my old Ventnor friends who had known
me since I was little more than a child. But on September 10th I had to bid
them and the old place goodbye, and on September 11th I re-entered my
beloved studio.
“September 12th.—An eventful day, for my ‘Quatre Bras’ canvas was
tackled. The sergeant-major and Colonel Browne arrived. The latter, good
man, has had the whole Waterloo uniform made for me at the Government
clothing factory at Pimlico. It has been made to fit the sergeant-major, who
put on the whole thing for me to see. We had a dress rehearsal, and very
delighted I was. They have even had the coat dyed the old ‘brick-dust’ red
and made of the baize cloth of those days! Times are changed for me. It will
be my fault if the picture is a fiasco.”
During the painting of “Quatre Bras” I was elected a member of the
Royal Institute of Painters in Water Colour, and I contributed to the Winter
Exhibition that large sketch of a sowar of the 10th Bengal Lancers which I
called “Missed!” and which the Graphic bought and published in colours.
This reproduction sold to such an extent that the Graphic must have been
pleased! The sowar at “tent-pegging” has missed his peg and pulls at his
horse at full gallop. I had never seen tent-pegging at that time, but I did this
from description, by an Anglo-Indian officer of the 10th, who put the thing
vividly before me. How many, many tent-peggings I have seen since, and
what a number of subjects they have given me for my brush and pencil!
Those captivating and pictorial movements of men and horses are
inexhaustible in their variety.
I had more models sent to me than I could put into the big picture—
Guardsmen, Engineers and Policemen—the latter being useful as, in those
days, the police did not wear the moustache, and I had difficulty in finding
heads suitable for the Waterloo time. Not a head in the picture is repeated. I
had a welcome opportunity of showing varieties of types such as gave me
so much pleasure in the old Florentine days when I enjoyed the Andrea del
Sartos, Masaccios, Francia Bigios, and other works so full of characteristic
heads.
On November 7th my sister and I went for a weekend to Birmingham,
where the people who had bought “The Roll Call” copyright were
exhibiting that picture. They particularly wished me to go. We were very
agreeably entertained at Birmingham, where I was curious to meet the
buyer of my first picture sold, that “Morra” which I painted in Rome.
Unfortunately I inquired everywhere for “Mr. Glass,” and had to leave
Birmingham without seeing him and the early work. No one had heard of
him! His name was Chance, the great Birmingham glass manufacturer.
“November 27th.—In the morning off with Dr. Pollard to Sanger’s
Circus, where arrangements had been made for me to see two horses go
through their performances of lying down, floundering on the ground, and
rearing for my ‘Quatre Bras’ foreground horses. It was a funny experience
behind the scenes, and I sketched as I followed the horses in their
movements over the arena with many members of the troupe looking on,
the young ladies with their hair in curl-papers against the evening’s
performance. I am now ripe to go to Paris.”
So to Paris I went, with my father. We were guests of my father’s old
friends, Mr. and Mrs. Talmadge, Boulevard Haussmann, and a complete
change of scene it was. It gave my work the desired fillip and the fresh
impulse of emulation, for we visited the best studios, where I met my most
admired French painters. The Paris Diary says:
“December 3rd.—Our first lion was Bonnat in his studio. A little man,
strong and wiry; I didn’t care for his pictures. His colouring is dreadful.
What good light those Parisians get while we are muddling in our smoky art
centre. We next went to Gérôme, and it was an epoch in my life when I saw
him. He was at work but did not mind being interrupted. He is a much
smaller man than I expected, with wide open, quick black eyes, yet with
deep lids, the eyes opening wide only when he talks. He talked a great deal
and knew me by name and ‘l’Appel,’ which he politely said he heard was
‘digne’ of the celebrity it had gained. We went to see an exhibition of
horrors—Carolus Duran’s productions, now on view at the Cercle
Artistique. The talk is all about this man, just now the vogue. He illustrates
a very disagreeable present phase of French Art. At Goupil’s we saw De
Neuville’s ‘Combat on the Roof of a House,’ and I feasted my eyes on some
pickings from the most celebrated artists of the Continent. I am having a
great treat and a great lesson.
“December 4th.—Had a supposed great opportunity in being invited to
join a party of very mondaines Parisiennes to go over the Grand Opera,
which is just being finished. Oh, the chatter of those women in the carriage
going there! They vied with each other in frivolous outpourings which
continued all the time we explored that dreadful building. It is a pile of
ostentation which oppressed me by the extravagant display of gilding,
marbles and bronze, and silver, and mosaic, and brocade, heaped up over
each other in a gorged kind of way. How truly weary I felt; and the
bedizened dressing-rooms of the actresses and danseuses were the last
straw. Ugh! and all really tasteless.”
However, I recovered from the Grand Opera, and really enjoyed the
lively dinners where conversation was not limited to couples, but flowed
with great ésprit across the table and round and round. Still, in time, my
sleep suffered, for I seemed to hear those voices in the night. How graceful
were the French equivalents to the compliments I received in London. They
thought I would like to know that the fame of “l’Appel” had reached Paris,
and so I did.
We visited Detaille’s beautiful studio. He was my greatest admiration at
that time. Also Henriette Browne’s and others, and, of course, the
Luxembourg, so I drew much profit from my little visit. But what a change
I saw in the army! I who could remember the Empire of my childhood, with
its endless variety of uniforms, its buglings, and drummings, and
trumpetings; its chic and glitter and swagger: 1870 was over it all now.
Well, never mind, I have lived to see it in the “bleu d’horizon” of a new and
glorious day. My Paris Diary winds up with: “December 14th.—Papa and I
returned home from our Paris visit. My eye has been very much sharpened,
and very severe was that organ as it rested on my ‘Quatre Bras’ for the first
time since a fortnight ago. Ye Gods! what a deal I have to do to that picture
before it will be fit to look at! I continue to receive droll letters and poems
(!). One I must quote the opening line of:

‘Go on, go on, thou glorious girl!’

Very cheering.”
CHAPTER X

MORE WORK AND PLAY

SO I worked steadily at the big picture, finding the red coats very trying.
What would I have thought, when studying at Florence, if I had been told to
paint a mass of men in one colour, and that “brick-dust”? However, my
Aldershot observations had been of immense value in showing me how the
British red coat becomes blackish-purple here, pale salmon colour there,
and so forth, under the influence of the weather and wear and tear. I have all
the days noted down, with the amount of work done, for future guidance,
and lamentations over the fogs of that winter of 1874-5. I gave nicknames
in the Diary to the figures in my picture, which I was amused to find, later
on, was also the habit of Meissonier; one of my figures I called the
“Gamin” and he, too, actually had a “Gamin.” Those fogs retarded my work
cruelly, and towards the end I had to begin at the studio at 9.30 instead of
10, and work on till very late. The porter at No. 76 told me mine was the
first fire to be lit in the morning of all in The Avenue.
Practising for “Quatre Bras.”

One day the Horse Guards, directed by their surgeon, had a magnificent
black charger thrown down in the riding school at Knightsbridge (on deep
sawdust) for me to see, and get hints from, for the fallen horse in my
foreground. The riding master strapped up one of the furious animal’s
forelegs and then let him go. What a commotion before he fell! How he
plunged and snorted in clouds of dust till the final plunge, when the riding
master and a trooper threw themselves on him to keep him down while I
made a frantic sketch. “What must it be,” I ask, “when a horse is wounded
in battle, if this painless proceeding can put him into such a state?”
The spring of 1875 was full of experiences for me. I note that “at the
Horse Guards’ riding school a charger was again ‘put down’ for me, but
more gently this time, and without the risk, as the riding master said, of
breaking the horse’s neck, as last time. I was favoured with a charge, two
troopers riding full tilt at me and pulling up at within two yards of where I
stood, covering me with the sawdust. I stood it bravely the second time, but
the first I got out of the way. With ‘Quatre Bras’ in my head, I tried to fancy
myself one of my young fellows being charged, but I fear my expression
was much too feminine and pacific.” March 22nd gave me a long day’s
tussle with the grey, bounding horse shot in mid-career. I say: “This is a
teaser. I was tired out and faint when I got home.” If that was a black day,
the next was a white one: “The sculptor, Boehm, came in, and gave me the
very hints I wanted to complete my bounding horse. Galloway also came.
He says ‘Quatre Bras’ beats ‘The Roll Call’ into a cocked hat! He gave me
£500 on account. Oh! the nice and strange feeling of easiness of mind and
slackening of speed; it is beginning to refresh me at last, and my seven
months’ task is nearly accomplished.” Another visitor was the Duke of
Cambridge, who, it appears, gave each soldier in my square a long scrutiny
and showed how well he understood the points.
On “Studio Monday” the crowds came, so that I could do very little in
the morning. The novelty, which amused me at first, had worn off, and I
was vexed that such numbers arrived, and tried to put in a touch here and
there whenever I could. Millais’ visit, however, I record as “nice, for he was
most sincerely pleased with the picture, going over it with great gusto. It is
the drawing, character, and expression he most dwells on, which is a
comfort. But I must now try to improve my tone, I know. And what about
‘quality’? To-day, Sending-in Day, Mrs. Millais came, and told me what her
husband had been saying. He considers me, she said, an even stronger artist
than Rosa Bonheur, and is greatly pleased with my drawing. That (the
‘drawing’) pleased me more than anything. But I think it is a pity to make
comparisons between artists. I may be equal to Rosa Bonheur in power, but
how widely apart lie our courses! I was so put out in the morning, when I
arrived early to get a little painting, to find the wretched photographers in
possession. I showed my vexation most unmistakably, and at last bundled
the men out. They were working for Messrs. Dickinson. So much of my
time had been taken from me that I was actually dabbing at the picture
when the men came to take it away; I dabbing in front and they tapping at
the nails behind. How disagreeable!”
After doing a water colour of a Scots Grey orderly for the “Institute,”
which Agnew bought, I was free at last to take my holiday. So my Mother
and I were off to Canterbury to be present at the opening of St. Thomas’s
Church there.
“April 11th, Canterbury.—To Mass in the wretched barn over a stable
wherein a hen, having laid an egg, cackled all through the service. And this
has been our only church since the mission was first begun six years ago, up
till now, in the city of the great English Martyr. But this state of things
comes to an end on Tuesday.”
This opening of St. Thomas’s Church was the first public act of Cardinal
Manning as Cardinal, and it went off most successfully. There were rows of
Bishops and Canons and Monsignori and mitred Abbots, and monks and
secular priests, all beautifully disposed in the Sanctuary. The sun shone
nearly the whole time on the Cardinal as he sat on his throne. After Mass
came the luncheon at which much cheering and laughter were indulged in.
Later on Benediction, and a visit to the Cathedral. I rather winced when a
group of men went down on their knees and kissed the place where the
blood of St. Thomas à Becket is supposed to still stain the flags. The
Anglican verger stared and did not understand.
On Varnishing Day at the Academy I was evidently not enchanted with
the position of my picture. “It is in what is called ‘the Black Hole’—the
only dark room, the light of which looks quite blue by contrast with the
golden sun-glow in the others. However, the artists seemed to think it a
most enviable position. The big picture is conspicuous, forming the centre
of the line on that wall. One academician told me that on account of the
rush there would be to see it they felt they must put it there. This ‘Lecture
Room’ I don’t think was originally meant for pictures and acts on the
principle of a lobster pot. You may go round and round the galleries and
never find your way into it! I had the gratification of being told by R.A.
after R.A. that my picture was in some respects an advance on last year’s,
and I was much congratulated on having done what was generally believed
more than doubtful—that is, sending any important picture this year with
the load and responsibility of my ‘almost overwhelming success,’ as they
called it, of last year on my mind. And that I should send such a difficult
one, with so much more in it than the other, they all consider ‘very plucky.’
I was not very happy myself, although I know ‘Quatre Bras’ to be to ‘The
Roll Call’ as a mountain to a hill. However, it was all very gratifying, and I
stayed there to the end. My picture was crowded, and I could see how it was
being pulled to pieces and unmercifully criticised. I returned to the studio,
where I found a champagne lunch spread and a family gathering awaiting
me, all anxiety as to the position of my magnum opus. After that hilarious
meal I sped back to the fascination of Burlington House. I don’t think,
though, that Mamma will ever forgive the R.A.’s for the ‘Black Hole.’
“April 30th.—The private view, to which Papa and I went. It is very
seldom that an ‘outsider’ gets invited, but they make a pet of me at the
Academy. Again this day contrasted very soberly with the dazzling P.V. of
‘74. There were fewer great guns, and I was not torn to pieces to be
introduced here, there, and everywhere, most of the people being the same
as last year, and knowing me already. The same furore cannot be repeated;
the first time, as I said, can never be a second. Papa and I and lots of others
lunched over the way at the Penders’ in Arlington Street, our hosts of last
night, and it was all very friendly and nice, and we returned in a body to the
R.A. afterwards. I was surprised, at the big ‘At Home’ last night, to find
myself a centre again, and people all so anxious to hear my answers to their
questions. Last year I felt all this more keenly, as it had all the fascination of
novelty. This year just the faintest atom of zest is gone.
“May 3rd.—To the Academy on this, the opening day. A dense, surging
multitude before my picture. The whole place was crowded so that before
‘Quatre Bras’ the jammed people numbered in dozens and the picture was
most completely and satisfactorily rendered invisible. It was chaos, for
there was no policeman, as last year, to make people move one way. They
clashed in front of that canvas and, in struggling to wriggle out, lunged right
against it. Dear little Mamma, who was there nearly all the time of our visit,
told me this, for I could not stay there as, to my regret, I find I get
recognised (I suppose from my latest photos, which are more like me than
the first horror) and the report soon spreads that I am present. So I wander
about in other rooms. I don’t know why I feel so irritated at starers. One can
have a little too much popularity. Not one single thing in this world is
without its drawbacks. I see I am in for minute and severe criticism in the
papers, which actually give me their first notices of the R.A. The Telegraph
gives me its entire article. The Times leads off with me because it says
‘Quatre Bras’ will be the picture the public will want to hear about most. It
seems to be discussed from every point of view in a way not usual with
battle pieces. But that is as it should be, for I hope my military pictures will
have moral and artistic qualities not generally thought necessary to military
genre.
“May 4th.—All of us and friends to the Academy, where we had a lively
lunch, Mamma nearly all the time in ‘my crowd,’ half delighted with the
success and half terrified at the danger the picture was in from the eagerness
of the curious multitude. I just furtively glanced between the people, and
could only see a head of a soldier at a time. A nice notion the public must
have of the tout ensemble of my production!”
I was afloat on the London season again, sometimes with my father, or
with Dr. Pollard. My dear mother did not now go out in the evenings, being
too fatigued from her most regrettable sleeplessness. There was a dinner or
At Home nearly every day, and occasionally a dance or a ball. At one of the
latter my partner informed me that Miss Thompson was to be there that
evening. All this was fun for the time. At a crowded afternoon At Home at
the Campanas’, where all the singers from the opera were herded, and
nearly cracked the too-narrow walls of those tiny rooms by the concussion
of the sound issuing from their wonderful throats, I met Salvini. “Having
his ‘Otello,’ which we saw the other night, fresh in my mind, I tried to
enthuse about it to him, but became so tongue-tied with nervousness that I
could only feebly say ‘Quasi, quasi piangevo!’ ‘O! non bisogna piangere,’
poor Salvini kindly answered. To tell him I nearly cried! To tell the truth, I
was much too painfully impressed by the terrific realism of the murder of
Desdemona and of Othello’s suicide to cry. I have been told that, when
Othello is chasing Desdemona round the room and finally catches her for
the murder, women in the audience have been known to cry out ‘Don’t!’
And I told him I nearly cried! Ugh!”
After this I went to Great Marlow for fresh air with my mother, and
worked up an oil picture of a scout of the 3rd Dragoon Guards whom I saw
at Aldershot, getting the landscape at Marlow. It has since been engraved.
By the middle of June I was at work in the studio once more. The
evenings brought their diversions. Under Mrs. Owen Lewis’s chaperonage I
went to Lady Petre’s At Home one evening, where 600 guests were
assembled “to meet H.E. the Cardinal.”[5] I record that “I enjoyed it very
much, though people did nothing but talk at the top of their voices as they
wriggled about in the dense crowd which they helped to swell. They say it
is a characteristic of these Catholic parties that the talk is so loud, as
everybody knows everybody intimately! I met many people I knew, and my
dear chaperon introduced lots of people to me. I had a longish talk with
H.E., who scolded me, half seriously, for not having come to see him. I was
aware of an extra interest in me in those orthodox rooms, and was much
amused at an enthusiastic woman asking, repeatedly, whether I was there.
These fleeting experiences instruct one as they fly. Now I know what it
feels like to be ‘the fashion.’” Other festivities have their record: “I went to
a very nice garden party at the house of the great engineer, Mr. Fowler,
where the usual sort of thing concerning me went on—introductions of
‘grateful’ people in large numbers who, most of them, poured out their
heartfelt(!) feelings about me and my work. I can stand a surprising amount
of this, and am by no means blasée yet. Mr. Fowler has a very choice
collection of modern pictures, which I much enjoyed.” Again: “The dinner
at the Millais’ was nice, but its great attraction was Heilbuth’s being there,
one of my greatest admirations as regards his particular line—characteristic
scenes of Roman ecclesiastical life such as I so much enjoyed in Rome. I
told Millais I had had Heilbuth’s photograph in my album for years. ‘Do
you hear that, Heilbuth?’ he shouted. To my disgust he was portioned off to
some one else to go in to dinner, but I had de Nittis, a very clever
Neapolitan artist, and, what with him and Heilbuth and Hallé and Tissot, we
talked more French and Italian than English that evening. Millais was so
genial and cordial, and in seeing me into the carriage he hinted very broadly
that I was soon to have what I ‘most t’oroughly deserved’—that is, my
election as A.R.A. He pronounced the ‘th’ like that, and with great
emphasis. Was that the Jersey touch?”
In July I saw de Neuville’s remarkable “Street Combat,” which made a
deep impression on me. I went also to see the field day at Aldershot, a great
success, with splendid weather. After the “battle,” Captain Cardew took us
over several camps, and showed us the stables and many things which
interested me greatly and gave me many ideas. The entry for July 17th says:
“Arranging the composition for my ‘Balaclava’ in the morning, and at
1.30 came my dear hussar,[6] who has sat on his fiery chestnut for me
already, on a fine bay, for my left-hand horse in the new picture. I have been
leading such a life amongst the jarring accounts of the Crimean men I have
had in my studio to consult. Some contradict each other flatly. When Col.
C. saw my rough charcoal sketch on the wall, he said no dress caps were
worn in that charge, and coolly rubbed them off, and with a piece of
charcoal put mean little forage caps on all the heads (on the wrong side,
too!), and contentedly marched out of the door. In comes an old 17th Lancer
sergeant, and I tell him what has been done to my cartoon. ‘Well, miss,’
says he, ‘all I can tell you is that my dress cap went into the charge and my
dress cap came out of it!’ On went the dress caps again and up went my
spirits, so dashed by Col. C. To my delight this lancer veteran has kept his
very uniform—somewhat moth-eaten, but the real original, and he will lend
it to me. I can get the splendid headdress of the 17th, the ‘Death or Glory
Boys,’ of that period at a military tailor’s.”
The Lord Mayor’s splendid banquet to the Royal Academicians and
distinguished “outsiders” was in many respects a repetition of the last but
with the difference that the assembly was almost entirely composed of
artists. “I went with Papa, and I must say, as my name was shouted out and
we passed through the lane of people to where the Lord Mayor and Lady
Mayoress were standing to receive their guests, I felt a momentary stroke of
nervousness, for people were standing there to see who was arriving, and
every eye was upon me. I was mentioned in three or four speeches. The
Lord Mayor, looking at me, said that he was honoured to have amongst his
guests Miss Thompson (cheers), and Major Knollys brought in ‘The Roll
Call’ and ‘Quatre Bras’ amidst clamour, while Sir Henry Cole’s allusion to
my possible election as an A.R.A. was equally well received. I felt very
glad as I sat there and heard my present work cheered; for in that hall, last
year, I had still the great ordeal to go through of painting, and painting
successfully, my next picture, and that was now a fait accompli.”
A rainy July sadly hindered me from seeing as much as I had hoped to
see of the Aldershot manœuvres. On one lovely day, however, Papa and I
went down in the special train with the Prince of Wales, the Duke of
Cambridge, and all the “cocked hats.” In our compartment was Lord
Dufferin, who, on hearing my name, asked to be introduced and proved a
most charming companion, and what he said about “Quatre Bras” was nice.
He was only in England on a short furlough from Canada, and did not see
my “Roll Call.”
“At the station at Farnborough the picturesqueness began with the gay
groups of the escort, and other soldiers and general officers, all in war trim,
moving about in the sunshine, while in the background slowly passed,
heavily laden, the Army on the march to the scene of action. Papa and I and
Major Bethune took a carriage and slowly followed the march, I standing
up to see all I could.
“We were soon overtaken by the brilliant staff, and saluted as it flashed
past by many of its gallant members, including the dashing Baron de
Grancey in his sky blue Chasseurs d’Afrique uniform. Poor Lord Dufferin
in civilian dress—frock coat and tall hat—had to ride a rough-trotting troop
horse, as his own horse never turned up at the station. A trooper was
ordered to dismount, and the elegant Lord Dufferin took his place in the
black sheepskin saddle. He did all with perfect grace, and I see him now, as
he passed our carriage, lift his hat with a smiling bow, as though he was
riding the smoothest of Arabs. The country was lovely. All the heather out
and the fir woods aromatic. In one village regiments were standing in the
streets, others defiling into woods and all sorts of artillery, ambulance, and
engineer waggons lumbering along with a dull roll very suggestive of real
war. At this village the two Army Corps separated to become enemies, the
one distinguished from the other by the men of one side wearing broad
white bands round their headdresses. This gave the wearers a rather savage
look which I much enjoyed. It made their already brown faces look still
grimmer. Of course, our driver took the wrong road and we saw nothing of
the actual battle, but distant puffs of smoke. However, I saw all the march
back to Aldershot, and really, what with the full ambulances, the men lying
exhausted (sic) by the roadside, or limping along, and the cheers and songs
of the dirty, begrimed troops, it was not so unlike war. At the North Camp
Sir Henry de Bathe was introduced, and Papa and I stood by him as the
troops came in.” A day or two later I was in the Long Valley where the most
splendid military spectacle was given us, some 22,000 being paraded in the
glorious sunshine and effective cloud shadows in one of the most striking
landscapes I have seen in England. “It was very instructive to me,” I write,
“to see the difference in the appearance of the men to-day from that which
they presented on Thursday. Their very faces seemed different; clean, open
and good-looking, whereas on Thursday I wondered that British soldiers
could look as they did. The infantry in particular, on that day, seemed
changed; they looked almost savage, so distorted were their faces with
powder and dirt and deep lines caused by the glare of the sun. I was well
within the limits when I painted my 28th in square. I suppose it would not
have done to be realistic to the fullest extent. The lunch at the Welsh
Fusiliers’ mess in a tent I thought very nice. Papa came down for the day. It
is very good of him. I don’t think he approves of my being so much on my
own hook. But things can’t help being rather abnormal.”
Here follows another fresh air holiday at my grandparents’ at Worthing
(where I rode with my grandfather), finishing up with a visit which I shall
always remember with pleasure—I ought to say gratitude—not only for its
own sake, but for all the enjoyment it obtained for me in Italy. That August
I was a guest of the Higford-Burrs at Aldermaston Court, an Elizabethan
house standing in a big Berkshire park. “I arrived just as the company were
finishing dinner. I was welcomed with open arms. Mrs. Higford-Burr
embraced me, although I have only seen her twice before, and I was made
to sit down at table in my travelling dress, positively declining to recall
dishes, hating a fuss as I do. The dessert was pleasant because every one
made me feel at home, especially Mrs. Janet Ross, daughter of the Lady
Duff Gordon whose writings had made me long to see the Nile in my
childhood. There are five lakes in the Park, and one part is a heather-
covered Common, of which I have made eight oil sketches on my little
panels, so that I have had the pleasure of working hard and enjoying the
society of most delightful people. There were always other guests at dinner
besides the house party, and the average number who sat down was
eighteen. Besides Mrs. Ross were Mr. and Mrs. Layard, he the Nineveh
explorer, and now Ambassador at Madrid, the Poynters, R.A., the Misses
Duff Gordon, and others, in the house. Mrs. Burr with her great tact allowed
me to absent myself between breakfast and tea, taking my sandwiches and
paints with me to the moor.”
Days at Worthing followed, where my mother and I painted all day on
the Downs, I with my “Balaclava” in view, which required a valley and low
hills. My mother’s help was of great value, as I had not had much time to
practise landscape up to then. Then came my visit, with Alice, to Newcastle,
where “Quatre Bras” was being exhibited, to be followed by our visit at
Mrs. Ross’s Villa near Florence, whither she had invited us when at
Aldermaston, to see the fêtes in honour of Michael Angelo.
“We left for Newcastle by the ‘Flying Scotchman’ from King’s Cross at
10 a.m., and had a flying shot at Peterborough and York Cathedrals, and a
fine flying view of Durham. Newcastle impressed us very much as we
thundered over the iron bridge across the Tyne and looked down on the
smoke-shrouded, red-roofed city belching forth black and brown smoke and
jets of white steam in all directions. It rises in fine masses up from the
turbid flood of the dark river, and has a lurid grandeur quite novel to us. I
could not help admiring it, though, as it were, under protest, for it seems to
me something like a sin to obscure the light of Heaven when it is not
necessary. The laws for consuming factory smoke are quite disregarded
here. Mrs. Mawson, representing the firm at whose gallery ‘Quatre Bras’ is
being exhibited, was awaiting our arrival, and was to be our hostess. We
were honoured and fêted in the way of the warm-hearted North. Nothing
could have been more successful than our visit in its way. These
Northerners are most hospitable, and we are delighted with them. They
have quite a cachet of their own, so cultured and well read on the top of
their intense commercialism—far more responsive in conversation than
many society people I know ‘down South.’ We had a day at Durham under
Mrs. Mawson’s wing, visiting that finest of all English Cathedrals (to my
mind), and the Bishop’s palace, etc. We rested at the Dean’s, where, of
course, I was asked for my autograph. I already find how interested the
people are about here, more even than in other parts where I have been.
Durham is a place I loved before I saw it. The way that grand mass of
Norman architecture rises abruptly from the woods that slope sheer down to
the calm river is a unique thing. Of course, the smoky atmosphere makes
architectural ornament look shallow by dimming the deep shadows of
carvings, etc.—a great pity. On our return we took another lion en passant
—my picture at Newcastle, and most delighted I was to find it so well
lighted. I may say I have never seen it properly before, because it never
looked so well in my studio, and as to the Black Hole——! What people
they are up here for shaking hands! When some one is brought up to me the
introducer puts it in this way: ‘Mr. So-and-So wishes very much to have the
honour of shaking hands with you, Miss Thompson.’ There is a straight-
forward ring in their speech which I like.”
We were up one morning at 4.30 to be off to Scotland for the day. At
Berwick the rainy weather lifted and we were delighted by the look of the
old Border town on its promontory by the broad and shining Tweed.
Passing over the long bridge, which has such a fine effect spanning the
river, we were pleased to find ourselves in a country new to us. Edinburgh
struck us very much, for we had never quite believed in it, and thought it
was “all the brag of the Scotch,” but we were converted. It is so like a fine
old Continental city—nothing reminds one of England, and yet there is a
Scotchiness about it which gives it a sentiment of its own. Our towns are, as
a rule, so poorly situated, but Edinburgh has the advantage of being built on
steep hills and of being back-grounded by great crags which give it a most
majestic look. The grey colour of the city is fine, and the houses, nearly all
gabled and very tall, are exceedingly picturesque, and none have those vile,
black, wriggling chimney pots which disfigure what sky lines our towns
may have. I was delighted to see so many women with white caps and
tartan shawls and the children barefoot; picturesque horse harness; plenty of
kilted soldiers.
We did all the lions, including the garrison fortress where the Cameron
Highlanders were, and where Colonel Miller, of Parkhurst memory, came
out, very pleased to speak to me and escort us about. He had the water
colour I gave him of his charger, done at Parkhurst in the old Ventnor days.
Our return to Newcastle was made in glorious sunshine, and we greedily
devoured the peculiarly sweet and remote-looking scenes we passed
through. I shall long remember Newcastle at sunset on that evening, Then, I
will say, the smoke looked grand. They asked me to look at my picture by
gas light. The sixpenny crowd was there, the men touching their caps as I
passed. In the street they formed a lane for me to pass to the carriage.
“What nice people!” I exclaim in the Diary.
All the morning of our departure I was employed in sitting for my
photograph, looking at productions of local artists and calling on the Bishop
and the Protestant Vicar. One man had carved a chair which was to be
dedicated to me. I was quaintly enthroned on it. All this was done on our
way to the station, where we lunched under dozens of eyes, and on the
platform a crowd was assembled. I read: “Several local dignitaries were
introduced and ‘shook hands,’ as also the ‘Gentlemen of the local Press.’ As
I said a few words to each the crowd saw me over the barriers, which made
me get quite hot and I was rather glad when the train drew up and we could
get into our carriage. The farewell handshakings at the door may be
imagined. We left in a cloud of waving handkerchiefs and hats. I don’t
know that I respond sufficiently to all this. Frankly, my picture being made
so much of pleases me most satisfactorily, but the personal part of the
tribute makes me curiously uncomfortable when coming in this way.”
Ruskin wrote a pamphlet on that year’s Academy in which he told the
world that he had approached “Quatre Bras” with “iniquitous prejudice” as
being the work of a woman. He had always held that no woman could paint,
and he accounted for my work being what he found it as being that of an
Amazon. I was very pleased to see myself in the character of an Amazon.
CHAPTER XI

TO FLORENCE AND BACK

WE started on our most delightful journey to Florence early in


September of that year to assist at the Michael Angelo fêtes as the guests of
dear Mrs. Janet Ross and the Marchese della Stufa, who, with Mr. Ross,
inhabited in the summer the delicious old villa of Castagnolo, at Lastra a
Signa, six miles on the Pisan side of my beloved Florence. Of course, I give
page after page in the Diary to our journey across Italy under the Alps and
the Apennines. To the modern motorist it must all sound slow, though we
did travel by rail! Above all the lovely things we saw on our way by the
Turin-Bologna line, I think Parma, rising from the banks of a shallow river,
glowing in sunshine and palpitating jewel-like shade, holds pride of place
for noontide beauty. After Modena came the deeper loveliness of the
afternoon, and then Bologna, mellowed by the rosy tints of early evening.
Then the sunset and then the tender moon.
By moonlight we crossed the Apennines, and to the sound of the droning
summer beetle—an extraordinarily penetrating sound, which I declared
makes itself heard above the railway noises, we descended into the Garden
of Italy, slowly, under powerful brakes. At ten we reached Florence, and in
the crowd on the platform a tall, distinguished-looking man bowed to me.
“Miss Thompson?” “Yes.” It was the Marchese, and lo! behind him, who
should there be but my old master, Bellucci. What a warm welcome they
gave us. Of course, our luggage had stuck at the douane at Modane, and
was telegraphed for. No help for it; we must do without it for a day or two.
We got into the carriage which was awaiting us, and the Marchese into his
little pony trap, and off we went flying for a mysterious, dream-like drive in
misty moonlight, we in front and our host behind, jingle-jingling merrily
with the pleasant monotony of his lion-maned little pony’s canter. We could
not believe the drive was a real one. It was too much joy to be at Florence—
too good to be true. But how tired we were!
At last we drove up to the great towered villa, an old-fashioned
Florentine ancestral place, which has been the home of the Della Stufas for
generations, and there, in the great doorway, stood Mrs. Ross, welcoming
us most cordially to “Castagnolo.” We passed through frescoed rooms and
passages, dimly lighted with oil lamps of genuine old Tuscan patterns, and
were delighted with our bedrooms—enormous, brick-paved and airy. There
we made a show of tidying ourselves, and went down to a fruit-decked
supper, though hardly able to sit up for sleep. How kind they were to us! We
felt quite at home at once.
“September 12th.—After Mass at the picturesque little chapel which,
with the vicario’s dwelling, abuts on the fattoria wing of the villa, we drove
into Florence with Mrs. Ross and the Marchese, whom we find the typical
Italian patrician of the high school. We were rigged out in Mrs. Ross’s
frocks, which didn’t fit us at all. But what was to be done? Provoking girls!
It was a dear, hot, dusty, dazzling old Florentine drive, bless it! and we were
very pleased. Florence was en fête and all imbandierata and hung with the
usual coloured draperies, and all joyous with church bells and military
bands. The concert in honour of Michael Angelo (the fêtes began to-day)
was held in the Palazzo Vecchio, and very excellent music they gave us, the
audience bursting out in applause before some of the best pieces were quite
finished in that refreshingly spontaneous way Italians have. After the
concert we loitered about the piazza looking at the ever-moving and
chattering crowd in the deep, transparent shade and dazzling sunshine. It
was a glorious sight, with the white statues of the fountain rising into the
sunlight against houses hung mostly with very beautiful yellow draperies. I
stood at the top of the steps of the Loggia de’ Lanzi, and, resting my book
on the pedestal of one of the lions, I made a rough sketch of the scene,
keeping the Graphic engagement in view. I subsequently took another of
the Michael Angelo procession passing the Ponte alle Grazie on its way
from Santa Croce to the new ‘Piazzale Michel Angelo,’ which they have
made since we were here before, on the height of San Miniato. It was a
pretty procession on account of the rich banners. A day full of charming
sights and melodious sounds.”
The great doings of the last day of the fêtes were the illuminations in the
moonlit evening. They were artistically done, and we had a feast of them,
taking a long, slow drive to the piazzale by the new zigzag. Michael Angelo
was remembered at every turn, and the places he fortified were especially
marked out by lovely lights, all more or less soft and glowing. Not a vile
gas jet to be seen anywhere. The city was not illuminated, nor was anything,
with few exceptions, save the lines of the great man’s fortifications. The old
white banner of Florence, with the Giglio, floated above the tricolour on the
heights which Michael Angelo defended in person. The effect, especially on
the church of San Miniato, of golden lamps making all the surfaces aglow,
as if the walls were transparent, and of the green-blue moonlight above, was
a thing as lovely as can be seen on this earth. It was a thoroughly Italian
festival. We were charmed with the people; no pushing in the crowds,
which enjoyed themselves very much. They made way for us when they
saw we were foreigners.
We stayed at Castagnolo nearly all through the vintage, pressed from one
week to another to linger, though I made many attempts to go on account of
beginning my “Balaclava.” The fascination of Castagnolo was intense, and
we had certainly a happy experience. I sketched hard every day in the
garden, the vineyards, and the old courtyard where the most picturesque
vintage incidents occurred, with the white oxen, the wine pressing, and the
bare-legged, merry contadini, all in an atmosphere scented with the
fermenting grapes. Everything in the Cortile was dyed with the wine in the
making. I loved to lean over the great vats and inhale that wholesome
effluence, listening to the low sea-like murmur of the fermentation. On the
days when we helped to pick the grapes on the hillside (and “helping
ourselves” at the same time) we had collazione there, a little picnic, with the
indispensable guitar and post-prandial cigarette. Every one made the most
of this blessed time, as such moments should be made the most of when
they are given us, I think. Young Italians often dined at the villa, and the
evenings were spent in singing stornelli and rispetti until midnight to the
guitar, every one of these young fellows having a nice voice. They were
merry, pleasant creatures.
One of the Balaklava Six-hundred.

Nothing but the stern necessity of returning to work could have kept me
from seeing the vintage out. We left most regretfully on October 4th, taking
Genoa and our dear step-sister on the way. Even as it was our lingering in
Italy made me too late, as things turned out, for the Academy!
October 19th has this entry: “Began my ‘Balaclava’ cartoon to-day.
Marked all the positions of the men and horses. My trip to Italy and the
glorious and happy and healthy life I have led there, and the utter change of
scene and occupation, have done me priceless good, and at last I feel like
going at this picture con amore. I was in hopes this happy result would be
obtained.” “Balaclava” was painted for Mr. Whitehead, of Manchester. I
had owed him a picture from the time I exhibited “Missing.” It was to be
the same size, and for the same price as that work, and I was in honour
bound to fulfil my contract! So I again brought forward the “Dawn of
Sedan,” although my prices were now so enlarged that £80 had become
quite out of proportion, even for a simple subject like that. However, after
long parleys, and on account of Mr. Whitehead’s repudiation of the Sedan
subject, it was agreed that “Balaclava” should be his, at the new scale
altogether. The Fine Art Society (late Dickenson & Co.) gave Mr.
Whitehead £3,000 for the copyright, and engaged the great Stacpoole, as
before, to execute the engraving.
I was very sorry that the picture was not ready for Sending-in Day at the
Academy. No doubt the fuss that was made about it, and my having begun a
month too late, put me off; but, be that as it may, I was a good deal
disturbed towards the end, and had to exhibit “Balaclava” at the private
gallery of the purchasers of the copyright in Bond Street. This gave me
more time to finish. I had my own Private View on April 20th, 1876: “The
picture is disappointing to me. In vain I call to mind all the things that
judges of art have said about this being the best thing I have yet painted.
Can one never be happy when the work is done? This day was only for our
friends and was no test. Still, there was what may be called a sensation.
Virginia Gabriel, the composer, was led out of the room by her husband in
tears! One officer who had been through the charge told a friend he would
never have come if he had known how like the real thing it was. Curiously
enough, another said that after the stress of Inkermann a soldier had come
up to his horse and leant his face against it exactly as I have the man doing
to the left of my picture.
“April 22nd.—An enormous number of people at the Society’s Private
View and some of the morning papers blossoming out in the most beautiful
notices, ever so long, and I getting a little reassured.” A day later: “Went to
lunch at Mrs. Mitchell’s, who invited me at the Private View, next door to
Lady Raglan’s, her great friend. Two distinguished officers were there to
meet me, and we had a pleasant chat.” And this is all I say! One of the two
was Major W. F. Butler, author of “The Great Lone Land.”
The London season went by full of society doings. Our mother had long
been “At Home” on Wednesdays, and much good music was heard at “The
Boltons,” South Kensington. Ruskin came to see us there. He and our
mother were often of the same way of thinking on many subjects, and I
remember seeing him gently clapping his hands at many points she made.
He was displeased with me on one occasion when, on his asking me which
of the Italian masters I had especially studied, I named Andrea del Sarto.
“Come into the corner and let me scold you,” were his disconcerting words.
Why? Of course, I was crestfallen, but, all the same, I wondered what could
be the matter with Andrea’s “Cenacolo” at San Salvi, or his frescoes at the
SS. Annunziata, or his “Madonna with St. Francis and St. John,” in the
Tribune of the Uffizi. The figure of the St. John is, to me, one of the most
adorable things in art. That gentle, manly face; that dignified pose; the
exquisite modelling of the hand, and the harmonious colours of the drapery
—what could be the matter with such work? I remember, at one of the
artistic London “At Homes,” Frith, R.A., coming up to me with a long face
to say, if I did not send to the Academy, I should lose my chance of election.
But I think the difficulties of electing a woman were great, and much
discussion must have been the consequence amongst the R.A.’s. However,
as it turned out, in 1879 I lost my election by two votes only! Since then I
think the door has been closed, and wisely. I returned to the studio on May
18th, for I could not lay down the brush for any amount of society doings.
Besides, I soon had to make preparations for “Inkermann.”
“Saturday, June 10th.—Saw Genl. Darby-Griffith, to get information
about Inkermann. I returned just in time to dress for the delightful Lord
Mayor’s Banquet to the Representatives of Art at the Mansion House, a
place of delightful recollections for me. Neither this year’s nor last year’s
banquet quite came up to the one of ‘The Roll Call’ year in point of
numbers and excitement, but it was most delightful and interesting to be in
that great gathering of artists and hear oneself gracefully alluded to in The
Lord Mayor’s speech and others. Marcus Stone sat on my left, and we had
really a thoroughly good conversation all through dinner such as I have
seldom embarked on, and I found, when I tried it, that I could talk pretty
well. He is a fine fellow, and simple-minded and genuine. My vis-à-vis was
Alma Tadema, with his remarkable-looking wife, like a lady out of one of
his own pictures; and many well-known heads wagged all around me. After
dinner and the speeches, Du Maurier, of Punch, suggested to the Lord
Mayor that we should get up a quadrille, which was instantly done, and the
friskier spirits amongst us had a nice dance. Du Maurier was my partner;
and on my left I had John Tenniel, so that I may be said to have been
supported by Punch both at the beginning and end of dinner, this being Du
Maurier’s simple and obvious joke, vide the post-turtle indulgence peculiar
to civic banquets. After a waltz we laggards at last took our departure in the
best spirits.”
I remember that in June we went to a most memorable High Mass, to
wit, the first to be celebrated in the Old Saxon Church of St. Etheldreda
since the days of the Reformation. This church was the second place of
Christian worship erected in London, if not in England, in the old Saxon
times. We were much impressed as the Gregorian Mass sounded once more
in the grey-stoned crypt. The upper church was not to be ready for years.
Those old grey stones woke up that morning which had so long been
smothered in the London clay.
Here follow too many descriptions in the Diary of dances, dinners and
other functions. They are superfluous. There were, however, some Tableaux
Vivants at an interesting house—Mrs. Bishop’s, a very intellectual woman,
much appreciated in society in general, and Catholic society in particular—
which may be recorded in this very personal narrative, for I had a funny
hand in a single-figure tableau which showed the dazed 11th Hussar who
figures in the foreground of my “Balaclava.” The man who stood for him in
the tableau had been my model for the picture, but to this day I feel the
irritation caused me by that man. In the picture I have him with his busby
pushed back, as it certainly would and should have been, off his heated
brow. But, while I was posing him for the tableau, every time I looked away
he rammed it down at the becoming “smart” angle. I got quite cross, and
insisted on the necessary push back. The wretch pretended to obey, but, just
before the curtain rose, rammed the busby down again, and utterly
destroyed the meaning of that figure! We didn’t want a representation of
Mr. So-and-so in the becoming uniform of a hussar, but my battered
trooper. The thing fell very flat. But tableaux, to my mind, are a mistake, in
many ways.
I often mention my pleasure in meeting Lord and Lady Denbigh, for they
were people after my own heart. Lady Denbigh was one of those women
one always looks at with a smile; she was so simpatica and true and
unworldly.
July 18th is noted as “a memorable day for Alice, for she and I spent the
afternoon at Tennyson’s! I say ‘for Alice’ because, as regards myself, the
event was not so delightful as a day at Aldershot. Tennyson has indeed
managed to shut himself off from the haunts of men, for, arrived at
Haslemere, a primitive little village, we had a six-mile drive up, up, over a
wild moor and through three gates leading to narrow, rutty lanes before we
dipped down to the big Gothic, lonely house overlooking a vast plain, with
Leith Hill in the distance. Tennyson had invited us through Aubrey de Vere,
the poet, and very apprehensive we were, and nervous, as we neared the
abode of a man reported to be such a bear to strangers. We first saw Mrs.
Tennyson, a gentle, invalid lady lying on her back on a sofa. After some
time the poet sent down word to ask us to come up to his sanctum, where he
received us with a rather hard stare, his clay pipe and long, black, straggling
hair being quite what I expected. He got up with a little difficulty, and when
we had sat down—he, we two and his most deferential son—he asked
which was the painter and which was the poet. After our answer, which
struck me as funny, as though we ought to have said, with a bob, ‘Please,
sir, I’m the painter,’ and ‘Please, sir, I’m the poet,’ he made a few
commonplace remarks about my pictures in a most sepulchral bass voice.
But he and Alice, in whom he was more interested, naturally, did most of
the talking; there was not much of that, though, for he evidently prefers to
answer a remark by a long look, and perhaps a slightly sneering smile, and
then an averted head. All this is not awe-inspiring, and looks rather put on.
We ceased to be frightened.
“There is no grandeur about Tennyson, no melancholy abstraction; and,
if I had made a demi-god of him, his personality would have much
disappointed me. Some of his poetry is so truly great that his manner seems
below it. The pauses in the conversation were long and frequent, and he did
not always seem to take in the meaning of a remark, so that I was relieved
when, after a good deal of staring and smiling at Alice in a way rather
trying to the patience, he acceded to her request and read us ‘The Passing of
Arthur.’ He was so long in finding the place, when his son at last found him
a copy of the book which suited him, and the tone he read in so deep and
monotonous, that I was much bored and longed for the hour of our
departure. He was vexed with Alice for choosing that poem, which he
seemed to think less of than of his later works, and he took the poor child to
task in a few words meant to be caustic, though they made us smile. But the
ice was melting. He seemed amused at us and we gratefully began to laugh
at some quaint phrases he levelled at us. Then he dropped the awe-inspiring
tone, and took us all over the grounds and gave us each a rose. He pitched
into us for our dresses which were too fashionable and tight to please him.
He pinned Alice against a pillar of the entrance to the house on our re-entry
from the garden to watch my back as I walked on with his son, pointing the
walking-stick of scorn at my skirt, the trimming of which particularly
roused his ire. Altogether I felt a great relief when we said goodbye to our
curious host with whom it was so difficult to carry on conversation, and to
know whether he liked us or not. Away, over the windy, twilight heath
behind the little ponies—away, away!”
At the beginning of August I began my studies for “The Return from
Inkermann.” The foreground I got at Worthing; and I had another visit to
Aldershot and many further conversations with Inkermann survivors—
officers of distinction. I am bound to say that these often contradicted each
other, and the rough sketches I made after each interview had to be re-
arranged over and over again. I read Dr. Russell’s account (The Times
correspondent) and sometimes I returned to my own conception, finding it
on the whole the most likely to be true.
I laugh even now at the recollection of two elderly sabreurs, one of them
a General in the Indian Army, who had a hot discussion in my studio, â
propos of my “Balaclava,” about the best use of the sabre. The Indian, who
was for slashing, twirled his umbrella so briskly, to illustrate his own
theory, that I feared for the picture which stood close by his sword arm. The
opposition umbrella illustrated “the point” theory.
Having finally clearly fixed the whole composition of “Inkermann,” in
sepia on tinted paper the size of the future picture I closed the studio on
August 25th and turned my face once more to Italy.
CHAPTER XII

AGAIN IN ITALY

MY sister and I tarried at Genoa on our way to Castagnolo where we


were to have again the joys of a Tuscan vintage. But between Genoa and
Florence lay our well-loved Porto Fino and, having an invitation from our
old friend Monty Brown, the English Consul and his young wife, to stay at
their castello there, we spent a week at that Eden. We were alone for part of
the time and thoroughly relished the situation, with only old Caterina, the
cook, and the dog, “Bismarck,” as company. Two Marianas in a moated
grange, with a difference. “He” came not, and so allowed us to clasp to our
hearts our chief delights—the sky, the sea, the olives and the joyous vines.
In those early days many of the deep windows had no glass, and one night,
when a staggering Mediterranean thunderstorm crashed down upon us, we
really didn’t like it and hid the knives under the table at dinner. Caterina
was saying her Rosary very loud in the kitchen. As we went up the winding
stairs to bed I carried the lamp, and was full of talk, when a gust of wind
blew the lamp out, and Alice laughed at my complete silence, more
eloquent than any words of alarm. We had every evening to expel curious
specimens of the lizard tribe that had come in, and turn over our pillows,
remembering the habits of the scorpion.
But that storm was the only one, and as to the sea, which three-parts
enveloped our little Promontory, its blue utterly baffled my poor paints. But
paint I did, on those little panels that we owe to Fortuny, so nicely fitting
into the box he invented. There was a little cape, crowned with a shrine to
Our Lady—“the Madonnetta” it was called—where I used to go daily to
inhale the ozone off the sea which thundered down below amongst the
brown “pudding-stone” rocks, at the base of a sheer precipice. The
“sounding deep.” Oh, the freshness, the health, the joy of that haunt of
mine! Our walks were perilous sometimes, the paths which almost
overhung the deep foaming sea being slippery with the sheddings of the
pines. At the “nasty bits” we had to hold on by shrubs and twigs, and haul
ourselves along by these always aromatic supports.
Admirable is the industry of the peasants all over Italy. Here on the
extreme point of Porto Fino wherever there was a tiny “pocket” of clay, a
cabbage or two or a vine with its black clusters of grapes toppling over the
abyss found foot-hold. We came one day upon a pretty girl on the very
verge of destruction, “holding on by her eyelids,” gathering figs with a
hooked stick, a demure pussy keeping her company by dozing calmly on a
branch of the fig tree. The walls built to support these handfuls of clay on
the face of the rock are a puzzle to me. Where did the men stand to build
them? It makes me giddy to think of it.
Paragi, the lovely rival of Monty’s robber stronghold, belonged to his
brother, and a fairer thing I never saw than Fred’s loggia with the slender
white marble columns, between which one saw the coast trending away to
La Spezzia. But “goodbye,” Porto Fino! On our way to Castagnolo, at
lovely Lastre a Signa, we paused at Pisa for a night.
“Pisa is a bald Florence, if I may say so; beautiful, but so empty and
lifeless. There are houses there quite peculiar, however, to Pisa, most
interesting for their local style. Very broad in effect are those flat blank
surfaces without mouldings. The frescoes on them, alas! are now merely
very beautiful blotches and stains of colour. We had ample time for a good
survey of the Duomo, Baptistery, Camposanto and Leaning Tower, all
vividly remembered from when I saw them as a little child. But I get very
tired by sight-seeing and don’t enjoy it much. What I like is to sit by the
hour in a place, sketching or meditating. Besides, I had been kept nearly all
night awake at the Albergo Minerva by railway whistles, ducks, parrots,
cats, dogs, cocks and hens, so that I was only at half power and I slept most
of the way to Signa.
“At the station a carriage fitted, for the heat, with cool-looking brown
holland curtains was awaiting us on the chance of our coming, and we were
soon greeted at dear Castagnolo by Mrs. Ross. Very good of her to show so
much happy welcome seeing we had been expected the evening before, not
to say for many days, and only our luggage had turned up! The Marchese,
who had to go into Florence this morning for the day, had gone down to
meet us last evening, and returned with the disconcerting announcement
that, whereas we had arrived last year without our luggage, this year the
luggage had arrived without us. ‘I bauli sono giunti ma le bambine—Chè!’”
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like