(Ebook) Affective Computing, Focus on Emotion Expression, Synthesis and Recognition by Jimmy Or (Editor) ISBN 9783902613233, 3902613238 2024 scribd download
(Ebook) Affective Computing, Focus on Emotion Expression, Synthesis and Recognition by Jimmy Or (Editor) ISBN 9783902613233, 3902613238 2024 scribd download
https://ebooknice.com/product/biota-grow-2c-gather-2c-cook-6661374
ebooknice.com
https://ebooknice.com/product/matematik-5000-kurs-2c-larobok-23848312
ebooknice.com
https://ebooknice.com/product/sat-ii-success-
math-1c-and-2c-2002-peterson-s-sat-ii-success-1722018
ebooknice.com
(Ebook) Master SAT II Math 1c and 2c 4th ed (Arco Master the SAT
Subject Test: Math Levels 1 & 2) by Arco ISBN 9780768923049,
0768923042
https://ebooknice.com/product/master-sat-ii-math-1c-and-2c-4th-ed-
arco-master-the-sat-subject-test-math-levels-1-2-2326094
ebooknice.com
(Ebook) Cambridge IGCSE and O Level History Workbook 2C - Depth Study:
the United States, 1919-41 2nd Edition by Benjamin Harrison ISBN
9781398375147, 9781398375048, 1398375144, 1398375047
https://ebooknice.com/product/cambridge-igcse-and-o-level-history-
workbook-2c-depth-study-the-united-states-1919-41-2nd-edition-53538044
ebooknice.com
https://ebooknice.com/product/vagabond-vol-29-29-37511002
ebooknice.com
https://ebooknice.com/product/music-emotion-recognition-multimedia-
computing-communication-and-intelligence-2414150
ebooknice.com
https://ebooknice.com/product/29-single-and-nigerian-53599780
ebooknice.com
https://ebooknice.com/product/boeing-b-29-superfortress-1573658
ebooknice.com
Affective Computing Focus on Emotion Expression
Synthesis and Recognition Jimmy Or (Editor) Digital
Instant Download
Author(s): Jimmy Or (Editor)
ISBN(s): 9783902613233, 3902613238
Edition: color illustrated
File Details: PDF, 13.55 MB
Year: 2008
Language: english
Affective Computing
Focus on Emotion Expression,
Synthesis and Recognition
Affective Computing
Focus on Emotion Expression,
Synthesis and Recognition
Edited by
Jimmy Or
Abstracting and non-profit use of the material is permitted with credit to the source. Statements and
opinions expressed in the chapters are these of the individual contributors and not necessarily those of
the editors or publisher. No responsibility is accepted for the accuracy of information contained in the
published articles. Publisher assumes no responsibility liability for any damage or injury to persons or
property arising out of the use of any materials, instructions, methods or ideas contained inside. After
this work has been published by the Advanced Robotic Systems International, authors have the right to
republish it, in whole or part, in any publication of which they are an author or editor, and the make
other personal use of the work.
A catalogue record for this book is available from the Austrian Library.
Affective Computing, Emotion Expression, Synthesis and Recognition, Edited by Jimmy Or
p. cm.
ISBN 978-3-902613-23-3
1. Affective Computing. 2. Or, Jimmy.
Preface
Affective Computing is a branch of artificial intelligence that deals with the design
of systems and devices that can recognize, interpret, and process emotions. Since
the introduction of the term “affective computing” by Rosalind Pichard at MIT in
1997, the research community in this field has grown rapidly. Affective Computing
is an important field because computer systems have become part of our daily
lives. As we nowadays live in the Age of Information Overload, and computer sys-
tems are becoming more complex, there is need for more natural user interfaces for
the overwhelmed computer users. Given that humans communicate with each
other by using not only speech but also implicitly their facial expressions and body
postures, machines that can understand human emotions and display affects
through these multimodal channels could be beneficial. If virtual agents and robots
are able to recognize and express their emotions through these channels, the result
of that will be more natural human-machine communication. This will allow hu-
man users to focus more on their tasks at hand.
This volume provides an overview of state of the art research in Affective Comput-
ing. It presents new ideas, original results and practical experiences in this increas-
ingly important research field. The book consists of 23 chapters categorized into
four sections. Since one of the most important means of human communication is
facial expression, the first section of this book (Chapters 1 to 7) presents a research
on synthesis and recognition of facial expressions. Given that we not only use the
face but also body movements to express ourselves, in the second section (Chap-
ters 8 to 11) we present a research on perception and generation of emotional ex-
pressions by using full-body motions. The third section of the book (Chapters 12 to
16) presents computational models on emotion, as well as findings from neurosci-
ence research. In the last section of the book (Chapters 17 to 22) we present applica-
tions related to affective computing.
A brief introduction to the book chapters is:
Chapter 1 presents a probabilistic neural network classifier for 3D analysis of facial
expressions. By using 11 facial features and taking symmetry of the human face
into consideration, the 3D distance vectors based recognition system can achieve a
high recognition rate of over 90%. Chapter 2 provides a set of deterministic and
stochastic techniques that allow efficient recognition of facial expression from a se-
ries of video imaging showing head motions. Chapter 3 reviews recent findings of
human-human interaction and demonstrates that the tangential aspects of an emo-
VI
tional signal (such as gaze and the type of face that shows the expression) can af-
fect the perceived meaning of the expression. Findings displayed in this chapter
could contribute to the design of avatars and agents used in the human computer
interface. Chapter 4 presents an approach to using genetic algorithm and neural
network for the recognition of emotion from the face. In particular, it focuses on
the eye and lip regions for the study of emotions. Chapter 5 proposes a system that
analyzes facial expressions based on topographic shape structure (eyebrow, eye,
nose and mouth) and the active texture.
Chapter 6 proposes a model of layered fuzzy facial expression generation (LFFEG)
to create expressive facial expressions for an agent in the affective human com-
puter interface. In this model, social, emotional and physiological layers contribute
to the generation of facial expression. Fuzzy theory is used to produce rich facial
expressions and personality for the virtual character. Based on recent findings that
the dynamics of facial expressions (such as timing, duration and intensity) play an
important role in the interpretation of facial expressions, Chapter 7 exams the
analysis of facial expressions based on computer vision and behavioral science
point of view. A technique that allows synthesis of photo-realistic expression of
various intensities is described.
In recent years, humanoid robots and simulated avatars have gained popularity.
Researchers try to develop both real and simulated humanoids that can behave
and communicate with humans more naturally. It is believed that a real humanoid
robot situated in the real world could better interact with humans. Given that we
also use whole body movements to express emotions, the next generation human-
oid robots should have a flexible spine and be able to express themselves by using
full body movements. Chapter 8 points out some of the challenges in developing
flexible spine humanoid robots for emotional expressions. Then, the chapter pre-
sents the development of emotional flexible spine humanoid robots based on find-
ings from a research on belly dance. Results of psychological experiments on the
effect of a full-body spine robot on human perceptions are presented.
Chapter 9 provides a review of the cues that we use in the perception of the affect
from body movements. Based on findings from psychology and neuroscience, the
authors raise the issue of whether giving a machine the ability to experience emo-
tions might help to accomplish reliable and efficient emotion recognition. Given
that human communications are multimodal, Chapter 10 reviews recent research
on systems that are capable of multiple input modalities and the use of alternative
channels to perceive affects. This is followed by a presentation of systems that are
capable of analyzing spontaneous input data in real world environments. Chapter
11 draws on findings from art theory to the synthesis of emotional expressions for
virtual humans. Lights, shadows, composition and filters are used as part of the
expression of emotions. In addition, the chapter proposes the use of genetic algo-
rithms to map affective states to multimodal expressions.
Since the modeling of emotion has become important in affective computing,
Chapter 12 presents a computational model of emotion. The model is capable of in-
VII
Acknowledgements
This book would not have been possible without the support of my colleagues and
friends. I own a great debt to Atsuo Takanishi of Waseda University. He gave me
VIII
Jimmy Or
May 2008
Center for High-Performance Integrated Systems
Korea Advanced Institute of Science and Technology
Daejeon, Republic of Korea
Contents
Preface ........................................................................................................................................V
1. Introduction
Face plays an important role in human communication. Facial expressions and gestures
incorporate nonverbal information which contributes to human communication. By
recognizing the facial expressions from facial images, a number of applications in the field of
human computer interaction can be facilitated. Last two decades, the developments, as well
as the prospects in the field of multimedia signal processing have attracted the attention of
many computer vision researchers to concentrate in the problems of the facial expression
recognition. The pioneering studies of Ekman in late 70s have given evidence to the
classification of the basic facial expressions. According to these studies, the basic facial
expressions are those representing happiness, sadness, anger, fear, surprise, disgust and
neutral. Facial Action Coding System (FACS) was developed by Ekman and Friesen to code
facial expressions in which the movements on the face are described by action units. This
work inspired many researchers to analyze facial expressions in 2D by means of image and
video processing, where by tracking of facial features and measuring the amount of facial
movements, they attempt to classify different facial expressions. Recent work on facial
expression analysis and recognition has used these seven basic expressions as their basis for
the introduced systems.
Almost all of the methods developed use 2D distribution of facial features as inputs into a
classification system, and the outcome is one of the facial expression classes. They differ
mainly in the facial features selected and the classifiers used to distinguish among the
different facial expressions. Information extracted from 3D face models are rarely used in
the analysis of the facial expression recognition. This chapter considers the techniques using
the information extracted from 3D space for the analysis of facial images for the recognition
of facial expressions.
The first part of the chapter introduces the methods of extracting information from 3D
models for facial expression recognition. The 3D distributions of the facial feature points
and the estimation of characteristic distances in order to represent the facial expressions are
explained by using a rich collection of illustrations including graphs, charts and face images.
The second part of the chapter introduces 3D distance-vector based facial expression
recognition. The architecture of the system is explained by the block diagrams and
flowcharts. Finally 3D distance-vector based facial expression recognition is compared with
the conventional methods available in the literature.
2 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Fig.1. Emotion-specified facial expression [Yin et al., 2006]: 1-Neutral, 2-Anger, 3-Sadness, 4-
Surprise, 5- Happiness, 6- Disgust, 7- Fear.
Fig. 2. The 3D orientation of the facial feature points [Pandzic & Forchheimer, 2002].
in a consistent way. As a result, description schemes that utilize FAPs produce reasonable
results in terms of expression and speech related postures.
Table 2. Muscle Actions involved in the six basic expressions [Karpouzis et al.,2000].
Facial Expression Recognition Using 3D Facial Feature Distances 5
In general, facial expressions and emotions can be described as a set of measurements (FDPs
and derived features) and transformations (FAPs) that can be considered atomic with
respect to the MPEG-4 standard. In this way, one can describe the anatomy of a human face,
as well as any animation parameters with the change in the positions of the facial feature
points, thus eliminating the need to explicitly specify the topology of the underlying
geometry. These facial feature points can then be mapped to automatically detected
measurements and indications of motion on a video sequence and thus help analyse or
reconstruct the emotion or expression recognized by the system.
MPEG-4 specifies 84 feature points on the neutral face. The main purpose of these feature
points is to provide spatial references to key positions on a human face. These 84 points
were chosen to best reflect the facial anatomy and movement mechanics of a human face.
The location of these feature points has to be known for any MPEG-4 compliant face model.
The Feature points on the model should be located according to figure points illustrated in
Figure 2. After a series of analysis on faces we have concluded that mainly 15 FAP’s are
affected by these expressions [Soyel et al., 2005].
These facial features are moved due to the contraction and expansion of facial muscles,
whenever a facial expression is changed. Table 2 illustrates the description of the basic
expressions using the MPEG-4 FAPs terminology.
Although muscle actions [P. Ekman & W. Friesen,1978] are of high importance, with respect
to facial animation, one is unable to track them analytically without resorting to explicit
electromagnetic sensors. However, a subset of them can be deduced from their visual
results, that is, the deformation of the facial tissue and the movement of some facial surface
points. This reasoning resembles the way that humans visually perceive emotions, by
noticing specific features in the most expressive areas of the face, the regions around the
eyes and the mouth. The seven basic expressions, as well as intermediate ones, employ facial
deformations strongly related with the movement of some prominent facial points that can
be automatically detected. These points can be mapped to a subset of the MPEG-4 feature
point set. The reader should be noted that MPEG-4 defines the neutral as all face muscles are
relaxed.
Fig. 3. 11-facial feature points: 1-Left corner of outer-lip contour, 2-Right corner of outer-lip
contour, 3-Middle point of outer upper-lip contour, 4- Middle point of outer lower-lip
contour, 5-Right corner of the right eye, 6-Left corner of the right eye, 7-Centre of upper
inner-right eyelid, 8-Centre of lower inner-right eyelid, 9-Uppermost point of the right
eyebrow, 10-Outermost point of right-face contour, 11- Outermost point of left-face contour.
Feature Point Groups Selected Feature Points
2- Chin, innerlip -
3.10-centre of lower inner-right eyelid
3.11- left corner of the right eye
3- Eyes
3.12-right corner of the right eye
3.14-centre of upper inner-right eyelid
4- Eye brows 4.4-uppermost point of the right eyebrow
5- Cheek -
6- Tongue -
7- Spine -
8.1-middle point of outer upper-lip contour
8.2-middle point of outer lower-lip contour
8- Outer Lip
8.3-left corner of outer-lip contour
8.4 right corner of outer-lip contour
9- Nose, Nostrils -
10.9-outermost point of left-face contour
10- Ear
10.10-outermost point of right-face contour
11-Hair Line -
Table 3. Selected facial features points.
Facial Expression Recognition Using 3D Facial Feature Distances 7
Target
Adjust
Weights
Table 5. Average confusion matrix using the NN classifier (BU-3DFE database)[H. Soyel &
H. Demirel, 2007 ].
When we compare the results of the proposed system with the results reported in [Wang et
al., 2006] which use the same 3D database through an LDA classifier, we can see that our
method outperforms the recognition rates in Table 6 for all of the facial expressions except
the Happy case. Both systems give the same performance for the “Happy” facial expression.
Note that the classifier in [Wang et al., 2006] does not consider the Neutral case as an
expression, which gives an advantage to the approach.
The average recognition rate of the proposed system is 91.3% where the average
performance of the method given in [Wang et al., 2006] stays at 83.6% for the recognition of
the facial expressions that uses the same 3D database.
Facial Expression Recognition Using 3D Facial Feature Distances 11
Table 6. Average confusion matrix using of the LDA based classifier in [Wang et al., 2006]
5. Conclusion
In this chapter we have shown that probabilistic neural network classifier can be used for
the 3D analysis of facial expressions without relying on all of the 84 facial features and error-
prone face pose normalization stage. Face deformation as well as facial muscle contraction
and expansion are important indicators for facial expression and by using only 11 facial
feature points and symmetry of the human face, we are able to extract enough information
from a from a face image. Our results show that 3D distance vectors based recognition
outperforms facial expression recognition results compared to the results of the similar
systems using 2D and 3D facial feature analysis. The average facial expression recognition
rate of the proposed system reaches up to 91.3%. The quantitative results clearly suggest
that the proposed approach produces encouraging results and opens a promising direction
for higher rate expression analysis.
6. References
Ekman, P. & Friesen, W. (1976). Pictures of Facial Affect. Palo Alto, CA: Consulting
Psychologist
Ekman, P. & Friesen, W. (1978). The Facial Action Coding System: A Technique for the
Measurement of Facial Movement, Consulting Psychologists Press, San Francisco
Rumelhart, D. Hinton, G. Williams, R. (1986) Learning internal representations by error
propagation, In. Parallel Data Processing,D. Rumelhart and J. McClelland, (Ed.), pp.
318-362,the M.I.T. Press, Cambridge, MA
Donato, G. Bartlett, M. Hager, Ekman, P. & Sejnowski, T. (1999). Classifying facial actions.
IEEE Transaction on Pattern Analysis and Machine Intelligence, 21(10), pp. 974–989
Lyons, M. Budynek, J. & Akamatsu, S. (1999). Automatic classification of single facial
images. IEEE Trans. On PAMI, 21, pp. 1357–1362
Karpouzis, K. Tsapatsoulis, N. & Kollias, S. (2000). Moving to Continuous Facial Expression
Space using the MPEG-4 Facial Definition Parameter (FDP) Set, In Proceedings of the
Electronic Imaging, San Jose, USA
Braathen, B. Bartlett, M. Littlewort, G. Smith, E. & Movellan, J. (2002). An approach to
automatic recognition of spontaneous facial actions. In Proceedings of International
Conferance on FGR, pp. 345-350, USA
12 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Pandzic, I.& Forchheimer R. (Ed.) (2002). MPEG-4 Facial Animation: the Standard,
Implementation and Applications, Wiley
Fasel, B. & Luttin, J. (2003). Automatic facial expression analysis: Survey. Pattern Recognition,
36(1), pp. 259–275
Pantic, M. & Rothkrantz, L. (2004). Facial action recognition for facial expression analysis
from static face images. IEEE Trans. on SMC-Part B: Cybernetics, 34, pp. 1449–1461
Soyel, H. Yurtkan, K. Demirel, H. Ozkaramanli, H. Uyguroglu, E. Varoglu, M. (2005). Face
Modeling andAnimation for MPEG Compliant Model Based Video Coding,
IASTED International Conference on Computer Graphics and Imaging.
Yin, L. Wei, X. Sun, Y. Wang, J. & Rosato, M.(2006). A 3d facial expression database for
facial behavior research. In Proceedings of International Conferance on FGR, pp. 211-
216, UK
Wang, J. Yin, L. Wei, X. & Sun, Y.(2006). 3D Facial Expression Recognition Based on
Primitive Surface Feature Distribution. IEEE CVPR'06 - Volume 2, pp. 1399-1406
Soyel, H. Demirel, H. (2007) Facial Expression Recognition using 3D Facial Feature
Distances, Lecture Notes in Computer Science (ICIAR 07), vol. 4633, pp. 831-838.
2
1. Introduction
The human face has attracted attention in a number of areas including psychology,
computer vision, human-computer interaction (HCI) and computer graphics (Chandrasiri et
al, 2004). As facial expressions are the direct means of communicating emotions, computer
analysis of facial expressions is an indispensable part of HCI designs. It is crucial for
computers to be able to interact with the users, in a way similar to human-to-human
interaction. Human-machine interfaces will require an increasingly good understanding of a
subject's behavior so that machines can react accordingly. Although humans detect and
analyze faces and facial expressions in a scene with little or no effort, development of an
automated system that accomplishes this task is rather diffcult.
One challenge is to construct robust, real-time, fully automatic systems to track the facial
features and expressions. Many computer vision researchers have been working on tracking
and recognition of the whole face or parts of the face. Within the past two decades, much
work has been done on automatic recognition of facial expression. The initial 2D methods
had limited success mainly because their dependency on the camera viewing angle. One of
the main motivations behind 3D methods for face or expression recognition is to enable a
broader range of camera viewing angles (Blanz & Vetter, 2003; Gokturk et al., 2002; Lu et
al., 2006; Moreno et al., 2002; Wang et al., 2004; Wen & Huang, 2003; Yilmaz et al., 2002).
To classify expressions in static images many techniques have been proposed, such as those
based on neural networks (Tian et al., 2001), Gabor wavelets (Bartlett et al., 2004), and
Adaboost (Wang et al., 2004). Recently, more attention has been given to modeling facial
deformation in dynamic scenarios, since it is argued that information based on dynamics is
richer than that provided by static images. Static image classifiers use feature vectors related
to a single frame to perform classification (Lyons et al., 1999). Temporal classifiers try to
capture the temporal pattern in the sequence of feature vectors related to each frame. These
include the Hidden Markov Model (HMM) based methods (Cohen et al., 2003) and Dynamic
Bayesian Networks (DBNs) (Zhang & Ji, 2005). In (Cohen et al., 2003), the authors introduce
a facial expression recognition from live video input using temporal cues. They propose a
new HMM architecture for automatically segmenting and recognizing human facial
expression from video sequences. The architecture performs both segmentation and
recognition of the facial expressions automatically using a multi-level architecture
14 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
composed of an HMM layer and a Markov model layer. In (Zhang & Ji, 2005), the authors
present a new approach to spontaneous facial expression understanding in image
sequences. The facial feature detection and tracking is based on active Infra Red
illumination. Modeling dynamic behavior of facial expression in image sequences falls
within the framework of information fusion with DBNs. In (Xiang et al., 2008), the authors
propose a temporal classifier based on the use of fuzzy C means where the features are
given by Fourrier transform.
Surveys of facial expression recognition methods can be found in (Fasel & Luettin, 2003;
Pantic & Rothkrantz, 2000). A number of earlier systems were based on facial motion
encoded as a dense flow between successive image frames. However, flow estimates are
easily disturbed by illumination changes and non-rigid motion. In (Yacoob & Davis, 1996),
the authors compute optical flow of regions on the face, then they use a rule-based classifier
to recognize the six basic facial expressions. Extracting and tracking facial actions in a video
can be done in several ways. In (Bascle & Black, 1998), the authors use active contours for
tracking the performer's facial deformations. In (Ahlberg, 2002), the author retrieves facial
actions using a variant of Active Appearance Models. In (Liao & Cohen, 2005), the authors
used a graphical model for modeling the interdependencies of defined facial regions for
characterizing facial gestures under varying pose. The dominant paradigm involves
computing a time-varying description of facial actions/features from which the expression
can be recognized; that is to say, the tracking process is performed prior to the recognition
process (Dornaika & Davoine, 2005; Zhang & Ji, 2005).
However, the results of both processes affect each other in various ways. Since these two
problems are interdependent, solving them simultaneously increases reliability and
robustness of the results. Such robustness is required when perturbing factors such as
partial occlusions, ultra-rapid movements and video streaming discontinuity may affect the
input data. Although the idea of merging tracking and recognition is not new, our work
addresses two complicated tasks, namely tracking the facial actions and recognizing
expression over time in a monocular video sequence.
In the literature, simultaneous tracking and recognition has been used in simple cases. For
example, (North et al., 2000) employs a particle-filter-based algorithm for tracking and
recognizing the motion class of a juggled ball in 2D. Another example is given in (Zhou et
al., 2003); this work proposes a framework allowing the simultaneous tracking and
recognizing of human faces using a particle filtering method. The recognition consists in
determining a person's identity, which is fixed for the whole probe video. The authors use a
mixed state vector formed by the 2D global face motion (affine transform) and an identity
variable. However, this work does not address either facial deformation or facial expression
recognition.
In this chapter, we describe two frameworks for facial expression recognition given natural
head motion. Both frameworks are texture- and view-independent. The first framework
exploits the temporal representation of tracked facial action in order to infer the current
facial expression in a deterministic way. The second framework proposes a novel paradigm
in which facial action tracking and expression recognition are simultaneously performed.
The second framework consists of two stages. First, the 3D head pose is estimated using a
deterministic approach based on the principles of Online Appearance Models (OAMs).
Second, the facial actions and expression are simultaneously estimated using a stochastic
approach based on a particle filter adopting mixed states (Isard & Blake, 1998). This
Facial Expression Recognition in the Presence of Head Motion 15
proposed framework is simple, efficient and robust with respect to head motion given that
(1) the dynamic models directly relate the facial actions to the universal expressions, (2) the
learning stage does not deal with facial images but only concerns the estimation of auto-
regressive models from sequences of facial actions, which is carried out using closed- from
solutions, and (3) facial actions are related to a deformable 3D model and not to entities
measured in the image plane.
(1)
where g is the standard shape of the model, τ s and τ a are shape and animation control
vectors, respectively, and the columns of S and A are the Shape and Animation Units. A
Shape Unit provides a means of deforming the 3D wireframe so as to be able to adapt eye
width, head width, eye separation distance, etc. Thus, the term S τ s accounts for shape
variability (inter-person variability) while the term A τ a accounts for the facial animation
(intra-person variability). The shape and animation variabilities can be approximated well
enough for practical purposes by this linear relation. Also, we assume that the two kinds of
variability are independent. With this model, the ideal neutral face configuration is
represented by τ a = 0. The shape modes were created manually to accommodate the
16 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
subjectively most important changes in facial shape (face height/width ratio, horizontal and
vertical positions of facial features, eye separation distance). Even though a PCA was
initially performed on manually adapted models in order to compute the shape modes, we
preferred to consider the Candide model with manually created shape modes with semantic
signification that are easy to use by human operators who need to adapt the 3D mesh to
facial images. The animation modes were measured from pictorial examples in the Facial
Action Coding System (FACS) (Ekman & Friesen, 1977).
In this study, we use twelve modes for the facial Shape Units matrix S and six modes for the
facial Animation Units (AUs) matrix A. Without loss of generality, we have chosen the six
following AUs: lower lip depressor, lip stretcher, lip corner depressor, upper lip raiser,
eyebrow lowerer and outer eyebrow raiser. These AUs are enough to cover most common
facial animations (mouth and eyebrow movements). Moreover, they are essential for
conveying emotions. The effects of the Shape Units and the six Animation Units on the 3D
wireframe model are illustrated in Figure 1.
Figure 1: First row: Facial Shape units (neutral shape, mouth width, eyes width, eyes vertical
position, eye separation distance, head height). Second and third rows: Positive and
negative perturbations of Facial Action Units (Brow lowerer, Outer brow raiser, Jaw drop,
Upper lip raiser, Lip corner depressor, Lip stretcher).
In equation (1), the 3D shape is expressed in a local coordinate system. However, one should
relate the 3D coordinates to the image coordinate system. To this end, we adopt the weak
perspective projection model. We neglect the perspective effects since the depth variation of
the face can be considered as small compared to its absolute depth. Therefore, the mapping
Facial Expression Recognition in the Presence of Head Motion 17
between the 3D face model and the image is given by a 2×4 matrix, M, encapsulating both
the 3D head pose and the camera parameters.
Thus, a 3D vertex Pi = (Xi, Yi, Zi)T ⊂ g will be projected onto the image point pi = (ui, vi)T
given by:
(2)
For a given subject, τs is constant. Estimating τs can be carried out using either feature-based
(Lu et al., 2001) or featureless approaches (Ahlberg, 2002). In our work, we assume that the
control vector τs is already known for every subject, and it is set manually using for instance
the face in the first frame of the video sequence (the Candide model and target face shapes
are aligned manually). Therefore, Equation (1) becomes:
(3)
where gs represents the static shape of the face-the neutral face configuration. Thus, the state
of the 3D wireframe model is given by the 3D head pose parameters (three rotations and
three translations) and the animation control vector τa. This is given by the 12-dimensional
vector b:
(4)
(5)
where the vector h represents the six degrees of freedom associated with the 3D head pose.
(a) (b)
Figure 2: (a) an input image with correct adaptation of the 3D model. (b) the corresponding
shape-free facial image.
(6)
18 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
where x denotes the shape-free patch and b denotes the geometrical parameters. Several
resolution levels can be chosen for the shape-free patches. The reported results are obtained
with a shape-free patch of 5392 pixels. Regarding photometric transformations, a zero-mean
unit-variance normalization is used to partially compensate for contrast variations. The
complete image transformation is implemented as follows: (i) transfer the rawbrightness
facial patch y using the piece-wise affine transform associated with the vector b, and (ii)
perform the gray-level normalization of the obtained patch.
(7)
The estimation of b̂ t from the sequence of images will be presented in Section 2.4. b̂ 0 is set
manually, according to the face in the first video frame. The facial texture model
(appearance model) associated with the shape-free facial patch at time t is time-varying in
that it models the appearances present in all observations x̂ up to time t - 1. This may be
required as a result, for instance, of illumination changes or out-of-plane rotated faces.
By assuming that the pixels within the shape-free patch are independent, we can model the
appearance using a multivariate Gaussian with a diagonal covariance matrix Σ. In other
words, this multivariate Gaussian is the distribution of the facial patches x̂ t. Let μ be the
Gaussian center and σ the vector containing the square root of the diagonal elements of the
covariance matrix Σ. μ and σ are d-vectors (d is the size of x).
In summary, the observation likelihood is written as:
(8)
(9)
We assume that the appearance model summarizes the past observations under an
⎛ log 2 ⎞
exponential envelope with a forgetting factor α = 1 − exp ⎜ − ⎟ , where nh represents the
⎝ nh ⎠
half-life of the envelope in frames (Jepson et al., 2003).
When the patch x̂ t is available at time t, the appearance is updated and used to track in the
next frame. It can be shown that the appearance model parameters, i.e., the μi's and σi's can
be updated from time t to time (t + 1) using the following equations (see (Jepson et al., 2003)
for more details on OAMs):
(10)
Facial Expression Recognition in the Presence of Head Motion 19
(11)
This technique is simple, time-efficient and therefore suitable for real-time applications. The
appearance parameters reflect the most recent observations within a roughly L = 1 / α
window with exponential decay.
Note that μ is initialized with the first patch x̂ 0. However, equation (11) is not used with α
being a constant until the number of frames reaches a given value (e.g., the first 40 frames).
For these frames, the classical variance is used, that is, equation (11) is used with α being set
to 1/ t .
Here we used a single Gaussian to model the appearance of each pixel in the shape-free
template. However, modeling the appearance with Gaussian mixtures can also be used at
the expense of an additional computational load (e.g., see (Lee, 2005; Zhou et al., 2004)).
(12)
The above criterion can be minimized using an iterative gradient descent method where the
starting solution is set to the previous solution b̂ t-1. Handling outlier pixels (caused for
instance by occlusions) is performed by replacing the quadratic function by the Huber's cost
function (Huber, 1981). The gradient matrix is computed for each input frame. It is
approximated by numerical differences. More details about this tracking method can be
found in (Dornaika & Davoine, 2006).
action parameters τ a (a 6-vector) associated with each training sequence, that is, the
temporal trajectories of the action parameters.
Figure 3 shows six videos belonging to the CMU database. The first five images depict the
estimated deformable model associated with the high magnitude of the five basic
expressions. Figure 4 shows the computed facial action parameters associated with three
training sequences: surprise, joy and anger. The training video sequences have an
interesting property: all performed expressions go from the neutral expression to a high
magnitude expression by going through a moderate magnitude around the middle of the
sequence.
Surprise Sadness
Joy Disgust
Anger Neutral
Figure 3: Six video examples associated with the CMU database. The first five images depict
the high magnitude of the five basic expressions.
Facial Expression Recognition in the Presence of Head Motion 21
(a)
(b)
(c)
Figure 4: Three examples (sequences) of learned facial action parameters as a function of
time. (a) Surprise expression. (b) Joy expression. (c) Anger expression.
22 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Table 1: Confusion matrix for the dynamical facial expression classifier using the DTW
technique (the smallest average similarity). The learned trajectories were inferred from the
CMU database while the used test videos were created at our laboratory. The recognition
rate of dynamical expressions was 100% for all basic expressions except for the disgust
expression for which the recognition rate was 44%.
Facial Expression Recognition in the Presence of Head Motion 23
Note that T represents the duration of the aligned trajectories which will be fixed for all
examples. For example, a nominal duration of 18 frames for the aligned trajectories makes
the dimension of all examples eij (all i and j) equal to 108.
Applying a Principal Component Analysis on the set of all training trajectories yields the
mean trajectory e as well as the principal modes of variation. Any training trajectory e can
be approximated by the principal modes using the q largest eigenvalues:
In our work, the number of principal modes is chosen such that the variability of the
retained modes corresponds to 99% of the total variability. The vector c can be seen as a
parametrization of any input trajectory, ê , in the space spanned by the q basis vectors Ul.
The vector c is given by:
(13)
Thus, all training trajectories eij can now be represented by the vectors cij (using (13)) on
which a Linear Discriminant Analysis can be applied. This gives a new space (the
Fisherspace) in which each training video sequence is represented by a vector of dimension
l -1 where l is the number of expression classes. Figure 6 illustrates the learning results
associated with the CMU data. In this space, each trajectory example is represented by a 5-
vector. Here, we use six facial expression classes: Surprise, Sadness, Joy, Disgust, Anger, and
Neutral. (a) displays the second component versus the first one, and (b) displays the fourth
component versus the third one. In this space, the neutral trajectory (a sequence of zero
vectors) is represented by a star.
Recognition. The recognition scheme follows the main steps of the learning stage. We infer
the facial expression by considering the estimated facial actions provided by our face tracker
(Dornaika & Davoine, 2006). We consider the one-dimensional vector e’ (the concatenation
of the facial actions τa(t)) within a temporal window of size T centered at the current frame t.
Note that the value of T should be the same as in the learning stage. This vector is projected
onto the PCA space, then the obtained vector is projected onto Fisherspace in which the
classification occurs. The expression class whose mean is the closest to the current trajectory
is then assigned to this trajectory (current frame).
Preformance evaluation. Table 2 shows the confusion matrix for the dynamical facial
expression classifier using Eigenspace and Fisherspace. The learned trajectories were
inferred from the CMU database while the used test videos were created at our laboratory.
The recognition rate of dynamical expressions was 100% for all basic expressions except for
the disgust expression for which the recognition rate was 55%. Therefore, for the above
experiment, the overall recognition rate is 92.3%. One can notice the slight improvement in
the recognition rate over the classical recognition scheme based on the DTW.
Facial Expression Recognition in the Presence of Head Motion 25
Table 2: Confusion matrix for the dynamical facial expression classifier using Eigenspace
and Fisherspace. The learned trajectories were inferred from the CMU database while the
used test videos were created at our laboratory. The recognition rate of dynamical
expressions was 100% for all basic expressions except for the disgust expression for which
the recognition rate was 55%.
(a)
(b)
Figure 6: The 35 trajectory examples associated with five universal facial expressions
depicted in Fisherspace. In this space, each trajectory example is represented by a 5-vector.
Here, we use six facial expression classes: Surprise, Sadness, Joy, Disgust, Anger, and
Neutral. (a) displays the second component versus the first one, and (b) displays the fourth
component versus the third one. In this space, the neutral trajectory (a sequence of zero
vectors) is represented by a star.
26 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
(14)
where γ t ∈ ε = { 1, 2,…,Nγ} is the discrete component of the state, drawn from a finite set of
integer labels. Each integer label represents one of the six universal expressions: surprise,
disgust, fear, joy, sadness and anger. In our study, we adopt these facial expressions
together with the neutral expression, that is, Nγ is set to 7. There is another useful
representation of the mixed state which is given by:
(15)
where ht denotes the 3D head pose parameters, and at the facial actions appended with the
expression label γ t, i.e. at = [ τ aT(t ) , γ t]T .
This separation is consistent with the fact that the facial expression is highly correlated with
the facial actions, while the 3D head pose is independent of the facial actions and
expressions. The remainder of this section is organized as follows. Section 4.1 provides some
backgrounds. Section 4.2 describes the proposed approach for the simultaneous tracking
and recognition. Section 4.3 describes experiments and provides evaluations of performance
to show the feasibility and robustness of the proposed approach.
4.1 Backgrounds
4.1.1 Facial action dynamic models
Corresponding to each basic expression class, γ, there is a stochastic dynamic model
describing the temporal evolution of the facial actions τ a(t), given the expression. It is
assumed to be a Markov model of order K. For each basic expression γ, we associate a
Gaussian Auto-Regressive Process defined by:
(16)
predicted value at time t obeys a multivariate Gaussian centered at the deterministic value
of (16), with BγBγT being its covariance matrix. In our study, we are interested in second-
order models, i.e. K = 2. The reason is twofold. First, these models are easy to estimate.
Second, they are able to model complex dynamics. For example, these models have been
used in (Blake & Isard, 2000) for learning the 2D motion of talking lips (profile contours),
beating heart, and writing fingers.
(17)
The parameters of each auto-regressive model can be computed from temporal facial action
sequences. Ideally, the temporal sequence should contain several instances of the
corresponding expression.
More details about auto-regressive models and their computation can be found in (Blake &
Isard, 2000; Ljung, 1987; North et al., 2000). Each universal expression has its own second-
order auto-regressive model given by Eq.(17). However, the dynamics of facial actions
associated with the neutral expression can be simpler and are given by:
τ a(t) = τ a(t-1)+Dwt
where D is a diagonal matrix whose elements represent the variances around the ideal
neutral configuration τa = 0. The right-hand side of the above equation is constrained to
belong to a predefined interval, since a neutral configuration and expression is characterized
by both the lack of motion and the closeness to the ideal static configuration. In our study,
the auto-regressive models are learned using a supervised learning scheme. First, we asked
volunteer students to perform each basic expression several times in approximately 30-
second sequences. Each video sequence contains several cycles depicting a particular facial
expression: Surprise, Sadness, Joy, Disgust, Anger, and Fear. Second, for each training
video, the 3D head pose and the facial actions τa(t) are tracked using our deterministic
appearance-based tracker (Dornaika & Davoine, 2006) (outlined in Section 2). Third, the
parameters of each auto-regressive model are estimated using the Maximum Likelihood
Estimator.
Figure 7 illustrates the value of the facial actions, τ a(t), associated with six training video
sequences. For clarity purposes, only two components are shown for a given plot. For a
given training video, the neutral frames are skipped from the original training sequence
used in the computation of the auto- regressive models.
transition matrix T whose entries T γ ',γ describe the probability of transition between two
expression labels γ’ and γ. The transition probabilities need to be learned from training video
sequences. In the literature, the transition probabilities associated with states (not
necessarily facial expressions) are inferred using supervised and unsupervised learning
techniques. However, since we are dealing with high level states (the universal facial
expressions), we have found that a realistic a priori setting works very well. We adopt a 7 ×7
symmetric matrix whose diagonal elements are close to one (e.g. Tγ,γ = 0.8, that is, 80% of the
transitions occur within the same expression class). The rest of the percentage is distributed
equally among the expressions. In this model, transitions from one expression to another
expression without going through the neutral one are allowed. Furthermore, this model
adopts the most general case where all universal expressions have the same probability.
However, according to the context of the application, one can adopt other transition
matrices in which some expressions are more likely to happen than others.
4.2 Approach
Since at any given time, the 3D head pose parameters can be considered as independent of
the facial actions and expression, our basic idea is to split the estimation of the unknown
parameters into two main stages. For each input video frame yt, these two stages are
invoked in sequence in order to recover the mixed state [ htT , atT ]T . Our proposed approach
is illustrated in Figure 8. In the first stage, the six degrees of freedom associated with the 3D
head pose (encoded by the vector ht) are obtained using a deterministic registration
technique similar to that proposed in (Dornaika & Davoine, 2006). In the second stage, the
facial actions and the facial expression (encoded by the vector at = [ τ aT( t ) , γt]T ) are
simultaneously estimated using a stochastic framework based on a particle filter. Such
models have been used to track objects when different types of dynamics can occur (Isard &
Blake, 1998). Other examples of auxiliary discrete variables beside the main hidden state of
interest are given in (Perez & Vermaak, 2005). Since τ a(t) and γt are highly correlated their
simultaneous estimation will give results that are more robust and accurate than results
obtained with methods estimating them in sequence. In the following, we present the
parameter estimation process associated with the current frame yt. Recall that the head pose
is computed using a deterministic approach, while the facial actions and expressions are
estimated using a probabilistic framework.
(18)
Facial Expression Recognition in the Presence of Head Motion 29
In the above experiment, the total number of particles is set to 200. Figure 12 illustrates the
same facial actions when the number of particles is set to 100. We have found that there is no
significant difference in the estimated facial actions and expressions when the tracking is
performed with 100 particles (see Figures 11.a and 12), which is due to the use of learned
multi-class dynamics.
Figure 13 shows the tracking results associated with another 600-frame test video sequence
depicting significant out-of-plane head movements. The recognition results were correct.
Recall that the facial actions are related to the deformable 3D model and thus the recognition
based on them is independent from the viewing angle.
A challenging example. We have dealt with a challenging test video. For this 1600-frame
test video, we asked our subject to adopt arbitrarily different facial gestures and expressions
for an arbitrary duration and in an arbitrary order. Figure 14 (Top) illustrates the probability
mass distribution as a function of time. As can be seen, surprise, joy, anger, disgust, and fear
are clearly and correctly detected. Also, we find that the facial actions associated with the
subject's conversation are correctly tracked using the dynamics of the universal expressions.
The tracked facial actions associated with the subject's conversation are depicted in nine
frames (see the lower part of Figure 14). The whole video can be found at http:
//www.hds.utc.fr/ ~ fdavoine/MovieTrackingRecognition.wmv.
able to provide the correct state (both the discrete and the continuous components) almost
instantaneously (see the correct alignment between the 3D model and the region of the lips
and mouth in Figure 16.b).
Low resolution video sequences In order to assess the behavior of our developed approach
when the resolution and/or the quality of the videos is low, we downloaded several low-
quality videos used in (Huang et al., 2002). In each 42-frame video, one universal expression
is displayed. Figure 17 shows our recognition results (the discrete probability distribution)
associated with three such videos. The left images display the 25th frame of each video. Note
that the neutral curve is not shown for reasons of clarity. As can be seen, the recognition
obtained with our stochastic approach was very good despite the low quality of the videos
used. The resolution of these videos is 320×240 pixels.
Impact of noisy estimated 3D head pose The estimated appearance-based 3D head pose
may suffer from some inaccuracies associated with the out-of-plane movements, which is
the case for all monocular systems. It would seem reasonable to fear that these inaccuracies
might lead to a failure in facial action tracking. In order to assess the effect of 3D head pose
inaccuracies on the facial action tracking, we conducted the following experiment. We
acquired a 750-frame sequence and performed our approach twice. The first was a
straightforward run. In the second run, the estimated out-of-plane parameters of the 3D
head pose were perturbed by a uniform noise, then the perturbed 3D pose was used by the
facial action tracking and facial expression recognition. Figure 18 shows the value of the
tracked actions in both cases: the noise-free 3D head pose (solid curve) and the noisy 3D
head pose (dotted curves). In this experiment, the two out-of-plane angles were perturbed
with additive uniform noise belonging to [-7degrees, +7degrees] and the scale was
perturbed by an additive noise belonging to [-2%, +2%]. As can be seen, the facial actions are
almost not affected by the introduced noise. This can be explained by the fact that the 2D
projection of out-of-plane errors produce very small errors in the image plane such that the
2D alignment between the model and the regions of lips and eyebrows is still good enough
to capture their independent movements correctly.
Robustness to lighting conditions The appearance model used was given by one single
multivariate Gaussian with parameters slowly updated over time. The robustness of this
model is improved through the use of robust statistics that prevent outliers from
deteriorating the global appearance model. This relatively simple model was adopted to
allow real-time performance. We found that the tracking based on this model was successful
even in the presence of temporary occlusions caused by a rotated face and occluding hands.
Figure 19 illustrates the tracking results associated with a video sequence provided by the
Polytechnic University of Madrid2, depicting head movements and facial expressions under
significant illumination changes (Buenaposada et al., 2006). As can be seen, even though
with our simple appearance model the possible brief perturbations caused temporary
tracking inaccuracies, there is no track lost. Moreover, whenever the perturbation
disappears the tracker begins once more to provide accurate parameters.
2
http://www.dia.fi.upm.es/~pcr/downloads.html
32 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
5. Conclusion
This chapter provided a set of recent deterministic and stochastic (robust) techniques that
perform efficient facial expression recognition from video sequences. More precisely, we
described two texture- and view-independent frameworks for facial expression recognition
given natural head motion. Both frameworks use temporal classification and do not require
any learned facial image patch since the facial texture model is learned online. The latter
property makes them more flexible than many existing recognition approaches. The
proposed frameworks can easily include other facial gestures in addition to the universal
expressions.
The first framework (Tracking then Recognition) exploits the temporal representation of
tracked facial actions in order to infer the current facial expression in a deterministic way.
Within this framework, we proposed two different recognition methods: i) a method based
on Dynamic Time Warping, and ii) a method based on Linear Discriminant Analysis. The
second framework (Tracking and Recognition) proposes a novel paradigm in which facial
action tracking and expression recognition are simultaneously performed. This framework
consists of two stages. In the first stage, the 3D head pose is recovered using a deterministic
registration technique based on Online Appearance Models. In the second stage, the facial
actions as well as the facial expression are simultaneously estimated using a stochastic
framework based on multi-class dynamics.
We have shown that possible inaccuracies affecting the out-of-plane parameters associated
with the 3D head pose have no impact on the stochastic tracking and recognition. The
developed scheme lends itself nicely to real-time systems. We expect the approach to
perform well in the presence of perturbing factors, such as video discontinuities and
moderate illumination changes. The developed face tracker was successfully tested with
moderate rapid head movements. Should ultra-rapid head movements break tracking, it is
possible to use a re-initialization process or a stochastic tracker that propagates a probability
distribution over time, such as the particle-filter-based tracking method presented in our
previous work (Dornaika & Davoine, 2006). The out-of-plane face motion range is limited
within the interval [-45 deg, 45 deg] for the pitch and the yaw angles. Within this range, the
obtained distortions associated with the facial patch are still acceptable to estimate the
correct pose of the head. Note that the proposed algorithm does not require that the first
frame should be a neutral face since all universal expressions have the same probability.
The current work uses an appearance model given by one single multivariate Gaussian
whose parameters are slowly updated over time. The robustness of this model is improved
through the use of robust statistics that prevent outliers from deteriorating the global
appearance model. This relatively simple model was adopted to allow real-time
performance. We found that the tracking based on this model was successful even in the
presence of occlusions caused by a rotated face and occluding hands. The current
appearance model can be made more sophisticated through the use of Gaussian mixtures
(Zhou et al., 2004; Lee, 2005) and/or illumination templates to take into account sudden and
significant local appearance changes due for instance to the presence of shadows.
Facial Expression Recognition in the Presence of Head Motion 33
Figure 7: The automatically tracked facial actions, τ a(t), using the training videos. Each video
sequence corresponds to one expression. For a given plot, only two components are
displayed.
34 Affective Computing, Focus on Emotion Expression, Synthesis and Recognition
Figure 8: The proposed two-stage approach. In the first stage (Section 4.2.1), the 3D head
pose is computed using a deterministic registration technique. In the second stage (Section
4.2.2), the facial actions and expression are simultaneously estimated using a stochastic
technique involving multi-class dynamics.
Discovering Diverse Content Through
Random Scribd Documents
Of Pluto, to have quite set free
His half regain’d Eurydice.
These delights, if thou canst give,
Mirth with thee, I mean to live.
311.
Il Penseroso
HENCE vain deluding joyes,
The brood of folly without father bred,
How little you bested,
Or fill the fixèd mind with all your toyes;
Dwell in som idle brain,
And fancies fond with gaudy shapes possess,
As thick and numberless
As the gay motes that people the Sun Beams,
Or likest hovering dreams
The fickle Pensioners of Morpheus train.
But hail thou Goddes, sage and holy,
Hail divinest Melancholy,
Whose Saintly visage is too bright
To hit the Sense of human sight;
And therfore to our weaker view,
Ore laid with black staid Wisdoms hue.
Black, but such as in esteem,
Prince Memnons sister might beseem,
Or that Starr’d Ethiope Queen that strove
To set her beauties praise above
The Sea Nymphs, and their powers offended.
Yet thou art higher far descended,
Thee bright-hair’d Vesta long of yore,
To solitary Saturn bore;
His daughter she (in Saturns raign,
Such mixture was not held a stain)
Oft in glimmering Bowres, and glades
He met her, and in secret shades
Of woody Ida’s inmost grove,
Whilst yet there was no fear of Jove.
Com pensive Nun, devout and pure,
Sober, stedfast, and demure,
All in a robe of darkest grain,
Flowing with majestick train,
And sable stole of Cipres Lawn,
Over thy decent shoulders drawn
Over thy decent shoulders drawn.
Com, but keep thy wonted state,
With eev’n step, and musing gate,
And looks commercing with the skies,
Thy rapt soul sitting in thine eyes:
There held in holy passion still,
Forget thy self to Marble, till
With a sad Leaden downward cast,
Thou fix them on the earth as fast.
And joyn with thee calm Peace, and Quiet,
Spare Fast, that oft with gods doth diet,
And hears the Muses in a ring,
Ay round about Joves Altar sing.
And adde to these retirèd Leasure,
That in trim Gardens takes his pleasure;
But first, and chiefest, with thee bring,
Him that yon soars on golden wing,
Guiding the fiery-wheelèd throne,
The Cherub Contemplation,
And the mute Silence hist along,
’Less Philomel will daign a Song,
In her sweetest, saddest plight,
Smoothing the rugged brow of night,
While Cynthia checks her Dragon yoke,
Gently o’re th’accustom’d Oke;
Sweet Bird that shunn’st the noise of folly,
Most musicall, most melancholy!
Thee Chauntress oft the Woods among,
I woo to hear thy eeven-Song;
And missing thee, I walk unseen
On the dry smooth-shaven Green,
To behold the wandring Moon,
Riding neer her highest noon,
Like one that had bin led astray
Through the Heav’ns wide pathles way;
And oft, as if her head she bow’d,
Stooping through a fleecy cloud.
Oft on a Plat of rising ground,
I hear the far-off Curfeu sound
Over som wide-water’d shoar,
Swinging slow with sullen roar;
Or if the Ayr will not permit,
Som still removèd place will fit.
Where glowing Embers through the room
Teach light to counterfeit a gloom,
Far from all resort of mirth,
Save the Cricket on the hearth,
Or the Belmans drousie charm,
To bless the dores from nightly harm:
Or let my Lamp at midnight hour,
Be seen in som high lonely Towr,
Where I may oft out-watch the Bear,
With thrice great Hermes, or unsphear
The spirit of Plato to unfold
What Worlds, or what vast Regions hold
The immortal mind that hath forsook
Her mansion in this fleshly nook:
And of those Dæmons that are found
In fire, air, flood, or under ground,
Whose power hath a true consent
With Planet, or with Element.
Som time let Gorgeous Tragedy
In Scepter’d Pall com sweeping by,
Presenting Thebs, or Pelops line,
Or the tale of Troy divine.
Or what (though rare) of later age,
Ennoblèd hath the Buskind stage.
But, O sad Virgin, that thy power
Might raise Musæeus from his bower
Or bid the soul of Orpheus sing
Such notes as warbled to the string,
Drew Iron tears down Pluto’s cheek,
And made Hell grant what Love did seek.
Or call up him that left half told
The story of Cambuscan bold,
Of Camball, and of Algarsife,
Of Camball, and of Algarsife,
And who had Canace to wife,
That own’d the vertuous Ring and Glass,
And of the wondrous Hors of Brass,
On which the Tartar King did ride;
And if ought els, great Bards beside,
In sage and solemn tunes have sung,
Of Turneys and of Trophies hung;
Of Forests, and inchantments drear,
Where more is meant then meets the ear.
Thus night oft see me in thy pale career,
Till civil-suited Morn appeer,
Not trickt and frounc’t as she was wont,
With the Attick Boy to hunt,
But Cherchef’t in a comly Cloud,
While rocking Winds are Piping loud,
Or usher’d with a shower still,
When the gust hath blown his fill,
Ending on the russling Leaves,
With minute drops from off the Eaves.
And when the Sun begins to fling
His flaring beams, me Goddes bring
To archèd walks of twilight groves,
And shadows brown that Sylvan loves,
Of Pine, or monumental Oake,
Where the rude Ax with heavèd stroke,
Was never heard the Nymphs to daunt,
Or fright them from their hallow’d haunt.
There in close covert by som Brook,
Where no profaner eye may look,
Hide me from Day’s garish eie,
While the Bee with Honied thie,
That at her flowry work doth sing,
And the Waters murmuring
With such consort as they keep,
Entice the dewy-feather’d Sleep;
And let som strange mysterious dream,
Wave at his Wings in Airy stream,
Of li l t t di l ’d
Of lively portrature display’d,
Softly on my eye-lids laid.
And as I wake, sweet musick breath
Above, about, or underneath,
Sent by som spirit to mortals good,
Or th’unseen Genius of the Wood.
But let my due feet never fail,
To walk the studious Cloysters pale,
And love the high embowèd Roof,
With antick Pillars massy proof,
And storied Windows richly dight,
Casting a dimm religious light.
There let the pealing Organ blow,
To the full voic’d Quire below,
In Service high, and Anthems cleer,
As may with sweetnes, through mine ear,
Dissolve me into extasies,
And bring all Heav’n before mine eyes.
And may at last my weary age
Find out the peacefull hermitage,
The Hairy Gown and Mossy Cell,
Where I may sit and rightly spell
Of every Star that Heav’n doth shew,
And every Herb that sips the dew;
Till old experience do attain
To somthing like Prophetic strain.
These pleasures Melancholy give,
And I with thee will choose to live.
312.
From ‘Arcades’
O’re the smooth enameld green
Where no print of step hath been,
Follow me as I sing,
And touch the warbled string.
Under the shady roof
Of branching Elm Star-proof,
Follow me,
I will bring you where she sits
Clad in splendor as befits
Her deity.
Such a rural Queen
All Arcadia hath not seen.
From ‘Comus’
313.
i
THE Star that bids the Shepherd fold,
Now the top of Heav’n doth hold,
And the gilded Car of Day,
His glowing Axle doth allay
In the steep Atlantick stream,
And the slope Sun his upward beam
Shoots against the dusky Pole,
Pacing toward the other gole
Of his Chamber in the East.
Mean while welcom Joy, and Feast,
Midnight shout, and revelry,
Tipsie dance, and Jollity.
Braid your Locks with rosie Twine
Dropping odours, dropping Wine.
Rigor now is gon to bed,
And Advice with scrupulous head,
Strict Age, and sowre Severity,
With their grave Saws in slumber ly.
We that are of purer fire
Imitate the Starry Quire,
Who in their nightly watchfull Sphears,
Lead in swift round the Months and Years.
The Sounds, and Seas with all their finny drove
Now to the Moon in wavering Morrice move,
And on the Tawny Sands and Shelves,
Trip the pert Fairies and the dapper Elves;
By dimpled Brook, and Fountain brim,
The Wood-Nymphs deckt with Daisies trim,
Their merry wakes and pastimes keep:
What hath night to do with sleep?
Night hath better sweets to prove,
Venus now wakes, and wak’ns Love....
Com, knit hands, and beat the ground,
In a light fantastick round.
314.
ii
Echo
315.
iii
Sabrina
The Spirit sings:
SABRINA fair
Listen where thou art sitting
Under the glassie, cool, translucent wave,
In twisted braids of Lillies knitting
The loose train of thy amber-dropping hair,
Listen for dear honour’s sake,
Goddess of the silver lake,
Listen and save!
Sabrina replies:
BY the rushy-fringèd bank,
Where grows the Willow and the Osier dank,
My sliding Chariot stayes,
Thick set with Agat, and the azurn sheen
Of Turkis blew, and Emrauld green
That in the channell strayes,
Whilst from off the waters fleet
Thus I set my printless feet
O’re the Cowslips Velvet head,
That bends not as I tread,
Gentle swain at thy request
I am here.
316.
iv
The Spirit epiloguizes:
TO the Ocean now I fly,
And those happy climes that ly
Where day never shuts his eye,
Up in the broad fields of the sky:
There I suck the liquid ayr
All amidst the Gardens fair
Of Hesperus, and his daughters three
That sing about the golden tree:
Along the crispèd shades and bowres
Revels the spruce and jocond Spring,
The Graces, and the rosie-boosom’d Howres,
Thither all their bounties bring,
That there eternal Summer dwels,
And West winds, with musky wing
About the cedar’n alleys fling
Nard, and Cassia’s balmy smels.
Iris there with humid bow,
Waters the odorous banks that blow
Flowers of more mingled hew
Than her purfl’d scarf can shew,
And drenches with Elysian dew
(List mortals, if your ears be true)
Beds of Hyacinth, and roses
Where young Adonis oft reposes,
Waxing well of his deep wound
In slumber soft, and on the ground
Sadly sits th’ Assyrian Queen;
But far above in spangled sheen
Celestial Cupid her fam’d son advanc’t,
Holds his dear Psyche sweet intranc’t
After her wandring labours long,
Till free consent the gods among
Make her his eternal Bride,
And from her fair unspotted side
Two blissful twins are to be born,
Youth and Joy; so Jove hath sworn
Youth and Joy; so Jove hath sworn.
But now my task is smoothly don,
I can fly, or I can run
Quickly to the green earths end,
Where the bow’d welkin slow doth bend,
And from thence can soar as soon
To the corners of the Moon.
Mortals that would follow me,
Love vertue, she alone is free.
She can teach ye how to clime
Higher then the Spheary chime;
Or if Vertue feeble were,
Heav’n it self would stoop to her.
317.
Lycidas
A Lament for a friend drowned in his passage from Chester on the Irish
Seas, 1637
YET once more, O ye Laurels, and once more
Ye Myrtles brown, with Ivy never-sear,
I com to pluck your Berries harsh and crude,
And with forc’d fingers rude,
Shatter your leaves before the mellowing year.
Bitter constraint, and sad occasion dear,
Compels me to disturb your season due:
For Lycidas is dead, dead ere his prime
Young Lycidas, and hath not left his peer:
Who would not sing for Lycidas? he knew
Himself to sing, and build the lofty rhyme.
He must not flote upon his watry bear
Unwept, and welter to the parching wind,
Without the meed of som melodious tear.
Begin, then, Sisters of the sacred well,
That from beneath the seat of Jove doth spring,
Begin, and somwhat loudly sweep the string.
Hence with denial vain, and coy excuse,
So may som gentle Muse
With lucky words favour my destin’d Urn,
And as he passes turn,
And bid fair peace be to my sable shrowd.
For we were nurst upon the self-same hill,
Fed the same flock, by fountain, shade, and rill.
Together both, ere the high Lawns appear’d
Under the opening eye-lids of the morn,
We drove a field, and both together heard
What time the Gray-fly winds her sultry horn,
Batt’ning our flocks with the fresh dews of night,
Oft till the Star that rose, at Ev’ning, bright
Toward Heav’ns descent had slop’d his westering wheel.
Mean while the Rural ditties were not mute,
Temper’d to th’Oaten Flute;
Rough Satyrs danc’d, and Fauns with clov’n heel,
From the glad sound would not be absent long,
And old Damætas lov’d to hear our song
And old Damætas lov d to hear our song
But O the heavy change, now thou art gon,
Now thou art gon, and never must return!
Thee Shepherd, thee the Woods, and desert Caves,
With wilde Thyme and the gadding Vine o’regrown,
And all their echoes mourn.
The Willows, and the Hazle Copses green,
Shall now no more be seen,
Fanning their joyous Leaves to thy soft layes.
As killing as the Canker to the Rose,
Or Taint-worm to the weanling Herds that graze,
Or Frost to Flowers, that their gay wardrop wear,
When first the White thorn blows;
Such, Lycidas, thy loss to Shepherds ear.
Where were ye Nymphs when the remorseless deep
Clos’d o’re the head of your lov’d Lycidas?
For neither were ye playing on the steep,
Where your old Bards, the famous Druids ly,
Nor on the shaggy top of Mona high,
Nor yet where Deva spreads her wisard stream:
Ay me, I fondly dream!
Had ye bin there—for what could that have don?
What could the Muse her self that Orpheus bore,
The Muse her self, for her inchanting son
Whom Universal nature did lament,
When by the rout that made the hideous roar,
His goary visage down the stream was sent,
Down the swift Hebrus to the Lesbian shore.
Alas! what boots it with uncessant care
To tend the homely slighted Shepherds trade,
And strictly meditate the thankles Muse,
Were it not better don as others use,
To sport with Amaryllis in the shade,
Or with the tangles of Neæra’s hair?
Fame is the spur that the clear spirit doth raise
(That last infirmity of Noble mind)
To scorn delights, and live laborious dayes;
But the fair Guerdon when we hope to find,
And think to burst out into sudden blaze,
Comes the blind Fury with th’abhorrèd shears,
And slits the thin spun life. But not the praise,
Phœbus repli’d, and touch’d my trembling ears;
Fame is no plant that grows on mortal soil,
Nor in the glistering foil
Set off to th’world, nor in broad rumour lies,
But lives and spreds aloft by those pure eyes,
And perfet witnes of all judging Jove;
As he pronounces lastly on each deed,
Of so much fame in Heav’n expect thy meed.
O fountain Arethuse, and thou honour’d floud,
Smooth-sliding Mincius, crown’d with vocall reeds,
That strain I heard was of a higher mood:
But now my Oate proceeds,
And listens to the Herald of the Sea
That came in Neptune’s plea,
He ask’d the Waves, and ask’d the Fellon winds,
What hard mishap hath doom’d this gentle swain?
And question’d every gust of rugged wings
That blows from off each beakèd Promontory,
They knew not of his story,
And sage Hippotades their answer brings,
That not a blast was from his dungeon stray’d,
The Ayr was calm, and on the level brine,
Sleek Panope with all her sisters play’d.
It was that fatall and perfidious Bark
Built in th’eclipse, and rigg’d with curses dark,
That sunk so low that sacred head of thine.
Next Camus, reverend Sire, went footing slow,
His Mantle hairy, and his Bonnet sedge,
Inwrought with figures dim, and on the edge
Like to that sanguine flower inscrib’d with woe.
Ah; Who hath reft (quoth he) my dearest pledge?
Last came, and last did go,
The Pilot of the Galilean lake,
Two massy Keyes he bore of metals twain,
(The Golden opes, the Iron shuts amain)
(The Golden opes, the Iron shuts amain)
He shook his Miter’d locks, and stern bespake,
How well could I have spar’d for thee, young swain,
Anow of such as for their bellies sake,
Creep and intrude, and climb into the fold?
Of other care they little reck’ning make,
Then how to scramble at the shearers feast,
And shove away the worthy bidden guest.
Blind mouthes! that scarce themselves know how to hold
A Sheep-hook, or have learn’d ought els the least
That to the faithfull Herdmans art belongs!
What recks it them? What need they? They are sped;
And when they list, their lean and flashy songs
Grate on their scrannel Pipes of wretched straw,
The hungry Sheep look up, and are not fed,
But swoln with wind, and the rank mist they draw,
Rot inwardly, and foul contagion spread:
Besides what the grim Woolf with privy paw
Daily devours apace, and nothing sed,
But that two-handed engine at the door,
Stands ready to smite once, and smite no more.
Return Alpheus, the dread voice is past,
That shrunk thy streams; Return Sicilian Muse,
And call the Vales, and bid them hither cast
Their Bels, and Flourets of a thousand hues.
Ye valleys low where the milde whispers use,
Of shades and wanton winds, and gushing brooks,
On whose fresh lap the swart Star sparely looks,
Throw hither all your quaint enameld eyes,
That on the green terf suck the honied showres,
And purple all the ground with vernal flowres.
Bring the rathe Primrose that forsaken dies.
The tufted Crow-toe, and pale Gessamine,
The white Pink, and the Pansie freakt with jeat,
The glowing Violet.
The Musk-rose, and the well attir’d Woodbine.
With Cowslips wan that hang the pensive hed,
And every flower that sad embroidery wears:
Bid A th ll hi b t h d
Bid Amaranthus all his beauty shed,
And Daffadillies fill their cups with tears,
To strew the Laureat Herse where Lycid lies.
For so to interpose a little ease,
Let our frail thoughts dally with false surmise.
Ay me! Whilst thee the shores, and sounding Seas
Wash far away, where ere thy bones are hurld,
Whether beyond the stormy Hebrides,
Where thou perhaps under the whelming tide
Visit’st the bottom of the monstrous world;
Or whether thou to our moist vows deny’d,
Sleep’st by the fable of Bellerus old,
Where the great vision of the guarded Mount
Looks toward Namancos and Bayona’s hold;
Look homeward Angel now, and melt with ruth.
And, O ye Dolphins, waft the haples youth.
Weep no more, woful Shepherds weep no more,
For Lycidas your sorrow is not dead,
Sunk though he be beneath the watry floar,
So sinks the day-star in the Ocean bed,
And yet anon repairs his drooping head,
And tricks his beams, and with new spangled Ore,
Flames in the forehead of the morning sky:
So Lycidas sunk low, but mounted high,
Through the dear might of him that walk’d the waves
Where other groves, and other streams along,
With Nectar pure his oozy Lock’s he laves,
And hears the unexpressive nuptiall Song,
In the blest Kingdoms meek of joy and love.
There entertain him all the Saints above,
In solemn troops, and sweet Societies
That sing, and singing in their glory move,
And wipe the tears for ever from his eyes.
Now Lycidas the Shepherds weep no more;
Hence forth thou art the Genius of the shore,
In thy large recompense, and shalt be good
To all that wander in that perilous flood.
Thus sang the uncouth Swain to th’Okes and rills,
g ,
While the still morn went out with Sandals gray,
He touch’d the tender stops of various Quills,
With eager thought warbling his Dorick lay:
And now the Sun had stretch’d out all the hills,
And now was dropt into the Western bay;
At last he rose, and twitch’d his Mantle blew:
To morrow to fresh Woods, and Pastures new.
317*.
To the Lady Margaret Ley
DAUGHTER to that good Earl, once President
Of Englands Counsel, and her Treasury,
Who liv’d in both, unstain’d with gold or fee,
And left them both, more in himself content,
Till the sad breaking of that Parlament
Broke him, as that dishonest victory
At Chæronèa, fatal to liberty
Kil’d with report that Old man eloquent,
Though later born, then to have known the dayes
Wherin your Father flourisht, yet by you
Madam, me thinks I see him living yet;
So well your words his noble vertues praise,
That all both judge you to relate them true,
And to possess them, Honour’d Margaret.
318.
On His Blindness
320.
To Cyriack Skinner
CYRIACK, whose Grandsire on the Royal Bench
Of Brittish Themis, with no mean applause
Pronounc’t and in his volumes taught our Lawes,
Which others at their Barr so often wrench:
To day deep thoughts resolve with me to drench
In mirth, that after no repenting drawes;
Let Euclid rest and Archimedes pause,
And what the Swede intend, and what the French.
To measure life, learn thou betimes, and know
Toward solid good what leads the nearest way;
For other things mild Heav’n a time ordains,
And disapproves that care, though wise in show,
That with superfluous burden loads the day,
And when God sends a cheerful hour, refrains.
321.
On His Deceased Wife
324.
ii
ALL is best, though we oft doubt,
What th’ unsearchable dispose
Of highest wisdom brings about,
And ever best found in the close.
Oft he seems to hide his face,
But unexpectedly returns
And to his faithful Champion hath in place
Bore witness gloriously; whence Gaza mourns
And all that band them to resist
His uncontroulable intent.
His servants he with new acquist
Of true experience from this great event
With peace and consolation hath dismist,
And calm of mind all passion spent.
SIR JOHN SUCKLING
1609-1642
325.
A Doubt of Martyrdom
O FOR some honest lover’s ghost,
Some kind unbodied post
Sent from the shades below!
I strangely long to know
Whether the noble chaplets wear
Those that their mistress’ scorn did bear
Or those that were used kindly.