0% found this document useful (0 votes)
3 views54 pages

Matching, regression discontinuity, difference in differences, and beyond 1st Edition Lee pdf download

The document is a comprehensive overview of various statistical methods including matching, regression discontinuity, and difference in differences, authored by Myoung-jae Lee and published by Oxford University Press. It covers fundamental concepts, implementation techniques, and applications of these methods in treatment effect analysis. The book also includes detailed chapters on advanced topics and practical examples to illustrate the methodologies discussed.

Uploaded by

dizwzdgbi5947
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views54 pages

Matching, regression discontinuity, difference in differences, and beyond 1st Edition Lee pdf download

The document is a comprehensive overview of various statistical methods including matching, regression discontinuity, and difference in differences, authored by Myoung-jae Lee and published by Oxford University Press. It covers fundamental concepts, implementation techniques, and applications of these methods in treatment effect analysis. The book also includes detailed chapters on advanced topics and practical examples to illustrate the methodologies discussed.

Uploaded by

dizwzdgbi5947
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Matching, regression discontinuity, difference

in differences, and beyond 1st Edition Lee


download

https://textbookfull.com/product/matching-regression-
discontinuity-difference-in-differences-and-beyond-1st-edition-
lee/

Download full version ebook from https://textbookfull.com


We believe these products will be a great fit for you. Click
the link to download now, or visit textbookfull.com
to discover even more!

Biota Grow 2C gather 2C cook Loucas

https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-
loucas/

Multiple Regression and Beyond An Introduction to


Multiple Regression and Structural Equation Modeling
Timothy Z. Keith

https://textbookfull.com/product/multiple-regression-and-beyond-
an-introduction-to-multiple-regression-and-structural-equation-
modeling-timothy-z-keith/

Beyond Heroic Realising a World Where People Power


Makes a Difference Asheem Singh

https://textbookfull.com/product/beyond-heroic-realising-a-world-
where-people-power-makes-a-difference-asheem-singh/

The Euclidean Matching Problem 1st Edition Gabriele


Sicuro (Auth.)

https://textbookfull.com/product/the-euclidean-matching-
problem-1st-edition-gabriele-sicuro-auth/
Regression and Other Stories 1st Edition Andrew Gelman

https://textbookfull.com/product/regression-and-other-
stories-1st-edition-andrew-gelman/

Classification and Regression Trees First Issued In


Hardback Edition Breiman

https://textbookfull.com/product/classification-and-regression-
trees-first-issued-in-hardback-edition-breiman/

Difference Equations For Scientists And Engineering


Interdisciplinary Difference Equations 1st Edition
Michael A. Radin

https://textbookfull.com/product/difference-equations-for-
scientists-and-engineering-interdisciplinary-difference-
equations-1st-edition-michael-a-radin/

Sex Differences in Sports Medicine 1st Edition Ellen


Casey Md

https://textbookfull.com/product/sex-differences-in-sports-
medicine-1st-edition-ellen-casey-md/

Sex Differences in Sports Medicine 1st Edition Ellen


Casey Md

https://textbookfull.com/product/sex-differences-in-sports-
medicine-1st-edition-ellen-casey-md-2/
Matching, Regression
Discontinuity, Difference in
Differences, and Beyond
Matching, Regression
Discontinuity, Difference in
Differences, and Beyond

Myoung-jae Lee

1
1
Oxford University Press is a department of the University of Oxford. It furthers
the University’s objective of excellence in research, scholarship, and education
by publishing worldwide. Oxford is a registered trade mark of Oxford University
Press in the UK and in certain other countries.

Published in the United States of America by Oxford University Press


198 Madison Avenue, New York, NY 10016, United States of America.

© Oxford University Press 2016

All rights reserved. No part of this publication may be reproduced, stored in


a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reproduction
rights organization. Inquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above

You must not circulate this work in any other form


and you must impose this same condition on any acquirer.

A copy of this book’s Catalog-in-Publication Data is on file with the Library of Congress
ISBN 978–0–19–025874–0 (pbk);
ISBN 978–0–19–025873–3 (hbk)

1 3 5 7 9 8 6 4 2
Printed by Webcom, Canada
To the memory of my father,
Kang-hee Lee,
on July 10, 2015, and to the memorable ride days after.
CON TEN TS

Preface xv

1 Basics of Treatment Effect Analysis 1


1.1 Counterfactual, Intervention, and Causal Relation 1
1.1.1 Potential Outcomes and Intervention 1
1.1.2 Causality and Association 3
1.1.3 Partial Equilibrium Analysis and Remarks 4
1.2 Various Treatment Effects and No Effects 5
1.2.1 Various Effects 5
1.2.2 Three No-Effect Concepts 6
1.2.3 Remarks 7
1.3 Group-Mean Difference and Randomization 8
1.3.1 Group-Mean Difference and Mean Effect 8
1.3.2 Consequences of Randomization 10
1.3.3 Checking Out Covariate Balance 12
1.4 Overt Bias, Hidden Bias, and Selection Problems 13
1.4.1 Overt and Hidden Biases 13
1.4.2 Selection on Observables and Unobservables 14
1.4.3 Linear Models and Biases 16
1.5 Estimation with Group Mean Difference and LSE 18
1.5.1 Group-Mean Difference and LSE 18
1.5.2 Job Training Example 19
1.5.3 Linking Counterfactuals to Linear Models 21
1.6 Structural Form, Assignment, and Marginal Model 22
1.6.1 Structural versus Reduced Forms for Response 22
1.6.2 Treatment Structural Form and Assignment 23
1.6.3 Marginal Structural Model 24
1.7 Simpson’s Paradox and False Covariate Control 25

2 Matching 28
2.1 Basics of Matching and Various Effects 28
2.1.1 Main Idea 28
2.1.2 Effect on Treated and Effect on Population 30
2.1.3 Dimension and Support Problems 31
2.1.4 Variables to Control 32

vii
viii Contents

2.2 Implementing Matching 35


2.2.1 Decisions to Make in Matching 35
2.2.2 Matching Estimators 37
2.2.3 Asymptotic Variance Estimation 40
2.2.4 Labor Union Effect on Wage 44
2.3 Propensity Score Matching (PSM) 46
2.3.1 Propensity Score as a Balancing Score 46
2.3.2 Removing Overt Bias with Propensity Score 47
2.3.3 Implementing PSM and Bootstrap 48
2.3.4 PSM Empirical Examples 50
2.3.5 Propensity Score Specification Issues* 52
2.4 Further Remarks 54
2.4.1 Covariate Balance Check 54
2.4.2 Matching for Hidden Bias 56
2.4.3 Prognostic Score and More* 58

3 Nonmatching and Sample Selection 61


3.1 Weighting 61
3.1.1 Weighting Estimator for Effect on Population 61
3.1.2 Other Weighting Estimators and Remarks 63
3.1.3 Asymptotic Distribution of Weighting Estimators* 65
3.1.4 Job Training Effect on Unemployment 66
3.1.5 Doubly Robust Estimator* 67
3.1.6 Weighting for Missing Data* 68
3.2 Regression Imputation 69
3.2.1 Linear Regression Imputation 70
3.2.2 Regression Imputation with Propensity Score 71
3.2.3 Regression Imputation for Multiple Treatment 72
3.2.4 Regression Imputation for Continuous Treatment* 73
3.2.5 Military Service Effect on Wage 74
3.3 Complete Pairing with Double Sum 76
3.3.1 Discrete Covariates 77
3.3.2 Continuous Covariates 79
3.3.3 Nonparametric Distributional Effect Tests* 80
3.4 Treatment Effects under Sample Selection 84
3.4.1 Difficulties with Sample Selection Models 85
3.4.2 Participation, Invisible, and Visible Effects 86
3.4.3 Identification of Three Effects with Mean Differences 87
3.4.4 Religiosity Effect on Affairs 88
3.5 Effect Decomposition in Sample Selection Models* 90
3.5.1 Motivation for Decomposition 90
3.5.2 Decomposition with Linear Selection Model 91
3.5.3 Four Special Models 92
3.5.4 Race Effect on Wage 94
ix Contents

4 Regression Discontinuity 97
4.1 Introducing RD with Before-After 97
4.1.1 BA Examples 97
4.1.2 BA Identification Assumption 98
4.1.3 From BA to RD 99
4.2 RD Identification and Features 100
4.2.1 Sharp RD (SRD) and Fuzzy RD (FRD) 101
4.2.2 Identification at Cutoff 102
4.2.3 RD Main Features 104
4.2.4 Class Size Effect on Test Score 106
4.3 RD Estimators 109
4.3.1 LSE for Level Equation 109
4.3.2 IVE for Right-Left Differenced Equation 110
4.3.3 Bandwidth Choice and Remarks 112
4.3.4 High School Completion Effect on Fertility 113
4.4 Specification Tests 116
4.4.1 Breaks in Conditional Means 116
4.4.2 Continuity in Score Density 117
4.5 RD Topics* 119
4.5.1 Spatial Breaks 119
4.5.2 RD for Limited Dependent Variables 120
4.5.3 Measurement Error in Score 121
4.5.4 Regression Kink (RK) and Generalization 123
4.5.5 SRD with Multiple Scores 126
4.5.6 Quantile RD 129

5 Difference in Differences 131


5.1 DD Basics 131
5.1.1 Examples for DD 132
5.1.2 Time-Constant and Time-Varying Qualifications 133
5.1.3 Data Requirement and Notation 135
5.2 DD with Repeated Cross-Sections 136
5.2.1 Identification 136
5.2.2 Identification with Parametric Models 140
5.2.3 Schooling Effect on Fertility: ‘Fuzzy DD’ 142
5.2.4 Linear Model Estimation for Two Periods or More 144
5.2.5 Earned Income Tax Credit Effect on Work 147
5.2.6 Time-Varying Qualification* 148
5.3 DD with Panel Data 150
5.3.1 Identification 150
5.3.2 Identification and Estimation with Parametric Models 152
5.3.3 Daylight Saving Time Effect on Energy 157
5.4 Panel Stayer DD for Time-Varying Qualification 158
5.4.1 Motivation 158
x Contents

5.4.2 Effect on In-Stayers Identified by Stayer DD 159


5.4.3 Identification and Estimation with Panel Linear Models 160
5.4.4 Pension Effect on Health Expenditure 162

6 Triple Difference and Beyond 165


6.1 TD Basics and More 165
6.2 TD with Repeated Cross-Sections 166
6.2.1 Identification 166
6.2.2 Identification and Estimation with Linear Models 169
6.2.3 Mandated Benefit Effect on Wage 172
6.3 TD with Panel Data 174
6.3.1 Identification 174
6.3.2 Estimation with Panel Linear Model 175
6.3.3 Tax-Inclusive Price Effect on Demand 177
6.4 GDD and Beyond 178
6.4.1 Motivation for GDD and Beyond 179
6.4.2 Identification for GDD and QD 180
6.4.3 Identified Effects When Panel Linear Model Holds 181
6.4.4 LSE for DD and GDD and Testing for DD Condition 182
6.4.5 Sulfa Drug Effect on Mortality: Is DD Trustworthy? 184
6.5 Clustering Problems and Inference for DD and TD 187
6.5.1 Single Clustering 188
6.5.2 Clustering in Panel Data 194
6.5.3 DD and TD with Cluster-Specific Treatment 199
6.5.4 Details on Cluster Variance Estimator* 202

A APPENDIX 209
A.1 Kernel Density and Regression Estimators 209
A.1.1 Histogram-Type Density Estimator 209
A.1.2 Kernel Density Estimator 210
A.1.3 Kernel Regression Estimator 211
A.1.4 Local Linear Regression 213
A.2 Bootstrap 213
A.2.1 Review on Usual Asymptotic Inference 215
A.2.2 Bootstrap to Find Quantiles 216
A.2.3 Percentile-t and Percentile Methods 218
A.2.4 Nonparametric, Parametric, and Wild Bootstraps 219
A.3 Confounder Detection, IVE, and Selection Correction 220
A.3.1 Coherence Checks 220
A.3.2 IVE and Complier Effect 225
A.3.3 Selection Correction Approach 230
A.4 Supplements for DD Chapter 232
A.4.1 Nonparametric Estimators for Repeated Cross-Section DD 233
A.4.2 Nonparametric Estimation for DD with Two-Wave Panel Data 233
xi Contents

A.4.3 Panel Linear Model Estimation for DD with One-Shot


Treatment 236
A.4.4 Change in Changes 238

References 241
Index 253

Online GAUSS Programs:


Pair Matching with PS for Union on Wage (PairMatchUnionOnWage)
Regression Imputation with PS-Based Nonparametrics (RegImpPsNprSim)
Complete Pairing with PS for Union on Wage (CpUnionOnWage)
RD Program (RdSim)
Repeated Cross-Section DD (DdReCroVary4WavesSim)
Panel DD for Differenced Model (DdPanel6WavesSim)
Repeated Cross-Section TD (TdReCro2WavesSim)
Panel DD and GDD for Differenced Model (DdGddPanel5WavesSim)
Panel DD, GDD and QD for Sulfa Drug (DdGddQdSulfaDrug)
Bootstrap for Sample Mean (BootAvgSim)
Selection Correction for Work on Doctor Visits (SelCorcWorkOnVisit)
Panel LSE, WIT and BET (PanelLseWitBetSim)
A BR I D GED CON TEN TS

Preface xv

1 Basics of Treatment Effect Analysis 1


1.1 Counterfactual, Intervention, and Causal Relation 1
1.2 Various Treatment Effects and No Effects 5
1.3 Group-Mean Difference and Randomization 8
1.4 Overt Bias, Hidden Bias, and Selection Problems 13
1.5 Estimation with Group Mean Difference and LSE 18
1.6 Structural Form, Assignment, and Marginal Model 22
1.7 Simpson’s Paradox and False Covariate Control 25

2 Matching 28
2.1 Basics of Matching and Various Effects 28
2.2 Implementing Matching 35
2.3 Propensity Score Matching (PSM) 46
2.4 Further Remarks 54

3 Nonmatching and Sample Selection 61


3.1 Weighting 61
3.2 Regression Imputation 69
3.3 Complete Pairing with Double Sum 76
3.4 Treatment Effects under Sample Selection 84
3.5 Effect Decomposition in Sample Selection Models* 90

4 Regression Discontinuity 97
4.1 Introducing RD with Before-After 97
4.2 RD Identification and Features 100
4.3 RD Estimators 109
4.4 Specification Tests 116
4.5 RD Topics* 119

5 Difference in Differences 131


5.1 DD Basics 131
5.2 DD with Repeated Cross-Sections 136
5.3 DD with Panel Data 150
5.4 Panel Stayer DD for Time-Varying Qualification 158

xii
xiii Abridged Contents

6 Triple Difference and Beyond 165


6.1 TD Basics and More 165
6.2 TD with Repeated Cross-Sections 166
6.3 TD with Panel Data 174
6.4 GDD and Beyond 178
6.5 Clustering Problems and Inference for DD and TD 187

A APPENDIX 209
A.1 Kernel Density and Regression Estimators 209
A.2 Bootstrap 213
A.3 Confounder Detection, IVE, and Selection Correction 220
A.4 Supplements for DD Chapter 232

References 241
Index 253
Online GAUSS Programs
PR EFACE

Treatment effect analysis is widely used in various disciplines of science, because


any controllable variable for a policy/program/medicine can be called a ‘treatment’.
For instance, a well-known treatment effect analysis method ‘propensity score
matching’ (Rosenbaum and Rubin 1983) has been cited more than 14,500 times
in Google Scholar. In economics, the key word ‘propensity score matching’ brings
up no fewer than 93 papers in Labour Economics, 84 in Health Economics, 81
in Journal of Public Economics, and even 34 in Journal of International Economics
and 9 in Journal of Monetary Economics. Given that propensity score matching
is just one of many available methods, these numbers amply demonstrate that
treatment effect analysis is popular in economics and will become more so in coming
years.
I wrote a book on treatment effects in 2005, titled Micro-Econometrics for Policy,
Program, and Treatment Effects. Ten years has passed since then, during which much
progress has been made; in fact, most of the above-mentioned papers in the economic
journals came after 2005. Thus it seems about time to revise the 2005 book for a
second edition. But covering the entire field is too demanding due to the universality
of treatment effect analysis, which is the same as causality analysis. Hence I decided
to cover only the most popular and applicable methods in a new book: matching,
regression discontinuity (RD), difference in differences (DD), and some others. Both
graduate students and researchers in economics (and other disciplines of science) will
benefit from this book.
The prerequisite for this book is least squares estimator (LSE) and instrumental
variable estimator (IVE) for linear models and some basic knowledge on maximum
likelihood estimators for binary and ordered responses (probit/logit and ordered
probit/logit). Also, exposure to kernel nonparametric regression will be helpful; this
as well as IVE are reviewed in the appendix.
This book consists of six chapters and an appendix. Starred sections are digressive
or technical mostly in the sense of requiring nonparametrics. They as well as the
appendix are optional, although it is still recommended to cover the confounder/IVE
section in the appendix. Representative programs written in GAUSS will be available
at ‘https://sites.google.com/site/mjleeku/’. Many programs there will use simulated
data because many data sets in the main text are either proprietary or being used in
ongoing research, but the programs with simulated data can be easily modified for
actual data use. The online data and programs will be updated, and more data will
be released as time passes.

xv
xvi Preface

Chapter 1 is a minor revised version of Chapter 2 of my 2005 book; it introduces


terminology and lays out the basics of treatment effect (i.e., causality) analysis. Casual
readers may want to browse this chapter to meet issues arising in treatment effect
analysis and to see whether those issues are appealing enough to go on to the remaining
chapters. Readers familiar with the basics may skip this chapter.
Chapter 2 is for ‘matching’, where the mean difference across treatment and control
groups is examined with covariates controlled by matching. Although matching is
nonparametric in its nature, much of it can be covered without a formal recourse to
nonparametrics. This chapter overlaps a lot with Chapters 3 and 4 of my 2005 book,
but the reader will find the updated literature since then.
Chapter 3 is for ‘nonmatching’ methods that examine the mean difference across
treatment and control groups as matching does, but nonmatching methods control
covariates differently from matching: they are ‘weighting’, ‘regression imputation’ and
‘complete pairing’. Chapter 3 also studies treatment effects in ‘sample selection models’
where a treatment can affect the participation decision in an activity and the ensuing
performance in the activity. The main difficulty in sample selection models is that the
performance is observed only when one participates.
Chapter 4 introduces ‘regression discontinuity (RD)’. RD accords ‘local random-
ization’, which makes it a particularly attractive study design in observational studies
where randomization is rare. RD, however, requires the treatment to be determined
in a specific way: a continuous variable crossing a threshold (e.g., a test score crossing
a cutoff). This restricts the applicability of RD somewhat, as not all treatments are
determined this way.
Chapter 5 is for ‘difference in differences (DD)’ or double difference. Possibly the
most basic treatment effect analysis method is ‘before-and-after (BA)’ which compares
before and after the treatment, but BA is inappropriate when the time gap is too
long so that variables other than the treatment change. DD solves this problem by
combining BA with matching. Among the changes in the other variables, those due
to observed variables are accounted for by controlling covariates in DD, and those due
to unobserved variables are negated to some extent by the second layer of difference.
Digressive discussions on DD are relegated to the appendix to keep this chapter within
a reasonable length.
Chapter 6 is for ‘triple difference (TD) and beyond’, where one extra differencing
is done from DD; the differencing can be time-wise or cross-sectional group-wise.
TD is used when the requisite assumptions for single and double differences are
questionable. TD can also provide a test for the underlying assumptions for DD. Going
further than TD, ‘quadruple difference’ will appear, which can estimate the treatment
effect under weaker assumptions than TD and provide tests for the TD assumptions.
The appendix contains a review on kernel nonparametric regression, a supplemen-
tary discussion for DD to prevent the chapter in the main body from getting too long,
and an introduction to bootstrap. The appendix also has discussions on how to detect
unobserved confounders, a review on IVE, and selection correction approaches to deal
with unobserved confounders. These are drawn from Lee (2005, Chapters 5 and 6)
xvii Preface

and put in the appendix despite their importance, as they are not the main theme of
this book.
There are many empirical examples, differing in terms of coverage length. One
extreme is a brief mention in a sentence or two, and the other is a detailed examination
in a separate section; in the latter, the section title will look like “D effect on Y” where D
is the treatment and Y is the response variable. There are also intermediate coverages
in a paragraph or two. Some empirical examples are interesting on their own to be
discussed with the relevant literature cited, whereas some are “clichés” just meant to
be an illustration and not much more.
There are a couple of topics that I hoped to cover in this book but could not due to
various constraints: ‘quantile treatment effect’ and ‘dynamic treatment effect’, although
the first is examined briefly for RD. Here I list some recent references for interested
readers. Quantile treatment effect generalizes the pervasive mean effect (Machado
and Mata 2005, Chernozhukov and Hansen 2008, Firpo et al. 2009, Rothe 2012,
and references therein). Dynamic treatment effect considers multiple treatments over
time that are adjusted based on interim responses (Robins and Hernán 2009, Lee and
Huang 2012, Chakraborty and Moodie 2013, and references therein). Also, there are
‘direct and indirect/mediation effect’ issues (Pearl 2010, Imai et al. 2010, VanderWeele
2015, and references therein).
As for notation, treatment and response will be denoted as D and Y, respectively;
covariates will be W, X, Z, or M. In most cases, a random variable will be denoted by an
uppercase letter, and its realized value by the lowercase. ‘Indicator function’ is defined
as 1[A] = 1 if A holds and 0 otherwise. E(·|X = x) will be often abbreviated just as
E(·|x) if it is clear which random variable is referred to; analogously, E(·|X = Xi ) will
be written just as E(·|Xi ). Since E(·|X = x) is a function, say, g(x), E(·|x) is a fixed
number, and g(X) and g(Xi ) are random variables obtained by replacing x in g(x) with
X and Xi .
Treatment and control groups will be often called simply ‘T group’ and ‘C group’.
Since we will assume iid (independent and identical distributions) across individuals
i = 1, . . . , N, often the subscript i in Di and Yi will be omitted. ‘Covariates’ and
‘confounders’ in its wide sense can be any variables other than D and Y, observed
or not. In its narrow sense, however, covariates refer to only observed variables, and
confounders refer to unobserved variables (‘errors’). ‘Covariate balance’ across the T
and C groups means the same covariate distributions across the two groups, although
only the mean equality is checked typically in practice. ‘With respect to’ may be
abbreviated as ‘wrt’, and distribution function as ‘df ’.
The variance and standard deviation of Y are denoted as V(Y) and SD(Y). Correla-
tion and covariance between X and Y will be denoted as COR(X, Y) and COV(X, Y).
We use f and F for density and distribution functions. Density/probability of Y will
be denoted typically as fY (y) or PY (y), or just as f (y) or P(y). Its conditional version
given X = x will be denoted as fY|X (y|x), fY|x (y) or fY (y|x), and PY|X (y|x), PY|x (y)
or PY (y|x). Sometimes, both density and probability may be denoted using f . The
triple line ‘≡’ is used for definitional equality. Convergence in law is denoted with
‘’, whereas ‘∼’ denotes the distribution, for example, Y ∼ N(0, 1) for Y following
xviii Preface

the standard normal distribution. The N(0, 1) distribution function and density are
 and φ.
The conditional independence between Y and W given X is denoted as ‘Y  W|X’
which is symmetric because Y  W|X ⇐⇒ W  Y|X. In contrast, conditional mean
independence E(Y|W, X) = E(Y|X) is not symmetric because E(Y|W, X) = E(Y|X)
does not necessarily imply E(W|Y, X) = E(W|X)—E(W|X) may not even exist when
E(Y|X) does; when Y is a vector, ‘E(Y|W, X) = E(Y|X)’ applies to each component
of Y separately. Often mean independence suffices, but we use mostly (statistical)
independence except in Chapter 1, which makes exposition simpler by avoiding this
kind of complication in E(Y|W, X) = E(Y|X). When W equals the binary D with
0 < P(D|X) < 1, E(Y|D, X) = E(Y|X) is equivalent to the symmetric condition
COR(Y, D|X) = 0 ⇐⇒ E(YD|X) = E(Y|X)E(D|X) as can be seen in Lee (2005, 36);
this is denoted as ‘Y ⊥ D|X’. In short,  for independence and ⊥ for zero correlation,
with symmetry holding for both.
When I started as an econometrician, I was busy with inventing new estimators
and tests. But after a while, it seemed vacuous to keep inventing things that are hardly
used. Just like anybody else, it dawned on me that I should do something “useful.”
I came across treatment effect analysis, which turned out to be the right topic for me.
Working on it has been joyous as well as rewarding. But it seems that the field is being
pulled back to where I wanted to escape from. This book may be taken as a small step
to change the unfortunate trend, so that econometricians and statisticians can relate
better to empirical researchers and make meaningful contributions to the world.
Finally, in writing this book, I have benefited from the comments made by
Donghwa Bae, Jin-young Choi, Hyeon-joon Hwang, Young-min Ju, Young-sook Kim,
and anonymous reviewers. Without implicating any of these scholars, a monograph
like this book, intended to introduce the latest developments on the research front, is
bound to have some errors, as those developments are yet to be thoroughly filed-tested.
For this, I hope the reader to be understanding. The research for this book has
been supported by the National Research Foundation Grant funded by the Korean
Government (NRF-2009-342-B00008).
1
BA SICS OF TR E ATMEN T EFFEC T A N A LYSIS

For a treatment and a response variable, it is desired to find a causal effect of the former on
the latter. This can be done using the ‘potential responses’ corresponding to the treatment
on/off. The basic way of identifying the effect is comparing the average difference between
the treatment and control (i.e., untreated) groups. For this to work, the treatment should
determine which potential response is realized, but otherwise be unrelated to the potential
responses. When this condition is not met due to some observed and unobserved variables
affecting both the treatment and response, biases can arise. Avoiding such biases is the
main task in causal analysis with observational data. Causality using potential responses
gives a new look to the old workhorse ‘structural-form regression analysis’, enabling the
interpretation of the regression parameters as causal parameters.

1.1 COUNTERFACTUAL, INTERVENTION,


AND CAUSAL RELATION
1.1.1 Potential Outcomes and Intervention
In many science disciplines, it is desired to know the effects of a treatment or cause
on a response (or outcome) variable Yi , where i = 1, . . . , N indexes individuals; the
effects are called ‘treatment effects’ or ‘causal effects’. The following are examples of
treatments and responses:

Treatment: exercise job training college education drug


Response: blood pressure wage lifetime earnings cholesterol

It is important to be specific on the treatment and response. For the drug-cholesterol


example, we need to know the quantity of the drug taken and how it is administered,
and when and how cholesterol is measured. The same drug may become different treat-
ments if taken in different dosages at different frequencies. Cholesterol levels measured
one week and one month after treatment are two different response variables. For job
training, classroom-type training certainly differs from mere assistance for job search,
and wages one and two years after the training are two different outcome variables.

1
2 Matching, RD, DD, and Beyond

Consider a binary treatment taking on 0 or 1 (this can be generalized to multiple


treatments). Let Yid , d = 0, 1, denote the ‘potential outcome’ when individual i
receives treatment d exogenously (i.e., when treatment d is forced in (d = 1) or out
(d = 0), in comparison to treatment d self-selected by the individual): for the exercise
example,

Yi1 : blood pressure with exercise “forced in”;


Yi0 : blood pressure with exercise “forced out.”

Although it is a little difficult to imagine exercise forced in or out, the expressions


‘forced-in’ and ‘forced-out’ reflect the notion of intervention. A better example would
be that the price of a product is determined in the market, but the government may
intervene to set the price at a level exogenous to the market to see how the demand
changes. Another example is that a person chooses to take a drug (self-selection),
rather than the drug being injected regardless of the person’s will (intervention).
When we want to know a treatment effect, we want to know the effect of a treatment
intervention, not the effect of treatment self-selection, on a response variable. With
this information, we can adjust (or manipulate) the treatment exogenously to attain
the desired level of response. This is what policies are all about. Left alone, people will
self-select a treatment, and the effect of a self-selected treatment can be analyzed easily,
whereas the effect of an intervened treatment cannot. Using the effect of a self-selected
treatment to guide a policy decision can be misleading, if the policy is an intervention.
Not all policies are interventions though; for example, a policy to encourage exercise.
Even in this case, however, before the government decides to encourage exercise, it
may want to know what the effects of exercise are; here, the effects may well be the
effects of exercise intervened.
Some treatment cannot be an intervention, for example, college education cannot
be forced. In this case, we can think of two individuals with identical characteristics.
Suppose one goes to college and the other does not, for a reason unrelated to the
outcome (lifetime earnings). The reason can be proximity to a college or parents being
college graduates—readers familiar with instrumental variable estimator (IVE) may
“smell” an instrument here (see Lee et al. 2007 and the references therein for using a
distance as an instrument). The outcomes of the two individuals can be denoted as Y 1
and Y 0 despite the treatment not being an intervention. The self-selection here differs
from the usual self-selection in that the selection was done due to an “independent
reason,” and in this sense the notation (Y 0 , Y 1 ) pertaining to intervention can be still
used in this nonenforceable treatment case.
Between the two potential outcomes, only one outcome is observed, while the
other (called ‘counterfactual’) is not, which is the fundamental problem in treatment
effect analysis. In the example of the effect of college education on lifetime earnings,
only one outcome (earnings with college education or without) is available per person.
One may argue that for some other cases, say, the effect of a drug on cholesterol,
both Y 0 and Y 1 could be observed sequentially. Strictly speaking however, if two
treatments (i.e., no treatment and treatment) are administered one by one sequentially,
3 Basics of Treatment Effect Analysis

we cannot say that we observe both Y 0 and Y 1 , as the subject changes over time,
although the change may be very small. Some scholars are against the notion of
counterfactuals, but it is well entrenched in statistics; in econometrics, it is called
‘switching regression’.

1.1.2 Causality and Association


Define Yi1 − Yi0 as the treatment (or causal) effect for individual i. In this definition,
there is no uncertainty about the cause and the response variable. This way of defining
causal effect using two potential responses is called ‘counterfactual causality’. This is
in sharp contrast to the so-called ‘probabilistic causality’ which tries to uncover the
real cause(s) for a response variable; there, no counterfactual is needed. Although
probabilistic causality is also a prominent causal concept, when we say ‘causal effect’
in this book, we will always mean counterfactual causality. In a sense, everything in
this world is related to everything else. As somebody put it aptly, a butterfly’s flutter
on one side of an ocean may cause a storm on the other side. Trying to find the real
cause could be a futile exercise. Counterfactual causality fixes the causal and response
variables and then tries to estimate the magnitude of the causal effect.
Let the observed treatment be Di , and the observed response Yi be

Yi = (1 − Di ) · Yi0 + Di · Yi1 , i = 1, . . . , N.

Causal relation is different from associative relation such as correlation or covariance:


we need (D, Y 0 , Y 1 ) in the former to get Y 1 − Y 0 , whereas we need only (D, Y) in
the latter; of course, an associative relation suggests a causal relation. COR(D, Y) is
an association; also COV(D, Y)/V(D) is an association. The latter shows that least
squares estimator (LSE)—also called ‘ordinary LSE (OLS)’—is only for association,
although we tend to interpret LSE findings in practice as if they are causal findings.
When an association between two variables D and Y is found, it is helpful to think
of the following three cases:

1. D influences Y unidirectionally (D −→ Y).


2. Y influences D unidirectionally (D ←− Y).
3. There are third variables W influencing both D and Y unidirectionally, although
there is no direct relationship between D and Y (D ←− W −→ Y).

In treatment effect analysis, as mentioned already, we fix the causal and response
variables, and then try to find the effect; thus case 2 is ruled out. What is difficult is to
tell case 1 from 3 which is a ‘common factor’ case (W is the common variables for D and
Y). Let X and ε denote the observed and unobserved variables, respectively, that can
affect both D and (Y 0 , Y 1 ); usually X is called ‘covariates’, but sometimes both X and ε
are called covariates. The variables X and ε are candidates for the common factors W.
It may be a little awkward, but we need to imagine that each individual has
(D, Y 0 , Y 0 , X, ε) to reveal either Y 0 or Y 1 depending on D = 0 or 1; X is revealed always
and ε is never. To simplify exposition, usually we ignore X and ε at the beginning of a
4 Matching, RD, DD, and Beyond

discussion and later look at how to deal with them. In a given data set, the group with
D = 1 that reveal only (X, Y 1 ) is called the treatment group (or T group), and the group
with D = 0 that reveal only (X, Y 0 ) is called the control group (or C group).

1.1.3 Partial Equilibrium Analysis and Remarks


Unless otherwise mentioned, assume that the observations are independent and
identically distributed (iid) across i; often the subscript i in the variables will be
omitted. The iid assumption—particularly the independence part—may not be as
innocuous as it looks at first glance. For instance, considering effects of a vaccine
against a contagious disease, one person’s improved immunity to the disease reduces
the other persons’ chance of contracting the disease. Some people’s improved lifetime
earnings due to college education may have positive effects on other people’s lifetime
earnings. That is, the iid assumption does not allow ‘externality’ of the treatment,
and in this sense, the iid assumption restricts our treatment effect analysis to be
microscopic or of ‘partial equilibrium’ nature.
The effects of a large-scale treatment that has far-reaching consequences does not
fit our partial equilibrium framework. For example, a large-scale expensive job training
may have to be funded by a tax that may lead to a reduced demand for workers, which
in turn weakens the job-training effect. Findings from a small-scale job-training study
where the funding aspect could be ignored (thus, ‘partial equilibrium’) would not
apply to a large-scale job training where every aspect of the treatment would have to
be considered (i.e., ‘general equilibrium’). In the former, untreated people would not
be affected by the treatment. For them, their untreated state with the treatment given
to other people would be the same as their untreated state without the existence of
the treatment. In the latter, the untreated people would be affected indirectly by the
treatment (either by paying the tax or by the reduced demand for workers). For them,
their untreated state when the treatment is present would not be the same as their
untreated state in the absence of treatment.
As this example illustrates, a partial equilibrium analysis may exaggerate the
general equilibrium treatment effect that takes into account all consequences if there
is a negative externality. However, considering all the consequences would be too
ambitious and would require far more assumptions and models than is necessary in
partial equilibrium analysis. The gain in general equilibrium analysis could be negated
by false assumptions or misspecified models. In this book, therefore, we stick to
microscopic partial-equilibrium type analysis.
This chapter is an introduction to treatment effects analysis. We owe parts of this
chapter to Rubin (1974), Holland (1986), Rosenbaum (2002), Pearl (2009), and
other papers in the treatment effect literature although it is often hard to point out
exactly which papers, as the origin of treatment effect idea itself is unclear. The reader
will also benefit by consulting other books and papers on treatment effect analysis,
such as Shadish et al. (2002), Imbens and Wooldridge (2009), Pearl (2010), Morgan
and Winship (2014), and Imbens and Rubin (2015).
5 Basics of Treatment Effect Analysis

1.2 VARIOUS TRE ATMENT EFFECTS AND NO EFFECTS


1.2.1 Various Effects
The individual treatment effect (of Di on Yi ) is defined as

Yi1 − Yi0

which is, however, not identified. If there were two identical individuals, we might
assign them to treatment 0 and 1, respectively, to get Yi1 − Yi0 , but this is impossible.
The closest thing would be monozygotic (identical) twins who share the same genes
and are likely to grow up in similar environments. Even in this case, however, the
environments in their adult lives could be quite different. The study of twins is popular
in social sciences, and some examples will appear later where the inter-twin difference
is used for Yi1 − Yi0 .
Giving up on observing both Yi1 and Yi0 , i = 1, . . . , N, one may desire to know only
the joint distribution of (Y 0 , Y 1 ), which still is a difficult task. A less ambitious goal
would be to know the distribution of Y 1 −Y 0 , but even this is hard. Then we could look
for some aspects of the Y 1 − Y 0 distribution, and the most popular choice is the mean
effect E(Y 1 − Y 0 ). There are other effects, such as the median effect Med(Y 1 − Y 0 ) or
more generally the α quantile effect Qα (Y 1 − Y 0 ), where Med and Qα denote median
and α quantile, respectively; obviously, Q0.5 (·) = Med(·).
Instead of differences as in Y 1 − Y 0 , we may use ‘ratios’ to define effects (e.g., Lee
and Kobayashi 2001):
E(Y 1 − Y 0 ) E(Y 1 )
= − 1 (proportional effect relative to E(Y 0 ));
E(Y 0 ) E(Y 0 )
 1   1 
Y − Y0 Y
E =E −1 (if Y 0 does not take on 0).
Y0 Y0
Replacing E(·) with Qα (·) yields a proportional effect relative to Qα (Y 0 ):
 1 
Qα (Y 1 ) Y
− 1 and Q α − 1 .
Qα (Y 0 ) Y0
Despite many treatment effects, in practice, the mean effect is the most popular.
The popularity of the mean effect is owing to the important equation

E(Y 1 − Y 0 ) = E(Y 1 ) − E(Y 0 ) :

the mean of the difference Y 1 − Y 0 can be found from the two marginal means of the T
and C groups. This is thanks to the linearity of E(·), which does not hold in general for
other location measures, for example, Qα (Y 1 − Y 0 ) = Qα (Y 1 ) − Qα (Y 0 ) in general.
To appreciate the difference between Qα (Y 1 −Y 0 ) and Qα (Y 1 )−Qα (Y 0 ), consider
Q0.5 (·) = Med(·) for an income policy:

Med(Y 1 − Y 0 ) > 0 : at least 50% of the population have Y 1 − Y 0 > 0;


Med(Y 1 ) − Med(Y 0 ) > 0 : the median person’s income increases.
6 Matching, RD, DD, and Beyond

For instance, imagine five persons ordered in terms of Y 0 . With D = 1, their income
changes such that the ordering of Y 1 ’s is the same as that of Y 0 ’s, and everybody but
the median person loses by one unit while the median person gains by four units:

Person: 1 2 3 4 5
Y −→ Y :
0 1
←− ←− −→−→−→−→ ←− ←−
Y 10 ≡ Y 1 -Y 0 : −1 −1 4 −1 −1

In this case, Med(Y 1 − Y 0 ) = Med(Y 10 ) = −1, whereas Med(Y 1 ) − Med(Y 0 ) = 4


as the median person gains by four units. Due to this kind of difficulty, we focus
on E(Y 1 − Y 0 ) and its variations among many location measures of the Y 1 − Y 0
distribution.
A generalization (or a specialization, depending on how one sees it) of the
(marginal) mean effect E(Y 1 − Y 0 ) is a conditional mean effect E(Y 1 − Y 0 |X = xo ),
where X = xo denotes a subpopulation characterized by the observed variables X
taking on xo (e.g., male, aged 30, college-educated, married). The conditional mean
effect shows that the treatment effect can be heterogeneous depending on X, which is
also said to be ‘treatment interacting with X’. It is also possible that the treatment effect
is heterogeneous depending on the unobservable ε.
For X-heterogeneous effects, we may present all effects as a function X. Alter-
natively, we may summarize the multiple heterogenous effects with some  summary
measures. A natural thing to look at would be an weighted average E(Y 1 − Y 0 |
X = x)ω(x)∂x of E(Y 1 − Y 0 |X = x) with the weight ω(x) being the population
density of X. If there is a reason to believe that a certain subpopulation is more
important than others, we could assign a higher weight to it. That is, there could be
many versions of marginal mean effect depending on the weighting function. We could
also use E{Y 1 − Y 0 |X = E(x)} instead of the integral. For ε-heterogeneous effects
E(Y 1 − Y 0 |ε), since ε is unobserved, ε has to be either integrated out or replaced with
a known number. Heterogeneous effects will appear from time to time, but thinking of
constant effects will make reading this book easier in most cases.

1.2.2 Three No-Effect Concepts


Having seen many different effects, one might ask what it means to have no treatment
effect, since it is possible to have a zero mean effect but a nonzero median effect, for
instance. The strongest version of no effect is Yi1 = Yi0 ∀i, which is analytically convenient
and used often in the literature. But for a “weighty” treatment (e.g., college education),
it is hard to imagine the response variable (e.g., lifetime earnings) being exactly the
same for all i with or without the treatment. Possibly the weakest version of no effect
is a zero location measure type such as E(Y 1 − Y 0 ) = 0 or Med(Y 1 − Y 0 ) = 0, where
Y 1 and Y 0 can differ considerably despite zero mean/median of Y 1 − Y 0 .
An appealing no treatment-effect concept is that Y 1 and Y 0 are exchangeable:
P(Y 0 ≤ y0 , Y 1 ≤ y1 ) = P(Y 0 ≤ y1 , Y 1 ≤ y0 ) ∀ y0 , y1
7 Basics of Treatment Effect Analysis

which allows a relation between Y 0 and Y 1 but implies the same marginal distribution.
For instance, if Y 0 and Y 1 are jointly normal with the same mean and variance, then
Y 0 and Y 1 are exchangeable. Another example is Y 0 and Y 1 being iid. Since Y 0 = Y 1
trivially implies exchangeability, exchangeability is weaker than Y 0 = Y 1 . Because
exchangeability implies the symmetry of Y 1 − Y 0 , exchangeability is stronger than the
zero mean/median of Y 1 − Y 0 . In short, the implication arrows of the three no-effect
concepts are

Y 0 = Y 1 =⇒ Y 0 and Y 1 exchangeable =⇒ zero mean/median of Y 1 − Y 0 .

Since the relation between Y 0 and Y 1 can never be identified, in practice, we


examine the main implication of exchangeability that Y 0 and Y 1 follow the same
distribution: F1 = F0 where Fd denotes the marginal distribution function of
Y d , d = 0, 1. When F1 = F0 means no effect, a positive effect can be defined with
stochastic dominance of F1 over F0 :

F1 (y) ≡ P(Y1 ≤ y) ≤ P(Y0 ≤ y) ≡ F0 (y) ∀y (with inequality holding for some y).

Here, Y 1 tends to be greater than Y 0 , meaning a positive treatment effect.


In some cases, only the marginal distributions of Y 0 and Y 1 matter. Suppose that
Y is income and U(·) is an income utility function. A social planner may prefer policy
1 to 0 if the mean utility under policy 1 is greater:
 ∞  ∞
U(y)∂F0 (y) ≤ U(y)∂F1 (y) ⇐⇒ E{U(Y 0 )} ≤ E{U(Y 1 )}.
−∞ −∞

Here, the difference Y 1 − Y 0 is not a concern, nor is the joint distribution of (Y 0 , Y 1 );


instead, only the two marginal distributions matter.
So long as we focus on the mean effect, E(Y 1 − Y 0 ) = 0 is the appropriate no-effect
concept. But there will be cases where a stronger version, Y 0 = Y 1 or F1 = F0 , is
adopted.

1.2.3 Remarks
The effects of a drug on health can be multidimensional, given the nature of health.
For instance, the benefit of a drug can be a lower cholesterol level, lower blood
pressure, lower blood sugar level, and so on, while the cost of the drug could be its
bad side effects. In another example, the benefits of a job training could be a shorter
unemployment duration or greater post-training wage, while the cost could be the
actual training cost and the opportunity cost of taking the training. Taking E(Y 1 − Y 0 )
as the treatment effect is different from the traditional cost-benefit analysis that tries to
account for all benefits and costs associated with the treatment. In E(Y 1 −Y 0 ), the goal
is much narrower, examining only one outcome measure instead of multiple outcomes.
The cost side is often ignored. If all benefits and costs could be converted into the same
monetary unit, however, and if Y is the net benefit (gross benefit minus cost), then the
treatment effect analysis would be the same as the cost-benefit analysis.
Another Random Scribd Document
with Unrelated Content
fantasy in her answers. She is literally radiant with youth,
imagination, and the joy of loving so passionately and being so
passionately beloved. And it is marvellous how thoroughly feminine
is her wit. Too many of the witty women in books written by men
have a man's intelligence. Rosalind's wit is tempered by feeling.
She has no monopoly of wit in this Arcadia of Arden. Every one in
the play is witty, even the so-called simpletons. It is a festival of wit.
At some points Shakespeare seems to have followed no stricter
principle than the simple one of making each interlocutor outbid the
other in wit (see, for example, the conversation between Touchstone
and the country wench whom he befools). The result is that the
piece is bathed in a sunshiny humour. And amid all the gay and airy
wit-skirmishes, amid the cooing love-duets of all the happy youths
and maidens, the poet intersperses the melancholy solos of his
Jaques:—
"I have neither the scholar's melancholy, which is emulation; nor the
musician's, which is fantastical; nor the courtier's, which is proud;
nor the soldier's, which is ambitious; nor the lawyer's, which is
politic; nor the lady's, which is nice; nor the lover's, which is all
these; but it is a melancholy of mine own, compounded of many
simples, extracted from many objects."
This is the melancholy which haunts the thinker and the great
creative artist; but in Shakespeare it as yet modulated with ease into
the most engaging and delightful merriment.

[1] Reprinted in Hazlitt's Shakespeare's Library, ed. 1875, part i. vol. ii.

XXIX

CONSUMMATE SPIRITUAL HARMONY—TWELFTH NIGHT—


JIBES AT PURITANISM—THE LANGUISHING CHARACTERS—
VIOLA'S INSINUATING GRACE—FAREWELL TO MIRTH

If the reader would picture to himself Shakespeare's mood during


this short space of time at the end of the old century and beginning
of the new, let him recall some morning when he has awakened with
the sensation of complete physical well-being, not only feeling no
definite or indefinite pain or uneasiness, but with a positive
consciousness of happy activity in all his organs: when he drew his
breath lightly, his head was clear and free, his heart beat peacefully:
when the mere act of living was a delight: when the soul dwelt on
happy moments in the past and dreamed of joys to come. Recall
such a moment, and then conceive it intensified an hundredfold—
conceive your memory, imagination, observation, acuteness, and
power of expression a hundred times multiplied—and you may divine
Shakespeare's prevailing mood in those days, when the brighter and
happier sides of his nature were turned to the sun.
There are days when the sun seems to have put on a new and festal
splendour, when the air is like a caress to the cheek, and when the
glamour of the moonlight seems doubly sweet; days when men
appear manlier and wittier, women fairer and more delicate than
usual, and when those who are disagreeable and even odious to us
appear, not formidable, but ludicrous—so that we feel ourselves
exalted above the level of our daily life, emancipated and happy.
Such days Shakespeare was now passing through.
It is at this period, too, that he makes sport of his adversaries the
Puritans without bitterness, with exquisite humour. Even in As You
Like It (iii. 2), we find a little allusion to them, where Rosalind says,
"O most gentle Jupiter!—what tedious homily of love have you
wearied your parishioners withal, and never cried, 'Have patience,
good people!'" In his next play, the typical, solemn, and self-
righteous Puritan is held up to ridicule in the Don Quixote-like
personage of the moralising and pompous Malvolio, who is launched
upon a billowy sea of burlesque situations. Of course the poet goes
to work with the greatest circumspection. Sir Toby has made some
inquiry about Malvolio, to which Maria answers (ii. 3):—
"Maria. Marry, sir, sometimes he is a kind of Puritan.
"Sir Andrew. O! if I thought that, I'd beat him like a dog.
"Sir Toby. What, for being a Puritan? thy exquisite reason, dear
knight?
"Sir And. I have no exquisite reason for't, but I have reason
good enough.
"Mar. The devil a Puritan that he is, or anything constantly but a
time-pleaser; an affectioned ass, that cons state without book,
and utters it by great swarths."

Not otherwise does Molière expressly insist that Tartuffe is not a


clergyman, and Holberg that Jacob von Tyboe is not an officer.
A forged letter, purporting to be written by his noble mistress, is
made to fall into Malvolio's hands, in which she begs for his love,
and instructs him, as a sign of his affection towards her, always to
smile, and to wear cross-gartered yellow stockings. He "smiles his
face into more lines than are in the new map [of 1598] with the
augmentation of the Indies;" he wears his preposterous garters in
the most preposterous fashion. The conspirators pretend to think
him mad, and treat him accordingly. The Clown comes to visit him
disguised in the cassock of Sir Topas the curate. "Well," says the
mock priest (not without intention on the poet's part), when Maria
gives him the gown, "I'll put it on, and I will dissemble myself in't;
and I would I were the first that ever dissembled in such a gown."
It is to Malvolio, too, that the merry and mellow Sir Toby, amid the
applause of the Clown, addresses the taunt:—

"Sir Toby. Dost thou think, because thou art virtuous, there shall
be no more cakes and ale?
"Clown. Yes, by Saint Anne; and ginger shall be hot i' the mouth
too."
In these words, which were one day to serve as a motto to Byron's
Don Juan, there lies a gay and daring declaration of rights.
Twelfth Night, or What you Will, must have been written in 1601, for
in the above-mentioned diary kept by John Manningham, of the
Middle Temple, we find this entry, under the date February 2, 1602:
"At our feast wee had a play called Twelve Night, or what you will,
much like the commedy of errores, or Menechmi in Plautus, but
most like and neere to that in Italian called Inganni. A good practise
in it to make the steward beleeve his lady widdowe was in love with
him," &c. That the play cannot have been written much earlier is
proved by the fact that the song, "Farewell, dear heart, since I must
needs be gone," which is sung by Sir Toby and the Clown (ii. 3), first
appeared in a song-book (The Booke of Ayres) published by Robert
Jones, London, 1601. Shakespeare has altered its wording very
slightly. In all probability Twelfth Night was one of the four plays
which were performed before the court at Whitehall by the Lord
Chamberlain's company at Christmastide, 1601-2, and no doubt it
was acted for the first time on the evening from which it takes its
name.
Among several Italian plays which bore the name of Gl'Inganni there
is one by Curzio Gonzaga, published in Venice in 1592, in which a
sister dresses herself as her brother and takes the name of Cesare—
in Shakespeare, Cesario—and another, published in Venice in 1537,
the action of which bears a general resemblance to that of Twelfth
Night. In this play, too, passing mention is made of one "Malevolti,"
who may have suggested to Shakespeare the name Malvolio.
The matter of the play is found in a novel of Bandello's, translated in
Belleforest's Histoires Tragiques; and also in Barnabe Rich's
translation of Cinthio's Hecatomithi, published in 1581, which
Shakespeare appears to have used. The whole comic part of the
action, and the characters of Malvolio, Sir Toby, Sir Andrew
Aguecheek, and the Clown, are of Shakespeare's own invention.
There occurs in Ben Jonson's Every Man out of his Humour a speech
which seems very like an allusion to Twelfth Night; but as Jonson's
play is of earlier date, the speech, if the allusion be not fanciful,
must have been inserted later.[1]
As was to be expected, Twelfth Night became exceedingly popular.
The learned Leonard Digges, the translator of Claudian, enumerating
in his verses, "Upon Master William Shakespeare" (1640), the poet's
most popular characters, mentions only three from the comedies,
and these from Much Ado and Twelfth Night. He says:—
"Let but Beatrice
And Benedicke be seene, loe in a trice
The Cockpit, Galleries, Boxes, all are full
To hear Malvoglio, that crosse garter'd Gull."
Twelfth Night is perhaps the most graceful and harmonious comedy
Shakespeare ever wrote. It is certainly that in which all the notes the
poet strikes, the note of seriousness and of raillery, of passion, of
tenderness, and of laughter, blend in the richest and fullest concord.
It is like a symphony in which no strain can be dispensed with, or
like a picture veiled in a golden haze, into which all the colours
resolve themselves. The play does not overflow with wit and gaiety
like its predecessor; we feel that Shakespeare's joy of life has
culminated and is about to pass over into melancholy; but there is
far more unity in it than in As You Like It, and it is a great deal more
dramatic.
A. W. Schlegel long ago made the penetrating observation that, in
the opening speech of the comedy, Shakespeare reminds us how the
same word, "fancy," was applied in his day both to love and to fancy
in the modern sense of the term; whence the critic argued, not
without ingenuity, that love, regarded as an affair of the imagination
rather than of the heart, is the fundamental theme running through
all the variations of the play. Others have since sought to prove that
capricious fantasy is the fundamental trait in the physiognomy of all
the characters. Tieck has compared the play to a great iridescent
butterfly, fluttering through pure blue air, and soaring in its golden
glory from the many-coloured flowers into the sunshine.
Twelfth Night, in Shakespeare's time, brought the Christmas
festivities of the upper classes to an end; among the common
people they usually lasted until Candlemas. On Twelfth Night all
sorts of sports took place. The one who chanced to find a bean
baked into a cake was hailed as the Bean King, chose himself a Bean
Queen, introduced a reign of unbridled frivolity, and issued whimsical
commands, which had to be punctually obeyed. Ulrici has sought to
discover in this an indication that the play represents a sort of
lottery, in which Sebastian, the Duke, and Maria chance to win the
great prize. The bibulous Sir Toby, however, can scarcely be
regarded as a particularly desirable prize for Maria; and the second
title of the play, What you Will, indicates that Shakespeare did not
lay any stress upon the Twelfth Night.
This comedy is connected by certain filaments with its predecessor,
As You Like It. The passion which Viola, in her male attire, awakens
in Olivia, reminds us of that with which Rosalind inspires Phebe. But
the motive is quite differently handled. While Rosalind gaily and
unfeelingly repudiates Phebe's burning love, Viola is full of tender
compassion for the lady whom her disguise has led astray. In the
admirably worked-up confusion between Viola and her twin brother
Sebastian, an effect from the Comedy of Errors is repeated; but the
different circumstances and method of treatment make this motive
also practically new.
With a careful and even affectionate hand, Shakespeare has
elaborated each one of the many characters in the play.
The amiable and gentle Duke languishes, sentimental and fancy-sick,
in hopeless enamourment. He is devoted to the fair Countess Olivia,
who will have nothing to say to him, and whom he none the less
besieges with his suit. An ardent lover of music, he turns to it for
consolation; and among the songs sung to him by the Clown and
others, there occurs the delicate little poem, of wonderful rhythmic
beauty, "Come away, come away, death." It exactly expresses the
soft and melting mood in which his days pass, lapped in a nerveless
melancholy. To the melody abiding in it we may apply the lovely
words spoken by Viola of the melody which preludes it:—
"It gives a very echo to the seat
Where love is throned."
In his fruitless passion, the Duke has become nervous and excitable,
inclined to violent self-contradictions. In one and the same scene (ii.
4) he first says that man's love is
"More giddy and unfirm,
More longing, wavering, sooner lost and worn"
than woman's; and then, a little further on, he says of his own love

"There is no woman's sides
Can bide the beating of so strong a passion
As love doth give my heart; no woman's heart
So big to hold so much: they lack retention."
The Countess Olivia forms a pendant to the Duke; she, like him, is
full of yearning melancholy. With an ostentatious exaggeration of
sisterly love, she has vowed to pass seven whole years veiled like a
nun, consecrating her whole life to sorrow for her dead brother. Yet
we find in her speeches no trace of this devouring sorrow; she jests
with her household, and rules it ably and well, until, at the first sight
of the disguised Viola, she flames out into passion, and, careless of
the traditional reserve of her sex, takes the most daring steps to win
the supposed youth. She is conceived as an unbalanced character,
who passes at a bound from exaggerated hatred for all worldly
things to total forgetfulness of her never-to-be-forgotten sorrow. Yet
she is not comic like Phebe; for Shakespeare has indicated that it is
the Sebastian type, foreshadowed in the disguised Viola, which is
irresistible to her; and Sebastian, we see, at once requites the love
which his sister had to reject. Her utterance of her passion,
moreover, is always poetically beautiful.
Yet while she is sighing in vain for Viola, she necessarily appears as
though seized with a mild erotic madness, similar to that of the
Duke: and the folly of each is parodied in a witty and delightful
fashion by Malvolio's entirely ludicrous love for his mistress, and vain
confidence that she returns it. Olivia feels and says this herself,
where she exclaims (iii. 4)—
"Go call him hither.—I am as mad as he
If sad and merry madness equal be."
Malvolio's figure is drawn in very few strokes, but with incomparable
certainty of touch. He is unforgetable in his turkey-like pomposity,
and the heartless practical joke which is played off upon him is
developed with the richest comic effect. The inimitable love-letter,
which Maria indites to him in a handwriting like that of the Countess,
brings to light all the lurking vanity in his nature, and makes his self-
esteem, which was patent enough before, assume the most
extravagant forms. The scene in which he approaches Olivia, and
triumphantly quotes the expressions in the letter, "yellow stockings,"
and "cross-gartered," while every word confirms her in the belief
that he is mad, is one of the most effective on the comic stage. Still
more irresistible is the scene (iv. 2) in which Malvolio is imprisoned
as a madman in a dark room, while the Clown outside now assumes
the voice of the Curate, and seeks to exorcise the devil in him, and
again, in his own voice, converses with the supposed Curate, sings
songs, and promises Malvolio to carry messages for him. We have
here a comic jeu de théâtre of the first order.
In harmony with the general tone of the play, the Clown is less witty
and more musical than Touchstone in As You Like It. He is keenly
alive to the dignity of his calling: "Foolery, sir, does walk about the
orb like the sun: it shines everywhere." He has many delightful
sayings, as for example, "Many a good hanging prevents a bad
marriage," or the following demonstration (v. I) that one is the
better for one's foes, and the worse for one's friends:—
"Marry, sir, my friends praise me, and make an ass of me; now, my
foes tell me plainly I am an ass: so that by my foes, sir, I profit in
the knowledge of myself, and by my friends I am abused: so that,
conclusions to be as kisses, if your four negatives make your two
affirmatives, why then, the worse for my friends, and the better for
my foes."
Shakespeare even departs from his usual practice, and, as though to
guard against any misunderstanding on the part of his public, makes
Viola expound quite dogmatically that it "craves a kind of wit" to play
the fool (iii. I):—
"He must observe their mood on whom he jests,
The quality of persons, and the time,
And, like the haggard, check at every feather
That comes before his eye. This is a practice
As full of labour as a wise man's art."
The Clown forms a sort of connecting-link between the serious
characters and the exclusively comic figures of the play—the pair of
knights, Sir Toby Belch and Sir Andrew Aguecheek, who are entirely
of Shakespeare's own invention. They are sharply contrasted. Sir
Toby, sanguine, red-nosed, burly, a practical joker, always ready for
"a hair of the dog that bit him," a figure after the style of Bellman;[2]
Sir Andrew, pale as though with the ague, with thin, smooth, straw-
coloured hair, a wretched little nincompoop, who values himself on
his dancing and fencing, quarrelsome and chicken-hearted, boastful
and timid in the same breath, and grotesque in his every movement.
He is a mere echo and shadow of the heroes of his admiration, born
to be the sport of his associates, their puppet, and their butt; and
while he is so brainless as to think it possible he may win the love of
the beautiful Olivia, he has at the same time an inward suspicion of
his own stupidity which now and then comes in refreshingly:
"Methinks sometimes I have no more wit than a Christian or an
ordinary man has; but I am a great eater of beef, and, I believe, that
does harm to my wit" (i. 3). He does not understand the simplest
phrase he hears, and is such a mere reflex and parrot that "I too" is,
as it were, the watchword of his existence. Shakespeare has
immortalised him once for all in his reply when Sir Toby boasts that
Maria adores him (ii. 3), "I was adored once too." Sir Toby sums him
up in the phrase:
"For Andrew, if he were opened, and you find so much blood in his
liver as will clog the foot of a flea, I'll eat the rest of the anatomy."
The central character in Twelfth Night is Viola, of whom her brother
does not say a word too much when, thinking that she has been
drowned, he exclaims, "She bore a mind that envy could not but call
fair."
Shipwrecked on the coast of Illyria, her first wish is to enter the
service of the young Countess; but learning that Olivia is
inaccessible, she determines to dress as a page (a eunuch) and
approach the young unmarried Duke, of whom she has heard her
father speak with warmth. He at once makes the deepest impression
upon her heart, but being ignorant of her sex, does not dream of
what is passing within her; so that she is perpetually placed in the
painful position of being employed as a messenger from the man
she loves to another woman. She gives utterance to her love in
carefully disguised and touching words (ii. 4):—
"My father had a daughter lov'd a man,
As it might be, perhaps, were I a woman,
I should your lordship.
Duke. And what's her history?
Vio. A blank, my lord. She never told her love,—
But let concealment, like a worm i' the bud,
Feed on her damask cheek: she pin'd in thought:
And, with a green and yellow melancholy,
She sat like Patience on a monument,
Smiling at grief."
But the passion which possesses her makes her a more eloquent
messenger of love than she designs to be. To Olivia's question as to
what she would do if she loved her as her master does, she answers
(i. 5):—
"Make me a willow cabin at your gate,
And call upon my soul within the house;
Write loyal cantons of contemned love,
And sing them loud even in the dead of night;
Holla your name to the reverberate hills,
And make the babbling gossip of the air
Cry out, Olivia! O! you should not rest
Between the elements of air and earth,
But you should pity me."
In short, if she were a man, she would display all the energy which
the Duke lacks. No wonder that, against her own will, she awakens
Olivia's love. She herself, as a woman, is condemned to passivity;
her love is wordless, deep, and patient. In spite of her sound
understanding, she is a creature of emotion. It is a very
characteristic touch when, in the scene (iii. 5) where Antonio, taking
her for Sebastian, recalls the services he has rendered, and begs for
assistance in his need, she exclaims that there is nothing, not even
"lying vainness, babbling drunkenness, or any taint of vice," that she
hates so much as ingratitude. However bright her intelligence, her
soul from first to last outshines it. Her incognito, which does not
bring her joy as it does to Rosalind, but only trouble and sorrow,
conceals the most delicate womanliness. She never, like Rosalind or
Beatrice, utters an audacious or wanton word. Her heart-winning
charm more than makes up for the high spirits and sparkling humour
of the earlier heroines. She is healthful and beautiful, like these her
somewhat elder sisters; and she has also their humorous eloquence,
as she proves in her first scene with Olivia. Yet there rests upon her
lovely figure a tinge of melancholy. She is an impersonation of that
"farewell to mirth" which an able English critic discerns in this last
comedy of Shakespeare's brightest years.[3]

[1] There is some (ironic) discussion of a possible criticism that might be brought
against a playwright: "That the argument of his comedy might have been of some
other nature, as of a duke to be in love with a countess, and that countess to be
in love with the duke's son, and the son to love the lady's waiting-maid; some
such cross wooing, with a clown to their serving-man...."
[2] See the footnote 7 in chapter XXII:
"A dance of all the gods upon Olympus,
With fauns and graces and the muses twined."
From a poem by Tegnér on Bellman, the Swedish convivial lyrist.
[3] "It is in some sort a farewell to mirth, and the mirth is of the finest quality, an
incomparable ending. Shakespeare has done greater things, but he has never
done anything more delightful."—Arthur Symons.

XXX

THE REVOLUTION IN SHAKESPEARE'S SOUL—THE


GROWING MELANCHOLY OF THE FOLLOWING PERIOD—
PESSIMISM, MISANTHROPY

For the time is now approaching when mirth, and even the joy of
life, are extinguished in his soul. Heavy clouds have massed
themselves on his mental horizon—their nature we can only divine—
and gnawing sorrows and disappointments have beset him. We see
his melancholy growing and extending; we observe its changing
expressions, without knowing its causes. This only we know, that the
stage which he contemplates with his mind's eye, like the material
stage on which he works, is now hung with black. A veil of
melancholy descends over both.
He no longer writes comedies, but sends a train of gloomy tragedies
across the boards which so lately echoed to the laughter of Beatrice
and Rosalind.
From this point, for a certain period, all his impressions of life and
humanity become ever more and more painful. We can see in his
Sonnets how even in earlier and happier years a restless
passionateness had been constantly at war with the serenity of his
soul, and we can note how, at this time also, he was subject to
accesses of stormy and vehement unrest. As time goes on, we can
discern in the series of his dramas how not only what he saw in
public and political life, but also his private experience, began to
inspire him, partly with a burning compassion for humanity, partly
with a horror of mankind as a breed of noxious wild animals, partly,
too, with loathing for the stupidity, falsity, and baseness of his
fellow-creatures. These feelings gradually crystallise into a large and
lofty contempt for humanity, until, after a space of eight years,
another revolution occurs in his prevailing mood. The extinguished
sun glows forth afresh, the black heaven has become blue again,
and the kindly interest in everything human has returned. He attains
peace at last in a sublime and melancholy clearness of vision. Bright
moods, sunny dreams from the days of his youth, return upon him,
bringing with them, if not laughter, at least smiles. High-spirited
gaiety has for ever vanished; but his imagination, feeling itself less
constrained than of old by the laws of reality, moves lightly and at
ease, though a deep earnestness now underlies it, and much
experience of life.
But this inward emancipation from the burthen of earthly life does
not occur, as we have said, until about eight years after the point
which we have now reached.
For a little time longer the strong and genial joy of life is still
dominant in his mind. Then it begins to darken, and, after a short
tropical twilight, there is night in his soul and in all his works.
In the tragedy of Julius Cæsar there still reigns only a manly
seriousness. The theme seems to have attracted him on account of
the analogy between the conspiracy against Cæsar and the
conspiracy against Elizabeth. Despite the foolish precipitancy of their
action, the leaders of this conspiracy, men like Essex and his
comrade Southampton, had Shakespeare's full personal sympathy;
and he transferred some of that sympathy to Brutus and Cassius. He
created Brutus under the deeply-imprinted conviction that
unpractical magnanimity, like that of his noble friends, is unfitted to
play an effective part in the drama of history, and that errors of
policy revenge themselves at least as sternly as moral delinquencies.
In Hamlet Shakespeare's growing melancholy and bitterness take the
upper hand. For the hero, as for the poet, youth's bright outlook
upon life has been overclouded. Hamlet's belief and trust in mankind
have gone to wreck. Under the disguise of apparent madness, the
melancholy life-lore which Shakespeare, at his fortieth year, had
stored up within him, here finds expression in words of spiritual
profundity such as had not yet been thought or uttered in Northern
Europe.
We catch a glimpse at this point of one of the subsidiary causes of
Shakespeare's melancholy. As actor and playwright he stands in a
more and more strained relation to the continually growing Free
Church movement of the age, to Puritanism, which he comes to
regard as nothing but narrow-mindedness and hypocrisy. It was the
deadly enemy of his calling; it secured, even in his lifetime, the
prohibition of theatrical performances in the provinces, a prohibition
which after his death was extended to the capital. From Twelfth
Night onwards, an unremitting war against Puritanism, conceived as
hypocrisy, is carried on through Hamlet, through the revised version
of All's Well that Ends Well, and through Measure for Measure, in
which his wrath rises to a tempestuous pitch, and creates a figure to
which Molière's Tartuffe can alone supply a parallel.
What struck him so forcibly in these years was the pitifulness of
earthly life, exposed as it is to disasters, not allotted by destiny, but
brought about by a conjunction of stupidity with malevolence.
It is especially the power of malevolence that now looms large
before his eyes. We see this in Hamlet's astonishment that it is
possible for a man "to smile and smile and be a villain." Still more
strongly is it apparent in Measure for Measure (v. I):—
"Make not impossible
That which but seems unlike. 'Tis not impossible,
But one, the wicked'st caitiff on the ground,
May seem as shy, as grave, as just, as absolute,
As Angelo; even so may Angelo,
In all his dressings, characts, titles, forms,
Be an arch-villain."
It is this line of thought that leads to the conception of Iago, Goneril,
and Regan, and to the wild outbursts of Timon of Athens.
Macbeth is Shakespeare's first attempt, after Hamlet, to explain the
tragedy of life as a product of brutality and wickedness in
conjunction—that is, of brutality multiplied and raised to the highest
power by wickedness. Lady Macbeth poisons her husband's mind.
Wickedness instils drops of venom into brutality, which, in its inward
essence, may be either weakness, or brave savagery, or stupidity of
manifold kinds. Whereupon brutality falls a-raving, and becomes
terrible to itself and others.
The same formula expresses the relation between Othello and Iago.
Othello was a monograph. Lear is a world-picture. Shakespeare
turns from Othello to Lear in virtue of the artist's need to
supplement himself, to follow up every creation with its counterpart
or foil.
Lear is the greatest problem Shakespeare had yet proposed to
himself, all the agonies and horrors of the world compressed into
five short acts. The impression of Lear may be summed up in the
words: a world-catastrophe. Shakespeare is no longer minded to
depict anything else. What is echoing in his ears, what is filling his
mind, is the crash of a ruining world.
This becomes even clearer in his next play, Antony and Cleopatra.
This subject enabled him to set new words to the music within him.
In the history of Mark Antony he saw the deep downfall of the old
world-republic—the might of Rome, austere and rigorous, collapsing
at the touch of Eastern luxury.
By the time Shakespeare had written Antony and Cleopatra, his
melancholy had deepened into pessimism. Contempt becomes his
abiding mood, an all-embracing scorn for mankind, which
impregnates every drop of blood in his veins, but a potent and
creative scorn, which hurls forth thunderbolt after thunderbolt.
Troilus and Cressida strikes at the relation of the sexes, Coriolanus at
political life; until all that, in these years, Shakespeare has endured
and experienced, thought and suffered, is concentrated into the one
great despairing figure of Timon of Athens, "misanthropos," whose
savage rhetoric is like a dark secretion of clotted blood and gall,
drawn off to assuage pain.

BOOK SECOND

INTRODUCTION—THE ENGLAND OF ELIZABETH IN


SHAKESPEARE'S YOUTH

Everything had flourished in the England of Elizabeth while


Shakespeare was young. The sense of belonging to a people which,
with great memories and achievements behind it, was now making a
decisive and irresistible new departure—the consciousness of living
in an age when the glorious culture of antiquity was being
resuscitated, and when great personalities were vindicating for
England a lofty and assured position, alike in the practical and in the
intellectual departments of life—these feelings mingled in his breast
with the vernal glow of youth itself. He saw the star of his fatherland
ascending, with his own star in its train.
It seemed to him as though men and women had in that day richer
abilities, a more daring spirit, and fuller powers of enjoyment than
they had possessed in former times. They had more fire in their
blood, more insatiable longings, a keener appetite for adventure,
than the men and women of the past. They knew how to rule with
courage and wisdom, like the Queen and Lord Burghley; how to live
nobly and fight gloriously, to love with passion and sing with
enthusiasm, like the beautiful hero of the younger generation, Sir
Philip Sidney, who found an early Achilles-death. They were bent on
enjoying existence with all their senses, comprehending it with all
their powers, revelling in wealth and splendour, in beauty and wit; or
they set forth to voyage round the world, to see its marvels, conquer
its treasures, give their names to new countries, and display the flag
of England on unknown seas.
Statesmanship and generalship were represented among them by
the men who, in these years, had humbled Spain, rescued Holland,
held Scotland in awe. They were sound and vigorous natures.
Although they all had the literary proclivities of the Renaissance,
they were before everything practical men, keen observers of the
signs of the times, firm and wary in adversity, in prosperity prudent
and temperate.
Shakespeare had seen Spenser's faithful friend, Sir Walter Raleigh,
next to himself and Francis Bacon the most brilliant and interesting
Englishman of his day, after covering himself with renown as a
soldier, a viking, and a discoverer, win the favour of Elizabeth as a
courtier, and the admiration of the people as a hero and poet.
Shakespeare no doubt laid to heart these lines in his elegy on
Sidney:—
"England doth hold thy limbs, that bred the same;
Flanders thy valour, where it last was tried;
The camp thy sorrow, where thy body died:
Thy friends thy want; the world thy virtues' fame."
For Raleigh, too, was a poet, as well as an orator and historian. "We
picture him to ourselves," says Macaulay, "sometimes reviewing the
Queen's guard, sometimes giving chase to a Spanish galleon, then
answering the chiefs of the country party in the House of Commons,
then again murmuring one of his sweet lovesongs too near the ears
of her Highness's maids of honour, and soon after poring over the
Talmud, or collating Polybius with Livy."[1]
And Shakespeare had seen the young Robert Devereux, Earl of
Essex, who in 1577, when only ten years old, had made a sensation
at court by wearing his hat in the Queen's presence and denying her
request for a kiss; at the age of eighteen win renown for himself as
a cavalry general under Leicester in the Netherlands, and at the age
of twenty depose Raleigh from the highest place in Elizabeth's
favour. He played "cards or one game or another with her ... till
birds' sing in the morning." She shut herself up with him in the
daytime, while the Venetian and French ambassadors, who had
already learnt to wait at locked doors in the time of his step-father,
Leicester, jested with each other in the anteroom as to whether
mounting guard in this fashion ought to be called tener la mula or
tenir la chandelle. And Essex demanded that Raleigh should be
sacrificed to his youthful devotion. As captain of the guard, Raleigh
had to stand at the door with a drawn sword, in his brown and
orange uniform, while the handsome youth whispered to the spinster
Queen of fifty-four things which set her heart beating. He made all
the mischief he could between her and Raleigh. She assured him
that he had no reason to "disdain" a man like that. But Essex asked
her—so he himself writes—"Whether he could have comfort to give
himself over to the service of a mistress that was in awe of such a
man;" "and," he continues, "I think he, standing at the door, might
very well hear the worst I spoke of him."
This impetuosity characterised Essex throughout his career; but he
soon developed great qualities, of which his first appearances gave
no promise; and when Shakespeare made his acquaintance,
probably in the year 1590, his personality must have been extremely
winning. Himself a poet, he no doubt knew how to value A
Midsummer Night's Dream, and its author. In all probability,
Shakespeare even at this time found a protector in the young
nobleman, and afterwards made acquaintance through him with his
kinsman Southampton, six years younger than himself. Essex had
already distinguished himself as a soldier. In May 1589 he had been
the first Englishman to wade ashore upon the coast of Portugal, and
in the lines before Lisbon he had challenged any of the Spanish
garrison to single combat in honour of his queen and mistress. In
July 1591 he joined the standard of Henry of Navarre with an
auxiliary force of 4000 men; he shared all the hardships of the
common soldiers; during the siege of Rouen he challenged the
leader of the enemy's forces to single combat; and then by his
incapacity he dissipated all the results of the campaign. His army
melted away to almost nothing.
He was at home during the following years, when Shakespeare
probably came to know him well, and to appreciate his chivalrous
nature, his courage and talent, his love of poetry and science, and
his helpfulness towards men of ability, such as Francis Bacon and
others. He therefore, no doubt, followed with more than the ordinary
patriotic interest the expedition of the English fleet to Cadiz in 1596,
in which the two old antagonists, Raleigh and Essex, were to fight
side by side. Raleigh here won a brilliant victory over the great
galleons of the Spanish fleet, burning them all except two, which he
captured; while on the following day, when a severe wound in the
leg prevented Raleigh from taking part in the action, Essex, at the
head of his troops, stormed and sacked the town of Cadiz. In his
despatches to Elizabeth, Raleigh praised Essex for this exploit. He
became the hero of the day; his name was in every mouth, and he
was even eulogised from the pulpit of St. Paul's.
It was indeed a great age. England's world-wide power was founded
at the expense of defeated and humiliated Spain; England's world-
wide commerce and industry came into existence. Before Elizabeth
came to the throne, Antwerp had been the metropolis of commerce;
during her reign, London took that position. The London Exchange
was opened in 1571; and twenty years later, English merchants all
the world over had appropriated to themselves the commerce which
had formerly been almost entirely in the hands of the Hanseatic
Towns. London urchins hung about the wharves of the Thames,
listening to the marvels related by seamen who had made the
voyage round the Cape of Good Hope to Hindostan. Sunburnt,
scarred, and bearded men haunted the taverns; they had crossed
the ocean, lived in the Bermuda Islands, and brought negroes and
Red Indians and great monkeys home with them. They told tales of
the golden Eldorado, and of real and imaginary perils in distant
quarters of the globe.
This peaceful development of commerce and industry had taken
place simultaneously with the development of naval and military
power. And the scientific and poetical culture of England advanced
with equal strides. While mariners had brought home tidings of
many an unknown shore, scholars also had made voyages of
discovery in Greek and Roman letters; and while they praised and
translated authors unheard of before, dilettanti brought forward and
interpreted Italian and Spanish poets who served as models of
invention and delicacy. The world, which had hitherto been a little
place, had suddenly grown vast; the horizon, which had been
narrow, widened out all of a sudden, and every mind was filled with
hopes for the days to come.
It had been a vernal season, and it was a vernal mood that had
uttered itself in the songs of the many poets. In our days, when the
English language is read by hundreds of millions, the poets of
England may be quickly counted. In those days the country
possessed something like three hundred lyric and dramatic poets,
who, with potent productivity, wrote for a reading public no larger
than that of Denmark to-day; for of the six millions of the
population, four millions could not read. But the talent for writing
verses was as widespread among the Englishmen of that time as the
talent for playing the piano among German ladies of to-day. The
power of action and the gift of song did not exclude each other.
But the blossoming springtide had been short, as springtide always
is.

[1] Macaulay, Essays—"Burleigh and his Times."


II

ELIZABETH'S OLD AGE

At the dawn of the new century the national mood had already
altered.
Elizabeth herself was no longer the same. There had always been a
dark side to her nature, but it had passed almost unnoticed in the
splendour which national prosperity, distinguished men, great
achievements and fortunate events had shed around her person.
Now things were changed.
She had always been excessively vain; but her coquettish pretences
to youth and beauty reached their height after her sixtieth year. We
have seen how, when she was sixty, Raleigh, from his prison,
addressed a letter to Sir Robert Cecil, intended for her eyes, in which
he sought to regain her favour by comparing her to Venus and
Diana. When she was sixty-seven, Essex's sister, in a supplication for
her brother's life, wrote of that brother's devotion to "her beauties,"
which did not merit so hard a punishment, and of her "excellent
beauties and perfections," which "ought to feel more compassion."
In the same year the Queen took part, masked, in a dance at Lord
Herbert's marriage; and she always looked for expressions of
flattering astonishment at the youthfulness of her appearance.
When she was sixty-eight, Lord Mountjoy wrote to her of her "faire
eyes," and begged permission to "fill his eyes with their onely deere
and desired object." This was the style which every one had to
adopt who should have the least prospect of gaining, preserving, or
regaining her favour.
In 1601 Lord Pembroke, then twenty-one years old, writes to Cecil
(or, in other words, to Elizabeth, in her sixty-eighth year) imploring
permission once more to approach the Queen, "whose incomparable
beauty was the onely sonne of my little world."
When Sir Roger Aston, about this time, was despatched with letters
from James of Scotland to the Queen, he was not allowed to deliver
them in person, but was introduced into an ante-chamber from
which, through open door-curtains, he could see Elizabeth dancing
alone to the music of a little violin,—the object being that he should
tell his master how youthful she still was, and how small the
likelihood of his succeeding to her crown for many a long day.[1] One
can readily understand, then, how she stormed with wrath when
Bishop Rudd, so early as 1596, quoted in a sermon Kohélet's verses
as to the pains of age, with unmistakable reference to her.
She was bent on being flattered without ceasing and obeyed without
demur. In her lust of rule, she knew no greater pleasure than when
one of her favourites made a suggestion opposed to one of hers,
and then abandoned it. Leicester had employed this means of
confirming himself in her favour, and had bequeathed it to his
successors. So strong was her craving to enjoy incessantly the
sensation of her autocracy, that she would intrigue to set her
courtiers up in arms against each other, and would favour first one
group and then the other, taking pleasure in their feuds and cabals.
In her later years her court was one of the most corrupt in the
world. The only means of prospering in it were those set forth in
Roger Ascham's distich:
"Cog, lie, flatter and face
Four ways in court, to win men grace."
The two main parties were those of Cecil and Essex. Whoever
gained the favour of one of these great lords, be his merits what
they might, was opposed by the other party with every weapon in
their power.
In some respects, however, Elizabeth in her later years had made
progress in the art of government. So weak had been her faith in the
warlike capabilities of her country, and so potent, on the other hand,
her avarice, that she had neglected to make preparation for the war
with Spain, and had left her gallant seamen inadequately equipped;
but after the victory over the Spanish Armada she ungrudgingly
devoted all the resources of her treasury to the war, which survived
her and extended well into the following century. This war had
forced Elizabeth to take a side in the internal religious dissensions of
the country. She was the head of the Church, regarded ecclesiastical
affairs as subject to her personal control, and, so far as she was
able, would suffer no discussion of religious questions in the House
of Commons. Like her contemporary Henri Quatre of France, she
was in her heart entirely indifferent to religion, had a certain general
belief in God, but thought all dogmas mere cobwebs of the brain,
and held one rite neither better nor worse than another. They both
regarded religious differences exclusively from the political point of
view. Henry ended by becoming a Catholic and assuring his former
co-religionists freedom of conscience. Elizabeth was of necessity a
Protestant, but tolerance was an unknown doctrine in England. It
was an established principle that every subject must accept the
religion of the State.
Authoritarian to her inmost fibre, Elizabeth had a strong bent
towards Catholicism. The circumstances of her life had placed her in
opposition to the Papal power, but she was fond of describing herself
to foreign ambassadors as a Catholic in all points except subjection
to the Pope. She did not even make any secret of her contempt for
Protestantism, whose head she was, and whose support she could
not for a moment dispense with. She felt it a humiliation to be
regarded as a co-religionist of the French, Scotch, or Dutch heretics.
She looked down upon the Anglican Bishops whom she had herself
appointed, and they, in their worldliness, deserved her scorn. But
still deeper was her detestation of all sectarianism within the limits
of her Church, and especially of Puritanism in all its forms. If she did
not in the first years of her reign indulge in open persecution of the
Puritans, it was only because she was as yet dependent on their
support; but as soon as she felt herself firmly seated on her throne,
she established, in spite of the stiff-necked opposition of Parliament,
the jurisdiction of the Bishops on all matters of ecclesiastical politics,
and suffered Puritan writers to be condemned to death or lifelong
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like