0% found this document useful (0 votes)
7 views

Extreme Changes in Changes

Uploaded by

490189269
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Extreme Changes in Changes

Uploaded by

490189269
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Extreme Changes in Changes∗

Yuya Sasaki† and Yulong Wang‡


arXiv:2211.14870v2 [econ.EM] 20 May 2023

Abstract

Policy analysts are often interested in treating the units with extreme outcomes,

such as infants with extremely low birth weights. Existing changes-in-changes (CIC)

estimators are tailored to middle quantiles and do not work well for such subpopu-

lations. This paper proposes a new CIC estimator to accurately estimate treatment

effects at extreme quantiles. With its asymptotic normality, we also propose a method

of statistical inference, which is simple to implement. Based on simulation studies, we

propose to use our extreme CIC estimator for extreme, such as below 5% and above

95%, quantiles, while the conventional CIC estimator should be used for intermediate

quantiles. Applying the proposed method, we study the effects of income gains from

the 1993 EITC reform on infant birth weights for those in the most critical conditions.

This paper is accompanied by a Stata command.

Keywords: quantile treatment effect, extreme quantile, Pareto exponent

JEL Code: C21


We thank the editor, associate editor, two anonymous referees, Alfonso Flores-Lagunes, Hilary Hoynes,
Doug Miller, and David Simon for useful advice about our empirical application. We benefited from dis-
cussions with Jon Roth. All remaining errors are ours. A Stata command, ecic (extreme changes in
changes), associated with this paper can be installed from SSC archive with the following command line:
ssc install ecic.

Associate professor of economics, Vanderbilt University. Email: [email protected]

Assistant professor of economics, Syracuse University. Email: [email protected].

1
1 Introduction

The difference-in-differences (DID) approach is a widely employed empirical strategy for

program evaluation in the presence of policy events in time. The common DID methods

critically depend on parallel trend assumptions and focus on identifying (conditional) average

effects. An alternative empirical strategy is the changes in changes (CIC) method proposed

by Athey and Imbens (2006). At the cost of alternative distributional assumptions, the

CIC gets around the common trend assumption and further can identify distributions of

counterfactual outcomes as opposed to just their averages. Thus, CIC can be used to analyze

heterogeneous individuals via quantile treatment effects under rank invariance.

As are the cases with other quantile-based estimands, however, the existing CIC esti-

mator only works for intermediate quantiles in theory. Practically, for instance, such an

estimator is accurate for intermediate quantile levels such as q ∈ (0.05, 0.95) between the

fifth to the ninety-fifth percentiles. This limitation for the existing CIC estimator rules out

causal inference for those individuals at the extreme top and extreme bottom quantiles.

Yet, it is sometimes rather at extreme quantiles that treatment effects are more relevant

to social policy analysis. For instance, policymakers often care about treating economically

disadvantaged subpopulations like the poorest individuals characterized by the limit q → 0.

The treatment effect for these tail subpopulations could be substantially larger than that

for the mid-sample subpopulations, and hence it is imperative for such policymakers to have

methods with which they can accurately assess treatment effects at the subpopulations in

the tail.

In this paper, we propose an alternative CIC estimator that more accurately estimates the

treatment effects at the tails, technically in the limits as q → 0 and q → 1. We also develop

asymptotic normality for this estimator and propose an easy-to-construct confidence interval.

Based on our simulation studies, we provide the following practical recommendation. For the

2
intermediate quantiles, use the existing estimator by Athey and Imbens (2006) along with

its standard error. For the extreme quantiles, on the other hand, use our proposed estimator

along with its standard error. We suggest using the log-log plot to choose the switching

point and demonstrate a combined use of both estimators in our empirical application.

With the proposed econometric method, we revisit the study by Hoynes, Miller, and Si-

mon (2015) in which they use the 1993 event of EITC reform to evaluate the effects of income

gains on infant birth weights. While they analyze average effects via the DID, we focus on

the effects at the low quantiles to see if such income gains can improve infant birth weights,

particularly for those at the most critical birth weight conditions. This empirical question

is of interest because low infant birth weight is known to have long-lasting impacts on the

health and economic well-being in adulthood (e.g., Currie, 2011) as well as an immediate

impact on infant mortality.

Literature. In contrast to the nowadays extensive body of literature on DID, the liter-

ature on CIC is relatively thin. Since its first proposal by Athey and Imbens (2006), the

CIC framework has been extended to fuzzy treatment assignments (de Chaisemartin and

D’Haultfœuille, 2014), models with covariates (Melly and Santangelo, 2015), continuous

treatments (D’Haultfœuille, Hoderlein, and Sasaki, 2022), and correction of attrition bias

(Ghanem, Hirshleifer, Kedagni, and Ortiz-Becerra, 2022). To our best knowledge, how-

ever, no preceding paper investigates extreme quantiles in the context of CIC, despite the

aforementioned policy relevance. On the other hand, there are a few papers that investigate

treatment effects at extreme quantiles outside the context of CIC – see Chernozhukov (2005),

Chernozhukov and Fernández-Val (2011), D’Haultfœuille, Maurel, and Zhang (2018), Zhang

(2018), and Deuber, Li, Engelke, and Maathuis (2021) to list but a few. None of the existing

papers on extremal treatment effects consider DID or CIC frameworks.

Organization. Section 2 provides a review of CIC. Section 3 introduces the proposed

method, and Section 4 derives asymptotic properties. Section 5 discusses some practical

3
issues, and Section 6 extends the proposed method to allow for covariates. Section 7 shows

simulation studies, and Section 8 presents the empirical application. Section 9 presents

additional simulation results calibrated to the empirical dataset, and Section 10 concludes.

Stata Command. This paper is accompanied by a Stata command, ecic (extreme changes

in changes). The package can be installed from SSC archive with the following command

line: ssc install ecic. After the installation, run help ecic for usage of the command.

2 The Changes in Changes

This section briefly reviews the CIC estimator following Athey and Imbens (2006). The goals

here are to introduce the data-generating model and the treatment parameter of interest, as

well as to fix notations to be used in the rest of this paper.

Individual i belongs to group Gi ∈ {0, 1}, where value of 0 (respectively, 1) indicates

the control (respectively, treatment) group. Each individual is observed in one of the two

time periods T i ∈ {0, 1}. For each draw i = 1, ..., n from the population, the group identity

Gi and time period T i are treated as random variables. Letting Y i denote a continuous

outcome, econometricians observe a random sample of (Y i , Gi , T i ).

The underlying structure to generate Y i is as follows. Let YNi (respectively, YIi ) denote

the potential outcome for individual i under no treatment (respectively, under treatment).

The potential outcome YN under no treatment is generated by

YN = h (U, T ) , (1)

where U represents unobserved characteristics, h (·, t) is strictly increasing for each t ∈ {0, 1},

and U ⊥ T |G. Let I i ∈ {0, 1} indicate that individual i receives a treatment. In the two-

group-two-period setting, we have I i = Gi T i . The realized outcome Y i is generated by

Y i = YNi 1 − I i + YIi · I i .


4
We now introduce the following short-hand notations:

YgtN ∼ YN |G = g, T = t

YgtI ∼ YI |G = g, T = t

Ygt ∼ Y |G = g, T = t.

For any distribution function F , we define its left-inverse F −1 by F −1 (q) = inf{y : F (y) ≥

q}. In this setup and with these notations, Athey and Imbens (2006, Theorem 3.1) establish

FY11N (y) = FY10 ◦ FY−1


00
◦ FY01 (y)

for all y, provided that U |G = 1 is a subset of the support of U |G = 0.

For each quantile q ∈ (0, 1), the quantile effect of the treatment is thus identified by

τqCIC := FY−1I (q) − FY−1N (q) = FY−1


11
(q) − FY−1
01
◦ FY00 ◦ FY−1
10
(q) . (2)
11 11

3 The Extreme Changes in Changes

The conventional estimator (Athey and Imbens, 2006, page 464) for τqCIC performs well in

middle quantiles, such as q ∈ (0.05, 0.95), but may perform less desirably in extreme quantiles

(e.g., q ∈ (0.00, 0.05]∪[0.95, 1.00)), as are the case with common quantile estimators. Indeed,

the asymptotic theory for the conventional estimator rules out extreme values of q. In this

section, we present our proposed method of estimating τqCIC as q = qn → 1 in the right tail.

A symmetric argument applies to the limit on the other side of the distribution in the left

tail as q → 0. To stress the drifting sequence of limiting parameters of our interest, we use

the notation τqeCIC for extreme CIC. In other words, ‘e’ in “eCIC” is used to remind readers

that this parameter drifts with q as the sample size increases.

Suppose that the distribution function FYgt of Ygt has regularly varying tails for each

5
{g, t} ∈ {0, 1}2 . Specifically, we assume

1 − FYgt (ty)
→ y −αgt as t → ∞
1 − FYgt (t)

for each {g, t} ∈ {0, 1}2 . Here, the parameter αgt > 0 is referred to as the Pareto exponent.

Our extreme CIC estimation is built on estimating the Pareto exponent. We emphasize that

this assumption is quite mild and most of the common families of parametric distributions

as well as a large class of nonparametric distributions satisfy it. For example, the Student-

t distribution with ν degrees of freedom satisfies this condition with ν being the Pareto

exponent. See, for example, de Haan and Ferreira (2007, Chapter 1) and Resnick (2007,

Chapter 2) for reviews on this condition.


(1) (2) (ngt )
Let Ygt ≥ Ygt ≥ ... ≥ Ygt denote the order statistics of the realized outcomes in the

group {g, t}, where ngt denotes the subsample size in this group. Choose the largest kgt + 1

of them, that is
(1) (2) (k +1)
Ygt ≥ Ygt ≥ ... ≥ Ygt gt .

Then, αgt can be estimated by the Hill estimator (Hill, 1975)


kgt
!−1
1 X h  (i)  
(kgt +1)
i
α̂gt = log Ygt − log Ygt . (3)
kgt i=1

As q → 1, FY−1
gt
(q) is estimated by
 1/α̂gt
(k +1) kgt
F̂Y−1 (q) = Ygt gt . (4)
gt
ngt (1 − q)

Moreover, the tail probability can be estimated by


!−α̂gt
kgt y
1 − F̂Ygt (y) = (k +1)
(5)
ngt Ygt gt
as q → 1. See, for example, de Haan and Ferreira (2007, Chapter 4).

By combining the identifying formula (2) with the component estimators (3)–(5), we

obtain the following estimator for the extreme CIC, τqeCIC .

τ̂qeCIC =F̂Y−1
11
(q) − F̂Y−1
01
◦ F̂Y00 ◦ F̂Y−1
10
(q)

6
 1/α̂01
 1/α̂11
(k +1) k11 (k +1) k01
=Y11 11 − Y01 01   
n11 (1 − q) n01 1 − F̂Y00 ◦ F̂Y−1 (q)
10
 1/α̂11
(k +1) k11
=Y11 11
n11 (1 − q)
   1/α̂10  α̂00 1/α̂01
(k +1) k10
k n  Y10 10 n10 (1−q)  
(k +1)  01 00 
− Y01 01 (k00 +1)
  .
 n01 k00 
  
Y00 

By simple algebraic manipulations, this expression simplifies as


 1/α̂11
k11
(1 − q)−1/α̂11
(k +1)
τ̂qeCIC =Y11 11
n11
!α̂00 /α̂01 
(k +1) 1/α̂01
(k +1) Y10 10 k01 n00
− Y01 01 (k +1)
(6)
Y00 00 n01 k00
 α̂00 /(α̂10 α̂01 )
k10
× (1 − q)−α̂00 /(α̂10 α̂01 ) .
n10

We thus propose (6) as the extreme CIC estimator, which is quite simple to implement. The

next section presents asymptotic properties based on kgt → ∞ as ngt → ∞ for all g and t.

We close this section with a discussion of the identifiability of the extreme CIC, τqeCIC .

While we informally reviewed the identification result of Athey and Imbens (2006, Theorem

3.1) in Section 2, we should emphasize that it relies on a common support condition (Athey

and Imbens, 2006, Assumption 3.4). Namely, for the identifying equality (2) to hold for all

q ∈ (0, 1), the support of U |G = 1 needs to be a subset of the support of U |G = 0. If this

condition fails, then FY11N remains unidentified outside of the support of Y |G = 0, T = 1

(Athey and Imbens, 2006, Corollary 3.1). Such an unidentified region of q generally contains

extreme quantiles. Hence, the common support condition is crucial especially in the context

of extreme quantiles. If FY11N is bounded away from zero and one on the support of Y |G =

0, T = 1, then we can deduce that the common support condition may be violated.

7
4 Asymptotic Theory

In this section, we derive a limit distributional property for the proposed extreme CIC

estimator. This result paves a way for statistical inference about the extreme CIC.
n
Let {Ygti }i=1
gt
denote the subsample of observed outcomes in group g and time t. We state

the following set of conditions, followed by discussions of each piece.

Conditions

n
1. Ygti is i.i.d. across i within each g and t. {Ygt1 , ..., Ygt gt } are independent across g and t.

2. Fgt (·) is regularly varying at infinity with Pareto exponent αgt . Moreover, for some

constant ρgt > 0, 1 − Fgt (y) = c1 y −αgt + c2 y −αgt −αgt ρgt (1 + o(1)) as y → ∞.

3. n11 /ngt → η11/gt ∈ (0, ∞) and k11 /kgt → λ11/gt ∈ (0, ∞) for each g, t ∈ {0, 1}2 .
 
2ρ /(1+2ρgt )
4. kgt → ∞ and kgt = o ngt gt for each g, t ∈ {0, 1}2 .

p 
5. ngt (1 − q) = o(kgt ) and log [ngt (1 − q)] = o kgt for each g, t ∈ {0, 1}2 .

6. FY−1
11
(q) /FY−1
01
◦ FY00 ◦ FY−1
10
(q) → ς ∈ (0, ∞).

We provide some discussions about these conditions. Following Athey and Imbens (2006),

Condition 1 assumes random sampling within each time and treatment group, and indepen-

dence across time periods and groups. Thus, it presumes repeated cross sections rather

than panel data. Condition 2 imposes the regularly varying tail conditions on all four con-

ditional distributions of the outcome. More generally, the regularly varying tail condition

is equivalent to that the underlying distribution belongs to the domain of attraction of the

extreme value distribution with a positive tail index. See, for example, de Haan and Fer-

reira (2007, Chapter 1). Since we derive the convergence rate, the second-order Pareto tail

approximation is inevitable. The second-order parameter ρgt governs the distance between

8
the true underlying distribution and the Pareto one. As remarked previously, this condition

imposes a rather mild restriction and also satisfies the common support condition. Condition

3 requires that the sample sizes of all subsamples are asymptotically of the same order of

magnitude.

Condition 4 specifies the order of the tail thresholds used in estimation. For simplicity
2ρ /(1+2ρgt )
of illustration, we select kgt to be of a smaller order than ngt gt so that the estimators

incur negligible asymptotic biases relative to variances. This requirement is similar in spirit

to under-smoothing bandwidths in kernel estimation or under-smoothing dimensions in sieve


2ρ /(1+2ρgt )
estimation. On the other hand, if we select kgt to be of the same order of ngt gt , the

asymptotic bias becomes non-negligible, whose expression is complicated. In particular, the

asymptotic bias involves the second-order parameter ρgt (e.g., de Haan and Ferreira, 2007,

Chapter 3). Estimation of this parameter is challenging since it requires further restrictions

on the underlying distribution (e.g., Cheng and Peng, 2001; Haeusler and Segers, 2007;

Carpentier and Kim, 2014), which are hard to interpret and hard to justify. Furthermore,

such bias estimators entail slower rates of convergence. Given these limitations, we focus on

the asymptotics based on undersmoothing for a better statistical inference.

Condition 5 imposes restrictions on the rate at which the quantile level q under investiga-

tion tends to the unit in the drifting sequence. In particular, q should tend to one sufficiently

fast so that the quantile under investigation is extreme. Otherwise, the q quantile is not in

the tail and can be better estimated by the standard CIC method. This condition is also

common in the extreme quantile literature (e.g., de Haan and Ferreira, 2007, Chapter 4).

Note that this condition allows for ngt (1 − q) → 0. When this happens, the other part

of this condition implicitly imposes a lower bound of 1 − q and equivalently that the ex-

trapolation cannot be pushed too far in the right tail (e.g., de Haan and Ferreira, 2007,
p 
Remark 4.3.4). To see this, observe that the condition log [ngt (1 − q)] = o kgt implies

1 − q > n−1 exp(−ε kgt ) for each ε > 0.


p

9
Condition 6 requires that the limit of the counterfactual outcome ratio is finite as q tends

to the unit. For simplicity, we consider ς ∈ (0, ∞). If ς is 0 or ∞, however, the estimator

F̂Y−1
gt
(·) has a different convergence rate across g and t, and consequently, we could ignore the

estimation error for some pairs of g and t.

The following theorem establishes the asymptotic normality for the extreme CIC estima-

tor (6) under these conditions.

Theorem 1 If Conditions 1-6 are satisfied, then


1/2
k11  d
−1 τ̂qeCIC − τqeCIC → N (0, Ω)
FY11 (q) log d11

holds, where dgt = kgt /(ngt (1 − q)) and


 2 
λ11/10 2  2

−2 1  α00
Ω= α11 + λ11/00 + λ11/10 + λ11/01 2 2 .
ς η11/10 α10 α01

A proof is relegated to Appendix A.

In finite samples, the asymptotic variance can be estimated by substituting α̂gt , λ̂11/gt =

k11 /kgt , and ςˆ = F̂Y−1


11
(q) /F̂Y−1
01
◦ F̂Y00 ◦ F̂Y−1
10
(q) for αgt , λ11//gt , and ς, respectively, in the

formula of the asymptotic variance Ω provided in the statement of Theorem 1. Under the

same conditions, this estimator of Ω is also consistent. The 95% confidence interval is then

constructed as
  2 1/2
2
−1/2
F̂Y−1
11
−2
(q) α̂11 + F̂Y−1
01
◦ F̂Y00 ◦ F̂Y−1
10
(q)
τ̂qeCIC ± 1.96k11 log d11  . (7)
 
 2 h i 2 
λ̂11/10 α̂00
× η̂11/10
λ̂11/00 + λ̂11/10 + λ̂11/01 α̂210 α̂201

In practice, it is recommended to replace d11 by d11 ∨ d for some d > 1 to ensure a

positive value of the logarithm in (7). We set d = 10 in the subsequent simulation studies

and empirical application. Finally, we remark that Ω simplifies to


"  2  #
λ11/10 2 

−2 1 
Ω=α 1+ λ11/00 + λ11/10 + λ11/01
ς η11/10

10
in the special case where αg,t is the same across {g, t}, say αg,t = α for all {g, t}, although

we do not impose this restriction in the subsequent numerical analyses. This could happen

if the treatment effect is a constant shift of the outcome. Given that α̂g,t is asymptotically

independent across g and t, we can perform the standard t-test for their equivalence.

5 Practical Issues

This section collects discussions on the remaining practical issues.

5.1 Choice of kgt

The number kgt of order statistics is the key tuning parameter in our method. We propose to

use the empirical choice rule proposed by Guillou and Hall (2001). We present the detailed

procedure here for convenience of readers.

Since the identical algorithm applies to each pair of g and t, we suppress these subscripts

in this subsection for notational simplicity. Given a random sample {Y 1 , Y 2 , . . . , Y n }, we

first sort them descendingly and denote the order statistics by Y (1) ≥ Y (2) ≥ . . . ≥ Y (n) .

Define Zi = i log(Y (i) /Y (i+1) ) for i = 1, . . . , n − 1. For each k = 1, . . . , n − 1, construct


k
!−1/2
X
Tk ≡ wi2 ξˆ−1 Uk ,
i=1

where wi = sgn (k − 2i + 1) |k − 2i + 1|, ξˆ = 1/α̂, and Uk ≡ ki=1 wi Zi . When the Pareto


P

tail approximation performs well, Tk should have its mean close to zero and variance close to

one. Accordingly, we can minimize the following criteria based on a moving average of Tk2 :
 1/2
bk/2c
X
Ck = (2bk/2c + 1)−1 2 
Tk+j .
j=−l

The optimal value k ∗ of k is

k∗ = min {k : Ct > 1 for all t ≥ k}. (8)


1≤k≤n−1

11
5.2 Extreme Quantiles

We now discuss how to define the domain [q, 1) of q on which one may use this extreme

CIC estimator, as opposed to the conventional CIC estimator. We suggest to make a scatter
(i) n
gt ngt
plot of {log Ygt }i=1 against {log i}i=1 , called the log-log plot. This plot is linear near small

values of i if the tail is approximately Pareto, and our estimator is accurate where it appears

linear. In this light, one can choose the boundary point q such that this log-log plot appears

linear for i ∈ {1, · · · , bngt (1 − q)c}. We concretely illustrate this procedure in our empirical

application in Section 8.

6 Extension: Covariates

Our proposed method can be easily extended to allow for covariates. Similarly to Athey and

Imbens (2006, pages 465-466), we first regress the outcome variable on the covariates and

then apply the proposed extreme CIC estimator to the regression residuals. We formalize

this procedure as follows.

Consider the linear model


i 0
Wgti = Xgt βgt + Ygti ,

(9)

where Wgti denotes the outcome variable for the i-th individual in group g and time t, and
i
Xgt denotes the covariate vector. The coefficient βgt can be different across g and t, and

hence the above regression can be conducted separately for each g and t. For notational

simplicity, we continue using Ygti to denote the error term, which is now unobserved. Given
i 0

an estimate β̂gt , we treat the residuals Ŷgti = Wgti − Xgt β̂gt as effective observations and

construct the proposed extreme CIC estimator based on them. Specifically, we order the

residuals as
(1) (2) (k +1)
Ŷgt ≥ Ŷgt ≥ ... ≥ Ŷgt gt

12
(i) k +1
gt (i) k +1
gt
and replace {Ygt }i=1 with {Ŷgt }j=1 in (3)–(6).

In additional to Conditions 1-6, we require the following additional condition.

Condition

p |Ŷgti −Ygti |
7. max1≤i≤ngt kgt 1+ Y i = op (1) for all g and t.
| gt |

Condition 7 is proposed recently by Girard, Stupfler, and Usseglio-Carleve (2021), who

study an estimator of tail features in a more general setup. This condition is mild and

satisfied by the least square estimator in the linear model (9) (cf., Girard et al., 2021, Section
i

3.1). In particular, when Xgt has a compact support, and the regression estimator β̂gt is n-

i
consistent, |Ŷgti − Ygti | becomes ||Xgt || · ||β̂gt − βgt || = Op ( ngt ) . Then Condition 7 follows

from that kgt /ngt → 0. In summary, this condition requires that the estimation error is
gt(i) k +1
sufficiently small and consequently the CIC estimator based on {Ŷgt }i=1 is asymptotically
(i) k +1
gt
the same as that based on {Ygt }i=1 .

The following corollary summerzies the result.

Corollary 1 Consider the linear regresssion model (9). If Conditions 1-7 are satisfied, then
(i) k +1
the estimator τ̂qeCIC based on {Ŷgt }i=1
gt
has the same asymptotic distribution as in Theorem

1.

A proof is relegated to Appendix A.

7 Simulations

We use the following data generating design based on our baseline model. Generated first

are the group and time period indicators according to

Gi ∼ Bernoulli(πG ) and T i ∼ Bernoulli(πT ).

13
To allow for the endogenous dependence between the group Gi and the unobservables U i ,

we in turn generate U i conditionally on Gi as follows.



Beta(πA , πB ) if Gi = 0


i
U ∼

Uniform(0, 1) if Gi = 1.

Here, we use the uniform distribution under Gi = 1 for ease of analytic tractability of both

the Pareto exponent and the quantile treatment effects and for the purpose of accurate

evaluations of simulation results with analytically known true parameter values. We also

remark that the conditional independence assumption U i ⊥ T i |Gi of Athey and Imbens

(2006) is satisfied in this design by construction. In this two-group-two-period setting, the

treatment indicator is in turn defined by I i = Gi T i .

The potential outcomes are generated through the model

YNi =hN (U i , T i ) = Ft−1


α
(U i ) + T i and (10)

YIi =hI (U i , T i ) = Ft−1


α
(U i ) + U i + 1, (11)

where Ft−1
α
denotes the quantile function of the Student-t distribution with α degrees of

freedom. Now, the observed outcomes are in turn generated by

Y i = YNi 1 − I i + YIi · I i .


There are three notable features in this data generating process. First, FY11N and FY11I in

(10)–(11) have Pareto exponents of α. Second, the second term U on the right-hand side of

(11), but not of (10), causes heterogeneous treatment effects characterized as follows

τqCIC = FY−1I (q) − FY−1N (q) = q.


11 11

Finally, we remark that the monotonicity assumption of Athey and Imbens (2006) for the

identification is satisfied in this model.

14
We evaluate the finite sample performance of our proposed extreme CIC estimator τ̂qeCIC

given in (6) along with its standard error estimator (7). The order statistics kgt are chosen

based on Guillou and Hall (2001) for each subsample (g, t) as described in Section 5.1.

We also present simulation results for the conventional estimator τ̂qCIC of Athey and Imbens

(2006, page 464) with its standard error estimator (Athey and Imbens, 2006, pages 464-465).

For the standard error estimation for τ̂qCIC , we use Epanechnikov kernel and Silverman’s rule

of thumb for bandwidth selection. Before presenting the results, we want to stress that we

focus on the extreme quantiles q ∈ [0.90, 1.00) on which comparisons are necessarily unfair

for the estimator τ̂qCIC of Athey and Imbens (2006), which presumes intermediate quantiles

in theory. We confirm and acknowledge that the estimator τ̂qCIC of Athey and Imbens (2006)

performs better in the intermediate quantiles q ∈ (0.10, 0.90).

Figure 1 shows Monte Carlo averages and inter-quartile ranges of the estimates under the

design with (πG , πT , πA , πB , α) = (0.1, 0.5, 1.0, 2.0, 10). The dashed curves on the left column

of the figure indicate the average estimates based on the conventional estimator τ̂qCIC . The

dotted curves on the right column of the figure indicate the average estimates based on

our proposed estimator τ̂qeCIC . In each panel, the shaded regions indicate the inter-quartile

ranges of the estimates by the respective methods. The results are shown at the extreme

quantiles q ∈ [0.90, 1.00) and for sample sizes N ∈ {2500, 5000}. The solid curves indicate

the true treatment effects. Observe that the conventional estimator τ̂qCIC tends to give biased

estimates as q → 1. On the other hand, our proposed estimator τ̂qeCIC yields significantly

less biased estimates even in the limit q → 1. We ran many additional simulations with

varying design parameter values (πG , πT , πA , πB , α), and the results indicate similar patterns

across sets of simulations.

Figure 2 shows Monte Carlo frequencies that the true treatment effects are covered by the

95% confidence intervals. The dashed curves indicate the results based on the conventional

estimator τ̂qCIC and the dotted curves indicate the results based on our proposed estimator

15
Figure 1: Monte Carlo averages and inter-quartile ranges (shaded) of the estimates based
on the conventional estimator τ̂qCIC (dashed curves on the left column) and our proposed
estimator τ̂qeCIC (dotted curves on the right column) at the extreme quantiles q ∈ [0.90, 1.00)
under the design with (πG , πT , πA , πB , α) = (0.1, 0.5, 1.0, 2.0, 10). The true treatment effects
are indicated by the solid curves.
16
Figure 2: Monte Carlo frequencies of coverage of the true treatment effects by the 95%
confidence intervals at the extreme quantiles q ∈ [0.90, 1.00) under the design with
(πG , πT , πA , πB , α) = (0.1, 0.5, 1.0, 2.0, 10). The dashed and dotted curves indicate the results
based on the conventional estimator τ̂qCIC and our proposed estimator τ̂qeCIC , respectively.

τ̂qeCIC . The results are shown at the extreme quantiles q ∈ [0.90, 1.00) and for sample sizes

N ∈ {2500, 5000}. Observe that the coverage frequency based on the conventional method

deviates away from the nominal probability of 0.95 as q → 1. In contrast, the coverage

frequency based on our proposed method is close to the nominal probability of 0.95 at each

point q ∈ [0.90, 1.00) in the extreme quantiles. We remark again that we ran many additional

simulations with varying design parameter values (πG , πT , πA , πB , α), and the results indicate

similar patterns across sets of simulations.

In light of these simulation results, we provide the following practical recommendation.

Use the conventional estimator τ̂qCIC of Athey and Imbens (2006, page 464) along with its

standard error estimator (Athey and Imbens, 2006, pages 464-465) for intermediate quantiles.

17
On the other hand, use our proposed estimator τ̂qeCIC in (6) along with the standard error

estimator (7) for extreme quantiles. The switching point can be chosen by using the log-

log plot described in Section 5.2. We also follow this practical guideline for the empirical

application to be presented in the next section.

8 EITC and Extremely Low Birth Weights

There is a long history in health economics research to study causes and prevention of low

infant birth weight. It is an important topic from policy viewpoint because low infant birth

weight has been identified to have long-lasting impacts on the health and economic well being

in adulthood (e.g., Currie, 2011) as well as they are well known to have immediate impact on

infant mortality. Some economic and behavioral factors affecting infant birth weight include

maternal smoking (e.g., Almond, Chay, and Lee, 2005; Currie, Neidell, and Schmieder, 2009),

maternal stress (e.g., Aizer, Stroud, and Buka, 2009; Camacho, 2008; Evans and Garthwaite,

2014), and economic resources (e.g., Hoynes et al., 2015), among others.

With studies of average effects as in most of the existing empirical studies, it still remains

unknown if these causal factors would have positive impacts on the most vulnerable subpop-

ulation, namely those infants born with extremely low birth weights. There are a few papers

(Chernozhukov and Fernández-Val, 2011; Sasaki and Wang, 2022) that study extreme quan-

tiles of infant birth weights, but causal interpretations of their estimation results require to

assume exogeneity of the explanatory variable of interest conditional on other observed co-

variates. In empirical settings admitting a changes-in-changes design, on the other hand, we

can handle flexible endogeneity in the treatment choice and study treatment effects for the

most vulnerable subpopulation at the extremely low quantiles using the method proposed

in this paper.

Hoynes et al. (2015) use the difference-in-differences (DID) design based on EITC reform

18
(Omnibus Reconciliation Act of 1993, OBRA93) to evaluate the effects of income gains

through the EITC on infant health outcomes. They find significant average effects of income

shocks on the incidence of low birth weight and the average infant birth weight. In this paper,

we aim to complement the work of Hoynes et al. (2015) by analyzing the heterogeneous effects

of the income gains through the EITC on infant birth weight at extremely low quantiles, as

opposed to those on average.

Following the prior work by Hoynes et al. (2015), we use the U.S. Vital Statistics Natality

Data, 1989–1999. We also adopt their DID design for our extreme CIC analysis by following

their two key assumptions. First, the effects of the EITC on infant birth weights run through

the cash available to the family which arrives through tax refunds and the cash is spent over

the subsequent 12 months. Second, we focus on the effects during the sensitive development

stage in the three months prior to birth. Consequently, following the cash-in-hand assignment

rule of Hoynes et al. (2015, Table 1), we include births in May 1994 or after in the “Post”

group (T = 1) associated with the policy event of OBRA93. The eligibility criteria for the

EITC includes the requirement that a taxpayer has a qualifying child. In this light, we

include all the second- or higher-order live births as the treatment group (G = 1). The

sample sizes are n00 = 2372001, n01 = 1287185, n10 = 2652321, and n11 = 1325598.

Hoynes et al. (2015) define subpopulations by year, state, parity, education, race, and

mother’s age. Then, they treat such a subpopulation as a unit of observation, and use the

average birth weight within a subpopulation as the outcome value for the unit. However, this

procedure will not allow us to analyze individual heterogeneity with the quantile treatment

effect because aggregation eliminates individual heterogeneity. Hence, we use each birth

as a unit of observation unlike Hoynes et al. (2015). Otherwise, we follow their empirical

approach as follows. First, we use year and state fixed effects. Since Hoynes et al. (2015) use

parity, education, race, and mother’s age to define their subgroups of aggregation, we instead

use this list of variables as covariates in our analysis. To accommodate these covariates, the

19
extended method introduced in Section 6 is employed. Second, we focus on single women

with high school education or less as in Hoynes et al. (2015).

To determine the switching point q between our extreme CIC estimator and the conven-

tional CIC estimator, we draw the log-log plots for −Ŷ00 , −Ŷ01 , −Ŷ10 , and −Ŷ11 in Figure 3.

Observe in each figure that the plot is reasonably linear up to around the 2.5-th or 5-th per-

centile, and thereby starts to curve downward. In light of the discussion in Section 5.2 and

noting that our current focus is on the left tail, we choose the switching point q such that the

log-log plot is linear for i ∈ {1, · · · , bngt qc}. To guarantee a well Pareto tail approximation,

we define the 2.5-th percentile as our switching point.

Figure 4 illustrates estimates and confidence intervals for τqCIC . The estimates by our pro-

posed method for the extreme quantiles q ∈ (0.000, 0.025] are indicated by dotted curves, and

the estimates by Athey and Imbens (2006) for the intermediate quantiles q ∈ (0.025, 0.200]

are indicated by the dashed curves. The gray shades indicate pointwise 95 percent confidence

intervals.

Observe that the point estimates are unambiguously positive for all the quantiles q ∈

(0.000, 0.200). Furthermore, these income effects are statistically significant at each quantile

q ∈ (0.000, 0.200]. Therefore, we can conclude that income gains will causally improve the

infant birth weights at low quantiles.

While Hoynes et al. (2015) discover positive effects of the EITC income gains on average,

we further find positive effects at the low quantiles in particular. This progress in empirical

research is important as causal effects for extremely low infant birth weights are more relevant

to policy analysis. Low infant birth weight is known to have have long-lasting impacts on the

health and economic well being in adulthood (e.g., Currie, 2011) as well as they are known to

have immediate impact on infant mortality. Our findings focusing on the low quantiles imply

that income support during pregnancy may help mitigate these adverse health and economic

outcomes. We want to stress that, for us to reach this important empirical conclusion, both

20
Figure 3: The log-log plots for −Ŷ00 , −Ŷ01 , −Ŷ10 , and −Ŷ11 .

21
Figure 4: Estimates and 95 percent confidence intervals for τqCIC of infant birth weight
for q ∈ (0.000, 0.200]. The sample consists of infants born between 1989 and 1999 from
unmarried black mothers who have complete 12 years of education. The results for the
extreme quantiles q ∈ (0.000, 0.025] are based on the proposed method. The results for the
middle quantiles q ∈ (0.025, 0.200] are based on Athey and Imbens (2006).

22
the conventional estimator τ̂qCIC by Athey and Imbens (2006) and our proposed estimator

τ̂qeCIC along with their standard errors are indispensable.

9 Simulations Based on Empirical Data

Section 7 presents simulation studies based on data generated from an artificial design. In

this section, we present additional simulation studies with resamples from the empirical data

which we use in Section 8.


n
Let {Ŷgti }i=1
gt
denote the residualized sample we obtain in Section 8 for each g and t. For

each g ∈ {0, 1}, we draw a one-percent subsample of size b0.01 · ng0 c from {Ŷ00i }ni=1
00
∪ {Ŷ10i }ni=1
10

with replacement, and define this subsample as a simulated sample of Yg0 . Similarly, from

each g ∈ {0, 1}, we draw a one-percent subsample of size b0.01 · ng1 c from {Ŷ01i }ni=1
01
∪ {Ŷ11i }ni=1
11

with replacement, and define this subsample as a simulated sample of Yg1 . Since we pool the

source samples between the control and the treatment groups for each t, the true quantile

treatment effect τqCIC is zero for all q by construction. Recall from Section 8 that the original

sample sizes are n00 = 2372001, n01 = 1287185, n10 = 2652321, and n11 = 1325598. Hence,

simulation sample sizes are b0.01 · n00 c = 23720, b0.01 · n01 c = 12871, b0.01 · n10 c = 26523,

and b0.01 · n11 c = 13255. Under this empirical Monte Carlo design, we run the same set of

estimation and inference as in Section 7, except that we focus on the left tail q ∈ (0.00, 0.10]

as opposed to the right tail q ∈ [0.90, 1.00).

The top row of Figure 5 shows Monte Carlo averages and inter-quartile ranges of the

estimates, analogously to Figure 1 in Section 7. The dashed curve on the left panel indicates

the average estimates based on the conventional estimator τ̂qCIC . The dotted curve on the

right panel indicates the average estimates based on our proposed estimator τ̂qeCIC . In each

panel, the shaded region indicates the inter-quartile ranges of the estimates by the respective

methods. The solid curves indicate the true treatment effects. Since the true treatment

23
effects are homogeneously zero for all q under the current data generating design, there is

little bias in the both estimators. Therefore, the inter-quartile ranges are nicely symmetric

for the both estimators. This feature of the results contrasts with that in Section 7, where

non-trivial biases exist for the conventional estimator τ̂qCIC at the extreme quantiles.

The bottom row of Figure 5 shows Monte Carlo frequencies that the true treatment effects

are covered by the 95% confidence intervals, analogously to Figure 2 in Section 7. The dashed

curve indicates the results based on the conventional estimator τ̂qCIC and the dotted curve

indicates the results based on our proposed estimator τ̂qeCIC . The results are shown at the

extreme quantiles q ∈ (0.00, 0.10]. Although the conventional estimator does not suffer from

bias under the current design, its statistical inference still suffers from size distortions. Our

proposed extreme CIC estimator τ̂qeCIC yields substantially less size distortions than the

conventional estimator τ̂qCIC .

10 Summary and Discussions

In this paper, we propose a new CIC estimator to accurately estimate the treatment effects

at extreme/tail quantiles. We also derive its asymptotic normality result for statistical

inference. Our proposal of these new methods is motivated by the fact that policy analysts

are often interested in treating subpopulations near tails of the distributions of outcome

variables (e.g., extremely poor individuals and infants with extremely low birth weights)

while existing CIC estimators are tailored to middle quantiles.

Simulation studies demonstrate that the new extreme CIC estimator along with its stan-

dard error estimator performs better than the conventional method in the tails. Based on

our observations of these results, we propose to use our proposed CIC estimator for extreme

quantiles, while the conventional CIC estimation should be used for intermediate quantiles.

Applying the proposed method to U.S. Vital Statistics Natality Data, we study the effects

24
Figure 5: Top: Monte Carlo averages and inter-quartile ranges (shaded) of the estimates
based on the conventional estimator τ̂qCIC (dashed curves on the left column) and our
proposed estimator τ̂qeCIC (dotted curves on the right column) at the extreme quantiles
q ∈ (0.00, 0.10]. The true treatment effects are indicated by the solid curves. Bottom:
Monte Carlo frequencies of coverage of the true treatment effects by the 95% confidence
intervals at the extreme quantiles q ∈ (0.00, 0.10]. The dashed and dotted curves indicate
the results based on the conventional estimator τ̂qCIC and our proposed estimator τ̂qeCIC ,
respectively.
25
of income gains from the 1993 EITC reform on infant birth weights for those in the most

critical conditions. We find significant positive effects of the income gains on infant birth

weights for the subpopulation at the low quantiles of birth weight.

Finally, we remind the readers that this paper is accompanied by a Stata command,

ecic (extreme changes in changes). The package can be installed from SSC archive with the

following command line: ssc install ecic. After the installation, run help ecic for usage

of the command.

References

Aizer, A., L. Stroud, and S. Buka (2009): “Maternal stress and child well-being:

Evidence from siblings,” Unpublished Manuscript, Brown University, Providence, RI.

Almond, D., K. Y. Chay, and D. S. Lee (2005): “The costs of low birth weight,”

Quarterly Journal of Economics, 120, 1031–1083.

Athey, S. and G. W. Imbens (2006): “Identification and inference in nonlinear difference-

in-differences models,” Econometrica, 74, 431–497.

Camacho, A. (2008): “Stress and birth weight: evidence from terrorist attacks,” American

Economic Review, 98, 511–15.

Carpentier, A. and A. K. H. Kim (2014): “Adaptive and minimax optimal estimation

of the tail coefficient,” Statistica Sinica, 25, 1133–1144.

Cheng, S. and L. Peng (2001): “Confidence intervals for the tail index,” Bernoulli, 7,

751–760.

Chernozhukov, V. (2005): “Extremal quantile regression,” Annals of Statistics, 806–839.

26
Chernozhukov, V. and I. Fernández-Val (2011): “Inference for extremal conditional

quantile models, with an application to market and birthweight risks,” Review of Economic

Studies, 78, 559–589.

Currie, J. (2011): “Inequality at birth: some causes and consequences,” American Eco-

nomic Review, 101, 1–22.

Currie, J., M. Neidell, and J. F. Schmieder (2009): “Air pollution and infant health:

Lessons from New Jersey,” Journal of Health Economics, 28, 688–703.

de Chaisemartin, C. and X. D’Haultfœuille (2014): “Fuzzy changes-in-changes,”

Unpublished Manuscript.

de Haan, L. and A. Ferreira (2007): Extreme Value Theory: An Introduction, Springer

Science & Business Media.

Deuber, D., J. Li, S. Engelke, and M. H. Maathuis (2021): “Estimation and infer-

ence of extremal quantile treatment effects for heavy-tailed distributions,” arXiv preprint

arXiv:2110.06627.

D’Haultfœuille, X., S. Hoderlein, and Y. Sasaki (2022): “Nonparametric

difference-in-differences in repeated cross-sections with continuous treatments,” Journal

of Econometrics, forthcoming.

D’Haultfœuille, X., A. Maurel, and Y. Zhang (2018): “Extremal quantile regres-

sions for selection models and the black–white wage gap,” Journal of Econometrics, 203,

129–142.

Evans, W. N. and C. L. Garthwaite (2014): “Giving mom a break: The impact

of higher EITC payments on maternal health,” American Economic Journal: Economic

Policy, 6, 258–90.

27
Ghanem, D., S. Hirshleifer, D. Kedagni, and K. Ortiz-Becerra (2022): “Cor-

recting Attrition Bias using Changes-in-Changes,” arXiv preprint arXiv:2203.12740.

Girard, S., G. Stupfler, and A. Usseglio-Carleve (2021): “Extreme conditional

expectile estimation in heavy-tailed heteroscedastic regression models,” Annals of Statis-

tics, 49, 3358–3382.

Guillou, A. and P. Hall (2001): “A diagnostic for selecting the threshold in extreme

value analysis,” Journal of the Royal Statistical Society: Series B (Statistical Methodology),

63, 293–305.

Haeusler, E. and J. Segers (2007): “Assessing confidence intervals for the tail index by

Edgeworth expansions for the Hill estimator,” Bernoulli, 13, 175–194.

Hill, B. M. (1975): “A simple general approach to inference about the tail of a distribu-

tion,” Annals of Statistics, 1163–1174.

Hoynes, H., D. Miller, and D. Simon (2015): “Income, the earned income tax credit,

and infant health,” American Economic Journal: Economic Policy, 7, 172–211.

Melly, B. and G. Santangelo (2015): “The changes-in-changes model with covariates,”

Unpublished Manuscript, Universität Bern, Bern.

Resnick, S. I. (2007): Heavy-tail phenomena: probabilistic and statistical modeling,

Springer Science & Business Media.

Sasaki, Y. and Y. Wang (2022): “Fixed-k inference for conditional extremal quantiles,”

Journal of Business & Economic Statistics, 40, 829–837.

Zhang, Y. (2018): “Extremal quantile treatment effects,” Annals of Statistics, 46, 3707–

3740.

28
Appendix

A Proof of Theorem 1

Proof. For succinctness, we use the short-hand notation Fgt (·) for FYgt (·), and accordingly

use the short-hand notation Fgt−1 (·) for FY−1


gt
(·). Under Conditions 1, 2, and 4, we have
p d 2

kgt (α̂gt − αgt ) ≡ Γgt → N 0, αgt (12)

for all (g, t) ∈ {0, 1}2 – see Hill (1975). Moreover, under Conditions 1, 2, 4, and 5, we have
!
kgt F̂gt−1 (q)
p
d −2

− 1 ≡ Λ gt → N 0, αgt , (13)
log dgt Fgt−1 (q)

for all (g, t) ∈ {0, 1}2 by Theorem 4.3.8 in de Haan and Ferreira (2007), where dgt ≡

kgt / (ngt (1 − q)). Given the indepence of {Ygt } across g and t under Condition 1, {F̂Y−1
gt
(q) , α̂gt }

are also independent across {g, t}. Thus, it suffices to derive the limit of the second item in

(6), that is,


  
−1 −1
Ân ≡ F̂01 F̂00 F̂10 (q)
(k +1) α̂00 /α̂01 
! 1/α̂01  α̂ /(α̂ α̂ )
Y10 10 k01 n00 k10 00 10 01
(1 − q)−α̂00 /(α̂10 α̂01 )
(k01 +1)
= Y01 (k00 +1)
Y00 n01 k00 n10

We also write the population counterpart as


   α00 /α01
  F −1 1 − k10  1/α01  α /(α α )
k01  10 n10 k01 n00 k10 00 10 01
−1
An = F01 1 −    (1 − q)−α00 /(α10 α01 ) ,
n01 F −1 1 − k00 n01 k00 n10
00 n00

and we are going to linearize Ân /An − 1 around zero. First, note that we have
(k +1)
!
p Ygt gt d −2

kgt − 1 ≡ ∆gt → N 0, αgt (14)
Fgt−1 (1 − kgt /ngt )
from Theorem 2.4.8 in de Haan and Ferreira (2007) and our Condition 4. Second, we

decompose Ân /An as


 
(k +1)
  Y01 01
log Ân /An = log   
−1 k01
F01 1− n01

29
    
−1 k10
1−
!
α̂00
(k +1)
Y10 10 α00 F 10 n10
+ log (k +1)
− log    
α̂01 Y00 00 α01 −1
F00 1− k00
n00
   
1 1 k01 n00
+ − log
α̂ α01 n k00
 01  01  
α̂00 α00 k10
+ − log
α̂10 α̂01 α10 α01 n10 (1 − q)
≡ I1n + I2n + I3n + I4n .

For the first term, I1n , we have


 
(k01 +1)
Y01 
−1/2

−1/2

−1/2

I1n = log     = log 1 + k01 ∆01 = k01 ∆01 + op k01
−1
F01 1 − nk0101

by (14). For the second term, I2n , we decompose it as


  
−1 k10

α̂00 α00
 F 10 1 − n10
I2n = − log   
α̂01 α01 −1
F00 1 − n00 k00

    
−1 k10
F10 1 − n10
!
(k10 +1)
α̂00  Y10
+ log (k +1)
− log    
α̂01 Y00 00 −1 k00
F00 1 − n00
  
−1 k10

(α̂00 − α00 ) α00
 F 10 1 − n10
= − (α̂01 − α01 ) log   
α̂01 α̂01 α01 −1
F00 1 − nk00
00
    
(k10 +1) (k00 +1)
α̂00   Y10 Y00
+ log    − log    
α̂01 F −1 1 − k10 F −1 1 − k00
10 n10 00 n00
    
−1 k10
−1/2
k00 Γ00 α00 F 10 1 − n10
−1/2
=    −   k01 Γ01  log   
−1/2 2 −1/2 −1 k00
α01 + Op k01 α01 + Op k01 F00 1 − n00
 
α00 
−1/2 −1/2
 h
−1/2 −1/2

−1/2 −1/2
i
+ + Op k00 + k01 k10 ∆10 − k00 ∆00 + op k10 + k00 .
α01
by (12) and (14). For term I3n , we rewrite it as
 
h
−1/2

−1/2
i k01 n00
I3n = k01 Γ01 + op k01 log
n01 k00
by (12). For term I4n , we rewrite it as
 
−1/2
k00 Γ00 −1/2
− α2α00 k10 Γ10
 
α10 α01 10 α01
k10
I4n =    log
 
n10 (1 − q)

α00 −1/2 −1/2 −1/2 −1/2
− α10 α2 k01 Γ01 + op k00 + k10 + k01
01

30
by (12).

Conditions 4 and 5 imply that dgt = kgt / [ngt (1 − q)] → ∞ for all g, t ∈ {0, 1}2 . More-
 
over, Condition 2 implies that Fgt−1 1 − nkgt
gt
= O((kgt /ngt )−1/αgt ) for all g, t. Then using

L’Hospital’s rule, Condition 3, and dgt → ∞, we obtain that


  
−1 k10
1 F10 1 − n10
log   
log (d11 ) −1
F00 1 − n00k00

 −1 −1 
α10 log (k10 /n10 ) α00 log (k00 /n00 )
= O − +
log (k11 /n11 ) − log (1 − q) log (k11 /n11 ) − log (1 − q)
= o(1).

Now using the above derivations, we obtain


√  
k11 1
I1n = Op = op (1)
log (d11 ) log (d11 )

    
−1 k10
k11 1 F 10 1 − n10
I2n = Op  log     = op (1)
log (d11 ) log (d11 ) −1
F00 1 − nk00
00
√  
k11 1 k01 n00
I3n = Op = op (1)
log (d11 ) log (d11 ) n01 k00
√ " 
1/2  1/2  1/2 #
k11 log (d10 ) k11 Γ00 k11 α00 Γ10 k11 α00 Γ01
I4n = − 2
− 2
+ op (1) .
log (d11 ) log (d11 ) k00 α10 α01 k10 α10 α01 k01 α10 α01

Now, combining I1n , I2n , I3n , and I4n , and using the fact that exp(x) = 1 + x + O(x2 ) as

x → 0, we obtain
√ !
k11 Ân
−1
log (d11 ) An
"  #
1/2  1/2  1/2
log (d10 ) k11 Γ00 k11 α00 Γ10 k11 α00 Γ01
= − 2
− 2
+ op (1)
log (d11 ) k00 α10 α01 k10 α10 α01 k01 α10 α01
2 2 2
 
d λ11/10 α00 α00 α00
→ N 0, λ11/00 2 2 + λ11/10 2 2 + λ11/01 2 2 .
η11/10 α10 α01 α01 α01 α10 α01

by independence among Γ00 , Γ10 and Γ01 , Condition 3, and (12).


−1
Finally, using (13) with g = t = 1 and the condition that F11 (q) /An → ς, we obtain
! !
1/2 1/2
k11 τ̂qeCIC − τqeCIC k11 −1
F̂11 (q) − F11−1
(q) Ân − An
−1 = −1 − −1
log d11 F11 (q) log d11 F11 (q) F11 (q)

31
!
1/2 −1
k11 F̂11 (q)
= −1 −1
log d11 F11 (q)
!
1/2 
k Ân An
− 11 −1 −1
log d11 An F11 (q)
d
→ N (0, Ω) ,

where
 2 
λ11/10 2   α2

−2 1
Ω= α11 + λ11/00 + λ11/10 + λ11/01 2 002 .
ς η11/10 α10 α01
This completes the proof.

B Proof of Corollary 1

Proof. The proof follows once we establish (12)–(14). Our Condition 7 is the same as

Girard et al. (2021, eq.(2)). Our Condition 2 is sufficient for their second-order Pareto tail

condition C2 (γ, ρ, A). Then (12) and (14) directly follow from their Corollary 2.1. Using the

same proof of Theorem 4.3.8 in de Haan and Ferreira (2007), (13) further follows from (12),

(14), and our Condition 2.

32

You might also like