0% found this document useful (0 votes)
14 views

Matching To Remove Bias in Observational Studies

Uploaded by

Ting Bie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Matching To Remove Bias in Observational Studies

Uploaded by

Ting Bie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Matching to Remove Bias in Observational Studies

Author(s): Donald B. Rubin


Source: Biometrics , Mar., 1973, Vol. 29, No. 1 (Mar., 1973), pp. 159-183
Published by: International Biometric Society

Stable URL: https://www.jstor.org/stable/2529684

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend
access to Biometrics

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
BIOMETRICS 29, 159-183
March 1973

MATCHING TO REMOVE BIAS IN OBSERVATIONAL


STUDIES

DONALD B. RUBINI

Department of Statistics, Harvard University, Cambridge, Massachusetts 02138, USA

SUMMARY

Several matching methods that match all of one sample from another larger sample on a
continuous matching variable are compared with respect to their ability to remove the bias
of the matching variable. One method is a simple mean-matching method and three are
nearest available pair-matching methods. The methods' abilities to remove bias are also
compared with the theoretical maximum given fixed distributions and fixed sample sizes.
A summary of advice to an investigator is included.

1. INTRODUCTION

Matched sampling is a method of data collection and organization


designed to reduce bias and increase precision in observational studies, i.e.
in those studies in which the random assignment of treatments to units
(subjects) is absent. Although there are examples of observational studies
which could have been conducted as properly randomized experiments, in
many other cases the investigator could not randomly assign treatments
to subjects. For example, consider the Kihlberg and Robinson [1968] study
comparing severity of injury in automobile accidents for motorists using
and not using seatbelts. One would not want to randomly assign subjects
to "seatbelt" and "no seatbelt" treatments and then have them collide at
varying speeds, angles of impact, etc. Neither, however, would one want to
simply compare the severity of injury in "random" samples of motorists in
accidents using and not using seatbelts; important variables such as "speed
of automobile at time of accident" may be differently distributed in the two
groups (i.e. seatbelted motorists are generally more cautious and therefore
tend to drive more slowly). Hence, in observational studies, methods such
as matched sampling or covariance adjustment are often needed to control
bias due to specific variables.
We will investigate matched sampling on one continuous matching variable
X (e.g., speed of automobile at time of accident) and two treatment popula-
tions, P1 and P2 (e.g., motorists in accidents using and not using seatbelts).
Several articles have previously considered this situation. However, most of
these have assumed that the average difference in the dependent variable

'Present Address: Educational Testing Service, Princeton,'New Jersey 08540.

159

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
160 BIOMETRICS, MARCH 1973

between the matched samples is an unbiased estimate of the effect of the


treatment and thus were interested in the ability of matching to increase the
precision of this estimate. See, for example, Wilks [1932], Cochran [1953],
Greenberg [1953], and Billiwicz [1965]. Here, we will investigate the ability
of matched sampling to reduce the bias of this estimate due to a matching
variable whose distribution differs in P1 and P2 (e.g., to reduce the bias due
to "speed at time of accident").
We assume that there is a random sample of size N from P1 , say G1
and a larger random sample of size rN, r > 1, from P2, say G2 . All subjects
in G1 and G2 are assumed to have recorded scores on the matching variable X.
Using these scores, a subsample of G2 of size N will be chosen according to
some "matching method"; we call this subsample G2* . The effect of the
treatment will then be estimated from the G1 and G2* samples both of size N.
If r is one, G2* would be a random sample from P2 , and matching could not
remove any bias due to X; if r is infinite, perfect matches could always be
obtained, and all of the bias due to X could be removed. We will study
moderate ratios of sample sizes, basically r = 2, 3, 4, although some results
are given for r = 6, 8, 10.
Following Cochran [1968], we will use "the percent reduction in the
bias of X due to matched sampling" as the measure of the ability of a match-
ing method to reduce the bias of the estimated effect of the treatment;
justification for this choice is given in section 2. Then section 3 states and
proves a theorem giving the maximum obtainable percent reduction in bias
given fixed distributions of X in P1 and P2 and fixed samples sizes N and rN.
In section 4, the ability of a simple mean-matching method to reduce bias
will be compared with the theoretical maximum. In section 5, we compare
three "nearest available" pair-matching methods with respect to their
ability to reduce bias. Section 6 serves to present practical advice to an
investigator.

2. TERMINOLOGY; PERCENT REDUCTION IN BIAS

Suppose that we want to determine the effect of a dichotomous treatment


variable on a continuous dependent variable, Y, given that the effect of a
continuous matching variable, X, has been removed.2 The dichotomous
treatment variable is used to form two populations P1 and P2 . In P1 and P2
X and Y have joint distributions which in general differ from P1 to P2 .
In Pi the conditional expectation of the dependent variable Y given a partic-
ular value of X is called the response surface for Y in Pi , and at X = x is
denoted Ri (x).
The difference in response surfaces at X == x, R1(x) - R2(x), is the effect
of the treatment variable at X = x. If this difference between response
surfaces is constant and so independent of the values of the matching variable,

2 As Cochran [1968] points out, if the matching variable X is causally affected by the treatment variable,
some of the real effect of the treatment variable will be removed in the adjustment process.

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 161

the response surfaces are called parallel, and the objective of the study is
the estimation of the constant difference between them. See Figure 1. For
linear response surfaces, "parallel response surfaces" is equivalent to "having
the same slope".

2 (X)

(X)

LIU
-j

L'J

CL
Ld-

0
uJ

a-

MATCHING VARIAB3LE )t
FIGURE 1

PARALLEL UNIVARIATE RESPONSE SURFACES

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
162 BIOMETRICS, MARCH 1973

If R,(x) - R2(x) depends on x, t


there is no single parameter that completely summarizes the effect of the
treatment variable. In this case we will assume that the average effect of the
treatment variable (the average difference between the response surfaces)
over the P1 population is desired. Such a summary is often of interest, es-
pecially when P1 consists of subjects exposed to an agent and P, consists

(X)

LmU 2 (X)

L/
0
2
LUJ
a-
LbJ

oJ

-j

x
0
LUI

0~

MATCHING VARIABLE X

FIGURE 2

NONPARALLEL UNIVARIATE RESPONSE SURFACES

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 163

of controls not exposed to the agent; see for example Belsen's [1956] study
of the effect of an educational television program.3
The average difference between non-parallel response surfaces over the
P1 population or the constant difference between parallel response surfaces
will be called the (average) effect of the treatment variable or more simply
"the treatment effect" and will be designated r:

r = EI{Rl(x) - R2(X)}, (2.1)

where E1 is the expectation over the distribution of X in P1.


Let y'i and x1i be the values of Y and X for the jth subject in G1 , and
similarity let Y2i and x2j be the values of Y and X for the jth subject in G2,
j = 1, * , N. Using the response surface notation we can write

yij = Ri(xij) + eii i = 1, 2; j = 1, ... N (2.2)

where E,(ei1) = 0 and E, is the conditional expectation given the


We assume that the difference between dependent variable averages in
G1 and G2. will be used to estimate r:

TO = N E Yl -N E Y2j = Y1. - 2. (2.3)

Let E represent the expectation over the distributions of X in matched


samples and E2E represent the expectation over the distribution of X in
matched G2* samples. Then using (2.3) and (2.1) we have that the expected
bias of r over the matched sampling plan is

EEC( t- ) = ER2(x) - E2*R2(x) (2.4)

since EE,(j2) = E2*R2(x) and EE,(1.1) = EIRl(x). If the distribution of


X in matched G2* samples is identical to that in random G1 samples then
ElR2(x) = E2.R2(x) and %, has zero expected bias. If r = 1, that is if the
G2- sample is a random sample from P2 , then the expected bias of T is
E1R2(x) - E2R2(x) where E2 is the expectation over the distribution of X
in P2 .
In order to indicate how much less biased TO based on matched samples
is than TO based on random samples, Cochran [1968] uses "the percent re-
duction in bias" or more precisely "the percent reduction in expected bias":
100 X (1 - expected bias for matched samples/expected bias for random
samples) which is from (2.4)

3 In other cases, however, this average difference may not be of primary interest. Consider for example
the previously mentioned study of the efficacy of seatbelts. Assume that if automobile speed is high seat-
belts reduce the severity of injury, while if automobile speed is low seatbelts increase the severity of injury.
(See Figure 2, where Pi = motorists using seatbelts, P2 = motorists not using seatbelts, X = automobile
speed, and Y = severity of injury.) A report of this result would be more interesting than a report that there
was no effect of seatbelts on severity of injury when averaged over the seatbelt wearer population. Since
such a report may be of little interest if the response surfaces are markedly nonparallel, the reader should
generally assume "nonparallel" to mean "moderately nonparallel." If the response surfaces are markedly
nonparallel and the investigator wants to estimate the effect of the treatment variable averaged over P2 (the
population from which he has the larger sample), the methods and results presented here are not relevant and
a more complex method such as covariance analysis would be more appropriate than simple matching. (See
Cochran [1969] for a discussion of covariance analysis in observational studies.)

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
164 BIOMETRICS, MARCH 1973

100{1 - E1R2(x) - E2.R2(xj> 100 E2*R2(x) - E2R2(x) (2.5)


ER2(x) - E2R2(x) ER2(x) - E2R2(x)

Notice that the percent reduction in bias due to matched sampling depends
only on the distribution of X in P1 , P2 and matched G2* samples, and the
response surface in P2 . If the response surface in P2 is linear,

R2(x) = g2 + :2(X - n2)


where

2 = mean of Y in P2

77i = mean of X in Pi
and

/32 = regression coefficient of Y on X in P2,


we have for the denominator of (2.5), fl2('11 -12) and for the numerator of
(2.5) 2('2* - '12) where '12* is the expected value of X in matched G2- samples,
E2*(x) (equivalently, '12* is the expected average X in G2- samples, E(x2.)).
Thus, if G1 is a random sample and the response surface in P2 is linear,
the percent reduction in bias due to matched sampling is

o = 'oo 2 -712 (2.6)


- '12

which is the same as the percent reduction in bias of the matching variable
X. Even though only an approximation if the P2 response surface is not linear,
we will use 0, the percent reduction in the bias of the matching variable, to
measure the ability of a matching method to remove bias.

3. THE MAXIMUM PERCENT REDUCTION IN BIAS GIVEN FIXED DISTRI-


BUTIONS AND FIXED SAMPLE SIZES

Assume that in Pi X has mean 77i (without loss of generality let n


variance 02i and (X - Ti)/oi 'i - i ,i =1, 2. Define the initial bias in X

B - ?77 7712 > 0,

which if 1 = 2 is simply the number of standard deviations between the


means of X in P1 and P2 .
Then if 0 is the percent reduction in bias of X due to some matching
method that selects a matched sample, G2 , of N subjects from a random
sample, G2, of rN P2 subjects, we have

0 < Oma, = 100 Q2(r, N) (3.1)


B1 +12

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 165

where %2(r, N) = the expected va


tions from a sample of size rN from f2 .
Since a reduction in bias greater than 100% is clearly less desirable than
100% reduction in bias, if B, ' / f, and Q2(r, N) are such that Omax 2 100
this should be interpreted as implying the existence of a matching method
that obtains 100% reduction in expected bias.4
This result follows immediately from (2.6): since ql > Iq2 0 e is the largest
when f2* (i.e. E(x52.)) is largest, which is clearly achieved when the N subjects
in G2 with the largest X values are always chosen as matches. The expected
value of these N largest values from a sample of rN is ,2 + 0-292(r, N). Hence,
the maximum value of 0 is

(Omax = 100 022 2(r, N) - 100 Q2(r, N)


1-212 ~B |1 Cf2
h~72

The result in (3.1) is of interest here for two reasons. First, for fixed
distributions and sample sizes and given a particular matching method,
a comparison of 0 and min {100, Omaxn} clearly gives an indication of how
well that matching method does at obtaining a G2* sample whose expected
X mean is close to w . In addition, the expression for Omar will be used to
help explain trends in Monte Carlo results. When investigating matching
methods that might be used in practice to match finite samples, properties
such as percent reduction in bias are generally analytically intractable.
Hence, Monte Carlo methods must be used on specific cases. From such
Monte Carlo investigations it is often difficult to generalize to other cases
or explain trends with much confidence unless there is some analytic or
intuitive reason for believing the trends will remain somewhat consistent.
It seems clear that if Oma,, is quite small (e.g. 20) no matching method will
do very well, while if Omax is large (e.g. 200) most reasonable matching methods
should do moderately well. Hence, we will use trends in Omax to help explain
trends in the Monte Carlo results that follow.
Two trends for Om,,a are immediately obvious from (3.1).

(1) Given fixed r, N, f2 and o2/r2, Oma decreases as B increases.


(2) Given fixed r, N, f2 and B, Oma ,, decreases as o-B/o-2 increases.

Given fixed f2 , B, and r2/ ' two other trends are derivable from simple
properties of the order statistics and the fact that Oma,, is directly proportional
to 02(r, N) (see Appendix A for proofs).

(3) Given fixed B, o,/o-2 2, f and N, Omax increases as r increases: Q2(r, N) <
92(r + a, N), a > 0; N, rN, aN integers.

4 A matching method that has as its percent reduction in expected bias min { 100, Omax I may be of little
practical interest. For example, consider the following matching method. With probability P = min { 1, 1 /Omnax)
choose the N G2 subjects with the largest observations as the G2. sample and with probability 1 - P choose a
random sample of size N as the G2. sample. It is easily checked that the percent reduction in expected bias using
this method is min { 100, Omax 1.

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
166 BIOMETRICS, MARCH 1973

(4) Given fixed B, o/lo, f,2 and r


02(r, N + b), b > 0; N, rN, rb integers.

From the fourth trend, we have Q(r, 1) <~ Q(r, N) ?< Q(r, co). Values of
2(r, 1) have been tabulated in Sarhan and Greenberg [1962] for several
distributions as the expected value of the largest of r observations. Q(r, co)
can easily be calculated by using the asymptotic result

?2(r, c) = r f zf(z), where f f(z) = 1/r.

Values of Q(r, 1) and Q(r, co) are given in Table 3.1 for X ' x2 (v = 2(2)10)
and X Normal, and for r = 2, 3, 4, 6, 8, 10.

Table 3.1 can be summarized as follows.

(a) For fixed r and v, the results for +xI are more similar to those for
the normal than are those for - . This result is expected since the
largest N observations come from the right tail of the distribution
and the right tail of +x2 is more normal than the right tail of -x
which is finite.
(b) Given a fixed distribution, as r gets larger the results differ more
from those for the normal especially for - Again this is not
surprising because the tails of low degree of freedom X2 are not very
normal, especially the finite tail.
(c) For r = 2, 3, 4, and moderately normal distributions ( vx , > > 8)
the results for the normal can be considered somewhat representative.
This conclusion is used to help justify the Monte Carlo investigations
of a normally distributed matching variable in the remainder of
this article.
(d) Given a fixed distribution and fixed r, the values for Q(r, 1) are
generally within 20% of those for Q(r, c), suggesting that when
dealing with moderate sample sizes as might commonly occur in
practice, we would expect the fourth trend (Omax increasing function
of N) to be rather weak.

In Table 3.2 values of Q(r, N) are given assuming f normal, the same
values of r as in Table 3.1, and N = 1, 2, 5, 10, 100, co. Values were found
with the aid of Harter [1960]. For fixed r, the values of Q(r, N) for N > 10
are very close to the asymptotic value Q(r, co), especially when r > 2. Even
Q(2, 10) is within about 3% of Q(2, co). These results indicate that the values
for Q(r, o ) given in Table 3.1 may be quite appropriate for moderate sample
sizes.

4. MEAN-MATCHING

Thus far we have not specified any particular matching method. Under
the usual linear model "mean-matching" or "balancing" (Greenberg [1953])
methods are quite reasonable but appear to be discussed rarely in the litera-
ture. In this section we will obtain Monte Carlo percent reductions in bias

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 167

o ON C0 CO C) Lr\ ON ON 0O C-- LC\


cfl H H 0 0 - m m CM H O\

CM CM CM CM cM H H H H H c)

Y) Lr\ H CO kO I CO CM co C C
ON OD O C-- C- ON CM CM H C) ON

H H H H H H H H H H C)

co 0 Lr\ CMl 01 U'\ 0 --I -:I >


8 o o ON CM ON \N mt C) cq zt c
CM CM H H H H H H H H ?)

Co

CM _- .t H C) CM ON \,D H -f C--
C-- \CD \CD \CO "0 -CC H H H C CO)

H H H H H H H H H H C)

ON Lr CM ON cO C ON CM C- ON H
C-- C-- t-- \kO \kO O\ CM CM H- C ON\

m mH H m Hi H H H H co

8 N (Y r- 0 01\ O\ \, ckl L-- m1 o

ONx C H C) ON C- ON CM C- C
-C 2-1 -'1 -C tC CM C) C) C) ON CO
sH H H H H H H H^ H o C)

CO ON CO - k H ON k C)
c CO CO co CM H o) C) C CO
<;8
g ~~H H H H H H H H H H O)

CO C) ON ON CO C C H CO >C ONx
O) H C) C) C) C) ON ON CO CO C-

H H H H H H O) ) CC) c C

C CO C O CY) O O ON CO LO\ H H

8
8H4 tH
r IHr4
H H C)r1
I 4 ON0oCh
ON CC
ON h
CO

H H H H HCC0 LA CC) C) C) C

CO kO \O kO CO OC CO C) - kO CO O-

C ) C) C) C) C) C) C) C) C) C) C)
CO CO C CO C Co O CO- - - - O

H~~~~~

C ) X >F L oN O CC - C C)

o o o O O o o o O O o
C) ) ) CC) -tC) i C) C)O C)k C) C)

+ + + + + z i i I I

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
168 BIOMETRICS, MARCH 1973

TABLE 3.2
Q(r, N); f NORMAL

r2 3 4 6 8 10

= 1 0.56 o.85 1.03 1.27 1.42 1.54

2 o.66 o.96 1.14 1.38 1.53 1.64

5 0o74 1.03 1.22 1.45 1.6o 1.70

10 0.77 1.o6 1.24 1.47 1.62 1.72

25 0.78 i.o8 1.26 1.49 i.64 1.74

50 0.79 1d08 1.27 1.50 1.65 1.75

100 o.80 1.09 1.27 1.50 1.65 1.75

co o.80 1.09 1.27 1.50 1.65 1.75

for a simple mean-matching method and compare these with the theoretical
maximums given by (3.1).
Assuming linear response surfaces it is simple to show from (2.3) that
the bias of I for estimating 'r is :2(X1 - X2.) + 01%. - l), where At is the
regression coefficient of Y on X in Pi and x, is the average X in the matched
samples. Using xl. to estimate ql or assuming parallel response surfaces
(p1 = 12) one would minimize the estimated bias of T by choosing the N G2
subjects such that [xl. - x2.1 is minimized. A practical argument against
using this mean-matching method is that finding such a subset requires the
use of some time consuming algorithm designed to solve the transportation
problem. Many compromise algorithms can of course be defined that ap-
proximate this best mean-match.
We will present Monte Carlo percent reductions in bias only for the
following very simple mean-matching method. At the kth step, k = 1, ... , N,
choose the G2 subject such that the mean of the current G2* sample of k
subjects is closest to x1. . Thus, at step 1 choose the G2 subject closest to
x1. ; at step 2 choose the G2 subject such that the average of the first G2*
subject and the additional G2 is closest to x. ; continue until N G2 subjects
are chosen.
In Table 4.1 we present Monte Carlo values of 0M N, the percent reduction
in bias for this simple mean-matching method.' We assume X normal,
B = 3 1; .2/c2 = 2, 1p 2; N = 25, 50, 100; and r = 2, 3, 4. Some limited
experience indicates that these values are typical of those that might occur
in practice. In addition, values of r and N were chosen with the results of

6 The standard errors for all Monte Carlo values given in Tables 4.1, 5.1, 5.2, and 5.3 are generally less than
0.5% and rarely greater than 1%.

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 169

(Y. H rH - Lr\ _ t
H \-O co ON co C\ \10 co ON

O.\ cCO O-O 0 o LC 0\ C0


- O 0\N co ON o 0O oN 0)o

C\j
Gd

0 CCo ) C ) L C CO) C> CD


C\IHH!C\I O\ ON~~~- C> ON C> C ON\ C> C
b H H H Hi H

H14- ~~~ON
H HC> H
C>H
C>HCH
C> H
C>H
0 C
o
0 z~~~ ~ 0 O 0 C) O 0 Ox 0

H ON CO C> H ON C C> C>


z ~~OD ON > ON \O C> ON C C>
H H H Ho?

3 ri ON C>
z C O C>
^oo C C> C> C>

b N C > C > o C> C>


mI1
0 b H H l H H H H H

HI-z ON C> C> C> C> CCo > O O o O O


M 1 mH- C~aH
CO C>
H o
CHCH
c H
COHC> C
H H
HH H H H

z UN C C C O C>
A 1 mC- O C> CD ON C> C C> C> C>
19 1H Hs H H H H H

o H HIO C> a C> C> C> C> C> C>


b rq~~ H H H H H H H

ON 0 C C > C C C

II 0 0C O- 04 0 0 01 IC 0l

pq ~ ~ ~ ~ O~

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
170 BIOMETRICS, MARCH 1973

Tables 3.1 and 3.2 in mind-values for percent reduction in bias may be
moderately applicable for nonnormal distributions, especially when r = 2,
and values given when N = 100 may be quite representative for N > 100.
0MN exhibits the four trends given in section 3 for Omax .

(1) Given fixed N, r, and v/o-'l OMNv decreases as B increases.


(2) Given fixed N, r, and B, OM N decreases as o-'/l increases.
(3) Given fixed B, _2 /o2 and N, 0M N increases as r increases.
(4) Given fixed B, 0/_2 and r, except for one value (67% for N - 50,
,1/ = 2, r 2, B = 1) OMN increases as N increases.

In Table 4.2 we present values of min Io00, Omax } for the same range of
N, B and _/f2 as in Table 4.1. Note first that the 67% for N = 50, (_2/o_2 = 2,
r = 2, B = 1 mentioned above is larger than the theoretical maximum and
thus suspect. Comparing the corresponding entries in Table 4.1 and Table 4.2
we see that the values for N = 100 always attain at least 96% of
min 100, Omax}, while the values for N = 50 always attain at least 91%
of min {100, Omax }, and those for N = 25 always attain at least 87% of
min 1 100, Omax }. Hence this simple method appears to be a very reasonable
mean-matching method, especially for large samples.

5. PAIR-MATCHING

Even though a simple mean-matching method can be quite successful


at removing the bias of X, matched samples are generally not mean-matched.
Usually matched samples are "individually" (Greenwood [1945]), "precision"
(Chapin [1947]), or "pair" (Cochran [1953]) matched, subject by subject.
The main reason is probably some intuitive feeling on the part of investigators
that pair-matched samples are superior. One theoretical justification is that
Ir based on exactly mean-matched samples has zero expected bias only if
the P2 response surface really is linear, while Ir based on exactly pair-matched
samples has zero expected bias no matter what the form of the response
surface. Since an investigator rarely knows for sure that the P2 response
surface is linear, if the choice is between exactly pair-matched samples
and exactly mean-matched samples of the same size, obviously he would
choose the exactly pair-matched samples.
The ease of constructing confidence limits and tests of significance is
a second reason for using a pair-matching method rather than a mean-
matching method. Significance tests and confidence limits that take advantage
of the increased precision in matched samples are easily constructed with
pair-matched data by using matched pair differences, while such tests and
limits for mean-matched data must be obtained by an analysis of covariance
(Greenberg [1953]).
Another reason for the use of pair-matching methods is that each matched
pair could be considered a study in itself. Thus, the investigator might assume
the response surfaces are nonparallel and use the difference y'i - Y2i to
estimate the response surface difference at xli . It follows from (2.2) that the

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 171

co 0 CCO H CY H ON _ H
H 4 \'0 "o u\ \O m- -t \0 rD :

C0 OD 0 o -D C\ Xo LF\ C) ND
01k \- LD UN \O L UN t>- L--
It

10

b ~NO CO CO NO CO CO O CO CO

H NO CO COm N C H C \ CO ON
C1-
~ ~kt
~tl w)co co)ON
H ONC\ \0'(Nw
co co
ON\1
ONco a)
N co

\ 0 co 00 \1 0 co 0 \w co a\

NO cO 01 ON 04 L(\ H 04 NO
H 01H- I>- Co ON - \ CO co ON ON
z If

b t- f 1- H t- CO 04 co ON
a g - H10 cO ON ON ON ON ON ON ON ON
H o 10

z CO ON cO O C) O C) H ON
ON\ ON. ON C) ) C) C) C ON
i a H H H H H

04j L-- ON "lo ON C) \O C) C


H CO ON\ ON C ON C) CO C) CD
H H H ,l

m k-j- 0\ H H ON 04 H H 04 H
CX1 ~ O C) C) ON C) A1 C) r' C)

H104H
lC\j 1 HrHd H
m H H H H H

b
04
co \1 M m m Ctj co CU] {-
b H104 C) C) C) H O C) C) C C
H H H H H H H H H

CO ND
mq 11 cql 01 01
m -- ci 0
CY0 4- CO
c n041 H
H

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
172 BIOMETRICS, MARCH 1973

bias of yj - Y2 i for estimatin


suming this bias to be some unknown increasing function of lxij - x2jl,
one minimizes the bias of each estimate, Y'I - Y2i , by minimizing each
x; - x2ij rather than l. - x2. I.
If each G1 subject is closest to a different G2 subject, assigning matches
to minimize each lxIi - x2jJ is easily done. However, if two or more G1
subjects are closest to the same G2 subject, the best way to assign individual
matches is not obvious, unless the investigator decides upon some criterion
to be minimized, such as one proportional to the average squared bias of
the N individual estimates assuming parallel linear response surfaces,
1/N E (x1i - X2,)2. As was already mentioned, in order to find the G2*
sample that minimizes any such quantity, some rather time consuming
algorithm designed to solve the transportation problem must be used.
Even though more complex pair-matching methods often may be superior,
we will investigate three simple "nearest available" pair-matching methods.
A nearest available pair-matching method assigns the closest match for
ge P G, from the yet unmatched G2 subjects and thus is completely defined
if the order for matching the G, subjects is specified. The three orderings
of the G, subjects to be considered here are: (1) the subjects are randomly
ordered (random), (2) the subject not yet matched with the lowest score
on X is matched next (low-high), and (3) the subject not yet matched with
the highest score on X is matched next (high-low). The results will depend
on our assumption that ql > I2, for if ql were less than 2, the values for
the low-high and high-low pair-matching methods would be interchanged.
In Tables 5.1, 5.2, and 5.3 we present Monte Carlo values for the percent
reduction in bias for random ordering (ORD), low-high ordering (OLH) and
high-low ordering (OHL). We assume the same range of conditions as given
in Table 4.1 for OMN. ORD and OHL exhibit the four trends given in section 3
for Omax and exhibited in Table 4.1 for OMN .

(1) Given fixed N, r, and ovl/ r , 0RD and OHL decrease as B increases.
(2) Given fixed N, r, and B, ORD and OHL decrease as o_2/_22 increases.
(3) Given fixed B, -2/ 2, and N, ORD and OHL increase as r increases.
(4) Given fixed B, 0_2 /of and r, ORD and OHL generally increase as N
increases.

These same four trends hold for all orderings if ORD and OHL increase"
is replaced by "ORD, OHL , and OLH get closer to 100%". Values of 0 greater
than 100% indicate that 772* > nI which is of course not as desirable as
fl2* 7 which implies 0 100.
Comparing across Tables 5.1, 5.2, and 5.3 we see that given fixed B,
0 022/e2, r and N, OLH ? ORD 2 OHL . This result is not surprising for the
following reason. The high-low ordering will have a tendency not to use
those G2 subjects with scores above the highest GI score while the low-high
ordering will have a tendency not to use those G2 subjects with scores below
the lowest G6 scores. Since we are assuming B > 0 (X1 > fl2), the low-high

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 173

-:j co 0 U'\ ( 0 Lr ON 0
CC )c 0 \D co 0 cD o
H H H

LC\ O 0 t- 0 O t0 0
CC) o 0 CC 0 0 o c 0 0
b 1 H H HrI H H 1 H

0 O 0 0 0 0 O 0 0

H H H H H H H H H

cc 0 0 C
r-i 0 ~~~~~0 0 i> 0 0 cc )

H H H H H

? H H H H H H H H H

10 rdH H
S~~~ H H
H H H H H H

? -J HIC'J 0 0 0 0 0 0 0 0 0
H H H H H H H H H

P4 C~j 0 0 0 0 0) 0 0 0 0
N HC~j 0 0 0 0 0) 0 0 0 0

II H- 01\ 0 0 0N O 0 0N 0 0
H H H H HH H H
c\J ~~~~~~~0 0 0 0O 0 0 0 0 0

O O O O O O O O O)

00 0 0- 0) 0 0~ 0 0

IIc Sr=N H H OO H H H - r
C14 O O O O O O) O O O
Hl r I ri rI rH H r-Ird -
Ct ~ ~ c' O~ O f O M O O O O O) O-

11 j m _:I CM m _: CN m

58~~~~~~0 = - = K l=

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
174 BIOMETRICS, MARCH 1973

co H Co H C- 1 H aC A1 H

N \O s \O (Y? o, LC\ LC\ C \ O

Z C\J O 1- LC\
b
z ~~~~~~- C)~~~~~ CMJ C\ O\ Lr\ H- O\ LC\ H-

0 (n t H kO
'.0 - O\ '. t-\
COI \.o 1>CO
H'. C
Un\C

S w~~~'. H CO Oa > ON ON '.


H '\0 co \.0 CO co '.o co co

Lf\ \.0 04 co H .-- H H \o


z H C)-t t-- co O\ t- ON ON cO O\ ON
Cq, Il

X M tN C04 H uN t- UN - C) . Coo
? eq. Hi 0 ON ON C ON ON ON ON ON
f bW
m ~ ~ ~ ~ ~ > F UN C> '. O UN ON ON

h Hk O (X) O ON ON O
00

z ON , L- ON \10 N C) '. ON ON

o ..sl\I h t-\
z (ncO
( h
Z C) 01 1 t ON 0 O CCo
H oo ON ON ON ON CO ON ON

C O h 00 ON ON ON ON C) ON ON ON

z H ~ ~~~H H H
H co CO\ ON CO ON () co ON ON
bj Hc0H[04
Z~ oN ON ON ON
zONaC) ON
HONH CD
ll~ H-

t-- ON ON\ ON C) C) C) C)
HI-4~~~~~ ONr ON ON ON C) C) C) C) C)

II 04 01 ~~~~~~ 04 .-z1 04 =

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 175

CC H co H m H ON - H
H .-wt \O \O U\ _ \O L-

c'j
II uN\ L- LC\ CY) ON ux, Lr\ 0 \1,0

H 00 H CO CU \1 H ON UN H
ID HUN [ UN t- cO UN L- Co
z

0 ~ ~ ~~~~~0

Ox UN Cx \f) t - c o c0 c-

ON t t-- cI o -t co o c cc
H \ L- co \.O co cO ko cO oa

H C'.! X 0 \1Q ON m o\ 0 ux\


w co \ t-. co o c\ ON ON
II
CO f
3~ ~~~~U Ox t'X - CO I CO Ox U 0
01

t- co m -t.z- \ t Lr\ L-
o 1~ H10 L~ - co o\ co ox ON co ox ox\
H

0
I>- ox cc \ c kO C0 \x co
L - co C7 co O N. Ox ON ox

0 UN 0 Lrx ox .-i- C Cl\cc'o c\cI

H cc t~~~~- \Cl co0 CO co ON


0 ccJ-I~~~~Y4 cc Ox ox~ cc ON ON ox c% ON

H101 cc Cx o\ G\ ON Ox~ Cx\ ON ON

co j 01 '0 \.0 co0 CO c cc ON


tL- ox Ox cc Ox ox \ ON ON

ii 01 cc ~~~~ 01 cc K 0c cc 2

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
176 BIOMETRICS, MARCH 1973

ordering should yield the most positive x2. followed by the random ordering
and then the high-low ordering. When '/o-2 = IL and B < -, OL Ir can be
somewhat greater than 100 (e.g. 113) while 100 > OR D > 94. In all other
cases (o-2/ 2- > .1 or 2/2 and B < 1), OLH is closer to 100% than ORD
or OHL . In general the results for ORD, OLH, and OHL are quite similar for
the conditions considered.
Comparing the results in this section with those in section 4, it is easily
checked that if v2/o2 < 1 the three pair-matching methods generally attain
more than 85% of min { 100, Om.,} in Table 4.2 indicating that they can
be reasonable methods of matching the means of the samples. However,
if cr2/cr2 = 2, the pair-matching methods often attain less than 70% of the
corresponding OMX Nin Table 4.1 indicating that when cr2/cr2 > 1 these pair-
matching methods do not match the means very well compared to a single
mean-matching method.
Remembering that pair-matching methods implicitly sacrifice closely
matched means for good individual matches, we also calculated a measure
of the quality of the individual matches. These results presented in Appendix
C indicate that, in general, the high-low ordering yields the closest individual
matches followed by the random ordering. This conclusion is consistent with
the intuition to match the most difficult subjects first in order to obtain
close individual matches.

6. ADVICE TO AN INVESTIGATOR

In review, we assume there are two populations, P1 and P2, defined by


two levels of a treatment variable. There is a sample, G, , of size N from P1
and a larger sample, G, , of size rN from P2 , both of which have recorded
scores on the matching variable X. The objective of the study is to estimate T.
the average effect of the treatment variable on a dependent variable Y over
the P1 population. We assume that n = Yh.-12. will be used to estimate X
where gl. is the average Y in the G1 sample and Y2. is the average Y in an
N-size subsample of G2 matched to G1, G2, X
Depending upon the particular study, the investigator may be able,
within limits, to control three "parameters".

(a) N, the size of the smaller initial sample (GD); equivalently, the size
of each of the final samples.
(b) r, the ratio of the sizes of the larger initial sample (G2) and the smaller
initial sample (GD).
(c) The matching rule used to obtain the G2* sample of size N from
the G2 sample of size rN.

Below we present advice for choosing these "parameters" in the order


first N, then r, and then the matching method.

(a) Choosing N

We will use a standard method for estimating N (Cochran [1963]) which

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 177

assumes that the investigator wants -


1 - a: Prob { IT- - r > A} = a. Let
error of -O we would choose

N = z2s2/A2, (6.1)

where z is the standard normal deviate corresponding to 1 - a confidence


limits (e.g. if ac = 0.05, z - 2).6 In order to use (6.1) we must have an estimate
of the standard error of -r, s/V\N.
Suppose that the response surfaces are linear with slopes 01 and 132 and
that x2. will be exactly matched to x. in the final samples by using one of
the matching methods discussed in sections 4 and 5. Thus, E E (Po) =
and it is easy to show that

s/N = E E% - r)2

= E Eg2( 1-X2.) + /3(X1. - 1) + e1. -e2.]. (62)


Setting xt2. = x and assuming the usual independent error model where
E0(e2i) = 72; i = 1, 2, (6.2) becomes
2 2 2

s2/N- oe + LT- + N (- 02)2. (6.3)

Rarely in practice can one estimate the quantities in (6.3). Generally, how-
ever, the investigator has some rough estimate of an average variance of Y,
say C2 , and of an average correlation between Y and X, say p. Using these
he can approximate (1/N) {e + Ir~ } by (2/N)6 (1 p
Approximating c2/N(Q1 _ 032)2 is quite difficult unless one has estimates
of A3 and _2 * The following rough method may be useful when the response
surfaces are at the worst moderately nonparallel. If the response surfaces
are parallel o-'/N(3 - /32)2 is zero and thus minimal. If the response surfaces
are at most moderately nonparallel, one could assume (31 - 022)2 < 2j2
in most uses.7 Hence, in many practical situations one may find that 0 <
1r2IN(0 - 02)2 < 2 (Q24/N)023 , where the upper bound can be approximated
by 2(p2crY/N). Hence, a simple estimated range for s2is
26v(1 P ) < 2 < 2&f. (6.4)

If the investigator believes that the response surfaces are parallel and
linear, the value of S2 to be used in (6.1) can be chosen to be near the minimum
of this interval. Otherwise, a value of S2 nearer the maximum would-be
appropriate.

(b) Choosing r

First assume that mean-matching is appropriate, i.e. assume an essentially


linear response surface in P2 , and that the sole obj ective is to estimate r.

6 Moderate samples (N > 20) are assumed. For small samples N = tN2_12s/A2 where tN-l is the student-
deviate with N - 1 degrees of freedom corresponding to 1 - a confidence limits.
7 A less conservative assumption is (,31 - 132)2 < 312.

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
178 BIOMETRICS, MARCH 1973

We will choose r large enough to expect 100% reduction in bias using the
simple mean-matching method of section 4.
(1) Estimate -y = B[(1 + crJ/U2)/2]1/2 and the approximate shape of the
distribution of X in P2 . In order to compensate for the decreased ability
of the mean-matching method to attain the theoretical maximum reduction
in bias in small or moderate samples (see section 4), if N is small or moderate
(N < 100) increase -y by 5 to 15% (e.g. 10% for N = 50, 5% for N = 100).
(2) Using Table 3.1 find the row corresponding to the approximate shape
of the distribution of X in P2 . Now find approximate values of r, and r.
such that Q(r', 1) - -y and Q(r. , o ) - -y. If N is very small (N < 5), r should
be chosen to be close to r1 ; otherwise, results in Table 3.2 suggest that r can
be chosen to be much closer to rO . r should probably be chosen to be greater
than two and in most practical applications will be less than four.
Now assume pair-matches are desired, i.e. the response surfaces may be
nonlinear, nonparallel and each yli - Y2i may be used to estimate the
treatment effect at x1j . We will choose r large enough to expect 95% + re-
duction in bias using the random order-nearest available pair-matching
method of section 5. Perform steps (1) and (2) as above for mean-matching.
However, since in section 5 we found that if -2/a2 > 1 nearest available
pair-matching did not match the means of the samples very well compared
to the simple mean-matching method, r should be increased. The following
is a rough estimate (based on Tables 5.1 and 4.1) of the necessary increase:

if a/ ' = '-, r remains unchanged


if o-/o2 = 1, increase r by about 50%
if o-/22 = 2, at least double r.

(c) Choosing a Matching Method

We assume G1 and G2 (i.e. r and I) are fixed and the choice is one of a
matching method. If the investigator knows the P2 response surface is linear
and wants only to estimate a, the results in section 4 suggest that he can use
the simple mean-matching method described in section 4 and be confident
in many practical situations of removing most of the bias whenever r > 2.
If confidence in the linearity of the P2 response surface is lacking and/or
the investigator wants to use each matched pair to estimate the effect of
the treatment variable at a particular value of X, he would want to obtain
close individual matches as well as closely matched means. Results in section 5
indicate that in many practical situations the random order nearest available
pair-matching method can be used to remove a large proportion of the bias
in X while assigning close individual matches. The random order nearest
available pair-matching is extremely easy to perform since the G1 subjects
do not have to be ordered; yet, it does not appear to be inferior to either
high-low or low-high orderings and thus seems to be a reasonable choice in
practice.
If a computer is available, a matching often superior to that obtained
with the simple mean-matching or one random order nearest available

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 179

pair-matching may be easily obtained by performing the simple mean-


matching and several nearest available pair-matchings (i.e. several random
orderings, low-high ordering, high-low ordering) and choosing the "best"
matching. There should be no great expense in performing several matchings.
Using Fortran IV subroutines given in Appendix B for the simple mean-
matching method and nearest available pair-matching methods, a matching
of 100 G, subjects from 400 G2 subjects takes about 1' seconds on an IBM
360/65.
In order to decide which matching is "best", record for all matched
samples d = - X.2. and d2 = 1/N E (x, - x2i)2. Pair-matches (and
thus d2) for the mean-matched sample can be found by using a nearest
available pair-matching method on the final samples. If several matchings
give equally small values of d, choose the matching that gives the smallest
value of d2. If d for one matched sample is substantially smaller than for
any of the other matched samples but d' for that sample is quite large, the
investigator must either (1) make a practical judgement as to whether closely
matched means or close individual matches are more important for his
study, or (2) attempt to find matches by a matching method more complex
than the ones considered here.
Admittedly, the practical situations and methods of estimating r covered
above are quite limited. The following article extends this work to include
regression (covariance) adjusted estimates of r and nonlinear parallel response
surfaces. Rubin [1970] includes extensions to the case of many matching
variables. Althauser and Rubin [1970] give a nontechnical discussion of some
problems that arise with many matching variables.

ACKNOWLEDGMENTS

This work was supported by the Office of Naval Research under contract
N00014-67A-0298-0017, NR-042-097 at the Department of Statistics, Harvard
University.
I wish to thank Professor William G. Cochran for many helpful suggestions
and criticisms on earlier drafts of this article. I would also like to thank the
referees for their helpful comments.

CROISEMENTS EN VUE D'EFFACER LES BIAIS DANS DES ETUDES

RESUME

Plusieurs m6thodes de croisements qui croisent tous les niveaux d'un meme 6chantillon
tir6 d'un autre echantillon plus grand sur une variable continue de croisement sont com-
parees en ce qui concerne leur pouvoir d'effacer le biais de la variable de croisement. Une des
methods est une methode croisant la moyenne et trois sont des m6thodes croisant la paire
la plus proche. Les possibilities des methodes en ce qui concerne la suppression du biais
sont aussi comparees au maximum theorique etant donn6 des distributions fixes et des
tailles d'echantillon fixees. Un resume pour aider le chercheur est inclus.

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
180 BIOMETRICS, MARCH 1973

REFERENCES

Althauser, R. P. and Rubin, D. B. [1970]. The computerized construction of a matched


sample. Amer. J. Soc. 76, 325-46.
Belsen, W. A. [1956]. A technique for studying the effects of a television broadcast. Appl.
Statist. 5, 195-202.
Billewicz, W. Z. [1965]. The efficiency of matched samples: an empirical investigation.
Biometrics 21, 623-43.
Chapin, F. S. [1947]. Experimental Designs in Sociological Research. Harper and Brothers,
New York.
Cochran, W. G. [1953]. Matching in analytical studies. Amer. J. Pub. Health 43, 684-91.
Cochran, W. G. [1963]. Sampling Techniques. Wiley, New York.
Cochran, W. G. [1968]. The effectiveness of adjustment by subclassification in removing
bias in observational studies. Biometrics 24, 295-313.
Cochran, W. G. [1969]. The use of covariance in observational studies. Appl. Statist. 18,
270-5.
Cox, D. R. [1957]. The use of a concomitant variable in selecting an experimental design.
Biometrics, 150-8.
Greenberg, B. G. [1953]. The use of covariance and balancing in analytical surveys. Amer.
J. Pub. Health 43, 692-9.
Greenberg, E. [1945]. Experimental Sociology: A Study in Method. Kings Crown Press, New
York.
Harter, H. L. [1960]. Expected values of normal order statistics. Aeronautical Research
Laboratories Technical Report, 60-292.
Kihlberg, J. K. and Robinson, S. J. [1968]. Seat belt use and injury patterns in automobile
accidents. Cornell Aeronautical Laboratory Report No. VJ-1823-R30.
Peters, C. C. and Van Voorhis, W. R. [1940]. Statistical Procedures and Their Mathematical
Bases. McGraw-Hill, New York.
Rubin, D. B. [1970]. The use of matched sampling and regression adjustment to observa-
tional studies. Statistics Department Report CP-4, Harvard University.
Sarhan, A. E. and Greenberg, B. G. [1962]. Contributions to Order Statistics. Wiley, New
York.
Yinger, J., Milton, I. K., and Laycock, F. [1967]. Treating matching as a variable in a
sociological experiment. Amer. Soc. Review 32, 801-12.
Wilks, S. S. [1932]. On the distributions of statistics in samples from a normal population
of two variables with matched sampling of one variable. Metron 9, 87-126.

APPENDIX A: PROOFS OF TRENDS (3) AND (4) IN SECTION 3

We prove the intuitively obvious trend (3) by considering a random


sample of size (a + r)N from f. Call the order statistics x(,) , * * *, X(N), , * *
X(a+t) A where x(1) is the largest observation. The average of the N largest
observations from these (a + r)N is 1/N EN x(j, . By randomly discarding
aN of the original observations, we have a random sample of size rN from f.
But in any such subset the average of the N largest observations is less than
or equal to 1/N EN x(ji . Averaging over repeated random samples we have
that Q(r, N) < Q(r + a, N), N, rN, aN positive integers.
We prove trend (4) by a similar but more involved argument. Consider
a random sample of size r(N + b) and let x(j) , ... , X(N) , ... * X(N+b) X ...
X(,(N+b)) be the order statistics. The average of the N + b largest from these
r(N + b) is 1/(N + b) 1: +b X(i) . Choosing a random rN-size subset of these

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 181

observations, we have that the expected value of the average of the N largest
from such a subset is

1 E 4 (total of largest N observations from S)

where

S = set of all distinct rN size subsets of original r(N + b)

n = (r(N + b)) - the number of elements in S.

This expression can be rewritten as

1 1 r(N+b)
N E XiXsi)

where Xi = number of elements of S in which x(,) is one of the N largest


observations, E N. = Nn.
For i = 1, ... , N, Xi = the number of subsets in which x(i) occurs = m =

rN 1 1). For all i > N, Xi < m. Consider the above summation


as a weighted sum of the x (i where the weights are >0 and add to
1 (E Xi/nN = 1). Increasing the weights on the largest x(i) while keeping
the sum of the weights the same cannot decrease the total value of the sum.
Thus,
-r(N+b) N +b-1

--N b)-nN1
nN E ir(i) < m im Ex(i) + (nN - m(N + b - i))X(N+b)}

< { E Xi) + ( - (N + b- X))(N+b)}

{ + b }
? { X(i)}

since (m/nN) = 1/(N + b).


Hence, the expected average of the top N from a random rN-size subset
is less than or equal to the average of the top b + N from the original r(b + N);
thus averaging over repeated random samples we have

Q(r, N) < Q(r, b + N), N, rN, r(b + N) positive integers.

APPENDIX B

FORTRAN SUBROUTINES FOR NEAREST AVAILABLE PAIR MATCHING AND SIMPLE MEAN
MATCHING

Notation used in the subroutines

N1 = N = size of G1

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
182 BIOMETRICS, MARCH 1973

N2 = rN = size of initial G2
X1 = vector of length N giving matching variable scores for G1 sample,
i.e. 1st entry is first G1 subject's score
X2 = vector of length rN giving scores for G2 on matching variable
AV1 = x1.
D = l. -X2. ; output for matched samples
D2 = 1/N E (x1j - X2j)2; output for matched samples
IG1 = vector giving ordering of G1 sample for nearest available matching
(a permutation of 1 ... NI)
IG2 = "current" ordering of G2 sample. After each call to a matching

SUBROUTINE NAMTCH(D,D2,IG2,y IG1,N1,N2,X1,X2)


C SUBROUTINE TO PERFORM NEAREST AVAILABLE MATCHING
C NECESSARY INPUTS ARE IG2, IGl, N1, N2, X1, X2
DIMENS.ION IG1 (1)i,1G2( 1) vX1(1) ,XZ( 1)
D=O.
D2=0*
DO 200 I=1,Nl
K= IGG1 I )
200 CALL MATCH(DtD2,1G2tX1(K)y N2iX2)
D=D/FLOAT(N1)
02=D2/FLOAT( NI)
RETURN
END

SUBROUTINE MNMTCH(D,IG2,N1 ,N2,X2tAV1)


C SUBROUTINE TO PERFORM SIMPLE MEAN MATCHING TO AV1
C NECESSARY INPUTS ARE TG2 Nit N2i X2i AV.1
DIMENSION IG2(1), X2(1)
D-O.
D2=0*
DO 200 I=1tN1
XX=AV1 +D
200 CALL MATCHDD27IG2t XXtIN2tX2)
D=D/FLOAT(N1)
RETURN
END

SUBROUTINE MATCH(DsD2 ,IG2,X1,K1,N2,X2)


C SUBROUTINE PICKS G2 SUBJECT BETWEEN (INCLUSIVE) K1 AND N2 IN LIST
C IG2 WHO HAS SCORE (IN X2) CLOSEST TO VALUE X1
C HIS SUBJECT NUMBER IS PUT IN IG2(K1) AND PREVIOUS ENTRY IN IG2(K1)
C IS MOVED BEYOND K1 ENTRY
DIMENSION X2(1)91G2(1)
LL=IG2 (N2)
IF (K1 .EQ. N2) GO TO 410
DMIN=ABS(X1-X2(LL))
K2=N2-K1
DO 400 LK=1,K2
K=N2-LK
L=IG2(K)
IF (ABS(Xl-X2(L)) .LT. DMIN) GO TO 300
IG2(K+1)=L
GO TO 400
300 IG2(K+1)=LL
LL-L
DMIN=ABS (X1-X2 (LL))
400 CONTINUE
410 CONTINUE
IG2 (K1)=LL
D=D+X1-X2 (LL)
D2=D2+(Xl-X2(LL))**2
RETURN
END

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms
MATCHING TO REMOVE BIAS 183

subroutine, the G6 subject having subject nu


to G2 having subject number IG2(K), K = 1, * , N1. Subject
number, of course, refers to order in vectors XI and X2. Before
the first call to a matching subroutine one should set IG2(K) = K,
K = 1, * *, N2. After this initialization IG2 should be considered
output of matching routines.

APPENDIX C
THE QUALITY OF INDIVIDUAL MATCHES:

100 X (B E +(X j + '2) / X NORMAL*

1 12 2 / 2 2 2rl/cr2 = 2
1 1 3 1I1 3 1
4 2 4 1 4 2 3 1 4 2 4 1

T=J- 47 58 72 81 35 5o 63 77 55 4-863 7 6
cm 2 01 02 o4 lo 02 o 10 16 10 19 28
3 5 00 00 01 02 01 02 o5 o8 o6 og 13 19
4 00 00 00 01 o o1 o1 o4 o4 4 o6 09 15
a1 42 55 71 8o 3o 47 64 76 31 42 57 73
? 2 00 01 05 o8 ios 3 08 15 07 13 20 26
3 0000 00 0o 1 000 o1 06 o4 o8 12 18
4 00 00 00 00 00 01 02 o4 003 5 09 13
1 37 56 69 8o 25 46 63 76 25 4! 60 73
$ 2 00 00 02 o6 01 02 07 1{ o06 12 19 27
3 00 00 00 oi oo o 102 05 03 07 1 17
; 4 00 00 00 00 0000 01 o4 03 05 09 13
1 55 42 42 47 30 26 31 40 27 28 34 45
c 2 01 01 02 04 02 03 o5 09 o8 og 13 19
003 00 oo 01 01 0 0 03 05 os o6 0 g 13
R d 4 00 00 00 01 01 01 02 03 04 0507 11
2. 51 39 38 43 21 19 27 37 20 22 29 4o
z; s R 2 00 01 01 03 01 02 o4 07 o0 o8 12 16
H 0) fi 3 00 00 0? 01 00 01 02 03 03 5 o8 11
4 00 00 00 00 00 00001 02 02 03 o6 og
i; 1 49 39 35 41 15 16 24 35 16 19 28 39
H H 2 00 00 01 02 00 01 03 o6 o4 07 :L1 17
I 5 3 00 00 00 00 0000 01 02 02 03 07 1L1
4 00 oo 00 00 0000 01 02 02 03 05 08
1 93 1o8 322 126 71, 92 108 119 55 77 96 110
0 2 02 01i 10 20 04 09 17 26 13 19 27 39
3 3 00 01 02 05 01 03 o8 15 07 12 18 26
4 00 00 01 02 01 02 04 07 05 o8 12 20

o 1 95 110 122 127 69 92 110 120 52 71 91 1o8


, 2 00 02 o8 18 02 07 16 25 11 19 29 36
3 3 00 00 01 03 01 02 05 11 o6 11 18 24
4 00 00 00 01 00 01 03 07 04 07 13 19
:03: 1 92 I11 122 128 62 91 L09 120 44 69 92 109
g 2 00 01 o6 17 o0 o6 14 25 10 19 28 38
5 3 00 00 00 02 00 01 05 10 05 10 17 24
4 1 00 00 00 01 00 01 02 07 040 7 13 18

If perfectly matched, equals 00. I randomly matched from random samples, equals 100.

Received January 1971, Revised July 1972


Key Words: Matching; Matched sampling; Observational studies; Quasi-experimental
studies; Controlling bias; Removing bias; Blocking.

This content downloaded from


137.92.99.84 on Wed, 31 Jul 2024 21:50:14 UTC
All use subject to https://about.jstor.org/terms

You might also like