0% found this document useful (0 votes)

23 views

Sampling Design and Analysis - (APPENDIX A Probability Concepts Used in Sampling)

Uploaded by

Elaine Kong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Sampling Design and Analysis - (APPENDIX A Probability Concepts Used in Sampling)

Uploaded by

Elaine Kong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Appendix A: Probability Concepts

Used in Sampling

I recollect nothing that passed that day, except Johnson’s quickness, who, when Dr. Beattie observed,
as something remarkable which had happened to him, that he had chanced to see both No. 1, and
No. 1000, of the hackney-coaches, the first and the last; “Why, Sir, (said Johnson,) there is an equal
chance for one’s seeing those two numbers as any other two." He was clearly right; yet the seeing of
the two extremes, each of which is in some degree more conspicuous than the rest, could not but strike
one in a stronger manner than the sight of any other two numbers.’’

—James Boswell, The Life of Samuel Johnson

The essence of probability sampling is that we can calculate the probability with which
any subset of observations in the population will be selected as the sample. Most of
the randomization theory results used in this book depend on probability concepts
for their proof. In this appendix we present a brief review of some of the basic ideas
used. The reader should consult a more comprehensive reference on probability, such
as Ross (2006) or Durrett (1994), for more detail and for derivations and proofs.
Because all work in randomization theory concerns discrete random variables,
Copyright © 2019. CRC Press LLC. All rights reserved.

only results for discrete random variables are given in this section. We use the results
in Sections A.1–A.3 in Chapters 2–4, and the results in Section A.3–A.4 in Chapters 5
and 6.

A.1
Probability
Consider performing an experiment in which you can write out all of the outcomes
that could possibly happen, but you do not know exactly which one of those outcomes
will occur. You might flip a coin, or draw a card from a deck, or pick three names out
of a hat containing 20 names. Probabilities are assigned to the different outcomes and
to sets composed of outcomes (called events), in accordance with the likelihood that
the events will occur. Let be the sample space, the list of all possible outcomes. For
flipping a coin, = {heads, tails}. Probabilities in finite sample spaces have three

Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
549
550 Appendix A: Probability Concepts Used in Sampling

basic properties:
1 P() = 1.
2 For any event A, 0 ≤ P(A) ≤ 1.
k

k
3 If the events A1 , . . . , Ak are disjoint, then P Ai = P(Ai ).
i=1 i=1

In sampling, we have a population of N units and use a probability sampling

scheme to select n of those units. We can think of those N units as balls in a box
labelled 1 through N in a box, and we draw n balls from the box. For illustration,
suppose N = 5 and n = 2. Then we draw two labeled balls out of the box:

5
2

1
3

If we take a simple random sample (SRS) of one ball, each ball has an equal probability
1/N of being chosen as the sample.

A.1.1 Simple Random Sampling with Replacement

In a simple random sample with replacement (SRSWR), we put a ball back after it is
chosen, so the same population is used on successive draws from the population. For
the box with N = 5, there are 25 possible samples (a, b) in , where a represents the
Copyright © 2019. CRC Press LLC. All rights reserved.

first ball chosen and b represents the second ball chosen:

(1, 1) (2, 1) (3, 1) (4, 1) (5, 1)

(1, 2) (2, 2) (3, 2) (4, 2) (5, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5)

Since we are taking a random sample, each of the possible samples has the same
probability, 1/25, of being the one chosen. When we take a sample, though, we usually
do not care whether we chose unit 4 first and unit 5 second, or the other way around.
Instead, we are interested in the probability that our sample consists of units 4 and 5
in either order, which we write as S = {4, 5}. By the third property in the definition
of a probability,
2
P({4, 5}) = P[(4, 5) ∪ (5, 4)] = P[(4, 5)] + P[(5, 4)] = .
25
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
A.1 Probability 551

Suppose we want to find P(unit 2 is in the sample). We can either count that nine
of the outcomes above contain 2, so the probability is 9/25, or we can use the addition
formula:
P(A ∪ B) = P(A) + P(B) − P(A ∩ B). (A.1)
Here, let A = {unit 2 is chosen on the first draw} and let B = {unit 2 is chosen on the
second draw}. Then,
P(unit 2 is in the sample) = P(A) + P(B) − P(A ∩ B) = 1/5 + 1/5 − 1/25 = 9/25.
Note that, for this example,
P(A ∩ B) = P(A) × P(B).
That occurs in this situation because events A and B are independent, that is, whatever
happens on the first draw has no effect on the probabilities of what will happen on the
second draw. Independence of the draws occurs in finite population sampling when
we sample with replacement.

A.1.2 Simple Random Sampling without Replacement

Most of the time, we sample without replacement because it is more efficient—if
Heather is already in the sample, why should we use resources by sampling her again?
If we plan to take an SRS (recall that SRS refers to a simple random sample without
replacement) of our population with N balls, the ten possible samples (ignoring the
ordering) are

{1, 2} {1, 3} {1, 4} {1, 5} {2, 3}

{2, 4} {2, 5} {3, 4} {3, 5} {4, 5}

Since there are ten possible samples and we are sampling with equal probabilities,
the probability that a given sample will be chosen is 1/10.
In general, there are
Copyright © 2019. CRC Press LLC. All rights reserved.

N N!
= (A.2)
n n!(N − n)!
possible samples of size n that can be drawn without replacement and with equal
probabilities from a population of size N, where
k! = k(k − 1)(k − 2) · · · 1 and 0! = 1.
For our example, there are

5 5! 5×4×3×2×1
= = = 10
2 2!(5 − 2)! (2 × 1) (3 × 2 × 1)
possible samples of size 2, as we found when we listed them.
Note that in sampling without replacement, successive draws are not independent.
For this example,
1
P(2 chosen on first draw, 4 chosen on second draw) = .
20
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
552 Appendix A: Probability Concepts Used in Sampling

But P(2 chosen on first draw) = 1/5, and P(4 chosen on second draw) = 1/5, so
P(2 chosen on first draw, 4 chosen on second draw) = P(2 chosen on first draw) ×
P(4 chosen on second draw).

EXAMPLE A.1 Players of the Arizona State Lottery game “Fantasy 5” choose 5 numbers without
replacement from the numbers 1 through 35. If the 5 numbers you choose match the 5
official winning numbers, you win $50,000. What is the probability you win $50,000?
You could select a total of

35 35!
= = 324,632
5 5! 30!
possible sets of 5 numbers. But only

5
=1
5
of those sets will match the official winning numbers, so your probability of winning
$50,000 is 1/324,632.
Cash prizes are also given if you match three or four of the numbers. To match
four, you must select four numbers out of the set of five winning numbers, and the
remaining number out of the set of 30 non-winning numbers, so the probability is

5 30
4 1 150
P(match exactly 4 balls) = = .
35 324,632 ■

EXERCISE A.1 What is the probability you match exactly 3 of the numbers? That you match at least
one of the numbers? ■
Copyright © 2019. CRC Press LLC. All rights reserved.

EXERCISE A.2 Calculating the sampling distribution in Example 2.4

A box has eight balls; three of the balls contain the number 7. You select an SRS
(without replacement) of size 4. What is the probability that your sample contains no
7s? Exactly one 7? Exactly two 7s? ■

A.2
Random Variables and Expected Value
A random variable is a function that assigns a number to each outcome in the sample
space. Which number the random variable will actually assume is only determined
after we conduct the experiment and depends on a random process: Before we conduct
the experiment, we only know probabilities with which the different outcomes can
occur. The set of possible values of a random variable, along with the probability
with which each value occurs, is called the probability distribution of the random
variable. Random variables are denoted by capital letters in this book to distinguish
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
A.2 Random Variables and Expected Value 553

them from the fixed values yi . If X is a random variable, then P(X = x) is the
probability that the random variable X takes on the value x. The quantity x is sometimes
called a realization of the random variable X; x is one of the values that could occur
if we performed the experiment.

EXAMPLE A.2 In the game “Fantasy 5,” let X be the amount of money you will win from your
selection of numbers. You win $50,000 if you match all 5 winning numbers, $500
if you match 4, $5 if you match 3, and nothing if you match fewer than 3. Then the
probability distribution of X is given in the following table:

x 0 5 500 50,000

320,131 4350 150 1

P(X = x)
324,632 324,632 324,632 324,632 ■

If you played “Fantasy 5” many, many times, what would you expect your average
winnings per game to be? The answer is the expected value of X, defined by

E(X) = EX = xP(X = x). (A.3)
x

For “Fantasy 5,”

320,131 4350 150
E(X) = 0 × + 5× + 500 ×
324,632 324,632 324,632

1 176,750
+ 50,000 × = = 0.45.
324,632 324,632
Think of a box containing 324,632 balls, in which 1 ball contains the number 50,000,
150 balls contain the number 500, 4350 balls contain the number 5, and the remaining
320,131 balls contain the number 0. The expected value is simply the average of the
numbers written inside all the balls in the box. One way to think about expected
value is to imagine repeating the experiment over and over again and calculating the
Copyright © 2019. CRC Press LLC. All rights reserved.

long-run average of the results. If you play “Fantasy 5” many, many times, you would
expect to win about 45 cents per game, even though 45 cents is not one of the possible
realizations of X.
Variance, covariance, and the coefficient of variation are defined directly in terms
of the expected value:
V (X) = E[(X − EX)2 ] = Cov (X, X) (A.4)

Cov (X, Y ) = E[(X − EX)(Y − EY )] (A.5)

Cov (X, Y )
Corr (X, Y ) = √ (A.6)
V (X)V (Y )
√
V (X)
CV (X) = , for E(X) = 0. (A.7)
E(X)
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
554 Appendix A: Probability Concepts Used in Sampling

Expected value and variance have a number of properties that follow directly from
the definitions above.

Properties of Expected Value

1 If g is a function, then E[g(X)] = g(x)P(X = x).
x

2 If a and b are constants, then E(aX + b) = aE(X) + b.

3 If X and Y are independent, then E(XY ) = (EX)(EY ).
4 Cov (X, Y ) = E(XY ) − (EX)(EY ).
⎡ ⎤
n
m
n
m
5 Cov ⎣ (ai Xi + bi ), (cj Yj + dj )⎦ = ai cj Cov (Xi , Yj ).
i=1 j=1 i=1 j=1

6 V (X) = E(X 2 ) − (EX)2 .

7 V (X + Y ) = V (X) + V (Y ) + 2 Cov (X, Y ).
8 −1 ≤ Corr (X, Y ) ≤ 1.

EXERCISE A.3 Prove properties 1 through 8 using the definitions in (A.3) through (A.7). ■

In sampling, we often use estimators that are ratios of two random variables. But
E[Y /X] usually does not equal EY /EX. To illustrate this, consider the following
probability distribution for X and Y :

y
x y P(X = x, Y = y)
x
1
1 2 2
4
Copyright © 2019. CRC Press LLC. All rights reserved.

1
2 8 4
4
1
3 6 2
4
1
4 8 2
4

Then EY/EX = 6/2.5 = 2.4, but E[Y/X] = 2.5. In this example, the values are close
but not equal.
The random variable we use most frequently in this book is
1 if unit i is in the sample
Zi = (A.8)
0 if unit i is not in the sample.
This indicator variable tells us whether the ith unit is in the sample or not. In an SRS,
n of the random variables Z1 , Z2 , . . . , ZN will take on the value 1, and the remaining
N − n will be 0. For Zi to equal 1, one of the units in the sample must be unit i, and
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
A.2 Random Variables and Expected Value 555

the other n − 1 units must come from the remaining N − 1 units in the population, so
P(Zi = 1) = P(ith unit is in the sample)

1 N −1
1 n−1
=
N
n
n
= . (A.9)
N
Thus,
E[Zi ] = 0 × P(Zi = 0) + 1 × P(Zi = 1)
n
= P(Zi = 1) = .
N
Similarly, for i = j,
P(Zi Zj = 1) = P(Zi = 1 and Zj = 1)
= P(ith unit is in the sample and jth unit is in the sample)

2 N −2
2 n−2
=
N
n
n(n − 1)
= .
N(N − 1)
Thus for i = j,
E[Zi Zj ] = 0 × P(Zi Zj = 0) + 1 × P(Zi Zj = 1)
n(n − 1)
= P(Zi Zj = 1) = .
N(N − 1)

A.4 Show that

EXERCISE
n(N − n)
V (Zi ) = Cov (Zi , Zi ) =
N2
and that, for i = j,
n(N − n)
Cov (Zi , Zj ) = − .
N 2 (N − 1) ■

The properties of expectation and covariance may be used to prove many results
in finite population sampling. In Chapter 4, we use the covariance of x̄ and ȳ from an
SRS. Let
1 1
N N
x̄U = xi , ȳU = yj ,
N i=1 N j=1

1 1
N N
x̄ = Zi xi , ȳ = Z j yj ,
n i=1 n j=1
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
556 Appendix A: Probability Concepts Used in Sampling

and

N
(xi − x̄U )(yi − ȳU )
i=1
R= .
(N − 1)Sx Sy
Then,
n RSx Sy
Cov (x̄, ȳ) = 1 − . (A.10)
N n
We use properties 5 and 6 of expected value, along with the results of
Exercise A.4, to show (A.10):
⎛ ⎞
1 N N
Cov (x̄, ȳ) = 2 Cov ⎝ Zi xi , Zj yj ⎠
n i=1 j=1

1
N N
= xi yj Cov (Zi , Zj )
n2 i=1 j=1

1 1
N N N
= x i yi V(Z i ) + xi yj Cov (Zi , Zj )
n2 i=1 n2 i=1 j=i

1N −n 1 N − n
N N N
= xi yi − x i yj
n N 2
i=1
n N 2 (N − 1) i=1 j=i

1 N − n
N N N
1 N −n N −n
= + x y
i i − xi yj
n N2 N 2 (N − 1) i=1 n N 2 (N − 1) i=1 j=1

1 N −n
N
1N −n
= x i yi − x̄U ȳU
n N(N − 1) i=1 nN −1
Copyright © 2019. CRC Press LLC. All rights reserved.

1 N −n
N
= (xi − x̄U )(yi − ȳU )
n N(N − 1) i=1
1 n
= 1− RSx Sy .
n N

EXERCISE A.5 Show that

Corr (x̄, ȳ) = R. ■ (A.11)

A.3
Conditional Probability
In sampling without replacement, successive draws from the population are depen-
dent: The unit we choose on the first draw changes the probabilities of selecting the
other units on subsequent draws. When taking an SRS from our box of five balls in
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
A.3 Conditional Probability 557

Section A.1, each ball has probability 1/5 of being chosen on the first draw. If we
choose ball 2 on the first draw and sample without replacement, then

1
P(select ball 3 on second draw | select ball 2 on first draw) = .
4
(Read as “the conditional probability that ball 3 is selected on the second draw given
that ball 2 is selected on the first draw equals 1/4.”) Conditional probability allows us
to adjust the probability of an event if we know that a related event occurred.
The conditional probability of A given B is defined to be

P(A ∩ B)
P(A | B) = . (A.12)
P(B)
In sampling we usually use this definition the other way around:

P(A ∩ B) = P(A | B)P(B). (A.13)

If events A and B are independent—that is, knowing whether A occurred gives us

absolutely no information about whether B occurred—then P(A | B) = P(A) and
P(B | A) = P(B).
Suppose we have a population with 8 households (HHs) and 15 persons living in
the households, as follows:

Household Persons

1 1, 2, 3
2 4
3 5
4 6, 7
5 8
6 9, 10
7 11, 12, 13, 14
Copyright © 2019. CRC Press LLC. All rights reserved.

8 15

In a one-stage cluster sample, as discussed in Chapter 5, we might take an SRS

of two households, then interview each person in the selected households. Then,

P(select person 10) = P(select HH 6) P(select person 10 | select HH 6)

2 2 2
= = .
8 2 8

In fact, for this example the probability that any individual in the population is inter-
viewed is the same value, 2/8, because each household is equally likely to be chosen
and the probability a person is selected is the same as the probability that the household
is selected.
Suppose now that we take a two-stage cluster sample instead of a one-stage cluster
sample, and we interview only one randomly selected person in each selected house-
hold. Then, in this example, we are more likely to interview persons living alone than
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
558 Appendix A: Probability Concepts Used in Sampling

those living with others:

P(select person 4) = P(select HH 2) P(select person 4 | select HH 2)

2 1 2
= = ,
8 1 8
but
P(select person 12) = P(select HH 7) P(select person 12 | select HH 7)

2 1 2
= = .
8 4 32
These calculations extend to multistage cluster sampling because of the general
result
P(A1 ∩ A2 ∩ · · · ∩ Ak ) = P(A1 | A2 , · · · , Ak )P(A2 | A3 . . . , Ak ) . . . P(Ak ). (A.14)
Suppose we take a three-stage cluster sample of grade school students. First, we take
an SRS of schools, then an SRS of classes within schools, then an SRS of students
within classes. Then the event {Joe is selected in the sample} is the same as {Joe’s
school is selected ∩ Joe’s class is selected ∩ Joe is selected} and we can find Joe’s
probability of inclusion by
P(Joe in sample) = P(Joe’s school is selected)
× P(Joe’s class is selected | Joe’s school is selected)
× P(Joe is selected | Joe’s school and class are selected).
If we sample 10% of the schools, 20% of classes within selected schools, and 50%
of students within selected classes, then
P(Joe in sample) = (0.10)(0.20)(0.50) = 0.01.

Conditional Expectation
Conditional expectation is used extensively in the theory of cluster sampling. Let X
and Y be random variables. Then, using the definition of conditional probability,
P(Y = y ∩ X = x)
P(Y = y | X = x) = . (A.15)
P(X = x)
This gives the conditional distribution of Y given that X = x. The conditional
expectation of Y given that X = x simply follows the definition of expectation using
the conditional distribution:

E(Y | X = x) = yP(Y = y | X = x). (A.16)
y

The conditional variance of Y given that X = x is defined similarly:

V (Y | X = x) = [y − E(Y | X = x)]2 P(Y = y | X = x). (A.17)
y
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
A.4 Conditional Expectation 559

EXAMPLE A.3 Consider a box with two balls, A and B:

B
A

2
1 6

3
4

4
Choose one of the balls at random, then randomly select one of the numbers inside
that ball. Let Y = the number that we choose and let
1 if we choose ball A
Z=
0 if we choose ball B.
Then,
1
P(Y = 1 | Z = 1) = ,
4
1
P(Y = 3 | Z = 1) = ,
4
1
P(Y = 4 | Z = 1) = ,
2
and

1 1 1
E(Y | Z = 1) = 1 × + 3× + 4× = 3.
4 4 2
Similarly,
1
P(Y = 2 | Z = 0) =
2
and
Copyright © 2019. CRC Press LLC. All rights reserved.

1
P(Y = 6 | Z = 0) = ,
2
so

1 1
E(Y | Z = 0) = 2 × + 6× = 4.
2 2
In short, if we know that ball A is picked, then the conditional expectation of Y is
the average of numbers in ball A since an SRS of size 1 is taken from the ball; the
conditional expectation of Y given that ball B is picked is the average of the numbers
in ball B. ■

Note that E(Y | X = x) is a function of x; call it g(x). Define the conditional

expectation of Y given X, E(Y | X), to be g(X), the same function but of the random
variable instead. E(Y | X) is a random variable and gives us the conditional expected
value of Y for the general random variable X: for each possible value of x, the value
E(Y | X = x) occurs with probability P(X = x).
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
560 Appendix A: Probability Concepts Used in Sampling

EXAMPLE A.4 In Example A.3, we know the probability distribution of Z and can thus use the
conditional expectations calculated to write the probability distribution of E(Y | Z):

z E(Y | Z = z) Probability

1
0 4
2
1
1 3
2 ■

In sampling, we need this general concept of conditional expectation largely so

we can use the following properties of conditional expectation to find expected values
and variances in cluster samples.

Properties of Conditional Expectation

Conditional expectation can be confusing, so let’s talk about what these properties
mean. The interested reader should see Ross (2006) or Durrett (1994) for proofs of
these properties.

1 E(X | X) = X. If we know what X is already, then we expect X to be X. The

probability distribution of E(X | X) is the same as the probability distribution of
X.
2 E[f (X)Y | X] = f (X)E(Y | X). If we know what X is, then we know X 2 , or log X,
Copyright © 2019. CRC Press LLC. All rights reserved.

or any function f (X) of X.

3 If X and Y are independent, then E(Y | X) = E(Y ). If X and Y are independent,
then knowing X gives us no information about Y . Thus the expected value of Y ,
the average of all the possible outcomes of Y in the experiment, is the same no
matter what X is.
4 E(Y ) = E[E(Y | X)]. This property, called successive conditioning, and prop-
erty 5 are the ones we use the most in sampling; we use them to find the bias and
variance of estimators in cluster sampling. Successive conditioning simply says
that if we take the weighted average of the conditional expected value of Y given
that X = x, with weights P(X = x), the result is the expected value of Y . You
use successive conditioning every time you take a weighted average of a quantity
over subpopulations: If a population has 60 women and 40 men, and if the average
height of the women is 64 inches and the average height of the men is 69 inches,
then the average height for the class is
(64 × 0.6) + (69 × 0.4) = 66 inches.
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
A.4 Conditional Expectation 561

In this example, 64 is the conditional expected value of height given that the person
is a woman, 69 is the conditional expected value of height given that the person
is a man, and 66 is the expected value of height for all persons in the population.
5 V [Y ] = V [E(Y | X)]+E[V (Y | X)]. This property gives an easy way of calculating
variances in two-stage cluster samples. It says that the total variability has two
parts: (a) the variability that arises because E(Y | X = x) varies with different
values of x, and (b) the variability that arises because there can be different values
of y associated with the same value of x. Note that, using property 6 of Expected
Value in Section A.2,

V (Y | X) = E{[Y − E(Y | X)]2 | X} = E[Y 2 | X] − [E(Y | X)]2 (A.18)

and

V [E(Y | X)] = E {E(Y | X) − E[E(Y | X)]}2

= E {E(Y | X) − E(Y )}2
= E{[E(Y | X)]2 } − [E(Y )]2 . (A.19)

EXAMPLE A.5 Here’s how conditional expectation properties work in Example A.3. Successive
conditioning implies that

E(Y ) = E(Y | Z = 0)P(Z = 0) + E(Y | Z = 1)P(Z = 1)

1 1
= 4× + 3× = 3.5.
2 2

We can find the distribution of V (Y | Z) using (A.18):

V (Y | Z = 0) = E(Y 2 | Z = 0) − [E(Y | Z = 0)]2

1 1
= 22 × + 62 × − (4)2 = 4,
2 2
V (Y | Z = 1) = E(Y 2 | Z = 1) − [E(Y | Z = 1)]2

2 1 2 1 2 1
= 1 × + 3 × + 4 × − (3)2 = 1.5.
4 4 2

These calculations give the following probability distribution for V (Y | Z):

z V (Y | Z = z) Probability

1
0 4
2
1
1 1.5
2
Lohr, Sharon L.. Sampling : Design and Analysis, CRC Press LLC, 2019. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=5748873.
Created from pitt-ebooks on 2024-05-15 02:45:16.
562 Appendix A: Probability Concepts Used in Sampling

Thus, using (A.19),

If we did not have the properties of conditional expectation, we would need to

find the unconditional probability distribution of Y to calculate its expectation and
variance—a relatively easy task for the small number of options in Example A.3 but
cumbersome to do for general multistage cluster sampling.

EXERCISE A.6 Consider the box below, with 3 balls labelled 1, 2, and 3:

Suppose we take an SRS of one ball, then subsample an SRS of one number from
the selected ball. Let Z represent the number of the ball chosen, and let Y represent
the number we choose from the ball. Use the properties of conditional expectation to
find E(Y ) and V (Y ). ■

Samarium
No ratings yet
Samarium
53 pages
Facebook Case Analysis
0% (1)
Facebook Case Analysis
10 pages
Decision SDTSC E018 of 23 Swaleh Taib Vs Stabilisation Committee of KSF & 4 Others
No ratings yet
Decision SDTSC E018 of 23 Swaleh Taib Vs Stabilisation Committee of KSF & 4 Others
12 pages
Applied Probability Models with Optimization Applications
From Everand
Applied Probability Models with Optimization Applications
Sheldon M. Ross
2.5/5 (3)
Railway Surcharge To EUROCODE
No ratings yet
Railway Surcharge To EUROCODE
6 pages
Tort of Defamation
No ratings yet
Tort of Defamation
26 pages
Probability - Notes2
No ratings yet
Probability - Notes2
5 pages
Probability and Stochastic Processes
No ratings yet
Probability and Stochastic Processes
24 pages
Lecture 2 Reviow Probability
No ratings yet
Lecture 2 Reviow Probability
78 pages
Chapter Five and Six
No ratings yet
Chapter Five and Six
23 pages
textbook
No ratings yet
textbook
84 pages
Week One Notes
No ratings yet
Week One Notes
10 pages
340 Printable Course Notes
No ratings yet
340 Printable Course Notes
184 pages
STAT 516 Course Notes Part 0: Review of STAT 515: 1 Probability
No ratings yet
STAT 516 Course Notes Part 0: Review of STAT 515: 1 Probability
21 pages
Stats Semis
No ratings yet
Stats Semis
18 pages
Random Variables
No ratings yet
Random Variables
12 pages
Econ 1006 Summary Notes 4
No ratings yet
Econ 1006 Summary Notes 4
24 pages
Lecture02 Chapter 02 Probability Michael Baron Inf Stats Final
No ratings yet
Lecture02 Chapter 02 Probability Michael Baron Inf Stats Final
69 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
10 pages
Var Mean Sample
No ratings yet
Var Mean Sample
14 pages
COUNTING TECHBIQUES AND PROBABILITY Supplementary Notes Part 2 (Updated)
No ratings yet
COUNTING TECHBIQUES AND PROBABILITY Supplementary Notes Part 2 (Updated)
8 pages
Statistics and Probability Lect3 S2025 Pucit
No ratings yet
Statistics and Probability Lect3 S2025 Pucit
27 pages
CHAPTER 4. Probability PDF
No ratings yet
CHAPTER 4. Probability PDF
81 pages
Lecture1 Probability
No ratings yet
Lecture1 Probability
6 pages
EGR 4201 Part A
No ratings yet
EGR 4201 Part A
10 pages
NEW STATISTICS ITC LOPEZ APUD
No ratings yet
NEW STATISTICS ITC LOPEZ APUD
61 pages
Chapter 7 Statistics
No ratings yet
Chapter 7 Statistics
7 pages
Probability random variable
No ratings yet
Probability random variable
14 pages
Engineering Data Analysis Part 1 24252ndSem Notes
No ratings yet
Engineering Data Analysis Part 1 24252ndSem Notes
71 pages
EMBA Day8
No ratings yet
EMBA Day8
73 pages
STAT 342 Statistical Methods For Engineers
No ratings yet
STAT 342 Statistical Methods For Engineers
41 pages
Stat 115 - Chapter 1
No ratings yet
Stat 115 - Chapter 1
156 pages
Probability
No ratings yet
Probability
12 pages
PROBABILITY Random Variable and Probability Distribution
No ratings yet
PROBABILITY Random Variable and Probability Distribution
12 pages
Basic Concepts of Probability Theory
No ratings yet
Basic Concepts of Probability Theory
51 pages
Chapter 3
No ratings yet
Chapter 3
57 pages
Probability and Statistics: To P, or Not To P?: Module Leader: DR James Abdey
No ratings yet
Probability and Statistics: To P, or Not To P?: Module Leader: DR James Abdey
5 pages
MEFall2023_6
No ratings yet
MEFall2023_6
35 pages
Probability and Distribution
No ratings yet
Probability and Distribution
43 pages
02 Probability PDF
No ratings yet
02 Probability PDF
11 pages
Lesson 2 - Probability (With Exercises)
No ratings yet
Lesson 2 - Probability (With Exercises)
10 pages
PROBABILITY
No ratings yet
PROBABILITY
9 pages
Unit 5 Probability Concepts
No ratings yet
Unit 5 Probability Concepts
10 pages
EE 214 Week 4 5 Module
No ratings yet
EE 214 Week 4 5 Module
13 pages
Chapter 5
No ratings yet
Chapter 5
50 pages
Chapter 5
No ratings yet
Chapter 5
50 pages
Probability QUESTION 01012022 103403am
No ratings yet
Probability QUESTION 01012022 103403am
4 pages
Lecture 1
No ratings yet
Lecture 1
25 pages
Lecture02 Ch 02 Probability Michael Baron Inf Stats Final FA24
No ratings yet
Lecture02 Ch 02 Probability Michael Baron Inf Stats Final FA24
65 pages
LECTURE 3
No ratings yet
LECTURE 3
8 pages
Statistical Infrences Lec 1
No ratings yet
Statistical Infrences Lec 1
35 pages
Chapter 1 Probability
No ratings yet
Chapter 1 Probability
13 pages
Pre-Calculus / Math Notes (Unit 11 of 22)
No ratings yet
Pre-Calculus / Math Notes (Unit 11 of 22)
11 pages
Chapter 3
No ratings yet
Chapter 3
14 pages
Random Variables and Probability Distribution
No ratings yet
Random Variables and Probability Distribution
50 pages
Enma104 Lessson2 Probability
No ratings yet
Enma104 Lessson2 Probability
9 pages
Outlines343 PDF
No ratings yet
Outlines343 PDF
192 pages
PCS NOTES M1 (1)
No ratings yet
PCS NOTES M1 (1)
17 pages
Probability Theory I STA 112 IPETU
No ratings yet
Probability Theory I STA 112 IPETU
31 pages
Probability & Sampling: Statistics For Social Research Spring 2015
No ratings yet
Probability & Sampling: Statistics For Social Research Spring 2015
13 pages
probability deleted part
No ratings yet
probability deleted part
31 pages
Lecture No 9
No ratings yet
Lecture No 9
11 pages
Abductive Reasoning: Fundamentals and Applications
From Everand
Abductive Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
From Everand
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
Harald Cramér
4/5 (2)
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
HRM Course Outline
No ratings yet
HRM Course Outline
3 pages
Apexindo Pratama Duta TBK.: Company Report: January 2019 As of 31 January 2019
No ratings yet
Apexindo Pratama Duta TBK.: Company Report: January 2019 As of 31 January 2019
3 pages
96fcf0 Organ Handbook
No ratings yet
96fcf0 Organ Handbook
34 pages
What Are The Different Types of Debentures
No ratings yet
What Are The Different Types of Debentures
2 pages
Null 1
No ratings yet
Null 1
2 pages
Get Intentional Interviewing and Counseling Facilitating Client Development in a Multicultural Society 8th Edition Ivey Test Bank Free All Chapters Available
100% (19)
Get Intentional Interviewing and Counseling Facilitating Client Development in a Multicultural Society 8th Edition Ivey Test Bank Free All Chapters Available
52 pages
Libya Since Independence Oil And Statebuilding Dirk Vandewalle instant download
No ratings yet
Libya Since Independence Oil And Statebuilding Dirk Vandewalle instant download
78 pages
Spss
No ratings yet
Spss
23 pages
Senarai Kontraktor Menyelenggara Alat Pemadam Api Berdaftar JBPM
No ratings yet
Senarai Kontraktor Menyelenggara Alat Pemadam Api Berdaftar JBPM
22 pages
Drama
No ratings yet
Drama
113 pages
Kisi - Kisi Bahasa Inggris
No ratings yet
Kisi - Kisi Bahasa Inggris
8 pages
Design Thinking Presentation
No ratings yet
Design Thinking Presentation
21 pages
Post Office PDF
No ratings yet
Post Office PDF
90 pages
Yoga As Seen by Carl Gustav Jung - Dusan Pajin
No ratings yet
Yoga As Seen by Carl Gustav Jung - Dusan Pajin
17 pages
Seagull Academy Prospectus
No ratings yet
Seagull Academy Prospectus
7 pages
Sexualand Asexual Reproduction Concept Sort
No ratings yet
Sexualand Asexual Reproduction Concept Sort
2 pages
Worksheet Central Dogma
100% (1)
Worksheet Central Dogma
3 pages
Ajay Mahajan, MD & CEO, CARE Ratings LTD
No ratings yet
Ajay Mahajan, MD & CEO, CARE Ratings LTD
2 pages
n360 wk06 Mini Care Plan - Patient B
No ratings yet
n360 wk06 Mini Care Plan - Patient B
1 page
Words of Life Bible Study Notes: Ten Studies in Psalm 119
No ratings yet
Words of Life Bible Study Notes: Ten Studies in Psalm 119
1 page
Grade 6 (Mid Term Test 2)
No ratings yet
Grade 6 (Mid Term Test 2)
2 pages
Exercises Rephrasing
No ratings yet
Exercises Rephrasing
9 pages
(Ebook) Regions and Powers: The Structure of International Security by Barry Buzan, Ole Wæver ISBN 9780511078200, 9780521814126, 9780521891110, 052181412X, 0521891116, 051107820Xinstant download
100% (3)
(Ebook) Regions and Powers: The Structure of International Security by Barry Buzan, Ole Wæver ISBN 9780511078200, 9780521814126, 9780521891110, 052181412X, 0521891116, 051107820Xinstant download
47 pages
Southwest Airlines - Internal Branding Communications
100% (2)
Southwest Airlines - Internal Branding Communications
57 pages
The Child and Adolescent Learners and The Learning Principles
No ratings yet
The Child and Adolescent Learners and The Learning Principles
30 pages

Sampling Design and Analysis - (APPENDIX A Probability Concepts Used in Sampling)

Uploaded by

Sampling Design and Analysis - (APPENDIX A Probability Concepts Used in Sampling)

Uploaded by

Appendix A: Probability Concepts

—James Boswell, The Life of Samuel Johnson

In sampling, we have a population of N units and use a probability sampling

A.1.1 Simple Random Sampling with Replacement

first ball chosen and b represents the second ball chosen:

(1, 1) (2, 1) (3, 1) (4, 1) (5, 1)

A.1.2 Simple Random Sampling without Replacement

{1, 2} {1, 3} {1, 4} {1, 5} {2, 3}

EXERCISE A.2 Calculating the sampling distribution in Example 2.4

320,131 4350 150 1

For “Fantasy 5,”

Cov (X, Y ) = E[(X − EX)(Y − EY )] (A.5)

Properties of Expected Value 

2 If a and b are constants, then E(aX + b) = aE(X) + b.

6 V (X) = E(X 2 ) − (EX)2 .

A.4 Show that

EXERCISE A.5 Show that

P(A ∩ B) = P(A | B)P(B). (A.13)

If events A and B are independent—that is, knowing whether A occurred gives us

In a one-stage cluster sample, as discussed in Chapter 5, we might take an SRS

P(select person 10) = P(select HH 6) P(select person 10 | select HH 6)

those living with others:

The conditional variance of Y given that X = x is defined similarly:

EXAMPLE A.3 Consider a box with two balls, A and B:

Note that E(Y | X = x) is a function of x; call it g(x). Define the conditional

In sampling, we need this general concept of conditional expectation largely so

Properties of Conditional Expectation

1 E(X | X) = X. If we know what X is already, then we expect X to be X. The

or any function f (X) of X.

V (Y | X) = E{[Y − E(Y | X)]2 | X} = E[Y 2 | X] − [E(Y | X)]2 (A.18)

E(Y ) = E(Y | Z = 0)P(Z = 0) + E(Y | Z = 1)P(Z = 1)

We can find the distribution of V (Y | Z) using (A.18):

V (Y | Z = 0) = E(Y 2 | Z = 0) − [E(Y | Z = 0)]2

These calculations give the following probability distribution for V (Y | Z):

Thus, using (A.19),

If we did not have the properties of conditional expectation, we would need to

You might also like

Properties of Expected Value