0% found this document useful (0 votes)
1 views

Comparisons_of_Several_Multivariate_Means (1)

The document discusses paired comparisons in statistical analysis, covering univariate and multivariate responses, including methods for hypothesis testing and confidence intervals. It details the use of t-tests and F-tests for analyzing treatment effects, as well as repeated measures designs. Additionally, it addresses comparing mean vectors from two populations and provides results for simultaneous confidence intervals and testing for equal treatments.

Uploaded by

vanjunxin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Comparisons_of_Several_Multivariate_Means (1)

The document discusses paired comparisons in statistical analysis, covering univariate and multivariate responses, including methods for hypothesis testing and confidence intervals. It details the use of t-tests and F-tests for analyzing treatment effects, as well as repeated measures designs. Additionally, it addresses comparing mean vectors from two populations and provides results for simultaneous confidence intervals and testing for equal treatments.

Uploaded by

vanjunxin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

1 Paired Comparisons

1.1 Paired Comparisons: Univariate Response


• Let Xj1 denote the response to treatment 1, and let Xj2 denote the response to treatment
2 for the jth trial.
• Let Dj = Xj1 − Xj2 , j = 1, 2, · · · , n reflect the differential effects of the treatments.
• Assume D1 , D2 , · · · , Dn are i.i.d. from N (δ, σd2 ), then

t= D̄−δ

sd / n
∼ tn−1
∑n ∑n
where D̄ = 1
n j=1 Dj and s2d = 1
n−1 j=1 (Dj − D̄)2
• An α-level test of H0 : δ = 0 v.s. H1 : δ ̸= 0 may be conducted by comparing |t| with
tn−1 (α/2). A 100(1 − α)% confidence interval for the mean difference δ is given by

d¯ − tn−1 (α/2) √
sd
n
≤ δ ≤ d¯ + tn−1 (α/2) √
sd
n

1.2 Paired Comparisons: Multivariate Responses


• We label the p responses within the jth unit as
     
Dj1 X1j1 X2j1
Dj2  X1j2  X2j2 
     
Dj =  .  =  .  −  . 
 ..   ..   .. 
Djp X1jp X2jp

• Assume D1 , D2 , · · · , Dp are i.i.d. from Np (δ, Σ), then


p(n − 1)
T 2 = n(D̄ − δ)′ Sd−1 (D̄ − δ) ∼ Fp,n−p
n−p

• An α level test of H0 : δ = 0 v.s. H1 : δ ̸= 0 is conducted by comparing


T 2 = n d̄′ Sd−1 d̄ with p(n−1)
n−p Fp,n−p (α).

1.3 Paired Comparisons: Multivariate Responses


• A 100(1 − α)% confidence region for δ consists of all δ such that
p(n − 1)
(d¯ − δ)′ Sd−1 (d¯ − δ) ≤ Fp,n−p (α)
n(n − p)

• 100(1 − α)% simultaneous confidence intervals for the individual mean differ-
ences δi are given by
√ √
(n − 1)p s2di
d¯i ± Fp,n−p (α)
n−p n

where d¯i is the ith element of d¯ and s2di is the ith diagonal element of Sd .

1
• The Bonferroni 100(1−α)% simultaneous confidence intervals for the individual
mean differences δi are √
α s2di
d¯i ± tn−1 ( )
2p n
α
where tn−1 ( 2p ) is the upper 100(α/2p)th percentile of a t-distribution with n-1
d.f.

1.4 Checking For a Mean Difference With Paired Observations

Example: Municipal wastewater treatment plants are required by law to monitor their
discharges into rivers and streams on a regular basis. Concern about the reliability
of data from one of these self-monitoring programs led to a study in which samples
of effluent were divided and sent to two laboratories for testing. One half of each
sample was sent to the Wisconsin State Laboratory of Hygiene, and one-half was sent
to a commercial laboratory routinely used in the monitoring program. Measurements
of biochemical oxygen demand (BOD) and suspended solid (SS) were obtained, for
n = 11 sample splits, from the two laboratories. The data are as following:
Commercial lab State lab of hygiene
Sample j x1j1 (BOD) x1j2 (SS) x2j1 (BOD) x2j2 (SS)
1 6 27 25 15
2 6 23 28 13
3 18 64 36 22
.. .. .. .. ..
. . . . .
11 20 14 39 21

Discussion:
• Randomized assignment of treatments can enhance the statistical analysis

• Contrast matrix C, each row is a contrast vector.

1.5 Repeated Measure Design


• Another generalization of the univariate paired t-statistic arises in situations when
q treatments are compared with respect to a single response variable.
• Each subject or experimental unit receives each treatment once over successive
periods of time. The jth observation is Xj = [Xj1 , Xj2 , · · · , Xjq ]′ , j =
1, 2, · · · , n, where Xji is the response to the ith treatment on the jth unit.

• The name repeated measures stems from the fact that all treatments are adminis-
tered to each unit.
• For comparative purposes, we consider contrasts of the components of µ =
E(Xi ). These could be

2
    
µ1 − µ2 1 −1 0 ··· 0 µ1
µ1 − µ3  1 0 −1 · · · 0   
    µ2 
 ..  =  .. .. .. .. ..   ..  = C1 µ
 .  . . . . .  . 
µ1 − µq 1 0 0 · · · −1 µq
or     
µ2 − µ1 −1 1 0 ··· 0 0 µ1
 µ3 − µ2  0 −1 1 ··· 0 0 µ2 
    
 ..  =  .. .. .. .. ..   ..  = C2 µ
 .   . . . . .  . 
µq − µq−1 0 0 0 ··· −1 1 µq
both C1 and C2 are called contrast matrices, because their q − 1 rows are linearly
independent and each is a contrast vector.
• Consider an Nq (µ, Σ) population, and let C be a contrast matrix. An α-level test of
H0 : Cµ = 0 v.s. H1 : Cµ ̸= 0 is: Reject H0 if
(n − 1)(q − 1)
T 2 = n(C X̄)′ (CSC ′ )−1 (C X̄) > Fq−1,n−q+1 (α)
(n − q + 1)
where x̄ and Σ are the sample mean vector and covariance matrix defined by Xj ’s.
• A confidence region for contrasts Cµ is
(n − 1)(q − 1)
n(C X̄ − Cµ)′ (CSC ′ )−1 (C X̄ − Cµ) ≤ Fq−1,n−q+1 (α)
(n − q + 1)

• simultaneous 100(1 − α)% confidence intervals for a single contrast c′ µ for any contrast
vectors are given by:
√ √
′ (n − 1)(q − 1) c′ Sc
c x̄ ± Fq−1,n−q+1 (α)
(n − q + 1) n

1.6 Testing for equal treatments in a repeated measures design


Example: Improved anesthetics are often developed by first studying their effects on
animals. In one study, 19 dogs were initially given the drug pentobarbitol. Each dog
was then administered carbon dioxide (CO2 ) at each of two pressure levels. Next,
halothane (H) was added, and the administration of (CO2 ) was repeated. The response,
milliseconds between heartbeats, was measured for the four treatment combinations:
• Treatment1 =high CO2 pressure without H
• Treatment2 = low CO2 pressure without H
• Treatment3 = high CO2 pressure with H
• Treatment4 = low CO2 pressure with H
Analyze the anesthetizing effects of CO2 pressure and halothane from this repeated-
measure design.

3
2 Comparing Mean Vectors From Two Populations
• consider a random sample of size n1 from population 1 and a sample of size n2
from population 2

• the observations on p variables from population 1 and 2 are [x11 , x12 , · · · , x1n1 ]′
and [x21 , x22 , · · · , x2n2 ]′
• question: µ1 − µ2 = δ0 v.s. µ1 − µ2 ̸= δ0
∑nk ∑nk
j=1 (xkj −
1 1
• the sample statistics from two populations are x¯k = nk j=1 xkj , Sk = nk −1
x¯k )(xkj − x¯k )′ , k = 1, 2
• Assume:

– the sample Xk1 , Xk2 , · · · , Xknk , k = 1, 2, i.i.d. with mean vector µk ,


positive definite covariance matrix Σk > 0.
– X11 , · · · , X1n1 are independent of X21 , · · · , X2n2 .

Further Assumptions When n1 and n2 Are Small


• Both populations are multivariate normal, Xk1 , · · · , Xknk i.i.d. ∼ Np (µk , Σk ), k=
1, 2
• Also, Σ1 = Σ2 , same covariance matrix

Then we get
• X̄1 − X̄2 is the MLE of δ = µ1 − µ2 , the MLE of Σ is Spooled = n1 +n2 −2 [(n1 −
1

1)S1 + (n2 − 1)S2 ]


• X̄1 − X̄2 ∼ Np (µ1 − µ2 , ( n11 + 1
n2 )Σ)

• X̄1 − X̄2 and Spooled are independent.


Result: If Xk1 , · · · , Xknk i.i.d. ∼ Np (µk , Σ), k = 1, 2 then
1 1
T 2 = [X̄1 − X̄2 − (µ1 − µ2 )]′ [( + )Spooled ]−1 [X̄1 − X̄2 − (µ1 − µ2 )]
n1 n2
(n1 +n2 −2)p
is distributed as n1 +n2 −p−1 Fp,n1 +n2 −p−1

• An α level test of H0 : µ1 − µ2 = δ0 v.s. H1 : µ1 − µ2 ̸= δ0 is to re-


ject H0 if T 2 = (x¯1 − x¯2 − δ0 )′ [( n11 + n12 )Spooled ]−1 (x¯1 − x¯2 − δ0 ) >
(n1 +n2 −2)p
n1 +n2 −p−1 Fp,n1 +n2 −p−1 (α)

• A 100(1−α)% confidence region for δ = µ1 − µ2 is T 2 = (x¯1 − x¯2 − δ)′ [( n11 +


−1 (n1 +n2 −2)p
1
n2 )Spooled ] (x¯1 − x¯2 − δ) ≤ n1 +n2 −p−1 Fp,n1 +n2 −p−1 (α)

4
3 Simultaneous Confidence Intervals
Result:
Let c2 = [(n1 + n2 − 2)p/(n1 + n2 − p − 1)]Fp,n1 +n2 −p−1 (α) then for all a ̸= 0.

′ ′ 1 1 ′
P (a (µ1 − µ2 ) ∈ a (x¯1 − x¯2 ) ± c ( + )a Spooled a) = 1 − α
n1 n2
in particular, for i = 1, 2 · · · , p
( √ )
1 1
µ1i − µ2i ∈ (x¯1i − x¯2i ) ± c ( + )Sii,pooled = 1 − α
n1 n2

3.1 The Two-Sample Situation When Σ1 ̸= Σ2


Result: Let the sample sizes be such that n1 − p and n2 − p are large. Then, an
approximate 100(1 − α)% confidence ellipsoids for µ1 − µ2 is given by all µ1 − µ2
satisfying
1 1
[x¯1 − x¯2 − (µ1 − µ2 )]′ [( S1 + S2 )−1 [x¯1 − x¯2 − (µ1 − µ2 )] ≤ χ2p (α)
n1 n2

where χ2p (α) is the upper (100α)th percentile of a chi-square distribution with p d.f.
Also, 100(1−α)% simultaneous confidence intervals for all linear combinations a′ (µ1 − µ2 )
are provided by
( √ )

′ ′ 1 1
P a (µ1 − µ2 ) ∈ a (x¯1 − x¯2 ) ± χ2p (α) a′ ( + S2 )a =1−α
S1 S2

• Example 6.4 and 6.5


• Testing hypothesis: H0 : Σ1 = Σ2 = · · · = Σk v.s. H1 : ∃ j ̸= l, Σj = Σl

4 Comparing Several Multivariate Population Means


4.1 Review: A summary of Univariate ANOVA
Xℓ1 , Xℓ2 , · · · , Xℓnℓ is a random sample from an N (µℓ , σ 2 ) population, ℓ = 1, 2, · · · , g
the random samples are independent
Question: H0 : µ1 = · · · = µg v.s. H1 : ∃k ̸= l s.t. µk ̸= µl
The reparameterization: µℓ = µ + τℓ , where µ is the overall mean, τℓ is the ℓth
population treatment effect. To define uniquely the model∑parameters and their least
g
squares estimates, it is customary to impose the constraint ℓ=1 nℓ τℓ = 0.
H0 : τ1 = · · · = τg = 0, Xℓj ∼ N (µ + τℓ , σ 2 ) can be expressed as Xℓj = µ + τℓ +eℓj ,
eℓj i.i.d, N (0, σ 2 ).

5
4.2 One-Way ANOVA
Motivated by the decomposition in Xℓj , the analysis of variance is based upon an
analogous decomposition of the observations:

xℓj = x̄ + (x̄ℓ − x̄) + (xℓj − x̄ℓ )


observation overall sample mean estimated treatment effect residual

where x̄ is an estimate of µ, τ̂ = x̄ℓ − x̄ is an estimate of τℓ , and xℓj − x̄ℓ is an estimate


of the∑
error∑ eℓj . ∑g ∑g ∑nℓ
g nℓ
ℓ=1
2
j=1 (xℓj − x̄) 2
= ℓ=1 nℓ (x̄ℓ − x̄)
2
+ ℓ=1 j=1 (xℓj − x̄ℓ )
total corrected SS = between samples SS + within samples SS

SScor = SStr + SSres

ANOVA Table For Comparing Univariate Population Means


Source of variation Sum of squares (SS) d.f.
treatments SStr ∑g − 1
g
Residual SSres nℓ − g
∑ℓ=1
ℓ=1 nℓ − 1
g
Total SScor

The F-test rejects H0 : τ1 = τ2 = · · · = τg = 0 at level α if


SStr /(g − 1)
F = ∑g > Fg−1,∑ nℓ −g (α)
SSres /( ℓ=1 nℓ − g)

4.3 One-Way MANOVA


MANOVA model for comparing g population mean vectors:
Xℓj = µ + τ ℓ + eℓj , j = 1, 2, · · · , nℓ and ℓ = 1, 2, · · · , g
where eℓj are independent Np (0, Σ) variables. µ is an overall mean and τ ℓ represents the ℓth

treatment effect with gℓ=1 nℓ τℓ = 0.

A vector of observations may be decomposed:


g

nℓ

g
(xℓj − x̄)(xℓj − x̄)′ = nℓ (x̄ℓ − x̄)(x̄ℓ − x̄)′
ℓ=1 j=1 ℓ=1


g

nℓ
+ (xℓj − x̄ℓ )(xℓj − x̄ℓ )′ = B + W
ℓ=1 j=1

MANOVA Table For Comparing Population Mean Vectors


Source of variation Matrix of SSP d.f.
Treatment B ∑g g − 1
Residual W nℓ − g
∑ℓ=1
ℓ=1 nℓ − 1
g
Total B+W

6
One test of H0 : τ1 = · · · = τg = 0 is to reject H0 if Wilk’s Λ∗ is too small

|W |
Λ∗ =
|B + W |
Distribution of Wilk’s Λ
No. of variables No. of groups Sampling distribution

p=1 g≥2 ( n−g
g−1
)( 1−Λ
Λ∗
) ∼ Fg−1,n−g


p=2 g≥2 ( n−g−1
g−1
√ Λ ) ∼ F2(g−1),2(n−g−1)
)( 1− Λ∗ ∗
p≥1 g=2 n−p−1
( p )( 1−Λ Λ∗
) ∼ Fp,n−p−1


p≥1 g=3 ( n−p−2
p
√ Λ ) ∼ F2p,2(n−p−2)
)( 1− Λ∗

When n is large,
p+g
)lnΛ∗ ∼ χ2p(g−1)
a
−(n − 1 −
2

5 Simultaneous Confidence Intervals For Treatment Ef-


fects
• When the hypothesis of equal treatment effects is rejected, those effects that led
to the rejection of the hypothesis are of interest
• For pairwise comparisons, the Bonferroni approach can be used to construct si-
multaneous confidence intervals for the components of the differences τk − τℓ .
• For the model Xℓj = µ + τ ℓ + eℓj , for all i = 1, · · · , p and all ℓ < k =
1, 2 · · · , g,
( √ )
α ωii 1 1
P (τki − τℓi ) ∈ x̄ki − x̄ℓi ± tn−g ( ) ( + ) =1−α
pg(g − 1) n − g nk ℓ

here ωii is the ith diagonal element of W .

6 Two-Way Multivariate Analysis of Variance


Univariate Two-Way Fixed-Effects Model with Interaction
• Suppose there are g levels of factor 1 and b levels of factor 2, and that n indepen-
dent observations can be observed at each of the gb combinations of levels.
• Specify the univariate two-way model as

Xℓkr = µ + τℓ + βk + γℓk + eℓkr

ℓ = 1, · · · , g, k = 1, · · · , b, r = 1, · · · , n
∑ g

b ∑
g

b
τℓ = βk = γℓk = γℓk = 0, eℓkr i.i.d. ∼ N (0, σ 2 )
ℓ=1 k=1 ℓ=1 k=1

7
ANOVA for comparing effects of two factors and their interaction
Source of variation Sum of squares d.f.
∑g
Factor 1 SSf ac1 = bn(x̄ℓ· − x̄)2 g-1
∑bℓ=1 2
Factor 2 SSf ac2 =
∑ k=1 gn(x̄·k − x̄) b-1
Interaction SSint = n(x̄ℓk − x̄ℓ· − x̄·k + x̄) 2 (g − 1)(b − 1)
ℓ,k∑ 2
Residual SSres = ℓ,k,r (xℓkr − x̄ℓk ) gb(n − 1)
∑ 2
Total SScor = ℓ,k,r (xℓkr − x̄) gbn-1

• The F-ratio of the mean squares, SSf ac1 /(g − 1), SSf ac2 /(b − 1), SSint /(g − 1)(b − 1)
to the mean squares, SSres /(gb(n − 1)) can be used to test for the effects of factor 1,
factor 2 and factor1-factor2 interaction, respectively.

Two-Way fixed-effects model for a vector response consisting of p components:

Xℓkr = µ + τℓ + βk + γℓk + eℓkr

ℓ = 1, · · · , g, k = 1, · · · , b, r = 1, · · · , n
∑ g

b ∑
g

b
τℓ = βk = γℓk = γℓk = 0, eℓkr i.i.d. ∼ Np (0, Σ)
ℓ=1 k=1 ℓ=1 k=1

MANOVA for comparing effects of two factors and their interaction


Source of variation SSP d.f.
∑g
Factor 1 SSPf ac1 = bn(x̄ℓ· − x̄)(x̄ℓ· − x̄)′ g-1
∑bℓ=1 ′
Factor 2

SSPf ac2 = k=1 gn(x̄·k − x̄)(x̄·k − x̄) b-1

Interaction SSPint = ℓ,k n(x̄
∑ ℓk − x̄ℓ· − x̄·k + x̄)(x̄ℓk − x̄ℓ· − x̄·k + x̄) (g − 1)(b − 1)
Residual SSPres = ℓ,k,r (xℓkr − x̄ℓk )(xℓkr − x̄ℓk )′ gb(n − 1)
∑ ′
Total SSPcor = ℓ,k,r (xℓkr − x̄)(xℓkr − x̄) gbn-1

• A test of H0 : γ11 = · · · = γgb = 0 v.s. H1 : at least one γℓk ̸= 0 is conducted by


rejecting H0 for small values of the ratio:

|SSPres |
Λ∗ =
|SSPint + SSPres |
For large samples, reject H0 at α level if

p + 1 − (g − 1)(b − 1)
−[gb(n − 1) − ]lnΛ∗ > χ2(g−1)(b−1)p (α)
2

• A test of H0 : τ1 = · · · = τg = 0 v.s. H1 : at least one τℓk ̸= 0 is conducted by


rejecting H0 for small values of the ratio:

|SSPres |
Λ∗ = |SSPf ac1 +SSPres |

For large samples, reject H0 at α level if

−[gb(n − 1) − p+1−(g−1)
2
]lnΛ∗ > χ2(g−1)p (α)

8
• A test of H0 : β1 = · · · = βb = 0 v.s. H1 : at least one βℓk ̸= 0 is conducted by
rejecting H0 for small values of the ratio:

|SSPres |
Λ∗ = |SSPf ac2 +SSPres |

For large samples, reject H0 at α level if

−[gb(n − 1) − p+1−(b−1)
2
]lnΛ∗ > χ2(b−1)p (α)

• The 100(1 − α)% simultaneous confidence intervals for τℓi − τmi are

α Eii 2
(x̄ℓ·i − x̄m·i ) ± tν ( )
pg(g − 1) ν bn

where ν = gb(n − 1), Eii is the ith diagonal element of E = SSPres .

• The 100(1 − α)% simultaneous confidence intervals for βki − βqi are

α Eii 2
(x̄·ki − x̄·qi ) ± tν ( )
pb(b − 1) ν gn

You might also like