0% found this document useful (0 votes)
9 views4 pages

568 Asst 1

Uploaded by

Others Item
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views4 pages

568 Asst 1

Uploaded by

Others Item
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

STAT 568 — Assignment 1 — due date is on course outline

For each of the questions which are carried out on R — please e-mail me your R programs.
These should be in one file, named yourname_asst1.R, with the various questions clearly
separated in the file. I should be able to run them exactly as I receive them, and see the
output nicely displayed (see the help files for the cat function). For these questions your
written assignment then need contain only the discussions of your solutions. Also: I will be
quite unhappy if you merely recycle my own programs, from the course web site, with only
the data sets changed.

1. Here you will investigate hypothesis testing in a ‘canonical form’ which greatly simpli-
fies the theory. The method yields the same F-tests as Cochran’s Theorem, but in a
different way. Suppose the problem begins as follows. You are given a linear model of
the data, in which observations 1    are independently and normally distributed,
with common variance  2 , and with means depending on certain predictors { }. In
matrix form, with y = (1    )0 , the model is

y = Xθ + ε

where X is an  ×  matrix of rank  ≤ , θ is a vector of unknown parameters


ranging over all of R , and ε ∼  (0  2 I  ). You wish to estimate the parameters by
maximum likelihood, and to then carry out the likelihood ratio test of

0 : Cθ = 0

where C is a  ×  matrix of rank    ≤ .

(a) Consider the following situation, and write it in the manner described above.
Specify the matrices X and C, and the values of   and .
There are  subjects in an experiment, 1 of whom receive a certain drug and
2 =  − 1 of whom are in a control group and receive nothing. A linear
regression model is fitted to the responses  , with one independent variable  —
the indicator of the event the the subject receives the treatment. We are to test
the hypothesis that the treatment has no effect.
Parts (b) — (e) are to be done in general, not just in the context of the example
in (a), to which you will return in part (f).
(b) Define ξ =  [y]. Show that the problem is equivalent to the following. We
observe y ∼  (ξ 2 I  ), where ξ lies in a given vector space Π of dimension .
The hypothesis specifies that ξ lies in a particular -dimensional subspace Π0 of
Π, where  ≤  − .
(c) Recall the Gram-Schmidt method (for instance from the Stat 312 notes); by
this one can construct an orthogonal basis {π 1   π } of Π0 , extend it to an
orthogonal basis {π 1   π ; π +1   π } of Π, and then to an orthogonal basis
{π1   π  ; π+1   π  ; π+1   π } of R . Let Q0 be the  ×  matrix with
columns {π1   π }, Q1 the  ×  −  matrix with columns {π +1   π }, and
Q2 the  ×  −  matrix with columns {π+1   π }. Then Q0 Q is an identity
matrix for each , and Q0 Q is aµzero matrix ¶ if  6= . Now the model states
.
that ξ lies in the column space of Q0 ..Q1 , and the hypothesis is that it lies in
the column space of Q0 . Define
µ ¶
.. ..
Q = Q0 .Q1 .Q2 

an  ×  orthogonal matrix, and z = Q0 y. Let η be the mean vector. Show that


z ∼  (η 2 I  ), that the model specifies that
⎛ ⎞
η0 ←
η= ⎝ η1 ⎠ ←− 
0 ←−

and that the hypothesis specifies that η 1 = 0−×1 .


(d) Partition z as ⎛ ⎞
z0 ←
z= ⎝ z1 ⎠ ←− 
z2 ←−
Show that the likelihood ratio test of the hypothesis rejects for large values of

kz 1 k2  ( − )
 = 
kz 2 k2  ( − )
and that à !
− 2kη 1 k2
 ∼ −  = 
2

(e) Define  (θ) = ky − Xθk2 . Show that, in terms of the original parameterization,
min0 ()−min ()
∇
 = min ()

 ()

where the minima are taken with and without the restrictions imposed by the
hypothesis,  () is the df of the mse in the unrestricted model, and ∇
is the difference in the degrees of freedom of the mses with and without the
hypothesis. Show further that

 2 2 = min kξ − tk2 
∈Π0

where ξ is the true mean vector and the minimum is evaluated over mean vectors
t as specified by the hypothesis; i.e. is the squared distance from ξ to the closest
member of Π0 (and so is 0 under 0 ).
(f) Identify Π Π0 and the noncentrality parameter, in the example from (a). As a
check, you should notice that the ncp coincides with the F test statistic when the
parameters are replaced by estimates.
P
2. Suppose that 1    are independent, with  ∼ (   1), so that  2 = 2 ∼
¡ ¢ P =1
2 2 , with 2 =  2
=1   .

(a) Show that we can assume that 1 ∼ ( 1) and that 2    ∼ (0 1).
[Hint: Let x×1 have elements  , and write  2 = kxk2 = kQxk2 for any
orthogonal Q. Choose Q to have first row ν 0  kνk.]
¡ ¢
(b) Use (a) to show that  2 ∼ 12 + −1 2
, where 12 ∼ 21 2 independently of
2
−1 ∼ 2−1 (central). [This is continued, so as to obtain the density of  2 , on
the Stat 575 web site.]
(c) Show that  2 is ‘stochastically increasing in ’, in that the function  ( 2  ) is
an increasing function of , for any   0. [Hint: Show that 12 has this property,
2
and then condition on −1 .]
Note: The same conditioning approach applies to a singly non-central  r.v. —
it too is stochastically increasing in its ncp, implying that the power of the LR
test of 0 : 2 = 0 — the hypothesis from the previous problem — increases as one
moves away from the null hypothesis.

3. From the text: ch. 2 #11. (In all testing situations, state the p-value.)

4. From the text: ch. 2 #12.

5. In the pulp/operator data set, represent the model as a regression model with response

 [ ] =  0 +  1 1 +  2 2 +  3 3 

using indicators  =  (treatment  + 1) for  = 1 2 3. Fit this model to the data.


Represent and test the hypothesis of no treatment effects, and verify that the  has
the same value as in the anova formulation of the model.
6. Obtain the ANOVA Table 3.19 in R.

7. Obtain the ANOVA Table 3.25 in R, and make the multiple comparisons - which com-
pounds are significantly different at the 5% level? Plot the 95% confidence intervals.

8. From the text: ch. 3 #9. Include a discussion of how blocking might be incorporated.

9. From the text: ch. 3 #33. ‘Compare the results’ by carrying out a suitable test.

10. From the text: ch. 3 #35. Compare the methods at the 95% level.

You might also like