Aitkin
Aitkin
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to
Journal of the Royal Statistical Society. Series C (Applied Statistics).
http://www.jstor.org
in Normal
ModellingVarianceHeterogeneity
RegressionUsing GLIM
By MURRAY AITKINt
Universityof Lancaster,UK
SUMMARY
This paper describes and presentssimple GLIM macros forthe modellingof variance heterogeneity
in normalregressionanalysis, using a log-linearregressionmodel forthe variance. The procedureis
illustratedwithtwo examples.
Keywords: Variance heterogeneity;Score test; GLIM; maximumlikelihood
1. Introduction
Homogeneityof varianceis one of the standardassumptionsof normalregressionanalysis.
The failureofthisassumptioncan sometimesbe rectified by a Box-Cox (1964) transformation
of the responsevariable,whichmay providethe added benefitsof closerapproximationsto
addivityand normality.This does not necessarilyoccur,however,and it is oftenof interest
is to findand
of variancein the analysis.The difficulty
to allow explicitlyforheterogeneity
fita satisfactorymodel forthe heterogeneity.
Modellingapproachesusingthe score testare proposedand used by Cook and Weisberg
(1983). For the normalregressionmodel
y= 3'xi + ei (1)
withei- N(O,a2) underhomogeneity,
theypropose the model forheterogeneity
var(ei) = a' = exp(X'zi), (2)
wherezi may containsome or all of the variablesin xi and othervariablesnot includedin
xi; zi is assumedto containa constant1. The log-linearformensuresthatoi remainspositive.
The resultsofChen (1983) suggestthatthescoretestused by Cook and Weisbergis not very
sensitiveto thefunctionalformoftherelationshipbetweenoQ and X'zi. The advantageofthe
score testis thatthe model (2) does not have to be explicitlyfitted;the disadvantageis that
the curvatureof the likelihoodat the null hypothesismay be verydifferent fromthatat the
maximumlikelihoodestimateofA,leadingto poor agreementbetweenthescoreand likelihood
ratio tests.
The purpose of this note is to point out that the model (2) can be fittedby maximum
likelihoodin GLIM3, (Bakerand Nelder 1978)and thatthemean regressionparameter/Bcan
be estimatedsimultaneously, allowingfor the model (2) for the dispersionusing a simple
macro. This resultprovides a powerfulgeneral regressionmodellingapproach in which
varianceheterogeneity is explicitlymodelledratherthanbeing(hopefully) transformed away.
Researchand Services,EducationalTesting
t Addressforcorrespondence:DivisionofStatisticaland Psychometric
Service,PrincetonN.J.08541, U.S.A.
? 1987 Royal StatisticalSociety 0035-9254/87/36332 $2.00
with
log Ui = zj.
The log-likelihoodis
1(/,A)=1 E log UCi-4 E (yi-_#/xi)2/a1
and the firstand second derivativeswithrespectto the parametersare
a:_ -
a21~~~~~~~
ZXi(yi-#'xi)lui
=#F _-E xjx'/Ui
01 E Zi + 2 E (yi-
- 'Xi)2Zi-a2
OA 2E e21i- i
_
=- E (e 2/U2 )Zi
a21
a21
a/3x=-E - (ei/ua)xiz,
where
ei = yi- 3'xi
The observedinformation forthecomplete
matrix parameter A)is
vector(/B,
rX'Wl1X X'Wl2Z
[Z'W12X Z'W22Zj
where
X = Xl... Xn] Z' = [z1, ** Zn,
W, 1 = diag(I/&2)
H12 = diag(e&/&i
)
= 4 diag(ei2/&i2)
ei = y-3- Xi.
matrixis
The estimatedexpectedinformation
>=X'WjjX O
L ? ~2Z'Zj
sinceE(e2) = a2. Thus a Fisherscoringalgorithmforthe simultaneousmaximumlikelihood
estimationof /Band A reducesto two separatealgorithmsfor/3and A. Since howeverW1
dependson A and ej depends on ,B,the completescoringalgorithmcan be formulatedas a
successiverelaxationalgorithm.
For giveno2, , is a weightedleast squares estimatewithweights1/2, and forgiven/3, X
is themaximumlikelihoodestimateobtainedfroma gamma distribution of e2, withdegrees
offreedomv = 1 and henceGLIM "scale" parameter(squaredcoefficient ofvariation)2/v= 2,
and a log linkfunctionforthe gamma mean 2.
The algorithmcan conveniently begin withan initialunweightednormalregressionof y
on x, takingu =- 2. The deviance(-2 log Lmax)is calculatedforthismodel,to providea
global testof homogeneityof variance.The squared residualsfromthe least squares fitare
definedas a new responsevariablewitha gamma distribution withscale parameter2. The
linearpredictorA'z is thenfittedusing a log link function,and the deviancecalculatedfor
theinitialestimateof(/3, A).A weightednormalregressionof y on x is now fitted, withscale
parameter1and weightsgivenbythereciprocalsofthefitted valuesfromthegammamodel.The
squared residualsare again fittedusingthegammaerrormodel,and thenormaland gamma
models are alternateduntilthe devianceconverges.At this point the standarderrorsfrom
bothmodelsare correct(thoseforthegammamodelin GLIM are based on theFisherscoring
algorithmwhich uses the expected informationmatrix).This analysis assumes that the
parameters/3 and A are functionally
independent, whichwill generallybe the case.
We will illustratewithtwo examples.
3. Examples
We considerfirsta 3 x 4 factorialdesignwithfourreplicates,the well-knownexamplein
Box and Cox (1964). The responseis the survivaltimeof rats afterpoisoning,classifiedby
threetypesof poison and fourtreatments. Box and Cox noted thatforthe untransformed
survivaltime,withintreatment normality is acceptablebutcommondispersionand additivity
are not. If a model withcommondispersionis fitted,however,the interactionappears only
marginallysignificant. The Box-Cox transformation pointsstronglyto the reciprocalscale,
and we proceedto model on thisscale, using rate= 1/timeas responsevariable.
The generalprocedurefollowedis to fita saturatedmodelforthemean,and use thisto find
an appropriatemodel for the dispersion,beginningwith the saturatedmodel and using
backwardelimination.Whena finalmodelforthedispersionhas been found,themean model
is thensimplifiedin a similarway.
Fittingthesaturatedmodelsforbothmeanand dispersionwefindthatnoneoftheinteraction
parametersin thedispersionmodel is large,and omittingtheinteraction, i.e. fitting
the main
effectsmodelforthedispersion,givesa devianceincreaseof5.91 on 6 df.The further omission
of the treatmentmain effectgives a devianceincreaseof 3.19 on 3 df.Omittingthe poison
typemain effectgivesan increaseof 5.68 on 2 df,and thisis almost all concentratedin the
contrastofType 2 withtheothertwo types:equatingtheType 1 and 3 dispersionsincreases
thedevianceby only0.12 on 1 df,thecontrastof Type 2 withthe othertwo increasingit by
5.56 on 1 df.
The finalmodel forthe dispersionon the rate scale thus specifiesa commonvariancefor
Types1 and 3 buta largervarianceforType2 (theGLIM codeis $CALC TYP2= %EQ(TYPE,2)
$FIT TYP2 $).
The parameterestimatesforthe saturatedmean model do not depend on the dispersion
model fitted,but theirstandarderrorsdo depend on thismodel. The largestinteractionof
-0.9137 is forTYPE(3). TREA(4), and its standarderroris 0.490in thecommondispersion
TABLE I
Cell means for rate, with (fittedvalues) from main effects
commonvariancemodel,and [fittedvaluesjfrommaineffects +
D34, Type2 variancemodel
Treatment
(note order)
1 3 4 2
3 4.803 4.265 3.092 3.029
(4.694) (4.122) (3.336) (3.037)
[4.7581 [4.181] [3.092]t [3.160]
TYPE
(note order) 2 3.268 2.714 1.702 1.393
(3.166) (2.594) (1.808) (1.509)
[3.072] [2.495] [2.039] [1.474]
TABLE 2
Mean and dispersonmodelsfor V113, MINITAB treedata
const H D const H D H2 HD D2
-80 o
-8.4
-8.8 0 ?
-9.6
-10.0 0
-10 4
-10 8 r__ _ _ _ _ _ _ _ _ _ _ _ __ _ _ _ _ _ _ _ _ _ _ _ _
75 10.0 12 5 15.0 17 5 20.0 22 5
4. Discussion
The macrosMEAN and VAR do thefitting ofthemeanand dispersionmodelsrespectively.
The model formulaehave to be specifiedby the user in macros MODM and MODV; these
macrosallow foroffsetswhichhave to be assignedvalues by the user.If no offsets
are to be
zero values mustbe specified:
fitted,
$ CALC OFSM = OFSV = 0 $
The responsevariableto be modelledis specifiedas an argumentto the VMOD macrowhich
does theoverallanalysis.VMOD calls an initialisingmacroSETUP whichsetsthemaximum
numberof iterations% N to 15 and definestheweightvariateforthefirstfit.VMOD passes
the y-variatename to a macro DRIVER which alternatesMEAN and VAR macros and
calculatesthe deviance at each iteration.In VAR the gamma fithas CYCLE set to 20 as
convergencemay take morethan 10 cycles.Iterationscease whensuccessivedeviancesdiffer
by less than0.001 or when 15 iterationshave been performed.Outputis switchedoffduring
iterationsexceptforthe printingof the deviance and iterationnumber.Afterconvergence
outputis switchedon again and one further iterationperformed,
withprintingof parameter
estimatesand standarderrorsfrommean and dispersionmodels.
A listingincludingtheGLIM macrosis givenforthetreedata withthelog V,log H, log D
model in an Appendixto the paper. The output fromach iterationhas been suppressed;
successive deviances are - 79.8937, - 87.4507, - 88.3032, - 88.3067.
5. Acknowledgement
I am grateful
to thereferees
forsuggestionsforimprovingthepaper,and to Gordon Smyth
forextensivediscussionsof computationalissues.This paper was revisedwhileI was visiting
the Departmentof Statistics,Universityof Tel Aviv,to whomI am gratefulforsupport.
References
Aitkinson,A. C. (1982) Regressiondiagnostics,transformations and constructedvariables(withDiscussion).J. R.
Statist.Soc. B, 44, 1-36.
Baker,R. J.and Nelder,J.A. (1978)GeneralizedLinearInteractive
Modelling:Release3. Oxford:NumericalAlgorithms
Group.
Box,G. E. P. and Cox, D. R. (1964) An analysisoftransformations
(withDiscussion).J. R. Statist.Soc. B, 26, 211-252.
Chen,C. F. (1983) Score testsforregressionmodels.J. Amer.Statist.Ass.,78, 158-161.
Cook, R. D. and Weisberg,S. (1983) Diagnosticsforheteroscedasticityin regression.Biometrika, 70, 1-10.
Pregibon,D. (1984) Book reviewof P. McCullagh and J. A. Nelder,"Generalized Linear Models". Ann. Statist.,
12, 1589-1596.
Ryan,T. A., Joiner,B. L. and Ryan,B. F. (1976) MinitabStudentHandbook.NorthScituate,Mass: DuxburyPress.
Smyth,G. K. (1985) Coupled and separable iterationsin nonlinearestimation.Ph.D. Thesis,AustralianNational
University.
Appendix
$UNIT 31
$DATA D H V
$READ
8.3 70 10.3 8.6 65 10.3 8.8 63 10.2 10.5 72 16.4
10.7 81 18.8 10.8 83 19.7 11.0 66 15.6 11.0 75 18.2
11.1 80 22.6 11.2 75 19.9 11.3 79 24.2 11.4 76 21.0
11.4 76 21.4 11.7 69 21.3 12.0 75 19.1 12.9 '74 22.2
12.9 85 33.8 13.3 86 27.4 13.7 71 25.7 13.8 64 24.9
14.0 78 34.5 14.2 80 31.7 14.5 74 36.3 16.0 '72 38.3
16.3 77 42.6 17.3 81 55.4 17.5 82 55.7 17.9 80 58.3
18.0 80 51.5 18.0 80 51.0 20.6 87 77.0
VARIABLES:
D- DIAMEITER (IN INCHES) OF 31 CHERRY TREES
AT A HEIGHT OF 4.5 FEET FROM THE GROUND
$output$!
$endmac!
Mean model
esltimate s.e. parameter
1 -6.390 0.2526 1
2 1.080 0.06806 LJB
3 1.955 0.02281 LD
scale parameter taken as 1.000
nmodel1changed
scaled deviance = 45.1.6 at cycle 1.1
(I.f. 28