Lerman - 1980 - Fitting Segmented Regression Models by Grid Search-Annotated
Lerman - 1980 - Fitting Segmented Regression Models by Grid Search-Annotated
Author(s): P. M. Lerman
Source: Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 29, No. 1 (1980)
, pp. 77-84
Published by: Wiley for the Royal Statistical Society
Stable URL: http://www.jstor.org/stable/2346413
Accessed: 20-06-2015 03:40 UTC
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/2346413?seq=1&cid=pdf-reference#references_tab_contents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact [email protected].
Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the
Royal Statistical Society. Series C (Applied Statistics).
http://www.jstor.org
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
Appl.Statist.(1980),
29, No. 1,pp. 77-84
FittingSegmented
RegressionModelsby Grid
Search
By P. M. LERMAN
Department ofCambridge,CambridgeCB2 3DX, England
ofAppliedBiology,University
[ReceivedMarch1977.Finalrevision
October1979]
SUMMARY
A grid-searchmethodoffitting
segmented curveswithunknown
regression transition
andcompared
pointsis described witha standard
method. It is shownto be suitablefor
a widerclassofmodelsthanthestandard
fitting method andtoprovide as a by-producta
wayofmakingreliableinferencesabouttheabscissaeofthetransitions.
Keywords: MULTIPHASE; SEGMENTED REGRESSION; TRANSITION POINT
1. INTRODUCTION
WE examinean alternativeto the standardmethodof fittingsegmented(sometimescalled
multiphase)regressionmodels of the type
E(yjx)=-f1(x,P1) forX0,x,X1
=f2(x, P2) forX1<x < X2
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
78 APPLIED STATISTICS
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
SEGMENTED REGRESSION 79
4. EXAMPLES
A fewcase studieswillillustratethegrid-search techniqueand bringout variouspractical
straightlines"model
points.In thefirststudywe look at a specialcase ofthe"twointersecting
forwhichHudson's fittingtechniqueis unsuitable.The secondstudyconsidersa modelwitha
non-linearsub-model.Finallywe look at a three-phasemodel,and see how to proceedwith
morethan two segments.
Example1. Shoot-apexdevelopment in cereals
Fig. l(a) presentsdata froman experiment on wheatperformed byDr E. J.M. Kirbyofthe
PlantBreedingInstitute, Cambridge(Kirby,1974).The independentvariable,x, is numberof
days since sowing,and y is the naturallogarithmof the numberof primordia,a measureof
shoot-apexdevelopment.Two intersecting straightlinesrepresenttherelationshipbetweenx
and y quite well;thatis
E(ylx)=ocl+flx forx<Xl
= oc2 + 2X forx>X1 (7)
subjectto cc1+ f1X1 = a2 + f2 X1. If no otherconstraintsare placed upon themodel,itcan be
fitted
straightforwardly byHudson'stechnique.However,thereare good biologicalgroundsfor
believingthatthetransition stageintheplant'sdevelopment,
occursat an identifiable namelyat
theend ofspikeletinitiation.Furthermore, an accurateestimate oftheordinateofthispointis
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
80 APPLIED STATISTICS
(a)
y 3 -
V
E /
~~ ~ ~ ~~0
E5-
0c 2 -
0 10 20 30 40 50 60 70
Dayssincesowing.x b
0 12
S,,,=1114 /
010 _
44 46 4 50 52 54
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
SEGMENTED REGRESSION 81
10
VS * 0
/z
09 _
Y
y~~~~~~ //
Sm(Xl)
08
I III
0 025 05 075 10
x
0 02 - (b)
0015 _
I I I.
0 025 05 075 10
X,
FIG. between
line;(b) relation
2. (a) Data pointsand best-fitting S.(Xl) andXl, forExample2.
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
82 APPLIED STATISTICS
036
1 4 %
Feed
1 2s
i oL
/
11 16 20 25 30
(metabohzbl. energy) 'x
(b)
s.K) .t S.,,0-227
0 3~~~~~~~~~~~~~~0
01
11 15 2 0 25 30
between
line;(b) relations
FIG. 3. (a) Data pointsandbest-fitting Sm(X2) and
Sm(Xi)andX1, shown-, andbetween
X2, shown----, forExample 3.
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
SEGMENTED REGRESSION 83
2 --
X, 20
3
-y-
10 .
15 20 25 30
FIG.4. ContoursofSm(X1, X2) forExample3. Values ofSm defining thecontoursare (1) 0 208,(2) 0 282,(3) 0 403,(4)
0 673,(5) 1 5,(6) 2-5. Contours1,2, 3 and 4 are theboundariesofan 80,95, 99 and 99 9 percentconfidenceregion,
respectively, for(X1,X2).
curveis showninFig.3(a).Fig.3(b)displays
plotsofapproximations
to Sm(Xi)andSm(X2),
the
minimum valuesofS forfixedX1and X2 respectively.
Theseapproximations wereobtained
from themapofSm(Xi,X2),and from regionsforX1andX2
Fig.3(b)the95percentconfidence
separatelyare obtained as those values satisfyingSm(Xi)0227 (i = 1,2), namely
1-25<X1 < 1 68 and 1 72< X2< 2X35
5. DiscussioN
CalculatingSm(X)
Ifallf inthemodel(1)arelinear,
anefficient
wayofperforming calculations
thenecessary is
tofitthemodelwithout
first imposing (2),andthento useLagrangian
constraints methods to
obtainthe adjustmentsto the residualsum of squares,and to the values of P,..., PSr
when
continuity X isimposed.
ata specific Themethod becausethesameunconstrained
isefficient fit
willapplytoa number ofX-valuesonthegrid.Thus,itneedonlybecarriedoutonceforthoseX,
andseveralvaluesofSm(X)maybe obtainedbytherelatively process.The
quickconstraining
necessarycalculations Iftheleast-squares
areas follows. parameter andresidual
estimates sum
ofsquaresfora linearmodely = Xp+ e aredenotedby andS,thentheadjustedvalueswhen
constraintsHp = c areimposedare(Plackett,1960,p. 53)
=
p p-(X'WX)' H'[H(X'WX)- H']- 1(H-c),
(10)
S = S + (H-c)'[H(X'WX) H']-(Hp-c),
whereWis thediagonalmatrixofweights
(Wjj= wj).
Ifanyofthefti evenwhenX isfixed,
arenon-linear, thevalueofSm(X)willusuallyneedtobe
byiterative
determined minimization
methods.Insuchcasestheusualproblems with
associated
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions
84 APPLIED STATISTICS
regression
non-linear Ross(1970)givesa gooddiscussion
willbeencountered. use
oftheefficient
methods
ofiterative leastsquares.
in non-linear
Theimposition ofcontinuity withnon-linearJi
constraints of
can be handledin a variety
ways(seeAdbyandDempster, 1974,Chapter5).A simplemethodistoreducetheproblem toan
minimization
unconstrained byincorporating intothemodel(seeExample2
directly
continuity
above),butthisis notalwayspossible.
DisplayingSm(X)
TheshapeofSm(X)iseasilyapprehended ifitisdisplayed graphicallyratherthanintabular
form.Thispresents ifthenumber
nodifficulties oftransitionpointsis oneortwo;Sm(Xi)can be
plotted X1ifthereis onlyonetransition
against (Figs.l(b),2(b)),andiftherearetwo,a contour
mapofSm(Xi,X2)can be produced(Fig.4). For threeor morebreaksitis simplest to display
lower-dimensional ofSm(X).A sensible
extracts choiceis Sm(V),theminimum valueofS forthe
V,oftransition
fixedvalueofa subvector, required
points(Fig.3(b)),sinceitis thefunction for
derivingconfidenceregionsforV. The valueofSm(V)can be approximated fromthemapof
Sm(X)provided thegridis suitablyfineand spansthe-required minimum.
ofX
Final refinement
SinceS maynotbe a stationarypointofS,itslocationfromtheapproximationrevealedby
themapisbestachieved bya techniquewhichdoesnotemploy Linearsearch(Adby
derivatives.
andDempster, point,andpattern
ifthereis onlyonetransition
1974)is suitable search(Powell,
1964;Adbyand Dempster, 1974)fortwoor morebreaks.
REFERENCES
ADBY,P. R. and DEMPSTER, to Optimization
M. A. H. (1974). Introduction Methods.London: Chapman and Hall.
BACON, D. W. and WATTS, betweentwointersecting
D. G. (1971).Estimatingthetransition straight 58,
lines.Biometrika,
525-534.
BEALE,E. M. L. (1960). Confidenceregionsin non-linearestimation.J. R. Statist.Soc. B, 22, 41-81.
DRAPER,N. R. and SMITH,H. (1966). AppliedRegressionAnalysis.London: Wiley.
DUNICZ,B. L. (1969). Discontinuitiesin the surfacestructureof alcohol-watermixtures.Sond. aus der Kolloid-
Zeitschrift undZeitschriftfuir Polymere,230, 346-357.
FEDER,P. I. (1975a). The log likelihoodratio in segmentedregression.Ann.Statist.,3, 84-97.
(1975b). On asymptoticdistribution theoryin segmentedregressionproblems-identified case. Ann.Statist.,3,
49-83.
HAWKINS, D. M. (1976). Point estimationof the parametersof piecewiseregressionmodels.Appl.Statist.,25, 51-57.
HINKLEY,D. V. (1969). Inferenceabout the intersection in two-phaseregression.Biometrika, 56, 495-504.
(1971). Inferencein two-phaseregression.J. Amer.Statist.Ass.,66, 736-743.
HUDSON,D. J. (1966). Fittingsegmentedcurveswhosejoin pointshave to be estimated.J. Amer.Statist.Ass.,61,
1097-1129.
KIRBY,E. J. M. (1974). Ear developmentin springwheat.J. agric.Sci., Camb.,82, 437-447.
OWEN,J.B., DAVIES,D. A. R. and RIDGMAN,W. J.(1969).The controlofvoluntaryfoodintakein ruminants. Animal
Production,11, 511-520.
PLACKETT,R. L. (1960). PrinciplesofRegressionAnalysis.London: OxfordUniversity Press.
POWELL, M. J. D. (1964). An efficient methodforfindingthe minimumof a functionof severalvariableswithout
calculatingderivatives.ComputerJ.,7, 155-162.
Ross, G. J.S. (1970). The efficientuse of functionminimizationin non-linearmaximumlikelihoodestimation.Appl.
Statist.,19, 205-221.
SINGPURWALLA,N. D. (1974).Estimationofthejoin pointin a heteroscedastic regressionmodelarisingin accelerated
lifetests.Commun.Statist.,3 (9), 853-863.
SPRENT, P. (1961). Some hypothesesconcerningtwo-phaseregressionlines.Biometrics, 17, 634-645.
WILLIAMS,D. A. (1970). Discriminationbetweenregressionmodels to determinethe patternof enzymesynthesis in
synchronouscell cultures.Biometrics, 26, 23-32.
This content downloaded from 192.122.237.41 on Sat, 20 Jun 2015 03:40:33 UTC
All use subject to JSTOR Terms and Conditions