Experiments in Modelling Recent Danish Fertility Curves: Demographv© Volum. 18, Number 2 May 1981
Experiments in Modelling Recent Danish Fertility Curves: Demographv© Volum. 18, Number 2 May 1981
Jan M. Hoem
Dan Mad ....
J"rgen Ltilvgreen Ni.l....
Else-MarieOhl....
Laboratory of Actuarial Mathematics, University of Copenhagen, Universitetsparken 5,
DK-21 00 Copenhagen, Denmark
Bo Rennermalm
National Central Bureau of Statistics, Stockholm, Sweden
for five-year age groups. Compare, e.g., able criterion, in the manner of Coale and
Hansen (1976), Brouard (1977), McNeil Trussell (1978). We have settled for the
et al. (1977), and Rogers et al. (1978). It unweighted least squares method because
can also be applied for the purpose of it has well-known, nice mathematical
graduating curves of rates subject to sub- properties, because curve plots reveal
stantial random variation, as has been good fits, and because ready-made com-
demonstrated in a series of analyses of re- puter programs based on it have been
gional fertility in all three Scandinavian available to us.
countries (Berge, 1974b, 1977; Holmbeck, The account below summarizes our
1975; Hoem and Torpe, 1977; Renner- main findings in the next section and ex-
malm and Strandberg, 1978; Woll, 1978), plains each of our experiments in two
where, for purposes of description and subsequent sections. We give relevant ref-
comparison, it has been important to erences to the literature on fertility curve
bring out the structure of the underlying, fitting as we go along. Additional refer-
"true" curves by removing the effects of ences have been given above and by
random fluctuations. Compare Hoem Hoem (1972, Sect. 1.2), Hoem and Berge
(1976, p. 183). We know from the general (1974, 1975), and Coale and Trussell
theory of analytic graduation (Hoem, (1978, p. 203). Some further references
1976, Sec. 3) that in the latter case, the are Duchene and Gillet-de Stefano
most efficient curve fitting procedure is (1974), Berge and Hoem (1975), Suchin-
that of weighted least squares, with dran et al. (1977), Wunsch (1980), and
weights equal to the reciprocals of the es- Bloom (1980). Additional curve plots for
timated (asymptotic) variances of the our data have appeared in a working pa-
"raw" (ungraduated) rates, or something per from this investigation.
equivalent. Moment methods are less effi-
cient, but they are valuable in that they
GENERAL DEFINITIONS AND MAIN
frequently provide good starting values FINDINGS
for the iterative numerical algorithms Generalities
needed to find the least squares fits.
In the situation which we focus on, ran- A fertility curve is a plot of a set of, say,
dom variation is negligible and the age-specific empirical fertility rates f x' We
weights described above are not really ap- assume that such rates are available for
propriate because the variances are essen- each of a number of single-year age
tially zero. If standard functions are fitted groups. Invariably, the parametric func-
by the best graduation method anyway, tion which is to be fitted to the empirical
with "variances" estimated in the usual curve can be written in the form
manner, the fit is occasionally nice, but
g(X;R,02' ... , 0,) = Rh(x;O" ... , 0,), (1)
systematic deviations can be seen in most
data sets we know of, both our own and where h(·; O2 , • • • , 0,) is a probability den-
those of others (Berge, 1974a; Holmbeck, sity function on the real line, with some r
unpublished Swedish data; Hoem, 1976, - 1 parameters O2, ... , 0" while R(=O,) is
Figure 9.) The reason must be that the an rth parameter representing the model
weighting procedure usually gives too total fertility rate (or the gross reproduc-
much attention to the low fertility in the tion rate, as the case may be). II} many
tails, particularly at the high ages in the cases, h(x) = h(x; O2 , ••• , 0,) will be de-
upper tail, and too little attention to the fined so as to be positive only for x in the
high fertility ages in the middle. No gen- fertile age span, as when hO is the beta
eral results are available concerning the density, but provided h(x) is small enough
optimality of any single fitting procedure outside of that span, it need not be con-
when there is no random variation, so one fined in such a manner, as it is not when
is essentially free to choose any reason- hO is the gamma density. A least squares
Experiments in Modelling Recent Danish Fertility Curves 233
fit results from minimizing are interpreted. In the empirical parts be-
low, halfa year should be added to any age
Q(O) = L {f x - g(x;lI)} z designation to get "real" age.
Data
with respect to 0 = (R,Oz, ... , Or>. In most
cases, the minimization must be made on We have fitted a number of functions to
a computer. the age-specific fertility rates for ages 15
Note that we count R as a parameter to to 46, inclusive, for Denmark for each of
be fitted and that Q(O) is minimized with the calendar years 1962 to 1971, as re-
respect to R as well as with respect to the ported by Danmarks Statistik (1973,
other parameters. Thus, we do not first es- Table 5). These rates constitute a conve-
timate R by R* = Ifx and then minimize nient basis for our purposes, for we have
been interested mostly in the analysis of
Q(R*,Oz,' .. , 0,) modem Scandinavian data, and they pro-
vide an opportunity to check the useful-
= I{fx - R*h(x; Oz,"" O,)}' ness of some procedures suggested by
with respect to the other parameters. Brass (1974 and elsewhere). The age
Some trial calculations suggest that the structure of Danish fertility has changed
latter method need not give much loss of only moderately over the period in ques-
fit, but there is no need to sacrifice accu- tion, but after a stable period of almost
racy since the work is done by computer twenty years, the total fertility rate started
anyway. There is no need to restrict one- falling dramatically in 1967 and quickly
self to R* as an estimator of the model pa- fell to well below replacement level. Com-
rameter R. On the other hand, R* pro- pare Columns I to 3 in Table 1 which lists
vides a convenient moment estimate as a R* = I i; M = I xfxlR*, and S for SZ =
starting value for the iterative computer I x'fxlR* - M' for each of the years in
algorithm which produces the least question.
squares estimates ft.,{J" ... , {J,. Strictly speaking, we are reporting only
Note that minimizing Q(R*,Oz,"" 0,) on the outcome of experiments with a
with respect to (Oz, ... , 0,) is equivalent to rather limited data set. However, we be-
minimizing I{fx * - h(x;Oz, ''', O,)}' for lieve that the tenor of our main results
the "normed" rates i: = fxIR*, as some will hold up when similar computations
authors do. Note also that gO is an in- are carried out on a broader scale, for our
tensity function, not a density, even findings correspond well with all of our
though its form is represented via the den- earlier experience, both with Scandina-
sity function h(·) in (1). vian and non-Scandinavian data, and
Throughout, Q(D) is our measure of the with whatever similar comparisons we
quality of the fit. If we had been more have found in the literature, so they ce-
concerned about intertemporal com- ment previous more fragmentary impres-
parability of the fit to the age structure of sions.
fertility, we would have reported Q(D)Ift.z
Main findings
instead.
Some authors like to specify age x + t For our experiments, we have selected
as the one to which the fertility rate f x is a number of fitting functions, some be-
said to refer. One might then write hex + cause they have shown promise in pre-
t) for hex) in (1). Mathematically speak- vious work, and others because they have
ing, it is more convenient to absorb the appeared to open new avenues. The func-
"+1" element into the function symbol, as tions can be divided into two groups:
we have done, but this convention should (1). For various choices of the function
be kept in mind when particular formulas h( . ) in (1), we have selected the gamma
are specified for hex) and when the results density, the beta density, the Hadwiger
10»
~
6 A
a
Raw Parameters 10 Q(~l for Various Functions g
Polynomial Brass
Mean Standard Coa1e- Had-
TFR Ageb Deviation Spline Trussell Gamma wiger Beta 1 2 log i t log log
Year (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
1962 2.54 26.36 5.60 140.6 274.5 269.7 347.5 1401. 9 497.8 1151.3 1762.4 1834.9
1963 2.64 25.37 5.58 74.1 260.9 263.5 378.2 1281. 3 428.7 1057.5 1661.6 1720.8
1964 2.60 26.27 5.56 73.3 201.6 217.0 325.6 1243.1 636.8 953.5 1391.1 1469.7
1965 2.61 26.27 5.55 128.4 204.0 297.6 438.4 1174.6 460.8 1070.4 1467.8 1556.5
~
1966 2.61 26.14 5.50 56.8 186.4 206.5 361.3 1064.9 575.3 961.6 1078.1 1210.9 ~
1967 2.35 26.01 5.49 91.6 230.6 214.5 319.3 936.5 346.3 780.9 772.3 943.6 8
1968 2.12 26.00 5.48 65.4 242.6 206.5 302.7 596.9 169.1 402.9 442.7 517.5 ~
:!
1969 2.00 26.09 5.40 86.2 330.7 227.7 271. 3 631.7 321.5 394.0 420.4 440.0 .:<
1970 1.95 26.22 5.28 47.0 288.9 192.5 174.1 905.2 517.0 531.0 534.1 507.9 ;.
c
1971 2.04 26.21 5.07 60.8 270.6 177.9 149.9 947.2 494.7 761.9 603.6 485.4 i..
Number of parameters fitted 10 4 4 4 5 6 4 3 3 !!'
:lI
C
3
a--Computed as described in Section 2B. r
.!"
b--Add 0.5 to get "real" age. ~
Q
'<
..
00
aD
..
Experiment. in Modelling Recent Danish Fertility Curv•• 235
density, the density essentially specified sults found with other fitting functions,
by Coale and Trussell, and densities es- we have refrained from pursuing this
sentially defined by Brass's logit func- question further.
tional relation and log log relational mod- (2). Our second group of fitting func-
els. Among these, the Coale-Trussell tions consists of two sixth-degree poly-
procedure and the gamma density come nomials (with different numbers of free
out about equal as the best in terms of ac- parameters) and a cubic spline function
curacy, both fitting the data very well. with three free knots. To fit these func-
(Compare Columns 5 and 6 in Table 1.) tions, it is not convenient to split off a
Coale and Trussell based their function multiplicative parameter R as in (l), and
on extensive data experiments and on we have not done so. Our report on the
some demographic theory. Although we polynomials can be made brief (Columns
find it a bit elusive, they also offer an in- 9 and 10 in Table 1). One polynomial,
terpretation of their parameters, perhaps which has six free parameters, gives some
something which may be of value in fer- very nice fits, but it cannot compete with
tility forecasts. For these reasons, we be- the four-parameter Hadwiger function in
lieve that their procedure frequently may terms of accuracy, and it has turned out to
be preferred to the gamma density (and be much less convenient numerically.
others) in future research. Two of its parameters can be interpreted
The Hadwiger function seems next best as the lower and upper limits of the fertile
(Column 7 of Table 1). Judging from our age span, and the other polynomial re-
data (and by previous experience), its fit is sults from fixing these at reasonable val-
not as reliably close as are the two best ues, in our case at ages 15 and 47. The re-
functions, but it is very good most of the sulting four-parameter polynomial gave
time, and the very best in its class on oc- much less satisfactory fits, frequently on a
casion. Cf. the Q( 8) of 1.5 . 10-4 for 1971 par with the Brass procedures.
in Table 1. We would not hesitate to use a So long as we only compare least
conveniently available computer program squares accuracy, curve plots, and similar
for Hadwiger graduation in a situation measures of fit, the cubic spline comes out
where producing one based on the Coale- much better than all competitors (Column
Trussell procedure or the gamma density 4 of Table 1). Its worst fit (for the year
were bothersome. 1962) is even better than the best of the
These three functions give much better fits of any of the other functions (Hadwi-
fits than the beta density and the two ger, for the year 1971).
Brass procedures. Brass never seems to Results of this character are what we
have suggested that his procedures would could expect on the basis of well-known
give more than a "fair description of ob- properties of spline functions, including
served distributions" (Brass, 1978, p. 8), the fact that the one we have used has as
which they do in our case as well, so this many as ten parameters (two for each
part of our conclusion corresponds to his knot and four others). Probably without
findings. much loss in accuracy, this number could
It seems more surprising that the beta be reduced by three by fixing the knots at
density fares so badly. Several authors us- convenient reasonable ages (say 20, 24,
ing fitting methods inferior to ours have and 29), for the fitted knots do not vacil-
apparently been largely satisfied with fits late much over the years, and the exact
comparable to ours. This is true in spite of position of the knots is not very important
the fact that we suspect that our computer anyway. Even so, the function would
algorithm may not have given us the have seven free parameters, none of
globally minimizing parameter values but which has a substantive interpretation,
may have reached some local minimum properties which may not be to the liking
instead. In view of the greatly positive re- of many demographers. Nevertheless,
236 DEMOGRAPHY, volume 18, number 2, May 1981
when descriptive power is more important 1971; Jagers, 1974; Hoem and Holmbeck,
than (a possibly elusive) parameter inter- 1975), as is also the case for similar pa-
pretability, the spline seems to be a very rameters of all the densities below. The
useful instrument, as it has proved to be original parameters (H, T, and D) have
in many other situations. no demographic interpretation. In partic-
Experiments at fitting the Gompertz ular, D is not even the lower age of fertil-
curve proved unpromising and were ity (Hoem and Berge, 1975).
quickly abandoned. As is illustrated in Figure 1, the Hadwi-
ger function g(x) = R h(x) fitted by least
squares gives a good representation of our
SOME PROPER DENSITY FUNCTIONS empirical fertility curves over most of the
The Hadwiger function age range in question. By Column 7 of
Table 1, the fit of the Hadwiger function
It is convenient to start the review of for the year in Figure 1 (1965) is less good
our experiments with the density named than for any of the other years. The dia-
for Hadwiger (1940) who first proposed gram demonstrates a common problem
its use for the present purpose, with the Hadwiger fits, namely, the heavy
upper tail of the fitted function.
H (-T-)3/2
h(x)=-
T..fir x-D
The gamma density
exp {-w (x': D + x; D - 2)} We have used the following representa-
tion of the gamma density:
for x> D. (2)
h(x) = r(i)cb (x - d)b-I
500
~
15 20 25 30 35 45 Age CoS
.....
238 DEMOGRAPHY, volume 18, number 2, May 1981
500
····.h
15 20 25 30 35 40 45Age ~
240 DEMOGRAPHY, volume 18, number 2, May 1981
Table 2.-Fitted Values of the Range (a,f3) of Ages Table 3.-Parameter Values for 1962 for the Beta
of Positive Fertility for the Beta Density and for a Density
Polynomial Representation
Raw Standard Refined
Cl fit fit
Parameter (1) (2) (3)
Beta Po1yn.1 Beta Po1yn.1
(1) (2) (3) (4)
R 2.5436 2.4762 2.5652
1962 16.55 15.30 45.93 44.59 Cl 15- 16.55 14.95
1963 16.52 15.25 45.96 44.58 46+ 45.93 204
1964 16.44 14.43 45.93 45.50 v 26.35 26.09 26.58
2 36.00
1965 16.52 15.31 45.91 44.53 T 31. 37 26.48
1966 15.87 15.42 45.99 44.91
1967 15.75 14.91 46.02 44.57 6LSQ
10 1401. 9 264.9
1968 15.63 15.07 46.04 44.64
1969 15.44 15.24 46.04 45.86 Add half a year to get "real" age for a, B,
1970 15.31 14.89 46.04 46.76 and v,
1971 15.63 15.61 46.07 44.64
3
Add half a year to get "real" age. p(X) = (X -15)(47 - X)2 L a.x"
k-O
forI5~x<47, (3)
dure would then reduce the sum of fitted this by weighted least squares to our
squares of deviations further by between curve for 1969, and let
a tenth and a third. Ho(x) = (I/R)fls p(s)ds. X
R a k m
H(x; a, b) = '1' {a + b'1'-'[Ho(x)]}
Year o
552-553) to suggest his log log model. for a < x < {3. (5)
For a = 1, h(') is the Gumbel density
(Johnson and Kotz, 1970, Chapter 21). When a and {3 are fixed at natural values,
Initial attempts for this particular case fit as Brass probably intended and as we
our data badly, and the only moderate have done in (3) above, g(x; a.; a" a., a3 )
Figure 3.-The Brass Log Log Density Fitted to the Danish Fertility Curve for 1962 by Ordinary Least ~
Squares N
a b
logit log log logit log log
Schedules: Variations in the Age Structure of 1975:2. (With a separately bound appendix of dia-
Childbearing in Human Populations. Population grams and tables.)
Index 40:185-258. Jagers, Peter. 1974. Convergence of General
- - - . 1978. Technical Note: Finding the Two Pa- Branching Processes and Functionals Thereof.
rameters that Specify a Model Schedule of Mari- Journal of Applied Probability 11:471-478.
tal Fertility. Population Index 44:203-213. Johnson, N. L. and S. Kotz. 1970. Continuous Uni-
Danmarks Statistik. 1973. Befolkningens bevaegel- variate Distributions I. Boston: Houghton Mifflin.
ser 1971. Statistiske Meddelelser 1973:10. McNeil, D. R., T. J. Trussell and J. C. Turner. 1977.
Duchene, J. and S. Gillet-de Stefano. 1974. Ajuste- Spline Interpolation of Demographic Data. De-
ment analytique des courbes de fecondite gener- mography 14:245-252.
ale. Population et Famille 32:53-93. Mitra, S. and A. Romaniuk. 1973. Pearsonian Type
Farid, S. M. 1973. On the Pattern of Cohort Fertil- I Curve and Its Fertility Potentials. Demography
ity. Population Studies 27:159-168. 10:351-365.
Gilje, E. 1969. Fitting Curves to Age-Specific Fertil- Murphy, E. M. and D. N. Nagnur. 1972. A Gom-
ity Rates: Some Examples. Statistical Review of pertz Fit that Fits: Applications to Canadian Fer-
the Swedish National Central Bureau of Statistics tility Patterns. Demography 9:35-50.
III 7:118-134. Rennermalm, Bo. 1978. Utjlimning av demografiska
Hadwiger, H. 1940. Eine analytische Reprodutions- kvoter med kubiska splinefunktioner. Statistical
funktion fur biologische Gesamtheiten. Skandi- Review of the Swedish National Central Bureau
navisk Aktuarietidskrift 23:101-113. of Statistics III 17:432-438.
Hansen, Hans Oluf. 1976. The Collecting and Com- Rennermalm, Bo and Arne Arvidsson. 1977. Ut-
piling of Demographic Data to Assist in Policy jamning av svenska fruktsamhetskurver 1968-
Making, Planning, Public Administration and 1973 med kubiska splinefunktioner, Statistical
Demographic Research between 1950 and 1975. Review of the Swedish National Central Bureau
Pp. 5-17 in Proceedings of Northern Population of Statistics III 15:389-401.
Workshop I (Paula Weston Wells, ed.), The Arc- Rennermalm, Bo and Margit Strandberg. 1978. Re-
tic Institute of North America. gionala fruktsamhetstal for perioderna 1968-
Hoem, Britta and Carsten Torpe. 1977. Regionale 1970, 1971-1973, 1974-1976. Swedish National
fertilitetsforskelle 1971-1976. Danmarks Statistik, Central Bureau of Statistics, Information i prog-
Statistiske Undersegelser #35. nosfragor, preliminary version. (With a separately
Hoem, Jan M. 1971. On the Interpretation of the bound supplement of diagrams and tables.)
Maternity Function as a Probability Density. Retherford, Robert. 1979. The Brass Fertility Poly-
Theoretical Population Biology 2:319-327 and nomial. Asian and Pacific Census Forum 5:15-22.
3:240. Rogers, A., R. Raquillet and L. J. Castro. 1978.
- - . 1972. On the Statistical Theory of Analytic Model Migration Schedules and Their Appli-
Graduation. Pp. 569-600 in Proceedings of the cations. Environment and Planning A 10:475-502.
6th Berkeley Symposium on Mathematical Statis- Romaniuk, A. 1973. A Three-Parameter Model for
tics and Probability. Birth Projections. Population Studies 27:467-478.
- - - . 1976. The Statistical Theory of Demo- Ross, G. J. S. 1970. The Efficient Use of Function
graphic Rates: A Review of Current Develop- Minimization in Non-Linear Maximum-Likeli-
ments (with Discussion). Scandinavian Journal of hood Estimation. Applied Statistics 19:205-221.
Statistics 3:169-185. - - . 1974. Fitting Models to Ecological Data.
Hoem, Jan M. and Erling Berge. 1974. Theoretical Proceedings of the 8th International Biometric
and Empirical Results on the Analytic Gradu- Conference, Constanta, Romania.
ation of Fertility Curves. Pp. 363-371 in Pro- Suchindran, C. M., N. K. Namboodiri and K. K.
ceedings of the 8th International Biometric Con- West. 1977. Analysis of Fertility by Increment-
ference. Constanta, Romania. Decrement Life Tables. Proceedings of the Social
- - - . 1975. Some Problems in Hadwiger Fertility Statistics Section, American Statistical Associa-
Graduation. Scandinavian Actuarial Journal. Pp. tion 1:431-436.
129-144. Woll, Claus A. 1978. Fertilitetsforskelle i Keben-
Hoem, Jan M. and Britta Holmbeck. 1975. The De- havn 1971-76. Kebenhavns statistiske kontor,
mographic Interpretation of the Basic Parameters Undersegelse No. 18.
in Hadwiger Fertility Graduation. Statistical Re- Wunsch, G. 1966. Courbes de Gompertz et per-
view of the Swedish National Central Bureau of spectives de fecondite, Recherches Economiques
Statistics III 13:369-375. de Louvain 6:457-468.
Hoem, Jan M. and Bo Rennermalm, 1978. On the - - - . 1980. Linearisation de la fonction de fecon-
Statistical Theory of Graduation by Splines. Uni- dite generate par age. Universite Catholique, Lou-
versity of Copenhagen, Laboratory of Actuarial vain. Departement Demographie, Document de
Mathematics, Working Paper No. 14. Recherche No. 47.
Holmbeck, Britta. 1975. Fruktsamhetens regionala Yntema, L. 1969. On Hadwiger's Fertility Function.
variationer 1968-1973. Swedish National Central Statistical Review of the Swedish National Cen-
Bureau of Statistics, Information i prognosfragor tral Bureau of Statistics III 7:113-117.