Bartlett-TheoreticalSpecificationSampling-1946
Bartlett-TheoreticalSpecificationSampling-1946
Author(s): M. S. Bartlett
Source: Supplement to the Journal of the Royal Statistical Society, Vol. 8, No. 1 (1946), pp.
27-41
Published by: Oxford University Press for the Royal Statistical Society
Stable URL: https://www.jstor.org/stable/2983611
Accessed: 04-01-2025 08:51 UTC
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Royal Statistical Society, Oxford University Press are collaborating with JSTOR to
digitize, preserve and extend access to Supplement to the Journal of the Royal Statistical
Society
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] 27
SYMPOSIUM ON AUTOCORRELATION IN TIME SERIES
CONTENTS
Page
BARTLETT .... ... ... ... ... ... ... ... ... ... ... ... ... 27
On the Theoretical Specification and Sampling Properties of Autocorrelated Time-Series. By M. S.
G. A. R. FOSTER ... ... ... ... ... ... ... ... ... ... ... ... 42
Some Instruments for the Analysis of Time Series and Their Application to Textile Research. By
By M. S. BARTLETT
[Read before the RESEARCH SECTION OF THE ROYAL STATISTICAL SOCIETY, January 29th, 1946,
DR. J. WISHART in the Chair.]
CONTENTS
1. Preliminary remarks.
2. Standard error formulae for the autocorrelations of discrete time-series. (a) The Markoff
process. (b) The general process.
3. The specification of continuous time-series by their autocorrelation functions.
4. Standard error formulk for the autocorrelations of continuous time-series. (a) The Markoff
process. (b) The general process.
5. Detailed specification of the second-order process.
6. The estimation problem; theoretical information available on the unknown parameters.
Application to Wolfer's sunspot numbers.
7. Concluding remarks.
8. References.
1. Preliminary remarks
It was suggested at the R.S.S. meeting at which Mr. M. G. Kendall's recent paper (9) on the
analysis of time-series was read that further discussion should be given to the problem of the
arduous labour of calculating correlograms. While I understand that later speakers will describe
new methods of calculating auto- or other serial correlations,* the purpose of the present sym-
.posium I interpret to be wider, for it is no use knowing how to calculate correlation coefficients
if we do not know what they mean. Now, their interpretation depends on two interdependent
things: the appropriateness of the theoretical scheme assumed and the magnitude of sampling
fluctuations. Kendall, following Yule (18), has stressed that for most time-series an autoregressive
or autocorrelation scheme is more relevant than the assumption of exact harmonic oscillations
detectable by periodogram analysis. He has also pointed out the need for pooling theoretical
results on this problem from the various fields of research where it has arisen-e.g., in economics,
meteorology, gunnery or in the theory of electrical fluctuations. Nevertheless, as in certain
respects I felt he stopped short in his presentation of autocorrelation theory, both generally and
on the particular question of sampling errors, my purpose here will be twofold:
(i) I shall amplify some suggestions 1 made in the discussion on his paper about the
sampling errors of a correlogram. The formulk I obtain are rather crude, and in some cases
not new, but they serve to indicate the order of magnitude of the errors.
(ii) I shall try to link up the work of the " English school " with work which it has rather
neglected-namely, the important mathematical work developed of recent years on the auto-
correlation theory of continuous time-series. I shall not attempt any comprehensive review,
but some acquaintance with it seems essential to anyone researching in the theory of time-
* Kendall has employed the term autocorrelation to denote a true value of which the observed value
is the serial correlation. I shall use what I think to be a more logical terminology-viz., serial correlation
for any correlation of one time-series with another, and autocorrelation for the particular serial correlation
of a series with itself. The standard notation of p for a true correlation coefficient, and r for the sample
value, will be used.
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
28 BARTLETT-On the Theoretical Specification and [No. 1,
series. In paiticular, I shall show that for the second-order oscillatory process considered
by Yule and Kendall it leads to a more fundamental grasp of the dual problem of specification
and sampling errors.
where y is the measure of non-normality E{x4} - 3. Since to the same order, if rs denotes the
correlation with lag s obtained from the series,
var (rj)- var (cov) + pS2 var (var) - 2p, cov (var, cov),
we obtain finally, independently of y,
a result which is more easily obtained from the formula for var (cov).
(b) The general process.-The above formulk are given for reference, but they are of limited
use in practice because we usually have to deal with time-series of more complicated character.
For the generalization of (1) to a linear autoregressive scheme of any otder, or in fact for any
time-series, we can, of course, write down formal sums which correspond to the above results for
the Markoff process. The most general result we require is (xs standardized)
cov (covsy covs+,) E{(2xrxr+s)(Exrx,+s+t)/n1 - PsPs+tg
which to the same order of approximation as before becomes
InI v00
E= - (PvPv+t
oo
+ Pv-sPv+s+t + KV,s,,) (5)
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] Sampling Properties of Autocorrelated Time-Series 29
where Kt,st is the first seminvariant involving all the four variables xr, Xr+S, X;+v Xi + +?+t,
and is a function of their intervals apart characterized by the suffices v, s, t.
When the xr are normally distributed, formula (5) reduces to
In v=-co
(PVPV+ + PV-SPV+S+t) .(6)
a useful result previously given by Daniell (5). *
To obtain var (var); var (cov); cov (var, cov), we put s= t 0; t 0; s 0 and t= s
-respectively in (5) or (6). Thus we obtain from (6):
* 00 (V S
var (rs)n v=-oo
- (p2 + v-sPv+s + 2P2PV - 4pspvpv-s) (7)
An important special case is when the true value ps has become small. The sampling errors of
correlations are then, as noted for the Markoff process, approximately equivalent to those of the
I m0 1 00
corresponding covariances. We obtain from (6) when p. is negligible for w > s,
= =-oo
where the function g(s - w) 0 for w > s. We have had a simple example of such a linear
process in the Markoff process (1). From (9) it follows at once from the properties of semin-
variants that the simultaneous cumulant or seminvariant function
K(r1, T2, T3, T4) log E{exp i(XlXr + 'r2Xr+s + 'r3Xr+v + 'r4Xr+v+s+t)}
00
cov (rs, rs+t) - cov (covsy covs+t) + psps + var (var) - p8 cov (var, covs+t) - ps+ cov (var, covy)
* I am indebted to Dr. H. E. Daniels for this reference, which has not been generally published.
t In his paper Slutsky also refers to a previous paper " On the standard error of the correlation
coefficient in the case of homogeneous, coherent chance series " (in Russian), Transactions of the Con-
juncture Institute 2 (1929), 94. Unfortunately I have not been able to ocate this paper anywhere in this
country.
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
30 BARTLETT-On the Theoretical Specification and [No. 1,
the result that for any linear process of the type (9) cov (r8, r,+ ) or var (r8) are to the present
order of approximation independent of the distribution of xs; thus cov (r8, r8+?) becomes
I 00
cov (rs, rs+ ) - E (PvPv+s + PvPv+2s+t + 4PsPs+tPv2- 2PsPvPs+v+t - 2ps+tPvPv+s) (13)
n v =-oo
The rather curious results in (8) that the sampling properties of r, when p5 has become small
depend on the " variatice " and covariances of pv in the correlogram seems sufficient to explain
the reluctance of an observed correlogram to damp down to zero with ps, a point which worried
Kendall when he came across it empirically. From (8) we see that the standard error of rs will
always be larger than 11/Vn, and that the observed correlogram will preserve a misleading regu-
larity even when ps is zero, the correlogram for neighbouring values of rs being the " correlogram"
of the true correlogram.
For example, let us consider Kendall's artificial series (see 8, Table 3)
Xs+2 = -1Xs+ 1 - 05xs + S+2 .(14)
for which Ps+2 = 11p8+ 0- 5p .s (15)
Kendall, giving values of rs up to s = 30, obtained an rs of - 057 for s = 25 (for an n of 65),
with r26 =-0-56 and r24 =- 043, values which appeared unexpectedly high compared with
the true values ps, which have effectively dropped to zero after s = 10. But from the true values
of ps, most easily obtained in succession from (15), and recently given by Kendall (Table II in his
Appendix to 18), we obtain var (rs) - 2 44/n, and a " correlogram" of the correlogram as shown
t21 +0at t at t at
in Table I.
TABLE I
Correlations at of the correlations ps
+0 434
832 8
7 +0
-0 118
022 13
14 -0
-0 015
027
3
4 +0 286
-0 002 10
9 +0
+0 096
102 15
16 -0 012
-0 024
5
6 -0 364 11 +0 071
-0276 12 +0019 17 -0
18 010
+0-005
If we consider rs for s 11 to 30, s is no longer small compared with the total number (65)
of observations. It is therefore a rather better approximation for each s to consider n as the
number of pairs of observations actually correlated (cf. Daniell, 5). This gives for the same range
of s an average value of var (rs) of 0 053. The observed value was computed to be o0o83, with an
effective number of degrees of freedom less than 20 because of formula (8). If we suppose that
we can treat the terms rs analogously to terms in the original time-series, but with the correlogram
of Table I, for which Za,2 (summed over all t) is 3-42, the effective number of degrees of freedom
will be more like 20/3 42- 6. A ratio 0 083/0 053 with 6 d.f. would not reach the 5 per cent.
significance level. This adaptation of standard tests is admittedly rough, but a test based on the
highest absolute value observed, 0o57 (for which n = 40, and the effective size of sample of which
it is the largest member again about 6), would yield a similar conclusion. Thus it may be con-
cluded that the observed values of rs have come out a little high, but not significantly so. With
correlograms we must evidently take care not to allow the tail to wag the dog!
For a Yule-Kendall process like (14), ps is of the form xjls + (1 - X)2s and Ep,2 theoretically
summable; similarly for any more general linear process. But in practice it is often simpler to
evaluate os 2 directly from the numerical values as above. If the p5 are not known, it is, however,
meaningless to consider Ers2, since var (rs) for large n is of order 1 /n, and the series cannot possibly
converge. The only valid procedure would appear to be to fit a theoretical scheme containing
one or two unknown parameters, such as the autoregressive scheme above, and obtain ZpS2 from
the corresponding theoretical correlogram. It may be that a purely autoregressive analysis is
sufficient, and this then has the advantage that the usual regression tests of significance, though
not exactly applicable, will be approximately valid for n large (see 12). If, for example, a scheme
like (14) were correct, the multiple regression of xs+2 on xs 1, xs 2, . . . when xs+1 and xs are
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] Sampling Properties of Autocorrelated Time-Series 31
held constant, should be zero. Unfortunately, as Yule and Kendall have pointed out, super-
posed error complicates such an analysis. Further complications which arise when such an
analysis is applied to continuous time-series are discussed in sections 5 and 6.
In some of the preceding variance formule for autocorrelations, the effective number of
degrees of freedom has been reduced by the factor 1 /EpY2 . This result may be compared with
Yule's factor 1 /Ep. for the variance of the mean (19), but, unlike the latter factor, it is essentially
less than one. In the discussion on Kendall's paper (9), Champernowne appears to have sug-
gested the use of the Yule factor, or at least its value of (1 - p)/(l A' p) if the series is a Markoff
process, for testing significance in periodogram analysis. But in such an analysis we test the
significance of a weighted mean, the weights being the appropriate harmonic coefficients; the
factor will correspondingly be a function of these coefficients. It may, it is true, be shown that
for a Markoff process the minimum value of the factor is (1 - I p )/(1 + I p I), which is equal to
the Yule factor if p is positive. But it is also not clear what the interpretation of such a test
would be. On the null hypothesis that there is no harmonic term we might identify the auto-
correlations in this factor with those in the series (with appropriate precautions, as for Ep,2 above).
But if there is a harmnonic term, Ep, even for an infinite series would not converge (cf. Wold 21,
section 17). Unless we adopt the laborious procedure of isolating the residuals for separate
study, we are in danger of eliminating the bias of finding harmonic terms when they do not exist
at the cost of never finding them when they do exist. And of course it is still important to study
the oscillations intrinsic in the autocorrelated series, which has corresponding to its correlogram
a " periodogram " of an entirely different character (see section 3 of this paper). We must not
throw away the baby with the bath-water !
To see its relevance to the schemes considered in section 2, suppose first we generalize the
autocorrelation for the Markoff process to the autocorrelation function
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
32 BARTLETT-On the Theoretical Specification and [No. 1,
Since f(w) is a valid distribution function in this case, so is p(s) a valid autocorrelation function.
It is also known (e.g., Rice, 14, Part II, where the important mathematical work by Wiener, 20, on
this aspect of the theory is referred to) that the function f(w) gives the intensities for different " fre-
quencies" w/27r corresponding to a harmonic analysis of the original time-series xl, so that there
is a unique relation between the harmonic and correlation analyses of a time-series (for the
corresponding relation for discrete series, see Wold loc. cit.). The above spectrum for the Markoff
process gives a continuous band of frequencies c/27r, thus stressing the possible irrelevance of a
standard periodogram analysis for such processes. It is only when the integrated function F(W)
is a step-function that discrete frequencies and corresponding periods in the classical sense
exist.
While the above theory enables us to study various permissible autocorrelation functiops, it
still appears to me important to set up if possible a more detailed theoretical mechanism to
represent a time-series. If we can do this, we not only ensure automatically that the auto-
correlation function is valid, but we find what it is, and perhaps obtain also further knowledge
about the distributional properties of the process. For we have seen in the case of linear processes
that the autocorrelation function does not exhaust the distributional properties unless the process
is " normal." General consistency conditions for the higher product-moments are not, as far
as I am aware, known. And further our postulated autocorrelation function, while at first sight
reasonable, may turn out to be incorrect for the particular process we have in mind. For
example, it is common to postulate the next simplest function to (16) for a continuous process as
(a) The Markoff process.-By such methods we obtain for the Markoff process when T is
large,
var (var) + 2
var (cov) + Ps 2 + T (19)
cov (var, cov) pf +2 + 2 ?
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] Sampling Properties of Autocorrelated Time-Series 33
take a finite set of observations?" The point is that they do indicate the intrinsic sampling
accuracy of a series of length T in contrast with that in the arbitrary number of observations we
happen to have made. We can always generate a discrete series as a set of observations made
at regular intervals on the continuous series (in general the converse is not true-e.g., if p < Oin
(1); cf. Wold loc. cit.). For the Markoff process the increment e8+ 1 then becomes the sum of
increments in the time-interval (s, s + 1). We now have the relation p e-TIn, whence
= -T n log p. Thus from formule (4) and (21), as p8 ->- 0, we obtain a relative efficiency in
estimating pj from the discrete set n of observations, of
The corresponding ratio from (3) and (20) depends on p8, but in the case s 1, we have
1(1--p2)p2(l
log l/p
+ log I /p2) (23)
The values of E& and E1 are plotted against p2 in Fig. 1.
0.8
Eo~ ~~~~~~~o
/ ~~~~~~~~~~~~~~~0.6
0-4
0-2
I,1.
02 04 06 08 p2
FIG. 1.-Efficiency ratios E0 (P8 = 0) and E1 (s = 1) for estimating p, from observations made at regular
intervals, plotted against p2, where p is the true correlation P1 and the process is a continuous Markoff
one.
(b) The general process.-For the general process it will be sufficient to note the integral
corresponding to (5); this is 0
cov (cov8, cov8+,) - I (PVPV+t + Pv-8Pv+8+t + Kv, 8, t)dv * * * * (24)
from which other formule may be deduced. For a continuous linear process, analogous to the
discrete linear process (9) I shall write formally (cf. the next section)
where JtIvdv represents the total random increment up to time t corresponding to E8sw in (9).
Then analogously to (10) we have
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
34 BARTLETT-On the Theoretical Specification and [No. 1,
where K1(-) is the rate of increase of the cumulant function of the total random increment rthdv.
From (25) we have formula analogous to (11) and (12), so that cov (r8, r8+) is again independent,
to our order of approximation, of K s, t. Incidentally we note from (25) that x, cannot be normal
unless the variable I represented in KI(() is nornial-i.e., the increments are intrinsically normal-
or else (roughly speaking) the individual increments I, are sufficiently small and numerous for
their sum, in a small interval of time for which g(t) is constant, to have become normal.
The results for the mean and variance of x, contained in (25) are known as Campbell's
theorem; the generalization to other seminvariants has been given by Rice (14, sections 1.5 and
3.11), equation (25) above representing a further extension required for the theoretical develop-
ments of this paper.
Omitting the term K,, , we may from the theory of Fourier transforms write (24) in the
alternative form
where f(co) was defined in section 3. Specific formule for the second-order linear process are
recorded later.
5. Detailed specification of the second-order process
Coming now to a detailed examination of the continuous second-order process, I shall
re-consider the problem which Yule used in his pioneering paper (18) as a basis for the second-
order difference equation of the type (14). He imagined a swinging pendulum subject to bom-
bardment by boys equipped with peashooters. This problem in another guise is of practical
importance, for equally we may think of a sensitive instrument disturbed by impulses of a
Brownian motion character (e.g., a galvanometer with suspended mirror whose torsional oscilla-
tions are disturbed by impacts from gas molecules). * As in the case of the Markoff process, this
form of Brownian motion has usually been studied with the assumption that even in a small
interval of time the disturbances are infinitesimal but numerous; the distribution of x: then
becomes normal. But again I shall not impose this restriction here, but leave the distribution of
x, unspecified in general. This allows an exact representation of Yule's problem of the swinging
pendulum subject to any type of instantaneous random impulse; and of similar or more com-
where dots denote differentiation with respect to the time t. In this equation I, represents a
random impulse function which changes x, discontinuously, but x, may be regarded as formally
defined by (26) in terms of It, which is an improper function possessing a proper integral fIdv.t
where [,u and p2 are the roots of x2 + acx + 3 = 0. Hence, if E{h} = 0, E{IJJ}J 0 when
it $ v, and a2(I) is a finite quantity representing the rate of increase of variance of the integrated
impulse ftIdv, we obtain
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] Sampling Properties of Autocorrelated Time-Series 35
where X2 = - _ 2, tan 0 = x/aVr., Equation (28) shows that if the series has been generated
a long time ago, so that the effect of initial conditions has become negligible, the series stabilizes
at 62(x) given by G2(I)/2mp (cf. Kendall, 8, equation (12), for the corresponding result for a discrete
process). The formula (30) for p, should be contrasted with (18), and also with Kendall's result
for a discrete process (8, equation (13)). It is, I find, not new, having been given in the case of
f() = I a.(30)
Brownian oscillations by Zernike (22, equation (6), p. 518). *
The corresponding frequency spectrum is obtained by inverting the function ps as
TO(t)ela = a + 3cos )t + x (5r3 - x2) sin Xt + 'xt sin Xt + (2p _-M2)t cos Xt (33)
OCP 8 X3P ~~~~2Xs 4X2 cs?t.(3
In the " aperiodic " case X = 0, we obtain the comparatively simple result from (24a),
From the exact solution (27) for x, we may investigate whether any exact difference equation for
x, exists in place of the differential equation (26). We obtain
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
36 BARTLETT-On the Theoretical Specification and [No. 1,
equation for p5 given by Kendall, its solution is different because it does not hold for s =-h
but only for s > 0, owing to -the dependence of xI+h on [J]tt+2h. In fact
P.0
08
SPURI OUS FREQUENCY
06 -
..0-4-
0-2
TRUE FREQUENCY
INTERVAL
02 04 06 08 10
FIG. 2.-Values of A estimated from the Yule-Kendall finite difference equation plotted against the
observational interval, when the process is a continuous second-order one with true frequency
A/2X = 0 (and a = 2).
To check this, I considered the solution of (26) in this aperiodic case, for which 3 =
The solution is
xt =f',Ive ia(t-)(t - v)dv,
P8 = (1 + Ics)eia8, (s> 0).(39)
If for such an autocorrelation, we attempted to estimate the frequency by means of (36), we
should obtain the spurious frequency shown in Fig. 2, where X is plotted as a function of the
value ah, where h is the interval between observations. For definiteness a was taken to be 2, so
that the limiting value of X is 3A/5 = 0-745.
The invalidity of (36) raises the question whether any other finite difference equations (apart
* This bias appears related to that obtained by Spencer Smith (17), who considered continuous periodic
time-series subject to independent disturbances in amplitude, phase and trend. I am doubtful, however,
of the possibility of analysing a disturbed oscillatory series into components corresponding to independent
disturbances of this kind, when for natural disturbances of the type considered here the effects are
necessarily related. From his concluding remarks Spencer Smith appears to recognize these limitations of
his method.
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] Sampling Properties of Autocorrelated Time-Series 37
from the relation (35)) exist for the process (27). We have seen that for this.process ix exists as
well as x,; in fact, from (27) we have
J' I L,1e1,tI-")-W [2e2O-v)} dv .(40)
whence
Ex9= 3Ext2} 1
p(., x+) = + Ps I t ? . . . . (41)
p(,t x1+,) = p(xt x,) = - *a J
where
As h -+ 0, these two equations reduce respectively to (26) and to i, = Lt(xt+h -x1)/h, showing
that for h small enough it is sufficient to consider the formal equation (26).
JT IfdtT Tc2dt }
-JoT d/f~,t d 0 J(43)
We know further that these least-squares estimates, which would have minimum standard errors
in orthodox regression analysis, will have asymptotically minimum errors in the present case,
irrespective of the distribution of I, (cf. 12 and section 2 of the present paper), given by
T ~~~~~~~~~~(44)
var (Xe) _ a2(I)/JT :2dt- 2oa/T 1
In some problems, where a continuous track of the time-series is available (in the case of torsional
Brownian oscillations continuous records from an oscillating mirror system are reproduced in 6,
Fig. 79) it is possible that direct optical or electrical devices could be invented to measure the
quantities occurring in (43), where it should be noted that owing to the existence of xz being only
formal, | ,x,dt is to be interpreted as Lt h f (Ix,x,, ,- 2)dt. In other cases, the formule (44)
are a gauge by which the efficiency of any actual method of estimation used can be investigated.
The principles involved are hest illustrated first for the simpler Markoff process, since we
have seen that the use of discrete series for the second-order process raises special difficulties.
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
38 BARTLETT-On the Theoretical Specification and [No. 1,
For the Markoff process we have corresponding to the equation (26) for the second-order process,
the formal equation
.it + vx, == It * * * . . . . . . . . . (45)
whence xt = Ie-u(t-v)dv
a2(x) = a2(I)/2i .(46)
P8 = e-s, (s 0) J
The least-squares estimate of Vi is
p = - .0T xt,dt! xt2dt .(47)
where var (Ve) a2(I)f xt2dt 2L/T .(48)
In (47) x, is to be replaced by (x ?h- x1)/h but if we do not proceed to the limit h = 0, we are
obviously estimating Vt by means of the autocorrelation Ph. Thus for finite h, bias may be intro-
E2,, - 08 0~~~~~~-
- / 06/
//"
0-4
02
02 04 06 08 p2
FIG. 3.-Efficiency ratios: (i) E2 and (ii) E1E2, plotted against p2 (p = ph), for estimating the unknown
parameter in a continuous Markoff process from the autocorrelation obtained (i) from a continuous
record, (ii) from observations made at intervals h.
duced if we keep to formula (47), and it will be more consistent to use the formula for ph directly-
i.e.,
hVLe =-log r . (49)
For this estimate, if r, were still obtained from a continuous record as fTxtx?+hdt!IfT x2dt, we
have var (t-e) var (rh/ph) or from (20), for any distribution of I,,
var (Fte) [1 - ph2(2hv. + 1)]ILTph2 .(50)
The ratio of (48) to (50) can be written as
E2 = lo
10gp2lll2 1 /p2 (51)
-2+I- i/p2.
which is plotted in Fig. 3. If further rh were obtained from a discrete set of observations witlh
interval h, we should have var (rh) given by (3) instead of by (21), and the overall efficiency is
reduced to E1E2, where E1 was shown in Fig. 1. E1E2 is also plotted in Fig. 3; it will be seen
that the fall in efficiency as h increases is fairly rapid.
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] Sampling Properties of Autocorrelated Time-Series 39
Thus in fitting the Markoff process (45) with finite h we revert from the regression estimate
to a consistent estimate obtained from the first available autocorrelation in the correlogram.
The remainder of the correlogram would be used in conjunction with the known magnitude of
its sampling errors to consider the adequacy of the fit.
Let us try to consider now the appropriate procedure for the second-order process. We
first of all make the inevitable substitution (4h -x,)/h for x. If the nature of the observations
prohibits the direct use of x,, we farther substitute (X +k - xt)/k for i, where for the moment
k (( h) is not assumed equal to h. The limiting estimate of ( presents no difficulty, the regression
estimate becoming
P,9e - 2(1 - rk)/k2. (52)
this being valid for small k as can be seen from the expansion
cov (r,, r,+,) s(s + t) cos Xt t sin ?.t + (2s + t) sin X (2s + t) sin Xs sin X(s + t)
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
40 BARTLETT-On the Theoretical Specification and [No. 1
Application to Wolfer's sunspot numbers.-In view of Yule's development of the finite differ-
ence equation with specific regard to the analysis of the Wolfer sunspot numbers, it seems
desirable (without attempting here any complete discussion of these figures) to illustrate the present
theory on the same data. Following Yule, I have assumed first that the series of annual numbers
quoted might be represented by a second-order process; The two unknown constants are now
estimated from the first two correlations r, and r2, this being fairly rapidly done by an inter-
polatory method. The result is given in Table II. While I have stressed that the large standard
errors for r8 when p5 is small allow comparatively large departures from expectation, the estimated
damping factor appears excessive, being even greater than in Yule's analysis. Yule suggested
that the data were effected by random observational errors, and while this is not very apparent
from the annual averages he depicts, it is much more evident in the original quarterly figures
(see 11, Fig. 1). For comparison I include also therefore an analysis for Yule's smoothed figures.
Of course the use of averages and graduated figures is highly dangerous in analysing time-series
for periods, and it would seem more satisfactory to use the original quarterly figures with appro-
priate inclusion in the estimation equations of the effect of any observational error. However,
the analysis for the smoothed figures is of some interest. The estimated period still comes out a
little lower than that usually accepted, but the discrepancy appears trivial compared with the
estimated period's standard error of the order of i i per cent. The asymptotic formula used,
corresponding to IOO per cent. efficiency of estimation (which is certainly not reached), was
The present analysis does not refute Yule's original analysis, the bias noted in section 5 being
apparently negligible owing to the small damping for this series, and the estimated period actually
less than Yule's for both ungraduated and graduated data. The apparent adequate fit (for the
graduated series) does not, of course, prove that the theoretical model is correct, but it does place
the onus of proof on those who claim more complicated schemes or more accurate estimates to
provide sampling errors and tests in support of their claims.
The alternative suggestion by Yule that the sunspot series might represent the square x,2 of
the amplitude x, of an oscillating series rather than x, itself is one which seems to merit further
investigation, but the following notes indicate the difficulty of handling such a theory.
(i) The autocorrelation function p8(x2) is no longer independent of the nature of the
distribution of I,
(ii) The autocorrelation function for various a may have a discontinuity at a = 0, being
given by cos 2Xs when a = 0, and, by p82(x) for a * 0 for I, normal.
(iii) The expected value of r8(x2) as T increases converges more and more slowly to p8(x2)
as a approaches 0.
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms
1946] Sampling Properties of Autocorrelated Time-Series 41
These comments show that no simple x'2 theory, such as assuming the disturbances I, to be
normal, will fit the observed autocorrelations, which have both positive and negative values.
8. References
1. Barnes, R. B., and Silverman, S. " Brownian motion as a natural limit to all measuring processes,'
Reviews of Modern Physics, 6 (1934), 162.
2. Bartlett, M. S. " Some aspects of the time-correlation problem in regard to tests of significance," J.
Roy. Stat. Soc., 98 (1935), 536.
3. Chandrasekhar, S. " Stochastic problems in physics and astronomy," Reviews of Modern Physics, 15
(1943), 1.
4. Cramer, H. Random variables and probability distributions (Cambridge, 1937).
5. Daniell, P. J. " Sampling errors of the lag-covariance of fluctuating-time-series " (unpublished note).
6. Fowler, R. H. Statistical mechanics (Cambridge, 2nd ed., 1936).
7. Kendall, M. G. " Oscillatory movements in English agriculture," J. Roy. Stat. Soc., 106 (1944), 91.
8. . "On autoregressive time-series," Biometrika, 33 (1944), 105.
9. . "On the analysis of oscillatory time-series," J. Roy. Stat. Soc., 108 (1945), 93.
10. Khintchine, A. " Korrelationstheorie der stationairen stochastischen Prozesse," Math. Annalen, 109
(1933-4), 604.
11. Larmor, J., and Yamaga, N. "On permanent periodicity in sunspots," Proc. Roy. Soc., A93 (1917),
493.
12. Mann, H. B., and Wald, A. "On the statistical treatment of linear stochastic difference equations,"
Econometrica, 11 (1943), 173.
13. Moyal, J. E. " Theory of random functions " (paper not yet published).
14. Rice, S. 0. " Mathematical analysis of random noise," Bell System Tech J., 23 (1944), 282; and 24
(1945), 46.
15. Slutsky, E. " Sur les fonctions eventuelles continues, integrables et derivables dans le sens stochas-
tique," Comptes Rendus, 187 (1928), 878.
16. . " The summation of random causes as the source of cyclic processes," Econometrica, 5 (1937),
105.
17. Spencer Smith, J. L. " The specification of disturbed periodic time-series of the type of Wolfer's
sunspot numbers," J. Roy. Stat. Soc., 107 (1944), 231.
18. Yule, G. U. " On a method of investigating periodicities in disturbed series, with special reference to
Wolfer's sunspot numbers," Phil. Trans., A226 (1927), 267.
19. . " On a method of studying time-series based on their internal correlations " (with " Note on
Mr. Yule's paper" by M. G. Kendall), J. Roy. Stat. Soc., 108 (1945), 208.
20. Wiener, N. " Generalized harmonic analysis," Acta Mathematica, 55 (1930), 117.
21. Wold, H. A study in the analysis of stationary time-series (Uppsala, 1938).
22. Zernike, F. " Die Brownsche Grenze fur Beobachtungsreihen," Zeits. f. Physik, 79 (1932), 516.
This content downloaded from 91.170.220.105 on Sat, 04 Jan 2025 08:51:17 UTC
All use subject to https://about.jstor.org/terms