0% found this document useful (0 votes)

111 views

Bhattacharya - 1967 - Simple Method Resolution Distribution Into Gaussian Components

The document describes a simple method for resolving a distribution into Gaussian components. It involves plotting the logarithm of class frequencies against midpoints and looking for straight line regions, which indicate components. Mean and standard deviation of each component can then be estimated from the line parameters. Several methods are proposed to estimate the proportions of each component in the mixture, including regression and equations involving expected and observed frequencies.

Uploaded by

Daniel Mateo Rangel Reséndez

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

111 views

Bhattacharya - 1967 - Simple Method Resolution Distribution Into Gaussian Components

Uploaded by

Daniel Mateo Rangel Reséndez

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

A Simple Method of Resolution of a Distribution into Gaussian Components

Author(s): C. G. Bhattacharya
Source: Biometrics, Vol. 23, No. 1 (Mar., 1967), pp. 115-135
Published by: International Biometric Society
Stable URL: http://www.jstor.org/stable/2528285 .
Accessed: 25/06/2014 04:14

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to
Biometrics.

http://www.jstor.org

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
A SIMPLE METHOD OF RESOLUTION OF A
DISTRIBUTION INTO GAUSSIAN COMPONENTS

C. G. BHATTACHARYA
CentralInland Fisheries ResearchInstitute,Barrackpore,India'

SUMMARY
An approximate method of solution is given of the problem of resolution of a
distribution into Gaussian components when the component distributions are
adequately separated. Illustrative examples are given.

RESUME
Une solution approchee du probleme de la resolution d'une distribution en
composantes gaussiennes est etablie lorsque les distributions composantes sont
convenablements6parees. La methode est illustree par des exemples.

INTRODUCTION
The distribution-of a morphometric character inl a biological popula-
tion is a mixture of components corresponding to different species, broods,
sexes, etc. A problem which frequently arises is to find the relative
frequencies and the frequency distribution of such components by an
analysis of the observed frequency distribution. The frequency distribu-
tion of any such component is usually assumed to be normal: hence
the problem is one of resolution of a distribution into Gaussiain com-
ponents.
For a population of fish such an analysis has been found to be very
helpful for population studies, particularly when determination of age
of a fish is difficult. The frequency distribution of length obtained
from a sample of fish is usually skew and polymnodal:in many cases, the
modes correspond to individual age-groups and are very helpful for
separating them. Buchanan-Wollaston and Hodgeson [1929] dis-
approved of the smoothing out of 'bumpy' distributions, as practiced
by the early fishery biologists, even for small samples. They suggested
that the individual 'humps' indicate meaningful modes around which
normal curves ought to be fitted.
The problem of resolution of a distribution into two Gaussian com-
ponents, and some particular cases of it, have been considered by several
I Present, address; Institute of Statistics, University of Ghana, Legon, Accra, Ghana.

115

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
116 BIOMETRICS, MARCH 1967

authors using the methods of moments (Pearson [1894; 1915], Rao

[1948]), incomplete moments (Pearson and Lee [1908-09]), half moments
(Gottschalk [1948]), and maximum likelihood (Rao [1948]), and a
graphical procedure based on the relationship between skewness and
kurtosis (Preston [1953]). The difficulties encountered with these
methods increase at a tremendous rate as the number of components
increases, and the general problem in which the number of components
is unknown, and may be more than two, does not seem to have been
considered in the statistical literature.
Various approximate methods have been suggested by fishery workers
for situations in which the components are adequately separated. The
probability paper method (Harding [1949], Cassie [1954]) involves
dissection of the distribution at the point of inflexion of the probit
plot, followed by correction for overlap of the components. The other
methods (Buchanan-Wollaston and Hodgeson [1929], Oka [1954],
Tanaka [1962]) depend on equating the class-frequency to the ordinate
at the midpoint of the class so that the logarithm of class frequency
is a quadratic function of the mid-point of the class in a region where the
effect of all but one component is negligible. The underlying idea,
similar to that of Pearson and Lee [1908-09], is to attempt to determine
a particular component from the region where the effect of all other
components is negligible.
While Buchanan-Wollaston and Hodgeson [1929] and Tanaka [1962]
fitted these parabolas directly to estimate the proportions of mixture
along with mean and s.d., Oka [1954] attempted to estimate only mean
and s.d. by fitting straight lines (representing derivatives of these
parabolas) to the differential coefficients of log class-frequency, estimated
by average divided differences for two consecutive classes. In his
numerical example Oka considered a constant class interval. It may
be observed that, if the class intervals are unequal, the parabolas will
vary from group to group, and approximating differential coefficients
by divided differences may involve large errors.
In this paper I consider a cubic approximation to density within
a class, but approximate the logarithm of class frequency by a quad-
ratic. This introduces some corrections for grouping. Further, the
class interval has been assumed to be constant, and since simple dif-
ferencing reduces a quadratic to a straight line, I have used direct
differences instead of approximate differential coefficients as used by
Oka. When the class intervals are unequal the differences may be
corrected by an iterative procedure. Finally, methods are suggested
and applied for estimation of the proportions in the mixture, which is
an important part of the analysis, unfortunately ignored in Oka's paper.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTIONINTO GAUSSIANCOMPONENTS 117

METHOD
Let y (x) denote the observed frequency in the class with x as its
mid-point and let h denote the class interval. We plot y(x + h)/y(x)
against x on semi-log paper, or A log y = log y(x + h) - log y(x) against
x on ordinary graph paper, and look for the regions where the graph
looks like a straight line with negative slope. Subject to certain condi-
tions (Appendix A), the number of such regions is the number of com-
ponents. We now take a translucent paper with a straight line drawn
on it and match the straight lines, noting for each such region the angle
(say 0, for the rth line) it makes with the negative direction of the axis
of x, and the x-intercept (say X, for the rth line). As shown in Appendix
A, the mean and s.d. of the rth component may be estimated by,
Pr = ''r + h/2 (1)
o = (dh cot 0f/b) - (h2/12) (2)
where b and d denote the relative scales for x and A log y respectively.
While matching the straight line it is better to fit closely to the points
where the frequency is large even if the apparent discrepancy becomes
somewhat large where the frequency is small.
Several methods may be used for estimation of the proportions of
mixture, after estimation of the Ai and o-i . Writing
Y(x) = expected frequency in the class with x as its mid-point,
Ni = total frequency of the ith component,
k = number of components,
P (x) = distribution function of a standard normal deviate,
and
x
pi(x) = p(+ 2h Pi p(-lh- (3)
we may consider methods based on the following four formulae.
k

(i) Y= NiPi.
This relationship may be easily fitted, since an estimate Pi of Pi is
obtainable by substituting jZiand 0rifor Ai and o-i in (3). Instead of
going into the complications of fitting the above relationship consider-
ing both variables subject to error, it seems easier to use ordinary
regression methods, treating Pi as fixed. Then the estimated Ni are
solutions of
k

(ia) E
i =1
Ni EPiXi = E ypij j y ..
1,*** k (4)

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
118 BIOMETRICS, MARCH 1967

where summiation is over all classes in the range of the distribution.

Alternatively, the following simpler formula may be used:
k

(ib) ZNiEPi= Ey, r=1,* ,k (5)

i=1 r r

where >J stands for summation over the classes fitted by the rth line.
(ii) Y NrPr + Nr+iPr+i
for the classes fitted by the rth or (r + l)th line or which lie in between
them. Estimates of Nr and NrT+1 fromn this, denoted by Nr (r+1) and
Nr+I (r) may be obtained from

Nr(r+l) Pr Nr+l(r) Z frPr+i = Z YPr (6)

Nr(r+l) Z PrfDr+1 + Nr+l(r) Z r+ = Z Y+I
where summation is restricted to the region under consideration. The
proportions of the various components in the mixture, pi say
(i-=1, **..., k), may then be estimated from the relations

Pi+l/Pi = Ni+l(i)/Ni(i+l), i = 1, * l ,k-1

= 1. (7)
Pi=

(iii) Y NrP,
for the classes fitted by the rth line; Nr may thus be estimated by either
(iiia) Nr = YPr/
Y Pr (8)
or
(iiib) Nr = EY/I P (9)
where summation is restricted to the region under consideration.
iv) As shown in Appendix A,
hN, _h2 [o'2-
r (h2/12)] ( -
ly log -[ - 4 (X

for the classes fitted by the rth line; an estimate of Nr is given by

log , -
og Y+ (h2/12)]
? ? ?+ log 27r (10)
where summiation is restricted to the region under consideration, and
n denotes the number of classes in the region. If common logarithms
are used, the 2nd and third term on R.H.S. of (10) must be multiplied
by log10 e, as illustrated in Appendix B.)

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMPONENTS 119

It may be observed that all the methods described above, with the
exception of (ii), yield a single estimate of the total frequency of each
component; from these the proportions of the mixture may be estimated
by
k

pi = Ni/ i.(1
The use of equations (1) and (2) for estimation of mean and s.d. is
illustrated in examples 1 and 2, where, for simplicity, the Ni are esti-
mated by equation (9), but considering only two classes near the centre
of each straight region. Equations (4)-(1l) are illustrated in Appendix
B, using the data of example 1.
Method (i) is very laborious when the number of components is
large. If the components are well separated the diagonal terms dominate
in the coefficient matrix of equations (4), so that an iterative procedure
(Bodewig [1956]) is very suitable. Method (ib) is a good substitute
for method (ia) and is less laborious, in that computation of sums of
squares and products is replaced by computation of partial sums, and
that the diagonal terms in the coefficient matrix are more dominant, so
that the iteration process converges more rapidly. Methods (iii) and (iv)
are simple, (iv) being particularly useful when statistical tables are
not available. Method (ii) appears to be a good compromise between
methods (i) and (iii). There seems little point in undertaking a laborious
calculation to estimate the Ni with high apparent precision when the
pi and 6-, are themselves subject to error.

SOME SPECIAL CASES

The method described above is strictly valid under certain precise
conditions. In some special cases these conditions are not satisfied,
but the method may still prove useful.
Figure la illustrates a two-component situation where the component
a is completely overlapped by b. In such a situation the graph of the
logarithmic difference of the class-frequency against the mid-point of
the class will indicate a straight line corresponding to component b.
Component a can then be determined by subtraction of the frequencies
due to component b from the frequency distribution of the mixture.
Figure lb illustrates a three-component situation where the com-
ponent b is completely overlapped by a and c: in the middle region all
three components overlap. In such a situation the graph will reveal
two straight lines from which a and c can be determined: component
b can then be determined by subtraction.
In Figure 1c (three components) the compolenlts a and c are com-

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
120 BIOMETRICS, MARCH 1967

( I)

(i;i)

FIGURE 1
i) Two NORMAL DISTIRIBUTIONS: a COMPLETELY OVERLAPPED BY b
ii) THREE NORMAL DISTRIBUTIONS: b COMPLETELY OVERLAPPED BY a AND C
iii) THREE NORMAL DISTRIBUTIONS: BOTH a AND C ARE COMPLETELY OVERLAPPLD

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMPONENTS 121

pletely overlapped by b. In this case the graph will show a straight

line corresponding to component b: subtracting the frequencies due to
b from the frequency distribution of the mixture then leaves a simple
mixture (without overlap) of a and c.
In the general situation, where it is not known whether the assump-
tions underlying the present method are satisfied, the validity of the
method may be roughly examined from the data. One first determines
those components which are revealed in the graph of logarithmic
difference of the class frequency agains-t the mid-point of the class.
The frequencies due to these components may then be subtracted from
the observed frequency distribution: if all the components have been
determined, the residual frequency should be negligible. A non-
negligible residual frequency indicates that the conditions underlying
the present method are not satisfied: however, the process may be
repeated with the residual frequency, and further components possibly
so determined.

EXAMPLES

Example 1
The data (Table 1) for this example are taken from Tanaka [1962]
and relate to the frequency distribution of forkal length of Porgy caught
by the pair-trawl fishery of the East China Sea.
The graph (Figure 2) of logarithmic differences of class frequency
against the midpoint of the class (circles) shows four approximately
straight regions with negative slope, indicating four distinct components.
The presence of a further component between the last two of these,
but substantially overlapped by them, is also suggested (the solid
points, and the line through them, are not yet available at this stage
of the argument). The parameters of this component cannot be esti-
mated directly, and the approach indicated in the previous section
must be adopted.
After matching straight lines with each of the approximately straight
regions mentioned above, we get
h = 1, b =1O, d =200 X loge=86.858
= 10.53, 22 = 14.78, 23 = 19.36, 5 = 26.12
= 85.25?, 02 = 81.25?, 03 = 73.00, 05 =
75.5'.
Hence, from (1) and (2),
= 11.03, g2 =
15.28, g3 = 19.86, 5 = 26.62
=l .81, 62 = 1.13, 03 = 1.60, 05 = 1.47.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
122 BIOMETRICS, MARCH 1967

TABLE 1
EXAMPLE 1: FREQUENCY DISTRIBUTION OF FORKAL LENGTH OF PORGIES

Observed
Class range Mid-point frequency log1oy A log10y

9-10 9.5 509 2.707 .643

10-11 10.5 2240 3.350 .019
11-12 11.5 2341 3.369 -.575
12-13 12.5 623 2.794 -.116
13-14 13.5 476 2.678 .412
14-15 14.5 1230 2.090 .068
15-16 15.5 1439 3.158 -.194
16-17 16.5 921 2.964 -.313
17-18 17.5 448 2.651 .058
18-19 18.5 512 2.709 .148
19-20 19.5 719 2.857 - .029
20-21 20.5 673 2.828 -.180
21-22 21.5 445 2.648 -.115
22-23 22.5 341 2.533 -.042
23-24 23.5 310 2.491 -.133
24-25 24.5 228 2.358 -.133
25-26 25.5 168 2.225 -.079
26-27 26.5 140 2.146 -.089
27-28 27.5 114 2.057 -.251
28-29 28.5 64 1.806 -.464
29-30 29.5 22 1.342

For simplicity, an estimate of the total frequency of each component

was found by using formula (9), but conisideringonly two classes near
the centre of the straight region. Thus,
y(lO.5) ? y(11.5)
N1 = PJ(10 5) + P(( 15) = 5811;

N2 - y(14.5) + y(1l5.5) = 4381;

A (14.5) ? P(I15.5)

N3 - (19.5) + y(20.5) = 2984;

N5 y(26.5) ? y(27.5) _= 516.

P5(26.5) + P5(27.5)
The contributions of the third and fifth components are now sub-
tracted from the observed frequencies in the intermediate region
(Table 2).

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMPONENTS 123

0-7 -

06 u
2

0'S 01

LU
X, , o,0~~~~~~~~~~~~~ ,
0~~~~~~~~~~~~~~~~~
1,'5 13'5S 15'5 17'5 I9's 2I 23 25'5\ 27'5

0
0

-02,

-0.3- 0

-06L MIIDpONT OF CLASS

FIGURE 2
EXAMPLE 1: GRAPH OF LOGARITHMIC DIFFERENCES OF THE CLASS-FREQUENCIES
AGAINST THE MID-POINTS OF THE CLASSES

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
124 BIOMETRICS, MARCH 1967

TABLE 2
EXAMPLE 1: CALCULATION OF RESIDUAL FREQUENCIES IN REGION BETWEEN
THIRD AND FIFTH COMPONENTS

YR
x y P3 P5 =y -RJ3P3 -1R5P5 log,oYR A log11YR

22.5 341 .06568 .00606 142 2.1523 .2169

23.5 310 .02002 .03045 234 2.3692 -.1517
24.5 228 .00417 .09788 165 2.2175 -.4251
25.5 168 .00060 .20139 62 1.7924

The new graph (solid points in Figure 2) shows an approximately

straight region with negative slope, clearly pointing out the intermediate
component. For this component,
24 = 23.12, 04 - 82.0.
Hence

4= 23.62, O4 1.07,
- y (23.5) + yP(24.5) - 639
P4(0.5 ?P4(24.5)
Finally, from (11),

A= .4065, A= .3067, = .2087, A = .0420, A = .0361.

The results obtained by the present method, without using trial
and error, are in close agreement with those obtained by Tanaka [1962]
using other methods involving trial and error to get improved results
(Table 3).
Examnple2
In example 1 the method was applied to real data on a fish popula-
tion. Here it is applied to a known mixture of Gaussian distributions
(Table 4) with adequate separation of the components.
The distribution has three distinct modes corresponding to the three
components. The troughs between the modes, the corresponding points
of inflexion of the ogive as well as the points of inflexion of the probit
plot (Figure 3) yield proportions of mixture close to the actual values.
The graph of logarithmic difference of frequency against the mid-
point of the class (Figure 4) shows three approximately straight regions
with negative slope, indicating the presence of the three components.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMJPONENTS 125

TABLE 3
EXAMIPLE 1: COMPARISON OF THE RESULTS OBTAINED BY FOUR METHODS
A. BUCHANAN-WOLLASTON B. CASSIE C. TANAKA D. AUTHOR

Parameter Method Components

1 2 3 4 5

A 11.05 15.32 19.85 23.58 26.82

Mean (cm.) B 11.02 15.33 19.85 23.46 26.92
C 10.99 15.26 19.84 23.50 26.82
D 11.03 15.28 19.86 23.62 26.62

A .844 1.161 1.412 1.212 1.443

Standard deviation B .76 1.15 1.32 1.29 1.54
(cm.) C .8 1.2 1.4 1.2 1.4
D .81 1.13 1.60 1.07 1.47

A .4072 .3110 .1860 .0642 .0316

Proportions of B .4049 .3164 .1788 .0693 .0307
mixture C .4007 .3194 .1873 .0598 .0328
D .4065 .3067 .2087 .0420 .0361

TABLE 4
EXAMPLE 2: ARTIFICIAL MIXTURE OF THREE GAUSSIAN DISTRIBUTIONS

Class-range Mid-point Frequency

(cm.) (x) (?) loge y A loge Y

8-9 8.5 31 3.4340 2.84

9-10 9.5 532 6.2766 1.42
10-11 10.5 2198 7.6953 .04
11-12 11.5 2297 7.7394 -1.21
12-13 12.5 685 6.5294 - .33
13-14 13.5 494 6.2025 .88
14-15 14.5 1188 7.0800 .22
15-16 15.5 1479 7.2991 - .46
16-17 16.5 938 6.8438 - .66
17-18 17.5 486 6.1882 .10
18-19 18.5 537 6.2860 .27
19-20 19.5 702 6.5539 - .06
20-21 20.5 664 6.4983 - .43
21-22 21.5 431 6.0661 - .81
22-23 22.5 192 5.2575 -1.18
23-24 23.5 59 4.0775 -1.59
24-25 24.5 12 2.4849 -1.79
25-26 25.5 2 .6932

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
126 BIOMETRICS, MARCH 1967

25i

25 t

23 0

21 0

F- 0~~~~~~~~~~~~ 0

17 8C 0

}01 80 0

0O 01 l 0 05 Z 5 10 20 30 40 50 60 70 80 90 95 98 99 99 899 9
PROBIT
SCALE)
CUMULATIVEy5)FREQUEyCY1-
FIGURE 3
EXAMPLE 2: PROBIT PLOT OF THE FREQUENCY DISTRIBUTION

~~~
Here 1 'l .71 02 1.1003 16

9 *
~~ A
,A
h=1, ,
A
b=50,
, d = 10, , _

510.54, = 14.81, 25 = 19.35,

01= 82.0?, 62 = 74.00, 63 = 62.00.
Hence,
1?= 11.04, /13 = 15.31, j3 = 19.85
&1= .78, 02= 1.16, 63 = 1.60

iV1, -
y(14.5) + y(ll.5)

N2 - y(l4.5) + y(lS.5) = 4496;

-P2(14.5)+ P2(15.5)
3 P (19.5) + y(20.5) = 2930;
P3.431195P2 P3(20.5)
=.431 1, A2 = .3444, A3 = .2245.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMPONENTS 127

3'o c;:.

U'
Il

1'2 0
tU

8'5 lot 12. 0~~~~~

14'S\ 16 18'5 20'5 22a5 24'5

-o'6

-118

MIPOINVr OP CLASS

FIGURE 4
EXAMPLE 2: GRAPH OF LOGARITHMIC DIFFERENCES OF THE CLASS FREQUENCIES
AGAINST THE MID-POINTS OF THE CLASSES

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
128 BIOMETRICS, MARCH 1967

TABLE 5
EXAMPLE 2: COMPARISON OF THE RESULTS OBTAINED BY AUTHOR'S METHOD
WITH THE VALUES USED TO CONSTRUCT THE 'DATA' (IN PARENTHESES)

Component
Parameters
1 2 3

11.03 15.28 19.86

mean
(11.04) (15.31) (19.85)

standardeviation
standard deviatn ( .80
.78) ( 1.12
1.16)
1.60
( 1 60)

.3334
proportionsof
proportions mixtur.4404
Ofmixture ( .4311) ( .3444) ( .2262
2245)

Table 5 compares the values of the parameters with the actual

values (in parentheses).

DISCUSSION
A satisfactory practical solution of the problem under investiga-
tion continues to elude mathematical statisticians. Although the
problem admits a neat theoretical solution, very great difficulties would
be encountered in the practical application of the theoretical results.
The methods so far adopted by fishery workers, as well as that presented
in this paper, are all approximate in the sense that they are applicable
only when the components are adequately separated.
The mathematical basis of the probability paper method is not
very clear. The other methods have a clear mathematical basis, are
effective even with considerable overlap of the components provided
the sample is sufficiently large, and when applicable require no correc-
tion for truncation of the components.
The use of differences, as Tanaka [1962] remarked, involves the
danger of magnifying errors in the frequency distribution. This is the
reason why numerical differentiation is generally viewed with much
concern (Nielson [1956]). It may, however, be recalled that Fisher
[1950] used logarithmic differences in place of relative rates for fitting
a logistic curve, and that a similar approach by Bhattacharya [1964-65]
has been found to be quite satisfactory for fitting a more general class
of growth curves which includes the logistic curve as a particular case.
In the present case, the differences are used directly rather than as
substitutes for differential coefficients. Hence, in view of the simplify-

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMPONENTS 129

ing assumptions already made, the use of first differences does not seem
to be objectionable provided due care is taken with small frequencies.
The advantage of a linear transform in any applied research can
hardly be overemphasized. In the present case it has the great ad-
vantage that it reduces subjective elements to a minimum and is
certainly the quickest and simplest of all the existing methods.
The assumption that the class-range should be small is important,
and the sample should be sufficiently large so that the class frequencies
do not become very small in the regions of interest. This point may
be taken into consideration at the stage of collection and compilation
of data. If it is felt that the original class-width is too large for the
method to be applicable, it may sometimes be useful to divide the
original class into an odd number of subclasses and work with the fre-
quency of the central subclass, which may be estimated by some
smoothing formula such as King's formulae (Willers [1948]). This would
require a distinction between the class-width and the class interval.

ACKNOWLEDGEMENTS
The author wishes to express his gratitude to Dr. B. S. Bhimachar,
Dirctor of the Institute, for his constant encouragement during the
course of the study, to Shri V. R. Pantulu for inspiring the study and
his constant help and to Prof. H. K. Nandy of the University of Calcutta
for his helpful criticism and valuable advice during the progress of the
work. Thanks are also due to Shri P. Datta for his useful suggestions
in connection with the study.

REFERENCES
Bhattacharya, C. G. [1966]. Fitting a class of growth curves. Sankhya B28, 1-10.
Bodewig, E. [1956]. Matrix calculus. 1st Edn. Amsterdam:North Holland Publ. Co.
Buchanan-Wollaston,H. G. and Hodgeson, W. C. [1929]. A new method of treating
frequency curves in fishery statistics, with some results. J. Cons. 4, 207-25.
Cassie, R. M. [1954]. Some uses of probability paper for the graphical analysis of
polymodal frequency distributions. Aust. J. Mar. Freshw.Res. 5, 513-22.
Fisher, R. A. [1950]. Statisticalmethodsfor researchworkers. 11th. Edn. Edinburgh:
Oliver and Boyd.
Fisher, R. A. and Yates, F. [1963]. Statistical tablesfor biological,agriculturaland
medicalresearch. 6th. Edn. London: Oliver and Boyd.
Gottschalk, V. H. [1948]. Symmetrical bimodal frequency curves. J. Franklin
Inst. 245, 245-52.
Harding, J. F. [1949]. The use of probability paper for the graphical analysis of
polymodal frequency distributions. J. Mar. biol. Ass. U. K. 28, 141-53.
Nielson, K. L. [1956]. Methodsin numericalanalysis. 1st Edn. New York: The
Macmillan Company.
Oka, M. [1954]. Ecologicalstudies on the kidai by the statistical method II. On the
growth of kidai (Taius tumifrons). Bull. Fac. FIish.Nagasaki 2, 8-25.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
130 BIOMETRICS, MARCH 1967

Pearson, E. S. and Hartley, HI.0. [1958]. Biometrikatablesfor statisticians. Vol. 1.

2nd Edn. London: CambridgeUniversity Press.
Pearson, K. [1894]. Contribution to the mathematical theory of evolution. Phil.
Trans. A 185, 71-110.
Pearson, K. and Lee, A. [1908-09]. On the generalizedprobable error in multiple
normal correlation. Biometrika6, 59-68.
Pearson, K. [1915]. On the problem of sexing osteometric material. Biometrika40,
479-87.
Preston, E. J. [1953]. A graphical method for analysis of statistical distributions
into normal coniponents. Biometrika40, 460-64.
Rao, C. R. [1948]. The utilisation of multiple measurements in problems of bio-
logical classification. J. R. Statist. Soc. B 10, 159-93.
Tanaka, S. [1962]. A method of analysing of polymodal frequency distribution and
its application to the length distribution of the Porgy, Taius tumifrons(J. and
S.). J. Fish. Res. Bd. Can. 19, 1143-59.
Willers, F. A. [1948]. Practical analysis: Graphicaland numericalmethods. Tr. by
Robert T. Beyer. 1st. Edn. New York: Dover publications.

APPENDIX A
THE UNDERLYING MATHEMATICAL ASSUMPTIONS

Let the frequency function be a mixture of k Gaussian distributions

with parameters (Ni , pi, oj), i = 1, * * *, k. We assume that the com-
ponent distributions are sufficiently separated for there to exist for
each component a sufficiently broad region where the effect of all other
components is comparatively negligible. We further assume that the
class-range is sufficiently small. Let h denote the class interval and
y denote the frequency in the class with x as its mid-point. Then,
x+h/2 k NT r x+h/2 NA2
E e
y =ef 1V
-(v-Mi)2/2crO dv (A-jr)2/2cr' dv
Jz-h/2 \/2 e',
i=l 0i J2
z-h/2

in the region where the effect of all except the rth component is negligible.
Writing v = x + Oru, this becomes
h/2cr
y ? Nr j Z(tr + U) du
-h/2a,r
h/2ar 03 h/2a,r
Z()(tr) = Z78) (tr
=Nr A du =
@ Juscod Nr t 1 ut}s
8dfu
~h/2ar 8=0 s 8=0 -h/2a,r

where Z stands for the density function of a standard normal deviate

and Z(s) for its sth derivative, and tr = (x - gr)/r .
Carrying out the integration w.r.t. u and expressing Z(' (t) as
product of Z(t) and Hermite polynomial of the sth degree, we have

n t i /2vrh an { hrs o h
neglecting terms involving h5 and higher powers of h.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTIONINTO GAUSSIANCOMPONENTS 131

Taking logarithms and neglecting terms involving h4 and higher

powers of h,
hNr h2 c2 -
(h2/12) 2
2(_
cTrV27r 24o_
Now,
At2 = 2h(x - pr + h/2)/cT
whence
A log y --h(0_2 - h2/12)(x - I.r + h/2)/(oJr
This shows that the graph of A log y against x is a straight line
with negative slope equal to -h[o_2- (h2/12)]/4.
When plotting it may be necessary to choose different scales for
x and A log y. If b and d denote the scales for x and A log y respectively,
the slope becomes -dh(_2 - h2/12)/bc4 .
Let Orbe the acute angle made by the line with the negative direc-
tion of the axis of x. Then, if a = b tan Od/dh,
4 2 h
ao_ - +
12 = 0?

i.e.,

2 = 1 = V1- (ah2/3) 1 i [1- (ah2/6)]

-
Cr 2a 2a
neglecting terms involving h4 and higher powers of h: thus
o2
- h2/12 or 1/a - h2/12.
The solution _2' -' h2/12 has obviously to be rejected, and hence we find
o2 1/a - h2/12 = dh cot fOr/b- h2/12.
Let X. be the value of x corresponding to which log y is zero: then
from the expression for log y given earlier,
Xr - Iur + h/2 = 0,
,r = Xr + h/2.

APPENDIX B
EXAMPLE

The various methods suggested in ?2 for estimation of the propor-

tions of mixture will be illustrated on the data of example 1. Since
these data do not conform to the assumptions of Appendix A, with the

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
132 BIOMETRICS, MARCH 1967

result that the 4th component cannot be isolated before the total
frequencies of the other components are determined, we restrict the
illustration to the first three components, and consider only the classes
in the range 9-22, in which we may neglect the effect of the 4th and
5th components. As suggested in ?2, the systems of linear equations
encountered in Methods (ia) and (ib) are conveniently solved by a
common iteration procedure called iteration II by Bodewig [1956].
This method starts with an initial solution, obtained by ignoring all
except the diagonal terms of the coefficient matrix: the kth approxima-
tion is then obtained by adjusting the right hand sides for the off-
diagonal terms, calculated using the (l - 1) th approximation.
Preliminary calculations are shown in Table 6: values of fi and &'
are from Table 3.

TABLE 6
COMPARISON OF METHODS (i)-(iv) (APPENDIX B): PRELIMINARY CALCULATIONS

Mid-
Class point
range of class P1 P2 P3
(cm.) (x) X 10 X 10 X 10 Y 10og10Y X -l X- 2 X- 3

8-9 8.5 550

9-10 9.5 9338 509 2.707 -1.53
10-11 10.5 38608 7 2240 3.350 - .53
11-12 11.5 40230 163 2341 3.369 .47
12-13 12.5 10576 1919 1 623 2.794 1.47
13-14 13.5 680 10565 11 476 2.678 -1.78
14-15 14.5 10 27475 107 1230 3.090 - .78
15-16 15.5 33853 673 1439 3.158 .22
16-17 16.5 19787 2901 921 2.964 1.22
17-18 17.5 5473 8559 448 2.651
18-19 18.5 713 17294 512 2.709 -1.36
19-20 19.5 44 23940 719 2.857 - .36
20-21 20.5 1 22706 673 2.828 .64
21-22 21.5 14755 445 2.648 1.64

Method (ia) Summing over classes in the range 9-22, equations (4)
become
.330854308N1 + .003458204N2 + .000001913N3 = 1923.38220
.003458204P1, + .243821904N2 + .014349321N3 = 1102.03553
.000001913N1 + .014349321X2 + .168761528N?3= 555.26670

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMPONENTS 133

The iterative solution, with the results of the successive iterations

arranged columnwise, is

J1 5813.38 5766.12 5769.01 5768.76 5768.78

N2 4519.83 4243.75 4267.04 4265.62 4265.74
N3 3290.24 2905.87 2929.35 2927.36 2927.49.
Hence
N1 = 5769; i2 = 4266; N3 = 2927; ZNi = 12962,
and from equation (11),

A1 = .4451, .3291, P2 = .2258. A

Method (ib) Summing over classes in the rainges 9-13, 13-17 and
18-22, corresponding to the 1st, 2nd and 3rd line respectively, equations
(5) become
.98752A1 + .02089N2 + .00001N3 = 5713
.00690N1 + .91680N2 + .03692N3 = 4066
ON1 + .00758S?2+ .78695N3 = 2349.
The solution is
N1 = 5695; S2 = 4274; R3 = 2944; E i= 12913
Pi .4410; P2 = .3310; p3 = .2280.

TABLE 7
ESTIMATION OF THE Ni BY METHOD (iii)

Method(iiia) using Method(iiib)using

Com- equation(8) equation(9)
ponent
(i) E 2i Pi lqi pi EP EY ; pi
1 .330808058 1920.02240 5804 .4401 .98752 5713 5785 .4381
2 .240404583 1057.61484 4399 .3336 .91680 4066 4435 .3359
3 .160547850 479.14501 2984 .2263 .78695 2349 2985 .2260

Method (ii) We consider the classes in the ranges 9-17 and 13-22 for
estimating P2/Pl and p3/p2 respectively. For p2/pl equations (5) become
.330854308N1(2) + .003458204X2(1) = 1923.38220
.003458204N1?(2)+ .2407755017?2(1)= 1073.54284.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
134 BIOMETRICS, MARCH 1967

C? C col

.. m cb
00

-b C.
CO0
(
00

<t tb 0m O . 0

Cl CC) CO l
ool~X C
ho~~~~~~0

O)0C

ao 010 a) X Ic N
lb

V] 0CllC

H H <
C4 cli ~ ~ ~
0 ; xo0 co 0-

00 .1i

4 Cli
Cl

a v ++ .
C b
CO _ _ _

0' r l0

4zC

t4 ce) 0

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions
RESOLUTION INTO GAUSSIAN COMPONENTS 135

The solution is

91(2) = 5768, N2 (1) = 4376,

whence, from equation (6),

02/P1 = =(1,/@1 (2= .7587.

Similarly P3/P2 = .6820, and finally,
Pi .4394, P2 = .3333, p3 = .2273.
Methods (iii) and (iv) The classes used are the same as for method (ib).
Computations using equations (8) and (9) are presented in Table 7,
and those using equation (10) are presented in Table 8: the estimated
Pi obtained are very similar among themselves, and to those obtained
by methods (i) and (ii).

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

All use subject to JSTOR Terms and Conditions

Mindmap Lv1 Quant 2022
No ratings yet
Mindmap Lv1 Quant 2022
9 pages
PLE Case Study
No ratings yet
PLE Case Study
28 pages
CH 01 Wooldridge 6e PPT Updated
No ratings yet
CH 01 Wooldridge 6e PPT Updated
77 pages
Micceri, T. (1989) - The Unicorn, The Normal Curve, and Other Improbably Creatures. Micceri89
No ratings yet
Micceri, T. (1989) - The Unicorn, The Normal Curve, and Other Improbably Creatures. Micceri89
18 pages
SPSS 23 Step by Step Answers To Selected Exercises
No ratings yet
SPSS 23 Step by Step Answers To Selected Exercises
75 pages
BAHIRDAR ComputerSci&Eng Final
100% (7)
BAHIRDAR ComputerSci&Eng Final
201 pages
Graphical Representation Notes
No ratings yet
Graphical Representation Notes
32 pages
09042020212640practical - Manual - Ag - Statistics - Ug and PG - Courses
No ratings yet
09042020212640practical - Manual - Ag - Statistics - Ug and PG - Courses
79 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
13 pages
Geometry of Submanifolds
From Everand
Geometry of Submanifolds
Bang-Yen Chen
No ratings yet
02-Descriptive Statistics of Numerical Data - 52
No ratings yet
02-Descriptive Statistics of Numerical Data - 52
61 pages
Ipsita Panda-Biostats Assignment
No ratings yet
Ipsita Panda-Biostats Assignment
11 pages
Measures of Central Tendency: Presentation By: DR Dharuv
No ratings yet
Measures of Central Tendency: Presentation By: DR Dharuv
44 pages
Sta117 P & Q-1 (1) - 1
No ratings yet
Sta117 P & Q-1 (1) - 1
16 pages
Chapter 11. Goodness of Fit and Contingency Tables
No ratings yet
Chapter 11. Goodness of Fit and Contingency Tables
12 pages
Student The Probable Error of A Mean
No ratings yet
Student The Probable Error of A Mean
24 pages
Measures of Central Tendency: Presentation By: Dr. Sampda Rajurkar
100% (1)
Measures of Central Tendency: Presentation By: Dr. Sampda Rajurkar
44 pages
MATH 322: Probability and Statistical Methods
No ratings yet
MATH 322: Probability and Statistical Methods
27 pages
Statistics Merged
No ratings yet
Statistics Merged
59 pages
Metode Chi Squere (Kuadrat) Sub Ingg
No ratings yet
Metode Chi Squere (Kuadrat) Sub Ingg
12 pages
Maths
No ratings yet
Maths
23 pages
Arithmetic Progression
No ratings yet
Arithmetic Progression
6 pages
10.2307@2341149
No ratings yet
10.2307@2341149
10 pages
Fisher 1922
No ratings yet
Fisher 1922
17 pages
1894 - Pearson - Contributions To The Mathematical Theory of Evolution
No ratings yet
1894 - Pearson - Contributions To The Mathematical Theory of Evolution
56 pages
PED 106 Module 7 Assessment
No ratings yet
PED 106 Module 7 Assessment
13 pages
Prob 2
No ratings yet
Prob 2
247 pages
Intro To Statistics
No ratings yet
Intro To Statistics
38 pages
fisher-1997-dispersion-on-a-sphere
No ratings yet
fisher-1997-dispersion-on-a-sphere
11 pages
40637 Sas 101 Descriptive Statistics Teacher.co .Ke
No ratings yet
40637 Sas 101 Descriptive Statistics Teacher.co .Ke
4 pages
II Pu formula list
No ratings yet
II Pu formula list
9 pages
Solution: A) Since 65.5 - 62.5 3, The Interval Must Be 3.: N F Rel N F Rel
No ratings yet
Solution: A) Since 65.5 - 62.5 3, The Interval Must Be 3.: N F Rel N F Rel
6 pages
02 Data and Preliminary Data Analysis - Print
No ratings yet
02 Data and Preliminary Data Analysis - Print
20 pages
Solution Manual Introduction To Statistical Theory (Part I) Sher M (PDFDrive)
No ratings yet
Solution Manual Introduction To Statistical Theory (Part I) Sher M (PDFDrive)
230 pages
Pearson 1930
No ratings yet
Pearson 1930
12 pages
Jee Mains + Boards Maths
100% (1)
Jee Mains + Boards Maths
89 pages
hwang Good-Turing frequency estimation in a finite population 2014
No ratings yet
hwang Good-Turing frequency estimation in a finite population 2014
19 pages
Basic Mathematics - Lecture 12
No ratings yet
Basic Mathematics - Lecture 12
10 pages
(Ibe Dan Onuoha 2020) Pengertian Metode Sturges
No ratings yet
(Ibe Dan Onuoha 2020) Pengertian Metode Sturges
16 pages
Notes
No ratings yet
Notes
18 pages
Brown y Forsythe 1974 The ANOVA and Multiple Comparisons For Data With Heterogeneous Variances
No ratings yet
Brown y Forsythe 1974 The ANOVA and Multiple Comparisons For Data With Heterogeneous Variances
7 pages
Class 10 Maths Statistics notes
No ratings yet
Class 10 Maths Statistics notes
20 pages
Stat 153 Slides PDF Statistics Mode (Statis
No ratings yet
Stat 153 Slides PDF Statistics Mode (Statis
10 pages
Lecture-1 Introduction
No ratings yet
Lecture-1 Introduction
51 pages
Group 4 - Normal Distributions Bsed Fil-1b
No ratings yet
Group 4 - Normal Distributions Bsed Fil-1b
16 pages
Geoquimica Traduccion 1
No ratings yet
Geoquimica Traduccion 1
11 pages
Nature: Measurement of Diversity
No ratings yet
Nature: Measurement of Diversity
1 page
Dispersion On A Sphere: Received23 December 1952)
No ratings yet
Dispersion On A Sphere: Received23 December 1952)
11 pages
GOY AL Brothers Prakashan: X X X X X N
No ratings yet
GOY AL Brothers Prakashan: X X X X X N
22 pages
Simplified Statistics For Small Numbers of Observations: R. B. Dean, and W. J. Dixon
No ratings yet
Simplified Statistics For Small Numbers of Observations: R. B. Dean, and W. J. Dixon
4 pages
Group 4 - Normal Distributions
No ratings yet
Group 4 - Normal Distributions
15 pages
Statistics For Class 10 PDF
No ratings yet
Statistics For Class 10 PDF
8 pages
Statistics For Class 10
No ratings yet
Statistics For Class 10
8 pages
Week3 Frequency Analysis
No ratings yet
Week3 Frequency Analysis
50 pages
Stat-231 Practical Manual
No ratings yet
Stat-231 Practical Manual
45 pages
L3 - Data Analysis - Central Tendency 20 - 21
No ratings yet
L3 - Data Analysis - Central Tendency 20 - 21
22 pages
1 Introduction of The Nature of Statistics and Frequency Distributions and Graph
No ratings yet
1 Introduction of The Nature of Statistics and Frequency Distributions and Graph
13 pages
Stat 101
No ratings yet
Stat 101
21 pages
Ana Assignment 5019
No ratings yet
Ana Assignment 5019
7 pages
PNG University of Technology Mathematics & Computer Science Department
No ratings yet
PNG University of Technology Mathematics & Computer Science Department
15 pages
Introduction BS Final
No ratings yet
Introduction BS Final
54 pages
Univariate Description: Look 4
No ratings yet
Univariate Description: Look 4
14 pages
4th Quarter Lesson Plan Measure of Central Tendency of The Grouped Data
No ratings yet
4th Quarter Lesson Plan Measure of Central Tendency of The Grouped Data
5 pages
Analytical Geometry of Three Dimensions
From Everand
Analytical Geometry of Three Dimensions
William H. McCrea
4/5 (1)
An Introduction to Phase-Integral Methods
From Everand
An Introduction to Phase-Integral Methods
John Heading
No ratings yet
Fundamentals of Mathematical Physics
From Everand
Fundamentals of Mathematical Physics
Edgar A. Kraut
3/5 (4)
Basic Concepts in Biostatistics-2
No ratings yet
Basic Concepts in Biostatistics-2
35 pages
Assignment of Biostatistics
No ratings yet
Assignment of Biostatistics
8 pages
Full Equity Markets in India Returns Risk and Price Multiples 1st Edition Shveta Singh Ebook All Chapters
100% (3)
Full Equity Markets in India Returns Risk and Price Multiples 1st Edition Shveta Singh Ebook All Chapters
50 pages
Describing Data: Displaying and Exploring Data
No ratings yet
Describing Data: Displaying and Exploring Data
13 pages
Empirical Parameterization of Setup, Swash, and Runup
No ratings yet
Empirical Parameterization of Setup, Swash, and Runup
16 pages
"Development of Kabaddi Skill Test Battery For High School Grils
No ratings yet
"Development of Kabaddi Skill Test Battery For High School Grils
9 pages
CE4022 Lecture Note 4-1 Flood Frequency Analysis and Reservoir Capacity Yield
No ratings yet
CE4022 Lecture Note 4-1 Flood Frequency Analysis and Reservoir Capacity Yield
88 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
8 pages
1.07 Z-Scores
No ratings yet
1.07 Z-Scores
2 pages
LNMI MBA Syllabus 2020 Final
No ratings yet
LNMI MBA Syllabus 2020 Final
157 pages
SPSS Statistics 26 Step by Step Answers To Selected Exercises
100% (1)
SPSS Statistics 26 Step by Step Answers To Selected Exercises
98 pages
Basic Statistics (STA201)
No ratings yet
Basic Statistics (STA201)
25 pages
Applied Statistics
No ratings yet
Applied Statistics
31 pages
Andy Field - Exploring Data
No ratings yet
Andy Field - Exploring Data
21 pages
Georisk 2017 Paper
No ratings yet
Georisk 2017 Paper
10 pages
1978 Marek 553-ProbAnalysis-PlaneShearFailure
No ratings yet
1978 Marek 553-ProbAnalysis-PlaneShearFailure
5 pages
An Introduction To Distribution-Free Statistical Methods: Douglas G. Bonett University of California, Santa Cruz
No ratings yet
An Introduction To Distribution-Free Statistical Methods: Douglas G. Bonett University of California, Santa Cruz
48 pages
food data
No ratings yet
food data
13 pages
Gold and Silver Price Analysis Using R Studio
No ratings yet
Gold and Silver Price Analysis Using R Studio
9 pages
Part 3 TB Data Project
No ratings yet
Part 3 TB Data Project
3 pages
l6 Hasil Analisa Data Spss Ok
No ratings yet
l6 Hasil Analisa Data Spss Ok
4 pages
Chapter 4: Displaying Quantitative Data
No ratings yet
Chapter 4: Displaying Quantitative Data
27 pages
BRM-Statistics in Research
No ratings yet
BRM-Statistics in Research
30 pages
Grade 12th Maths Worksheet
No ratings yet
Grade 12th Maths Worksheet
7 pages

Bhattacharya - 1967 - Simple Method Resolution Distribution Into Gaussian Components

Uploaded by

Bhattacharya - 1967 - Simple Method Resolution Distribution Into Gaussian Components

Uploaded by

A Simple Method of Resolution of a Distribution into Gaussian Components

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

authors using the methods of moments (Pearson [1894; 1915], Rao

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

where summiation is over all classes in the range of the distribution.

(ib) ZNiEPi= Ey, r=1,* ,k (5)

Nr(r+l) Pr Nr+l(r) Z frPr+i = Z YPr (6)

Pi+l/Pi = Ni+l(i)/Ni(i+l), i = 1, * l ,k-1

for the classes fitted by the rth line; an estimate of Nr is given by

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

SOME SPECIAL CASES

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

pletely overlapped by b. In this case the graph will show a straight

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

9-10 9.5 509 2.707 .643

For simplicity, an estimate of the total frequency of each component

N2 - y(14.5) + y(1l5.5) = 4381;

N3 - (19.5) + y(20.5) = 2984;

N5 y(26.5) ? y(27.5) _= 516.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

-06L MIIDpONT OF CLASS

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

22.5 341 .06568 .00606 142 2.1523 .2169

The new graph (solid points in Figure 2) shows an approximately

A= .4065, A= .3067, = .2087, A = .0420, A = .0361.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

Parameter Method Components

A 11.05 15.32 19.85 23.58 26.82

A .844 1.161 1.412 1.212 1.443

A .4072 .3110 .1860 .0642 .0316

Class-range Mid-point Frequency

8-9 8.5 31 3.4340 2.84

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

510.54, = 14.81, 25 = 19.35,

N2 - y(l4.5) + y(lS.5) = 4496;

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

8'5 lot 12. 0~~~~~

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

11.03 15.28 19.86

Table 5 compares the values of the parameters with the actual

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

Pearson, E. S. and Hartley, HI.0. [1958]. Biometrikatablesfor statisticians. Vol. 1.

Let the frequency function be a mixture of k Gaussian distributions

where Z stands for the density function of a standard normal deviate

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

Taking logarithms and neglecting terms involving h4 and higher

2 = 1 = V1- (ah2/3) 1 i [1- (ah2/6)]

The various methods suggested in ?2 for estimation of the propor-

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

8-9 8.5 550

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

The iterative solution, with the results of the successive iterations

J1 5813.38 5766.12 5769.01 5768.76 5768.78

A1 = .4451, .3291, P2 = .2258. A

Method(iiia) using Method(iiib)using

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

91(2) = 5768, N2 (1) = 4376,

02/P1 = =(1,/@1 (2= .7587.

This content downloaded from 91.229.229.205 on Wed, 25 Jun 2014 04:14:36 AM

You might also like