Business Stat 10 12 .PDF
Business Stat 10 12 .PDF
correlation analysis. The sign of correlation coefficient indicates the nature (direct or inverse) of
relationship between two variables, while the absolute value of correlation coefficient indicates
the extent of relationship.
2. Correlation analysis determines an association between two variables x and y but not that they
have a cause-and-effect relationship. Regression analysis, in contrast to correlation, determines
the cause-and-effect relationship between x and y, that is, a change in the value of independent
variable x causes a corresponding change (effect) in the value of dependent variable y if all other
factors that affect y remain unchanged.
3. In linear regression analysis one variable is considered as dependent variable and other as
independent variable, while in correlation analysis both variables are considered to be
independent.
4. The coefficient of determination r2 indicates the proportion of total variance in the dependent variable that
is explained or accounted for by the variation in the independent variable. Since value of r2 is determined
from a sample, its value is subject to sampling error. Even if the value of r2 is high, the assumption
of a linear regression may be incorrect because it may represent a portion of the relationship that
actually is in the form of a curve.
Remarks
1. The relationship between the dependent variable y and independent variable x exists and is
linear. The average relationship between x and y can be described by a simple linear regression
equation y = a + bx + e, where e is the deviation of a particular value of y from its expected value
for a given value of independent variable x.
2. For every value of the independent variable x, there is an expected (or mean) value of the dependent
variable y and these values are normally distributed. The mean of these normally distributed
values fall on the line of regression.
3. The dependent variable y is a continuous random variable, whereas values of the independent
variable x are fixed values and are not random.
4. The sampling error associated with the expected value of the dependent variable y is assumed to
be an independent random variable distributed normally with mean zero and constant standard
deviation. The errors are not related with each other in successive observations.
5. The standard deviation and variance of expected values of the dependent variable y about the
regression line are constant for all values of the independent variable x within the range of the
sample data.
338 BUSINESS S T A T I S T I C S
6. The value of the dependent variable cannot be estimated for a value of an independent variable
lying outside the range of values in the sample data.
Remarks
1. When variables x and y are correlated perfectly (either positive or negative) these lines coincide, that is, we
have only one line.
2. Higher the degree of correlation, nearer the two regression lines are to each other.
3. Lesser the degree of correlation, more the two regression lines are away from each other. That is, when r
= 0, the two lines are at right angle to each other.
4. Two linear regression lines intersect each other at the point of the average value of variables x and y.
To determine the value of ŷ for a given value of x, this equation requires the determination of two
unknown constants a (intercept) and b (also called regression coefficient). Once these constants are
calculated, the regression line can be used to compute an estimated value of the dependent variable
y for a given value of independent variable x.
REGRESSION ANALYSIS 339
The particular values of a and b define a specific linear relationship between x and y based on
sample data. The coefficient ‘a’ represents the level of fitted line (i.e., the distance of the line above or
below the origin) when x equals zero, whereas coefficient ‘b’ represents the slope of the line (a measure
of the change in the estimated value of y for a one-unit change in x).
The regression coefficient ‘b’ is also denoted as:
• byx (regression coefficient of y on x) in the regression line, y = a + bx
• bxy (regression coefficient of x on y) in the regression line, x = c + dy
1. The correlation coefficient is the geometric mean of two regression coefficients, that is, r = byx × bxy .
2. If one regression coefficient is greater than one, then other regression coefficient must be less than one,
because the value of correlation coefficient r cannot exceed one. However, both the regression coefficients
may be less than one.
3. Both regression coefficients must have the same sign (either positive or negative). This property rules out
the case of opposite sign of two regression coefficients.
4. The correlation coefficient will have the same sign (either positive or negative) as that of the two regression
coefficients. For example, if by x = – 0.664 and bxy = – 0.234, then r = – 0.664 × 0.234 = – 0.394.
5. The arithmetic mean of regression coefficients bxy and byx is more than or equal to the correlation coefficient
r, that is, (by x + bx y ) / 2 ≥ r. For example, if byx = – 0.664 and bx y = – 0.234, then the arithmetic mean of
these two values is (– 0.664 – 0.234)/2 = – 0.449, and this value is more than the value of r = – 0.394.
6. Regression coefficients are independent of origin but not of scale.
where n is the total number of pairs of values of x and y in a sample data. The equations (10.1) are
called normal equations with respect to the regression line of y on x. After solving these equations for
a and b, the values of a and b are substituted in the regression equation, y = a + bx.
Similarly if we have a least squares line x̂ = c + dy of x on y, where x̂ is the estimated mean value
of dependent variable x, then the normal equations will be
Σ x = nc + d Σ y
Σ xy = n Σ y + d Σ y2
These equations are solved in the same manner as described above for constants c and d. The values
of these constants are substituted to the regression equation x = c + dy.
2
n 2 2
n 1 n
Sxx = ∑ ( xi − x ) = ∑ xi − ∑ xi
i =1 i =1 n i =1
2
S yx n 2 n 1 n
Syy = ∑ ( yi − y ) = ∑ yi −
2
and d = , where ∑ y
S yy i =1 i =1 n i =1
Since the regression line passes through the point ( x , y ), the mean values of x and y and the
regression equations can be used to find the value of constants a and c as follows:
a = y − bx for regression equation of y on x
c = x – d y for regression equation of x on y
The calculated values of a, b and c, d are substituted in the regression line y = a + bx and
x = c + dy respectively to determine the exact relationship.
Example 10.1: Use least squares regression line to estimate the increase in sales revenue expected
from an increase of 7.5 per cent in advertising expenditure.
Solution: Assume sales revenue ( y) is dependent on advertising expenditure (x). Calculations for
regression line using following normal equations are shown in Table 10.1
Σ y = na + bΣ x and Σ x y = a Σ x + bΣ x2
Σ xΣ y 40 × 56
where Sxy = Σ x y – = 373 – = 93
n 8
(Σ x)2 (56)2
Sxx = Σ x2 – = 524 – = 132
n 8
The intercept ‘a’ on the y-axis is calculated as:
40 56
a = y − bx = – 0.704 × = 5 – 0.704×7 = 0.072
8 8
Substituting the values of a = 0.072 and b = 0.704 in the regression equation, we get
y = a + bx = 0.072 + 0.704 x
For x = 0.075, we have y = 0.072 + 0.704 (0.075) = 0.1248 or 12.48%.
342 BUSINESS S T A T I S T I C S
Example 10.2: The owner of a small garment shop is hopeful that his sales are rising significantly
week by week. Treating the sales for the previous six weeks as a typical example of this rising trend,
he recorded them in Rs. 1000’s and analysed the results.
Week : 1 2 3 4 5 6
Sales : 2.69 2.62 2.80 2.70 2.75 2.81
Fit a linear regression equation to suggest to him the weekly rate at which his sales are rising and
use this equation to estimate expected sales for the 7th week.
Solution: Assume sales ( y) is dependent on weeks (x). Then the normal equations for regression
equation: y = a + bx are written as:
Σ y = na + bΣ x and Σxy = aΣ x + bΣx2
Calculations for sales during various weeks are shown in Table 10.2.
(∑ x)2 (21)2
Sxx = ∑ x 2 − = 91 – = 17.5
n 6
The intercept ‘a’ on the y-axis is calculated as
16.37 21
a = y − bx = − 0.025 ×
6 6
= 2.728 – 0.025×3.5 = 2.64
Substituting the values a = 2.64 and b = 0.025 in the regression equation, we have
y = a + bx = 2.64 + 0.025x
For x = 7, we have y = 2.64 + 0.025 (7) = 2.815
Hence the expected sales during the 7th week is likely to be Rs. 2.815 (in Rs. 1000’s).
(a) Deviations Taken from Actual Mean Values of x and y If deviations of actual values of variables x and
y are taken from their mean values x and y , then the regression equations can be written as:
• Regression equation of y on x • Regression equation of x on y
y – y = byx (x – x ) x – x = bx y (y – y )
where by x = regression coefficient of where bx y = regression coefficient
= y on x. where bxy = of x on y.
The value of byx can be calculated using The value of bx y can be calculated
the using the formula formula
Σ (x − x ) ( y − y ) Σ (x − x ) ( y − y )
by x = bx y =
Σ ( x − x )2 Σ ( y − y )2
(b) Deviations Taken from Assumed Mean Values for x and y If mean value of either x or y or both are in
fractions, then we must prefer to take deviations of actual values of variables x and y from their
assumed means.
• Regression equation of y on x • Regression equation of x on y
y – y = byx (x – x ) x – x = bxy (y – y )
n Σ dx dy − (Σ dx ) (Σ d y ) n Σ dx dy − (Σ dx )(Σ dy )
where by x = where bx y =
nΣ dx2 2
− (Σ dx ) n Σ dy2 − (Σ dy )2
n = number of observations n = number of observations
dx = x – A; A is assumed mean of x dx = x – A; A is assumed mean
dx = = of x
dy = y – B; B is assumed mean of y dy = y – B; B is assumed mean
of y
(c) Regression Coefficients in Terms of Correlation Coefficient If deviations are taken from actual mean
values, then the values of regression coefficients can be alternatively calculated as follows:
Σ (x − x ) ( y − y ) Σ (x − x ) ( y − y )
byx = 2
bxy =
Σ (x − x ) Σ ( y − y )2
Covariance(x, y) σy Covariance(x, y) σx
= = r⋅ = = r⋅
σ2x σx σ2y σy
Example 10.3: Compute the two regression coefficients using the value of actual mean value of X and
Y from the data given below and then work out the value of r.
X : 7 4 8 6 5
Y : 6 5 9 8 2 [GJU (Hisar), BBA, 2004]
Solution: Calculations for two regression coefficients are given below:
X Y x=X – X y=Y– Y xy x2 y2
7 6 +1 0 0 1 0
4 5 –2 –1 2 4 1
8 9 +2 +3 6 4 9
6 8 0 +2 0 0 4
5 2 –1 –4 4 1 16
ΣX 30 ΣY ΣY
X = = =6 and Y = = =6
n 5 n 5
Σ xy 12
Regression coefficient of X on Y is: bxy = = = 0.4
Σ y2 30
Σ xy 12
Regression coefficient of Y on X: byx = = = 1.2
Σ x2 10
Example 10.4: Use following data to find out the two lines of regression and compute the Karl
Pearson’s coefficient of correlation.
Σ x = 250, Σ y = 300, Σ x y = 7900, Σ x2 = 6500, Σ y2 = 10000, n = 10
Solution: Calculate x , y , bxy and byx with the help of given information as follows:
Σx 250 Σy 300
x = = = 25; y = = = 30
n 10 n 10
nΣ x y − Σ xΣ y 10(7900) − (250) (300)
bxy = 2 2
= 2
= 0.4
n Σ y − (Σ y) 10(10000) − (300)
Let the regression line of x on y be:
x − x = bxy ( y − y )
x – 25 = 0.4( y – 30)
x = 0.4y – 12 + 25 = 0.4y + 13
n Σ x y − Σ xΣ y 10(7900) − (250) (300)
Also byx = 2 2 = = 1.6
n Σ x − (Σ x) 10(6500) − (250)2
Let the regression line of y on x be
y − y = byx ( x − x )
y – 30 = 1.6(x – 25)
y = 1.6x – 40 + 30 = 1.6x – 10
Hence, the coefficient of correlation is: r = bxy × byx = 0.4 × 1.6 = 0.8.
Example 10.5: A departmental store gives training to its salesman which is followed by a test. The
store is considering whether it should terminate the service of any salesman who does not do well in
the test. The following data gives the scores and sales made by nine salesmen during a certain period:
Test Scores : 14 19 24 21 26 22 15 20 19
Sales (‘00 Rs.) : 31 36 48 37 50 45 33 41 39
Calculate the correlation coefficient between test scores and the sales. Does it indicate that the
termination of services of low test scores is justified? If the firm wants a minimum sales volume of Rs.
3000, what is the minimum test score that will ensure continuation of service? Also estimate the most
probable sales volume of a sales making a score of 28. [Delhi Univ., B.Com(Hons), 1998, 2002]
Solution: Let test scores be x and sales be y. Calculation required to determine co-relation coeffi-
cient are shown below:
REGRESSION ANALYSIS 345
X dx = X – 20 dx2 Y dy = Y – 40 dy2 dx dy
14 –6 36 31 –9 81 54
19 –1 1 36 –4 16 4
24 4 16 48 8 64 32
21 1 1 37 –3 9 –3
26 6 36 50 10 100 60
22 2 4 45 5 25 10
15 –5 25 33 –7 49 35
20 0 0 41 1 1 0
19 –1 1 39 –1 1 1
180 120 360 346 193
Σ x 180 Σ y 360
x = = = 20 and y = = = 40
n 9 n 9
Σxy 193
r= = = 0.9476
2
Σx × Σy2 120 × 346
The value of r shown that there is a high degree of correlation between test scores (x) and sales (y).
This implies that the persons having low test scores will not be able to make good sales hence termi-
nation is justified.
Regression equation of X on Y is:
Σ dx dy − (Σ dx ) (Σdy ) 193
X – X = bxy ( Y − Y ); where bxy = = = 0.557
Σd2y − (Σdy )2 346
x – 20 = 0.557 (y – 40)
x = 0.557 y – 22.28 + 20 = 0.557 Y – 2.28
When sales is Rs. 3000, i.e. y = 30, then x = 0.557 (30) – 2.28 = 14.43
Hence, test score = 14.43 when sales volume is Rs. 3000.
Regression equation of Y on X:
Σxy 193
y − y = byx ( x − x ) ; where byx = = =1.6
Σx 2 120
∴ y – 40 = 1.6 (x – 20)
y = 1.6x – 32 + 40 = 1.6x + 8
When test score is 28, i.e. x = 28, then y= 1.6(28) + 8 = 52.8
Thus sales volume = 52.8 × 100 = Rs. 5280.
Example 10.6: The following data relate to the scores obtained by 9 salesmen of a company in an
intelligence test and their weekly sales (in Rs. 1000’s)
Salesmen : A B C D E F G H I
Test scores : 50 60 50 60 80 50 80 40 70
Weekly sales : 30 60 40 50 60 30 70 50 60
(a) Obtain the regression equation of sales on intelligence test scores of the salesmen.
(b) If the intelligence test score of a salesman is 65, what would be his expected weekly sales.
[HP Univ., MCom, 1996]
Solution: Assume weekly sales (y) as dependent variable and test scores (x) as independent variable.
Calculations for the following regression equation are shown in Table 10.3.
y – y = by x (x – x )
346 BUSINESS S T A T I S T I C S
Σx 540 Σy 450
(a) x = = = 60; y = = = 50
n 9 n 9
Σ dx dy − (Σ dx )(Σ d y ) 1200
byx = = = 0.75
Σ dx2 − (Σ dx )2
1600
Job : A B C D E F G H I
Points : 5 25 7 19 10 12 15 28 16
Pay (Rs.) : 3.0 5.0 3.25 6.5 5.5 5.6 6.0 7.2 6.1
(a) Find the least squares regression line for linking pay scales to points.
(b) Estimate the monthly pay for a job graded by 20 points.
Solution: Assume monthly pay (y) as the dependent variable and job grade points (x) as the independent
variable. Calculations for the following regression equation are shown in Table 10.4.
y – y = byx (x – x )
REGRESSION ANALYSIS 347
Σx 137 Σy 48.15
(a) x = = = 15.22; y = = = 5.35
n 9 n 9
Since mean values x and y are non-integer value, therefore deviations are taken from assumed mean
as shown in Table 10.4.
n Σ dx d y − (Σ dx ) (Σ d y ) 9 × 65.40 − 2 × 3.15 582.3
byx = = = = 0.133
n Σ dx2 − (Σ dx ) 2
9 × 484 − (2)2 4352
Substituting values in the regression equation, we have
y – y = byx (x – x ) or y – 5.35 = 0.133 (x – 15.22) = 3.326 + 0.133x
(b) For job grade point x=20, the estimated average pay scale is given by
y = 3.326 + 0.133x = 3.326 + 0.133 (20) = 5.986
Hence, likely monthly pay for a job with grade points 20 is Rs. 5986.
Example 10.8: Find two regression equations from the data given below:
x : 57 58 59 59 60 61 62 64
y : 77 78 75 78 82 82 79 81 [GJU (Hisar), BBA, 2005]
Solution: Calculations of two regression equations as shown below:
x y dx = x – 59 dY = y – 82 dx dy dx2 dy2
57 77 –2 –5 +10 4 25
58 78 –1 –4 +4 1 16
59 75 0 –7 0 0 49
59 78 0 –4 0 0 16
60 82 +1 0 0 1 0
61 82 +2 0 0 4 0
62 79 +3 –3 –9 9 9
64 81 +5 –1 –5 25 1
Σx 480 Σy 632
x = = = 60 and y = = = 79
n 8 n 8
x – 60 = 0.55( y − y )
x = 0.55y – 43.45 + 60 = 0.55y + 16.55
Regression equation of y on x: y − y = byx ( x − x )
y – 79 = 0.67 (x – 60)
y = 0.67 x – 40.2 + 79 = 0.67 x + 38.8.
Example 10.9: The following data give the ages and blood pressure of 10 women:
Age : 156 142 136 147 149 142 160 172 163 155
Blood pressure : 147 125 118 128 145 140 155 160 149 150
(a) Find the correlation coefficient between age and blood pressure.
(b) Determine the least squares regression equation of blood pressure on age.
(c) Estimate the blood pressure of a woman whose age is 45 years.
Solution: Assume blood pressure (y) as the dependent variable and age (x) as the independent variable.
Calculations for regression equation of blood pressure on age are shown in Table 10.5.
Table 10.5 Calculations for Regression Equation
We may conclude that there is a high degree of positive correlation between age and blood pressure.
(b) The regression equation of blood pressure on age is given by
y – y = byx (x – x )
Σx 522 Σy 1417
x = = = 52.2; y = = = 141.7
n 10 n 10
n Σ dx dy − Σ dx Σ dy 10(1115) − 32(− 33) 12206
and byx = = = = 1.11
n Σ dx2 2
− (Σ dx ) 10(1202) − (32)2 10996
Example 10.10: The General Sales Manager of Kiran Enterprises—an enterprise dealing in the sale
of readymade men’s wear—is toying with the idea of increasing his sales to
Rs. 80,000. On checking the records of sales during the last 10 years, it was found that the annual
sale proceeds and advertisement expenditure were highly correlated to the extent of 0.8. It was
further noted that the annual average sale has been Rs. 45,000 and annual average advertisement
expenditure Rs. 30,000, with a variance of Rs. 1600 and Rs. 625 in advertisement expenditure
respectively.
In view of the above, how much expenditure on advertisement would you suggest the General
Sales Manager of the enterprise to incur to meet his target of sales?
Solution: Assume advertisement expenditure (y) as the dependent variable and sales (x) as the
independent variable. Then the regression equation advertisement expenditure on sales is given by
σy
(y – y ) = r (x − x )
σx
Given r = 0.8, σx = 40, σy = 25, x = 45,000, y = 30,000. Substituting these value in the above
equation, we have
25
(y – 30,000) = 0.8 (x – 45,000) = 0.5 (x – 45,000)
40
y = 30,000 + 0.5x – 22,500 = 7500 + 0.5x
When a sales target is fixed at x = 80,000, the estimated amount likely to the spent on advertisement
would be
y = 7500 + 0.5×80,000 = 7500 + 40,000 = Rs. 47,500
350 BUSINESS S T A T I S T I C S
Example 10.11: You are given the following information about advertising expenditure and sales:
Arithmetic mean, x 10 90
Standard deviation, σ 3 12
Example 10.12: For 100 students of a class, the regression equation of marks in statistics (x) on marks
in economics (y) is: 3y – 5x + 180 = 0. If marks in economics is 50 and variance of marks in statistics
is (4/9) of variance of marks in economics, find mean marks in statistics and the coefficient of correlation
between them. [Delhi Univ., BCom (Hons), 2005]
Also it is given that y = 50. To find x , put y = 50 in the regression equation we get
3(50) – 5x + 180 = 0 or x = 66, i.e. x = 66
4 2 2 σx 2
It is given that σ2x = σ y i.e. σ2x = σy or =
9 3 σy 3
3 σx 3
Since, bxy = or r . =
5 σy 5
r ( 23 ) = 53 or r =
3 3
×
5 2
=
9
10
= 0.9
Solution: Given, regression equation of y on x : y = 20 + 0.4x. This implies that byx = 0.4.
Also, x = 30, put x = 30 in the regression equation of y on x, we get
y = 20 + 0.4 (30) = 20 + 12 = 32
Thus y = 32
We know that, r2 = bxy × byx
(0.8)2 = bxy × 0.4 [since r = 0.8, byx = 0.4]
0.64
bxy = = 1.6
0.4
Regression equation of x on y is:
x − x = bxy ( y − y )
x – 30 = 1.6 ( y – 32) = 1.6 y – 51.2 + 30 = 1.6 y – 21.2.
Mean 18 100
Standard deviation 14 20
(0.8)(20)
Y – 100 = ( X − 18) = 1.14 (X – 18)
14
Y = 1.14 X – 20.52 + 100 = 1.14 X + 79.48
If X = 70, then Y = 1.14 (70) + 79.48 = 159.28.
Example 10.16: In a partially destroyed laboratory record of an analysis of regression data, the
following results only are legible:
Variance of x = 9
Regression equations : 8x – 10y + 66 = 0 and 40x – 18y = 214
Find on the basis of the above information:
(a) The mean values of x and y,
(b) Coefficient of correlation between x and y, and
(c) Standard deviation of y.
Solution: (a) Since two regression lines always intersect at a point ( x , y ) representing mean values of the
variables involved, solving given regression equations to get the mean values x and y as shown below:
8x – 10y = – 66
40x – 18y = 214
Multiplying the first equation by 5 and subtracting from the second, we have
32y = 544 or y = 17, i.e. y = 17
Substituting the value of y in the first equation, we get
8x – 10(17) = – 66 or x = 13, that is, x = 13
(b) To find correlation coefficient r between x and y, we need to determine the regression coefficients
bxy and byx.
Rewriting the given regression equations in such a way that the coefficient of dependent variable
is less than one at least in one equation.
REGRESSION ANALYSIS 353
66 8
8x – 10 y = – 66 or 10 y = 66 + 8x or y= + x
10 10
That is, byx = 8/10 = 0.80
214 18
40x – 18y = 214 or 40x = 214 + 18y or x = + y
40 40
That is, bxy = 18/40 = 0.45
Hence coefficient of correlation r between x and y is given by
Example 10.17: There are two series of index numbers, P for price index and S for stock of a
commodity. The mean and standard deviation of P are 100 and 8 and of S are 103 and 4 respectively.
The correlation coefficient between the two series is 0.4. With these data, work out a linear equation
to read off values of P for various values of S. Can the same equation be used to read off values of S for
various values of P?
Solution: The regression equation to read off values of P for various values S is given by
σp
P = a + bS or (P – P ) = r (S − S)
σs
Given P = 100, S = 103, σp = 8, σs = 4, r = 0.4. Substituting these values in the above equation,
we have
8
P – 100 = 0.4 (S − 103) or P = 17.6 + 0.8 S
4
This equation cannot be used to read off values of S for various values of P. Thus to read off values of
S for various values of P we use another regression equation of the form:
σs
S = c + dP or S−S = (P − P)
σp
Substituting given values in this equation, we have
4
S – 103 = 0.4 (P – 100) or S = 83 + 0.2P
8
Example 10.18: A panel of Judges A and B graded seven debators and independently awarded the
following marks:
Debator : 1 2 3 4 5 6 7
Marks of A : 40 34 28 30 44 38 31
Marks of B : 32 39 26 30 38 34 28
An eighth debator was awarded 36 marks by Judge A while Judge B was not present. If Judge B
was also present, how many marks would you expect him to award to eighth debator assuming same
degree of relationship exists in judgement? [Delhi Univ., BCom (Hons) 1993]
Solution: Let marks of A be denoted by x and that of B by y. A = 30 and B = 30 be assumed as mean
value of x-series and y-series, respectively. The calculation required for regression equations are
shown below:
354 BUSINESS S T A T I S T I C S
40 10 100 32 2 4 20
34 4 16 39 9 81 36
28 –2 4 26 –4 16 8
30 ← A 0 0 30 ← B 0 0 0
44 14 196 38 8 64 112
38 8 64 34 4 16 32
31 1 1 28 –2 4 2
245 35 381 227 17 185 206
Σx 245 Σy 227
x = = = 35 and y= = = 32.43
n 7 n 7
Regression coefficient of y on x:
n Σ dx d y − Σ dx Σ d y 7(206) − (35)(17) 1442 − 595 847
byx = = = = = 0.587
2
n Σ dx − (Σ dx )
2
7(381) − (35)2 2667 − 1225 1442
−8
r = byx × bxy = × (−2) =1.06
14
REGRESSION ANALYSIS 355
Since value of r cannot exceed one, consider another set of regression equations:
Regression equation of x on y is
208 14
14y – 208 = –8x or x= −
8 8
−14
Thus bxy =
8
Regression equation of y on x is:
145 5
5x – 145 = –10y 10y = 145 – 5x or y = − x
10 10
−5 1
Thus byx = =−
10 2
1 −14
Now r = byx × bxy = − × = 0.875 = ± 0.93
2 8
Since bxy and byx both are negative, r should also be negative. Hence r = – 0.93.
rσ y 1 − 0.93 σ y
We know that, byx = or − = , where σx = 2
σx 2 2
2
or σy = = 1.075
0.93 × 2
σx 6 σy 768
We know that bx y = r = and byx = r = , therefore
σy 5 σx 1000
356 BUSINESS S T A T I S T I C S
6 768
bx y byx = r2 = × = 0.9216
5 1000
Hence r = 0.9216 = 0.96.
Since both b xy and b yx are positive, the correlation coefficient is positive and hence
r = 0.96.
1 − r2 1 − (0.96)2
Probable error of r = 0.6745 = 0.6745
n 60
0.0528
= = 0.0068
7.7459
Solving the given regression equations for x and y, we get x = 6 and y = 1 because regression
lines passed through the point ( x , y ).
σx 6 σ 6 σ 6 5
Since r = or 0.96 x = or x = =
σy 5 σy 5 σy 5 × 0.96 4
σx / x y σx 1 5 5
Also the ratio of the coefficient of variability = = ⋅ = × = .
σy / y x σy 6 4 24
S e l f-P r a c t i c e P r o b l e m s 10A
10.1 The following calculations have been made for 10.3 The following data give the experience of
prices of twelve stocks (x) at the Calcutta Stock machine operators and their performance
Exchange on a certain day along with the ratings given by the number of good parts
volume of sales in thousands of shares (y). From turned out per 100 pieces:
these calculations find the regression equation Operator : 11 12 13 4 5 16 7 18
of price of stocks on the volume of sales of experience (x) : 16 12 18 4 3 10 5 12
shares.
Performance
Σ x = 580, Σ y = 370, Σ xy = 11494,
ratings (y) : 87 88 89 68 78 80 75 83
Σ x2 = 41658, Σ y2 = 17206.
Calculate the regression lines of performance
10.2 A survey was conducted to study the ratings on experience and estimate the probable
relationship between expenditure (in Rs.) on performance if an operator has 7 years
accommodation (x) and expenditure on food experience.
and entertainment (y) and the following results 10.4 A study of prices of a certain commodity at Delhi
were obtained: and Mumbai yield the following data:
Mean Standard
Delhi Mumbai
Deviation
• Expenditure on 173..1 63.15 • Average price per kilo (Rs.) 2.463 2.797
• accommodation • Standard deviation 0.326 0.207
• Expenditure on food 47.8 22.98 • Correlation coefficient
• and entertainment between prices at Delhi
• Coefficient of correlation r = 0.57 and Mumbai r = 0.774
Write down the regression equation and
Estimate from the above data the most likely
estimate the expenditure on food and
price (a) at Delhi corresponding to the price of Rs.
entertainment if the expenditure on
2.334 per kilo at Mumbai (b) at Mumbai
accommodation is Rs. 200.
corresponding to the price of 3.052 per kilo at
Delhi
REGRESSION ANALYSIS 357
10.5 The following table gives the aptitude test do this he selects a random sample of 10
scores and productivity indices of 10 workers applicants. They are given the test and later
selected at random: assigned a production rating. The results are
Aptitude as follows:
scores (x) : 60 62 65 70 72 48 53 73 65 82
Worker : A B C D E F G H I J
Productivity
index (y) : 68 60 62 80 85 40 52 62 60 81 Test score : 53 36 88 84 86 64 45 48 39 69
Production
Calculate the two regression equations and rating : 45 43 89 79 84 66 49 48 43 76
estimate (a) the productivity index of a worker
whose test score is 92, (b) the test score of a Fit a linear least squares regression equation of
worker whose productivity index is 75. production rating on test score.
10.6 A company wants to assess the impact of R&D 10.9 Find the regression equation showing the
expenditure (Rs. in 1000s) on its annual profit capacity utilization on production from the
(Rs. in 1000’s). The following table presents following data:
the information for the last eight years:
Average Standard
Year R & D expenditure Annual profit Deviation
1991 9 45 • Production 35.6 10.5
1992 7 42 (in lakh units) :
1993 5 41 • Capacity utilization
1994 10 60 (in percentage) : 84.8 8.5
1995 4 30 • Correlation coefficient r = 0.62
1996 5 34 Estimate the production when the capacity
1997 3 25 utilization is 70 per cent.
1998 2 20 10.10 Suppose that you are interested in using past
expenditure on R&D by a firm to predict current
Estimate the regression equation and predict expenditures on R&D. You got the following
the annual profit for the year 2002 for an allo- data by taking a random sample of firms, where
cated sum of Rs. 1,00,000 as R&D expendi- x is the amount spent on R&D (in lakh of rupees)
ture. 5 years ago and y is the amount spent on R&D
10.7 Obtain the two regression equations from the (in lakh of rupees) in the current year:
following bivariate frequency distribution: x : 30 50 20 180 10 20 20 40
y : 50 80 30 110 20 20 40 50
Sales Revenue Advertising Expenditure (Rs. in thousand) (a) Find the regression equation of y on x.
(Rs. in lakh) 5–15 15–25 25–35 35–45 (b) If a firm is chosen randomly and x = 10,
75–125 3 4 4 8 can you use the regression to predict the
value of y? Discuss.
125–175 8 6 5 7
175–225 2 2 3 4 10.11 The following data relates to the scores
obtained by a salesmen of a company in an
225–275 3 3 2 2
intelligence test and their weekly sales (in Rs.
1000’s ):
Estimate (a) the sales corresponding to
advertising expenditure of Rs. 50,000, (b) the Salesman
advertising expenditure for a sales revenue of intelligence : A B C D E F G H I
Rs. 300 lakh, (c) the coefficient of correlation.
Test score : 50 60 50 60 80 50 80 40 70
10.8 The personnel manager of an electronic Weekly sales: 30 60 40 50 60 30 70 50 60
manufacturing company devises a manual test
for job applicants to predict their production (a) Obtain the regression equation of sales on
rating in the assembly department. In order to intelligence test scores of the salesmen.
358 BUSINESS S T A T I S T I C S
(b) If the intelligence test score of a salesman is 10.14 In trying to evaluate the effectiveness in its ad-
65, what would be his expected weekly sales? vertising compaign, a firm compiled the
[HP Univ., M.Com., 1996] following information:
Calculate the regression equation of sales on
10.12 Two random variables have the regression
advertising expenditure. Estimate the probable
equations:
sales when advertisement expenditure is Rs.
3x + 2y – 26 = 0 and 6x + y – 31 = 0 60 thousand.
(a) Find the mean values of x and y and coeffi-
cient of correlation between x and y. Year Adv. expenditure Sales
(b) If the varaince of x is 25, then find the stand- (Rs. 1000’s) (in lakhs Rs.)
ard deviation of y from the data. 1996 12 5.0
10.13 For a given set of bivariate data, the following 1997 15 5.6
results were obtained 1998 17 5.8
x = 53.2, y = 27.9, 1999 23 7.0
Regression coefficient of y on x = – 1.5, and 2000 24 7.2
Regression coefficient of x and y = – 0.2. 2001 38 8.8
2002 42 9.2
Find the most probable value of y when x = 60.
2003 48 9.5
σx 3x + 2y = 26 or y = 13 – (3/2)x,
x – x =r (y − y)
σy So byx = – 3/2
6x + y = 31 or x = 31/6 – (1/6)y,
10.5
x – 35.6 = 0.62 (y – 84.8) So bxy = – 1/6
8.5
x = – 29.3483 + 0.7659y Correlation coefficient,
Conceptual Questions
1. (a) Explain the concept of regression and point 10. (a) Distinguish between correlation and
out its usefulness in dealing with business regression analysis.
problems. (b) The coefficient of correlation and coefficient
(b) Distinguish between correlation and of determination are available as measures of
regression. Also point out the properties of association in correlation analysis. Describe
regression coefficients. the different uses of these two measures of
2. Explain the concept of regression and point out association.
its importance in business forecasting. 11. What are regression coefficients? State some of
3. Under what conditions can there be one the important properties of regression
regression line? Explain. coefficients.
4. Why should a residual analysis always be done as 12. What is regression? How is this concept useful to
part of the development of a regression model? business forecasting?
13. What is the difference between a prediction
5. What are the assumptions of simple linear
interval and a confidence interval in regression
regression analysis and how can they be
analysis?
evaluated?
14. Explain what is required to establish evidence of
6. What is the meaning of the standard error of
a cause-and-effect relationship between y and x
estimate?
with regression analysis.
7. What is the interpretation of y-intercept and the 15. What technique is used initially to identify the kind
slope in a regression model? of regression model that may be appropriate.
8. What are regression lines? With the help of an 16. (a) What are regression lines? Why is it necessary
example illustrate how they help in business to consider two lines of regression?
decision-making.
(b) In case the two regression lines are identical,
9. Point out the role of regression analysis in business prove that the correlation coefficient is either + 1
decision-making. What are the important or – 1. If two variables are independent, show
properties of regression coefficients? that the two regression lines cut at right angles.
Formulae Used
1. Simple linear regression model 3. Regression coefficient in sample regression
y = β0 + β1x + e equation b = ŷ
2. Simple linear regression equation based on
sample data y = a + bx a = y − bx
362 BUSINESS S T A T I S T I C S
True or False
1. A statistical relationship between two variables 8. Correlation coefficient is the geometric mean of
does not indicate a perfect relationship. regression coefficients.
2. A dependent variable in a regression equation is a 9. If the sign of two regression coefficients is
continuous random variable. negative, then sign of the correlation coefficient is
positive.
3. The residual value is required to estimate the
10. Correlation coefficient and regression coefficient
amount of variation in the dependent variable
are independent.
with respect to the fitted regression line.
11. The point of intersection of two regression lines
4. Standard error of estimate is the conditional represents average value of two variables.
standard deviation of the dependent variable. 12. The two regression lines are at right angle when
5. Standard error of estimate is a measure of scatter the correlation coefficient is zero.
of the observations about the regression line. 13. When value of correlation coefficient is one, the
6. If one of the regression coefficients is greater than two regression lines coincide.
one the other must also be greater than one. 14. The product of regression coefficients is always
more than one.
7. The signs of the regression coefficients are always
15. The regression coefficients are independent of
same.
the change of origin but not of scale.
1. T 2. T 3. T 4. T 5. T 6. F 7. T 8. T 9. F
10. F 11. T 12. T 13. T 14. F 15. T
R e v i e w S e l f-P r a c t i c e P r o b l e m s
10.15 Given the following bivariate data: 10.17 The coefficient of correlation between the ages
x: – 1 5 3 2 1 1 7 3 of husbands and wives in a community was
found to be + 0.8, the average of husbands
y: – 6 1 0 0 1 2 1 5 age was 25 years and that of wives age 22 years.
(a) Fit a regression line of y on x and predict y Their standard deviations were 4 and 5 years
if x = 10. respectively. Find with the help of regression
equations:
(b) Fit a regression line of x on y and predict x
if y = 2.5. (a) the expected age of husband when wife’s
age is 16 years, and
10.16 Find the most likely production corresponding (b) the expected age of wife when husband’s
to a rainfall of 40 inches from the following data: age is 33 years.
10.18 You are given below the following information
Rainfall Production about advertisement expenditure and sales:
(in inches) (in quintals)
Adv. Exp. (x) Sales (y)
Average 30 50
(Rs. in crore) (Rs. in crore)
Standard deviation 5 10
Mean 20 120
Coefficient of correlation r = 0.8. Standard deviation 5 25
REGRESSION ANALYSIS 363
11.1 INTRODUCTION
The increasing complexity of the business environment together with changing demands and
expectations, implies that every organization needs to know the future values of their key decision
variables. Forecasting takes the historical data and project them into the future to predict the
occurrence of uncertain events. This may help organizations to assess the future consequences of
existing decisions and to evaluate the consequences of decisions (actions or strategies). For example,
inventory is ordered without certainty of future sales; new equipment is purchased despite uncer-
tainty about the demand for products; investments are made without knowing profits in future;
alternative staff mix is made without knowing the increase in the level of service that can be
provided, and so on.
Forecasting is essential to make reliable and accurate estimates of what will happen in the
future in the face of uncertainty. A flow chart of forecasts and the decision-making process is
shown in Fig. 11.1. In general, the decisions are influenced by the chosen strategy with regard
to an organization’s future priorities and activities. Once decisions are taken, the consequences are
measured in terms of expectation to achieve the desired products/services levels.
FORECASTING AND TIME SERIES ANALYSIS 367
1994 2
1995 3
1996 6
1997 10
1998 8
1999 7
2000 12
2001 14
2002 14
2003 18
2004 19
where y is the forecast variable at period t; pattern is the mean value of the forecast variable at
period t and represents the underlying pattern, and e is the random fluctuation from the pattern
that occurs of the forecast variable at period t.
Seasonal It is a special case of a cycle component of time series in which fluctuations are repeated
usually within a year (e.g. daily, weekly, monthly, quarterly) with a high degree of regularity. For
example, average sales for a retail store may increase greatly during festival seasons.
Irregular Irregular variations are rapid charges or bleeps in the data caused by short-term
unanticipated and non-recurring factors. Irregular fluctuations can happen as often as day to day.
many models by which a time series can be analysed; two models commonly used for decomposition
of a time series are discussed below.
C o n c e p t u a l Q u e s t i o n s 11A
1. Briefly describe the steps that are used to 8. Explain what you understand by time series.
develop a forecasting system. Why is time-series considered to be an effective
2. What is forecasting? Discuss in brief the various tool of forecasting?
theories and methods of business forecasting. 9. Explain briefly the additive and multiplicative
models of time series. Which of these models is
3. For what purpose do we apply time series
more popular in practice and why?
analysis to data collected over a period of
time? 10. Identify the four principal components of a time-
series and explain the kind of change, over time,
4. How can one benefit from determining past
to which each applies.
patterns?
11. What is the advantage of reducing a time series
5. What is the difference between a causal into its four components?
model and a time series model?
12. Despite great limitations of statistical forecasting,
6. What is a judgmental forecasting model, and forecasting techniques are invaluable to the
when is it appropriate? economist, the businessman, and the
7. Explain clearly the different components into government. Explain.
which a time series may be analysed. Explain 13. (a) Why are forecasts important to organiza-
any method for isolating trend values in a time tions?
series.
372 BUSINESS S T A T I S T I C S
(b) Explain the difference between the terms: examples where you believe the seasonality
seasonal variation and cyclical variation. may change.
(c) Give reasons why the seasonal component 14. Identify the classical components of a time
in the time-series is not constant? Give series and indicate how each is accounted for
in forecasting.
Example 11.1: Fit a trend line to the following data by using the freehand method.
Year : 1997 1998 1999 2000 2001 2002 2003 2004
Sales turnover : 80 90 92 83 94 99 92 104.
(Rs. in lakh)
Solution: Figure 11.5 presents the freehand graph of sales turnover (Rs. in lakh) from 1997 to
2004. Forecast can be obtained simply by extending the trend line
FORECASTING AND TIME SERIES ANALYSIS 373
Moving Averages
If we attempt to observe the movement of some variable values over a period of time and try to
project this movement into the future, then it is essential to smooth out first the irregular pattern
in the historical values of the variable, and later use this as the basis for a future projection. This
can be done by using the technique of moving averages.
This method is a subjective method and depends on the length of the period chosen for
calculating moving averages. To remove the effect of cyclical variations, the period chosen should
be an integer value that corresponds to or is a multiple of the estimated average length of a cycle
in the series.
The moving averages which serve as an estimate of the next period’s value of a variable given
a period of length n is expressed as:
374 BUSINESS S T A T I S T I C S
Σ{Dt + Dt − 1 + Dt − 2 + ... + Dt − n + 1}
Moving average, MA t + 1 =
n
where t = current time period
D = actual data which is exchanged each period
n = length of time period
In this method, the term ‘moving’ is used because it is obtained by summing and averaging the
values from a given number of periods, each time deleting the oldest value and adding a new
value.
The major advantage of a moving average is the opportunity it provides to focus on the long-
term trend (and cyclical) movements in a time series without the obscuring effect of short-term
‘noise’ influences.
The limitation of this method is that it is highly subjective and dependent on the length of
period chosen for constructing the averages. Moving averages have the following three limitations:
(i) As the size of n (the number of periods averaged) increases, it smoothens the variations
better, but it also makes the method less sensitive to real changes in the data.
(ii) It is difficult to choose the optimal length of time for which to compute the moving
average. Moving averages can not be found for the first and last k/2 periods in a k-period
moving average.
(iii) Moving averages cannot pick-up trends very well. Since these are averages, it will always
stay within past levels and will not predict a change to either a higher or lower level.
(iv) It causes a loss of information (data values) at either end of the original time series.
(v) Moving averages do not usually adjust for such time-series effects as trend, cycle or
seasonality.
Example 11.2: Shown is production volume (in ’000 tonnes) for a product. Use these data to
compute a 3-year moving average for all available years. Also determine the trend and short-term
error.
Solution: The first average is computed for the first 3 years as follows:
21 + 22 + 23
Moving average (year 1–3) = = 22
3
The first 3-year moving average can be used to forecast the production volume in fourth year,
1998. Because 25,000 tonnes production was made in 1998, the error of the forecast is Error1998
= 25,000 – 22,000 = 3000 tonnes.
Similarly, the moving average calculation for the next 3 years is:
22 + 23 + 25
Moving average (year 2–4) = = 23.33
3
FORECASTING AND TIME SERIES ANALYSIS 375
Odd and Even Number of Years When the chosen period of length n is an odd number, the moving
average period is centred on i (middle period in the consecutive sequence of n periods). For
instance with n = 5, MA3(5) is centred on the third year, MA4(5) is centred on the fourth year...,
and MA9(5) is centred on the ninth year.
No moving average can be obtained for the first (n – 1)/2 years or the last (n – 1)/2 year of
the series. Thus for a 5-year moving average, we cannot make computations for the just two years
or the last two years of the series.
When the chosen period of length n is an even numbers, equal parts can easily be formed
and an average of each part is obtained. For example, if n = 4, then the first moving average M3
(placed at period 3) is an average of the first four data values, and the second moving average M 4
(placed at period 4) is the average of data values 2 through 5. The average of M 3 and M4 is placed
at period 3 because it is an average of data values for period 1 through 5.
Example 11.3: Assume a four-year cycle and calculate the trend by the method of moving average
from the following data relating to the production of tea in India:
Example 11.4: Vacuum cleaner sales for 12 months is given below. The owner of the supermarket
decides to forecast sales by weighting the past three months as follows:
Months : 1 2 3 4 5 6 7 8 9 10 11 12
Actual sales : 10 12 13 16 19 23 26 30 28 18 16 14
(in units)
Solution: The results of 3-month weighted average are shown in Table 11.3
xweighted = 3M
t – 1 + 2Mt – 2 + 1Mt – 3
1
= [3 × Sales last month + 2 × Sales two months ago +1 × Sales three months ago]
6
Example 11.5: A food processor uses a moving average to forecast next month’s demand. Past
actual demand (in units) is shown below:
Month : 43 44 45 46 47 48 49 50 51
Actual
demand : 105 106 110 110 114 121 130 128 137
(a) Compute a simple five-month moving average to forecast demand for month 52.
(b) Compute a weighted three-month moving average where the weights are highest for the latest
months and descend in order of 3, 2, 1.
Solution: Calculations for five-month moving average are shown in Table 11.4.
Table 11.4 Five-month Moving Average
Semi-Average Method
The semi-average method permits us to estimate the slope and intercept of the trend line quite easily
if a linear function will adequately describe the data. The procedure is simply to divide the data into
two parts and compute their respective arithmetic means. These two points are plotted corresponding
to their midpoint of the class interval covered by the respective part and then these points are joined
FORECASTING AND TIME SERIES ANALYSIS 379
by a straight line, which is the required trend line. The arithmetic mean of the first part is the
intercept value, and the slope is determined by the ratio of the difference in the arithmetic mean of
the number of years between them, that is, the change per unit time. The resultant is a time series
of the form : ŷ = a + bx. The ŷ is the calculated trend value and a and b are the intercept and slope
values respectively. The equation should always be stated completely with reference to the year where
x = 0 and a description of the units of x and y.
The semi-average method of developing a trend equation is relatively easy to commute and may
be satisfactory if the trend is linear. If the data deviate much from linearity, the forecast will be
biased and less reliable.
Example 11.6: Fit a trend line to the following data by the method of semi-average and forecast
the sales for the year 2002.
Solution: Since number of years are odd in number, therefore divide the data into equal parts
(A and B) of 3 years ignoring the middle year (1996). The average of part A and B is
∆y change in sales
Slope = b = =
∆x change in year
112 − 107 5
= = = 1.25
1998 − 1994 4
Intercept = a = 107 units at 1994
Thus, the trend line is: ŷ = 107 + 1.25x
Since 2002 is 8-year distant from the origin (1994), therefore we have
ŷ = 107 + 1.25(8) = 117
Example 11.7: In the study of sales, a company obtained the following trend equation: yc = 16 + 2x
(Origin 1995, x unit = 1 year, y = total number of units sold).
The company has the physical facilities to provide only 30 units in a year and it believes that at
least for the next decade trend will continue as before. Find:
(a) What is the average annual increase in the number of units sold?
(b) By which year the company’s expected sales have equalled to its present capacity?
(c) Estimate the sales for the year 1998. [Delhi Univ., BCom(Hons), 2003]
Solution: (a) Trend equation is yc = 16 + 2x. Since slope of this line is b = 2, therefore average annual
increase is 2 units.
(b) Since the company’s present capacity is 30 units, substituting y = 30 in the trend equation, we
get 30 = 16 + 2x or x = 7. Thus, in seven years, the company’s expected sales have equalled
the present capacity. Since 1995 is taken as origin, therefore required year would be 1995 +
7 = year 2002.
(c) Since 1995 is origin, therefore, for estimating sales of 1998, putting x = 1998–1995 = 3 in the
trend equation we get y = 16 + 2(3) = 22 units.
Example 11.8: Trend equation for yearly sales (in ‘000 Rs.) for a commodity is : y = 81.6 + 28.8 x
(unit of x = 1 year, origin is July 16, 1991). Adjust the trend equation to find the monthly trend
values with Jan. 1992 as origin and find the trend values for March 1992.[Delhi Univ., B Com (Hons),
2003]
Solution: Annual trend equation is y = 81.6 + 28.8x. Therefore, monthly trend equation is
81.6 28.5
y= + x
12 12 × 12
Here x unit = one month and origin = July 16, 1991
Since the required origin is Jan. 1992 i.e., Jan. 16, 1992, the trend equation will be obtained by
increasing x by 6 months because, Jan. 16, 1992– July 16, 1991 = 6 months
y = 6.8 + 0.2 (x + 6) = 0.2x + 8
To find trend value for March 1992, we will put x = 2 in trend equation as March 16, 1992– Jan.
16, 1992 = 2 months
y = 0.2 (2) + 8 = 8.4 (Rs., in thousand)
FORECASTING AND TIME SERIES ANALYSIS 381
Example 11.9: Give below is the quarterly trend equation for sales (Rs. in thousand) of a commodity:
yC = 130 + 1.8x
[Origin: first quarter of 2002; x unit = 1 quarter, y unit = average quarterly sales (Rs. in thousand)]
Convert the above equation to annual trend equation and estimate the sales for the year 2006.
[Delhi Univ., B Com (Hons), 2005]
Solution: Quarterly trend equation is: yC = 130 + 1.8 x; origin as first quarter of 2002, i.e. February
15, 2002. To convert it into annual trend, shift origin to June 30, 2002 (middle of year 2002). That is
shift x by June 30, 2002–February 15, 2002 = 4.5 months or 1.5 quarters
Thus, the trend equation with June 30, 2002 as origin becomes:
yC = 130 + 1.8 (x + 1.5) = 132.7 + 1.8 x
The annual trend equation then is
yC = 132.7 × 4 + (1.8) (16 x) = 530.8 + 28.8 x
Putting x = 4 to get expected sales for 2006 : yC = 530.8 + 28.8(4) = 530.8 + 115.2 = Rs.. 646.
S e l f-P r a c t i c e P r o b l e m s 11A
1995 20 2000 25 Find the trend line that describes the trend by
1996 22 2001 23 using the method of semi-averages.
1997 24 2002 26 11.4 Calculate the three-month moving averages
1998 21 2003 25 from the following data:
1999 23 Jan. Feb. March April May June
57 65 63 72 69 78
11.3 A State Govt. is studying the number of traffic
July Aug. Sept. Oct. Nov. Dec.
fatalities in the state resulting from drunken
driving for each of the last 12 months: 82 81 90 92 95 97
[Osmania Univ., B.Com, 1996]
382 BUSINESS S T A T I S T I C S
11.5 Gross revenue data (Rs. in million) for a Travel Calculate a 5-year moving average for the
Agency for a 11-year period is as follows: unit cost of the product.
11.2
11.1
11.3 11.6
Month Accidents Year Per Unit Cost 5-year 5-year
1 280 Moving Total Moving Average
2 300 1995 332 — —
3 280
4 280 1996 317 — —
5 270
—→
1997 357 1800 1800/5 = 360.0
6 240 —→
7 230 1998 392 1873 1873/5 = 374.6
8 230 1999 402 1966 1966/5 = 393.2
9 220
2000 405 2036 407.2
10 200
11 210 2001 410 2049 409.8
12 200 2002 427 2085 417.0
Average of first 6 months, 2003 405 — —
a = 1650/6 = 275 2004 438 — —
Average of last 6 months, b = 1290/6 = 215
Trend line y = 275 + 215 x. 11.7
11.4 Year Number 5-year 5-year 7-year 7-year
of Failures Moving Moving Moving Moving
Month Values 3-month 3-month Moving
Total Average Total Average
Total Average
1989 23 — — — —
Jan. 57 — —
Feb. 65 185 185/3 = 61.67 1990 26 — — — —
March 63 200 200/3 = 66.67 1991 28
—→ 129 25.8 — —
April 72 204 204/3 = 68.00 —→
1992 32 118 23.6 153 21.9
May 69 219 73.00
June 78 229 76.33 1993 20 104 20.8 140 20.0
July 82 241 80.33 1994 12 86 17.2 123 17.6
Aug. 81 253 84.33
1995 12 63 12.6 108 15.4
Sept. 90 263 87.67
Oct. 92 277 92.38 1996 10 56 11.2 87 12.4
Nov. 95 284 94.67 1997 9 55 11.0 81 11.6
Dec. 97 — —
1998 13 57 11.4 81 11.6
11.5 1999 11 59 11.8 78 11.1
The trend line of best fit has the properties that (i) the summation of all vertical deviations about
it is zero, that is, Σ (y – ŷ ) = 0, (ii) the summation of all vertical deviations squared is a minimum, that
is, Σ (y – ŷ ) is least, and (iii) the line goes through the mean values of variables x and y. For linear
equations, it is found by the simultaneous solution for a and b of the two normal equations:
Σ y = na + bΣ x and Σ x y = aΣx + bΣx2
where the data can be coded so that Σ x = 0, two terms in these equations drop out and we have
Σ y = n a and Σ x y = bΣx2
Coding is easily done with time-series data. For coding the data, we choose the centre of the time
period as x = 0 and have an equal number of plus and minus periods on each side of the trend line
which sum to zero.
Alternately, we can also find the values of constants a and b for any regression line as:
Σxy− n x y
b = and a = y − bx
Σ x 2 − n ( x )2
Example 11.10: Below are given the figures of production (in thousand quintals) of a sugar factory:
Year : 1995 1996 1997 1998 1999 2000 2001
Production : 80 90 92 83 94 99 92
(a) Fit a straight line trend to these figures.
(b) Plot these figures on a graph and show the trend line.
(c) Estimate the production in 2004. [Bangalore Univ., B.Com, 1998]
Solution: (a) Using normal equations and the sugar production data we can compute constants a and
b as shown in Table 11.5:
Σx 28 Σy 630
x = = = 4, y = = = 90
n 7 n 7
Σx y − nx y 2576 − 7(4) (90) 56
b= = 2
= =2
2
Σ x − n (x ) 2
140 − 7(4) 28
a = y – b x = 90 – 2(4) = 82
Therefore, linear trend component for the production of sugar is:
ŷ = a + bx = 82 + 2x
386 BUSINESS S T A T I S T I C S
The slope b = 2 indicates that over the past 7 years, the production of sugar had an average growth
of about 2 thousand quintals per year.
Example 11.11: The following table relates to the tourist arrivals (in millions) during 1994 to 2000
in India:
Year : 1994 1995 1996 1997 1998 1999 2000
Tourists arrivals : 18 20 23 25 24 28 30
Fit a straight line trend by the method of least squares and estimate the number of tourists that
would arrive in the year 2004.
Solution: Using normal equations and the tourists arrival data we can compute constants a and b as
shown in Table 11.6:
Table 11.6 Calculations for Least Squares Equation
Σx Σy 168
x = = 0, y = = = 24
n n 7
Σx y − nx y 53
b= = = 1.893;
2
Σ x − n (x ) 2 28
a = y – b x = 24 – 1.893(0) = 24
Therefore, the linear trend component for arrival of tourists is
y = a + bx = 24 + 1.893x
The estimated number of tourists that would arrive in the year 2004 are:
y = 24 + 1.893 (7) = 37.251 million (measured from 1997 = origin)
Example 11.12: From the following data, calculate trend by method of least squares:
Year : 1970 1971 1972 1973 1974 1975 1976
Profit (’000 Rs.) : 300 700 600 800 900 700 1000
[Delhi Univ., BCom (P) 1985]
Solution: Using normal equations the calculations required to determine trend are shown below:
Σ y 100 Σ xy 35
Let the straight line trend be: y + a + bx, where a = = = 20, and b = 2
= = 3.5
n 5 Σx 10
Hence, y = 20 + 3.5x. Putting x = 3, in the trend line to estimate sales for year 1991 as follows:
Y1991 = 20 + 3.5(3) = Rs. 30.5 crore.
Example 11.14: Fit a straight line trend to the following data by least squares method after summing
the given quarterly data to yearly data. Also tabulate short term fluctuations.
Export of Cotton Textile (Million Rs.)
1998 10 13 14 12
1999 12 14 15 13
2000 13 15 18 14
2001 15 18 21 18
2002 15 22 23 20
Plot the trend values and actual values and draw the trend line. [Delhi Univ., BCom (Hons), 1989]
Solution: Let the year 2000 be origin. Also x represents time in years and y represents exports in
millions of rupees.
Convert the quarterly data into yearly data as follows:
1998 10 13 14 12 49
1999 12 14 15 13 54
2000 13 15 18 14 60
2001 15 18 21 18 72
2002 15 22 23 20 80
Σy 315 Σxy 80
Let the stright line trend be, y = a + bx, where a = = = 63 and b = = =8
n 5 Σx2 10
Hence, y = a + bx = 63 + 8x
Plotting trend values on a graph, the trend line so obtained is shown below:
In order to find out the values of constants a and b in the exponential function, the two normal
equations to be solved are
Σ log y = n log a + log b Σ x
Σ x log y = log a Σ x + log b Σ x2
When the data is coded so that Σx = 0, the two normal equations become
1
Σ log y = n log a or log a = Σ log y
n
Σ x log y
and Σ x log y = log b Σ x2 or log b =
Σ x2
Coding is easily done with time-series data by simply designating the center of the time period as
x = 0, and have equal number of plus and minus period on each side which sum to zero.
Example 11.15: The sales (Rs. in million) of a company for the years 1995 to 1999 are:
Year : 1997 1998 1999 2000 2001
Sales : 1.6 4.5 13.8 40.2 125.0
Find the exponential trend for the given data and estimate the sales for 2004.
Solution: The computational time can be reduced by coding the data. For this consider
u = x – 3. The necessary computations are shown in Table 11.7.
Table 11.7 Calculation for Least Squares Equation
1 1
log a = Σ log y = (5.6983) = 1.1397
n 5
Σ u log y 4.7366
log b = 2
= = 0.4737
Σu 10
Therefore log y = log a + (x + 3) log b = 1.1397 + 0.4737x
For sales during 2004, x = 3, and we obtain
log y = 1.1397 + 0.4737 (3) = 2.5608
or y = antilog (2.5608) = 363.80
Example 11.16: Fit an exponential trend to the following data:
Year : 2001 2002 2003 2004 2005 2006 2007
Sales (in lakhs of Rs.) : 32 47 65 92 132 190 275
FORECASTING AND TIME SERIES ANALYSIS 391
Solution: Calculations to fit an exponential trend to the given data are shown below:
Year x Sales (y) log y x2 x log y
Let year 2004 be the origin and the exponential trend equation be y = abx. Then normal equations
are
Σ log y = n log a + log bΣx or 13.7926 = 7log a
Σx log y = log a Σx + log b Σx2 or log a = 1.9704
Also 4.3237 = 0 + 28 log b or log b = 0.154
Then log y = 1.9704 + 0.154x
For sales in 2008, x = 4. Thus log y = 1.9704 + 0.154(4) = 2.5864. Hence
y = Antilog (2.5864) = 385.9
S e l f-P r a c t i c e P r o b l e m s 11B
11.9 The general manager of a building materials (b) Determine a point estimate for
production plant feels that the demand for plasterboard shipments when the number
plasterboard shipments may be related to the of construction permits is 30.
number of construction permits issued in the 11.10 A company that manufactures steel observed
country during the previous quarter. The the production of steel (in metric tonnes)
manager has collected the data shown in the represented by the time-series:
table. Year : 1996 1997 1998 1999 2000 2001 2002
Production
Construction Plasterboard of steel : 60 72 75 65 80 85 95
Permits Shipments (a) Find the linear equation that describes the
trend in the production of steel by the
15 6
company.
9 4 (b) Estimate the production of steel in 2003.
40 16 11.11 Fit a straight line trend by the method of least
20 6 squares to the following data. Assuming that
25 13 the same rate of change continues, what would
25 9
be the predicted earning (Rs. in lakh) for the
15 10
year 2004?
35 16
Year : 19951996199719981999200020012002
(a) Use the normal equations to derive a Earnings: 38 40 65 72 69 60 87 95
regression forecasting equation. [Agra Univ., BCom 1996; MD Univ., BCom, 1998]
FORECASTING AND TIME SERIES ANALYSIS 393
11.12 The sales (Rs. in lakh) of a company for the the percentage of private industry jobs that
years 1990 to 1996 are given below: are managerial. The following data show the
percentage of females who are managers
Year : 1998 1999 2000 2001 2002 2003 2004
from 1996 to 2003.
Sales : 32 47 65 88 132 190 275
Find trend values by using the equation yc Years : 1996 97 98 99 00 01 02 03
= a bx and estimate the value for 2005. Percentage : 6.7 5.3 4.3 6.1 5.6 7.9 5.8 6.1
[Delhi Univ., B.Com, 1996] (a) Develop a linear trend line for this
11.13 A company that specializes in the production of time series through 2001 only.
petrol filters has recorded the following (b) Use this trend to estimate the
production (in ’000 units) over the last 7 years. percentage of females who are
managers in 2004.
Years : 1995 96 97 98 99 00 01 11.15 A company develops, markets, manufactures,
Production : 42 49 62 75 92 122 158 and sells integrated wide-area network access
(a) Develop a second-degree estimating products. The following are annual sales (Rs.
equation that best describes these in million) data from 1998 to 2004.
data. Year : 1998 1999 2000 2001 2002 2003 2004
(b) Estimate the production in 2005. Sales : 16 17 25 28 32 43 50
11.14 In 1996 a firm began downsizing in order to (a) Develop the second-degree estimating
reduce its costs. One of the results of these equation that best describes these data.
cost cutting measures has been a decline in (b) Use the trend equation to forecast sales
for 2005.
xi
= × 100 (i = 1, 2, ...,12)
x
It is important to note that the average of the indexes will always be 100, that is, sum of the
indexes should be 1200 for 12 months, and sum should be 400 for 4 quarterly data. If the sum of
these 12 months percentages is not 1200, then the monthly percentage so obtained are adjusted by
multiplying these by a suitable factor [1200 ÷ (sum of the 12 values)].
Example 11.18: The seasonal indexes of the sale of readymade garments in a store are given below:
If the total sales of garments in the first quarter is worth Rs. 1,00,000, determine how much
worth of garments of this type should be kept in stock to meet the demand in each of the remaining
quarters. [Delhi Univ., B.Com, 1996]
FORECASTING AND TIME SERIES ANALYSIS 397
Solution: Calculations of seasonal index for each quarter and estimated stock (in Rs.) is shown in
Table 11.8.
Table 11.8 Calculation of Estimated Stock
Example 11.19: Use the method of monthly averages to determine the monthly indexes for the data
of production of a commodity for the years 2002 to 2004.
Solution: Computation of seasonal index by average percentage method based on the data is shown
in Table 11.9.
Monthly Average : 1080/20 = 90; 360/12 = 30; 1200/2 = 100
Monthly Average : 1080/20 = 90; 360/12 = 30; 1200/2 = 100
398 BUSINESS S T A T I S T I C S
The average of monthly averages is obtained by dividing the total of monthly averages by 12. In
column 7 each monthly average for 3 years have been expressed as a percentage of the averages. For
example, the percentage for January is:
Monthly index for January = 21/30 = 70;
February = (21/30) × 100 = 70
March = (27/30) × 100 = 90, and so on
Example 11.20: The data on prices (Rs. in per kg) of a certain commodity during 2000 to 2004 are
shown below:
Quarter Years
2000 2001 2002 2003 2004
I 45 48 49 52 60
II 54 56 63 65 70
III 72 63 70 75 84
IV 60 56 65 72 66
Compute the seasonal indexes by the average percentage method and obtain the deseasonalized
values.
Solution: Calculations for quarterly averages are shown in Table 11.10.
Year Quarters
I II III IV
2000 45 54 72 60
2001 48 56 63 56
2002 49 63 70 65
2003 52 65 75 72
2004 60 70 84 66
Quarterly total 254 308 364 319
Quarterly average 50.8 61.6 72.8 63.8
Seasonal index 81.60 98.95 116.94 102.48
50.8
Thus, Seasonal index for quarter I = × 100 = 81.60
62.25
61.6
Seasonal index for quarter II = × 100 = 98.95
62.25
72.8
Seasonal index for quarter III = × 100 = 116.94
62.25
63.8
Seasonal index for quarter IV = × 100 = 102.48
62.25
Deseasonalized Values Seasonal influences are removed from a time-series data by dividing the actual
y value for each quarter by its corresponding seasonal index:
Actual quarterly value
Deseasonalized value = × 100
Seasonal index of corresponding quarter
The deseasonalized y values which are measured in the same unit as the actual values, reflect the
collective influence of trend, cyclical and irregular forces. The deseasonalized values are given in
Table 11.11.
Limitations of the method of simple averages This method is the simplest of all the methods for
measuring seasonal variation. However, the limitation of this method is that it assumes that
there is no trend component in the series, that is, C ⋅ S ⋅ I = 0 or trend is assumed to have
little impact on the time-series. This assumption is not always justified.
Year Quarters
I II III IV
2000 55.14 54.57 61.57 58.54
2001 58.82 56.59 53.87 54.64
2002 60.00 63.66 59.85 63.42
2003 63.72 65.68 64.13 70.25
2004 73.52 70.74 71.83 64.40
400 BUSINESS S T A T I S T I C S
Year Quarters
I II III IV
2000 60 80 72 68
2001 68 104 100 88
2002 80 116 108 96
2003 108 152 136 124
2004 160 184 172 164
Calculate the seasonal index for each of the four quarters using the ratio-to-trend method.
Solution: Calculations to obtain annual trend values from the given quarterly data using the method
of least-squares are shown in Table 11.12.
Solving the following normal equations, we get
Σ y = na + bΣ x or 560 = 5a or a = 112
Σ xy = aΣ x + bΣ x2 or 240 = 10b or b = 24
Thus the yearly fitted trend line is: y = 112 + 24x. The value of b = 24 indicates yearly increase in
sales. Thus the quarterly increment will be 24/4 = 6.
To calculate quarterly trend values, consider first the year 2000. The trend value for this year is
64. This is the value for the middle of the year 2000, that is, half of the 2nd quarter and half of the
3rd quarter. Since quarterly increment is 6, the trend value for the 2nd quarter of 2000 would be
FORECASTING AND TIME SERIES ANALYSIS 401
64 – (6/2) = 61 and for the 3rd quarter it would be 64 + (6/2) = 67. The value for the 1st quarter of
2000 would be 61 – 6 = 55 and for the 4th quarter it would be 67 + 6 = 73. Similarly, trend values
of the various quarters of other years can be calculated as shown in Table 11.13.
Year Quarters
I II III IV
2000 55 61 67 73
2001 79 85 91 97
2002 103 109 115 121
2003 127 133 139 145
2004 151 157 163 169
After getting the trend values, the given data values in the time-series are expressed as percent-
ages of the corresponding trend values in Table 11.13. Thus for the 1st quarter of 2000, this percent-
age would be (60/55) × 100 = 109.09; for the 2nd quarter it would be (80/61) × 100 = 131.15, and
so on. Other values can be calculated in the same manner as shown in Table 11.14.
The total of average of seasonal indexes is 403.12 (>400). Thus we apply the correction factor
(400/403.12) = 0.992. Now each quarterly average is multiplied by 0.992 to get the adjusted seasonal
index as shown in Table 11.14.
The seasonal index 92.02 in the first quarter means that on average sales trend to be depressed by
the presence of seasonal forces to the extent of approx. (100 – 92.02) = 7.98%. Alternatively, values of
time series would be approx. (7.98/92.02)×100 = 8.67% higher had seasonal influences not been
present.
402 BUSINESS S T A T I S T I C S
Year Quarters
I II III IV
2000 109.09 131.15 107.46 93.15
2001 86.08 122.35 109.89 90.72
2002 77.67 106.42 93.91 79.34
2003 85.04 114.29 97.84 85.52
2004 105.96 117.20 105.52 97.04
Total 463.84 591.41 514.62 445.77
Average 92.77 118.28 102.92 89.15
Adjusted = 403.12
seasonal index 92.02 117.33 102.09 88.43
Example 11.22: The production of a commodity during 1993-98 is given below. Fit the second
degree parabola to these data and estimate the production for the year 2000:
Year : 1993 1994 1995 1996 1997 1998
Production : 10 12 13 15 18 20
(’000 tonnes) [Delhi Univ., BCom), 2002]
Solution: Second degree parabolic trend equation is given by
yC = a + bx + cx2
To find the values of constants a, b and c, the normal equations are:
Σy = na + bΣx + cΣx2
Σxy = aΣx + bΣx2 + cΣx3
Σx 2 y = aΣx2 + bΣx3 + cΣx4
Calculations required to calculate values of constants considering 1995 as origin are shown
below:
Year y x x2 x3 x4 xy x2y
1993 10 –2 4 –8 16 –20 40
1994 12 –1 1 –1 1 –12 12
1995 13 0 0 0 0 0 0
1996 15 1 1 1 1 15 15
1997 18 2 4 8 16 36 72
1998 20 3 9 27 81 60 180
88 3 19 27 115 79 319
Multiply eqn. (i) by 19 and (ii) by 3 and subtract, we get 105a + 280c = 1435 (v)
Multiply eqn. (iv) by 3 and subtract (v) from it, we get 112c = 16 or c = 0.143
Putting c = 0.143 in (iv) we get 35a + 56(0.143) = 473 or a = 13.285
Multiply eqn. (ii) by 19 and (iii) by 3 and add we get 280b – 168c = 5752
or 35b – 21c = 719 (vi)
Putting c = 0.143 in (vi), we get
35b – 21(0.143) = 719 or b = 20.40
Again putting values of b and c in Eqn. (i), we get
6a + 3(20.40) + 19(0.143) = 88 or a = 4.46
Hence the parabolic equation becomes:
y= 4.46 + 20.40x + 0.143x2
Also, for x = 7
y2000 = 4.46 + 20.40 (7) + 0.143(7)2 = 4.46 + 142.80 + 7.00 = 154.26
Example 11.23: The prices of a commodity during 2001-2006 are given below. Fit a parabola
Y = a + bx + cx2 to these data. Estimate the price for the year 2007.
Year : 2001 2002 2003 2004 2005 2006
Price (Rs.) : 100 107 128 144 181 192
Year y x x2 x3 x4 xy x2 y
Example 11.24: (a) The trend equation for the yearly sales of a commodity with 1st July, 1991 as
origin is yC = 96 + 28.8x + 4x2, where x unit = 1 year. Determine the monthly trend equation with
Jan 1992 as origin.
(b) Compute trend values for August 1991. [Delhi Univ., B.Com(Hons), 2002]
Solution: Given trend equation yC = 96 + 28.8x + 4x2 (origin: 1st July, 1991, x unit = one year)
(a) To obtain monthly trend equation divide 96(i.e. a) by 12, 41(i.e. b) by (12×12) and 4(i.e. c) by
(12×12×12):
96 28.8 4
yC = + x+ x2 = 8 + 0.2x + 0.0023x2 (i)
12 12 × 12 12 × 12 × 12
(origin: Ist July, 1991, x unit 1 month)
To change the origin from 1st July 1991 to January 1992, x shall be increased by 6.5. That is
yC = 8 + 0.2 (x + 6.5) + 0.0023 (x + 6.5)2
= 8 + 0.2 (x + 6.5) + 0.0023 (x2 + 13x + 42.25)
= 8 + 0.2x + 1.3 + 0.0023 x2 + 0.03 x + 0.097
= 9.397 + 0.23x + 0.023 x2 (ii)
(b) To get trend value for August 1991, replace x by 1.5 in (i)
yC = 8 + 0.2(1.5) + 0.0023×(1.5)2 = 8.305.
Example 11.25: Calculate the seasonal index by the ratio-to-moving method from the following data:
Year Quarters
I II III IV
2001 75 60 53 59
2002 86 65 63 80
2003 90 72 66 85
2004 100 78 72 93
Solution: Calculations for 4 quarterly moving averages and ratio-to-moving averages are shown in
Tables 11.15 and 11.16.
Year Quarters
I II III IV
2001 — — 85.21 90.25
2002 128.12 91.71 85.13 106.14
2003 117.45 92.75 85.13 104.29
2004 120.48 92.03 — —
Total 366.05 276.49 255.47 300.68
Seasonal average 91.51 69.13 63.87 75.17 = 299.66
Adjusted
seasonal index 122.07 92.22 85.20 100.30 ≅ 400
406 BUSINESS S T A T I S T I C S
The total of seasonal averages is 299.66. Therefore the corresponding correction factor would be
400/299.68 = 1.334. Each seasonal average is multiplied by the correction factor 1.334 to get the
adjusted seasonal indexes shown in Table 11.17.
Example 11.26: Calculate the seasonal indexes by the ratio-to-moving average method from the
following data:
Rearranging the percentages to moving averages, the seasonal indexes are calculated as shown in
Table 11.18.
FORECASTING AND TIME SERIES ANALYSIS 407
Since the total of average indexes is less than 400, the adjustment of the seasonal index has been
done by calculating the grand mean value as follows:
Advantages and Disadvantages of Ratio-to-Moving Average Method This is the most widely used method for
measuring seasonal variations because it eliminates both trend and cyclical variations from the time-
series. However, if cyclical variations are not regular, then this method is not capable of eliminating
them completely. Seasonal indexes calculated by this method will contain some effect of cyclical
variations.
The only disadvantage of this method is that six data values at the beginning and the six data
values at the end are not taken into consideration for calculation of seasonal indexes.
Year Quarters
I II III IV
1999 68 62 61 63
2000 65 58 56 61
2001 68 63 63 67
2002 70 59 56 62
2003 60 55 51 58
Solution: Computations of link relatives (L.R.) are shown in Table 11.19 by using the following
formula:
Data value of current quarter
Link relative of any quarter = × 100
Data value of preceeding quarter
FORECASTING AND TIME SERIES ANALYSIS 409
Year Quarters
I II III IV
1999 — 91.18 98.39 103.28
2000 103.18 89.23 96.55 108.93
2001 111.48 92.65 100.00 106.35
2002 104.48 84.29 94.91 110.71
2003 96.78 91.67 92.73 113.73
Total of L.R. 415.92 449.02 482.58 543.00
Arithmetic
mean of L.R. 103.98 89.80 96.52 108.60
89.80 × 100 96.52 × 89.80 108.60 × 86.67
Chain relatives (C.R.) 100
100 100 100
= 89.80 = 86.67 = 94.12
The new chain relatives for the first quarter on the basis of last quarter is calculated as follows:
L.R. of first quarter × C.R. of previous quarter 103.98 × 94.12
New C.R = = = 97.9
100 100
Since new C.R. is not equal to 100, therefore we need to apply quarterly correction factor as:
1 1
d= (New C.R. of first quarter – 100) = (97.9 – 100) = – 0.53
4 4
Thus the corrected (or adjusted) C.R. for other quarters is shown in Table 11.20. For this we use the
formula:
Corrected C.R. for kth quarter = Original C.R. of kth quarter – (k – 1) d
where k = 1, 2, 3, 4.
Quarter I II III IV
Corrected C.R. 100 89.80 – (– 0.53) 86.67 – 2(– 0.53) 94.13 – 3(– 0.53)
= 90.33 = 87.73 = 95.71
Example 11.28: Apply the method of link relatives to the following data and calculate the seasonal
index:
Year Quarters
I II III IV
2000 45 54 72 60
2001 48 56 63 56
2002 49 63 70 65
2003 52 65 75 72
2004 60 70 84 86
Solution : Computations of link relatives (L.R.) using the following formula are shown in
Table 11.21.
Data value of current quarter
L.R. of any quarter = × 100
Data value of preceding quarter
Year Quarters
I II III IV
2000 — 120 133.33 83.33
2001 80.00 116.67 112.50 88.89
2002 87.50 128.57 111.11 92.86
2003 80.00 125.00 115.38 96.00
2004 85.71 116.67 120.00 78.57
Total of L.R. 333.21 606.91 592.32 439.65
Arithmetic
mean of L.R. 83.30 121.38 118.46 87.93
121.38 × 100 118.46 × 121.38 87.93 × 143.78
Chain relatives 100
100 100 100
(C.R.) = 121.38 = 143.78 = 126.42
The new chain relatives for the first quarter on the basis of the preceding quarter is calculated as
follows:
L.R. of first quarter × C.R. of previous quarter
New C.R. =
100
83.30 × 126.42
= = 105.30
100
Since the new C.R. is more than 100, therefore we need to apply a quarterly correction factor as :
1
d= (New C.R. of first quarter – 100)
4
1
= (105.30 – 100) = 1.325
4
FORECASTING AND TIME SERIES ANALYSIS 411
Thus the corrected (or adjusted) C.R. for other quarters is shown in Table 11.22. For this we use the
formula
Corrected C.R. for kth quarter = Original C.R. of kth quarter – (k – 1) d
where k = 1, 2, 3, 4.
Quarters I II III IV
Corrected C.R. 100 121.38 – 1.32 143.78 – 2(1.32) 126.42 – 3(1.32)
= 120.06 = 141.14 = 122.46
Advantages and Disadvantages of Link Relative Method This method is much simpler than the ratio-to-
trend or the ratio-to-moving average methods. In this method the L.R. of the first quarter (or month)
is not taken into consideration as compared to ratio-to-trend method, where 6 values each at the
beginning and at the end periods (month) are lost.
This method eliminates the trend but it is possible only if there is a straight line (linear) trend in the
time-series—which is generally not formed in business and economic series.
C o n c e p t u a l Q u e s t i o n s 11B
15. (a) Under what circumstances can a trend 18. Briefly describe the moving average and least
equation be used to forecast a value in a series squares methods of measuring trend in time-
in the future? Explain. series.
(b) What are the advantages and disadvantages 19. Explain the simple average method of
of trend analysis? When would you use this calculating indexes in the context of time-series
method of forecasting? analysis.
16. What effect does seasonal variability have on a
20. Distinguish between ratio-to-trend and ratio-
time-series? What is the basis for this variability
for an economic time-series? to-moving average as methods of measuring
seasonal variations. Which is better and why?
17. What is measured by a moving average? Why
are 4-quarter and 12-month moving averages 21. Distinguish between trend, seasonal variations,
used to develop a seasonal index? and cyclical variations in a time-series. How
can trend be isolated from variations?
FORECASTING AND TIME SERIES ANALYSIS 413
S e l f-P r a c t i c e P r o b l e m s 11C
11.16 Apply the method of link relatives to the 11.19 Calculate seasonal index numbers from the
following data and calculate seasonal indexes. following data:
Year Quarters
I II III IV
1999 — 108.3 120.0 111.5
2000 62.1 146.3 106.3 89.9
2001 93.2 95.6 143.1 68.8
2002 112.5 80.6 129.3 113.3
2003 77.6 110.6 109.6 88.8
Formulae Used
y = ab x;
1. Secular trend line 1 Σ x log y
log a = Σ log y; log b =
• Linear trend model n Σ x2
y = a + bx
• Parabolic trend model
Σx y − nx y y = a + bx + cx2
• where a = y – b x ; b =
Σ x 2 − n ( x )2 Σ y − c Σ x2 Σx y
• where a = ; b=
• Exponential trend model n Σ x2
416 BUSINESS S T A T I S T I C S
R e v i e w S e l f-P r a c t i c e P r o b l e m s
11.22 A sugar mill is committed to accepting beets (c) Change the sales (y) scale to monthly and
from local producers and has experienced the forecast the monthly sales rate at July 1,
following supply pattern (in thousands of tons/ 2003, and also at one year later.
year and rounded).
11.25 Data collected on the monthly demand for an
item were as shown below:
Year Tonnes Year Tonnes
1990 100 1995 400 January 100
1991 100 1996 400 February 90
1992 200 1997 600 March 80
1993 600 1998 800 April 150
1994 500 1999 800 May 240
June 320
The operations manager would like to project
July 300
a trend to determine what facility additions will
August 280
be required by 2004
September 220
(a) Sketch a freehand curve and extend it to
(a) What conclusion can you draw with respect
2004. What would be your 2004 forecast
to the length of moving average versus
based upon the curve? smoothing effect?
(b) Compute a three-year moving average and (b) Assume that the 12-month moving aver-
plot it as a dotted line on your graph. age centred on July was 231. What is the
value of the ratio-to-moving average that
11.23 Use the data of Problem 11.22 and the normal would be used in computing a seasonal
equations to develop a least squares line of best index?
fit. Omit the year 1990. 11.26 Consider the following time-series data:
(a) State the equation when the origin is 1995. Week : 1 2 3 4 5 6
(b) Use your equation to estimate the trend Value : 8 13 15 17 16 9
value for 2004. (a) Develop a 3-week moving average for this
time-series. What is the forecast for week
11.24 A forecasting equation is of the form: 7?
y c = 720 + 144x (b) Use α = 0.2 to compute the exponential
[2003 = 0, x unit = 1 year, y = annual sales] smoothing values for the time-series. What
is the forecast for week 7?
(a) Forecast the annual sales rate for 2003 and
11.27 Below are given the figures of production (in
also for one year later.
million tonnes) of a cement factory:
(b) Change the time (x) scale to months and
Year : 1990 1992 1993 1994 1995 1996 1999
forecast the annual sales rate at July 1,
Production : 77 88 94 85 91 98 90
2003, and also at one year later.
FORECASTING AND TIME SERIES ANALYSIS 417
(a) Fit a straight line trend by the ‘least squares 11.29 Fit a parabolic curve of the second degree to
method’ and tabulate the trend values. the data given below and estimate the value for
(b) Eliminate the trend. What components of 2002 and comment on it.
the time series are thus left over? Year : 1996 1997 1998 1999 2000
(c) What is the monthly increase in the Sales
production of cement?
(Rs. in ’000): 10 12 13 10 8
11.28 The sale of commodity in tonnes varied from
January 2000 to December, 2000 in the 11.30 Given below are the figures of production of a
following manner: sugar (in 1000 quintals) factory:
Year : 1991 1992 1993 1994 1995 1996 1997
280 300 280 280 270 240
Production : 40 45 46 42 47 49 46
230 230 220 200 210 200
Fit a straight line trend by the method of least
Fit a trend line by the method of semi-averages. squares and estimate the value for 2001.
G l o s s a r y o f Te r m s
Causal forecasting methods: Forecasting methods that relate a time-series to other variables which are used to explain
cause and effect relationship.
Delphi method: A quantitative forecasting method that obtains forecasts through group consensus.
Time-series: A set of observations measured at successive points in time or over successive periods of time.
Trend: A type of variation in time-series that reflects a long-term movement in time-series over a long period of time.
Cyclical variation: A type of variation in time-series, in which the value of the variable fluctuates above and below a
trend line and lasting more than one year.
Seasonal variation: A type of variation in time-series that shows a periodic pattern of change in time-series within a
year; patterns tend to be repeated from year to year.
Irregular variation: A type of variation in time-series that reflects the random variation of the time-series values which
is completely unpredictable.
Moving averages: A quantitative method of forecasting or smoothing a time-series by averaging each successive group
of data values.
Weighted moving average: A quantitative method of forecasting or smoothing a time-series by computing a weighted
average of past data values; sum of weights must equal one.
Deseasonalization: A statistical process used to remove the effect of seasonality from a time-series by dividing each
original series observation by the corresponding seasonal index.
LEARNING OBJECTIVES
12.1 INTRODUCTION
We know that most values change and therefore may want to knowhow much change has taken place
over a period of time. For example, we may want to know how much the prices of different items
essential to a household have increased or decreased so that necessary adjustments can be made in
the monthly budget. An organization may be concerned with the way in which prices paid for raw
materials, annual income and profit, commodity prices, share prices, production volume, advertising
budget, wage bills, and so on, have changed over a period of time. However, while prices of a few
items may have increased, others may have decreased over a given period of time. Consequently in all
such situations, an average measure needs to be defined to compare such differences from one time
period to another. Index numbers are yardsticks for describing such difference.
An index number can be defined as a relative measure describing the average changes in any
quantity over time. In other words, an index number measures the changing value of prices, quantities,
or values over a period of time in relation to its value at some fixed point in time, called the base
period. This resulting ratio of the current value to a base value is multiplied by 100 to express the
index as a percentage. Since an index number is constructed as a ratio of a measure taken during one
time period to that same measure taken during another time period (called base period), it has no
unit and is always expressed as a percentage term as follows:
INDEX NUMBERS 421
Indexes may be based at any convenient period, which is occasionally adjusted, and these
are published at any convenient frequency. Examples of some indexes are:
Daily Stock market prices
Monthly Unemployment figures
Yearly Gross National Product (GNP)
Index numbers were originally developed by economists for monitoring and comparing
different groups of goods. For decision-making in business, it is sometimes essential to understand
and manipulate different published index series and to construct one’s own index series.
From Table 12.1, it is observed that the price relative of 113.5 in 2003 shows an increase of
13.5% in wage bill compared to the base year 2000.
A composite price index measures the average price change for a basket of related items from a base
period to the current period. For example, the wholesale price index reflects the general price level
for a group of items (or a basket of items) taken as a whole.
The retail price index reflects the general changes in the retail prices of various items including
food, housing, clothing, and so on. In India, the Bureau of Labour statistics, publishes retail price
index. The consumer price index, a special type of retail price index, is the primary measure of the
cost of living in a country. The consumer price index is a weighted average price index with fixed
weights. The weightage applied to each item in the basket of items is derived from the urban and
rural families.
Quantity Index A quantity index measures the relative changes in quantity levels of a group (or
basket) of items consumed or produced, such as agricultural and industrial production, imports and
exports, between two time periods. The method of constructing quantity indexes is the same as that of
price index except that the quantities are vary from period to period.
The two most common quantity indexes are the weighted relative of aggregates and the weighted
average of quantity relative index.
Value Index A value index measures the relative changes in total monetary worth of an item, such as
inventories, sales, or foreign trade, between the current and base periods. The value of an item is
INDEX NUMBERS 423
determined by multiplying its unit price by the quantity under consideration. The value index can
also be used to measure differences in a given variable in different locations. For example, the
comparative cost of living shows that in terms of cost of goods and services, it is cheaper to live in a
small city than in metro cities.
Special Purpose Indexes A few index numbers such as industrial production, agricultural production,
productivity, etc. can also be constructed separately depending on the nature and degree of relationship
between groups and items.
• Index number, almost alone in the domain of social sciences, may truly be called an exact
science, if it be permissible to designate as science the theoretical foundations of a useful art.
—Irving Fisher.
4. Index numbers measure the effect of changes in relation to time or place : Index numbers are used to
compare changes which take place over periods of time, between locations, and in categories. For
example, cost of living may be different at two different places at the same or cost of living in one
city can be compared across two periods of time.
The price index number is helpful in deflating the national income to remove the effect of
inflation over a long term, so that we may understand whether there is any change in the real
income of the people or not. The retail price index is often used to compute real changes in
earnings and expenditure as it compares the purchasing power of money at different points in
time. It is generally accepted as a standard measure of inflation even though calculated from a
restricted basket of goods.
C o n c e p t u a l Q u e s t i o n s 12A
1. Explain the significance of index numbers. 6. Index numbers are economic barometers. Explain
2. Explain the differences among the three principal this statement and mention the limitations of index
types of indexes: price, quantity, and value. numbers (if any).
3. How are index numbers constructed? What is 7. What are the basic characteristics of an index
their purpose? number?
4. What is an index number? Describe briefly its 8. Since value of the base year is always 100, it does
applications in business and industry. not make any difference which period is selected
as the base on which to construct an index.
5. What does an index number measure? Explain
Comment.
the nature and uses of index numbers.
9. What are the main uses of an index number?
10. What is meant by the term deflating a value series?
26
2002 26.00 × 100 = 105.69 2.65
24.60
26.50
2003 26.50 × 100 = 107.72 2.03
24.60
(b) The percentage change in price relative is divided by the index it has come from and multiplied
by 100 for finding percentage increase.
103.04 − 100
For year 2001: × 100 = 3.04 per cent
100
105.69 − 103.04
For year 2002: × 100 = 2.57 per cent
103.04
107.72 − 105.69
For year 2003: × 100 = 1.92 per cent
105.69
INDEX NUMBERS 427
Calculate the simple aggregate price index for 2002 using 2000 as the base year.
Solution: Calculations for aggregate price index are shown in Table 12.3.
The unweighted aggregate price index for expenses on a few food items in 2002 is given by
Σ p1 199
P01 = × 100 = × 100 = 122.83
Σ p0 162
The value P01 = 122.83 implies that the price of food items included in the price index has
increased by 22.83% over the period 2000 to 2002.
1 p1
Average price relative index P01 = Σ 100 (12-2)
n p0
1 p 1 p
log P01 = Σ log 1 100 = Σ log P ; P = 1 100
n p
0 n p0
Advantages: This index has the following advantages over the aggregate price index:
(i) The value of this index is not affected by the units in which prices of commodities are
quoted. The price relatives are pure numbers and therefore are independent of the original
units in which they are quoted.
(ii) Equal importance is given to each commodity and extreme commodities do not influence the
index number.
Limitations: Despite the few advantages mentioned above, this index is not popular on account of the
following limitations.
(i) Since it is an unweighted index, therefore each price relative is given equal importance.
However in actual practice a few price relatives are more important than others.
(ii) Although arithmetic mean is often used to calculate the average of price relatives, it also has
a few biases. The use of geometric mean is computationally difficult. Other measures of
central tendency such as median, mode and and harmonic mean, are almost never used for
calculating this index.
(iii) Index of price relatives does not satisfy all criteria such as identity, time reversal, and circular
properties, laid down for an ideal index. These criteria will be discussed later in the chapter.
Example 12.3: From the data given below, construct the index of price relatives for the year 2002
taking 2001 as base year using (a) arithmetic mean and (b) geometric mean.
Solution: Calculations of Index number using arithmetic mean (A.M.) is shown in Table 12.4.
1 p1 1
Average of price relative index P01 = Σ 100 = (627.54) = 125.508
n p0 5
Hence, we conclude that prices of items included in the calculation of index have increased by
25.508% in 2002 as compared to the base year 2001.
430 BUSINESS S T A T I S T I C S
(b) Index number using geometric mean (G.M.) is shown in Table 12.5
Table 12.5: Calculations of Index Using G.M.
S e l f-P r a c t i c e P r o b l e m s 12A
12.1 The following data concern monthly salaries (b) Calculate the percentage points change
for the different classes of employees within a between consecutive years.
small factory over a 3-year period.
12.3 A State Govt. had compiled the information
Employee Salary per Month shown below regarding the price of the three
essential commodities: wheat, rice, and sugar.
Class 1998 1999 2000 From the commodities listed, the corresponding
price indicates the average price for that year.
A 2300 2500 2600 Using 1998 as the base year, express the price
B 1900 2000 2300 for the years 2000 to 2002, in terms of
C 1700 1700 1800 unweighted aggregate index.
D 1000 1100 1300
Commodity 1998 1999 2000 2001 2002
Using 1998 as the base year, calculate the simple
aggregate price index for the years 1999 and Wheat 4 6 8 10 12
2000. Rice 16 20 24 30 36
12.2 The following data describe the average salaries Sugar 8 10 16 20 24
(Rs. in ’000) for the employees in a company
over ten consecutive years. 12.4 Following are the prices of commodities in 2003
and 2004. Calculate a price index based on price
Year : 1 2 3 4 5 relatives, using the geometric mean.
Average salary : 10.9 11.4 12.0 12.7 13.6
Year Commodity
Year : 6 7 8 9 10
Average salary : 14.4 15.0 15.5 16.3 13.6 A B C D E F
2003 45 60 20 50 85 120
(a) Calculate an index for these average 2004 55 70 30 75 90 130
salaries using year 5 as the base year.
INDEX NUMBERS 431
P01 = antilog
1
n { }
log P = antilog
1
6 {
(12.5659) } A
B
100
25
8
6
12.00
7.50
150
125
15,000
3,125
= antilog (2.0948) = 124.4
C 10 5 5.25 105 1,050
12.5 Let expenditure on clothing be x and on house
rent be y. Then as per conditions given, we have D 20 48 60.00 125 2,500
3500 = 1400 + x + y + 560 + 630 E 25 15 16.50 110 2,750
or x + y = 910 (i) F 30 9 27.00 300 9,000
Multiplying expenditure with group index and Total 210 33,425
equating it to 136, we get
(1400 × 180) + ( x × 150) + ( y × 100) ΣPQ 33, 425
+(500 × 110) + (630 × 80) Index number = = = 159.17
136 = ΣQ 210
3500
2,52,000 + 150 x + 100 y + 61,600 + 50, 400
12.8
136 =
3500 Year Income Index Real Income Real Income
4,76,000 = 2,52,000 + 150x + 100y + 61,600
(Rs.) (Rs.) Index
+ 50,400
150x + 100y = 1,12,000 (ii) 4000
1990 4000 100 ×100 = 4000.00 100.00
Multiplying Eqn. (i) by 150 and subtracting it 100
from (ii), we get
4400
50y = 24,500 or y = Rs. 490 (house rent) 1991 4400 130 ×100 = 3384.62 84.62
130
Substituting the value of y in Eqn. (i):
x + 490 = 910 or x = Rs. 420 (clothing) 4800
1992 4800 160 ×100 = 3000.00 75.00
12.6 Let the rise in price of cloth be x. 160
Advantages: The main advantage of this method is that it uses only one quantity measure based on the
base period and therefore we need not keep record of quantity consumed in each period. Moreover,
having used the same base period quantity, we can compare the index of one period with another
directly.
Disadvantages: We know that the consumption of commodities decreases with relatively large increases
in price and vice versa. Since in this index the fixed quantity weights are determined from the base
period usage, it does not adjust such changes in consumption and therefore tends to result in a bias
in the value of the composite price index.
Example 12.4: Compute the cost of living index number using Laspeyre’s method, from the following
information:
434 BUSINESS S T A T I S T I C S
Solution: Calculation of cost of living index by Laspeyre’s method is shown in Table 12.6.
Σ p1 q0 3085
Cost of living index = × 100 = × 100 = 135.9
Σ p0 q0 2270
Moreover, each year the index number for the previous year requires recomputation to reflect the
effect of the new quantity weights. Thus, it is difficult to compare indexes of different periods when
calculated by the Paasche’s method.
Example 12.5: For the following data, calculate the price index number of 1999 with 1998 as the
base year, using: (a) Laspeyre’s method, and (b) Paasche’s method.
Solution: Table 12.7 presents the information necessary for both Laspeyre’s and Paasche’s methods.
Advantages: Fisher’s method is also called ideal method due to following reasons:
(i) The formula is based on geometric mean which is considered to be the best average for
constructing index numbers.
(ii) The formula takes into account both base year and current year quantities as weights. Thus
it avoids the bias associated with the Laspeyre’s and Paasche’s indexes.
(iii) This method satisfies essential tests required for a index, that is, time reversal test and factor
reversal test.
Disadvantages: The calculation of index using this method requires more computation time. Although
the index number is theoretically better than others discussed previously, it is not fit for common use
because it requires current quantity weights every time an index is calculated.
Example 12.6: Compute index number from the following data using Fisher’s ideal index formula.
Solution: Table 12.8 presents the information necessary for Fisher’s method to calculate the index.
1999 2000
Commodity Price Expenditure on Quantity Price Expenditure on Quantity
(Rs.) Consumed (Rs.) (Rs.) Consumed (Rs.)
A 8 200 65 1950
B 20 1400 30 1650
C 5 80 20 900
D 10 360 15 300
E 27 2160 10 600
INDEX NUMBERS 437
Solution: Table 12.9 presents the information necessary for Fisher’s method to calculate the index.
A 4 8 9 10 72 32 90 40
B 3 7 4 8 28 21 32 24
C 4 6 8 7 48 24 56 28
D 2 5 4 5 20 10 20 10
168 87 198 102
Σp1 q0 168
Laspeyre’s index = × 100 = × 100 =193.10.
Σp0 q0 87
Σp1 q1 198
Paasche’s index = × 100 = × 100 =194.12.
Σp0 q1 102
Example 12.9: Give that Σ p1 q1 = 250, Σ p0 q1 = 150; Paasche’s index number = 150 and Dorbish
and Bowley’s index number = 145. Find out Fisher’s ideal index number and Marshal Edgeworth
index number.[Delhi Univ., B.Com (Hons), 1992, 2005]
Σ p1 q1 250 × 100
Paasche’s index number = × 100 or 150 =
Σ p0 q1 Σ p0 q1
25000 500
or 150 Σ p0 q1 = 25000, i.e. Σ p0 q1 = = =167 (approx)
150 3
Σ p1 q1 + Σ p1 q1 × 100
Σ p q
0 0 Σ p0 q1
Dorbish-Bowley’s index number =
2
Σ p q 250 145 Σ p1 q0
or 145 = 1 0 + 50 or = + 1.497
150 167 50 150
Σp1 q0 Σp1 q0
or 2.90 – 1.50 = or 1.40 = [Taking 1.497 as 1.50]
150 150
or Σ p1 q0 = 1.40×150 = 210
250 500
or Σ p0 q1 = × 100 = =166.6 ~ 167
150 3
L+P L + 150
Dorbish and Bowley’s index number = or 145 = , i.e. L = 140
2 2
Σp1 q0 Σp1 q0
But L = × 100 , or 140 = × 100
Σp0 q0 150
140 × 150
or Σ p1 q0 = = 210
100
Marshall-Edgeworth Method
In this method the sum of base year and current year quantities are considered as the weight to
calculate the index. The formula for constructing the index is:
Σ ( q0 + q1 ) p1 Σ q0 p1 + Σ q1 p1
Marshall-Edgeworth price index = × 100 = × 100
Σ ( q0 + q1 ) p0 Σ q0 p0 + Σ q1 p0
where notations have their usual meaning.
The disadvantage with this formula is the same as that of Paasche index and Fisher’s ideal index
in the sense that it also needs current quantity weights every time an index is constructed.
Walsch’s Method
In this method the quantity weight used is the geometric mean of the base and current year quantities.
The formula for constructing the index is
Σ p1 q0 q1
Walsch’s price index = × 100
Σ p0 q0 q1
Although this index satisfies the time reversal test, it needs current quantity weight every time an
index is constructed.
Kelly’s Method
The method suggested by T.L. Kelly for the construction of index number is
Σ p1q
Kelly’s price index = × 100
Σ p0 q
where q = fixed weight.
This method is also called the fixed weight aggregate method because instead of using base period or
current period quantities as weights, it uses weights from a representative period. The representative
weights are referred to as fixed weight. The fixed weights and the base period prices do not have to
come from the same period.
Advantages and Disadvantages of Kelly’s Method
Advantages: An important advantage of this index is that it does not need yearly changes in the
weights. One can select a different period for fixed weight other than base period. This can improve
the accuracy of the index. Moreover, the base period can also be changed without changing the fixed
weight. The weights should be appropriate and should indicate the relative importance of various
commodities. This weight may be kept fixed until new data are available to revise the index.
Disadvantages: One disadvantage with this index is that it does not take into account the weight either
of the base year or of the current year.
Example 12.11: It is stated that the Marshall-Edgeworth index number is a good approximation of
the ideal index number. Verify this statement using the following data:
Solution: Table 12.10 presents the information necessary to calculate Fisher and Marshall-Edgeworth
indexes.
A 25 40 2,000 50
B 22 18 1,200 30
C 54 16 1,320 44
D 20 40 1,350 45
E 18 30 630 15
Σ q1 p0 Σ q1 p1
Fisher’s quantity index number, Q01 = × × 100
Σ q0 p0 Σ q0 p1
5456 6500 35,464
= × × 100 = × 100 =136.85
3600 5260 18,936
Σ q1( p1 + p0 )
Marshall Edgeworth quantity index, Q01 = × 100
Σ q0 ( p1 + p0 )
Σ q1 p1 + Σ q1 p0 6500 + 5456
= × 100 = × 100 =134.94.
Σ q0 p1 + Σ q0 p0 5260 + 3600
Example 12.13: Compute Laspeyre’s, Paasche’s, Fisher’s, and Marshall-Edgeworth’s index num-
bers from the following data:
Σ p1 q0 224
Laspeyre’s price index = ×100 = ×100 = 107.17
Σ p0 q0 209
Σ p1 q1 259
Paasche’s price index = ×100 = ×100 = 105.28
Σ p0 q1 246
Σ p1 ( q0 + q1 ) Σ p1 q0 + Σ p1 q1
Marshall-Edgeworth’s price index = ×100 = ×100
Σ p0 ( q0 + q1 ) Σ p0 q0 + Σp0 q1
244 + 259
= ×100 = 110.55
209 + 246
442 BUSINESS S T A T I S T I C S
Σ {( p1 / p0 ) × 100} ( p0 q0 ) Σ PV
Weighted average of price relative index, P01 = =
Σ p0 q0 ΣV
Σ p1 q0
= ×100
Σ p0 q0
where V(= p0q0) = base period value
P(= (p0/q0) × 100 = price relative
This formula is equivalent to Laspeyre’s method for any given problem.
If we wish to compute a weighted average of price relative using V = p0q1, then the above formula
becomes
Σ {( p1 / p0 ) × 100}( p0 q1 ) Σ p1q1
P01 = = ×100
Σ p0 q1 Σ p0 q1
A 3 20 4.0
B 1.5 40 1.6
C 1.0 10 1.5
Solution: The following table presents the necessary information to calculate the weighted average
price relative index.
INDEX NUMBERS 443
4
A 20 3 4 60 × 100 =133.33 8000
3
1.6
B 40 1.5 1.6 60 × 100 =106.67 6400
1.5
1.5
C 10 1 1.5 10 × 100 =150 1500
1
130 15,900
ΣPV 15,900
Weighted average of price relative index, P01 = = =122.31.
ΣV 130
Example 12.15: A large manufacturer purchases an identical component from three different suppliers
that differ in unit price and quantity supplied. The relevant data for 2000 and 2001 are given below:
Construct a weighted average price relative index using (a) arithmetic mean and (b) geometric mean.
Solution: Table 12.12 presents the information necessary to calculate the weight average price relative
index.
(a) Weighted average of price relative index
[Σ( p1 / p0 ) 100] p0 q0
1,12,001.70
P01 = == 113.13
Σ p0 q0 990
The value of P01 implies that there has been 13.13% increase in price from year 2000 to 2001.
(b)
Table 12.13 Calculations of Weighted Geometric Mean of Price Relatives
ΣV × logP 2032.92
P01 = antilog = antilog
ΣV 990
= antilog (2.0535) = 113.11
Σ V0 ( q1 / q0 ) Σ q1 p0
Laspeyre’s quantity index QL = × 100 = × 100
Σ V0 Σ q0 p0
Σ V1 ( q1 / q0 ) Σ q1 p1
Similarly, Paasche’s quantity index QP = × 100 = × 100
Σ V1 Σ q0 p1
Σ q1 p0 Σ q1 p1
Fisher’s quantity index QF = QL × QP = × × 100
Σ q0 p1 Σ q0 p1
INDEX NUMBERS 445
The formula for computing a weighted average of quantity relative index is also the same as used
to compute a price index. The formula for this type of quantity index is
q
Σ 1 × 100 ( q0 p0 )
q0
Weighted average of quantity relative index =
Σ q0 p0
where q 1 = quantities for the current period
q0 = quantities for the base period
Example 12.16: Obtain Laspeyre’s price index number and Paasche’s quantity index number from
the following data:
Table 12.14 Calculations on Laspeyre’s Price Index and Paasche’s Quantity Index
Price Quantity
Item p0 p1 q0 q1 p1 q 0 p0 q 0 q1 p 1 q0 p 1
1 2 5 20 15 100 40 75 100
2 4 8 4 5 32 16 40 32
3 1 2 10 12 20 10 24 20
4 5 10 5 6 50 25 60 50
202 91 199 202
Σ p1 q0 202
Laspeyre’s price index = × 100 = × 100 = 221.98
Σ p0 q0 91
Σ q1 p1 199
Paasche’s quantity index = × 100 = × 100 = 98.51
Σ q0 p1 202
Example 12.17: Compute the quantity index by using Fisher’s formula from the data given below:
Solution: The base year quantity q0 and current year quantity q1 for individual commodity can be
calculated as follows (Table 12.15):
Total value 50 48 18
q0 (for 2002) = = = 10; = 6; =3
Price 5 4 6
Total value 48 49 20
q1 (for 2003) = = = 12; = 7; =4
Price 4 7 5
140 117
= × × 100 = 120.65
116 97
Example 12.18: Calculate the weighted average of quantity relative index from the following data:
Quantity (Units) Price (Rs./Unit)
Commodity
2000 2002 2000
A 10 12 100
B 15 20 75
C 8 10 80
D 20 25 60
E 50 60 500
Solution: Table 12.16 presents the information necessary to calculate the weighted average of quantity
relative index.
Table 12.16 Calculations of a Weighted Average of Quantity Relatives Index
Σ {( q1 / q0 ) × 100}( q0 p0 )
Weighted average of quantity relatives index =
Σ q0 p0
34,99,996.25
= = 120.835
28,965
S e l f-P r a c t i c e P r o b l e m s 12B
12.9 The following table contains information from employees of an industrial centre for a particular
the raw material purchase records of a small year (with base 1990 = 100) were:
factory for the year 2002 and 2003:
Food 200
Commodity 2002 2003 Clothing 130
Price Total Price Total Fuel and Lighting 120
(Rs./Unit) Value (Rs./Unit) Value Rent 150
A 15 50 6 72 Miscellaneous 140
B 17 84 10 80
The weights are 60, 8, 7, 10, and 15 respectively.
C 10 80 12 96
It is proposed to fix dearness allowance in such
D 14 20 5 30
a way as to compensate fully the rise in the
E 18 56 8 64
prices of food and house rent. What should
be the dearness allowance, expressed as a
Calculate Fisher’s ideal index number.
percentage of wage?
12.10 The subgroup indexes of the consumer price
index number for urban non-manual
448 BUSINESS S T A T I S T I C S
357 342
= × × 100 = 121.96 Σ p1 q0 Σ p1 q1
290 284 P01 = × × 100
Σ p0 q0 Σ p0 q1
12.10 Let the income of the consumer be Rs. 100.
He spent Rs. 60 on food and Rs. 10 on house 191 234
= × × 100 = 126.9
rent in 1990. The index of food is 200 and the 150 185
house rent Rs. 150 for the particular year for 12.14
which the data are given. In order to maintain
the same consumption standards regarding two Comm- Price (Rs.) Quantity
items, he will have to spend Rs. 120 on food odity (in 1000 kg)
and Rs. 15 on house rent. Further the weights 1989 1998 1989 1998
of other items are constant; in order to maintain p0 p1 q0 q1 p0 q0 p1 q0 p0 q1 p1q1
the same standard he will have to spend Rice 9.3 4.5 100 90 4930.0 450.0 837.0 405.0
120+8+7+15+5 = Rs. 155. Hence the Wheat 6.4 3.7 111 10 0470.4 440.7 464.0 437.0
dearness allowance should be 55 per cent. Pulses 5.1 2.7 115 13 4425.5 413.5 415.3 48.1
12.11 1025.9 504.2 916.3 450.1
Item p0 q0 p1 q1 p1 q0 p0 q0 p1 q1 p0 q1
Σ p1 q0 504.2
A 1 10 2 5 20 10 10 5 Laspeyre’s index = × 100 = × 100
Σ p0 q0 1025.9
B 1 35 x 2 5x 35 2x 2
= 49.15
20 + 5x 15 10 + 2x 7
Σ p1 q1
Σ p1q1 20 + 5 x Paasche’s index = × 100
Laspeyre’s index = = ; Σ p0 q1
Σ p0 q0 15
450.1
Σ p1 p0 10 + 2 x = × 100 = 49.12
Paasche’s index = = 916.3
Σ p0 q1 7
Σ p1 q0 Σ p1 q1
(20 + 5x) /15 28 Fisher’s index = × × 100
Given = Σ p0 q0 Σ p0 q1
(10 + 2x) / 7 27
20 + 5 x 7 28 = 49.15 × 49.12 = 49.134
or × = or x = 4
15 10 + 2x 27
450 BUSINESS S T A T I S T I C S
12.15 12.17
Comm- 1996 2000 Commodity Quantity Price
odity p0 q0 p1 q1 p0q0 p0q1 p1q0 p1q1 1985 1993 1993
A 2 174 3 182 1148 1165 222 1246 q0 q1 p1 q1p1 q0 p1
B 5 125 4 140 1625 1700 500 1560 A 100 150 900/150 = 6 900 600
C 7 140 6 133 1280 1231 240 1198 B 80 100 500/100 = 5 500 400
1053 1095 962 1004 C 60 72 360/72 = 5 360 300
D 30 33 297/33 = 9 297 270
Marshall-Edgeworth index 2057 1570
Σ p1q0 + Σ p1q1
= × 100
Σ p0 q0 + Σ p0 q1 Σ q1 p1
Paasche’s quantity index = × 100
Σ q0 p1
962 + 1004
= × 100 = 91.53
1053 + 1095 2057
= × 100
Fisher’s Ideal index 1570
= 131.02
Σ p1q0 Σ p1q1 12.18
= × × 100 = 91.523
Σ p0 q0 Σ p0 q1
Commodity Quantity Price Percent Base Weighted
12.16 1995 1999 1995 Relatives Value Relatives
q1
Type of 2000 2001 Weight, Crime Relative, RW q0 q1 p0 × 100 q0 p0 (6)=
Crime W R
q0 (4)×(5)
(1) (2) (3) (4) (5)
Robberies 13 8 6 (8/13)×100 369.24
Wheat 29 24 3.80 83 110.20 9,146.60
= 61.54
Corn 13 2.5 2.91 83 8.73 724.59
Car 15 22 5 (22/15)×100 733.50
thefts = 146.70 Soyabeans 12 14 6.50 117 78.00 9,126.00
Cycle 249 185 4 (185/249)×100 297.16 196.93 18,997.19
thefts = 74.290
Pocket 328 259 1 (259/328)×100 78.96 Weighted average of relative quantity index
picking = 78.960
Thefts 497 448 2 (448/497)×100 180.28
Σ {( q1 / q0 ) × 100}( q0 p0 )
=
by servants = 90.15 Σ q0 p0
18 1659.14 18, 997.19
= = 96.
196.93
Σ RW 1659.14
Crime index = = = 92.17
ΣW 18
Σ p1q0 Σ p1q1 Σ p0 q1 Σ p0 q0
P01 × P10 = × × × =1
Σ p0 q0 Σ p0 q1 Σ p1 q1 Σ p1q0
Since P01 × P10 = 1, Fisher’s ideal index satisfies the test.
Σ p1 q1
P01 × Q01 =
Σ p0 q0
The factor reversal test is satisfied only by the Fisher’s ideal price index as shown below:
Σ p1q0 Σ p1q1
P01 = ×
Σ p0 q0 Σ p0 q1
Changing p to q and q to p, we get the quantity index:
Σ q1 p0 Σ q1 p1
Q01 = ×
Σ q0 p0 Σ q0 p1
Multiplying P10 and Q01, we get
Σ p1q0 Σ p1q1 Σ q1 p0 Σ q1 p1 Σ p1q1 Σ q1 p1 Σ p1 q1
× × × = × =
Σ p0 q0 Σ p0 q1 Σ q0 p0 Σ q0 p1 Σ p0 q0 Σ q0 p0 Σ p0 q0
This result implies that the Fisher’s index formula could be used for constructing both price and
quantity indexes.
Σ p1 q0 × Σ p1 q1 1060 × 1200
Fisher’s index, P01 = × 100 = × 100 = 143
Σ p0 q0 × Σ p0 q1 740 × 840
Time Reversal Test, i.e. P01 × P10 = 1
Σ p1 q0 × Σ p1 q1 1060 × 1200
P01 = =
Σ p0 q0 × Σ p0 q1 740 × 840
Σ p0 q1 Σ p0 q0 840 × 740
and P10 = × =
Σ p1q1 Σ p1q0 1200 × 1060
Σ p1q0 Σ p1q1 Σ q1 p0 Σ q1 p1
P01 × Q01 = × × ×
Σ p0 q0 Σ p0 q1 Σ q0 p0 Σ q0 p1
Solution: Table 12.17 presents all the necessary information for constructing the Fisher’s ideal index
number.
454 BUSINESS S T A T I S T I C S
2002 2003
Commodity q0 p0 q1 p1 p1 q 0 p0 q 0 q 1 p1 p0 q 1
A 20 12 30 14 280 240 420 360
B 13 14 15 20 260 182 300 210
C 12 10 20 15 180 120 300 200
D 8 6 10 4 32 48 40 60
E 5 8 5 6 30 40 30 40
782 630 1090 870
Σ p0 q1 Σ p0 q0 870 630
P10 = × = × = 0.8019
Σ p1q1 Σ p1q0 1090 782
Solution: Table 12.18 presents information necessary for Fisher’s method to calculate the index.
2000 2001
Commodity p0 q0 p1 q1 p1 q 0 p0 q 0 p1 q 1 p0 q 1
A 10 49 12 50 588 490 600 500
B 12 25 15 20 375 300 300 240
C 18 10 20 12 200 180 240 216
D 20 5 40 2 200 100 80 40
1363 1070 1220 996
Σ p0 q1 Σ p0 q0
P10 = ×
Σ p1q1 Σ p1q0
Σ p1q0 Σ p1q1 Σ p0 q1 Σ p0 q0
Thus P01 × P10 = × × ×
Σ p0 q0 Σ p0 q1 Σ p1 q1 Σ p1q0
Example 12.22: With the help of the following data, show that the index number calculated on the
basis of arithmetic mean is not reversible while the index number calculated on the basis of geometric
mean is reversible. Make comparison between arithmetic mean and geometric mean:
A 40 60
B 50 80
C 20 40
D 20 10
[Delhi Univ., B Com (Hons), 2003]
Solution: Calculations for price relatives using 1998 and 1999 as base year are shown in the table
below:
A 40 60 150 66.67
B 50 80 160 62.50
C 20 40 200 50.00
D 20 10 50 200.00
560 379.17
560
A.M. of price relatives [with 1998 as Base] = =140 (= P01 )
4
456 BUSINESS S T A T I S T I C S
379.17
A.M. of price relatives [with 1999 as base] = = 94.79 (= P10 )
4
Time Reversal Test is said to be satisfied if P01 × P10 = 1 [Ignoring the factor 100] In the given
situation the A.M. of price relatives: P01×P10 = 1.4 × 0.9479 ≠ 1
G.M. of price relatives (ignoring the factor 100) with 1998 as base is
= (1.5 × 1.6 × 2 × 0.5)1/4 = 1.244666 (= P01)
G.M. of price relatives (ignoring the factor 100) with 1999 as base is
= (0.6667 × 0.6250 × 0.5 × 2)1/4 = 0.8034384 (= P10)
In case of G.M. the relation: P01×P10 = 1. Further it may be noted that the index number using
A.M. is higher than the index number using G.M.
Example 12.23: If the ratio between Laspeyers and Paasche’s index number is 28 : 27. Find missing
figure in the following table.
Base Year Current Year
Commodities Price Quantity Price Quantity
X 1 10 2 5
Y 1 5 – 2
Solution: Calculations required to find the missing figure are shown below:
X 1 10 2 5 10 5 20 10
Y 1 5 x 2 5 2 5x 2x
Σ p1 q0 20 + 5x
Laspeyers index, L = × 100 = × 100
Σ p0 q0 15
Σ p1q1 10 + 2x
Paasche’s index, P = × 100 = × 100
Σ p0 q1 7
L 28
Since = , therefore
P 27
20 + 5x
× 100
15 28
=
10 + 2x 27
× 100
7
20 + 5x 7 28
× =
15 10 + 2x 27
4+x 7 28
× =
3 2(5 + x) 27
4+x 8
=
5+ x 9
9(4 + x) = 8(5 + x)
36 + 9x = 40 + 8x or x = 40 – 36 = 4
Hence, the missing figure is: 4.
INDEX NUMBERS 457
Example 12.24: Calculate Fisher’s price index for the following data and prove that it satisfies both
Time Reversal and Factor Reversal test:
Wheat 8 10 20 30
Sugar 6 9 14 18
Tea 2 5 15 20
Σp1q1
Factor Reversal Test: P01 × Q01 = = V01
Σp0 q0
method, the data of each period is related with that of the immediately preceding period and not with
any fixed period. This means that for the index of 2000 the base would be 1999 and for the index of 1999
the base would be 1998 and similarly of the index of 1998 the base would be 1997. Such index numbers
are very useful in comparing current period data with the preceding period’s data. Fixed base index in
such a case does not give an appropriate comparison, because all prices are based on the fixed base
period which may be far away for the current period and the preceding period.
For constructing an index by the chain base method, a series of indexes are computed for each
period with preceding period as the base. These indexes are known as link index or link relatives. The
steps of calculating link relatives are summarized below:
(i) Express the data of a particular period as a percentage of the preceding period’s data. This
is called the link relative.
(ii) These link relatives can be chained together. This is done by multiplying the link relative of
the current year by the chain index of the previous year and dividing the product by 100.
Thus
Chain index Link relative of current year × Chain index of previous year
for current year = 100
The chain index is useful for long-term comparison whereas link relatives are used for a comparison
with the immediately preceding period. The fixed base indexes compiled from the original data and
the chain indexes compiled from link relatives give the same value of index provided there is only
one commodity whose indexes are being constructed.
Remarks Chain relatives differ from fixed base relatives in computation. Chain relatives are computed
from link relatives whereas fixed base relatives are computed directly from original data.
Disadvantage: The main disadvantage of the chain base index is that it is not useful for long term
comparisons of chained percentages in a time series. The process of chaining link relatives is
computationally difficult.
Solution: Computation of the chain base index number is shown in Table 12.19.
Example 12.26: Prepare fixed base index numbers from the chain base index numbers given below:
Solution: Computation of fixed base indexes is shown in Table 12.20 using the following formula:
Current period CBI × Previous period FBI
Fixed base index (FBI) =
100
460 BUSINESS S T A T I S T I C S
Example 12.27: Calculate the chain base index number and fixed base index number from the
following data:
Solution: Computation of the chain index number and fixed base index number is shown in Tables
12.21 and 12.22.
Example 12.28: Shift the base from 2004 to 2006 in the data given below:
Year Index (2004 = 100)
2001 87.27
2002 90.91
2003 95.40
2004 100.00
2005 104.00
2006 106.00
2007 112.00
Solution: Calculations required to shift the base from 2004 to 2006 are shown below:
Example 12.29: Given below are two sets of indices. For the purpose of continuity of records, you
are required to construct a combined series with the year 1993 as the base:
1990 100 –
1991 120 –
1992 125 –
1993 150 –
1994 – 110
1995 – 120
1996 – 95
1997 – 105
100
1990 100 100 × = 66.66
150
100
1991 120 120 × = 80
150
100
1992 125 125 × = 83.33
150
100
1993 150 150 × = 100
150
150 × 110 100
1994 =165 165 × = 110
100 150
165 × 120 100
1995 =198 198 × = 132
100 150
198 × 95 100
1996 =188.1 188.1 × = 125.4
100 150
105 100
1997 188.1 × =197.5 197.5 × = 131.66
100 150
Example 12.30: Prepare a spliced series of index numbers with 2003 as base from the following
series:
Year 1998 1999 2000 2001 2002 2003 2004
Index A 100 120 135
Index B 100 115 125 145
Index C 100 110
Solution: The spliced series of index numbers with 2003 as base is shown below:
100 100
1998 100 – × 100 = 74.07 × 74.07 = 51.082
135 145
100 100
1999 120 – × 120 = 88.89 × 88.89 = 61.3
135 145
100
2000 135 100 – 100 × 100 = 68.97
145
100
2001 – 115 – 115 × 115 = 79.3
145
100
2002 – 125 – 125 × 125 = 86.2
145
2003 – 145 – 145 100 100
2004 – – – 110 110
Example 12.31: A price index number series was started in 1997 as base. By 2001, it rose by 15%.
The link relative for year 2002 was 95. In this year, a new series was started. This new series rose by
25% in next year. During year 2004, the price level was 5% higher than 2003. However in 2005, they
were 6% higher than 2004. Splice the two series and calculate the index number for various years by
shifting base to 2003.
Solution: Calculations required for splicing of indices are shown below:
100
1997 100 × 100 = 91.53
109.25
15 100
2001 100 + × 100 = 115 × 115 = 105.26
100 109.25
95 × 115
2002 = 109.25* 100 100
100
25
2003 100 + × 100 = 125 125
100
5
2004 125 + × 125 = 131.25 131.25
100
6
2005 131.25 + × 131.25 = 139.125 139.125
100
Example 12.32: The following table gives the annual income of a person and the general price index
number for the period 1988 to 1996. Prepare index number to show the changes in the real income
of the person.
Annual income
Year Annual Income Price Real Wage = × 100 Real Wage Index
Price index
(Rs.) Index Number 1988 = 100
36,000
1988 36,000 100 × 100 = 36,000 1000
100
42,000 35,000
1989 42,000 120 × 100 = 35,000 × 100 = 97.22
120 36,000
50,000 34,482.7
1990 50,000 145 × 100 = 34482.7 × 100 = 95.78
145 36,000
55,000 34,375
1991 55,000 160 × 100 = 34,375 × 100 = 95.49
160 36,000
Contd...
INDEX NUMBERS 465
Annual income
Year Annual Income Price Real Wage = × 100 Real Wage Index
Price index
(Rs.) Index Number 1988 = 100
60,000 24,000
1992 60,000 250 × 100 = 24,000 × 100 = 66.67
250 36,000
64,000 20,000
1993 64,000 320 × 100 = 20,000 × 100 = 55.56
320 36,000
68,000 15,111.1
1994 68,000 450 × 100 = 15,111.1 × 100 = 41.97
450 36,000
72,000 13,584.9
1995 72,000 530 × 100 = 13,584.9 × 100 = 37.74
530 36,000
75,000 12,500
1996 75,000 600 × 100 = 12,500 × 100 = 34.72
600 36,000
S e l f-P r a c t i c e P r o b l e m s 12C
12.19 Calculate Fisher’s Ideal index from the data 12.22 The following table gives the annual income of
given below and show that it satisfies the time a clerk and the general index number of price
reversal and factor reversal tests. during 1994–98. Prepare the index number to
show the changes in the real income of the
Commodity Base Year Current Year teacher.
Quantity Price Quantity Price
Year Income Price Year Income Price
A 12 10 15 12
B 15 27 20 25 (Rs.) Index No. (Rs.) Index No.
C 24 25 20 29 1994 36,000 100 1999 64,000 290
D 25 16 25 14
1995 42,000 104 2000 68,000 300
12.20 Splice the following two index number series, 1996 50,000 115 2001 72,000 320
continuing series A forward and series B
backward. 1997 55,000 160 2002 75,000 330
1998 60,000 280 — — —
Year : 1998 1999 2000 2001 2002 2003
Series A : 100 120 150 — — — 12.23 From the following average price of groups
Series B : — — 100 110 120 150 of commodities given in rupees per unit, find
the chain base index number with 1994 as the
12.21 Calculate the chain base index number chained base year:
to 1994 from the average price of following
three commodities: Group 1994 1995 1996 1997 1998
12.19
Σ p0 q1 Σ p0 q0 470 425
P10 = × = × = 0.8638
Commodity q0 p0 q1 p1 p1q0 p0q0 p1q1 p0 q1 Σ p1q1 Σ p1q0 530 505
A 12 10 15 12 144 120 180 150
Time Reversal Test
B 15 7 20 5 75 105 100 140
C 24 5 20 9 216 120 180 100 505 530 470 425
D 5 16 5 14 70 80 70 80 P01 × P10 = × × × = 1 =1
425 670 530 505
505 425 530 470
Factor Reversal Test
12.21
Σ q1 p0 Σ q1 p1
Q01 = ×
Σ q0 p0 Σ q0 p1 Comm- Relatives Based on the Preceding Year
odity
1999 2000 2001 2002 2003
470 530
= × = 0.9693
525 505 Wheat 100 150 133.33 125 120
Rice 100 125 120.00 125 120
505 530 470 530 Sugar 100 125 160.00 125 120
P01 × Q01 = × × ×
425 470 425 505 Total 300 400 413.33 375 360
Average 100 133.33 137.78 125 120
of link
530 530
= × relatives
425 425 133.33×100 137.78 ×133.33 125×183.70 120 × 229.63
Chain 100 100 100 100 100
index
503 Σ p1q1
= which is equal to (1999 = 133.33 =183.70 = 229.63= 275.55
425 Σ p0 q0 = 100)
12.20
12.22
Year Series Series Series B Series A
A B Spliced to A Spliced to B Year Income Price Real Real Income
(Rs.) Index Income Index
100
1998 100 — — ×100
150 1994 360 100 (360/100) ×100 100.00
= 66.66 =360.00
1995 420 104 (420/104) × 100 112.18
100 = 403.85
1999 120 — — ×120
150 1996 500 115 (500/115) × 100 120.77
= 180.00 = 434.78
1997 550 160 (550/160) ×100 95.49
150 100 = 343.75
2000 150 100 ×100 ×150
100 150 1998 600 280 (600/280) × 100 59.52
= 150 = 100.00 = 214.29
150 1999 640 290 (640/290) × 100 61.30
2001 — 110 ×110 = 165 = 220.69
100
2000 680 300 (680/300) × 100 62.96
150 = 226.67
2002 — 120 ×120 = 180 2001 720 320 (720/320) × 100 62.52
100
= 225.00
150 2002 750 330 (750/330) × 100 63.13
2003 — 150 ×150 = 225
100 = 227.27
12.23
Group 1994 1995 1996 1997 1998
Price Link Price Link Price Link Price Link Price Link
Relative Relative Relative Relative Relative
I 2 100 13 150 14 133.3 15 125 16 120
II 8 100 10 125 12 120.0 15 125 18 120
III 8 100 15 125 18 160.0 10 125 12 120
Total 300 400 413.3 375 360
Average of link 100 133.33 137.77 125 120
relatives
133.33 137.77 125 120
Chain index 100 × 100 × 133.33 × 183.69 × 229.61
(1994 = 100) 100 100 100 100
= 133.33 = 183.69 = 229.61 = 275.53
468 BUSINESS S T A T I S T I C S
12.24 (a) Average weekly wage can be obtained by Time Reversal Test: P01 × P10 = 1
using the following formula: 801 796 627 630
Money wage P01 × P10 = × × × = 1 =1
Real wage = × 100 630 627 796 801
Price index
Σ p1q1 796
Factor Reversal Test: P01 × Q01 = =
Year Weekly Take-home Consumer Real wages Σ p0 q0 630
Pay (Rs.) Price Index
Σ q1 p0 Σ q0 p1 627 796
Q01 = × = × P × Q01
109.5 Σ q0 p0 Σ q0 p1 630 801 01
1998 109.50 112.8 × 100
112.8
= 97.07 801 796 627 796 796
= × × × =
112.2 630 627 630 801 630
1999 112.20 118.2 × 100
118.2 Σ p1q1
which is equal to
= 94.92 Σ p0 q0
116.4
2000 116.40 127.4 × 100 12.26
127.4
= 91.37 1998 1999
125.08 Comm p0 q0 p1 q1 p1 q0 p0 q0 p1 q1 p0 q1
2001 125.08 138.2 × 100
138.2
= 90.51 A 12 20 14 30 280 240 420 360
B 14 13 20 15 260 182 300 210
135.4
2002 135.40 143.5 × 100 C 10 12 15 20 180 120 300 200
143.5 D 6 8 4 10 32 48 40 60
= 94.36 E 8 5 6 5 30 40 30 40
138.10
2003 138.10 149.8 × 100 782 630 1090 870
149.8
= 92.19 Fisher’s Ideal Index:
(b) Since real wage was maximum in the year 1998,
Σ p1 q0 Σ p1 q1
the employees had the greatest buying power in P01 = × × 100
that year. Σ p0 q0 Σ p0 q1
(c) The percentage increase in the weekly wages for
the year 2003 required to provide the same 782 1090
= × × 100 = 124.7
buying power that the employees had in 1998: 630 870
Absolute difference = 97.07 – 92.19 = 6.88. Time Reversal Test:
12.25
Σ p1 q0 Σ p1 q1 Σ p0 q1 Σ p0 q0
Comm p0 p1 q0 q1 p1 q0 p0 q0 p1 q1 p0 q1 P01 × P10 = × × ×
Σ p0 q0 Σ p0 p1 Σ p1 q1 Σ p1 q0
A 6 8 10 12 80 60 96 72
782 1090 870 630
B 10 10 5 8 50 50 80 80 = × × × =1
630 870 1040 782
C 5 7 8 10 56 40 70 60
12.27
D 15 20 12 15 240 180 300 225
E 20 25 15 10 375 300 250 200 Comm 1p1 q1 1p0 1q011p1q1 1p0q1 11p1q0 p0q0
801 630 796 627 A 15 14 13 18 1170 142 1140 1124
B 18 18 16 25 1144 108 1200 1150
Σ p1q0 Σ p1q1 C 13 25 11 40 1175 125 1120 1140
P01 = × ;
Σ p0 q0 Σ p0 q1 D 15 36 12 48 1540 432 1720 1576
E 09 14 17 18 1126 198 1162 1126
Σ p0 q1 Σ p0 q0 F 17 13 15 19 1191 165 1133 1195
P10 = ×
Σ p1 q1 Σ p1 q0 1046 770 1375 1011
INDEX NUMBERS 469
the same. The percentage of expenditure on different commodities by an average family constitutes
the individual weights assigned to the corresponding price relatives, and the percentage expenditure
on five well-accepted groups of commodities namely: (i) food, (ii) clothing, (iii) fuel and lighting, (iv)
house rent, (v) miscellaneous.
The weight applied to each commodity in the market basket is derived from a usage survey of
families throughout the country. The consumer price index or cost of living index numbers are
constructed by the following two methods:
Aggregate expenditure method or weighted aggregate method
This method is similar to the Laspeyre’s method of constructing a weighted index. To apply this
method, the quantities of various commodities consumed by a particular class of people are assigned
weights on the basis of quantities consumed in the base year. Mathematically it is stated as:
Total expenditure in current period Σ p1 q0
Consumer price index = × 100 = × 100
Total expenditure in base period Σ p0 q0
where p1 and p0 = prices in the current period and base period, respectively
q0 = quantities consumed in the base period
Family budget method or method of weighted average of price relatives
To apply this method the family budget of a large number of people, for whom the index is meant, are
carefully studied. Then the aggregate expenditure of an average family on various commodities is
estimated. These values constitute the weights. Mathematically, consumer price index is stated as:
Σ PV
Consumer price index = × 100
ΣV
when P = price relatives, p1/p0×100
V = Value weight, p0q0
Example 12.33: Owing to change in prices the consumer price index of the working class in a
certain area rose in a month by one quarter of what it was prior to 225. The index of food became 252
from 198, that of clothing from 185 to 205, of fuel and lighting form 175 to 195, and that of
miscellaneous from 138 to 212. The index of rent, however, remained unchanged at 150. It was
known that the weight of clothing, rent and fuel, and lighting were the same. Find out the exact
weight of all the groups.
Solution: Suppose the weights of items inclnded in the group are as follows:
• Food x • Fuel and Lighting z • Rent z
• Miscellaneous y • Clothing z
Therefore, the weighted index in the beginning of the month would be:
Index Weight IW
I W
Food 198 x 198x
Clothing 185 z 185z
Fuel and Lighting 175 z 175z
Rent 150 z 150z
Miscellaneous 138 y 138y
x + y + 3z 198x+138y+510z
INDEX NUMBERS 471
x + y + 3z 252x+212y+550z
Example 12.34: Incomplete information obtained from a partially destroyed records on cost of living
analysis is given below:
Group Group Index Percent of Total
Expenditure
Food 268 60
Clothing 280 Not available
Housing 210 20
Fuel and Electricity 240 5
Miscellaneous 260 Not available
The cost of living index with percent of total expenditure as weight was found to be 255.8.
Estimate the missing weights. [Delhi Univ., B.Com (Hons) 2005]
Solution: Let the weights for clothing be x1 and for miscellaneous be x2. Then
60 + x1 + 20 + 5 + x2 = 100
x1 + x2 = 15 ...(i)
268 × 60 + 280 x1 + 210 × 20 + 240 × 5 + 260 x2
255.8 =
100
25580 = 16080 + 280 x1 + 4200 + 1200 + 260 x2
= 21480 + 280 x1 + 260 x2
4100 = 260 (x1 + x2) + 20 x1
= 260 × 15 + 20 x1 = 3900 + 20 x1 [Since, x1 + x2 = 15]
200 = 20 x1 or x1 = 10
x 2 = 15 – 10 = 5 and, then x1 = 10, x2 = 5.
Example 12.35: Calculate the index number using (a) Aggregate expenditure method, and (b) Family
budget method for the year 2000 with 1995 as the base year from the following data:
Commodity Quantity (in Units) Price (in Rs./Unit) Price (in Rs./Unit)
1990 1990 2000
A 100 8.00 12.00
B 25 6.00 7.50
C 10 5.00 5.25
D 20 48.00 52.00
E 25 15.00 16.50
F 30 9.00 27.00
Solution: Calculations of cost of living index are shown in Tables 12.23 and 12.24.
Table 12.23 Index Number by Aggregative Expenditure Method
Price (Rs. per unit) in Quantity (in units) Price Relatives Weights
Commodity 1990 2000 in 1990 P (P1/p0)×100 W = p0 q0 PW
p0 p1 q0
A 8.00 12.00 100 150.00 800 1,20,000
B 6.00 7.50 25 125.00 150 18,750
C 5.00 5.25 10 105.00 50 5,250
D 48.00 52.00 20 108.33 960 1,03,996.80
E 15.00 16.50 25 110.00 375 41,250
F 9.00 27.00 30 300.00 270 81.00
2605 3,70,246.8
Σ p1 q0 3702.5
Cost of living index = × 100 = × 100 = 142.13
Σ p0 q0 2605
Σ PW 3,70,246.8
Cost of living index = = = 142.123
ΣW 2605
The small difference observed between the index by the Aggregative Method (142.13) and the
index by the Family Budget Method (142.123) is due to the approximation in the value of price
relatives (= 108.33) in commodity D.
Example 12.36: The monthly income of a person is Rs. 10,500. It is given that cost of living index for
a particular month is 136. Find out the money spent by that person on food and on clothing.
Item Expenditure (Rs.) Index
Food — 180
Rent 1470 100
Clothing — 150
Fuel and Power 1680 110
Misc 1890 80
ΣI × E
Consumer Price Index (CPI) =
ΣE
474 BUSINESS S T A T I S T I C S
Example 12.37: The consumer price index in a particular town and the weights according to differ-
ent groups of items were as follows:
Food 55, Fuel 15, Clothing 10, Rent 12 and Miscellaneous 8. In October 1999, the dearness
allowance was fixed by a mill of that town at 182 per cent of worker’s wages which fully compensated
for the rise in the prices of food and rent but did not compensate for anything else. Another mill of
the same town paid dearness allowance of 46.5 per cent which compensated for the rise in fuel and
miscellaneous groups. It is known that the rise in food is double the rise in fuel and the rise in
miscellaneous group is double the rise in rent. Find the rise in food, fuel, rent and miscellaneous
groups. [Delhi Univ., B.Com (Hons), 2002]
Solution: Let rise in fuel be x and rise in rent be y. Then, rise in food will be 2x and rise in miscella-
neous group will be 2y.
First mill compensated fully for rise in food and rent but did not compensate for anything else.
Dearness allowance was fixed at 182%, i.e. Rs. 282 paid against Rs. 100.
Index after rise for first mill is 282.
Food 2X 55 110x
Fuel 100 15 1500
Clothing 100 10 1000
Rent Y 12 12y
Miscellaneous 100 8 800
100 3300 + 110x + 12y
ΣW × I
Index =
ΣW
3300 + 110 x + 12 y
282 =
100
28200 – 3300 = 110x + 12y
110x + 12y = 24900
Second mill paid dearness allowance at the rate of 46.5%.
INDEX NUMBERS 475
ΣW × I
Index =
ΣW
7700 + 15x + 16 y
146.5 =
100
14650 – 7700 = 15x + 16y
15x + 16y = 6950 (ii)
By multiplying (i) by 4 and (ii) by 3 and subtracting (ii) from (i), we get
440x + 48y = 99600
45x + 48y = 20850
395x = 78750, i.e. x = 199.367
Substituting x = 199.367 in (ii), we get 16y = 6950 – 2990 = 360 or y = 247.5
(a) Hence rise in fuel shall be 199.37 and rise in food shall be 398.74
(b) Rise in rent shall be 247.5 and rise in Miscellaneous groups shall be 495.
C o n c e p t u a l Q u e s t i o n s 12B
18. What is the chain base method of construction of
11. (a) Discuss the various problems faced in the
index numbers and how does it differ from the
construction of index numbers.
fixed base method?
(b) Explain the problem faced in the construction
19. It is said that index numbers are a specialized
of cost of living index.
type of averages. How far do you agree with this
12. Discuss the importance and use of weights in the statement? Explain briefly the Time Reversal and
construction of general price index numbers. Factor Reversal Tests.
13. What is Fisher’s Ideal index? Why is it called ideal? 20. What are the Factor Reversal and Circular tests of
Show that it satisfies both the time reversal test as consistency in the selection of an appropriate index
well as the factor reversal test. formula? Verify whether Fisher’s Ideal Index
14. Laspeyre’s price index generally shows an upward satisfies such tests.
trend in the price changes while Paasche’s method 21. What is the major difference between a weighted
shows a downward trend on them. Elucidate the aggregate index and a weighted average of
statement. relatives index?
15. Explain the Time Reversal Test and Factor 22. What are the tests to be satisfied by a good index
Reversal Test with the help of suitable examples. number? Examine how far they are met by
16. Distinguish between deflating and splicing of Fisher’s Ideal index number.
index numbers. 23. What are the tests prescribed for a good index
17. What is the cost of living index number? Is it the number? Describe the index number which
same as the consumer price index number? satisfies these tests.
476 BUSINESS S T A T I S T I C S
Formulae Used
pn Weighted average of price relatives
1. Price relatives in period n, P0n = × 100
p0
p
qn Σ 1 × 100 ( p0 q0 )
Quantity relative in period n, Q0n = × 100 p
0
q0 P0n =
Σ p0 q0
Σ pn qn
Value relative in period n, V0n = × 100
Σ p0 q0 (base year value as weights)
2. Unweighted aggregate price index in period n Weighted average of price relatives
Σ pn p
P0n = × 100 Σ 1 × 100 ( p1q1 )
Σ p0 p
P0n = 0
Simple average of price relative Σ p1q1
1 p (current year value as weights)
P0n =
2
∑ p0n × 100
4. Quantity indexes
Simple G.M. of price relative
(a) Unweighted quantity index in period n
1 p
P0n = antilog Σ n × 100 Σ qn
n p0 Q0n =
Σ q0
× 100
Simple aggregate quantity index
Σ qn Simple average of quantity relative
Q0n = × 100
Σ q0
1 qn
3. Weighted aggregate price indexes Q0n = Σ × 100
n q0
(a) Weighted aggregate method in period n
Σ pn q (b) Weighted quantity index in period n
P0n = × 100
Σ p0 q
Σqn W
Σ pn q0 Q0n = × 100
Laspeyre’s index, Ip (L) = × 100 Σq0W
Σ p0 q0
5. Tests for adequacy or consistency
Σ pn qn
Paasche’s index, Ip (P) = × 100 Time reversal test: P0n × Pn0 = 1
Σ p0 qn
Marshall-Edgeworth’s index Σ pn qn
Factor reversal test: P0n × Q0n =
Σ p0 q0
Σ pn ( q0 + qn )
Ip (M-E) = × 100
Σ p0 ( q0 + qn ) Circular test: P01 × P12 × P23 × ... × P(n – 1)n
Dorbish and Bowley’s index × Pn0 = 1
1 6. Link relative
Ip (D-B) = (L + P) × 100
2 Current period price
= × 100
Fisher’s ideal index, = L × P × 100 Price of the preceding period
(b) Weighted average of price relatives in period n
p Current period's link relative
Σ n × 100 W × Preceding period's chain index
p Chain index =
P0n = 0 100
ΣW
INDEX NUMBERS 477
True or False
1. Like all statistical tools, index numbers must be 7. Index numbers measure change in magnitude of
used with great caution. a group of distinct but related variables.
2. For constructing index numbers, the best method 8. Prices should be for the same unit of quantity in
on theoretical grounds is not the best method index numbers.
from practical point of view. 9. Quantity relatives are used to measure changes
3. The Fisher Ideal Index number is a compromise in the volume of consumption.
between two well known indexes—not a right 10. Weighting of index number makes them more
compromise, economically, for the statistician. representative.
11. Cost of living index numbers are based on retail
4. The real problem while constructing a index
prices of items of consumption.
number is whether he shall leave weighting to
12. Splicing means constructing one continuous series
chance or seek to rationalize it.
from two index series on the basis of a common
5. Like relatives are based on the idea that one series base.
can be converted into another because time 13. Chain indexes give the same result as do fixed
reversibility holds. base index numbers.
6. Index numbers are the signs and guide-posts along 14. Weighted average of relatives and weighted
the business highway that indicate to the aggregative methods render the same result.
businessman how he should drive or manage. 15. Paasche’s formula is a weighted aggregate index
with quantity weights in the base year.
Concepts Quiz Answers
1. T 2. F 3. F 4. F 5. T 6. T 7. T 8. F 9. T
10. T 11. T 12. T 13. T 14. T 15. F
R e v i e w S e l f-P r a c t i c e P r o b l e m s
12.28 Construct an index number for each year from
Comm Unit Weight Price (in Rs. per unit)
the following average annual price of cotton
with 1989 as the base year. (Rs. 1000) 1996 1997
A Kg 5 2.00 4.50
Year Price (Rs.) Year Price (Rs.) B Quintal 7 2.50 3.20
1989 75 1994 70 C Dozen 6 3.00 4.50
D Kg 2 1.00 1.80
1990 50 1995 69
1991 65 1996 75 12.30 From the chain base index numbers given
1992 60 1997 84 below, find the fixed base index numbers.
1993 72 1998 80 Year : 1996 1997 1998 1999 2000
Chain base index : 80 110 120 90 140
12.29 The price quotations of four different
commodities for 1996 and 1997 are given 12.31 The following are the group index numbers
below. Calculate the index number for 1997 and the group weights of an average working
with 1996 as base by using (i) the simple average class family’s budget. Construct the cost of living
of price relatives and (ii) the weighted average number.
of price relatives.
478 BUSINESS S T A T I S T I C S
Items Quantity Consumed Price (in Rs. per Unit) House rent 13 100
per Year in the Base Year Given Year Miscellaneous 14 236
Given Year
[Vikram Univ., MBA, 1996]
Rice (qtl) 2.50 × 12 12 25
Pulses (kg) 3 × 12 4 0.6 12.37 During a certain period the cost of living index
Oil (litre) 2 × 12 1.5 2.2 goes up from 110 to 200 and the salary of a
Clothing worker is also raised from Rs. 3250 to Rs. 5000.
(metres) 6 × 12 0.75 1.0 Does the worker really gain, and if so, by how
Housing much in real terms?
(per month) — 20 30 12.38 An enquiry into the budgets of middle class
Miscellaneous families in a certain city gave the following
(per month) — 10 15 information.
12.34 Compute the Consumer Price Index number Expenses Food Fuel ClothingRent Miscellaneous
from the following: 35% 10% 20% 15% 20%
Group Base Year Current Year Weight Prices (Rs.)
Price (Rs.) Price (Rs.) (Per cent) 1990 : 150 25 75 30 40
Food 400 550 35 Prices (Rs.)
Rent 250 300 25 1991 : 145 23 65 30 45
Clothing 500 600 15
Fuel 200 350 20 What is the Cost of Living Index number of
Entertainment 150 225 25 1991 as compared with that of 1990?
The index for 1998 is 160. Thus the sum of the ΣP W 3544
index numbers of the four commodities would Cost of living index = = =
ΣW 25
be 160 × 4 = 640. Hence, 500 + 5x = 640 or x 141.76
= 28. The rise in the price of cloth was Rs. 8 per For maintaining the same standard, the
metre.
2050 × 141.76
12.33 business executive should get
100
= Rs. 2906.08.
Items Quantity
12.36
Consumed q1 p0 p1 p1 q 1 p0 q 1
Rice (qtl) 2.50× 12 12.00 25.0 750.0 360.0 Group Weights (W) Group Index (P) PW
Pulses (kg) 3×12 0.40 0.6 21.6 14.0 Food 47 247 11609
Oil (litres) 2×12 1.50 2.2 52.8 36.0 Fuel and 7 293 2051
Clothing (mt) 6×12 0.75 1.0 72.0 54.0 lighting
Housing Clothing 8 289 2312
(per month) — 20 30 360.0 240.0 House rent 13 100 1300
Miscellaneous Miscellaneous 14 236 3304
(per month) — 10 15 180.0 120.0
Total 89 20,576
1436.4 824.4
ΣP W 20,576
Σ p1 q1 1436.4 Cost of living = = = 231.19
Cost of living index = × 100 = × 100 ΣW 89
Σ p0 q1 824.4
12.37 Real wage of Rs. 3250
= 174.24
Actual wage
12.34 = × 100
Cost of living index
p1 3250
Group p0 q1 P = × 100 Weight PW = × 100 = Rs. 2954.54
q0 110
W
5000
Food 400 550 137.5 35 4812.5 Real wage of Rs. 5000 = × 100
200
Rent 250 300 120.0 25 3000.0 = Rs. 2500 which is less than Rs. 2954.54
Clothing 500 600 120.0 15 1800.0
Since the real wage of Rs. 5000 is less than
Fuel 200 350 175.0 20 3500.0
that of Rs. 3250, the worker does not really
Entertainment 150 225 150.0 5 750.0
gain, real wage decrease by Rs. (2954.54 –
Total 100 13,862.5 2500) = Rs. 45.45.
12.38
Σ PW 13, 862.5
Consumer price index = =
ΣW 100 p
Expenses 1990 1991 P= 1 × 100
= 138.63 q
on 0
12.35 p0 p1 W PW
Group Average Group Index Price PW Food 150 145 96.67 35 3383.45
Per cent Increase in Weight
Fuel 125 123 92.00 10 920.00
P W
Clothing 175 165 89.67 20 1733.40
Food 32 132 15 1980 Rent 130 130 100.00 15 1500.00
Clothing 54 154 3 462 Miscell 140 145 112.50 20 2250.00
Rent 47 147 4 588
Total 100 9786.85
Fuel and light 78 178 2 356
Miscellaneous 58 158 1 158 Σ PW 9786.85
Cost of living index for 1991 = Σ W =
Total 25 2544 100
= 97.86.