Unit-6
Unit-6
RESIDUAL ANALYSIS
Structure
6.1 Introduction R-student
Studentised Residuals
PRESS Residuals
6.1 INTRODUCTION
In Unit 1, you have discovered how a scatter plot helps to understand the
relationship between response and explanatory variables before fitting the
regression model. You have also learnt how to fit a simple and multiple linear
regression model using the ordinary least squares method in Unit 2.
Once you fit the regression model, you need to go ahead with diagnostics of
the fitted regression model to verify the underlying assumptions of the model.
We explore how to use residuals (the difference between the observed and
predicted values of the response variable) or scaled residuals to verify some of
the model assumptions in this unit. The essential point to focus on here is that
the errors or disturbances ( i ) and the residuals ( ri ) are not the same but are
closely related. As we have already discussed, the error of an observed value
is a difference between the observed value and actual value (or expected
value based on the whole population) of the response variable for a given
data. The error term ( i ) is the unobservable random variable which cannot be
measured or observed directly. The error terms are assumed to be normally
distributed with zero mean and common variance σ2 as well as uncorrelated to
each other (see Fig. 6.1). 159
Block 2 Model Adequacy Checking
(a) (b)
Fig. 6.2: Scatter Plot. 161
Block 2 Model Adequacy Checking
SAQ 1
State the assumptions of a linear regression model.
Y
Residuals
Regression Line
sometime, rounding
0 errors may change the
X
result when we
approximate the
Fig. 6.3: Residuals
decimal places.
Ideally, according to the property of residuals, the sum of the residuals for all
n
given observations should always be equal to zero, i.e., ri = 0 . So, from the
i =1
properties of residuals, we know that it has zero mean and variance 2 . The
mean of the residuals is obtained as:
E(ri ) = E(yi − yˆ i )
= E(yi ) − E(yˆ i )
= yi − yi = 0
(r − r ) i
2
r i
2
(y i − yˆ i )2
ˆ 2 = i =1
= i =1
= i =1
= MSSRe s … (5)
n − k −1 n − k −1 n − k −1
n
r i
where r = i =1
= 0 and k is the number of explanatory variables in the model.
n
163
Block 2 Model Adequacy Checking
r = (y
i =1
i
i =1
i − yˆ i ) = 0
(ii) The sum of the cross-products between the values of each one of the
explanatory variables and the residuals will be zero. In other words, if we
consider the corresponding values of an explanatory variable as weight,
the weighted sum of the residuals will be equal to zero, i.e.,
n
x r = 0 ; j = 1, 2, …, k.
i =1
ji i
(iii) The sum of the cross-products between predicted values of the response
n
variable and the residuals will be zero, i.e., ŷ r = 0
i =1
i i
1 2340 1340 49
2 2310 1210 49
3 2040 1420 45
4 2630 1200 47
5 2340 1310 46
6 2330 1250 48
7 2340 1310 48
8 2340 1190 48
9 2380 1280 50
10 2360 1340 46
11 2330 1340 41
12 2180 1390 50
13 2100 1430 52
14 3160 1210 51
15 2400 1320 53
16 2390 1260 53
17 2400 1320 53
18 2400 1200 55
19 2440 1290 51
20 2420 1350 46
Determine the predicted values of the response variable and residuals for the
164 given data.
Unit 6 Residual Analysis
Solution: We can fit the regression model for the given data, as discussed in
Unit 2. We have obtained the best-fitted multiple regression model using the
method of ordinary least squares given as:
Yˆ = 4605.52767 − 1.79313X1 + 2.10913X2
We can also determine the predicted values of the response variable as given
in Column 3 of Table 6.2. The observed and predicted values of the response
variable corresponding to the first observation are 2340 and 2306.0832,
respectively (see Columns 2 and 3 of Table 6.2). We now determine the value
of the first residual using the formula given in equation (3) shown as follows:
r1 = 2340 − 2306.0832 = 33.9168
In the same way, we compute other values of the residuals for all observations
given in the data and arrange them in Column (4) of Table 6.2 as follows:
Table 6.2: Computation of the residuals
It would be helpful if you now solved the following Self Assessment Question
to revise the computation of the residuals.
SAQ 2
Suppose a store manager wants to evaluate the effect of a factor such as the
price of a product (X) on the number of products sold monthly (Y). For this
purpose, the following data on 25 products are recorded to explore the
relationship between the monthly sales and the price of a product:
165
Block 2 Model Adequacy Checking
The standardised residuals (si) have mean zero and variance approximately
equal to one, i.e., E(si ) = 0 and Var(si ) 1.
Note that the value of hij when we have only one explanatory variable is given
as:
1 (xi − x)(x j − x)
hij = + ; i, j = 1, 2, …, n … (9)
n SS x
Since matrices H and (I−H) are symmetric, i.e., (I − H) =(I−H) and idempotent,
i.e., (I − H)(I − H) = (I − H) , we have
V(R) = (I − H) 2
where, 0 hii 1.
Note that the studentised residuals are usually larger than the ordinary
residuals ( ri ) or the standardised residuals ( si ). Then, diagnosing any violation
in the model's assumptions may be more accessible by examining the
studentised residuals than the ordinary and standardised residuals. The ith
studentised residual d i is also known as the ith internally studentised residual.
Note that the variance of residuals becomes stable, especially in the case of
large data. In many situations, the values of si and di may not be differentiated
much, and these values often communicate similar information. It is also
essential to remember that if the ith data point for which we have a large
residual as well as large hii , this point will possibly be highly influential on the
fitted regression model. So, in that situation, we generally advise assessing
the studentised residual di .
The variance of the ith residual in case of only one explanatory variable can be
obtained as:
1 (x − x)2
Var(ri ) = MSSRes 1 − + i … (14)
n SS x
From equation (15), we can observe that the estimated variance of ri will be
larger (or ri will be smaller) if x is are close to x . On the contrary, we can say
that it will be smaller (or ri will be larger) if x is are towards the extreme ends
of the data. Moreover, if the value of n is large enough, it controls the impact of
the term (x i − x)2 and makes it comparatively smaller. As a result, ri may not
be considerably much different from di for large data.
where, Y(i) , X(i) j ' s and regression parameters ˆ (i) j ' s have usual meanings
except for the subscript (i), which denotes the deleted ith observation from the
data.
If the ith fitted value of the response variable using the regression model
defined in equation (16) is denoted by ŷ (i) , we define the ith PRESS residual
as:
r(i) = yi − yˆ (i) ; i = 1, 2, ..., n … (17)
ri
r(i) = … (18)
1 − hii
From equation (18), notice that it is not necessary to determine the separate
fitted regression models excluding the respective observation for obtaining the
PRESS residuals. We can compute it directly with the help of ordinary
residuals (ri). From equation (18 ), we can consider PRESS residuals as
weighted ordinary residuals, which are weighted corresponding to the diagonal
elements of H. After looking at the relationship of ri and r(i) , we notice that the
residuals (corresponding to some points) with large hii will also have large
PRESS residuals, and these points will usually be high influence points.
Remember the large difference between ri and r(i) signals to a point where the
model fits the data well; otherwise, the model predicts poorly without that
point.
The expected value (mean) of the PRESS residuals will be
E(r(i) ) = 0 … (19)
r 1
Var(r(i) ) = Var i = Var(ri ) … (20)
1 − hii (1 − hii )
2
1 2
Var(r(i) ) = (1 − h ) 2
= … (21)
(1 − hii )2 1 − hii
ii
The variance of the ith PRESS residual for unknown 2 using estimated
ˆ 2 = MSSRe s is given as:
MSSRe s
Var(r(i) ) = …(22)
1 − hii
From equations (13) and (24), we can say that the formula of standardised
PRESS residual is the same as the studentised residual when we choose the
estimated 2 , i.e., ˆ 2 = MSSRe s .
6.4.4 R-student
As mentioned in Section 6.4.2, we conventionally estimate 2 with MSSres to
calculate the studentised residual ( di ), which is generally used to detect an
outlier. You should make a point that we determine MSSres by internally fitting
the regression model based on all n observations. This kind of scaling of the
residuals is termed internal scaling. There is another scaling termed externally
studentised residual, also known as R-student. Additionally, we estimate 2
after removing the ith observation from the given data to obtain the ith R-
student. Notice that here, we get the estimated value of 2 by excluding the ith
observation from the data in the case of the ith R-student. That is why it is also
called externally studentised residual. The R-student residuals are more
powerful in detecting outliers and checking the assumption of equal variance.
1 ri2
MSSRes(i) = (n − k)MSSRes − … (25)
n − k −1 1 − hii
Notice that the MSSRes(i) will be significantly different from MSSres if the ith
data observation is influential. In this case, the R-student will be more
responsive (sensitive) to that point. Otherwise, there will not be much
difference between d(i) and di . Also note that d(i) follows t-distribution with
(n − k − 1) degrees of freedom underlying the usual regression model's
assumptions.
It is always desirable to check the behaviour of the residuals. Residuals
possibly explain any irregularity of the fitted regression model that might have
occurred and misled us. Before discussing the different residual plots, let us
first delve into the computation of the scaled residuals using the following
example.
Example 2: For the data mentioned in Example 1, obtain the standardised,
studentised, PRESS and R-student residuals. Also, check the presence of any
outliers in the given data.
Solution: It is given that n = 20 and k = 2. We have determined the values of
residuals in Example 1. From the solution of Example 1, we compute the
appropriate estimator of using equation (5), shown as follows:
595657.5015
ˆ = MSSRes = = 35038.6766 = 187.1862
20 − 2 − 1
Now, with the help of this example, you should try to understand the above-
described theoretical framework for computing the scaled residuals. We obtain
the scaled residuals at different data points by constructing Table 6.4 with
eight columns. The description of these columns is as follows:
1. The second column contains the residuals determined in Table 6.3 for all
observations of the response variable.
ri
2. We compute the standardised residuals using the formula si =
MSSRe s
given in equation (6) and arrange them in Column 3.
3. Before computing other scaled residuals, we determine the ith diagonal
element of the hat matrix H = X( XX )−1 X , i.e., hii in Column 4.
Since the value of the standardised residual for the 14th observation is more
than 3, it indicates an outlier in the given data.
We now give the following Self Assessment Question to practice computing
172
the scaled residuals for a simple regression model.
Unit 6 Residual Analysis
SAQ 3
For the data on the sales and price of the product given in SAQ 2, determine
the scaled residuals.
After obtaining the residuals and scaled residuals, we now discuss how to
check the underlying assumptions, namely linearity, constant variance, and
normality, to be valid for the fitted regression model. We use graphical
Usually, we prefer
methods to verify the validity of these assumptions. Let us verify how well the
to plot the
regression model fits the given data. For this purpose, we draw the residual studentised
and normal probability plots. We will discuss these plots in the following residuals because
sections, one at a time. they have constant
variance.
Y
↑
→X
0
The residual plot will not be randomly scattered when the fitted model
deviates from any assumptions. Some common types of variation in the
residual plots are discussed as follows:
(ii) Variation from the assumption of constant variance
Suppose we observe that the points of a residual plot are likely to create
a non-constant, i.e., an increasing or decreasing band. It indicates a
deviation from the constant variance assumption. When we have the
heteroscedastic data (non-constant variance), we obtain the residual plot
similar to the pattern shown in Figs. 6.5 to 6.7. The models depicting
these patterns may have a poor ability to account for the variability.
Y
↑
→X
0
Fig. 6.5 shows an outward (right opening) funnel pattern. It indicates that
the variance of the error terms is not constant and increases with the
response variable. Fig. 6.6 has an inward opening (left opening) funnel
pattern, which shows that the error variance decreases with the
174 response variable.
Unit 6 Residual Analysis
Y
↑
→X
0
Y
↑
si
→X
Y
↑
(a)
→X
0
175
Block 2 Model Adequacy Checking
Y
↑
(b)
0 →X
(a) (b)
Fig. 6.9: Non-linear and non-constant variance patterns.
A trend may also be exhibited in the residual plot, as shown in Fig. 6.10.
In this type of plot, there is likely to be an error in the computation of the
regression model. In this situation, additional regression analysis may be
needed in the model formulation. Note that we also use the residual
plots to identify outliers (large residual values). These outliers occur
towards the extreme ends, away from most of the points. However,
sometimes the outliers may indicate either non-constant variance or a
non-linear relationship between Y and X. So you should also keep in
mind and examine these prospects instead of considering them only
outliers. You know, sometimes some outliers influence more on the
regression estimates and the modified model may change the status of
176 outliers.
Unit 6 Residual Analysis
Y
↑
(a)
0 →X
Y
↑
(b)
→X
0
Fig. 6.10: (a) Upward trend shape and (b) Downward trend shape.
You may now like to solve the following example to understand the residual
plots better.
Example 3: For the yield data given in Example 1, construct and interpret the
residual plots considering scaled residuals against the (i) predicted values of
yield (yˆ i ), rainfall (X1) and area under crop (X2).
Solution: (i) To obtain a residual plot, we consider the predicted values (yˆ i ) on
the horizontal axis and any scaled residuals on the vertical axis. We now mark
the points corresponding to the scaled residuals (computed in Table 6.4)
against the fitted values of the response variable obtained in Table 6.3.
In this way, we obtain the residual plots for scaled residuals with respect to the
predicted values of the response variable, which are shown in Fig. 6.11.
4.0 4.0
3.5 3.5
3.0 3.0
Studentised Residual
2.5
Standardised Residual
2.5
2.0 2.0
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
-0.5 -0.5
-1.0 -1.0
-1.5 -1.5
2100 2200 2300 2400 2500 2600 2100 2200 2300 2400 2500 2600
Predicted Yield Predicted Yield
177
Block 2 Model Adequacy Checking
800 7.0
6.0
600
5.0
R-student Residual
PRESS Residual
400 4.0
3.0
200
2.0
0 1.0
0.0
-200
-1.0
-400 -2.0
2100 2200 2300 2400 2500 2600 2100 2200 2300 2400 2500 2600
Predicted Yield Predicted Yield
Similarly, we can plot the rainfall (X1) against the scaled residuals. The
resulting residual plots are shown in Fig. 6.12.
4.0 4.0
3.5
3.0 3.0
Standardised Residual
Studentised Residual
2.5
2.0 2.0
1.5
1.0
1.0
0.5
0.0
0.0
-0.5 -1.0
-1.0
-1.5 -2.0
1150 1200 1250 1300 1350 1400 1450 1150 1200 1250 1300 1350 1400 1450
Annual Rainfall Annual Rainfall
800 7.0
6.0
600
5.0
R-student Residual
PRESS Residual
400 4.0
3.0
200
2.0
0 1.0
0.0
-200
-1.0
-400 -2.0
1150 1250 1350 1450 1150 1250 1350 1450
Annual Rainfall Annual Rainfall
The residual plots for scaled residuals corresponding to the area under crop
(X2) are presented in Fig. 6.13.
178
Unit 6 Residual Analysis
4.0 4.0
3.5
Standardised Residual
3.0 3.0
Studentised Residual
2.5
2.0 2.0
1.5
1.0
1.0
0.5
0.0
0.0
-0.5 -1.0
-1.0
-1.5 -2.0
40 45 50 55 60 40 45 50 55 60
Area Area
800 7.0
6.0
600
R-student Residual
5.0
PRESS Residual
400 4.0
3.0
200
2.0
0 1.0
0.0
-200
-1.0
-400 -2.0
40 45 50 55 60 40 45 50 55 60
Area Area
The scaled residual plots shown in Figs. 6.11 and 6.12 appear to follow a
curvilinear pattern approximately. Hence, the assumption of linear regression
does not seem valid, or we can say that the linear regression model does not
fit the given data well. Fig. 6.13 slightly shows the deviation from the
assumption of constant variance of the error terms. You can also infer that
Figs. 6.11 to 6.13 indicate a possible outlier corresponding to the 14th
observation.
You may like to pause here and check your understanding of the construction
of residual plots by solving an exercise. Now, you can try the following Self
Assessment Question.
SAQ 4
For the scaled residuals computed in SAQ 3 for the data given on sales and
price of a product, construct the residual plots corresponding to the predicted
values of sales versus (i) standardised residuals and (ii) studentised residuals.
Until you are familiar with the residual plots, we will now discuss the normal
plots used to check the normality assumption.
normality assumption. So, we can assume that the distribution of residuals will
be approximately normal for a sufficiently large sample. However, it would help
if you are cautious about possible violations of the normality assumption while
dealing with small samples. As you know, the assumption of normality (i.e.,
error terms should be normally distributed) plays an important role in
regression analysis in small samples. You have learnt in "MST-016: Statistical
Inference" that statistical tests rely upon certain assumptions about the
variables, including normality, which are essential for the validity of the results.
When these assumptions are not satisfied, the results may not be reliable.
In regression analysis, the assumption of normality of error terms is essential
since the t-test, F-test, and confidence intervals depend on it. In this section,
we construct the normal plots to verify the normality assumption. Suppose for
a sample of size n, we divide the area of a normal curve into n equal parts, in
1
which each part consists of an observation with cumulative probability .
n
The normality assumption of error terms is essential while fitting a linear regression
model. We can plot the histogram to get an idea of normality. However, the normal
probability plot is widely used to validate the assumption of normality, and it is
easier to interpret than the histogram. You will come across various normal plots
in the literature depending on the variables considered on the X and Y axes of
the plots. They are:
▪ Normal probability plot: In a normal probability plot, we plot the ordered
residuals/standardised residuals against the theoretical cumulative
probabilities of a normal distribution (which is known as cumulative area
under the normal curve or theoretical cumulative density function (CDF)).
We consider either theoretical cumulative probability (pi) or percentile
theoretical cumulative probabilities (Pi) on the vertical axis (Y-axis) and the
ordered standardised residuals on the horizontal axis (X-axis) to obtain the
normal probability plot.
▪ Normal probability-probability plot: This plot is also known as the P-P
plot. This plot marks sample cumulative probabilities of the observed data
(CDF of the ordered residuals/standardised residuals) against the
theoretical cumulative probabilities on the X and Y axes, respectively. Note
that the sample cumulative and theoretical cumulative probabilities are also
known as empirical and expected cumulative probabilities, respectively.
▪ Normal quantile plot: It is similar to the normal probability plot. The main
180 difference is that ordered residuals/standardised residuals are plotted
Unit 6 Residual Analysis
The resulting points should lie approximately along a straight line (Y=X) if it
satisfies the assumption of normality (see Fig. 6.15). If the considered sample
Fig. 6.15 comes from the normal distribution, the points on the Q-Q plot will closely
follow the straight line (Y = X). In the normal probability plot, we visually
determine the straight line that passes through the central points of all values
rather than the extreme points at both ends. You can come across different
plot patterns while dealing with a normal probability plot.
(i) If all points lie approximately along a straight line, as shown in Fig. 6.16,
the normal probability plot is considered ideal and will satisfy the
normality assumption. A substantial amount of departure from the
straight line indicates non-normality.
(ii) The normal probability plots shown in Figs. 6.17 (a) and (b) do not lie
182 along a straight line. These patterns indicate that there is a problem with
Unit 6 Residual Analysis
(a) (b)
Fig. 6.17: Heavier-tailed normal probability plots.
(iii) In Figs 6.18 (a) and (b), the normal probability plots show sudden
upward and downward changes in the direction of the trend. This sharp
change in the direction indicates positively and negatively skewed
underlying distributions, respectively.
(a) (b)
Example 4: Construct the normal probability plot, P-P plot and normal
quantile plot for the data on the crop yield considered in Example 1.
Solution: In Examples 1 and 2, we have already computed the residuals and
the standardised residuals for the data mentioned in Table 6.1. To draw the
normal probability and quantile plots, we calculate the ordered standardised
residuals, their ranks, percentiles and expected normal values (theoretical
quantile). We arrange the standardised residuals computed in Column 3 of
Table 6.4 in increasing order (up to two decimal places for simplicity), as
shown in Column 2 of Table 6.5. You should note that we can also assign
ranks to the standardised residuals without arranging them in increasing order.
After arranging the ranks in Column 3, we calculate the cumulative
probabilities and percentile cumulative probabilities using equations (27) and
(28) for the corresponding ordered standardised residuals and enter them in
Columns 4 and 5, respectively. We obtain the theoretical quantiles (expected
normal values) using equation (29) in Column 6. We then compute the sample
cumulative probabilities corresponding to the ordered standardised residuals
(Column 2) using negative and positive cumulative z-values given in Tables III
and IV of the appendix given at the end of this volume. The computed sample
cumulative probabilities are placed in Column 7 of Table 6.5.
Table 6.5: Computation of percentile cumulative probability and theoretical quantities
0.9
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
-1.5 -0.5 0.5 1.5 2.5 3.5 0.0 0.2 0.4 0.6 0.8 1.0
Ordered Standarised Residual Sample Cumulative Probability
In the same way, we can plot the normal quantile plot (or Q-Q plot) as shown
in Fig. 6.20.
2.0
Expected Normal Values
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-1.5 -0.5 0.5 1.5 2.5 3.5
Ordered Standarised Residual
Note that all resulting points do not seem to lie along a straight line, as shown
in Figs. 6.19 and 6.20. Notice from these figures that some points of the
distribution deviate slightly from the straight line, and one point lying far away
is an outlier. It indicates that the distribution of error terms (residuals) is
deviating to be normally distributed.
We now solve an exercise to practice constructing the normal probability and
normal quantile plots. Now, you can try the following Self Assessment
Question.
SAQ 5
For the exercise given in SAQ 4, construct the normal probability and normal
quantile plots.
We now end this unit by summarising what you have learnt in it.
6.7 SUMMARY
• The difference between the ith observed and the corresponding predicted
(fitted) values of the response variable is defined as residual. The ith
residual is given as:
185
Block 2 Model Adequacy Checking
ri = yi − yˆ i ; i = 1, 2,..., n
ˆ = ( yˆ , yˆ , ..., yˆ ) .
where R = (r1, r2 , ..., rn ) and Y 1 2 n
r i
2
(y i − yˆ i )2
ˆ 2 = MSSRe s = i =1
= i =1
n − k −1 n − k −1
• The ith standardised residual for the unknown 2 is given as:
ri
si =
ˆ
where, ˆ 2 = MSSRes
where, 0 hii 1
• The ith PRESS residual also known as deleted residual, is given as:
r(i) = yi − yˆ (i) ; i = 1, 2, ..., n
ri
r(i) =
1 − hii
• The variance of the ith PRESS residual for unknown 2 is given as:
MSSRe s
Var(r(i) ) =
1 − hii
186
• The ith R-student is computed as:
Unit 6 Residual Analysis
ri
d(i) = ; i = 1, 2,...,n
MSSRes(i) (1 − hii )
heights (in meters) of 20 trees. The resulting data are recorded in the
following table:
Table 6.6: Diameter and height data of 20 trees
3. The predicted values of the response variable and residuals are given in
Table 6.7. As mentioned in Unit 3, we can compute the value of ̂ as:
̂ = 41732863.5406 = 6460.0978
We now compute the diagonal of hat matrix (hii), MSRe s(i) , standardised
residuals, studentised residuals, PRESS residuals and R-student for all
observations given in the data and arrange them in Columns (3) to (8) of
Table 6.8 as follows:
Table 6.8: Computation of the scaled residuals
Standardised Residual
1.5
1.0
0.5
(a) 0.0
-0.5
-1.0
-1.5
-2.0
0 10000 20000 30000 40000
Predicted Yield
2.5
2.0
Studentised Residual
1.5
1.0
0.5
(b) 0.0
-0.5
-1.0
-1.5
-2.0
0 10000 20000 30000 40000
Predicted Yield
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Ordered Standarised Residual
The ordered standardised residuals are plotted against the expected normal
values to obtain a normal quantile plot, as shown in Fig. 6.23.
2.0
1.5
Theoretical Quantile
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Ordered Standarised Residual
Terminal Questions(TQs)
1. Refer to Sections 6.4.2 and 6.4.4.
2. Refer to Section 6.6.
3. (i) The simple regression model for the given data can be fitted as:
ˆ = 1.4239 + 0.5945 X
Y
Since the values of standardised residuals for all observations are less
than 3, there is no indication of an outlier in the given data.
(ii) The residual plot of standardised residuals obtained in Table 6.10
against the given values of the explanatory variable (X) is shown in
Fig. 6.24.
192
Unit 6 Residual Analysis
2.0
1.5
Standardised Residual
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
3 4 5 6 7 8
Diameter
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
3 4 5 6 7 8
Diameter
193
Block 2 Model Adequacy Checking
Ordered Theoretical Sample
S. Theoretical
Standardised Rank Cumulative Cumulative
No. Quantile
Residual Probability Probability
11 0.06 11 0.53 0.0627 0.5239
12 0.23 12 0.58 0.1891 0.5910
13 0.65 13 0.63 0.3186 0.7422
14 0.69 14 0.68 0.4538 0.7549
15 0.74 15 0.73 0.5978 0.7704
16 0.75 16 0.78 0.7554 0.7734
17 1.02 17 0.83 0.9346 0.8461
18 1.14 18 0.88 1.1503 0.8729
19 1.29 19 0.93 1.4395 0.9015
20 –2.18 20 0.98 1.9600 0.9192
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Sample cumulative probability
The ordered standardised residuals are plotted against the expected normal
values to obtain a normal quantile plot, as shown in Fig. 6.27.
2.0
1.5
1.0
Theoretical Quantile
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
-2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Sample Quantile
Note that the resulting points on the P-P and Q-Q plots are deviating slightly
from the straight lines (see Figs. 6.26 and 6.27). It indicates that the
distribution of error terms (residuals) slightly differs from the normal
distribution.
194