The Uncertainty of A Result From A Linear Calibration
The Uncertainty of A Result From A Linear Calibration
Abstract
The standard error of result obtained from a straight line calibration is given by a well
known ISO-endorsed expression. Its derivation and use are explained and the approach is
extended for any function that is linear in the coefficients, with an example of a weighted
quadratic calibration in ICPAES. When calculating the standard error of an estimate, if
QC data is available it is recommended to use the repeatability of the instrumental
response, rather than the standard error of the regression, in the equation.
Key words:
Standard error, calibration, regression, measurement uncertainty
Introduction
Calibration of a measuring system is at the heart of many chemical measurements. It has
direct relevance to the traceability of the measurement and contributes to the
measurement uncertainty. A measurement can be seen as a two-step process in which an
instrument is calibrated using one or more standards, followed by presentation of a
sample to the instrument and the assignment of the value of the measurand. Instrumental
analytical methods, particularly chromatographic, spectroscopic and electrochemical
methods, are usually calibrated over a range of concentrations of the analyte. Often the
calibrations are assumed (or arranged to be) linear and in the past, a graph was prepared
by drawing the best straight line by eye through the points. Having obtained a response
from the instrument from the sample to be analysed, the concentration of this sample was
read off the graph, going from the instrument response on the y-axis to the concentration
on the x-axis. While drawing a graph for the purpose of calibration is no longer done in
practice, with a spreadsheet performing a least squares regression to obtain the equation
of the best straight line, the calibration function is often still referred to as a ‘calibration
line’ or ‘calibration curve’.
In this paper the commonly used expression for the standard error of a result obtained
from a straight line calibration is extended to a quadratic calibration, and the case where
weighted regression is necessary. Spreadsheet recipes are given to accomplish these
calculations.
aˆ = y − bˆx (4)
where the sum is over all data pairs, and x and y are the average values of x and y in the
calibration set. These estimates minimise the standard error of the regression, sy/x (also
known as the residual standard deviation)
∑ (y − yˆ i )
2
i
sy / x = i
(5)
n−2
where ŷ i is the value of y obtained from equation 2.
Having determined a calibration function the equation must be inverted to assign a
concentration ( x̂ 0) given a response (y0) from an unknown test sample.
y −a
xˆ 0 = 0 (6)
b
Note that the carets on a and b will now be omitted. Equation 6 can be written in terms of
the mean x and y values from the calibration, to remove the constant term a and its
correlation with b when the standard error is calculated.
y −y
xˆ 0 = 0 −x (7)
b
The standard error of the estimate of the concentration from the mean of m responses, y0,
is usually given as
sy/ x 1 1 ( y 0 − y )2
s xˆ0 = + + n
(8)
b ∑ ( xi − x )
b m n 2 2
i =1
where there are n points in the calibration, and x and y are the means of the calibration
data5. Equation 8 is quoted with a caveat that this is an approximation, which stems from
the statistical difficulties of an error model applied to the inversion of equation 2 6, 7. A
rigorous derivation of the confidence interval on x̂ 0was given by Fieller in 1954 8.
base assumptions of classical least squares (all error in y). This is why we use equation 7
and not equation 6, a and b are correlated, but b and y are not.
2 2 2
∂y ∂y ∂b
V ( xˆ 0 ) = V ( y 0 ) + V ( y ) + V (b)
∂xˆ 0 ∂xˆ 0 ∂xˆ 0 (11)
V ( y 0 ) V ( y ) V (b) ( y 0 − y )
2
= + 2 + 2 ×
b2 b b b2
The variance in the response to the unknown (y0) is usually estimated by the variance of
the regression s y2 / x. If y0 is the mean of m independent observations then
s y2 / x
V ( y0 ) = (12)
m
Similarly the variance of the mean of the calibration responses ( y ) is
s y2 / x
V ( y) = (13)
n
The variance of the slope is 9
s y2 / x
V (b) = (14)
∑ ( xi − x )
2
i
and therefore
( y0 − y )
2 2
s 1 1
V ( xˆ 0 ) = 2 + +
y/x
(15)
2
n
b ∑ ( xi − x )
b m n
2
i =1
which is equation 8 squared. It is seen that V(y0) is estimated by equation 12, that is from
the standard error of the regression. It is possible that calibration measurements have
been made under different conditions than routine measurements, for which a separate
estimate of the standard deviation of the responses might well be available from in house
QC measurements of repeatability. Therefore, if such data is known, then instead of
equation 8, it would be clearer and better to calculate s x̂0 as
1 s r2 s y / x s y / x ( y 0 − y )
2 2 2
s xˆ0 = + + n
(16)
b 2 ∑ ( xi − x )
b m n 2
i =1
which distinguishes between the variance of the response when the instrument is
presented with the sample (first term) and the component due to the lack of fit of the
calibration line (second term).
have been shown to be a better function than linear for some HPLC applications 10. The
observed instrumental response y (usually a number of ‘counts’ of a detector, or
absorbance of a spectrophotometric detector) is
y = a + b1 x + b2 x 2 (17)
As with the straight line calibration the constant, a, is eliminated by moving the origin of
the calibration to the origin
y − y = b ( x − x ) + b (x − x )
1 2
2 2 (18)
In the analysis of a sample, a response (y0) allows calculation of a concentration ( x̂ 0)
xˆ 0 =
(
− b1 + b12 − 4b2 y − y 0 − b1 x − b2 x 2 ) (19)
2b2
Applying equation 10 to the variance of x̂ 0from equation 19
2 2 2 2
∂xˆ ∂xˆ ∂xˆ ∂xˆ
V ( xˆ 0 ) = 0 V (b1 ) + 0 V (b2 ) + 0 V ( y ) + 0 V ( y 0 )
∂b1 ∂b2 ∂y ∂y 0 (20)
∂xˆ ∂xˆ
+ 2 0 0 C (b1 , b2 )
∂b1 ∂b2
The assumptions of the regression give V ( x ) = 0 and V (x ) = 0 , and the further
2
assumption of independence between the indications and the parameters of the regression
is made. Note that equation 19 can be differentiated and for a linear system the
covariance matrix of the coefficients (b) is given by σ2 (xTx)-1 where σ2 is the variance of
y which can be estimated by s y2 / xand the matrix x is the design matrix of the calibration
(a column of 1’s, followed by columns of the x-values and x2-values used in the
calibration). Table 1 gives expressions for the differentials in equation 20. Kirkup and
Mulholland 10 have derived a similar expression but retained the constant term. In the
practical implementation of their scheme, three covariance terms must be calculated
(C(a,b), C(a,c), C(b,c)) in contrast to the single term in equation 20.
Table 1: Differentials in equation 20 for the calculation of the variance of an estimated value from a
(
quadratic calibration y − y = b1 ( x − x ) + b2 x 2 − x 2 ) . The discriminant of the solution
(
for x is D = b12 − 4b2 y − y 0 − b1 x − b2 x 2 )
∂xˆ ∂xˆ 0
X in 0
∂X ∂X
b1 − 1 + ½D -1/2 (2b1 + 4b2 x )
2b2
b2
(
b1 − D 1 / 2 ½D -1/2 4 y 0 − 4 y + 4b1 x + 8b2 x 2
+
)
2b22 2b2
y – D-1/2
y0 D-1/2
(
b = x T Wx ) (x
−1 T
Wy ) (21)
(
V = x T Wx ) (y
−1 T
)
Wy − b T x T Wy / (n − p) (22)
with p the number of coefficients in the model and n the number of independent
concentrations.
Confidence intervals
To obtain a confidence interval the standard error of the regression is multiplied by an
appropriate point on the distribution function. In the case of normally-distributed data this
is the two-tailed Student-t value for the degrees of freedom of the calibration (n – 1 or n –
2). The reason that at least five calibrations solutions of different concentrations should
be used, is that the resulting three degrees of freedom has an associated Student’s t value
for α = 0.05 of 3.18, which then multiplies s x̂0to give a larger confidence interval than is
the case with more points. For example, ten points with eight degrees of freedom has a
Student’s t value of 2.30.
Example calculations
Two systems are given here to illustrate the methods described above, a linear calibration
of sulfite using a channel biosensor and the weighted quadratic calibration of the ICPAES
analysis of K+. An illustrative spreadsheet is given in the supplementary material to this
paper.
Box 1a
=B3-$B$9
=SUMSQ(E3:E8)
=LINEST(C5:C10,B5:B10,1,1)
Box1b
=(G2-$C$13)/$B$13
=$C$15*(G2-$C$9)/$B$13^2/SQRT($E$9)
=$C$15/SQRT(COUNT($B$3:$B$8)/$B$13)
=$C$15/$B$13
=SQRT(SUMSQ(I2:K2))
=TINV(0.05,$C$16)*L2
=H2-M2
Box 1: Calculation of standard error and 95% confidence interval for an estimated concentration in
a linear calibration. (a) Data and calculations for error formula including output from LINEST.(b)
First rows of a calculation of the 95% confidence interval on an estimated concentration.
Y = b1 x + b2 x 2 (23)
− b1 + b12 + 4b2 y 0
xˆ 0 = (24)
2b2
with V(b1), V(b2) and C(b1,b2) from the covariance matrix equation (22), and V(y0)
estimated from QC or validation data, or the weights. Table 2 gives the equations of the
differentials in equation 25.
Table 2: Differentials in equation 25 for the calculation of the variance of an estimated value from a
quadratic calibration forced through the origin y = b1 x + b2 x 2 . The discriminant of the
function for an indication y0 is D = b + 4b2 y 0
1
2
.
∂xˆ 0 ∂xˆ 0
X in
∂X ∂X
b1 − 1 + D -1/2 b1
2b2
b2 b1 − D 1 / 2 D -1/2 y 0
+
2b22 b2
y0 D-1/2
The regression line and 95% confidence interval of estimates of concentration are
calculated in the spreadsheet shown in Box 2, and are graphed in Figure 1. The
confidence intervals are quite dependent on the errors in the calibration points, but Figure
1 is typical of a number of data sets processed. The confidence intervals diverge with
increasing concentration, as the contribution of the uncertainty of the quadratic term (b2)
increases (Figure 2). The effect is ameliorated by the increasing negative correlation
term.
1.5
0
0 20 40 60 80 100 120 140
-0.5
-1
cov(b1,b2)
-1.5
[K] /mg L-1
Figure 1: Calibration for the routine ICPAES analysis of potassium. Five calibration points,
measured in triplicate, blank-corrected and fitted with a weighted quadratic regression through zero.
Error bars are the 95% confidence interval of the mean of each point. Dashed lines are the 95%
confidence interval on estimated concentrations from the calibration.
4.5
Millions
3.5
3
Detector response
2.5
1.5
0.5
0
0 20 40 60 80 100 120 140 160 180
-1
[K] /mg L
Figure 2: The fractional contributions of the components in the calibration function to the standard
error of the estimates in Figure 1.
Box 2a
=AVERAGE(C7:E7)
=STDEV(C7:E7)
=F7-$B$1
=SQRT(SUMSQ(I3:I7)/5)
Box 2b
=1/$G7^2
=SQRT(B26)
=MMULT(MINVERSE(MMULT(B11:F12,MMULT(B15:F19,A3:B7)))
,MMULT(B11:F12,MMULT(B15:F19,H3:H7)))
=MINVERSE(MMULT(B11:F12,MMULT(B15:F19,A3:B7)))*(MMULT(B13:F13,
MMULT(B15:F19,H3:H7))-MMULT(E22:F22,MMULT(B11:F12,MMULT(B15:F19,H3:H7))))/3
Box 2c
=SQRT(SUMSQ(F30,H30,J30)+K30)
=E30*SQRT(B$26)
=2*E30*G30*C$26
=(-1+0.5/SQRT(C30)*(2*B$22))/2/B$23
=B30*I30
=((B$22-SQRT(C32))/2/B$23^2+A32/
SQRT(C32)/B$23)
=(-$B$22+SQRT(C32))/2/$B$23 =G32*SQRT(C$27)
=1/SQRT(C32)
=$B$22^2+4*$B$23*A32
=A32*$I$8
Box 2: Calculation of standard error for a concentration of potassium from an ICPAES analysis
using a weighted quadratic calibration. The data is blank corrected and fitted through the origin. (a)
Data and calculation of standard deviations for the weights for each point. (b) Matrix calculations for
the variance/covariance matrix of the coefficients. (c) Calculation of first rows for standard error of
the estimate of concentration. The 95% confidence intervals (not shown) are calculated from s(x0) as
in Box 1b.
Acknowledgements
The author thanks Dr Michael Wu of the National Measurement Institute, Australia, for
the ICPAES data used here, and Dr Edith Chow for the sulfite data.
References
1 S. De Jong and A. Phatak, in Partial Least Squares Regression, ed. S. Van Huffel,
SIAM, Philadelphia, 1997, pp. 25-36.
2 À. Martínez, J. Riu and F. X. Rius, Chemometrics and Intelligent Laboratory
Systems, 2000, 54, 61-73.
3 W. Bremser and W. Hasselbarth, Analytica Chimica Acta, 1997, 348, 61-69.
4 M. Mulholland and D. B. Hibbert, Journal of Chromatography, 1997, A, 73-82.
5 ISO 11095,Linear calibration using reference materials, International
Organization for Standardization, Geneva, 1996.
6 J. N. Miller and J. C. Miller, Statistics and Chemometrics for Analytical
Chemistry, Pearson Education Ltd, 2005.
7 P. D. Lark, B. R. Craven and R. L. L. Bosworth, The handling of chemical data,
Pergamon Press, Oxford, 1968.
8 E. C. Fieller, Journal of the Royal Statistical Society Series B, 1954, 16, 175 -
183.
9 D. B. Hibbert and J. J. Gooding, Data Analysis for Chemistry, Oxford University
Press, New York, 2005.
10 L. Kirkup and M. Mulholland, Journal of Chromatography, A, 2004, 1029, 1-11.
11 D. B. Hibbert, Accreditation and Quality Assurance, 2005, 10, 300-301
12 W. Huber, Accreditation and Quality Assurance, 2004, 9, 726
13 S. L. R. Ellison, Accreditation and Quality Assurance, 2006, 11, 146 - 152.