Multivariate Calibration . II. Chemometric Methods: Tormod Naes and Harald Martens
Multivariate Calibration . II. Chemometric Methods: Tormod Naes and Harald Martens
lo,1984
Multivariate calibration*.
II. Chemometric methods
Tormod Naesand Harald Martens
sample preparation prior to the measurement, pu-
Aas, Norway
rifying the samples in order to allow specific (or se-
In this outline of new approaches to multivariate calibration lective) measurements free of interferences.
in chemistry the following topics are treated: Advantages of Multivariate calibration offers an alternative solu-
multivariate calibration over conventional univariate cali- tion to this specificity problem: by combining several
bration: detect and eliminate selectivity problems. Multivar- different measurements (e.g. light absorbance at
iate calibration methods based on selection of some variables several wavelengths) interferences can be modelled
vs. methods based on data compression of all the variables. and eliminated. The multivariate linear prediction
Direct vs. indirect calibration: pure constituents or known equation can be written as
samples for calibration? Calibration methods based on data
compression by physical modelling: Beer’s law. Use of e=6,+d
Beer’s law in controlled and natural calibration: the gener- (1)
alized least-squares fit and the best linear predictor. Extend-
ing Beer’s law to handle unknown selectivity problems. Cali- where x = (xi,..., xJ represents anrow yector, of p
bration methods based on data compression by factor model- different measured variables, and b = (b,,. . . , !I,)’ is
ling: the principal component regression and partial least- the corresponding column vector of calibration coef-
squares regression. Methods for detecting abnormal sam- ficients. Non-linear formulae may also be used, but
ples (outliers). Pre-treatments to linearize data. in this paper we give main attention to linear ones. In
other words, we assume that the instrument re-
The purpose of quantitative measurements (e.g. sponse readings, X, have been linearized in advance,
from a rapid spectroscopic instrument) is to predict for instance as x = absorbance = log (l/transmis-
some useful information. A mathematical formula (a sion).
prediction equation) is required in order to trans- The determination of the calibration coefficients
form the measured data (e.g. spectroscopic read- 6,, &,..., hP is termed ‘multivariate calibration’ and
ings) to relevant information (e.g. chemical concen- can be done in many different ways. By direct multi-
trations). variate calibration (direct multicomponent analysis)
The simplest prediction equation is the univariate they are obtained as a result of prior measurements
linear one, i. e. of all the individual pure constituents’ spectra, k,,
k2,... For estimating their concentrations in an un-
e=6,+x6 known mixture, these directly measured spectra are
used in the linear mixture model. Several modern in-
where &is the estimate of e.g. the concentration c, x struments, e.g. for UV-Vis spectroscopy, use such
is the measured variable and & and 6 are the cali- direct multicomponent analysis to eliminate known
bration coefficients. The ‘hats’ over b, and b indicate chemical interferences.
that their values have been determined in a previous The present paper deals mostly with indirect cali-
calibration step. bration, which may be more laborsome, but also
A preceding paper’ showed how contributions more general than the direct method, because it also
from foreign chemical constituents with similar in- allows elimination of interferences from unknown
strument signals or other systematic interferences in constituents and unknown physical phenomena. In
the measured signals x can create gross errors in the the indirect approach one needs a ‘representative’
predicted concentrations e if this conventional univa- set of ‘reference samples’ with measured values for
riate calibration approach is used. Therefore much both spectrum and concentrations of the constitu-
effort has traditionally been devoted to elaborate ents to be predicted. This set of calibration data is
used for statistical estimation of the model param-
eters that after a mathematical transformation yield
the desired calibration coefficients b,, b,, . . . , b,.
* For part I see the September issue: Trends Anal. Chem., 3
(1984) 204. In the present paper some different approaches to
In part I there is a printing error. Page 206, first column, line 15, multivariate calibration will be described from a
“c = cK” should read “x = IS’. theoretical point of view. The importance of ‘bad
data points’ (outliers) and data linearization are also next section). The PCR and PLSR methods will also
treated. be described. Both the PLSR and PCR method have
shown to give good results in calibration problems
Multivariate calibration and they have valuable interpretation possibilities.
Models and methods Both are based on regressing the concentration c on
It is realistic in many analytical situations to as- linear combinations or projections of the spectral
sume that the instrument response (spectrum x) is a variables in X. They differ in the way these linear
linear function of the concentrations of the constitu- combinations are found. We term these linear com-
ents of the mixture plus measurement noise. This lin- binations ‘regression factors’.
ear multicomponent mixture model can be used for In Fig. 1 we have illustrated some aspects of the
both direct and indirect calibration. In spectroscopy difference between the most important calibration
it is known as Beer’s law. To emphasize its empirical methods. The methods are computationally quite
nature we prefer to call it Beer’s model. It is identical different. But when the relationship between x and c
to an ordinary statistical linear model. From a math- is nicely linear and the measurement noise in x and c
ematical point of view nothing is better than linear is very low, the different methods yield similar re-
models. They are easy to understand and can be sults. However, they make different statistical distri-
used both for direct and indirect calibration. This bution assumptions and allow for different error
type of methods will be considered in more detail in types, and therefore appear to have somewhat dif-
the next section. ferent applications.
For indirect multivariate calibration various re-
gression techniques with c as dependent variable (re- Calibration methods based on the linear multicompo-
gressand) can be used as alternatives to Beer’s mod- nent mixture model (Beer’s model)
el. The simplest of these calibration methods is the The model. The linear relationship between spec-
ordinary multiple linear regression (MLR) method trum and concentrations can be written as
based on a model directly compatible with the multi-
variate prediction equation (eqn. l), i.e. the model x=cK+e (3)
is
ante matrix equal to V. The model is sometimes ex- Notice that no information about the population is
tended by adding an intercept, and will later be ex- involved. When used in the prediction equation,
tended further by a term tP representing unknown eqn. 5 yields the best linear predictors (BLP) for c
interferences. and eqn. 6 yields the generalized least-squares
The calibration methods based on Beer’s model (GLS) estimator for c. Both are optimal in the sense
can be used to calibrate for several constituents si- that they minimize the mean squared error (MSE,
multaneously. It is therefore convenient to use a q- i.e. the square of the standard deviation of the differ-
dimensional generalization of the prediction equa- ence between the measured and predicted c) in the
tion (eqn. 1). This predictor can be written as natural and controlled case, respectively.
Indirect calibration. When using Beer’s model for
e=&)+x& (4) indirect calibration, the model parameters K and V
(and u and D in the random case) are unknown and
where & = (&,t,,.., 6,,) is the row vector of inter- must be estimated by observations of x and c in the
cepts and B = {bkj} , j = 1,. . . , q and k = 1,. . . , p set of calibration data, i.e. (xl, cl),. .., (x,, c,J where
represents the calibration coefficients for the q dif- n is the number of samples. Several alternative
ferent constituents at the p different instrument re- methods for estimating the parameters exist. In Naes6
sponse variables. These are derived from the physi- the maximum-likelihood (ML) estimators under
cal constants, K. normality assumptions are used, but also other alter-
As discussed in our previous paper’, there is a dis- natives may be envisioned. Irrespective of the choice
tinction between so-called natural (or random) and of estimators, the estimated versions of the BLP and
controlled calibration. In the random case informa- GLS estimators are here called EBLP and EGLS,
tion about the population of samples is available and respectively (E for estimated). If we use the ML esti-
incorporated, and in the other case no such informa- mators for the parameters, it is easy to see from es-
tion exists. Both cases are possible in Beer’s model. tablished regression theory that EBLP is equal to the
In the natural case we denote the concentration ordinary MLR predictor for c on x (see previous sec-
mean of the sample population [i.e. E(c)] by u and tion). In addition EBLP and EGLS with ML esti-
the dispersion matrix of the population [i.e. mates for the model parameters are multivariate
cov(c)] by D. In controlled calibration situations no generalizations of the so-called inverse and classical
population information about the distribution of methods of univariate calibration, respectively. The
mixtures is available, and we can not use or utilize u inverse and classical methods are derived by using
and D. the regression of c on x and x on c, respectively.
Direct calibration. Direct calibration in eqn. 3 re- In practice, the dimension of x is often large com-
quires that the model parameters K and V (and u and pared with that of c and the matrix consisting of cali-
D in the random case) are known. In the random cal- bration data for x is often highly multicollinear. To
ibration case it is natural5 to predict unknown con- incorporate this information in the type of methods
centrations from calibration coefficients estimated treated here it is assumed6 that e has linear factor
bY structure, i.e. that e = tP + e* where e* has uncorre-
lated elements. Such ‘rank reduction’ models have
b = (K’DK + V) -1 K’D (5) shown to be suitable for multicollinear data. The
model is then
and
x=cK+e=cK+tP+e* (7)
i, = U(Z-KB)
The rows of P may be regarded as spectra of un-
where Z is the q-dimensional identity matrix. Notice known constituents, while K contains spectra of the
that the expressions contain u and D, the distribution known constituents. Estimates of the residuals e can
parameters for the constituent concentrations. be obtained after the estimation of K, by fitting x to c
In the controlled case the calibration coefficients (see eqn. 3) in the calibration data, i.e.
are usually estimated by5
0 = x-& (8)
zj = v-1 K’ (KV- I&p)- 1
(6)
The covariance matrix of tP + e* may then be esti-
and mated by factor analysis of the residuals C and finally
inserted in eqns. 5 and 6 (instead of e.g. the ML esti-
i, = 0 mator of V). The number of factors in P can be esti-
trends in analytical chemistry, vol. 3, no. lo,1984 269
mated by cross-validation or prediction testing on an methods have been found to perform well on chem-
independent dataset (see Fig. 2). The methods de- ical data.
rived with the factor assumption on the residuals
have given good results in e.g. NIR spectroscopy6. Methods based on regression on factors from x
The basic distinction between natural and con- In this section we will describe another important
trolled calibration is unimportant if the residuals af- type of calibration techniques, namely those meth-
ter fitting the calibration model to the calibration ods which are based on regressing the chemical con-
data are very smalls. But if appreciable errors exist, centrations on linear combinations or projection of
the distinction is important: the calibration methods x, called ‘regression factors’. The regression factors
for thentwo cases will yield calibration coefficients are found by projecting the spectra x onto a space
b O,..., bp that are optimal for different types of un- spanned by row vectors pr,..., fiA (estimated factor
known samples (Fig. 3). In the controlled case, the loadings). Two methods of this kind, PCR and PLSR
calibration coefficients (e.g. from EGLS) will yield will be considered.
acceptable predictions over a wide range of concen- PCR. In PCR, the regression factors are described
trations and may even allow extrapolations outside by loadings fir, fi2,. . . , fiA which are simply defined as
the calibration range. In the natural case, the cali- the normalized eigenvectors corresponding to the A
bration coefficients (e.g. from EBLP) contain esti- largest eigenvalues of the covariance matrix defined
mates of u and D and will be especially good for ‘typ- by the x’s. The fi vectors are orthonormal. The di-
ical’ samples near the center of the assumed popula- mensionality A is selected such that the space span-
tion, represented e.g. by the mean in the set of cali- ned by P’ = @ii, &,..., &) contains most of the in-
bration samples (li). It will be bad for more ‘atypical’ formation in x. The projection of x into this space,
samples outside this narrow range. Averaged over I = (ii,..., fA), is found by least squares using the
samples selected from the population, the natural equation
philosophy will usually give the best predictions.
For practical applications of the methods presen- x=X+ 4+e (9
ted in this section we refer to Nas6 and Brown’. The
This is illustrated in Fig. 4.
The ‘scores’ I are then used as regressors in the
equation
c=T+iq+f (10)
obtaining an estimate of the ‘chemical loadings’ q by
the least-squares method. The symbols X and C are
GLS
I I I I I
I I I 1 I I I
0 1 2 3 4 5 6 7 8 9 10 I
” C-axis
Number of factors
Fig. 2. Prediction ability of MLR and EBLP. This artificial ex- Fig. 3. Mean squared error (MSE) of the BLP and the GLS esti-
ample illustrates a typical situation when predictors are tested on mator for c. From the statistical literatures it is known that the
an independent dataset (prediction testing). The straight line illus- GLS estimator is superior to BLP outside an ellipsoid, while the
trates the mean squared error of prediction (MSE) of MLR while opposite holds inside the ellipsoid where ‘typical’ samples are situ-
the plotted curve illustrates how the MSE of EBLP changes with ated. The figure gives a univariate illustration of this. We note that
the assumed number of rows (factors) in P. In this artificial exam- in this case the ellipsoid is an interval. The BLP curve follows a
ple the maximum number of rows is 10 and the bestpredictions are quadratic function with minimum at u (the mean of c). A similar
obtained for 6 rows. We refer to Martens and Nest for further de- figure holds for EBLP and EGLS, at least when the calibration
tails. data set is large.
270 trendsin analyticalchemistry, vol. 3, no. 10, I984
x3-axis
t Outlier detection methods
diffuse near infrared reflectance (NIRR) spectro- 4 S. Wold, H. Martens and H. Wold, in A. Ruhe and B.
scopy. In such cases linearizing transformations of Kagstriim (Editors), Proc. Conf. Matrix Pencils, Pited, Swe-
den, March 1982, Lecture Noted in Mathematics, Springer
the spectral reflectance measurements are needed to
Verlag, Heidelberg, 1982, p. 286.
make them suitable for linear models. Many such T. Nres, Biom. J., in press.
transformations may be envisioned. T. Nazs, Biom. J., in press.
In NIRR a logarithmic linearization of the reflec- P. J. Brown, J. R. Stat. Sot. B., 44 (1982) 287.
tance measurement is often used. In the statistical H. Martens and S. A. Jensen, in J. Holas and J. Kratochivil
(Editors), Proc. 7th World Cereal and Bread Congr. Prague,
literature the Box-Cox transformation2 has become
June, 1982, Developments in Food Sciences, 5, Elsevier,
popular to fit a linear model to the data and it works Amsterdam, 1983, p. 607 (p. 613 and 614 are interchanged).
well in many applications. Norris” and Martens and 9 T. Naes and H. Martens, submitted for publication.
co-workersi2J3 have considered further optical cor- 10 D. Cook, Technometrics, 19 (1977) 15.
rections for NIRR data to eliminate multiplicative 11 H. K. Norris, in H. Martens and H. Russwurm (Editors),
Proc. IUFoST Symp. Food Research and Data Analysis,
light scatter effects. This type of transformation is
Oslo, September, 1982, Applied Science Publishers, 1983, p.
based on separation of the measured, linearized 95.
spectrum into a multiplicative light scatter noise con- 12 H. Martens, S. A. Jensen and P. Geladi, in 0. H. J. Christie
tribution and an additive light absorption contribu- (Editor), Proc. Symp. Applied Statistics, Stavanger, June 12-
tion proportional to the chemical composition. 14,1983, Stokkand Forlag Publ., Stavanger, 1983, p. 205.
13 H. Martens and T. Noes, in P. Williams (Editor), in a book on
Near Infrared Spectroscopy, American Association of Cere-
Conclusion al Chemistry, St. Paul, 1985, in press.
Multivariate calibration is a young discipline, and
T Nr.zs and H. Martens are at the Norwegian Food Research In-
many of its aspects have yet not been fully tested. 1.
But it is our belief that multivariate calibration will stitute, P. 0. Box 50, N-1432 Aas-NLH, Norway.