0% found this document useful (0 votes)
48 views

Multivariate Calibration . II. Chemometric Methods: Tormod Naes and Harald Martens

This document discusses multivariate calibration methods for quantitative chemical analysis. It describes: 1) Advantages of multivariate calibration over univariate calibration, such as detecting and eliminating selectivity problems from interferences. 2) Different multivariate calibration methods, including those based on variable selection versus data compression, and direct versus indirect calibration using pure constituents or reference samples. 3) Calibration methods based on data compression, including using Beer's law and factor modeling approaches like principal component regression and partial least squares regression. 4) How to detect abnormal samples and pre-treatments to linearize data for these methods.

Uploaded by

Liliana Forzani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Multivariate Calibration . II. Chemometric Methods: Tormod Naes and Harald Martens

This document discusses multivariate calibration methods for quantitative chemical analysis. It describes: 1) Advantages of multivariate calibration over univariate calibration, such as detecting and eliminating selectivity problems from interferences. 2) Different multivariate calibration methods, including those based on variable selection versus data compression, and direct versus indirect calibration using pure constituents or reference samples. 3) Calibration methods based on data compression, including using Beer's law and factor modeling approaches like principal component regression and partial least squares regression. 4) How to detect abnormal samples and pre-treatments to linearize data for these methods.

Uploaded by

Liliana Forzani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

266 trends in analytical chemistry, vol. 3, no.

lo,1984

Multivariate calibration*.
II. Chemometric methods
Tormod Naesand Harald Martens
sample preparation prior to the measurement, pu-
Aas, Norway
rifying the samples in order to allow specific (or se-
In this outline of new approaches to multivariate calibration lective) measurements free of interferences.
in chemistry the following topics are treated: Advantages of Multivariate calibration offers an alternative solu-
multivariate calibration over conventional univariate cali- tion to this specificity problem: by combining several
bration: detect and eliminate selectivity problems. Multivar- different measurements (e.g. light absorbance at
iate calibration methods based on selection of some variables several wavelengths) interferences can be modelled
vs. methods based on data compression of all the variables. and eliminated. The multivariate linear prediction
Direct vs. indirect calibration: pure constituents or known equation can be written as
samples for calibration? Calibration methods based on data
compression by physical modelling: Beer’s law. Use of e=6,+d
Beer’s law in controlled and natural calibration: the gener- (1)
alized least-squares fit and the best linear predictor. Extend-
ing Beer’s law to handle unknown selectivity problems. Cali- where x = (xi,..., xJ represents anrow yector, of p
bration methods based on data compression by factor model- different measured variables, and b = (b,,. . . , !I,)’ is
ling: the principal component regression and partial least- the corresponding column vector of calibration coef-
squares regression. Methods for detecting abnormal sam- ficients. Non-linear formulae may also be used, but
ples (outliers). Pre-treatments to linearize data. in this paper we give main attention to linear ones. In
other words, we assume that the instrument re-
The purpose of quantitative measurements (e.g. sponse readings, X, have been linearized in advance,
from a rapid spectroscopic instrument) is to predict for instance as x = absorbance = log (l/transmis-
some useful information. A mathematical formula (a sion).
prediction equation) is required in order to trans- The determination of the calibration coefficients
form the measured data (e.g. spectroscopic read- 6,, &,..., hP is termed ‘multivariate calibration’ and
ings) to relevant information (e.g. chemical concen- can be done in many different ways. By direct multi-
trations). variate calibration (direct multicomponent analysis)
The simplest prediction equation is the univariate they are obtained as a result of prior measurements
linear one, i. e. of all the individual pure constituents’ spectra, k,,
k2,... For estimating their concentrations in an un-
e=6,+x6 known mixture, these directly measured spectra are
used in the linear mixture model. Several modern in-
where &is the estimate of e.g. the concentration c, x struments, e.g. for UV-Vis spectroscopy, use such
is the measured variable and & and 6 are the cali- direct multicomponent analysis to eliminate known
bration coefficients. The ‘hats’ over b, and b indicate chemical interferences.
that their values have been determined in a previous The present paper deals mostly with indirect cali-
calibration step. bration, which may be more laborsome, but also
A preceding paper’ showed how contributions more general than the direct method, because it also
from foreign chemical constituents with similar in- allows elimination of interferences from unknown
strument signals or other systematic interferences in constituents and unknown physical phenomena. In
the measured signals x can create gross errors in the the indirect approach one needs a ‘representative’
predicted concentrations e if this conventional univa- set of ‘reference samples’ with measured values for
riate calibration approach is used. Therefore much both spectrum and concentrations of the constitu-
effort has traditionally been devoted to elaborate ents to be predicted. This set of calibration data is
used for statistical estimation of the model param-
eters that after a mathematical transformation yield
the desired calibration coefficients b,, b,, . . . , b,.
* For part I see the September issue: Trends Anal. Chem., 3
(1984) 204. In the present paper some different approaches to
In part I there is a printing error. Page 206, first column, line 15, multivariate calibration will be described from a
“c = cK” should read “x = IS’. theoretical point of view. The importance of ‘bad

01659936/84/$02.00. OElsevier Science Publishers B.V.


trends in analytical chemistry, vol. 3, no. IO, 1984 267

data points’ (outliers) and data linearization are also next section). The PCR and PLSR methods will also
treated. be described. Both the PLSR and PCR method have
shown to give good results in calibration problems
Multivariate calibration and they have valuable interpretation possibilities.
Models and methods Both are based on regressing the concentration c on
It is realistic in many analytical situations to as- linear combinations or projections of the spectral
sume that the instrument response (spectrum x) is a variables in X. They differ in the way these linear
linear function of the concentrations of the constitu- combinations are found. We term these linear com-
ents of the mixture plus measurement noise. This lin- binations ‘regression factors’.
ear multicomponent mixture model can be used for In Fig. 1 we have illustrated some aspects of the
both direct and indirect calibration. In spectroscopy difference between the most important calibration
it is known as Beer’s law. To emphasize its empirical methods. The methods are computationally quite
nature we prefer to call it Beer’s model. It is identical different. But when the relationship between x and c
to an ordinary statistical linear model. From a math- is nicely linear and the measurement noise in x and c
ematical point of view nothing is better than linear is very low, the different methods yield similar re-
models. They are easy to understand and can be sults. However, they make different statistical distri-
used both for direct and indirect calibration. This bution assumptions and allow for different error
type of methods will be considered in more detail in types, and therefore appear to have somewhat dif-
the next section. ferent applications.
For indirect multivariate calibration various re-
gression techniques with c as dependent variable (re- Calibration methods based on the linear multicompo-
gressand) can be used as alternatives to Beer’s mod- nent mixture model (Beer’s model)
el. The simplest of these calibration methods is the The model. The linear relationship between spec-
ordinary multiple linear regression (MLR) method trum and concentrations can be written as
based on a model directly compatible with the multi-
variate prediction equation (eqn. l), i.e. the model x=cK+e (3)
is

c=b,+xb+f (2) Here K is a fixed matrix of physical constants (for in-


stance the matrix of absorbance spectra of the pure
where f is an error term assumed to have expectation constituents), c = (cl,. . . , cq) is a row vector of con-
zero. The parameters b, and b = (b,, . . . , b,) are esti- centrations of one or more chemical constituents,
mated by fitting c to x over the set of calibration sam- and e is a row vector of residuals (random noise,
ples. Once estimated, they are used directly as cali- model errors, etc.) with expectation zero and covari-
bration coefficients in eqn. 1.
In practice, some of the instrument response vari- Multiple linear reqression (MLR)
ables in the spectrum x often approximate linear
Chemical concentration c =
combinations of other ones. This is called multicolli-
= f (instrument responses z,, x2, ...) + error
nearity and results in numerical problems and often
also in poor performance of the predictor based on
Beer's model methods
MLR. It is then better to use some other calibration
technique that can reduce this problem. There exist Instrument responses x,, x2, . . . . zp =
two types of such methods, namely those methods = f (chemical concentrations c,, c2, ... . c4) + errors
that select a subset of the elements of the spectrum x
and those methods that utilize the whole spectrum, PLSR and PCR

“full-spectrum” methods. Chemical concentration c =


Stepwise multiple linear regression2 (SMLR) and = f (regression factors a,, a*, ...) + error
stepwise derivative linear regression3 (here called Instrument responses z,, x2, ... . zp =
SDLR) are examples of the selection type of meth- = g (regression factors a,, a*, ...) + errors
ods used in commercial near infrared (NIR) instru-
ments at present. To the second type, the full-spec-
Fig. 1. The data models used in different groups of methods for
trum methods, belong the predictors based on ridge indirect multivariate calibration. The figure illustrates how meas-
regression2 (RR), principal-component regression2 ured chemical concentration data c are related to instrument re-
(PCR), partial least-squares regression4 (PLSR) and sponse data x,, x,, . . . xp in the calibration sample set. Letters f ( )
some of the methods developed in Beer’s model (see and g ( ) symbolize ‘function of.
268 trends in analytical chemistry, vol. 3, no. 10, 1984

ante matrix equal to V. The model is sometimes ex- Notice that no information about the population is
tended by adding an intercept, and will later be ex- involved. When used in the prediction equation,
tended further by a term tP representing unknown eqn. 5 yields the best linear predictors (BLP) for c
interferences. and eqn. 6 yields the generalized least-squares
The calibration methods based on Beer’s model (GLS) estimator for c. Both are optimal in the sense
can be used to calibrate for several constituents si- that they minimize the mean squared error (MSE,
multaneously. It is therefore convenient to use a q- i.e. the square of the standard deviation of the differ-
dimensional generalization of the prediction equa- ence between the measured and predicted c) in the
tion (eqn. 1). This predictor can be written as natural and controlled case, respectively.
Indirect calibration. When using Beer’s model for
e=&)+x& (4) indirect calibration, the model parameters K and V
(and u and D in the random case) are unknown and
where & = (&,t,,.., 6,,) is the row vector of inter- must be estimated by observations of x and c in the
cepts and B = {bkj} , j = 1,. . . , q and k = 1,. . . , p set of calibration data, i.e. (xl, cl),. .., (x,, c,J where
represents the calibration coefficients for the q dif- n is the number of samples. Several alternative
ferent constituents at the p different instrument re- methods for estimating the parameters exist. In Naes6
sponse variables. These are derived from the physi- the maximum-likelihood (ML) estimators under
cal constants, K. normality assumptions are used, but also other alter-
As discussed in our previous paper’, there is a dis- natives may be envisioned. Irrespective of the choice
tinction between so-called natural (or random) and of estimators, the estimated versions of the BLP and
controlled calibration. In the random case informa- GLS estimators are here called EBLP and EGLS,
tion about the population of samples is available and respectively (E for estimated). If we use the ML esti-
incorporated, and in the other case no such informa- mators for the parameters, it is easy to see from es-
tion exists. Both cases are possible in Beer’s model. tablished regression theory that EBLP is equal to the
In the natural case we denote the concentration ordinary MLR predictor for c on x (see previous sec-
mean of the sample population [i.e. E(c)] by u and tion). In addition EBLP and EGLS with ML esti-
the dispersion matrix of the population [i.e. mates for the model parameters are multivariate
cov(c)] by D. In controlled calibration situations no generalizations of the so-called inverse and classical
population information about the distribution of methods of univariate calibration, respectively. The
mixtures is available, and we can not use or utilize u inverse and classical methods are derived by using
and D. the regression of c on x and x on c, respectively.
Direct calibration. Direct calibration in eqn. 3 re- In practice, the dimension of x is often large com-
quires that the model parameters K and V (and u and pared with that of c and the matrix consisting of cali-
D in the random case) are known. In the random cal- bration data for x is often highly multicollinear. To
ibration case it is natural5 to predict unknown con- incorporate this information in the type of methods
centrations from calibration coefficients estimated treated here it is assumed6 that e has linear factor
bY structure, i.e. that e = tP + e* where e* has uncorre-
lated elements. Such ‘rank reduction’ models have
b = (K’DK + V) -1 K’D (5) shown to be suitable for multicollinear data. The
model is then
and
x=cK+e=cK+tP+e* (7)
i, = U(Z-KB)
The rows of P may be regarded as spectra of un-
where Z is the q-dimensional identity matrix. Notice known constituents, while K contains spectra of the
that the expressions contain u and D, the distribution known constituents. Estimates of the residuals e can
parameters for the constituent concentrations. be obtained after the estimation of K, by fitting x to c
In the controlled case the calibration coefficients (see eqn. 3) in the calibration data, i.e.
are usually estimated by5
0 = x-& (8)
zj = v-1 K’ (KV- I&p)- 1
(6)
The covariance matrix of tP + e* may then be esti-
and mated by factor analysis of the residuals C and finally
inserted in eqns. 5 and 6 (instead of e.g. the ML esti-
i, = 0 mator of V). The number of factors in P can be esti-
trends in analytical chemistry, vol. 3, no. lo,1984 269

mated by cross-validation or prediction testing on an methods have been found to perform well on chem-
independent dataset (see Fig. 2). The methods de- ical data.
rived with the factor assumption on the residuals
have given good results in e.g. NIR spectroscopy6. Methods based on regression on factors from x
The basic distinction between natural and con- In this section we will describe another important
trolled calibration is unimportant if the residuals af- type of calibration techniques, namely those meth-
ter fitting the calibration model to the calibration ods which are based on regressing the chemical con-
data are very smalls. But if appreciable errors exist, centrations on linear combinations or projection of
the distinction is important: the calibration methods x, called ‘regression factors’. The regression factors
for thentwo cases will yield calibration coefficients are found by projecting the spectra x onto a space
b O,..., bp that are optimal for different types of un- spanned by row vectors pr,..., fiA (estimated factor
known samples (Fig. 3). In the controlled case, the loadings). Two methods of this kind, PCR and PLSR
calibration coefficients (e.g. from EGLS) will yield will be considered.
acceptable predictions over a wide range of concen- PCR. In PCR, the regression factors are described
trations and may even allow extrapolations outside by loadings fir, fi2,. . . , fiA which are simply defined as
the calibration range. In the natural case, the cali- the normalized eigenvectors corresponding to the A
bration coefficients (e.g. from EBLP) contain esti- largest eigenvalues of the covariance matrix defined
mates of u and D and will be especially good for ‘typ- by the x’s. The fi vectors are orthonormal. The di-
ical’ samples near the center of the assumed popula- mensionality A is selected such that the space span-
tion, represented e.g. by the mean in the set of cali- ned by P’ = @ii, &,..., &) contains most of the in-
bration samples (li). It will be bad for more ‘atypical’ formation in x. The projection of x into this space,
samples outside this narrow range. Averaged over I = (ii,..., fA), is found by least squares using the
samples selected from the population, the natural equation
philosophy will usually give the best predictions.
For practical applications of the methods presen- x=X+ 4+e (9
ted in this section we refer to Nas6 and Brown’. The
This is illustrated in Fig. 4.
The ‘scores’ I are then used as regressors in the
equation

c=T+iq+f (10)
obtaining an estimate of the ‘chemical loadings’ q by
the least-squares method. The symbols X and C are

GLS
I I I I I
I I I 1 I I I
0 1 2 3 4 5 6 7 8 9 10 I

” C-axis
Number of factors

Fig. 2. Prediction ability of MLR and EBLP. This artificial ex- Fig. 3. Mean squared error (MSE) of the BLP and the GLS esti-
ample illustrates a typical situation when predictors are tested on mator for c. From the statistical literatures it is known that the
an independent dataset (prediction testing). The straight line illus- GLS estimator is superior to BLP outside an ellipsoid, while the
trates the mean squared error of prediction (MSE) of MLR while opposite holds inside the ellipsoid where ‘typical’ samples are situ-
the plotted curve illustrates how the MSE of EBLP changes with ated. The figure gives a univariate illustration of this. We note that
the assumed number of rows (factors) in P. In this artificial exam- in this case the ellipsoid is an interval. The BLP curve follows a
ple the maximum number of rows is 10 and the bestpredictions are quadratic function with minimum at u (the mean of c). A similar
obtained for 6 rows. We refer to Martens and Nest for further de- figure holds for EBLP and EGLS, at least when the calibration
tails. data set is large.
270 trendsin analyticalchemistry, vol. 3, no. 10, I984

x3-axis
t Outlier detection methods

Outlier detection when predicting unknown samples


Once the model parameters have been estjmated,
the resulting calibration coefficients b, and b can be
used for prediction. However, such predictions can
give quite misleading concentration results if used on
samples for which the model and its parameters do
not apply. Such samples may be called abnormal ob-
servations or outliers. Unexpected errors may al-
ways occur in chemical or instrumental analysis. In a
previous paper’ we illustrated how multivariate
analysis can reveal such outliers. Outlier detections
can be used to ensure that every unknown prediction
/x,-axis
sample satisfies the calibration assumptions. For all
the calibration methods described in the previous
sections outlier detection criteria have been devel-
Fig. 4. Prediction by the PCR and PLSR methods. In both cases oped. In Naes and Martens9 various methods for de-
the observation x is projected into the subspace, represented by tecting outliers in Beer’s model are considered. It is
the plane spanned by the factor axes whose directions from the distinguished between on one hand, observations
mean, X, are defined by fit and p2. The coordinates (factor
that fit the estimated mixture model (eqn. 3) but
scores, here it and i2) are used as predictor variables when predict-
ing c, while the residuals, 6, may be used to reveal outliers.
have concentrations far from the mean of the cali-
bration sample set, and on the other hand, obser-
vations that do not even fit eqn. 3 itself. The criteria
for outlier detection based on eqn. 3 are based on
here the mean of x and c in the calibration data set. weighted sums of the residuals derived from EBLP
Prediction of an unknown c from a measured x can and EGLS.
proceed by estimating I using p in eqn. 9 and E from Outlier detection for the PLSR and PCR methods
2 usmg eqn. 10. In this case C can be written as is considered by Martens and Jensens. The same dis-
- -1
?= b, + xb, where b = p’q and b, = c-xb. tinction between different types of outliers as above
PLSR. PLSR determines the parameters P and q is made. The detection in this case is based on testing
in a slightly different way. While PCR estimates the fit of the spectral data to the space spanned by
loadings fl using only the information in X, the PLSR the matrix p, the residual lack of fit, &.The residuals
method estimates fi using both c and X. Thus PLSR is can be studied by e.g. Fourier analysis-or visual in-
more parsimonious than PCR since it primarily se- spection, and tested statistically to detect abnormal
lects vectors with predictive relevance for c. Hence, residuals. These outlier methods were tested suc-
PLSR often gives better results with fewer factors cessfully on NIR measurements. They distinguished
than PCR. The PLSR algorithm for finding P and 4 is clearly between wheat and barely products. Recent-
very efficient on the computer and we refer to Wold ly improved outlier detection methods are de-
et al.4 and Martens and Jensen8 for a description of scribedlj.
it. The method can be modified and extended in sev-
eral ways. Prediction in the PLSR model is similar to Outlier detection in the calibration step
that of PCR (Fig. 4). In this case e is a little more To obtain calibration coefficients that give good
complicated than for PCR but it still has the form of predictions it is necessary to eliminate erroneous cal-
eqn. 1. ibration samples. This requires outlier detection also
Since c is used as regressand in these methods in the calibration phase. To detect outliers in the cal-
(eqn. 10) PCR and PLSR are natural calibration ibration, ordinary regression outlier detection statis-
methodss. They do not share the advantage of the tics such as the DFFITSO or modifications of these
Beer’s model approach which is able to handle both may be used.
the natural and controlled calibration. On the other
hand, since c is used as regressand, the PCR and Linearization of input data
PLSR approaches are probably better suited to han- It should be emphasized that linear models only
dle the problem of measurement noise in both x and represent approximations to the real world and
c in the calibration data. This problem is often over- usually the linearity can only be justified in a restrict-
looked in the literature. ed range of sample quality variations. An example is
trends in analytical chemistry, vol. 3, no. lo,1984 271

diffuse near infrared reflectance (NIRR) spectro- 4 S. Wold, H. Martens and H. Wold, in A. Ruhe and B.
scopy. In such cases linearizing transformations of Kagstriim (Editors), Proc. Conf. Matrix Pencils, Pited, Swe-
den, March 1982, Lecture Noted in Mathematics, Springer
the spectral reflectance measurements are needed to
Verlag, Heidelberg, 1982, p. 286.
make them suitable for linear models. Many such T. Nres, Biom. J., in press.
transformations may be envisioned. T. Nazs, Biom. J., in press.
In NIRR a logarithmic linearization of the reflec- P. J. Brown, J. R. Stat. Sot. B., 44 (1982) 287.
tance measurement is often used. In the statistical H. Martens and S. A. Jensen, in J. Holas and J. Kratochivil
(Editors), Proc. 7th World Cereal and Bread Congr. Prague,
literature the Box-Cox transformation2 has become
June, 1982, Developments in Food Sciences, 5, Elsevier,
popular to fit a linear model to the data and it works Amsterdam, 1983, p. 607 (p. 613 and 614 are interchanged).
well in many applications. Norris” and Martens and 9 T. Naes and H. Martens, submitted for publication.
co-workersi2J3 have considered further optical cor- 10 D. Cook, Technometrics, 19 (1977) 15.
rections for NIRR data to eliminate multiplicative 11 H. K. Norris, in H. Martens and H. Russwurm (Editors),
Proc. IUFoST Symp. Food Research and Data Analysis,
light scatter effects. This type of transformation is
Oslo, September, 1982, Applied Science Publishers, 1983, p.
based on separation of the measured, linearized 95.
spectrum into a multiplicative light scatter noise con- 12 H. Martens, S. A. Jensen and P. Geladi, in 0. H. J. Christie
tribution and an additive light absorption contribu- (Editor), Proc. Symp. Applied Statistics, Stavanger, June 12-
tion proportional to the chemical composition. 14,1983, Stokkand Forlag Publ., Stavanger, 1983, p. 205.
13 H. Martens and T. Noes, in P. Williams (Editor), in a book on
Near Infrared Spectroscopy, American Association of Cere-
Conclusion al Chemistry, St. Paul, 1985, in press.
Multivariate calibration is a young discipline, and
T Nr.zs and H. Martens are at the Norwegian Food Research In-
many of its aspects have yet not been fully tested. 1.

But it is our belief that multivariate calibration will stitute, P. 0. Box 50, N-1432 Aas-NLH, Norway.

give better use of today’s advanced analytical instru-


ments in chemistry. It may allow easy and fast analy-
sis of ‘dirty’ samples, predicting relevant information
in spite of interferences, and it provides reliability In forthcoming issues.. .
via its pattern recognition error warnings.
In order to make indirect multivariate calibration
techniques easily accessible for non-specialists, we
are now developing a computer program package for Comparison of approaches to robotics in the
multivariate calibration, to be used in 16-bits micro- laboratory
computers. The program is intended for general lab- by G. D. Owens and R. A. DePalma
oratory use.
Pesticide residue analysis
It is our belief that multivariate calibration and
by T. R. Roberts
other chemometric data analytic techniques can im-
prove interdisciplinary contact, by allowing quanti- Chemical analysis of extraterrestrial material
tative coupling of data from widely different scientif- by Y. A. Surkov and Y. I. Belyayev
ic fields. Such interdisciplinary interpretation may
Liquid chromatography with on-line electron
cause a sorely needed shift from ‘atomistic’ special-
capture detection -a viable way to go?
ization to more ‘holistic’ systems analysis. Various
by I/. A. Th. Brinkman and F. A. Maris
branches in chemistry may thereby get better com-
munication with each other and with external fields Ion chromatography in water analysis
like geology, biology or economy. by P. A. Shpigun

Environmental analysis using GC-AAS


by R. M. Harrison, N. Hewitt and S. J. de Mora

Ion-induced spectroscopies for the analysis of surfac-’


es, interfaces and thin films
by H. W. Werner and R. P. H. Garten

References Artificial intelligence and knowledge engineering


1 H. Martens and T. Noes, Trends Anal. Chem., 3 (1984) 204. by J. Fox
2 N. R. Draper and H. Smith, Applied Regression Analysis,
New York, 2nd ed., 1980.
3 K. H. Norris and P. C. Williams, Cereal Foods World, 22
(1977) 461.

You might also like