0% found this document useful (0 votes)
71 views

Quantitative Validation of Mathematical Models: Kevin Dowding

Uploaded by

Supriya Diwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

Quantitative Validation of Mathematical Models: Kevin Dowding

Uploaded by

Supriya Diwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Proceedings of

2001 ASME International Mechanical Engineering Congress Exposition


November 11-16, 2001, New York, NY

Quantitative Validation of Mathematical Models


Kevin Dowding
Sandia National Laboratories
Validation and Uncertainty Quantification Processes
P.O. Box 5800 Mail Stop 0828
Albuquerque, NM 87185-0828
e-mail: [email protected]

ABSTRACT t Student’s t-distribution


Validation is a process to compare a mathematical model with T t-statistic
a set of physical experiment to quantify the accuracy of the T temperature, °C
model to represent the physical world (experiment). Because the
TL boundary temperature, °C
goal is to use experiments to quantify the accuracy of the
mathematical model, the interaction of the model and q0 heat flux, W /m 2
experiment must be carefully studied. Advancing the V covariance matrix
comparison beyond a qualitative nature requires consideration
x boundary/initial condition vector
of the errors in the process and the effect of these errors on the
comparison. The mathematical model, in conjunction with X experiment dependence vector
sensitivity analysis, uncertainty analysis, and statistical analysis z spatial coordinate, m
are tools for studying the interaction of the model and ε prediction error
experiment and quantifying the effect of errors. A model for εT temperature measurement error, °C
steady state heat conduction is used to discuss issues associated
with the errors in the validation process and demonstrate a εq heat flux measurement error, W /m 2
0

quantitative process to study validation of mathematical models. σ standard deviation


φp physical/material property vector
NOMENCLATURE
φn numerical parameter vector
D j, i experimental measurement, sensor j and exp i, °C
2
ê residual, °C χ Chi-square distribution
E[ ] expected value
superscript
k thermal conductivity, W /m°C
ˆ estimate/measurement
L length, m
p parameter subscript
2 2
r consistency statistic, °C S model
2
r crit critical value of consistency statistic, °C
2 D experiment
T total
S model prediction, °C
Sp scaled sensitivity coefficient, °C

1 Copyright  2001 by ASME


INTRODUCTION as D . We discuss the functional dependencies of both next.
Model validation is a process by which we study how The model predictions come from the solution of a
accurately a model represents the physical world. Traditionally, mathematical model (differential equation). For example, the
validation has involved comparing results of a mathematical solution of the heat conduction equation or Navier-Stokes
model of a physical phenomenon to results of a physical equations. The mathematical model requires specifying
experiment. In many cases the outcome of the comparison is physical/empirical parameters, which are denoted as φ p and
described in qualitative terms: good, reasonable, excellent, etc.
Most engineering journals contain examples that illustrate this boundary/initial conditions, which are denoted x . Examples of
paradigm. As we rely less on physical experiments, and more on φ p are physical dimensions and physical properties like thermal
models in the engineering design process we need a quantitative conductivity, kinematic viscosity, or contact resistance.
assessment of how accurately a model reflects the physical
Examples of x are the initial temperature, wall heat flux, and
world. Furthermore, the paradigm to assess how well a model
exit pressure. In many cases to obtain model predictions a
represents the physical world, including the use of the model to
numerical solution is required. Parameters associated with the
design and assess the errors, needs to change in light of the
increased reliance on models. numerical solution are denoted φ n . Examples of φ n are the
number of finite elements and nonlinear convergence tolerance.
A field of study with overlap to the validation process is the
To summarize, the model predictions have the following
area of inverse analysis. Inverse analysis is typically concerned
dependence
with identifying physical parameters or boundary conditions,
“causes,” given experimental measurements, “effects,” [1,2]. S = f ( φ p, x, φ n ). (1)
The inverse problem entails more than a residual minimization
to match the measurements to a model and thus infer the The intended application of the mathematical model is used to
“causes.” (Just as the validation process is more than comparing select the range of φ p and x at which to compare the model
a deterministic model with a single experiment.) Issues of what
predictions and experiments.
is the best experimental configuration, what experimental data
are needed, can the “causes” be reliably identified from the The experiment should physically simulate the parameter
measurements, how accurate are the estimated “causes,” etc. are range, φ p , and apply boundary conditions, x , of interest to
studied. There are several parallels between the inverse problem
assess accuracy of the model. The experiment may have
and the validation process. In general, both seek to conduct
additional dependencies beyond parameters or boundary
experiments for the purpose of identifying characteristics of the
conditions. We may study the accuracy of the model as
mathematical model. In this paper we shall draw upon the tools
conditions in the physical experiment that are representative of
and concepts used in the inverse area, and additional tools, to
how the model is applied are varied. Denote these conditions as
demonstrate their use in the validation process.
X . Examples of X could be effects that may be neglected in the
The goal of this paper is to discuss issues associated with the model like orientation and ambient pressure. The model does
validation process, identify analysis tools that may help in the not depend explicitly X and by comparing with experiments we
process, and, by way of example, demonstrate the use of the
can assess the dependence. We can write the dependence of the
tools to address validation.
experimental measurements as
Validation Process D = f ( φ p, x, X ). (2)
The validation process should be driven from the perspective
quantifying the accuracy of the model for a planned application The experiment doesn’t explicitly depend on the value of
of the model. In this paper we assume that we have a physical properties in φ p . Instead, hardware is selected that will
mathematical model we would like to compare with an physically respond consistent with the values in φ p .
experiment to quantitatively measure the accuracy of the model.
This starting point is actually several steps into the validation The dependencies for the model predictions and experimental
process in that there are (application) drivers that set measurements are shown in Eq. (1) and Eq. (2). Both depend on
requirements on the model. We shall defer to Trucano et al. [3] parameters and boundary/initial conditions. In addition, the
and for a discussion of application requirements that drive the model predictions have dependence on numerical parameters
model validation process. associated with solving the mathematical model and the
The basic validation process is to take model predictions and experiment may have dependence on conditions not reflected in
compare them to experimental measurements to quantify the the model. In general the experimental measurements and
accuracy of the model to represent the physical world. Denote model predictions are vectors. The vector may have
measurements that are a single value from multiple
the model predictions as S and the experimental measurements

2 Copyright  2001 by ASME


experiments, time/space resolved from a single experiment, or (6) is that the residual is not equal to the difference between
time/space resolved from multiple experiments. Similarly, “errorless” experiment and model values
φ p, x, X are generally vectors.
ê ≠ ε . (7)
Statistical Model
For clarification, the errors in Eq. (6) are separated into two
To quantitatively study the accuracy of the model, differences categories based on the source of the error.
between an experimental measurements and model predictions
• Inherent errors are related to the ability of the model to pre-
should be studied
dict the present validation experiment and are due lack of
ε≡D–S. (3) knowledge, measurement error, or numerical solution error.
The experimental error and model error given in Eq. (6) are
Call this the prediction error. The differences are a sensitive the result of inherent error. That is, error that is inherent in
indicator of the relationship between an experiment and a the comparison between a model and experiment. Exam-
model. The differences have the combined functional ples of inherent error are imprecisely known physical prop-
dependence of the experiment and model erties and boundary and initial conditions for the
experiment, measurement error, and numerical solution
ε = f ( φ p, x, X, φ n ). (4)
error. These errors can be reduced, at least to some degree,
by additional experiments to better characterize the physi-
A fundamental difficulty with validation is that the data we have cal properties, improved experimental measuring capabili-
for the experimental measurements and model predictions are ties, or reduced numerical error. It is impossible to remove
corrupted with error. Denote the corrupted experimental all the effects of inherent error.
• Prediction errors are the result of the model not represent-
measurements and model predictions as D̂ and Ŝ , respectively.
ing the physics of the experiment. Prediction errors are not
Hence we have data to quantify the differences, call them a consequence of inherent error (lack of knowledge, mea-
residuals surement error, or numerical error concerning the present
experiment). Prediction errors are the approximation error
ê = D̂ – Ŝ . (5)
of the mathematical model to represent the physics in the
experiment.
Equation (5) be expanded to shown the relationship of the data
corrupted with error to the “errorless” values In general, errors are anticipated to be a collection of random
and systematic type errors. Reliably separating the effect of
ε these errors on the residuals may be quite difficult, in particular
ê = ( D̂ – D ) + ( S – Ŝ ) + (6)
Residual Exp Error Model Error Prediction Error if the inherent errors are relatively large compared to the
prediction errors and have a systematic effect on the residuals.
In most validation experiments, we are interested in identifying
Equation (6) is conceptual because in practice we don’t know the prediction error, that is the effect of the model not
the “errorless” values of D and S . The point is that the residual representing the physical world, independent of inherent errors.
data are a function of the experimental, model, and prediction Because inherent errors are convolved with prediction errors
errors. when a model and experiment are compared, we need to
Experimental error in Eq. (6) is due to the effect of understand and quantify the effect of inherent errors to
understand the presence and magnitude of the prediction errors.
instrumentation to measure D̂ . The model error in Eq. (6) is
with regard to the ability of the model to predict the specific The effects of inherent error could be quantified by
validation experiment. For a model to predict the outcome of a conducting additional experiments. With adequate resources, in
validation experiment requires physical and material properties, principle, we could conduct a designed set of validation
φ p , and boundary and initial conditions, x , as inputs to the experiments to quantify and separate the effect of inherent
(variable and systematic) errors. Then conceptually we could
model. These inputs are usually a combination of measurements
quantify the total error (inherent and prediction) and remove all
from the present validation experiment and estimates from
components of error that we are not interested in to quantify the
previous experiments, identify the measurements/estimates as
prediction error, if it exists. Experiments to quantify all potential
φ̂ p, x̂ for the true values φ p, x . The model error is due to the errors would be needed. Taking this approach could be costly
effect of ( φ p – φ̂ p, x – x̂ ) on the model predictions. Model error and for all but simple models, the experimental requirements
could be too exhaustive to undertake.
may also depend on φ n . The prediction error is the
The approach suggested here is based on using the model and
approximation error of the model. A key point concerning Eq.

3 Copyright  2001 by ASME


analysis techniques, as opposed to relying solely on the 4. design a validation experiment to maximize usefulness of
experimental data. The techniques are related to inverse and the information about the model
uncertainty analysis. In many cases we will not have
Points 1-3 relate to assessing what is learned about the model
experimental data to directly assess the effect of inherent error
by comparing to experimental data and what is the effect of
on the model. For example, we may have experiments
error/uncertainty. Point 4 relates to using the model in designing
conducted on a single apparatus and not know the effect of
the physical experiment to maximize the information content of
uncertainty in a material property. In the absence of data we can
an experiment for addressing validation issues. The model
use the model and analysis to study the effect of this inherent
should play a key role in designing the experiment. Due to space
error. The approach is to study the model, in conjunction with
limitations this important point is only briefly discussed.
existing experimental data, to identify potential contributors to
Assessing the validity of the model requires sensitivity,
the errors between the model and experiment and provide
uncertainty, and statistical analyses. We briefly discuss each
direction for deciding how resources should be allocated to
topic next. Then we apply these analyses to an example problem
improve our understanding of differences between the model
to demonstrate their use.
and validation experiment.
Using the model and analysis will not replace experiments. Sensitivity Analysis
Validation requires experimental data, perhaps of a higher Sensitivity analysis is concerned with quantifying the effect
quality and better characterized than what we are accustomed of model inputs, ( φ p, x ) , on the model output. Scaled sensitivity
to. The idea is to leverage the model for conducting a smarter
experimental test suite to develop greater insight into the source coefficients are found to be useful for this purpose. They are the
of the differences between the model and experiment and partial derivatives of the model output, S , with respect to a
include the potential effect of errors for which we do not have parameter p (evaluated at a point in the parameter space),
data. multiplied by the parameter
∂S
MODEL VALIDATION ANALYSIS TOOLS S p ≡ p ------ . (8)
∂p
The validation process requires a combination of analytical,
numerical, statistical, and experimental skills. This speaks to the Typically, scaled sensitivity coefficients are evaluated at the
difficulty of the process. There is not a unique set of steps one mean value of the parameters. There are several techniques to
follows to study validation. The goal of validation is insight to, calculate the sensitivity coefficients and Blackwell et al. [4]
and quantification of, how a model represents the physical provide a discussion of this topic.
world. In order to achieve this goal we need to leverage the
Scaled sensitivity coefficients are an indicator of the
model, experiments, and analyses to maximize the available
mathematical model’s dependence on parameters (coefficients)
resources. The process of attempting to validate a model will
and boundary/initial conditions as shown as φ p, x in Eq. (1).
certainly provide valuable insight into the weaknesses in both
the model and experiment. Scaled sensitivity coefficients are useful to identify and rank the
important model parameters and boundary conditions.
The mathematical model can be a resource to gain insight to Additionally, the dependence on parameters in a mathematical
the validation process. Since validation is being undertaken, a model is an indication of the dependence on physical
mathematical model exists and can be exercised to study the mechanisms in a mathematical model. For example, consider a
validation process. If the model is deemed so poor as to be mathematical model with thermal diffusion and radiation.
unreliable than the model is probably not ready for validation. Diffusion requires the parameter thermal conductivity and
The experimental data provides valuable information to not only radiation requires the parameter emissivity. If the model is less
quantify the accuracy of the model, but further enhance our sensitive to thermal conductivity than to emissivity, diffusion is
understanding of the model and potentially improve it. Using an not as important as radiation. Furthermore, a comparison
the model in conjunction with sensitivity analysis and with experimental data provides less information about the
uncertainty analysis provides additional insight to the validation validity of the diffusion mechanism than it does for the radiation
process. Carefully studying the model can: mechanism. Finally, the more sensitive a parameters, the greater
1. assess what attributes of the model are tested by comparing the effect error in its value has on error in the model prediction.
with data In addition to identifying the important parameters/boundary
2. quantify the impact of measurement errors and/or uncer- conditions and physical mechanisms in the mathematical model,
tainties sensitivity coefficients are useful for estimating correlation
structure in a mathematical model. Even when parameters in a
3. provide insight to the structure of the errors and/or uncer-
math model are independent, model predictions of space/time
tainties
dependent data or multiple experiments can be correlated. The

4 Copyright  2001 by ASME


correlation structure for model predictions can be estimated Easterling [7]. Identify this approach as error model estimation.
using the model. This approach asks the question: How accurate is the model
given the experimental measurements? From this metric
Uncertainty Analysis inferences can be made about how good the model is for
When faced with inputs to the model that we know are in nontested conditions by assuming attributes about the error
error to some degree, either due to measurement error from the model.
present experiment or a value taken from a previous experiment, A second approach compares the experimental measurements
some quantification of this effect is needed. Sensitivity analysis and model predictions including estimates of the error in both
may indicate that the model is insensitive to the input, in which values. This approach asks the question: Are there additional
case we may choose to neglect the effect of error in its value. If errors in the model and/or experiment not attributable to the
the model is sensitive to the input we can use uncertainty estimated error? Identify this approach as consistency testing.
propagation techniques to quantify the effect on the model
output. There are several techniques for propagating Error Model Estimation
uncertainty/variability through a model [5,6].
This metric aims to quantify the error observed between the
Propagation techniques, regardless of the specific method, model predictions and experimental measurements. Because
require an estimate of the error in the model parameter. errors are due to random and systematic effects, estimating the
Typically, in the absence of data we assume characteristics of mean and variance are of interest
the error, e.g., the error is random with normal distribution,
mean equal to zero and standard deviation equal to σ. Assuming 1
n
characteristics of the error and quantifying the effect is better µ̂ e = E [ eˆi ] = --- ∑ ( eˆi ) (9)
ni = 1
than ignoring the potential error. However, the uncertainty on
the model prediction calculated using an assumed distributions
must be taken in the context that it is an estimate conditioned on 2  2 1
n
σ̂ e = E  ( eˆi – E [ eˆi ] )  = ------------ ∑ ( ê i – µ̂ e ) 2 . (10)
the assumed distribution. If the quantified accuracy of the model   n – 1i = 1
hinges on assumptions, additional experiments need to be
conducted to quantify the (assumed) errors. The mean represents a bias while variance represents scatter in
Uncertainty analysis (and sensitivity analysis) operates on an the error. Estimates of these values depend on the sample size,
assumed correct (valid) model. Therefore, neither will indicate n. When we have a small number of samples the estimates may
the presence of a prediction error, only comparing with not be accurate. Estimates of the mean and variance can
experimental data will serve this purpose. Sensitivity and represent an error model describing difference between the
uncertainty analyses are useful for quantifying the magnitude experiment and mathematical model.
and (correlation) structure of errors in the model predictions due Given estimates for the bias and variance, statistical tests can
to inherent errors in the validation process. be performed to test hypotheses concerning the error model,
calculate confidence limits on the estimates, and make
Statistical Analysis inferences about the magnitude of the error in other, e.g.,
Statistical analysis provides tools for analyzing data with untested, applications. For illustration we summarize a few tests
variable and systematic errors. An excellent reference that here, refer to Easterling [7] for additional discussion.
discusses the application of statistical analysis to the validation A test that may be of interest is that the error model (and
process is Easterling [7]. The basic approach is to statistically
characterize the residuals. Statistical quantities such as the hence the mathematical model) is unbiased ( µ̂ e = 0 ) . To test the
2
mean, or bias, and variance can be estimated. Statistical test can hypothesis that µ̂ e = 0 when we have an estimate of σ̂ e , the t-
be performed to assess the significance of the bias and provide
statistic is used
confidence limits on the estimated variance. In the following
sections different statistical analyses are summarized and µ̂ e
applied to study validation. The analyses can be found in a -.
T = -------------- (11)
σ̂ e / n
statistical method textbook [8].

VALIDATION METRICS If T < – t 1 – α/2 ( n – 1 ) or T > t 1 – α/2 ( n – 1 ) , where t 1 – α/2 ( n – 1 ) is


Two approaches to quantify validation are discussed. The the t-distribution for n-1 degrees of freedom at a probability
approaches are thought to be complementary by the author. The level of 1 – α/2 , we would have evidence to reject the
first approach is to use the experiments measurements and hypothesis that ( µ̂ e = 0 ) and have statistical evidence that the
model predictions to estimate characteristics of the error, differences between the experimental measurements and model

5 Copyright  2001 by ASME


predictions are biased.
where Ŝ i, D̂ i are elements of Ŝ, D̂ and
Another useful application of statistical methods is to
2
calculate confidence intervals for estimates in the error model. σ̂ T , i = ( σ̂ S2 + σ̂ D
2 ) (16)
i i
For example, an upper confidence bound for the estimated
variance is [8] is the ith element along the diagonal of V T .
2 2 (n – 1)
σ e < σ̂ e ----------------------------------
2
- (12) Equation (15) can help interpret the metric. The statistic is a
χ 1 – α/2 ( n – 1 ) test of zero prediction error relative to nonuniform variance in
the residuals. Assuming the residuals are jointly normally
2
where χ 1 – α/2 ( n – 1 ) is the Chi-squared distribution for n-1 distributed and the prediction error is zero, r 2 has a Chi-square
degrees of freedom and a probability level of 1 – α/2 . This distribution
confidence bound estimates the largest we would expect the r 2 ∼ χ2( n ) . (17)
variance to be given the stated probability level and data used to
estimate the variance. From the confidence limit we can infer The critical value at which we would reject the (hypothesis) that
the magnitude of the errors in other situations. Assuming the the residuals are jointly normally distributed and have zero
errors are normally distributed, we would expect errors to be prediction error is selected to be the value at
less than 2σ e with a probability of 95 percent, i.e., ê < 2σ e . 2
2
r crit = l1 – α ( n ) . (18)
Consistency Testing
2
A statistically-based metric suggested by Hills and Trucano where l 1 – α ( n ) is the value associated with the 100(1-α)%
[9] is discussed. The idea of this metric is to compare the percentile for n degrees of freedom. The number of degrees of
residual with an uncertainty estimate for errors in the model and freedom, n, is the number of measurement/prediction pairs. The
experiment. The metric is based on the statistic consistency test compares the magnitude of r 2 in Eq. (13),
calculated with model predictions, experimental measurements
r 2 = ( D̂ – Ŝ ) T V T–1 ( D̂ – Ŝ ) = ê T V T–1 ê , (13)
and, an estimated covariance matrix of the errors in these
2 , that is based on the
values, with a critical value, r crit
where Ŝ and D̂ are vectors of data from the model and
differences being distributed as χ 2 .
experiment and V T is the estimated covariance matrix of the
differences. The vector data could be time/space dependent data The interpretation of this test is that if the residuals, ê i , are
from a single experiment, a single value from multiple 2
small relative to potential errors in the values, σ̂ T , i (assuming
experiments, or a combination of both. This example compares
independent errors for discussion), there is no statistical
a single experiment/model prediction pair compared for
evidence to suggest that the observed differences are the result
multiple experiments.
2
of any source other than errors contributing to σ̂ T , i . Comparing
Applying this metric requires the covariance matrix for the
2
residuals. If experiment and model are independent we can the residual to a quantity similar to σ̂ T , i is a test suggested by
write the covariance matrix for the residuals as a sum Coleman and Stern [10].
VT = VS + V D (14) Frequently, the error in model predictions for multiple
experiments and/or space/time dependent measurements are not
where V S and V D are (error) covariance matrices for the model independent and the covariance matrix V T has off-diagonal
predictions and experimental measurements. These matrices entries. In this case, in addition to the magnitude, the correlation
indicate the effect of inherent error on the model and structure of the observed error is important. The structure of the
experiment. We can estimate the covariance matrices V S and residual must be consistent with the correlation structure of V T .
V D using uncertainty propagation and statistical analysis. If the Hence, what Hills and Trucano [9] test is whether these is
errors between separate model prediction/experimental statistical evidence to refute that prediction error is nonzero. An
measurement pairs are independent the covariance matrix only important point to note about the test is: A positive outcome,
has diagonal entries and we can rewrite Eq. (13) as indicating that there is no statistical evidence to suggest
prediction error is significant, does not mean it is zero. As
2 ( Ŝ i – D̂ i ) 2 ( ê i ) 2 demonstrated in, it means the prediction error, relative to the
r σ̂ = ∑ ----------------------- = ∑ ----------
- (15)
T
i
2
σ̂ T , i i σ̂
2 model and experiment error due to the inherent error, is not
T, i

6 Copyright  2001 by ASME


statistically significant. This suggest that to identify the
Ŝ is used to identify model values because we use estimates/
prediction error we need to reduce the effect of inherent errors.
The implication is that when we have a validation exercise with measurements of parameters, φ̂ p , and boundary conditions, x̂ ,
large errors in the model or experiment due to inherent error in for the experiment. Although the model studied in this paper is
the process it will be difficult to quantify the prediction error. highly idealized it contains the pertinent attributes of studying a
more complex model. In particular, the model has an input that
VALIDATION EXAMPLE PROBLEM is estimated from a previous experiment, k̂ , which has an
We shall discuss the validation process by way of a simple (inherent) error due to uncertainty in its value and an input that
example. The example selected is heat conduction. Heat is measured in the validation experiment, q̂ 0 , which has an
conduction is not selected because we believe it needs
(inherent) random error in its value. Most all validation
validating. The basis for heat conduction is conservation of
experiments must deal with model inputs with these two types
energy and Fourier’s law, which have consensus in the
of error (uncertain and random). A complex model would have
engineering community as being valid, except for exceptional
many more inputs, but generally they could be classified
circumstances. The choice of studying heat conduction/
according to the error in the input as defined in the simple
Fourier’s law is to begin with a model for which there is
model. An important issue not covered by this model, which has
consensus on its validity as a starting point to understand issues
an analytical solution, is the effect of error due to numerically
in the validation process.
solving the model equations.
Mathematical Model
Validation Data
The mathematical model studied is one dimensional steady
To validate the model we need an experiment that is
state thermal diffusion
representative of steady state thermal diffusion. Instead of
dealing with the complexities of an actual experiment, data is
----- ⋅  k ------- = 0 , 0 < z < L .
d dT
(19)
dz  dz  simulated using the model. Experimental data is simulated
directly from the model in Eq. (22)
The boundary conditions are prescribed flux on one surface and D = T ( z, φ p, x ) – T L = S . (24)
prescribed temperature on another
– k ∇T z=0
= q0 (20) Since the experimental values are equal to the “true” model
values, one expects the data should support validity of the
T z=L
= TL . (21) model.
In practice experimental data contain measurement errors. To
Assuming constant properties, we can analytically solve Eq. simulate realistic experimentally measured data, additive
(19) through Eq. (21) for the temperature distribution measurement error is introduced
q0 L
T ( z ;φ p, x ) = ---------  1 – --- + T L .
z
(22) D̂ = D + ε T , (25)
k  L

The additional dependencies identified on the left side of Eq. where ε T is the experimental measurement error.
(22) are the physical/material property dependence Experiments intended for validating a model require more
φ p = { k, L } , and boundary conditions x = { q 0, T L } . The than physical measurements to compare with a model. The
relationship of the model to this particular experiment is experiment must provide material/physical properties and
through the values in φ p, x . The values that are ultimately used boundary/initial conditions for input to the model. The material/
in the model will have error relative to the “true” (unknown) physical properties and boundary conditions relate the model to
values for the given experimental apparatus. These errors (the specific) experiment. The values used in a model are either
contribute to the inherent error and affect the model predictions. measured during the validation experiment or taken from
another source. Examples of a source other than the validation
Model values to be studied in this case are differences experiment are previous experiments or expert opinion. Material
between the dependent variable (temperature) of the properties are generally not measured directly in a validation
mathematical equations and the boundary temperature experiment, while boundary conditions are typically measured.
Therefore, in addition to measuring temperature differences,
q̂ 0 L̂
Ŝ ≡ T ( z, φ̂ p, x̂ ) – T L = ---------  1 – --- .
z which we will compare to model values, we simulate measuring
(23)
k̂  L̂ the boundary condition. By measuring a temperature difference,
we only need the heat flux boundary condition for the model.

7 Copyright  2001 by ASME


We simulate errors in the measured heat flux as
parameters in the model φ̂ p = { k̂, L̂ } . Typically the values can
q̂ 0 = q 0 + ε q (26) be directly measured in the experiment, as is the case for
0
physical dimensions, or come from previous experiments, as is
typically the case for material properties. The value for thermal
where ε q0 is the measurement error in the heat flux.
conductivity is (assumed) to be taken from a value estimated in
The following values are used to generate the simulated a previous experiment, k̂ = 14W /m°C . The thickness is
validation data: k = 15 W/m°C , q 0 = 1500 W/m 2 , L = 1 m , (assumed) to be measured in the experiment, L̂ = 1.0m . Heat
and z = 0 m. These parameters are the unknown “true” values flux is measured during the validation experiment and is given
for the validation experiment. Random error in the temperature in Table 1. Having estimated/measurements for the model
difference measurement is simulated assuming a normal inputs we can evaluate the model for each experiment. These
distribution, ε T ∼ N ( 0.0, 3°C ) . Measurements are simulated at predictions are given in Table 2.
four sensors with the same error distribution, but independent At this point we have pairs of model predictions and
error. Measurement error in heat flux is also assumed to be experimental measurements we can compare to quantify the
normal, ε q0 ∼ N ( 0.0, 100 W /m 2 ) . Hence the (simulated) accuracy of the model to represent the physical world
(experiment). The error that we quantify by comparing the
experiments provide measurements of the temperature
experimental measurement and model prediction pairs is only as
difference between z=0 and z=L with four sensors and a surface
good as the individual accuracy of each value and may provide
heat flux measurement. Simulated data for 10 replicate
limited information about the validity model. Additional
experiments are listed in Table 1. From this point, we do not
analysis of the model will help indicate what we can learn from
assume we know the “true” values of model inputs. We operate
the comparing the model and experiment.
as one would in a real validation exercise using estimates and
measurements in our model. Table 2: Model predictions for each validation experiment,
Table 1: Experimental (simulated) data at z = 0 k̂ = 14 W /m°C , L̂ = 1m , and q̂ 0, i from Table 1.

D̂ 1, i D̂ 2, i D̂ 3, i D̂ 4, i D̂ i ≡ Ŝ Ŝ
Exp Exp
1
4 q̂ 0, i °C °C
--- ∑ D̂ j, i
4j = 1 1 102.84 6 107.19
Exp 2 103.62 7 111.91
°C °C °C °C °C W /m 2
3 90.53 8 107.05
1 99.37 103.57 96.65 101.91 100.37 1439.9
4 113.88 9 107.10
2 101.65 96.70 100.26 93.99 98.15 1450.7 5 105.84 10 107.88
3 101.39 99.04 103.71 98.11 100.56 1267.5
4 96.30 103.17 99.66 101.14 100.07 1594.4 ANALAYSIS
5 93.64 98.07 97.89 96.95 96.63 1481.8 Sensitivity Analysis
6 104.56 99.89 103.68 97.91 101.51 1500.8 The data in Table 1 are available to study the accuracy of the
7 97.65 101.76 99.25 101.44 100.02 1566.8 mathematical model in Eq. (19) through Eq. (21). An obvious
8 99.77 102.67 106.93 101.57 102.73 1498.8 question is: What does a comparison of these data with model
predictions in Table 2 demonstrate about the model? The value
9 102.74 100.17 96.68 101.46 100.26 1499.5 of experimental measurements to gain insight to a model is not
10 99.17 103.83 105.59 98.43 101.76 1510.3 intuitively obvious. Potentially, the experimental data are not
suitable for assessing certain characteristics of the model. By
Model Predictions studying the sensitivity of the model we can learn about the
dependence of the model to predict the experimental data.
The model prediction, Ŝ ( z ) , is the solution of the If model predictions of the experimental data are insensitive
mathematical equations in Eq. (19) through Eq. (21) with to the model, or equivalently parameters in the model, little is
estimated/measured values of properties and boundary learned about the validity of the model by comparing with the
conditions. In this simple example an analytical solution exists validation data. It may be possible to conclude that the model is
as given in Eq. (23). The parameters for the model are physical (fortuitously) valid, when in fact good agreement is achieved
because there is lack of dependence on the model. Likewise, if

8 Copyright  2001 by ASME


model predictions depend more on (measured) boundary
conditions, than the model itself, the data may be of limited Table 3: Parameter uncertainty/variability
utility for validation. The scaled sensitivity coefficient provides
a quantitative measure of the dependence of model predictions True
Estimated Values
on model parameters and boundary conditions. Value
( φ̂ p, x̂ )
Parameter φ p ( φ p, x )
Sensitivity to the model predictions can be derived by taking Source
the partial derivative of Eq. (23) with respect to each model or BC x mean Std Error
input: Estimate Estimate True value

∂Ŝ q̂ 0 L̂
k̂ ------ = – Ŝ ( z ) = – ---------  1 – ---
z
(27) k , ( W /m°C )
k̂  L̂
∂k̂
Previous Exper- 14 1 15
iment

q 0 , ( W /m 2 ) Measured in (Table 1) 90 1500


∂Ŝ
q̂ 0 -------- = Ŝ ( z ) (28) Validation Exp
∂q̂ 0
L , (m) assumed known 1 0 1
∂Ŝ q̂ 0 L̂
We will use the estimates listed in Table 3 for the parameters/
- = Ŝ ( 0 ) .
L̂ ------ = -------- (29)
∂L̂ k̂ boundary conditions in the model predictions. The standard
deviation of the thermal conductivity is 1.0 W /m°C . Meaning
With such a simple model we can assess the model’s we expect the “true” value for this experiment to be within 15%
dependence without plotting the sensitivity. For more complex of the mean, assuming a probability of 95% and normal
models, plots of the sensitivity variables would be studied. distribution. We select a standard deviation of the heat flux
The sensitivity indicates that the model at (z = 0) has equal based on variability of the 10 experiments, which is roughly 90
dependence on the thermal conductivity, heat flux, and W/m2. We neglect error in the thickness and because we are
thickness, with magnitudes of the scaled sensitivities predicting a temperature difference the model doesn’t require
comparable to the magnitude of the model. A p% change in any the boundary temperature.
one of these will have the same effect on the model prediction. The model predictions listed in Table 2 use the mean
Model predictions will have an appreciable dependence on parameters listed in Table 3. The accuracy of the model is
Fourier’s law, as indicated by the sensitivity to thermal
conductivity. The model is equally sensitive to heat flux and dependent on the accuracy of estimated values ( φ̂ p, x̂ ) to
thickness. This is an important point. The model’s dependence represent ( φ p, x ) for the experiment. We note that the true
on the measured boundary flux and thickness are as important as values for material properties, φ p , are usually variable from
thermal conductivity. Hence to assess the adequacy of Fourier’s
sample to sample. The experiment represents a single
law we will need accurate measurements of heat flux and
realization from a random distribution of the material
thickness. Notice that to validate Fourier’s law requires an
properties. We represent the material properties in the model
accurate value of the model parameter thermal conductivity.
with values taken from a previous experiment which are
Uncertainty Analysis corrupted with estimation error from the process used to
measure the material property.
Three inputs are required for the model φ̂ p = { k̂, L̂ } and
The effect of error, or potential error, in our estimates k̂, qˆ0 on
x̂ = { q̂ 0 } . Sensitivity analysis has demonstrated that the model
the model predictions can be estimated. Using a Taylor series
prediction depends equally on the three model inputs at z=0. We we expand about the true, but unknown, values
assume that the physical dimension, L̂ , can be measured
∂Ŝ i ∂Ŝ i
accurately enough to neglect error in its value. The values used Ŝ i = S + ------- ( k̂ – k ) + -------- ( q̂ 0, i – q 0, i ) . (30)
∂k̂ ∂q̂ 0
for thermal conductivity and measured heat flux are expected to k̂, q̂ 0 q̂ 0, k̂
have error. In most cases we must rely on analysis of previous
experimental data or expert opinion to estimate the error in
physical properties. Repeat experiments, as in this example, can where, recall that Ŝ = S ( k̂, q̂ 0 ) . With this linear approximation
aid in quantifying the error in measured boundary conditions. we can estimate the effect of errors ( k̂ – k, q̂ 0, i – q 0, i ) on

( Ŝ i – S ) . Taking the expected value and variance of Eq. (30)


assuming uncorrelated errors in the parameters gives

9 Copyright  2001 by ASME


140 E [ ( k̂ – k ) ( q̂ 0, j – q 0, j ) ]
ρ kq ≡ -------------------------------------------------------- (34)
σ̂ k σ̂ q
i 0, j
130
Temperature, ( °C )

120
which represents the relationship between the error k̂ – k of
110 experiment i and q 0ˆ, j – q 0, j of experiment j. In most cases we
100
would not expect the error in model inputs to be related. When
we have a model input that is a constant for each experiment,
90 but unknown, the error due to not knowing the constant affects
the prediction for all experiments. In our example, if the
80
estimated thermal conductivity is too high, it will impact all
model value Ŝ model predictions in a similar manner. Consequently, the error
70
0 2 4 6 8 10 12
in thermal conductivity is correlated ρ kk = 1 in Eq. (33)
Experiment between experiments. There is no reason to believe the errors in
Figure 1. Model prediction with 2σ uncertainty bars other model inputs are correlated,
( ρ kq = ρ q k = ρ q q = 0 ) ( i ≠ j ) . We can simplify Eq. (33) to
0 0 0 0

∂Ŝ i ∂Ŝ i
E [ Ŝ i – S ] = E [ ( k̂ – k ) ] ------- + E [ ( q̂ 0i – q 0, i ) ] -------- . (31)  
∂k̂ ∂q̂ 0   2  σ̂ k 2 2  σ̂ q 0 
2 
k̂, q̂ 0 q̂ 0, k̂   Ŝ i  ------- + Ŝ i  ---------  (i = j) 
  k̂  q̂ 0   
V S ≡ cov ( Ŝ i – S, Ŝ j – S ) =  .
(35)
2  σ̂ k 2 
2 ∂Ŝ i 2  ∂Ŝ i   Ŝ i Ŝ j  ------- (i ≠ j) 
2
σ̂ S = E [ ( Ŝ i – S ) ] = E [ ( k̂ – k ) ]  ------- + E [ ( q̂ 0i – q 0, i ) ]  --------
2 2
 
=   k̂  
i
∂k̂  ∂q̂ 0  

σ̂ k 2 2  σ̂ q0 2
2
 ----- The values for the covariance matrix of the ten experiments
 k̂  i  q̂ 0  i
- Ŝ + -------- Ŝ (32)
are listed in Table 4. Note that σ̂ S i are values along the diagonal
of VS . Since the model has (assumed) independent
A plot of the model predictions with uncertainty bars
measurements of heat flux we do not have off-diagonal
calculated as two times the standard deviation due to input
contributions for the heat flux as shown in Eq. (35). The model
uncertainty is shown in Fig. 1. These uncertainty limits
predictions of the 10 experiments are seen to be correlated
demonstrate the range within which we would expect the “true”
value of our model to reside if the “true” values k and q 0 are Cov ( Ŝ 1 – S, Ŝ 2 – S ) ( 7.37 )
2
ρS = ----------------------------------------------- = ------------------------------------- = 0.52 . (36)
bounded by the standard error given in Table 3. Given that we 1, S2
σ̂ S σ̂ S ( 10.24 ) ( 10.28 )
1 2
have used estimates of these values, the applicability of the
range shown to contain the “true” value is conditioned upon the
accuracy of the estimates and the standard error.
Table 4: Covariance matrix of model predictions, VS
Consider predicting multiple experiments, say experiments i
and j. The covariance matrix for model predicts of the two
Exp 1 2 3 4 5 6 7 8 9 10
experiments is
1 10.24 7.37 6.89 7.73 7.45 7.50 7.66 7.49 7.49 7.52
cov ( Ŝ i – S, Ŝ j – S ) = E [ ( Ŝ i – S ) ( Ŝ j – S ) ] = 2 7.37 10.28 6.91 7.75 7.48 7.52 7.69 7.52 7.52 7.55
3 6.89 6.91 9.63 7.25 6.99 7.03 7.18 7.03 7.03 7.05
∂Ŝ i ∂Ŝ j ∂Ŝ i ∂Ŝ j
ρ kk σ̂ k σ̂ k ------- -------- + ρ kq σ̂ k σ̂ q ------- -------- 4 7.73 7.75 7.25 10.82 7.84 7.89 8.06 7.72 7.88 7.91
∂k̂ ∂k̂ 0 0j
∂k̂ ∂q̂ 0
5 7.45 7.48 6.99 7.84 10.40 7.60 7.77 7.60 7.60 7.63

∂Ŝ i ∂Ŝ j ∂Ŝ i ∂Ŝ j 6 7.50 7.52 7.03 7.89 7.60 10.47 7.82 7.65 7.65 7.68
+ ρ q k σ̂ q σ̂ k -------- -------- + ρ q q σ̂ q σ̂ q -------- -------- ,(33)
j ∂q̂
0 ∂k̂
7 7.66 7.69 7.18 8.06 7.77 7.82 10.72 7.81 7.82 7.84
0 0i 0 0 0i 0j
∂qˆ ∂qˆ 0 0
8 7.49 7.52 7.02 7.88 7.60 7.65 7.81 10.46 7.64 7.67

In Eq. (33) a correlation coefficient has been defined 9 7.49 7.52 7.03 7.88 7.60 7.65 7.82 7.64 10.46 7.67
10 7.52 7.55 7.05 7.91 7.63 7.68 7.84 7.67 7.67 10.50

The correlation is due to the dependence of the model

10 Copyright  2001 by ASME


provided by Easterling [7]. Since the experimental data were
110
generated in an experiment by sensor arrangement, first
108 consider an analysis of variance of the temperature
106 measurements according to this structure. The analysis of
Temperature, ( °C )

104 variance partitions the total variation of the Table 1 measured


102 temperatures by potential sources of that variation. The sources
100
in this case are experiments and sensors. Table 6 gives the
resulting two-way analysis of variance [8] for the measurements
98
of the temperature. This table shows no significant source of
96
variation among either experiments or sensors. Thus, the data
Sensor 1
94
Sensor 2 can be regarded as 40 random measurements of the same thing,
Sensor 3
92 Sensor 4 which indeed is how the data were generated. The temperature
mean value
90
0 2 4 6 8 10 12
measurement error has a standard deviation which is estimated
by the square root of the Total MS, namely 3.02 C, which is
Experiment essentially equal to the sigma value of 3.0 C used to generate the
Figure 2. Experimental (validation) temperature data. This estimate will be used in the following statistical
measurements analysis of the residuals.
predictions on thermal conductivity and the fact that all
experiments have the same thermal conductivity error, k̂ – k . Table 6: 2-way ANOVA for experimental data
Hence, an error in thermal conductivity will effect all model
Source of variation dof SS MS F P
predictions in the same manner.
Total 39 356.69 9.15
Statistical Analysis of Validation Data Experiments 9 110.74 12.30 1.49 0.20
Establishing error estimates for the experimental Sensors 3 23.19 7.73 0.94 ~0.5
measurements is a difficult task. Repeat experiments and/or Residual 27 222.76 8.25
redundant measurements, as in this example, provide data to
quantify error in the measurements. Repeated experiments and/ When data are not available for statistical analysis, at a
or redundant measurements with statistical analysis are an minimum, one should have an estimate of what confidence to
important avenue to quantifying the measurement error in the give the experimental data. This may be an expert’s opinion to
experimental data. A word of caution, repeated experiments the accuracy of the data or a (crude) estimate. An opinion/
may not be independent, measurement error may be correlated, estimate is better than assuming the data is without error,
and measurements may be biased. For our simulated experiment because an estimate can set a level to which we believe the data.
we have both redundant measurements and repeated Below this level we may have reason to be suspect of the data’s
experiments. In general, both provide valuable insight to the accuracy. Relying on expert opinion is more qualitative than
measurement error. The measured temperature data (Table 1) quantitative. However, it forces us to consider the accuracy of
are plotted in Fig. 2. Each experiment has 4 measurements and the data and prescribes the confidence we put in the data for the
there are 10 experiments with nominally the same conditions. In validation.
this example, the mean and standard deviation (of the mean) for
the four measurements from each experiment are given in COMPARION BETWEEN EXPERIMENT AND MODEL
Table 5. Steps to this point have been leading to a comparison of the
Table 5: Experimental mean and standard deviation (of experiment and model. The preceding steps are aimed at
mean) for each validation experiment quantifying or estimating the effect of error in the experiment
and model due to inherent errors we know are a part of the
D̂ i σ̂ D D̂ i σ̂ D validation process. The inherent errors and their effect on the
i i
Exp Exp
°C °C °C °C model and experiment would appear to be a reasonable context
to judge the residuals. The magnitude and correlation structure
1 100.37 1.51 6 101.51 1.57 of the errors can be compared to the residuals to establish if
2 98.14 1.73 7 100.02 0.96 these is evidence to suggest inherent errors are not possibly the
source of the residuals.
3 100.55 1.25 8 102.73 1.52
4 100.06 1.44 9 100.26 1.30
5 96.63 1.02 10 101.75 1.74

The analysis and text, with minor editing, that follows was

11 Copyright  2001 by ASME


Figure 3 plots the model predictions from Table 2 and knowledge about the experiment and measurement error.
experimental measurements from Table 5 with error bars Through sensitivity analysis, uncertainty propagation, and
approximated as two times the estimated standard deviation statistical analysis we have tried to identify and estimate the
(values from the diagonal of Table 4 for the model and Table 5 effect of these errors. Next consider quantifying the errors.
for the experiment). Figure 3 indicates that the differences
between the model and experiment are within the estimated Error Model Estimation
error bounds of the model, but not within the error bounds of the The analysis and text, with minor editing, that follows was
experiment. If errors are normally distributed we would expect provided by Easterling [7]. Summary statistics for the residual
them to be within 2 standard deviations 95% of the time. A
data are: µ̂ e = – 5.58°C and σ̂ e = 6.48°C . It is fairly obvious
noticeable trend is that all experimental measurements are less
than the corresponding model predictions except for experiment from the fact that nine of the ten residuals are negative that the
3. We have assumed the experimental errors are independent model is biased. A t-test for bias (Eq. (11)) yields a value of
(and in fact for this simulated case we know they are). It is T = – 2.72 , which is significant at about the 0.02 level (two-
possible that the model has correlated errors between tailed), so there is fairly strong evidence of bias. There is only
experiments and could account for the bias. For example, if the about a 2 percent chance of getting a t-value this large or larger
conductivity value is too high or too low it would impact all in absolute value just by chance if the model prediction was
experiments and bias the difference. The covariance matrix for unbiased. We know from the way that the data were generated
the model estimates this effect. that this bias is due to the biased estimate of k that was used in
Table 7: Residual and total uncertainty assuming independence the model predictions. Let us see if the analysis leads us to that
conclusion.
eˆi eˆi /Dˆ i σ̂ T , i eˆi eˆi /Dˆ i σ̂ T , i
Exp Exp The standard deviation of the residuals has contributors due to
°C % °C °C % °C measurement error, both in the experiment's temperature
measurements and the flux measurements. To measure “true”
1 -2.47 -2.46 10.35 6 -5.68 -5.60 10.58
prediction error we need to remove these extraneous sources of
2 -5.47 -5.57 10.43 7 -11.89 -11.88 10.76 variation. That is, we want to make predictions for flux values
3 10.03 9.96 9.71 8 -4.32 -4.21 10.57 that are specified, say by system requirements, rather than
4 -13.81 -13.81 10.92 9 -6.84 -6.82 10.54 estimated, and we want to predict actual temperature, not
measured temperature. The decomposition given in Eq. (6)
5 -9.21 -9.52 10.45 10 -6.12 -6.02 10.65
provides a basis for removing measurement-error related
The residuals are an indication of the accuracy of the model to sources of variation. In Eq. (6), D̂ – D is the difference between
represent this validation experiment. Table 7 tabulates the the average measured temperature for an experiment and the
residual, normalized residual, and univariate total error Eq. (16)
for each experiment. We have acknowledged that there are actual temperature, Ŝ – S is the difference between the model
(inherent) errors in the model and experiment due to lack of calculation at the measured flux and at the actual flux, and
D – S , which was called the prediction error, is the difference
140 between actual experimental temperature and the predicted
temperature based on actual flux. Thus, in words, Eq. (6) says
130
that the residual is the sum of measurement error, model error
derived from measurement error of experimental parameters,
Temperature, ( °C )

120
and prediction error. These terms cannot be separated
110 individually, because D and S are not known, but they can be
separated statistically by their variances.
100
If we assume that temperature measurement errors are
90 independent of flux measurement errors and of prediction
errors, which is generally plausible and which we know to be
80 model value Ŝ the case in this example, then Eq. (6) leads to the relationship
exp value D̂
70
0 2 4 6 8 10 12 var ( D̂ – Ŝ ) = var ( D̂ – D ) + var ( Ŝ – S ) + var ( D – S ) (37)
Experiment
where var ( . ) denotes the variance of the parenthetic quantity.
Figure 3. Comparison of experiment and model
The square of the standard deviation of residuals gives an
values with 2σ uncertainty bars on each estimate of the left-hand variance. The experimental data and

12 Copyright  2001 by ASME


flux measurements provide the means of estimating the first two bias in the temperature data, measured fluxes, and estimated
right-hand variances. By subtraction, the variance of “true” thermal conductivity could all contribute to the observed bias in
prediction error, which is the third right-hand variance in Eq. the residuals, as could an erroneous mathematical model. An
(37) can then be estimated. infinite combination of biases in all of these potential sources
(exempting L from our investigation) could result in the
The analysis of variance of the Table 1 data indicated that the
observed bias. These potential sources cannot be separated on
observed variability of D̂ was all due to measurement error. the data alone; additional information is needed.
This analysis provided an estimated temperature measurement
Additional analyses to estimate an updated value of thermal
error standard deviation of 3.02°C for individual temperature conductivity, calculate an estimation error, and make subsequent
measurements. Because D̂ is the average of four measurements, predictions with that value can be performed [11].
its standard deviation is estimated by 3.02/ 4 = 1.51°C . Consistency Testing
The variance of Ŝ – S over the 10 experiments is due to flux Assuming the experiment and model are independent, the
measurement error. Note that because the same thermal covariance matrix for the residuals is given in Eq. (14). The
experimental error is assumed independent for each experiment
conductivity estimate, k̂ = 14W /m°C , was used in all the Ŝ 2
values, its estimation error contributes only to bias in the and V D has values of σ̂ Di from Table 5 along the diagonal. The
observed prediction errors, not to their variance. Thus, it is covariance of the model predictions are not independent and V S
inappropriate to include any variance component associated is given in Table 4. We can evaluate Eq. (13) and Eq. (18) to test
with thermal conductivity estimation error at this stage of the the consistency of the residuals in Table 7 to the (estimated)
analysis.
total, model, and experimental errors, as given by V T , V S, V D ,
The fact that the analysis of variance of the experimental data respectively.
showed no experiment-to-experiment variability is an indication
that the actual flux was constant over the 10 experiments, which
indeed was the case. Thus the variability of the measured flux Table 8: Consistency test
values is due strictly to flux measurement error, not actual flux 2 2
variation across experiments. If that were not the case, we Covariance r r crit Pr V
Tot

would have to find some way to separate actual flux variation


Matrix °C 2 °C 2
and flux measurement error. Summary statistics for the
measured flux values are: µ̂ q = 1481.1W /m 2 and VT 7.02 18.3 (95%, 0.72
0
10dof)
2
σ̂ q = 88.4W /m . Because Ŝ = q̂ 0 L/k̂ , the standard deviation of VS 7.23 0.70
0

Ŝ – S is estimated by σ̂ S = 88.4/14 = 6.31°C . VD 459.6 0

We thus have the following estimated standard Values of r 2 calculated using different error (covariance)
deviations: σ̂ e = 6.48°C , σ̂ D = 1.51°C , and σ̂ S = 6.31°C matrices are listed in the Table 8. By comparing the value of r 2
The root sum of squares of the two measurement error with the critical value we have statistical evidence as to whether
contributors to the variance of the residuals is equal to the residuals are statistically significant relative to the error in
2 2 the model and experiment. For the total error, V T , or error in the
6.31 + 1.51 = 6.49°C , which is essentially equal to the
standard deviation of the observed prediction errors, which is model, V S , we see that the value is less than the critical value.
6.48°C . Thus, measurement error accounts for all of the Hence we do not have statistical evidence to reject the residuals
observed variability. There is no evidence in these data of as being consistent with these errors. The outcome is different
extraneous extra-model, unmodeled physics contributing to the when we only account for error in the experiment. We would
variability of residual. That is, this analysis leads to the reject the residuals as being consistent with error in the
conclusion that var ( D – S ) = 0 . This finding thus captures the experiment because the value exceeds the critical value.
way that the data and predictions were in fact generated. Discussion
Prediction error in this example is all bias, no variance.
The two metrics studied in this example are aimed at
A finding of bias should lead to an analysis aimed at answering different questions. Error model estimation uses the
determining the source of the bias and a means of eliminating it residuals to quantify the accuracy of the model for predicting
in subsequent predictions. In this case, as is clear from Eq. (23), the experiment [7]. Consistency testing [9] (statistically)

13 Copyright  2001 by ASME


compares the residuals to an independent estimate of the from New Mexico State University for helpful discussions
covariance of the residuals based on uncertainty/variability in concerning data in the paper. Special thanks to Rob Easterling
the model and experiment to test for evidence that prediction for providing the statistical analysis included in the paper.
error is nonzero. Sandia is a multiprogram laboratory operated by Sandia
The error model analysis indicated: Corporation, a Lockheed Martin Company, for the United States
Department of Energy under Contract DE-AC04-94AL85000.
• a residual error with mean – 5.58°C and standard deviation
of 6.48°C . RERFERENCES
• there is fairly strong evidence of bias [1] Beck, J. V. and Arnold, K. J., 1977, Parameter Estimation in
• variability in the residuals is accounted for with variability Engineering and Science, Wiley, New York.
due to the experimental measurements, which impacts the
model prediction [2] Kurpisz, K. and Nowk, A. J., 1995, Inverse Thermal
Problems, Computational Mechanics, Southampton, U.K.
The consistency test indicated:
[3] Trucano, T. G., Easterling, R. G., Dowding, K. J., Paez, T.
• the residuals are consistent with the estimated covariance L., Urbina, A., Romero, V. J., Rutherford, B. M., and Hills, R.
matrix (rigorously this should be stated that there was not G., 2001, “Description of the Sandia Validation Metrics
evidence to reject the residuals as being consistent with the Project,” Sandia National Laboratories, Report SAND2001-
estimated covariance matrix). Hence there is no evidence 1339, Albuquerque, NM.
that the prediction error is nonzero.
[4] Blackwell, B. F., Cochran, R. J., and Dowding, K. J., 1999,
The intent of this paper was not a rigorous comparison of “Development and Implementation of Sensitivity Coefficient
validation metrics. Nevertheless, comments are given Equations For Heat Conduction Problems,” Numerical Heat
concerning the two metrics studied. Both metrics conclude Transfer, Part B, Vol. 36, No. 1, pp. 15-32.
correctly that the data doesn’t indicate a significant prediction
error. The error model separates the variance components to [5] Coleman H. W., and Steel, W. G., 1989, Experimentation
conclude that the prediction error does not contribute to the and Uncertainty Analysis for Engineers, Wiley, New York.
residual variance and then indicates evidence of a bias. The [6] Red-Horse, J., Paez, T. L., Field Jr., R. V., and Romero, V.,
consistency test doesn’t explicitly indicate a bias, but indicates 2000, “Nondeterministic Analysis of Mechanical Systems,”
the biased residuals are consistent (or not inconsistent) with the Sandia National Laboratories, Albuquerque, NM, Report
estimated covariance matrix. SAND2000-0890.
SUMMARY AND CONCLUSIONS [7] Easterling, R. G., 2001, “Measuring the Predictive
Capability of Computational Models: Principles and Methods,
A process to quantify the accuracy of mathematical models to
Issues and Illustrations,” Sandia National Laboratories,
represent physical world (experiments) was presented. The
Albuquerque, NM, Report SAND2001-0243.
basic process and associated issues have been discussed and
tools to aid in the process, such as sensitivity analysis, [8] Bethea, R. M., Duran, B. S., and Boullion, T. L., 1995,
uncertainty analysis, and statistical analysis, were presented. A Statistical Methods for Engineers and Scientists,” Third edition,
simple example problem using simulated experimental data was Marcel Dekker, New York, NY.
used for demonstration. Sensitivity analysis, uncertainty [9] Hills, R. G., and Trucano, T. G., 1999, “Statistical
analysis, statistical analysis, and two validation metrics were Validation of Engineering and Scientific Models: Background,”
applied to the demonstration problem. Sandia National Laboratories, Report SAND99-1256,
A process to quantify the accuracy of a mathematical model Albuquerque, NM.
by comparing with an experiment will evolve as it is applied [10] Coleman, H. W. and Stern, F., 1997, “Uncertainties and
over time. The process, however, should involve using the CFD Code Validation,” Journal of Fluids Engineering, Vol.
mathematical model, leveraging sensitivity and uncertainty 119, pp. 795-803.
propagation techniques, and statistical methods.
[11] Easterling, R. G., 2001, “Statisical Analysis of Predictive
Capability,” Internal memorandum to Kevin Dowding, dated
July 5, 2001, Sandia National Laboratories, Albuquerque, NM.
ACKNOWLEDGEMENTS
The author acknowledges Ben Blackwell, Rob Easterling, and
Tom Paez all from Sandia National Laboratories (SNL) for
carefully reviewing the paper and providing suggestions for
improvement. Thanks to Brian Rutherford (SNL) and Rich Hills

14 Copyright  2001 by ASME

You might also like