0% found this document useful (0 votes)

149 views

Statistics 622: Calibration

L04 Calibration

Uploaded by

Steven Lambert

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

149 views

Statistics 622: Calibration

L04 Calibration

Uploaded by

Steven Lambert

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Fall 2011

Statistics 622
Module 4
Calibration

OVERVIEW .................................................................................................................................................................................. 3
METHODOLOGY ......................................................................................................................................................................... 3
INTRODUCTION .......................................................................................................................................................................... 4
CALIBRATION ............................................................................................................................................................................. 4
ANOTHER REASON FOR CALIBRATION .................................................................................................................................. 6
ROLE OF CALIBRATION IN REGRESSION ................................................................................................................................ 6
CHECKING THE CALIBRATION OF A REGRESSION ................................................................................................................ 7
CALIBRATION IN SIMPLE REGRESSION (DISPLAY.JMP) ................................................................................................. 8
TESTING FOR A LACK OF CALIBRATION ............................................................................................................................. 10
CALIBRATION PLOT ............................................................................................................................................................... 12
CHECKING THE CALIBRATION .............................................................................................................................................. 15
CHECKING CALIBRATION IN MULTIPLE REGRESSION ..................................................................................................... 17
CALIBRATING A MODEL ........................................................................................................................................................ 20
PREDICTING WITH A SMOOTHING SPLINE ......................................................................................................................... 21
BUNDLING THE CALIBRATION INTO ONE REGRESSION ................................................................................................... 22
OTHER FORMS OF CALIBRATION ......................................................................................................................................... 25
DISCUSSION ............................................................................................................................................................................. 25

Prediction lies at the heart of statistical modeling. Statistical models extrapolate the data and
conditions that weve observed into new situations and forecast results in future time periods.
Ideally, these predictions are right on target, but thats wishful thinking. Since predictions only
estimate whats going to happen, at least we can be sure that they are right on average. For
example, suppose were trying to predict how well sales representatives perform in the field
based on their success in a training program. People vary widely, so we cannot expect to predict
exactly how each will perform too many other, random factors come into play.
We should, though, be right on average. Suppose our model predicts the sales volumes
generated weekly by sales reps. Among those predicted to book, say, $10,000 in sales next week,
we ought to demand that the average sales of these reps is in fact $10,000. At least then the total
predicted sales volume will come close to the actual total, even if we under or over predict the
sales of individuals.
Thats what calibration is all about: being right on average. Calibration does not ask much of a
model; its a minimal but essential requirement. Models that are not calibrated, even if they
have a high R2, are misleading and miss the opportunity to be better. Theres no excuse for using
an uncalibrated model because its a problem thats easily fixed, once you know to look for it.

Copyright Robert A Stine

Revised 11/9/11

Statistics 622

4-2

Fall 2011

Overview
This lecture introduces calibration and methods for checking
whether a model is calibrated.
Calibration
Calibration is a fundamental property of any predictive
model. If a model is not calibrated its predictions under
some conditions systematically lead to biased predictions
and poor decision, being too high or too low on average.
Test for calibration uses the plot of Y on the fitted values. It
is similar to checking for a nonlinear pattern in the
scatterplot of Y versus X with a transformation.
Outline
Simple regression provides the initial setting.
Well calibrate a simple regression in two ways. The first
is more familiar, but wont work in multiple regression.
The second generalizes to multiple regression.
Then well move to multiple regression. The second
calibration method used in simple regression works fine in
multiple regression.

Methodology
Smoothing spline
Smoothing splines fit smooth, gradually changing trends
that may not be linear.
Calibrating a model using splines improves its predictions,
albeit without offering more in the way of explaining what
was wrong in the original model.

Statistics 622

4-3

Fall 2011

Introduction
Better models produce better predictions in several ways:
The better the model, the more closely its predictions track
the average of the response.
The better the model, the more precisely its predictions
match the response (e.g., smaller prediction intervals).
The better the model, the more likely it is that the
distribution of the prediction errors are normally distributed
(as assumed by the MRM).
Consequences of systematic errors
Too be right on average, is critical. Unless predictions from
a model are right on average, the model cannot be used for
economic decisions.
Calibration is about being right on average.
High R2 calibrated. Calibration is neither a consequence
nor precursor to a large R2. A model with R2 = 0.05 may be
calibrated, and a model with R2 = 0.999 need not be
calibrated.
In simple regression, calibration is related to the choice of
transformations that capture nonlinear transformations.

Calibration
Definition
A model is calibrated if its predictions are correct on average:
ave(Response | Predicted value) = Predicted value.

( )

E Y Y = Y

Statistics 622

4-4

Fall 2011

Calibration is a basic property of any predictor, whether the

prediction results from the equation of a regression or a
gypsys forecast from a crystal ball.
A regression should be calibrated before we use its
predictions.
Calibration in other situations
Physicians opinions on risk of an illness
Weather forecasts
Political commentary on the state of financial markets
Example
A bank has made a collection of loans. It expects to earn
$100 interest on average from these loans each month.
If this claim is calibrated, then the average interest earned
over following months will be $100.
If the bank earns $0 half of the months and $200 during the
other months, then the estimate is calibrated. (Not very
precise, but it would be calibrated.)
If the bank earns $105 every month, then the estimate is not
calibrated (even though it is arguably a better prediction).
Role in demand models (Module 7)
Getting the mean level of demand correct is the key to
making a profitable decision. Calibration implies that
predicted demand is indeed the mean demand under the
conditions described by the explanatory variables.
Example: If you collect days on which we predict to sell
150 items, the average sales on these days is 150.
Calibration does not imply small variance or a good
estimate of . It only means that we are right on average.

Statistics 622

4-5

Fall 2011

Another Reason for Calibration

A gift to competitors
If a model is not calibrated, a rival can improve its
predictions without having to model the response.
Rivals have an easy way to beat your estimates.
Example
You spend a lot of effort to develop a model that predicts
the credit risk of customers.
If these predictions are not calibrated, a rival with no
domain knowledge can improve these predictions,
obtaining a better estimate of the credit risk and identify the
more profitable customers.
Do you really want to miss out?

Role of Calibration in Regression

Procedure for calibrating a regression model produces a
predictor that is right on average and has smaller average
squared prediction error.
(1) Fit the regression.
(2) Check its calibration as shown in following pages.
(3a) If calibrated, move on to application.
(3b) If not, then calibrate the model.
Formal description of the adjustment
Building the calibrated predictor begins with the usual
fitted values form the regression. The non-calibrated
predictor from regression is
= b0 + b1 X1 + + bk Xk
To improve we transform this variable to a better
predictor, say y = g() where g is a smooth function.
Statistics 622

4-6

Fall 2011

The idea is to pick g so that the resulting prediction y is

calibrated. This adjustment is known in econometrics as
fitting a single-index model.
Pat
When done in this automatic way, calibration is a patch
that fixes a flaw in the initial model, without explaining or
identifying the original source of the problem.
The patched model is better than the original, but not as
good as the model that remedies the underlying flaw (such
as finding a missing interaction or identifying a
transformation).

Checking the Calibration of a Regression

Basic procedure is familiar
No new tools are absolutely necessary: youve been
checking for calibration all along, but did not use this
terminology.
Checking for calibration uses methods that you have
already seen in other contexts.
Calibration in simple regression
A simple regression is calibrated if the fit of the equation
tracks the average of the response.
Often, to get a calibrated model, you transform one or both
of the variables (e.g., using logs or reciprocals)
Calibration in multiple regression
The calibration plot is part of the standard output
produced by JMP that describes the fit of a multiple
regression.

Statistics 622

4-7

Fall 2011

Calibration in Simple Regression

(display.jmp)

Relationship between promotion and sales1

How much of a new beverage product should be put on
display in a chain of liquor stores?
Scatterplot and linear fit
450
400
350

Sales

300
250
200
150
100
50
0
0

Display Feet

Sales = 93 + 39.8 Display Feet

The linear fit misses the average amount sold for each
amount of shelf space devoted to its display. The objective
is to identify a smooth curve that captures the relationship
between the display feet and the amount sold at the stores.
Smoothing spline
We can show the average sales for each number of display
feet by adding a smoothing spline to the plot.2
A spline is a smooth curve that connects the average value
of Y over a subset of values of identified by an interval of
adjacent values of X.
1

This example begins in BAUR on page 12 and shows up several times in that casebook. This
example illustrates the problem of resource allocation because to display more of one item
requires showing less of items. The resource is the limited shelf space in stores.
2
In the Fit Y by X view of the data, choose the Fit Spline option from the tools revealed by the
red triangle in the upper left corner of the window. Pick the flexible option.
Statistics 622

4-8

Fall 2011

The smoothness of the spline depends on the data and can

be controlled by a slider in the software. The slider controls
the size of the averaged subset by controlling the range of
values of X.
Example of spline
In this example, a smoothing spline joins the mean sales
for each amount sold with a smooth curve.3
450
400
350

Sales

300
250
200
150
100
50
0
0

Display Feet

With replications (several stores display the same amount;

these data have several ys for each x), the data form
columns of dots in the plot.
If the range of adjacent values of X is small enough (less
than 1) then the spline joins the means of these columns.4
Without replications, the choice of a spline is less obvious
and requires trial and error (i.e., guesswork).
Compare the spline to the linear fit
With one foot on display, the linear fit seems too large.
The mean is about $50. With two feet on display, the linear
fit seems too small. The mean of sales is around $200.
3

The spline tool is easy to control. Use the slider to vary the smoothness of the curve shown in
the plot. See BAUR casebook, p. 13, for further discussion of splines and these data.
4
Replications (several cases with matching values of all of the Xs) are a great asset if your data
has them. Usually, they only appear in simple regression because there are too many unique
combinations of predictors in multiple regression. Because the display data has replications,
JMPs Lack of Fit test checks for calibration. See the BAUR casebook, page 247.
Statistics 622

4-9

Fall 2011

The linear fit explains R2 = 71% of the variation. A spline

always does as least as well as the linear fit. For the spline,
R2 = 86%.
The situation resembles multiple regression: R2 grows as
we add predictors. In a sense, a spline adds predictors as
well. Alas, JMP does not tell us how many so well need to
approximate the number.

Testing for a Lack of Calibration

Objective
Does the spline really fit the data better that the line (able to
predict new observations more accurately)? Does the spline
capitalize on random, idiosyncratic features of the data that
wont show up in new observations?
Test:
H0: Linear model is not calibrated
Approximate spline with polynomial
Approximate the spline by fitting a polynomial that roughly
matches its fit to the data. We can test the polynomial fit to
see if its better than the linear fit.
Fit a polynomial model5
This plot superimposes the fit of a 5th degree polynomial
y = b0 + b1 X + b2 X 2 + b3 X 3 + b4 X 4 + b5 X 5
The output centers the predictor; the test only requires R2.

To add a polynomial to a scatterplot, click the red triangle in the upper left corner of the output
window and pick Fit Polynomial. To match the spline often requires 5 or 6 powers.
Statistics 622

4-10

Fall 2011

450
400
350

Sales

300
250
200
150
100
50
0
0

Display Feet

RSquare
Root Mean Square Error
Term
Intercept
Display Feet
(Display Feet-4.40426)^2
(Display Feet-4.40426)^3
(Display Feet-4.40426)^4
(Display Feet-4.40426)^5

Estimate
109.52749
37.9317
12.189222
-2.36853
-2.070732
0.108778

0.8559
38.22968
Std Error
70.2328
15.51394
9.094144
6.066507
1.236305
0.560066

t Ratio
1.56
2.45
1.34
-0.39
-1.67
0.19

Prob>|t|
0.1266
0.0189
0.1875
0.6982
0.1016
0.8470

Partial F-test
Do not use the t-statistics to assess the polynomial because
of collinearity (even with centering). Were not
interpreting the coefficients; all we want to learn is whether
R2 significantly improved.
The partial F-test does what we need. It finds a very
significant increase for the 4 added predictors.
The degrees-of-freedom divisor in the numerator of the
ratio is 4 (not 5) because the polynomial adds 4 more
predictors. The d.f. for the denominator of F is n 2 (for
the original fit) 4 (added by the spline)

(!R ) /(# added)

F=
2

(1 " R ) /(df all)

2
all

Statistics 622

(0.8559 " 0.7120) /4

(1 " 0.8559) /(47 " 2 " 4)

41 0.1439
#
= 10.2
4 0.1441

4-11

Fall 2011

Conclude: Reject H0, the linear regression is not calibrated.6

Calibrating simple regression
Transform one or both variables to capture the pattern and
obtain a more interpretable model that fits almost as well.
Substantively, model the diminishing returns to scale using
a log of the number of display feet as a predictor.
The R2 of the fit using log(display feet) is 81.5% and the
partial F improvement offered by the polynomial is no
longer significant (F = 2.9).7
450
400
350

Sales

300
250
200
150
100
50
0
0

Display Feet

Sales = 83.56 + 138.62 Log(Display Feet)

[BTW, it does not matter which log you use (unless youre
interpreting by taking derivatives. Logs are proportional to
each other. For instance log10(x) =loge(x)/loge(10). ]

Calibration Plot
Generalize to multiple regression
A calibration plot offers an equivalent test of calibration,
one that generalizes to multiple regression.
We need this generalization to test for lack of calibration
more convenient in models with more than one predictor.

The improvement is statistically significant even though none of the t-stats for the added
coefficients is significant. Theres too much collinearity. When you add several terms at once,
use the partial F to judge the increase.
7
F= ((0.8559-0.815)/4)/((1-0.8559)/41) = 2.9.
Statistics 622

4-12

Fall 2011

Definition
A calibration plot is a scatterplot of the actual values of
the response on the y-axis and the predicted values on the
x-axis. (This plot is part of the default output produced by
Fit Model.)
450
400

Sales Actual

350
300
250
200
150
100
50
0
0

50 100 150 200 250 300 350 400 450

Sales Predicted P<.0001 RSq=0.71

RMSE=51.591

We need to make our own calibration plot in order to test

for calibration. Save the prediction formula (or the
predicted/fitted values) from the simple regression, then
scatterplot the actual values on the predicted values.
450
400
350

Sales

300
250
200
150
100
50
0
0

50 100 150 200 250 300 350 400 450

Pred Formula Sales

The slope of this fit must be b1 = 1 and the intercept must be

b0 = 0. The R2 and RMSE match those of the previous
regression.

Statistics 622

4-13

Fall 2011

Linear Fit8

Sales = 0 + 1 Pred Formula Sales

Summary of Fit

RSquare
Root Mean Square Error
Observations (or Sum Wgts)
Term
Intercept
Pred Formula Sales

0.712
51.591
47

Parameter Estimates

Estimate
0
1

Std Error
26.51
0.09

t Ratio
0.00
10.55

Prob>|t|
1.0000
<.0001

Why is this true?

Hint: Recall that R2 in regression is the square of the
correlation between the response and the predicted response.

JMP sometimes uses scientific notation for numbers very close to zero, something like 1.0e13, which translates to 1 10-13. The difference from zero in this output is due to round-off
errors in the underlying numerical calculations; computers use binary arithmetic that does not
represent exactly every number that you write down.
Statistics 622

4-14

Fall 2011

Checking the Calibration

Using the calibration plot
We can check the calibration of any regression from the
calibration plot.
Example
This summary of what happens when we add the 5th order
polynomial for the display example. 9
The coefficients of the polynomial differ from those in the
prior fit, but R2 and RMSE of the fit are the same.
Caution: Be careful computing the partial F test. We want
the degrees of freedom as though we added these terms to
the original model as done previously, not started over.
Hence, we reach the same conclusion from the calibration
plot as when working with the scatterplot of Y on X: the
linear model not calibrated.
450
400
350

Sales

300
250
200
150
100
50
0
0

50 100 150 200 250 300 350 400 450

Pred Formula Sales

The summary of the fit of this model is on the next page.

Why use a 5th order polynomial, you might ask? Its arbitrary, but seems to match the fit of a
smoothing spline so long as we stay away from the edges of the plot.
Statistics 622

4-15

Fall 2011

Summary of Fit

RSquare
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)

0.8559
38.2297
268.1300
47.0000

Parameter Estimates

Term
Intercept
Pred Formula Sales
(Pred Formula Sales-268.13)^2
(Pred Formula Sales-268.13)^3
(Pred Formula Sales-268.13)^4
(Pred Formula Sales-268.13)^5

Estimate
20.765257
0.9541011
0.0077119
-0.000038
-8.29e-7
1.1e-9

Std Error
106.1972
0.390224
0.005754
0.000097
4.9e-7
5.6e-9

t Ratio
0.20
2.45
1.34
-0.39
-1.67
0.19

Prob>|t|
0.8459
0.0189
0.1875
0.6982
0.1016
0.8470

Warning: Never extrapolate a high-order polynomial10

The fit of a polynomial of this degree goes wild when
extrapolated beyond the data, even though it seems
reasonable within the observations.
The spline is much more useful for this purpose.

When to check for calibration?

Calibration is a diagnostic, something done once you feel
you have a reasonable model and are ready to start
checking residuals.

BS p 699. The example of modeling a time series using a polynomial illustrates what can
happen when you extrapolate a polynomial of high degree (degree 6) outside the range of the
observed data. See Figure 27.8 on page 701.
Statistics 622

4-16

Fall 2011

Checking Calibration in Multiple Regression

Procedure
Use the calibration plot of y on the fitted values (a.k.a.,
predicted values) from the regression. (Alternatively, you
can add back into the multiple regression itself.)
Test the calibration by seeing whether a polynomial fit in
the plot of y on significantly improves upon the linear fit
that summarizes the R2 and RMSE of the regression.
Tricky part
The degrees of freedom for the associated partial F test
depend on the number of slopes in the original multiple
regression.
Thankfully, this adjustment has little effect unless you have
a small sample.
Example data
Model of housing prices, from the analysis of the effects of
pollution on residential home prices in the 1970s in the
Boston area (introduced in Module 3).
In the examples of interactions, we observed curvature in
the residuals of the models. Thats a lack of calibration.
Illustrative multiple regression
This regression with 5 predictors explains almost 70% of
the variation in housing values. All of the predictors are
statistically significant (if we accept the MRM) with tiny pvalues.11
The model lacks interactions; every predictor is assumed to
have a linear partial association with the prices, with a
fixed slope regardless of other explanatory variables.
11

This model is part of the JMP data file; run the Uncalibrated regression script.

Statistics 622

4-17

Fall 2011

Summary of Fit
RSquare
Root Mean Square Error
Mean of Response
Observations

0.689
4.951
22.248
500

Parameter Estimates
Estimate Std Error t Ratio Prob>|t|
72.10
3.11
23.18 <.0001
-20.83
3.25
-6.41 <.0001
-1.55
0.19
-8.18 <.0001
-0.74
0.04 -17.66 <.0001
-1.28
0.12 -10.88 <.0001
0.05
0.01
3.48
0.0006

Term
Intercept
NOx
Distance
Lower class
Pupil/Teacher
Zoning

Calibration plot
We find curvature in the plot of the housing values on the
fitted values from the regression.
As happens often, a lack of calibration is most evident at
the extremes with the smallest and largest fitted values.
50

Value

40
30
20
10
0
0

Pred Formula Value

RSquare
Root Mean Square Error
Term
Intercept
Pred Formula Value

Estimate
-7.5e-14
1

0.689
4.93
Std Error
0.706
0.030

t Ratio
0.00
33.18

Prob>|t|
1.0000
<.0001

The fitted slope and intercept are 1 and 0, with R2 and

RMSE matching those in the multiple regression.
Statistics 622

4-18

Fall 2011

Testing for a lack of calibration

Use the procedure that we applied to simple regression,
starting from a scatterplot of y on .
The fitted 5th order polynomial (or spline) shows that
average housing values are larger than the fitted values at
both sides of the plot. The fitted values are too small for
both inexpensive and expensive housing.
The effect for inexpensive tracts (left) seems less noticeable
than for the expensive tracts (right).
50

Value

40
30
20
10
0
0

Pred Formula Value

The partial F-test for calibration shows that theres room to

improve the model.12
Summary of Fit

RSquare
Root Mean Square Error
Observations (or Sum Wgts)

0.742
4.506
500

(0.742 ! 0.689) /4
(1! 0.742) /(500 ! 6 ! 4)
490 0.053
=
"
= 25.2
4
0.258

The degrees of freedom in the F are 4 (for the 4 nonlinear terms in the polynomial) and n 6
4. (6 for the intercept and slopes in the original fit plus 4 for the non-linear terms).
Statistics 622

4-19

Fall 2011

Calibrating a Model
What should be done when a model is not calibrated?
Simple regression. Ideally, find a substantively motivated
transformation, such as a log, that captures the curvature.
Routine adjustments such as those for calibration are
no substitute for knowing the substance of the problem.
Multiple regression. Again, find the right transformation or
a missing predictor. This can be hard to do, but some
methods can find these (and indeed work for this data set
and model).
If you only need predictions, calibrate the fit.
(a) Use predictions from the polynomial used to test for
calibration, or better yet
(b) Use a spline that matches the polynomial fit in the test
for calibration. The spline avoids the edge effects that
make polynomials go wild when extrapolating.
Example
The matching spline (lambda=238.43) has similar R2 to the
fit of the polynomial model used in the test (74.2%), but the
spline controls the behavior at the edges of the plot.
50

Value

40
30

R-Square
Sum of Squares Error

0.748328
9784.732

10
0
0

Pred Formula Value

Statistics 622

4-20

Fall 2011

Predicting with a Smoothing Spline

Use this three-step process:
1. Save predictions from the original, uncalibrated
multiple regression (used to test the calibration).
2. Match a smoothing spline to the polynomial in the
calibration plot (as done above).
3. Save the prediction formula from this smoothing
spline as a column in the data table.
4. Insert values of from the regression into the formula
for the smoothing spline to get the final prediction.
Example
Predict the first row. This row is missing the response
column and matches the predictors held in row #206, a case
that the fitted regression under-predicts.
Predicted value from multiple regression = 39.0
Predict the response using the spline fit for cases with this
predicted value. If you have saved the formula for the
spline fit, this calculation happens automagically.
Predicted value from the spline = 49.5
The actual value for the matching case in row #206 is 50.
The spline gives a much better estimate!
The use of a spline after a regression fixes the calibration, but
avoids any attempt at explanation.
Thats a weakness. Knowing why the model under-predicts
housing values at either extreme might be important.

Statistics 622

4-21

Fall 2011

Bundling the Calibration into One Regression

Prediction Accuracy
The weakness of two-step calibration as done here in
multiple regression is that it hides the nature of the
original explanatory variables.
We save predicted values from the multiple regression,
then use these predicted values to fit a second model.
Because the original explanatory variables are hidden in
this second polynomial regression, that model cannot assess
the accuracy of prediction about all it can do is the backof-the-envelope 2 RMSE calculation.
Single model
The solution though it muddies the interpretation is to
revise the original multiple regression rather than fit a
second, separate polynomial regression.
2
3
4
5
Simply add the powers of the predicted values y , y , y , y
(note, omitting the 1st power) back into the original
regression.13
As a bonus, since all of the effects are together in one
model, we can
(a) use familiar regression summaries, such as the
profile plots or leverage plots.
(b) use tests provided by the multiple regression,
such as the custom test for the added terms.

JMP makes this easy. Recall the fitted multiple regression, set the Degree item to 5, select the
predicted value column, and finally use the Macro > Polynomial to Degree button to add powers
of the predicted values. Just remember to remove the 1st power. Easier done than said.
Statistics 622

4-22

Fall 2011

Collinearity
The addition of these powers adds quite a bit of collinearity
that will obscure the effects of the original explanatory
variables, so you will need to return to the original equation
to interpret them.
Be sure centering is turned on for a bit less collinearity.
It would be better to add so-called orthogonal
polynomials to the regression, but that requires more
manual calculations, so well not go there.
Example
To illustrate, I will redo the prior calibration for the 5predictor model for the Boston data. I relabeled the powers
of the predictions to make the output fit.
The fit is slightly better than the prior calibration regression
since it gets to modify the original slopes in addition to
calibrating the fit.
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)

Term
Intercept
NOx
Distance
Lower class
Pupil/Teacher
Zoning
(Pred-22.2482)^2
(Pred-22.2482)^3
(Pred-22.2482)^4
(Pred-22.2482)^5
Statistics 622

0.772
0.768
4.256
22.248
500.000

Parameter Estimates
Estimate
45.12621
-11.52448
-0.18500
-0.64874
-0.50426
-0.05706
0.04167
0.00401
0.00003
-0.00001
4-23

Std Error
4.1677
2.9539
0.2103
0.0710
0.1310
0.0143
0.0088
0.0010
0.0000
0.0000

t Ratio
10.83
-3.90
-0.88
-9.14
-3.85
-3.99
4.74
3.84
0.80
-2.04

Prob>|t|
<.0001*
0.0001*
0.3794
<.0001*
0.0001*
<.0001*
<.0001*
0.0001*
0.4249
0.0419*
Fall 2011

Value

Calibration plot
The calibration plot for the revised model looks fine.
This figure shows the calibration
50
plot for the predictions from
40
this multiple regression.
Theres a line and spline
in the plot. The two are
basically indistinguishable.

30
20
10

20
30
40
Pred Formula Value

Profile plot
The profile plot shows that the calibration has a substantial
impact on the fit.
You can see how it only impacts the fit for tracts with
either high or low property values.

Prediction
Now that we have one regression model, we can use the
methods in the next lecture to anticipate the accuracy of
predictions.

Statistics 622

4-24

Fall 2011

Other Forms of Calibration

When a model predicts the response right, on average, we
might ought to say that the model is calibrated to first order.
Higher-order calibration matters as well:
(a) Second-order calibrated: the variation around every
prediction is a known or correctly estimated (including
capturing changes in error variation).
(b) Strongly calibrated: the distribution of the prediction
errors is known or correctly estimated (such as with a
normal distribution).
We typically check these as part of the usual regression
diagnostics, after we get the right equation for the mean value.
Thats the message here as well:
Get the predicted value right on average, then worry about
the other attributes of your prediction.

Discussion
Some points to keep in mind
The reason to calibrate a model is so that the predictions
are correct, on average. Economic decision-making
requires calibrated predictions. As a by-product, the model
also has smaller RMSE.
We fit the polynomial to test for calibration only because
JMPs spline tool does not tell us the information needed to
use the partial F test (i.e., the number of added variables)
Splines are subject to personal impressions. Methods are
available (not in JMP) that provide a more objective
measure of how much to smooth the fit.

Statistics 622

4-25

Fall 2011

CryptoNote v1 (Archive - Org)
No ratings yet
CryptoNote v1 (Archive - Org)
16 pages
Active Credit Portfolio Management in Practice
From Everand
Active Credit Portfolio Management in Practice
Jeffrey R. Bohn
No ratings yet
Halpin Tsai Review Article
No ratings yet
Halpin Tsai Review Article
9 pages
A Short Guide to Marketing Model Alignment & Design: Advanced Topics in Goal Alignment - Model Formulation
From Everand
A Short Guide to Marketing Model Alignment & Design: Advanced Topics in Goal Alignment - Model Formulation
David Young
No ratings yet
0 Regularization PDF
No ratings yet
0 Regularization PDF
88 pages
Calibration: The Achilles Heel of Predictive Analytics: Opinion Open Access
No ratings yet
Calibration: The Achilles Heel of Predictive Analytics: Opinion Open Access
7 pages
Model Comparison and Calibration Assessment
No ratings yet
Model Comparison and Calibration Assessment
70 pages
Nixon, Measuring Calibration in Deep Learning
No ratings yet
Nixon, Measuring Calibration in Deep Learning
14 pages
Bias and Accuracy Definition PDF
No ratings yet
Bias and Accuracy Definition PDF
7 pages
Multivariate Calibrations 22-23
No ratings yet
Multivariate Calibrations 22-23
22 pages
Lecture36 2012 Full
No ratings yet
Lecture36 2012 Full
30 pages
Test Plan and Calibration
No ratings yet
Test Plan and Calibration
8 pages
Lecture Week 13 - Regression
No ratings yet
Lecture Week 13 - Regression
10 pages
Future Ready: How to Master Business Forecasting
From Everand
Future Ready: How to Master Business Forecasting
Steve Morlidge
No ratings yet
Quality of Analytical Measurements: Univariate Regression: 2009 Elsevier B.V. All Rights Reserved
No ratings yet
Quality of Analytical Measurements: Univariate Regression: 2009 Elsevier B.V. All Rights Reserved
43 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
6 pages
Statistical Model Validation
No ratings yet
Statistical Model Validation
3 pages
Calibration PDF
No ratings yet
Calibration PDF
27 pages
CH2. Simple Linear Regression 2023
No ratings yet
CH2. Simple Linear Regression 2023
100 pages
C A Libratio N (Thegood Curve) : Greg Hudson Envirocompliance Labs, Inc
No ratings yet
C A Libratio N (Thegood Curve) : Greg Hudson Envirocompliance Labs, Inc
16 pages
Lecture 2
No ratings yet
Lecture 2
37 pages
Regresión y Calibración
No ratings yet
Regresión y Calibración
6 pages
ModelCalibration
No ratings yet
ModelCalibration
11 pages
DDMA05_ModelSelection
No ratings yet
DDMA05_ModelSelection
28 pages
Uncertainty Notes
No ratings yet
Uncertainty Notes
166 pages
Diagnostic Tests2
No ratings yet
Diagnostic Tests2
25 pages
Cross-Validation and Model Selection
No ratings yet
Cross-Validation and Model Selection
46 pages
Internal Econometrics
No ratings yet
Internal Econometrics
5 pages
The Uncertainty of A Result From A Linear Calibration
No ratings yet
The Uncertainty of A Result From A Linear Calibration
13 pages
Statistical Model Validation
No ratings yet
Statistical Model Validation
4 pages
Evaluation of Measurement Uncertainty For Thermometers With Calibration Equation
No ratings yet
Evaluation of Measurement Uncertainty For Thermometers With Calibration Equation
8 pages
Statistical Testing and Prediction Using Linear Regression: Abstract
No ratings yet
Statistical Testing and Prediction Using Linear Regression: Abstract
10 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
What Does This Company Do?: Understanding a Business and its Risks
From Everand
What Does This Company Do?: Understanding a Business and its Risks
Drago Dimitrov
No ratings yet
Calibration w4 2024-1
No ratings yet
Calibration w4 2024-1
31 pages
Class 9: Richard P. Waterman
No ratings yet
Class 9: Richard P. Waterman
22 pages
Making Confident Decisions
No ratings yet
Making Confident Decisions
37 pages
Module07 - Model Selection and Regularization
No ratings yet
Module07 - Model Selection and Regularization
46 pages
Statistical Evaluation of Big Data
No ratings yet
Statistical Evaluation of Big Data
22 pages
Model Evaluation
No ratings yet
Model Evaluation
80 pages
MSD_Model_diagnostics_1
No ratings yet
MSD_Model_diagnostics_1
43 pages
15.433 INVESTMENTS Class 7: The CAPM and APT Part 2: Applications and Tests
No ratings yet
15.433 INVESTMENTS Class 7: The CAPM and APT Part 2: Applications and Tests
25 pages
SM Notes 2020
No ratings yet
SM Notes 2020
139 pages
Calibration Classifierpdf
No ratings yet
Calibration Classifierpdf
20 pages
Unit 4
No ratings yet
Unit 4
33 pages
Detection of Prediction Outliers and Inliers in Multivariate Calibration
No ratings yet
Detection of Prediction Outliers and Inliers in Multivariate Calibration
19 pages
Regression Notes
No ratings yet
Regression Notes
6 pages
Linear Regression With R
No ratings yet
Linear Regression With R
45 pages
MLR- R and R2
No ratings yet
MLR- R and R2
17 pages
Regression Notes
No ratings yet
Regression Notes
7 pages
An Introduction To Statistical Learning
No ratings yet
An Introduction To Statistical Learning
19 pages
Statlearn PDF
No ratings yet
Statlearn PDF
123 pages
Credit Score Savior
From Everand
Credit Score Savior
D Brown
No ratings yet
Practical Sales and Operations Planning
From Everand
Practical Sales and Operations Planning
John Chase
1.5/5 (4)
The Failure of Risk Management: Why It's Broken and How to Fix It
From Everand
The Failure of Risk Management: Why It's Broken and How to Fix It
Douglas W. Hubbard
No ratings yet
Stats101A - Chapter 3
No ratings yet
Stats101A - Chapter 3
54 pages
Bootstrap
No ratings yet
Bootstrap
12 pages
A Quantitative Approach to Commercial Damages: Applying Statistics to the Measurement of Lost Profits
From Everand
A Quantitative Approach to Commercial Damages: Applying Statistics to the Measurement of Lost Profits
Mark G. Filler
No ratings yet
Capitulo 2 big data
No ratings yet
Capitulo 2 big data
25 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Mathematical Modelling Lecture 4 - Fitting Data: Phil Hasnip Phil - Hasnip@york - Ac.uk
No ratings yet
Mathematical Modelling Lecture 4 - Fitting Data: Phil Hasnip Phil - Hasnip@york - Ac.uk
25 pages
3 & 4 - Calibration
No ratings yet
3 & 4 - Calibration
12 pages
Gic1 U4
No ratings yet
Gic1 U4
24 pages
Secretariat ISPE Indonesia Affiliate
No ratings yet
Secretariat ISPE Indonesia Affiliate
4 pages
31.8 Tlieu Ky Thuat
No ratings yet
31.8 Tlieu Ky Thuat
4 pages
ParklistVZVZL Parts v7
0% (1)
ParklistVZVZL Parts v7
15 pages
Guideline On Management of Computerized Systems For Marketing Authorization
No ratings yet
Guideline On Management of Computerized Systems For Marketing Authorization
25 pages
2risk MaPPandContainment PharmaForum2011 1
No ratings yet
2risk MaPPandContainment PharmaForum2011 1
35 pages
FANUC Ladder 0i-M Doosan
No ratings yet
FANUC Ladder 0i-M Doosan
349 pages
DLL - All Subjects 2 - Q2 - W1 - D2
No ratings yet
DLL - All Subjects 2 - Q2 - W1 - D2
7 pages
Exponents!
No ratings yet
Exponents!
5 pages
Module_2 - Fundamental_algorithmetic_strategies
No ratings yet
Module_2 - Fundamental_algorithmetic_strategies
31 pages
Class-VII-week8-Week 8
0% (2)
Class-VII-week8-Week 8
11 pages
Lab Workbook With Solutions-Final PDF
100% (5)
Lab Workbook With Solutions-Final PDF
109 pages
SCH Handbook 2015
No ratings yet
SCH Handbook 2015
52 pages
Chapter 1 SSM PDF
No ratings yet
Chapter 1 SSM PDF
4 pages
Phy022 PS1 PDF
No ratings yet
Phy022 PS1 PDF
7 pages
Theory of Computation and Compiler Design: Module - 5
No ratings yet
Theory of Computation and Compiler Design: Module - 5
12 pages
Introduction To TKS Solver
No ratings yet
Introduction To TKS Solver
67 pages
Mini Relay Regional Questions 201213
No ratings yet
Mini Relay Regional Questions 201213
16 pages
Math Q. Hyperlink
No ratings yet
Math Q. Hyperlink
8 pages
MR 6 2017 Problems
No ratings yet
MR 6 2017 Problems
4 pages
FP2C8 Polar Coordinates 210221 Online
No ratings yet
FP2C8 Polar Coordinates 210221 Online
3 pages
Ancment638588199722232188
No ratings yet
Ancment638588199722232188
24 pages
Behavioral Finance Psychology Decision Making and Markets 1st Edition Ackert Solutions Manual - Available For Instant Download And Reading
100% (3)
Behavioral Finance Psychology Decision Making and Markets 1st Edition Ackert Solutions Manual - Available For Instant Download And Reading
30 pages
Spatio-Temporal Rainfall Variability in The Amazon Basin Countries (Brazil, Peru, Bolivia, Colombia, and Ecuador)
No ratings yet
Spatio-Temporal Rainfall Variability in The Amazon Basin Countries (Brazil, Peru, Bolivia, Colombia, and Ecuador)
21 pages
Module-4-and-5-python
No ratings yet
Module-4-and-5-python
5 pages
Department of Mathematics, University of Sargodha, Sargodha. Session 2005 - 2007 Applied Mathematics
No ratings yet
Department of Mathematics, University of Sargodha, Sargodha. Session 2005 - 2007 Applied Mathematics
1 page
College of St. John - Roxas: Flexible Learning Plan Course Syllabus A.Y 2021-2022
No ratings yet
College of St. John - Roxas: Flexible Learning Plan Course Syllabus A.Y 2021-2022
6 pages
A Radically Modern Approach To Introductory Physics
100% (9)
A Radically Modern Approach To Introductory Physics
465 pages
Statistical Quality Control (SQC) : Tazia Hossain
No ratings yet
Statistical Quality Control (SQC) : Tazia Hossain
17 pages
s2017 Pbs Pixar Notes PDF
No ratings yet
s2017 Pbs Pixar Notes PDF
18 pages
Spur Gears
No ratings yet
Spur Gears
10 pages
MA Economics Syllabus (CUCSS) : Calicut University
No ratings yet
MA Economics Syllabus (CUCSS) : Calicut University
22 pages
Basic Physical Interpretation: I I I I
No ratings yet
Basic Physical Interpretation: I I I I
1 page
Machine Learning
No ratings yet
Machine Learning
22 pages

Statistics 622: Calibration

Uploaded by

Statistics 622: Calibration

Uploaded by

Fall 2011

Copyright Robert A Stine

Calibration is a basic property of any predictor, whether the

Another Reason for Calibration

Role of Calibration in Regression

The idea is to pick g so that the resulting prediction y is

Checking the Calibration of a Regression

Calibration in Simple Regression

Relationship between promotion and sales1

Sales = 93 + 39.8 Display Feet

The smoothness of the spline depends on the data and can

With replications (several stores display the same amount;

The linear fit explains R2 = 71% of the variation. A spline

Testing for a Lack of Calibration

(!R ) /(# added)

(1 " R ) /(df all)

(0.8559 " 0.7120) /4

Conclude: Reject H0, the linear regression is not calibrated.6

Sales = 83.56 + 138.62 Log(Display Feet)

50 100 150 200 250 300 350 400 450

Sales Predicted P<.0001 RSq=0.71

We need to make our own calibration plot in order to test

50 100 150 200 250 300 350 400 450

Pred Formula Sales

The slope of this fit must be b1 = 1 and the intercept must be

Sales = 0 + 1 Pred Formula Sales

Why is this true?

Checking the Calibration

50 100 150 200 250 300 350 400 450

Pred Formula Sales

The summary of the fit of this model is on the next page.

Warning: Never extrapolate a high-order polynomial10

When to check for calibration?

Checking Calibration in Multiple Regression

Pred Formula Value

The fitted slope and intercept are 1 and 0, with R2 and

Testing for a lack of calibration

Pred Formula Value

The partial F-test for calibration shows that theres room to

Pred Formula Value

Predicting with a Smoothing Spline

Bundling the Calibration into One Regression

Other Forms of Calibration

You might also like