Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx

Statistics, Data Analysis, and Decision Modeling
FOURTH EDITION
James R. Evans
9780558689766
Chapter 7 Forecasting
Introduction
QUALITATIVE AND JUDGMENTAL METHODS
Historical Analogy
The Delphi Method
Indicators and Indexes for Forecasting
STATISTICAL FORECASTING MODELS
FORECASTING MODELS FOR STATIONARY TIME SERIES
Moving Average Models
Error Metrics and Forecast Accuracy
Exponential Smoothing Models
FORECASTING MODELS FOR TIME SERIES WITH TREND
AND SEASONALITY
Models for Linear Trends
Models for Seasonality
Models for Trend and Seasonality
CHOOSING AND OPTIMIZING FORECASTING MODELS
USING CB PREDICTOR
REGRESSION MODELS FOR FORECASTING
Autoregressive Forecasting Models
Incorporating Seasonality in Regression Models
Regression Forecasting with Causal Variables

THE PRACTICE OF FORECASTING
BASIC CONCEPTS REVIEW QUESTIONS
SKILL-BUILDING EXERCISES
SKILL-BUILDING EXERCISES
PROBLEMS AND APPLICATIONS
CASE: ENERGY FORECASTING
APPENDIX: ADVANCED FORECASTING MODELS—
THEORY AND COMPUTATION
Double Moving Average
Double Exponential Smoothing
Additive Seasonality
Multiplicative Seasonality
Holt–Winters Additive Model
Holt– –Winters Multiplicative Model
INTRODUCTION
One of the major problems that managers face is forecasting
future events in order to make good decisions. For example,
forecasts of interest rates, energy prices, and other economic
indicators are needed for financial planning; sales forecasts are
needed to plan production and workforce capacity; and forecasts
of trends in demographics, consumer behavior, and
technological innovation are needed for long-term strategic
planning. The government also invests significant resources on
predicting short-run U.S. business performance using the Index
of Leading Indicators. This index focuses on the performance of
individual businesses, which often is highly correlated with the
performance of the overall economy, and is used to forecast
economic trends for the nation as a whole. In this chapter, we
introduce some common methods and approaches to forecasting,

including both qualitative and quantitative techniques.
Managers may choose from a wide range of forecasting
techniques. Selecting the appropriate method depends on the
characteristics of the forecasting problem, such as the time
horizon of the variable being forecast, as well as available
information on which the forecast will be based. Three major
categories of forecasting approaches are qualitative and
judgmental techniques, statistical time-series models, and
explanatory/causal methods.
Qualitative and judgmental techniques rely on experience and
intuition; they are necessary when historical data are not
available or when the decision maker needs to forecast far into
the future. For example, a forecast of when the next generation
of a microprocessor will be available and what capabilities it
might have will depend greatly on the opinions and expertise of
individuals who understand the technology.
Statistical time-series models find greater applicability for
short-range forecasting problems. A time series is a stream of
historical data, such as weekly sales. Time-series models
assume that whatever forces have influenced sales in the recent
past will continue into the near future; thus, forecasts are
developed by extrapolating these data into the future.
Explanatory/causal models seek to identify factors that explain
statistically the patterns observed in the variable being forecast,
usually with regression analysis. While time-series models use
only time as the independent variable, explanatory/causal
models generally include other factors. For example, forecasting
the price of oil might incorporate independent variables such as
the demand for oil (measured in barrels), the proportion of oil
stock generated by OPEC countries, and tax rates. Although we
can never prove that changes in these variables actually cause
changes in the price of oil, we often have evidence that a strong
influence exists.
Surveys of forecasting practices have shown that both

judgmental and quantitative methods are used for forecasting
sales of product lines or product families, as well as for broad
company and industry forecasts. Simple time-series models are
used for short- and medium-range forecasts, whereas regression
analysis is the most popular method for long-range forecasting.
However, many companies rely on judgmental methods far more
than quantitative methods, and almost half judgmentally adjust
quantitative forecasts.
In this chapter, we focus on these three approaches to
forecasting. Specifically, we will discuss the following:
Historical analogy and the Delphi method as approaches to
judgmental forecasting
Moving average and exponential smoothing models for time-
series forecasting, with a discussion of evaluating the quality of
forecasts
A brief discussion of advanced time-series models and the use
of Crystal Ball (CB) Predictor for optimizing forecasts
The use of regression models for explanatory/causal forecasting
Some insights into practical issues associated with forecasting
Qualitative and Judgmental Methods
Qualitative, or judgmental, forecasting methods are valuable in
situations for which no historical data are available or for those
that specifically require human expertise and knowledge. One
example might be identifying future opportunities and threats as
part of a SWOT (Strengths, Weaknesses, Opportunities, and
Threats) analysis within a strategic planning exercise. Another
use of judgmental methods is to incorporate nonquantitative
information, such as the impact of government regulations or
competitor behavior, in a quantitative forecast. Judgmental
techniques range from such simple methods as a manager’s
opinion or a group-based jury of executive opinion to more
structured approaches such as historical analogy and the Delphi
method.
Historical Analogy
One judgmental approach is historical analogy, in which a
forecast is obtained through a comparative analysis with a

previous situation. For example, if a new product is being
introduced, the response of similar previous products to
marketing campaigns can be used as a basis to predict how the
new marketing campaign might fare. Of course, temporal
changes or other unique factors might not be fully considered in
such an approach. However, a great deal of insight can often be
gained through an analysis of past experiences. For example, in
early 1998, the price of oil was about $22 a barrel. However, in
mid-1998, the price of a barrel of oil dropped to around $11.
The reasons for this price drop included an oversupply of oil
from new production in the Caspian Sea region, high production
in non-OPEC regions, and lower-than-normal demand. In
similar circumstances in the past, OPEC would meet and take
action to raise the price of oil. Thus, from historical analogy,
we might forecast a rise in the price of oil. OPEC members did
in fact meet in mid-1998 and agreed to cut their production, but
nobody believed that they would actually cooperate effectively,
and the price continued to drop for a time. Subsequently, in
2000, the price of oil rose dramatically, falling again in late
2001. Analogies often provide good forecasts, but you need to
be careful to recognize new or different circumstances. Another
analogy is international conflict relative to the price of oil.
Should war break out, the price would be expected to rise,
analogous to what it has done in the past.
The Delphi Method
A popular judgmental forecasting approach, called the Delphi
method, uses a panel of experts, whose identities are typically
kept confidential from one another, to respond to a sequence of
questionnaires. After each round of responses, individual
opinions, edited to ensure anonymity, are shared, allowing each
to see what the other experts think. Seeing other experts’
opinions helps to reinforce those in agreement and to influence
those who did not agree to possibly consider other factors. In
the next round, the experts revise their estimates, and the
process is repeated, usually for no more than two or three

rounds. The Delphi method promotes unbiased exchanges of
ideas and discussion and usually results in some convergence of
opinion. It is one of the better approaches to forecasting long-
range trends and impacts.
Indicators and Indexes for Forecasting
Bottom of Form
Indicators and indexes generally play an important role in
developing judgmental forecasts. Indicators are measures that
are believed to influence the behavior of a variable we wish to
forecast. By monitoring changes in indicators, we expect to gain
insight about the future behavior of the variable to help forecast
the future. For example, one variable that is important to the
nation’s economy is the Gross Domestic Product (GDP), which
is a measure of the value of all goods and services produced in
the United States. Despite its shortcomings (for instance, unpaid
work such as housekeeping and child care is not measured;
production of poor-quality output inflates the measure, as does
work expended on corrective action), it is a practical and useful
measure of economic performance. Like most time series, the
GDP rises and falls in a cyclical fashion. Predicting future
trends in the GDP is often done by analyzing leading
indicators—series that tend to rise and fall some predictable
length of time prior to the peaks and valleys of the GDP. One
example of a leading indicator is the formation of business
enterprises; as the rate of new businesses grows, one would
expect the GDP to increase in the future. Other examples of
leading indicators are the percent change in the money supply
(M1) and net change in business loans. Other indicators, called
lagging indicators, tend to have peaks and valleys that follow
those of the GDP. Some lagging indicators are the Consumer
Price Index, prime rate, business investment expenditures, or
inventories on hand. The GDP can be used to predict future
trends in these indicators.
Indicators are often combined quantitatively into an index. The
direction of movement of all the selected indicators are

weighted and combined, providing an index of overall
expectation. For example, financial analysts use the Dow Jones
Industrial Average as an index of general stock market
performance. Indexes do not provide a complete forecast, but
rather a better picture of direction of change, and thus play an
important role in judgmental forecasting.
The Department of Commerce began an Index of Leading
Indicators to help predict future economic performance.
Components of the index include the following:
•average weekly hours, manufacturing
•average weekly initial claims, unemployment insurance
•new orders, consumer goods and materials
•vendor performance—slower deliveries
•new orders, nondefense capital goods
•building permits, private housing
•stock prices, 500 common stocks (Standard & Poor)
•money supply
•interest rate spread
•index of consumer
•average weekly hours, manufacturing
•average weekly initial claims, unemployment insurance
•new orders, consumer goods and materials
•vendor performance—slower deliveries
•new orders, nondefense capital goods
•building permits, private housing
•stock prices, 500 common stocks (Standard & Poor)
•money supply
•interest rate spread
•index of consumer expectations (University of Michigan)
Business Conditions Digest included more than 100 time series
in seven economic areas. This publication was discontinued in
March 1990, but information related to the Index of Leading
Indicators was continued in Survey of Current Business. In

December 1995, the U.S. Department of Commerce sold this
data source to The Conference Board, which now markets the
information under the title Business Cycle Indicators;
information can be obtained at its Web site (www.conference-
board.org). The site includes excellent current information
about the calculation of the index, as well as its current
components.
Statistical Forecasting Models
Many forecasts are based on analysis of historical time-series
data and are predicated on the assumption that the future is an
extrapolation of the past. We will assume that a time series
consists of T periods of data, At, = 1, 2, …, T. A naive
approach is to eyeball a trend—a gradual shift in the value of
the time series—by visually examining a plot of the data. For
instance, Figure 7.1 shows a chart of total energy production
from the data in the Excel file Energy Production &
Consumption. We see that energy production was rising quite
rapidly during the 1960s; however, the slope appears to have
decreased after 1970. It appears that production is increasing by
about 500,000 each year and that this can provide a reasonable
forecast provided that the trend continues.
Figure 7.1 Total Energy Production Time Series
Figure 7.2 Federal Funds Rate Time Series
Time series may also exhibit short-term seasonal effects (over a
year, month, week, or even a day) as well as longer-term
cyclical effects or nonlinear trends. At a neighborhood grocery
store, for instance, short-term seasonal patterns may occur over
a week, with the heaviest volume of customers on weekends,
and even during the course of a day. Cycles relate to much

longer-term behavior, such as periods of inflation and recession
or bull and bear stock market behavior. Figure 7.2 shows a chart
of the data in the Excel file Federal Funds Rate. We see some
evidence of long-term cycles in the time series.
Of course, unscientific approaches such as the “eyeball method”
may be a bit unsettling to a manager making important
decisions. Subtle effects and interactions of seasonal and
cyclical factors may not be evident from simple visual
extrapolation of data. Statistical methods, which involve more
formal analyses of time series, are invaluable in developing
good forecasts. A variety of statistically based forecasting
methods for time series are commonly used. Among the most
popular are moving average methods, exponential smoothing,
and regression analysis. These can be implemented very easily
on a spreadsheet using basic functions available in Microsoft
Excel and its Data Analysis tools; these are summarized in
Table 7.1. Moving average and exponential smoothing models
work best for stationary time series. For time series that involve
trends and/or seasonal factors, other techniques have been
developed. These include double moving average and
exponential smoothing models, seasonal additive and
multiplicative models, and Holt–Winters additive and
multiplicative models . We will review each of these types of
models. This book provides an Excel add-in, CB Predictor, that
applies these methods and incorporates some intelligent
technology. We will describe CB Predictor later in this chapter.
Table 7.1 Excel Support for Forecasting
Excel Functions Description

TREND (known_y’s, known_x’s, new_x’s, constant)
Returns values along a linear trend line
LINEST(known_y’s, known_x’s, new_x’s, constant, stats)
Returns an array that describes a straight line that best fits the
data
FORECAST(x, known_y’s, known_x’s)
Calculates a future value along a linear trend
Analysis Toolpak
Description
Moving average Projects forecast values based on the
average value of the variable over a specific number of
preceding periods
Exponential smoothing Predicts a value based on the
forecast for the
prior period, adjusted for the
error in that prior forecast
Regression Used to develop a model relating time-
series data to a set of
variables assumed to influence the data
Forecasting Models for Stationary Time Series
Two simple approaches that are useful over short time periods
when trend, seasonal, or cyclical effects are not significant are
moving average and exponential smoothing models.
Moving Average Models
The simple moving average method is based on the idea of
averaging random fluctuations in the time series to identify the
underlying direction in which the time series is changing.
Because the moving average method assumes that future
observations will be similar to the recent past, it is most useful
as a short-range forecasting method. Although this method is
very simple, it has proven to be quite useful in stable
environments, such as inventory management, in which it is
necessary to develop forecasts for a large number of items.
Specifically, the simple moving average forecast for the next

period is computed as the average of the most recent k
observations. The value of k is somewhat arbitrary, although its
choice affects the accuracy of the forecast. The larger the value
of k, the more the current forecast is dependent on older data;
the smaller the value of k, the quicker the forecast responds to
changes in the time series. (In the next section, we discuss how
to select k by examining errors associated with different
values.)
For instance, suppose that we want to forecast monthly
burglaries from the Excel file Burglaries since the citizen-police
program began. Figure 7.3 shows a chart of these data. The time
series appears to be relatively stable, without trend, seasonal, or
cyclical effects; thus, a moving average model would be
appropriate. Setting k = 3, the three-period moving average
forecast for month 59 is:
Moving average forecasts can be generated easily on a
spreadsheet. Figure 7.4 shows the computations for a three-
period moving average forecast of burglaries. Figure 7.5 shows
a chart that contrasts the data with the forecasted values.
Moving average forecasts can also be obtained from Excel’s
Data Analysis options (see Excel Note: Forecasting with
Moving Averages).
Figure 7.3 Monthly Burglaries Chart
In the simple moving average approach, the data are weighted
equally. This may not be desirable because we might wish to put
more weight on recent observations than on older observations,
particularly if the time series is changing rapidly. Such models
are called weighted moving averages. For example, you might
assign a 60% weight to the most recent observation, 30% to the
second most recent observation, and the remaining 10% of the
weight to the third most recent observation. In this case, the
three-period weighted moving average forecast for month 59
would be:

EXCEL NOTE Forecasting with Moving Averages
From the Analysis group, select Data Analysis then Moving
Average. Excel displays the dialog box shown in Figure 7.6.
You need to enter the Input Range of the data, the Interval (the
value of k), and the first cell of the Output Range. To align the
actual data with the forecasted values in the worksheet, select
the first cell of the Output Range to be one row below the first
value. You may also obtain a chart of the data and the moving
averages, as well as a column of standard errors, by checking
the appropriate boxes. However, we do not recommend using
the chart or error options because the forecasts generated by
this tool are not properly aligned with the data (the forecast
value aligned with a particular data point represents the forecast
for the next month) and, thus, can be misleading. Rather, we
recommend that you generate your own chart as we did in
Figure 7.5. Figure 7.7 shows the results produced by the
Moving Average tool (with some customization of the forecast
chart to show the months on the x-axis). Note that the forecast
for month 59 is aligned with the actual value for month 58 on
the chart. Compare this to Figure 7.5 and you can see the
difference.
Page 244
Figure 7.6 Excel Moving Average Tool Dialog
Figure 7.7 Results of Excel Moving Average Tool (note
misalignment of forecasts with actual in the chart)
Different weights can easily be incorporated into Excel
formulas. This leads us to the questions of how to measure
forecast accuracy and also how to select the best parameters for
a forecasting model.
Error Metrics and Forecast Accuracy
The quality of a forecast depends on how accurate it is in
predicting future values of a time series. The error in a forecast

is the difference between the forecast and the actual value of the
time series (once it is known!). In Figure 7.5, the forecast error
is simply the vertical distance between the forecast and the data
for the same time period. In the simple moving average model,
different values for k will produce different forecasts. How do
we know, for example, if a two- or three-period moving average
forecast or a three-period weighted moving average model
(orothers) would be the best predictor for burglaries? We might
first generate different forecasts using each of these models, as
shown in Figure 7.8, and compute the errors associated with
each model.
Figure 7.8 Alternative Moving Average Forecasting Models
To analyze the accuracy of these models, we can define error
metrics, which compare quantitatively the forecast with the
actual observations. Three metrics that are commonly used are
the mean absolute deviation, mean square error, and mean
absolute percentage error. The mean absolute deviation (MAD)
is the absolute difference between the actual value and the
forecast, averaged over a range of forecasted values:
where At is the actual value of the time series at time t, Ft is the
forecast value for time t, and n is the number of forecast values
(not the number of data points since we do not have a forecast
value associated with the first k data points). MAD provides a
robust measure of error and is less affected by extreme
observations.
Mean square error (MSE) is probably the most commonly used
error metric. It penalizes larger errors because squaring larger
numbers has a greater impact than squaring smaller numbers.
The formula for MSE is:
Again, n represents the number of forecast values used in

computing the average. Sometimes the square root of MSE,
called the root mean square error (RMSE), is used.
Table 7.2 Error Metrics for Moving Average Models of
Burglary Data
k = 2 k = 3 3-Period Weighted
MAD 13.63 14.86 13.70
MSE 254.38 299.84 256.31
MAPE 23.63% 26.53% 24.46%
A third commonly used metric is mean absolute percentage
error (MAPE). MAPE is the average of absolute errors divided
by actual observation values.
The values of MAD and MSE depend on the measurement scale
of the time-series data. For example, forecasting profit in the
range of millions of dollars would result in very large MAD and
MSE values, even for very accurate forecasting models. On the
other hand, market share is measured in proporti The values of
MAD and MSE depend on the measurement scale of the time-
series data. For example, forecasting profit in the range of
millions of dollars would result in very large MAD and MSE
values, even for very accurate forecasting models. On the other
hand, market share is measured in proportions; therefore, even
bad forecasting models will have small values of MAD and
MSE. Thus, these measures have no meaning except in
comparison with other models used to forecast the same data.

Generally, MAD is less affected by extreme observations and is
preferable to MSE if such extreme observations are considered
rare events with no special meaning. MAPE is different in that
the measurement scale is eliminated by dividing the absolute
error by the time-series data value. This allows a better relative
comparison ons; therefore, even bad forecasting models will
have small values of MAD and MSE. Thus, these . Although
these comments provide some guidelines, there is no universal
agreement on which measure is best.
These measures can be used to compare the moving average
forecasts in Figure 7.8. The results, shown in Table 7.2, verify
that the two-period moving average model provides the best
forecast among these alternatives.
Exponential Smoothing Models
A versatile, yet highly effective approach for short-range
forecasting is simple exponential smoothing. The basic simple
exponential smoothing model is: where Ft + 1 is the forecast for
time period t + 1, Ft is the forecast for period t, At is the
observed value in period t, and α is a constant between 0 and 1,
called the smoothing constant. To begin, the forecast for period
2 is set equal to the actual observation for period 1.
Using the two forms of the forecast equation just given, we can
interpret the simple exponential smoothing model in two ways.
In the first model, the forecast for the next period, Ft + 1, is a
weighted average of the forecast made for period t, Ft, and the
actual observation in period t, At. The second form of the
model, obtained by simply rearranging terms, states that the
forecast for the next period, Ft + 1, equals the forecast for the
last period, plus a fraction α of the forecast error made in period
t, At − Ft. Thus, to make a forecast once we have selected the
smoothing constant, we need only know the previous forecast
and the actual value. By repeated substitution for Ft in the
equation, it is easy to demonstrate that Ft + 1 is a decreasingly

weighted average of all past time-series data. Thus, the forecast
actually reflects all the data, provided that is strictly between 0
and 1.
For the burglary data, the forecast for month 43 is 88, the actual
observation for month 42. Suppose we choose α = 0.7; then the
forecast for month 44 would be:
The actual observation for month 44 is 60; thus, the forecast for
month 45 would be:
Since the simple exponential smoothing model requires only the
previous forecast and the current time-series value, it is very
easy to calculate; thus, it is highly suitable for environments
such as inventory systems where many forecasts must be made.
The smoothing constant is usually chosen by experimentation in
the same manner as choosing the number of periods to use in the
moving average model. Different values of α affect how quickly
the model responds to changes in the time series. For instance, a
value of α = 1 would simply repeat last period’s forecast, while
α = 1 would forecast last period’s actual demand. The closer α
is to 1, the quicker the model responds to changes in the time
series because it puts more weight on the actual current
observation than on the forecast. Likewise, the closer is to 0,
the more weight is put on the prior forecast, so the model would
respond to changes more slowly.
An Excel spreadsheet for evaluating exponential smoothing
models for the burglary data using values of between 0.1 and
0.9 is shown in Figure 7.9. A smoothing constant of α = 0.6
provides the lowest error for all three metrics. Excel has a Data
Analysis tool for exponential smoothing (see Excel Note:
Forecasting with Exponential Smoothing).
EXCEL NOTE Forecasting with Exponential Smoothing
From the Analysis group, select Data Analysis then Exponential
Smoothing. In the dialog (Figure 7.10), as in the Moving

Average dialog, you must enter the Input Range of the time-
series data, the Damping Factor (1 − α)—not the smoothing
constant as we have defined it (!)—and the first cell of the
Output Range, which should be adjacent to the first data point.
You also have options for labels, to chart output, and to obtain
standard errors. As opposed to the Moving Average tool, the
chart generated by this tool does correctly align the forecasts
with the actual data, as shown in Figure 7.11. You can see that
the exponential smoothing model follows the pattern of the data
quite closely, although it tends to lag with an increasing trend
in the data.
Figure 7.10 Exponential Smoothing Tool Dialog
Figure 7.11 Exponential Smoothing Forecasts for α = 0.6
Forecasting Models for Time Series with Trend and Seasonality
When time series exhibit trend and/or seasonality, different
techniques provide better forecasts than the basic moving
average and exponential smoothing models we have described.
The computational theory behind these models are presented in
the appendix to this chapter as they are quite a bit more
complicated than the simple moving average and exponential
smoothing models. However, a basic understanding of these
techniques is useful in order to apply CB Predictor software for
forecasting, which we introduce in the next section.
Models for Linear Trends
For time series with a linear trend but no significant seasonal
components, double moving average and double exponential
smoothing models are more appropriate. Both methods are
based on the linear trend equation:
This may look familiar from simple linear regression. That is,
the forecast for k periods into the future from period t is a
function of a base value at also known as the level, and a trend,

or slope, bt. Double moving average and double exponential
smoothing differ in how the data are used to arrive at
appropriate values for at and bt
Models for Seasonality
Seasonal factors (with no trend) can be incorporated into a
forecast by adjusting the level, at, in one of two ways. The
seasonal additive model is:
and the seasonal multiplicative model is:
In both models, st − s + k is the seasonal factor for period t − s
+ k and s is the number of periods in a season. A “season” can
be a year, quarter, month, or even a week, depending on the
application. In any case, the forecast for period t + k is adjusted
up or down from a level (at) by the seasonal factor. The
multiplicative model is more appropriate when the seasonal
factors are increasing or decreasing over time. This is evident
when the amplitude of the time series changes over time.
Models for Trend and Seasonality
Many time series exhibit both trend and seasonality. Such might
be the case for growing sales of a seasonal product. The
methods we describe are based on the work of two researchers,
C.C. Holt, who developed the basic approach, and P.R. Winters,
who extended Holt’s work. Hence, these approaches are
commonly referred to as Holt–Winters models. These models
combine elements of both the trend and seasonal models
described above. The Holt-Winters additive model is based on
the equation:
Table 7.3 Forecasting Model Choice
No Seasonality Seasonality

No Single moving average or single Seasonal
additive or seasonal
Trend exponential smoothing
multiplicative model
Trend Double moving average or Holt–
Winters additive or Holt–
double exponential smoothing Winters multiplicative
model
and the Holt-Winters multiplicative model is:
F t+1= ( a t + b t) S t- s + 1
The additive model applies to time series with relatively stable
seasonality, while the multiplicative model applies to time
series whose amplitude increases or decreases over time.
Table 7.3 summarizes the choice of models based on
characteristics of the time series.
Choosing and Optimizing Forecasting Models Using CB
Predictor
CB Predictor is an Excel add-in for forecasting that is part of
the Crystal Ball suite of applications. We introduced Crystal
Ball for distribution fitting in Chapter 3. CB Predictor can be
used as a stand-alone program for forecasting, and can also be
integrated with Monte Carlo simulation, which we discuss in
Chapter 10. CB Predictor includes all the time-series
forecasting approaches we have discussed. See Excel Note:
Using CB Predictor for basic information on using the add-in.
We will illustrate the use of CB Predictor first for the data in
the worksheet Burglaries after the citizen-police program
commenced. Only the single moving average and single
exponential methods were chosen in the Method Gallery for this
example. CB Predictor creates a worksheet for each of the
results checked in the Results dialog. Figure 7.16 shows the

Methods Table, which summarizes the forecasting methods used
and ranks them according to the lowest RMSE error criterion. In
this example, CB Predictor found the best fit to be a 2-period
moving average. This method was also the best for the MAD
and MAPE error metrics. The Durbin–Watson statistic checks
for autocorrelation (see the discussion of autocorrelation in
regression in Chapter 6), with values of 2 indicating no
autocorrelation. Theil’s U statistic is a relative error measure
that compares the results with a naive forecast. A value less
than 1 means that the forecasting technique is better than
guessing, a value equal to 1 means that the technique is about as
good as guessing, and a value greater than 1 means that the
forecasting technique is worse than guessing. Note that CB
Predictor identifies the best number of periods for the moving
average or the best smoothing constants as appropriate. For
instance, in Figure 7.16, we see that the best-fitting single
exponential smoothing model has alpha = 0.631.
EXCEL NOTE Using CB Predictor
After Crystal Ball has been installed, CB Predictor may be
accessed in Excel from the Crystal Ball tab. Click on the Tools
menu and then CB Predictor. CB Predictor guides you through
four dialog boxes, the first of which is shown in Figure 7.12.
These can be selected by clicking the Next button or by clicking
on the tabs. Input Data allows you to specify the data range on
which to base your forecast; Data Attributes allows you to
specify the type of data and whether or not seasonality is
present (see Figure 7.13); Method Gallery allows you to select
one or more of eight time-series methods—single moving
average, double moving average, single exponential smoothing,
double exponential smoothing, seasonal additive, seasonal
multiplicative, Holt–Winters additive, or Holt–Winters
multiplicative (see Figure 7.14). The charts shown in the
Method Gallery suggest the method that is best suited for the
data similar to Table 7.3. However, CB Predictor can run each

method you select and will recommend the one that best
forecasts your data. Not only does it select the best type of
model, it also optimizes the forecasting parameters to minimize
forecasting errors. The Advanced button allows you to change
the error metric on which the models are ranked. The final
dialog, Results, allows you to specify a variety of reporting
options (see Figure 7.15). The Preferences button allows you to
customize these results.
Figure 7.12 CB Predictor Input Data Dialog
Figure 7.13 CB Predictor Data Attributes Dialog
Figure 7.14 CB Predictor Method Gallery Dialog
Figure 7.15 CB Predictor Results Dialog
Figure 7.16 CB Predictor Output—Methods Table
Figure 7.17 CB Predictor Output—Results Table
The Results Table (Figure 7.17) provides the historical data,
fitted forecasts, and residuals. For future forecasts, it also
provides a confidence interval based on Step 8 in the Results
dialog. Thus, the forecast for month 59 is 60.5, with a 95%
confidence interval between 34.26 and 86.74. CB Predictor also
creates a chart showing the data and fitted forecasts, and a
summary report of all results.
As a second example, the data in the Excel file Gas & Electric
provides two years of data for natural gas and electric usage for
a residential property (see Figure 7.18). In the Data Attributes
tab of CB Predictor, we select a seasonality of 12 months.
Although the data are clearly seasonal, we will select all the
time-series methods in the Method Gallery tab. Figure 7.19
shows the results. In this example the Seasonal Multiplicative
method was ranked first, although you will notice that the top
four methods provide essentially the same quality of results.
Figure 7.20 shows the forecasts generated for the next 12
months.
Figure 7.18 Gas & Electric Data will notice that the top four
methods provide essentially the same quality of results. Figure

7.20 shows the forecasts generated for the next 12 months.
Figure 7.18 Gas & Electric Data
Figure 7.19 Methods Table for Gas Use
Figure 7.20 Gas Use ForecastsEgression Models for Forecasting
We introduced regression in the previous chapter as a means of
developing relationships between dependent and independent
variables. Simple linear regression can be applied to forecasting
using time as the independent variable. For example, Figure
7.21 shows a portion of the Excel file Coal Production, which
provides data on total tons produced from 1960 through 2007. A
linear trendline shows an R2 value of 0.969 (the fitted model
assumes that the years are numbered 1 through 48, not as actual
dates). The actual values of the coefficients in the model:
Tons = 416,896,322.7 + 16,685,398.57 × Year
Thus, a forecast for 2008 would be:
CB Predictor can also use linear regression for forecasting, and
provides additional information. To apply it, first add a column
to the spreadsheet to number the years beginning with 1
(corresponding to 1960). In Step 1 of the Input Data tab, select
the ranges of both this new Year column and Total Tons. In the
Data Attributes tab, check the box for multiple linear regression
in Step 5, and click the Select Variables button; this will allow
you to specify which are the independent and dependent
variables. Figure 7.22 shows a portion of the output showing
forecasts for the next 5 years and 95% confidence intervals.
However, note that the Durbin–Watson statistic (see Chapter 6)
suggests that the data are autocorrelated, indicating that other
approaches, called autoregressive models, are more appropriate.
Figure 7.21 Portion of Coal Production
Autoregressive Forecasting Models
An autoregressive forecasting model incorporates correlations
between consecutive values in a time series. A first-order

autocorrelation refers to the correlation among data values one
period apart, a second-order autocorrelation refers to the
correlation among data values two periods apart, and so on.
Autoregressive models improve forecasting when
autocorrelation is present in data. A first-order autoregressive
model is:
Y I = a o + a 1 Y i – 1 + d i
Page 256
where Yi is the value of the time series in period i and δi is a
nonautocorrelated random error term having 0 mean and
constant variance. A second-order autoregressive model is:
Additional terms may be added for higher-order models.
To build an autoregressive model using multiple linear
regression, we simply add additional columns to the data matrix
for the dependent variable that lag the original data by some
number of periods. Thus, for a second-order autoregressive
model, we add columns that lag the dependent variable by one
and two periods. For the coal production data, a portion of this
data matrix is shown in Figure 7.23. Using these additional
columns as independent variables, we run the multiple
regression tool, obtaining the results shown in Figure
7.24.Figure 7.22 Portion of CB Predictor Output for Regression
Forecasting
Note that the p-value for the second-order term exceeds 0.05
(although not by much), indicating that this variable is not
significant. Dropping it and rerunning the regression using only
the first-order term results in the model shown in Figure 7.25.
However, the adjusted R2 is less than that of the second-order
model, indicating a poorer fit. Thus, we use the second-order
model:
Tons = 136,892,640 + 0.608 x (Year – 1) + 0.259 x (Year -2)

A forecast for year 49 (2008) would be:
Tons = 136,892,640 + 0.608 x 1,162,749,659 + 0.259 x
1,131,498,099 = 1,136,902,440
A forecast for year 50 (2009) would be:
Tons = 136,892,640 + 0.608 x 1,136,902,440 + 0.259 x
1,162,749,659 = 1,129,281,485
Figure 7.23 Portion of Data Matrix for Autoregressive
Forecasting of Coal Production DataIncorporating Seasonality
in Regression Models
Quite often time-series data exhibit seasonality, especially on
an annual basis, as we saw in the Gas & Electric data. Multiple
linear regression models with categorical variables can be used
for time series with seasonality. To do this, we use dummy
categorical variables for the seasonal components. With
monthly data, as we have for natural gas usage, we have a
seasonal categorical variable with k = 12 levels. As discussed in
Chapter 6, we construct the regression model using dummy
variables. We will use January as the reference month;
therefore, this variable does not appear in the model:
Figure 7.25 First-Order Autoregressive Forecasting Model
This coding scheme results in the data matrix shown in Figure
7.26. This model picks up trends from the regression coefficient
for time, and seasonality from the dummy variables for each
month. The forecast for the next January will be β0 + β1(25).
The variable coefficients (betas) for each of the other 11 months
will show the adjustment relative to January. For example,
forecast for next February would be β0 + β1(25) + β2(1), and so
on.
Figure 7.27 shows the results of using the Regression tool in
Excel after eliminating insignificant variables (Time and Feb).
Because the data shows no clear linear trend, the variable Time
could not explain any significant variation in the data. The
dummy variable for February was probably insignificant

because the historical gas usage for both January and February
were very close to each other. The R2 for this model is 0.971,
which is very good. The final regression model is:
Pg.259
Figure 7.26 Data Matrix for Seasonal Regression Model
Regression Forecasting with Causal Variables
In many forecasting applications, other independent variables
such as economic indexes or demographic factors may influence
the time series, and can be incorporated into a regression model.
For example, a manufacturer of hospital equipment might
include such variables as hospital capital spending and changes
in the proportion of people over the age of 65 in building
models to forecast future sales.
To illustrate the use of multiple linear regression for forecasting
with causal variables, suppose that we wish to forecast gasoline
sales. Figure 7.28 shows the sales over 10 weeks during June
through August along with the average price per gallon and a
chart of the gasoline sales time series with a fitted trendline
(Excel file Gasoline Sales). During the summer months, it is not
unusual to see an increase in sales as more people go on
vacations. The chart shows a linear trend , although R2 is not
very high.
The trend line is: Sales = 4790.1 + 812.99 Week
Figure 7.27 Final Regression Model for Forecasting Gas Use
Pg. 261
Figure 7.28 Gasoline Sales Data and Trendline
Using this model, we would predict sales for week 11 as:

Sales = 4790.1 + 812.99 (11) = 13,733 gallons
However, we also see that the average price per gallon changes
each week, and this may influence consumer sales. Therefore,
the sales trend might not simply be a factor of steadily
increasing demand, but might also be influenced by the average
price per gallon. The average price per gallon can be considered
as a causal variable. Multiple linear regression provides a
technique for building forecasting models that incorporate not
only time, but other potential causal variables also. Thus, to
forecast gasoline sales, we propose a model using two
independent variables (Week and Price/Gallon).
Figure 7.29 Regression Results for Gas Sales
Sales = β0 + β1 Week + β2 Price/Gallon
The results are shown in Figure 7.29 and the regression model
is:
Sales = 72333.08 + 508.67 Week − 16463.2 Price/Gallons
This makes sense because as price changes, sales typically
reflect the change. Notice that the R2 value is higher when both
variables are included, explaining more than 86% of the
variation in the data. If the company estimates that the average
price for
Figure 7.28 Gasoline Sales Data and Trendline
Using this model, we would predict sales for week 11 as:
However, we also see that the average price per gallon changes
each week, and this may influence consumer sales. Therefore,

the sales trend might not simply be a factor of steadily
increasing demand, but might also be influenced by the average
price per gallon. The average price per gallon can be considered
as a causal variable. Multiple linear regression provides a
technique for building forecasting models that incorporate not
only time, but other potential causal variables also. Thus, to
forecast gasoline sales, we propose a model using two
independent variables (Week and Price/Gallon).
Figure 7.29 Regression Results for Gas Sales
Sales = β0 + β1 Week + β2 Price/Gallon
The results are shown in Figure 7.29 and the regression model
is:
Sales = 72333.08 + 508.67 Week − 16463.2 Price/Gallons
This makes sense because as price changes, sales typically
reflect the change. Notice that the R2 value is higher when both
variables are included, explaining more than 86% of the
variation in the data. If the company estimates that the average
price for the next week will drop to $3.80, the model would
forecast the sales for week 11 as:
Sales = 72333.08 + 508.67 (11) – 16463.2 (3.80) = 15, 368 gal
Notice that this is higher than the pure time-series forecast
because of the sensitivity to the price per gallon.
The Practice of Forecasting
In practice, managers use a variety of judgmental and
quantitative forecasting techniques. Statistical methods alone
cannot account for such factors as sales promotions, unusual
environmental disturbances, new product introductions, large
one-time orders, and so on. Many managers begin with a
statistical forecast and adjust it to account for intangible
factors. Others may develop independent judgmental and

statistical forecasts then combine them, either objectively by
averaging or in a subjective manner. It is impossible to provide
universal guidance as to which approaches are best, for they
depend on a variety of factors, including the presence or
absence of trends and seasonality, the number of data points
available, length of the forecast time horizon, and the
experience and knowledge of the forecaster. Often, quantitative
approaches will miss significant changes in the data, such as
reversal of trends, while qualitative forecasts may catch them,
particularly when using indicators as discussed earlier in this
chapter.
Here we briefly highlight three practical examples of
forecasting and encourage you to read the full articles cited for
better insight into the practice of forecasting.
•Allied-Signal’s Albuquerque Microelectronics Operation
(AMO) produced radiation-hardened microchips for the U.S.
Department of Energy (DOE). In 1989 a decision was made to
close a plant, but operations at AMO had to be phased out over
several years because of long-term contractual obligations.
AMO experienced fairly erratic yields in the production of some
of its complex microchips, and accurate forecasts of yields were
critical. Overestimating yields could lead to an inability to meet
contractual obligations in a timely manner, requiring the plant
to remain open longer. Underestimates would cause AMO to
produce more chips than actually needed . AMO’s yield
forecasts had previously been made by simply averaging all
historical data. More sophisticated forecasting techniques were
implemented, resulting in improved forecasts of wafer
fabrication. Using more accurate yield forecasts and
optimization models, AMO was able to close the plant sooner,
resulting in significant cost savings.1
•More than 70% of the total sales volume at L.L. Bean is
generated through orders to its call center. Calls to the L.L.
Bean call center are classified into two types: telemarketing
(TM), which involves placing an order, and telephone inquiry

(TI), which involves customer inquiries such as order status or
order problems. Accurately forecasting TM and TI calls helps
the company better plan the number of agents to have on hand at
any point in time. Analytical forecasting models for both types
of calls take into account historical trends, seasonal factors, and
external explanatory variables such as holidays and catalog
mailings. The estimated benefit from better precision from the
two forecasting models is approximately $300,000 per year.2
•DIRECTV was founded in 1991 to provide subscription
satellite television. Prior to launching this product, it was vital
to forecast how many homes in the United States would
subscribe to satellite forecasting TM and TI calls helps the
company better plan the number of agents to have on hand at
any point in time. Analytical forecasting models for both types
of calls take into account historical trends, seasonal factors, and
external explanatory variables such as holidays and catalog
mailings. The estimated benefit from better precision from the
two forecasting models is approximately $300,000 per year.2
1 D.W. Clements and R.A. Reid, “Analytical MS/OR Tools
Applied to a Plant Closure,” Interfaces 24, no. 2 (March–April,
1994): 1–12.
2 B.H. Andrews and S.M. Cunningham, “L.L. Bean Improves
Call-Center Forecasting,” Interfaces 25, no. 6 (November–
December, 1995): 1–13.
3 Frank M. Bass, Kent Gordon, and Teresa L. Ferguson,
“DIRECTV: Forecasting Diffusion of a New Technology Prior
to Product Launch,” Interfaces 31, no. 3 (May–June 2001): Part
2 of 2, S82–S93.
Basic Concepts Review Questions
1.
Explain the differences between qualitative and judgmental,
statistical time-series, and explanatory/causal forecasting
models.
2.
Describe some common forecasting approaches for judgmental
forecasting.

3.
How are indicators and indexes used in judgmental forecasting?
4.
What are the primary components of time series?
5.
Summarize statistical methods used in forecasting and the types
of time series to which they are most appropriate.
6.
Explain how a simple moving average is calculated.
7.
List and define the three principal ways of measuring forecast
accuracy. What are the key differences among them?
8.
Explain the differences between moving average and
exponential smoothing models.
9.
What types of forecasting models are best for time series with
trends and/or seasonality?
10.
What are the advantages of using CB Predictor for forecasting?
11.
What are autoregressive models, and when should they be used?
12.
How are dummy variables used in regression forecasting models
with seasonality?
13.
What is a causal variable in forecasting? Provide an example
from your experience of some applications where causal
variables might be used in a forecast.
14.
Summarize some of the practical issues in using forecasting
tools and approaches.
Skill-Building Exercises
1.
Find a 4-period moving average forecast for the monthly
burglaries data, compute MAD, MSE, and MAPE error metrics,

and determine if this model is better than the 2-period moving
average discussed in the chapter (Table 7.2).
2.
Try to identify the best set of weights for a 3-period moving
average model for the burglary data that minimizes the MAD
error metric.
3.
Find the best value of the smoothing constant between 0.5 and
0.7 (in increments of 0.05) for exponential smoothing for the
burglary data.
4. Use CB Predictor to find the best forecasting model for
Electric Use in the Gas & Electric Excel file.
5. Set up and fit a third-order autoregressive model for the coal
production example. Compare the results to the example in the
chapter. What do you find?
6. Find the best multiple regression model for Electric Use in
the Gas & Electric Excel file using the approach for
incorporating seasonality.
Problems and Applications
1.
The Excel file Closing Stock Prices provides data for four
stocks over a six-month period.
· a.Develop spreadsheet models for forecasting each of the
stock prices using single moving average and single exponential
smoothing.
· b.Using MAD, MSE, and MAPE as guidance, find the best
number of moving average periods and best smoothing constant
for exponential smoothing. (You might consider using data
tables to facilitate your search.)
· c.Compare your results to the best moving average and
exponential smoothing models found by CB Predictor.
2.
For the data in the Excel file Baseball Attendance do the
following:
Top of Form
PageGo to the specified printed page number

Bottom of Form
a.Develop spreadsheet models for forecasting attendance using
single moving average and single exponential smoothing.
b.Using MAD, MSE, and MAPE as guidance, find the best
for exponential smoothing.
c.Compare your results to the best moving average and
3.
For the data in the Excel file Ohio Prison Population do the
following:
a.Develop spreadsheet models for forecasting both male and
female populations using single moving average and single
exponential smoothing.
b.Using MAD, MSE, and MAPE as guidance, find the best
c.Compare your results to the best moving average and
4.
For the data in the Excel file Gasoline Prices do the following:
a. a.Develop spreadsheet models for forecasting attendance
using single moving average and single exponential smoothing.
b. b.Using MAD, MSE, and MAPE as guidance, find the best
c. c.Compare your results to the best moving average and
d.
e. 3.
f. For the data in the Excel file Ohio Prison Population do the
following:
g. a.Develop spreadsheet models for forecasting both male and
female populations using single moving average and single

exponential smoothing.
h. b.Using MAD, MSE, and MAPE as guidance, find the best
i. c.Compare your results to the best moving average and
j.
k. 4.
l. For the data in the Excel file Gasoline Prices do the
following:
m. a.Develop spreadsheet models for forecasting prices using
single moving average and single exponential smoothing.
n. b.Using MAD, MSE, and MAPE as guidance, find the best
o. c.Compare your results to the best moving average and
p. 5.
Construct a line chart for the data in the Excel file Arizona
Population.
a.Suggest the best-fitting functional form for forecasting these
data.
b.Use CB Predictor to find the best forecasting model.
6.
Construct a line chart for each of the variables in the data file
Death Cause Statistics, and suggest the best forecasting
technique. Then apply CB Predictor to find the best forecasting
models for these variables.
7.
The Excel file Olympic Track and Field Data provides the gold
medal–winning distances for the high jump, discus, and long
jump for the modern Olympic Games. Develop forecasting
models for each of the events. What does the model predict for
the next Olympics and what are the confidence intervals?

8.
Use CB Predictor to find the best forecasting model for the data
in the following Excel files:
a.New Car Sales
b.Housing Starts
c.Coal Consumption
d.DJIA December Close
e.Federal Funds Rates
f.Mortgage Rates
g.Prime Rate
h.Treasury Yield Rates
9.
Consider the data in the Excel file Consumer Price Index.
· a.Use simple linear regression in CB Predictor to forecast the
data. What would be the forecasts for the next six months?
· b.Are the data autocorrelated? Construct first- and second-
order autoregressive models and compare the results to part (a).
10.
Consider the data in the Excel file Nuclear Power.
Top of Form
PageGo to the specified printed page numberGo
Bottom of Form
Cite/Link
CASE Energy Forecasting
The Excel file Energy Production & Consumption provides data
on energy production, consumption, imports, and exports. You
have been hired as an analyst for a government agency and have
been asked to forecast these variables over the next 10 years.
Apply forecasting tools and appropriate visual aids, and write a
formal report to the agency director that explains these data and
the future forecasts.
Appendix Advanced Forecasting Models—Theory and

Computation
In this appendix, we present computational formulas for
advanced models for time-series forecasting. The calculations
are somewhat complex, but can be implemented on spreadsheets
with a bit of effort.
Double Moving Average
Double moving average involves taking averages of averages.
Let Mt be the simple moving average for the last k periods
(including period t):
The double moving average, Dt for the last k periods (including
period t) is the average of the simple moving averages:
Top of Form
PageGo to the specified printed page number
Using these values, the double moving average method
estimates the values of at and bt in the linear trend model Ft + k
= at + btk as:
These equations are derived essentially by minimizing the sum
of squared errors using the last k periods of data. Once these
parameters are determined, forecasts beyond the end of the
observed data (time period T) are calculated using the linear
trend model with values of aT and bT. That is, for k periods
beyond period T, the forecast is FT + k = aT + bTK. For
instance, the forecast for the next period would be FT + 1 = aT
+ bT(1).

Double Exponential Smoothing
Like double moving average, double exponential smoothing is
also based on the linear trend equation, Ft + k = at + btk, but
the estimates of at and bt are obtained from the following
equations:
The level and seasonal factors are estimated in the additive
model using the following equations:
where α and γ are smoothing constants. The first equation
estimates the level for period t as a weighted average of the
deseasonalized data for period t, (At − St − s), and the previous
period’s level. The seasonal factors are updated as well using
the second equation. The seasonal factor is a weighted average
of the estimated seasonal component for period t, (At − at) and
the seasonal factor for the last period of that season type. Then
the forecast for the next period is Ft + 1 = at + St − s + 1. For k
periods out from the final observed period T, the forecast is:
Top of Form
Page
To initialize the model, we need to estimate the level and
seasonal factors for the first s periods (e.g., for an annual
season with quarterly data this would be the first 4 periods; for
monthly data, it would be the first 12 periods, etc.). We will use
the following approach:

And That is, we initialize the level for the first s periods to the
average of the observed values over these periods and the
seasonal factors to the difference between the observed data and
the estimated levels. Once these have been initialized, the
smoothing equations can be implemented for updating.
Multiplicative Seasonality
The seasonal multiplicative model is:
where α and γ are again the smoothing constants. Here, Aa/St −
s is the deseasonalized estimate for period t. Large values of β
put more emphasis on this term in estimating the level for
period t. The term At/at is an estimate of the seasonal factor for
period t. Large values of γ put more emphasis on this in the
estimate of the seasonal factor.
The forecast for the period t + 1 is Ft + 1 = atSt − s + 1. For k
periods out from the final observed period T, the forecast is:
As in the additive model, we need initial values for the level
and seasonal factors. We do this as follows:
and

Once these have been initialized, the smoothing equations can
be implemented for updating.
Holt–Winters Additive Model
The Holt–Winters additive model is based on the equation:
This model is similar to the additive model incorporating
seasonality that we described in the previous section, but it also
includes a trend component. The smoothing equations are:
Here, α, β and γ are the smoothing parameters for level, trend,
and seasonal components, respectively. The forecast for period t
+ 1 is:
The forecast for k periods beyond the last period of observed
data (period T) is:
The initial values of level and trend are estimated in the same

fashion as in the additive model for seasonality. The initial
values for the trend are bt = bs, for t = 1, 2, … s, where:
This model has the same basic smoothing structure as the
additive seasonal model but is more appropriate for seasonal
time series that increase in amplitude over time. The smoothing
equations are:
data (period T) is:
The initial values of level and trend are estimated in the same
fashion as in the additive model for seasonality. The initial
values for the trend are bt = bs, for t = 1, 2, … s, where:
Note that each term inside the brackets is an estimate of the
trend over one season. We average these over the first 2s
periods.
Holt–Winters Multiplicative Model
The Holt-Winters multiplicative model is:
This model parallels the additive model:
The forecast for period t + 1 is:
data (period T) is:

The forecast for period t + 1 is:
data (period T) is:
Reference
Evans, J. R. (2010). Statistics, data analysis, and decision
modeling. (4 ed.). New Jersey: Pearson College Div. Retrieved
from http://digitalbookshelf.argosy.edu/pages 235-268
264
265
266
267

Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx

More Related Content

Similar to Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx (20)

More from dessiechisomjj4 (20)

Recently uploaded (20)

Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx