0% found this document useful (0 votes)
29 views

1 3MonteCarloSimulation

Uploaded by

mochamadharfis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

1 3MonteCarloSimulation

Uploaded by

mochamadharfis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 11

Chapter 1: Introduction Your Notes

1.3. MONTE CARLO SIMULATION

Like regression analysis, Monte Carlo simulation is a general term that


represents a particular method that has many applications and flavors.
The word “simulation” refers to the fact that we build an artificial model
of a real system in order to study and understand the system. The “Monte
Carlo” part of the name alludes to the randomness inherent in the analysis:

The name “Monte Carlo” was coined by [physicist Nicholas]


Metropolis (inspired by [Stanislaw] Ulam's interest in poker)
during the Manhattan Project of World War II, because of the
similarity of statistical simulation to games of chance, and because
the capital of Monaco was a center for gambling and similar
pursuits. Monte Carlo is now used routinely in many diverse
fields, from the simulation of complex physical phenomena such
as radiation transport in the earth's atmosphere and the simulation
of the esoteric subnuclear processes in high energy physics
experiments, to the mundane, such as the simulation of a Bingo
game or the outcome of Monty Hall's vexing offer to the
contestant in “Let's Make a Deal.”1

In this book, we will use the Monte Carlo method to teach you about
inference and the properties of different econometric estimators. The
basic idea is to have a computer run a chance process over and over so
that we can see the results. We’ll set up an artificial environment where
we know important parameters so that we can explore and check statistical
properties.

Do not be confused about the role of Monte Carlo simulation in


econometrics. It cannot be used to estimate unknown parameters (like the
average or a slope coefficient) since the whole idea is based upon
knowledge of these parameters. Instead, Monte Carlo simulation is used
to understand the properties of estimators themselves. Econometric

1
Introduction to Monte Carlo Methods, http://csep1.phy.ornl.gov/mc/node1.html.

779357418.doc Page 1 of 11
Chapter 1: Introduction Your Notes

researchers use the Monte Carlo approach as a way to test-drive


estimators, i.e., to figure out how estimators perform under different
circumstances.

To further explain the role of Monte Carlo simulation in econometrics, we


point out that the field itself is divided into applied and theoretical areas.
Applied econometrics is concerned with actually estimating parameters
like the rate of return on education or the marginal propensity to consume.
Monte Carlo simulation is not used here. Theoretical econometrics is
where different estimators are tested and evaluated. This can be done via
analytical methods that usually use mathematical theorems and logic or
with Monte Carlo simulation. Recent advances in personal computing
make it possible for you to take advantage of Monte Carlo simulation as
an alternative or corroborating technique to the conventional approach
based on statistical theory.

Let’s take a look at a concrete example of how Monte Carlo simulation


can be used. Suppose we know that Larry Bird, the legendary basketball
player, is a 90% free throw shooter. That is, the chance of his making any
given free throw is 90%, regardless of whether he made or missed his
previous free throw.2 Suppose further that we want to know how well the
sample percentage will perform as an estimator of Bird’s free throw
accuracy if we have a sample of 100 free throws. Put another way,
assume we have Bird, whom we know is truly a 90% free throw shooter,
take 100 free throw attempts. What percentage of the 100 attempts will he
be likely to hit? We know that we should see something around 90%,
because that is his true long-run performance. But the fact that chance

2
We cannot resist telling you that according to the web site, http://www.larrybird.com/stats.html, which
we visited on January 13, 1998, Bird’s lifetime NBA free throw percentage was 88.6% in the regular
season (3960 made out of 4471 attempts) and 89.0% in the playoffs (901 out 1012). A question: was Bird
better in the playoffs? Trivial facts: Bird played in 164 playoff games and on average he attempted more
free throws per game in the playoffs than in the regular season (6.2 versus 5.8 per game).

779357418.doc Page 2 of 11
Chapter 1: Introduction Your Notes

plays a role in free throw shooting means that we may well get something
different from 90%. Now, the possibilities are anywhere from 0 to 100%,
but what are the likely results? Is it likely that we could see him make
only 72 out of 100 attempts for a sample percentage of 72%? Is making
every shot—100 straight free throws—giving him a 100% sample
percentage something that we might see every once in a while? Or, are
results like 72% and 100% extremely rare and results like 88%, 93%, or
91% much more likely?

In statistics, “rare” and “likely” are important words, while “possible” is


not too interesting.3 If results like 72% were quite common, we’d
conclude that a single sample percentage of made shots out of 100 free
throws is a bad way to gauge Bird’s true skill. After all, if we did not
know his true percentage and we had only one sample with which to guess
his true, but unknown, shooting percentage, we might get a result like
72% and we’d be way off. If, on the other hand, we consistently get a
sample percentage within, say, 1 percentage point of 90%, then we’d
argue that the sample percentage of made shots out of 100 free throws is a
pretty good gauge of Bird’s true skill.

What we’re trying to do, of course, is to evaluate the likely size of the
spread in the sample percentage of a sample of 100 free throws. Each free
throw has some chance built into it and so the sample percentage of 100
free throws also has a chance component. We need to figure out how
much variation there is in the sample percentage of 100 free throws. In
other words, we need to find the SE (standard error) of the sample
percentage. A small SE of the sample percentage is good—it means that
the observed sample percentages are unlikely to stray far from 90%.

3
It’s “possible” that a 90% free throw shooter would miss 100 in a row. The likelihood of this outcome,
0.1100 is so remote that we ignore it completely. The chances of making every shot aren’t so great either
—0.9100 = 0.00266%.

779357418.doc Page 3 of 11
Chapter 1: Introduction Your Notes

There are two routes to figuring out the variation in the sample
Two ways to
percentage. The first is statistical theory.4 The second route is the Monte
find the SE
Carlo approach: this consists of producing a simulation of the data
generating process, generating a series of replications of that process, and
analyzing the results of the experiment. How do we implement this
Statistical Monte Carlo
strategy? Theory Simulation

Open the Excel file called MonteCarlo.xls now.


Make sure you click the Enable Macros button when opening the
file in order for the workbook to function properly.

Read the brief description in the sheet Introduction, then go to the Rand
sheet (by clicking on the sheet tab at the bottom of the screen) to learn
about Excel’s random number generation capability. When you are
finished, you should understand how Excel generates random numbers
and how the Excel functions RAND() and IF(expression, true, false) can
be used to create a virtual Larry Bird shooting machine.

Read the Introduction and Rand sheets now.

You have learned that we can use Excel to simulate the result of a single
free throw by having it draw a random number uniformly between 0 and
1. If the number is below 0.9, Excel says, “hit;” if the randomly drawn
number is equal to or above 0.9, it says, “miss.” We can have Excel
register “1” for a hit and “0” for a miss.

To simulate Bird shooting 100 free throws is simple: just repeat the
formula in 100 cells as we show in the sheet called Sample. Call the result
of 100 “shots” a single repetition of the simulation. The key information
from a single repetition would be the sample percentage of 1’s. You

4
We review exactly how statistical theory can be used to solve this problem in the next chapter.

779357418.doc Page 4 of 11
Chapter 1: Introduction Your Notes

should press F9, per the instructions in the Sample sheet, to make sure you
understand that the sample percentage of 100 attempts varies—press F9
again and again and watch how the sample percentage bounces around.
Sometimes Larry does exceptionally well, maybe 94% or 95%, but every
once in a while he does quite badly. Well, never as poorly as say, Shaq5
—extremely badly for Larry is 85% and below 80% is really rare. You
might repeatedly press F9 for 20 minutes and not see 80%.

Now that you understand how the success or failure of a single free throw
is determined via Excel’s Rand function and IF statement and how we
calculate the sample percentage from 100 free throws, we can turn to
actually creating and interpreting Monte Carlo simulation results.

To figure out the spread of the sample percentage in the Larry Bird
example, we simply conduct lots of repetitions and examine the resulting
histogram of results. Let’s say we do 1,000 repetitions. Now we have
1,000 sample percentages. We can find the mean of these sample
percentages and their SD (standard deviation). You’re guaranteed to get
an average close to 0.90 (90%). The question is, “How much spread is
there in the 1,000 sample percentages?” The SD of the 1,000 sample
percentages is a Monte Carlo-generated approximation to the true, exact
SE of the sample percentage and the histogram of the 1,000 sample
percentages approximates the probability histogram (or sampling
distribution).

Monte Carlo simulation will always be an approximation to the exact truth


because the truth in a sampling context is based on an infinite number of
repetitions. One thousand repetitions will usually generate a pretty good
approximation, but ten thousand would be even closer to the truth. No

5
Shaquille O’Neal is a tremendously gifted seven-foot-one-inch athlete in the NBA. See his web site:
http://www.shaq.com/.

779357418.doc Page 5 of 11
Chapter 1: Introduction Your Notes

finite number of repetitions, no matter how large, will give the exact
answer. Monte Carlo simulation cannot be used to get the exact right
answer, but it can give an increasingly good approximation as the number
of repetitions rises.

We ran a Monte Carlo analysis of the sample percentage of 100 attempts


with our simulated Larry Bird shooting free throws. Here are our results:

Empirical Histogram for 10000 Repetitions


1400

1200 Summary Statistics


Average = 89.99%
1000 SD = 2.995%
Max = 99%
800 Min = 79%

600

400

200

0
78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100%

Figure 13.1: Monte Carlo Simulation of Percentage Made (MonteCarlo.xls)

The bars in the histogram show how many samples of 100 free throws had
a particular percentage made. Of the 10,000 repetitions of 100 free
throws, the lowest sample percentage was 79% and the highest was 99%.
In almost 1400 samples, the computer simulation of Larry Bird made
exactly 90 out of 100 free throw attempts. The mean of the 10,000
sample percentages was 89.99% with a standard deviation of 2.995%.
This analysis says that the likely size of chance error for the sample
percentage of 100 free throws is about 3%. Thus, we should not be
surprised to find that Larry Bird sinks 87% or 93% of his free throws
when he takes 100 attempts. It would be very surprising, however, if he
hit all 100, or if he hit only 80 out of 100 since these values are more than

779357418.doc Page 6 of 11
Chapter 1: Introduction Your Notes

3 standard deviations away, and in most cases that means such outcomes
are rare indeed.

Now it’s your turn. From the Samples sheet, click on the

button. A new sheet appears in the workbook called


MCSim and you are looking at the results of a previous Monte Carlo
simulation of the sample percentage of 100 free throws. There is one
extremely important difference between the graph above and the graph on
the MCSim sheet—the former is dead and the latter is alive. That is, the
graph on the Excel sheet will change as the values in column B change.
That means you can easily run your own Monte Carlo simulation and do

so as many times as you wish. Simply click on the


button in order to run your own Monte Carlo simulation.

A dialog box like this one will appear:

After clicking the OK button, you will be able to watch the progress of the
simulation. So, how did your simulation turn out? Is your histogram
similar to ours?

779357418.doc Page 7 of 11
Chapter 1: Introduction Your Notes

A more subtle implication of the Monte Carlo analysis just performed is


that the empirical histogram of the Monte Carlo simulation for Larry Bird
appears slightly skewed to the left, which you can see by looking closely
at the picture. This is not an accident of our particular run. Look at your
simulation results carefully. Is the left tail a little longer than the right? Is
the histogram symmetrical around the expected value of 90%? In other
words, is the fraction of samples with 91% made free throws roughly
equal to the fraction of samples with 89%? How about the fraction of
samples with 88% free throws made versus that for 92%? Two points
can be made here. First, it is not possible to do better than 100%, while
79% and below are possible outcomes. Second, our histogram and yours
probably as well are clearly asymmetric. The histogram of the sample
percentage of 100 free throws approximately follows the normal
distribution, but it is not distributed exactly normally. We’ll discuss this
point in greater depth while reviewing statistical inference. For now, we
remind you that the Central Limit Theorem tells us that the sampling
distribution of the sample percentage comes to resemble the normal
distribution more closely as the sample size increases.

Let’s summarize the Larry Bird free throw shooting example. We wanted
to know how much spread there was in the sample percentage. Instead of
traditional, analytical methods based on the theory of probability and
statistics, we adopted the Monte Carlo simulation strategy. We repeatedly
resampled and thereby obtained an approximation to the SE of the sample
percentage of 100 attempts. Our run gave us a value of about 3%. What
did you get? The formula for the SE of the sample percentage gives us
precisely 3%.6 It is, of course, no accident that our Monte Carlo
experiments yield results close to the standard formulas of statistical
theory.
6
The appropriate formula is:

779357418.doc Page 8 of 11
Chapter 1: Introduction Your Notes

Why bother then with Monte Carlo simulation? First, it enables you to
see clearly the source of chance error and variation in a problem.
Formulas often make it difficult to see what’s really going on. While
some people quickly understand and accept the notion of randomness and
variation, we believe most people learn much better when they actually
see variation. We believe many more people will really get it when they
hit F9 to draw another sample and see that sample percentage bouncing
around. By hitting F9, the student is doing and understanding instead of
passively reading or listening.

Second, Monte Carlo techniques drive home the concept of the Standard
Error, surely one of the most difficult ideas in statistics and econometrics
for beginning students. The SE measures the spread of outcomes of
chance processes. Visually, it is the spread of the probability histogram of
the different outcomes of the chance process. The Monte Carlo method
allows us to approximate the probability histogram and therefore the SE
just by running numerous repetitions of the same data generating process.

While our primary use of Monte Carlo is to teach you econometrics, we


also would like to point out that there are many random variable problems
with no analytical solutions. That is, traditional statistical theory cannot
solve them. This happens in econometrics often when small or finite
sample sizes are under consideration. The advent of extremely fast
computers has opened a new avenue for solving these problems. Thus,
it’s not merely a question of a neat alternative to a tried and true approach
—Monte Carlo methods offer solutions to previously impossible
problems.

779357418.doc Page 9 of 11
Chapter 1: Introduction Your Notes

To see another example of the method of Monte Carlo, click on the

button (on the Sample sheet near cell D17) a few times.
Our simulated Larry Bird exhibits variation in the longest streak of made
free throws in each sample of 100 attempts. What’s the average longest
streak? What’s the spread in the distribution of longest streaks? As
before, we forego analytical solutions to these questions in favor of Monte
Carlo analysis.7

Click on the button (on the Sample sheet near cell D22)
to see a demonstration of how a Monte Carlo simulation can be used to
determine approximately the average and spread of the Max Streak
sampling distribution. As before, a new sheet, this time named Streak,
appears in the workbook with results of 1000 repetitions available for
your inspection. Notice that Max Streak is not normally distributed —it
has a long right-hand tail.

You might want to try your own Monte Carlo analysis by clicking the

button. Once again the dialog box will describe the


simulation and the progress bar will keep you updated on where the
simulation stands. The progress bar is more useful this time because the
simulation takes a longer (calculating the longest streak in a stretch of 100
free throws is a lot harder than calculating the percentage made). You can
do other work while the simulation is running, but this may slow down the
simulation itself (after all, your computer will be busy doing other tasks

7
For an analytical approximation to the exact distribution of the max streak problem, see William Feller,
An Introduction to Probability Theory and Its Applications, Vol. 1 , 3rd Edition, Revised Printing, John
Wiley and Sons, p.325. Our Monte Carlo results agree with Feller's approximation.

779357418.doc Page 10 of 11
Chapter 1: Introduction Your Notes

instead of grinding out the next repetition). If your screen saver comes
on, this will also slow down the simulation. Notice also that you can
interrupt the simulation by pressing the Esc (escape) key on the upper left-
hand corner of your keyboard. Excel will prompt you with a dialog box
and you can click the End button to stop the simulation.

Of course, if you happen to be running on the latest generation chip, these


suggestions are moot since the simulation will fly through 10,000
repetitions.

Monte Carlo Simulation Summary

This concludes our explanation of Monte Carlo simulation. We’ll use


Monte Carlo analysis repeatedly to examine the properties of statistical
estimators and to explain a variety of ideas and concepts in econometrics.

With Excel’s RAND() function, it will be fast and easy to draw many
random samples and then examine the resulting distribution. This will
provide a visual, concrete demonstration of difficult, abstract ideas. In
addition, with Excel, you will be able to run your own simulations and
compare your results to ours. If a point is unclear, you can always run the
simulation again and keep doing so until it makes sense.

779357418.doc Page 11 of 11

You might also like