Continuous Distributions
Continuous Distributions
V. Bardis
Theory: Consider the case where the variable of interest, X, is continuous, that is, it can take on any
value in an interval (or union of intervals).
Example: X = family wealth or X = rhino height or X = running speed, X = guitar weight etc.
Theory: Further suppose that there exists an interval (or intervals) of values of X such that there is a
positive portion of the population with values inside every smaller subinterval of the interval. Then
assume that the portion for every interval of size zero is equal to zero, i.e., the relative frequency or
(probability) of any given value is zero. Then X is a continuous variable.
Example: The following is an example that fits the theoretical model of a continuous distribution quite
well.
Suppose we have 1 kg of gold dust (shown in purple below) that we spread evenly inside a fish tank of
length 2.
The fish tank is put on a graph where the horizontal axis is used to identify different parts of the
fish tank corresponding to intervals of its base (location) and the vertical axis is used to record
the height (or density) of the mass of gold at every point inside the tank.
Since the gold is evenly spread inside the tank, the height (or density) is the same at every
point.
The total mass of gold inside the tank is equal to " Height x Base " = (1/2) (2) = 1, i.e., the
purple mass we see is 100% of the gold.
Suppose we insert a straw of width α and extract the gold inside the straw. What portion of
gold will we extract?
Answer: (1/2) α = α /2
Observe then that as α increases the portion of the gold we extract increases.
Likewise, as α decreases the portion of the gold we extract decreases so that as the straw
becomes very thin, i.e, α -> 0, we end up with near zero portion of the gold .
Suppose α = 0.10 and we apply the straw starting at location 1 which means it will cover up
to location 1.10, an interval equal to the width of the straw. Since the density is 1/2 from 1 to
1.10, we will extract (1/2)(.10) = 0.05 or 5% of the gold. Thus we say that between 1 and 1.10
we find 5% of the gold. More formally,
Because the gold is evenly spread inside the tank, the portion of gold that can be extracted at
any location with a straw of width alpha is the same. This is the case of a uniform
continuous distribution.
As shown below, in the case of a uniform continuous distribution we have a "density curve"
(the top boundary of the purple mass) that is flat (i.e., parallel to the x axis).
For every continuous distribution, the area under the density curve over a given interval is the
relative frequency (or probability) of that interval.
Formally, the curve is the graph of the frequency (or probability) density function of the distribution,
denoted by f. In our example, f is given by
1/2 , 0 ≤ x ≤ 2
f(x) =
0, otherwise
Note that f(x) is NOT a relative frequency (or probability) in the case of a continuous distribution.
It is simply the "height of the density curve at x".
As with discrete distributions, we can trace cumulative frequencies (or probabilities) from the smallest to
largest values of X using what is know asthe cumulative frequency (or distribution) function F(x).
Think of the "integral" sign in what follows as "the area under the curve".
and
Basic Properties of a Continuous Distribution
( F(x) is non-decreasing in x )
Population Variance
Population Median
In the example, it is easy to find the median 0.5 = F(x) => 0.5 = (1/2) x => x = 1 i.e., m = 1.
Because the distribution is symmetric, we must have a mean value equal to the median.
The variance is equal to 1/3 and so the standard deviation is equal to sqrt(1/3)
(density function)
(population mean)
(population median)
(population variance)
The gold is no longer uniformly distributed inside and tank. The height or density varies with
location. It follows that any straw of width alpha will extract different amounts at different locations.
But note, as before, the area of the triangle is equal to (1/2) x Height x Base = (1/2) (1) (2) = 1.
The area under the density must always be equal to 1.
As before, we are not concerned with the exact calculations of the population parameters here. Instead we wish to focus
how to identify "frequencies" or "portions" (and eventually "probabilities) as areas under the density curve and how to
identify what are known as quantiles using the cumulative distribution function.
The Normal Distribution (the "Bell Curve")
The "Standard Normal Distribution" is the Normal distribution with zero mean and variance equal to 1
The Density Function and the Cumulative Distribution Function of the Normal Distribution
and
The Z-Table: The Table for the Cumulative Frequencies of the Standard Normal Distribution
- it is used often as the "distribution of errors" in measurement (Gauss 1801/1809) with the
properties that
* Errors of equal magnitude are equally likely and symmetric about 0 and so
Expected Error = 0
* Large Errors are less likely than Small Errors
* In the presence of several mearurements, the most likely value is their average
- it is used often to construct models of risk and uncertainty due to its simplicity.
- it is a good approximation for other important distributions such as the Binomial Distribution,
the distribution of the "sum" of identically and independently distributed random
variables and therefore of the "distribution of the sample mean".