ProbabilityStatistics_Probability3 (1)
ProbabilityStatistics_Probability3 (1)
Random variables
1
1 if a < x < b
f (x) = 1l(a,b) (x) = b−a
b−a
0 otherwise.
This type of law appears when we want to take a point at random in an interval.
The distribution function is easily calculated. For x ∈ (a, b)wehave
Z x
1 x−a
F (x) = dy = .
a b−a b−a
So,
0 if x < a,
x−a
F (x) = if a ≤ x < b,
b−a
1 if x ≥ b.
This tells us that the probability that the value of X is in a certain subinterval (c, d) of
the interval (a, b) is proportional to the length of this subinterval, specifically
d
d−c
Z
1
P (c < X < d) = dx = .
c b−a b−a
1
Note that Expectation is the midpoint of the range (a, b).
Slightly longer calculations would give us the following value of variance:
(b − a)2
V ar(X) = .
12
Example. To make a safety study we choose a random point on the AVE route between
Barcelona and Madrid, which is 506 km long. What is the probability that the point is less
than 150 km from Barcelona?
We define the variable X = ”Distance from Barcelona”. We have X ∼ U (0.506). Its
density is
1
f (x) = 1l(0,506) (x)
506
and its distribution function is
0 if x < 0,
x
F (x) = if 0 ≤ x < 506,
506
1 if x ≥ 506.
So,
150
P (X < 150) = F (150) = = 0.2964
506
Note that if we are asked the probability of choosing any point, for example 183.4276, we
must answer that it is zero:
P (X = 183.4276) = 0.
λ e−λx if x ≥ 0
f (x) =
0 otherwise.
Write X ∼ Exp(λ).
This distribution is a good model for many situations where the v.a. is a timeout. For ex-
ample, the time between two consecutive particulate emissions of a radioactive substance
fits well with the exponential distribution.
Expectation and variance. Expectation is given by
Z ∞
1
E(X) = x · λe−λx dx = .
0 λ
The value of the integral is obtained by integrating in parts.
2
Analogous calculations would give us the following value of variance:
1
V ar(X) = 2 .
λ
(x − µ)2
1
f (x) = √ exp −
2π σ 2σ 2
4
Figure 1.2: Density functions of a Normal with µ = 0 and different values of σ.
5
The density of this v.a. has the simplest expression:
1 x2
f (x) = √ e− 2 .
2π
Next we will see that given any normal distribution we can turn it into a standard normal
distribution by means of a linear transformation and solve the calculations of probabilities
with this one.
• Standard Normal Distribution Tables. The tables of the standard normal distri-
bution give us the function
Z z
1 x2
Φ(z) = P (Z ≤ z) = √ e− 2 dx,
−∞ 2π
for all z ≥ 0. That is, the area below the Gaussian bell, between −∞ and z, as shown in
Figure ??.
Using the symmetry of the normal law and that the area below the function from −∞ to
∞ is equal to 1, we have that if z ≤ 0, then
Φ(z) = P (Z ≤ z) = 1 − Φ(−z).
Note that if z is negative, −z will be positive and Φ(−z) is tabulated. Using these
two properties (symmetry and that the total area is 1) we can calculate any probability
involving the standard normal distribution.
• Calculations with any normal distribution. Calculations involving non-standard
normal distributions can be performed using the following theorem:
6
Theorem 1 If X ∼ N (µ, σ 2 ), then the v.a. Z built as
X −µ
Z=
σ
has standard normal distribution.
Note that since σ 2 is the variance of X, the amount that appears in the denominator of
the above expression is the standard deviation σ. So what do we need to do to transform a
v.a. normal with a certain mean and a certain variance in a standard normal distribution
is to subtract the mean and divide by the standard deviation. This process is called
standardization.
Example. Suppose that X ∼ N (3, 4), that is, µ = 3 and σ 2 = 4. We want to calculate
First, we standardize by subtracting the mean and dividing by the standard deviation:
2−3 X −3 6−3
P (2 < X < 6) = P < <
2 2 2
We have that
X −3
Z= ∼ N (0, 1).
2
Therefore, the above expression is equal to
P (2 < X < 6) = P (−0.5 < Z < 1.5) = Φ(1.5) − Φ(−0.5) = Φ(1.5) − 1 − Φ(0.5)
= Φ(1.5) + Φ(0.5) − 1 = 0.93319 + 0.69146 − 1 = 0.62465.
• Reverse calculations with normal distribution. For inferential statistics the in-
verse problems of what we did in the previous example are of great interest. For example,
it is interesting to know what the number k is such that
P (X ≤ k) = 0.95,
or
P (X ≥ k) = 0.01
or also
P (|X − µ| ≤ k) = 0.90
and other similar problems. Let’s take an example of how to solve such problems.
Example. Let X ∼ N (3, 4). We want to find a number k such that
P (|X − 3| ≤ k) = 0.95.
7
We take out the absolute value and standardize:
−k X −3 k −k k
P (|X − 3| ≤ k) = P (−k ≤ X − 3 ≤ k) = P ≤ ≤ =P ≤Z≤ .
2 2 2 2 2
Per so that we find k, we observe that due to the symmetry and that the total area under
the Gaussian bell is 1, we have that between −∞ and k2 we must have an area of 0.975
(make a drawing!). Like this
k
Φ = 0.975.
2
k
In the tables we find that z corresponds to the probability 0.975 and we find that = 1.96.
2
This implies that k = 2 · 1.96 = 3.92.
Note. Many calculators and software are currently designed for calculations statistics
have the Φ(z) function built-in.
where the symbol ≈ indicates that the v.a. has this approximate distribution.
The approximation given in the theorem works very well if the condition is satisfied
np(1 − p) ≥ 18.
Some authors use the approximation if np(1 − p) ≥ 5 or even an even weaker condition
is met: np ≥ 5 and n(1 − p) ≥ 5 (both inequalities must be met at the same time). It
should be noted that the approach in these conditions is not too careful.
Example. The probability of a male child being born is 0.515 in a certain geographic
area. If a hospital in this area has 1000 births in a certain year, what is the probability
that the number of boys will not exceed the number of girls during this year?
The random variable
X = Number of male births per 1000
has distribution B(1000, 0515). In addition, the condition is met
8
Then, by Theorem ?? we know that X ≈ N (np, np(1−p)) = N (515, 249, 775). Therefore,
we can proceed as follows
X − 515 500 − 515
P (X ≤ 500) = P √ ≤√ ≈ P (Z ≤ −0.95) ,
249, 775 249, 775