0% found this document useful (0 votes)
57 views31 pages

Lecture - 01

The document provides an introduction to probability theory and random variables in the context of econometrics. It defines key concepts such as: 1) Random variables take values that depend on random outcomes and are used in econometrics to model empirical relationships that cannot be fully explained by observable variables. 2) Probability density functions (PDFs) describe the probability of continuous random variables falling within a given interval based on the area under the curve of the PDF. 3) Cumulative distribution functions (CDFs) give the probability that a random variable will be less than or equal to a specified value by integrating the PDF from negative infinity to that value.

Uploaded by

Abdulla Shaheed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views31 pages

Lecture - 01

The document provides an introduction to probability theory and random variables in the context of econometrics. It defines key concepts such as: 1) Random variables take values that depend on random outcomes and are used in econometrics to model empirical relationships that cannot be fully explained by observable variables. 2) Probability density functions (PDFs) describe the probability of continuous random variables falling within a given interval based on the area under the curve of the PDF. 3) Cumulative distribution functions (CDFs) give the probability that a random variable will be less than or equal to a specified value by integrating the PDF from negative infinity to that value.

Uploaded by

Abdulla Shaheed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Lecture 1: Introduction to probability theory

ECON4038: Pre-Sessional Econometrics

Marit Hinnosaar

University of Nottingham
School of Economics

Fall 2020

1 / 31
Random variables

2 / 31
Random variables

I Econometrics uses statistical methods to study empirical questions in economics


I In empirical analysis in economics, typically, both the explanatory variables and
outcome variables are treated as random variables
I Informally: Random variable is a variable which values depend on random things
I What could be such random things: classical examples, weather, gambling (by
definition, it is a game of chance)
I In empirical analysis in economics, there is randomness because the empirical
models cannot capture all relationships
I You could say that a firm’s profit isn’t really random. But in the analysis, we treat it
as random because we cannot observe all the things that affect profit
I Similarly, for wage, price, GDP...

3 / 31
Random variables in empirical analysis in economics

I Consider a simple linear regression model yi = α + βxi + εi , where i = 1, . . . , N, y


is the outcome variable, x is the explanatory variable, and ε is the error term.
I In introductory econometrics classes, you might have encountered that in this
model, the explanatory variable x is considered as non-random
I This has important implications for the relationship of the explanatory variable x
and the error term ε: if xi is nonrandom then εi and xi are independent of one
another
I This is unrealistic in most situations (in all situations that don’t involve a
randomized experiment)
I Therefore, we consider the explanatory variables as random

4 / 31
Random variables: discrete and continuous

I Random (or stochastic) variable is any variable whose value is a real number
that cannot be predicted exactly, and can be viewed as the outcome of chance
I Discrete random variable takes a limited number of distinct values
I Product rating (number of stars: 0,1,2,3,4,5), number of children, years of education
I Continuous random variable takes any value over a continuum (e.g.
temperature)
I Most economic variables (e.g. GDP, wages, profits, prices, exchange rates) are
considered to be continuous random variables
I Hence, our main focus in this module is on continuous variables

I We now consider some basic probability theory associated with continuous


random variables

5 / 31
Probability Density Function

6 / 31
Probability Density Function: Definition
I The probability density function (PDF) of a continuous random variable is a
function that governs how probabilities are assigned to interval values for the
random variable.
I Let X be a continuous random variable defined on the interval −∞ ≤ x ≤ ∞.
Then if f (x) is the PDF of X , we have:
Z b
P(a ≤ X ≤ b) = f (x)dx
a
I The integral of the PDF over a certain range gives the probability that the
random variable will fall in that interval
I To be a valid PDF, f (x) must satisfy the following conditions:
(i) f (x) ≥ 0, −∞ ≤ x ≤ ∞
R∞
(ii) −∞ f (x)dx = 1
so that all probabilities are non-negative, and the continuous sum of probabilities
for all possible outcomes is one
7 / 31
Probability Density Function: Definition continued

I Note that the probability that a continuous random variable takes any particular
value equals 0:

P(X = a) = P(a ≤ X ≤ a)
Z a
= f (x)dx
a
= 0

I Probabilities for a continuous random variable are only non-zero when measured
over an interval

8 / 31
Probability Density Function: Example of Sales Data
I This example is the probability density function of alcohol sales (in US dollars) by
restaurant/bar. Data source: Texas Comptroller of Public Accounts.

8.0e-05
Example of data:

6.0e-05
first ten rows

Density
4.0e-05
City State Zipcode Sales
1. CYPRESS TX 77429 31138
2. DENTON TX 76205 38664

2.0e-05
3. WILMER TX 75172 26414
4. AMARILLO TX 79106 21509
5. SAN ANTONIO TX 78217 49968
6. NEVADA TX 75173 19014

0
7. SAN ANTONIO TX 78209 8935 0 20000 40000 60000 80000 100000
8. ROUND ROCK TX 78681 23130 Total alcohol sales (USD)
9. DALLAS TX 75219 57482
10. GRANBURY TX 76048 30372
Figure: PDF of alcohol sales in restaurants/bars in
Texas, US.
I Probability that a restaurant has alcohol sales about 1 million USD is much
smaller than the probability of having about 10,000 USD 9 / 31
Probability Density Function: Example of Spinner
I Let’s look at a spinner, the behavior of a spinner is described with a simple
distribution

https://www.google.com/search?q=spinner

10 / 31
Probability Density Function: Example of Spinner continued

I Consider a spinner on a 0–100 dial


I Let X denote the value the spinner lands on
I Suppose the spinner has four equal sections (i.e. dividing lines at 0/100, 25, 50
and 75).
I The spinner can land on an infinite number of positions
I Therefore, the probability of it landing on any particular position is zero, e.g.
P(X = 32.93794873974) = 0
I However, the probability that the spinner lands in a specified interval can be easily
established, e.g. P(0 ≤ X ≤ 50) = 0.5, P(75 ≤ X ≤ 100) = 0.25

11 / 31
Probability Density Function: Example of Spinner continued
I The PDF of X here is given by:
1

100 for 0 ≤ x ≤ 100
f (x) =
0 otherwise
I Probabilities can be calculated using this formula, e.g.:
Z 100
1 x 100
P(75 ≤ X ≤ 100) = dx = = 1 − 0.75 = 0.25
75 100 100 75
I This example is a case of a uniform distribution

Figure: Probability density function of uniform distribution U(0, 100), values on x-axes, density
on y-axes
12 / 31
Probability Density Function: Uniform Distribution

I More generally, uniform distribution defined by X ∼ U(a, b) with PDF:


 1
b−a for a ≤ x ≤ b
f (x) =
0 otherwise

I Here we can calculate the probability for any interval from v to w such that
a ≤ v ≤ w ≤ b:

x w
Z w
1 w −v
P(v ≤ X ≤ w ) = dx = =
v b−a b−a v b−a

I Notice that P(v ≤ X ≤ w ) depends only on the width of the interval w − v


relative to the total range b − a, but not on its position relative to a and b; hence
the terminology uniform distribution

13 / 31
Cumulative Distribution Function

14 / 31
Cumulative Distribution Function: Definition

I Cumulative distribution function (CDF) for a random variable X gives the


probability that X will take a value less than or equal to a specified number x
I CDF is a monotonically increasing function of the probability density function
defined as:

F (x) = P(−∞ ≤ X ≤ x)
Z x
= f (t)dt
−∞

where f (t) is the PDF


I Note that F (−∞) = 0 and F (∞) = 1
I Also, note that PDF is a derivative of CDF

15 / 31
Cumulative Distribution Function: Example of Sales Data
I This example is the cumulative distribution function of alcohol sales (in US
dollars) by restaurant/bar
8.0e-05

1
6.0e-05

.8 .6
Probability
Density
4.0e-05

.4
2.0e-05

.2
0

0
0 20000 40000 60000 80000 100000 0 20000 40000 60000 80000 100000
Total alcohol sales (USD) Total alcohol sales (USD)

Figure: PDF of alcohol sales data in Figure: CDF of alcohol sales data in
restaurants/bars in Texas, US. Source: Texas restaurants/bars in Texas, US. Source: Texas
Comptroller of Public Accounts Comptroller of Public Accounts

16 / 31
Cumulative Distribution Function: Example
I Consider again the example of the spinner on a 0–100 dial
I The probability density function was given by:
 1
100 for 0 ≤ x ≤ 100
f (x) =
0 otherwise
I From this we can obtain the CDF as integrating the PDF:
Z x Z x
1 t x x
F (x) = f (t)dt = dt = =
−∞ 0 100 100 0 100
I Then, we can calculate the probability that X is less than eg 70: P(X ≤ 70) = 0.7

Figure: Cumulative distribution function of uniform distribution U(0, 100), values on


x-axes, probability on y-axes 17 / 31
Cumulative Distribution Function: Uniform Distribution Example

I More generally, for the uniform distribution X ∼ U(a, b), the CDF is:

t x
Z x
1 x −a
F (x) = dt = =
a b−a b−a a b−a

18 / 31
Joint Probability Density Function

19 / 31
Joint Probability Density Function: Definition
I So far we considered the distribution of a single continuous random variable.
I In econometrics, we are often concerned with the joint distribution of more than
one continuous random variable
I EXAMPLE: wage together with levels of education
I Let’s consider the joint distribution of two random variables
I Let X and Y be continuous random variables defined on the interval −∞ ≤ X ≤
∞ and −∞ ≤ Y ≤ ∞. Then if f (x, y ) is the joint PDF of X and Y , we have:
Z dZ b
P(a ≤ X ≤ b, c ≤ Y ≤ d) = f (x, y )dxdy
c a
I Thus the double integral of the PDF over certain ranges now gives the probability
that both X and Y will fall in specified intervals.
I To be a valid PDF, f (x, y ) must satisfy the following conditions:
(i) f (x, y ) ≥ 0, −∞ ≤ x ≤ ∞, −∞ ≤ y ≤ ∞
R∞ R∞
(ii) −∞ −∞ f (x, y )dxdy = 1
20 / 31
Joint Probability Density Function: Example
I Consider the random variables X and Y that have the joint PDF:
1
for a ≤ x ≤ b and c ≤ y ≤ d

f (x, y ) = (b−a)(d−c)
0 otherwise

I In this case X and Y are said to have a bivariate uniform distribution on the
interval [a, b], [c, d]
I We can establish the probabilities of X and Y lying in certain intervals using this
PDF, e.g.:
Rf Rw 1
P(v ≤ X ≤ w , e ≤ Y ≤ f ) = e v (b−a)(d−c) dxdy
w
n o f
Rf x
Rf w −v (w −v )y
= e (b−a)(d−c) dy = e (b−a)(d−c) dy = (b−a)(d−c)
v e
(w −v )(f −e)
= (b−a)(d−c)

21 / 31
Marginal Probability Density Function

22 / 31
Marginal Probability Density Function: Definition
I Given a joint distribution for a pair of random variables X and Y , it is possible to
work out the univariate probability density functions of the individual variables X
and Y , regardless of the values that the other variable might take
I When the PDF of X or Y is obtained from the joint distribution, we refer to it as
the marginal PDF
I The marginal PDF of X is obtained by marginalizing out Y variable
I The marginal PDFs for X and Y are defined as:
Z ∞ Z ∞
f (x) = f (x, y )dy , f (y ) = f (x, y )dx
−∞ −∞

I EXAMPLE: we might have a joint PDF of wage and education, and we can get
from it the wage distribution by aggregating over all the education levels
I Note that f (x) (or f (y )) is a function of x (or y ) alone and both are legitimate
univariate PDFs in their own right
23 / 31
Marginal Probability Density Function: Definition continued

I Marginal PDF of X is used to assign probabilities to a range of values of X


irrespective of the range of values in which Y is located, i.e.

P(a ≤ X ≤ b) = P(a ≤ X ≤ b, −∞ ≤ Y ≤ ∞)
Z b Z ∞ 
= f (x, y )dy dx
a −∞
Z b
= f (x)dx
a

24 / 31
Marginal Probability Density Function: Example
I Consider again the bivariate uniform random variables X and Y with probability
density function:
1
for a ≤ x ≤ b and c ≤ y ≤ d

f (x, y ) = (b−a)(d−c)
0 otherwise
I The marginal PDF of X is obtained by integrating over y :
R∞
f (x) = −∞ f (x, y )dy
Rd 1
= c (b−a)(d−c) dy
d
y
= (b−a)(d−c)


c
d−c
= (b−a)(d−c)
1
= b−a

so that the marginal distribution of X is X ∼ U(a, b)


25 / 31
Conditional Probability Density Function

26 / 31
Conditional Probability Density Function: Definition
I We are sometimes interested in the distribution of one random variable given that
another takes a certain value
I EXAMPLE: wage distribution given that person has a Masters degree
I Given the joint distribution f (x, y ), conditional PDFs of X and Y are defined as:

f (x, y ) f (x, y )
f (x|y ) = , f (y |x) =
f (y ) f (x)
I This is analogous to the conditional probability of event A occurring given that
event B occurs, specified as P(A|B) = P(A and B)/P(B).
I The conditional PDF of X is used to assign probabilities to a range of values of X
given that Y takes the value Y = y , i.e.:
Z b
P(a ≤ X ≤ b|Y = y ) = f (x|y )dx
a

27 / 31
Conditional Probability Density Function: Example
I Consider again the bivariate uniform random variables X and Y with joint PDF:
1
for a ≤ x ≤ b and c ≤ y ≤ d

f (x, y ) = (b−a)(d−c)
0 otherwise
I The conditional PDF of X is obtained as:
1
f (x, y ) (b−a)(d−c) 1
f (x|y ) = = 1
=
f (y ) d−c
b−a
I Note that for this particular case, the conditional distribution of X given Y is the
same as the marginal distribution of X , f (x)
I Although this result holds for our bivariate uniform example, it is not a general
result, it is pretty unusual
I It only arises because in the bivariate uniform example the X and Y random
variables are independent of each other
I If random variables are not independent, the conditional distributions will differ
from the corresponding marginal distributions
28 / 31
Statistical Independence

29 / 31
Statistical Independence: Definition
I The notion of the independence of two events is that knowledge of one event
occurring has no effect on the probability of the second event occurring
I Given two random variables X and Y with joint PDF f (x, y ), X and Y are said to
be statistically independent if and only if the joint PDF can be expressed as the
product of the marginal PDFs:

f (x, y ) = f (x)f (y )

I Under independence, conditional PDF equals marginal PDF:

f (x, y ) f (x)f (y )
f (x|y ) = = = f (x)
f (y ) f (y )
and similarly, f (y |x) = f (y ).
I EXAMPLE: the event of getting 6 the first time you roll a dice and the event of
getting 6 the second time you roll a dice are independent
I What you get the second time does not depend on what you got the first time
30 / 31
Statistical Independence: Example

I In the above bivariate uniform example, we found that the conditional distribution
equals marginal distribution: f (x|y ) = f (x) and f (y |x) = f (y ), showing that X
and Y are independent
I This can also be confirmed by demonstrating that the joint PDF is the product of
the marginal PDFs:
1 1
f (x)f (y ) =
b−ad −c
1
=
(b − a)(d − c)
= f (x, y )

31 / 31

You might also like