0% found this document useful (0 votes)
2 views

Probabilities and Statistics

The document covers fundamental concepts in probabilities and statistics, emphasizing the importance of probability theory in understanding social processes and uncertainty in empirical research. It introduces basic set theory, probability spaces, probability measures, and key principles such as conditional probability and Bayes's rule. Additionally, it discusses various probability distributions, including discrete distributions like Bernoulli and binomial distributions, as well as the normal distribution.

Uploaded by

809876700qq
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Probabilities and Statistics

The document covers fundamental concepts in probabilities and statistics, emphasizing the importance of probability theory in understanding social processes and uncertainty in empirical research. It introduces basic set theory, probability spaces, probability measures, and key principles such as conditional probability and Bayes's rule. Additionally, it discusses various probability distributions, including discrete distributions like Bernoulli and binomial distributions, as well as the normal distribution.

Uploaded by

809876700qq
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

8/28/24, 9:32 AM Probabilities and Statistics

Probabilities and Statistics


MSDA - Bootcamp 2024 Summer

KT Wong
Faculty of Social Sciences, HKU
2024-08-27

The materials in this topic are drawn from Stachurski (2016)

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 1/34


8/28/24, 9:32 AM Probabilities and Statistics

Motivation

Social processes are not deterministic


▶ the “effects” of social causes are difficult to isolate and estimate

need a framework for communicating our uncertainty about the inferences we draw
in our empirical work

probability theory
▶ help us do inference on modeling
▶ the root of social statistics

it might be even helpful for case selection and small-n inference in qualitative
research

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 2/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
some set theory

set: a collection of elements

subset: set that is composed entirely of elements of another set


▶ e.g. Set A is subset of Set B if every element of A was also an element of B

▪ i.e. Set B contains Set A

union: the union of two sets contains all the elements that belong to either sets

intersection: the intersection of two sets contains only those elements found in both
sets

complement: the complement of a given set is the set that contains all elements not
in the original set

disjoint: Two sets are disjoint when their intersection is empty

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 3/34


8/28/24, 9:32 AM Probabilities and Statistics

intersection

union

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 4/34


8/28/24, 9:32 AM Probabilities and Statistics

complement

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 5/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
probability space

Let Ω be a set of possible underlying outcomes - sample space

Let ω ∈ Ω be a particular underlying outcomes

Let  ⊂ Ω be a subset of Ω - event

 be a collection of such subsets  ⊂ Ω

the pair (Ω, ) forms probability space

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 6/34


8/28/24, 9:32 AM Probabilities and Statistics

Exercises
Consider the universal set Ω = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Given the sets A = {2, 4, 6, 8, 10} and B = {3, 6, 9}

Find the complement of set A

Find the complement of set B

Calculate the intersection of sets A and B

Compute the union of the complements of sets A and B

Find the intersection of the complement of set A and B

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 7/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
probability measure

A probability measure P maps a set of event(s)  ∈  into a scalar number between


0 and 1
▶ this is the “probability” that event A happens, denoted by P(A)

The probability measure must satisfy the following properties:

1. P(∅) = 0
2. P(Ω) = 1
3. If A , A , … are disjoint, then P(∪
1 2

i=1 Ai ) = ∑∞
i=1 P(Ai )

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 8/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
Law of Total Probability

The probability of event A can be decomposed into n parts


▶ one part that intersects with B1 , another part that intersects with B2 , and so on

we can state the probability formally

P(A) = P(A ∩ B1 ) + P(A ∩ B2 ) + … + P(A ∩ Bn )

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 9/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
Conditional Probability

Conditional probability statements recognize that some prior information bears on


the determination of subsequent probabilities

The conditional probability of event A given event B is denoted by P(A|B) and is


defined as
P(A ∩ B)
P(A|B) =
P(B)

we can use conditional probability statements to derive the law of total probability

P(A) = P(A|B1 )P(B1 ) + P(A|B2 )P(B2 ) + … + P(A|Bn )P(Bn )


where B1 , B2 , … , Bn form a partition of the sample space

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 10/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
Bayes’s rule

Bayes’ rule can be regarded as a way to reverse conditional probabilities

let A and B be two events with P(B) > 0


▶ then the Bayes’ rule states that
P(B|A)P(A)
P(A|B) =
P(B|A)P(A) + P(B|AC )P(AC )

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 11/34


8/28/24, 9:32 AM Probabilities and Statistics

Exercises
Suppose you work in a building that has a fire alarm system. The fire alarm is designed
to go off when there is a fire, and it’s also known that sometimes the alarm can go off
due to smoke from a malfunctioning HVAC system.

there is a 1% chance that there is a fire: P(Fire) = 0.01

the alarm system works pretty well and there is a 95% chance it goes off when there
is an actual fire
▶ P(Alarm goes off | Fire) = 0.95

there is a 10% chance that the alarm goes off due to smoke without a fire
▶ P(Alarm goes off | No Fire) = 0.1

what’s the probability of actually there being a dangerous fire given that alarm goes
off?

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 12/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
Independence
Intuition: Information about the outcome of event A doesn’t change the probability of
event B happening

Two events A and B are independent if

P(A ∩ B) = P(A)P(B)

we can deduce that if A and B are independent, then


▶ P(A) = P(A|B)
▶ P(B) = P(B|A)

When there are more than two events, we say that they are mutually independent if
every subset of the events is independent

Reminder:

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 13/34


8/28/24, 9:32 AM Probabilities and Statistics

▶ pairwise independence does not imply mutual independence

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 14/34


8/28/24, 9:32 AM Probabilities and Statistics

Basic Concepts
random variable

A random variable X(ω) is a function of the underlying outcome ω ∈ Ω


▶ X(ω) has a probability distribution that is induced by the underlying probability
measure P and the function X(ω):

∫
Prob(X ∈ A) = P(ω)dω
where  is the subset of Ω for which X(ω) ∈ A

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 15/34


8/28/24, 9:32 AM Probabilities and Statistics

Probability Distributions
A probability distribution Prob(X ∈ A) can be described by its cumulative distribution
function (CDF)
F X (x) = Prob{X ≤ x}.
A continuous-valued random variable can be described by density function f (x) that is
related to its CDF by

∫t∈B
Prob{X ∈ B} = f (t)dt
x

∫−∞
F(x) = f (t)dt

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 16/34


8/28/24, 9:32 AM Probabilities and Statistics

Probability Distributions

For a discrete-valued random variable

the number of possible values of X is finite or countably infinite

we replace a density with a probability mass function (pmf), a non-negative


sequence that sums to one

we replace integration with summation in the formula that relates a CDF to a


probability mass function (pmf)

let us discuss some common distributions for illustrations

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 17/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Discrete distributions
A discrete distribution is defined by a set of numbers S = {x1 , … , xn } and a probability
mass function (pmf) on S, which is a function p from S to [0, 1] with the property
n


p(xi ) = 1
i=1

a random variable X has distribution p if X takes value xi with probability p(xi )

ℙ{X = xi } = p(xi ) for i = 1, … , n

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 18/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Discrete distributions

The mean or expected value of a random variable X with distribution p is


n


𝔼[X] = xi p(xi )
i=1

Expectation is often called the first moment of the distribution in statistics


▶ we call it the mean of the distribution (represented by) p

The variance of X is defined as


n


𝕍[X] = (xi − 𝔼[X]) 2 p(xi )
i=1

Variance is often called the second central moment of the distribution in statistics

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 19/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Discrete distributions

The cumulative distribution function (CDF) of X is defined by


n


F(x) = ℙ{X ≤ x} = 𝟙{xi ≤ x}p(xi )
i=1

Here 𝟙{statement} = 1 if “statement” is true and zero otherwise

Hence the second term takes all xi ≤ x and sums their probabilities

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 20/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
the uniform distribution

p(xi ) = 1/n for all i


▶ the mean is (n + 1)/2
▶ the variance is (n2 − 1)/12

Bernoulli distribution

the Bernoulli distribution on S = {0, 1} , which has pmf:

p(i) = θi (1 − θ) 1−i (i = 0, 1)

θ ∈ [0, 1] is a parameter
▶ p(1) = θ means that the trial succeeds (takes value 1) with probability θ
▶ the mean is θ; the variance is θ(1 − θ)

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 21/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Binomial distribution
the binomial distribution on X = {0, … , n}, which has pmf

(x)
n x
p(x) = θ (1 − θ) n−x

θ ∈ [0, 1] is a parameter
▶ the mean is nθ
▶ the variance is nθ(1 − θ)

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 22/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Poisson distribution
The Poisson distribution on X = {0, 1, …} with parameter λ > 0 has pmf
λx −λ
p(x) = e
x!

The interpretation of p(x) is: the


probability of i events in a fixed time
interval, where the events occur
independently at a constant rate λ.
▶ the mean is λ and the variance is λ

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 23/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Normal distribution
the most famous distribution is the normal distribution, which has density

‾ σ ( 2σ 2 )
1 (x − μ) 2
p(x) = exp −
‾‾
√2π

it has two parameters, μ ∈ ℝ and


σ ∈ (0, ∞)
▶ the mean is μ and the variance is σ 2

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 24/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Continuous distributions
A continuous distribution is represented by a probability density function (pdf), which
is a function p over ℝ such that p(x) ≥ 0 for all x and

∫−∞
p(x)dx = 1

We say that random variable X has distribution p if


b

∫a
ℙ{a < X < b} = p(x)dx
for all a ≤ b

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 25/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Lognormal distribution
The lognormal distribution is a distribution on (0, ∞) with density

‾ ( )
1 (log x − μ) 2
p(x) = exp −
‾‾
σx√2π 2σ 2

It has two parameters, μ and σ


▶ the mean is exp(μ + σ 2 /2)
▶ the variance is
[exp(σ ) − 1] exp(2μ + σ )
2 2

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 26/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Gamma distribution
The gamma distribution is a distribution on (0, ∞) with density
βα α−1
p(x) = x exp(−βx)
Γ(α)

It has two parameters, α > 0 and β > 0


▶ the mean is α/β
▶ the variance is α/β2

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 27/34


8/28/24, 9:32 AM Probabilities and Statistics

Common distributions
Beta distribution
The beta distribution is a distribution on (0, 1) with density
Γ(α + β) α−1
p(x) = x (1 − x) β−1
Γ(α)Γ(β)
where Γ is the gamma function

it has two parameters, α > 0 and β > 0


▶ the mean is α/(α + β)
▶ the variance is
αβ/(α + β) 2 (α + β + 1)

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 28/34


8/28/24, 9:32 AM Probabilities and Statistics

Bivariate Probability Distribution


Two discrete random variables

Let X, Y be two discrete random variables that take values:

X ∈ {0, … , I − 1} Y ∈ {0, … , J − 1}

their joint distribution is described by a matrix

F I×J = [fij] i∈{0,…,I−1},j∈{0,…,J−1}


whose elements are

fij = P{X = i, Y = j} ≥ 0
where ∑ i ∑ j fij = 1

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 29/34


8/28/24, 9:32 AM Probabilities and Statistics

Bivariate Probability Distribution


Two discrete random variables
The joint distribution induce marginal distributions
J−1


P{X = i} = fij = μi , i = 0, … , I − 1
j=0
I−1


P{Y = j} = fij = νj , j = 0, … , J − 1
i=0

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 30/34


8/28/24, 9:32 AM Probabilities and Statistics

Bivariate Probability Distribution


Two discrete random variables

for example, let a joint distribution over (X, Y) be

[ P(X = 1, Y = 0) P(X = 1, Y = 1) ] [ .15 .5 ]


P(X = 0, Y = 0) P(X = 0, Y = 1) .25 .1
F X,Y = =

the marginal distributions are:

P{X = 0} =
P{X = 1} =
P{Y = 0} =
P{Y = 1} =

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 31/34


8/28/24, 9:32 AM Probabilities and Statistics

Summary Statistics
Suppose we have an observed distribution with values {x1 , … , xn }

The sample mean of this distribution is defined as


n
1
n∑
x̄ = xi
i=1

The sample variance is defined as


n
1
n∑
(xi − x̄) 2
i=1

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 32/34


8/28/24, 9:32 AM Probabilities and Statistics

LLN and CLT

two of the most important results in probability and statistics

1. the law of large numbers (LLN)


2. the central limit theorem (CLT)
Let X1 , . … , Xn be independent and identically distributed scalar random variables,
with common distribution F and common mean μ and variance σ 2

Law of large numbers

P(|X̄n − μ| ≥ ε) → 0 as n → ∞, ∀ε > 0

Central limit theorem

√n‾ (X̄n − μ) → N(0, σ )


2 d
as n→∞

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 33/34


8/28/24, 9:32 AM Probabilities and Statistics

References
Stachurski, John. 2016. A Primer in Econometric Theory. Cambridge,
Massachusetts: The MIT Press.

file:///Users/leslie/Library/Containers/com.microsoft.Outlook/Data/tmp/Outlook Temp/Probability_final_draft_student.html#/basic-concepts-2 34/34

You might also like