0% found this document useful (0 votes)
9 views

L06 Inference

This document introduces statistical inference and discusses using sample data to make inferences about a population. It provides examples of using sample proportions and means to estimate population proportions and means. The key points are: 1) Statistical inference uses sample data to learn about an unknown population in a rigorous way. 2) Sample proportions and means can be used as estimators of population proportions and means. 3) The sampling distribution considers all possible sample means that could arise from the population and allows evaluating how representative a sample is of the population.

Uploaded by

Tenebrae Lux
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

L06 Inference

This document introduces statistical inference and discusses using sample data to make inferences about a population. It provides examples of using sample proportions and means to estimate population proportions and means. The key points are: 1) Statistical inference uses sample data to learn about an unknown population in a rigorous way. 2) Sample proportions and means can be used as estimators of population proportions and means. 3) The sampling distribution considers all possible sample means that could arise from the population and allows evaluating how representative a sample is of the population.

Uploaded by

Tenebrae Lux
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

ECON10005

Quantitative Methods 1
Introduction to Statistical Inference

Faculty of Business and Economics


Department of Economics
Statistical Inference

Statistical inference is a collection of procedures


designed so that data in a sample can be informative in
a rigorous way about a specified population.

Examples:

• There is a survey about the federal election. What can we learn about the real election’s outcome?
• The scores from students in QM1 Tutorial 25 are known. Can we infer the average score of the whole QM1 class?
• A bucket of water from lake Ontario is examined. How is the water quality of the lake?
• 100 volunteers’ results from a clinical trial for a new vaccine are observed. What is the vaccine’s effectiveness?

2
Simple Random Sample
Simple random sample

A sampling scheme in which


• every individual in the population has an equal probability of being in the sample
• every individual is chosen (or not chosen) independently of every other individual.

 A simple random sample provides a “representative


sample” for “reliable inference” of the population
 There are other sampling schemes (not covered in QM1)

Is this a simple random sample?

1. ABC Radio National / Sky News ask for people to dial in and say who they are going to vote for.
2. Telephone polling for the federal election.
3. Check the solders in the hospital to evaluate the effectiveness of the new helmet.
3
Statistical Inference and Random Variables
We use a random variable to represent a numerical measure of the population.

Recall a random variable is described by


• the possible outcomes
• the probabilities (or pdf) of those outcomes

Example: Who will win the election?

Consider the population of all U.S.A. voters. Consider a randomly


selected voter and define as the random variable

{
𝑋= 1 , voter prefers Democratic 𝑃 ( 𝑋 =1 )=𝑝
0 , voter prefers Republican 𝑃 ( 𝑋 =0 ) =1 −𝑝
Democratic wins, Republic wins Value is unobserved.

Q: we have a sample of 900 voters. Can we say something about the real value of ? 4
Estimation of a Proportion

Suppose take only values 0 or 1.


This is a sample with n observations

The number of 1’s in the sample is


e.g., . =2

The sample proportion of 1’s is .


• It measure the ratio of 1’s in the sample.
• It is an estimator of the population probability p
• It is denoted by . ()

5
Estimation of a Proportion

The sample proportion of 1’s is .

Idea: use the sample proportion to estimate the population proportion

E.g., n=900 with

Suppose 540 and 360 .

This is NOT the true p, but we hope that it is a good approximation.


Is it really so? In short YES, details are coming.
6
Estimation of the Mean

Example: Average Mark

Suppose QM1 marks are distributed between 0 and 100. Let X denote the
mark of a random selected student and denote the probability distribution
of X by p(x). The mean QM1 mark would be

Suppose we have a randomly selected sample of marks of 20 students


from QM1. Can we use the mean of these 20 marks to estimate ?

 is unknown to us.
 does not matter as we only care about the mean in this example.
 The answer is yes.
 We use the sample mean to estimate the population mean.
7
Estimation of the Mean
General framework to estimate the population mean: • Estimator: the sample mean is random
before the data is sampled/observed.
• Consider a population whose distribution has mean .
• Take a random sample of observations from this It is a methodology/rule to transform a
population. sample () into a value .
Therefore, it is a random variable
Let denote the random variable for the first observation.
It is random because until the sample is taken its value is
uncertain. • Estimate: the value of the sample mean
Another sample would (generally) give a different value for . (unfortunately, same name😭) that you
calculated.
Similarly let correspond to the other observations.
It is a realization from the random variable.
The sample mean is an estimator for the population mean

8
Estimation of the Mean
Example. QM1 marks

The mean mark for semester 1 2021 is 69% (you have the data from Week 1). i.e., 69 (we are
lucky, this time. Usually is an unknown number)
Only a simple random sample of marks may be available.
e.g., = 63, 68, 57, 51, 63, 73, 58, 57, 63, 71, 50, 76, 84, 71, 51, 74, 50, 86, 70, 66

The sample mean is an estimator of the population mean . This is NOT the true , but we
The number 65.1 is an estimate of the population mean . hope that it is a good
approximation.
Is it really good? Yes, more
 Consider as a random variable and 65.1 is one random draw from it. details are coming.
 This is the same idea if you see
 A different sample will give different value of in general
9
M&M Colors

According to figures on the site in 2008, the


proportions changed to favor blue, orange, and
green over yellow, red, and brown: 24% blue,
20% orange, 16% green, 14% yellow, 13% red,
and 13% brown.

Then, sometime around 2008, the online color


distribution was just…gone. Through a
spokesman, Mars refused to say when or why
the information was removed.

See “A statistician got curious about M&M


colors
and went on an endearingly geeky quest for an
swers”
10
ECON10005
Quantitative Methods 1
Sampling Distribution and Unbiased Estimation

Faculty of Business and Economics


Department of Economics
Example: Population
In a quiz in semester 1, we asked QM1 students:

"Were the rankings of the University of Melbourne an important factor in your choice to study
here?”

Out of 1021 responses, 884 said "yes", the rest will be treated as "no".
For now, think of these 1021 students as the population.

If we code "yes" = 1, "no" = 0, then the data in a spreadsheet look like


• There are 884 1’s and 137 0’s.

The population proportion of 1's is


• This value is usually unknown in real-world applications.

12
Population Proportion and Population Mean
Define a random variable

The probability . This is the population proportion.

. This is the population mean

In this example, when the random variable only takes value of 1 or 0


the population proportion of 1’s is the same as the population mean
 population proportion of 0’s is NOT

13
Sample and Sample Mean
Define a random variable

A simple random sample draw of 20 students is denoted by (obviously, )

 Each different sample drawn can give a


different value for
20 observations
1  We use (the sample mean) to estimate
𝑋= (1+ 1+ 0+1+ … )
20 the population mean p

 Recall: is also the sample proportion

14
Estimation
Question: can we use the observed (data) to infer the unobserved (e.g., population mean)

• Our interest is the population mean (also population proportion in the example)
• We can use the sample mean to estimate the population mean (also in the example).

️ we verify it is a good idea? .️


️Can
Keep in mind that is random. So, any value you see is obtained by chance.

• Is always likely to be below or above (the population mean )

• Could be unreasonably far from . If so, what is our chance of being so unlucky.

• Can we tell whether 20 observations is enough? How to express estimation uncertainty?

• What would happen if we increase the sample size .


15
Sampling Distribution
 All previous questions can be answered by investigating the sampling distribution of

The sampling distribution of is the probability distribution of


all of the possible value of that could arise from the population.

 Note: is random (before the random sample is drawn)

If it is random, then it could have a mean… this helps us to understand the central tendency of .

( ) ( )
𝑛
1 1 1 1
𝐸 ( 𝑋 )= 𝐸 ∑ 𝑋 𝑖 =𝐸 𝑋 1+ 𝑋 2 +…+ 𝑋 𝑛
𝑛 𝑖=1 𝑛 𝑛 𝑛
1 1 1
¿ 𝐸 ( 𝑋 1 ) + 𝐸 ( 𝑋 2 ) +… 𝐸 ( 𝑋 𝑛)
𝑛 𝑛 𝑛
1 1 1
¿ 𝜇+ 𝜇+ …+ 𝜇=𝜇
𝑛 𝑛 𝑛
16
Sampling Distribution

𝐸 ( 𝑋 )=𝜇

Read it as: the expected value of the sample mean is the population mean
Or
The mean of the sample mean is the population mean.

Interpretation: the sample mean (as a random variable) is centered around the true
value of the population mean.

Heuristic example:
1. You and MANY of your classmates each take a random sample of 20 observations.
2. Each of you calculate a sample mean from your own data.
3. In general, each of you have a different value of the sample mean.
4. The average of the sample means from you and your classmates is the true mean.

18
Unbiased Estimator
Definition

An estimator (e.g., ) of a population parameter (e.g., ) is unbiased if


the expected value of the estimator is equal to the population
parameter (e.g., )

An unbiased estimator is one which is not tending to be


too large or too small. It is, on average, just right!

Important: unbiasedness does NOT mean equal , why?

19
ECON10005
Quantitative Methods 1
Consistency

Faculty of Business and Economics


Department of Economics
Beyond Unbiasedness
Three statisticians go deer hunting with bows
and arrows. they spot a big buck and take aim.
One shoots and his arrow flies off three metres
Unbiasedness sounds good, is it enough? to the right. The second shoots and his arrow
flies off three metres to the left. The third
statistician jumps up and down yelling; We got
him! We got him!

Recall: In our example, is random, we can find its mean to understand its central tendency.

 We can also check its variance to evaluate its dispersion from its mean.

 If the variance is large, a single is more likely to be far away from the true mean .

 If the variance is small, a single is more likely to be close to the true mean
21
Variance of the Sampling Distribution

Example: are independent with

( ) ( )
𝑛
1 1 1 1
𝑣𝑎𝑟 ( 𝑋 )=𝑣𝑎𝑟 ∑ 𝑋 𝑖 =𝑣𝑎𝑟 𝑋 1+ 𝑋 2+…+ 𝑋 𝑛
𝑛 𝑖=1 𝑛 𝑛 𝑛

1 1 1
¿ 2
𝑣𝑎𝑟 ( 𝑋 1 ) + 2
𝑣𝑎𝑟 ( 𝑋 2 ) + …+ 2
𝑣𝑎𝑟 ( 𝑋 𝑛 )  Each pair because of independence
𝑛 𝑛 𝑛

1 2 1 2 1 2
¿ 2
𝜎 + 2
𝜎 + …+ 2
𝜎
𝑛 𝑛 𝑛

𝜎2
¿
𝑛 22
Summary
For a simple random sample , where and ,
the sample mean

• is a random variable with a sampling distribution


• (unbiasedness)

 Larger is more precise


 Larger is less precise

The sample size n does not affect unbiasedness.


E.g., even if you have only 1 observation , it is still an unbiased estimator for the population mean .
But using only one observation seems NOT a good idea.

Because more observations will reduce the variance.


You will have a better chance to be close to the true mean.

24
Consistency
is said to be consistent because

• It is unbiased (for QM1 now and can be relaxed in more advanced subjects)
• Its variance converges towards zero as n grows

Recall the QM1 survey example: a random variable

The probability . This is the population mean (proportion).

• A random sample with 20 observations may give (17 answer “yes”)

• Another random sample with 20 (different) observations may give (18 answer “yes”)

• Do this 10,000 times to have a view of the sampling distribution of


25
Consistency
is said to be consistent because

Maybe your value is here • It is unbiased


• Its variance converges towards zero as n grows

𝑛=20

Maybe your friend's value is here

Maybe your another trial of value is here

26
Consistency
is said to be consistent because

• It is unbiased
• Its variance converges towards zero as n grows

𝑛=20
As the sample size increases, the sampling
𝑛=50 distribution of is more concentrated around the
𝑛=100 true mean .

A small sample may end up with a value closer to the truth

A large sample may end up with a value far from the truth

Still, a large sample is preferred because we


have better chance to be closer to the truth!
27
Consistency
is said to be consistent because

• It is unbiased
• Its variance converges towards zero as n grows

𝑛=20 As the sample size goes to infinity, the sampling


𝑛=50 distribution of becomes a spike, and ALL
𝑛=100 possible outcomes will be arbitrarily close to
the truth!
0

This is consistency
28
ECON10005
Quantitative Methods 1
Central Limit Theorem

Faculty of Business and Economics


Department of Economics
Dice: 1

30
Dice: sum of 2

31
Dice: sum of 3

32
Dice: sum of 4

33
Dice: sum of 5

34
Dice: sum of 6

35
Dice: summary

• Become a bell shape that looks like a normal distribution

• The mean increases

• The variance increases

 Can we shift and squeeze(or stretch) is into a standard


normal?
 If so, we can use the probability from a normal distribution
to approximate the distribution that is difficult to derive
analytically.

37
Central Limit Theorem

Some preparation first:

If are independent with and

We have derived that


Then,

Shift toward 0 and squeeze (or stretch) it to have unit variance by construct the ”Z-score”

 You can verify that Z has zero mean and its variance is 1
38
Central Limit Theorem
If are independent with and

Define

Central Limit Theorem

As n increases, the distribution of Z converges towards N(0, 1).

 The distribution of becomes approximately standard normal, regardless of the distribution of .

39
Central Limit Theorem
If are independent with and Define

• is random, so , as a function of , is also random

Example:
1. You roll dice and record their average . We can derive that one dice
and .

2. Your friends will do the same as you did. She could have a different value of , hence a
different

3. You have MANY friends, and each does the same experiment.

40
n=2
 The distribution of a dice’s outcome is discrete, hence NOT continuous
 n=2 is a very small sample size

 We do not expect CLT works well as show below

This is a “claw” shape from the Z value of tossing


dice .

The discrete probabilities are connected in order


to compare with standard normal

41
n=3

 You and MANY of your friends toss 3 dice each and compute the Z score,

Feel smoother? Because more outcome values in


the discrete variable is available.

The discrete probabilities are connected in order


to compare with standard normal

42
n=10

 You and MANY of your friends toss 10 dice each and compute the Z score,

CLT seems to work.

is not a big number from daily practice and


common sense.

The discrete probabilities are connected in order


to compare with standard normal

43
From 2 to 30

Note:

• This is the distribution of Z score, NOT


Because has a non-zero mean and a shrinking
variance.

• CLT is very general.


Even the underlying distribution of each random
variable is far from a normal distribution.

• Note: For each data set (a sample) you have


ONLY one value of Z, which is a random draw
from its sampling distribution.
(A random draw Z=z from the sampling
distribution of Z)
44
CLT: Application

Harris Poll : 42% of American adults believe in ghosts!

Structure of poll: survey randomly chosen American adults, ask if they believe in ghosts.

Let denote population proportion of belief in ghosts. (we have shown that it is also the population mean)

If We (econometricians and statisticians)


often give a hat to an estimator. E.g., is
If an estimator for in this application.
And is an estimator for in regression
In general, (later this semester)

45
CLT: Application

 Each is a Bernoulli random variable

 The sum of ’s, , is a Binomial random variable

 The sample mean is not a Binomial random variable. We can approximate it by a normal
distribution through CLT.
From the dice example, it won’t be surprising that the score associated with converges to the
standard normal distribution.

46
CLT: Application
-score’s distribution for different sample sizes

Underlying is a Bernoulli random variable.


in this example.

1. Randomly draw observations, calculate .

2. Calculate the Z-score

3. Repeat 1-2 for MANY times to reveal the


distribution of Z.

49
CLT: Another Application

For random variables , each

1. Randomly draw observations, calculate .

2. Calculate the Z-score

3. Repeat 1-2 for MANY times to reveal the


distribution of Z.

👆This is the SAME as the dice or the Ghost survey problem

50
CLT and related randomness

is a random variable
A value is observed

is a random variable The missing link


• sampling distribution is calculated based on these values.
• Unbiased • Now ’s value is observed
• consistent

The missing link


is a random variable Construct z-score from
It approaches N(0,1) whengrows • Now Z’s value is observed

This is the unknown universe This is what you see


51
Central Limit Theorem (repeated)
If are independent with and

Define

Central Limit Theorem

As n increases, the distribution of Z converges towards N(0, 1).

 The distribution of becomes approximately standard normal, regardless of the distribution of .

52
𝑋 −μ
CLT Q&A 𝑍=
σ
√𝑛
Q: Why CLT?

A: To make life easier… indeed.


E.g., How do you calculate the probability of the sum of rolling 1000 dice being less than 2000?
You can use CLT to calculate it and it is very precise.

Q: Since I can ONLY see one value from my data, does it mean I have only one -score?

A: yes

Q: What is the point of learning a distribution of or if I can only see one value?

A: To make sure that we use the correct methodology.


To tell whether the value that we have is some true discovery or purely by chance.

53

You might also like