0% found this document useful (0 votes)

41 views

MIT2 854F10 Stats

1. The document provides an overview of statistical inference concepts including probability distributions, sampling, estimation, and hypothesis testing. 2. Key distributions discussed include the binomial, Poisson, normal, t, chi-square, and F distributions and how they relate to sampling and estimating population parameters. 3. Examples are given of how to construct confidence intervals for estimating means and variances using these distributions depending on whether the population variance is known or unknown.

Uploaded by

Tesfaye Tefera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

MIT2 854F10 Stats

Uploaded by

Tesfaye Tefera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Statistical Inference

Lecturer: Prof. Duane S. Boning

Agenda
1. Review: Probability Distributions & Random Variables 2. Sampling: Key distributions arising in sampling
Chi-square, t, and F distributions

3. Estimation: Reasoning about the population based on a sample 4. Some basic confidence intervals
Estimate of mean with variance known Estimate of mean with variance not known Estimate of variance

5. Hypothesis tests

Discrete Distribution: Bernoulli

Bernoulli trial: an experiment with two outcomes

Probability mass function (pmf):

f(x) p 1-p 0 1 x

Discrete Distribution: Binomial

Repeated random Bernoulli trials

n is the number of trials p is the probability of success on any one trial x is the number of successes in n trials

Binomial Distribution
Binomial Distribution
0.25
1 .2 1 0 .8 0 .6 S eries1

0 .4

0 .2

0.2

Probability

0.15

0.1

0.05

Number of "Successes"

Discrete Distribution: Poisson

Mean: Variance: Example applications:

# misprints on page(s) of a book # transistors which fail on first day of operation

Poisson is a good approximation to Binomial when n is large and p is small (< 0.1)

Poisson Distributions
Poisson Distribution
0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 c

Probability

Poisson Distribution
0.1

0.04

0.09
0.02 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 Events per unit

0.08 0.07 0.06 0.05 0.04 0.03 c

=20

Poisson Distribution
0.08

0.07

0.06

=30

Probability

0.02 0.01

0.05

Probability

0
0.04

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 Events per unit

0.03

0.02

0.01

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 Events per unit

Continuous Distributions
Uniform Distribution Normal Distribution
Unit (Standard) Normal Distribution

Continuous Distribution: Uniform

probability density function (pdf)

cumulative distribution function* (cdf)

also

sometimes called a probability distribution function *also sometimes called a cumulative density function

Standard Questions You Should Be Able To Answer (For a Known cdf or pdf)

Probability x less than or equal to some value

Probability x sits within some range

Continuous Distribution: Normal (Gaussian)

pdf

cdf

0.99865 0.977 0.84

0.5

0.00135

0.0227 0

0.16

Continuous Distribution: Unit Normal

Normalization

Mean
Variance pdf

cdf

Using the Unit Normal pdf and cdf

We often want to talk about percentage points of the distribution portion in the tails
1 0.9

0.5

0.1

Philosophy

The field of statistics is about reasoning in the face of uncertainty, based on evidence from observed data
Beliefs:
Distribution or model form Distribution/model parameters

Evidence:
Finite set of observations or data drawn from a population

Models:
Seek to explain data

Moments of the Population vs. Sample Statistics

Population
Mean Variance Standard Deviation Covariance Correlation Coefficient
15

Sample

Sampling and Estimation

Sampling: act of making observations from populations Random sampling: when each observation is identically and independently distributed (IID) Statistic: a function of sample data; a value that can be computed from data (contains no unknowns)
average, median, standard deviation

SticiGui: Statistics Tools for Internet and Classroom Instruction with a Graphical User Interface
http://stat-www.berkeley.edu/~stark/SticiGui

Sampling Demo

Population vs. Sampling Distribution

Population (probability density function)

n = 20
Sample Mean (statistic)

n = 10 n=2

Sample Mean (sampling distribution)

n=1

Sampling and Estimation, cont.

Sampling Random sampling Statistic A statistic is a random variable, which itself has a sampling distribution
I.e., if we take multiple random samples, the value for the statistic will be different for each set of samples, but will be governed by the same sampling distribution

If we know the appropriate sampling distribution, we can reason about the population based on the observed value of a statistic
E.g. we calculate a sample mean from a random sample; in what range do we think the actual (population) mean really sits?

Sampling and Estimation An Example

Suppose we know that the thickness of a part is normally distributed with std. dev. of 10: We sample n = 50 random parts and compute the mean part thickness: First question: What is distribution of

Second question: can we use knowledge of distribution to reason about the actual (population) mean given observed (sample) mean?
20

Estimation and Confidence Intervals

Point Estimation:
Find best values for parameters of a distribution Should be
Unbiased: expected value of estimate should be true value Minimum variance: should be estimator with smallest variance

Interval Estimation:
Give bounds that contain actual value with a given probability Must know sampling distribution!

Confidence Interval Demo

Confidence Intervals: Variance Known

We know , e.g. from historical data Estimate mean in some interval to (1- )100% confidence

Remember the unit normal percentage points Apply to the sampling distribution for the sample mean

1 0.9

0.5

0.1

Example, Contd
Second question: can we use knowledge of distribution to reason about the actual (population) mean given observed (sample) mean?

n = 50
95% confidence interval, ~95% of distribution lies within +/- 2 of mean = 0.05

Reasoning & Sampling Distributions

Example shows that we need to know our sampling distribution in order to reason about the sample and population parameters Other important sampling distributions:
Student-t Use instead of normal distribution when we dont know actual variation or Chi-square Use when we are asking about variances F Use when we are asking about ratios of variances

Sampling: The Chi-Square Distribution

Typical use: find distribution of variance when mean is known Ex:

So if we calculate s2, we can use knowledge of chi-square distribution to put bounds on where we believe the actual (population) variance sits

0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

Sampling: The Student-t Distribution

Typical use: Find distribution of average when For k ! 1, tk ! N(0,1) Consider xi ~ N( , 2) . Then

is NOT known

This is just the normalized distance from mean (normalized to our estimate of the sample variance)
26

Back to our Example

Suppose we do not know either the variance or the mean in our parts population:

We take our sample of size n = 50, and calculate

Best estimate of population mean and variance (std.dev.)?

If had to pick a range where would be 95% of time?

Have to use the appropriate sampling distribution: In this case the t-distribution (rather than normal distribution)
27

Confidence Intervals: Variance Unknown

Case where we dont know variance a priori Now we have to estimate not only the mean based on our data, but also estimate the variance Our estimate of the mean to some interval with (1- )100% confidence becomes

Note that the t distribution is slightly wider than the normal distribution, so that our confidence interval on the true mean is not as tight as when we know the variance.
28

Example, Contd
Third question: can we use knowledge of distribution to reason about the actual (population) mean given observed (sample) mean even though we werent told ?

n = 50

t distribution is slightly wider than gaussian distribution

95% confidence interval

Once More to Our Example

Fourth question: how about a confidence interval on our estimate of the variance of the thickness of our parts, based on our 50 observations?

Confidence Intervals: Estimate of Variance

The appropriate sampling distribution is the Chi-square Because 2 is asymmetric, c.i. bounds not symmetric.

Example, Contd
Fourth question: for our example (where we observed sT2 = 102.3) with n = 50 samples, what is the 95% confidence interval for the population variance?

0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005

0 0

10 20

30 40

50 60 70

80 90 100

Sampling: The F Distribution

Typical use: compare the spread of two populations Example:

x ~ N( x, y ~ N( y, Then from which we sample x1, x2, , xn 2 ) from which we sample y , y , , y y 1 2 m
2 x)

Concept of the F Distribution

Assume we have a normally distributed population We generate two different random samples from the population In each case, we calculate a sample variance si2 What range will the ratio of these two variances take? ) F distribution Purely by chance (due to sampling) we get a range of ratios even though drawing from same population
Example:

Assume x ~ N(0,1)
Take samples of size n = 20 Calculate s12 and s22 and take ratio

95% confidence interval on ratio

Large range in ratio!

Hypothesis Testing
A statistical hypothesis is a statement about the parameters of a probability distribution H0 is the null hypothesis
E.g. Would indicate that the machine is working correctly

H1 is the alternative hypothesis

E.g. Indicates an undesirable change (mean shift) in the machine operation (perhaps a worn tool)

In general, we formulate our hypothesis, generate a random sample, compute a statistic, and then seek to reject H0 or fail to reject (accept) H0 based on probabilities associated with the statistic and level of confidence we select

Which Population is Sample x From?

Two error probabilities in decision:
Type I error: false alarm Type II error: miss Power of test (correct alarm)
Consider H0 the normal condition Consider H1 an alarm condition

Control charts are hypothesis tests:

Set decision point (and sample size) based on acceptable , risks

Is my process in control or has a significant change occurred?

Summary
1. 2. 3. 4. Review: Probability Distributions & Random Variables Sampling: Key distributions arising in sampling
Chi-square, t, and F distributions

Estimation: Reasoning about the population based on a sample Some basic confidence intervals
Estimate of mean with variance known Estimate of mean with variance not known Estimate of variance

Hypothesis tests

Next Time: 1. Are effects (one or more variables) significant? ) ANOVA (Analysis of Variance) 2. How do we model the effect of some variable(s)? ) Regression modeling
37

MIT OpenCourseWare http://ocw.mit.edu

2.854 / 2.853 Introduction to Manufacturing Systems

Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Att Final Bill
No ratings yet
Att Final Bill
4 pages
BN2102 1-6 Notes
No ratings yet
BN2102 1-6 Notes
38 pages
El Segundo Terminal Manual (Feb 2009) PDF
No ratings yet
El Segundo Terminal Manual (Feb 2009) PDF
90 pages
Advanced Statistical Approaches To Quality: INSE 6220 - Week 4
No ratings yet
Advanced Statistical Approaches To Quality: INSE 6220 - Week 4
44 pages
Statistics 1B Lecture Notes: Author: T. Farrar
No ratings yet
Statistics 1B Lecture Notes: Author: T. Farrar
129 pages
L8 Statistical Estimation 1
No ratings yet
L8 Statistical Estimation 1
48 pages
03 Estimation IITB PDF
No ratings yet
03 Estimation IITB PDF
58 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Statistics and Probability Chapter 1 2 3
No ratings yet
Statistics and Probability Chapter 1 2 3
89 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
Chap1SamplingDistributions
No ratings yet
Chap1SamplingDistributions
14 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
Note 06 - Concept of Statistical Inference
No ratings yet
Note 06 - Concept of Statistical Inference
30 pages
Distributions
No ratings yet
Distributions
21 pages
Sampling
No ratings yet
Sampling
27 pages
Statistical Inference
No ratings yet
Statistical Inference
106 pages
Stat Notes
No ratings yet
Stat Notes
5 pages
Week 6. Chapter 7 Introduction To Inferential Statistics
No ratings yet
Week 6. Chapter 7 Introduction To Inferential Statistics
24 pages
Bizstat ssn2
No ratings yet
Bizstat ssn2
55 pages
Chapter 3 Sampling Distribution and Confidence Interval
100% (2)
Chapter 3 Sampling Distribution and Confidence Interval
57 pages
Course: Statistical Inference & Applications: Instructor in Charge
No ratings yet
Course: Statistical Inference & Applications: Instructor in Charge
30 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
Unit-7
No ratings yet
Unit-7
35 pages
Stats-And-Prob-Reviewer (Grade 11 Stem)
100% (1)
Stats-And-Prob-Reviewer (Grade 11 Stem)
5 pages
Statistics Final Review
No ratings yet
Statistics Final Review
28 pages
Formula_List_Statistics_2
No ratings yet
Formula_List_Statistics_2
4 pages
Internal Paper
No ratings yet
Internal Paper
20 pages
What Is Statistic
No ratings yet
What Is Statistic
129 pages
SB K49 Lecture7
No ratings yet
SB K49 Lecture7
57 pages
- Module 4-Sampling 2
No ratings yet
- Module 4-Sampling 2
56 pages
Statistics for Economists Lecture V
No ratings yet
Statistics for Economists Lecture V
37 pages
Sampling & Sampling Distributions
No ratings yet
Sampling & Sampling Distributions
26 pages
RM Note Unit - 4
No ratings yet
RM Note Unit - 4
21 pages
Sampling Distribution
No ratings yet
Sampling Distribution
41 pages
Probability, Statistics and Reliability: (Module-4)
No ratings yet
Probability, Statistics and Reliability: (Module-4)
47 pages
Chapter 4
No ratings yet
Chapter 4
45 pages
Inferential Statistics - GRY 324
No ratings yet
Inferential Statistics - GRY 324
88 pages
Summary Week 2
No ratings yet
Summary Week 2
17 pages
06 Stat Est
No ratings yet
06 Stat Est
41 pages
Chapter 3 - Sampling Distribution and Confidence Interval1
No ratings yet
Chapter 3 - Sampling Distribution and Confidence Interval1
54 pages
Chap5 Statistical Inference
No ratings yet
Chap5 Statistical Inference
38 pages
Introduction To Probability and Statistics Twelfth Edition: Presentation Designed and Written By: Barbara M. Beaver
No ratings yet
Introduction To Probability and Statistics Twelfth Edition: Presentation Designed and Written By: Barbara M. Beaver
31 pages
Introduction To Inferential Statistics Sampling Distributions
No ratings yet
Introduction To Inferential Statistics Sampling Distributions
21 pages
I P S F E Sampling Distributions: Ntroduction To Robability AND Tatistics Ourteenth Dition
No ratings yet
I P S F E Sampling Distributions: Ntroduction To Robability AND Tatistics Ourteenth Dition
37 pages
chapter7-Sampling-Distribution
No ratings yet
chapter7-Sampling-Distribution
37 pages
Mod 2 Stats
No ratings yet
Mod 2 Stats
8 pages
Sampling Distributions and Confidence Intervals
No ratings yet
Sampling Distributions and Confidence Intervals
69 pages
Binomial Distributions For Sample Counts
No ratings yet
Binomial Distributions For Sample Counts
38 pages
5 BSM214 Lecture5 Fall2023
No ratings yet
5 BSM214 Lecture5 Fall2023
25 pages
CH 6 Updated
No ratings yet
CH 6 Updated
13 pages
Screenshot 2024-12-15 at 01.18.34
No ratings yet
Screenshot 2024-12-15 at 01.18.34
161 pages
MT233 October 2019-1
No ratings yet
MT233 October 2019-1
39 pages
Untitled 3
No ratings yet
Untitled 3
32 pages
Engineering Probability & Statistics
No ratings yet
Engineering Probability & Statistics
30 pages
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
No ratings yet
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
91 pages
Chapter 4. Sampling Distributions
No ratings yet
Chapter 4. Sampling Distributions
31 pages
Review of Chapters 1-5
No ratings yet
Review of Chapters 1-5
21 pages
Business Statistics Key Formulas
50% (4)
Business Statistics Key Formulas
5 pages
Notes ch3 Sampling Distributions
No ratings yet
Notes ch3 Sampling Distributions
20 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Annalisa Marcja, Carlo Toffalori A Guide To Classical and Modern Model Theory Trends in Logic 2003
No ratings yet
Annalisa Marcja, Carlo Toffalori A Guide To Classical and Modern Model Theory Trends in Logic 2003
376 pages
MIT2 854F10 Multipart
No ratings yet
MIT2 854F10 Multipart
53 pages
Sandcasting
No ratings yet
Sandcasting
7 pages
Global Mapper Help
No ratings yet
Global Mapper Help
345 pages
Bellis vs. Bellis 20 SCRA 358, June 06, 1967
No ratings yet
Bellis vs. Bellis 20 SCRA 358, June 06, 1967
2 pages
Documentmass 3658
No ratings yet
Documentmass 3658
52 pages
Relational Aggression Classroom Guidance Lesson
No ratings yet
Relational Aggression Classroom Guidance Lesson
5 pages
1.mechanics & Force System
No ratings yet
1.mechanics & Force System
30 pages
Comparison Asme B31.1-B31.3-B31.8
100% (1)
Comparison Asme B31.1-B31.3-B31.8
7 pages
Regeneration Song Book
No ratings yet
Regeneration Song Book
11 pages
Statistical Mechanics
No ratings yet
Statistical Mechanics
74 pages
17.object Cloning in Java
No ratings yet
17.object Cloning in Java
12 pages
HYD Micro-Project
No ratings yet
HYD Micro-Project
14 pages
Matrix Organisation
100% (1)
Matrix Organisation
15 pages
Dutch-Bangla Bank Limited
No ratings yet
Dutch-Bangla Bank Limited
4 pages
A 336 - A 336M - 03 - Qtmzni0wmw
No ratings yet
A 336 - A 336M - 03 - Qtmzni0wmw
8 pages
Creating Clusters
No ratings yet
Creating Clusters
7 pages
Uang Yudha
No ratings yet
Uang Yudha
5 pages
Bullying and Conflict Resolution Strategies
No ratings yet
Bullying and Conflict Resolution Strategies
4 pages
TADs InformedConsent
100% (1)
TADs InformedConsent
1 page
UTS & UAS Language Assessment 2021 Muh Syafei
100% (1)
UTS & UAS Language Assessment 2021 Muh Syafei
2 pages
Esp Writing
No ratings yet
Esp Writing
2 pages
Islamic Financial System Principles and Operations PDF 701 800
100% (1)
Islamic Financial System Principles and Operations PDF 701 800
100 pages
ch4 Questions
No ratings yet
ch4 Questions
14 pages
Merry Christmas, Mr.Lawrence
No ratings yet
Merry Christmas, Mr.Lawrence
3 pages
LG g8 Thinq Ug
No ratings yet
LG g8 Thinq Ug
202 pages
Jihad The Islamic Doctrine of Permanent War
No ratings yet
Jihad The Islamic Doctrine of Permanent War
48 pages
MG 204 Tutorial Outline Sem 1 2017
No ratings yet
MG 204 Tutorial Outline Sem 1 2017
7 pages
UUM OL Evaluation
No ratings yet
UUM OL Evaluation
1 page
Housing and Urban Development Department: User Manual
No ratings yet
Housing and Urban Development Department: User Manual
24 pages
Binance Coin Analysis and Valuation
No ratings yet
Binance Coin Analysis and Valuation
28 pages
2503.16041v1
No ratings yet
2503.16041v1
12 pages

MIT2 854F10 Stats

Uploaded by

MIT2 854F10 Stats

Uploaded by

Statistical Inference

Lecturer: Prof. Duane S. Boning

Discrete Distribution: Bernoulli

Probability mass function (pmf):

Discrete Distribution: Binomial

Discrete Distribution: Poisson

Mean: Variance: Example applications:

0.08 0.07 0.06 0.05 0.04 0.03 c

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 Events per unit

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 Events per unit

Continuous Distribution: Uniform

cumulative distribution function* (cdf)

Probability x less than or equal to some value

Probability x sits within some range

Continuous Distribution: Normal (Gaussian)

0.99865 0.977 0.84

Continuous Distribution: Unit Normal

Using the Unit Normal pdf and cdf

Moments of the Population vs. Sample Statistics

Sampling and Estimation

Population vs. Sampling Distribution

Sample Mean (sampling distribution)

Sampling and Estimation, cont.

Sampling and Estimation An Example

Estimation and Confidence Intervals

Confidence Interval Demo

Confidence Intervals: Variance Known

Reasoning & Sampling Distributions

Sampling: The Chi-Square Distribution

Typical use: find distribution of variance when mean is known Ex:

Sampling: The Student-t Distribution

Back to our Example

We take our sample of size n = 50, and calculate

Best estimate of population mean and variance (std.dev.)?

Confidence Intervals: Variance Unknown

t distribution is slightly wider than gaussian distribution

95% confidence interval

Once More to Our Example

Confidence Intervals: Estimate of Variance

0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005

Sampling: The F Distribution

Typical use: compare the spread of two populations Example:

Concept of the F Distribution

95% confidence interval on ratio

Large range in ratio!

H1 is the alternative hypothesis

Which Population is Sample x From?

Control charts are hypothesis tests:

Set decision point (and sample size) based on acceptable , risks

Is my process in control or has a significant change occurred?

MIT OpenCourseWare http://ocw.mit.edu

2.854 / 2.853 Introduction to Manufacturing Systems

You might also like