0% found this document useful (0 votes)
7 views

MECH 262 - Notes (Statistics)

Uploaded by

sisitrash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

MECH 262 - Notes (Statistics)

Uploaded by

sisitrash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Introduction

Terminology:
● Population: Entire data set
● Sample: Subset of population
● Sample Space: all possible outcomes of data set
● Discrete: Fixed number of options
● Continuous: Infinite number of options
● Random/Stochastic variable: assigned number to identify outcome
● Distributions
○ Symmetric
○ Uniform
○ Bimodal
○ Skewed
○ J-Shaped
● Stochastic process
○ Random process
Describing data set:
● Central tendency
○ Mean
○ Median
○ Mode
● Dispersion
○ Standard deviation (root mean square)


Normal distribution:
● Also called gaussian distribution or bell curve
● Relates mean to standard deviation


Probability axioms:
● Probability: Likelihood that event will happen
● Axiom 1: probability is between 0 and 1
● Axiom 2: P=1 means event must happen
● Axiom 3: sum of all probabilities equals 1
Probability rules:
● Mutually exclusive
○ Events cannot occur at same time
○ RULE: 𝑃(𝐴∪𝐵) = 𝑃(𝐴) + 𝑃(𝐵)
○ eg. Flipping coin and getting heads or tails
● Mutually inclusive
○ Two event may or may not occur together
○ RULE: 𝑃(𝐴∪𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴∩𝐵)
■ So we don’t double count overlap
● Independent events
○ Outcome of A doesn’t influence B
○ RULE: 𝑃(𝐴∩𝐵) = 𝑃(𝐴) × 𝑃(𝐵)
○ eg. Flipping two coins separately
● Dependent events:
○ Also called conditional probabilities
○ Probability of A given B happening
○ Denoted: 𝑃(𝐴|𝐵)
○ RULE: 𝑃(𝐴∩𝐵) = 𝑃(𝐵) 𝑃(𝐴|𝐵)
Probability distribution:
● Probability mass functions (PMF)
○ For discrete random variables
○ Mean: µ = Σ𝑥𝑖 𝑃(𝑥𝑖)
2 2
○ Variance: σ = Σ(𝑥𝑖 − µ) 𝑃(𝑥𝑖)
● Probability density function (PDF)
○ Since infinite number of outcomes, probability of given outcome is 0
■ ∴intervals must be used
○ Use integral instead of sums
○ 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = ∫𝑓(𝑥) 𝑑𝑥
○ Mean: µ = ∫𝑥 𝑓(𝑥) 𝑑𝑥
2
○ Variance: σ = ∫(𝑥 − µ) 𝑓(𝑥) 𝑑𝑥
● Cumulative distribution function (CDF)
○ Use probability function to determine probability of event in certain range
■ Adjust bonds of integration

Discrete probability distributions


● Types
○ Binomial distribution
○ Poisson distribution
Choose notation:
● Permutations
○ Denoted: nPk
○ Order matters
○ Number of unique ordered sets
● Combinations
○ Denoted: nCk
○ Order independent
○ Number of unique non-ordered sets
Binomial distribution:
● Used for only 2 possible outcomes
○ Eg. pass or fail
○ Probability of certain outcome is process is repeated x amount of times

● where p is probability of desired event, k is #of desired events, n is


total number of events
Poisson distribution:
● Probability of given events in fixed interval of time or space
● Used for
○ Constant mean rate
○ Events independent of time since last event
○ Time or spatially discrete problems
● Assumptions:
○ Probability of event is independent of time interval (given same length of time)
○ Probability of event is independent of other events
𝑘 −λ
λ𝑒
● 𝑃(𝑘) = 𝑘!
where k is number of time desired event occurs, λ is probability of event in
given time interval of interest

Continuous probability distributions


Types:
● Normal/gaussian distribution
● Standard log-normal distribution
● Exponential distribution
Normal/gaussian distribution:
● Symmetric and negative
● Models random error


○ But hard to solve so use change of variables
(𝑥−µ)
○ 𝑧= σ
■ How many standard deviations away from mean
○ Set standard integral:


■ Set z1=0 to make it a single variable function
■ Use tables
● Using matlab to find probability
○ p=normcdf(z) where z is change of variable OR
■ Form -∞ to point (NOT 0)
○ p=normpdf(x, μ, σ)
■ Form -∞ to point (NOT 0)
Standard lognormal distribution:
● Strictly positive and occasionally very large
○ eg. Lifetime of equipment
● Logarithm that is normally distributed
○ Take ln of variable apply standard normal distribution
Exponential distribution
● Likelihood of event increase or decreases exponentially with time
−λ𝑥
● PDF function: 𝑓(𝑥, λ) = λ𝑒 where λ is rate parameter
1
● Mean (μ) = standard deviation (σ) = λ
−λ𝑥1 −λ𝑥2
● CDF: 𝑃(𝑥1 ≤ 𝑥 ≤ 𝑥2) = 𝑒 −𝑒

Test for normality:


● Test how normal an experimental distribution is
● 2 measures
○ Skewness: skewness of data
■ Denoted S
■ 𝑆 = 0 is perfectly symmetric
■ |𝑆| > 1 ⇒ very skewed
○ Kurtosis: sharpness of data (also indication of tails/extremes)
■ Denoted K
■ 𝐾 = 3 is perfect normal distribution
■ note: Some programs subtract 3 automatically
■ |𝐾 − 3| > 3 ⇒ very sharp/blunt
● Standardized moment chart to describe data relative to normal distribution

Population and samples


Distribution parameter estimation:
● Impractical to measure all population
○ ∴usually measure only sample
● How do we determine if our sample accurately represents the mean
● Central limit theorem
○ Distribution of sample means is normally distributed
■ For n>30 (#of data points per sample)
■ Regardless of population distribution
○ ↗ #of a data points (n) per sample →
○ ↘ standard deviation of, mean of samples
○ Properties
■ If population is normally distributed → samples mean ( ) is normally
distributed
■ If #of data points per sample (n)>30 → samples mean ( ) is normally
distributed

■ If n>30 →
Interval estimation of mean:
● Determining error in our sample mean

○ δ is confidence interval
○ Standard in 95% confidence interval (ASME standard)
● Confidence level (C)
○ Probability that population mean (μ) lies within confidence interval
○ 𝐶 = 1 − α where α is level of significance
○ C is % chance event will happen
○ α is % chance event will not happen
● Assume standard deviation of sample is equal to standard deviation of population
○ 𝑆= σ
𝑧α/2𝑆
● δ= where α is significance
𝑛
○ To find zα/2 reverse the z process using tables
○ Using matlab
■ z = norminv(p)
■ Linkes probability to z value
○ z1 is at norminv(α/2) AND z2 is at norminv(C+α/2)
● One-sided intervals
○ Only interested in upper of lower limit

■ Upper:

■ Lower:
○ DON’T divide α by 2 since all area (probability) is on one side
Student’s t-distribution:
● Use when n<30
● Same procedure as normal distribution BUT
○ Use t instead of z
○ Matlab:
α
■ tinv(p, nu) where 𝑝𝑢𝑝𝑝𝑒𝑟 = 𝐶 + 2
■ ν (nu) is degree of freedom
● As ν→∞, distribution approaches normal distribution
● As ν→∞, distribution flattens and widens
Estimation of population variance:
2
● Use chi-squared (χ ) distribution
● Use matlab
○ chi2inv(p, nu)
2
● χ is only positive therefore bounds are:
α α
○ 𝑝1 = 2
𝑝2 = 𝐶 + 2

Correlation
Linear correlation:
● Linear correlation coefficient (rx, y)
○ 𝑟 = 1 → strong positive correlation
○ 𝑟 =− 1 → strong negative correlation
○ 𝑟 =± 0. 1 → no correlation
● Only provides data on correlation
○ NO slope
○ NO non-linear correlation
● Matlab:
○ corr(x, y) where x and y are arrays of values
● Significance of linear correlation coefficient
○ ↗ data points → ↗ significance
○ Table gives minimum correlation coefficient needed to accept correlation
○ Depends on
■ #of data points sampled
■ Significance level wanted (α) (%that correlation is due to pure chance)

Correlation and causation:


● Correlation NOT causation

You might also like