0% found this document useful (0 votes)
10 views

Statistical Methods

The document discusses statistical methods and probability. It introduces concepts like population and sample, different sampling methods, types of variables and data. It also covers summarizing data through graphical and descriptive methods, and key concepts in probability like sample space, events, rules of probability, random variables, and discrete and continuous distributions.

Uploaded by

lizahxm
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Statistical Methods

The document discusses statistical methods and probability. It introduces concepts like population and sample, different sampling methods, types of variables and data. It also covers summarizing data through graphical and descriptive methods, and key concepts in probability like sample space, events, rules of probability, random variables, and discrete and continuous distributions.

Uploaded by

lizahxm
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Statistical Methods

Lecture 1
Census: collection of data from every member of population, usually too large to collect.
Sample: sub-collection from the population.
Different sample  different data  different conclusions about population.
A sample should be representative and unbiased.

Parameter: numerical measurement describing a population’s characteristic, often Greek symbols.


Statistic: numerical measurement describing a sample’s characteristic, small letters.

Sampling methods:
o Voluntary response sample: subjects decide themselves to be included in sample.
o Random sample: each member of population has equal probability of being selected.

o Simple random sample: each sample of size n has equal probability of being chosen
o Systematic sampling: after starting point, select every k-th member.
o Stratified sampling: divide population into subgroups such that subjects within groups have
same characteristics, then draw a (simple) random sample from each group.
o Cluster sampling: divide population into clusters, then randomly select some of these
clusters.
o Convenience sampling: easily available results.

Variable: varying quantity


o Response (dependent) variable: representing the effect to study
o Explanatory (independent) variable: possibly causing that effect
o Confounding: mixing influence of several explanatory variables on response
Types of studies
o Observational study: characteristics of subjects are observed, subjects are not modified.
 Retrospective (case-control): data from past
 Cross-sectional: data from one point in time
 Prospective (longitudinal): data are to be collected
o Experiment: some subject treatment
 Sometimes control and treatment group; single-blind or double-blind
 To measure placebo effect or experimenter effect

Types of data
o Qualitative (categorical): names or labels represent counts/measurements
 Nominal: names, labels, categories (no ordering)
 Gender, eye color
 Ordinal: categories with ordering, but no (meaningful) differences
 U.S. grades, opinions
o Quantitative (numerical): numbers represent counts/measurements
 Interval: ordering possible and differences between numbers are meaningful. No
natural zero starting point
 Year of birth, temperatures
 Ratio: ordering possible, differences are meaningful and there is a natural starting
point.
 body length, marathon times

 Discrete: countable number of possible values


 Continuous: uncountably many possible values

Summarizing data:
o Graphical: tables, graphs, other figures
o Descriptive:
 Qualitative: describe shape, location and dispersion/variation
 Quantitative: numerical summaries of location and variation

Graphical summaries:
o Frequency distribution (table)
Count occurrences of category
o Bar chart
Spaces in between the categories
o Pareto bar chart
Bar chart, but categories are ordered w.r.t. frequency.
Data of nominal measurement level is required
o Pie chart
Pie piece sizes determined by relative frequency of category
o Histogram
Bar areas are proportional to frequency in respective interval. No white space.
Only used for quantitative data
o Time series
Visualization of time-varying quantity

Qualitative description:
o Shape:
Make smooth approximation of histogram
Shape of smooth curve relates data distribution to familiar distributions.
 Symmetrical
 Left- or Right-skewed
 Uniform
o Location:
Position on x-axis
Same shape, different location
o Dispersion (spread/variation):
Measure of variation within dataset
Same shape and location; different dispersion
 Small or large dispersion

Measures of center: value at center/middle of a dataset


o Mean

Average
Every data value is used
Strongly affected by extreme values
 Sample mean denoted by ̄x
 Population mean denoted by μ
o Median
Middle value after sorting
Not much affected by extreme values
o Mode
Value with highest frequency
Bimodal (2), multimodal (>2)

(sample) standard deviation: common measure of variation (or deviation from ̄x )


Measures how much the values deviate from the sample mean.

 Square root of sample variance (‘mean quadratic deviation from ̄x )

Population standard deviation: σ


Population variance: σ2
Range = maximum – minimum

Percentiles: measures of location and dispersion


Quartiles:
o Q1= P25
o Q2= P50 = median
o Q3= P75

5 number summary:
1. Minimum
2. Q1
3. Median, Q2
4. Q3
5. Maximum
Interquartile range = Q3 – Q1
 Boxplot: provide information about distribution
top value: maximum
top of box: Q3
thick line: median
bottom of box: Q1
lowest value: minimum

whiskers: lines extending from the box


outliers: all points not included between whiskers

Lecture 2
Probability experiment: production of (random) outcome.
 dice roll, coin toss
Sample space Ω: set of all possible outcomes
 Ω = { 1,2,3,4,5,6}
Event A, B, …: collection of outcomes
 A = {even number thrown} = { 2,4,6}
Simple event: consist of 1 outcome
Probability measure: function P(.) assigning values between 0 and 1 to events
 P(A) = P({2,4,6}) = ½
Interpretation of probabilities:
o P(A) = 0  occurrence of A is impossible
o P(A) = 1  occurrence of A is certain
o P(A) = small e.g. <0.05  occurrence of A is unlikely

3 ways to determine probability P(A) of event A:


1. Estimate with relative frequency:
number of ×A occurred
P(A) =
number of ×the procedure was repeated

2. Classical (theoretical) approach


Make probability model (outcome space, probability measure..), compute P(A) by using
properties of P
3. Subjective approach
Estimate P(A), based on intuition/experience

With relative frequency, many trials lead to the relative frequency almost being equal to the real
value of P(A)  Law of Large numbers: suppose a procedure is repeated (independently). The
relative frequency probability of an event A tends towards true P(A)

Determining P(A) is all outcomes are equally likely:


number of ways A can occur
P(A) =
total number of different simple events

Counting principle:
Suppose 2 probability experiments are performed
a > x possible outcomes;
b > y possible outcomes
Combined: a x b possible outcomes

How to find P(A) in discrete case


o Find sample space Ω
o Determine probabilities P(ω) for all ω in Ω
Finite case with N equally likely outcomes: P(ω) = 1/N
o Determine which outcomes belong to A
o Compute

 Example biased dice: probability of even number

Addition rule:
o P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Notation:
A ∪ B = A or B: union, set of outcomes which are in A or B (or both)
A ∩ B = A and B: intersection, set of outcomes which are both in A and B

o A and B are disjoint if they exclude each other, A ∩ B = ∅


Addition rule for 2 disjoint events:
P(A ∪ B) = P(A) + P(B)

General addition rule for disjoint events

o : complement of A: outcomes which are not in A


Complement rule:

Multiplication rule
o P(B|A): conditional probability that B occurs given that A has occurred.
Conditional probability:
P ( A ∩B)
If P(A) > 0, then: P(B|A) =
P( A )
BUT P(B|A) ≠ P(A|B)

o Multiplication rule:
P(A ∩ B) = P(A) · P(B|A).

o Independence:
Two events A and B are independent if P(A ∩ B) = P(A) · P(B)
 P(B) = P(B|A) when A and B are independent
 Independence ≠ disjointness

Two different sampling methods:


1. Sampling with replacement: selections are independent events
2. Sampling without replacement: selections are dependent events
 Drawing a small sample from a large population, then treat selections as independent
events.

o Complement of at least one:


P(≥ 1 occurrence of . . . ) = 1 − P(no occurrence of . . .)

Lecture 3
Addition rule for disjoint events

Then, multiplication rule:

 Simple law of total probability:


Let A and B be events, then:

Combine with multiplication rule:


 Bayes’ Theorem (simple):

!! , but

o Partition:
Events A1, …, Am are called partition if
 They are pairwise disjoint: Ai ∩ Aj = ∅, if i ≠ j;
 Their union is entire sample space: : A1 ∪ A2 ∪ . . . ∪ Am = Ω.

o Law of total probability


Let A1, …, Am be a partition, then:

o Bayes’ Theorem
Let A1, …, Am be partition, then for r ∈ {1, …, m}:

EXAMPLE:

A random variable is a variable that assigns a numerical value to each outcome of a probability
experiment.
Notation: X, Y, ..
X : random variable, x value of a random variable
EXAMPLE

o Probability distribution:
Determines probabilities of values of a random variable
Given by table, formula or graph

o Discrete random variable


Has finite (countably) many different values
Its probability distribution: collection of all their individual probabilities
Total sum of probabilities: 1

o Continuous random variable


Has uncountably many different values
Its probability distribution: given by probability density function; probabilities computed by
area under this function.
Total area: 1

Outcomes ω in Ω and probability measure P determine probability distribution of X:


P(X = x) = P({ω ∈ Ω : X(ω) = x}).

Find probability distribution of discrete random variable:


1. Determine sample space of underlying probability experiment and probabilities of outcomes
ω (see lecture 2)
2. List values X(ω) for all ω in Ω
3. For each value of x of X, find all simple events {ω} with value x
Unify: {X = x} = {ω : X (ω) = x}
4. Probabilities P({ω}) determine probability of {X = x}:

5. Table. Left column: all values x of X and column with probabilities P(X =x)

EXAMPLE:
o Expected value (expectation/mean)
The expected value of a discrete random variable X with possible values x1, …, xk:
Weighted average of all possible values of X:

EXAMPLE
o Variance
The variance of a discrete random variable X with values x1, …, xk:

o The standard deviation of X is

EXAMPLE
Law of Large Numbers:
Let X1, …, Xn be n independent versions of random variable X; let µ = E(X)
1
Their mean (X1 + … + Xn) tends to approach µ.
n
LLN of Lect.2: random variable Xi = 1 if A occurs, Xi = 0 if A doesn’t.

Find probability distribution of discrete random variable


1. Determine sample space and probabilities of underlying probability experiment.
2. List the numerical values X(ω) for each outcome ω ∈ Ω.
3. Find the collection of outcomes which have the same numerical value x.
4. Determine
5. Tabulate the results.

Lecture 4
Probability density function
A curve p(x) such that
o p(x) ≥ 0 for all x
o total area under curve = 1
P(X ∈ [a, b]) = area under the curve p(x) between a and b

Normal distribution
Random variable X has a normal distribution if p(x) is continuous, bell-shaped and symmetric

If E(X) = µ and SD(X) = σ ,


Notation: N(µ, σ2 ) for normal distribution with mean µ, variance σ 2

Standard normal distribution: N(0,1)


Determine probabilities of normal distribution
P(X ≤ z) = area under density to the left of z

P(X ∈ [a, b]) = P(X ≤ b) − P(X ≤ a)


P(X ≥ b) = 1 − P(X ≤ b)

EXAMPLE

o Probability density function


A curve p(x) such that p(x) ≥ 0 and the total area under the curve is 1.
The probability that X takes a value between a and b: P( X ∈ [a, b]), can be obtained by
determining the area under the curve p(x) between a and b.
o Normal distribution
A random variable X has a normal distribution if its probability density p(x) is continuous,
bell-shaped and symmetric.
Notation: N(µ, σ2 ) for a normal distribution with mean µ and variance σ 2 .
The standard normal distribution has mean 0 and standard deviation 1: N(0, 1).
o Determine probabilities of standard normal distribution
Let X has N(0,1) distribution.
Probability P(X ≤ x). Use table which shows the cumulative area under the curve to the left of
a z-score, P(X ≤ z).
Z-score of value x
Let x be a (data) value of interest, related to a population distribution with mean µ and standard
x−µ
deviation σ. The z-score of x is z =
σ
 Number of standard deviations away from the mean
x−µ
Let X ∼ N(µ, σ2 ). Since P(X ≤ x) = P(Z ≤ z), where Z = ∼ N(0, 1), use Table 2
σ

EXAMPLE

The Central Limit Theorem (CLT)


Take a sample of size n > 30 from a population with mean µ and standard deviation σ.

The population can have any distribution

The Central Limit Theorem for normal population (special case)


Take a sample of size n from a normal population with mean µ and standard deviation σ.

N can be any number

EXAMPLE
A model distribution
Probability distribution for describing the unknown true population distribution
Examples (continuous variables: normal, uniform, t, χ 2 , exponential.

The variable < . . . > is (modelled as) a random variable


having a < model distribution>
with < relevant parameters >

Example: The variable ‘Date of birth - Due date’ is a random variable having a normal distribution
with mean 0 and standard deviation 10

Accessing normality
Consider dataset x1, …, xn. When is model distribution N(µ, σ2 ) reasonable?
o Shape of histogram
Bell-shaped curve
Strong deviation from bell shape? Then N(µ, σ2 ) unlikely
o Normal QQ plot
Approximately straight line
EXAMPLE

What is a QQ plot
There are QQ plots other than “normal QQ plots”: use theoretical quantiles of other continuous
distributions.

Sample size
Small n: more variation
 histogram / QQ plot could deviate (from bell shape / straight line), even if N(µ, σ2 ) true.
Large n: histogram and QQ plot: more reliable

A location-scale family of probability distributions


Each member is obtained by
o Shifting (change in location) and/or
o Stretching/squeezing (change in scale)

Stochasts X and Y have probability distributions that are in the same location-scale family if and only
if the QQ-plot shows a straight line Y = a + bX

Normal distributions form a location-scale family


3 types of QQ-plots
1. X-axis: theoretical quantiles of a probability distribution.
y-axis: sample quantiles of this dataset
used to asses whether the particular distribution could be used as model distribution.
2. X-axis: theoretical quantiles of a probability distribution.
y-axis: theoretical quantiles of another probability distribution
used to compare the shape of two probability distributions, for instance to verify whether
they belong to the same location-scale family.
3. X-axis: sample quantiles of a dataset
y-axis: sample quantiles of another dataset
used to compare the shape of the two data distributions and assess whether they could
possibly originate from two model distributions belonging to the same location-scale family.

How to interpret QQ plots?


Draw straight line through middle of QQ plot
o Points on left side below straight line?
 left tail of sample is heavier than left tail of N(0, 1).
o Points on left side above straight line?
 left tail of N(0, 1) is heavier than left tail of sample.
o Points on right side above straight line?
 right tail of sample is heavier than right tail of N(0, 1).
o Points on right side below straight line?
 right tail of N(0, 1) is heavier than right tail of sample.

How to assess normality of data with QQ plot


o Make normal QQ plots
o If points follow approximately straight line y = a + bx (with slope b > 0), then N(a, b2 ) is
reasonable as model distribution.
o If points don’t follow straight line: sample most likely not from normal distribution.
In latter case: sample most likely from location-scale family with lighter or heavier tails than those of
normal distribution, depending on shape of QQ plot.

You might also like