Lecture 5_ MUST
Lecture 5_ MUST
Inferential Statistics
Harold C Banda
Inferential Statistics
We use sample data to make generalizations about an
unknown population.
This part of statistics is called inferential statistics.
The sample data help us to make an estimate of a population
parameter. i.e.
1 the sample mean, x̄, is the point estimate for the population
mean, µ.
2 The sample standard deviation, s, is the point estimate for the
population standard deviation, σ.
Inferential Statistics
Inferential Statistics
Inferential Statistics
There are two estimation techniques,
1 point estimation and
2 interval estimation.
Point Estimation
Definition
A point estimate of a population characteristic is a single number
that is based on sample data and represents a plausible value of
the characteristic.
Inferential Statistics
Inferential Statistics
Inferential Statistics
Point Estimation
Definition
A point estimate of a population characteristic is a single number
that is based on sample data and represents a plausible value of
the characteristic.
Inferential Statistics
Definition
A statistic whose mean value is equal to the value of the
population characteristic being estimated is said to be an unbiased
statistic. A statistic that is not unbiased is said to be biased.
Inferential Statistics
Inferential Statistics
Inferential Statistics
Definition
A confidence interval (CI) for a population characteristic is an
range of plausible values for the characteristic.
Definition
The confidence level associated with a confidence interval
estimate is the success rate of the method used to construct the
interval.
Inferential Statistics
Confidence Interval
Example
Suppose we have collected data from a sample. We know the
sample mean but we do not know the mean for the entire
population. The sample mean is 7 and the error bound for the
mean is 2.5.
From the example above, x̄ = 7 and EBM = 2.5.
The confidence interval is (7 − 2.5, 7 + 2.5); calculating the
values gives (4.5, 9.5).
If the Confidence Level (CL) is 95%, then we say that ”We
estimate with 95% confidence that the true value of the
population mean is between 4.5 and 9.5.”.
Inferential Statistics
Calculating the Confidence Interval
Probability Levels and interval estimates
Normal Curve
Confidence Intervals
Example
A study is conducted concerning the blood pressure of 60 year old
women with glaucoma. In the study 200 60-year old women with
glaucoma are randomly selected and the sample mean systolic
blood pressure is 140 mm Hg and the sample standard deviation is
25 mm Hg.Calculate a 95% confidence interval for the true mean
systolic blood pressure among the population of 60 year old women
with glaucoma.
Inferential Statistics
Calculating the Confidence Interval
Hypotheses and Test Procedures
Definition
A statistical hypothesis is a conjecture about a population
parameter. This conjecture may or may not be true.
Definition
A test of hypotheses or test procedure is a method that uses
sample data to decide between two competing claims (hypotheses)
about a population characteristic
Inferential Statistics
Calculating the Confidence Interval
Hypotheses and Test Procedures
Definition
The null hypothesis, denoted by H0 , is a claim about a
population characteristic that is initially assumed to be true.
The alternative hypothesis, denoted by Ha , is the competing
claim.
Alternative Hypothesis
For example suppose we want to test the hypothesis that a
population mean (µ) is equal to 55. The hypothesis is that:
Ho : µ = 55.
Where: µ is the true value and 55 is the assumed value.
Therefore the three possible alternative hypotheses are:
1 H1 : µ ̸= 55 ⇒ expressed in Two-tailed test.
2 H1 : µ > 55 ⇒ expressed in Right-tailed test.
3 H1 : µ < 55 ⇒ expressed in Left-tailed test.
A statistical test uses the data obtained from a sample to
make a decision about whether the null hypothesis should be
rejected.
Inferential Statistics
Calculating the Confidence Interval
Alternative Hypothesis...cont’d
Alternative Hypothesis...cont’d
A one-tailed test is either right- tailed when the inequality
sign is > or left-tailed when the inequality sign <. It indicates
that the Ho should be rejected when the test value is in the
critical region on one side of the mean.
In a two-tailed test, the null hypothesis should be rejected
when the test value (numerical value obtained from a
statistical test) is in either of the two critical regions.
Inferential Statistics
Calculating the Confidence Interval
Note:
Note:
It is important to establish a criterion for the rejection and
acceptance of the null hypothesis. In that regard, the level of
risk you desire rejecting a null hypothesis is the level of
significance (α).
It is also worthy of note that theoretically a test never proves
that a hypothesis is true but merely provides statistical
evidence for not rejecting a hypothesis.
When you reject the Null Hypothesis, the five possibilities are:
Note:...cont’d
There may be a complexity of interrelationships among many
variables.
The relationship may be coincidental.
Definition
The probability of a Type I error is denoted by α and is called
the level of significance of the test.
Thus, a test with α = 0.01 is said to have a level of significance of
0.01 or to be a level 0.01 test.
The probability of a Type II error is denoted by β.
Inferential Statistics
Errors in Hypothesis Testing
Power of a test
Power of a test
Definition
The probability of correctly rejecting H0 when it is false is called
the power of the test
For any particular value of µ, the power is 1 − β.
Inferential Statistics
Power Curve
Power Curve
Inferential Statistics
Hypothesis Tests for a Population Mean
Example
A major hospital provides one of the most comprehensive
emergency medical services in Malawi. Operating in a multiple
hospital system with approximately 40 mobile medical units, the
service goal is to respond to medical emergencies with a mean time
of 12 minutes or less. The response times for a random sample of
40 medical emergencies were tabulated. The sample mean is 13.25
minutes. The population standard deviation is believed to be 3.2
minutes. Perform a hypothesis test, with a 0.05 level of
significance, to determine whether the service goal of 12 minutes
or less is being achieved.
Inferential Statistics
Statistical tests: Parametric tests
Testing A Hypothesis Involving A Mean
Example
A State Highway Patrol periodically samples vehicle speeds at
various locations on a particular roadway. The sample of vehicle
speeds is used to test the hypothesis H0 : µ ≤ 65. The locations
where H0 is rejected are deemed the best locations for speed traps.
At Location F , a sample of 25 vehicles shows a mean speed of
66.2 km/h with a standard deviation of 4.2 km/h. Use α = 0.05 to
test the hypothesis.
Inferential Statistics
Hypothesis Tests
Example
For a Christmas and New Year’s week, the National Safety Council
estimated that 500 people would be killed and 25,000 injured on
the nation’s roads. The NSC claimed that 50% of the accidents
would be caused by drunk driving. A sample of 120 accidents
showed that 67 were caused by drunk driving. Use these data to
test the NSC’s claim with α = 0.05.