0% found this document useful (0 votes)
3 views34 pages

Lecture 5_ MUST

The document provides an overview of inferential statistics, focusing on estimation techniques such as point estimation and interval estimation. It explains the concepts of confidence intervals, hypothesis testing, and the significance of Type I and Type II errors in statistical analysis. Additionally, it outlines the steps involved in hypothesis testing and provides examples to illustrate these concepts.

Uploaded by

che-006-22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views34 pages

Lecture 5_ MUST

The document provides an overview of inferential statistics, focusing on estimation techniques such as point estimation and interval estimation. It explains the concepts of confidence intervals, hypothesis testing, and the significance of Type I and Type II errors in statistical analysis. Additionally, it outlines the steps involved in hypothesis testing and provides examples to illustrate these concepts.

Uploaded by

che-006-22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Inferential Statistics

Inferential Statistics

Harold C Banda

Phone: +265 9997-733-78/8893-733-57


Email : [email protected]
Inferential Statistics
Estimation Using a Single Sample

Inferential Statistics
We use sample data to make generalizations about an
unknown population.
This part of statistics is called inferential statistics.
The sample data help us to make an estimate of a population
parameter. i.e.
1 the sample mean, x̄, is the point estimate for the population
mean, µ.
2 The sample standard deviation, s, is the point estimate for the
population standard deviation, σ.
Inferential Statistics
Inferential Statistics

Inferential Statistics
There are two estimation techniques,
1 point estimation and
2 interval estimation.

Point Estimation

Definition
A point estimate of a population characteristic is a single number
that is based on sample data and represents a plausible value of
the characteristic.
Inferential Statistics
Inferential Statistics

Inferential Statistics

Point Estimation

Definition
A point estimate of a population characteristic is a single number
that is based on sample data and represents a plausible value of
the characteristic.

A point estimate is obtained by first selecting an appropriate


statistic.
The estimate is then the value of the statistic for the given
sample.
For example, the computed value of the sample mean x̄
provides a point estimate of a population mean µ.
Inferential Statistics
Inferential Statistics

Inferential Statistics

Choosing a Statistic for Computing an Estimate


The statistic used should be one that tends to yield an
accurate estimate—that is, an estimate close to the value of
the population characteristic.
Information about the accuracy of estimation for a particular
statistic is provided by the statistic’s sampling distribution.

Definition
A statistic whose mean value is equal to the value of the
population characteristic being estimated is said to be an unbiased
statistic. A statistic that is not unbiased is said to be biased.
Inferential Statistics
Inferential Statistics

Inferential Statistics

Choosing a Statistic for Computing an Estimate


Inferential Statistics
Confidence Interval

Large-Sample Confidence Interval for a Population Proportion

Definition
A confidence interval (CI) for a population characteristic is an
range of plausible values for the characteristic.

It is constructed so that, with a chosen degree of confidence,


the value of the characteristic will be captured between the
lower and upper endpoints of the interval.

Definition
The confidence level associated with a confidence interval
estimate is the success rate of the method used to construct the
interval.
Inferential Statistics
Confidence Interval

Large-Sample Confidence Interval for a Population Proportion


The confidence level provides information on how much
“confidence” we can have in the method used to construct
the interval estimate (not our confidence in any one particular
interval).
Usual choices for confidence levels are 90%, 95%, and 99%,
although other levels are also possible.
If we are constructing a 95% confidence interval, then we
would be using a method that is “successful” 95% of the time.
Inferential Statistics
Calculating the Confidence Interval

Calculating the Confidence Interval


To construct a confidence interval for a single unknown
population mean µ , where the population standard deviation
is known, we need x̄ as an estimate for µ and we need the
margin of error. Here, the margin of error is called the
error bound for a population mean (abbreviated EBM).
The sample mean x̄ is the point estimate of the unknown
population mean µ, the confidence interval estimate will have
the form: (point estimate - error bound, point estimate
+ error bound) or, in symbols,(x̄ − EBM, x̄ + EBM).
The margin of error depends on the confidence level
(abbreviated CL). The confidence level is often considered the
probability that the calculated confidence interval estimate
will contain the true population parameter.
Inferential Statistics
Calculating the Confidence Interval
Calculating the Confidence Interval

Calculating the Confidence Interval


However, it is more accurate to state that the confidence level
is the percent of confidence intervals that contain the true
population parameter when repeated samples are taken.
Most often, it is the choice of the person constructing the
confidence interval to choose a confidence level of 90% or
higher because that person wants to be reasonably certain of
his or her conclusions.
There is another probability called alpha (α). α is related to
the confidence level CL. α is the probability that the interval
does not contain the unknown population parameter.
Mathematically, α + CL = 1.
Inferential Statistics
Calculating the Confidence Interval
Example

Example
Suppose we have collected data from a sample. We know the
sample mean but we do not know the mean for the entire
population. The sample mean is 7 and the error bound for the
mean is 2.5.
From the example above, x̄ = 7 and EBM = 2.5.
The confidence interval is (7 − 2.5, 7 + 2.5); calculating the
values gives (4.5, 9.5).
If the Confidence Level (CL) is 95%, then we say that ”We
estimate with 95% confidence that the true value of the
population mean is between 4.5 and 9.5.”.
Inferential Statistics
Calculating the Confidence Interval
Probability Levels and interval estimates

Probability Levels and interval estimates


Probability Levels Confidence Coefficient(%) Z-value
0.1 90 1.64
0.05 95 1.96
0.01 99 2.58

The significance difference is the degree of difference


between sample mean (x̄) and population mean (µ) that leads
to the rejection of the null hypothesis. This is because it has
only 5% or less chance of occurring.
The critical or rejection region is the range of values of the
test values that indicates that there is significant difference
and that the null hypothesis should be rejected. This tells us
that the degree of difference between the two means cannot
wholly be explained by chance.
Inferential Statistics
Calculating the Confidence Interval
Normal Curve

Normal Curve

Figure: Normal curve showing acceptance and rejection regions with a


significance level (α) of 0.05
Inferential Statistics
Calculating the Confidence Interval
Confidence Intervals

Confidence Intervals

Example
A study is conducted concerning the blood pressure of 60 year old
women with glaucoma. In the study 200 60-year old women with
glaucoma are randomly selected and the sample mean systolic
blood pressure is 140 mm Hg and the sample standard deviation is
25 mm Hg.Calculate a 95% confidence interval for the true mean
systolic blood pressure among the population of 60 year old women
with glaucoma.
Inferential Statistics
Calculating the Confidence Interval
Hypotheses and Test Procedures

Hypotheses and Test Procedures

Definition
A statistical hypothesis is a conjecture about a population
parameter. This conjecture may or may not be true.

Definition
A test of hypotheses or test procedure is a method that uses
sample data to decide between two competing claims (hypotheses)
about a population characteristic
Inferential Statistics
Calculating the Confidence Interval
Hypotheses and Test Procedures

Null and Alternative Hypotheses

Definition
The null hypothesis, denoted by H0 , is a claim about a
population characteristic that is initially assumed to be true.
The alternative hypothesis, denoted by Ha , is the competing
claim.

In carrying out a test of H0 versus Ha or H1 , the hypothesis


H0 will be rejected in favor of Ha only if sample evidence
strongly suggests that H0 is false.
If the sample does not provide such evidence, H0 will not be
rejected.
The two possible conclusions are then reject H0 or fail to
reject H0 .
Inferential Statistics
Calculating the Confidence Interval
Alternative Hypothesis

Alternative Hypothesis
For example suppose we want to test the hypothesis that a
population mean (µ) is equal to 55. The hypothesis is that:
Ho : µ = 55.
Where: µ is the true value and 55 is the assumed value.
Therefore the three possible alternative hypotheses are:
1 H1 : µ ̸= 55 ⇒ expressed in Two-tailed test.
2 H1 : µ > 55 ⇒ expressed in Right-tailed test.
3 H1 : µ < 55 ⇒ expressed in Left-tailed test.
A statistical test uses the data obtained from a sample to
make a decision about whether the null hypothesis should be
rejected.
Inferential Statistics
Calculating the Confidence Interval
Alternative Hypothesis...cont’d

Alternative Hypothesis...cont’d
A one-tailed test is either right- tailed when the inequality
sign is > or left-tailed when the inequality sign <. It indicates
that the Ho should be rejected when the test value is in the
critical region on one side of the mean.
In a two-tailed test, the null hypothesis should be rejected
when the test value (numerical value obtained from a
statistical test) is in either of the two critical regions.
Inferential Statistics
Calculating the Confidence Interval
Note:

Note:
It is important to establish a criterion for the rejection and
acceptance of the null hypothesis. In that regard, the level of
risk you desire rejecting a null hypothesis is the level of
significance (α).
It is also worthy of note that theoretically a test never proves
that a hypothesis is true but merely provides statistical
evidence for not rejecting a hypothesis.
When you reject the Null Hypothesis, the five possibilities are:

1 There is direct cause and effect between the variables. For


example, increase in height of maize plant brings about
increase in its yield.
2 There is a reverse cause and effect relationship between the
variables.
3 The relationship between the variables may be caused by a
third variable.
Inferential Statistics
Calculating the Confidence Interval
Note:...cont’d

Note:...cont’d
There may be a complexity of interrelationships among many
variables.
The relationship may be coincidental.

Type I and Type II Errors


In hypothesis testing situation, there are four possible
outcomes.
That is, two possibilities of incorrect decisions, together with
the two possibilities for correct decisions.
These are listed in the table below.
Inferential Statistics
Errors in Hypothesis Testing

Type I and Type II Errors

A Type I error occurs if one rejects the null hypothesis when


it is true or it is the risk that a true hypothesis will be rejected.
A Type II error occurs if you do not reject the null
hypothesis when it is false or when a false hypothesis is
erroneously accepted as true.
Inferential Statistics
Errors in Hypothesis Testing

Type I and Type II Errors

Definition
The probability of a Type I error is denoted by α and is called
the level of significance of the test.
Thus, a test with α = 0.01 is said to have a level of significance of
0.01 or to be a level 0.01 test.
The probability of a Type II error is denoted by β.
Inferential Statistics
Errors in Hypothesis Testing
Power of a test

Power of a test

Definition
The probability of correctly rejecting H0 when it is false is called
the power of the test
For any particular value of µ, the power is 1 − β.
Inferential Statistics
Power Curve

Power Curve
Inferential Statistics
Hypothesis Tests for a Population Mean

Hypothesis Tests for a Population Mean


The test procedures in this case are based on the two results
that lead to the z and t confidence intervals.
1 When either n ≥ 30 (is large) or the population distribution is
approximately normal, then z = x̄−µ
σ

has approximately a
n
standard normal distribution.
2 When either n < 30 (is small) or the population distribution is
approximately normal, then t = x̄−µ
√s has approximately a
n
t−distribution with df = n − 1.
Inferential Statistics
Statistical tests: Parametric tests

Aproaches to Hypothesis Testing


1 p−value approach.

2 Critical value approach.

The p−value approach


1 The p−value is the probability, computed using the test
statistic, that measures the support (or lack of support)
provided by the sample for the null hypothesis.
2 If the p−value is less than or equal to the level of significance
α, the value of the test statistic is in the rejection region.
3 Reject H0 if the p−value ≤ α.
Inferential Statistics
Statistical tests: Parametric tests

The Critical Value Approach


x̄−µ
√ , n ≥ 30 has a standard normal
1 The test statistic z = σ/ n
probability distribution.
2 We can use the standard normal probability distribution table
to find the z−value with an area of α in the lower (or upper)
tail of the distribution.
3 The value of the test statistic that established the boundary
of the rejection region is called the critical value for the test.
4 The Rejection rule is:
a Lower tail: Reject H0 if z ≤ −zα
b Upper tail: Reject H0 if z ≥ zα .
Note: The above procedures are for a one-tailed test. You need to
read about how the same approaches are used in two-tailed tests.
Inferential Statistics
Statistical tests: Parametric tests
Steps in Hypothesis Testing

Steps in Hypothesis Testing


Every hypothesis testing situation begins with the statement
of hypothesis.
1 Develop the null and alternative hypotheses.
2 Specify the level of significance α.
3 Collect the sample data and compute the test statistic.
4 Choose either p−value or Critical value approaches to test the
null hypothesis.
5 Make a conclusion.
Inferential Statistics
Statistical tests: Parametric tests
Testing A Hypothesis Involving A Mean

Testing A Hypothesis Involving A Mean (σ known)

Example
A major hospital provides one of the most comprehensive
emergency medical services in Malawi. Operating in a multiple
hospital system with approximately 40 mobile medical units, the
service goal is to respond to medical emergencies with a mean time
of 12 minutes or less. The response times for a random sample of
40 medical emergencies were tabulated. The sample mean is 13.25
minutes. The population standard deviation is believed to be 3.2
minutes. Perform a hypothesis test, with a 0.05 level of
significance, to determine whether the service goal of 12 minutes
or less is being achieved.
Inferential Statistics
Statistical tests: Parametric tests
Testing A Hypothesis Involving A Mean

Testing A Hypothesis Involving A Mean (σ unknown)


We use the t−test when the standard deviation for the
population is unknown.
t−test is also used when n < 30.
x̄−µ
Test statistic: t = √ .
s/ n
This test statistic has a t distribution with n − 1 degrees of
freedom.
Rejection Rule: p−Value Approach, Reject H0 if p–value≤ α.
Rejection Rule: Critical Value Approach
1 H0 : µ ≥ µ0 : Reject H0 if t ≤ −tα .
2 H0 : µ ≤ µ0 : Reject H0 if t ≥ tα
3 H0 : µ = µ0 : Reject H0 if t ≤ −t α2 or t ≥ t α2 .
Inferential Statistics
Statistical tests: Parametric tests
Testing A Hypothesis Involving A Mean

Testing A Hypothesis Involving A Mean (σ unknown)

Example
A State Highway Patrol periodically samples vehicle speeds at
various locations on a particular roadway. The sample of vehicle
speeds is used to test the hypothesis H0 : µ ≤ 65. The locations
where H0 is rejected are deemed the best locations for speed traps.
At Location F , a sample of 25 vehicles shows a mean speed of
66.2 km/h with a standard deviation of 4.2 km/h. Use α = 0.05 to
test the hypothesis.
Inferential Statistics
Hypothesis Tests

Hypothesis Tests: Population Proportion


The equality part of the hypotheses always appears in the null
hypothesis.
In general, a hypothesis test about the value of a population
proportion p must take one of the following three forms
(where p0 is the hypothesized value of the population
proportion)
1 H0 : p ≥ p0 vs Ha : p < p0 , One-tailed (lower tail).
2 H0 : p ≤ p0 vs Ha : p > p0 , One-tailed (upper tail).
3 H0 : p = p0 vs Ha : p ̸= p0 , Two tail.
Inferential Statistics
Hypothesis Tests

Tests About a Population Proportion


Test Statistic q
p0 (1−p0 )
z = p̄−p
σp̄
0
, where σ p̄ = n -a common error in using p̄
in this formula (assuming np > 5 and n(1 − p) > 5 ).
Rejection Rule: p–Value Approach; Reject H0 if p−value≤ α.
Rejection Rule: Critical Value Approach
1 H0 : p ≤ p0 , Reject H0 if z ≥ zα
2 H0 : p ≥ p0 , Reject H0 if z ≤ −zα
3 H0 : p = p0 , Reject H0 if z ≤ −z α2 or z ≥ z α2 .
Inferential Statistics
Hypothesis Tests

Tests About a Population Proportion

Example
For a Christmas and New Year’s week, the National Safety Council
estimated that 500 people would be killed and 25,000 injured on
the nation’s roads. The NSC claimed that 50% of the accidents
would be caused by drunk driving. A sample of 120 accidents
showed that 67 were caused by drunk driving. Use these data to
test the NSC’s claim with α = 0.05.

You might also like