0% found this document useful (0 votes)
69 views

Lecture 7 Test

This lecture discusses hypothesis testing procedures for several populations parameters: 1. Tests about a population mean, including situations where the population standard deviation is known or unknown, and for large or small sample sizes. 2. Tests concerning a population proportion. 3. Tests for the difference between two population means and between two population proportions. 4. Analysis of paired data and simple linear regression are also covered. Key aspects of hypotheses testing such as the null and alternative hypotheses, types of errors, and rejection regions are reviewed. Examples are provided to demonstrate applications of the procedures.

Uploaded by

21142467
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Lecture 7 Test

This lecture discusses hypothesis testing procedures for several populations parameters: 1. Tests about a population mean, including situations where the population standard deviation is known or unknown, and for large or small sample sizes. 2. Tests concerning a population proportion. 3. Tests for the difference between two population means and between two population proportions. 4. Analysis of paired data and simple linear regression are also covered. Key aspects of hypotheses testing such as the null and alternative hypotheses, types of errors, and rejection regions are reviewed. Examples are provided to demonstrate applications of the procedures.

Uploaded by

21142467
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Nguyễn Ngọc Tứ

Lecture 7

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 1 / 34


Lecture outline

ˆ Tests About a Population Mean


ˆ Test Concerning a Population Proportion
ˆ Tests for a difference between two population means
ˆ Inference concerning a difference between population proportion
ˆ Analysis of Paired Data
ˆ Simple Linear Regression

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 2 / 34


Hypotheses and Test Procedures

A statistical hypothesis is a claim or assertion either about the value of a


single parameter (population characteristic or characteristic of a probability
distribution), about the values of several parameters, or about the form of
an entire probability distribution.
The null hypothesis, denoted by H0 , is the claim that is initially assumed
to be true.
The alternative hypothesis, denoted by Ha , is the assertion that is con-
tradictory to H0 .
The alternative to the null hypothesis will look like one of the following
three assertions:
1. Ha : θ > θ0 (in which case the implicit null hypothesis is H0 : θ ≤ θ0 )
2. Ha : θ < θ0 (in which case the implicit null hypothesis is H0 : θ ≥ θ0 )
3. Ha : θ ̸= θ0

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 3 / 34


Hypotheses and Test Procedures

A test of hypotheses is a method for using sample data to decide whether


the null hypothesis H0 should be rejected. A test procedure is specified by
the following:
1. A test statistic, a function of the sample data on which the decision
(reject H0 or do not reject H0 ) is to be based
2. A rejection region, the set of all test statistic values for which H0
will be rejected
The null hypothesis H0 will then be rejected if and only if the observed or
computed test statistic value falls in the rejection region.

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 4 / 34


Hypotheses and Test Procedures
Errors in Hypothesis Testing
1. A type I error consists of rejecting the null hypothesis H0 when it is
true.
2. A type II error involves not rejecting H0 when H0 is false.
Table of Errors
Actual Population Value
H0 true Ha true
Decision Reject H0 Type I Error Correct Decision
Decision Fail to Reject H0 Correct Decision Type II Error

The significance level, α, is the probability of making a Type I error, if


the null hypothesis is true.

P(Type I Error) = α

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 5 / 34


Tests About a Population Mean

A Normal Population with Known σ


Let X1 , . . . , Xn represent a random sample of size n from the normal popu-
lation with expected value µ and standard deviation σ. Then

X ∼ N(µX = µ, σX2 = σ 2 /n)

Null hypothesis H0 : µ = µ0
x − µ0
Test statistic value: z = √
σ/ n
Alternative Hypothesis Rejection Region for Level α Test
Ha : µ > µ0 z ≥ zα (upper-tailed test)
Ha : µ < µ0 z ≤ −zα (lower-tailed test)
Ha : µ ̸= µ0 |z| ≥ zα/2 (two-tailed test)

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 6 / 34


A Normal Population with Known σ - Example
A manufacturer of sprinkler systems used for fire protection in office build-
ings claims that the true average system-activation temperature is 130o F.
A sample of n = 9 systems, when tested, yields a sample average activation
temperature of 131.08o F. If the distribution of activation times is normal
with standard deviation 1.5o F, does the data contradict the manufacturer’s
claim at significance level α = 0.01?
Solution. Parameter of interest µ = average activation temperature true.
Null hypothesis: H0 : µ = 130 (null value µ0 = 130).
Alternative hypothesis Ha : µ ̸= 130
Test statistic value:
x − µ0 x − 130 131.08 − 130
z= √ = √ = √ = 2.16
σ/ n 1.5/ n 1.5/ 9
Rejection region: α = 0.01, we have: zα/2 = z0.005 = 2.58
Comparision: |z| < zα/2 . We reject Ha .
The data does not give strong support to the claim that the true average
differs from the design value of 130.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 7 / 34
A Normal Population with Large Sample and
Unknown σ 2 (n > 40)

Let X1 , . . . , Xn represent a random sample of size n from the normal popu-


lation with expected value µ and standard deviation σ. Then

X ∼ N(µX = µ, σX2 = σ 2 /n)

Null hypothesis H0 : µ = µ0
x − µ0
Test statistic value: z = √
s/ n
Alternative Hypothesis Rejection Region for Level α Test
Ha : µ > µ0 z ≥ zα (upper-tailed test)
Ha : µ < µ0 z ≤ −zα (lower-tailed test)
Ha : µ ̸= µ0 |z| ≥ zα/2 (two-tailed test)

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 8 / 34


A Normal Population with Large Sample and
Unknown σ 2 (n > 40)
Example. One machine needs to pack products weighing 1 kg. Suspecting that the
device is not working properly, people choose a sample of 100 products to see the
following:
Weight 0.95 0.97 0.99 1.01 1.03 1.05
Number of package 9 31 40 15 3 2
Using a significance level α = 0.05, please conclude on the above doubt?
Solution

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 9 / 34


A Normal Population with Large Sample and
Unknown σ 2 (n > 40)
Example. One machine needs to pack products weighing 1 kg. Suspecting that the
device is not working properly, people choose a sample of 100 products to see the
following:
Weight 0.95 0.97 0.99 1.01 1.03 1.05
Number of package 9 31 40 15 3 2
Using a significance level α = 0.05, please conclude on the above doubt?
Solution
Let µ be the average weight of a package.
Null hypothesis H0 : µ = 1. Alternative hypothesis Ha : µ ̸= 1
From the data sheet: n = 100, x = 0.9856, s = 0.02
x − µ0 0.9856 − 1
Calculator of test statistics value: z = √ = √ = −7.2
s/ n 0.02/ 100
Conclusion: using a significance level of α = 0.05, we have zα/2 = z0.025 = 1.96
Therefore, |z| > zα/2 , and H0 would be rejected. (x = 0.9856 < µ = 1. Packing
bags less than 1 kg.)
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 9 / 34
A Normal Population with Small Sample and
Unknown σ 2 (n < 40)

Let X1 , . . . , Xn represent a random sample of size n from the normal popu-


lation with expected value µ and standard deviation σ. Then

X −µ
T = √
S/ n

has a t distribution n − 1 degree of freedom(df).


Null hypothesis H0 : µ = µ0
x − µ0
Test statistic value: t = √
s/ n
Alternative Hypothesis Rejection Region for Level α Test
Ha : µ > µ0 t ≥ tα,n−1 (upper-tailed test)
Ha : µ < µ0 t ≤ −tα,n−1 (lower-tailed test)
Ha : µ ̸= µ0 |t| ≥ tα/2,n−1 (two-tailed test)

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 10 / 34


A Normal Population with Small Sample and
Unknown σ 2 (n < 40)
Example. Glycerol is a major by-product of ethanol fermentation in wine
production and contributes to the sweetness, body, and fullness of wines.
The article “A Rapid and · · · ” includes the following observations on glyc-
erol concentration (mg/mL) for samples of standard-quality (uncertified)
white wines: 2.67, 4.62, 4.14, 3.81, 3.83. Suppose the desired concentration
value is 4. Does the sample data suggest that true average concentration is
something other than the desired value with a significance level of 0.05?
Solution. µ = true average Glycerol concentration.
Null hypothesis H0 : µ = 4. Alternative hypothesis Ha : µ ̸= 4
From data sheet, we have n = 5, x = 3.814, s = 0.718
x −4 3.814 − 4
Test statistic value t = √ = √ = −0.58
s/ n 0.718/ 5
With a significance level of 0.05, we have tα/2,n−1 = t0.025,4 = 2.776.
Conclusion |t| < tα/2,n−1 and Ha will be rejected.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 11 / 34
Summary
CASE 1 H0 Ha REJECTION REGION
H0 : µ = µ 0 Ha : µ > µ0 z ≥ zα
σ known x − µ0
z= √ Ha : µ < µ0 z ≤ −zα
σ/ n
Ha : µ ̸= µ0 |z| ≥ zα/2
CASE 2 H0 Ha REJECTION REGION
H0 : µ = µ 0 Ha : µ > µ0 z ≥ zα
σ unknown x − µ0
n > 40 z= √ Ha : µ < µ0 z ≤ −zα
s/ n
Ha : µ ̸= µ0 |z| ≥ zα/2
CASE 3 H0 Ha REJECTION REGION
H0 : µ = µ 0 Ha : µ > µ0 t ≥ tα;n−1
σ unknown x − µ0
n ≤ 40 z= √ Ha : µ < µ0 t ≤ −tα;n−1
s/ n
Ha : µ ̸= µ0 |t| ≥ tα/2;n−1

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 12 / 34


Tests Concerning a Population Proportion

Let p denote the proportion of individuals or objects in a population who


possess a specified property.
The estimator p̂ = Xn is unbiased (E (p̂) = p), has approximately a normal
p
distribution, and its standard deviation is σp̂ = p(1 − p)/n. Then
Null hypothesis H0 : p = p0 .
p̂ − p0
Test statistic value: z = p
p0 (1 − p0 )/n
Alternative Hypothesis Rejection Region
Ha : p > p0 z ≥ zα (upper-tailed)
Ha : p < p0 z ≤ −zα (lower-tailed)
Ha : p ̸= p0 |z| ≥ zα/2 (two-tailed)
These test procedures are valid provided that: np0 ≥ 10 và n(1 − p0 ) ≥ 10.

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 13 / 34


Tests Concerning a Population Proportion

Example. Natural cork in wine bottles is subject to deterioration, and as


a result wine in such bottles may experience contamination. The article
“Effects of Bottle Closure Type on Consumer Perceptions of Wine Quality”
reported that, in a tasting of commercial chardonnays, 16 of 91 bottles
were considered spoiled to some extent by cork-associated characteristics.
Does this data provide strong evidence for concluding that more than 15%
of all such bottles are contaminated in this way? Let’s carry out a test of
hypotheses using a significance level of 0.10.

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 14 / 34


Tests Concerning a Population Proportion

Solution.
p = the true proportion of all commercial chardonnay bottles considered
spoiled to some extent by cork-associated characteristics.
Null hypothesis H0 : p = 0.15. Alternative Hypothesis Ha : p > 0.15
16
With n = 91, p̂ = = 0.1758, test statistic value:
91
p̂ − 0.15 0.1758 − 0.15
z=p =p = 0.69
(0.15)(0.85)/n (0.15)(0.85)/91

With a significance level of α = 0.1, we have: zα = 1.28.


Conclusion z ≤ zα and H0 cannot be rejected.
Although the percentage of contaminated bottles in the sample somewhat
exceeds 15%, the sample percentage is not large enough to conclude that
the population percentage exceeds 15%.

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 15 / 34


Tests for a Difference Between Two Population
Mean
Basice Assumptions
1. X1 , . . . , Xn is a random sample from a distribution with mean µ1 and
variance σ12 .
2. Y1 , . . . , Yn is a random sample from a distribution with mean µ2 and
variance σ22 .
3. The X and Y samples are independent of one another.

proportion

The excepted value of X − Y is µ1 − µ2 , so X − Y is an unbiased


estimator of µ1 − µ2 . The standard deviation of X − Y is
r
σ12 σ22
σX −Y = + .
m n

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 16 / 34


Test Procedures for Normal Populations with
Known Variances
Null hypothesis H0 : µ1 − µ2 = ∆0
x − y − ∆0
Test statistic value: z = r
σ12 σ22
+
m n
Alternative Hypothesis Rejection Region for Level α Test
H a : µ1 − µ 2 > ∆ 0 z ≥ zα (upper-tailed)
H a : µ1 − µ 2 < ∆ 0 z ≤ −zα (lower-tailed)
Ha : µ1 − µ2 ̸= ∆0 |z| ≥ zα/2 (two-tailed)
Example. Analysis of a random sample consisting of m = 20 specimens of cold-
rolled steel to determine yield strengths resulted in a sample average strength of
x = 29.8. A second random sample of n = 25 two-sided galvanized steel specimens
gave a sample average strength of y = 34.7. Assuming that the two yield-strength
distributions are normal with σ1 = 4.0 and σ2 = 5.0. Does the data indicate that
the corresponding true average yield strengths µ1 and µ2 are different? Let’s carry
out a test at significance level 1%.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 17 / 34
Test Procedures for Normal Populations with
Known Variances
Solution.
The parameter of interset is µ1 −µ2 , the difference between the true average
strengths for the two types of steel.
Null hypothesis H0 : µ1 − µ2 = 0.
Alternative hypothesis Ha : µ1 − µ2 ̸= 0
The test statistic value is
x − y − ∆0 29.8 − 34.7
z= r = r = −3.66
σ12 2
σ2 16 25
+ +
m n 20 25
For α = 0.01, we have zα/2 = z0.005 = 2.58.
Conclusion: |z| ≥ zα/2 , and H0 will be rejected.
The sample data strongly suggests that the true average yeild strength for
cold-rolled steel differs from that for galvanized steel.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 18 / 34
Tests for a Difference Between Two Population
Mean

CASE 1 H0 Ha REJECTION REGION


H0 : µ 1 − µ 2 = 0 Ha : µ 1 − µ 2 > ∆ 0 z ≥ zα
σ12 , σ22 x − y − ∆0
known z= r Ha : µ 1 − µ 2 < ∆ 0 z ≤ −zα
σ12 σ22
+
m n
Ha : µ 1 − µ 2 =
̸ ∆0 |z| ≥ zα/2
CASE 2 H0 Ha REJECTION REGION
σ12 , σ22 H0 : µ 1 − µ 2 = 0 Ha : µ 1 − µ 2 > ∆ 0 z ≥ zα
unknown, x − y − ∆0
z= r Ha : µ 1 − µ 2 < ∆ 0 z ≤ −zα
Large s12 s22
Sample +
m n
m,n >40 Ha : µ1 − µ2 ̸= ∆0 |z| ≥ zα/2

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 19 / 34


Tests for a Difference Between Two Population
Mean
CASE 3 H0 Ha REJECTION REGION
σ12 , σ22 H0 : µ 1 − µ 2 = 0 H a : µ1 − µ2 > ∆ 0 t ≥ tα;ν
unknown, x − y − ∆0
t= r H a : µ1 − µ2 < ∆ 0 t ≤ −tα;ν
Small s12 s22
Sample +
m n
m, n ≤ 40 Ha : µ1 − µ2 ̸= ∆0 |t| ≥ tα/2;ν
2
s12 s22

+
m n
where ν = ; (round ν downto the nearest integer).
(s1 /m)2 (s2 /n)2
+
m−1 n−1
For example, m1 = 10 = m2 ; s1 = 0.79; s2 = 3.59 ⇒ ν = 9.87 ⇒ ν = 9.
tα/2;ν = t0.025;9 = 2.262
Read example 9.1; 9.2 (page 348); example 9.4 (p. 351); example 9.7 (p.359)
Homework: 2b, 3, 6a, 7, 8a (p.354); 19, 28, 32 (p. 362).
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 20 / 34
Tests for a Difference Between 2 Population Mean
Exercise 1. Observe a sample of 75 morning shifts, the sample mean of
products was 806 products/shift and the standard deviation was 187. Ob-
serve a sample of 100 evening shifts, the sample mean of products was 725
products/shift and the standard deviation was 168. Does it appear that true
average products of morning shift is greater than evening shift?
Test the appropriate hypotheses at significance level 0.01.
Exercise 2. To study the effect of applying fertilizer according to formulaA
on maize yield, people experimented on 5 pairs of adjacent plots, each pair
included a control piece (without fertilizer) and a piece with apply fertilizer
according to formula A. The products obtained are as follows (ton/ha).
without fertilizer 55 53 30 37 49
with fertilizer 60 58 28 39 47
Does it appear that the true average corn yield of field not using fertilizers
is less than field using fertilizers? Test the appropriate hypotheses at signif-
icance level 0.05.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 21 / 34
Inferences Concerning a Difference Between
Population Proportions

Let p1 , p2 be the proportion of S’s in population #1, #2, respectively.


Let X ∼ Bin(m, p1 ) and Y ∼ Bin(n, p2 ) with X and Y independent vari-
ables.
Set pˆ1 = X /m and pˆ2 = Y /n. Then

E (pˆ1 − pˆ2 ) = p1 − p2 ,

so pˆ1 − pˆ2 is an unbiased estimator p1 − p2 and


p1 q1 p2 q2
V (pˆ1 − pˆ2 ) = + , with q1 = 1 − p1 .
m n

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 22 / 34


Inferences Concerning a Difference Between
Population Proportions

Null hypothesis H0 : p1 − p2 = ∆0
pˆ1 − pˆ2
Test statistic value: z = s  
1 1
p̂ q̂ +
m n
X +Y
where p̂ = và q̂ = 1 − p̂.
m+n
Alternative Hypothesis Rejection Region for Approximate Level α Test
Ha : p 1 − p 2 > ∆ 0 z ≥ zα
Ha : p 1 − p 2 < ∆ 0 z ≤ −zα
Ha : p1 − p2 ̸= ∆0 |z| ≥ zα/2

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 23 / 34


Inferences Concerning a Difference Between
Population Proportions
Example 1. Among 529 helmets of company A, there are 95 helmets not meeting
quality standards. Among 400 helmets of company B, there are 95 helmets not
meeting quality standards. Carry out a test of hypotheses at the 3% significance
level to decide whether the quality standards of helmets of company A differs from
the helmets of company B.
Solution.
Let p1 , p2 be the proportion of helmet not meeting quality standards of company
A, B, respectively.
The null hypothesis H0 : p1 − p2 = 0. The alternative hypothesis Ha : p1 − p2 ̸= 0.
With the significance level α = 3%, ⇒ zα/2 = 2.17
95 95 95 + 95
m = 529; n = 400; pb1 = ; pb2 = ; pb = ; qb = 1 − pb
529 400 529 + 400
The statistic value: z = q pb1 −pb2 = −2.16706
p̂ q̂ ( m1 + n1 )
Conclusion: |z| < zα/2 ; accept H0 : p1 = p2 .
The quality of helmet of company A and company B are the same.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 24 / 34
Inferences Concerning a Difference Between
Population Proportions
Example 2. Among 550 male students, there are 45 good students. Among
450 female students, there are 50 good students. Does it appear that the
proportion of good students among male students is less than among female
students? Test the appropriate hypotheses at significance level 0.02.
Solution.

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 25 / 34


Analysis of Paired Data

The data consists of n independently selected pair

(X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ),

with E (Xi ) = µ1 and E (Yi ) = µ2 i = 1, . . . , n.


Let
D1 = X1 − Y1 , D2 = X2 − Y2 , . . . , Dn = Xn − Yn
so the Di are the differences within pairs.
Then the Di are assumed to be normally distributed with mean value µD and
2 (this is usually a consequence of the X ′ s and Y ′ s themselves
variance σD i i
being normally distributed).

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 26 / 34


Analysis of Paired Data

Null hypothesis: H0 : µD = ∆0
(where D = X − Y is the difference between the first and second observa-
tions within a pair, and µD = µX − µY ).
Test statistic value:
d − ∆0
t= √
sD / n
(where D and sD re the sample mean and standard deviation, respectively,
of the di ’s).
Alternative Hypothesis Rejection Region for Level α Test
H a : µD > ∆ 0 t ≥ tα,n−1
H a : µD < ∆ 0 t ≤ −tα,n−1
Ha : µD ̸= ∆0 |t| ≥ tα/2,n−1

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 27 / 34


Analysis of Paired Data
Example 1. Measure the length of the left hand X and the length of the
right hand Y of some randomly selected people:
X (cm) 60 61 61.5 60.8 63.2 62.4 61.5 63.2
Y (cm) 61 63 62.1 61.3 61.9 61.1 61.4 62.8
With a significance level of 1%, compare the average length of your left
hand and right hand.
Example 2. Surveying the height (unit: cm) of some 4th grade elementary
students after 3 months of summer vacation in region A, we collect the
following data:
Y 132.1 135.2 128.5 122.8 128.5 138.1 140.2 136.3
X 132.8 135.2 128.9 123.5 129.3 138.9 140.5 136.5
where Y : before summer and X : after summer.
Let’s conclude that the idea that after 3 months of summer vacation, el-
ementary school students in grade 4 increased by at least 0.3 cm with
significance level of 5%.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 28 / 34
Analysis of Paired Data
1. Let µX , µY be the average length of left and right hand.
Let µD = µX − µY . Then
X (cm) 60 61 61.5 60.8 63.2 62.4 61.5 63.2
Y (cm) 61 63 62.1 61.3 61.9 61.1 61.4 62.8
d =X −Y -1 -2 -0.6 -0.5 1.3 1.3 0.1 0.4

Null hypothesis: H0 : µD = 0. Alternative hypothesis: Ha : µD ̸= 0.


From data sheet, we have n = 8, d¯ = −0.125, sd = 1.136.
Test statistic value:
d − ∆0 −0.125 − 0
t= √ = √ = −0.3112
sD / n 1.136/ 8

With α = 0.01, n = 8, tα/2;n−1 = t0.005;7 = 3.499.


Conclusion |t| < tα/2;n−1 and accept H0 .
The length of the arms is equal.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 29 / 34
Analysis of Paired Data
2. Let µX , µY be respectively the average length after summer and before
summer.
Let µD = µX − µY . Then
Y 132.1 135.2 128.5 122.8 128.5 138.1 140.2 136.3
X 132.8 135.2 128.9 123.5 129.3 138.9 140.5 136.5
d =X −Y 0.7 0 0.4 0.7 0.8 0.8 0.3 0.2

Null hypothesis: H0 : µD = 0.3. Alternative hypothesis: Ha : µD > 0.3.


From data sheet, we have: n = 8, d¯ = 0.4875, sd = 0.3044.
Test statistic value:
d − ∆0 0.4874 − 0.3
t= √ = √ = 1.742
sD / n 0.3044/ 8

With α = 0.05, tα;n−1 = t0.05;7 = 1.895.


Conclusion t < tα;n−1 and accept H0 .
The above opinion is wrong.
Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 30 / 34
Linear Regression

Let X and Y be two random variables that receive corresponding values


x1 , . . . , xn và y1 , . . . , yn
X x1 x2 x3 ... xn
Y y1 y2 y3 ... yn

We find the relationship of X and Y by establishing linear regression models.


Model equation:
y = a + bx,
where P
(xi − x)(yi − y ) Sxy
b= P 2
=
(xi − x) Sxx
and P P
yi − b xi
a= = y − bx
n

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 31 / 34


Linear Regression

Correlation coefficient of n pairs (x1 , y1 ), . . . , (xn , yn ) is given by

Sxy
r=√ p
Sxx Syy

. Properties.
i. r ∈ [−1; 1]
ii. If r = 0 then two random variables have no correlation.
iii. The linear dependence of the two variables is shown in the following
table
Weak Moderate Strong
|r | ≤ 0.5 −0.8 < r < −0.5, 0.5 < r < 0.8 |r | ≥ 0.8

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 32 / 34


Linear Regression

Example. Observe a sample (X , Y )

X 132.0 129.0 120.0 113.2 105.0 92.0 84.0


Y 46.0 48.0 51.0 52.1 54.0 52.0 59.0
X 83.2 88.4 59.0 80.0 81.5 71.0 69.2
Y 58.7 61.6 64.0 61.4 54.6 58.8 58.0

Find the linear regression equation of X and Y ?


From data sheet, we have

b = −0.20938742 and a = 75.212432

and linear equation


y = 75.212 − 0.209x
Correlation coefficient: r = −0.8892. Strong correlation.

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 33 / 34


Using calculator to find regression equation

fx-570
1. (frequent column) Shift → Mode → ⇓ 4 → 1.on
2. Mode → 3:STAT → 2.
Input data, then press AC.
3. Shift → 1 → 5. Reg → 1. A
Shift → 1 → 5. Reg → 2. B
Shift → 1 → 5. Reg → 3. r: correlation
fx-580
1. MODE → 6 → 2
Input data, then press AC.
2. OPTN → 3
Note: The linear regression equation is Y = A + BX .

Nguyễn Ngọc Tứ Lecture 7 2023 - 2024 34 / 34

You might also like