0% found this document useful (0 votes)
26 views

Problem #1

This problem involves testing the hypothesis that the average annual tourist arrival from Hong Kong is higher for Penang than for Johor using a two sample t-test. The datasets provided are annual tourist arrivals (in thousands) from Hong Kong to Penang from 2014 to 2019 (x1) and to Johor from 2014 to 2019 (x2). The null hypothesis is that the mean difference between Penang and Johor is equal to 0, while the alternative hypothesis is that the mean for Penang is greater than the mean for Johor. A Welch two sample t-test is conducted to test this hypothesis at a 0.05 significance level.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Problem #1

This problem involves testing the hypothesis that the average annual tourist arrival from Hong Kong is higher for Penang than for Johor using a two sample t-test. The datasets provided are annual tourist arrivals (in thousands) from Hong Kong to Penang from 2014 to 2019 (x1) and to Johor from 2014 to 2019 (x2). The null hypothesis is that the mean difference between Penang and Johor is equal to 0, while the alternative hypothesis is that the mean for Penang is greater than the mean for Johor. A Welch two sample t-test is conducted to test this hypothesis at a 0.05 significance level.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Problem #1

H0: μ = 65kg (no significant difference in weight)


Hα: μ ≠ 65kg (there is significant difference in weight)
μ(mu)=65kg (stated), σ(sigma.x)=12
significance level, α = 0.05

Input Output

# Data sets for testing


x1 <-
c(41.2,31.8,46.0,84.9,66.1,70.4,82.0,67.7,37.2,59.7,76
.9,72.8,53.3,54.4,68.1,60.6,

47.5,69.5,54.7,78.3,71.6,67.4,50.6,62.4,84.7,73.8,69.0
,62.8,52.4,62.6)
x2 <-
c(52.6,59.4,80.4,70.7,86.3,75.1,62.3,68.6,86.0,80.8,67
.1,63.8,74.7,71.6,84.3,

77.1,69.9,71.1,74.2,56.7,62.9,81.7,53.1,59.6,77.2,48.0
,68.0,91.3,84.7,62.5)

#Check for normality of samples using visual


techniques
qqPlot(x1)
qqPlot(x2)

x1 plot

x2 plot
-Z.test is allowed since both data are normally distributed
(z test is used since sample size is larger than 30)

#Hypothesis testing for x1 and x2 data: x1


z = -1.0589, p-value = 0.2896
z.test(x=x1,alternative = "two.sided",mu=65,sigma.x = alternative hypothesis: true mean is not
12,,conf.level =0.95) equal to 65
95 percent confidence interval:
z.test(x=x2,alternative = "two.sided",mu=65,sigma.x = 58.38593 66.97407
12,,conf.level = 0.95) sample estimates:
mean of x
62.68

data: x2
z = 2.6123, p-value = 0.008993
alternative hypothesis: true mean is not
equal to 65
95 percent confidence interval:
66.42927 75.01740
sample estimates:
mean of x
70.72333

z statistic and its p-value :

For x1, z = -1.0589, p-value = 0.2896 For x2, z = 2.6123, p-value = 0.008993

For first sample taken in 1990,Since p=0.2896 > α = For the second sample taken in 2010,Since p=0.008993<
0.05, H0 is accepted. There is no significant difference α = 0.05, H0 is rejected. There is a significant difference
in mean body weight of male UTM students in 1990 in mean body weight of male UTM students in 2010
compared to 1970 compared to 1970.
Problem 2 - The TSP, measured in micrograms per cubic meter (mg/m3 ), is considered excessive if it exceeds 90 mg/m3 .
Air sample readings taken from 13 locations around Pasir Gudang are given in Table 2. At 0.05 level of significance, decide
if the air quality in Pasir Gudang is above the maximum acceptable level of TSP.

From this sample. It is can be known that if average(mean) air quality in 13 location is > 90 mg/m3 , it is harmful.
Also known that the significance level, α = 0.05. Decide if the air quality in Pasir Gudang is above maximum acceptable level.

H0 = 90 mg/m3 (Average air quality is equal to or below maximum acceptable level)


Hα > 90 mg/m3 (Average air quality is above maximum acceptable level)

Input Output

#Air sample readings taken from 13 locations around Pasir > #Air sample readings taken from 13 locations around
Gudang Pasir Gudang
> x1 <- c(89,103,85,96,98,93,85,98,97,86,105,91,100)
x1 <- c(89,103,85,96,98,93,85,98,97,86,105,91,100)

#normality test Shapiro-Wilk normality test


shapiro.test(x1) data: x1
W = 0.94108, p-value = 0.4711
Samples are normally distributed.
T-test is allowed to be used since sample size is smaller
than 30

#At 0.05 level of significance, decide if the air quality in Pasir One Sample t-test
Gudang is above the maximum acceptable level of TSP using
data: x1
t-test t = 2.3094, df = 12, p-value = 0.01976
alternative hypothesis: true mean is greater than 90
t.test(x=x1, mu=90, alternative="greater", conf.level=0.95) 95 percent confidence interval:
90.98322 Inf
sample estimates:
mean of x
94.30769

Answer :

p-value = 0.01976, which is less than α = 0.05 (p-value (0.01976) < 0.05)
H0 is rejected. Average air quality of Pasir Gudang is above maximum acceptable level of Total Suspended Particulate(TSP)

Screenshot :
Problem 3 - A sample of 12 residents was asked about their satisfaction level on garbage collection services
before and after privatization by their local authority. Their rating on a 1 – 10 Likert scale, where 1 represents
“Least Satisfied” and 10 denotes “Completely Satisfied”, are given in Table 3:

At 97.5% confidence level, can we conclude that the residents are more satisfied with the garbage collection
after the privatization?

Significance level, α = 1-confidence level


α = 1- 0.975 = 0.025
To find if they are more satisfied or not, we compare between the average scores :
Ho: μ(after) - μ(before) privatization = 0 (No change in satisfaction)
Hα: μ (after) - μ(before) privatization > 0 (More satisfied)

Paired data (same 12 residents, need to compare) , thus paired t-test.

Input Output

#Data sets for testing > #Data sets for testing


bfrpriv <- c(4,7,6,6,7,5,7,6,8,8,6,6) > bfrpriv <- c(4,7,6,6,7,5,7,6,8,8,6,6)
> aftpriv<- c(4,8,7,9,8,5,7,8,9,8,6,4)
aftpriv<- c(4,8,7,9,8,5,7,8,9,8,6,4)

#normality test Shapiro-Wilk normality test


shapiro.test(bfrpriv)
data: bfrpriv
shapiro.test(aftpriv)
W = 0.91998, p-value = 0.2857

Shapiro-Wilk normality test

data: aftpriv
W = 0.88032, p-value = 0.08848

Both samples are normally distributed (p-value higher than 0.025)


T-test is allowed to be used since sample size is smaller than 30

#Hypothesis test Paired t-test


t.test(bfrpriv, aftpriv, mu=0,
data: bfrpriv and aftpriv
alternative="greater", paired = TRUE,
t = -1.6295, df = 11, p-value = 0.9343
conf.level=0.975) alternative hypothesis: true mean difference is
greater than 0
97.5 percent confidence interval:
-1.371263 Inf
sample estimates:
mean difference
-0.5833333

Answer :
p-value = 0.9343 > α = 0.025
Hence, H0 is retained. The residents are NOT more satisfied with the garbage collection after the privatization.
However, the difference in mean between after and before is 0.58333. Shows that they ARE more satisfied with
the garbage collection after the privatization. Maybe there are errors.
Problem 4 - A study on the annual arrival of Hong Kong tourists to Penang and Johor was conducted. For each of
these states, tourist arrivals (in thousand persons/year) were recorded for six years, between 2014 and 2019.

At 0.05 level of significance, does the average annual tourist arrival from Hong Kong is higher for Penang than for
Johor?

Significance level, α = 0.05


Ho: μ penang-μ johor = 0 ( Higher for Johor)
Hα: μ penang-μ johor > 0 ( Higher for Penang)

Input Output

#Datasets for testing > #Datasets for testing


x1 <- c(35,42,42,43,47,52) > x1 <- c(35,42,42,43,47,52)
x2 <- c(26,30,32,32,32,34) > x2 <- c(26,30,32,32,32,34)

#hypothesis test Welch Two Sample t-test


t.test(x1,x2,mu=0, alternative="greater",
paired=FALSE,var.equal=FALSE,conf.level=0.95) data: x1 and x2
t = 4.8473, df = 7.2295, p-value = 0.0008502
alternative hypothesis: true difference in
means is greater than 0
95 percent confidence interval:
7.637572 Inf
sample estimates:
mean of x mean of y
43.5 31.0

Mean of x (Penang) is 43.5


Mean of y (Johor) is 31.0
p-value is 0.0008502 < α = 0.05

Thus, H0 is rejected. The average annual tourist arrival from Hong Kong is higher for Penang than for Johor

You might also like