Solutions
Solutions
(i) Compute mean, median, and mode for the number of defects per week for each year.
0 2 6
1 5 12
2 5 9
3 5 9
4 19 6
5 6 3
6 6 5
7 4 2
Step 1: Mean
∑(x ⋅ f )
Mean =
∑f
For 2021-22:
(0 ⋅ 2) + (1 ⋅ 5) + (2 ⋅ 5) + (3 ⋅ 5) + (4 ⋅ 19) + (5 ⋅ 6) + (6 ⋅ 6) + (7 ⋅ 4) 167
Mean = = ≈ 3.21
2 + 5 + 5 + 5 + 19 + 6 + 6 + 4 52
For 2022-23:
(0 ⋅ 6) + (1 ⋅ 12) + (2 ⋅ 9) + (3 ⋅ 9) + (4 ⋅ 6) + (5 ⋅ 3) + (6 ⋅ 5) + (7 ⋅ 2) 133
Mean = = ≈ 2.56
6 + 12 + 9 + 9 + 6 + 3 + 5 + 2 52
Step 2: Median
For 2021-22:
For 2022-23:
Step 3: Mode
For 2021-22: Mode = 4 (highest frequency = 19 weeks). For 2022-23: Mode = 1 (highest frequency
= 12 weeks).
Comment on Skewness:
For 2021-22: Mean (3.21) > Median (4) > Mode (4) → Negative Skewness.
For 2022-23: Mean (2.56) < Median (4) < Mode (1) → Positive Skewness.
n 1 μ1 + n 2 μ2
Combined Mean(μc ) =
n1 + n2
n1 + n2
After calculations:
Consistency: Since the standard deviation for 2022-23 is smaller, it is more consistent.
Question 2:
Given:
4X − 5Y + 33 = 0, 20X − 9Y − 107 = 0
Xˉ = 17, Yˉ = 13
2. Correlation coefficient:
bXY ⋅ bY X
r=±
bXY + bY X
r = 0.87
3. Estimate Sales (X ) when Y = 25: Substitute Y = 25 in either regression equation to find X =
20.6.
(ii) Regression of Y on X
Y = a + bX
After calculations:
Y = 10 + 0.5X
2. Estimate Y when X = 55:
Y = 10 + 0.5(55) = 37.5
(iii) Rank Correlation
6 ∑ d2
rs = 1 −
n(n2 − 1)
After calculations:
rs = 0.71
Question 3:
I'll continue with the solutions for Question 3 if you'd like to proceed! Let me know.
proceed
Question 3:
Problem Details:
We need to find P (O∣T ), the probability that oil exists given the test is positive.
Bayes' Theorem:
P (T ∣O)P (O)
P (O∣T ) =
P (T ∣O)P (O) + P (T ∣Oc )P (Oc )
Substitute values:
The population from which samples are drawn must have a finite variance.
2. Sampling Distributions:
For n = 5, the sampling distribution will be wider (greater variability) because of smaller
sample size.
For n = 100, the sampling distribution will be narrower, as larger sample sizes reduce
variability.
Comparison: The sampling distribution with n = 100 is closer to the population mean and has less
variability.
We need the cutoff score (X ) such that the top 30% of recruits have Z -score:
P (Z) = 0.70
From the standard normal table:
Z = 0.52
X −μ
Z=
σ
Substitute values:
X − 100
0.52 =
10
Solve for X :
Question 4:
Example: "The average sales are equal to 100 units per month."
Example: "The average sales are not equal to 100 units per month."
1. Testing whether a new marketing strategy improves sales compared to the previous strategy.
2. Determining if a new machine produces fewer defects compared to the old machine.
Given:
1. Step 1: Hypotheses:
∑X 62 + 92 + 75 + 68 + 83 + 95
Xˉ = = = 79.17
n 6
∑(X − Xˉ )2
s= = 12.61
n−1
Xˉ − μ0 79.17 − 70
t= = = 2.37
s/ n 12.61/ 6
Question 5:
The question discusses analyzing the five-point summary of the age distribution for males and
females using a box plot.
Females have a smaller interquartile range (IQR), indicating more consistency in ages.
Male data also has outliers or extreme values on the higher end.
Conclusion:
Given:
n = 6 (number of trials),
P (X = 3) = 0.2457,
P (X = 4) = 0.0819.
6
( )p3 (1 − p)3 = 0.2457
3
For P (X = 4):
6
( )p4 (1 − p)2 = 0.0819
4
p = 0.4, q = 1 − p = 0.6
Given:
s
CI = Xˉ ± Z ⋅
n
12
CI = 32 ± 1.96 ⋅ = 32 ± 1.96 ⋅ 1.2 = 32 ± 2.35
100
12
CI = 32 ± 2.576 ⋅ = 32 ± 2.576 ⋅ 1.2 = 32 ± 3.09
100
4. Difference:
The 99% confidence interval is wider than the 95% confidence interval, reflecting greater
certainty.
Final Answers: