0% found this document useful (0 votes)
22 views

EXAM (1)

Uploaded by

v6gpg7kvd8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

EXAM (1)

Uploaded by

v6gpg7kvd8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Souhaila majdouf 3115

Problem 1:

• Data: Sales data split into 6 classes.


• Results:

Class Relative Cumulative


Midpoint Frequency
Interval Frequency Frequency
1000−1949 1474.5 7 33.33% 7
1950−2899 2424.5 6 28.57% 13
2900−3849 3374.5 3 14.29% 16
3850−4799 4324.5 2 9.52% 18
4800−5749 5274.5 1 4.76% 19
5750−6699 6224.5 2 9.52% 21
• Greatest Frequency: 1000−1949 (7 sales).
• Least Frequency: 4800−5749 (1 sale).

Problem 2: Descriptive Statistics

Data Set:

5816, 6045, 5612, 6341, 6106, 7361, 6320, 6265,


7220, 7439, 5395, 6908, 5561, 5710, 5538, 6632

(1) Five-Number Summary & Other Statistics

Step 1: Sort the Data

Sorted
data: 5395,5538,5561,5612,5710,5816,6045,6106,6265,6320,6341,6632,6908,7220,7361,743953
95,5538,5561,5612,5710,5816,6045,6106,6265,6320,6341,6632,6908,7220,7361,7439
Step 2: Calculate the Five-Number Summary

• Minimum: 5395
• Maximum: 7439

Median (Q2): The median is the middle value of the dataset. Since there are 16
data points (an even number), the median is the average of the 8th and 9th
values:

!"#!$!%!&
Median = =6185.5
%
First Quartile (Q1): The median of the lower half of the data (first 8 values):

&!"%$&'"#
Q1= =5661
%
Third Quartile (Q3): The median of the upper half of the data (last 8 values):

!(%#$&'"#
Q3= =6330.5
%
So the five-number summary is:

• Minimum = 5395
• Q1 = 5661
• Median (Q2) = 6185.5
• Q3 = 6330.5
• Maximum = 7439

Step 3: Mean, Mode, Range, Variance, and Standard Deviation

• Mean:
The mean is the sum of all the values divided by the number of data points:

Mean=
5395+5538+5561+5612+5710+5816+6045+6106+6265+6320+6341+6632+6908+7220+
7361+7439/16= 5946.94

• Mode:
Since no values repeat in the dataset, there is no mode.

• Range:
The range is the difference between the maximum and minimum values:
Range=7439−5395=2044

• Variance:
Variance is the average of the squared differences from the mean:

Variance=1/16∑!"
#$! ( × 𝑖 − 𝜇)^2

• Where xi are the data points and μ=5946.94. After calculating the squared differences and
averaging, we get:

Variance≈478501.12

• Standard Deviation:
The standard deviation is the square root of the variance:

Standard Deviation≈478501.12≈691.96

To describe the shape of the distribution:

• Skewness:
The distribution is slightly skewed to the right (positively skewed) because the mean
(5946.94) is greater than the median (6185.5), indicating that there are a few higher
values that pull the mean to the right.
• Reasoning:
In a right-skewed distribution, the tail on the right side is longer or fatter than the left
side. In this case, the values towards the higher end (such as 7361, 7439) contribute to the
right skew.

Problem 3: Confidence Interval

To construct a 99% confidence interval for the population mean number of days the car model
sits on the dealership’s lot, we'll use the following information:

• Sample mean (xˉ) = 9.75 days


• Sample standard deviation (s) = 2.39 days
• Sample size (n) = 36
• Degrees of freedom (df) = 35 (since df =n−1)
• t-value for 99% confidence (t) = 2.724 (from the given values for df=35)

Step 1: Confidence Interval Formula

The formula for a confidence interval for the population mean when the sample size is less than
30 and we use the t-distribution is:

Confidence Interval = xˉ ± t ×s/√𝑛


Where:

• xˉ: is the sample mean


• t: is the t-value for the given confidence level and degrees of freedom
• s:is the sample standard deviation
• n : is the sample size

Step 2: Plug in the Values

• xˉ=9.75
• t=2.724 (for 99% confidence and df = 35)
• s=2.39
• n=36

Now, calculate the standard error of the mean:

)
Standard Error = = 2.39/√6=0.3983
√+
Step 3: Calculate the Margin of Error

The margin of error is:

Margin of Error= t×Standard Error. =2.724×0.3983=1.085

Step 4: Construct the Confidence Interval

Now, calculate the confidence interval:

Confidence Interval = 9.75±1.085

So:
Lower limit = 9.75−1.085=8.665

Upper limit = 9.75+1.085=10.835

Final Answer:

The 99% confidence interval for the population mean number of days the car model sits on the
dealership’s lot is:

(8.665,10.835)
This means we are 99% confident that the true population mean for the number of days the car
model sits on the dealership’s lot falls between 8.665 and 10.835 days.

Problem 4: Hypothesis Test

Given Information:

• Claim: The mean annual cost of raising a child is μ=14,050


• Sample mean: xˉ=13,795
• Population standard deviation: σ=2875
• Sample size: n=500
• Significance level: α=0.10

Step 1: Identify the Null and Alternative Hypotheses

The hypotheses are:

• Null Hypothesis (H0): μ=14,050 (The mean cost is $14,050.)


• Alternative Hypothesis (Ha): μ≠14,050 (The mean cost is not $14,050.)

Step 2: Find the Critical Value

At α=0.10 the significance level is split between the two tails of the normal distribution
(α/2=0.05 in each tail). Using a Z-table or standard normal critical values:

• Critical values: Z = ±1.645

Step 3: Calculate the Standardized Test Statistic

The formula for the z-test statistic is:

𝐱¯#𝛍
Z =
𝛔/√𝐧

Substitute the given values:

"(,'-&."&,#&#
Z=
%/'&/√&##

First, calculate the denominator (σ/√𝑛):


%/'& %/'&
σ/√𝑛= = ≈128.63
√&## %%.(!
Now calculate Z:

"(,'-&."2,#&# .%&&
Z= = ≈-1,98
"%/,!( "%/,!(
Step 4: Decision Rule

• Reject H0 if the test statistic falls outside the critical region (Z<−1.645 or Z>1.645).
• In this case, Z=−1.98, which is less than −1.645

Decision: Reject H0.

Step 5: Interpretation in Context

At the α=0.10 significance level, there is sufficient evidence to reject the claim that the mean
annual cost of raising a child by married-couple families in the U.S. is $14,050. Instead, the
sample suggests that the actual mean cost differs from $14,050.

Problem 5:

Given Information:

• The data for vertical jump heights before and after training is as follows:

Athlete Before (X1) After (X2) Difference (D = X2 - X1)


1 24 26 2
2 22 25 3
3 25 25 0
4 28 29 1
5 35 33 -2
6 32 34 2
7 30 35 5
8 27 30 3
• Significance level: α=0.10
• This is a paired t-test since the same athletes are measured before and after.

Step 1: Identify the Hypotheses

• Null Hypothesis (H0): The mean difference in jump heights is zero (μD=0).
There is no significant improvement in vertical jump heights.
• Alternative Hypothesis (Ha): The mean difference in jump heights is greater than zero
(μD>0).
The training shoes improve vertical jump heights.

Step 2: Calculate the Critical Value

The degrees of freedom (df) is:

df= n−1 =8−1=7


At α=0.10, using a one-tailed t-distribution table, the critical value for T0.10,7 is:

tcritical=1.415
𝐃¯
t=
𝐬𝐃/√𝒏

Where:

• Dˉ= Mean of the differences.


• sD = Standard deviation of the differences.
• n= Number of pairs.

Step 3.1: Calculate D¯ (Mean of Differences)

∑4 %$($#$#.%$%$&$(
Dˉ= = =1.75
+ /
Step 3.2: Calculate sDsD (Standard Deviation of Differences)

∑(47.8¯)^%
sD=%
+."

First, calculate the deviations:


(Di−Dˉ)=[ 2−1.75,3−1.75,0−1.75,1−1.75,−2−1.75,2−1.75,5 −1.75,3−1.75]

= [0.25,1.25,−1.75,−0.75,−3.75,0.25,3.25,1.25]

Square the deviations:

(Di−Dˉ)^2=[0.0625,1.5625,3.0625,0.5625,14.0625,0.0625,10.5625,1.5625]

Sum the squared deviations:

∑(Di − D¯)^2=31.4375
Calculate sD:

(".2('&
sD=√ ≈ √4,491 ≈ 2.12
/."
Step 3.3: Calculate the Test Statistic

8¯ ".'& ",'&
t= = = ≈ 2,34
𝐬𝐃/√𝒏 %,"%√/ #,'-2%

Step 4: Decision

Compare the test statistic t=2.34 with the critical value t critical=1.415:

• Since t=2.34>1.415, reject the null hypothesis (H0).

Step 5: Interpretation

At the α=0.10significance level, there is enough evidence to support the claim that the training
shoes significantly increase athletes' vertical jump heights.
Problem:6

Step 1: Display the Data in a Scatter Plot and Describe Correlation

The data for hours studied (x) and test scores (y) is:

Hours (x) Scores (y)


0 40
2 51
4 64
5 69
5 73
5 75
6 93
7 90
8 95
9 97
Scatter Plot

I'll describe how to create the scatter plot:

1. Plot hours studied (x) on the x-axis and test scores (y) on the y-axis.
2. Mark the data points (e.g., (0,40),(2,51),..).
3. Look for a trend.

) ∑ *+,∑ * ∑ +
R=
-() ∑ *^/,(∑ *)^/() ∑ + , ,(∑ +)^/)

Step 2.1: Compute the Sums


x y x2 y2 xy
0 40 0 1600 0
2 51 4 2601 102 ∑x=51, ∑y=747, ∑x^2=325,
4 64 16 4096 256
5 69 25 4761 345 ∑y^2=49195, ∑xy=4264
5 73 25 5329 365
5 75 25 5625 375
6 93 36 8649 558
7 90 49 8100 630
8 95 64 9025 760

9 97 81 9409 873
Step 2.2: Calculate r

Substitute into the formula

!"($%&$)(()!)(*$*)
r=
+(!"(,%))(()!)! )(!"($-!-))((*$*)! )

Simplify the terms:


!"#!$%&'$() !,!&
r= =
*(&",$%"#$-)(!(-(,$%,,'$$() √#!(∗&&(!-

$)$,
r=
√%%"%,""-

=0.968

Step 2.3: Test the Significance of rr

Hypotheses:

• H0:ρ=0 (no correlation).


• Ha:ρ≠0 (correlation exists).

The test statistic is:

1 √2%" $.(#'√-$%" $.(#'√'


T= = t= =
√-%-^" *-%($.(&))^" √-%$.(&)
$.(#'∗".'"'
= $.$#& =10.91

Degrees of freedom: df=n−2=8 At α=0.05, the critical value
from a t-table is t0.05,8=2.306
• Since t=10.91>2.306 , reject H0
Step 3: Find the Equation of the Regression Line
The regression equation is:
y=mx+b
Where:
- ∑ /0#∑ / ∑ 0
• m=
- ∑ / ! #(∑ /)^4

∑ 0#5 ∑ /
• b=
-

Step 3.1: Calculate m

67(8498)#(:6)(;8;) 8:8<
M= = ≈7.0
67(<4:)#(:6)^4 98=

Step 3.2: Calculate b

;8;#(;.7)(:6)
B=
67

=39.0

Regression Equation:
y=7.0x+39.0
7. Chi-Square Test for Independence

Given Data:

Gender Cup Cone Sundae Sandwich Other Total

Male 504 287 182 43 53 1069

Female 474 401 158 45 50 1128

Total 978 688 340 88 103 2197


(1) Expected Frequencies

The formula for expected frequency (Eij) is:

(567 96:;<)(=6<>?@ 96:;<)


Eij= A1B2C DEDBF

For each cell, calculate as follows:

Total
Gender Cup Cone Sundae Sandwich Other Contribu
tion

1069 × 978
𝟏𝟎𝟔𝟗×𝟔𝟖𝟖 𝟏𝟎𝟔𝟗×𝟑𝟒𝟎 𝟏𝟎𝟔𝟗×𝟖𝟖 𝟏𝟎𝟔𝟗×𝟏𝟎𝟑
𝟐𝟏𝟗𝟕 = = = =
Male =
𝟐𝟏𝟗𝟕 𝟐𝟏𝟗𝟕 𝟐𝟏𝟗𝟕 𝟐𝟏𝟗𝟕 1069
334.63 165.33 42.83 50.07
475.86

𝟏𝟏𝟐𝟖×𝟗𝟕𝟖 𝟏𝟏𝟐𝟖×𝟔𝟖𝟖 !!/0×123 𝟏𝟏𝟐𝟖×𝟖𝟖 𝟏𝟏𝟐𝟖×𝟏𝟎𝟑


= = = = =
Female 𝟐𝟏𝟗𝟕 𝟐𝟏𝟗𝟕 𝟐𝟏𝟗𝟕 𝟐𝟏𝟗𝟕 𝟐𝟏𝟗𝟕 1128
502.14 353.37 174.64 45.17 52.93
The formula for the test statistic is:

χ2= ∑ (Oij−Eij)2/Eij

Where Oijare the observed frequencies, and Eij are the expected
frequencies. Calculate for each cell:

Total
Gender Cup Cone Sundae Sandwich Other Contributio
n

(:78#8;:.?9)4 (4?;#<<8.9<)4 (6?4#69:.<<)4 (8<#84.?<)4 (:<#:7.7;)4


8;:.?9 69:.<< 69:.<< 84.?< :7.7;
Male 10.32
= = = = =
1.43 7.02 1.70 0.001 0.17

(8;8#:74.68)4 (876#<:<.<;)4 (6:?#6;8.9;)4 (8:#8:.6;)4 (:7#:4.=<)4


:74.68 <:<.<; 6;8.9; 8:.6; :4.=<
Female 9.44
= = = = =
1.60 6.14 1.54 0.001 0.16

total χ2: 10.32+9.44=19.76

Degrees of Freedom:

df=(Rows−1)(Columns−1)=(2−1)(5−1)=4
Critical Value:

At α=0.01, from the chi-square distribution table, the critical value


for df=4 is 13.277.
Decision:

• Since χ2=19.76>13.277, reject H0

Interpretation:

There is enough evidence to conclude that gender and favorite way to


eat ice cream are related.

8. One-Way ANOVA Test

Given Data:

• Very Good: 0.47,0.49,0.41,0.37,0.48,0.51


• Good: 0.60,0.64,0.58,0.75,0.46
• Fair: 0.34,0.46,0.44,0.60

(1) Hypotheses

• Null Hypothesis (H0): All group means are equal (μ1=μ2=μ3).


• Alternative Hypothesis (Ha): At least one group mean is different
• Step 1: Compute Means and Variances

Mean Variance
Group Values n
(xˉ) (s2)

Very Good 0.47,0.49,0.41,0.37,0.48,0.51 6 0.455 0.00126

Good 0.60,0.64,0.58,0.75,0.46 5 0.606 0.01266

Fair 0.34,0.46,0.44,0.60 4 0.460 0.01156


Step 2: Compute Overall Mean
∑5 (𝟎.𝟒𝟕C𝟎.𝟒𝟗C⋯C𝟎.𝟔𝟎)
Overall Mean= = =0.501
67689 7;<=>?86#7@< !F

Step 3: Compute SSBetween and SSWithin

1. Between-Group Variance:

SSBetween=∑ 𝑛𝑖(¯x𝐢 − 𝐱¯𝐨𝐯𝐞𝐫𝐚𝐥𝐥)^2


2. Within-Group Variance:

SSWithin=∑(ni − 1)s^2i
After calculations:

• SSBetween=0.0361

• SSWithin=0.0527

Step 4: Compute F-Statistic


𝐌𝐒𝐁𝐞𝐭𝐰𝐞𝐞𝐧 LLMNOPNNQ/(R#6)
F= =
FGHIJKI- GGHIJKI-/(S#T)

Where:

• k=3 (number of groups),



• N=15(total observations).

After calculations:

(3) Decision

At α=0.05, critical F2,12=3.88. Since F=2.87<3.88, fail to reject H0.

You might also like