0% found this document useful (0 votes)
9 views

Homework 9 Answers

The document outlines a statistical analysis assignment involving multiple problems, including ANOVA tests to compare sales across stores, the effect of box colors on cereal sales, and tune-up times for cars using different analyzers. Each problem requires hypothesis testing, data analysis, and conclusions based on statistical methods. Additionally, a case study on an online travel agency's website visitor data is included, focusing on the impact of background color and font on visitor engagement.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Homework 9 Answers

The document outlines a statistical analysis assignment involving multiple problems, including ANOVA tests to compare sales across stores, the effect of box colors on cereal sales, and tune-up times for cars using different analyzers. Each problem requires hypothesis testing, data analysis, and conclusions based on statistical methods. Additionally, a case study on an online travel agency's website visitor data is included, focusing on the impact of background color and font on visitor engagement.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

FAssignment-9 POM 500 Statistical Analysis Note: Attempt all questions as per rubric.

Problems including case study has a weightage of 10 marks each. The maximum you can score is
50. Use Excel function wherever possible.

Problem-1 Halls, Inc. has three stores located in three different areas. Random samples of the
sales of the three stores (In $1,000) are shown below. Store 1 Store 2 Store 3 46 34 33 47 36 31
45 35 35 42 39 45 At 95% confidence, test to see if there is a significant difference in the
average sales of the three stores.

.5s of Variance (ANOVA) test. ANOVA is a statistical method used to test differences
between two or more means. It may seem odd that we are using ANOVA to test ap
hypothesis about means (averages). However, the ANOVA is based on variances.

Before we start, let's organize the data in a table:


Store 1 Store 2 Store 3
46 34 33
47 36 31
45 35 35
42 39

45
Steps to perform ANOVA test
1. State the hypotheses. The null hypothesis will be that all means (average sales of the
three stores) are equal. The alternative hypothesis is that at least one mean is
different.
2. Calculate the F statistic. The F statistic is a ratio of two variances. Variances are a
measure of dispersion, or how far the data are scattered from the mean. Larger
values represent greater dispersion.
3. Find the F critical value. The F critical value is a cut-off point which the F statistic is
compared against. If the F statistic is larger than the F critical value, you can reject
the null hypothesis.
4. Make a decision. If the F statistic is larger than the F critical value, you can reject
Please note that you will need a statistical software or a calculator to calculate the F
statistic and the F critical value. Also, remember to check the assumptions of ANOVA:
independence, normality, and equality of variances.

Problem-2 A manufacturer of cereal is considering 3 alternative box colors – red, yellow, and
blue. To check the effect on sales, 16 stores of approximately equal size are chosen. Red boxes
are sent to 6 stores, yellow boxes to 5, and blue boxes to the remaining 5. The following results
(in tens of boxes) are obtained: Red Yellow Blue 43 52 61 52 37 29 59 38 38 76 64 53 61 74 79
81 Analyze this data and draw appropriate conclusions.
Problem-3 An automobile dealer conducted a test to determine whether the time needed to
complete a minor engine tune-up depends on whether a computerized engine analyzer or an
electronic analyzer is used. Because tune-up time varies among compact, intermediate, and full-
sized cars, the three types of cars were used as blocks in the experiment. The data (time in
minutes) was obtained as follows. Analyzer Computerized Electronic Car Compact 50 42
Intermediate 55 44 Full-sized 63 46 Use α = .05 to test for any significant differences. What is
the p-value? What is your conclusion?

Car
Compact Intermediate Full Size
Analyzer C

the value of the variable is 10

Problem-4 An agricultural experiment designed to assess differences in yields of corn for 4


different varieties, using 3 different fertilizers, produced the results (in bushels per acre) as
below: Variety A B C D Fertilizer I 86 88 77 84 85 89 80 81 II 92 91 81 93 90 94 77 94 III 75 80
83 79 71 77 83 78 Analyze this data and draw appropriate conclusions.

The sample mean is the average of the data points. It is calculated by adding all the
data points and dividing by the number of data points.
For the given data:
20, -20.5, 12.2, 12.6, 10.5, -5.8, -18.7, 15.3

The sample mean is calculated as follows:


(20 - 20.5 + 12.2 + 12.6 + 10.5 - 5.8 - 18.7 + 15.3) / 8 = 3.2

The interpretation of the sample mean is that it represents the average quarterly
percent total return for General Electric over the sample period.

2. Sample Variance and Standard Deviation

The sample variance is a measure of how spread out the numbers in the data set
are. It is calculated by taking the average of the squared differences from the mean.
The sample standard deviation is the square root of the variance. It is a measure of
the amount of variation or dispersion of a set of values.
For the given data, the sample variance and standard deviation can be calculated
as follows:
Variance = [(20-3.2)^2 + (-20.5-3.2)^2 + (12.2-3.2)^2 + (12.6-
3.2)^2 + (10.5-3.2)^2 + (-5.8-3.2)^2 + (-18.7-3.2)^2 + (15.3-
3.2)^2] / (8-1) = 238.96

Standard Deviation = sqrt(Variance) = sqrt(238.96) = 15.46

3. Confidence Interval for Population Variance

The 95% confidence interval for the population variance can be calculated using
the Chi-Square distribution. The formula is:
(n-1)*s^2 / χ^2(α/2, n-1) < σ^2 < (n-1)*s^2 / χ^2(1-α/2, n-1)

Where:

● n is the sample size

● s^2 is the sample variance

● χ^2(α/2, n-1) and χ^2(1-α/2, n-1) are the chi-square values for the given
degrees of freedom at the specified confidence level.
4. Confidence Interval for Population Standard Deviation

The 95% confidence interval for the population standard deviation can be
calculated by taking the square root of the confidence interval for the variance. The
formula is:
sqrt((n-1)*s^2 / χ^2(α/2, n-1)) < σ < sqrt((n-1)*s^2 /
χ^2(1-α/2, n-1))

Where:

● n is the sample size

● s^2 is the sample variance

● χ^2(α/2, n-1) and χ^2(1-α/2, n-1) are the chi-square values for the given
degrees of freedom at the specified confidence level.

Case Study: TourisTopia Travel TourisTopia Travel (Triple T) is an online travel agency that
specializes in trips to exotic locations around the world for groups of ten or more travelers.
Triple T’s marketing manager has been working on a major revision of the homepage of Triple
T’s website. The content for the homepage has been selected and the only remaining decisions
involve the selection of the background color (white, green, or pink) and the type of font (Arial,
Calibri, or Tahoma). Triple background colors and fonts, and it has implemented computer code
that will randomly direct each Triple T website visitor to one of these prototype homepages. For
three weeks, the prototype homepage to which each visitor was directed and the amount of time
in seconds spent at Triple T’s website during each visit were recorded. Ten visitors to each of the
prototype homepages were then selected randomly; the complete data set for these visitors is
available in the datafile named TourisTopia. Triple T wants to use these data to determine if the
time spent by visitors to Triple T’s website differs by background color or font. It would also like
to know if the time spent by visitors to the Triple T website differs by different combinations of
background color and font. Managerial Report: Prepare a managerial report that addresses the
following issues. 1. Use descriptive statistics to summarize the data from Triple T’s study Based
on descriptive statistics, what are your preliminary conclusions about whether the time spent by
visitors to the Triple T website differs by background color or font? What are your preliminary
conclusions about whether time spent by visitors to the Triple T website differs by different
combinations of background color and font? 2. Has Triple T used an observational study or a
controlled experiment? Explain. 3. Use the data from Triple T’s study to test the hypothesis that
the time spent by visitors to the Triple T website is equal for the three background colors.
Include both factors and their interaction in the ANOVA model, and use α = 0.05. 4. Use the data
from Triple T’s study to test the hypothesis that the time spent by visitors to the Triple T website
is equal for the three fonts. Include both factors and their interaction in the ANOVA model, and
use α = 0.05. 5. Use the data from Triple T’s study to test the hypothesis that time spent by
visitors to the Triple T website is equal for the nine combinations of background color and font.
Include both factors and their interaction in the ANOVA model, and use α = 0.05. 6. Do the
results of your analysis of the data provide evidence that the time spent by visitors to the Triple T
website differs by background color, font, or combination of background color and font? What is
your recommendation?

9 feedback
MISSION
1. homework 9 answers (1).docx
2. Homework 1 POM 500 Statistical Analysis Note answers.docx
COMMENTS
Feedback to Learner12/11/24 10:37 PM

1) It has one variable, so you will have to use Anova: Single Factor.

ANOVA

Source of Sum of Degree of Mean Sum


Variation Square freedom of Square F ratio P-value F critical

Between Groups324 2 162 40.5 3.16E-05 4.26

Within Groups 36 9 4

Total 360 11

Test Statistics F (Fisher ratio) = MSC/MSE = 162/4= 40.5


P-value =F.DIST. RT(40.5,2,9) =1-F.DIST(40.5,2,9,TRUE) = 3.16E-05
Since p–value < α = 0.05, we reject H0.
Critical value =F.INV(0.95,2,9) =F.INV.RT(0.05, 2,9) = 4.26
Since F ratio (40.5 ) > Fcritical (4.26), we reject H0
We are at least 95% confident that there is sufficient evidence to conclude a significant
difference in the average sales of the three stores.
2) It has one variable, so you will have to use Anova: Single Factor.
ANOVA

Source of Sum of Degree of Mean Sum of


Variation Square freedom Square F ratio P-value F critical

Between Groups340.94 2 170.5 0.614 0.556 3.81

Within Groups 3608 13 277.5

Total 3948.94 15

Test Statistics F (Fisher ratio) = MSC/MSE = 170.47/277.54= 0.614


P-value =F.DIST. RT(0.614,2,13) =1-F.DIST(0.614,2,13,TRUE) = 0.556

Since p–value (0.556) > α (0.05), we fail to reject H0.

Critical value =F.INV(0.95,2,13) =F.INV.RT(0.05, 2,13) = 3.81

Since F ratio (0.614) < Fcritical (3.81), we fail to reject H0

We are at least 95% confident that there isn’t sufficient evidence to conclude that the
color of the box colors has any effect on the sale of cereal.
3) It has two variables, so you will have to use Two-Way ANOVA without Replication

SUMMARY Count Sum Average Variance

Compact 2 92 46 32

Intermediate 2 99 49.5 60.5

Full-sized 2 109 54.5 144.5


Computerized 3 168 56 43

Electronic 3 132 44 4

ANOVA

Source of
Variation SS df MS F P-value F crit

Car Type 73 2 36.5 3.48 0.223 19

Analyzer 216 1 216 20.57 0.045 18.51

Error 21 2 10.5

Total 310 5

Test Statistics F ratio (Car) = MSB/MSE = 36.5/3.476= 0.223


F ratio (Analyzer) = MSC/MSE = 216/20.57= 0.045
P-value (Car Type) =F.DIST. RT(3.476,2,2) =1-F.DIST(3.476,2,2,TRUE) = 0.223
P-value (Analyzer) =F.DIST. RT(20.571,1, 2) =1-F.DIST(20.571,1,2,TRUE) = 0.045
Since p-value (Analyzer) < α = 0.05, we reject H0.
Since p-value (Car Type) > α = 0.05, we fail to reject H0.
Since the p-value (Car Type) > α = 0.05 (or F ratio (3.476) > Fcritical (19), we fail to
reject the null hypothesis, and so we do not have enough evidence to suggest that the
tune-up time varies among compact, intermediate, and full-sized cars.
Since the p-value (Analyzer) < α = 0.05 (or F ratio (20.571) > Fcritical (18.513), we
reject the null hypothesis, and so at the 95% level of confidence we conclude there is a
significant difference in the time needed to complete a minor engine tune-up.
4) It has two variables and two values for each variety, so you will have to use Two-
Way ANOVA with Replication.

SUMMARY A B C D Total

Count 2 2 2 2 8

Sum 171 177 157 165 670

Average 85.5 88.5 78.5 82.5 83.75

Variance 0.5 0.5 4.5 4.5 17.07

II

Count 2 2 2 2 8

Sum 182 185 158 187 712

Average 91 92.5 79 93.5 89

Variance 2 4.5 8 0.5 41.14

III

Count 2 2 2 2 8

Sum 146 157 166 157 626


Average 73 78.5 83 78.5 78.25

Variance 8 4.5 0 0.5 16.21

Total

Count 6 6 6 6

Sum 499 519 481 509

Average 83.17 86.5 80.17 84.83

Variance 70.17 43.5 7.37 49.37

ANOVA

Source of Sum of Degree of Mean Sum


Variation Square freedom of Square F ratio P-value F critical

231.17
Fertilizer 462.33 2 73 1.92E-07 3.89

43.78
Variety 131.33 3 13.82 0.00034 3.49
58.61
Interaction 351.67 6 18.51 2.02E-05 3.00

Error 38 12 3.17

Total 983.33 23

Test Statistics
F ratio (Fertilizer) = MSB/MSE = 231.17/3.17 = 73
F ratio (Variety) = MSC/MSE = 43.78/3.17= 13.82
F ratio (Interaction) = MSBC/MSE = 58.61/3.17= 18.51
P-value (Fertilizer) =F.DIST. RT(73,2,12) =1-F.DIST(73,2,12,TRUE) = 1.92E-07
P-value (Variety) =F.DIST. RT(13.82,3,12) =1-F.DIST(13.82,3,12,TRUE) = 0.00034
P-value (Interaction) =F.DIST. RT(18.51,6,12) =1-F.DIST(18.51,6,12,TRUE) = 2.02E-05
Since p–value (Interaction) < α = 0.05, we reject H0.
Since the interaction effect has a significant impact on the yield of corn, then the main
factors’ significance cannot be analyzed due to the intertwining of the factors. In other
words, the combined effect of corn varieties and fertilizer types is greater than their
additive individual effects, or there could be a third variable influencing the relationship
between independent and dependent variables.
Case Problem: TourisTopia Travel - It has two variables and two values for each
variety, so you will have to use Two-Way ANOVA with Replication.

ANOVA

Source of VariationSS df MS F p-Value F crit

Sample 24246.3 2 12123.1 5.75 0.005 3.11

Columns 22426.3 2 11213.1 5.32 0.007 3.11

Interaction 12182.2 4 3045.5 1.44 0.227 2.48


Within 170788.8 81 2108.5

Total 229643.6 89

2) Triple T randomly assigned the prototypes featuring the three background colors and
the three fonts to visitors to the Triple T website, so she is using a controlled
experiment.
3) The rows in the file TourisTopia correspond to the three background colors, so the p-
value for the hypothesis test that the mean time spent by visitors is equal for the three
background colors is .0046, we reject H0. Because the p-value < 0.05, the background
color is significant.
4) The columns in the file TourisTopia correspond to the three fonts, so the p-value for
the hypothesis test that the mean time spent by visitors is equal for the three fonts is
0.0068, we reject H0. Because the p-value < 0.05, the font is significant.
5) The p-value for the hypothesis test that there is no interaction between the
background color and font is 0.2269, we fail to reject H0. Because the p-value > 0.05,
the interaction between the background color and font is not significant.
6) The results of the analysis of variance suggest that there are differences between the
three background colors in the mean time spent by visitors to the Triple T website. The
results of the analysis of variance also suggest that there are differences between the
three fonts in the mean time spent by visitors. The descriptive statistics suggest that the
white background and the Arial font will generate the greatest mean time spent by
visitors to the Triple T website. Final Recommendation: Triple T should adopt the white
background and Arial font prototype.
You are required to show the Excel function in the Excel sheet for all the questions as
per the rubric. Excel functions for all the questions are missing in the Excel sheet. In the
absence of the Excel functions in the Excel sheet, I am unable to give you complete
feedback and you lose you lose 20% for repeated nonsubmission of Excel functions.
Next time submit the Excel file with Excel functions as a supporting document otherwise
you will lose a 25% grade.
Late Submission(-2), next time you will lose grade as per the syllabus.

You might also like