Skittles Project
Skittles Project
Report Introduction
For this project, we took a sample of regular size bags of skittles. We counted the number of
each color skittle there was, red, orange, yellow, green, and purple. As a class we compiled our
data together by recording the number of red, orange, yellow, green, and purple candies each
person had in their bag. All of the numbers were set in a Google Sheets document and each
person’s totals were added up to get an overall total of each color. Using our personal and class
data I will attempt to estimate the true proportion of yellow skittles per bag. We used our data for
the color CI and Test and the data from all of our samples for the number of skittles in a bag CI
and Test, so we can have a bigger sample size to get a true report. Below is my attempt of the
interpretation of that data.
Data Collection
You put your data and the class data or at least a summary of the class data
12 8 19 12 10
A confidence level is the best range that one can use to estimate a close population parameter.
Meaning that we can find through an equation a reasonable guess with a highly likelihood that
the value is within that range of numbers. The purpose of a confidence interval is to be
guaranteed that the true value of a population parameter will contain a certain value, if repeated
statistic.
Purpose: Market research is about reducing risk, Confidence Intervals are about risk. They
consider the sample size and the potentional variation in the population and give us an estimate
Construct a 99% confidence interval estimate for the true proportion of yellow candies---USE
THE DATA FROM YOUR BAG TO MAKE YOUR PHAT
1. We are trying to estimate p = the true proportion of yellow skittles per bag. Our best
guess is pˆ = .31 but because of sampling variability, we are unlikely to be correct. So,
we will calculate a 99% z-interval for p.
2. Conditions
b. independence condition: Yes my one bag of skittles was less than 5% of all the Skittles
Assume population: 61
c. Normality Condition? We hope that the population is normal, but we will proceed with
caution.
Construct a 95% confidence interval estimate for the true mean number of candies per
bag--- USE THE CLASS DATA TO MAKE YOUR XBAR
1. We are trying to estimate μ = the average number of candies per bag. Our best guess is x =
60.55 but because of sampling variability, we are unlikely to be correct. So, we will calculate a
2. Conditions
Again fill in the formula, create interval and draw a curve and list the df.
4. Thus, I am 95% confident that the interval from .07 to .12 captures the true average number of
candies per bag from the class data.
Discuss and interpret the results of each of your TWO interval estimates. Include neatly written
and scanned copies of your work.
Hypothesis Tests
A hypothesis test is a method of statistical inference. Hypothesis tests are used when
determining what outcomes of a study would lead to a rejection of the null hypothesis
for a pre-specified level of significance.
Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red.
---USE THE DATA FROM YOUR BAG TO MAKE YOUR PHAT
1. At first glance, it appears that the true proportion p of red candies situation is greater than .05 since
pˆ = .39 . However, it is also possible that the true proportion is p= .01 and we got a sample proportion
this low because of sampling variability. To decide, we will conduct a 1 sample z test for p (x= .05)
2. Ho: p = .01 Ha: p not = .01
3. Conditions:
a) Random Sample? Yes
b) Independence n<=.05N? Yes it is independent
c) Normality? Yes it’s normal
4. 2P( pˆ >or< .01 )=2P zor p(1p) = 2P(z > or <.01)
Use a 0.01 significance level to test the claim that the mean number of candies in a bag
of Skittles is 55. ---USE THE CLASS DATA TO MAKE YOUR XBAR
Same steps for this but you use the TTest and a t curve so your standardized test statistic will
be a t and you will need to give degrees of freedom.
Discuss and interpret the results of each of your two hypothesis tests. Include neatly written and
scanned copies of your work.
p= .000
X bar= 60.55
t= 8.11
The p value is less than .01 using the class data, it is a random sample and it is independent.
Reflection
Discuss the conditions for doing interval estimates and hypothesis tests and discuss whether or
not your samples met these conditions. SINCE YOU ARE USING MY 5 STEPS FROM THE
NOTES YOU HAVE ALREADY CHECKED ALL THE CONDTIONS JUST DISCUSS THEM
AGAIN HERE. How could we improve the normality condition for the color problems? What
possible errors could have been made by using this data? How could the sampling method be
improved? State what conclusions you have drawn from your statistical research.
The conditions for doing interval estimates hypothesis tests is to make sure the sample is a
random sample and is less than 5%. The sample size should be less than 30 and the normality
needs to be checked. For each of the times I did my estimates, they were all random samples,
less than 5% and less than 30. Everything checked out when I made my calculations which
worked out well as I was solving everything. We could improve the normality condition for the
color problems by calculating the normality for each color and compare that data. We would’ve
had more errors by knowing there is a different amount in each bag and is more work than we
would actually need. I have drawn the conclusion that 20% of the time, Skittles is telling the
truth. We can trust the amount Skittles provides based on the calculations I had made.