Math1040skittlestermproject Tomac-Dylan
Math1040skittlestermproject Tomac-Dylan
Dylan Tomac
7/09/16
Math 1040-11 Ping Yu
Introduction:
The goal of this project is to catalog the data of 29 bags of Skittles from each student
in my 1040 class. Each student was required to buy a standard size bag of Skittles
and report the number of each color (along with the total number) to the Professor
in order to compile a reasonable amount of data to pull from. Once the data was
complied we created several charts and observations about the candy. Our findings
are as follows.
Number of Candies by Color:
Upon seeing the total number of Candies I assumed that the ratios of each color
would reflect my own findings in a standard size bag. But it seems that my bag might
have been an anomaly when comparing the two data collections.
In the overall data, the ratio of each color seems to be reasonably predicable. In
order from greatest to least, it went Red, Orange, Yellow, Green and Purple. Seeing
as Red is primary color of the Skittles Brand, I expected it to have the highest
frequency only to be proceeded by the next color in relation to the color wheel
Orange, Yellow, Green and Purple.
But in my individual bag, Red was actually the least and Yellow was the greatest.
Posing an interesting perspective on data and how a small sample can be a
misrepresentation of the overall population.
Part II
Hypothesis Tests:
The purpose of a Hypothesis Test is to determine if enough statistical data exists to
prove a certain belief, hypothesis or parameter. In most situations you would
compare the sample data to the hypothesis about the overall population, thus
proving that the hypothesis is reasonable or inaccurate.
See Attached Paper for the following:
1. 0.05 Significance Level to test the claim that 20% of Skittles candies are Red
2. 0.01 Significance Level to test the claim that the mean number of Skittles in
a bag is 55.
Reflection:
A Confidence Interval is generally used when stating a claim relating to a given
sample with a certain degree of uncertainty. For example, if you randomly sample a
48-pack of batteries, measure the run time, and calculate that the 95% confidence
interval is 13-14 hours. This would indicate that you are 95% confident that the
mean for the entire population of the batteries will fall within that range.
A Hypothesis Test is used to determine if a given Hypothesis is true when discussing
a population and a sample. You use the sample to compare and evaluate if the data
reflects your Hypothesis about the population parameter. For example, if you
sampled a neighborhood in a given city to see how many houses have garages, you
would then use this sample to draw and test a Hypothesis about the overall
population of the homes within the city.
In both tests human error could be very likely when inputting and recording data
points. When working on several problems I could have recorded data wrong from
the beginning, or even left out a number or read the incorrect probability from the
table. I found that when thoroughly checking my data and using the correct tools,
my findings usually came out accurately. But it required attention and proofing on
every problem.
For improving the sampling method, it would be beneficial to have a tighter control
on the actual samples. Having 29 different people count Skittles at various times and
locations can introduce a large possibility for error. Having a smaller number of
people count the data or having everyone in the same room when counting could
dramatically decrease the possibility of skewed data.
As far as conclusions, I have found that its far easier to than I first thought to create
a statistical study on a given subject. Especially if the data is easily accessible like the
Skittles were. I was also surprised to find how much information you could get from
a sample, and the conclusions you could draw from a solid Hypothesis.