Project1 - Cold Storage Case Study
Project1 - Cold Storage Case Study
Report
Table of Contents
1 Project Objective............................................................................................................. 3
2 Assumptions.................................................................................................................... 3
3 Exploratory Data Analysis – Step by step approach........................................................3
3.1 Environment Set up and Data Import.......................................................................3
3.1.1 Install necessary Packages and Invoke Libraries..............................................3
3.1.2 Set up working Directory...................................................................................3
3.1.3 Import and Read the Dataset............................................................................4
3.2 Variable Identification...............................................................................................4
3.2.1 Variable Identification – Inferences...................................................................4
3.3 Univariate Analysis...................................................................................................4
3.4 Bi-Variate Analysis...................................................................................................5
3.5 Missing Value Identification......................................................................................5
3.6 Outlier Identification.................................................................................................5
3.7 Variable Transformation / Feature Creation.............................................................5
4 Conclusion...................................................................................................................... 5
5 Appendix A – Source Code.............................................................................................5
1 Project Objective
The objective of the report is to explore Cold Storage Data Sets -
(“Cold_Storage_Temp_Data”) and (“Cold_Storage_Mar2018”) in R and generate insights
about the data set. This exploration report will consist of the following:
2 Assumptions
The Hypothesis test that we are performing considers the following assumptions:
2) As the sample size of our data is sufficiently large (N > 35), we know, based on
the central limit theorem, that the sampling distribution of the mean will be
approximately normal, regardless of the distribution being sampled.
3) The z-test and the t-test both assume that the data are independently sampled
from a normal distribution
4) A z-test assumes that σ (Standard Deviation) is known but a t-test does not.
4| Pag
e
3.1.3 Import and Read the Dataset
The given dataset is in .csv format. Hence, the command ‘read.csv’ is used for
importing the file.
1. The following box plot gives the overall temperature distribution frequency depicting
that there is one outlier in the data which has the value of 5 degree whereas
maximum upper limit is : Q3 + 1.5IQR = 3.30 + (1.5*0.8) = 4.5 degree
As shown below, Histogram distribution of temperature shows that the temperature
for overall year follows normal distribution bell curve.
2. The below given bar plot shows that there are approximately equal no. of days in
the three seasons which have been taken under observation (Rainy, Summer and
Winter)
Rainy: 122
Summer:120
Winter: 123
3.4 Bi-Variate Analysis
The following box plot shows the co-relation between the seasons and the
temperature variations for each season. It clearly gives a pictorial
representation of the distribution of temperatures for respective seasons.
The given data set contains outliers in Winter and Rainy season:
1. 3 outliers towards the right in case of Winter season.
2. 1 outlier towards the right in case of Rainy season.
4 Conclusion
In our case study of Cold Storage, it is given that to ensure that there is no
change of texture, body appearance, separation of fats the optimal temperature
to be maintained is between 2 deg - 4 deg C.
In March 2018, customers started complaining for the dairy products going sour
and often smelling. According to the supervisor, he has been vigilant on
maintaining the temperature below 3.9 deg C. So according to the problem, we
can formulate our null and alternative hypothesis for the test as follows:
After studying the sample of 35 days that has been pulled out by the
supervisor, we calculated the following:
3. Standard Deviation of the sample (sd)- assumed for ztest as given in the
question: 0.508
Z-test Statistics Analysis: As per the ztest for one tailed test, we can
formulate that-
1. Zstat = 0.8617
This shows that the Zstat lies in the acceptance region(Null acceptance) and
the p-value being more than the significance value (alpha = 0.1) indicates that
our Null Hypothesis is true and should be accepted. Therefore, we find
sufficient statistical evidence to accept the null hypothesis at the 0.1 level of
significance.
T-test Statistics Analysis: As per the t-test for one tailed test(Left tail), we
can easily formulate through R, the value for t.test that clearly shows that the
true mean is more than the 3.9 which is in Null Acceptance region and the p-
value equals 0.9953 that means with 99.53% confidence level, Null
Hypothesis can be accepted.
Therefore, it can be concluded after performing both the Hypothesis tests, that
the claim of supervisor of maintaining the temperature below 3.9 deg C is
incorrect and corrective measures should be taken by the Cold Storage Plant to
strictly monitor and maintain the temperature of the dairy products between 2 –
4 degC to ensure delivery of quality products to the customers.
6| Pag
e
PROBLEM 1:
7| Pag
e
8| Pag
e
PROBLEM 2: