Asm 2 STT
Asm 2 STT
Introduction
II. Apply a range of statistical methods used in business planning for quality, inventory and
capacity management (P4)
Recipe:
Range = largest value – smallest value
Example:
Given data set X={2,4,5,6,7,8,9,12,15}.
We see that the maximum value of the set X is Xmax=15 and the smallest value is Xmin=2
=> The range of variation R is:
R=15-2=13
1.1.2. Variance
The variance is a measure of variability that utilizes all the data.
It is based on the difference between the value of each observation (xi) and the mean (𝑥 ̅ for a
sample, m for a population). The variance is useful in comparing the variability of two or more
variables. The variance is the average of the squared deviations between each data value and
the mean. Large variance indicates that there is more variation in the values of the data set and
that there may be a larger gap between the values of observations. If all observations are close
together, the variance will be small.
There are two formulas for the variance depending on whether you are calculating the variance
for an entire population or using a sample to estimate the population variance.
Example:
I’ll work through an example using the formula for a sample on a dataset with 17 observations
in the table below. The numbers in parentheses represent the corresponding table column
number. The procedure involves taking each observation (1), subtracting the sample mean (2) to
calculate the difference (3), and squaring that difference (4). Then, I sum the squared
differences at the bottom of the table. Finally, I take the sum and divide by 16 because I’m using
the sample variance equation with 17 observations (17 – 1 = 16). The variance for this dataset is
201.
Because the calculations use the squared differences, the variance is in squared units rather the
original units of the data. While higher values of the variance indicate greater variability, there is
no intuitive interpretation for specific values. Despite this limitation, various statistical tests use
the variance in their calculations.
1.1.3. Standard deviation
The standard deviation of a data set is the positive square root of the variance. It is measured in
the same units as the data, making it more easily interpreted than the variance. Standard
deviation is a measure of the dispersion of values in a given data set from their mean. It shows
how far each value lies on average from the mean.
Because it is easier to visualize and apply, the standard deviation is often used as a primary
measure of the variability of the data in a data set. Standard deviation is used for a number of
areas such as product quality control, weather forecasting, and volatility risk measurement in
financial markets. In addition, the standard deviation also helps to normalize the values of
different series of numbers to the same data domain.
Example:
Sample data of time (seconds) running 500m and 1500m for a group of 5 people:
T500 = {55.2, 58.8, 62.4, 54, 59.4}
T1500 = {271.2, 261, 276, 282, 270}
Calculate the variance of running 2 distances 500m and 1500m.
- Standard deviation:
S500=3.38
S1500=7.77
=> Conclusion: The standard deviation of the 500m distance shows that the 500m running time
of these 5 people is only 3.38s on average compared to the average 500m running time of
57.96s. But the standard deviation of the 1500m to 7.77s shows that with a longer distance, the
average performance of the five athletes will have a more significant difference than the 500m.
1.1.4. Coefficient of variation.
The coefficient of variation indicates how large the standard deviation is in relation to the mean.
The coefficient of variation is a measure used in descriptive statistics. This coefficient of
variation is used to measure the relative variability of undistributed data sets with different
mean values. The coefficient of variation represents the ratio of the standard deviation to the
mean of a variable, and it is a useful statistical tool for comparing variability from one data
series to another, even when there is a significant difference.
Recipe:
Based on the calculations above, Fred wants to invest in the ETF because it offers the lowest
coefficient (of variation) with the most optimal risk-to-reward ratio.
1.2. Probability distributions and application to business operations and processes.
1.2.1. Definition
Probability is a numerical measure of the likelihood that an event will occur. Probability values
are always assigned on a scale from 0 to 1. A probability near zero indicates an event is quite
unlikely to occur. A probability near one indicates an event is almost certain to occur. The
validation distribution is used to describe real-life sets of variables, such as the toss of a coin or
the weight of a chicken egg. They are also used during hypothesis testing to determine the
value
1.2.2. Types of probability distributions:
Discrete probability distributions: