Statistical Analysis Mean and Standard Deviations
Statistical Analysis Mean and Standard Deviations
STATISTICAL ANALYSIS
Statistical analysis is often used to explain variations in experimental data. It is the basis for
which predictions can be made from measurements (as in extrapolation). Probably the statistical
measures that are most familiar to students are the mean (or average), which is used to describe a
sample center or location, and standard deviation, which is a measure of the spread of the sample.
The mean is defined as
n
∑y i
µ= i
where y is the variable of interest for each member of the sample and n is the number of
observations in the sample. The standard deviation is the square root of the variance
Standard Deviation = s = s2
Where
∑y − (∑ y i ) 2 / n
2
i
s2 = i i
Example: A total of 58 AISI 1018 cold-drown steel bars were tested to determine the 0.2 percent
offset yield strength Sy in kpsi. The results were:
20
Sy m
64 2
15
68 6
Frequency
72 6
76 9 10
80 19
84 10 5
88 4
92 2
0
64 68 72 76 80 84 88 92
m is the number of measurements at the
Yield Strength S y, kpsi
certain value.
µ=
∑ S × m = 78.41
y
∑m
∑ S × m − (∑ S × m) /(∑ m) = 42.45
2 2
y y
s 2
=
∑m
s = s 2 = 6.52
MAE244 ANALYSIS c.2
The least squares method is used to fit a polynomial of nth degree. Because our
experiments will be conducted in the linear range of linear elastic materials, the only thing that
should be considered is the fit of a straight line. Thus, through the least squares fit, the slope, m,
and the intercept, b, of the straight line will be determined.
y = mx + b
that will be the best representation of the experimental data (x1, y1), (x2, y2), .... (xn, yn)....(xN, yN).
The least square fit will tell how changes in x affect changes in y, where x is the independent
variable and y is the dependent variable.
y variable (dependent)
y = mx+ b + ε
y = mx+ b
x variable (independent)
The term ε is added to define the actual location of the points (i.e. ε is an error term). For n x and
y data points, the slope, m, and the intercept, b, are calculated using the following equations:
x=∑ n , y=∑ n
x y
(∑ x ) (∑ x )(∑ y )
2
= ∑x − , Sxy = ∑ xy −
2
Sxx n n
m = Sxy Sxx , b = y − mx
MAE244 ANALYSIS c.3
Correlation
The main use of regression is prediction. The sample correlation coefficient, r, is the
statistic to determine the strength of the correlation (or prediction). It is found using
Sxy
r=
Sxx Syy
where r=1 is a perfect positive fit and r=-1 is a perfect negative fit. r2, the coefficient of
determination, is often used to indicate the proportion of the variability in y explained by the
linear bivariate association with x.
Example. r = 0.89, therefore r2 = 0.79. Then 79% of the variability among y is explained on the
basis of the linear relationship between x.
Statistical analysis can be performed using Excel or Lotus 1-2-3 so it is not necessary to perform
hand calculations using the above equations.
This analysis tool performs linear regression analysis by using the "least squares" method to fit a
line through a set of observations. Student can analyze how a single dependent variable is affected
by the values of one or more independent variables ¾ for example, how an athlete's performance
is affected by such factors as age, height, and weight. Student can apportion shares in the
performance measure to each of these three factors, based on a set of performance data, and then
use the results to predict the performance of a new, untested athlete.
For the following set of experimental data, regression analysis was performed using Excel.
25
0 0
20
180 5
570 10 15
700 15 10 Experimental Data
1075 20 5 Linear Regression of Data
1300 25 0
1600 30 0 500 1000 1500 2000
1690 35 Strain (µmm/mm)