100% found this document useful (1 vote)
68 views

1-7 Least-Square Regression

This document discusses techniques for curve fitting discrete data to obtain intermediate estimates. There are two general approaches to curve fitting: fitting a single curve to represent the overall trend of scattered data, or passing curves through precise data points. The least squares method is used to minimize the sum of the squared residuals between measured and calculated values to obtain the best-fit curve. Regression analysis in Excel can perform linear and non-linear regression to model relationships between variables and determine correlation.

Uploaded by

Rawash Omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
68 views

1-7 Least-Square Regression

This document discusses techniques for curve fitting discrete data to obtain intermediate estimates. There are two general approaches to curve fitting: fitting a single curve to represent the overall trend of scattered data, or passing curves through precise data points. The least squares method is used to minimize the sum of the squared residuals between measured and calculated values to obtain the best-fit curve. Regression analysis in Excel can perform linear and non-linear regression to model relationships between variables and determine correlation.

Uploaded by

Rawash Omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

Curve Fitting (Regression)

• Describes techniques to fit curves (curve fitting) to


discrete data to obtain intermediate estimates.

• There are two general approaches two curve fitting:


– Data exhibit a significant degree of scatter. The strategy is
to derive a single curve that represents the general trend of
the data.
– Data is very precise. The strategy is to pass a curve or a
series of curves through each of the points.

1
Least Square Method - Curve Fitting

Linear Interpolation

Curvlinear Interpolation

2
Mathematical Background
• Arithmetic mean y. The sum of the individual data points (yi)
divided by the number of points (n).

• Standard deviation Sy. The most common measure of a spread


for a sample.

or

3
• Variance Sy2. Representation of spread by the square
of the standard deviation.

Degrees of freedom

• Coefficient of variation c.v. Has the utility to quantify


the spread of data.

4
Least Squares Regression

Linear Regression
• Fitting a straight line to a set of paired
observations: (x1, y1), (x2, y2),…,(xn, yn).
y=a0+a1x+e
a1- slope
a0- intercept
e- error, or residual, between the model and
the observations
5
Criteria for a “Best” Fit/
• Minimize the sum of the residual errors for all
available data:

n = total number of points


• However, this is an inadequate criterion, so is the sum
of the absolute values

6
Minimize the sum of the
residual errors for all
available data (not
adequate)

Minimize the sum of the


absolute value of the
residual errors for all
available data (not
adequate)

7
• Best strategy is to minimize the sum of the squares of
the residuals between the measured y and the
calculated y with the linear model:

• This strategy yields a unique line for any given set of


data.

8
List-Squares Fit of a Straight Line/

Normal equations, can be


solved simultaneously

Mean values
9
10
11
“Goodness” of our fit/
If
• Total sum of the squares around the mean for the
dependent variable, y, is St
• Sum of the squares of residuals around the regression
line is Sr
• St-Sr quantifies the improvement or error reduction
due to describing data in terms of a straight line rather
than as an average value.

r2-coefficient of determination
Sqrt(r2) – correlation coefficient 12
r -Coefficient of Determination
2

13
• For a perfect fit
Sr=0 and r=r2=1, signifying that the line
explains 100 percent of the variability of the
data.
• For r=r2=0, Sr=St, the fit represents no
improvement.
• A correlation coefficient, r, greater than 0.8 is
generally described as strong, whereas a
correlation less than 0.5 is generally described
as weak.
14
Algorithm for Least-Square Linear
Regression
Sumx = 0 : Sumy=0 : St=0
Sumxy = 0 : Sumx2=0 : Sr=0

‘ Calculate Sumx, Sumy, Sumxy, and Sumx2


For i=1 to n
Sumx = Sumx + x(i)
Sumy = Sumy + y(i)
Sumxy = Sumxy + x(i) * y(i)
Sumx2 = Sumx2 + x(i)^2
Next i

‘ Calculate the mean values of x and y


xm = Sumx /n : ym = Sumy / n

‘ Calculate constants for the line equation a0 and a1


a1 = (n * Sumxy – Sumx * Sumy) / (n * Sumx2 – Sumx^2)
a0 = ym – a1 * xm

15
Algorithm for Least-Square Linear
Regression (Cont.)
‘ Calculate St, and Sr
For i=1 to n
St = St + (y(i) – ym)^2
Sr = Sr + (y(i) – a1 * x(i) – a0)^2
Next i

‘ Calculate r2
r2 = (St – Sr) / St

16
Linearization of Non-linear
Regression

Non-linear
Linear

Perform regression
for xi and (ln yi)

17
Linearization of Non-linear
Regression

Non-linear
Linear

Perform regression
for ln(xi) and ln (yi)

18
Linearization of Non-linear
Regression (Assignment)

?
Non-linear
Linear

? ?
19
Regression Analysis in Excel
Excel has different regression built in
models (for correlating two variables)

20
Regression Analysis in Excel
Here is an example of a sample data set and
the plot of a "best-fit" straight line through
the data

21
Regression Analysis in Excel

Correlation Coefficient, r

Coefficient of Determination, r 2
or R2
22
Questions

23

You might also like