0% found this document useful (0 votes)
58 views

Chapter 5 Curve Fitting PART 1

Curve fitting describes techniques to fit curves through discrete data points to estimate intermediate values. The two main approaches are least-squares regression and interpolation. Least-squares regression finds the best-fit linear or polynomial curve by minimizing the residuals between the data and curve. It is commonly used when data is variable or nonlinear. Linear regression fits a straight line using slope and y-intercept derived from minimizing the sum of squared residuals. The standard error of estimate and correlation coefficient quantify the goodness of fit.

Uploaded by

Andres Cortes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Chapter 5 Curve Fitting PART 1

Curve fitting describes techniques to fit curves through discrete data points to estimate intermediate values. The two main approaches are least-squares regression and interpolation. Least-squares regression finds the best-fit linear or polynomial curve by minimizing the residuals between the data and curve. It is commonly used when data is variable or nonlinear. Linear regression fits a straight line using slope and y-intercept derived from minimizing the sum of squared residuals. The standard error of estimate and correlation coefficient quantify the goodness of fit.

Uploaded by

Andres Cortes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Chapter 5

Curve fitting
Chapter 5
Curve fitting

Linear regression (exponential model,


power equation and saturation growth
rate equation)
Polynomial Regression
Polynomial Interpolation (Linear
interpolation, Quadratic
Interpolation, Newton DD)

Lagrange Interpolation
Spline Interpolation
3

• Curve fitting describes techniques to fit curves at points between


the discrete values to obtain intermediate estimates.
• Two general approaches for curve fitting:
a) Least –Squares Regression - to fits the shape or general trend
by sketch a best line of the data without necessarily matching
the individual points (figure PT5.1, pg 426).
- 2 types of fitting:
i) Linear Regression
ii) Polynomial Regression

Curve fitting
4
Figure PT5.1, pg 439 shows sketches developed from same set of data by 3
engineers.
a) least-squares regression - did not
attempt to connect the point, but
characterized the general upward
trend of the data with a straight line
b) Linear interpolation - Used straight-
line segments or linear interpolation
to connect the points. Very common
practice in engineering. If the
values are close to being linear,
such approximation provides
estimates that are adequate for
many engineering calculations.
However, if the data is widely
spaced, significant errors can be
introduced by such linear
interpolation.
c) Curvilinear interpolation - Used
curves to try to capture suggested
bay the data.
 Our goal here to develop
systematic and objective
method deriving such curves. Curve fitting
a) Least-square Regression : i) Linear Regression
• Is used to minimize the discrepancy/differences between the data
points and the curve plotted. Sometimes, polynomial interpolation is
inappropriate and may yield unsatisfactory results when used to
predict intermediate values (see Fig. 17.1, pg 455).

Fig. 17.1 a): shows 7


experimentally derived
data points exhibiting
significant variability. Data
exhibiting significant error.

Curve fitting 5
6

Linear Regression is fitting a „best‟ straight line through the


points.
The mathematical expression for the straight line is:

y = a0+a1x+e Eq 17.1
where, a1- slope
a0 - intercept
e - error, or residual, between the model
and the observations
Rearranging the eq. above as:
e = y - a0 - a 1x

Thus, the error or residual, is the discrepancy between the true


value y and the approximate value, a0+a1x, predicted by the
linear equation.

Curve fitting
7

Criteria for a ‘best’ Fit


• To know how a “best” fit line through the data is by minimize the
sum of residual error, given by ;
n n

 e  ( y  a
i 1
i
i 1
i 0  a1 xi ) ----- Eq 17.2

where; n : total number of points

• A strategy to overcome the shortcomings: The sum of the squares


of the errors between the measured y and the y calculated with the
linear model is shown in Eq 17.3;

n n n
S r   e  ( yi ,measured  yi ,mod el )   ( yi  a0  a1 xi ) 2
2
i
2
----- Eq 17.3
i 1 i 1 i 1

Curve fitting
8

Least-squares fit for a straight line


• To determine values for ao and a1, i) differentiate equation 17.3
with respect to each coefficient, ii) setting the derivations equal
to zero (minimize Sr), iii) set ao = n.ao to give equations 17.4
and 17.5, called as normal equations, (refer text book) which
can be solved simultaneously for a1 and ao;

n  xi yi   xi  yi
a1 
n  xi2   xi 
2 ----- Eq 17.6

----- Eq 17.7
a0  y  a1 x

where; y , x = means for x and y

Curve fitting
EXAMPLE 1

Use least-squares regression to fit a straight line to:

x 1 2 3 4 5 6 7

y 0.5 2.5 2.0 4.0 3.5 6.0 5.5


10

• Two criteria for least-square regression will provide the best


estimates of ao and a1 called maximum likelihood principle in
statistics:
i. The spread of the points around the line of similar magnitude
along the entire range of the data.
ii. The distribution of these points about the line is normal.

• If these criteria are met, a “standard deviation” for the


regression line is given by equation:

Sr
Sy  ---------- Eq. 17.9
x n2

sy/x : standard error of estimate


“y/x” : predicted value of y corresponding to a particular value of x
n -2 : two data derived estimates ao and a1 were used to compute Sr
(we have lost 2 degree of freedom)

Curve fitting
11

• Equation 17.9 is derived from Standard Deviation (Sy) about the


mean :

St
Sy  -------- (PT5.2, pg 442 )
n 1

St   ( yi  y ) 2 -------- (PT5.3, pg 442 )

St : total sum of squares of the residuals between data points


and the mean.

• Just as the case with the standard deviation, the standard error
of the estimate quantifies the spread of the data.

Curve fitting
12

Estimation of errors in summary


1. Standard Deviation
St ----- (PT5.2, pg 442 )
Sy 
n 1
S t   ( yi  y ) 2 ----- (PT5.3, pg 442 )

2. Standard error of the estimate


n n
Sr   ei   ( yi  a0  a1 x) 2
2
----- Eq 17.8
i 1 i 1

S
Sy  r
x n2 ----- Eq 17.9

where, y/x designates that the error is for a predict value of y


corresponding to a particular value of x.

Curve fitting
13

3. Determination coefficient

St  S r
r2  ----- Eq 17.10
St

4. Correlation coefficient

St  S r n xi yi  ( xi )( yi )
r @r 
St n xi2  ( xi ) 2 n yi2  ( yi ) 2 ----- Eq 17.11

Curve fitting
EXAMPLE 2

Use least-squares regression to fit a straight line to:

x 1 2 3 4 5 6 7

y 0.5 2.5 2.0 4.0 3.5 6.0 5.5

Compute the standard deviation the standard error of estimate and


the correlation coefficient for data above (use Example 1 result)
QUIZ 1

Use least-squares regression to fit a straight line to:

x 1 2 3 4 5 6 7 8 9

y 1 1.5 2 3 4 5 8 10 13

Compute the standard error of estimate and the correlation


QUIZ 2 (DIY)
Compute the standard error of the estimate and the
correlation coefficient.

x 0.25 0.75 1.25 1.50 2.00

y -0.45 -0.60 0.70 1.88 6.00


Chapter 5
Curve fitting
18

Linearization of Nonlinear Relationships


• Linear regression provides a powerful technique for
fitting the best line to data, where the relationship
between the dependent and independent variables is
linear.
• But, this is not always the case, thus first step in any
regression analysis should be to plot and visually
inspect whether the data is a linear model or not.

Curve fitting
19

Figure 17.8: a) data is ill-suited for linear regression, b) parabola is


preferable.

Curve fitting
20

Figure 17.9: Type of polynomial equations and their linearized versions,


respectively.

Curve fitting
21

• Fig. 17.9, pg 453 shows population growth of radioactive decay


behavior.

Fig. 17.9 (a) : the exponential model

y  1e x
1 ------ (17.12)

1 , 1 : constants, 1  0

This model is used in many fields of engineering to characterize


quantities.
Quantities increase : 1 positive
Quantities decrease : 1 negative

Curve fitting
22
Example 2

Fit an exponential model y = a ebx to:

x 0.4 0.8 1.2 1.6 2.0 2.3

y 750 1000 1400 2000 2700 3750

Solution
• Linearized the model into;
ln y = ln a + bx

y = a 0 + a1 x ----- (Eq. 17.1)

• Build the table for the parameters used in eqs 17.6 and 17.7,
as in example 17.1, pg 444.
Curve fitting
23

xi yi ln yi xi2 (xi)(ln yi)


0.4 750 6.620073 0.16 2.648029
0.8 1000 6.900775 0.64 5.520620
1.2 1400 7.244228 1.44 8.693074
1.6 2000 7.600902 2.56 12.161443
2.0 2700 7.901007 4.00 15.802014
2.3 3750 8.229511 5.29 18.927875
 8.3 44.496496 14.09 63.753055

n6
n n

x
i 1
i  8.3  ln y
i 1
i  44.496496
n n

x
i 1
2
i
 14.09  ( x )(ln y )  63.753055
i 1
i i

8.3 44.496496
x  1.383333 ln y 
Curve fitting
 7.416083
6 6
24

n( xi )(ln yi )  xi (ln yi )


a1  b 
nxi2  (xi ) 2
(6)(63.753055)  (8.3)(44.496496)
b  0.843
(6)(14.09)  (8.3) 2

a0  ln a  ln y  b x  7.416083  (0.843)(1.383333)
ln a  6.25

Straight-line:
ln y  ln a  bx
 ln y  6.25  0.843x

Exponential: y  a ebx

ln a  6.25  a  e6.25  518


 y  a ebx  518 e0.843x Curve fitting
25

Figure 17.9: Type of polynomial equations and their linearized versions,


respectively.

Curve fitting
26

Figure 17.9: Type of polynomial equations and their linearized versions,


respectively.

Curve fitting
27

Power Equation

• Equation (17.13 ) can be linearized by taking base-10


logarithm to yield:

y  2 x 2
-------- (17.13)
log y  log  2   2 log x -------- (17.16)

• A plot of log y versus log x will yield a straight line with slope
of 2 and an intercept of log 2.

Curve fitting
28

Power Equation

Example 4

Linearization of a Power equation and fit equation (17.13) to the


data in table below using a logarithmic transformation of the
data.

x 1 2 3 4 5

y 0.5 1.7 3.4 5.7 8.4

Curve fitting
29

Solution:

xi yi log xi log yi (log xi)2 (log xi)(log yi)

1 0.5 0 -0.301 0 0
2 1.7 0.301 0.226 0.090601 0.068026
3 3.4 0.477 0.534 0.227529 0.254718
4 5.7 0.602 0.753 0.362404 0.453306
5 8.4 0.699 0.922 0.488601 0.644478
 2.079 2.134 1.169135 1.420528

Curve fitting
30

n5
n n

 log x
i 1
i  2.079  log y
i 1
i  2.134
n n

 (log x )
i 1
i
2
 1.169135  (log x )(log y )  1.420528
i 1
i i

2.079 2.134
log x   0.4158 log y   0.4268
5 5

n(log x i )(log y i )  (log x i )(log y i )


b
n(log x i ) 2  (log x i ) 2
(5)(1.420528)  (2.079)(2.134)
b  1.75
(5)(1.169135)  (2.079) 2

 Curve fitting
31

log a  log y  b (log x )  0.4268  (1.75)(0.4158)


ln a  0.3

Straight-line:

log y  log a  b log x


 log y  0.3  1.75 log x

Power: y  a xb

log a  0.3  a  100.3  0.5

 y  a xb  0.5 x1.75

Curve fitting
32

• Fig. 17.10 a), pg 455, is


a plot of the original
data in its untransformed
state, while fig. 17.10 b)
is a plot of the
transformed data.
• The intercept, log 2 =
-0.300, and by taking
the antilogarithm, 2 =
10-0.3 = 0.5.
• The slope is 2 = 1.75,
consequently, the
power equation is : y =
0.5x1.75

Curve fitting
33

Figure 17.9: Type of polynomial equations and their linearized versions,


respectively.

Curve fitting
34

Saturation-growth rate Equation

• Equation (17.14) can be linearized by inverting it to yield:


 x 
y  3   ------- (17.14)

 3  x 

1 3 1 1
  ------- (17.17)
y 3 x 3

• A plot of 1/y versus 1/x will yield a straight line with slope of
3/3 and an intercept of 1/3
• In their transformed forms, these models are fit using linear
regression in order to evaluate the constant coefficients.
• They could then be transformed back to their original state
and used for predictive purposes as discusses in

Curve fitting
35

Example 5

Linearization of a saturation-growth rate equation to the data


in table below.

x 0.75 2 2.5 4 6 8 8.5

y 0.8 1.3 1.2 1.6 1.7 1.8 1.7

Curve fitting
36

Solution

n7
n n
1 1

i 1 xi
 2.8926 
i 1 yi
 5.2094
2
n
1 n
 1  1 
    2.3074
i 1  xi 
      2.8127
i 1  xi   yi 

 1  2.8926  1  5.2094
   0.4132     0.7442
 x 7  y 7

Curve fitting
37

xi yi 1/ xi 1/ yi (1/xi)2 (1/ xi)(1/yi)

0.75 0.8 1.33333 1.25000 1.7777 1.6666


2 1.3 0.50000 0.76923 0.2500 0.3846
2.5 1.2 0.40000 0.83333 0.1600 0.3333
4 1.6 0.25000 0.62500 0.0625 0.1562
6 1.7 0.16667 0.58823 0.0278 0.0981
8 1.8 0.12500 0.55555 0.0156 0.0694
8.5 1.7 0.11765 0.58823 0.0138 0.1045
 2.89260 5.20940 2.3074 2.8127

 1  1   1   1 
n
 

 
   
 
   

b
  xi  y i   xi   y i 
2
a  1   1  2
n 
  (  
)
 i
x  i
x
b (7)(2.8127)  ( 2.8926)(5.2094)
  0.5935
a (7)(2.3074)  ( 2.8926) 2

Curve fitting
38

1  1 b1
         0.7442  (0.5935)(0.4132)
a  y a x
1
   0.4990
a
1 1 b1
Straight-line:  
y a ax

1 1
  0.4990  0.5935
y x

 x 
Saturation-growth: y  a  b  x 
1
 0.4990   a  2
a
b
 0.5935   b  (0.5935)(2)  1.187
a
 x 
 y  2
1.187  x  Curve fitting
39

Quiz 3

Fit a power equation and saturation growth rate equation to:

x 1 2 3 4 5 6 7

y 2.1 2.2 2.3 2.4 2.5 2.6 2.7

Curve fitting
40

Figure 17.8: a) data is ill-suited for linear regression, b) parabola is


preferable.

Curve fitting
41

Figure 17.9: Type of polynomial equations and their linearized versions,


respectively.

Curve fitting
42

Polynomial Regression (pg 470)


• Another alternative is to fit polynomials to the data using polynomial
regression.

• The least-squares procedure can be readily extended to fit the data


to a higher-order polynomial.

• For example, to fit a second–order polynomial or quadratic:


y  a0 a1x  a2 x 2  e

• The sum of the squares of the residual is:


n
S r   ( yi  a0  a1 xi  a2 xi2 ) 2
i 1

where n= total number of points

Curve fitting
43

• Then, taking the derivative of equation (17.18) with respect to


each of the unknown coefficients, ao , a1 ,and, a2 of the
polynomial, as in:

 Sr
 2 ( yi  a0  a1 xi  a2 xi2 )
 a0
 Sr
 2 xi ( yi  a0  a1 xi  a2 xi2 )
 a1
 Sr
 2 xi2 ( yi  a0  a1 xi  a2 xi2 )
 a2

• Setting the equations equal to zero and rearrange to develop set


of normal equations and by setting ao = n.ao
(n)a0  ( xi )a1  ( xi2 )a2   yi
( xi )a0  ( xi2 )a1  ( xi3 )a2   xi yi ----- 17.19

( xi2 )a0  ( xi3 )a1  ( xi4 )a2   xi2 yi Curve fitting


44

• The above 3 equations are linear with 3 unknowns coefficients (ao ,


a1 ,and, a2) which can be calculated directly from observed data.
• In matrix form:
 n xi xi2  a0   yi 
 
 xi xi2 xi3    a1    xi yi 
   
xi2 xi3 xi  a2  xi yi 
4 2

• The two-dimensional case can be easily extended to an mth-order


polynomial as:

y  a0  a1 x  a2 x 2  ...  am x m  e
• Thus, standard error for mth-order polynomial :

Sr
Sy/ x  ----- 17.20
n  (m  1)

Curve fitting
45

Example 6
Fit a second order polynomial to the data in the first 2 columns
of table 17.4:

xi yi x i2 x i3 x i4 xi yi xi2 yi

0 2.1 0 0 0 0 0
1 7.7 1 1 1 7.7 7.7
2 13.6 4 8 16 27.2 54.4
3 27.2 9 27 81 81.6 244.8
4 40.9 16 64 256 163.6 654.4
5 61.1 25 125 625 305.5 1527.5
 15 152.6 55 225 979 585.6 2488.8

• From the given data:


m=2 Σxi = 15 Σ xi4 = 979 y = 25.433
n=6 Σ yi = 152.6 Σ xiyi = 585.6 Σ xi3 = 225
x = 2.5 Σ xi2 = 55 Σ xi2yi = 2488.8
46

 n xi xi2  a0   yi 


 
 xi xi2 xi3    a1    xi yi 
xi2 xi3 xi4   a3  xi2 yi 

• Therefore, the simultaneous linear equations are:

 6 15 55  ao   152.6 
15 55 225  a    585.6 
  1   
55 225 979 
  
a 2
 2488.8

• Solving these equations through a technique such as Gauss
elimination gives:
ao = 2.47857, a1 = 2.35929, and a2 = 1.86071

• Therefore, the least-squares quadratic equation for this case is:


y = 2.47857 + 2.35929x + 1.86071x2
• To calculate st and sr , build table 17.4 for columns 3 and 4.
Curve fitting
47

xi yi (yi- y )2 (yi-ao-a1xi-a2xi2)2
0 2.1 544.44 0.14332
1 7.7 314.47 1.00286
2 13.6 140.03 1.08158
3 27.2 3.12 0.80491
4 40.9 239.22 0.61951
5 61.1 1272.11 0.09439
Σ 152.6 2513.39 3.74657

St  ( yi  y )  2513.39
S r  ( yi  a0  a1 xi  a2 xi ) 2  3.74657
2

The standard error (regression polynomial):

Sr 3.74657
Sy    1.12
x n  (m  1) 6  (2  1)

Curve fitting
48

• The correlation coefficient can be calculated by using equations


17.10 and 17.11, respectively:

St  S r
r2 
St
St  S r n xi yi  ( xi )( yi )
r @r 
St n xi2  ( xi ) 2 n yi2  ( yi ) 2

Therefore, r2 = (St – Sr) / St = (2513.39 – 3.74657) / 2513.39


r2 = 0.99851
The correlation coefficient is, r = 0.99925

• The results indicate that 99.851% of the original uncertainty has


been explained by the model. This result supports the conclusion
that the quadratic equation represents an excellent fit, as evident
from Fig.17.11.
Curve fitting
49

Figure 17.11: fit of a second-order polynomial

Curve fitting

You might also like