10 Matlab Fitting
10 Matlab Fitting
Cureve Fitting
Data Approximation with a Function
¾ Data points (xi, yi) can be approximated by
a function
y = f(x) such that the function passes
“close” to the data points but does not
necessarily pass through them.
¾ Data fitting is necessary to model data with
fluctuations such as experimental
measurements.
measurements
¾ The most common form of data fitting is the
least squares method.
1
Fitting Polynomials to Data
y Fitting analytical curves to experimental data is a common task in
engineering.
y The simplest is linear interpolation (straight lines between data points).
Polynomials are often used to define curves, and there are several
P l i l f d d fi d h l
options:
y Fit curve through all data points (polynomial—spline interpolation),
y Fit curve through some data points,
y Fit curve as close to points as possible (polynomial regression, also
called “least square fitting”)
polynomial polynomial
interpolation regression
Multi‐Valued Data Points
y
2
Fitting Data with a Linear Function
y
y = ax+b,
Determine a and b.
Fitting Data with a Linear Function
y Linear interpolation with a polynomial of degree one
y Input: two nodes
p
y Output: Linear polynomial
( x2 , y2 ) p1 x1 + p2 = y1
p1 x2 + p2 = y2
( x1 , y1 )
⎛ x1 1⎞⎛ p1 ⎞ ⎛ y1 ⎞
y ( x) = p1 x + p2 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ = ⎜⎜ ⎟⎟
⎝ x2 1⎠⎝ p2 ⎠ ⎝ y2 ⎠
3
Fitting Data with a quadratic Function
y Quadratic interpolation with a polynomial of degree two
( x2 , y2 ) p1 x12 + p2 x1 + p3 = y1
p1 x22 + p2 x2 + p3 = y2
( x3 , y3 )
p1 x32 + p2 x3 + p3 = y3
( x1 , y1 )
⎛ x12 x1 1⎞⎛ p1 ⎞ ⎛ y1 ⎞
⎜ 2 ⎟⎜ ⎟ ⎜ ⎟
y ( x) = p1 x 2 + p2 x + p3 ⎜ x2 x2 1⎟⎜ p2 ⎟ = ⎜ y2 ⎟
⎜ x2 x3 1⎟⎠⎜⎝ p3 ⎟⎠ ⎜⎝ y3 ⎟⎠
⎝ 3
7
Fitting Data with a polynomial Function
y Polynomial interpolation of degree n
( x2 , y2 ) ⎛ x1n L x1 1⎞⎛ p1 ⎞ ⎛ y1 ⎞
⎜ n ⎟⎜ ⎟ ⎜ ⎟
⎜ x2 L x2 1⎟⎜ p2 ⎟ ⎜ y2 ⎟
⎜M =
( xn , yn )
⎜ O M M ⎟⎟⎜ M ⎟ ⎜ M ⎟
( x1 , y1 ) ⎜ ⎟ ⎜ ⎟
⎜ xn
⎝ n L xn 1⎠⎜⎝ pn ⎠ ⎜⎝ yn ⎠
y ( x) = p1 x n + L + pn x + pn+1
8
4
Higher‐Order Polynomials
Linear Least Squares Algorithm
• Data points (xi , yi ) i = 1K n
• Compact notation n 2
z = ∑ [ yi − (axi + b )]
i =1
y A least squares linear fit minimizes the square of the
distance between every data point and the line of best fit
10
5
polyfit Function
y Polynomial curve fitting
coef = polyfit(x,y,n)
polyfit(x y n)
y A least‐square fit with polynomial of degree n.
y x and y are data vectors, and coef is vector of
coefficients of the polynomial,
y coef = [q2 - !q3 - !/!/!/!- !qo - !qo,2]
11
Polynomial Curve Fitting
y x and y are vectors containing the x and y data to be fitted,
and n is the order of the polynomial to return. For example,
consider the x‐yy test data.
consider the x
x = [-1.0, -0.7, -0.2, 0.6, 1.0];
y = [-2.1, 0.3 , 1.7, 1.9, 2.1];
y A fourth order (n=4) polynomial that approximately fits the
data is
p = polyfit(x,y,4)
y In other words, the polynomial function would be:
12
6
Polyfit & Polyval
• MATLAB provides built-in functionality for fitting data
to polynomial equations.
– polyfit function performs the fit and returns
coefficients
– polyval function evaluates the polynomial using the
polyfit coefficients
p = polyfit(x,y,n) y = polyval(p,x)
Where Where
x – vector of x data x – vector of x data
y ‐ vector of y data p – coefficients of the
n – order of the polynomial to polynomial
use (1= linear, 2= quadratic, y – vector of yfit data
3= cubic…)
p – coefficients of the
polynomial
13
Polynomial Interpolation in Matlab
polyfit - to find the polynomial (the coefficients)
x = [-1.0, -0.7, -0.2, 0.6, 1.0];
y = [-2.1, 0.3, 1.7, 1.9, 2.1];
% For 5 points, we need a 4th degree polynomial
p = polyfit(x, y, 4);
14
7
Polynomial Curve Fitting
x = [-1.0, -0.7, -0.2, 0.6, 1.0];
y = [
[-2.1,
2.1, 0.3, 1.7, 1.9, 2.1];
% For 5 points, we need a 4th degree polynomial
p = polyfit(x, y, 4)
interp_at_one_fourth = polyval(p, 0.25)
% Evaluate at a range of values for a smooth plot
xx = -1.0:0.1:1.0;
yy = polyval(p,
polyval(p xx);
plot(x,y,'or',xx,yy,'b-',0.25,interp_at_one_fourth,'sk');
15
p = polyfit(x, y, 4);
2.5
1.5
0.5
-0.5
-1
-1.5
-2
-2.5
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
16
8
p = polyfit(x, y, 3);
2.5
1.5
0.5
-0.5
-1
-1.5
-2
-2.5
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
17
p = polyfit(x, y, 2);
2.5
1.5
0.5
-0.5
-1
-1.5
-2
-2.5
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
18
9
p = polyfit(x, y, 1);
3
-1
1
-2
-3
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
19
Choosing the Right Polynomial
y The degree of the correct approximating function
depends on the type of data being analyzed.
y When a certain behavior is expected, such as the linear
response of beam deflection under increasing loads,
we know what type of function to use, and simply have
to solve for its coefficients.
y When we don’t know what sort of response to expect,
ensure your data sample size is large enough to clearly
distinguish which degree is the best fit.
20
10
Basic Curve‐Fitting Tools
Method 2
Curve Fitting Tools can be accessed directly from the figure
window:
i d
To access curve fitting directly from the figure window,
select ‘basic fitting’ from the ‘tools’ pulldown menu in the
figure window.
Tools Î Basic Fitting
This is a quick and easy method to calculate and visualize a
variety of higher order functions including interpolation
y Given x=[0:5]; y=[0,1,60,40,41,47];
Basic Curve‐Fitting Tools
y clear;clc
y x=[0:5];
y y=[0,1,60,40,41,47];
y plot(x,y,'rh')
y axis([0 10 0 100]);
11
Solution
23
Solution
24
12
Solution
25
Solution
26
13
Solution
27
Solution
28
14
Solution
Higher Order Curve Fitting
Caution:
y Higher order polynomial fits should
only be used when a large number of
data points are available.
y Higher order polynomial fitting
functions may fit more the data more
accurately but may not yield an
l b i ld
interpretable model.
y Higher order polynomial can introduce
unwanted wiggles.
30
15
Higher Order Curve Fitting
Method 3
31
Higher Order Curve Fitting
32
16
Curve Fitting Toolbox
1. Loading Data Sets:
Before yyou can import
p data into the Curve Fitting
g Tool, the data
variables must exist in the MATLAB workspace.
You can import data into the Curve Fitting Tool with the Data
GUI. You open this GUI by clicking the Data button on the
Curve Fitting Tool.
The Data Sets pane allows you to:
Import predictor (X) data, response (Y) data, and weights.
If yyou do not import
p weights,
g then theyy are assumed to be 1 for all data points.
p
Specify the name of the data set.
Preview the data.
Click the Create data set button to complete the data import
process.
33
Curve Fitting Toolbox
34
17
Curve Fitting Toolbox
2. Smoothing Data Points:
If yyour data is noisy,
y yyou might
g need to apply
pp y a smoothing
g
algorithm to expose its features, and to provide a reasonable
starting approach for parametric fitting.
1. The relationship between the response data and the predictor data
is smooth.
2. The smoothing process results in a smoothed value that is a
better estimate of the original value because the noise has been
reduced.
35
Curve Fitting Toolbox
2. Smoothing Data Points:
The Curve Fittingg Toolbox supports
pp these smoothingg methods:
Moving Average Filtering: Lowpass filter that takes the average
of neighboring data points.
Lowess and Loess: Locally weighted scatter plot smooth. These
methods use linear least squares fitting, and a first-degree
polynomial (lowess) or a second-degree polynomial (loess).
Robust lowess and loess methods that are resistant to outliers are
also available.
Savitzky-Golay Filtering: A generalized moving average where
you derive the filter coefficients by performing an unweighted
linear least squares fit using a polynomial of the specified degree.
36
18
Curve Fitting Toolbox
Smoothing Method and Parameters
37
Curve Fitting Toolbox
Excluding Data Points:
38
19
Curve Fitting Toolbox
Excluding Data Points:
The Curve Fitting Toolbox provides two methods to exclude
h C i i lb id h d l d
data:
y Marking Outliers: Outliers are defined as individual data
points that you exclude because they are inconsistent with
the statistical nature of the bulk of the data.
y Sectioning: Sectioning excludes a window of response or
predictor data. For example, if many data points in a data
set are corrupted by large systematic errors you might
set are corrupted by large systematic errors, you might
want to section them out of the fit.
y For each of these methods, you must create an exclusion
rule, which captures the range, domain, or index of the
data points to be excluded.
39
Curve Fitting Toolbox
Plotting Fitting Curves:
The Fit Editor
Th Fit Edit allows you to:
ll t
y Specify the fit name, the current data set, and the
exclusion rule.
y Explore various fits to the current data set using a
library or custom equation, a smoothing spline, or
an interpolant.
y Override the default fit options such as the
O id th d f lt fit ti h th
coefficient starting values.
y Compare fit results including the fitted
coefficients and goodness of fit statistics.
40
20
Curve Fitting Toolbox
Plotting Fitting Curves:
The Table of Fits allows you to:
y Keep track of all the fits and their data sets for the
current session.
y Display a summary of the fit results.
y Save or delete the fit results.
Save or delete the fit results
41
Curve Fitting Toolbox
Analyzing Fits:
y You can evaluate (interpolate or extrapolate),
differentiate, or integrate a fit over a specified data
range with the Analysis GUI. You open this GUI by
clicking the Analysis button on the Curve Fitting
Tool.
42
21
Curve Fitting Toolbox
Analyzing Fits:
43
Curve Fitting Toolbox
Saving Your Work:
y You can save one or more fits and the associated fit results
as variables to the MATLAB workspace.
y You can then use this saved information for
documentation purposes, or to extend your data
exploration and analysis.
y In addition to saving your work to MATLAB workspace
variables, you can:
1. Save the session
2. Generate an M‐file
44
22
Plot of Linear Fit
70
This linear fit uses
60
every data point.
50
45
Bad Data
What about this point?
70
Is it really a “good”
60 data point?
50
What do you know
40 about the data?
30
Is it monotonic?
20
10
0
0 1 2 3 4 5 6 7 8 9 10
46
23
A linear fit ignoring one data point
Now the slope is greater
70 and seems to follow the
d f ll h
data points pretty well.
60
50
Bad data point is ignored.
40
30 But what if we "convicted"
g p
the wrong data point?
20
10
0
0 1 2 3 4 5 6 7 8 9 10
47
A linear fit ignoring one data point
70
Ignoring a different
60 data point allows us
to approximate the
50
data pretty well with
40 a second degree
30 polynomial.
20
10
0
0 1 2 3 4 5 6 7 8 9 10
24
Common Fitting Functions
¾ The linear function y = mx + b
y Its slope is m and its intercept is b.
Its slope is m and its intercept is b
¾ The power function y = bxm
y x raised to some power m
y m does not need to be an integer or positive
y no other terms – simpler than polynomial
¾ The exponential function y
The exponential function y = b(10)mx
y also written as y = bemx
y base (10 or e) raised to some power mx
Plots tell which model to use
Each function gives a straight line when plotted
using
g a specific
p set of axes:
50
25
Steps for Function Discovery
1- Examine the data near the origin.
☺ The linear function can pass through the
origin only if b = 0.
Examples of power functions
clc;clear 4
x=linspace(0,4);
35
3.5
b=1;
m=2
m=[-1,-.5,0,.5,1,2];
3
y1=b*x.^(m(1));
2.5
y2=b*x.^(m(2)); m=1
y3=b*x.^(m(3));
2
y4=b*x.^(m(4)); m=0.5
1.5
y5=b*x.^(m(5));
m=0
y6=b*x.^(m(6));
1
plot(x,y1,'k.-',x,y2,'b.-',x,y3,'g.-',x,y4,'r.-',x,y5,'m.-',x,y6,'c.-')
m=-0.5
0.5
axis([0 4 0 4]) m=-1
gtext('m=2');gtext('m=1');gtext('m=0.5');gtext('m=0');
0
0 0.5 1 1.5 2 2.5 3 3.5 4
gtext('m=-0.5');gtext('m=-1');
Figure 1 52
26
Examples of exponential functions
clc;clear 4
x=linspace(0,2);
35
3.5
m=2
b=1;
m=[-2,-1,0,1,2];
3
m=1
y1=b*exp(m(1).*x);
2.5
y2=b*exp(m(2).*x);
y3=b*exp(m(3).*x);
2
y4=b*exp(m(4).*x);
1.5
y5=b*exp(m(5).*x);
m=0
plot(x,y1,'r.-',x,y2,'b.-',x,y3,'g.-',x,y4,'m.-',x,y5,'c.-')
1
axis([0 2 0 4])
0.5
m=-1
gtext('m=2');gtext('m=1');gtext('m=0');gtext('m=-1');gtext('m=-2');
m=-2
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Figure 2 53
Steps for Function Discovery
2- Plot the data using rectilinear scales. If it
forms a straight line, then it can be
represented by the linear function and you are
finished.
Otherwise, if you have data at x = 0, then
a. If y(0) = 0, try the power function.
b. If y(
y(0)
) ≠ 0,, try
y the exponential
p function.
If data is not given for x = 0, proceed to step 3.
54
27
Steps for Function Discovery
3- If you suspect a power function, plot the data
using log-log scales. Only a power function
will form a straight line on a log-log plot. If
you suspect an exponential function, plot the
data using the semilog scales. Only an
exponential function will form a straight line
on a semilog plot.
55
Using Polyfit for Linear Curve Fit
y p = polyfit(x,y,1)
y p is the vector of coefficients, [p1 p2].
y p1=m , p2=b
56
28
Using Polyfit for Power Curve Fit
y In this case:
y Thus we can find the power function that fits the data
by typing:
y p = polyfit(log(x),log(y),1)
l fi (l ( ) l ( ) 1)
y p is the vector of coefficients, [p1 p2].
y p1=m , p2=log(b)
Î
57
Using Polyfit for Exponential Curve Fit
y In this case:
y Thus we can find the power function that fits the data
by typing:
y p = polyfit(x,log(y),1)
l fi ( l ( ) 1)
y p is the vector of coefficients, [p1 p2].
y p1=m , p2=log(b)
Î
58
29
Example
Plot and determine which curve fits best
x y
Determine the best 2.5 821
fitting (linear, 3 498
exponential, or power 3.5 302
function) to describe 4 183
the data. Plot the 4.5 111
function on the same 5 67
Program
clc; close; clear
x=[2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 7, 8, 9];
y=[821 498,
y=[821, 498 302,
302 183,
183 111,
111 67,
67 41,
41 25,
25 9,
9 3,
3 1];
plot(x,y,'kh'); xlabel('x'), ylabel('y');hold on
p1=polyfit(x,y,1);
p2=polyfit(log(x),log(y),1);
p3=polyfit(x,log(y),1);
x1=1:.1:10;
y1=polyval(p1,x1);
y2 polyval(p2 x1);
y2=polyval(p2,x1);
y3=polyval(p3,x1);
plot(x1,y1,'g.:',exp(x1),exp(y2),'r.:',x1,exp(y3),'b.:')
axis([1 10 -200 1200]);
legend('Experimental Data','Line Fit.','Power Fit.','Exponential Fit.');
60
30
Resulte
1200
Experimental Data
Line Fit.
Power Fit.
1000 Exponential Fit.
800
600
y
400
200
-200
1 2 3 4 5 6 7 8 9 10
x
61
Assessing Goodness of Fit
y The tough part of polynomial regression is knowing
that the "fit" is a good one.
y Determining the quality of the fit requires experience,
a sense of balance and some statistical summaries.
y One common goodness of fit involves a least‐squares
approximation. This describes the distance of the entire set of
data points from the fitted curve. The normalization of the
residual error minimizing the square of the sum of squares of
all residual errors.
y The coefficient of determination (also referred to as the R2
value) for the fit indicates the percent of the variation in the
data that is explained by the model.
62
31
Assessing Goodness of Fit
y This coefficient can be computed via the commands:
ypred = polyval(coeff,x); % predictions
dev = y - mean(y); % deviations - measure of spread
SST = sum(dev.^2); % total variation to be accounted for the fit
resid = y - ypred; % residuals - measure of mismatch
SSE = sum(resid.^2); % variation NOT accounted for the fit
normr = sqrt(SSE) % the 2-norm of the vector of the residuals for the fit
Rsq = 1 - SSE/SST; % R2 Error (percent of error explained)
y The closer that Rsq is to 1, the more completely the fitted model
"explains" the data.
63
Residuals
y It can be helpful to plot the model error samples
directly—the
directly the residuals
residuals
y For this would involve:
y make another vector of model data that aligns with
original data and subtract:
y y1=polyfit(p1,x) ;
1 l fit( 1 ) % original x vector
% i i l t
y r1=y – y1; % vector of model errors
y plot(x, r1);
64
32
Residual in Line Fitt
x=[2.5 3 3.5 4 4.5 5 5.5 6 7 8 9];
y=[821,498,302,183,111,67,41,25,9,3,1];
p=polyfit(x,y,1); 1000
x1=2:.1:10;
500
y1=polyval(p,x1);
0
y11=polyval(p,x);
r=y‐y11; -500
2 3 4 5 6 7 8 9 10
subplot(2,1,1)
b l (2 1 1)
plot(x,y,'hr',x1,y1,'g‐');500
subplot(2,1,2) 0
plot(x,r,'ob') -500
axis([2 10 ‐800 800]); 2 3 4 5 6 7 8 9 10
65
Residual in Power Fitt
x=[2.5 3 3.5 4 4.5 5 5.5 6 7 8 9];
y=[821,498,302,183,111,67,41,25,9,3,1];
p=polyfit(log(x),log(y),1);
6000
x1=2:.1:10; 5000
4000
y1=polyval(p,log(x1)); 3000
2000
y11=polyval(p,log(x)); 1000
r=y‐exp(y11); 0
2 3 4 5 6 7 8 9 10
subplot(2,1,1)
b l (2 1 1)
plot(x,y,'hr',x1,exp(y1),'g‐');
500
subplot(2,1,2) 0
plot(x,r,'ob') -500
axis([2 10 ‐800 800]); 2 3 4 5 6 7 8 9 10
66
33
Residual in Exponential Fitt
x=[2.5 3 3.5 4 4.5 5 5.5 6 7 8 9];
y=[821,498,302,183,111,67,41,25,9,3,1];
p=polyfit(x,log(y),1); 1500
x1=2:.1:10;
1000
y1=polyval(p, x1);
500
y11=polyval(p, x);
r=y‐exp(y11); 0
2 3 4 5 6 7 8 9 10
subplot(2,1,1)
b l (2 1 1)
plot(x,y,'hr',x1,exp(y1),'g‐');
500
subplot(2,1,2) 0
plot(x,r,'ob') -500
axis([2 10 ‐800 800]); 2 3 4 5 6 7 8 9 10
67
Exercises
y Calculate the R2 error and Norm of the residual error
for a 2nd order polynomial fit for the data in the
previous example.
68
34
Solution
100
x=[0,.5,1,1.5,2,2.5,3,3.5,4];
80
y=[100,62,38,21,13,7,4,2,3];
l fit( 60 2)
p=polyfit(x,y,2);
x1=0:.1:4;40
y1 = polyval(p,x1);
20
dev = y-mean(y);
0
SST = sum(dev.^2)
-20
y11= polyval(p,x);
0 0.5 1 1.5 2 2.5 3 3.5 4
resid = y - y11;
SSE = sum(resid.^2)
10
normr = sqrt(SSE); % residual norm
Rsq = 1 - SSE/SST % R^2 R 2 Error
subplot(2,1,1)
5 normr =
plot(x,y,'ro',x1,y1,'b-')
subplot(2,1,2) 12.1376
0
plot(x,resid,'hr')
Rsq =
-5
0 0.5 1 1.5 2 2.5 3 3.5 4
0.9837
69
Linear Modeling with Non‐polynomial Terms
Fit the data in x and y with
the following equation: y = a1 x 0 + a2 e − x + a3 xe− x
35
Linear Modeling with Non‐polynomial Terms
continued…
a = B\y
B\ • Use
U the
th left
l ft division
di i i ( \ ) operator
t tot tell
t ll
MATLAB to solve the system of
xx = [0:0.1:2.5]'; equations.
0.5
0 0.5 1 1.5 2 2.5 Try It!
71
36