0% found this document useful (0 votes)
59 views

MBAS901 2 Lecture

This document outlines the key topics covered in a lecture on estimation using regression models for business analytics. The lecture covers predictive analytics, linear regression modeling and evaluation, and generalized linear models. It defines different types of predictive analytics problems like classification, clustering, and time series forecasting. It also explains the basics of simple and multiple regression modeling, including dependent and independent variables, scatter plots, and estimating regression coefficients from sample data to build a predictive linear model.

Uploaded by

Sabina
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

MBAS901 2 Lecture

This document outlines the key topics covered in a lecture on estimation using regression models for business analytics. The lecture covers predictive analytics, linear regression modeling and evaluation, and generalized linear models. It defines different types of predictive analytics problems like classification, clustering, and time series forecasting. It also explains the basics of simple and multiple regression modeling, including dependent and independent variables, scatter plots, and estimating regression coefficients from sample data to build a predictive linear model.

Uploaded by

Sabina
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 87

MBAS901

Essential Elements for Business Analytics

Lecture : Estimation Using Regression


Lecture Outline

1.1 Predictive Analytics


1.2 Modeling: Linear Regression
1.3 Evaluation: Linear Regression
1.4 Going Beyond Linear Regression Models
1.5 Generalized Linear Models & Generalized Additive
Models
Lecture Outline

1.1 Predictive Analytics


1.2 Modeling: Linear Regression
1.3 Evaluation: Linear Regression
1.4 Going Beyond Linear Regression Models
1.5 Generalized Linear Models & Generalized Additive
Models
Life Cycle of Business Analytics
Predictive Analytics in General
Predictive Analytics in General
Predictive Analytics
Type Explanation
Causal modeling Causal modeling attempts to help us understand what events or actions actually influence others.

E.g.: “Did a particular drug help decrease blood pressure level in patients”.
Value estimation Value estimation attempts to estimate or predict, for each individual, the numerical value of some variable for that individual

E.g.: “How much will a given customer use the service?”


Classification Classification attempt to predict, for each individual in a population, which of a (small) set of classes this individual belongs to

E.g.: “Among all the customers, which are likely to respond to a given offer?”
Clustering Clustering attempts to group individuals in a population together by their similarity, but not driven by any specific purpose

E.g.: “Do our customers form natural groups or segments?”


Co-occurrence Grouping Co-occurrence Grouping (also known as frequent item-set mining, association rule discovery, and market-basket analysis) attempts to find
associations between entities based on transactions involving them.

E.g.: What items are purchased together?


Similarity Matching Similarity Matching attempts to identify similar individuals based on data known about them; it can be used directly to find similar entities

E.g.: “Find people who are similar to you in terms of the products they have liked or have purchased”.
Network Link Prediction Network Link Prediction attempts to predict connections between data items, usually by suggesting that a link should exist, and possibly also
estimating the strength of the link.

E.g.: “Since you and Karen share 10 friends, maybe you’d like to be Karen’s friend?”
Change Point Detection Change Detection in a time series data attempts to detect changes quickly that are significant for either action to be taken or as a result of an
action taken

E.g.: “Has there been an increase in global temperature over a period of time to alarm us about global warming?”
Forecasting Time Series Forecasting Time Series involves taking models fit on historical data and using them to predict future observations

E.g. “What will be the temperatures for the next few days in a city?”
Lecture Outline

1.1 Predictive Analytics


1.2 Modeling: Linear Regression
1.3 Evaluation: Linear Regression
1.4 Going Beyond Linear Regression Models
1.5 Generalized Linear Models & Generalized Additive
Models
Regression Modeling
Regression Modeling

• Regression modeling: a very valuable tool for an analyst


• Understand the relationship between variables
• Predict the value of one dependent variable based on independent variables

• Simple linear regression models have only two variables


• Multiple regression models have more than one independent
variable
• Can be applied to non-linearity

Copyright ©2015 Pearson


4–
Education, Inc.
10
Basics of Regression Modeling

Variable to be predicted is called the dependent variable or response


variable
• Value depends on the value of the independent variable(s)

Explanatory or predictor variable

Dependent Independent Independent


variable
= variable
+ variable

Copyright ©2015 Pearson


4–
Education, Inc.
11
Scatter Diagram

Scatter diagram or scatter plot often used to investigate the


relationship between variables

• Independent variable normally plotted on X axis


• Dependent variable normally plotted on Y axis

Copyright ©2015 Pearson


4–
Education, Inc.
12
Example: Triple A
Construction
• Triple A Construction renovates old homes
• The dollar volume of renovation work is dependent on the area payroll

TABLE 4.1

LOCAL
AREA SALES
PAYROLL
($100,000s)
($100,000,000s)
1 6 3
2 8 4
3 9 6
4 5 4
5 4.5 2
6 9.5 5
Copyright ©2015 Pearson
4–
Education, Inc.
13
Triple A
Construction
Figure 4.1 – Scatter Diagram

12 –

10 –
Sales ($100,000)

8–

6–

4–

2–

0 |– | | | | | | | |
0 1 2 3 4 5 6 7 8
Payroll ($100 million)

Copyright ©2015 Pearson


4–
Education, Inc.
14
Triple A
Construction
Figure 4.1 – Scatter Diagram

12 –

10 –
Sales ($100,000)

8–

6–

4–

2–

0 |– | | | | | | | |
0 1 2 3 4 5 6 7 8
Payroll ($100 million)

Copyright ©2015 Pearson


4 – 15
Education, Inc.
Simple Linear Regression

Regression used to model relationships between variables:

Y  0  1 X  

where
Y = dependent variable (response)
X = independent variable (predictor or explanatory)
0 = intercept (value of Y when X = 0)
1 = slope of the regression line
 = random error

Copyright ©2015 Pearson


4 – 16
Education, Inc.
Triple A
Construction
Figure 4.1 – Scatter
Diagram

12 –

10 –
Sales ($100,000)

8–
6– 1

4–

2–
Y  0  1 X  

0 | | | | | | | |
0 1 2 3 4 5 6 7 8
|–
Payroll ($100 million)

Copyright ©2015 Pearson


4 – 17
Education, Inc.
Simple Linear Regression

BUT true values for the slope and intercept are not known
 Estimated using sample data
This is a machine
learning model

Yˆ b0  b1X

where
^
Y = predicted value of Y
b0 = estimate of β0, based on sample results
b1 = estimate of β1, based on sample results

Copyright ©2015 Pearson


4 – 18
Education, Inc.
Example: Triple A
Construction
Predict sales based on area payroll
Y = Sales
X = Area payroll

The line minimizes the errors:


Error = (Actual value) – (Predicted value)
𝑒 = Y − Y෡

2
Regression analysis minimizes the sum of squared errors 𝑒2 = Y −
Y෡
 Least-squares regression

Copyright ©2015 Pearson


4 – 19
Education, Inc.
Example: Triple A
Construction
Yˆ b0  b1 X

Formulas for simple linear regression, intercept and slope:

X nX
 average (mean) of X values
 Y
Y  n  average (mean) of Y values

b1 
 ( X  X )(Y
Y )
 ( X  X )2
b0  Y  b1 X
Copyright ©2015 Pearson
4 – 20
Education, Inc.
Example: Triple A
Construction
TABLE 4.2 – Regression calculations

Y X (X – X)2 (X – X)(Y – Y)
6 3 (3 – 4)2 = 1 (3 – 4)(6 – 7) = 1
8 4 (4 – 4)2 = 0 (4 – 4)(8 – 7) = 0
9 6 (6 – 4)2 = 4 (6 – 4)(9 – 7) = 4
5 4 (4 – 4)2 = 0 (4 – 4)(5 – 7) = 0
4.5 2 (2 – 4)2 = 4 (2 – 4)(4.5 – 7) = 5
9.5 5 (5 – 4)2 = 1 (5 – 4)(9.5 – 7) = 2.5
ΣY = 42 ΣX = 24 Σ(X – X)2 = 10 Σ(X – X)(Y – Y) = 12.5
Y = 42/6 = 7 X = 24/6 = 4

Copyright ©2015 Pearson


4 – 21
Education, Inc.
Triple A
Construction
X 
X 24
Regression calculations: 4
6

Y   6
Y 42
7
6

b 
6 ( X – X )(Y – Y 12.5
 
1 ) 10 1.25
( X – X ) 2

b0  Y  b1 X  7 – (1.25)(4)  2

 Therefore Yˆ 2 + 1.25X

Copyright ©2015 Pearson


4 – 22
Education, Inc.
Triple A
Construction
X 
X 24
Regression calculations: 4
6

Y   6  7
Y 42
6
sales = 2 + 1.25(payroll) 6

b
 ( X – X )(Y – Y )  12.5 1.
If the payroll 1next year is $600 million 25
( X – X )
2
10

=> What are the predicted sales?


b0  Y − b1 X  7 – (1.25)(4)  2

 Therefore Yˆ 2 + 1.25X

Copyright ©2015 Pearson


4 – 23
Education, Inc.
Triple A
Construction
X 
X 24
Regression calculations: 4
6

Y   6  7
Y 42
6
sales = 2 + 1.25(payroll) 6

b
 ( X – X )(Y – Y )  12.5 1.
If the payroll 1next year is $600 million 25
( X – X )
2
10

Yˆ  2 + 1.25(6)  9.5
b0  Y − b1 X  7 – (1.25)(4)  2

 Therefore Yˆ 2 + 1.25X


Copyright ©2015 Pearson
4 – 24
Education, Inc.
Lecture Outline

1.1 Predictive Analytics


1.2 Modeling: Linear Regression
1.3 Evaluation: Linear Regression
1.4 Going Beyond Linear Regression Models
1.5 Generalized Linear Models & Generalized Additive
Models
Life Cycle of Business Analytics
Measuring the Fit
of the Regression Model
• Regression models can be developed for any quantitative
variables X and Y

• How helpful is the model in predicting Y?

• With average error, positive and negative errors cancel each other
out => average error is always 0 => not a useful indicator!

Copyright ©2015 Pearson


4 – 27
Education, Inc.
Measuring the Fit
of the Regression Model
TABLE 4.3 – Errors

Y X Y e
6 3 2 + 1.25(3) = 5.75 0.25

8 4 2 + 1.25(4) = 7.00 1

9 6 2 + 1.25(6) = 9.50 -0.5

5 4 2 + 1.25(4) = 7.00 -2

4.5 2 2 + 1.25(2) = 4.50 0

9.5 5 2 + 1.25(5) = 8.25 1.25

∑e = 0

Copyright ©2015 Pearson


4 – 28
Education, Inc.
Measuring the Fit
of the Regression Model
• Regression models can be developed for any quantitative
variables X and Y

• How helpful is the model in predicting Y?

• With average error, positive and negative errors cancel each other
out => average error is always 0 => not a useful indicator!

• Three measures of variability


• SST – Total variability about the mean (data)
• SSE – Variability about the regression line (error) We want those
as low as possible
• SSR – Total variability that is explained by the model

Copyright ©2015 Pearson


4 – 29
Education, Inc.
Measuring the Fit
of the Regression Model
• Sum of squares total:
SST  (Y Y )2

• Sum of squares error:


SSE   e 2   (Y
Yˆ ) 2

• Sum of squares regression: SSR  (Yˆ Y


)2

• An important relationship: SST  SSR +


SSE

Copyright ©2015 Pearson


4 – 30
Education, Inc.
Measuring the Fit
of the Regression Model
TABLE 4.3 – Sum of Squares for Triple A
Construction
^
Y
Y X (Y – Y) 2
(Y – Y^)2 (Y^– Y)2
6 3 (6 – 7)2 = 1 2 + 1.25(3) = 5.75 0.0625 1.563

8 4 (8 – 7)2 = 1 2 + 1.25(4) = 7.00 1 0

9 6 (9 – 7)2 = 4 2 + 1.25(6) = 9.50 0.25 6.25

5 4 (5 – 7)2 = 4 2 + 1.25(4) = 7.00 4 0

4.5 2 (4.5 – 7)2 = 6.25 2 + 1.25(2) = 4.50 0 6.25

9.5 5 (9.5 – 7)2 = 6.25 2 + 1.25(5) = 8.25 1.5625 1.563

∑(Y – Y)2 = 22.5 ∑(Y – Y^)2 = 6.875 ∑(Y^ – Y)2 = 15.625


Y=7 SST = 22.5 SSE = 6.875 SSR = 15.625

Copyright ©2015 Pearson


4 – 31
Education, Inc.
Measuring the Fit
of the Regression Model
• Sum of squares total:
SST  (Y Y )2

For Triple ASSE


• Sum of squares error:   e 
Construction 2

Yˆ)2
(Y SST = 22.5
SSE = 6.875
• Sum of squares SSR = 15.625
Hard to interpret
=> need something else!
regression: SSR  (Yˆ −Y )2

• An important relationship: SST  SSR +


SSE

Copyright ©2015 Pearson


4 – 32
Education, Inc.
Coefficient of Determination

The coefficient of determination r2


= The proportion of the variability in Y explained by the regression

SSR SSE
r 2 1–
SST SST

15.625
 For Triple A Construction: r 2   0.6944
22.5

Copyright ©2015 Pearson


4 – 33
Education, Inc.
Coefficient of Determination

The coefficient of determination r2


= The proportion of the variability in Y explained by the regression

SSR SSE
r 2 1–
SST SST

15.625
 For Triple A Construction: r 2   0.6944
22.5
About 69% of the variability of
the revenue (Y) is explained by
The closer to 1, the better the equation based on payroll (X)
Copyright ©2015 Pearson
4 – 34
Education, Inc.
Correlation Coefficient

• An expression of the strength of the linear relationship


• Always between +1 and –1
• The correlation coefficient is:

|𝑟| = 𝑟2

 For Triple A Construction: |𝑟| = 0.6944 = 0.8333

Copyright ©2015 Pearson


4 – 35
Education, Inc.
Four Values of the
Correlation Coefficient
FIGURE 4.3
Y Y

(a) Perfect Positive X (b) Positive X


Correlation: Correlation:
r = +1 0<r<1
Y Y

(c) No Correlation: X (d) Perfect Negative X


r=0 Correlation:
Copyright ©2015 Pearson r = –1
4 – 26
Education, Inc.
Other important indicators

For the overall model: F-value (Pr > F) – Significance F


• Probability that naïve model better than this one
• Less than
• 0.10: likely
• 0.05: very likely
Prediction = data average
• 0.01: extremely likely

For the parameters: P-values


• Probability that it is equals to 0
• Less than
• 0.10: likely
• 0.05: very likely
• 0.01: extremely likely
Recap

A good model has

• High r2

• Small Significance F value (Pr > F)


=> Model statistically significant

• Small P-values for the parameters


=> parameters statistically significant
Assumptions of the Regression Model
• Regression models require the following assumptions:

• Errors are independent


• Errors are normally distributed
• Errors have a mean of zero
• Errors have a constant variance

Copyright ©2015 Pearson


4 – 39
Education, Inc.
Residuals
The difference between the observed value of the dependent variable (y) and the
predicted value (ŷ) is called the residual (e).
Each data point has one residual.
Residual = Observed value - Predicted value
e=y-ŷ

Both the sum and the mean of the residuals are equal to zero. That is, Σ e = 0
and e = 0

 A plot of the residuals often highlights violations of assumptions

Copyright ©2015 Pearson


4 – 40
Education, Inc.
Residual Plots: if everything is ok
FIGURE 4.4A – Pattern of Errors Indicating
Randomness
Error
0

Copyright ©2015 Pearson


4 – 41
Education, Inc.
Residual Plots: outlier
Error
0

X
Copyright ©2015 Pearson
4 – 42
Education, Inc.
Residual Plots: Nonconstant error variance
Error

Copyright ©2015 Pearson


4 – 43
Education, Inc.
Residual Plots: Trend in the Outliers =>
Non Linear Relationship
Error

Copyright ©2015 Pearson


4 – 44
Education, Inc.
Limitations of Linear Regression

Only for linear effects!


N
o
t

l
Lecture Outline

1.1 Predictive Analytics


1.2 Modeling: Linear Regression
1.3 Evaluation: Linear Regression
1.4 Going Beyond Linear Regression Models
1.5 Generalized Linear Models & Generalized Additive
Models
Beyond Simple Linear Regression

What if:
- There are MORE THAN ONE independent variables?
- Relationship between variables is NOT linear?
- Data is NOT normally distributed?
Multiple Linear Regression
• Extensions to the simple linear model
• Models with more than one independent variable

Y = 0 + 1X1 + 2X2 + … + kXk + 

where
Y= dependent variable (response variable)
Xi = ith independent variable (predictor or explanatory
variable)
0 = intercept (value of Y when all Xi = 0)
i = coefficient of the ith independent variable
k= number of independent variables
= random error
Copyright ©2015 Pearson Education, Inc. 4 – 48
More than 2 variables: finding a
(hyper)plane

X2

X1
Example: 2 input variables
Develop a model to determine the suggested listing price for houses based on the size and age
of the house

Yˆ  b0  b1 X1  b2 X 2

where
Yˆ = predicted value of dependent variable (selling price)
b0 = Y intercept
Xi = value of the two independent variables (square footage and age)
respectively 4–
50
bi = slopes for X1 and X2 respectively

 Selects a sample of houses that have sold recently and records the data

Copyright ©2015 Pearson Education, Inc.


Example: 2 input variables

This is the
model

Selling Price ($) = 150 + 16.6 SquareFootage – 0.8 Age


Example: 3 input variables
Develop a model to determine the suggested listing price for houses based on the size and age
of the house, and the condition of the house

Yˆ  b0  b1 X1  b2 X 2  b3X 3

where
Yˆ = predicted value of dependent variable (selling price)
b0 = Y intercept
Xi = value of the three independent variables (square footage, age, condition)
respectively 4–
52
bi = slopes for X1, X2 and X3 respectively

 Selects a sample of houses that have sold recently and records the data

Copyright ©2015 Pearson Education, Inc.


Example: 3 input variables

This is the
model

Selling Price ($) = 150 + 16.6 SquareFootage – 0.8 Age + 15.6Excellent + 35Mint + 5Good
Evaluating Multiple Regression Models

• Similar to simple linear regression models

• r2, Significance F, P-values interpreted the same

Copyright ©2015 Pearson Education, Inc. 4 – 54


Evaluating Multiple Regression Models

• The best model is a statistically significant model with a high r2 and


few variables

• As more variables are added to the model, the r2 value increases


• In such cases, a high r2 value may NOT mean that the model is
good !

• For this reason, the adjusted r2 value is often used to determine the
usefulness of an additional variable

• The adjusted r2 takes into account the number of independent


variables in the model

Copyright ©2015 Pearson Education, Inc. 4 – 55


Using the right predictors
SELLING
SELLING SQUARE AGE CONDITION TODAY’S
PRICE
PRICE FOOTAG WEATHE
(AUD)
($) E R
95,000 1,926 30 Good 135,000 Good
119,000 2,069 40 Excellent 169,300 Bad
124,800 1,720 30 Excellent 176,000 Good
135,000 1,396 15 Good 192,000 Meh
142,000 1,706 32 Mint 202,000 Good
145,000 1,847 38 Mint 206,317 Bad
159,000 1,950 27 Mint 226,237 Bad
165,000 2,323 30 Excellent 234,775 Ok
182,000 2,285 26 Mint 259,000 Good
183,000 3,752 35 Good 260,500 Bad
200,000 2,300 18 Good 284,500 Bad
211,000 2,525 17 Good 300,000 Good
215,000 3,800 40 Excellent 305,900 Good
219,000 1,740 12 Mint 311,500 Good
NOT
RELEVANT
Using the right predictors
SELLING
SELLING SQUARE AGE CONDITION TODAY’S
PRICE
PRICE FOOTAG WEATHE
(AUD)
($) E R
95,000 1,926 30 Good 135,000 Good
119,000 2,069 40 Excellent 169,300 Bad
124,800 1,720 30 Excellent 176,000 Good
135,000 1,396 15 Good 192,000 Meh
142,000 1,706 32 Mint 202,000 Good
145,000 1,847 38 Mint 206,317 Bad
159,000 1,950 27 Mint 226,237 Bad
165,000 2,323 30 Excellent 234,775 Ok
182,000 2,285 26 Mint 259,000 Good
183,000 3,752 35 Good 260,500 Bad
200,000 2,300 18 Good 284,500 Bad
211,000 2,525 17 Good 300,000 Good
215,000 3,800 40 Excellent 305,900 Good
219,000 1,740 12 Mint 311,500 Good
Using the right predictors
SELLING
SELLING SQUARE AGE CONDITION PRICE
PRICE FOOTAG
(AUD)
($) E
95,000 1,926 30 Good 135,000
119,000 2,069 40 Excellent 169,300
124,800 1,720 30 Excellent 176,000
135,000 1,396 15 Good 192,000
142,000 1,706 32 Mint 202,000
145,000 1,847 38 Mint 206,317
159,000 1,950 27 Mint 226,237
165,000 2,323 30 Excellent 234,775
182,000 2,285 26 Mint 259,000
183,000 3,752 35 Good 260,500
200,000 2,300 18 Good 284,500
211,000 2,525 17 Good 300,000
215,000 3,800 40 Excellent 305,900
219,000 1,740 12 Mint 311,500
Using the right input

Selling Price ($) = 150 + 0.7 Selling Price (AUD) + 16.6 SF – 0.8 Age + 15.6 Excellent + 35 Mint

• R2 = 0.89
• Significance F < 0.00001
• All parameters significant

Is the model useful?


Using the right input

Selling Price ($) = 150 + 0.7 Selling Price (AUD) + 16.6 SF – 0.8 Age + 15.6 Excellent + 35 Mint

• R2 = 0.89
• Significance F < 0.00001
• All parameters significant

Can not get that


information in
production!

You cannot try to predict Y using something directly derived from Y!


Is the model making sense?

Selling Price ($) = 150 + 16.6 SF + 0.8 Age - 15.6 Excellent - 35 Mint

• R2 = 0.89
• Significance F < 0.00001
• All parameters significant

Is the model useful?


Variables selections methods

If you have 1,000s of variables, which one to use?

1. Make sure you select only the relevant first!


2. Use a variable selection method
Variables selections methods
Forward selection:
1. Starts with no variables in the model
2. Test the addition of each variable using a chosen model fit criterion
3. Add the variable (if any) whose inclusion gives the most statistically
significant improvement of the fit
4. Go to 2 until no more variable improves the model to a statistically
significant extent

Backward selection
5. Start with all candidate variables
6. Test the deletion of each variable using a chosen model fit criterion
7. Delete the variable (if any) whose loss gives the most statistically
insignificant deterioration of the model fit
8. Go to 2 until no further variables can be deleted without a
statistically significant loss of fit
Variables selections methods

Stepwise Selection: A combination of the forward and backward,


testing at each step for variables to be included or excluded

Fast Backward: Available for logistic regression models, this technique uses a
numeric shortcut to compute the next selection iteration quicker than backward
selection.

Lasso: This method adds and removes candidate effects based on a version of
ordinary least squares, where the sum of the absolute regression coefficients is
constrained.

Adaptive Lasso: Available for linear regressions, this is a modification to Lasso


where selection weights are applied to each of the parameters that are used to
create the Lasso constraint.
Example: Colonel Motors

Use regression analysis to improve fuel efficiency


 Study the impact of weight on miles per gallon (MPG)

TABLE 4.6

WEIGHT WEIGHT
MPG (1,000 LBS.) MPG (1,000 LBS.)
12 4.58 20 3.18
13 4.66 23 2.68
15 4.02 24 2.65
18 2.53 33 1.70
19 3.09 36 1.95
19 3.11 42 1.92

Copyright ©2015 Pearson Education, Inc. 4 – 65


Example: Colonel Motors
FIGURE 4.6A – Linear Model for MPG Data

45 –

40 – Does this line


35 – fit well?
30 –

25 –
MPG

20 –

15 –

10 –

5–

0 |– | | | |
1.00 | 4.00 5.00
2.00
3.00
Weight (1,000 lb.)

Copyright ©2015 Pearson Education, Inc. 4 – 66


Transformation
FIGURE 4.6B – Nonlinear Model for MPG Data

45 –

40 –

35 –

30 –

25 –
MPG

20 –

15 –

10 –

5–

0 |– | | | |
1.00 | 4.00 5.00
2.00
3.00
Weight (1,000 lb.)

Copyright ©2015 Pearson Education, Inc. 4 – 67


Nonlinear Regression

In some situations, variables are not linear


Transformations may be used to turn a nonlinear model into a linear
model

*
* * *
* *
** * * ** ** * * *
** *
Nonlinear relationship Linear relationship

Copyright ©2015 Pearson Education, Inc. 4 – 68


Transformation : Quadratic

The nonlinear model is a quadratic model

 The easiest approach: develop a new variable


X 2  (weight)2

 New model:

Yˆ  b0  b1 X1  b2 X
2

Copyright ©2015 Pearson Education, Inc. 4 – 69


Transformation: log
• Y => Y’ = log(Y)
• and/or X => X’ = log(X)

Note: only for variables > 0


Transformation: log

log(Y)

Right-skewed distribution Approximately Normal distribution


Transformation: log

~ Normal From Reduce the


distribution exponential to variance within
linear each groups
https://basicmedicalkey.com/transformations
relationship
Lecture Outline

1.1 Predictive Analytics


1.2 Modeling: Linear Regression
1.3 Evaluation: Linear Regression
1.4 Going Beyond Linear Regression Models
1.5 Generalized Linear Models & Generalized Additive
Models (GLM and GAM)
Generalized Linear Model (GLM)

• Generalized linear models extend the theory and methods of linear models
to data that is not normally distributed. A link function is used to take into
account the distribution of the response variable.
• There is only one response variable (Response).
• Model effects or explanatory variables can be any of the following effects:
• continuous (Continuous effects)
• categorical (Classification effects)
• interaction terms (Interaction effects)
Generalized Linear Model (GLM)
We can write a general equation for linear model as follows —
y = W0 + W1*X1 + W2*X2 + W3*X3 + ... + Wn*Xn + e

We will generalize this model that results in GLMs for modelling data originating
from the exponential family of probability distributions such as Normal, Binomial,
Poisson among others.
There are 3 components of GLM.
• Random Component — Which defines the response variable y and its probability
distribution. One important assumption is that the responses from y1 to yn are
independent to each other.
• 2. Systematic Component — Which defines which explanatory variables we want
to include in our model. It also allows the interactions among explanatory
variables such as — X1*X2, X1² etc. That is the part that we model. It is also
called linear predictor of covariates i.e. X1, X2, … , Xn and coefficients i.e W1, W2,
… , Wn.
• 3. Link Component — Which connects Random and Systematic component. It is
the function of expected value of response variable i.e. E(Y) which enables
linearity in the parameters and allows E(Y) to be non-linearly related to the
explanatory variables. It is the link function that generalizes the linear models.
Generalized Linear Model (GLM)
Generalized Linear Model (GLM)
Generalized Linear Model Results in SAS Viya:
Summary Bar

Residual
Fit Plot
Summary

Assessment
Generalized Additive Model (GAM)

• Generalized Additive Models are an extension of the generalized linear


model. They relax the linearity assumption and allow spline terms so
that nonlinear dependency structures can be captured.
• There is only one response variable (Response).
• Model effects or explanatory variables can be any of the following
effects:
• spline (Spline effects)
• continuous (Continuous effects)
• categorical (Classification effects)
• interaction terms (Interaction effects)
Generalized Additive Model (GAM)

A GAM is a linear model with a key difference: A GAM is allowed to learn non-
linear features.
• GAMs relax the restriction that the relationship must be a simple weighted sum
• Instead assume that the outcome can be modelled by a sum of arbitrary
functions of each feature.
• To do this, we simply replace beta coefficients from Linear Regression with a
flexible function called a spline.
• Splines are complex functions that allow us to model non-linear relationships for
each feature.
• The sum of many splines forms a GAM.
• The basic linear regression equation is defined by the sum of a linear
combination of variables.
Generalized Additive Model (GAM)
Generalized Additive Model (GAM)
Generalized Additive Model

• Like generalized linear models, generalized additive models enable you


to specify a link function and a distribution.
• Spline terms can be either only one or two-dimensional.
• Spline terms are constructed by the thin-plate regression spline technique.
• Advantages of generalized additive models include pattern discovery and
potential better predictive capability. (Ordinary general linear model might
overlook pattern discovery.)
• A disadvantage includes the loss of interpretability of a spline effect as a
predictor.

25
Copyright © SAS Institute Inc. All rights reserved.
Using Splines in a Generalized Additive Model

Splines can be thought of as piecewise polynomial regressions that are


smoothly connected at interior knots
Linear Regression Vs GLM Vs GAM
Linear Regression Vs GLM Vs GAM
Model
Complexity
Too complex

Not complex
enough

...
Video Tutorials on SAS Viya
Regression Models in SAS Viya

You might also like