0% found this document useful (0 votes)

14 views

LM01 Basics of Multiple Regression and Underlying Assumptions IFT Notes

This document provides an overview of multiple linear regression, including its uses in explaining relationships between financial variables, testing theories, and making forecasts. It outlines the basic structure of a regression model, the interpretation of coefficients, and the five key assumptions underlying multiple regression analysis. Additionally, it discusses diagnostic tools such as scatterplots and Q-Q plots to identify potential violations of these assumptions.

Uploaded by

ereden4030

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

LM01 Basics of Multiple Regression and Underlying Assumptions IFT Notes

Uploaded by

ereden4030

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

LM01 Basics of Multiple Regression and Underlying

Assumptions

1. Introduction ...........................................................................................................................................................2
2. Uses of Multiple Linear Regression ..............................................................................................................2
3. The Basics of Multiple Regression ................................................................................................................4
4. Assumptions Underlying Multiple Linear Regression ..........................................................................5
Summary......................................................................................................................................................................8

This document should be read in conjunction with the corresponding learning module in the 2024
Level II CFA® Program curriculum. Some of the graphs, charts, tables, examples, and figures are
copyright 2023, CFA Institute. Reproduced and republished with permission from CFA Institute. All
rights reserved.

Required disclaimer: CFA Institute does not endorse, promote, or warrant the accuracy or quality of
the products or services offered by IFT. CFA Institute, CFA®, and Chartered Financial Analyst® are
trademarks owned by CFA Institute.

Version 1.0

© IFT. All rights reserved 1

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

1. Introduction
Multiple linear regression is used to model the linear relationship between one dependent
variable and two or more independent variables.
When constructing regression models, most of the heavy computational work is done by
statistical software such as: Excel, Python, R, SAS, and STATA. They can estimate the model
parameters and produce related statistics. An analyst’s primary role is to specify the model
correctly and to interpret the output from statistical software.
2. Uses of Multiple Linear Regression
In practice, multiple regression can be used:
• To explain the relationships between financial variables: e.g., the relationship
between inflation, GDP growth rates and interest rates.
• To test existing theories – e.g., are equity returns impacted by a stock’s market cap
and value/growth factors.
• To make forecasts – e.g., using variables such as financial leverage, profitability,
revenue growth, and changes in market share to predict whether a company will face
financial distress.
Exhibit 2 from the curriculum outlines a general process of regression analysis.
• We first start by determining if the dependent variable is continuous (e.g. returns) or
discrete (e.g. takeover target or not a takeover target). For continuous variables,
traditional regression models can be used. For discrete variables, a logistic regression
model is needed.
• We then estimate the regression model and analyze the residuals to see if any key
underlying regression assumptions are violated. If violations occur, the model has to
be adjusted.
• Next, we examine a model’s ‘goodness of fit’ to check if the overall fit is significant.
The model has to be adjusted until it meets the analyst’s criteria.
• After a model has been deemed acceptable (i.e. the regression assumptions are
satisfied, the overall fit is significant, and the model is the best model of the possible
models) we can use it for analysis and forecasting.
Instructor’s Note: These steps are covered in detail later in this reading and in subsequent
readings.

© IFT. All rights reserved 2

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

© IFT. All rights reserved 3

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

3. The Basics of Multiple Regression

A multiple linear regression model has the general form:
Yi = b0 + b1 X1i + b2 X2i + ⋯ + bk Xki + εi , i = 1, 2,….n
where:
Yi = the ith observation of the dependent variable
X1i,….,Xki = the ith observation of the independent variables
b0 = the intercept term
b1,….,bk = slope coefficients for each of the independent variables
εi = the error term of the ith observation
n = the number of observations
A regression equation has one intercept coefficient and k slope coefficients (also called
partial regression coefficients), where k is equal to the number of independent variables.
The slope coefficient bj measures how much the dependent variable Y changes when the
independent variable, Xj, changes by one unit holding all other independent variables
constant. For example, consider the following regression equation:
Y = 0.2 + 0.6X1 + 0.5 X2 + ϵ
If X1 changes by 1 unit and X2 remains constant, then Y will change by 0.6 units. Similarly, if
X1 remains constant and X2 changes by 1 unit, then Y will change by 0.5 units.
The intercept coefficient b0 represents the expected value of Y if all independent variables
are zero. In our example, if X1 and X2 are each zero, then the expected value of Y is 0.2.
Example:
(This is the Knowledge Check example from Sec 3 in the curriculum.)
An institutional salesperson has just read the research report in which you estimated a
regression of monthly excess returns on a portfolio, RETRF, against the Fama–French three
factors:
• MKTRF, the market excess return;
• SMB, the difference in returns between small- and large-capitalization stocks; and
• HML, the difference in returns between value and growth stocks.
All returns are stated in whole percentages (that is, 1 for 1%), and the estimated regression
equation is
RETRF = 1.5324 + 0.5892MKTRF + −0.8719SMB + −0.0560HML.
Before this salesperson meets with her client firm, she asks you to do the following
regarding your estimated regression model:

1. Interpret the intercept.

© IFT. All rights reserved 4

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

Solution
If the market excess return, SMB, and HML are each zero, then we expect a return on the
portfolio of 1.534%.

2. Interpret each slope coefficient.

Solution
Each slope coefficient is interpreted assuming the other variables are held constant.
• For MKTRF, if the market return increases by 1%, we expect the portfolio’s return to
increase by 0.5892%.
• For SMB, if the size effect returns increase by 1%, we expect the portfolio’s return to
decrease by 0.8719%.
• For HML, if the value effect returns increase by 1%, we expect the portfolio’s return to
decrease by 0.056%.

3. Calculate the predicted value of the portfolio’s return if

MKTRF = 1, SMB = 4, and HML = –2.
Solution
Given the expected values of the independent variables, the expected return on the portfolio
is:
R = 1.534 + 0.5892(1) − 0.8719(4) − 0.0560(−2) = −1.2524.
4. Assumptions Underlying Multiple Linear Regression
The five main assumptions underlying multiple regression models are:
1. Linearity: The relationship between the dependent variable and the independent
variables is linear.
2. Homoskedasticity: The variance of the regression residuals is the same for all
observations.
3. Independence of errors: The observations are independent of one another. This
implies the regression residuals are uncorrelated across observations.
4. Normality: The regression residuals are normally distributed.
5. Independence of independent variables:
5a. Independent variables are not random.
5b. There is no exact linear relation between two or more of the independent
variables or combinations of the independent variables.

© IFT. All rights reserved 5

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

Regression software produces diagnostic plots which can help detect if these assumptions
are violated. Commonly used diagnostic plots are discussed below.
Scatterplots of dependent and independent variables: A scatterplot matrix (also referred
to as a pairs plot) is useful for detecting non-linear relationships.
For example, consider a model that explains the excess return of ABC stock using market
excess return (MKTRF), size (SMB) and value (HML) as explanatory variables.
ABC_RETRFt = b0 + b1MKTRFt + b2SMBt + b3HMLt + εt
The regression software uses 10-years of monthly data and produces the following
scatterplot matrix.

The bottom row shows the scatter plot between Y and each of the three independent
variables. We can draw the following conclusions:
• There is a positive relationship between ABC_RETF and the market factor, MKTRF.
• There seems to be no apparent relation between ABC_RETRF and the size factor, SMB.
• There is a negative relationship between ABC_RETF and the value factor, HML.

© IFT. All rights reserved 6

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

Instructor’s Note: If you observe a non-linear relationship between Y and any independent
variable (e.g. a curve instead of a straight line) then the regression assumption of ‘linearity’
has been violated.
Next, we look at the scatterplot between the independent variables. The relatively flat line
for the SMB-HML pair indicates that SMB and HML have little to no correlation. This is a
desirable characteristic between explanatory variables.
Instructor’s Note: If you observe a strong relationship between two independent variables
then the regression assumption of ‘independence of independent variables’ has been
violated.

Scatterplots of residuals: This plot is useful for detecting violations of homoskedasticity

and independence of errors. It can also help identify outliers in our data.
We first look at the scatterplot of residuals against the dependent variable.

As indicated by the line centered near residual value 0.00, a visual inspection does not reveal
any directional relationship between the residuals and the predicted values from the
regression model. This indicates that the residuals behave in an independent manner and
that the regression’s errors have a constant variance and are uncorrelated with each other.
The square markers – months 7, 25, and 95 indicate potential outliers. This data can be used
to look for shocks caused by factors not considered in the model that occurred at these
points in time.
Instructor’s Note: If you observe a strong relationship then the regression assumptions of
‘homoskedasticity’ and ‘independence of errors’ has been violated.

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

Next, we look at the scatterplot of the regression residuals versus each of the three
independent variables.

A visual inspection reveals no directional relationship between the residuals and the
explanatory variables, implying no violation of the multiple linear regression assumptions.

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

Also, the same three potential outliers identified in the ‘residual v/s dependent variable’ plot
are also apparent in these scatter plots as indicated by the square markers. So, we can
conclude that these are indeed outliers.
Instructor’s Note: If you observe a significant relationship between the residuals and an
independent variable then the model is misspecified.
Normal Q-Q plot: A normal Q-Q plot is used to visualize the distribution of a variable by
comparing it to a normal distribution. If the variable is normally distributed, it should align
along the diagonal. We can use this plot to check if the model’s residuals are normally
distributed.
A Q-Q plot for the residuals of our regression model is presented below:

Apart from the three outliers, all other observations are very close to the diagonal line.
Hence, we can conclude that the regression model error term is close to being normally
distributed.
Instructor’s Note: If you observe that observations move away from the diagonal then the
regression assumption of ‘normality’ is violated. Deviations from the diagonal past the ± 2
standard deviations mark indicate that the distribution is ‘fat-tailed’ – a commonly observed
feature of financial data.

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

Summary
LO: Describe the types of investment problems addressed by multiple linear
regression and the regression process.
Multiple regression can be used:
• To explain the relationships between financial variables.
• To test existing theories.
• To make forecasts.
LO: Formulate a multiple linear regression model, describe the relation between the
dependent variable and several independent variables, and interpret estimated
regression coefficients.
A multiple linear regression model has the general form:
Yi = b0 + b1 X1i + b2 X2i + ⋯ + bk Xki + εi , i = 1, 2,….n
The slope coefficient bj measures how much the dependent variable Y changes when the
independent variable, Xj, changes by one unit holding all other independent variables
constant.
The intercept coefficient b0 represents the expected value of Y if all independent variables
are zero.
LO: Explain the assumptions underlying a multiple linear regression model and
interpret residual plots indicating potential violations of these assumptions.
The five main assumptions underlying multiple regression models are:
1. Linearity: The relationship between the dependent variable and the independent
variables is linear.
2. Homoskedasticity: The variance of the regression residuals is the same for all
observations.
3. Independence of errors: The observations are independent of one another. This
implies the regression residuals are uncorrelated across observations.
4. Normality: The regression residuals are normally distributed.
5. Independence of independent variables:
5a. Independent variables are not random.
5b. There is no exact linear relation between two or more of the independent
variables or combinations of the independent variables.
Scatterplots of dependent and independent variables are used to check if the assumptions of
‘linearity’ and ‘independence of independent variables’ have been violated.

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

Scatterplots of residuals is used to check if the assumptions of ‘homoskedasticity’ and

‘independence of errors’ have been violated.
A ‘Q-Q’ plot of residuals is used to check if the assumption of ‘normality’ has been violated.

FINAL General Maths PSMT
No ratings yet
FINAL General Maths PSMT
16 pages
Cfa l2 2024 Volume1 1522872379
No ratings yet
Cfa l2 2024 Volume1 1522872379
30 pages
Sigma Plot 11 Users Guide
25% (4)
Sigma Plot 11 Users Guide
947 pages
Quant and Eco
No ratings yet
Quant and Eco
218 pages
CFA L2 2024 Volume1
100% (1)
CFA L2 2024 Volume1
168 pages
CFA Level2
No ratings yet
CFA Level2
8 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
6 pages
Lecture 2
No ratings yet
Lecture 2
29 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
High Yield Notes
No ratings yet
High Yield Notes
251 pages
MBA Analytics For Finance 09
No ratings yet
MBA Analytics For Finance 09
12 pages
Cfa Level 2 2023 Summary
No ratings yet
Cfa Level 2 2023 Summary
100 pages
Econometrics Unit 4
No ratings yet
Econometrics Unit 4
56 pages
Marketing Research: Ninth Edition
No ratings yet
Marketing Research: Ninth Edition
44 pages
2024 L2 QuantMethods
No ratings yet
2024 L2 QuantMethods
57 pages
2023 L2 QuantMethods
No ratings yet
2023 L2 QuantMethods
57 pages
Quantative Methods
No ratings yet
Quantative Methods
8 pages
2024 Chapter 1
No ratings yet
2024 Chapter 1
8 pages
2025 L2 QuantMethods
No ratings yet
2025 L2 QuantMethods
57 pages
Bsacore1 M5 Wed
No ratings yet
Bsacore1 M5 Wed
4 pages
5) Multiple Regression
100% (1)
5) Multiple Regression
8 pages
Mod 3C
No ratings yet
Mod 3C
36 pages
Module 5: Multiple Regression Analysis: Tom Ilvento
No ratings yet
Module 5: Multiple Regression Analysis: Tom Ilvento
20 pages
Pink Green Bright Aesthetic Playful Math Class Presentation
No ratings yet
Pink Green Bright Aesthetic Playful Math Class Presentation
34 pages
Multiple linear regression
No ratings yet
Multiple linear regression
2 pages
BA Module 5 Summary
No ratings yet
BA Module 5 Summary
3 pages
Hanan
No ratings yet
Hanan
9 pages
U-4_IML
No ratings yet
U-4_IML
17 pages
IDS UNIT 5 Linear Regression
No ratings yet
IDS UNIT 5 Linear Regression
27 pages
Day 2-Data Science
No ratings yet
Day 2-Data Science
16 pages
ADM2304 Multiple Regression Dr. Suren Phansalker
No ratings yet
ADM2304 Multiple Regression Dr. Suren Phansalker
12 pages
LM04 Extensions of Multiple Regression IFT Notes
No ratings yet
LM04 Extensions of Multiple Regression IFT Notes
17 pages
1 - Multiple Regression
No ratings yet
1 - Multiple Regression
8 pages
Example of How To Use Multiple Linear Regression
No ratings yet
Example of How To Use Multiple Linear Regression
4 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Multiple Regression
No ratings yet
Multiple Regression
8 pages
Improved Research Paper - Linear Regression in Market Mix Modelling
No ratings yet
Improved Research Paper - Linear Regression in Market Mix Modelling
8 pages
Chapter 3 Multiple Linear Regression - We Use This One
No ratings yet
Chapter 3 Multiple Linear Regression - We Use This One
6 pages
7-Multiple Regression
No ratings yet
7-Multiple Regression
17 pages
Linear Regression PDF
100% (1)
Linear Regression PDF
32 pages
IV Ai & Ds Al3451 Ml Unit2
No ratings yet
IV Ai & Ds Al3451 Ml Unit2
50 pages
unit5_R
No ratings yet
unit5_R
5 pages
Statistics For Decision Making
No ratings yet
Statistics For Decision Making
7 pages
1.5.Linear Regression
No ratings yet
1.5.Linear Regression
5 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
Multiple Regression
No ratings yet
Multiple Regression
35 pages
2025_R10_Module_10.1
No ratings yet
2025_R10_Module_10.1
8 pages
Data Analytics Regression Unit III
No ratings yet
Data Analytics Regression Unit III
27 pages
Regression
No ratings yet
Regression
27 pages
Data Science
100% (1)
Data Science
14 pages
4 Multiple Regression Analysis
No ratings yet
4 Multiple Regression Analysis
58 pages
Hypotest 8
No ratings yet
Hypotest 8
2 pages
unit-3 part 2 DA
No ratings yet
unit-3 part 2 DA
20 pages
Tute - 04
No ratings yet
Tute - 04
6 pages
MachineLearning_Unit-II
No ratings yet
MachineLearning_Unit-II
45 pages
Regression Analysis Linear and Multiple Regression
No ratings yet
Regression Analysis Linear and Multiple Regression
6 pages
Regression Analysis Linear and Multiple Regression
No ratings yet
Regression Analysis Linear and Multiple Regression
6 pages
Regression Analysis Linear and Multiple Regression
No ratings yet
Regression Analysis Linear and Multiple Regression
6 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Multiple Regression and Issues in Regression Analysis
No ratings yet
Multiple Regression and Issues in Regression Analysis
25 pages
Advanced Econometrics: Methods and Practical Uses
From Everand
Advanced Econometrics: Methods and Practical Uses
Himadri Deshpande
No ratings yet
Sas 9.0 Manual PDF
No ratings yet
Sas 9.0 Manual PDF
1,861 pages
A Level Math Paper 2 Correlation and Scatter Diagrams
No ratings yet
A Level Math Paper 2 Correlation and Scatter Diagrams
30 pages
RANCHI
No ratings yet
RANCHI
85 pages
Project Sta 108 1
No ratings yet
Project Sta 108 1
13 pages
Descriptive Analytics.ipynb - Colab
No ratings yet
Descriptive Analytics.ipynb - Colab
9 pages
TQM Assignment
No ratings yet
TQM Assignment
2 pages
LS 02 - Correlation - Regression
No ratings yet
LS 02 - Correlation - Regression
17 pages
Dads304 Visualisation
No ratings yet
Dads304 Visualisation
7 pages
Tabular and Graphical Methods: Business Statistics: Communicating With Numbers, 4e
No ratings yet
Tabular and Graphical Methods: Business Statistics: Communicating With Numbers, 4e
32 pages
Fundamentals of Experimental Design
No ratings yet
Fundamentals of Experimental Design
12 pages
Correlation Research Design - PRESENTASI
100% (1)
Correlation Research Design - PRESENTASI
62 pages
A Complete Guide to Line Charts _ Atlassian
No ratings yet
A Complete Guide to Line Charts _ Atlassian
8 pages
MAS 02 Cost Behavior With Regression Analysis
No ratings yet
MAS 02 Cost Behavior With Regression Analysis
6 pages
Spot the Mistakes - Graphs and Charts - Answers
No ratings yet
Spot the Mistakes - Graphs and Charts - Answers
5 pages
Data Visualization Techniques 1
No ratings yet
Data Visualization Techniques 1
27 pages
UI/UX Presentation2
No ratings yet
UI/UX Presentation2
16 pages
#1660908-Data Management and Statistical Computing
No ratings yet
#1660908-Data Management and Statistical Computing
21 pages
Graphical Data Analysis with R Antony Unwin 2024 Scribd Download
100% (4)
Graphical Data Analysis with R Antony Unwin 2024 Scribd Download
55 pages
Descriptive Research
No ratings yet
Descriptive Research
74 pages
CH 7 - Quality Tools - OCW
No ratings yet
CH 7 - Quality Tools - OCW
37 pages
Creating Data Visualizations Using Tableau Desktop (Beginner) - Map and Data Library
No ratings yet
Creating Data Visualizations Using Tableau Desktop (Beginner) - Map and Data Library
41 pages
Math Questions
No ratings yet
Math Questions
20 pages
Chart Reference Guide
No ratings yet
Chart Reference Guide
11 pages
Data Visualization
No ratings yet
Data Visualization
14 pages
02 - BIOE 211 - Data Presentation (Compressed)
No ratings yet
02 - BIOE 211 - Data Presentation (Compressed)
37 pages
Info4602 Final Group Project
No ratings yet
Info4602 Final Group Project
20 pages
Interaction Design Patterns For Enterprises
No ratings yet
Interaction Design Patterns For Enterprises
128 pages
Q C Tools
No ratings yet
Q C Tools
25 pages

LM01 Basics of Multiple Regression and Underlying Assumptions IFT Notes

Uploaded by

LM01 Basics of Multiple Regression and Underlying Assumptions IFT Notes

Uploaded by

LM01 Basics of Multiple Regression and Underlying Assumptions 2024 Level II Notes

LM01 Basics of Multiple Regression and Underlying

© IFT. All rights reserved 1

© IFT. All rights reserved 2

© IFT. All rights reserved 3

3. The Basics of Multiple Regression

1. Interpret the intercept.

© IFT. All rights reserved 4

2. Interpret each slope coefficient.

3. Calculate the predicted value of the portfolio’s return if

© IFT. All rights reserved 5

© IFT. All rights reserved 6

Scatterplots of residuals: This plot is useful for detecting violations of homoskedasticity

© IFT. All rights reserved 7

© IFT. All rights reserved 8

© IFT. All rights reserved 9

© IFT. All rights reserved 10

Scatterplots of residuals is used to check if the assumptions of ‘homoskedasticity’ and

© IFT. All rights reserved 11

You might also like