0% found this document useful (0 votes)

73 views21 pages

Predictive Analytics - Business Predictions Using Mutliple Linear Regression

Predictive analytics uses statistics and modeling to predict future outcomes. Multiple linear regression analyzes relationships between dependent and independent variables to predict values. This document outlines assumptions and steps for conducting multiple linear regression to predict student achievement from factors like interest, anxiety, goals, gender identity. Key outputs include goodness of fit tests, predictive equations, and interpretations of coefficients.

Uploaded by

Sakshi Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views21 pages

Predictive Analytics - Business Predictions Using Mutliple Linear Regression

Uploaded by

Sakshi Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 21

PREDICTIVE ANALYTICS -

BUSINESS PREDICTIONS
USING MUTLIPLE LINEAR
REGRESSION
What is Predictive Analytics?
• The term predictive analytics refers to the use of statistics and
modeling techniques to make predictions about future outcomes
and performance.
• Predictive analytics looks at current and historical data patterns to
determine if those patterns are likely to emerge again.
• This allows businesses and investors to adjust where they use their
resources to take advantage of possible future events.
• Predictive modeling is often used to clean and optimize the quality
of data used for such forecasts.
• Modeling ensures that more data can be ingested by the system,
including from customer-facing operations, to ensure a more
accurate forecast.
What is Multiple Linear Regression?
• A multiple linear regression analysis is carried out to predict the values of a
dependent variable, Y, given a set of kth predictor variables (X1, X2, …, Xk).
• Multiple linear regression is used to estimate the relationship between two or
more independent variables and one dependent variable.
• We also use it when we want to determine which variables are better predictors
than others. (Variables Selection).
• The objective of multiple regression analysis is to use the independent variables
whose values are known to predict the value of the single dependent value.
• For example, if you're doing a multiple regression to try to predict blood
pressure (the dependent variable) from independent variables such as height,
weight, age, and hours of exercise per week, you'd also want to include sex as
one of your independent variables.
ASSUMPTION 1

Dependent variable should be measured

on a continuous scale (i.e., it is either an
interval or ratio variable).

Examples of variables that meet this

criterion include Achieve (measured from 0
to 100).
ASSUMPTION 2

You have two or more independent variables,

which can be either continuous (i.e., an interval
or ratio variable) or categorical (i.e., an ordinal
or nominal variable).

Examples of variables that meet this criterion

include Perfgoal, Mastery, Interest, Anxiety and
Genderid (measured from 0 to 100).
ASSUMPTION 3 ASSUMPTION 4

You should have independence of observations There needs to be a linear relationship between (a) the
(i.e., independence of residuals), which you can dependent variable and each of your independent
easily check using the Durbin-Watson statistic, variables, and (b) the dependent variable and the
which is a simple test to run using SPSS Statistics. independent variables collectively. Whilst there are a
number of ways to check for these linear relationships,
we suggest creating scatterplots and partial regression
plots using SPSS Statistics, and then visually
inspecting these scatterplots and partial regression
plots to check for linearity.
ASSUMPTION 5

There should be no significant outliers,

high leverage points or highly influential
points. Outliers, leverage and influential
points are different terms used to
represent observations in your data set
that are in some way unusual when you
wish to perform a multiple regression
analysis.
ASSUMPTION 6

One of the assumptions of linear

regression is that the residuals are
normally distributed*.

we have a histogram of the standardized

residuals (note: The unstandardized and
standardized residuals have the same mean
(of zero) and shape (in terms of skewness
and kurtosis). They differ in terms of their
standard deviations. Here, the residuals
exhibit only a minor departure from
normality.
SCENARIO
We will be carrying out a multiple regression with student (a) interest, (b)
anxiety, (c) mastery goals, (d) performance goals and (e) gender identification as
predictors (IV’s) of student achievement, Interest, anxiety, mastery goals,
performance goals, and achievement are all assumed to be continuous variables.

Gender identification is a binary IV (which is permissible in OLS regression)

which has been dummy coded (coded 0=identified male, 1=identified female).
Dummy coding is a form of coding of binary variables that facilitates
interpretation of the intercept when included in a regression model.
STEPS TO PERFORM MULTIPLE LINEAR REGRESSION

STEPS TO PERFORM
MULTIPLE LINEAR
REGRESSION
STEP 1:
To open your Excel file in SPSS:
1. File, Open, Data, from the SPSS menu.
2. Select type of file you want to open, Excel *.xls
*.xlsx, *.xlsm .
3. Select file name.
4. Click 'Read variable names' if the first row of the
spreadsheet contains column headings.
5. Click Open.
Analyze > Regression > Linear > Move Achieve to Dependent, all
STEP 2 other variables (Perfgoal, Mastery, Interest, Anxiety, Genderid)
to Independent(s).
STEP 3:
Now we will fill out the sub-dialogs as shown below
In Statistics, we will check the following-
• Estimates
• Confidence Interval Level (95%)
• Model Fit
• Descriptives
• Part and Partial Correlations
• Collinearity Diagnostics
• Under Residuals
Case wise Diagnostics
Outliner Outside (3) SD
•In Plots > SRESID IN Y > ZPRED IN X > check Histogram > check
STEP 4 Normal Probability plot > In Standardized Residual Plot Check >
Produce all partial plots.
STEP 5:
In Save Option
INTERPRETATION OF THE
DATA
• Descriptive Analysis is the type of
analysis of data that helps describe,
show or summarize data points in a
constructive way such that patterns
might emerge that fulfill every
condition of the data.

• It gives you a conclusion of the

distribution of your data, helps you
detect typos and outliers, and
enables you to identify similarities
among variables, thus making you
ready for conducting further
statistical analyses.
INTERPRETATION OF
THE DATA

• The first table of interest is the Model

Summary table. This table provides the R, R
2 , adjusted R 2 , and the standard error of
the estimate, which can be used to determine
how well a regression model fits the data.
• The "R" column represents the value of R,
the multiple correlation coefficient. R can be
one measure of the quality of the prediction
of the dependent variable.
• In this case, Achieve has a value of 0.642, in
this example, indicates a good level of
prediction.
• The F-ratio in the ANOVA table (see below) tests whether the overall regression
INTERPRETATION OF model is a good fit for the data. The table shows that the independent variables

THE DATA statistically significantly predict the dependent variable, F(45, 134) = 18.770, p
< .0005 (i.e., the regression model is a good fit of the data).
INTERPRETATION OF THE DATA
• The general form of the equation to predict Achieve from
Perfgoal, Mastery, Interest, Anxiety and Genderid is
predicted Achieve = 2.357 – (0.010 x Perfgoal) – (0.325 x
Mastery) – (0.198 x Interest) -0.023 x Anxiety) – (0.235 x
Genderid)

• This is obtained from the Coefficients table, as shown

below Unstandardized coefficients indicate how much
the dependent variable varies with an independent
variable when all other independent variables are held
constant.
CONCLUSION
• A multiple regression was run to predict from gender, age,
weight and heart rate.

• These variables statistically significantly predicted VO2max,

F(45, 134) = 18.770, p < .0005 (i.e., the regression model is a
good fit of the data).

• R2 = .412. All four variables added statistically significantly to

the prediction, p < .05.

Introductory Business Statistics 2e - WEB
No ratings yet
Introductory Business Statistics 2e - WEB
627 pages
Regression Analysis Assignment
100% (1)
Regression Analysis Assignment
8 pages
Stevenson - 13e - Chapter - 3 Revised
No ratings yet
Stevenson - 13e - Chapter - 3 Revised
40 pages
Strip Plot Design
No ratings yet
Strip Plot Design
9 pages
MS Azure DP-100
No ratings yet
MS Azure DP-100
56 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Regression PPT Final
100% (1)
Regression PPT Final
59 pages
Regression Analysis Assignment
No ratings yet
Regression Analysis Assignment
8 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Logistic Regression-Advanced Biostat PDF
No ratings yet
Logistic Regression-Advanced Biostat PDF
86 pages
Example How To Perform Multiple Regression Analysis Using SPSS Statistics
100% (1)
Example How To Perform Multiple Regression Analysis Using SPSS Statistics
14 pages
Multiple Regression MS
No ratings yet
Multiple Regression MS
35 pages
Trust Wallet Spamming
No ratings yet
Trust Wallet Spamming
50 pages
Regression Analysis Presentation
No ratings yet
Regression Analysis Presentation
52 pages
Notes For Chapter 5-6
No ratings yet
Notes For Chapter 5-6
27 pages
11 Regression JASP
100% (1)
11 Regression JASP
35 pages
K10-Regresi Pelbagai 6013
No ratings yet
K10-Regresi Pelbagai 6013
29 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
26 pages
Helper
No ratings yet
Helper
64 pages
Da Semi
No ratings yet
Da Semi
42 pages
UKP6053 - L8 Multiple Regression
100% (2)
UKP6053 - L8 Multiple Regression
105 pages
UNIT II Regression
No ratings yet
UNIT II Regression
59 pages
Chapter 10 Multiple Regression
No ratings yet
Chapter 10 Multiple Regression
43 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Practical - Regression
No ratings yet
Practical - Regression
114 pages
Unit 3
No ratings yet
Unit 3
24 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
1 s2.0 S0304405X22000204 Main
No ratings yet
1 s2.0 S0304405X22000204 Main
26 pages
Simple Liner REgression
No ratings yet
Simple Liner REgression
27 pages
Data Screening and Main Model Analysis in Spss
No ratings yet
Data Screening and Main Model Analysis in Spss
26 pages
Regression Analysis
No ratings yet
Regression Analysis
20 pages
Stats-Simple Linear Regression
No ratings yet
Stats-Simple Linear Regression
26 pages
Correlation Regression 15 16
No ratings yet
Correlation Regression 15 16
19 pages
Hypotesis Testing Chapter1
No ratings yet
Hypotesis Testing Chapter1
32 pages
Question Paper Marked
No ratings yet
Question Paper Marked
25 pages
Chapter 11
No ratings yet
Chapter 11
35 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Multiple Linear Regression in Data Mining
100% (1)
Multiple Linear Regression in Data Mining
14 pages
1.3. MR Using SPSS
No ratings yet
1.3. MR Using SPSS
24 pages
Discussion 8 - Data Analysis Techniques
No ratings yet
Discussion 8 - Data Analysis Techniques
16 pages
CSS
No ratings yet
CSS
15 pages
Introduction To Econometrics
No ratings yet
Introduction To Econometrics
37 pages
Applied Quantitative Methodology - Summary 4
No ratings yet
Applied Quantitative Methodology - Summary 4
24 pages
The Cybernetic Bayesian Brain: Anil K. Seth
No ratings yet
The Cybernetic Bayesian Brain: Anil K. Seth
51 pages
Stats Multiple Regression
No ratings yet
Stats Multiple Regression
19 pages
The Art of PD Curve Calibration: Dirk Tasche December 15, 2012
No ratings yet
The Art of PD Curve Calibration: Dirk Tasche December 15, 2012
36 pages
10 Regression Analysis
No ratings yet
10 Regression Analysis
55 pages
Module01.1 LinearRegression
No ratings yet
Module01.1 LinearRegression
32 pages
Multiple Regression ANOVA
No ratings yet
Multiple Regression ANOVA
11 pages
Intro Regression Modeling
No ratings yet
Intro Regression Modeling
11 pages
Stats Final Assignment (
No ratings yet
Stats Final Assignment (
9 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
6.multiple Regressions - BDSM - 2020 - Oct
No ratings yet
6.multiple Regressions - BDSM - 2020 - Oct
45 pages
Linear Regression: DSCI 5240 Data Mining and Machine Learning For Business Russell R. Torres
No ratings yet
Linear Regression: DSCI 5240 Data Mining and Machine Learning For Business Russell R. Torres
39 pages
تقييم مستوى رصد معلومات رأس المال البشري باستخدام نظم معلومات الموارد البشرية دراسة ميدانية لبعض فنادق ولاية ورقلــة الجزائر
No ratings yet
تقييم مستوى رصد معلومات رأس المال البشري باستخدام نظم معلومات الموارد البشرية دراسة ميدانية لبعض فنادق ولاية ورقلــة الجزائر
16 pages
Statstic Slide
No ratings yet
Statstic Slide
24 pages
Gamma Extended Frechet Distribution
No ratings yet
Gamma Extended Frechet Distribution
23 pages
Regression
No ratings yet
Regression
14 pages
Business Analytics Presentation
No ratings yet
Business Analytics Presentation
11 pages
7-Multiple Regression
No ratings yet
7-Multiple Regression
17 pages
Regression
No ratings yet
Regression
9 pages
DS122 HW With Answers
No ratings yet
DS122 HW With Answers
4 pages
BCM Anmol Sakshi
No ratings yet
BCM Anmol Sakshi
19 pages
Chapter One
No ratings yet
Chapter One
7 pages
Econometrics Beat - Dave Giles' Blog - ARDL Modelling in EViews 9
No ratings yet
Econometrics Beat - Dave Giles' Blog - ARDL Modelling in EViews 9
26 pages
Multiple Linear Regression Analysis
No ratings yet
Multiple Linear Regression Analysis
23 pages
What Is Regression Analysis
No ratings yet
What Is Regression Analysis
18 pages
Session3 and 4 - RKS - PredictiveAnalytics
No ratings yet
Session3 and 4 - RKS - PredictiveAnalytics
46 pages
Regression Analysis Assignment
No ratings yet
Regression Analysis Assignment
8 pages
5.multiple Regression
No ratings yet
5.multiple Regression
17 pages
Chapter6 Regression Diagnostic For Leverage and Influence
No ratings yet
Chapter6 Regression Diagnostic For Leverage and Influence
10 pages
Multiple Regression Analysis Using SPSS Statistics
No ratings yet
Multiple Regression Analysis Using SPSS Statistics
9 pages
Multiple Regression Analysis Using SPSS Laerd
No ratings yet
Multiple Regression Analysis Using SPSS Laerd
14 pages
Solution
No ratings yet
Solution
5 pages
Econometrics Tintner
No ratings yet
Econometrics Tintner
17 pages
Making Predictions With Regression Analysis - Statistical Analysis
No ratings yet
Making Predictions With Regression Analysis - Statistical Analysis
11 pages
BSM Sakshi Garg FT 3 B
No ratings yet
BSM Sakshi Garg FT 3 B
10 pages
Assumptions of Multiple Regression
No ratings yet
Assumptions of Multiple Regression
12 pages
SPSS ANNOTATED OUTPUT Multiple Regression
No ratings yet
SPSS ANNOTATED OUTPUT Multiple Regression
12 pages
Chapter 18
No ratings yet
Chapter 18
3 pages
Introduction To Statistical Inference
100% (2)
Introduction To Statistical Inference
561 pages
How To Conduct A Multiple Regression in SPSS 1
No ratings yet
How To Conduct A Multiple Regression in SPSS 1
5 pages
Multiple Regression ANOVA
No ratings yet
Multiple Regression ANOVA
11 pages
W6 - L6 - Multiple Linear Regression
No ratings yet
W6 - L6 - Multiple Linear Regression
3 pages
Multiple Regression Analysis Using SPSS Statistics
No ratings yet
Multiple Regression Analysis Using SPSS Statistics
5 pages
Linear Regression Analysis in SPSS Statistics
No ratings yet
Linear Regression Analysis in SPSS Statistics
7 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
Interpreting Correlation
No ratings yet
Interpreting Correlation
13 pages
STAT202-homework2 HW21
No ratings yet
STAT202-homework2 HW21
2 pages
Career Objective:: Worked Under Finance Department and Managed Their Excel Sheets
No ratings yet
Career Objective:: Worked Under Finance Department and Managed Their Excel Sheets
2 pages
Test Procedure in SPSS Statistics
No ratings yet
Test Procedure in SPSS Statistics
8 pages
Quantitative Techniques For Management Sciences Practice Questions
No ratings yet
Quantitative Techniques For Management Sciences Practice Questions
2 pages
Sakshi Garg FT 3 B
No ratings yet
Sakshi Garg FT 3 B
1 page
Statistics Notes - Normal Distribution, Confidence Interval & Hypothesis Testing
No ratings yet
Statistics Notes - Normal Distribution, Confidence Interval & Hypothesis Testing
2 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet

Predictive Analytics - Business Predictions Using Mutliple Linear Regression

Uploaded by

Predictive Analytics - Business Predictions Using Mutliple Linear Regression

Uploaded by

PREDICTIVE ANALYTICS -

Dependent variable should be measured

Examples of variables that meet this

You have two or more independent variables,

Examples of variables that meet this criterion

There should be no significant outliers,

One of the assumptions of linear

we have a histogram of the standardized

Gender identification is a binary IV (which is permissible in OLS regression)

• It gives you a conclusion of the

• The first table of interest is the Model

• This is obtained from the Coefficients table, as shown

• These variables statistically significantly predicted VO2max,

• R2 = .412. All four variables added statistically significantly to

You might also like