Predictive Analytics - Business Predictions Using Mutliple Linear Regression
Predictive Analytics - Business Predictions Using Mutliple Linear Regression
BUSINESS PREDICTIONS
USING MUTLIPLE LINEAR
REGRESSION
What is Predictive Analytics?
• The term predictive analytics refers to the use of statistics and
modeling techniques to make predictions about future outcomes
and performance.
• Predictive analytics looks at current and historical data patterns to
determine if those patterns are likely to emerge again.
• This allows businesses and investors to adjust where they use their
resources to take advantage of possible future events.
• Predictive modeling is often used to clean and optimize the quality
of data used for such forecasts.
• Modeling ensures that more data can be ingested by the system,
including from customer-facing operations, to ensure a more
accurate forecast.
What is Multiple Linear Regression?
• A multiple linear regression analysis is carried out to predict the values of a
dependent variable, Y, given a set of kth predictor variables (X1, X2, …, Xk).
• Multiple linear regression is used to estimate the relationship between two or
more independent variables and one dependent variable.
• We also use it when we want to determine which variables are better predictors
than others. (Variables Selection).
• The objective of multiple regression analysis is to use the independent variables
whose values are known to predict the value of the single dependent value.
• For example, if you're doing a multiple regression to try to predict blood
pressure (the dependent variable) from independent variables such as height,
weight, age, and hours of exercise per week, you'd also want to include sex as
one of your independent variables.
ASSUMPTION 1
You should have independence of observations There needs to be a linear relationship between (a) the
(i.e., independence of residuals), which you can dependent variable and each of your independent
easily check using the Durbin-Watson statistic, variables, and (b) the dependent variable and the
which is a simple test to run using SPSS Statistics. independent variables collectively. Whilst there are a
number of ways to check for these linear relationships,
we suggest creating scatterplots and partial regression
plots using SPSS Statistics, and then visually
inspecting these scatterplots and partial regression
plots to check for linearity.
ASSUMPTION 5
STEPS TO PERFORM
MULTIPLE LINEAR
REGRESSION
STEP 1:
To open your Excel file in SPSS:
1. File, Open, Data, from the SPSS menu.
2. Select type of file you want to open, Excel *.xls
*.xlsx, *.xlsm .
3. Select file name.
4. Click 'Read variable names' if the first row of the
spreadsheet contains column headings.
5. Click Open.
Analyze > Regression > Linear > Move Achieve to Dependent, all
STEP 2 other variables (Perfgoal, Mastery, Interest, Anxiety, Genderid)
to Independent(s).
STEP 3:
Now we will fill out the sub-dialogs as shown below
In Statistics, we will check the following-
• Estimates
• Confidence Interval Level (95%)
• Model Fit
• Descriptives
• Part and Partial Correlations
• Collinearity Diagnostics
• Under Residuals
Case wise Diagnostics
Outliner Outside (3) SD
•In Plots > SRESID IN Y > ZPRED IN X > check Histogram > check
STEP 4 Normal Probability plot > In Standardized Residual Plot Check >
Produce all partial plots.
STEP 5:
In Save Option
INTERPRETATION OF THE
DATA
• Descriptive Analysis is the type of
analysis of data that helps describe,
show or summarize data points in a
constructive way such that patterns
might emerge that fulfill every
condition of the data.
THE DATA statistically significantly predict the dependent variable, F(45, 134) = 18.770, p
< .0005 (i.e., the regression model is a good fit of the data).
INTERPRETATION OF THE DATA
• The general form of the equation to predict Achieve from
Perfgoal, Mastery, Interest, Anxiety and Genderid is
predicted Achieve = 2.357 – (0.010 x Perfgoal) – (0.325 x
Mastery) – (0.198 x Interest) -0.023 x Anxiety) – (0.235 x
Genderid)