Chapter 7

This document discusses multiple regression analysis to predict important factors in first-grade reading. It provides step-by-step instructions for conducting multiple regression using statistical software. It analyzes several data sets to identify significant predictors and ensure regression assumptions are met. Key findings include identifying the strongest predictors of reading scores and grammar outcomes. Assumptions like multicollinearity, normality and homoscedasticity are evaluated.

Uploaded by

dennypetrie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views2 pages

Chapter 7

Uploaded by

dennypetrie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Chapter 7: Looking For Groups of Explanatory Variables Through Multiple Regression: Predicting Important Factors in First-

Grade Reading

7.4.5 Application activity: Multiple Regression

ANALYZE > REGRESSION > LINEAR. Wellbeing “Dependent” box. variables into the “Independent” box:
“Enter”.
STATISTICS button and tick “confidence intervals”, “casewise diagnostics”, “descriptives”, “part and partial correlations” and “collinearity
diagnostics”,
PLOTS button and put SRESID in the “Y” axis box and ZPRED in the “X” axis box and tick “Normal probability plot”.
Save button and check Mahalanobis and Cook’s under the “Distances” box.
Press OK and run the regression.

Looking at the relations between DV and IV

The correlation between Iv and Dv is high (r = .807) and indicates multicollinearity.
The correlation between Iv and Iv is also high (r = .712).

The model in the “Model Summary” box that includes all 5 explanatory variables has an R2 = .672, which is quite high. Of the individual
terms of this equation Dv=IV IV IV IV, the Coefficients output box shows that only iv is statistical (t = 5.21, p < .0005).
The standardized coefficients are β = -.025 for non-verbal reasoning, β = .12 for working memory, β = -.106 for naming speed, β = -.083
for L2 phonemic awareness, and β = .769 for KINDERL2READING. You will find these coefficients in the “Coefficients” output box.
At the end of the Coefficients output you will find the VIF column. Here no values are over 5, so presumably this does not indicate a
problem with multicollinearity.
The Residuals Statistics output box does not indicate problems with outliers (standardized residuals, Cook’s distance or Mahalanobis), but
the residuals vs. predicted values plot could indicate some heteroscedasticity (values on the right side of the plot are more constrained than
values on the left). The P–P plot does show variance away from a straight line, indicating that data may not be normally distributed.

ANALYZE > REGRESSION > LINEAR. Dv and Iv allocation.

change the Method to “Stepwise”.
NEXT button to indicate that you will enter that variable in the first step.
Now put intelligence test scores (INTELLIG) into the “Independent” box and press NEXT. The third step should enter L2CONTA, the fourth
ANWR_1, and the last ENWR_1.
Open the STATISTICS button and tick “confidence intervals”, “casewise diagnostics”, “R squared change”, “descriptives” and “collinearity
diagnostics”. Open the PLOTS button and put SRESID in the “Y” axis box and ZPRED in the “X” axis box and tick “Normal probability
plot”. Click the SAVE button and check Mahalanobis and Cook’s under the “Distances” box. Press OK when back to the LINEAR
REGRESSION dialog box and run the regression.

In looking at your output, first look at the box labeled “Variables Entered/Removed” and make sure everything was done in steps the way
you wanted. Next look at the “Model Summary” box.
The overall R2 for the model with all 5 variables entered was R2 = .688, adjusted R2 = .672. This explains quite a lot of what is going on! I
will give a table with the results for the change in R2 (found in the “Model Summary” box), the unstandardized coefficients and the
statistical results for each of the variables in the last model (found in the “Coefficients” box).

R2 change Unstandardized t-statistic p-value

coefficient
Time 2 grammar
Time 1 grammar .303 .045 .577 .57
Intelligence .013 .186 .845 .40
L2 contact .006 -.132 -1.490 .14
ANWR1 .363 .546 3.458 .001
ENWR1 .004 .213 1.051 .30

We can compare the strength of the variables by looking at the R2 change. It is clear that at least entered in this order, Time 1 grammar is
highly predictive of Time 2 grammar scores, but even more highly predictive is scores on the Arabic non-word test (its R2 change is even
higher than that of the Time 1 grammar). The t-test shows that the ANWR is the only constituent which is statistical (by the way, French and
O’Brien tried reversing the order of the ENWR and ANWR and found that in that case the ENWR received most of the R 2 change (.328) and
the ANWR just a little (.038). So it is clear that a measure of phonological memory was the big predictor, and which one it was was not so
important).

In examining regression assumptions, the VIF column shows that in the model with all 5 variables, both of the phonological memory tests
received VIF values of a little over 10, indicating a problem with multicollinearity. Given what I said above about reversing the order of the
two tests, in order to find the optimal model it would be best to choose one or the other of the phonological memory tests (probably the
ANWR since it had a larger R2 change when it was first than the ENWR when it was first). In the Residuals Statistics box, no standardized
residuals are above 3 (or below -3) so that is good. For Cook’s distance no scores are above 1, and for Mahalanobis’ distance no scores are
above 15, so we do not seem to have any problems with outliers. For normality, looking at the P–P plot, there appears to be a very good fit
of the data to the line, indicating the residuals are normally distributed. For looking at the homoscedasticity requirement, the scatterplot of
residuals vs. predicted values does not show any evidence of data being more constricted on one side over another. This is quite a clean data
set that satisfies all of the assumptions of regression (a rarity!).

scatterplot matrix of the data (Graphs > Legacy Dialogs > Scatter/Dot, then choose Matrix Scatter and press DEFINE; put all 6 variables into
the “Matrix Variables” box) shows that all data may have a linear relationship with OVERALL except for ENROLL, which seems to be a
vertical line with a few outliers. Opening the regression dialog box, put OVERALL in the “Dependent” box and all of the other variables in
the “Independent” box. Leave the Method as “Enter”. Open the same buttons and tick the same boxes as described for #2. This model
explains R2 = .76 of the variance in overall scores, a large amount. The Coefficients output box indicates (from the t-test) that the statistical
factors were Teach and Knowledge only.
Running another regression with just Teach and Knowledge as the two explanatory factors, the R2 is now .74 (not much lower, but a much
simpler equation). Both factors are statistical components of the regression equation (according to the t-test).

In looking at regression assumptions, the VIF does not indicate a problem with multicollinearity, residuals statistics, Cook’s and
Mahalanobis do not indicate a problem with outliers or influence points, the P–P plot looks good indicating a normal distribution, and there
is no clear heteroscedasticity in the residuals vs. predicted fit plot. Overall, this model seems to satisfy regression assumptions quite well.

5.
First call for a scatterplot matrix using the commands described in #4 above. Look at the intersection of the explanatory variable with the
response variable (SWEAR2). A scatterplot matrix of the intersection of SWEAR2 with the explanatory variables (L2FREQ, WEIGHT2,
L2_COMP, L2SPEAK) showed a random scattering of the variables pretty much over the entire graph, which would violate the assumption of
linearity. However, since the points are discrete and not jittered so we can see their frequency, it could be that there are indeed linear trends
that are not apparent in the scatterplot. In other words, there may be many more points along a linear line in the plot, but because we can
only see 25 discrete points on the scatterplot, we cannot tell how often each point is chosen. If we add regression lines to the data (open the
Chart Editor, push the ADD FIT LINE AT TOTAL button (or use the menu) and then CLOSE), there do seem to be linear relationships indicated.
We will continue with the analysis.

In the regression, put SWEAR2 in the “Dependent” box and the other variables in the “Independent” box. Leave the Method as “Enter”. Open
the same buttons and tick the same boxes as described for #2.

Looking at output: The correlations between swearing frequency and the explanatory variables seem to be of acceptable effect size, but not
too high so as to pose a problem. The Coefficients box shows that only weight given to swearing in L2 (WEIGHT2), L2 speaking ability
(L2SPEAK) and L2 frequency usage (L2FREQ) are statistical predictors of swearing frequency. Go back to the ANALYZE > REGRESSION >
LINEAR menu and remove L2_COMP from the Independent box. Run the regression again, and the regression equation is:
Swearing frequency = .41 + .23(Weight given to swearing in L2) + .21(L2 speaking ability) + .29 (L2 frequency of use).
This model can be obtained by looking at the constant and the unstandardized coefficients in the “Coefficients” box of the output).

This model explains R2 = .29 of the variance in swearing frequency (according to the “Model Summary” box), which is a goodly amount
but there is room for more explanation. The Residuals Statistics does not indicate any problem with non-normality (maximum in
standardized residuals is not over 3), and Cook’s distance is less than 1. For very large samples like this (over 500) there is no problem with
Mahalanobis unless values are over 25 (Field, 2005), so none of these diagnostics indicates a problem with influence. The P–P plot looks to
be pretty normal, but the residuals vs. predicted values does not look random. It has a clear downward slope to it, indicating a problem with
heteroscedasticity in the data.

6. Larson-Hall (2008)
Use the LarsonHall2008.sav file. Open the regression dialog box and put GJTSCORE in the “Dependent” box. Enter the three explanatory
variables one at a time into the “Independent box” after you have changed the Method to “Stepwise” (see instructions in #3 if you can’t
remember how to do the hierarchical regression).

With this order (TOTALHRS, RLWSCORE, APTSCORE) the R2 = .12 (fairly low). The R2 change is .034 for hours, .088 for RLW test, and .001
for aptitude.

Now open up the regression dialog box. You could redo the regression by pressing the RESET button, but then you would have to open up all
the sub-dialog boxes as well and tick everything again. It’s probably easiest to just trace back your steps and move each variable out from
the 3 blocks you created.

With this order (RLWSCORE, APTSCORE, TOTALHRS) the R2 = .12. The R2 change is .090 for RLW test, .002 for aptitude, and .031 for hours
of input.

With this order (APTSCORE, RLWSCORE, TOTALHRS), the R2 = .12. The R2 change is .034 for total hours and .088 for RLW test. Aptitude
doesn’t even get included when it is first!

The R2 doesn’t really change depending on the order, but the R2 change does vary depending on the order it is entered. Aptitude gets very
little R2 change, but most when it is second after RLW. RLW is the strongest variable and it gets the most R2 change when it comes first.
RLW score gets the most when it is first.

Learning Objectives: Simple Linear Regression
No ratings yet
Learning Objectives: Simple Linear Regression
6 pages
Biostatistics individual assignment by ziyin
No ratings yet
Biostatistics individual assignment by ziyin
29 pages
Stochastic Kriging paper Ankenman et.al
No ratings yet
Stochastic Kriging paper Ankenman et.al
13 pages
ABS TP
No ratings yet
ABS TP
13 pages
Linear Regression
No ratings yet
Linear Regression
14 pages
Problem Set 11
No ratings yet
Problem Set 11
3 pages
Business Analytics Calendar February 2023
No ratings yet
Business Analytics Calendar February 2023
1 page
DDU ASSIGNMENT GROUP - PDF 2016
No ratings yet
DDU ASSIGNMENT GROUP - PDF 2016
4 pages
Running and interpreting multiple regression in SPSS (includes review of assumptions)
No ratings yet
Running and interpreting multiple regression in SPSS (includes review of assumptions)
60 pages
Regn_lect_4
No ratings yet
Regn_lect_4
9 pages
Linear Regression - Kevin
No ratings yet
Linear Regression - Kevin
31 pages
Chapter 6-Simple Linear Regression and Correlation
No ratings yet
Chapter 6-Simple Linear Regression and Correlation
23 pages
Lecture 12 - Adv. Correlation and Multiple Regression
No ratings yet
Lecture 12 - Adv. Correlation and Multiple Regression
32 pages
MAT204 - M3 Ktunotes.in
No ratings yet
MAT204 - M3 Ktunotes.in
26 pages
Multiple Linear Regression test_2025
No ratings yet
Multiple Linear Regression test_2025
47 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
26 pages
One way ANOVA and Chi square
No ratings yet
One way ANOVA and Chi square
31 pages
FML LabFile 7exps
No ratings yet
FML LabFile 7exps
37 pages
Lab 9 Report
No ratings yet
Lab 9 Report
5 pages
Interpretation of Regression
No ratings yet
Interpretation of Regression
6 pages
AMA3602Final2024Fall Ray
No ratings yet
AMA3602Final2024Fall Ray
21 pages
STROBE Statement-Checklist of Items That Should Be Included in Reports of Observational Studies
No ratings yet
STROBE Statement-Checklist of Items That Should Be Included in Reports of Observational Studies
2 pages
Linear Regression Model: Topic 2
No ratings yet
Linear Regression Model: Topic 2
49 pages
Results
No ratings yet
Results
11 pages
Meet5 Psy 312 Decision-Making Association
No ratings yet
Meet5 Psy 312 Decision-Making Association
49 pages
Maintenance Strategic and Capacity Planning
No ratings yet
Maintenance Strategic and Capacity Planning
39 pages
BS1 Applied Statistics Exam Solutions 2011
No ratings yet
BS1 Applied Statistics Exam Solutions 2011
10 pages
Sensi Dan Spesi Metode Apung
No ratings yet
Sensi Dan Spesi Metode Apung
2 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
HW4 Solutions: Problem 6.2
No ratings yet
HW4 Solutions: Problem 6.2
8 pages
I. Task 1 - Multiple Regression: Determinants of Wages: Econometrics Assignment
No ratings yet
I. Task 1 - Multiple Regression: Determinants of Wages: Econometrics Assignment
16 pages
QMB Asn 3
No ratings yet
QMB Asn 3
9 pages
Second Stats Packet 24
No ratings yet
Second Stats Packet 24
100 pages
ETC 2420/5242 Lab 10 2016: Purpose
No ratings yet
ETC 2420/5242 Lab 10 2016: Purpose
11 pages
Summary Experiments
No ratings yet
Summary Experiments
3 pages
RM Practical-195218222
No ratings yet
RM Practical-195218222
15 pages
Exams Paper B Critical Review Syllabus May2011
No ratings yet
Exams Paper B Critical Review Syllabus May2011
7 pages
Regn_lect_5
No ratings yet
Regn_lect_5
9 pages
Data Fix
No ratings yet
Data Fix
19 pages
Jurnal Disiplin Kerja Dan Job Description
No ratings yet
Jurnal Disiplin Kerja Dan Job Description
9 pages
Methods notes
No ratings yet
Methods notes
9 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Correlation and Regression Analysis Using SPSS
No ratings yet
Correlation and Regression Analysis Using SPSS
102 pages
Examining RADR As A Valuation Method in Capital Budgeting
No ratings yet
Examining RADR As A Valuation Method in Capital Budgeting
6 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
6 pages
Multiple Linear Regression (Multiple Regression Analysis)
No ratings yet
Multiple Linear Regression (Multiple Regression Analysis)
37 pages
CSS
No ratings yet
CSS
15 pages
15Multiple Linear Regression
No ratings yet
15Multiple Linear Regression
168 pages
ECON326 Midterm
No ratings yet
ECON326 Midterm
5 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
Lecture 3 Classical Linear Regression Model
No ratings yet
Lecture 3 Classical Linear Regression Model
55 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
Specimen Exam Solutions Cs1a Ifoa 2019 Final
No ratings yet
Specimen Exam Solutions Cs1a Ifoa 2019 Final
11 pages
STAT Activity
No ratings yet
STAT Activity
26 pages
Response Surface Methods and Designs
No ratings yet
Response Surface Methods and Designs
68 pages
Ho - Diagnostics Examples 2 in SPSS
No ratings yet
Ho - Diagnostics Examples 2 in SPSS
4 pages
Lecture 3: Lagrangian Duality and Algorithms For The Lagrangian Dual Problem
No ratings yet
Lecture 3: Lagrangian Duality and Algorithms For The Lagrangian Dual Problem
47 pages
Chapter 6 (Part Ii)
No ratings yet
Chapter 6 (Part Ii)
41 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Chapter 3 Notes Part 3
No ratings yet
Chapter 3 Notes Part 3
9 pages
Perry J. Kaufman - Trading Systems and Methods (Fifth Edition) - 15
0% (1)
Perry J. Kaufman - Trading Systems and Methods (Fifth Edition) - 15
1 page
Tutorial 4
No ratings yet
Tutorial 4
10 pages
Assosa University School of Graduate Studies Mba Program
No ratings yet
Assosa University School of Graduate Studies Mba Program
10 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
DADM - Cheat Sheet: Hypothesis Testing Two Way Anova
No ratings yet
DADM - Cheat Sheet: Hypothesis Testing Two Way Anova
2 pages
Capm & Apt-1
No ratings yet
Capm & Apt-1
7 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
Multiple Regression ANOVA
No ratings yet
Multiple Regression ANOVA
11 pages
Latin Square Design: Arif Rahman
No ratings yet
Latin Square Design: Arif Rahman
31 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
Equation Formula: 1 1 FV PMT 1
No ratings yet
Equation Formula: 1 1 FV PMT 1
3 pages
Linear Regression II
No ratings yet
Linear Regression II
54 pages
5.multiple Regression
No ratings yet
5.multiple Regression
17 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Basic Statistics: Basic Statistical Interview Question
No ratings yet
Basic Statistics: Basic Statistical Interview Question
5 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
10 Regression Analysis
No ratings yet
10 Regression Analysis
55 pages
UKP6053 - L8 Multiple Regression
100% (2)
UKP6053 - L8 Multiple Regression
105 pages
Heteroscedasticity:: Testing and Correcting in SPSS
No ratings yet
Heteroscedasticity:: Testing and Correcting in SPSS
32 pages
Multiple Linear Regression Analysis
No ratings yet
Multiple Linear Regression Analysis
23 pages
Multiple Regression - D. Boduszek
No ratings yet
Multiple Regression - D. Boduszek
27 pages
STATA Red Tutorial
100% (1)
STATA Red Tutorial
84 pages
Multiple Regression - D. Boduszek - HUD PDF
No ratings yet
Multiple Regression - D. Boduszek - HUD PDF
37 pages
1quiz - Game Theory
0% (4)
1quiz - Game Theory
4 pages
Train1201 PDF
No ratings yet
Train1201 PDF
388 pages
5 Regression Analysis
No ratings yet
5 Regression Analysis
43 pages
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Digital Circuit Simulation Using Excel
From Everand
Digital Circuit Simulation Using Excel
Anthony Mazzurco
No ratings yet

Chapter 7

Uploaded by

Chapter 7

Uploaded by

Chapter 7: Looking For Groups of Explanatory Variables Through Multiple Regression: Predicting Important Factors in First-

7.4.5 Application activity: Multiple Regression

Looking at the relations between DV and IV

ANALYZE > REGRESSION > LINEAR. Dv and Iv allocation.

R2 change Unstandardized t-statistic p-value

You might also like