Multiple Linear Regression 2021
Multiple Linear Regression 2021
REGRESSION
Professor Dr. Syed Hatim Noor
Dr. Wan Arfah Nadiah Wan Abdul Jamil
Universiti Sultan Zainal Abidin
Introduction
• Multiple linear regression • Outcome is a numerical
is the estimation of the variable
linear relationship
between a dependent • If independent variables
variable and more than are numerical (called
one independent Multiple Linear
variables or covariates Regression)
• Applied in exploratory • If independent variables
and explanatory studies are combination of
numerical and categorical
or categorical only
(called General Linear
Regression)
Syed Hatim Noor 2
Multiple Linear Regression
Model
• Y = β0 + β1X1 + β2X2 + β3X3 +……. βnXn
• Y is the outcome
• β0 is intercept
• β1…….. βn - regression coefficient for individual
independent variable
• X1 ……. Xn - individual independent variables
• Our interest is in regression coefficient, its 95%
confidence interval and corresponding p-value
Descriptives
Histogram
Statistic Std. Error
age Mean 65.26 .216
200
95% Confidence Lower Bound 64.84
Interval for Mean Upper Bound
65.68
150
5% Trimmed Mean 64.76
Median 64.00
Frequency
Variance 63.737
100
Std. Deviation 7.984
Minimum 45
Maximum 95
50
Range 50
Interquartile Range 11
Mean = 65.26
Skewness .824 .066 Std. Dev. = 7.984
0 N = 1,369
Kurtosis .337 .132 50 60 70 80 90
age
• Y = a + bx
• Weight = -39.67 + (0.65*Height)
600
500 Coefficientsa
Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B
PEFR
400 Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) 537.876 26.068 20.634 .000 486.739 589.013
age -3.289 .396 -.219 -8.295 .000 -4.067 -2.511
a. Dependent Variable: PEFR
300
200
R Sq Linear = 0.048
100
40 50 60 70 80 90 100
age
600
Coefficientsa
400
1 (Constant) 98.270 18.746 5.242 .000 61.495 135.044
weight 4.190 .344 .313 12.166 .000 3.514 4.865
a. Dependent Variable: PEFR
300
200
R Sq Linear = 0.098
100
30 40 50 60 70 80
weight
600
Coefficientsa
500
Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B
PEFR
200
R Sq Linear = 0.091
100
Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B
Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) 98.270 18.746 5.242 .000 61.495 135.044
weight 4.190 .344 .313 12.166 .000 3.514 4.865
2 (Constant) -242.718 45.631 -5.319 .000 -332.232 -153.204
weight 3.155 .360 .235 8.774 .000 2.449 3.860
height 2.585 .317 .219 8.159 .000 1.964 3.207
3 (Constant) -63.934 53.764 -1.189 .235 -169.402 41.534
weight 2.692 .363 .201 7.417 .000 1.980 3.404
height 2.571 .313 .218 8.221 .000 1.958 3.185
age -2.326 .382 -.155 -6.090 .000 -3.076 -1.577
a. Dependent Variable: PEFR
Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B
Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) -63.934 53.764 -1.189 .235 -169.402 41.534
age -2.326 .382 -.155 -6.090 .000 -3.076 -1.577
weight 2.692 .363 .201 7.417 .000 1.980 3.404
height 2.571 .313 .218 8.221 .000 1.958 3.185
a. Dependent Variable: PEFR
Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B
Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) 98.270 18.746 5.242 .000 61.495 135.044
weight 4.190 .344 .313 12.166 .000 3.514 4.865
2 (Constant) -242.718 45.631 -5.319 .000 -332.232 -153.204
weight 3.155 .360 .235 8.774 .000 2.449 3.860
height 2.585 .317 .219 8.159 .000 1.964 3.207
3 (Constant) -63.934 53.764 -1.189 .235 -169.402 41.534
weight 2.692 .363 .201 7.417 .000 1.980 3.404
height 2.571 .313 .218 8.221 .000 1.958 3.185
age -2.326 .382 -.155 -6.090 .000 -3.076 -1.577
a. Dependent Variable: PEFR
Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B Collinearity Statistics
Model B Std. Error Beta t Sig. Lower Bound Upper Bound Tolerance VIF
1 (Constant) -63.934 53.764 -1.189 .235 -169.402 41.534
age-IC -2.326 .382 -.155 -6.090 .000 -3.076 -1.577 .949 1.054
weight 2.692 .363 .201 7.417 .000 1.980 3.404 .837 1.195
height 2.571 .313 .218 8.221 .000 1.958 3.185 .875 1.142
a. Dependent Variable: PEFR
Coefficientsa
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -237.204 155.635 -1.524 .128
age-IC .270 2.221 .018 .121 .903
weight 5.962 2.780 .445 2.144 .032
height 2.586 .313 .219 8.264 .000
age_wt -.050 .042 -.267 -1.186 .236
a. Dependent Variable: PEFR
77.26
77.26
400.00000
200.00000
Unstandardized Residual
0.00000
-200.00000
-400.00000
of fitted observations,
0.00000 then assumption is
not met
-200.00000
• There is no peculiar
feature here, linearity
-400.00000
shape of divergence
200.00000
or convergence or
Unstandardized Residual
fan-shape of fitted
0.00000
observations, then
assumption is not met
• There is no peculiar
-200.00000
-400.00000
feature here, equal
150.00000 200.00000 250.00000 300.00000 350.00000 400.00000 450.00000
variance assumption
Unstandardized Predicted Value
is fulfilled
Observations
along the line
120
100
80
Frequency
60
40
20
Mean = -1.4599433E-
0 14
Std. Dev. =
20
-3
-2
-1
0.
30
10
109.77281718
00
00
00
00
0.
0.
0.
00
00
00
N = 1,369
.0
.0
.0
00
00
00
00
00
0
00
00
00
00
00
0
0
Unstandardized Residual
400.00000
149
212
200.00000
0.00000
-200.00000
260
1,007
875
918
-400.00000
Unstandardized Residual
Syed Hatim Noor 38
Step 5: Checking assumption
of normality
120 400.00000
149
212
100
200.00000
80
Frequency
60
0.00000
40
-200.00000
20
260
Mean = -1.4599433E- 1,007
14 875
0 918
Std. Dev. =
20
-3
-2
-1
0.
30
10
109.77281718
00
00
00
00
0.
0.
0.
-400.00000
00
00
00
N = 1,369
.0
.0
.0
00
00
00
00
00
0
00
00
00
00
00
0
0
200.00000 200.00000
Unstandardized Residual
Unstandardized Residual
0.00000 0.00000
-200.00000 -200.00000
-400.00000 -400.00000
40 50 60 70 80 90 100 30 40 50 60 70 80
age weight
independent
200.00000
numerical variables
Unstandardized Residual
and residuals
• The forms of
0.00000
-200.00000
independent
numerical variables
-400.00000
are appropriate
120 140
height
160 180 • They have linear
relationship with the
residuals
Syed Hatim Noor 41
A Checklist for reporting
Multiple Linear Regression
• The relationship of interest or • Confirm that the assumptions
the purpose of the analysis are fulfilled (text and footnote
(text) of the table )
• Sample size • Inform about interactions and
• Multiple linear regression multicollinearity problem
equation • Report how any outlying data
• Regression coefficients, their are treated (e.g. transformation
95% confidence interval and of data if normality assumption
the actual p-value of each is not met) (text)
independent variable in the • Name the statistical package
final model used in the analysis (text)
• Coefficient of determination
(R2)
Height (cm) 3.57 (2.97, 4.16) <0.001 2.57 (1.96, 3.19) <0.001
Age (years) -3.29 (-4.07, -2.51) <0.001 -2.33 (-3.08, -1.58) <0.001