SlideShare a Scribd company logo
Quantitative
Methods
for
Lawyers Class #20
Regression Analysis
Part 3
@ computational
computationallegalstudies.com
professor daniel martin katz danielmartinkatz.com
lexpredict.com slideshare.net/DanielKatz
Multiple
Regression
Just a Reminder...
Keep This Visual Image
in Your Mind
Estimate a lawyer’s rate:
Real Rate Report™ Regression model
From the CT TyMetrix/Corporate Executive Board 2012
Real Rate Report©
$15
1
$16
1
$34
per 10 years$95 +$99
(Finance)
-$15
(Litigation)
n = 15,353 Lawyers
Tier 1
Market Experience
Partner
Status
Practice
Area
Base
+ + +/-
Source: 2012 Real Rate Report™
32
$15
Per
100 Lawyers
Law
Firm
Size+ +
$161
$151
$15
per 100
lawyers $95
$34
per 10
years
-$15
(Litigation)
+$99
(Finance)
Y = βo +/- β1 ( X1 ) +/- β2 ( X2 ) +/- β3 ( X3 ) +/- β4 ( X3 ) +/- β5 ( X3 ) + ε
Y = $151 + $15 ( ) + 161 ( ) + 95 ( ) + 34 ( ) +/- β5 ( ) + ε
Per
100
Lawyers
If Tier 1
Market
is True
Partner
Status
is True
Per
10
Years
Practice
Area
From The Last Time...
Now Lets Consider the More Complex Case:
Relationship Between Sat Score and Expenditures/
Variety of other Variables ?
Our Y
Dependent
Variable
Our X Predictors/
Independent Variables
Multivariate Regression
Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ε
csat = 851.56 + 0.003*expense – 2.62*percent + 0.11*income + 1.63*high + 2.03*college + ε
Lets Consider Our
“Beta Coefficients”
Are They
Statistically
Significant?
Look at the
P Value on
“Expense” -
It is no longer
Statistically
Significant
Two Ways to Think
About Significance:
Is the P Value > .05?
Is the Tstat < 1.96?
Variable
Significant
@ .05 Level
expense no
percent yes
income no
high no
college no
intercept yes
Using Our Model to Predict
Using Our Model to Predict
What if we had a Hypothetical State with the following factors -
• Per Pupil Expenditures Primary & Secondary (expense) - $6000
• % HS of graduates taking SAT (percent) - 20%
• Median Household Income (income) - 33.000
• % adults with HS Diploma (high) - 70%
• % adults with College Degree (college) - 15%
• Midwest State (Region=South)
Please Predict the Mean Score for this Hypothetical State?
Here is our Model:
csat = 849.59 – 0.002*expense – 3.01*percent – 0.17*income + 1.81*high + 4.67*college +
-34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε
Using Our Model to Predict
What if we had a Hypothetical State with the following factors -
• Per Pupil Expenditures Primary & Secondary (expense) - $6000
• % HS of graduates taking SAT (percent) - 20%
• Median Household Income (income) - 33.000
• % adults with HS Diploma (high) - 70%
• % adults with College Degree (college) - 15%
• Midwest State (Region=South)
Here is our Model:
csat = 849.59 – 0.002*expense – 3.01*percent – 0.17*income + 1.81*high + 4.67*college +
-34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε
csat = 849.59 – 0.002*(6000) – 3.01*(20) – 0.17*(33.000) + 1.81*(70) + 4.67*(15) +
-34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε
Using Our Model to Predict
csat = 849.59 – 0.002*(6000) – 3.01*(20) – 0.17*(33.000) + 1.81*(70) + 4.67*(15) +
-34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε
csat = 849.59 – 0.002*expense – 3.01*percent – 0.17*income + 1.81*high + 4.67*college +
-34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε
csat = 849.59 – 12 – 60.2 – 5.61 + 126.7 + 70.05 + - 9.18
predicted composite SAT Score = 959.35
Violation of
Regression
Assumptions
Heteroskedasticity
Regression Analysis assumes that error terms are independently,
identically and normally distributed
Assumes that error terms have mean of zero and a constant variance
(i.e. variance is the same throughout all subsets of values of the
error terms)
What does this Mean?
If there is an error in our estimate - that estimate is still centered
around the true variable value
No Systematic Error in over/under estimating the regression
coefficients
Heteroskedasticity
Heteroscedasticity does not cause ordinary least squares coefficient
estimates to be biased, although it can cause ordinary least squares
estimates of the variance (and, thus, standard errors) of the coefficients to
be biased, possibly above or below the true or population variance.
Thus, regression analysis using heteroscedastic data will still provide an
unbiased estimate for the relationship between the predictor variable and
the outcome, but standard errors and therefore inferences obtained from
data analysis are suspect.
Biased standard errors lead to biased inference, so results of hypothesis
tests are possibly wrong.
Heteroskedasticity
HeteroskedasticHomoskedastic
How Do I Detect
Heteroskedasticity?
Visual (Ocular) Method is a good starting point (although you
should probably also check with a more formal approach)
However, lets just start here:
(1) Run the Regression
(2) Plot the Residuals against the fitted values
(3) Review the Resulting Plot -
When plotting residuals vs. predicted values (aka Yhat) we
should not observe any pattern if the variance in the
residuals is homoskedastic
(0) Load the Data
(1) Run the Regression
(1) Run the Regression
(2) Plot the Residuals against the fitted values
(3) Review the Resulting Plot -
When plotting residuals vs. predicted values
(aka Yhat) we should not observe any pattern
if the variance in the residuals is
homoskedastic
Take a Look ...
Here we do observe
residuals that slightly
expand as we move
along the fitted values
How Do I Detect
Heteroskedasticity?
There is a More Formal Approach ...
the Breusch-Pagan test
Test the Null Hypothesis of Constant Variance
(1) Run the Regression
(2) Execute the Breusch-Pagan test
How Do I Detect
Heteroskedasticity?
However, it is generally considered wise to use assume
Heteroskedasticity and control for it in an appropriate manner
This is a Fail to Reject
Situation
Robust
Standard
Errors
Robust Standard Errors
Robust Standard Errors Control for heteroskedasticity
In R
you can
just use
“rlm”
instead
of “lm”
Robust
Standard
Errors
Compare the Two Outputs
Coefficients are roughly the
same but
Std. Errors and T stats are
different
Multicollinearity
Multicollinearity
statistical phenomenon in which two or more predictor variables in
a multiple regression model are highly correlated.
In this situation the coefficient estimates may change erratically in
response to small changes in the model or the data.
Multicollinearity does not reduce the predictive power or reliability
of the model as a whole, at least within the sample data
themselves; it only affects calculations regarding individual
predictors.
Take a Look at the Visual
Mean
composite
SAT
score
Per pupil
expenditures
prim&sec
% HS
graduates
taking
SAT
Median
household
income,
$1,000
%
adults
HS
diploma
% adults
college
degree
From
Stata
Take a Look at the Visual
From
R
Take a Look at the Visual
Mean
composite
SAT
score
Per pupil
expenditures
prim&sec
% HS
graduates
taking
SAT
Median
household
income,
$1,000
%
adults
HS
diploma
% adults
college
degree
http://cran.r-project.org/web/packages/car/car.pdf
How Do I Detect
Multicollinearity?
(1) Run the Regression
(2) Obtain and then Examine the Variance Inflation Factor (“VIF”)
A vif > 10 or a 1/vif < 0.10 is an issue
Here we look to be okay
Daniel Martin Katz
@ computational
computationallegalstudies.com
lexpredict.com
danielmartinkatz.com
illinois tech - chicago kent college of law@
Ad

More Related Content

What's hot (20)

Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Daniel Katz
 
Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...
Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...
Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...
Daniel Katz
 
Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...
Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...
Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...
Daniel Katz
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and Distribution
Eugene Yan Ziyou
 
multiple regression
multiple regressionmultiple regression
multiple regression
Priya Sharma
 
Types of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics IITypes of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics II
Rupak Roy
 
Intro to Quant Trading Strategies (Lecture 4 of 10)
Intro to Quant Trading Strategies (Lecture 4 of 10)Intro to Quant Trading Strategies (Lecture 4 of 10)
Intro to Quant Trading Strategies (Lecture 4 of 10)
Adrian Aley
 
Intro to Quant Trading Strategies (Lecture 10 of 10)
Intro to Quant Trading Strategies (Lecture 10 of 10)Intro to Quant Trading Strategies (Lecture 10 of 10)
Intro to Quant Trading Strategies (Lecture 10 of 10)
Adrian Aley
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions
Anthony J. Evans
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
Abhimanyu Dwivedi
 
Probability Distribution
Probability DistributionProbability Distribution
Probability Distribution
Sarabjeet Kaur
 
Intro to Quant Trading Strategies (Lecture 6 of 10)
Intro to Quant Trading Strategies (Lecture 6 of 10)Intro to Quant Trading Strategies (Lecture 6 of 10)
Intro to Quant Trading Strategies (Lecture 6 of 10)
Adrian Aley
 
Psych stats Probability and Probability Distribution
Psych stats Probability and Probability DistributionPsych stats Probability and Probability Distribution
Psych stats Probability and Probability Distribution
Martin Vince Cruz, RPm
 
Data Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVAData Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVA
Derek Kane
 
Intro to Quant Trading Strategies (Lecture 9 of 10)
Intro to Quant Trading Strategies (Lecture 9 of 10)Intro to Quant Trading Strategies (Lecture 9 of 10)
Intro to Quant Trading Strategies (Lecture 9 of 10)
Adrian Aley
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
Subrat Sar
 
Basic probability theory and statistics
Basic probability theory and statisticsBasic probability theory and statistics
Basic probability theory and statistics
Learnbay Datascience
 
Intro to Quant Trading Strategies (Lecture 2 of 10)
Intro to Quant Trading Strategies (Lecture 2 of 10)Intro to Quant Trading Strategies (Lecture 2 of 10)
Intro to Quant Trading Strategies (Lecture 2 of 10)
Adrian Aley
 
Chapter 07
Chapter 07Chapter 07
Chapter 07
bmcfad01
 
Multiple regression presentation
Multiple regression presentationMultiple regression presentation
Multiple regression presentation
Carlo Magno
 
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Daniel Katz
 
Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...
Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...
Quantitative Methods for Lawyers - Class #7 - Probability & Basic Statistics ...
Daniel Katz
 
Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...
Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...
Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewnes...
Daniel Katz
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and Distribution
Eugene Yan Ziyou
 
multiple regression
multiple regressionmultiple regression
multiple regression
Priya Sharma
 
Types of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics IITypes of Probability Distributions - Statistics II
Types of Probability Distributions - Statistics II
Rupak Roy
 
Intro to Quant Trading Strategies (Lecture 4 of 10)
Intro to Quant Trading Strategies (Lecture 4 of 10)Intro to Quant Trading Strategies (Lecture 4 of 10)
Intro to Quant Trading Strategies (Lecture 4 of 10)
Adrian Aley
 
Intro to Quant Trading Strategies (Lecture 10 of 10)
Intro to Quant Trading Strategies (Lecture 10 of 10)Intro to Quant Trading Strategies (Lecture 10 of 10)
Intro to Quant Trading Strategies (Lecture 10 of 10)
Adrian Aley
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions
Anthony J. Evans
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
Abhimanyu Dwivedi
 
Probability Distribution
Probability DistributionProbability Distribution
Probability Distribution
Sarabjeet Kaur
 
Intro to Quant Trading Strategies (Lecture 6 of 10)
Intro to Quant Trading Strategies (Lecture 6 of 10)Intro to Quant Trading Strategies (Lecture 6 of 10)
Intro to Quant Trading Strategies (Lecture 6 of 10)
Adrian Aley
 
Psych stats Probability and Probability Distribution
Psych stats Probability and Probability DistributionPsych stats Probability and Probability Distribution
Psych stats Probability and Probability Distribution
Martin Vince Cruz, RPm
 
Data Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVAData Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVA
Derek Kane
 
Intro to Quant Trading Strategies (Lecture 9 of 10)
Intro to Quant Trading Strategies (Lecture 9 of 10)Intro to Quant Trading Strategies (Lecture 9 of 10)
Intro to Quant Trading Strategies (Lecture 9 of 10)
Adrian Aley
 
Basic probability theory and statistics
Basic probability theory and statisticsBasic probability theory and statistics
Basic probability theory and statistics
Learnbay Datascience
 
Intro to Quant Trading Strategies (Lecture 2 of 10)
Intro to Quant Trading Strategies (Lecture 2 of 10)Intro to Quant Trading Strategies (Lecture 2 of 10)
Intro to Quant Trading Strategies (Lecture 2 of 10)
Adrian Aley
 
Chapter 07
Chapter 07Chapter 07
Chapter 07
bmcfad01
 
Multiple regression presentation
Multiple regression presentationMultiple regression presentation
Multiple regression presentation
Carlo Magno
 

Similar to Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3 (20)

Regression for class teaching
Regression for class teachingRegression for class teaching
Regression for class teaching
Pakistan Gum Industries Pvt. Ltd
 
Statistics excellent
Statistics excellentStatistics excellent
Statistics excellent
National Institute of Biologics
 
Corrleation and regression
Corrleation and regressionCorrleation and regression
Corrleation and regression
Pakistan Gum Industries Pvt. Ltd
 
Statistical analysis
Statistical analysisStatistical analysis
Statistical analysis
highlandn
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Nilanjan Bhaumik
 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
Makati Science High School
 
Topic 2b .pptx
Topic 2b .pptxTopic 2b .pptx
Topic 2b .pptx
tengshiankam
 
A General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docxA General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docx
evonnehoggarth79783
 
Basics of Stats (2).pptx
Basics of Stats (2).pptxBasics of Stats (2).pptx
Basics of Stats (2).pptx
madihamaqbool6
 
L1 statistics
L1 statisticsL1 statistics
L1 statistics
dapdai
 
Risk notes ch12
Risk notes ch12Risk notes ch12
Risk notes ch12
Ragheed I. Moghrabi MA, MBA
 
UNIT 3.pptx.......................................
UNIT 3.pptx.......................................UNIT 3.pptx.......................................
UNIT 3.pptx.......................................
vijayannamratha
 
Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data science
pujashri1975
 
Statistics.pdf
Statistics.pdfStatistics.pdf
Statistics.pdf
Shruti Nigam (CWM, AFP)
 
statistics
statisticsstatistics
statistics
Sanchit Babbar
 
best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.ppt
DejeneDay
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
NazarudinManik1
 
Measure of Dispersion in statistics
Measure of Dispersion in statisticsMeasure of Dispersion in statistics
Measure of Dispersion in statistics
Md. Mehadi Hassan Bappy
 
DescriptiveStatistics.pdf
DescriptiveStatistics.pdfDescriptiveStatistics.pdf
DescriptiveStatistics.pdf
data2businessinsight
 
Exploratory Data Analysis EFA Factor analysis
Exploratory Data Analysis EFA Factor analysisExploratory Data Analysis EFA Factor analysis
Exploratory Data Analysis EFA Factor analysis
KathiravanGopalan
 
Statistical analysis
Statistical analysisStatistical analysis
Statistical analysis
highlandn
 
A General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docxA General Manger of Harley-Davidson has to decide on the size of a.docx
A General Manger of Harley-Davidson has to decide on the size of a.docx
evonnehoggarth79783
 
Basics of Stats (2).pptx
Basics of Stats (2).pptxBasics of Stats (2).pptx
Basics of Stats (2).pptx
madihamaqbool6
 
L1 statistics
L1 statisticsL1 statistics
L1 statistics
dapdai
 
UNIT 3.pptx.......................................
UNIT 3.pptx.......................................UNIT 3.pptx.......................................
UNIT 3.pptx.......................................
vijayannamratha
 
Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data science
pujashri1975
 
best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.ppt
DejeneDay
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
NazarudinManik1
 
Exploratory Data Analysis EFA Factor analysis
Exploratory Data Analysis EFA Factor analysisExploratory Data Analysis EFA Factor analysis
Exploratory Data Analysis EFA Factor analysis
KathiravanGopalan
 
Ad

More from Daniel Katz (20)

Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Daniel Katz
 
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Daniel Katz
 
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Daniel Katz
 
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Daniel Katz
 
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Daniel Katz
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Daniel Katz
 
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Daniel Katz
 
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Daniel Katz
 
Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer
Daniel Katz
 
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Daniel Katz
 
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Daniel Katz
 
LexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision MakingLexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision Making
Daniel Katz
 
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
Daniel Katz
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Daniel Katz
 
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Daniel Katz
 
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Daniel Katz
 
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Daniel Katz
 
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Daniel Katz
 
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Daniel Katz
 
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Daniel Katz
 
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Daniel Katz
 
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Daniel Katz
 
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Daniel Katz
 
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Daniel Katz
 
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Daniel Katz
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Daniel Katz
 
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Daniel Katz
 
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Daniel Katz
 
Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer
Daniel Katz
 
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Daniel Katz
 
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Daniel Katz
 
LexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision MakingLexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision Making
Daniel Katz
 
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
Daniel Katz
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Daniel Katz
 
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Daniel Katz
 
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Daniel Katz
 
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Daniel Katz
 
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Daniel Katz
 
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Daniel Katz
 
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Daniel Katz
 
Ad

Recently uploaded (20)

How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
Unit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdfUnit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdf
KanchanPatil34
 
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Library Association of Ireland
 
To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
Political History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptxPolitical History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptx
Arya Mahila P. G. College, Banaras Hindu University, Varanasi, India.
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdfBiophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
PKLI-Institute of Nursing and Allied Health Sciences Lahore , Pakistan.
 
Anti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptxAnti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptx
Mayuri Chavan
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
Operations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdfOperations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdf
Arab Academy for Science, Technology and Maritime Transport
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
Presentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem KayaPresentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingHow to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
Celine George
 
SPRING FESTIVITIES - UK AND USA -
SPRING FESTIVITIES - UK AND USA            -SPRING FESTIVITIES - UK AND USA            -
SPRING FESTIVITIES - UK AND USA -
Colégio Santa Teresinha
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
Unit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdfUnit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdf
KanchanPatil34
 
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Library Association of Ireland
 
To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
Anti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptxAnti-Depressants pharmacology 1slide.pptx
Anti-Depressants pharmacology 1slide.pptx
Mayuri Chavan
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
Presentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem KayaPresentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingHow to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
Celine George
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 

Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3

  • 1. Quantitative Methods for Lawyers Class #20 Regression Analysis Part 3 @ computational computationallegalstudies.com professor daniel martin katz danielmartinkatz.com lexpredict.com slideshare.net/DanielKatz
  • 4. Keep This Visual Image in Your Mind
  • 5. Estimate a lawyer’s rate: Real Rate Report™ Regression model From the CT TyMetrix/Corporate Executive Board 2012 Real Rate Report© $15 1 $16 1 $34 per 10 years$95 +$99 (Finance) -$15 (Litigation) n = 15,353 Lawyers Tier 1 Market Experience Partner Status Practice Area Base + + +/- Source: 2012 Real Rate Report™ 32 $15 Per 100 Lawyers Law Firm Size+ + $161 $151 $15 per 100 lawyers $95 $34 per 10 years -$15 (Litigation) +$99 (Finance)
  • 6. Y = βo +/- β1 ( X1 ) +/- β2 ( X2 ) +/- β3 ( X3 ) +/- β4 ( X3 ) +/- β5 ( X3 ) + ε Y = $151 + $15 ( ) + 161 ( ) + 95 ( ) + 34 ( ) +/- β5 ( ) + ε Per 100 Lawyers If Tier 1 Market is True Partner Status is True Per 10 Years Practice Area
  • 7. From The Last Time...
  • 8. Now Lets Consider the More Complex Case: Relationship Between Sat Score and Expenditures/ Variety of other Variables ? Our Y Dependent Variable Our X Predictors/ Independent Variables Multivariate Regression
  • 9. Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ε csat = 851.56 + 0.003*expense – 2.62*percent + 0.11*income + 1.63*high + 2.03*college + ε
  • 10. Lets Consider Our “Beta Coefficients” Are They Statistically Significant? Look at the P Value on “Expense” - It is no longer Statistically Significant
  • 11. Two Ways to Think About Significance: Is the P Value > .05? Is the Tstat < 1.96? Variable Significant @ .05 Level expense no percent yes income no high no college no intercept yes
  • 12. Using Our Model to Predict
  • 13. Using Our Model to Predict What if we had a Hypothetical State with the following factors - • Per Pupil Expenditures Primary & Secondary (expense) - $6000 • % HS of graduates taking SAT (percent) - 20% • Median Household Income (income) - 33.000 • % adults with HS Diploma (high) - 70% • % adults with College Degree (college) - 15% • Midwest State (Region=South) Please Predict the Mean Score for this Hypothetical State? Here is our Model: csat = 849.59 – 0.002*expense – 3.01*percent – 0.17*income + 1.81*high + 4.67*college + -34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε
  • 14. Using Our Model to Predict What if we had a Hypothetical State with the following factors - • Per Pupil Expenditures Primary & Secondary (expense) - $6000 • % HS of graduates taking SAT (percent) - 20% • Median Household Income (income) - 33.000 • % adults with HS Diploma (high) - 70% • % adults with College Degree (college) - 15% • Midwest State (Region=South) Here is our Model: csat = 849.59 – 0.002*expense – 3.01*percent – 0.17*income + 1.81*high + 4.67*college + -34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε csat = 849.59 – 0.002*(6000) – 3.01*(20) – 0.17*(33.000) + 1.81*(70) + 4.67*(15) + -34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε
  • 15. Using Our Model to Predict csat = 849.59 – 0.002*(6000) – 3.01*(20) – 0.17*(33.000) + 1.81*(70) + 4.67*(15) + -34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε csat = 849.59 – 0.002*expense – 3.01*percent – 0.17*income + 1.81*high + 4.67*college + -34.57*1 if regionWest=true + 34.87* 1 if regionNorthEast=true - 9.18* 1 if regionSouth=true + ε csat = 849.59 – 12 – 60.2 – 5.61 + 126.7 + 70.05 + - 9.18 predicted composite SAT Score = 959.35
  • 17. Heteroskedasticity Regression Analysis assumes that error terms are independently, identically and normally distributed Assumes that error terms have mean of zero and a constant variance (i.e. variance is the same throughout all subsets of values of the error terms) What does this Mean? If there is an error in our estimate - that estimate is still centered around the true variable value No Systematic Error in over/under estimating the regression coefficients
  • 18. Heteroskedasticity Heteroscedasticity does not cause ordinary least squares coefficient estimates to be biased, although it can cause ordinary least squares estimates of the variance (and, thus, standard errors) of the coefficients to be biased, possibly above or below the true or population variance. Thus, regression analysis using heteroscedastic data will still provide an unbiased estimate for the relationship between the predictor variable and the outcome, but standard errors and therefore inferences obtained from data analysis are suspect. Biased standard errors lead to biased inference, so results of hypothesis tests are possibly wrong.
  • 20. How Do I Detect Heteroskedasticity? Visual (Ocular) Method is a good starting point (although you should probably also check with a more formal approach) However, lets just start here: (1) Run the Regression (2) Plot the Residuals against the fitted values (3) Review the Resulting Plot - When plotting residuals vs. predicted values (aka Yhat) we should not observe any pattern if the variance in the residuals is homoskedastic
  • 21. (0) Load the Data (1) Run the Regression
  • 22. (1) Run the Regression (2) Plot the Residuals against the fitted values (3) Review the Resulting Plot - When plotting residuals vs. predicted values (aka Yhat) we should not observe any pattern if the variance in the residuals is homoskedastic
  • 23. Take a Look ... Here we do observe residuals that slightly expand as we move along the fitted values
  • 24. How Do I Detect Heteroskedasticity? There is a More Formal Approach ... the Breusch-Pagan test Test the Null Hypothesis of Constant Variance (1) Run the Regression (2) Execute the Breusch-Pagan test
  • 25. How Do I Detect Heteroskedasticity? However, it is generally considered wise to use assume Heteroskedasticity and control for it in an appropriate manner This is a Fail to Reject Situation
  • 27. Robust Standard Errors Robust Standard Errors Control for heteroskedasticity In R you can just use “rlm” instead of “lm”
  • 28. Robust Standard Errors Compare the Two Outputs Coefficients are roughly the same but Std. Errors and T stats are different
  • 30. Multicollinearity statistical phenomenon in which two or more predictor variables in a multiple regression model are highly correlated. In this situation the coefficient estimates may change erratically in response to small changes in the model or the data. Multicollinearity does not reduce the predictive power or reliability of the model as a whole, at least within the sample data themselves; it only affects calculations regarding individual predictors.
  • 31. Take a Look at the Visual Mean composite SAT score Per pupil expenditures prim&sec % HS graduates taking SAT Median household income, $1,000 % adults HS diploma % adults college degree From Stata
  • 32. Take a Look at the Visual From R
  • 33. Take a Look at the Visual Mean composite SAT score Per pupil expenditures prim&sec % HS graduates taking SAT Median household income, $1,000 % adults HS diploma % adults college degree
  • 35. How Do I Detect Multicollinearity? (1) Run the Regression (2) Obtain and then Examine the Variance Inflation Factor (“VIF”) A vif > 10 or a 1/vif < 0.10 is an issue Here we look to be okay
  • 36. Daniel Martin Katz @ computational computationallegalstudies.com lexpredict.com danielmartinkatz.com illinois tech - chicago kent college of law@