0% found this document useful (0 votes)
69 views

RT1 Project 1&2 Assignment

1. The document describes several regression models analyzing factors that affect college GPA and wages. For college GPA, owning a computer and high school GPA are found to have a significant positive effect, while mother's and father's education do not. For wages, blacks earn approximately 18.8% less than non-blacks, and the return to education may be lower for blacks. When allowing for differences between married blacks, married non-blacks, single blacks, and single non-blacks, married blacks earn about 18% less than married non-blacks.

Uploaded by

Prashant kudrya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

RT1 Project 1&2 Assignment

1. The document describes several regression models analyzing factors that affect college GPA and wages. For college GPA, owning a computer and high school GPA are found to have a significant positive effect, while mother's and father's education do not. For wages, blacks earn approximately 18.8% less than non-blacks, and the return to education may be lower for blacks. When allowing for differences between married blacks, married non-blacks, single blacks, and single non-blacks, married blacks earn about 18% less than married non-blacks.

Uploaded by

Prashant kudrya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

# RT1 Project 1&2 Assignment - Prashant Kudrya[EA20035]

# Project 1

library(skimr)

library(ggplot2)

library(dplyr)

library(broom)

library(stargazer)

library(wooldridge)

data <- wooldridge::gpa1

model <- lm(data$colGPA~data$age+data$soph+data$junior+data$senior + data$male+


data$campus + data$business+ data$engineer

+ data$hsGPA+ data$ACT + data$job20 + data$drive + data$bike + data$voluntr + data$PC +


data$greek + data$car

+ data$siblings + data$bgfriend + data$clubs + data$skipped + data$alcohol + data$gradMI +


data$fathcoll + data$mothcoll)

stargazer(model, type = 'text')

# 1 From the model we see that, only four factors are significant in affecting college GPA

model1 <- lm(data$colGPA~ data$hsGPA + data$PC+ data$gradMI + data$skipped )

stargazer(model1, type = 'text')

#2. Will owning a computer increase college GPA?

#Yes, owning a computer will increase college GPA. This can be concluded from the
coefficient of PC which is significant at 5% level of significance. Owning a computer will
increase college GPA by .135 points

#3. Is it statistically significant? (Hint: control as many variables as you can)

model3 <- lm(data$colGPA~data$PC)

stargazer(model3, type = 'text')

#Yes owing a PC is statistically significant even at 1% level of significance

#4. Argue that including mother’s and father’s college level education have any bearing on
college gpa.

#Adding mothcoll and fathcoll to the regression

#Unrestricted Model

model4 <- lm(data$colGPA~ data$hsGPA + data$PC+ data$gradMI + data$skipped + data$mothcoll +


data$fathcoll)
summary(model4, type = 'text')

#Restricted Model

model1 <- lm(data$colGPA~ data$hsGPA + data$PC+ data$gradMI + data$skipped )

summary(model1, type = 'text')

#hsGPA coeff changes from 0.458 to 0.457

#PC coeff changes from 0.124 to 0.117

#gradeMI coeff changes from 0.172 to 0.185

#skipped coeff undergoes no change

#The constant term changes from 1.372 to 1.332

#And all of them continue to be significant while father college and mother college are
insignificant

#Therefore, adding these parameters does not have much change on collegeGPA

#Also verifying the same using f-test to test joint significance

#H0: beta5 = beta6 = 0

#H1: beta5 != beta6

library(car)

nullhyp<- c("data$fathcoll", "data$mothcoll")

linearHypothesis(model4, nullhyp)

#The f-value is 0.629

#As per the F-table, the value at 10% significance is 1.77

#Since 0.629 < 1.77, we cannot reject null hypothesis

#mothcoll and fathcoll are jointly insignificant in the model

#5. Add hsGPA2 to the model that you constructed in (1) and decide whether this generalization is
needed.

data$gpa2 <- data$hsGPA^2

model5 <- lm(data$colGPA~ data$hsGPA + data$PC+ data$gradMI + data$skipped + data$gpa2)

summary(model5)

#When hsGPA2 is added to the regression, its coefficient is about .334 and its t statistic is

#about 1.67. (The coefficient on hsGPA is about –1.78) This is a borderline case. The

#quadratic in hsGPA has a U-shape, and it only turns up at about hsGPA*


#= 2.68, which is hard to interpret. The coefficient of main interest, on PC, falls to about .
116 but is still significant.

#Adding hsGPA2 is a simple robustness check of the main finding.

###----------------------------------------------------------------------------------------------------------------------------

#### Project 2

#1. Estimate the model

# Log(wage) = B0 + B1 educ + B2 exper + B3 tenure + B4 married + B5 black +

# B6 south + B7 urban + u

# and report the results in tabular form. Holding other factors fixed, what is

# the approximate difference in monthly salary between blacks and nonblacks?

# Is this difference statistically significant?

data2 <- wooldridge::wage2

summary(data2)

#and report the results in tabular form. Holding other factors fixed, what is the approximate
difference

#in monthly salary between blacks and non blacks? Is this difference statistically significant?

reg1 <- lm(log(wage)~educ+exper+tenure+married+black+south+urban, data2)

stargazer(reg1, type = 'text')

#The coefficient on black implies that, at given levels of the other explanatory variables, black

#men earn about 18.8% less than nonblack men. The t statistic is about –5, and so it is very

#statistically significant.

#2. Add the variables exper2 and tenure2 to the equation and show that they are jointly insignificant
at

#even the 20% level.

data2$exper2 <- data2$exper^2

data2$tenure2 <- data2$tenure^2

reg2 <- lm(log(wage)~educ+exper+tenure+married+black+south+urban+exper2+tenure2, data2)

summary(reg2)

#Checking f-statistic for joint significance

#H0: beta8 = beta9 =0


#H1: beta8 != beta9 != 0

nullhypo <- c("exper2", "tenure2")

linearHypothesis(reg2,nullhypo)

# The F statistic for joint significance of exper2 and tenure2

# with 2 and 925 df, is about 1.49 with p-value ≈ .226. Because the p-value is above .20, these
quadratics are jointly

### insignificant at the 20% level

#3. Extend the original model to allow the return to education to depend on race and test whether
the

#return to education does depend on race.

#H0 : beta(black*educ) !=0 (education does depend on race)

#H1: beta(black*educ) =0 (education does not depend on race)

reg2 <- lm(log(wage)~educ+exper+tenure+married+black+south+urban+black*educ, data2)

summary(reg2)

# The coefficient on the interaction is about −.0226. Therefore, this implies that is that the return to

# another year of education is about 2.3 percentage points lower for black men than nonblack men

#and the t-statistic is -1.12. on the basis of this we can reject the null hypothesis and say that

#return to education does not depend on race

#4. Again, start with the original model, but now allow wages to differ across four groups of people:
married

#and black, married and nonblack, single and black, and single and nonblack. What is the estimated

#wage differential between married blacks and married nonblacks?

#Creating variables for married blacks

data2$marriedblack<- ifelse(data2$married==1 & data2$black==1, 1,0)

data2$marriednonblack<- ifelse(data2$married==1 & data2$black==0, 1,0)

data2$unmarriedblack<- ifelse(data2$married==0 & data2$black==1, 1,0)

#We do not add the fourth category unmarried black as adding all four will lead to multicollinearity

#Adding these to regression equation and removing the columns married and black to avoid
multicollinearity

reg3 <- lm(log(wage)~educ+exper+tenure+south+urban+ unmarriedblack+ marriedblack +


marriednonblack, data2)

summary(reg3)
# The coefficient of married black is +0.009, which means that married black

# earns 0.9% more than single non-black (i.e., base group).

# The coefficient of married nonblack is +0.189, which means that married nonblack

# earns 18.9% more than single non-black.

#The differential between married blacks and married non blacks is given by

#the difference of their coefficients: .0094 − .189 = −.1796. That is, a

### married black man earns about 18% less than a, married nonblack man

You might also like