Homework Assignment #1 Review (Due On W 4/8)
Homework Assignment #1 Review (Due On W 4/8)
Written problems:
Computer problems:
1. Use Davis2018.dta.
a. Use the substr and as.numeric function in R to generate new variables representing
the year and month of the closing date.
Hints: For example, to get the year of the pending date, I would use the
following command:
Davis2018$PendingYear<-substr(Davis2018$PendingDate,1,4)
Davis2018$PendingYear<-as.numeric(PendingYear)
summary(Davis2018$PendingYear)
Min. 1st Qu. Median Mean 3rd Qu. Max.
2017 2018 2018 2018 2018 2019
b. Restrict the sample to sales of single-family houses with close dates in 2018.
c. Draw a bar plot to summarize the average sale price of houses with different
characteristics of your choice (e.g., bedrooms, bathrooms, closing month, etc.)
using the subsample created in part b.
d. Run a regression of sale price on month of closing and test the overall significance
of the regression with 5% significance level.
e. How would you obtain heteroskedastic robust standard errors in the above
regression if you think the homoskedasticity assumption is violated?
f. Run a regression of sale price on list price and days on market. How do you
interpret the slope coefficients of this regression? Do you think the zero
conditional mean condition is satisfied here?
g. Add house characteristics to the above regression model and test the joint
significance of all newly added house characteristics variables.
h. Review your ECN 102 (or STA 108, ECN 140, etc…) notes on regressions with
quadratic terms. Now, add a quadratic term of DaysOnMarket to the regression in
f. For houses with the same list prices, what is the predicted difference in sale
price if a house stays on market a week longer than the other?
2. Use the RENTAL.dta dataset. This dataset comes from the Wooldridge textbook. It
includes rental prices and other variables of 64 college towns for the years of 1980 and
1990.
a. Review your ECN 102 (or STA 108, ECN 140, etc…) notes on regressions with
log transformed variables. Regress log of rent (lrent) on log of pop (lpop), log of
avginc (lavginc), and pctstu using only 1990 data. Interpret the slope coefficient of
lavginc as well as pctstu. Do you think the zero conditional mean assumption is
satisfied here?
b. The variable clrent only has non-missing values in 1990. Verify those values are
equal to the change in lrent in each city between year 1980 and year 1990. Recall
that changes in log transformed variables could be interpreted as % changes in the
original variable. Notice that clrent is equal to .5516071 for city 1. How do you
interpret this number?
c. Finally, we regress change in lrent (clrent) on change in lpop (clpop), change in
lavginc (clavginc), and change in pctstu (cpctstu) between year 1980 and year
1990. How do you interpret the intercept here? Explain what the zero conditional
mean assumption is requiring in this regression.