0% found this document useful (0 votes)
24 views3 pages

J PAL Statistical Programming Exam PDF

Uploaded by

Nora Soualhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views3 pages

J PAL Statistical Programming Exam PDF

Uploaded by

Nora Soualhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

J-PAL Statistical Programming Exam

This test will ask you to manipulate, clean, and analyze data from a randomized evaluation conducted by
Miriam Bruhn, Dean Karlan, and Antoinette Schoar on the impact of consulting services on small and
medium enterprises in Mexico.1 The data and code have been modified for the purposes of this exam –
all data quality or coding issues are intentional. The abstract of the paper can be found below. All of the
files that you need to complete this test are included in the zip folder that accompanies this test.

You may complete this test using Stata or R. Please only return your code in a text file. You have
three hours to complete this exam, and will lose points for a late submission.

Please do not share or discuss this exercise or your answers with anyone.

The Impact of Consulting Services on Small and Medium Enterprises: Evidence from a Randomized Trial
in Mexico2

Abstract:

A randomized control trial with 432 small and medium enterprises in Mexico shows positive impact of
access to 1 year of management consulting services on total factor productivity and return on assets.
Owners also had an increase in “entrepreneurial spirit” (an index that measures entrepreneurial
confidence and goal setting). Using Mexican social security data, we find a persistent large increase
(about 50 percent) in the number of employees and total wage bill even 5 years after the program. We
document large heterogeneity in the specific managerial practices that improved as a result of the
consulting, with the most prominent being marketing, financial accounting, and long-term business
planning.

1 Karlan, Dean; Bruhn, Miriam; Schoar, Antoinette, 2017, "The Impact of Consulting Services on Small
and Medium Enterprises: Evidence from a Randomized Trial in Mexico",
https://doi.org/10.7910/DVN/H74D2A, Harvard Dataverse, V2, UNF:6:Bi3tfUloxxabN7TSMbpicQ==
[fileUNF]

2Miriam Bruhn, Dean Karlan, and Antoinette Schoar, "The Impact of Consulting Services on Small and
Medium Enterprises: Evidence from a Randomized Trial in Mexico," Journal of Political Economy 126, no.
2 (April 2018): 635-687. https://doi.org/10.1086/696154

1
This study source was downloaded by 100000887014526 from CourseHero.com on 06-06-2024 11:35:30 GMT -05:00

https://www.coursehero.com/file/65314407/J-PAL-Statistical-Programming-Exampdf/
There are five sections in the exam:

1. Handling Data
2. Cleaning Data
3. Visualizing Data
4. Analyzing Data
5. Interpreting Results

In sections 1-2, you will write your own code. Within your code, please write the section and question
number. In sections 3 and 4 you will troubleshoot code written by an RA who was working on the project.
In section 5, you will write code and interpret the output. Please answer the questions which ask for
written responses as comments in your code file.

Section 1: Handling Data

1. Record all of your work using a dofile/R script named Cleaning.do/Cleaning.R. Include up to 6
informative header lines in the file.
2. The maindata.xls file contains the raw data for the baseline and first endline, while the
longterm.xls file contains data from the second, long-term follow-up endline. Load these datasets
into R/Stata and combine them.
3. The social_security variable contains personally identifiable information—if it were released
people could determine who the firm’s owner is. Remove this variable from the dataset, but save
it in such a way that you can merge it back to the main dataset if needed.

Section 2: Cleaning Data

1. Use the codebook to label the sec variable and its values.
2. The codebook mentions a categorical variable indicating which quartile of 2008 costs a firm fell
into, which is missing from the data. Create this variable.
3. Several of the firms in the study did not report their 2008 profit level.
a. How are missing values currently coded for this variable? How many missings are
because the firm did not know their profit level?
b. What is the risk with coding missing values this way (Remember to answer questions as
a comment in your R script/dofile)?
c. Recode all missing responses to this question so that they won’t affect analysis of 2008
profit (i.e., the mean of 2008 profit should only be calculated for firms who reported their
profit).
4. Save the dataset with all of these changes as finaldata.dta or finaldata.Rdata. You will need this
dataset for the next three sections.

Section 3: Visualizing Data

Each of the following questions in sections 3 and 4 are about code in the AnalysisVisualization.do/
AnalysisVisualization.R file. This code was sent to you by the RA on the project, but it has errors. Please
copy this code into your R script or do file, and fix the code there.

1. Notice how the RA imports the data into the software. Will this code work on your computer?
Change this it so that it will replicate easily on various computers.

2
This study source was downloaded by 100000887014526 from CourseHero.com on 06-06-2024 11:35:30 GMT -05:00

https://www.coursehero.com/file/65314407/J-PAL-Statistical-Programming-Exampdf/
2. This code is meant to create a Latex file with these regression results. Is it doing so? If not, fix it
(Note: You do not need to have Tex software to verify this. Latex files can be opened with text
editors).
3. The RA wanted to standardize the number of times a firm had applied for a bank loan. How can
you tell this was done incorrectly?.
a. Create a standardized version of the variable loan_bank_number called
std_numb_bankloan2 and graph the histogram of it to confirm it is correctly coded.

Section 4: Analyzing Data

1. The RA wants to see the coefficient on each year dummy. Fix the code so that no dummies are
dropped.
2. The RA performs 2 LATE regression to determine the effect of participating in the program on
sales. Is the coefficient on in_program the same in both models? How can you tell?
a. Explain why adding controls does or does not change the coefficient.
Section 5: Interpreting Results

1. In Section 4, question 2, why did the RA need to instrument for the in_program variable?
2. Using the model 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑖𝑡 = ϐ0 + ϐ1 ∗ 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑖𝑡 + 𝜖𝑖𝑡 to determine for which of the following
outcomes is the treatment effect significant: sales, profits, has_trademark? Explain how you know
it is significant.
3. Regress sales on gender and ethnicity and their interaction, without a constant.. Interpret what
each of the coefficients means.

Once you are done with the exam, copy your R script/Do file into a text file, and only submit the
text file.

3
This study source was downloaded by 100000887014526 from CourseHero.com on 06-06-2024 11:35:30 GMT -05:00

https://www.coursehero.com/file/65314407/J-PAL-Statistical-Programming-Exampdf/
Powered by TCPDF (www.tcpdf.org)

You might also like