0% found this document useful (0 votes)
11 views

STATAforEconWorkshop2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

STATAforEconWorkshop2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Stata for Econometrics Workshop 2

Erika Meucci

We are going to be working interactively – at the end you will be expected to create a do file to
reproduce what you are doing. A good idea is to have a do file editor open and add to it each time
you execute a command. Don’t forget to add the explanation of each step.

Create a new folder for this workshop – call it something useful like StataWorkshop02.
Download the Do-files and the data files and save them to that location.

Launch Stata
Change the working directory to the newly created folder
Open the Do-files when asked to

1. GETTING TO KNOW STATA

1.1 STATA DATA FILES

Stata data files have the extension *.dta. These files should not be opened with any program but Stata. If
you locate a *.dta file using double-click it will also start Stata. There are several ways to open, or load,
Stata data files. We will explain a couple of them.

1.1.1 The use command

With Stata started, change your working directory to the where you have stored the Stata data files. In the
Command window type use lifeexp and press Enter.

If you have a data file already open, and have changed it in some way, Stata will reply with an error
message:

no; data in memory would be lost

This feature will prevent you from losing changes to a data file you may wish to save. If this happens, you
can either save the previous data file [more on this below], or enter clear in the Command window.
The clear command will clear what is in Stata’s memory. If you want to open the data file and clear
memory, enter

use lifeexp, clear

1.1.2 Using the toolbar

To open a Stata data file click the Open (use) icon on the toolbar

Locate the file you wish to open, select it, and click Open.
In the Review window the implied command is shown.

In Stata opening a data file is achieved with the use command. The path of the data file is shown in
quotes. The quotes are necessary if the path name has spaces included. The option clear indicates that any
existing data is cleared from memory. You can also work with data file that are on line.

Today we are going to work with the following data set

use http://stats.idre.ucla.edu/stat/stata/notes/hsb2

1.2 THE VARIABLES WINDOW

In the Variables window the data file variables are listed.

Also shown are variable Labels, if they are present, along with the Type of variable and its Format.

Labels are useful and can be easily added, changed or deleted. For example, if you type the following line
in the command window

label variable female "gender"

This command will create the label, and it will write over an already existing label for female.

Instead of the command approach, you can use the pull-down menus as follows:

On the Stata pull-down menu select

Data > Data Utilities > Label Utilities > Label Variable.

In the resulting dialog box, you can alter the existing label by choosing Attach a label to a variable,
choosing the variable from the Variable: drop-down list and typing in the New variable label. Click OK.
In the dialog box you can also choose to Remove a label.

1.3 DESCRIBING DATA AND OBTAINING SUMMARY STATISTICS

As we saw last class, there are a few things you should do each time a data file is opened, or when you
begin a new problem. First, enter into the Command window

describe

This produces a summary of the variables in the data file, information about them, and their labels.

Next, in the Command window, type summarize and press Enter.

In the Results window we find the summary statistics


If you forget a Stata command, the pull-down menus virtually assure that with enough clicking you can
obtain the desired result. To illustrate, click on Statistics on the Stata menu list.

However, what you can see is the list of number of observations, mean, standard deviation, Min. and
Max. of the variables. If you want information about the median and other percentiles, you should use

sum varname, d

For example, write

sum write, d

in the command line:


We will define skewness and kurtosis in the next lecture.

1.4 THE STATA HELP SYSTEM

The Stata help system is one if its most powerful features. Click on Help on the Stata menu. Select
Contents.

Each of the blue words is linked to further screens. You should explore these to get a feel for what is
available.

1.4.1 Using keyword search

Now click on Help > Search

In the Dialog box that opens there are several search options. To search all the Stata documentation and
Frequently Asked Questions (FAQs) simply type in phrase describing what you want to find. It does not
have to be a specific Stata command. For example, let’s search for Summary Statistics.

Up comes a list of topics that might be of interest. Once again blue terms are links. Click on Summarize.

The resulting Viewer box shows the command syntax, which can be used when typing commands in the
Command window, and many options.
1.4.2 Using command search

If you know the name of the Stata command you want help with, click Help > Stata Command

In the resulting dialog box type in the name of the command and click OK.

Alternatively, on the command line type

help summarize

and press Enter.

1.4.3 Opening a dialog box

If you know the name of the command you want, but do not recall details and options, a dialog box can be
opened from the Command window. For example, if you wish to summarize the data using the dialog
box, enter

db summarize

Or, enter

help summarize

and click on the blue link to the viewer or click on Dialog to obtain the dialog box.

1.5 STATA COMMAND SYNTAX

Stata commands have a common syntax. The name of the command, such as summarize is first.

command [varlist] [if] [in] [weight] [, options]

The terms in brackets [ ] are various optional command components that could be used.
• [varlist] is the list of variables for which the command is used.
• [if] is a condition imposed on the command.
• [in] specifies range of observations for the command.
• [weight] when some sample observations are to be weighted differently than others.
• [, options] command options go here.

For more on these options use a Keyword Search for Command syntax, then click Language.
Remark: An important fact to keep in mind when using Stata is that its commands are case sensitive. This
means that lower case and capital letters have different meanings. Since Stata considers x to be different
from X, it is easy to make programming errors.

1.5.1 Syntax of summarize

Consider the following examples using the syntax features. In each case type the command into the
Command window and press Enter.
For example,

summarize write, detail


computes detailed summary statistics for the variable write. The percentiles of write from smallest to
largest are shown, along with additional summary statistics (e.g., skewness and kurtosis) that you have
learnt about. Note that Stata echoes the command you have issued with a preceding period (.).

summarize write if female==1

computes the simple summary statistics for the females in the sample. The variable female is 1 for
females and 0 for males. In the “if statement” [called an “if qualifier” by Stata] equality is indicated by
“==”.

summarize if write >= 40

computes simple summary statistics for those in the sample whose writing score (write) is greater than or
equal to 40.

summarize in 1/50

computes summary statistics for observations 1 through 50.

summarize write in 1/50, detail

computes detailed summary statistics for the variable write in the first 50 observations.
If you notice at bottom left of the Results window —more—: when the Results window is full it pauses
and you must click —more— in order for more results to appear, or press the space bar.

1.5.2 Learning syntax using the review window

At this point you are wondering “How am I supposed to know all this?” Luckily you do not have to know
it all now, and learning comes with repeated use of Stata. One great tool is the combination of pull-down
menus and the Review window. Suppose we want detailed summary statistics for female write scores in
the first 100 observations. While you may be able to guess from previous examples how to do this, let’s
use the point and click approach. Select Statisics > Summary statistics > Summary and
descriptive statistics and then Summary Statistics from the pull-down menu.

In the resulting dialog box we will specify which variables we want to include, and select the option to
display additional statistics. Then click on the by/if/in tab at the top.

In the new dialog box you can enter the if condition in a box. Click the box next to Use a range of
observations. Use the selection boxes to choose observations 1 to 100. Then click OK.

Stata echoes the command, and produces detailed summary statistics for the women in the first 100
observations

In the Review window is the list of commands we have typed. You will also find the list of commands
generated using the dialogs. After experimenting for just a few minutes you will learn the syntax for the
command summarize. Suppose you want to change the last command to include observations 1 to 150.
You can type the command

summarize write if female == 1 in 1/150, detail

into the Command window, but Stata offers us a much easier option. In the Review window, simply click
on the command. Instantly, this command appears in the Command window

Simply edit this command, changing 100 to 150, then press Enter

To edit a previously used command, click on that command in the Review window. The past command
will appear to the Command window, where it can be edited and executed. Not only do you obtain new
results, but the modified command now appears as the last item in the Review window.
1.6 SAVING YOUR WORK

When you carry out a long Stata session you will want to save your work.

1.6.1 Copying and pasting

One option is to highlight the output the Results window, then right-click.

This gives you options to copy (Ctrl+C) the output as text, and then paste it into a document using the
shortcut (Ctrl+V) or by clicking the paste icon.
If you paste into a word processing document you may find that the nicely arranged Stata results become
a ragged, hard to read, mess. Part of the results might look like

This is due to the word processor changing the font. While you may be using Times New Roman font for
standard text, use Courier New for Stata output. You may have to reduce the font size to 8 or 9 to make it
fit. A partial output is
!

1.6.2 Using a log file

As we saw in the last session, Stata offers a better alternative. In addition to having results in the Results
window in Stata, it is a very good idea to have all results written (echoed) to an output file, which Stata
calls a log file. You can begin a log file by entering a command as we saw last session or by clicking on
the Log Begin/Close/Suspend/Resume icon on the Stata toolbar.

In the resulting dialog box, the log file can be named and the type of log file selected. The default is a
formatted log file with the extension *.smcl, which stands for Stata Markup and Control Language. Better
to save it as a log file by choosing the option to save a Stata Log. Give the file a meaningful name and
recall that it will be located in the directory which we have made the default. Click Save.

This dialog box can also be reached via the Stata toolbar by clicking File > Log > Begin.

The command log using, with the file name in quotes, appears in the Results window.

You can Begin/Close/Suspend/Resume a log by choosing the icon used to open the log file.
In the resulting dialog box select Close log file and press OK.

1.6.3 Viewing a log file

To View the log file, click on File > Log > View. In the dialog box enter (or browse for) the file
name and click OK. The log file file1.log opens in what is called the Stata Viewer. Now, you can print the
entire log file by clicking the printer icon.

1.6.4 Translating a log file to a text file

Advantages of the log file include the ability to view the formatted output, and to easily print it. A
disadvantage of those files is that they cannot be easily viewed without having Stata open. They are like
*.html files in that while they are text files, they also include lots and lots of formatting commands.
You can translate the Stata log files into simple text files. On the Stata toolbar select

File > Log > Translate.

Fill in the dialog box and click Translate.


1.6.5 Using Stata commands for log files

To open a log file using the Command window, enter

log using file1.log

This will open file1.log in the current directory. Variations of this command are:

log using file1.log, replace

will open the log file and replace one by the same name if it exists.

log using file1.log, append

will open an existing log file and add new results at the end.

The command

log close

closes a log file that is open.

1.7 USING THE DATA BROWSER

It is a good idea to examine the data to see the magnitudes of the variables and how they appear in the
data file. On the Stata toolbar are a number of icons

Sliding the mouse pointer over each icon reveals its use. Click on Data Browser.

The data browser is a spreadsheet view. Use the slide bar at the bottom and the one on the right to view
the entire data array. The browser allows you to scroll through the data, but not to edit any of the entries.
This is a good feature that ensures we do not accidentally change a data value.

1.8 WRITE YOUR DO-FILE

Do-files are very convenient after having pointed and clicked enough so that the commands you want to
execute appear in the Review window. If you have been carrying along on the computer with the
examples we have been doing, then your Review window is a clutter of commands right now. Let’s take
those commands to a new Do-file called WS02.do (or whatever you want) or to your existing do file if
you haven’t already. The extension *.do is recognized by Stata and should be used. Right-click in the
Review window, and on the pull-down menu click Select All. After all commands are selected right-click
again and choose Send to Do-file Editor.

The Do-file Editor is opened. To save this file click on File > Save as and enter the file name WS02.do.
The Stata Do-file editor is a simple text editor that allows you to edit the command list to include only
those commands to keep. In the file below we have eliminated some commands, done some rearranging,
and added some new commands. It also presumes that the log file is new, that you have saved and cleared
any previous work, and that the working directory has been specified.

To execute this series of commands click the Do icon on the Do-file Editor toolbar.

The results appear in the Result window and will be written to the specified log file.
The Do-file editor has some useful features. Several Do-files can be open at once, and the Do-file editor
can be used to open and edit any text file. By highlighting several commands in the Do-file and selecting
Do Selected Lines parts of the Do-file can be executed one after the other. Of course the data file must be
open prior to attempting to execute the selected lines.

2 CREATING AND MANAGING VARIABLES

Stata offers a wide variety of functions that can be used to create new variables, and commands that let
you alter the variables you have created. In this section we examine some of these capabilities.

2.1 GENERATING NEW VARIABLES

To create a new variable use the generate command. Let’s start with the pull-down menu. Click on Data
> Create or change data > Create new variable on the Stata menu. A dialog box will
open.

Alternatively, in the Command window, enter db generate to open the dialog box. In the dialog box
you must fill in New variable name: choose something logical, informative and not too long.
Contents of new variable: this is a formula (no equal sign required) that is a mathematical
expression. write2 is a new variable that will be the square of write. The operator “^” is the symbol Stata
uses for “raise to a power, so write^2 is the square of write, write^3 would be write cubed, and so on.
Click OK. In the Results window (and Review window) we see that the command implied by the menu
process is
generate float write2 = write^2
In this command float is automatically added by the menu driven process and is a description of the type
of variable being created. It stands for floating point. Type help data type if you are curious. It is
optional and is not required. We can enter
generate write2 = write^2
The command can also be shortened to
gen write2 = write^2

2.2 USING THE EXPRESSION BUILDER

Suppose in the process of creating a new variable you forget the exact name of the function. This happens
all the time. To illustrate let us create a new variable lwrite which will be the natural logarithm of write.
Go through the steps in Section 2.1 until you reach the generate dialog box. Type in the name of the new
variable, and then click Create, opening Expression builder.

In the Expression builder dialog box you can locate a function by choosing a category, scrolling down the
function list while keeping an eye on the definitions at the bottom until you locate the function you need.

Double-click on the function log(), and it will appear the Expression builder window
Now fill in the name of the variable write in place of “x” and click OK.

In the generate dialog box you will now find the correct expression for the natural logarithm of write in
the Contents of new variable space. Click OK.

The command will be executed, creating the new variable lwrite which shows up in the Variables window.
Stata echoes the command to the Results window
generate float lwrite = log(write)
and to the Review window. The simple direct command is
gen lwrite = log(write)

2.3 DROPPING OR RENAMING A VARIABLE

To drop or rename a variable in the variable list, click on Data > Create or change data >
Keep or drop variables.

The command choice is Keep or Drop.

• Drop deletes the selected variables from the data file.


• Keep deletes all variables from the data file except the ones selected.

To Rename a variable, click Data > Data utilities > Rename groups of variables.

Suppose we want to rename math as mathematics. Then fill in the dialog box.

The drop and rename commands are simple to enter directly, and are

drop write2
rename math mathematics

2.4 USING ARITHMETIC OPERATORS

The Arithemetic operators are:


+ addition
- subtraction (or create negative of value, or negation)
* multiplication
/ division
^ raise to a power
To illustrate these operators consider the following generate statements:
generate write1 = write+1 (addition)
generate negwrite = -write (negative or negation)
generate sciencemath = science*math (multiplication)
generate sciencemath_read = sciencemath*read (multiplication with created variable)
generate write_read = write/read (division)
generate sciencemath_write = (science*math)*write (multiplication)
The last line shows the use of parentheses. Like regular algebra parentheses control the order of
operations, with expressions in parentheses being performed first.
Several of these constructions were for demonstration purposes only. We’ll drop them using
drop write1 negwrite sciencemath sciencemath_read write_read
sciencemath_write

Stata shortcut: With a list of variables to type it is easier to type the command name, here drop, and then
click on the names of the variables in the Variables window. When selected they appear in the Command
window.

2.5 USING STATA MATH FUNCTIONS

Stata has a long list of mathematical and statistical functions that are easy to use. Type help functions in
the Command window. We will be using math functions and density functions extensively.

Click on math functions. Scrolling down the list you will see many functions that are new to you. A few
examples of the ones we will be using are:

generate lwrite = log(write) (natural logarithm)


generate elwrite = exp(lwrite) (exponential function is antilog of natural log)
generate rootwrite = sqrt(write) (square root)

Note that the exponential function is exp. Use the Stata browser to compare the values of write and
elwrite. These are identical because the exponential function is the antilog of the natural logarithm. The
variable lwrite is the logarithm of write, and elwrite is the antilog of lwrite. The function log(write) is the
natural logarithm and so is ln(write). We often use the notation ln(x) is used to denote the natural
logarithm.

2.6 SAVING THE STATA DATA FILE AND ENDING THE SESSION

At this point it would be a good idea to save the Stata data file, since it has been changed by adding
several variables. Click File > Save as
!

So that you do not write over the original data, save the data file under a new name.
The Stata command is

save WS02.dta (or whatever name you like to give it)

As the final step you will Close the log file.

log close

QUESTIONS:

1) Using the file http://stats.idre.ucla.edu/stat/stata/notes/hsb2, find


information about the variables “race”, “math”, “read” and “write”.
2) How many observations are you considering? What is the range? What are the
mean and the standard deviation of each variable? What is the median of “math”,
“read” and “write”? What is the Inter Quartile Range of each variable?
3) Generate the variable: math+100. Drop the variable.
HOMEWORK 2:

Execute the following operations:


1) Open a new log file (remember to change the working directory if needed).
2) Load the STATA shipped datafile called census.dta.
3) Summarize all the variables.
4) Summarize the population variable in detail.
5) Create a new variable called lpop equal to the natural logarithm of the population.
6) Create a new variable poplt5pc as the ratio between the poplt5 (population less than
5 y.o.) and pop.
7) Create a new variable marriagepc that is the number of marriage divided by the
state’s population.
8) Compile a do-file with all the aforementioned operations.
9) Make sure the do-file runs correctly.
10) Copy and paste the commands and the answers on a word document. Export the
document as a pdf file, name it as your_last_nameSTATAHW2.pdf, and send me the pdf
file via e-mail.

You might also like