0% found this document useful (0 votes)

23 views

software material

Uploaded by

amanueco21

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

software material

Uploaded by

amanueco21

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

1.

Introduction to Stata

What is Stata?
 It is a multi-purpose statistical package to explore, summarize and analyze datasets

 It has capability for handling and manipulating large data sets (e.g. millions of

observations), and it has ever growing capabilities for handling panel and time-series

regression analysis.

 A dataset is a collection of several pieces of information called variables usually arranged by

columns
 A variable can have one or several values information for one or several cases

Stata Interface
I. Stata Windows

Generally, Stata has 4 main windows

Command window: to submit command to stata. It supports basic text editing, copying

and pasting, and a command history

Results window: contains all the commands and their textual results

Review window: shows the history of commands that have been entered. It displays

successful commands in black and unsuccessful commands, along with their error codes,

in red.

variables window: shows the list of variables in the dataset, along with selected

properties of the variables

1
II. The Stata Tool Bars
Contain buttons that provide quick access to Stata’s more commonly used features

The toolbar buttons and their functions

Open: opens a Stata dataset. Click on the button to open a dataset with the Open dialog.
Save: saves the Stata dataset currently in memory to disk.
Print: displays a list of windows. Select a window name to print its contents
Log: begins a new log or closes, suspends, or resumes the current log.
Viewer: opens the Viewer or brings a Viewer to the front of all other windows.
Graph: brings a Graph window to the front of all other windows.
Do file Editor: opens the Do-file Editor or brings a Do-file Editor to the front of all other
windows.
Data Editor (Edit): opens the Data Editor or brings the Data Editor to the front of the
other Stata windows.
Data Editor (Browse): opens the Data Editor in browse mode
Variables Manager: opens the Variables Manager.
Clear more Condition: tells Stata to continue when it has paused in the middle of long
output.
Break: stops the current task in Stata.

III. Stata Menus and dialogs

 Stata’s Data, Graphics, and Statistics menus provide point-and-click access to almost

every command in Stata.

 The dialogs for many commands have the by /if/in and Weights tabs.

 These provide access to Stata’s commands and qualifiers for controlling the estimation

sample and dealing with weighted data

Getting Started

2
 If you are using Stata version 11 or earlier, and you want to read in a big dataset, then

before reading in your data, you must tell Stata to make available enough computer

memory for your data.

 If you get a message while using Stata 11 or earlier that there is not enough memory,
 For example, “no room to add more observations…”, then you need to manually set the
memory higher.
 You can type, for example,
 clear or drop_all
 to set the memory to a large enough amount, type
 set mem 700m or something higher

How to Read Data into Stata?

To load files in excel format into Stata, follow one of the following 2 procedures

1. click on “file” on the menu bar. In the file drop down menu, click on “import” and then

choose excel spread sheet, or

2. Open data editor by just typing “edit” or clicking on the menu bar. Then copy from excel,

right click in any of the cell in the data editor and then, past.

Saving data into Stata

 If the dataset is new or just imported from other format go to file –> save as or

 just type: save filename

 To save a dataset that has been already in use (overwriting the original data file),

 1. select File > Save; or

 2. click on the Save button; or

 3. type:
save, replace in the Command window

Log file

3
 A log file is simply a record of your Results window. It records all commands and all

textual output as it happens.

 Thus, it keeps your lab notebook for you as you work.

 Because it writes the file to disk while it writes the Results window, it also protects you

from disastrous failures, be they power failures or computer crashes.

How to create it?

File>Log>Begin

Do-file
 Do-file is a file containing a list of commands for Stata to run (called a batch file or a

script in other settings). It gets its name from the term do-file.

 Do-file Editor has advanced features that can help in writing such files; it can also be

used to build up a series of commands that can then be submitted to Stata all at once.

 A do-file can be launched by either clicking on the Do-file editor toolbar button or by typing
doedit in the command window.

4
2. DATA MANAGEMENT

Loading Data into Stata

Things to know about entering data in Stata

 A period („.‟) represents a missing numeric value

 Press Tab or Return to input a missing numeric value

 Press Tab or Return to input a missing value for a string variable

 Stata will not allow empty columns or rows in the middle of your data set.

Easy steps to load your data in Stata

Say you have a File name: datamgmt in excel

 Open the data
 Open data editor by just typing “edit” or clicking on the menu bar.
 Then copy from excel, right click in any of the cell in the data editor and then, past.
 To save the file, type: save filename
 ex: save datamgmt

Naming variables
 Variable names can have up to 32 characters,

 but many commands print only 12, and shorter names are easier to type.

 Stata names are case sensitive, Age and age are different variables!

 It pays to develop a convention for naming variables and sticking to it.

5
 It helps to use short lowercase names and single words or abbreviations rather than multi-

word names,

 for example, use effort or fpe to represent a variable called family_planning_effort or

familyPlanningEffort, although all four names are legal.

 Note the use of underscores to separate words.

Renaming variables

 Variables can be renamed using the following Stata syntax:

 rename old variable name new variable name
For example, rename female sex

Labeling variables
 Variables can be labeled using the following Stata syntax

 label variable var1 "description"

 where var1 is the variable to be labeled; and description is the label of var1

 The various levels of a categorical variable can be labeled using the following two Stata
syntaxes together:
 label define var1 1 “name of the first category” 2 “name of the second category”
 label values var1 var1
Where var1 is the name of the categorical variable; and 1 and 2 are the levels of the
categorical variable.
Example: A variable called gender has two categories – 1 for male and 2 for female.
 The categories of gender can be labeled as follows:
 label define gender 1 male 2 female
 label values gender gender

Generating (creating) new variable from existing

variables(s)

6
 The most common command for creating new variables is generate.

 Syntax is: generate new variable = expression

Where: new variable is the name of new variable

 Example: Generate a variable called income which is the sum of farm income
(fincome) and nonfarm income (nfincome):

generate income = fincome + nfincome

to generate natural logarithm: gen name of the new variable == ln(x)

to generate square root of X from X: gen name of the new variable == sqrt(X)

to generate natural exponential of X: gen name of the new variable == exp(X)

7
Keeping and dropping variable
 Your data set may contain variables you are not interested in or you don’t want to

analyze.

 It’s a good idea to get rid of these first – that way, they won’t use up valuable

memory and they won’t inadvertently sneak into your analysis.

 You can tell Stata to either keep what you want or drop what you don’t want – the end
results will be the same.
The syntax is
 keep variables to remain
 drop variables to remove
 keep if var>= 0
 drop if var < 0

Examining the Data

 It is important to examine your data when you first read it into Stata

 check that all the variables and observations are present and in the correct format.

 “browse and edit” commands start a pop-up window in which you can examine the

raw data.

 To examine it within the results window, use the “list” command

 Note: listing the entire dataset is only feasible if it is small.

 If the dataset is large, you can use some options to make the output of list more

tractable.

 The list command displays the values of all the variables

 Syntax: list varlist
 Where varlist is the list of variables to be listed; and options is any or a combination of
any of the options associated with the list command
8
 list varlist , options (you can include the if or in options). like,
1. list x if x> 65 or x >=25 etc
2. List x if gender=1. This give value of X only for male or female
3. list X in 1/5. To list only the first 5 observations

Assert

 With large datasets, it often is impossible to check every single observation using

list or browse

 additional commands to examine data are described in the following.

 A first useful command is “assert “which verifies whether a certain statement is

true or false.

 Syntax: assert expression

For example, you might want to check whether all values in the math variable are
nonnegative as they should be:
 Syntax: assert math !< 0 or assert math >= 0
 If the statement is true, assert does not yield any output on the screen.
 If it is false, assert gives an error message and the number of contradictions.

Describe
The describe command produces a summary of the dataset in memory or of the data stored in a
Stata-format dataset.
 Syntax: describe
 describe varlist, memory_option
 Describe data in file
 describe varlist using “location and name of the file”, file_options

Summarize
This provides summary statistics, such as means, standard deviations, and so on.

 Syntax: summarize or

9
 Summarize, detail

Tabulate
The tabulate command is a versatile command that can be used, for example, to produce a

frequency table of one variable or a cross-tab of two variables.

 Syntax: tabulate varname, options

 Syntax: tabulate varname1 varname2, options

Inspect
 The inspect command is a way to eyeball the distribution of a variable, including as
it does a mini-histogram.
Syntax: inspect varlist

Correlations
 Correlation measures association/relationship between variables.

 The correlate command displays the correlation matrix or covariance matrix for a

group of variables.

 The syntax: corr variable list

How to get correlation & if it is significant or not (pairwise correlation)?

syntax: pwcorr list of variables, star (5or 1or 10.i.e level of sig)

10
3. Application to Crossectional Analysis

Hypothesis Testing
ttest varname == # : Test the hypothesis that the mean of a variable is equal to some
number, which you type the number, instead of the sign #.
ttest varname1 == varname2 :Test the hypothesis that the mean of one variable equals
the mean of another variable.
ttest varname, by(groupvar) :Test the hypothesis that the mean of a single variable is
the same for all groups. The groupvar must be a variable with a distinct value for each
group. For example, groupvar might be gender, to see if the mean of a variable is the
same for male & female

Confidence Intervals
ci varname :Confidence interval for the mean of varname (using asymptotic normal
distribution).
ci varname, level(#) : Confidence interval at #%. For example, use 99 for a 99% confidence
interval.

How to generate dummy variable

We can generate dummy variables by using the tabulate (tab) & generate (gen) commands. Say
the variable “Race” is a categorical variable with 4 categories, we can generate dummy variable
for each category by using the following syntax
tab Race, gen (Race_dummy)

OLS Regression
regress yvar xvarlist: Regress the dependent variable yvar on the independent variables
xvarlist. For example: regress y x or regress y x1 x2 x3.
regress yvar xvarlist, robust : regress but this time compute robust standard errors.
regress yvar xvarlist, robust level(#): Regress with robust standard errors, and this time change
the confidence interval to #% (e.g. use 99 for a 99% confidence interval)

OLS Regression with dummy variable(s)

11
regress yvar xvarlist i.Race: Regress the dependent variable (yvar) on the continuous
independent variables (xvarlist) & categorical independent variable (Race). For example: regress
y x i.Race
, or regress y x1 x2 x3 i.Race

Post-Estimation Commands
Commands described here work after OLS regression.
predict yhat: After a regression, create a new variable, having the name you enter here, that
contains for each observation the predicted value of the dependent variable.
predict name of the new variable, residuals : After a regression, create a new variable, having
the name you enter here, that contains for each observation its residual

Post-Estimation Tests
1. Heteroskedasticity Tests
Syntax: hettest
2. Functional Form (specification error) Test
Syntax: ovtest
3. Multicollinierity Test
Syntax: vif

Logistic (Logit) Regression

logit yvar xvarlist: Regress a binary dependent variable (yvar) on the independent variables
(xvarlist). For example: logit y x or logit y x1 x2 x3.
logit yvar xvarlist, or : Regress a binary dependent variable (yvar) on the independent
variables (xvarlist). But this time compute the odds ratio (or)
For example: logit y x1 x2 x3, or
logit yvar xvarlist, robust : Regress a binary dependent variable (yvar) on the independent
variables (xvarlist). But this time compute robust standard errors.

Logistic Regression with dummy variable(s)

logit yvar xvarlist i.Race: Regress a binary dependent variable (yvar) on the continuous
independent variables (xvarlist) & categorical independent variable (Race).
For example: logit y x1 x2 x3 i.Race

12
If you are interested in computing the odds ratio
logit y x1 x2 x3 i.Race, or

CF Week 9 Assignment Template
No ratings yet
CF Week 9 Assignment Template
6 pages
Introduction To SPSS 1
100% (2)
Introduction To SPSS 1
38 pages
MTH302 MCQs
No ratings yet
MTH302 MCQs
19 pages
STATA Capacity Building March 8
No ratings yet
STATA Capacity Building March 8
15 pages
Stata0 2008 Quique Moral Benito
No ratings yet
Stata0 2008 Quique Moral Benito
8 pages
Compiled by Solomon Kebede
No ratings yet
Compiled by Solomon Kebede
136 pages
What Is Stata?
No ratings yet
What Is Stata?
16 pages
STATAforEconWorkshop3
No ratings yet
STATAforEconWorkshop3
12 pages
Stata 1
No ratings yet
Stata 1
24 pages
Stata Application Part I
No ratings yet
Stata Application Part I
27 pages
Stata Manual Introduction
No ratings yet
Stata Manual Introduction
24 pages
A quick introduction to STATA
No ratings yet
A quick introduction to STATA
14 pages
Introduction to Stata for data management
No ratings yet
Introduction to Stata for data management
7 pages
STATA Notes 2022
No ratings yet
STATA Notes 2022
25 pages
Econometrics I lab tutorial using STATA
No ratings yet
Econometrics I lab tutorial using STATA
28 pages
Stata For Dummies v1m
No ratings yet
Stata For Dummies v1m
12 pages
Stata Basics13
No ratings yet
Stata Basics13
23 pages
Zorn - Stata 4 Dummies - 2007
No ratings yet
Zorn - Stata 4 Dummies - 2007
12 pages
STATAforEconWorkshop2
No ratings yet
STATAforEconWorkshop2
15 pages
Introduction To Stata 2012 - Econ4150
No ratings yet
Introduction To Stata 2012 - Econ4150
17 pages
Exercise: 1 SPSS (Statistical Package For Social Sciences)
100% (1)
Exercise: 1 SPSS (Statistical Package For Social Sciences)
43 pages
Stata Tutorial
No ratings yet
Stata Tutorial
63 pages
Introduction Stata Slides 2
No ratings yet
Introduction Stata Slides 2
25 pages
Stata Introduction and Worksheet
No ratings yet
Stata Introduction and Worksheet
2 pages
Basics of STATA Software
No ratings yet
Basics of STATA Software
67 pages
Workshop Series: Contents
No ratings yet
Workshop Series: Contents
10 pages
STATAforEconWorkshop1
No ratings yet
STATAforEconWorkshop1
12 pages
Sda Lab 1
No ratings yet
Sda Lab 1
6 pages
Applied Econometrics Using Stata
100% (2)
Applied Econometrics Using Stata
100 pages
Stata For Windows
No ratings yet
Stata For Windows
10 pages
Stata
No ratings yet
Stata
6 pages
Tutorial SPSS
No ratings yet
Tutorial SPSS
39 pages
Stat A Tutorial
No ratings yet
Stat A Tutorial
40 pages
Stat A Guide
No ratings yet
Stat A Guide
10 pages
Stata Handout
No ratings yet
Stata Handout
16 pages
SPSS For Windows: A Brief Tutorial
No ratings yet
SPSS For Windows: A Brief Tutorial
32 pages
Cara Membuka STATA
No ratings yet
Cara Membuka STATA
2 pages
Final SSB Manual Mbn605 08.08
No ratings yet
Final SSB Manual Mbn605 08.08
46 pages
EEA Stata Training Manual
100% (2)
EEA Stata Training Manual
85 pages
SPSS Step-by-Step Tutorial: Part 1
No ratings yet
SPSS Step-by-Step Tutorial: Part 1
50 pages
SPSS For Windows: Presented By: Office of Information Technology (OIT) Written By: William Dardick
No ratings yet
SPSS For Windows: Presented By: Office of Information Technology (OIT) Written By: William Dardick
51 pages
المحاضرة الأولى (حاسوب)
No ratings yet
المحاضرة الأولى (حاسوب)
9 pages
NBDM Training 2016 - Module 2 - Data Management Using Stata
No ratings yet
NBDM Training 2016 - Module 2 - Data Management Using Stata
63 pages
GSW 2
No ratings yet
GSW 2
11 pages
An Introductory SAS Course
No ratings yet
An Introductory SAS Course
17 pages
Stata: A Brief Introduction
No ratings yet
Stata: A Brief Introduction
9 pages
A Short Guide To Stata 10 For Windows
No ratings yet
A Short Guide To Stata 10 For Windows
7 pages
Tutorial of Stata
No ratings yet
Tutorial of Stata
11 pages
stata notes
No ratings yet
stata notes
7 pages
STATA Tutorial I
No ratings yet
STATA Tutorial I
48 pages
Stata Training 1
No ratings yet
Stata Training 1
58 pages
The Basics of STATA_2020
No ratings yet
The Basics of STATA_2020
15 pages
SPSS Assignment
No ratings yet
SPSS Assignment
8 pages
Presentation 1
No ratings yet
Presentation 1
23 pages
Lecture 1-2 Applied Econometrics
No ratings yet
Lecture 1-2 Applied Econometrics
68 pages
SPSS For The Classroom - The Basics
No ratings yet
SPSS For The Classroom - The Basics
1 page
SPSS For Beginners: An Illustrative Step-by-Step Approach to Analyzing Statistical data
From Everand
SPSS For Beginners: An Illustrative Step-by-Step Approach to Analyzing Statistical data
Hunt Robert D.
No ratings yet
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Microsoft Access for Beginners and Intermediates
From Everand
Microsoft Access for Beginners and Intermediates
Fredrick Ezeh
No ratings yet
Crystal Reports Introduction: Versions 2008-2016
From Everand
Crystal Reports Introduction: Versions 2008-2016
Seth Bonder
No ratings yet
Q 4 RESEARCH Module 2 3
No ratings yet
Q 4 RESEARCH Module 2 3
27 pages
Hypothesis Testing - Class 2
No ratings yet
Hypothesis Testing - Class 2
30 pages
Cost Estimation Using Regression Analysis
No ratings yet
Cost Estimation Using Regression Analysis
9 pages
07 - Dependent Sample T Test
No ratings yet
07 - Dependent Sample T Test
16 pages
PLUM - Ordinal Regression: Warnings
No ratings yet
PLUM - Ordinal Regression: Warnings
3 pages
EBOOK Statistics Informed Decisions Using Data 4Th Edition Ebook PDF Download Full Chapter PDF Kindle
100% (57)
EBOOK Statistics Informed Decisions Using Data 4Th Edition Ebook PDF Download Full Chapter PDF Kindle
61 pages
B-15_Stratified Analysis
No ratings yet
B-15_Stratified Analysis
9 pages
STAT659: Chapter 6
No ratings yet
STAT659: Chapter 6
30 pages
Outlier Analysis 2nd Edition Charu C. Aggarwal (Auth.) - The ebook in PDF/DOCX format is available for instant download
100% (2)
Outlier Analysis 2nd Edition Charu C. Aggarwal (Auth.) - The ebook in PDF/DOCX format is available for instant download
63 pages
5C T1
No ratings yet
5C T1
8 pages
Correlation and Linear
No ratings yet
Correlation and Linear
27 pages
MATH 1281 Assignment Unit 6
No ratings yet
MATH 1281 Assignment Unit 6
5 pages
Slide Show
No ratings yet
Slide Show
51 pages
Descriptive Stats
No ratings yet
Descriptive Stats
100 pages
Chapter8 Econometrics Heteroskedasticity
No ratings yet
Chapter8 Econometrics Heteroskedasticity
15 pages
Reference Manual: Paleontological Statistics
No ratings yet
Reference Manual: Paleontological Statistics
222 pages
Presentation1 (1275)
No ratings yet
Presentation1 (1275)
31 pages
FINAL EXAM STAT and PROB
No ratings yet
FINAL EXAM STAT and PROB
5 pages
Assignment 6: IC252 - IIT Mandi
No ratings yet
Assignment 6: IC252 - IIT Mandi
2 pages
Pengaruh Pelatihan Dan Pengembangan Terhadap Produktivitas Kerja Karyawan PT. Semen Baturaja (Persero) Palembang
No ratings yet
Pengaruh Pelatihan Dan Pengembangan Terhadap Produktivitas Kerja Karyawan PT. Semen Baturaja (Persero) Palembang
12 pages
Scatter Plot Advertisment Vs Sales
No ratings yet
Scatter Plot Advertisment Vs Sales
5 pages
Newest Version Older Version: TI-Inspire Manual 1
No ratings yet
Newest Version Older Version: TI-Inspire Manual 1
56 pages
Fashion Involt Dan Enviroment
No ratings yet
Fashion Involt Dan Enviroment
14 pages
14 Anova1
No ratings yet
14 Anova1
31 pages
Statistical Properties of OLS
No ratings yet
Statistical Properties of OLS
59 pages
Quality Management System Flowchart
No ratings yet
Quality Management System Flowchart
7 pages
Empcode First Name Last Name Dept Region - Code Branch Hiredate Salary
No ratings yet
Empcode First Name Last Name Dept Region - Code Branch Hiredate Salary
23 pages
Sampling Methods
No ratings yet
Sampling Methods
5 pages

software material

Uploaded by

software material

Uploaded by

1.

 A dataset is a collection of several pieces of information called variables usually arranged by

Generally, Stata has 4 main windows

and pasting, and a command history

properties of the variables

The toolbar buttons and their functions

III. Stata Menus and dialogs

every command in Stata.

sample and dealing with weighted data

memory for your data.

How to Read Data into Stata?

choose excel spread sheet, or

Saving data into Stata

 just type: save filename

 1. select File > Save; or

 2. click on the Save button; or

textual output as it happens.

 Thus, it keeps your lab notebook for you as you work.

from disastrous failures, be they power failures or computer crashes.

How to create it?

Loading Data into Stata

 A period („.‟) represents a missing numeric value

 Press Tab or Return to input a missing numeric value

 Press Tab or Return to input a missing value for a string variable

Easy steps to load your data in Stata

Say you have a File name: datamgmt in excel

 It pays to develop a convention for naming variables and sticking to it.

 for example, use effort or fpe to represent a variable called family_planning_effort or

familyPlanningEffort, although all four names are legal.

 Note the use of underscores to separate words.

 Variables can be renamed using the following Stata syntax:

 label variable var1 "description"

Generating (creating) new variable from existing

 Syntax is: generate new variable = expression

Where: new variable is the name of new variable

generate income = fincome + nfincome

to generate natural logarithm: gen name of the new variable == ln(x)

to generate natural exponential of X: gen name of the new variable == exp(X)

memory and they won’t inadvertently sneak into your analysis.

Examining the Data

 To examine it within the results window, use the “list” command

 Note: listing the entire dataset is only feasible if it is small.

 The list command displays the values of all the variables

 additional commands to examine data are described in the following.

 A first useful command is “assert “which verifies whether a certain statement is

 Syntax: assert expression

frequency table of one variable or a cross-tab of two variables.

 Syntax: tabulate varname, options

 Syntax: tabulate varname1 varname2, options

 The syntax: corr variable list

How to get correlation & if it is significant or not (pairwise correlation)?

How to generate dummy variable

OLS Regression with dummy variable(s)

Logistic (Logit) Regression

Logistic Regression with dummy variable(s)

You might also like