Sas - Introduction to working under change management

Introduction ToIntroduction To
SASSAS

Good Data
Management Practices

Four Statistical Packages
• SPSS
• Stata
• R
• SAS

• Point and Click
• Command Line
• Programs (the best way)
Three Ways to Work

Outline
• Sermon on SYNTAX
• Cleaning data and creating variables
• Never overwrite original data
• Practices that will help you keep track of your work
• Safeguarding your work

A Sermon on SYNTAX
• Command line and Point and Click
– Advantages:
• Quick, may require less learning
– Disadvantages:
• Takes longer the second time – you must wade through the
point and click menu rather than just change a word
• You do not have a record of what you have done

SPSS
The King of Point and Click

You can point and click to get files, create variables, change variable
values, and do analysis, and end up without a record of what you
have done. You will be sorry.

Or, you can use Point and Click as an aid as you write programs.
You can copy syntax created by Point and Click into your program.
In SPSS programs are written in a Syntax Window and they have the
extension of .sps when you save them.

You can modify SPSS defaults so that commands will be reflected in the
log. This allows you to copy commands from your log into your
program file. These changes also make debugging easier.

You will find information about how to
modify SPSS at the following URL.

You can point and click, issue commands on the command line, or
create .do files. “.do” files can store your programs.

With R you can point and click, issue
commands on the command line, or
create .R files. “.R” files store your
programs.
Results from P&C are reflected so you
can copy them into your program.

SAS allows some point and
click, but immediately offers
an editor where you can write
your programs. SAS
programs end with the .sas
extension, and are text files.
SAS features an enhanced
editor with cool color coding
that makes it easier to write
and debug programs.

Never clean data in the data view

Scenario 1:
You get a data set and find errors in it.
You change the values in the data window.
You save it with point and click, over-writing your original data.
Later you try to recall what changes you made, when and why. Of
course you can’t. You can’t even be sure that you made the
“corrections” for the proper cases.
You can’t look back at older data sets to confirm what you did. You
sit there sweating.

Scenario 2 same as Scenario 1 :
You save it with point and click, over-writing your original data and,
while you are saving the file,
1) Your computer goes down because of a power outage OR
2) There is a brief interruption in the network
HALF OF YOUR DATA SET IS LOST.
You cry.

Scenario 3:
You get a data set and find errors in it.
You write a program that:
1) gets the original data
2) makes changes in values with SYNTAX
3) Includes comments about the changes
4) saves the new file in a different name
Science marches forward.

Creating Variables and Recoding
is not the same as Cleaning Data
• You always want clean data
• You may not always want the recoded or created
variables
• Make new variables, but keep the old ones. (don’t
over-write) Use the original to check the new

Examples of Recoding/Creating
• Creating a series of dummies from a categorical variable
• Creating an index from a series of scale variables
• Creating a dichotomous or categorical variable from a continuous
variable
• Always consider MISSING VALUES

Sample SPSS Program
* CleanNew.sps .
* 10/10/05 created dummy for male .
Get file = ‘dirty.sav’ .
* Cleaning data, PJG, looked at survey form, educ for ID=1 should be 16, 10/9/05 .
If id = 1 educ = 16 .
* Create a dummy variable from “gender”.
If gender = ‘m’ male = 1 .
If gender = ‘f’ male = 0 .
If gender = ‘’ male = -9 .
Missing values male (-9) .
Variable label male ‘Male’ .
Value labels male 1 ‘Male’ 0 ‘Female’ .
Save outfile = ‘CleanNew.sav’ / drop gender .

Summary for Cleaning and Creating
variables
• Use syntax (programs) to create and clean variables
• Document when and why in your programs
• Save new file – do not over-write the old

It may be months between the
time that you finish a paper,
submit it, and get to revise it for
publication.

What you will need to know:
• The origin of your variables:
– What is the source for each variable
– How were they created?
• What programs created your final tables?
• What program files created the file you used for your final tables?

Create a Directory for the Project
• For example, c:MA_Thesis
• Store all of the programs and data in that directory and
subdirectories

Naming Conventions
• For every data file you have, you should have a program
file with a corresponding name.
• When you have finished your paper, create a program
file for each table. For example: table1.sas table2.sas

Document your work
• Write comments in your program.
• Put a file in your directory called a_note, readme, or
something similar that includes a brief description of the
project and important information.

Safeguarding your work
• Multiple backups – not all stored in the same basket
• Worry about the future
– Keep up with formats (cards, tapes, floppy disks, CDs, what
next? )
– Store in portable formats

For More Information click below link:
Follow Us on:
http://vibranttechnologies.co.in/sas-classes-in-mumbai.html
Thank You !!!

Sas - Introduction to working under change management

More Related Content

Viewers also liked (13)

Similar to Sas - Introduction to working under change management (20)

More from Vibrant Technologies & Computers (20)

Recently uploaded (20)

Sas - Introduction to working under change management