PSTAT 130 Full Lecture Notes
PSTAT 130 Full Lecture Notes
pdf
Lecture 2.pdf
Lecture 3.pdf
Lecture 4.pdf
Lecture 5.pdf
Lecture 6.pdf
Lecture 7.pdf
Lecture 8.pdf
Lecture 9.pdf
Lecture 10.pdf
Lecture 11.pdf
Lecture 12.pdf
Lecture 13.pdf
Lecture 14 .pdf
SAS Components
This class
Base SAS – basic procedures and data management
Data management
Creating reports
Today’s Objectives
Open SAS
Menu -> All Programs -> Math & Stats -> SAS -> SAS 9.4
The Main Windows
Log
Output
Results
Explorer
Editor
Output window:
Results viewer
Default output for SAS 9.4
A single, continuous html report
Not affected by options like
Page size, page number, etc.
Another Useful Window: Table Editor
Getting Started
SAS statements
Always begin with a keyword
Always end with a semicolon (;)
Are free format
i.e. Can begin at any location and end at any location
Entire program can be written on one line, or many lines
EXCEPT when using the datalines; statement
SAS
Is not case sensitive
i.e. daTa nOtCaseSensitive;
EXCEPT in the case of string comparisons
The Basics
Comments
General format:
libname desktop 'C:\desktop';
The DATA Step
Variable being
End of PROC Step Keyword “Tables” summarized
First SAS Program
DATA intelligence;
input IQ;
datalines;
99 i.e. work.intelligence
140
125
118
104
;
run;
PROC print;
run;
William Qiu
Department of Statistics and Applied Probabilty
UCSB
PSTAT 130 - Summer 2017 William Qiu 2
Objectives
Create a printed report using PROC PRINT
Learn PROC PRINT syntax and options
Define titles and footnotes to enhance reports
Define descriptive column headings
Use SAS system options
proc print;
run;
With Data File
proc print data=ia.empdata;
run;
With Variable Selection: identifies the variables to print. PROC PRINT prints
the variables in the order that you list them.
proc print data=ia.empdata;
var JobCode EmpID Salary;
run;
PSTAT 130 - Summer 2017 William Qiu 5
Proc Print Example
Libname ia <insert folder reference>
Examples
title1 ‘PSTAT 130 Homework #1’; suppresses a title on line 1 and all lines
after it:
footnote2 ‘Confidential’;
ia.empdata
OPTIONS option...;
Run the following program to generate a list of all options (the list will
appear in the LOG window):
PROC OPTIONS;
RUN;
Formats affects only how the data values appear in output, not the
actual data values as they are stored in the SAS data set.
SAS
Format Report
Data
Set
Numeric Numeric
format data values Keyword
name
The statement begins with the keyword VALUE and ends
with a semicolon after all the labels have been assigned.
proc format;
value boardfmt low-49=‘Below Average’
50-99=‘Average’
100-high=‘Above Average’;
run;
Numeric
data values Keywords
proc format;
value $grade ‘A’=‘Good’
‘B’-’D’=‘Fair’
‘F’=‘Poor’
‘I’,’U’=‘See Instructor’
‘F’=‘Poor’
Other=‘Miscoded’;
run;
Examples
title1 ‘PSTAT 130 Homework #1’;
footnote2 ‘Confidential’;
OPTIONS option...;
PROC OPTIONS;
RUN;
SAS
Format Report
Data
Set
MMDDYYw. DATEw.
Format Displayed Format Displayed
Value Value
MMDDYY6. 101601 DATE7. 16OCT01
MMDDYY8. 10/16/01 DATE9. 16OCT2001
MMDDYY10. 10/16/2001
PROC FORMAT;
VALUE format-name range1=‘label’
range2=‘label’
...;
RUN;
Range(s)
Can be single values or
ranges of values
Numeric Numeric
format data values Keyword
name
proc format;
value boardfmt low-49=‘Below Average’
50-99=‘Average’
100-high=‘Above Average’;
run;
Numeric
data values Keywords
proc format;
value $grade ‘A’=‘Good’
‘B’-’D’=‘Fair’
‘F’=‘Poor’
‘I’,’U’=‘See Instructor’
‘F’=‘Poor’
Other=‘Miscoded’;
run;
proc format;
value $codefmt “FLTAT”=“Flight Attendant”
“PILOT”=“Pilot”;
run;
proc format;
value money low-25000=‘Less than 25,000’
25000-50000=‘25,000 to 50,000’
50000-high=‘More than 50,000’;
run;
proc format;
value $codefmt ‘FLTAT’=‘Flight Attendant’
‘PILOT’=‘Pilot’;
value money low-25000=‘Less than 25,000’
25000-50000=‘25,000 to 50,000’
50000-high=‘More than 50,000’;
run;
Operands include
variables or constants
Operators include
comparison operators
logical operators
special operators
functions
PSTAT 130 - Summer 2017 - William Qiu 45
Printing Selected Observations
Use the WHERE statement to control which observations
are processed.
proc print data=ia.empdata noobs;
var JobCode EmpID Salary;
where JobCode='PILOT';
run;
OR (|)
if either expression is true, then the compound expression is true
where JobCode='PILOT' or JobCode='FLTAT';
where JobCode='PILOT' | JobCode='FLTAT';
NOT
can be combined with other operators to reverse the logic of a
comparison.
where not JobCode in('PILOT','FLTAT');
CONTAINS (?)
selects observations that include the specified substring.
where LastName ? 'LAM';
(LAMBERT, BELLAMY, and ELAM are selected.)
Contrast In and ?
Example:
proc print data=ia.empdata width=uniform;
run;
Example:
proc print data=ia.empdata noobs;
var JobCode EmpID Salary;
sum Salary;
run;
(Note: the SUM statement also produces subtotals if you
print the data in groups.)
PSTAT 130 - Summer 2017 - William Qiu 57
Sequencing and Grouping Observations
Act
Obs ID Name Age Date Height Weight Level Fee
Act
Obs ID Name Age Date Height Weight Level Fee
Example:
proc print data=ia.empdata;
id JobCode;
var EmpID Salary;
run;
Job Emp
Code ID LastName FirstName Salary
Conversion
Process
SAS Data
Set
Descriptor Data
Portion Portion
PSTAT 130 - Summer 2017 - William Qiu
SAS Data Sets
SAS data sets have a descriptor portion and a data portion.
Descriptor General data set information
Portion * data set name * data set label
* date/time created * storage information
* number of observations
Example:
proc contents data=data1.empdata;
run;
1 EmpID Char 4
3 FirstName Char 13
4 JobCode Char 5
2 LastName Char 13
5 Salary Num 8
Example:
data staff;
set mydrive.empdata;
keep lastname firstname jobcode salary;
run;
General Form
Libref.filename
Examples:
ia.empdata
work.staff
Directory
Libref IA
Engine V9
Physical Name E:\UCSB\pstat 130\data1
File Name E:\UCSB\pstat 130\data1
Member File
# Name Type Size Last Modified
New SAS
data set
data mydrive.dfwlax;
set ia.dfwlax;
run;
PSTAT 130 - Summer 2017- William Qiu
Reading Raw Data within a SAS Program
If the raw data are contained in the SAS Program:
Use the DATALINES keyword, followed by the raw data lines.
Example:
data work.sample;
input firstname $ gender $ age;
datalines;
John Male 22
Jane Female 19
;
run;
data students;
input Name $ Gender $ Age Enroll mmddyy8.;
datalines;
David Male 19 06/18/10
non-standard data
Amelia Female 23 08/02/10
(dates)
Ravi Male 17 07/22/10
Ashley Female . 09/14/10
Jim Male 26 08/26/10
;
run;
Data libref.new-data-set;
KEEP variables;
or
DROP variables;
run;
Operands are
Variable names
Constants
Operators are
Arithmetic symbols (+, -, /, *, etc)
SAS functions
Example:
Total=sum(Salary, Bonus);
Example:
evaldate = ’14FEB2009’d;
IF expression;
data work.dfwlax;
infile 'raw-data-file';
input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17
Economy 18-20;
run;
It then loads the first line of data into the input buffer,
parses it into variables, and outputs those values to the
SAS dataset Input Buffer with
1 2 3
1st line of data
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
0 1 5 1 0 / 2 5 / 1 2 L A X 1 4 1 6 3
data students;
input @1 Name $8. @9 Gender $6. @18 Age 2. @22
Enroll mmddyy8.;
datalines;
David Male 19 06/18/10
Amelia Female 23 08/02/10
Ashley Female 20 09/14/10
Jim Male 26 08/26/10
;
run;
DATA SAS-data-set ;
SET SAS-data-set1 SAS-data-set2 . . .;
<other SAS statements>
RUN;
Morning Afternoon
Name Score allsections Name Score
Mary 75 Name Score Andy 78
Mark 82 Mary 75 Alice 85
Mike 68 Mark 82 Art 62
Mike 68
Andy 78
Alice 85
Art 62
data allsections;
set morning afternoon;
run;
DATA SAS-data-set;
MERGE SAS-data-sets;
BY BY-variable(s);
<other SAS statements>
RUN;
Midterm Final
Name Midscore allsections Name Finalscore
Wendy 32 Name Midscore Finalscore John 91
Andy 38 Andy 38 82 Wendy 73
John 27 John 27 91 Andy 82
Wendy 32 73
Data allscores;
Merge midterm (RENAME=(score=midterm))
final (RENAME=(score=final));
by Name;
run;
Example:
Data allscores;
Merge midterm (IN=InMidterm)
final (IN=InFinal);
by Name;
if InMidterm and InFinal;
run;
PSTAT 130 - Summer 2017 - William Qiu
Lookup Tables
Data set variable contains ‘codes’
Males are coded as 1
Females are coded as 2
Lookup table contains ‘labels’ that can be merged with
‘codes’
GenderLookup
GenderCode GenderLabel
1 Male
2 Female
Alex Shepard
Andy Potts
Cheryl Smith
Curt Forrest
PSTAT 130 - Summer 2017 - William Qiu
Task – Output Class List
Create a Class List for each Student showing details of the
classes each is taking.
Include the variables below
Assign appropriate variable labels
Use an appropriate format for FirstClassDate
Example on next slide
First
Course Class Instructor Building Room Class Class
Name Date Name Name Number Days Time
First
Course Class Instructor Building Room Class Class
Name Date Name Name Number Days Time
Course Instructor
Student Name Name Name Academic Rank Salary
Example:
Act N
Level Obs Variable N Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
HIGH 7 Age 7 34.2857143 7.5213980
Height 7 70.1428571 4.2201332
Weight 7 163.5714286 21.1412483
Example:
proc format;
value $codefmt
'FLTAT1'-'FLTAT3'='Flight Attendant'
'PILOT1'-'PILOT3'='Pilot';
run;
proc freq data = ia.crew;
format JobCode $codefmt.;
tables JobCode;
run;
JobCode Salary
Frequency ‚
Percent ‚
Row Pct ‚
Col Pct ‚Less tha‚25,000 t‚More tha‚ Total
‚n 25,000‚o 50,000‚n 50,000‚
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Flight Attendant ‚ 5 ‚ 39 ‚ 0 ‚ 44
‚ 7.25 ‚ 56.52 ‚ 0.00 ‚ 63.77
‚ 11.36 ‚ 88.64 ‚ 0.00 ‚
‚ 100.00 ‚ 100.00 ‚ 0.00 ‚
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Pilot ‚ 0 ‚ 0 ‚ 25 ‚ 25
‚ 0.00 ‚ 0.00 ‚ 36.23 ‚ 36.23
‚ 0.00 ‚ 0.00 ‚ 100.00 ‚
‚ 0.00 ‚ 0.00 ‚ 100.00 ‚
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 5 39 25 69
7.25 56.52 36.23 100.00
PSTAT 130 - Summer 2017 - William Qiu
William Qiu
Department of Statistics and Applied Probability
UCSB
PSTAT 130 - Summer 2017 - William Qiu
Lecture Outline
Summarizing Your Data – con’t
PROC MEANS
PROC FREQ
PROC TABULATE
PROC REPORT
„ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ†
‚ Location ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ CARY ‚ FRANKFURT ‚ LONDON ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ N ‚ N ‚ N ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ 17.00‚ 12.00‚ 15.00‚
Šƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒŒ
PSTAT 130 - Summer 2017 - William Qiu
Obtaining a Total
proc tabulate data=ia.fltat;
class Location;
table Location All
run;
Blank Operator between Location and
All concatenates information
„ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒ†
‚ Location ‚ ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒ‰ ‚
‚ CARY ‚ FRANKFURT ‚ LONDON ‚ All ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ N ‚ N ‚ N ‚ N ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ 17.00‚ 12.00‚ 15.00‚ 44.00‚
Šƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒŒ
PSTAT 130 - Summer 2017 - William Qiu
Two-Dimensional Tables
title2 'by JobCode';
proc tabulate data=ia.fltat;
class Location JobCode;
table JobCode, Location;
run;
Column Dimension
Comma operator
Row Dimension moves to a new
dimension
„ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ†
‚ ‚ Location ‚
‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ ‚ CARY ‚ FRANKFURT ‚
‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ ‚ N ‚ N ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚JobCode ‚ ‚ ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‰ ‚ ‚
‚FLTAT1 ‚ 5.00‚ 4.00‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚FLTAT2 ‚ 7.00‚ 5.00‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚FLTAT3 ‚ 5.00‚ 3.00‚
Šƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒŒ
Row Dimension
Column Dimension
„ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ†
‚ ‚ Location ‚
‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ ‚ CARY ‚ FRANKFURT ‚
‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ ‚ Salary ‚ Salary ‚
‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚ ‚ Sum ‚ Sum ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚JobCode ‚ ‚ ‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‰ ‚ ‚
‚FLTAT1 ‚ 131000.00‚ 100000.00‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚FLTAT2 ‚ 245000.00‚ 181000.00‚
‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒ‰
‚FLTAT3 ‚ 217000.00‚ 134000.00‚
Šƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒŒ
Example:
proc report data=ia.admit nowindow;
run;
Act
ID Name Sex Age Date Height Weight Level Fee
2458 Murray, W M 27 1 72 168 HIGH 85.20
2462 Almers, C F 34 3 66 152 HIGH 124.80
2501 Bonaventure, T F 31 17 61 123 LOW 149.75
2523 Johnson, R F 43 31 63 137 MOD 149.75
2539 LaMance, K M 51 4 71 158 LOW 124.80
2544 Jones, M M 29 6 76 193 HIGH 124.80
2552 Reberson, P F 32 9 67 151 MOD 149.75
2555 King, E M 35 13 70 173 MOD 149.75
2563 Pitts, D M 34 22 73 154 LOW 124.80
2568 Eberhardt, S F 49 27 64 172 LOW 124.80
2571 Nunnelly, A F 44 19 66 140 HIGH 149.75
2572 Oberon, M F 28 17 62 118 LOW 85.20
2574 Peterson, V M 30 6 69 147 MOD 149.75
2575 Quigley, M F 40 8 69 163 HIGH 124.80
2578 Cameron, L M 47 5 72 173 MOD 124.80
2579 Underwood, K M 60 22 71 191 LOW 149.75
2584 Takahashi, Y F 43 29 65 123 MOD 124.80
2586 Derber, B M 25 23 75 188 HIGH 85.20
2588 Ivan, H F 22 20 63 139 LOW 85.20
2589 Wilcox, E F 41 16 67 141 HIGH 149.75
2595 Warren, C M 54 7 71 183 MOD 149.75
COLUMN SAS-variables;
Example:
Selected options:
SUMMARIZE Prints the total.
OL Prints a single line above the total.
DOL Prints a double line above the total.
UL Prints a single line below the total.
DUL Prints a double line below the total.
Example:
ods html file=‘output.html'
style=analysis;
Description:
SUMVAR = identifies the analysis variable to use for
the sum or mean calculation.
TYPE= specifies that the height or length of the
bar or size of the slice represents a mean
or sum of the analysis-variable values.
COLOR=color | C=color
FONT=type-font | F=type-font
HEIGHT=n | H=n
Example:
plot Boarded*Date;
run;
goptions reset=all;
PSTAT 130 - Summer 2017- William Qiu
Producing Plots – GPLOT Output
General Form
SYMBOLn options;
I = rl
I = rlclm95
regeqn
PSTAT 130 - Summer 2017- William Qiu
William Qiu
Department of Statistics and Applied Probability
UCSB
PSTAT 130 - Summer 2017-William Qiu
Lecture Outline
Controlling When Records are Written to a SAS Dataset
Creating Multiple Records from a Single Observations
(OUPUT)
Writing to Multiple SAS Data Sets (OUTPUT)
Creating a Single Record from Multiple Observations
(RETAIN)
Creating a running total – RETAIN Statement and Sum
Expression
Limiting the Variables Read from or Written to a SAS Data
Set (DROP= and KEEP=)
Limiting the Number of Observations Read from a Data
Set (FIRSTOBS and OBS statements)
PSTAT 130 - Summer 2017-William Qiu
PSTAT 130 - Summer 2017-William Qiu
Automatic Output (Review)
data forecast;
2. Automatic set prog2.growth;
Return <additional SAS statements>;
run;
1. Automatic Output
OUTPUT <SAS-data-set-1……SAS-data—set-n>;
Murray, W 1 176.400
Murray, W 2 185.220
Murray, W 3 194.481
Murray, W 4 204.205
Murray, W 5 214.415
Almers, C 1 159.600
Almers, C 2 167.580
Almers, C 3 175.959
Almers, C 4 184.757
Almers, C 5 193.995
Bonaventure, T 1 129.150
Bonaventure, T 2 135.608
Bonaventure, T 3 142.388
Bonaventure, T 4 149.507
Bonaventure, T 5 156.983
PSTAT 130 - Summer 2017-William Qiu
Writing to Multiple SAS Data Sets
GOAL: Use the ADMIT data set to create three separate
output data sets:
Patients with Low activity level
Patients with Moderate activity level
Patients with High activity level
This is the reverse of Concatenating (Appending) data sets
General Form
First.BY-variable
Last.BY-variable
First.Name
Last.Name
PSTAT 130 - Summer 2017-William Qiu
Outputting Only Summary Records
data weightchange;
set weightloss;
by name visit;
if first.name then do;
baseweight=weight;
weightchange = .;
end;
if last.name then do;
weightchange = baseweight - weight;
output;
end;
retain;
run;
KEEP=
SAS-data-set(KEEP=variable-1 variable-2 …variable-n)
Air Force NSF Camp Springs MD USA Andrews Air Force Base
Army EDG Edgewood Arsenal MD USA Weide Army Air Field
Army FME Fort Meade (Odenton) MD USA Tipton Army Air Field
Naval NHZ Brunswick ME USA Brunswick Naval Air
Air Force LIZ Limestone ME USA Loring Air Force Base
Air Force SAW Gwinn/Marquette MI USA K. I. Sawyer Air Force
Marine 76G Marine City MI USA Marine City Airport
Naval NFB Mount Clemens MI USA Detroit Naval Air
Air Force SZL Knob Noster MO USA Whiteman Air Force Base
Air Force BIX Biloxi MS USA Keesler Air Force Base
Air Force CBM Columbus MS USA Columbus Air Force Base
Naval NJW Moscow MS USA Joe Williams Naval
Air Force GFA Great Falls MT USA Malmstrom Air Force Base
Air Force POB Fayetteville NC USA Pope Air Force Base
Army FBG Fort Bragg NC USA Simmons Army Air Field
Air Force GSB Goldsboro NC USA Seymour Johnson Air Force
Army HFF Hoffmann/Camp Mackal NC USA Mackall Army Air Field
Example:
data army;
set prog2.military(obs=25);
if Type eq 'Army' then output;
run;
Note: SAS will warn you that this file is not of .xls type.
Just ignore this warning.
Note: INFILE vs. FILE, INFORMAT vs. FORMAT, INPUT vs. PUT
One controls data coming into SAS, the other controls data leaving SAS
Mary,21,Female
John,22,Male
David,18,Male
PSTAT 130 - Summer 2017- William Qiu
Missing Data at the End of a Row
By default, when there is missing data at the end of a row,
1. SAS loads the next record to finish the observation
2. a note is written to the log
3. SAS loads a new record at the top of the DATA step and
continues processing.
In Pass Carg
ID Service Cap Cap
Desired Output
Proc Print;
Var Week1-Week52;
Run;
Use this data set to create another data set suitable for
mailing labels.
Example
LENGTH(“SMITH, JOHN”) = 11
Example
INDEX(‘SMITH-JOHN’,’-’) = 6
Example:
SUBSTR(“PSTAT130”,6,3) = “130”
123456789012345
Smith, John
Length= 11
Comma_pos = 6
Last_name = First_name =
SUBSTR(1,5) SUBSTR(8,3)
First “word”
Example:
SCAN(“Smith, John”, 1) = “Smith”
SCAN(“Smith, John”, 2) = “John”
Second “word”
Examples:
“John” || “Smith” = “JohnSmith”
“John” || “ “ || “Smith” = “John Smith”
Example:
TRIM(“JOHN “) = “JOHN”
Note: CEIL(4) = 4
Note: FLOOR(4) = 4
Example
Areacode = 805
AreaChar = PUT(Areacode,3.) = “805”
BUT if you want to do more than one thing, you need a DO-
END statement
IF sex = ‘Male’ THEN
DO;
Abbreviation = ‘Mr.’;
Salutation = ‘Sir’;
END;
Year Capital
2004 17364.61
Year Capital
2001 5375.00
2002 11153.13
2003 17364.61
1 1 1 1
2 2 3 2
3 3 6 6
4 4 10 24
5 5 15 120
6 6 21 720
7 7 28 5040
8 8 36 40320
9 9 45 362880
10 10 55 3628800