Top 10 Most Powerful Functions For PROC SQL

The document discusses the top 10 most powerful functions for PROC SQL in SAS. It provides examples of using functions like MONOTONIC(), COUNT(), NMISS(), COALESCE(), MISSING(), SPEDIS(), SOUNDEX(), RANUNI(), MAX(), IFC(), IFN(), UNIQUE(), and PUT() in PROC SQL statements. These functions help facilitate data management and descriptive statistics by providing functionality like specifying row numbers, counting values, combining columns, checking for missing values, fuzzy matching, random sampling, aggregating values, binary selection, finding variable levels, and creating filters.

Uploaded by

ArvindRaj

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

132 views

Top 10 Most Powerful Functions For PROC SQL

Uploaded by

ArvindRaj

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Top 10 most powerful functions for PROC SQL

ABSTRACT
PROC SQL is not only one of the many SAS procedures and also a distinctive subsystem with all
common features from SQL (Structured Query Language). Equipped with PROC SQL, SAS upgrades to
a full-fledging relational database management system. PROC SQL provides alternative ways to manage
data other than the traditional DATA Step and SAS procedures. In addition, SASs built-in functions are
the add-on tools to increase the power of PROC SQL. In this paper, we illustrate ten popular SAS
functions, which facilitate the capacity of PROC SQL in data management and descriptive statistics.

INTRODUCTION
Structured Query Language (SQL) is a universal computer language for all relational database
management systems.PROC SQL is the implementation of the SQL syntax in SAS. It first appeared
in SAS 6.0, and since then has been widely used for SAS users. PROC SQL greatly increases SASs
flexibility in handling data, especially for multiple-table joining and database access. There are a number
of comparisons between the DATA Step and the SQL procedure in SAS [1]. A majority of SAS functions
can be directly used in the SQL procedure. And the PROC procedure also enjoys a few unique functions.
In this paper, we select the 10 SAS functions, and show their usage in the SQL procedure. For
demonstration purpose, we simulate a Social Security Number (SSN) dataset with two entries from
different sources. Each entry of the 1000 records should be identical but some values are missing.
*****(0) Simulate two datasets for demonstration********************;
*****(0.1) Simulate a dataset for two SSN entries*******************;
data ssn_data;
do i = 1 to 1000;
ssn1 = ceil((ranuni(1234)*1E9));
ssn2 = ssn1;
if ssn1 le ceil((ranuni(1000)*1E9)) then call missing(ssn1);
if ssn2 le ceil((rannor(2000)*1E9)) then call missing(ssn2);
drop i;
output;
end;
format ssn1 ssn2 ssn11.;
run;
We also simulate a patient-visiting dataset with three patient IDs. Every patient receives three different
treatments at each visit. The effects of the treatments (1 means effective; 0 means not effective) and the
cost for each visit are recorded. Other than the two simulated datasets, two datasets shipped with SAS,
SASHELP.CLASS andSASHELP.CARS, are also used in the paper.
*****(0.2) Simulate a dataset for hospital visits ******************;
data hospital_data;
input id visit treat1 treat2 treat3 cost;
format cost dollar8.2;
cards;
1 1 0 0 0 520
1 2 1 0 0 320
1 3 0 1 0 650
2 1 1 0 0 560
2 2 1 0 0 360
3 1 1 0 0 500
3 2 0 0 1 350
;;;
run;

TOP 10 FUNCTIONS FOR THE SQL PROCEDURE IN SAS

1. The MONOTONIC function

The MONOTONIC function is quite similar to the internal variable _N_ in DATA Step. We can use it to
select the records according to their row number. For example, we choose the SSNs from the 501 th line to
the 888th line in the SSN dataset.
****(1) MONOTONIC: specify row numbers******************************;
proc sql;
select *
from ssn_data
where monotonic() between 501 and 800
;quit;
2. The COUNT, N and NMISS functions
These counting functions are especially useful in data cleaning. By using them, the detailed missing
status is shown in only one output table. For the SSN dataset, we can display the total numbers of the
missing and non-missing values for each SSN entry.
****(2) COUNT/N/NMISS: find total and missing values****************;
proc sql;
select count(*) as n 'Total number of the observations',
count(ssn1) as m_ssn1 'Number of the missing values for ssn1',
nmiss(ssn1) as nm_ssn1 'Number of the missing values for ssn1',
n(ssn2) as m_ssn1 'Number of the nonmissing values for ssn2',
nmiss(ssn2) as nm_ssn2 'Number of the non-missing values for ssn2'
from ssn_data
;quit;
3. The COALESCE function
The COALESCE function does the magic to combine multiple rows into a single one with any non-missing
value. In this example, there are two rows of SSNs, and supposedly they should be identical each other.
However, some of them are missing due to input errors or other reason. The COALESCE function in the
SQL statement below checks the value of the two rows and returns the first non-missing value, which
maximizes the SSN information.
****(3) COALESCE: combine values among columns**********************;
proc sql;
select monotonic() as obs, coalesce(ssn1, ssn2) as ssn format = ssn11.
from ssn_data
;quit;
4. The MISSING function
The MISSING function returns a Boolean value for a variable (0 when non-missing; 1 when missing). In
the example below, the missing status of the values in the SSN dataset is displayed row by row.
****(4) MISSING: return Boolean for missing value*******************;
proc sql ;
select monotonic() as obs,
(case sum(missing(ssn1), missing(ssn2))
when 0 then 'No missing'
when 1 then 'One missing value'
else 'Both missing values'
end) as status 'Missing status'
from ssn_data
;quit;
5. The SPEDIS and SOUNDEX functions
The two functions can fulfill fuzzy matching. For example, if we want to examine the first entry of the SSN
dataset to see if there is any possible duplicate, we can use the SPEDIS function in the SQL statement to
look up any pair of the records. Here we set the argument to be 25 in order to detect any singlet [2].
****(5)SPEDIS/SOUNDEX: fuzz matching*********************************;
****(5.1)SPEDIS: find spelling mistakes******************************;

proc sql;
select a.ssn1 as x, monotonic(a.ssn1) as x_obs,
b.ssn1 as y, monotonic(b.ssn1) as y_obs
from ssn_data as a, ssn_data as b
where (x gt y) and (spedis(put( x, z11.), put( y, z11.)) le 25)
;quit;
For human names, we can check similarities by the SOUNDEX function to avoid duplicates [3]. The
SASHELP.CLASS has 19 names. Phonically, John and Jane look similar according to the SOUNDEX
function.
****(5.2)SOUNDEX: find phonic similarity*****************************;
proc sql;
select a.name as name1, b.name as name2
from sashelp.class as a, sashelp.class as b
where soundex(name1) = soundex(name2) and (name1 gt name2)
;quit;
6. The RANUNI function
This function does simple random sampling like PROC SURVEYSELECT. We can specify the OUTOBS
option at the beginning to choose the sample size.
****(6)RANUNI: simple random sampling********************************;
proc sql outobs=30;
select *
from ssn_data
order by ranuni(1234)
;quit;
7. The MAX function
The MAX function returns the maximum value and sometimes simplifies column-wise aggregation. For
the patient-visiting dataset, if we need to know if each treatment is effective for the patients, it may take
some time to code the RETAIN statement and temporary variables at DATA Step, while the MAX function
at PROC SQL is quite straightforward.
****(7)MAX: find the maximum value for each column******************;
proc sql;
select id, max(treat1) as effect1 'Effect after Treatment 1',
max(treat2) as effect2 'Effect after Treatment 2',
max(treat3) as effect3 'Effect after Treatment 3'
from hospital_data
group by id
;quit;
8. The IFC and IFN functions
The two functions play a role like the CASE-WHEN-END statements in typical SQL syntax, if the condition
is about a binary selection. The IFC function deals with character variables, while the IFN function is
for numbers. For the patient-visiting dataset, we can use the two functions together to find the total cost,
the discounted cost (a 15% discount is applied if the total cost is greater than $1,000), and whether the
first treatment is effective for each patient.
****(8)IFC/IFN: binary selection for either character and number****;
proc sql;
select id, ifc(max(treat1) = 1, 'Yes', 'No') as overall_effect
length = 3 'Any effect after treatment 1',
sum(cost) as sum_cost format = dollar8.2 'Total cost',
ifn(calculated sum_cost ge 1000,
calculated sum_cost*0.85,
calculated sum_cost*1) as discounted_cost

format=dollar8.2 'Total cost after discount if any'

from hospital_data
group by id
;quit;
9. The UNIQUE function
This function is very convenient to show the number of the levels for every categorical variable.
****(9)UNIQUE: find the levels of categorical variables************;
proc sql;
select count(unique(make)) as u_make 'Number of the car makers',
count(unique(origin)) as u_origin 'Number of the car origins',
count(unique(type)) as u_type 'Number of the car types'
from sashelp.cars
;quit;
10. The PUT function
We can apply the PUT function with a user-defined format by PROC FORMAT in the WHERE statement
to create filters. For the SASHELP.CARS dataset, this strategy is used to choose only the high or
medium priced cars.
****(10)PUT: create an filter by user-defined format***************;
proc format;
value range
40000-high='High'
26000 -< 40000='Medium'
other ='Low';
run;
proc sql;
select model, make, msrp,
msrp as range 'Price Range' format = range.
from sashelp.cars
where put(msrp, range.) in ('High', 'Medium')
;quit;

SPF ELITE Brochure-2022 Unlocked
No ratings yet
SPF ELITE Brochure-2022 Unlocked
28 pages
The Matrix - The Hidden Wiki
No ratings yet
The Matrix - The Hidden Wiki
32 pages
OCl 1Z0-1072 - Exam PDF
100% (5)
OCl 1Z0-1072 - Exam PDF
31 pages
Database Programming With SQL Midterm Exam
100% (2)
Database Programming With SQL Midterm Exam
16 pages
Obiee Functions
No ratings yet
Obiee Functions
20 pages
(Ebook) Cody's Data Cleaning Techniques Using SAS by Ron Cody ISBN 9781599946597, 1599946599 pdf download
100% (2)
(Ebook) Cody's Data Cleaning Techniques Using SAS by Ron Cody ISBN 9781599946597, 1599946599 pdf download
50 pages
SAS Chapter 03
No ratings yet
SAS Chapter 03
6 pages
Advanced PROC SQL Programming Techniques
No ratings yet
Advanced PROC SQL Programming Techniques
6 pages
Download Complete Cody s Data Cleaning Techniques Using SAS 2nd ed Edition Ron Cody PDF for All Chapters
No ratings yet
Download Complete Cody s Data Cleaning Techniques Using SAS 2nd ed Edition Ron Cody PDF for All Chapters
51 pages
Functtion in Base Sas and Proc SQL
No ratings yet
Functtion in Base Sas and Proc SQL
12 pages
(Ebook) Cody's Data Cleaning Techniques Using SAS by Ron Cody ISBN 9781599946597, 1599946599 - The ebook in PDF and DOCX formats is ready for download now
100% (1)
(Ebook) Cody's Data Cleaning Techniques Using SAS by Ron Cody ISBN 9781599946597, 1599946599 - The ebook in PDF and DOCX formats is ready for download now
49 pages
LAB # 05 Single Row Functions & Aggregating Data Using Group Function Order by Clause
No ratings yet
LAB # 05 Single Row Functions & Aggregating Data Using Group Function Order by Clause
5 pages
Merge
0% (1)
Merge
16 pages
Analytic Functions by Example Oracle FAQ
No ratings yet
Analytic Functions by Example Oracle FAQ
16 pages
Topic: Generating Reports
No ratings yet
Topic: Generating Reports
15 pages
Mid 1
100% (1)
Mid 1
26 pages
Completetypes N Preloadfmt PDF
No ratings yet
Completetypes N Preloadfmt PDF
5 pages
comparing_two_tables
No ratings yet
comparing_two_tables
8 pages
Proc SQL Subquery
No ratings yet
Proc SQL Subquery
10 pages
"How Does Your Data Compare?" Sas'S Compare Procedure: Jenna Heyen, William M. Mercer, Inc., Deerfieldj IL
No ratings yet
"How Does Your Data Compare?" Sas'S Compare Procedure: Jenna Heyen, William M. Mercer, Inc., Deerfieldj IL
4 pages
Section 9 Quiz PLSQL
100% (1)
Section 9 Quiz PLSQL
5 pages
Efficiency Techniques and Methods Kelley Weston Q2 2009
No ratings yet
Efficiency Techniques and Methods Kelley Weston Q2 2009
46 pages
CC 107
No ratings yet
CC 107
2 pages
Functions: String Function
No ratings yet
Functions: String Function
27 pages
Char Func
No ratings yet
Char Func
3 pages
The Simplified Syntax For The CREATE FUNCTION Statement Is As Follows
No ratings yet
The Simplified Syntax For The CREATE FUNCTION Statement Is As Follows
19 pages
Base Five 08
No ratings yet
Base Five 08
19 pages
Sas Faq V1.3
No ratings yet
Sas Faq V1.3
56 pages
Vibeeshcsa15031sas1 150920141056 Lva1 App6891
No ratings yet
Vibeeshcsa15031sas1 150920141056 Lva1 App6891
44 pages
SAS 9.4
No ratings yet
SAS 9.4
8 pages
Proc SQL
100% (1)
Proc SQL
7 pages
0.0 - Hypothesis Testing - AA
No ratings yet
0.0 - Hypothesis Testing - AA
13 pages
Advanced SQL Processing
No ratings yet
Advanced SQL Processing
7 pages
4 Preprocessing
No ratings yet
4 Preprocessing
72 pages
SAS Interview Questions and Answers.
No ratings yet
SAS Interview Questions and Answers.
29 pages
Oracle SQL Functions
No ratings yet
Oracle SQL Functions
5 pages
DMS CH3 2020-21
No ratings yet
DMS CH3 2020-21
29 pages
IP MS
No ratings yet
IP MS
6 pages
Functions
No ratings yet
Functions
19 pages
Effective Use of SQL in SAS Programming
100% (1)
Effective Use of SQL in SAS Programming
8 pages
Module 4
No ratings yet
Module 4
26 pages
Sascheatsheet 170401221255
100% (1)
Sascheatsheet 170401221255
29 pages
Excel Functions - Full List
No ratings yet
Excel Functions - Full List
6 pages
Excel Functions - Full List
No ratings yet
Excel Functions - Full List
13 pages
Manual No 7
No ratings yet
Manual No 7
7 pages
Creating Your Own Library
No ratings yet
Creating Your Own Library
13 pages
Cody s Data Cleaning Techniques Using SAS 2nd ed Edition Ron Cody - The ebook is ready for instant download and access
100% (1)
Cody s Data Cleaning Techniques Using SAS 2nd ed Edition Ron Cody - The ebook is ready for instant download and access
46 pages
Institute Exit Exam
No ratings yet
Institute Exit Exam
16 pages
PROC_Statements
No ratings yet
PROC_Statements
2 pages
LBSIM Business Analytics Slides - Day 9
No ratings yet
LBSIM Business Analytics Slides - Day 9
21 pages
Advanced SAS Programming Syntax Reference Guide
No ratings yet
Advanced SAS Programming Syntax Reference Guide
6 pages
SAS 201 - Copy - Copy (4)
No ratings yet
SAS 201 - Copy - Copy (4)
17 pages
Phuse 2017: Geetha Kesireddi, Gce Solutions Inc, Hyderabad, India
No ratings yet
Phuse 2017: Geetha Kesireddi, Gce Solutions Inc, Hyderabad, India
8 pages
Additional Relational Operations
No ratings yet
Additional Relational Operations
13 pages
SAS Functions
No ratings yet
SAS Functions
8 pages
Sas Functions: Excerpted From SAS Release 8.2 Online Documentation July, 2004
No ratings yet
Sas Functions: Excerpted From SAS Release 8.2 Online Documentation July, 2004
8 pages
Introduction To Oracle Functions and Group by Clause
100% (2)
Introduction To Oracle Functions and Group by Clause
62 pages
Base Interview 2
No ratings yet
Base Interview 2
10 pages
Section 9 Quiz
0% (1)
Section 9 Quiz
6 pages
Using SAS To Perform A Table Lookup: Definition of Terms
No ratings yet
Using SAS To Perform A Table Lookup: Definition of Terms
14 pages
DBMS Lab Manual
From Everand
DBMS Lab Manual
Jitendra Patel
1.5/5 (3)
Oracle SQL and PL/SQL
From Everand
Oracle SQL and PL/SQL
Niraj Gupta
4.5/5 (8)
Advanced SAS Interview Questions You'll Most Likely Be Asked
From Everand
Advanced SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Instalacion Cabezal HP A3
No ratings yet
Instalacion Cabezal HP A3
20 pages
Checking The Data Using Extractor Checker in ECC Delta and Repea Delta
No ratings yet
Checking The Data Using Extractor Checker in ECC Delta and Repea Delta
21 pages
MPLS To SRv6 Migration-1
100% (1)
MPLS To SRv6 Migration-1
9 pages
Migration Guide From CPM2A To CP1E
No ratings yet
Migration Guide From CPM2A To CP1E
40 pages
AboutDIReport PDF
No ratings yet
AboutDIReport PDF
3 pages
74HCT14 Hex Schmitt Trigger Inverter With LSTTL Compatible Inputs
No ratings yet
74HCT14 Hex Schmitt Trigger Inverter With LSTTL Compatible Inputs
8 pages
7 Tools To Help You Write A Novel
No ratings yet
7 Tools To Help You Write A Novel
27 pages
ASME Handbook
100% (1)
ASME Handbook
38 pages
DASAN V8240 Datasheet EN
No ratings yet
DASAN V8240 Datasheet EN
2 pages
HansZimmerPercussion UserManual
No ratings yet
HansZimmerPercussion UserManual
19 pages
Chat Live Sexy
No ratings yet
Chat Live Sexy
2 pages
Manual Aquamaster Atobá
No ratings yet
Manual Aquamaster Atobá
255 pages
AQA L2FM Practice Paper 1
No ratings yet
AQA L2FM Practice Paper 1
18 pages
Udom
No ratings yet
Udom
18 pages
Point To Point Microwave PDF
100% (1)
Point To Point Microwave PDF
83 pages
ECE1003_DIGITAL-LOGIC-DESIGN_ETH_1.1_3_ECE1003_revised (2)
No ratings yet
ECE1003_DIGITAL-LOGIC-DESIGN_ETH_1.1_3_ECE1003_revised (2)
2 pages
Answer All The Following Questions: Problem 1: (25) : Marks
No ratings yet
Answer All The Following Questions: Problem 1: (25) : Marks
2 pages
Handbook Amd64 PDF
No ratings yet
Handbook Amd64 PDF
166 pages
THANK YOU Messege
No ratings yet
THANK YOU Messege
1 page
Logicaldoc Devmanual 1.4
No ratings yet
Logicaldoc Devmanual 1.4
11 pages
Seminar Report
No ratings yet
Seminar Report
20 pages
Another Example For Illustration:: (How It Works Practically)
No ratings yet
Another Example For Illustration:: (How It Works Practically)
5 pages
Foxconn MotherBoard Lenovo IS6XM RX7
No ratings yet
Foxconn MotherBoard Lenovo IS6XM RX7
5 pages
Operator and Installation Manual TR7750 VN
No ratings yet
Operator and Installation Manual TR7750 VN
144 pages
SIMATIC IT Industry Libraries Hybrid For Process
No ratings yet
SIMATIC IT Industry Libraries Hybrid For Process
21 pages
Prplmesh Overview
No ratings yet
Prplmesh Overview
14 pages
Design Sheet: Load Calculation Table
No ratings yet
Design Sheet: Load Calculation Table
13 pages

Top 10 Most Powerful Functions For PROC SQL

Uploaded by

Top 10 Most Powerful Functions For PROC SQL

Uploaded by

Top 10 most powerful functions for PROC SQL

TOP 10 FUNCTIONS FOR THE SQL PROCEDURE IN SAS

format=dollar8.2 'Total cost after discount if any'

You might also like