0% found this document useful (0 votes)
125 views29 pages

Sas Interview Questions

The document discusses several topics related to SAS programming including: 1. The functions of Program Data Vector (PDV) which includes creating a database with one observation at a time and containing automatic variables like _N_ and _ERROR_. 2. The use of double trailing @@ in input statements which instructs SAS to hold the current record for execution of the next input statement. 3. The difference between the NODUP and NODUPKEY options in PROC SORT which compare all variables or just BY variables to identify and eliminate duplicate observations. 4. Character functions used for data cleaning like COMPRESS, TRIM, LOWCASE, UPCASE, and COMPBL.

Uploaded by

Chetan Sapkal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views29 pages

Sas Interview Questions

The document discusses several topics related to SAS programming including: 1. The functions of Program Data Vector (PDV) which includes creating a database with one observation at a time and containing automatic variables like _N_ and _ERROR_. 2. The use of double trailing @@ in input statements which instructs SAS to hold the current record for execution of the next input statement. 3. The difference between the NODUP and NODUPKEY options in PROC SORT which compare all variables or just BY variables to identify and eliminate duplicate observations. 4. Character functions used for data cleaning like COMPRESS, TRIM, LOWCASE, UPCASE, and COMPBL.

Uploaded by

Chetan Sapkal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

1

1)What are PDV and their functions?

Answer: Program Data Vector (PDV) is a logical concept and is defined as


an area of memory where a data set is being built by SAS.
Functions of PDV are as follows:
 A database having one observation at one time is created.
 The input buffer for holding the data from an external file is created
at the time of compilation.
 PDV contains two automatic variables namely, _N_ (displays the count
of the data step that is being executed) and _ERROR_ (notifies the
error that occurs at the time of execution).

2)Explain why double trailing @@ is used in Input Statements?

Answer: During data step iteration, including double trailing @@ in Input


statements implies that SAS should hold the current record for the purpose
of execution of the next Input statement rather than switching onto the
new record.

3)Explain the difference between NODUP and NODUPKEY options?


Answer: For removing duplicate values from the table, PROC SORT is
basically categorized between two options:
 NODUP
 NODUPKEY
The difference between these two options can be seen below:
NODUPKEY NODUP

Compares just the BY variable present in the dataset. Compares all the variables
present in the dataset.

Removes duplicate options Identifies and eliminates


for the values of variable listed in BY statement. duplicate observations.
2

NODUPKEY NODUP

Syntax: Syntax:
PROC SORT DATA=readin NODUPKEY; PROC SORT DATA=readin NO
BY variable name; BY variable name;
RUN; RUN;

Informat Format

Indicate SAS how to read Indicate SAS how to


data into SAS variable. display values in the
variable.

These are used to read the data or take input data from external These are used
files. to write the data.

4)Explain DATA_NULL_?
Answer: As the name defines, DATA_NULL_ is a data step that actually does
not create any data set.
It is used for:
 Creating macro variables.
 Writing the output without any data set.

5)Explain the purpose of SUBSTR functions in SAS programming.


Answer: In SAS programming, whenever there is a requirement of the
program to abstract a substring, the SUBSTR function is used in the case of
a character variable.
When a start position and length are specified, then this function is used for
abstracting character string.

Syntax: SUBSTR(char_var, start,length


3

6) Name and describe few SAS character functions that are used for data
cleaning in brief.
Answer: Few SAS character functions that are used for data cleaning are
enlisted below:
 Compress(char_string) function is used for removing blanks or some
specified characters from a given string.
 TRIM(str) function is used for removing trailing blanks from a given
string.
 LOWCASE(char_string) function is used for converting all the
characters in a given string to lowercase.
 UPCASE(char_string) function is used for converting all the characters
in a given string to uppercase.
 COMPBL(str) function is used for converting multiple blanks to a
single blank.

7) Name and describe few SAS character functions that are used for data
cleaning in brief.
Answer: Few SAS character functions that are used for data cleaning are
enlisted below:
 Compress(char_string) function is used for removing blanks or some
specified characters from a given string.
 TRIM(str) function is used for removing trailing blanks from a given
string.
 LOWCASE(char_string) function is used for converting all the
characters in a given string to lowercase.
 UPCASE(char_string) function is used for converting all the characters
in a given string to uppercase.
 COMPBL(str) function is used for converting multiple blanks to a
single blank.

8)Explain the purpose of the RETAIN statement.


4

Answer: As the meaning of the word ‘RETAIN’ signifies to keep the value
once assigned, the purpose of RETAIN statement is the same in SAS
programming as it’s meaning implies.
Within a SAS program, when it is required to move from the current
iteration to the next of the data step, at that time RETAIN statement tells
SAS to retain the values rather than set them to missing.

9)What is the difference between Missover and Truncover?

Ans. Missover – When the Missover option is used on the INFILE statement,
the INPUT statement does not jump to the next line when reading a short
line. When an INPUT statement reaches the end of the current input data
record, variables without any values assigned are set to missing.

Truncover –The Truncover option assigns the raw data value to the variable
even if the value is shorter than the length that is expected by the INPUT
statement. The Truncover option acts similar to the Missover option.
However, it takes partial values to fill the first unfilled variable.

The difference between the two is that while Truncover reads partial data
that falls at the end of the record, Missover sets the value to missing.

10) ods syntax

ODS outputtype
PATH path name
FILE = Filename and Path
STYLE = StyleName
;
PROC some proc
;
ODS outputtype CLOSE;
5
6
7
8
9
10

Single Trailing @

It instructs SAS to hold a record in the input buffer


11

Double Trailing @ @

.It instructs SAS to hold a record in the input buffer across multiple
iterations of the DATA step.

10)Difference between PROC CONTENTS and PROC PRINT?


Ans: PROC PRINT is used to list the values of the variables in a SAS data set
and ensure that data was read correctly into SAS.
PROC CONTENTS displays information about the SAS dataset.

11) in which phase pdv and descriptor portion of dataset is created?


Ans:in Compilation phase

12) How to print pdv in log?


ANS :By PUTLOG statement
Put log_ all_

How to minimize the number of decimal places for the variable using
PROC MEANS. You can limit the decimal places by using MAXDEC=option.
With this, you can set it equal to the length that you prefer.

Q9. What is the length assigned to the target variable by the Scan function?

Ans. The SCAN function returns a given word from a character string using
default and specific delimiters. The length assigned to the target variable by
the scan function is 200.
12

Q10. Explain the use of the TRANWRD function.

Ans. TRANWRD function is used to control the search and replace


functionality. It removes and replaces all the occurrences of a given word. It
does not remove trailing blankets in the replacement string and the target
string

Q11. Mention the methods to perform a “table lookup” in SAS.

Ans. Following are the five methods to perform “table lookup” in SAS:

 Match Merging
 Format Tables
 Direct Access
 PROC SQL
 Arrays

 Missover – When the Missover option is used on the INFILE


statement, the INPUT statement does not jump to the next line when
reading a short line. When an INPUT statement reaches the end of
the current input data record, variables without any values assigned
are set to missing.
 Truncover –The Truncover option assigns the raw data value to the
variable even if the value is shorter than the length that is expected
by the INPUT statement. The Truncover option acts similar to the
Missover option. However, it takes partial values to fill the first
unfilled variable.
 The difference between the two is that while Truncover reads partial
data that falls at the end of the record, Missover sets the value to
missing.
13

Explain the difference between One-to-One Merge and Match-Merge in


SAS.

Ans. One-to-one merge is used when we want to combine one observation


from each data set. It is not important to match observations.

For example, when merging an observation that contains an employee’s


name and year with an observation that contains a date, time, and location
for a conference, it does not matter which employee gets which time slot.
In such a case, we will use a One-to-one merge.

Thus, a one-to-one merge is used if both data sets in the merge statement
are sorted and each observation in one data set has a corresponding
observation in the other data set.

On the other hand, Match-merge is used if the observations do not match.

What is the use of $BASE64X?

Ans; By using $BASE64X encoding, the character data is converted into


ASCII text.

Name statements that function at both compile and execution time.

Options, title, footnote

What is the difference between SAS functions and procedures?

Functions expect argument values to be supplied across an observation in a


SAS data set whereas a procedure expects one variable value per
observation.
14

Name statements that are recognized at compile time only.

drop, keep, rename, label, format, informat, attrib, where, by, retain, length,
array.

Differentiate ‘CEIL’ and ‘FLOOR’.

Ans=The CEIL function, when issued, retrieves the smallest integer, while
FLOOR does the opposite and retrieves the biggest one.

What are the statements that are executed only?

answer

INfILE, INPUT, Output, Call routines

What does the trace option do?

Answer

ODS Trace is used to find the names of the particular output objects when
several of them are created by some procedure. ODS TRACE ON; ODS
TRACE Off;

Difference between SCAN and SUBSTR.

SCAN extracts words within a value that is marked by delimiters. SUBSTR


extracts a portion of the value by stating the specific location. It is best used
when we know the exact position of the sub string to extract from a
character value.
15

What does the trace option do?

ODS Trace is used to find the names of the particular output objects when
several of them are created by some procedure. ODS TRACE ON; ODS
TRACE Off;

Difference between SCAN and SUBSTR.

SCAN extracts words within a value that is marked by delimiters. SUBSTR


extracts a portion of the value by stating the specific location. It is best used
when we know the exact position of the sub string to extract from a
character value.

Where do you use PROC MEANS over PROC FREQ?

Ans=We will use PROC MEANS for numeric variables whereas we use PROC
FREQ for categorical variables.

What is the function of output statement in a SAS Program?

answer

You can use the OUTPUT statement to save summary statistics in a SAS
data set. This information can then be used to create customized reports or
to save historical information about a process.

You can use options in the OUTPUT statement to

 Specify the statistics to save in the output data set,


 Specify the name of the output data set, and
 Compute and save percentiles not automatically computed by the
CAPABILITY procedure.
16

What is Debugging?

answer

Debugging is a technique for testing the program logic, and this can be
done with the help of Debugger.

How to debug SAS Macros?

There are some system options that can be used to debug SAS Macros:
MPRINT, MLOGIC, SYMBOLGEN.

Describe the basic structure of a SAS program.


Ans: SAS Program consists of a DATA step and PROC step.

 DATA step recovers and manipulates the data.


 PROC step interprets the data.

What is the difference between the COUNT and COUNTW functions?

In a specified character string, for a particular substring, the COUNT function


counts the number of times it appears, while the COUNTW function counts
the number of words.

Elucidate the RUN function.

When a user wants to specify a command line for the operating system to
execute, the RUN is used. It is also used to make the SAS log readable by
creating a step boundary.
17

Describe the ALTER= Data Set option?

Alter= Data set option enables users to access the read-and write-protected
file. It prevents the user from replacing or deleting the file. This is among one
of the most frequently asked SAS interview questions.

SAS interview questions

There are some system options that can be used to debug SAS Macros:
MPRINT, MLOGIC, SYMBOLGEN.

What does PROC print, and PROC contents do?

To display the contents of the SAS dataset PROC print is used and also to
assure that the data were read into SAS correctly. While PROC CONTENTS
display information about a SAS dataset.

What is SAS informats?

SAS INFORMATS are used to read, or input data from external files known
as Flat Files ASCII files, text files or sequential files). The informat will tell
SAS on how to read data into SAS variables.
18

15) Name types of category in which SAS Informats are placed

SAS informats are placed in three categories,

 Character Informats : $INFORMATw


 Numeric Informats : INFORMAT w.d
 Date/Time Informats: INFORMAT w.

16) What function CATX syntax does?

CATX syntax concatenates character strings remove trailing and leading


blanks and inserts separators.

SAS File Extensions


The SAS programs, data files and the results of the programs are saved with
various
extensions in Windows.
*.sas - It represents the SAS code file which can be edited using the SAS
Editor or
any text editor.
*.log - It represents the SAS Log File that contains information such as
errors,
warnings, and data set details for a submitted SAS program.
*.mht / *.html - It represents the SAS Results file.
*.sas7bdat - It represents the SAS Data File that contains a SAS data set
including
variable names, labels, and the results of calculations.
19

SUBSTRN
This function extracts a substring using the start and end positions. In case
the end
position is not mentioned, it extracts all the characters till the end of the
string.
Syntax
SUBSTRN('stringval',p1,p2)
Following is the description of the parameters used:
stringval is the value of the string variable.
p1 is the start position of extraction.
p2 is the final position of extraction.

Arrays in SAS are used to store and retrieve a series of values using an
index value. The
index represents the location in a reserved memory area.
Syntax
In SAS an array is declared by using the following syntax:
ARRAY ARRAY-NAME(SUBSCRIPT) ($) VARIABLE-LIST ARRAY-VALUES
In the above syntax:
ARRAY is the SAS keyword to declare an array.
ARRAY-NAME is the name of the array which follows the same rule as
variable
names.
SUBSCRIPT is the number of values the array is going to store.
($) is an optional parameter to be used only if the array is going to store
character
values.
VARIABLE-LIST is the optional list of variables which are the place holders
for
array values.
ARRAY-VALUES are the actual values that are stored in the array. They
can be
declared here or can be read from a file or data line.
20

SAS Variable Names


Variables in SAS represent a column in the SAS data set. The variable names
follow these
rules.
 It can be maximum 32 characters long.
 It cannot include blanks.
 It must start with the letters A through Z (not case sensitive) or an
underscore (_).
 It can include numbers but not as the first character.
 Variable names are case insensitive.

SAS Statements
Let us now discuss the SAS statements:
Statements can start anywhere and end anywhere. A semicolon at the
end of the
last line marks the end of the statement.
Many SAS statements can be on the same line, with each statement
ending with a
semicolon.
Space can be used to separate the components in a SAS program
statement.
SAS keywords are not case sensitive.
Every SAS program must end with a RUN statement.

1.
DO Index
The loop continues from the start value till the stop value of the index
variable.
21

2.
DO WHILE
The loop continues till the while condition becomes false.
3.
DO UNTIL
The loop continues till the UNTIL condition becomes True.

SAS – DO Index Loop


This DO Index loop uses an index variable for its start and end value. The
SAS statements
are repeatedly executed until the final value of the index variable is
reached.
Syntax
DO indexvariable= initialvalue to finalvalue ;
. . . SAS statements . . . ;
END;

SAS – DO WHILE Loop


The DO WHILE loop uses a WHILE condition. The SAS statements are
repeatedly executed
until the while condition becomes false.
Syntax
DO WHILE (variable condition);
. . . SAS statements . . . ;
END;

SAS – DO UNTIL Loop


The DO UNTIL loop uses an UNTIL condition. The SAS statements are
repeatedly executed
till the UNTIL condition becomes TRUE.
22

Syntax
DO UNTIl (variable condition);
. . . SAS statements . . . ;
END;

1.
IF Statement
An if statement consists of a condition. If the condition is true then the
specific
data is fetched.

2.
IF-THEN-ELSE Statement
An if statement followed by else statement, which executes when the
Boolean
condition is false.
3.
IF-THEN-ELSE-IF Statement
An if statement followed by else statement, which is again followed by
another
pair of IF-THEN Statement.
4.
IF-THEN-DELETE Statement
An if statement consists of a condition, which when true deletes the specific
data
from the observations

Syntax
The basic syntax for creating an if statement in SAS is:
IF (condition ) THEN result1;
23

ELSE result2;

An IF-THEN-ELSE-IF statement consists of a Boolean expression with a


THEN
statements. This is again followed by an ELSE Statement.
Syntax
The basic syntax for creating an if statement in SAS is:
IF (condition1) THEN result1;
ELSE IF (condition2) THEN result2;
ELSE IF (condition3) THEN result3;

If the condition evaluates to be true, then the respective observation is


processed.

An IF-THEN-DELETE statement consists of a Boolean expression followed by


a SAS
THEN DELETE statement.
Syntax
The basic syntax for creating an if statement in SAS is:
IF (condition ) THEN DELETE;
If the condition evaluates to be true, then the respective observation is
processed

Function Categories
Depending on their usage, the functions in SAS are categorized as follows.
Mathematical
Date and Time
24

Character
Truncation
Miscellaneous

Truncation Functions
These are the functions used to truncate numeric values.
Examples
The following SAS program shows the use of truncation functions.
data trunc_functions;
/* Nearest greatest integer */
ceil_ = CEIL(11.85);
/* Nearest greatest integer */
floor_ = FLOOR(11.85);
/* Integer portion of a number */
SAS
72
int_ = INT(32.41);
/* Round off to nearest value */
round_ = ROUND(5621.78);
run;
proc print data = trunc_functions noobs;
run;

Input Date Date width Informat


03/11/2014 10 mmddyy10.
03/11/14 8 mmddyy8.
December 11, 2012 20 worddate20.
14mar2011 9 date9.
14-mar-2011 11 date11.
14-mar-2011 15 anydtdte15.

Reading ASCII (Text) Data Set


These are the files which contain the data on text format. The data is usually
delimited by
25

a space, but there can be different types of delimiters also which SAS can
handle. Let’s
consider an ASCII file containing the employee data. We read this file using
the Infile statement available in SAS.

PROC IMPORT DATAFILE=REFFILE


DBMS=XLS
OUT=WORK.IMPORT;
GETNAMES=YES;
RUN;
PROC PRINT DATA=WORK.IMPORT RUN;

Following are the two prerequisites for merging the data sets:
input data sets must have at least one common variable to merge on.
input data sets must be sorted by the common variable(s) that will be
used to
merge on.
Syntax
The basic syntax for MERGE and BY statement in SAS is:

MERGE Data-Set 1 Data-Set 2


BY Common Variable

Following is the description of the parameters used:


Data-set1, Data-set2 are data set names written one after another.
Common Variable is the variable based on whose matching values the
data sets
will be merged.
26

Subsetting Observations
In this method, we extract only few observations from the entire data set.
Syntax
We use PROC FREQ which keeps track of the observations selected for the
new data set.
The syntax for subsetting observations is:
IF Var Condition THEN DELETE ;
Following is the description of the parameters used:
Var is the name of the variable based on whose value the observations
will be
deleted using the specified condition.

Syntax
The basic syntax for sort operation in data set in SAS is:
PROC SORT DATA=original dataset OUT=Sorted dataset;
BY variable name;
Following is the description of the parameters used:
variable name is the column name on which the sorting happens.
Original dataset is the dataset name to be sorted.
Sorted dataset is the dataset name after it is sorted

Syntax
The basic syntax for using the ODS statement in SAS is:
ODS outputtype
PATH path name
FILE = Filename and Path
STYLE = StyleName
;
PROC some proc
;
ODS outputtype CLOSE;
27

Following is the description of the parameters used:


PATH represents the statement used in case of HTML output. In other
types of
output, we include the path in the filename.
Style represents one of the in-built styles available in the SAS
environment

PROC EXPORT
PROC EXPORT is a SAS in-built procedure. It is used to export the SAS data
sets for writing
the data into files of different formats.

Syntax
The basic syntax for writing the procedure in SAS is:
PROC EXPORT
DATA=libref.SAS data-set (SAS data-set-options)
OUTFILE="filename"
DBMS=identifier LABEL(REPLACE);

proc export data=sashelp.cars


outfile=
'/folders/myfolders/sasuser.v94/TutorialsPoint/car_data.csv'
dbms=csv;
run;

Following is the description of the parameters used:


SAS data-set is the data set name which is being exported. SAS can share
the
data sets from its environment with other applications by creating files
which can
28

be read by different operating systems. It uses the inbuilt EXPORT function


to out
the data set files in a variety of formats. In this chapter we will see the
writing of
SAS data sets using proc export along with the options dlm and dbms.
SAS data-set-options is used to specify a subset of columns to be
exported.
filename is the name of the file to which the data is written into.
identifier is used to mention the delimiter that will be written into the file.
LABEL option is used to mention the name of the variables written to the
f

Syntax
The local variables are declared with the syntax shown below.
% LET (Macro Variable Name) = Value;

Macro Programs
Macro is a group of SAS statements that is referred by a name and to use it
in program
anywhere, using that name. It starts with a %MACRO statement and ends
with %MEND
statement.
Syntax
The local variables are declared with the syntax given below.
# Creating a Macro program.
%MACRO (Param1, Param2,….Paramn);
Macro Statements;
%MEND;
# Calling a Macro program.
%MacroName (Value1, Value2,…..Valuen);
29

You might also like