SPSS GUIDE: Constructing An Index With A Compute Statement
SPSS GUIDE: Constructing An Index With A Compute Statement
This outline provides and example of how to use a COMPUTE STATEMENT to create
an INDEX for social disorder with variables within the NOSCJ (1995) dataset. This
guide should only be taken as a general example of how to construct an index using a
compute statement. The creation of an index for a different concept would likely require
a number of different procedures than those described here: i.e., the inclusion of different
variables, the dropping of different categories, and so on. None-the-less, you would still
need to follow the same general procedures described here: 1) dropping unusable
categories, 2) dealing the missing value categories, and 3) developing a compute
statement for summation to create the index.
Social Disorder
As discussed in lecture, one could make the argument that the following variables within
the NOSCJ 1995 dataset: PROBLEM: NEIGHBORHOOD DOGS LOOSE (N2),
PROBLEM: UNSUPERVISED YOUTH (N5), PROBLEM: TOO MUCH NOISE (N6),
and PROBLEM: PEOPLE DRUNK/HIGH IN PUBLIC (N7) represent elements of social
disorder. While we could debate the sufficiency of them as a measurement for social
disorder, we can all agree that each is associated with social disorder. That is, they have
face validity.
Since the concept of social disorder is a rather broad concept we would do better to
combine these variables into a single composite score (i.e., and index) so that we can
begin to cover the full range of meanings included in the concept of social disorder,
rather than just using a single one of these variables to measure social disorder.
Creating and Index for Social Disorder
First, we need to run a recode statement for each variable to be used to create the index
(i.e., 4 separate recode statements: one for each variable).
The purpose of these recode statements is too allow us to drop those values or attributes
within the variable that cannot be used for our index. Notice that there are actually 6
possible responses that respondents could choose from when asked about these
neighborhood conditions: 1 = serious problem; 2 = somewhat problem; 3 = minor
problem; 4 = not a problem; 8 = Dk; 9 = Rf. Dk refers to those respondents who didnt
know about these neighborhood conditions, and Rf refers to respondents who refused
to answer these questions. While the first four responses can be used for our index, these
two remaining responses (8 = Dk and 9 = Rf) will pose a few problems.
Since there are very few, if any, respondents who choose these responses (8 or 9) and
these responses are not in numerical agreement within the other attributes of the variable
(i.e., the first four elements of the scale are ordinal, and these last two or nominal), we
need to create new variables that do not contain these responses. Additionally, notice that
the original coding scheme suggests that lower scores refer to more serious problems. In
order to make our index scores more logical (i.e., high scores reflect a more serious
problem), in addition to dropping the unusable categories, we also need to rearrange the
coding scheme.
In secondary data analysis (what you are currently doing), such procedures are often
referred to as cleaning the data.
So, lets create four new clean variables with four recode statements.
In the recode statements, drop the values of 8 and 9 (see previous SPSS guides if you
need a review on how to run a recode statement) and, at the same time, change the
direction of the values, so higher numbers reflect higher degrees of social disorder (i.e. 1
= 4, 2 = 3, 3 = 2, 4 = 1). You can name these new variables whatever you would like:
(for example: N2_r, N5_r, N6_r, N7_r) in my scheme, the r refers to recode, and the
original variable names tells me from which variable I create the new variable.
Second, we need to program SPSS to run a compute statement to combine our newly
created and clean variables into a single composite score. To explain, this compute
statement will take a respondents score on each variable and sum them together. Then, it
will divide this result by the total number of variables included. As such, we are
effectively creating an average score (derived from their responses to these questions)
for each respondent. This final score represents the composite score (i.e., the index
score), and this compute statement will be applied to all respondents in the data set.
Heres the procedure you will need to follow to achieve this:
From the menu bar, click on Transform
From the drop down window, click on Compute.
Notice that the compute variable window just appeared (it has a list of
all the variables in the dataset in the bottom left-hand window, a calculator
in the middle, two empty windows at the top, and so on.
The first step you need to do is name the variable that you are going to
create. Since you are going to create an index for social disorder, why
not call this variable: SOCDIS. Type this variable name in the Target
Variable window.
Did you notice what happened? The Type&Label icon just below that
window became active. Go ahead an click on that icon and type in a
variable label for the variable you are creating. Its an arbitrary label, but
you might want to call it: social disorder index: created from N2, N5, N6,
and N7). Once you type in a label, click on Continue to go back to the
original Compute Variable window.