Module 2 Introduction To SPSS - Word
Module 2 Introduction To SPSS - Word
MODULE NO. 02
TITLE INTRODUCTION TO SPSS
OVERVIEW SPSS means “Statistical Package for
the Social Sciences” and was first launched in
1968. Since SPSS was acquired by IBM in 2009, it's
officially known as IBM SPSS Statistics but most
users still just refer to it as “SPSS”.
LEARNING OUTCOMES The students will learn the basic terms and
commands of SPSS software and how to use it in
data analysis.
Introduction to SPSS
1. Data Transformation
2. Data Examination
3. Descriptive Statistics
4. Reliability Tests
5. Correlation Tests
6. Regression Tests
7. T-Tests
8. ANOVA
SPSS has scores of statistical and mathematical function with a flexible data
handling capability. It can read a number of data, i.e. numeric, alphanumeric,
binary, dollar, date, time, and further it also has data manipulation utilities.
1. Once the SPSS file (new file within the software) is open, a new
dialogue box opens, which gives user 6 options to choose from.
Run a Tutorial: which allows the user to run SPSS tutorial which
explains basic tools and tests which can be used within SPSS.
Type in data: When the user needs to conduct a new analysis based on
a survey, they need to choose this option.
Run an existing query: When the user wants to open an existing query
and run it in the SPSS software along with data on it which may have
been obtained from other sources or your friend.
Create a new query using Database Wizard: This provides the user
with an opportunity to create a new query or new data set using a
formulated database wizard
Open an Existing data: When the user wants to open a data saved in a
folder and run it on software
Open another type of file: This is similar to option “e” only here one
can open a different type of files like MS Excel file.
2. Now here for this example, the user will choose, Type in Data in order
to develop new query on SPSS, which is mostly applicable since we
need to develop new query with every new statistical assignment. Once
you select that there will be two views i.e. DATA VIEW and VARIABLE
VIEW.
DATA view: It displays the actual data (i.e. responses from respondents
who participated in the survey) of the variables which you have created
(See Figure 2 given below)
Data View in SPSS
Variable View
Width: Defines the Width of the value in the “Data View” sheet of SPSS.
The width will impact the representation of really big words such as
“arachnophobia” or “cardiovascular disease” in the tables or graphs.
Generally, we keep the Width at “8”.
Agree= 2
Neutral= 3
Disagree= 4
Strongly Disagree= 5
Align: Signifies the alignment in the box in Data View. This can be left,
right or centre.
When you open the SPSS program, you will see a blank spreadsheet in Data
View. If you already have another dataset open but want to create a new one,
click File > New > Data to open a blank spreadsheet.
You will notice that each of the columns is labeled “var.” The column names will
represent the variables that you enter in your dataset. You will also notice that
each row is labeled with a number (“1,” “2,” and so on). The rows will represent
cases that will be a part of your dataset. When you enter values for your data in
the spreadsheet cells, each value will correspond to a specific variable (column)
and a specific case (row).
1. Click the Variable View tab. Type the name for your first variable under
the Name column. You can also enter other information about the variable,
such as the type (the default is “numeric”), width, decimals, label, etc. Type
the name for each variable that you plan to include in your dataset. In this
example, I will type “School_Class” since I plan to include a variable for the
class level of each student (i.e., 1 = first year, 2 = second year, 3 = third
year, and 4 = fourth year). I will also specify 0 decimals since my variable
values will only include whole numbers. (The default is two decimals.)
2. Click the Data View tab. Any variable names that you entered in Variable
View will now be included in the columns (one variable name per column).
You can see that School_Class appears in the first column in this example.
3. Now you can enter values for each case. In this example, cases represent
students. For each student, enter a value for their class level in the cell that
corresponds to the appropriate row and column. For example, the first
person’s information should appear in the first row, under the variable
column School_Class. In this example, the first person’s class level is “2,”
the second person’s is “1,” the third person’s is “1,” the fourth person’s is
“3,” and so on.
4. Repeat these steps for each variable that you will include in your dataset.
Don't forget to periodically save your progress as you enter data.
ID Variables versus Row Numbers
Now that you know how to enter data, it is important to discuss a special type of
variable called an ID variable. When data are collected, each piece of information
is tied to a particular case. For example, perhaps you distributed a survey as part
of your data collection, and each survey was labeled with a number (“1,” “2,” etc.).
In this example, the survey numbers essentially represent ID numbers: numbers
that help you identify which pieces of information go with which respondents in
your sample. Without these ID numbers, you would have no way of tracking which
information goes with which respondent, and it would be impossible to enter the
data accurately into SPSS.
When you enter data into SPSS, you will need to make sure that you are entering
values for each variable that correspond to the correct person or object in your
sample. It might seem like a simple solution to use the conveniently labeled rows
in SPSS as ID numbers; you could enter your first respondent’s information in the
row that is already labeled “1,” the second respondent’s information in the row
labeled “2,” etc. However, you should never rely on these pre-numbered rows for
keeping track of the specific respondents in your sample. This is because the
numbers for each row are visual guides only—they are not attached to specific
lines of data, and thus cannot be used to identify specific cases in your data. If
your data become rearranged (e.g., after sorting data), the row numbers will no
longer be associated with the same case as when you first entered the data.
Again, the row numbers in SPSS are not attached to specific lines of data
and should not be used to identify certain cases. Instead, you should create a
variable in your dataset that is used to identify each case—for example, a variable
called StudentID.
Here is an example that illustrates why using the row numbers in SPSS as case
identifiers is flawed:
Let’s say that you have entered values for each person for
the School_Class variable. You relied on the row numbers in SPSS to correspond
to your survey ID numbers. Thus, for survey #1, you entered the first respondent’s
information in row 1, for survey #2 you entered the second person’s information in
row 2, and so on. Now you have entered all of your data.
But suppose the data get rearranged in the spreadsheet view. A common way of
rearranging data is by sorting—and you may very well need to do this as you
explore and analyze your data. Sorting will rearrange the rows of data so that the
values appear in ascending or descending order. If you right-click on any variable
name, you can select “Sort Ascending” or “Sort Descending.” In the example
below, the data are sorted in ascending order on the values for the
variable School_Class.
But what happens if you need to view a specific respondent’s information? Or
perhaps you need to double-check your entry of the data by comparing the
original survey to the values you entered in SPSS. Now that the data have been
rearranged, there is no way to identify which row corresponds to which
participant/survey number.
The main point is that you should not rely on the row numbers in SPSS since they
are merely visual guides and not part of your data. Instead, you should create a
specific variable that will serve as an ID for each case so that you can always
identify certain cases in your data, no matter how much you rearrange the data. In
the sample data file, the variable ids acts as the ID variable.
Sometimes you may need to add new cases or delete existing cases from your
dataset. For example, perhaps you notice that one observation in your data was
accidentally left out of the dataset. In that situation, you would refer to the original
data collection materials and enter the missing case into the dataset (as well as
the associated values for each variable in the dataset). Alternatively, you may
realize that you have accidentally entered the same case in your dataset more
than once and need to remove the extra case.
INSERTING A CASE
1. In Data View, click a row number or individual cell below where you want
your new row to be inserted.
3. A new, blank row will appear above the row or cell you selected. Values for
each existing variable in your dataset will be missing (indicated by either a
“.” or a blank cell) for your newly created case since you have not yet
entered this information.
1. In the Data View tab, click the case number (row) that you wish to delete.
This will highlight the row for the case you selected.
2. Press Delete on your keyboard, or right-click on the case number and
select “Clear”. This will remove the entire row from the dataset.
Sometimes you may need to add new variables or delete existing variables from
your dataset. For example, perhaps you are in the process of creating a new
dataset and you must add many new variables to your growing dataset.
Alternatively, perhaps you decide that some variables are not very useful to your
study and you decide to delete them from the dataset. Or, similarly, perhaps you
are creating a smaller dataset from a very large dataset in order to make the
dataset more manageable for a research project that will only use a subset of the
existing variables in the larger dataset.
INSERTING A VARIABLE
1. In the Data View window, click the name of the column to the right of of
where you want your new variable to be inserted.
New variables will be given a generic name (e.g. VAR00001). You can enter a
new name for the variable on the Variable View tab. You can quick-jump to the
Variable View screen by double-clicking on the generic variable name at the top
of the column. Once in the Variable View, under the column “Name,” type a new
name for the variable name you wish to change. You should also define the
variable's other properties (type, label, values, etc.) at this time.
All values for the newly created variable will be missing (indicated by a “.” in each
cell in Data View, by default) since you have not yet entered any values. You can
enter values for the new variable by clicking the cells in the column and typing the
values associated with each case (row).
/*Reorder the variables to place the new variable in the desired position.*/
MATCH FILES
FILE = *
/KEEP = ids newvar ALL.
DELETING A VARIABLE
1. In the Data View tab, click the column name (variable) that you wish to
delete. This will highlight the variable column.
2. Press Delete on your keyboard, or right-click on the selected variable and
click “Clear.” The variable and associated values will be removed.
Alternatively, you can delete a variable through the Variable View window:
1. Click on the row number corresponding to the variable you wish to delete.
This will highlight the row.
2. Press Delete on your keyboard, or right-click on the row number
corresponding to the variable you wish to delete and click "Clear".
You can also delete variables using command syntax.
Data Analysis
SPSS can open all sorts of data and display them -and their metadata- in two
sheets in its Data Editor window. So how to analyze your data in SPSS? Well,
one option is using SPSS’ elaborate menu options.
For instance, if our data contain a variable holding respondents’ incomes over
2010, we can compute the average income by navigating to Descriptive
Statistics as shown below.
Doing so opens a dialog box in which we select one or many variables and one or
several statistics we'd like to inspect.
After clicking Ok , a new window opens up: SPSS’ output viewer window. It holds
a nice table with all statistics on all variables we chose. The screenshot below
shows what it looks like.
For non SPSS users, the look and feel of SPSS’ Output Viewer window probably
comes closest to a Powerpoint slide holding items such as blocks of text, tables
and charts.
SPSS Reporting
SPSS Output items, typically tables and charts, are easily copy-pasted into other
programs. For instance, many SPSS users use a word processor such as MS
Word, OpenOffice or GoogleDocs for reporting. Tables are usually copied in rich
text format, which means they'll retain their styling such as fonts and borders. The
screenshot below illustrates the result.
ACTIVITY:
Input the following data in the SPSS. Transfer your output to Microsoft windows.
Send to STATGrad@gmail,com
Prepared by: