0% found this document useful (0 votes)
3 views

Spss 1

Uploaded by

Mohamed Awad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Spss 1

Uploaded by

Mohamed Awad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Introduction

SPSS
Statistical Package for the Social Sciences
Introduction
SPSS
•It uses both a graphical and a syntactical interface
•managing, analyzing, and presenting data
Learning Objectives

In these three sessions, you will learn your way around


the program, conducting statistical analyses, creating
tables and charts, and preparing your output for
incorporation into external files such as spreadsheets,
power point, and word processors.
Some Definitions

 Population (universe) is the collection of things under


consideration
 Sample is a portion of the population selected for
analysis
 Statistic is a summary measure computed to describe a
characteristic of the sample
Some Definitions
 Mean (average) is the sum of the values divided by the
number of values
 Median is the midpoint of the values (50% above; 50%
below) after they have been ordered from the smallest to
the largest, or the largest to the smallest
 Mode is the value among all the values observed that
appears most frequently
 Range is the difference between the smallest and
largest observation in the sample
Some Definitions
Type of data

Categorical data:
There are basically two kinds of data in this groups:

1. Nominal data (named categories), e.g. gender (male/female),


ethnicity (Malay, Chinese, Indian), outcome (dead/alive), etc. The
nominal data are summarised by percentages.

2. Ordinal data (ordered categories), e.g. tumour staging (Stage


1, 2, 3, 4), disease severity (mild, moderate, severe), Likert scale
(5-point scale, 1-5), etc. The ordinal data are summarised by
median value.
Some Definitions

 “Scale” Continuous data:


Continuous data is sometime referred to as interval data.
These data take the form of a range of number, and may
or may not have decimals, e.g. age, HbA1c, weight,
height, haemoglobin level, etc. The continuous data are
summarised by mean and standard deviation (SD).
Some Definitions

Variance and Standard Deviation


 Variance (deviations) is a measure of the dispersion of a
sample (or how closely the observations cluster around
the mean [average])
 Standard Deviation, the square root of the variance, is
the measure of variation in the observed values (or
variation in the clustering around the mean)
Some Definitions
Dependent and independent variables:
 Dependent variable is the variable of interest
 Independent variable is the grouping variable
Let us say, you want to find out if HbA1c differ by gender or ethnicity.
Then HbA1c is the dependent variable, and gender and ethnicity are
independent variables.
Summarised data
summary of data, plus a graphical display of the data (e.g. in graph and
scatter plot) is a very useful way of having a sense of your data
before you embark on formal statistical analysis (the so-called “eye-
balling the data”).
Some Definitions

P-Values and Q-Values


 The p-value is the probability that the null hypothesis is true.
 If the value is less than .05 the Null Hypothesis should be rejected.
 If the value is greater than .05 the Null Hypothesis should be
accepted.
When the null hypothesis is true (P-Value > ,05), nothing is really
happening; differences are due to chance “no statistical significant
difference”
Some Definitions
 Hypothesis is an assumed statement which should be proved
or disproved while doing the research analysis.

 Confidence the reverse of a p-value, is called the q-value. p-value


= 5% then the q-value (confidence) is 95%.
Some Definitions
•Qualitative research is based on quality attributes of
the Data.
•Quantitative research is based on quantity attributes
of the data.
Some Definitions
 Inferential statistics:
The process of drawing conclusion on a population from a sample.
The type of statistics used for Hypothesis testing
you have observed that diabetic patients of certain ethnic group
appeared to have poorer diabetic control. Rather than stating that
“Malay diabetic patients have poor diabetic control”, we should state
that “in the population of all diabetic patients, there is no difference
in glycaemic control by ethnicity” (this is the so-called Null
Hypothesis). By drawing a representative sample of diabetic
patients from the population, you then seek to disprove the Null
Hypothesis.
Some Definitions
INFERENTIAL STATISTICS
PARAMETRIC TESTS
t-test
If you want to find out if HbA1c differ by gender, a statistical output can
appear as follow: means HbA1c for males and females are 8.7%
(SD=1.9) and 8.9% (SD=2.3) respectively, t= - 0.711, df=158,
p=0.478. As HbA1c is a continuous variable (and presumably
normally distributed), we use t-test for two groups comparison of
means (males vs females). Since the p value is more than 0.05 (the
conventional cut-off for statistical significance), we can interpret the
result as no statistical significant difference or “no real difference in
HbA1c in male and female diabetic patients”.
Some Definitions
INFERENTIAL STATISTICS
PARAMETRIC TESTS
ANOVA
For three or more groups comparison of means (Malays, Chinese and
Indians), we use ANOVA (F test): means for HbA1c in Malays,
Chinese and Indians are 9.6% (SD=2.5), 8.4% (SD=2.0) and 8.7%
(SD=1.9) respectively, F=4.524, p=0.012. In this case, there is
statistical significant difference (p is less than 0.05) in the HbA1c
among these three ethnic groups.
Some Definitions
INFERENTIAL STATISTICS
NON-PARAMETRIC TESTS
Statistical tests that require either no assumptions or very few assumptions about a population’s
distribution

Chi-square test
A cross-tabulation of gender (and ethnicity) with HbA1c categories.
Chi-square test is used to compare frequencies (counts) in two or more
groups.
Some Definitions
 Cross-Tabulation:
A technique for organizing data by groups, categories, or classes, thus
facilitating comparisons; a joint frequency distribution of observations on two
or more sets of variables
 Analyze data by groups or categories

 Compare differences

 Contingency table

 Percentage cross-tabulations

 Frequency table:
A simple tabulation that indicates the frequency with which respondents give a
particular answer
SPSS Windows and Files
Windows File Suffix Function

Data Editor .sav Define, enter, and edit data and run
statistical analyses

Output Viewer .spv Contain the results of all statistical analyses


and graphical displays of data.

Syntax Editor .sps Compose SPSS commands and submit


them to the SPSS
processor. This window is activated when
you click on the Paste function.
Population and Sample

Population Sample
Use statistics to summarize
features

Use parameter to summarize


features

Inference on the population from the sample


Different Shapes of Distributions

You might also like