Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
Reliability statistics: Cronbach’s Alpha > .80 (.60 or .70 acceptable)=> the reliability of the scale is high
Report
Item statistics: (mean) scale from x-> y: average difficulty (x+y)/2 (low p values = difficult items)
reliability .5 middle, <.1 very difficult <.9 very easy
analysis
Item-total statistics: items with “corrected item total correlation” below .30 are discarded or revised
Above .30 : each of the items reflects sufficiently what the scale as a whole measures.
Cronbach’s alpha if item deleted: alpha should be higher than all amounts-> OK
1. Analyze->Correlate->Bivariate: options: exclude cases listwise (correlation table)
2.1 Factor
2. Analyze-> Dimension Reduction-> Factor: check: (Initial solution coefficient significance levels+
inverse, KMO) (correlation matrix based on eigenvalue, scree plot), (loading plots, varimax,rotated
solutions) (save as variables, regression sort by size)
Inverse of Correlation Matrix: when the non-Diagonal values are significantly smaller than the values
on the diagonal-> the correlation structure is well suited for a factor analysis.
Cross Loadings:
Evaluating the Cross-Loadings: two or more factor loading>.3 or .4 (larg difference >.2 ok, lower than .2
exclude.)
Rotated Componne matrix: sorted by size.
Calculating Sum Scales or Factor Scores:
COMPUTE X new=y+z+r .
VARIABLE LABELS X ‘lable it’.
**Sort Cases By clus_1. Split file separate by clus_1. (Do Frequencies analyze on cluster) .Split file off.
Report Cluster Dendogram: the number of clusters, increase in heterogeneity is found when stepping from a two-cluster
to a one-cluster solution This suggests a two-cluster solution.
(Between
Do analysis again and give the number of cluster in Save, single solution!!
Linkage is for
outliers) Scatter Plot (even with more than 2 use the normal scatter, without any cluster. Then 2 it with the known
(Wards Method) cluster number but still x and y axis only)
Systemic Measures: Cramer’s V (not quadratic) and Contingency Coefficients (larger than 2x2)
significant. Value< 0.3=> relationship is not very strong. (Phi is for 2x2 tables.)
Report
Contingency Directional Measures: Lamda: quantify the strength of the relationship between two variables by
assessing how much the prediction of one variable improves when a second variable is used to improve
this prediction. A value of zero does not necessarily mean there is no relationship between the two
variables at all. Report: knowledge about (row " Dependent") reduces the error in predicting the other by
(value x%). Knowledge about (row "intensity Dependent") does the same for the prediction of the other
(y%).
!!The problem with the calculation of significance levels in the case of Lambda is due to the large sample
size.
3. Optimize Model
In order to optimize model when is skewed to right: Compute wage_1=ln(wage)
When the residual plots are U-shaped and is not linear X2: COMPUTE x_r =x*x-> do regression
Coefficients: Constant (can be not sign in bivariate) and report variable sig.
If the variable increases by 1 unit, the other increases by B%. of variable.
Confidence level: when it does not include 0, than the regression coefficients is significantly different
from 0.
Unstandardized coefficients (B) show absolute change of dependent variable if the independent variable
increases by one unit.
The beta coeeficients are the standardized coefficients of the regression, their magnitudes reflect their
Report relative importance in predicting.
Regression
Model Summary: R square= the model explains % of the variance for Y. the highest the better the fit.
(y=B0+B1X1+U)
Coefficients t-test: Bot coefficients are sign.
Prerequisite of Gaus-Markov: 1. Linearity of coefficients (linear regression model), 2. Random sample.
3.Zero conditional mean of error term=0 (the mean values of the residuals do not differ visibly from 0
across the range of standardized estimated values, check scatterplot.) 4. Sample variation in explanatory
variables ( scatter plot shows variance), 5. Homoscedasticty when residuals are trumpet shaped they don’t
have constant variances, there is heteroscedasticity (when met the model shall be improved and don’t
interpret anymore. (if they don’t have a wave like pattern, there is an independence of error, normal
distribution of error and residuals shall be considered as well.)
1. Recode gender (1=0)(2=1) Into Female. (dummy is one less than the number of categories)
RECODE grade_n (2=0) (3=1) (4=0) (5=0) (6=0) INTO dgrade_3.
Box plot+ Levens test (As Levene's test is not significant, equal variances can be assumed)
Report One
Way ANOVA 1.Test of between subjects: Corrected model=sig=> the model as a whole is sig. constant=intercept
• When its Sig: There is a main effect of independent v on dependent. The value of Adjusted R
Square*100% shows that % of the variance in independent variable around the grand mean can
be predicted by the model. (categorical)
• When not-sig: holding all other variables constant– there is no significant difference between X
and the not significant one.
2. Partial Eta squared: (the higher the better)
Categorical V explains x% of the previously unexplained variation
For the grand mean (intercept), partial η2 is not interesting and therefore, will not be interpreted.
3. Multiple comparisons: Post hoc comparisons ( whether they differ sign)
Group x and y have a sig difference, if not=> those non sig are in one group
4. Profile Plot: Profile plots are of particular interest if you want to get a quick overview on the relative
differences between the mean values or if there are several factors involved (and hence interactions). If
not horizontal=> there is a main effect of independent on dependent.
Interaction = there is a dependency between the two. (no parallel lines)
**Boxplot:Graphs->Legacy Dialogs->Boxplots->Clustered.
9.1. Two Way
Analyze-> General Linear Model-> Unvariate-> Plots: horizontal and separate line: (add) 2 times
ANOVA (x*y, y*x)