Educational Statistics
Educational Statistics
IN EDUCATIONAL MANAGEMENT
Learning Objectives:
1. define statistics;
2. discuss the aim of statistics;
3. describe some useful vocabulary in statistics;
4. differentiate the different fields of statistics;
5. describe the kinds of data needed in educational management;
6. identify various kinds of variables; and,
7. distinguish between nominal, ordinal, interval and ratio scales.
variable
Introduction
Definition of Statistics
Fields of Statistics
Most statisticians agree that there are two primary fields of statistics as
applied to education. The first field involves the description of data. Statistical
techniques which are used to describe data are referred to as descriptive
statistics. Descriptive statistics are utilized to summarize sets of numerical data
such as test scores, ages, length of service, years of schooling, economic status
etc. Through the use of statistics one simplifies time and space necessary to
describe data.
The second field allows the researcher to draw better inferences as to
whether a phenomenon which is observed in a relatively small number of
individuals (sample) can be generalized to a larger number of individuals
(population). This is called inferential statistics.
In inferential statistics, one is frequently concerned with relationships
between and among variables. The educator attempts to discover relationships
among such variables as mental ability, performance or achievement, study
Basic Statistical Concepts and Tools in Educational Management 3
[Introduction to Statistical Concepts]
habits, multiple intelligence etc. so that after establishing relationships, he may
use the results to improve school programs.
Thus, inferential statistics serves as an important function in the design of
educational research. The researcher should organize his studies carefully so
that the data yielded from the study can be analyzed meaningfully.
Classification of Variables
Data for analysis result from the measurement of one or more variables.
Depending upon the variables, and the way in which they are measured, different
kinds of data result represent different scales of measurement.
It is important to know which type of scale is represented by your data
since different statistics are appropriate for different scales of measurement. A
characteristic may be measured using nominal, ordinal, interval and ratio scales.
The resulting data are termed nominal, ordinal, interval and ratio data.
Nominal Scale
Interval Scale
An ordinal scale not only classifies subjects but also ranks them in terms
of the degree to which they possess a characteristic of interest. In other words,
an ordinal scale puts the subjects in order from highest to lowest, from most to
least. Ordinal scales are typically seen in questions that call for ratings of quality
(e.g. excellent, very good, good, fair, poor, very poor) and agreement (e.g.
strongly agree, agree, disagree, strongly disagree)
Although ordinal scales indicate that some subjects are higher, or lower
than others, they do not indicate how much higher or how much better. That is,
intervals between ranks are not equal, the difference between rank 1 and rank 2
is not necessarily the same as the difference between rank 2 and 3, as the
example below illustrates.
Rank Weight (in lbs.)
1 170
2 165
3 158
4 150
5 145
6 140
7 138
8 137
9 135
10 130
Ratio Scale
Subject Score
1 X1 8
2 X2 9
3 X3 5
4 X4 7
5 X5 6
The way to use the summation sign to indicate the sum of all five X‟s is:
5
xi
11
This notation is read as follows: Sum the values of X from X 1 through X5.
The index I (shown just under the sign) indicates which values of X are to be
summed. The index I takes on values beginning with the value to the right of the
“=” sign (1 in this case) and continues sequentially until it reaches the value
above the sign (5 in this case). Therefore I takes on the values 1,2,3,4 and 5
and the values of X1, X2, X3, X4, and X5 are summed (8+9+5+7+6 = 35)
Examples:
Given:
X 1 3 4 8 10
Y 2 5 7 9 12
x = 1 + 3 + 4 + 8 + 10 = 26
X2 = 1 + 9 + 16 + 64 + 100 = 190
Y = 2 + 5 + 7 + 9 + 12 = 35
Y2 = 4 + 25 + 49 + 81 + 144 = 303
XY = (1)(2) + (3)(5) + (4)(7) + (8)(9) + (10)(12) = 237
(Y)2 = (35)2 = 1225
XY = (26)(35) = 910
XY2 = (26)(303) = 7878
Reading Assignment:
Links: ebmlibrarian.wetpaint.com/thread/508093
barnettinternational.com/EducationalServices_Seminar.as
px?s=6725&id
www.statsoft.com/textbook/esc.html
E-Journals/E-Books
PUP website: infotrac.galegroup.com/itweb/pup
Password: powersearch
Exercises/Written Assignments:
Find:
1. x
2. x 2
3. y
4. y 2
5. xy
6. y
2
7. x2
8. x y
9. x y 2
10. x y
2
References/Bibliography:
Learning Objectives:
non-probability sampling
Reading Assignment:
Links: http://www.stat.yale.edu/Courses/1997-98/101/sample.htm
http://www.abs.gov.au/websitedbs/D3310116.NSF/0/116e0f
93f17283eb4a2567ac00213517?OpenDocument
E-Journals/E-Books
PUP website: infotrac.galegroup.com/itweb/pup
Password: powersearch
Exercises/Written Assignments
References/Bibliography
Basic Statistical Concepts and Tools in Educational Management 20
[Introduction to Statistical Concepts]
Blommers, Paul J. Statistical Methods in Psychology and Education.
Lanham University Press. 1987
Average
mode
LEARNER
Introduction
the Greek capital letter sigma which tells us that the subscripted variables are to
be added. Therefore, it directs us to sum the elements that appear to the right of
the sign, starting with X1 and the proceeding in order to Xn. Thus,
x = X1 + X2 + X3 + X4 …+ Xn
i 1
i
From these ideas, we can express the formula for the sample mean as:
n
x = x
i 1
i
Example 1:
a. The data given below is the total number of hours lost due to tardiness
and absences of employees in a company in a given year. Find the
mean.
Solution:
x1 x2 ....x`12
x=
n
415
=
12
x = 34.58 hours lost per month
THE WEIGHTED ARITHMETIC MEAN
W1 X 1 W2 X 2 .... Wn xn Wi X i
X=
W1 W2 ...... Wn Wn
1 5
2 6
3 11
4 14
5 20
6 12
7 12
8 8
Basic Statistical Concepts and Tools in Educational Management 24
[Introduction to Statistical Concepts]
TOTAL 88
Solution:
5(1) 6(2) 11(3) 14(4) 20(5) 12(6) 12(7) 8(8)
x=
88
440
=
88
x = 5 is the average household size
Disadvantages:
1. It is easily affected by the presence of out-lying or extreme values in the
group data.
2. It cannot be calculated if one or more items in the data are missing.
3. It cannot be computed from a frequency distribution that contains “open-
ended” intervals.
4. It is also an inappropriate average when the distribution is invariably
skewed, since skewness increases the error of the midpoint assumption.
Thus, the mean computed form a grouped of data may differ markedly
from the computed from an ungrouped data.
5. It cannot be easily interpreted.
MEDIAN
The median is defined as the value of the middle item in a set of items.
Half of the observations are above and half are below the median. It is denoted
by the symbol Md. We can determine the median by using the following steps:
Arrange the items from lowest to highest or vice versa.
Count to the middlemost value. The median is the middlemost
value for an odd number of observations and the average of the two
middle values for an even number of observations.
It is an appropriate measure of average for data in ordinal scale of
measurement.
Example 1.
Using the example on total hours lost in a company, find the median.
Basic Statistical Concepts and Tools in Educational Management 25
[Introduction to Statistical Concepts]
Step 1. Sort the data according to ascending order of values
x1 = 20
x2 = 23
x3 = 24
x4 = 27
x5 = 30
x6 = 32
x7 = 37
x8 = 37
x9 = 40
x10 = 48
x11 = 42 x12 = 55
xn / 2 x( n1) / 2 x6 x7
Md =
2 2
32 37 69
Or Md = = 34.5 hours lost
2 2
Interpretation:
This means that in half of the year, the company lost less than 34.5 hours
while for the other half of the year, company lost by more than 34.5 hours due to
the tardiness and absences of employees.
Example 2.
Distribution of Employees by Performance Rating
RATING Number Cumulative Frequency
(CF)
Outstanding 5 5
Very Satisfactory 35 40
Satisfactory 25 65
Unsatisfactory 2 67
Total 67
Example 2.
Distribution of Faculty by Performance Rating
RATING Number
Outstanding (O) 20
Very Satisfactory (VS) 35
Satisfactory (S) 10
Unsatisfactory (US) 2
Total 67
1085
Median = 542.5
2
Median = High School
C. Religion
Religion Frequency
Iglesia ni Kristo 143
Catholic 282
Protestant 184
Muslim 150
Fundamentalist 210
Others 116
TOTAL 1085
Mode = Catholic
985
Median= 492.5
2
Median = 4 household members
E. Sex Distribution
Male = 286 Female = 259
Edwards, Allen L. Statistical Analysis. New York, Holt, Rinehard and Winston.
1998
Leedy, P.D. and Ormrod, Jean E. Practical Research Planning and Design, 8th
edition, pp. 1-6, Prentice Hall, 2005.
Learning Objectives:
variance
LEARNER
standard deviation
quartile deviation
Introduction
RANGE
( x x )2 = 1 + 9 + 0 + 0 + 4 + 0 + 1 + 1 + 0 + 4 = 20
5. Divide the result by n-1
20 20
= = 22.2
n 1 9
The resulting number is the variance. The more spread out the
items are, the larger the variance.
6. Take the square root of the variance. This is the standard deviation
2.22 = 1.4889966
The larger the standard deviation, the more variation there is in the data
set.
Interpretation of the Standard Deviation
If the distribution of scores is symmetric, the following rules apply:
About 68.27% of all observations fall between the mean and 1 standard deviation,
About 95.45% of all observations fall between the mean and 2 standard
deviations.
About 99.98% of all observations fall between the mean and 3 standard
deviations.
In the above-illustrated example, with a mean of 7 and a standard
deviation of 1.49
68. 27% of the items fall between 5.51 – 8.49
95.45% of the items fall between 4.02 – 9.98
99.98% of the items fall between 2.53 – 11.47
Basic Statistical Concepts and Tools in Educational Management 35
[Introduction to Statistical Concepts]
QUARTILE DEVIATION
Percentiles
A percentile is a number that indicates the percentage of a distribution that
is equal to or below that number. To say that a student scored in the 85 th
percentile means that 85% of other students who took the test scored the
same or below that student.
Percentiles are also used to compare an individual value with a set of
standards or norms. To say that a student obtained a 60 th percentile in the
National Admission Test (NAT) means that compared to a standard or NAT
norm, 60% have the same or lower scores and 40% have the same or higher
scores. The median is equivalent to 50th percentile. Half of the distribution is
at or above, and half is at or below.
Interquartile Range
The interquartile range makes use of percentiles. It is referred to as the
difference between the 25th (First Quartile or Q1) and the 75th percentile (Third
Quartile or Q3) the interquartile range contains 50% of the items.
Quartile Deviation
Q3 Q1
This is given by the formula: QD
2
The average of the interquartile range is the quartile deviation. It is used
when the median is used as the measure of central tendency.
SKEWNESS
Measures of central tendency and dispersion indicate where the data are
located and how spread they are. Measures of skewness are concerned with
whether the data are symmetrically distributed, or is concerned with the shape of
the distribution.
We are familiar with the distribution referred to as the normal symmetrical
or bell-shaped curve. We always assume that the data are distributed normally.
But, this is not always the case, and in actual research it is always wise to
examine the set of data to see if the distribution is symmetrical or skewed.
If the mean is to right of the median and mode, and if the tail is longer at
the right side, the distribution is said to be positively skewed or skewed to the
right, like the one below:
Reading Assignment:
Links: http://www.quckmba.com/stats/dispersion/
http://mathbits.com/MathBits/TISection/Statistics2/dispersion.htm
http://www.emathzone.com/tutorials/basic-statistics/introduction-
to-measure-of-dispersion.html
E-Journals/E-Books:
PUP website: infotrac.galegroup.com/itweb/pup
Password: powersearch
Exercises/Written Assignments:
Introduction
The word hypothesis comes from the Greek words hypo, which means
“under” and tithenai which means “to place”.
A hypothesis is a preconceived idea, assumed to be true but has to be
tested for its truth or falsity. Suppose a researcher is concerned with testing the
relationship between variables. Through inferential statistical measures, he can
discover important information even if no relationship was established between
the variables. It is possible for the researcher to discover differences and
therefore may test individual or group differences.
It is therefore helpful for the researcher to think of inferential statistics n
terms of whether they test for relationship or association or whether they test for
comparison or difference.
There are two types of hypotheses: the null hypothesis and the alternative
hypothesis.
There are two types of error involved with hypothesis testing. Type I error
is committed when a researcher rejected a null hypothesis when in fact it is true.
The second type of error, Type II error, is the error that occurs when the data
from the sample produce results that fail to reject the null hypothesis when in fact
the null hypothesis is false and should be rejected.
Parametric and nonparametric Statistics
Reading Assignment:
E-Journals/E-Books:
1. Distinguish between
1.1 Parametric and nonparametric tests
1.2 Null and alternative hypothesis
1.3 Statistical significance and practical significance
1.4 Type I and Type II Errors
2. Discuss the possibility of committing Type I and Type II errors in
hypothesis testing.
3. Explain the importance of tests of statistical significance.
4. You are interested in finding out whether young adults (25-34 years old)
are more likely to view basketball games on TV than older adults (35 or
more years old). State your null hypothesis. Write also the alternative
hypothesis accompanying the null hypothesis.
5. Give five (5) examples of a null hypothesis accompanied by an alternative
hypothesis.
References/Bibliography:
Learning Objectives:
Introduction
2. Choose the statistical test and perform the calculation. A researcher must
determine the measurement scale, the type of variable, the type of
data gathered, the number of groups or the number of categories.
3. State the level of significance for the statistical test. The level of
significance is chosen before the test is performed. It has been
traditionally accepted by schools of thought to use alpha, , to denote
the level of significance in rejecting null hypotheses. It is equivalent to
the amount of risk regarding the accuracy of the test that the
researcher is willing to accept. The levels most frequently used are
.05, .01, and .001. An level of .05 implies that the probability of
committing an error by chance is 5 in 100.
4. Compute the calculated value. Use the formula for the appropriate
significance test to obtain the calculated value. The following formula
may be used.
A. Test of Significance of Difference
1. Between Means (t-test for independent samples)
x x x x
N1 2 N2
2
i 1 i 2
N1 N 2 N
S1 S 2
SX X2
1
N1 N 2
N1 N 2 2
2. One-Way Analysis of Variance (ANOVA)
3. A typical ANOVA Table
Source of Degrees of Sum of Mean F-ratio
Variation Freedom Squares Square
Between
groups
Within
groups
Total
Contingency Table
N2
O E 2 df (r 1)(c 1)
E
row total )(column total )
Where E =
( grand total )
5. Determine the critical value the test statistic must attain to be significant.
After we compute the calculated measure, we must look up the critical
Rejection Rejection
area Acceptance area
area
-1.96 0 1.96 z
Areas of acceptance and Rejection in a Standard Normal Distribution, Using 𝛼 = .05
6. Make the decision. If the calculated value is greater than the critical
value, we reject the null hypothesis. If the critical value is larger, we
conclude that we have failed to reject the null hypothesis.
Reading Assignment:
E-Journals/E-Books:
PUP website: infotrac.galegroup.com/itweb/pup
Password: powersearch
Exercises/Written Assignments:
6. Propose a problem and use the steps of hypothesis testing to solve the
problem.
7. Briefly explain the meaning and give an example for each.
7.1 null hypothesis
7.2 alternative hypothesis
Basic Statistical Concepts and Tools in Educational Management 49
[Introduction to Statistical Concepts]
7.3 tails of a test
7.4 acceptance and rejection region
8. Give three examples each of:
8.1 one-tailed test
8.2 two-tailed test
References/Bibliography:
Learning Objectives:
Introduction
Correlation Analysis
Y Y
r = +1
X r = -1X
Basic Statistical Concepts and Tools in Educational Management 51
(a) Perfect positive linear association (b) Perfect negative linear association
[Introduction to Statistical Concepts]
Y Y
…. …. ….
….…. ….
…. . . …..
…..….. 0<r<+1 …. ….
. ….…. ….
…. . .
. . X . . . . …. . -1 < Xr < 0
.
(c) Positive linear association .
(d) Negative linear association
Y r =0 Y r =0
…. …. …. ….
…. …. . …. …. ….. . …. . …. ….
…. …. .….
. …. . . . …. X
….. . .
…. .
. . . X
….
. ….association .
(d) Nonlinear association.
(d) No. linear
Figure 1 Hypothetical
. …. scatter diagrams indicating some forms of relationship
.
between X and Y.
1845 1750
(50)(200)
95
.95
100
A correlation coefficient of 0.95 indicates a positive high degree or positive
strong correlation between the values. A positive coefficient indicates a direct
relationship; when the values in one set increases, those of the other set also
increase.
In linear correlation analysis, it is important that we know the extent by
which the variation in the dependent variable is related to the variation in the
independent variable such that we may be able to see the need for examining
other related variables by such statistical tools as the multiple or partial
correlation.
A process by which we can measure the interdependence of the variables
under consideration is provided for by the square of the correlation coefficient
which is known as the coefficient of determination(r2). This value tells us the
proportion of the total variation in the dependent variable (y) which is explained
by the linear relationship with the independent variable (x).
Testing the Significance of Pearson r
52
.95
1 (.95) 2
3
.95
1 .9025
3
.95 .95(5.547) 5.27
.0975
This value with (N-2) degrees of freedom, or 3 degrees of freedom is
significant at the 5 percent level (t.05 = 3.182, df = 3). Therefore, there is a
significant relationship that exists between x and y at .05 level of significance.
Correlations
X Y
X Pearson Correlation 1.000 .950*
Sig. (2-tailed) .013
N 5.000 5
6 D 2
R 1
n(n 2 1)
Example:
A committee of five men and five women in a government agency
evaluated ten applications for a vacant position in the office. They were asked to
rate each applicant on the basis of his/her qualification and interview. The
ranking of the applicants by men and women are given in the following table:
Applicant 1 2 3 4 5 6 7 8 9 10
Men (X) 6 2 1 4 9 10 5 3 8 7
Women (Y) 5 8 4 6 3 2 7 10 1 9
Solution:
Rx Ry D D2
1 6 5 1 1
D 2
256
The following table will show the Spearman rank correlation from the result window:
The following table will show the Spearman rank correlation from the result
window with the same set of values used with manual computations.
Correlations
Men Women
Spearman's rho Men Correlation Coefficient 1.000 -.552
Sig. (2-tailed) . .098
N 10 10
Y Y
…. …
…. …
….
….…. …… …
…. … X …. … X
(a) Positive linearBasic
relationship (b)Tools
Statistical Concepts and Negative linear relationship
in Educational Management 59
[Introduction to Statistical Concepts]
…. … …. …
Y .. .. Y
… …
. . . . . . . . . . .. . . .
… . . …
........ ...
X . … … … . X
........
(c) No …...
relationship (d) Nonlinear . ……
… … … (quadratic) ……
relationship
…
. … … . ….
… … … … …
…
…… …
……
…
… … …
Y
Y = a +bXand Tools in Educational Management
Basic Statistical Concepts 60
[Introduction to Statistical Concepts]
X
Figure 3: Graph of Y = a + bX when b > 0.
b units
1 unit
byx
XY [ X Y / N ]
[ X 2 [( X ) 2 / N
a yx Y byx X
x 270 X 2
5992
Y 125 Y 2
1307
XY 2740 n 15
X X / N 270 / 15 18
Y Y / N 125 / 15 8.3
bxy
XY [ X Y / n]
[ X 2 [( X ) 2 / n]
n[ xy x y
n x 2 ( x) 2
= [15(2740) – (270) (125)]/[15(5992) – (270)2]
= 0.43286
ayx = Y - byxX
= 8.33 – (0.43286) (18)
= 8.33 – 7.7915
= 0.5385
The least squares regression line is therefore
Y = a + bX
Y = 0.5385 + 0.43286X
ANOVAb
Sum of Mean
Model Squares df Square F Sig.
1 Regressio
211.608 1 211.608 51.203 .000a
n
Coefficientsa
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) .685 1.191 .575 .575
Test1 .425 .059 .893 7.156 .000
a. Dependent Variable: Test2
* Difference in manual computation and use of SPSS is due to rounding off errors.
Chi-Square (X2)
2
The chi-square (Greek letter chi, X ) is the most commonly used method
of comparing proportions. It is particularly useful in test evaluating a relationship
between nominal- or ordinal data. Typical situations or settings are cases where
persons, events, or objects are grouped in two or more nominal categories such
as “Yes – No” responses, “Favor-Against-Undecided” or class “A,B,C, or D”.
Chi-square analysis compares the observed frequencies of the responses
with the expected frequencies. It is a measure of actual divergence of the
observed and expected frequencies. It is given by the formula:
2 ( Fo Fe ) 2
X =
Fe
Where: Fo = observed number of cases
some hypothesis.
If the differences between the observed and the expected frequencies are
small, X2 will be small. The greater the difference between the observed and
expected frequencies under the null hypothesis, the greater or larger the X 2 will
be. If the difference between the observed and expected values are so large
collectively as to occur by chance only, say 0.05 or less, when the null
hypothesis is true, then the null hypothesis is rejected.
Illustration:
Consider the nomination of three (3) presidential candidates of a political
party, A, B and C. The chairman wonders whether or not they will be equally
popular among the members of the party. To this the hypothesis of equal
preference, a random sample of 315 men were selected and interviewed which
one of the three candidates they prefer. The following are the results of the
survey:
Candidates Frequency
A 98
B 115
C 102
Are you going to reject the null hypothesis that equal members of men in
the party prefer each of the three candidates? Or are you going to accept the null
hypothesis of equality of preference?
[( f o f e ) 2
X
2
= 1.505
fe
In order to test the significance of the computed X2 value using a specified
criterion of significance, the obtained value is referred to a table with appropriate
degrees of freedom which is equal to k-1, where k is equal to the number of
categories of the variable. In this problem, df= 3-1 =2. Therefore, for the X2 to
be significant at the 0.05 level, the computed value should be more than (>) the
tabular value which is 5.991.
Summarizing,
Level of significance = 5%
df = k-1 (number of categories minus one)
= 3-1
=2
Critical value : X2(.05.2) = 5.991
Decision rule: Reject Ho if X computed > 5.991, otherwise do
not reject Ho.
Conclusion:
Since 1.505 < 5.991, do not reject Ho. There is no sufficient
evidence or reason to reject the null hypothesis that the frequencies in the
population are equal.
Chi-square and Goodness of Fit: One Sample Case in SPSS: From the Data
Editor window. Data are entered in the SPSS like this.
The following table will show the Chi-square correlation from the result window:
Correlations
Test Statistics
a 0 cells (.0%) have expected frequencies less than 5. The minimum expected
cell frequency is 105.0.
As you click on the cross tab, the following window will appear. From this
window, select the row variable and insert it as a marked row. Select the second
variable and put them in to mark a column.
The Chi-square tests table will show the value of Pearson Chi-Square
value, associated with the significance value.
Chi-Square Tests
Asymp. Sig. Exact Sig. Exact Sig.
Value df (2-sided) (2-sided) (1-sided)
Pearson Chi-Square 10.632a 1 .001
b
Continuity Correction 9.728 1 .002
Likelihood Ratio 10.730 1 .001
Fisher's Exact Test .002 .001
Linear-by-Linear
10.579 1 .001
Association
N of Valid Cases 200
CHI-SQUARE TEST
Example.
A researcher is interested whether there is a significant relationship
between gender and the educational attainment of the respondents. To
investigate the issue, the researcher randomly selected 25 teachers and noted
their gender and educational attainment. Gender is coded 1 for male and 2 for
female, and the Educational Attainment is coded 1 for BA/BS Degree Holder,
2 for With Units in MA/MS, 3 for MA/MS Degree Holder. The data are shown
in Table 1.
EducAttainment
BA/BS MA/MS
Degree With Units Degree
Holder in MA/MS Holder Total
Gende Male Count 4 2 3 9
r
% within
80.0% 20.0% 30.0% 36.0%
EducAttainment
Female Count 1 8 7 16
% within
20.0% 80.0% 70.0% 64.0%
EducAttainment
Total Count 5 10 10 25
% within
100.0% 100.0% 100.0% 100.0%
EducAttainment
significantly related to educational attainment. Since the computed p-value
(0.065) with chi-square value of 5.469 is greater than to the assigned level of
accepted. It means that the tow variables are not significantly related at 5% level
Teaching Equal
Performa variances .016 .899 8.2 23 .000 10.18831 1.23456 7.63442 12.74220
nce assumed 53
Equal
ariances
not 8.2 21.
.000 10.18831 1.22995 7.63696 12.73967
assumed 83 913
The mean teaching performance were 93.55 (SD = 3.01) for males and
83.36 (SD = 3.10) for females. The teaching performance of males and females
were compared using an independent samples t-test. There was no significant
difference between the teaching performance of males and females t(23) =
8.253, p-value = 0.000
PAIRED SAMPLES t-TEST
A researcher is interested whether a training course increases or
decreases the teaching performance of the teachers who attended the training.
The researcher randomly selected 25 teachers and measured their teaching
performance before and after they attended the training courses. The data are
shown in Table 3.
Table 3. The Teaching Performance of 25 Teachers
Before and After a Training Program
Case Teaching Performance Teaching Performance
Before Training After Training
1 85 95
2 84 98
3 86 97
4 87 92
5 89 96
6 82 93
7 80 94
8 84 95
9 86 90
10 82 82
11 89 97
12 87 98
13 82 95
14 81 95
15 86 92
16 89 91
17 89 94
18 84 95
19 85 96
20 88 97
21 81 99
Basic Statistical Concepts and Tools in Educational Management 80
[Introduction to Statistical Concepts]
22 80 98
23 86 99
24 87 91
25 89 98
Figure 3.1 Paired samples t-test data entered in SPSS Data Editor
Step 2. Run Paired Samples t-test.
Click on Analyze.
Click on Compare Means.
Click on Paired Samples T Test
A dialog box will appear. Click on both the variables you want to include in
the analysis, and then click on the small black arrow to the left of the white
box headed Paired Variables. The variables will move into the white box
as in shown in Figure 3.2.
Click OK.
The mean Teaching Performance was 85.12 (SD = 3.00) before the
training course and 94.68 (SD = 3.70) after the training course. The difference
between the mean level of Teaching Performance before and after the training
course was examined with a paired samples t-test. Teaching Performance was
significantly higher after the training course than before it t(24) = 10.347, p-value
< 0.05.
Reading Assignment:
Exercises/Written Assignments:
Monthly Income 20 14 15 18 30 34 36 47
Monthly expenses 15 12 25 20 25 30 35 32
2. The scores on an aptitude test of ten students and the number of hours
they reviewed for the test are as follows:
Number of Hours 18 27 20 10 30 24 32 27 12 16
Scores (y) 68 82 77 90 78 72 94 88 60 70
x 3 2 4 5 5 6 6 7 9 7 8 5 6 3 8
y 65 50 75 70 80 85 79 88 91 87 88 70 71 63 85
References/Bibliography
LESSON
TYPES 1 ANALYSIS FOR COMPARISON
OF
LESSON 1
Learning Objectives:
t-test
T-test
group and x2 , the mean of the experimental group with N1 and N 2 cases,
N1 N 2 N
The standard error of the difference between the two means is given by:
2 2
sx x s s
1 2 N N
1 2
To obtain the t-ratio, the difference between the mean is divided by the
estimate of the standard error,
x x
t 1 2
sx x
1 2
Another formula for calculating s2 (pooled variance) when raw scores are
given is:
( x) 2
N 1
( x) 2
N 2
N
N N
x ( x N
2 2
x N
1
2 2 2
1
2
s
2
N1 N 2 2
Sample Calculation:
At the start of a semester, a teacher draws a sample of students in his
mathematics class and randomly assigned 15 of them to an experimental group
Basic Statistical Concepts and Tools in Educational Management 87
[Introduction to Statistical Concepts]
and 15 to a control group. The teacher taught the experimental group (Group I)
with Computer-Aided Instruction (CAI) and the control group with the traditional
method of teaching. At the end of semester, a mathematics achievement test
was given to both groups. The results are given below. On the basis of the data,
should the teacher conclude that the experimental method is more effective than
the traditional method?
Experimental Group Control Group
35 25
45 32
36 40
20 28
25 22
35 25
40 20
45 50
27 20
32 18
20 12
18 40
15 15
12 10
10 8
Hypothesis:
Ho: 𝜇 1 = 𝜇 2 or 𝜇 2 - 𝜇 1 = 0
The assumption is made that x1 , x2 are means of two samples drawn from
35 25 1225 625
45 32 2025 1024
36 40 1296 1600
x1 27.7 x2 24.3
x1 x2
2 2
N1 = 15 N2 = 15
N
x N
1
2 N N
x ( x N
2 2
2 2
1
2
s
2
N1 N 2 2
(415) 2 (365) 2
13,367
10899
s
2 15 15
15 15 2
s2
13,367 11481.67 10,889 8,881.67
28
s 2 139.38
The t-ratio is:
Decision Rule:
The tabular value of t for df 28 at 0.05 level is 2.048. The observed value
of 0.46 indicates that the difference between the means was not large enough for
us to reject the null hypothesis at the 5% level. We can conclude that the two
methods of teaching do not exert a differential effect on the achievement of
students.
95% Confidence
Interval of the
Scores Equal
variances .072 .791 .773 28 .446 3.33333 4.31093 -5.49721 12.16388
assumed
Equal
variances not .773 27.968 .446 3.33333 4.31093 -5.49766 12.16433
assumed
Figure 2.1 SPSS Data Editor after independent samples –test data entered in
SPSS Data Editor
Step 2. Run the Independent Samples t-test.
Click on Analyze.
Click on Compare Means.
Click on Independent-Samples T Test
A dialog box will open with a list of your variables in the white rectangle
on the left hand side. Click on the dependent variable, and then click on
the small black arrow to the left of the white box headed Test Variable(s).
This will place the dependent variable (in this case TeachingPerformance)
in the box.
Group Statistics
Std. Error
Gender N Mean Std. Deviation Mean
Number of cases in
each group The mean of The standard deviation of
each group (in each group
Independent Samples Test
this example
Levene's Test males and
for Equality of
females).
Variances t-test for Equality of Means
Teaching Equal
Performa variances .016 .899 8.2 .000 10.18831 1.23456 7.63442 12.74220
23
nce assumed 53
degrees of freedom
Referring to the Independent Samples Test table again we can see the
results of the t-test. The t-value is 8.253, there 23 degrees of freedom and the
(two-tailed) significance level is 0.000. The difference between two groups‟
means is significant in this case because 0.000 is less than to the assigned level
of significance (0.05).
The mean teaching performance were 93.55 (SD = 3.01) for males and
83.36 (SD = 3.10) for females. The teaching performance of males and females
were compared using an independent samples t-test. There was no significant
difference between the teaching performance of males and females t(23) =
8.253, p-value = 0.000
SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO MEANS FOR
DEPENDENT SAMPLES
Significance of Levene’s test for Significance level: the probability of a
t value equality of variances. If this value significant difference in the means as
great as the one here if the null hypothesis
In the conduct of research, one
is lessmay
than use twotherelated
0.05, use t-value, samples, that is, one
is true. This value needs to be below
df, and significance level given for
compared to the assigned level of
may “match” or otherwise relate theEqual
twovariances
samples studied. This matching
not assumed may be
significance for statistical significance.
achieved by using each subject as his own control (as in pre-post tests design) or
by matching subjects and then randomly assigning one member of each pair to
one of the two conditions, and his matched partner to the other condition. A
procedure for testing the significance between the mean difference of two related
samples may be applied without computing the correlation coefficient between
them. This method is sometimes called the difference method.
Let D be the difference between any pair, x 1-x2. The mean difference of
all pairs is D =
D . The difference between the means of the two groups of
N
means, we may test for the significance of the difference between two means by
(D D )
2
SD
2
N 1
2
S
D
2
SD
N
t
D
N D 2 ( D) 2
N 1
Example:
The following are the pre and post-test scores of 12 students on a test
which determines whether learning has taken place as a result of a new teaching
strategy.
Test the significance between the two means at the 5% level.
Subject No. Pre-Test Post-Test
1 18 22
2 14 11
3 9 10
4 17 15
5 7 11
Basic Statistical Concepts and Tools in Educational Management 97
[Introduction to Statistical Concepts]
6 16 19
7 12 14
8 11 13
9 11 13
10 17 20
11 15 16
12 8 10
H O : There is no significant difference between the mean of the pre-test and
post-test.
D 22
Use t
D
N D 2 ( D) 2
N 1
22
t = -2.95
12(78) (22) 2
12 1
22
t
936 484
11
22
t
41.09
Conclusion:
Since observed value of t < tabular value at 5% level of significance, then
the null hypothesis is not rejected. This means that the post-test mean is not
significantly different from the pre-test mean. It further means method did not
increase achievement exposure to the new teaching method.
N Correlation Sig.
Paired Differences
Pair pre_test - -
1.85047 .53418 -3.00907 -.65760 -3.432 11 .006
1 post_test 1.83333
Figure 3.1 Paired samples t-test data entered in SPSS Data Editor
Step 2. Run Paired Samples t-test.
Click on Analyze.
Click on Compare Means.
Click on Paired Samples T Test
A dialog box will appear. Click on both the variables you want to include in
the analysis, and then click on the small black arrow to the left of the white
box headed Paired Variables. The variables will move into the white box
as in shown in Figure 3.2.
Click OK.
The mean Teaching Performance was 85.12 (SD = 3.00) before the
training course
t-valueand 94.68 (SD = 3.70)
Degrees after the training
of freedom course.
Significance level: The difference
The probability of a difference in
the means as great as the one here if the null
between the mean level of Teaching Performance before and
hypothesis after
is true. Thisthe
valuetraining
needs to be below to
the assigned level of significance (alpha) for
course was examined with a paired samples t-test. Teaching Performance was
statistical significance.
significantly higher after the training course than before it t(24) = 10.347, p-value
< 0.05.
Analysis of Variance (ANOVA)
ANOVA relies on the F-ratio to test the hypothesis that the two variances
are equal; that is, the subgroups are from the same population. “Between
groups” refer to the variation between each group mean and the grand or overall
mean. “Within groups” refer to the variation between each subject and the
subject‟s group mean.
X 258 x 3452
2
ni ( X i ) ( X ) / n
2 2
SSB =
= (152+122+162+102+82+142+122) – (87)2/7
= (225+144+256+100+64+196+144) – 1081.29
= 1129 – 1081.29 = 47.71
X B ( X B ) / nB
2 2
2. SSGB =
= (162+152+172+162+102+152+82) – (97)2/7
= (256+225+289+256+100+225+64) – 1344.14
= 1415 -1344.14 = 70.86
X C ( X C ) / nC
2 2
3. SSGC =
= (82+62+162+62+82+142+162) – (74)2/7
= (64+36+256+36+64+196+256) – 782.29
= 908 – 782.29 = 125.71
SSW = 47.71+70.86 + 125.71
= 244.28
RELATION OF THE SUMS OF SQUARES
SST = SSB + SSW
SSW = SST - SSB
= 282.28 – 38 = 244.28
ANOVA Table
To run analysis of variance in SPSS, select Analyze, Compare Means, One way
ANOVA…from the data editor window.
Then, specify scores in dependent variable box and student in the factor
box. Then click “ok” button.
Scores
Total 282.286 20
Figure 4.1 Independent Samples ANOVA data entered in SPSS Data Editor
Step 2. Run Paired Samples t-test.
Click on Analyze.
Click on Compare Means.
Click on One-Way ANOVA…
A dialog box will appear. Click on the dependent variable. It will be
highlighted in black. Click on the black triangle to the left of the box
headed Dependent List (Teaching Performance). The dependent
variable will be moved across to this box.
Click on the independent variable. It will be highlighted in black. Click on
black triangle to the left of the box headed Factor (Employment Status).
The dependent variable will be moved across to this box (see Figure 4.2).
Basic Statistical Concepts and Tools in Educational Management 109
[Introduction to Statistical Concepts]
Click on Post Hoc … A dialog box will appear.
Click on the white box next to Tukey. A tick will appear in the box (see
Figure 4.3).
Click on Continue.
Click on Options…
Click in the white boxes next to Descriptive (see figure 4.4)
Click on Continue.
Click OK.
Figure 4.2 Entering the independent (Factor) and dependent variables when
carrying out in independent samples ONE-WAY ANOVA.
Figure 4.4 Entering the options when carrying out (Descriptive) an independent
samples ONE WAY ANOVA
Descriptives
Teaching_Performance
95% Confidence
Interval for Mean
Std. Lower Upper
N Mean Deviation Std. Error Bound Bound Min Max
Part-Timer 8 84.8750 3.22656 1.14076 82.1775 87.5725 80.00 89.00
Temporary 8 85.7500 2.12132 .75000 83.9765 87.5235 82.00 89.00
Permanent 8 84.2500 3.53553 1.25000 81.2942 87.2058 80.00 89.00
Total 24 84.9583 2.95590 .60337 83.7102 86.2065 80.00 89.00
The number of cases
at each level of the The means for each of The standard deviation
independent variable the three samples of each sample
ANOVA
Teaching_Performance
Sum of
Squares df Mean Square F Sig.
Between 9.083 .615
2 4.542 .497
Groups
Within Groups 191.875 21 9.137
Total 200.958 23
Reading Assignment:
E-Journals/E-Books:
PUP website: infotrac.galegroup.com/itweb/pup
Password: powersearch
Exercises/Written Assignments:
Basic Statistical Concepts and Tools in Educational Management 112
[Introduction to Statistical Concepts]
1. Ten teachers who applied for the Licensure Examination for Teachers
(LET)were given a test to measure their readiness. They were
subjected to refresher courses after which a second test was given to
determine if the course affects their readiness. The results were given
below.
Teachers T1 T2
1 125 157
2 89 131
3 135 155
4 119 162
5 190 245
6 135 148
7 186 225
8 147 170
9 180 195
10 145 135
2. Ten students are paired in terms of age and year level. One member
of each pair taught Physics in English and the other member using
Filipino. At the end of the course, a proficiency test was administered.
The data are tabulated below.
Student 1 English Filipino
1 70 82
2 80 80
3 70 81
4 69 70
5 80 86
6 90 55
7 87 82
8 75 35
9 82 78
10 80 85
Is there a significant difference?
3. The following represent the test scores of three (3) groups of
respondents
A B C
12 6 7
Basic Statistical Concepts and Tools in Educational Management 113
[Introduction to Statistical Concepts]
11 8 4
10 6 10
9 10 9
14 8 12
12 6 4
13 7 3
14 9 6
11 12 5
10 13 6
References/Bibliography