0% found this document useful (0 votes)
24 views2 pages

Testing If Two Measuring Procedures Measure The Same Dimension

This document describes a statistical technique for testing the hypothesis that two sets of measurements differ only due to measurement error, differing units of measurement, and differing origins. The technique involves computing sums of squares and cross products from replicate measurements on subjects using two measuring instruments. A matrix is constructed from these values and used to test if the instruments measure the same underlying dimension. If the matrix is positive definite, the hypothesis that the instruments only differ due to measurement error is rejected. A numerical example applying the technique to vocabulary test data is also provided.

Uploaded by

levi nilawati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views2 pages

Testing If Two Measuring Procedures Measure The Same Dimension

This document describes a statistical technique for testing the hypothesis that two sets of measurements differ only due to measurement error, differing units of measurement, and differing origins. The technique involves computing sums of squares and cross products from replicate measurements on subjects using two measuring instruments. A matrix is constructed from these values and used to test if the instruments measure the same underlying dimension. If the matrix is positive definite, the hypothesis that the instruments only differ due to measurement error is rejected. A numerical example applying the technique to vocabulary test data is also provided.

Uploaded by

levi nilawati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Psychological Bulletin

1973, Vol. 79, No. 1, 71-72

TESTING IF TWO MEASURING PROCEDURES


MEASURE THE SAME DIMENSION1
FREDERIC M. LORD "
Educational Testing Service

A convenient statistical technique is described for testing the hypothesis that


two sets of measurements differ only because of errors of measurement and
because of differing origins and units of measurement.

This note is concerned with testing the A typical element of A, similarly, is


hypothesis that two sets of measurements
differ only because of (a) errors of measure- .v
ment, (b) differing units of measurement, r £ (x,a. — xa..)(xha. — xh..).
u=l
and (c) differing arbitrary origins for mea-
surement. It describes a convenient statistical Let Fp denote the 1 — p percentile of the
technique that, unnoticed, has become avail- F distribution with N and N(r — 1) degrees
able (Villegas, 1964). It will probably be of freedom, for the numerator and denomi-
preferred to other techniques used for this nator, respectively. Compute the matrix:
purpose (Forsyth & Feldt, 1969, 1970; Lord,
1957; McNemar, 1958), though each has M = (r - - FVW-
advantages and disadvantages. From the product of the two diagonal terms
STATISTICAL PROCEDURE of M subtract the product of the two off-
diagonal terms, thus obtaining the value of
The necessary raw data consist of r ^ 2 the determinant \M\. If the determinant is
replicate measurements (indexed by the sub- positive and if both diagonal terms are also
script k) on each of N people (indexed by a) positive, then the F test rejects at significance
by each of two measuring instruments or pro- level < p the hypothesis H0 stated at the
cedures (indexed by g or h). For each instru- beginning of this paper. Otherwise, the hy-
ment or procedure separately, compute the pothesis is not rejected. This same conclusion
usual among-persons and within-persons sums may be stated more compactly as follows:
of squares. Also, compute the corresponding H0 is rejected if and only if the matrix M
sums of cross products between instruments is positive definite.
(or procedures). Let W denote the 2 X 2
matrix of within-persons sums of squares and DISCUSSION
cross products, and let A denote the corre-
sponding 2 X 2 among-persons matrix. In a The assumptions required for the validity
standard notation, the element of W in col- of the foregoing statistical procedure and
umn g and row h is conclusions are as follows. Within each repli-
cation, the errors of measurement for instru-
N r
H Z (Xgak ~ XBa.)(Xhnk ~ Xha.}.
ments g and h for each person always have
a bivariate normal distribution. The mean
1 error is always taken to be zero. No restrictive
This research was sponsored in part by the Per-
sonnel and Training Research Programs, Psychologi- assumptions are made about the variance or
cal Sciences Division of the Office of Naval Research correlation parameters of this distribution,
under Contract No. N00014-69-C-0017 and Contract except that the variances are positive. In
Authority Identification NR 1SO-303 and by Educa-
tional Testing Service. Reproduction in whole or in particular, the errors for g and h may be
part is permitted for any purpose of the United correlated within a replication and within a
States Government. person. The bivariate distribution is the same
2
Requests for reprints should be sent to Frederic
M. Lord, Division of Psychological Studies, Educa- for each replication and for each individual
tional Testing Service, Princeton, New Jersey 08S40. measured. For two replications or two persons,
71
72 FREDERIC LORD

TABLE 1 NUMERICAL EXAMPLE


DATA FOR SPEEDED AND UNSPEEDED In the numerical example, measuring in-
VOCABULARY TESTS
struments g are unspeeded IS-item vocabu-
lary tests; measuring instruments h are highly
speeded 75-item vocabulary tests. The r = 2
Among persons 93524.56 76176.30 111220.58 parallel forms of each test were administered
Within persons 18533.00 -547.00 15403.50 to each of N = 649 examinees. The raw data
Total 112057.56 75629.30 126624.08 are the same as those used in a previous
numerical example (Lord, 1957, pp. 210-
212). The required sums of squares and
the errors are always uncorrelated with each products are shown in Table 1.
other. The .05 significance level of F for 649 and
The unknown relationship between the two 649 degrees of freedom is 1.13. The matrix
origins mentioned in H0, also that between M is easily found (to the nearest integer)
the two units of measurement, represent to be
"nuisance parameters" for the purpose of T72582 767941
making a significance test of H0. Villegas M =
[76794 93815J
(1964) shows that when (and only when) M
is positive definite, there is no set of values The determinant is positive, so H0 is re-
for these (unknown) nuisance parameters jected. This agrees with the conclusion
such that an appropriate variance ratio cal- reached previously by large-sample methods
(Lord, 1957) under somewhat different as-
culated from them and from the data will
sumptions.
lie below the p level of significance.
The significance level in the foregoing REFERENCES
statement is precisely p. However, when the FORSYTH, R. A., & FELDT, L. S. An investigation of
entire procedure is considered as a test for empirical sampling distributions of correlation co-
efficients corrected for attenuation. Educational
HO, it is known only (assuming that the repli- and Psychological Measurement, 1969, 29, 61-72.
cate measurements are not perfectly corre- FORSYTH, R. A., & FELDT, L. S. Some theoretical and
lated) that the significance level of the test empirical results related to McNemar's test that
the population correlation coefficient corrected for
will be less than p. Thus, the significance test attenuation equals 1.0. American Educational
is conservative in the sense that H0 will be Research Journal, 1970, 7, 197-207.
rejected somewhat less often than the p value LORD, F. M. A significance test for the hypothesis
that two variables measure the same trait except
would indicate.3 It is important in practice for errors of measurement. Psychometrika, 1957,
that the replications meet the assumptions 22, 207-220.
stated. Any practice effect between replica- McNEMAR, Q. Attenuation and interaction. Psy-
chometrika, 19S8, 23, 259-266.
tions that increases the within-persons sums SAW, J. G. A conservative test for the concurrence
of squares, for example, will tend to decrease of several regression lines and related problems.
the chance of rejecting H0. Biometrika, 1966, 53, 272-275.
VILLEGAS, C. Confidence region for a linear relation.
3 The Annals of Mathematical Statistics, 1964, 35,
The author is indebted to Murray Aitkin and to 780-788.
Leon Gleser for pointing out and clarifying this
distinction. Sec also Saw (1966, Sections 2-3). (Received August 23, 1971)

You might also like