Testing If Two Measuring Procedures Measure The Same Dimension
This document describes a statistical technique for testing the hypothesis that two sets of measurements differ only due to measurement error, differing units of measurement, and differing origins. The technique involves computing sums of squares and cross products from replicate measurements on subjects using two measuring instruments. A matrix is constructed from these values and used to test if the instruments measure the same underlying dimension. If the matrix is positive definite, the hypothesis that the instruments only differ due to measurement error is rejected. A numerical example applying the technique to vocabulary test data is also provided.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
24 views2 pages
Testing If Two Measuring Procedures Measure The Same Dimension
This document describes a statistical technique for testing the hypothesis that two sets of measurements differ only due to measurement error, differing units of measurement, and differing origins. The technique involves computing sums of squares and cross products from replicate measurements on subjects using two measuring instruments. A matrix is constructed from these values and used to test if the instruments measure the same underlying dimension. If the matrix is positive definite, the hypothesis that the instruments only differ due to measurement error is rejected. A numerical example applying the technique to vocabulary test data is also provided.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
Psychological Bulletin
1973, Vol. 79, No. 1, 71-72
TESTING IF TWO MEASURING PROCEDURES
MEASURE THE SAME DIMENSION1 FREDERIC M. LORD " Educational Testing Service
A convenient statistical technique is described for testing the hypothesis that
two sets of measurements differ only because of errors of measurement and because of differing origins and units of measurement.
This note is concerned with testing the A typical element of A, similarly, is
hypothesis that two sets of measurements differ only because of (a) errors of measure- .v ment, (b) differing units of measurement, r £ (x,a. — xa..)(xha. — xh..). u=l and (c) differing arbitrary origins for mea- surement. It describes a convenient statistical Let Fp denote the 1 — p percentile of the technique that, unnoticed, has become avail- F distribution with N and N(r — 1) degrees able (Villegas, 1964). It will probably be of freedom, for the numerator and denomi- preferred to other techniques used for this nator, respectively. Compute the matrix: purpose (Forsyth & Feldt, 1969, 1970; Lord, 1957; McNemar, 1958), though each has M = (r - - FVW- advantages and disadvantages. From the product of the two diagonal terms STATISTICAL PROCEDURE of M subtract the product of the two off- diagonal terms, thus obtaining the value of The necessary raw data consist of r ^ 2 the determinant \M\. If the determinant is replicate measurements (indexed by the sub- positive and if both diagonal terms are also script k) on each of N people (indexed by a) positive, then the F test rejects at significance by each of two measuring instruments or pro- level < p the hypothesis H0 stated at the cedures (indexed by g or h). For each instru- beginning of this paper. Otherwise, the hy- ment or procedure separately, compute the pothesis is not rejected. This same conclusion usual among-persons and within-persons sums may be stated more compactly as follows: of squares. Also, compute the corresponding H0 is rejected if and only if the matrix M sums of cross products between instruments is positive definite. (or procedures). Let W denote the 2 X 2 matrix of within-persons sums of squares and DISCUSSION cross products, and let A denote the corre- sponding 2 X 2 among-persons matrix. In a The assumptions required for the validity standard notation, the element of W in col- of the foregoing statistical procedure and umn g and row h is conclusions are as follows. Within each repli- cation, the errors of measurement for instru- N r H Z (Xgak ~ XBa.)(Xhnk ~ Xha.}. ments g and h for each person always have a bivariate normal distribution. The mean 1 error is always taken to be zero. No restrictive This research was sponsored in part by the Per- sonnel and Training Research Programs, Psychologi- assumptions are made about the variance or cal Sciences Division of the Office of Naval Research correlation parameters of this distribution, under Contract No. N00014-69-C-0017 and Contract except that the variances are positive. In Authority Identification NR 1SO-303 and by Educa- tional Testing Service. Reproduction in whole or in particular, the errors for g and h may be part is permitted for any purpose of the United correlated within a replication and within a States Government. person. The bivariate distribution is the same 2 Requests for reprints should be sent to Frederic M. Lord, Division of Psychological Studies, Educa- for each replication and for each individual tional Testing Service, Princeton, New Jersey 08S40. measured. For two replications or two persons, 71 72 FREDERIC LORD
TABLE 1 NUMERICAL EXAMPLE
DATA FOR SPEEDED AND UNSPEEDED In the numerical example, measuring in- VOCABULARY TESTS struments g are unspeeded IS-item vocabu- lary tests; measuring instruments h are highly speeded 75-item vocabulary tests. The r = 2 Among persons 93524.56 76176.30 111220.58 parallel forms of each test were administered Within persons 18533.00 -547.00 15403.50 to each of N = 649 examinees. The raw data Total 112057.56 75629.30 126624.08 are the same as those used in a previous numerical example (Lord, 1957, pp. 210- 212). The required sums of squares and the errors are always uncorrelated with each products are shown in Table 1. other. The .05 significance level of F for 649 and The unknown relationship between the two 649 degrees of freedom is 1.13. The matrix origins mentioned in H0, also that between M is easily found (to the nearest integer) the two units of measurement, represent to be "nuisance parameters" for the purpose of T72582 767941 making a significance test of H0. Villegas M = [76794 93815J (1964) shows that when (and only when) M is positive definite, there is no set of values The determinant is positive, so H0 is re- for these (unknown) nuisance parameters jected. This agrees with the conclusion such that an appropriate variance ratio cal- reached previously by large-sample methods (Lord, 1957) under somewhat different as- culated from them and from the data will sumptions. lie below the p level of significance. The significance level in the foregoing REFERENCES statement is precisely p. However, when the FORSYTH, R. A., & FELDT, L. S. An investigation of entire procedure is considered as a test for empirical sampling distributions of correlation co- efficients corrected for attenuation. Educational HO, it is known only (assuming that the repli- and Psychological Measurement, 1969, 29, 61-72. cate measurements are not perfectly corre- FORSYTH, R. A., & FELDT, L. S. Some theoretical and lated) that the significance level of the test empirical results related to McNemar's test that the population correlation coefficient corrected for will be less than p. Thus, the significance test attenuation equals 1.0. American Educational is conservative in the sense that H0 will be Research Journal, 1970, 7, 197-207. rejected somewhat less often than the p value LORD, F. M. A significance test for the hypothesis that two variables measure the same trait except would indicate.3 It is important in practice for errors of measurement. Psychometrika, 1957, that the replications meet the assumptions 22, 207-220. stated. Any practice effect between replica- McNEMAR, Q. Attenuation and interaction. Psy- chometrika, 19S8, 23, 259-266. tions that increases the within-persons sums SAW, J. G. A conservative test for the concurrence of squares, for example, will tend to decrease of several regression lines and related problems. the chance of rejecting H0. Biometrika, 1966, 53, 272-275. VILLEGAS, C. Confidence region for a linear relation. 3 The Annals of Mathematical Statistics, 1964, 35, The author is indebted to Murray Aitkin and to 780-788. Leon Gleser for pointing out and clarifying this distinction. Sec also Saw (1966, Sections 2-3). (Received August 23, 1971)