Measurement in Research
Measurement in Research
MEASUREMENT
&
DATA COLLECTION
LEARNING OBJECTIVES
■ Identify types of data that researchers collect ;
1. Face Validity
2. Criterion Related Validity
3. Construct-related Validity
4. Content-related Validity
Types of Validity-Face Validity
• Face validity is concerned with how a measure or procedure appears. Does it
seem like a reasonable way to gain the information the researchers are
attempting to obtain? Does it seem well designed? Does it seem as though it
will work reliably?
• An example could be, after a group of students sat a test, you asked for
feedback, specifically if they thought that the test was a good one. This
enables refinements for the next research project and adds another
dimension to establishing validity.
Types of Validity-Content Validity
1. test-retest
2. multiple forms
3. inter-rater
4. split-half
• The test-retest technique is to
administer your test, instrument,
survey, or measure to the same group
of people at different points in time
An example would be the method of maintaining weights used by the U.S. Bureau
of Standards. Platinum objects of fixed weight (one kilogram, one pound, etc...) are
kept locked away. Once a year they are taken out and weighed, allowing scales to
be reset so they are "weighing" accurately. Keeping track of how much the scales
are off from year to year establishes a stability reliability for these instruments. In
this instance, the platinum weights themselves are assumed to have a perfectly
fixed stability reliability
• Parallel-Forms or multi-forms technique
• …create a large set of questions that address the same construct
and then randomly divide the questions into two sets. You
administer both instruments to the same sample of people
• The success of this method hinges on the equivalence of the two
forms of the test.
• If, for example, rater A •Inter-Rater Reliability.
observed a child act out • Inter-Scorer Reliability refers to the consistency
aggressively eight times, we or degree of agreement between two or more
would want rater B to observe
scorers, judges, or raters.
the same amount of aggressive
• You could have two judges rate one set of
acts. If rater B witnessed 16
aggressive acts, then we know papers. Then you would just correlate their two
at least one of these two raters sets of ratings to obtain the inter-scorer
is incorrect. reliability coefficient, showing the consistency
• If there ratings are positively of the two judges’ ratings.
correlated, however, we can be
reasonably sure that they are
measuring the same construct
of aggression. It does not,
however, assure that they are
measuring it correctly, only
that they are both measuring it
the same.
• Split-half reliability
• A measure of consistency where a test is split in two and the
scores for each half of the test is compared with one another. If the
test is consistent it leads the experimenter to believe that it is most
likely measuring the same thing.
Reliability and Validity
that is pretty close and the test can be considered both valid and reliable.