0% found this document useful (0 votes)
415 views

Reliability & Validity

This document discusses the concepts of reliability and validity in research. Reliability refers to the consistency and dependability of measurement tools or procedures. There are several types of reliability, including test-retest reliability, internal consistency, and inter-rater reliability. Validity refers to whether a measurement tool accurately measures what it is intended to measure. Types of validity include construct validity, face validity, content validity, predictive validity, and concurrent validity. Together, reliability and validity are important for establishing the quality and accuracy of research measurements and results.

Uploaded by

wajahatroomi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
415 views

Reliability & Validity

This document discusses the concepts of reliability and validity in research. Reliability refers to the consistency and dependability of measurement tools or procedures. There are several types of reliability, including test-retest reliability, internal consistency, and inter-rater reliability. Validity refers to whether a measurement tool accurately measures what it is intended to measure. Types of validity include construct validity, face validity, content validity, predictive validity, and concurrent validity. Together, reliability and validity are important for establishing the quality and accuracy of research measurements and results.

Uploaded by

wajahatroomi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Reliability & Validity

What is Reliability?
Reliability: Consistency and dependability.
If a measurement device or procedure consistently
assigns the same score to individuals or objects with
equal values, the device is considered reliable.
Researchers must establish the reliability of their
measurement devices in order to be certain that
they are obtaining a systematic and consistent
record of the variation in X and Y.

Types of Reliability
Several types:
Test-retest reliability and alternate reliability
Inter-item reliability and internal consistency
Split-half reliability
Inter-rater reliability
Scorer reliability

Test-retest Reliability

Measure

the

scores

twice

with

the

same

instrument. Reliable measures should produce


very similar scores. Examples:
IQ tests typically show high test-retest reliability.
The reliability of a bathroom scale can be tested
by recording your weight 2-3 times within a
minute or two.

Alternate Forms Reliability


Test-retest
participants

procedures
may

be

may
able

not
to

be

recall

useful
their

when

previous

responses and simply repeat them upon retesting. In


cases where administering the exact same test will not
necessarily be a good test of reliability, we may use
alternate forms reliability. As the name implies, two
or more versions of the test are constructed that are
equivalent in content and level of difficulty. Professors use
this technique to create makeup or replacement exams
because students may already know the questions from
the earlier exam.

Inter-item reliability
Inter-item reliability: The degree to which different
items measuring the same variable attain
consistent results.
Scores on different items designed to measure the
same construct should be highly correlated. It also
goes by the name internal consistency.
Example: Math tests often ask you to solve several
examples of the same type of problem. Your
scores on these questions will normally represent
your ability to solve this type of problem, and the
test would have high inter-item reliability.

Inter-rater reliability
When observers must use their own judgment to
interpret the events they are interpreting
(including live or videotaped behaviors and
written answers to open-ended interview
questions), scorer reliability must be measured.
Have different observers take measurements of
the same responses; the agreement between
their measurements is called inter-rater reliability.
Their results can be compared statistically and
represent the scorers reliability.

A measure is valid if it measures what it is supposed to


measure, and does so cleanly without accidentally
including other factors.
Most experiments are designed to measure
hypothetical constructs such as intelligence, learning,
or love. The experimenter must create an operational
definition of the dependent variable because one
cannot measure these hypothetical constructs directly.
A valid measure is one that measures this hypothetical
construct accurately (such as intelligence) without
being influenced by other factors (such as motivation).

Types of Validity
Validity: (actually studying the
variables that we wish to study)
Construct validity
Face validity
Content validity
Criterion validity -- 2 types:
Predictive validity
Concurrent validity

Construct Validity
Do my dependent variables actually
measure the hypothetical construct that I
want to test?
Does my IQ test really measure IQ, and
nothing else?
Do my procedures actually measure
learning, (without being influenced by
motivation)?
Does my personality test really measure
personality traits without including fatigue?

Face Validity
The consensus (usually by experts in the field) that a
measure represents a particular concept. It is the least
stringent type of validity. Because most psychological
variables require indirect measures (like the intelligence
example before), the validity of a measured definition
may not be self-evident.
Does rate of eating really reflect hunger? In rats, does the
rate of lever pressing actually measure learning?
Does talking measure extroversion?
Does GPA or SAT score
really reflect intelligence?

Comparing face validity with


construct validity
Face validity: The consensus that a measure
represents a particular concept the face
value of the measure. (Would a 130-pound
53 college student be a good football or
basketball player?)
Construct validity: The accuracy with which
a measure represents the particular concept,
without influence of additional factors.
Construct validity implies that other
operational definitions of the same construct
will yield correlated results.

Content Validity
Does the content of our measure fairly reflect the
content of the thing we are measuring?
Example: Do the questions on an exam accurately
reflect what you have learned in the course, or were
the exam questions sampled from only a subsection of the material?
A test to measure your knowledge of mathematics
should not be limited to addition problems, nor
should it include questions about French literature.
It should cover the entire range appropriate math
problems you are trying to measure.

Criterion Validity
A powerful indicator of the validity of a
measure is its ability to accurately predict
performance on other, independent outcome
measures (referred to as criterion measures).
The extent to which your SAT score predicts
your college GPA is an indication of the SATs
criterion validity.
There are two approaches to criterion
validity: Concurrent validity and Predictive
validity.

Concurrent vs. Predictive


Validity
In concurrent validity, the SAT test scores and
criterion measures (high school GPA) are
obtained at roughly the same time
(concurrent).
If the SAT shows high concurrent validity, it will
be highly correlated with GPA obtained at the
same time the SAT is taken.
Predictive validity, however, would be high if
your SAT score accurately predicted your
college GPA, which is obtained long after taking
the SAT.

You might also like