Validity and Reliability Lesson 3.

Uploaded by

Inner Art

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Validity and Reliability Lesson 3.

Uploaded by

Inner Art

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 48

VALIDITY AND RELIABILITY

Commonly used terms…

“She has a valid point”

“My car is unreliable”

…in science…
“The conclusion of the study was not valid”

“The findings of the study were not reliable”.

• What’s validity?
Some definitions…
• Validity

The soundness or appropriateness of a test

or instrument in measuring what it is
designed to measure
Denotes the extent to which an instrument is
measuring what it is supposed to measure
• "Validation is inquiry into the soundness of
the interpretations proposed for scores from
a test" Cronbach (1990, p. 145)
• The tool is valid for a particular purpose and
for certain situations.
• For example the purpose of a measurement is
to determine intelligence. Becaouse of this
reason, you can not use this tool to determine
anxiety level or personality disorder
• So; It cannot be used for any other purpose !!!
• While definition of validity seems simple,
there are several different types of validity
that are relavant in the social science.
• Each of these types of validity takes a
somewhat diffreent approach in assessing the
extent to which a measures what it supports
to.
• Validity was traditionally subdivided into three
categories: criterion-related, content and
construct validity (see Brown 1996, pp. 231-
249).
• 1)Criterion related (1)predicitve validity,
• 2) Concurrent validity
• 2)Content validity
• 3)Construct validity
Criterion-Related Validity
• Criterion validity (or criterion-
related validity) measures how well
one measure predicts an outcome
for another measure.
• Validity is usually determined by comparing
two instruments ability to predict a similar
outcome with a single variable being
measured.
• There are two major types of criterion validity
predictive or concurrent forms of validity.
• 1) Concurrent criterion validity
• 2) Predictive validity
• Concurrent criterion validity is used when the
two instruments are used to measure the
same event at the same time.
• Example: surveys during an election can be
said to have concurrent criterion validity if
they predict similar outcomes to the election.
CONCURRENT VALIDITY

The extent to which a

procedure correlates
with the current
behavior of subjects
• Concurrent Validity
–Infers that the test produces
similar results to a previously
validated test
• 2) Predictive validity is used when the
instrument is administered then time is
allowed to pass and is measured against the
another outcome.
• if the test accurately predicts what it is
supposed to predict.
• For example, TOEFL exhibits predictive validity
for performance in English
2) CONTENT VALIDITY
• A second basic type of validity is content
validity.
• This type of validity has played a major role in
the development and assessment of various
types of tests used in psychology and
especially education.
• Fundamentally, content validity depends on
the extent to which an emprical measurement
reflects a specific domain of content.
• In psychometrics, content validity
refers to the extent to which a
measure represents all facets of a
given construct.
• For example a test in arithmetical operations
would not be content valid if the test
problems focused only on addition, thus
neglecting subtraction, multiplication and the
division.
• For example, a depression scale may lack
content validity if it only assesses the affective
dimension of depression but fails to take into
account the behavioral dimension.
3) Construct validity
• To understand the traditional definition of
construct validity, it is first necessary to
understand what a construct is.
• A construct, or psychological construct as it is
also called, is an attribute, proficiency, ability,
or skill that happens in the human brain and is
defined by established theories.
In psychology, a construct is a skill, attribute,
intelligence, self-concept, motivation,
aggression or ability that is based on one or
more established theories and can be
observed by some type of instrument.
• Construct validity refers to how well a test or
tool measures the constructs that it was
designed to measure.
• In other words, to what extent the
statements we are using in the scale to
measure depression are actually measuring it
• Construct validity is woven into the theoretical
fabric of the social sciences, and is thus central to
the measurement of abstract theoretical
concepts.
• Indeed, construct validation must be considered
within a theoretical contex.
• Rosenberg’s self-esteem scale:
theoretically, Rosenberg (1965) has argued that
a student’s level of self-esteem is positively
related to participation in school activities.
• Thus, the theoretical prediction is that the
higher the level of self-esteem, the more active
the student willl be in school –related activities.
• A teacher can administer Rosenberg’s self-
esteem scale to a group of students and can
determine the extent of their involvement in
school activities.

• If the correlation is positive, then this evidence

has been supported the construct validity of
Rosenberg’ s self-esteem scale.
• Another example;
• Someone with depression could tell about
being fatigued, anxious, hopeless etc. and
these symptoms could point towards
depression or anxiety, but we can't determine
the severity of depression just by hearing the
symptoms because we need to have a tool to
measure it as closely as possible. Here we
trying to measure the construct called
depression.
FACTORS AFFECTING VALIDITY
1. Test-related factors
2. The criterion to which you compare
your instrument may not be well
enough established
3. Intervening events
4. Reliability
• Reliability ?

• The consistency of measurements

A RELIABLE TEST
Produces similar scores across
various conditions and situations,
including different evaluators and
testing environments.
RELIABILITY COEFFICIENTS
• The statistic for expressing reliability.
• Expresses the degree of consistency
in the measurement of test scores.
• Donoted by the letter r with two
identical subscripts (rxx)
• Reliability coefficients: 0 to 1
• The closer the value is to one (1.00),
the higher the reliability is assumed.
• 1)Test-retest reliability
• 2)Split half reliability
• 3) Interrater reliability
1)TEST-RETEST RELIABILITY

Suggests that subjects tend

to obtain the same score
when tested at different
times.
• test-retest reliability is a measure of reliability
obtained by administering the same test twice
over a period of time to a group of
individuals. The scores from Time 1 and Time
2 can then be correlated in order to evaluate
the test for stability over time.
•
• Example: A test designed to assess student
learning in psychology could be given to a
group of students twice, with the second
administration perhaps coming a week after
the first. The obtained correlation coefficient
would indicate the stability of the scores.
•
Split-Half Reliability
• Sometimes referred to as internal
consistency
• Indicates that subjects’ scores on
some trials consistently match
their scores on other trials
– Split-half testing measures reliability. In split-half
reliability, a test for a single knowledge area
is splitinto two parts and then both parts given to
one group of students at the same time. The
scores from both parts of the test are correlated
• Stages of split half reliability:
• Administer the test to a large group students
(ideally, over about 30).
• Randomly divide the test questions into two
parts. For example, separate even questions from
odd questions.
• Score each half of the test for each student.
• Find the correlation coefficient for the two halves.
INTERRATER RELIABILITY
Involves having two raters independently
observe and record specified behaviors,
such as hitting, crying, yelling, and getting
out of the seat, during the same time period
• Inter-rater reliability is a measure of reliability
used to assess the degree to which different judges
or raters agree in their assessment decisions.
Inter-rater reliability is useful because human
observers will not necessarily interpret answers
the same way; raters may disagree as to how well
certain responses or material demonstrate
knowledge of the construct or skill being assessed.
•
• Example: Inter-rater reliability might be employed
when different judges are evaluating the degree to
which art portfolios meet certain standards. Inter-
rater reliability is especially useful when judgments
can be considered relatively subjective. Thus, the
use of this type of reliability would probably be
more likely when evaluating artwork as opposed to
math problems.
•
FACTORS AFFECTING RELIABILITY

1. Test length
2. Test-retest interval
3. Variability of scores
4. Guessing
5. Variation within the test situation
Threats to Reliability
• Fatigue
8 am 9 am 10 am

Subject 1 60 ml.kg-1.min-1 55 ml.kg-1.min-1 50 ml.kg-1.min-1

Therefore, solution = increase time between tests.

Threats to Reliability
• Habituation

Subject 1 60 ml.kg-1.min-1 65 ml.kg-1.min-1 70 ml.kg-1.min-1

Therefore, solution = familiarise prior to test.

Threats to Reliability
• Standardisation of Procedures
– Control of extraneous variables
Measurement Errors
• Ultimately, reliability is dependent on the
degree of measurement error in a given study

• The overall error in any measurement is

comprised of both systematic and random
error
Relationship between reliability
and validity
• If data are valid, they must be reliable. If people
receive very different scores on a test every
time they take it, the test is not likely to predict
anything.
• However, if a test is reliable, that
does not mean that it is valid.
• Reliability is a necessary, but not sufficient,
condition for validity

Characteristics of A Good Test
50% (2)
Characteristics of A Good Test
5 pages
Validity & Reliability
No ratings yet
Validity & Reliability
27 pages
Chapter 4 Assessment & Evaluation
No ratings yet
Chapter 4 Assessment & Evaluation
10 pages
Establishing Validity-and-Reliability-Test
No ratings yet
Establishing Validity-and-Reliability-Test
28 pages
2.measurement of Validity Reliability
No ratings yet
2.measurement of Validity Reliability
31 pages
BBA-BI-Class 19 Business Research Notes For BHM
No ratings yet
BBA-BI-Class 19 Business Research Notes For BHM
28 pages
Validity and Reliability
No ratings yet
Validity and Reliability
6 pages
Validity and Reliability
100% (2)
Validity and Reliability
20 pages
Validity and Reliability
No ratings yet
Validity and Reliability
6 pages
Topic 3 Characteristics and Principles of Assessment
100% (1)
Topic 3 Characteristics and Principles of Assessment
45 pages
Validity TM
No ratings yet
Validity TM
8 pages
meai.21 (1)
No ratings yet
meai.21 (1)
11 pages
Reliability and Validity
No ratings yet
Reliability and Validity
21 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
Validity and Reliability
100% (1)
Validity and Reliability
6 pages
What is Reliability
No ratings yet
What is Reliability
2 pages
Validity_merged
No ratings yet
Validity_merged
8 pages
Kyu Edu 2301 WK3
No ratings yet
Kyu Edu 2301 WK3
5 pages
Validity & Realibility
No ratings yet
Validity & Realibility
13 pages
PSY 323 TOPIC 3
No ratings yet
PSY 323 TOPIC 3
5 pages
Measuring Reliability and Validity
No ratings yet
Measuring Reliability and Validity
18 pages
Validity and Reliability in Research
0% (1)
Validity and Reliability in Research
13 pages
Reliability and Validity
No ratings yet
Reliability and Validity
19 pages
Validity and Reliability of Instruments
No ratings yet
Validity and Reliability of Instruments
26 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
35 pages
Psycho Metric Properties of Tests
No ratings yet
Psycho Metric Properties of Tests
8 pages
Test Worthiness: Validity, Reliability, Cross-Cultural Fairness, and Practicality
No ratings yet
Test Worthiness: Validity, Reliability, Cross-Cultural Fairness, and Practicality
7 pages
SPL-3 Unit 3
No ratings yet
SPL-3 Unit 3
4 pages
Psychology Assessment
No ratings yet
Psychology Assessment
3 pages
KPD Validity & Realibility
No ratings yet
KPD Validity & Realibility
25 pages
Validity
No ratings yet
Validity
5 pages
Psychological Assessment
No ratings yet
Psychological Assessment
47 pages
Validity and Relability
No ratings yet
Validity and Relability
4 pages
Validity: Syed Hassan Shah Kargil (Ladakh)
No ratings yet
Validity: Syed Hassan Shah Kargil (Ladakh)
15 pages
Validity Explains How Well The Collected Data Covers The Actual Area of Investigation
No ratings yet
Validity Explains How Well The Collected Data Covers The Actual Area of Investigation
7 pages
Validity and reliability
No ratings yet
Validity and reliability
24 pages
PT Presentaion
No ratings yet
PT Presentaion
25 pages
Strructures
No ratings yet
Strructures
28 pages
Unit 4: Qualities of A Good Test: Validity, Reliability, and Usability
No ratings yet
Unit 4: Qualities of A Good Test: Validity, Reliability, and Usability
18 pages
Reliability and Validity Mha1
No ratings yet
Reliability and Validity Mha1
13 pages
10 Validity
No ratings yet
10 Validity
5 pages
What Is Validit1
No ratings yet
What Is Validit1
5 pages
Module 2 Part 4 Criteria For Measurement
No ratings yet
Module 2 Part 4 Criteria For Measurement
25 pages
LESSON 6 Assessment Reviewer
No ratings yet
LESSON 6 Assessment Reviewer
7 pages
What Is Questionnaire?
No ratings yet
What Is Questionnaire?
4 pages
Reliability & Validity: Dr. Nitu Singh Sisodia
No ratings yet
Reliability & Validity: Dr. Nitu Singh Sisodia
20 pages
Scales Reliability and Validity
No ratings yet
Scales Reliability and Validity
10 pages
VALIDITY and reliability
No ratings yet
VALIDITY and reliability
23 pages
Validity and Reliability
No ratings yet
Validity and Reliability
29 pages
Royal University of Phnom Penh Faculty of Education Master of Education Program
No ratings yet
Royal University of Phnom Penh Faculty of Education Master of Education Program
41 pages
Unit 9
No ratings yet
Unit 9
27 pages
Chapter 4 - Validity and Reliability
100% (2)
Chapter 4 - Validity and Reliability
23 pages
Research Methodology Validity Presentation
No ratings yet
Research Methodology Validity Presentation
22 pages
Validity and Reliability
No ratings yet
Validity and Reliability
33 pages
Al2 Report
No ratings yet
Al2 Report
87 pages
Theories Essay
No ratings yet
Theories Essay
14 pages
Reflection
No ratings yet
Reflection
6 pages
1 Interpersonal Relationships
No ratings yet
1 Interpersonal Relationships
60 pages
Psychological Impact of Abortion On Females Aged Between 15-40
No ratings yet
Psychological Impact of Abortion On Females Aged Between 15-40
15 pages