0% found this document useful (0 votes)

14 views

Prsentaion of Pyschometrics

Uploaded by

thesecretfreelance

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Prsentaion of Pyschometrics

Uploaded by

thesecretfreelance

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 20

Introduction to Psychometrics

• Psychometrics & Measurement Validity

• Some important language
• Properties of a “good measure”
– Standardization
– Reliability
– Validity
• Common Item types
• Reverse Keying
Psychometrics
(Psychological measurement)

The process of assigning values to represent the amounts and

kinds of specified attributes, to describe (usually) persons.
• We do not “measure people”
• We measure specific attributes or characteristics of a person

Psychometrics is the “centerpiece” of empirical

psychological research and practice.
• All data result from some form of “measurement”
• For those data to be useful we need “Measurement Validity”
•The better the measurement, the better the data, the better the
conclusions of the psychological research or application
Most of what we try to measure in Psychology are constructs

They’re called because most of what we care about as

psychologists are not physical measurements, such as height,
weight, pressure & velocity…
…rather the “stuff of psychology”  learning, motivation,
anxiety, social skills, depression, wellness, etc. are things that
“don’t really exist”.

They are attributes and characteristics that we’ve constructed to

give organization and structure to behavior. Essentially all of
the things we psychologists research, both as causes and
effects, are Attributive Hypotheses with different levels of
support and acceptance!!!!
Measurement of constructs is more difficult than of physical
properties!
We can’t just walk up to someone with a scale, ruler, graduated
cylinder or velocimeter and measure how depressed they are.

We have to figure out some way to turn their behavior, self-reports

or traces of their behavior into variables that give values for the
constructs we want to measure.

So, measurement is, much like the rest of research that we’ve
learned about so far, all about representation !!!

Measurement Validity is the extent to which the

data (variable values) we have represent the
behaviors (constructs) we want to study.
What are the different types of constructs we measure ???
The most commonly discussed types are ...
• Achievement -- “performance” broadly defined (judgements)
• e.g., scholastic skills, job-related skills, research DVs, etc.
• Attitude/Opinion -- “how things should be” (sentiments)
• polls, product evaluations, etc.
• Personality -- “characterological attributes” (keyed sentiments)
• anxiety, psychoses, assertiveness, etc.
There are other types of measures that are often used…
• Social Skills -- achievement or personality ??
• Aptitude -- “how well some will perform after then are trained and
experiences” but measures before the training & experience”
• some combo of achievement, personality and “likes”
• IQ -- is it achievement (things learned) or is it “aptitude for
academics, career and life” ??
Each question/behavior is called an  item

Kinds of items  objective items vs. subject items

• “objective” does not mean “true” “real” or “accurate”
• “subjective” does not mean “made up” or “inaccurate”

Items are names for “how the observer/interviewer/coder

transforms participant’s responses into data”
Objective Items - no evaluation, judgement or decision is needed
• either “response = data” or a “mathematical transformation”
• e.g., multiple choice, T&F, matching, fill-in-the-blanks
Subjective Items – response must be evaluated and a decision or
judgment made what should be the data value
• content coding, diagnostic systems, behavioral taxonomies
• e.g., essays, interview answers, drawings, facial expressions
Some more language …

A collection of items is called many things…

• e.g., survey, questionnaire, instrument, measure, test, or scale

Three “kinds” of item collections you should know ..

• Scale (Test) - all items are “put together” to get a single score

• Subscale (Subtest) – item sets “put together” to get

multiple separate scores

• Surveys – each item gives a specific piece of information

Most “questionnaires,” “surveys” or “interviews” are a combination

of all three.
Desirable Properties of Psychological Measures
Interpretability of Individual and Group Scores

Population Norms

Validity

Reliability

Standardization
Standardization
Administration – test is “given” the same way every time
• who administers the instrument
• specific instructions, order of items, timing, etc.
• Varies greatly - multiple-choice classroom test  hand it out)
- WAIS -- 100+ page administration
manual

Scoring – test is “scored” the same way every time

• who scores the instrument
• correct, “partial” and incorrect answers, points awarded, etc.
• Varies greatly -- multiple choice test (fill in the sheet)
-- WAIS – 200+ page scoring manual
Reliability (Agreement or Consistency)

Inter-rater or Inter-observers reliability

• do multiple observers/coders score an item the same way ?
•
important whenever using subjective items

Internal reliability -- do the items measure a central “thing”

• Cronbach’s alpha  α = .00 – 1.00  higher values mean
stronger internal consistency/reliability

External Reliability -- consistency of scale/test scores

• test-retest reliability – correlate scores from same test given
3-18 weeks apart
• alternate forms reliability – correlate scores from two
Validity (Consistent Accuracy)
Face Validity -- do the items come from “domain of interest” ?
non-statistical -- decision of “target population”

Content Validity -- do the items come from “domain of interest”?

non-statistical -- decision of “expert in the field”
Criterion-related Validity -- does test correlate with “criterion”?
• statistical -- requires a criterion that you “believe in”
• predictive, concurrent, postdictive validity

Construct Validity -- does test relate to other measures it should?

• Statistical -- Discriminant validity
• convergent validity -- correlates with selected tests
• divergent validity -- doesn’t correlate with others
“Is the test valid?”
Jum Nunnally (one of the founders of modern psychometrics)
claimed this was “silly question”! The point wasn’t that tests
shouldn’t be “valid” but that a test’s validity must be assessed
relative to…
• the construct it is intended to measure
• the population for which it is intended (e.g., age, level)
• the application for which it is intended (e.g., for classifying
folks into categories vs. assigning them
quantitative values)

So, the real question is, “Is this test a valid measure of this
construct for this population in this application?” That
question can be answered!
Face Validity
Does the test “look like” a measure of the construct of interest?
• “looks like” a measure of the desired construct to a member
of the target population
• will someone recognize the type of information they are
responding to?
• Possible advantage of face validity ..
• If the respondent knows what information we are looking for,
they can use that “context” to help interpret the
questions and provide more useful, accurate answers
• Possible limitation of face validity …
• if the respondent knows what information we are looking for,
they might try to “bend & shape” their answers to what
they think we want -- “fake good” or “fake bad”
Content Validity
Does the test contain items from the desired “content domain”?
• Based on assessment by “subject matter experts” (SMEs) in
that content domain
• Is especially important when a test is designed to have low face
validity
• e.g., tests of “honesty” used for hiring decisions
• Is generally simpler for “achievement tests” than for
“psychological constructs” (or other “less concrete” ideas)
• e.g., it is a lot easier for “math experts” to agree whether or
not an item should be on an algebra test than it is for
“psychological experts” to agree whether or not an items
should be on a measure of depression.
• Content validity is not “tested for”. Rather it is “assured” by the
informed item selections made by experts in the domain.
Criterion-related Validity
Do the test scores correlate with criterion behavior scores??

concurrent -- test taken now “replaces” criterion measured now

• often the goal is to substitute a “shorter” or “cheaper” test
• e.g., the written drivers test replaces road test
predictive -- test taken now predicts criterion measured later
• want to estimate what will happen before it does
• e.g., your GRE score (taken now) predicts grad school
(later)
postdictive – test taken now captures behavior & affect of before
• most of the behavior we study “has already happened”
• e.g., adult memories of childhood feelings or medical
history
When criterion behavior occurs
Before Now Later
concurrent
postdictive predictive
Test taken now
Construct Validity
• Does the test interrelate with other tests as a measure of this
construct should ?
• We use the term construct to remind ourselves that many of the
terms we use do not have an objective, concrete reality.
• Rather they are “made up” or “constructed” by us in our
attempts to organize and make sense of behavior and
other psychological processes
• attention to construct validity reminds us that our defense of the
constructs we create is really based on the “whole
package” of how the measures of different constructs relate
to each other
• So, construct validity “begins” with content validity (are these the
right types of items) and then adds the question, “does this
test relate as it should to other tests of similar and
The statistical assessment of Construct Validity …

Discriminant Validity
• Does the test show the “right” pattern of interrelationships with
other variables? -- has two parts
• Convergent Validity -- test correlates with other measures of
similar constructs
• Divergent Validity -- test isn’t correlated with measures of
“other, different constructs”
• e.g., a new measure of depression should …
• have “strong” correlations with other measures of “depression”
• have negative correlations with measures of “happiness”
• have “substantial” correlation with measures of “anxiety”
• have “minimal” correlations with tests of “physical health”,
“faking bad”, “self-evaluation”, etc.
Population Norms
In order to interpret a score from an individual or group, you must
know what scores are typical for that population
• Requires a large representative sample of the target population
• preferably  random, research-selected & stratified
• Requires solid standardization  both administrative & scoring
• Requires great inter-rater reliability if subjective

The Result ??
A scoring distribution of the population.
• lets us identify “normal,” “high” and “low” scores
• lets us identify “cutoff scores” to define important populations
and subpopulations
Desirable Properties of Psychological Measures
Interpretability of Individual and Group Scores

Population Norms
Scoring Distribution & Cutoffs

Validity
Face, Content, Criterioin-Related, Construct

Reliability
Interrater, Internal Consistency, Test-Retest & Alternate Forms

Standardization
Administration & Scoring
Reverse Keying
We want the respondents to carefully read an separately respond
to each item of our scale/test. One thing we do is to write the
items so that some of them are “backwards” or “reversed” …
Consider these items from a depression measure…
1. It is tough to get out of bed some mornings. disagree 1 2 3 4 5
agree
2. I’m generally happy about my life. 1 2 3 4 5
3. I sometimes just want to sit and cry. 1 2 3 4 5
4. Most of the time I have a smile on my face. 1 2 3 4 5

If the person is “depressed”, we would expect then to give a fairly

high rating for questions 1 & 3, but a low rating on 2 & 4.
Before aggregating these items into a composite scale or test
score, we would “reverse key” items 2 & 4 (1=5, 2=4, 4=2, 5=1)

PPT on validity and its types
100% (2)
PPT on validity and its types
10 pages
More How to Win at Aptitude Tests
From Everand
More How to Win at Aptitude Tests
Liam Healy
4/5 (7)
Assignment 2 - Psych Assessment
No ratings yet
Assignment 2 - Psych Assessment
3 pages
PDF Supercharge Your Piano Practice DD
100% (1)
PDF Supercharge Your Piano Practice DD
3 pages
(ISO 9001) Procedure For Training
100% (3)
(ISO 9001) Procedure For Training
3 pages
Courtois - Tratamiento Del Trauma Capítulo 1
No ratings yet
Courtois - Tratamiento Del Trauma Capítulo 1
18 pages
Gap Analysis/Identifying Priority Improvement Areas: Reporter: Edrohn R. Cumla
No ratings yet
Gap Analysis/Identifying Priority Improvement Areas: Reporter: Edrohn R. Cumla
16 pages
Tickle: IQ and Personality Tests - Which Online Personality Test Are You? - Results The PROFILER Personality Test
No ratings yet
Tickle: IQ and Personality Tests - Which Online Personality Test Are You? - Results The PROFILER Personality Test
33 pages
10 Validity
No ratings yet
10 Validity
5 pages
BRM_detailed
No ratings yet
BRM_detailed
92 pages
Essentials of A Good Psychological Test
No ratings yet
Essentials of A Good Psychological Test
6 pages
Validity seminar
No ratings yet
Validity seminar
14 pages
Personality
No ratings yet
Personality
17 pages
Reliability and Validity
No ratings yet
Reliability and Validity
27 pages
PSY 101L: Psychological Testing: Prof. A.K.M. Rezaul Karim, PH.D
No ratings yet
PSY 101L: Psychological Testing: Prof. A.K.M. Rezaul Karim, PH.D
50 pages
Validity: Syed Hassan Shah Kargil (Ladakh)
No ratings yet
Validity: Syed Hassan Shah Kargil (Ladakh)
15 pages
Reviewer For Clinical Psych
No ratings yet
Reviewer For Clinical Psych
21 pages
C7 Neuropsychology - Interpreting Psychometric Data
No ratings yet
C7 Neuropsychology - Interpreting Psychometric Data
20 pages
20201231171835D4978 - Psikometri 1 - 2
No ratings yet
20201231171835D4978 - Psikometri 1 - 2
28 pages
Vii. Validity
No ratings yet
Vii. Validity
3 pages
Defining and Measuring Variables
No ratings yet
Defining and Measuring Variables
24 pages
PSY 414 Psychological Testing & Construction 10.03.2021
No ratings yet
PSY 414 Psychological Testing & Construction 10.03.2021
73 pages
Validity
No ratings yet
Validity
4 pages
Validity and Reliability
No ratings yet
Validity and Reliability
23 pages
Validity 1
No ratings yet
Validity 1
24 pages
LESSON 6 Assessment Reviewer
No ratings yet
LESSON 6 Assessment Reviewer
7 pages
Essentials of A Good Test
No ratings yet
Essentials of A Good Test
6 pages
MSR Intro
No ratings yet
MSR Intro
10 pages
Psihometr
No ratings yet
Psihometr
10 pages
Psycho Metric Properties of Tests
No ratings yet
Psycho Metric Properties of Tests
8 pages
Psy Ass Lec Soup
No ratings yet
Psy Ass Lec Soup
7 pages
AZA3462 2019 Lecture 3
No ratings yet
AZA3462 2019 Lecture 3
20 pages
Validity
No ratings yet
Validity
6 pages
Reliability Dan Validity
No ratings yet
Reliability Dan Validity
12 pages
Week 5.2 - Conceptual Basis of Validity - Part 2
No ratings yet
Week 5.2 - Conceptual Basis of Validity - Part 2
23 pages
Validity its types
No ratings yet
Validity its types
11 pages
Measurement Concepts
No ratings yet
Measurement Concepts
23 pages
Alphy Biju BAP.21.440 Assignment On Validity Psychology
No ratings yet
Alphy Biju BAP.21.440 Assignment On Validity Psychology
6 pages
Breakdown of Validity
No ratings yet
Breakdown of Validity
4 pages
Validity Explains How Well The Collected Data Covers The Actual Area of Investigation
No ratings yet
Validity Explains How Well The Collected Data Covers The Actual Area of Investigation
7 pages
Introduction To Psychometric Testing FINAL
No ratings yet
Introduction To Psychometric Testing FINAL
7 pages
Validity TM
No ratings yet
Validity TM
8 pages
Chapter 5 - Measurement Techniques
No ratings yet
Chapter 5 - Measurement Techniques
46 pages
3 VALIDITY - RELAIBILITY 18032024 101010am
No ratings yet
3 VALIDITY - RELAIBILITY 18032024 101010am
55 pages
Psychological Assessment
100% (1)
Psychological Assessment
7 pages
Test Worthiness: Validity, Reliability, Cross-Cultural Fairness, and Practicality
No ratings yet
Test Worthiness: Validity, Reliability, Cross-Cultural Fairness, and Practicality
7 pages
Validity and Reliability in Research
0% (1)
Validity and Reliability in Research
13 pages
Assessment-as-an-Integral-Part-of-Teaching
No ratings yet
Assessment-as-an-Integral-Part-of-Teaching
46 pages
Chapter 3 BSRM
No ratings yet
Chapter 3 BSRM
4 pages
Validity_merged
No ratings yet
Validity_merged
8 pages
Validity: PSY 112: Psychological Assessment
No ratings yet
Validity: PSY 112: Psychological Assessment
61 pages
Psymea Notes
No ratings yet
Psymea Notes
9 pages
Psych-13_Lesson-6_010155
No ratings yet
Psych-13_Lesson-6_010155
53 pages
A1181590628 - 23746 - 19 - 2020 - Measurement and Scaling RECAP-3
No ratings yet
A1181590628 - 23746 - 19 - 2020 - Measurement and Scaling RECAP-3
20 pages
Lecture 2 Reseach Measurement 2020
No ratings yet
Lecture 2 Reseach Measurement 2020
28 pages
SPL-3 Unit 3
No ratings yet
SPL-3 Unit 3
4 pages
Module 6
No ratings yet
Module 6
10 pages
PSY 6535 Psychometric Theory Validity - Part 1
No ratings yet
PSY 6535 Psychometric Theory Validity - Part 1
31 pages
Validity and Reliability
100% (2)
Validity and Reliability
20 pages
DOC-20241022-WA0008.
No ratings yet
DOC-20241022-WA0008.
16 pages
B.ed. Assignments
100% (1)
B.ed. Assignments
7 pages
Assessment That Works: How Do You Know How Much They Know? a Guide to Asking the Right Questions
From Everand
Assessment That Works: How Do You Know How Much They Know? a Guide to Asking the Right Questions
John Sleigh
No ratings yet
101 Assessment Tips: Enhancing Understanding of Vocational Quality
From Everand
101 Assessment Tips: Enhancing Understanding of Vocational Quality
Vanessa McCarthy
No ratings yet
Critical Thinking Gr. 5-8
From Everand
Critical Thinking Gr. 5-8
Brenda Rollins
2.5/5 (3)
Becoming Better Psychotherapists Advancing Training and Supervision (Louis Georges Castonguay Clara E. Hill) (Z-Library)
No ratings yet
Becoming Better Psychotherapists Advancing Training and Supervision (Louis Georges Castonguay Clara E. Hill) (Z-Library)
404 pages
ST MMM&R Lesson Plan
100% (1)
ST MMM&R Lesson Plan
3 pages
Soft Computing Programs
No ratings yet
Soft Computing Programs
12 pages
SOP Prajesh
No ratings yet
SOP Prajesh
4 pages
Synopsis Final
No ratings yet
Synopsis Final
9 pages
Thesis Introduction Sample Format
100% (2)
Thesis Introduction Sample Format
4 pages
Bai Tap Ngay 442022 Education
No ratings yet
Bai Tap Ngay 442022 Education
2 pages
The Life of Reason or the Phases of Human Progress, Book 4 Reason in Art (George Santayana) (Z-Library)
No ratings yet
The Life of Reason or the Phases of Human Progress, Book 4 Reason in Art (George Santayana) (Z-Library)
333 pages
Principles of Management-BBA 143
No ratings yet
Principles of Management-BBA 143
3 pages
Technical consultation
No ratings yet
Technical consultation
17 pages
Comprehensive Written Report
No ratings yet
Comprehensive Written Report
7 pages
Detail of Posts Female 19
No ratings yet
Detail of Posts Female 19
19 pages
WTD Kick-Off-Script
No ratings yet
WTD Kick-Off-Script
4 pages
Lesson Plan English 1 Degrees of Comparison Inductive Method
100% (1)
Lesson Plan English 1 Degrees of Comparison Inductive Method
8 pages
Resume (DST)
No ratings yet
Resume (DST)
1 page
60 Days IELTS Challege
No ratings yet
60 Days IELTS Challege
4 pages
Download Applying Conversation Analysis 1st Edition Keith Richards ebook All Chapters PDF
100% (3)
Download Applying Conversation Analysis 1st Edition Keith Richards ebook All Chapters PDF
61 pages
The Importance of Beauty Pageant.
No ratings yet
The Importance of Beauty Pageant.
3 pages
Early Skin-To-Skin Contact Formothers and Their Healthy Newborn Infants (Review)
No ratings yet
Early Skin-To-Skin Contact Formothers and Their Healthy Newborn Infants (Review)
2 pages
What Is The Difference Between COURTESY and DECISPLINE
No ratings yet
What Is The Difference Between COURTESY and DECISPLINE
3 pages
Unjuran Kpi Upsr PPD 2019
No ratings yet
Unjuran Kpi Upsr PPD 2019
20 pages
Significant Figures 9MEC1
No ratings yet
Significant Figures 9MEC1
9 pages
Text-Based Questions - Narrative and Historical Recount
100% (1)
Text-Based Questions - Narrative and Historical Recount
2 pages
RTS Schedule 2025-26
No ratings yet
RTS Schedule 2025-26
6 pages
Postdoctoral Position Opportunity On Animal Behaviour
No ratings yet
Postdoctoral Position Opportunity On Animal Behaviour
2 pages
Adam's Equity Theory
100% (2)
Adam's Equity Theory
3 pages

Prsentaion of Pyschometrics

Uploaded by

Prsentaion of Pyschometrics

Uploaded by

Introduction to Psychometrics

• Psychometrics & Measurement Validity

The process of assigning values to represent the amounts and

Psychometrics is the “centerpiece” of empirical

They’re called because most of what we care about as

They are attributes and characteristics that we’ve constructed to

We have to figure out some way to turn their behavior, self-reports

Measurement Validity is the extent to which the

Kinds of items  objective items vs. subject items

Items are names for “how the observer/interviewer/coder

A collection of items is called many things…

Three “kinds” of item collections you should know ..

• Subscale (Subtest) – item sets “put together” to get

• Surveys – each item gives a specific piece of information

Most “questionnaires,” “surveys” or “interviews” are a combination

Scoring – test is “scored” the same way every time

Inter-rater or Inter-observers reliability

Internal reliability -- do the items measure a central “thing”

External Reliability -- consistency of scale/test scores

Content Validity -- do the items come from “domain of interest”?

Construct Validity -- does test relate to other measures it should?

concurrent -- test taken now “replaces” criterion measured now

If the person is “depressed”, we would expect then to give a fairly

You might also like