0% found this document useful (0 votes)
4 views

Evalutingand Making DecisionsR

Uploaded by

aysnrkvckr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Evalutingand Making DecisionsR

Uploaded by

aysnrkvckr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

evaluating

1
Selection Techniques and decisions

Psy 3601 Industrial Psychology 11/19/2024


2 Learning objectives

 Understand how to determine the reliability of a test and the factors that
affect test reliability
 Understand the ways to validate a test
 Learn how to find information about tests
 Understand how to determine the utility of a selection test
 Understand how to use test scores to make personnel selection
decisions

Psy 3601 Industrial Psychology 11/19/2024


3 Optimal Employee Selection Systems
 Are Reliable
 Are Valid
 Based on a job analysis (content validity)
 Predict work-related behavior (criterion validity)
 Reduce the Chance of a Legal Challenge and Fair
 Face valid
 Don’t invade privacy
 Don’t intentionally discriminate
 Are Cost Effective
 Cost to purchase/create
 Cost to administer
 PsyCost to score
3601 Industrial Psychology 11/19/2024
4 Reliability
 The extent to which a score from a test is consistent and
free from errors of measurement
 Methods of Determining Reliability
 Test-retest (temporal stability)
 Alternate forms (form stability)
 Internal reliability (item stability and item consistency)
 Scorer reliability—inter-rater reliability

Psy 3601 Industrial Psychology 11/19/2024


5 Test-Retest Reliability
 Measures temporal stability
 Administration
 Same applicants
 Same test
 Two testing periods
 Scores at time one are correlated with scores at time two
 Correlation should be above 0.70 most tests used in
industry has more than this value

Psy 3601 Industrial Psychology 11/19/2024


6 Test-Retest Reliability

Problems
Sources of measurement errors
 Characteristic or attribute being measured may change over
time
 Reactivity
 Carry over effects (test scores improves when the test taken
second round: d=.46)
 Useful in sensory motor and psychomotor test
 Practical problems
 Time consuming
 Expensive (attrition you may not find the same people to take
the test)
 Inappropriate for some types of tests why?
 Establishing the time interval is difficult but 3days-3 months
Psy 3601 Industrial Psychology 11/19/2024
7 Alternate Forms Reliability

Administration
Two forms of the same test are developed, and to the
highest degree possible, are equivalent in terms of
content, response process, and statistical characteristics
 One form is administered to examinees, and at some later
date, the same examinees take the second form or
 Two forms are given to the participants in a counterbalanced
order so the effect of one test taken first will be same since
each form has equal probability of given first and second
 Why use? retaking improves the test scores (Cognitive Ability
and Knowledge tests) .46 in retaking and .24 for alternate
many might use it is better to create alternate forms also
large number of people taking test need arise to create the
alternate forms (faculty also use alternate form)

Psy 3601 Industrial Psychology 11/19/2024


8 Alternate Forms Reliability:
Scoring
 Scores from the first form of test are correlated with
scores from the second form
 If the scores are highly correlated, the test has form
stability. There are two ways to do this type of reliability:
back to back administration and administering with the
time interval.

Psy 3601 Industrial Psychology 11/19/2024


9 Alternate Forms Reliability

Disadvantages
Difficult to develop but easy in math and vocabulary

 Content sampling errors (items may not cover enough


area)

 Time sampling errors (time difference between the two


administration Alternate forms score differences:d=.24
the less time it is better (you understand what the error
is coming)
 This type of reliability is the lowest all of the other reliabilities
and the best type of reliability as it covers more variance
than other reliability estimates

Psy 3601 Industrial Psychology 11/19/2024


10 Internal Consistency Reliability
 Defines measurement error strictly in terms of
consistency or inconsistency in the content of the test.
 Used when it is impractical to administer two separate
forms of a test.
 With this form of reliability the test is administered only
once and measures item stability.

Psy 3601 Industrial Psychology 11/19/2024


11 Common Methods

 Cronbach’s Coefficient Alpha


 Used with ratio or interval data. Now people are using
McDonald’s OMEGA more appropriate but traditionally Alpha
used

 Kuder-Richardson Formula
 Used for test with dichotomous items (yes-no true-false)
 Also KR-20 is the average of all possible split halve reliability

Psy 3601 Industrial Psychology 11/19/2024


12 Interrater Reliability

 Used when human judgment of performance is involved in


the selection process
 Refers to the degree of agreement between 2 or more
raters
 Scorer reliability is especially an issue in projective or
subjective tests in which there is no one correct answer,
but even tests scored with the use of keys suffer from
scorer mistakes. Therefore, if two people are scoring the
test interscorer correlations should be used
 If there are more rater we use interclass correlation

Psy 3601 Industrial Psychology 11/19/2024


13 Reliability: Conclusions
 The higher the reliability of a selection test the better.
Reliability should be 0.70 or higher
 Reliability can be affected by many factors (see the next
slide)
 If a selection test is not reliable, it is useless as a tool for
selecting individuals
 Meta analysis using the reliabilities will give us a standart
to compare the reliability of the tests obtained from your
data

Psy 3601 Industrial Psychology 11/19/2024


14 Factors affecting the reliability

 Individual differences among the respondents


(the more the differences, the higher the reliability
 Stability of traits (better test-retest)
 Sampling --representative ones are more reliable (large samples better
estimates) 100-200 best
 Test difficulty (due to less variance reliability is low)
 Administering and scoring
 Length of a test (The longer is more reliable)

Psy 3601 Industrial Psychology 11/19/2024


15 Validity
 Definition: The degree to which inferences from scores on
tests or assessments are justified by the evidence
(building evidence for the inferences)
 accuracy of judgments or inferences made from the test
scores
 Common Ways to Measure
 Content Validity
 Criterion Validity
 Construct Validity

Psy 3601 Industrial Psychology 11/19/2024


Content Validity
16
 The extent to which test items sample the content that
they are supposed to measure
 Refers to the representativeness of the items of the
content domain.
 In industry the appropriate content of a test battery is
determined by a job analysis
 One way to test the content validity of a test is to have
subject matter experts (e.g., experienced employees,
supervisors) rate test items on the extent to which the
content and level of difficulty for each item are related to
the job in question.
 No coefficient except content validity ratio

Psy 3601 Industrial Psychology 11/19/2024


17 Criterion related Validity
 Criterion validity refers to the extent to which a test score
is related to some measure of job performance called a
criterion is needed if you are measuring the competencies
 Established using one of the following research designs:
 Concurrent Validity (week design due to range restriction)
 Predictive Validity (a better design but may not be possible)
 Validity Generalization

Psy 3601 Industrial Psychology 11/19/2024


18 Concurrent Validity
 Uses current employees

 Range restriction can be a problem

Psy 3601 Industrial Psychology 11/19/2024


19 Predictive Validity
 Correlates test scores with future behavior
 Reduces the problem of range restriction
 May not be practical

Psy 3601 Industrial Psychology 11/19/2024


20 Validity Generalization
 Validity Generalization is the extent to which a test found
valid for a job in one location is valid for the same job in a
different location
 The key to establishing validity generalization is meta-
analysis (finding the average validity of a test for a job)
and job analysis (to show that jobs where we used the
test in question is similar tot he job we have)

Psy 3601 Industrial Psychology 11/19/2024


21 Construct Validity
 The extent to which a test actually measures the
construct
 Is concerned with inferences about test scores
 Determined by correlating scores on a test with scores
from other test (different tests measuring same construct
should converge)
 Conscientiousness scores from paper pencil and
structured interview should correlate highly.
 Also we measure the Factorial validity as away of
construct validity
 For assessment centers same methods measuring
different constructs seem to be correlated more than
different methods measuring the same construct.
meaning correlations are due to method not the
constructs are different (method variance)
Psy 3601 Industrial Psychology 11/19/2024
Construct Validity (continued)
22

 A different method of construct validity is a known group validity:


develping a transformational leadership scale: give the test to known
group of supervisors and non supervisors, look at the mean differences:
does the supervisors have high scores on the tests than the non
supervisors significantly (but used less often)

Psy 3601 Industrial Psychology 11/19/2024


23 Face Validity
 The extent to which a test appears to be job related
 Reduces the chance of legal challenge
 Increasing face validity improves test taking motivation
 Increases the faking if the items are seem to be
measuring is obvious
 Face validity is not sufficient in itself

Psy 3601 Industrial Psychology 11/19/2024


24 What type of validity when?

 If the relation between test and the job is obvious, content validity is
enough (work sample test)
 If not criterion related validity but not easy to conduct due to
small sample size instability of coefficients (but necessary for CA tests
personality tests)
Then use validity generalization or synthetic validity

Psy 3601 Industrial Psychology 11/19/2024


25 Test Information
 The objectives of the Mental Measurements
Yearbooks (2022, 21th): factual information on all
known tests published in the English-speaking
countries of the world;
 candidly critical test reviews written for the Mental
Measurements Yearbook series by qualified
professionals in education, psychology,
speech/language/hearing, and other fields
representing a variety of viewpoints;
 unique publication of each volume in the Mental
Measurements Yearbook series with new volumes
supplementing rather than supplanting previous series
volumes.
 Tests in Print IX
Psy 3601 Industrial Psychology 11/19/2024
26 Cost efficiency

 Individual vs group tests


 Non proctored internet test can be cost effective compare to paper
pencil tests
 Computerized Adaptive Testing (expensive to develop )
 Objectively scored vs subjective scoring (the first one is less expensive)

Psy 3601 Industrial Psychology 11/19/2024


27
Selection utility

Psy 3601 Industrial Psychology 11/19/2024


28 Utility
 Can we improve the quality of selection by using tests vs
not using them?

Psy 3601 Industrial Psychology 11/19/2024


29 Common Utility Methods
 Taylor-Russell Tables
 Proportion of Correct Decisions
 The Brogden-Cronbach-Gleser Model
 Lawshe tables

Psy 3601 Industrial Psychology 11/19/2024


30 Utility Analysis:

1. Taylor-Russell Tables
Estimates the percentage of future employees that will be
successful
 Three components
 Validity
 Base rate % successful (successful employees ÷ total
employees)
 Selection ratio (hired ÷ total applicants)
 % improvement by selection

Psy 3601 Industrial Psychology 11/19/2024


31 Taylor-Russell Example
 Suppose we have
 a test validity of 0.40
 a selection ratio of 0.30
 a base rate of 0.50
 Using the Taylor-Russell Tables what percentage of future
employees would be successful?( 69-50)/50=.38

Psy 3601 Industrial Psychology 11/19/2024


Sele
ctio
n
rati
o
50 r 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.9
% 5 0 0 0 0 0 0 0 0 0 5
0.0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0 0 0 0 0 0 0 0 0 0 0 0
0.1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0 8 7 6 5 4 3 3 2 1 1 0
0.2 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0 7 4 1 9 8 6 5 4 3 2 1
0.3 0.7 0.7 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5
0 4 1 7 4 2 0 8 6 4 2 1
0.4 0.8 0.7 0.7 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5
0 2 8 3 9 6 3 1 8 6 3 2
0.5 0.8 0.8 0.7 0.7 0.7 0.6 0.6 0.6 0.5 0.5 0.5
0 8 4 6 4 0 7 3 0 7 4 2
0.6 0.9 0.9 0.8 0.7 0.7 0.7 0.6 0.6 0.5 0.5 0.5
0 4 0 4 9 5 0 6 2 9 4 2
0.7 0.9 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5
0 8 5 0 5 0 5 0 5 0 5 3
Psy 3601 Industrial Psychology
0.8 1.0 0.9 0.9 0.9 0.8 0.8 0.7 0.6 0.6 0.5 0.5
0
11/19/2024
9 5 0 5 0 3 7 1 5 3
32
33 Utility Analysis:
2. Proportion of correct decisions
 Proportion of Correct Decisions With Test
 (Correct rejections + correct acceptances) ÷
Total employees
 Baseline of Correct Decisions
 Successful employees ÷ Total employees

Psy 3601 Industrial Psychology 11/19/2024


Determining the Proportion of Correct
Decisions (based on method explanation)

Michael G. Aamodt, Industrial/Organizational Psychology: An Applied Approach, 9th Edition. © 2023 Cengage. All Rights Reserved. May not be scanned,
copied or duplicated, or posted to a publicly accessible website, in whole or in part.
35 Proportion of Correct Decisions
 Proportion of Correct Decisions With Test
 (Quadrant II Quadrant IV)/ (Quadrants
I+II+III+IV)
 (10 + 11) ÷ (5 + 10 + 4 + 11)
 = 21 ÷ 30 = 0.70

 Baseline of Correct Decisions


 Quadrants I + II/ Quadrants I+II+III+IV
 5 + 10)/(5 + 10 + 4+ 11)=15/30=.50

Psy 3601 Industrial Psychology 11/19/2024


Utility Analysis:
36
Lawshe Table

Lawshe Table: Will particular applicant with a percentile of


test score be sucessful. What is the success rate of an
applicant with a certain test score
 Type of information needed: Base rate, validity ,and
the test score of the applicant in terms of
percentiles: in which percentile the test score of a
person within the group (did the person score in the top
20%, the next 20%, the middle 20%, the next lowest
20%, or the bottom 20%?) example: an applicant scored
the 5th highest among 15 people. The persons score is at
the 30th percentile. That means person score is the
second 20th in the following table

Psy 3601 Industrial Psychology 11/19/2024


37

Psy 3601 Industrial Psychology 11/19/2024


38 Brogden-Cronbach-Gleser Utility

Formula
Gives an estimate of utility by estimating the amount of
money an organization would save if it used the test to
select employees.
Savings =(n) (t) (r) (SDy) (m) – cost of testing
 n = Number of employees hired per year
 t = average tenure
 r = test validity
 SDy = standard deviation of performance in money usully
taken as 40 % of the yearly salary of the position
 m = mean standardized predictor score of selected
applicants ( it is the z score or standardized score)

Psy 3601 Industrial Psychology 11/19/2024


39 Standardized Selection Ratio :M
SR m
1.00 0.00
0.90 0.20
0.80 0.35
0.70 0.50
0.60 0.64
0.50 0.80
0.40 0.97
0.30 1.17
0.20 1.40
0.10 1.76
0.05 2.08
Psy 3601 Industrial Psychology 11/19/2024
40 Example 1

 Suppose:
 we hire 10 employee per year
 the average person in this position stays 2 years
 the validity coefficient is 0.40
 the average annual salary for the position is 80,000
 we have 50 applicants for ten openings.
 Cost of test per person is 1000 for testing
 M= 1.40 taken from the above table

 Our utility would be:


 (10 x 2 x 0.40 x 32,000 x 1.40) – (50 x 1000) =?

Psy 3601 Industrial Psychology 11/19/2024


41 Example2:

 Suppose we will hire 250 people


 The average person in this position stays 4 years
 The validity coefficient is 0.30
 The average annual salary for the position is 70000
 We have 500 applicants for 250 openings
 Our utility will be (250x4x.30x28000x.80)-500x450 (450 is the cost for testing
per employee)=64950000 (altı milyon dörtyüz doksan beş bin by writing)
Note: .80 is taken from the table above slide number 39

Psy 3601 Industrial Psychology 11/19/2024


42
Test fairness

Psy 3601 Industrial Psychology 11/19/2024


43 Definitions
 Technical aspects of the test

 A test is biased if there are group differences in test


scores (e.g., race, gender) that are unrelated to the
construct being measured (e.g., personality), test scores
of one group will be high causing adverse impact and one
group is being hired more than the other groups

Psy 3601 Industrial Psychology 11/19/2024


44 Adverse Impact

Male Female
 Number of applicants 50 30
 Number hired 20 6
 Selection ratio 0.40 0.20

0.20/0.40 = 0.50 < 0.80 (adverse impact)

Psy 3601 Industrial Psychology 11/19/2024


45 Adverse impact (continued)

Male Female
 Number of applicants 50 30
 Number hired 20 10
 Selection ratio 0.40 0.33

0.33/0.40 = 0.83 > 0.80 (no adverse impact)

Psy 3601 Industrial Psychology 11/19/2024


46 Definitions
 Predictive Bias
 A test is fair if people of equal probability of success on a job have an
equal chance of being hired
 Single group validity ,differential validity (rare but use a regression
equation to benefit the protected group) what to do?
 Not use the test
 Use the test with separate regression equations May not be possible)

 Other aspects of fairness: applicants’ perceptions of fairness may be


affected from the difficulty of the test, the amount of time allowed to
complete the test, the face validity of the test items, the manner in
which hiring decisions are made from the test, policies about retaking
the test, and the way in which requests for testing accommodations
for disabilities were handled.

Psy 3601 Industrial Psychology 11/19/2024


47
Making the hiring decision

Psy 3601 Industrial Psychology 11/19/2024


48 Top down selection
 Test the candidates
 List the scores of the candidates from highest to lowest
 Select the appropriate amount (say 10 people needed to
fill the jobs
 If there are more than one tests, then use the regression
equation to predict the applicant’s performance, then list the
scores from top then select from the list or combine the
scores taken from tests and list the scores (question of
differential weighting vs equal weights)
 This approach is also called compensatory approach?

Psy 3601 Industrial Psychology 11/19/2024


49 The Top-Down selection
 A “performance first” hiring formula

Applicant Gender Test Score


(or predicted
test scores)
Ali M 99
Ercan M 98
Özcan M 91
Ömer M 90
Melis F 88
Psy 3601 Industrial Psychology 11/19/2024

Mehmet M 87
50 Top-Down Selection
Advantages
 Higher quality of selected applicants
 Objective decision making
Disadvantages
 Less flexibility in decision making
 Adverse impact = less workforce diversity
 Ignores measurement error
 Assumes test score accounts for all the variance in
performance (Zedeck, Cascio, Goldstein & Outtz, 1996).

Psy 3601 Industrial Psychology 11/19/2024


51 The Passing Scores Approach
 Who will perform at an acceptable level?
 A passing score is a point in a distribution of scores that
distinguishes acceptable from unacceptable performance
(Kane, 1994).

 Uniform Guidelines (1978) Section 5H:


Passing scores should be reasonable and consistent with
expectations of acceptable proficiency (the methd of
establishing a passing score should be based on valid
procedures)

Psy 3601 Industrial Psychology 11/19/2024


52 Passing Scores

Applicant gender Score

1 M 98
2 M 80
3 F 70 (passing score)cut -off
4 M 69
5 F 58
6 M 40
Psy 3601 Industrial Psychology 11/19/2024
53 Passing Scores (cutoff point)
 Advantages
 Increased flexibility in decision making
 Less adverse impact against protected groups
 Disadvantages
 Lowered utility
 Can be difficult to set
 It is not easy to establish cutting score best known methods
Angoff and Nidelesky
 Or we may use expectancy charts or tables

Psy 3601 Industrial Psychology 11/19/2024


54 Expectancy Tables

 First obtain the scores of performance and test scores from the past
employees
 Establish the frequency interval for the test scores
 For each interval establish the probability of success on the criterion
 Obtain the frequency interval where probability of success on the
criterion is acceptable. That point of the frequency interval is considered
cut off or passing score

Psy 3601 Industrial Psychology 11/19/2024


ExpectancyTable
Test Score on the predictor(s) Probability of
success on the
criterion
522 – 574 85%

483 – 521 75%

419 – 482 66%

0 – 418 56%
56 Multiple cutoff

 Take all the tests at the same time


 Determine the cut off value (passing- score) for each test
 Then combine the scores and select from the list or use a new
regression equation to predict the upcoming performance: Used when
certain amount of proficiency required for a job like pilots.

Psy 3601 Industrial Psychology 11/19/2024


57 Multiple Hurdle

 Give one test at a time for large number of individuals those who pass
the fist hurdle continue to go further and take the second hurdle and so
on it takes time
 Need to establish the cutting point or passing score in each hurdle, in each
hurdle you reduce the number of people who will continue in the testing
program. At the end you will have a handful of people to be selected
 Start from the least expensive to the most expensive instrument

Psy 3601 Industrial Psychology 11/19/2024


58 The end and Questons for you

 What do you think about the technical aspects of the tests as a topic?
 What is your opinion of the decision making in selection?
 Do you think that utility is important?

Psy 3601 Industrial Psychology 11/19/2024

You might also like