0% found this document useful (0 votes)
61 views

IO Chapter6 Final

The document discusses evaluating selection techniques and decision making. It covers the key characteristics of effective selection techniques including reliability, validity, cost-efficiency, fairness, and legal defensibility. Specifically, it defines reliability and the different types of reliability such as test-retest, alternate forms, internal, and scorer reliability. It also defines validity and the different types including content, criterion, construct, face, and known-group validity. Finally, it discusses establishing the usefulness of selection devices and introduces Taylor-Russell tables for evaluating tests.

Uploaded by

Julia Mopla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

IO Chapter6 Final

The document discusses evaluating selection techniques and decision making. It covers the key characteristics of effective selection techniques including reliability, validity, cost-efficiency, fairness, and legal defensibility. Specifically, it defines reliability and the different types of reliability such as test-retest, alternate forms, internal, and scorer reliability. It also defines validity and the different types including content, criterion, construct, face, and known-group validity. Finally, it discusses establishing the usefulness of selection devices and introduces Taylor-Russell tables for evaluating tests.

Uploaded by

Julia Mopla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Industrial/Organizational Psychology: An Applied Approach

Chapter 6: Evaluating Selection Technique and Decision

Presented to:
Prof. Lloyd Sajol, MPsy

Presented by:
Mariel Bajo
Erica May Cambarihan
Mark Joseph Cayabyab
Jessa Jane Fiel
Ayana Megan Florido
Jevan Carl Montales
Stacey Nanie Plazos
Princess Joy Sumalinog
Rica Tesoro

Sem 2 / A.Y. 2022-2023


CHAPTER 6: EVALUATING SELECTION TECHNIQUE AND DECISION
6.1 Characteristics of Effective Selective Techniques
Remarks Definition of Terms Terms
6.1.1 Reliability
The five characteristics of effective selection techniques:
● Reliable Characteristics of
● Valid Effective Selective
● Cost-efficient Techniques
● Fair
● Legally defensible
The extent to which a score from a test or from an
evaluation is consistent and free from error.
Reliability
If a score from a measure is not stable or error-free, it is
not useful.
Is an essential characteristic of an effective measure.
● Test-Retest Reliability
● Alternate-Forms Reliability Types of Reliability
● Internal Reliability and
● Scorer reliability
The extent to which repeated administration of the same
test will achieve similar results.

Each one of several people take the same test twice. The
scores from the first administration of the test are
correlated with scores from the second to determine
whether they are similar. If they are, then the test is said Test-Retest Reliability
to have temporal stability.

Typical time intervals between test administrations range


from three days to three months. Usually, the longer the
time interval the lower the reliability coefficient.
The consistency of test scores across time and not highly
susceptible to such random daily conditions as illness, Temporal Stability
fatigue, stress, or uncomfortable testing conditions.
The extent to which two forms of the same test are similar.
This counterbalancing of test-taking order is designed to
eliminate any effects that taking one form of the test first Alternate-Forms
may have on scores on the second form. Stability

The scores on the two forms are then correlated to


determine whether they are similar. If they are, the test is
said to have form stability.
A method of controlling for order effects by giving half of a
sample Test A first, followed by Test B, and giving the Counterbalancing
other half of the sample Test B first, followed by Test A.
The extent to which the scores on two forms of a test are Form Stability
similar.
To look at the consistency with which an applicant
responds to items measuring a similar dimension or
construct.

The extent to which similar items are answered in similar Internal Reliability
ways is referred to as internal consistency and measures or
item stability. Internal Consistency

Another factor that can affect internal reliability is item


homogeneity.
● split-half Methods in determining
● coefficient alpha internal consistency
● Kuder-Richardson formula 20 (K-R 20)
The extent to which responses to the same test items are Item Stability
consistent.
The extent to which test items measure the same
construct.

That is, do all of the items measure the same thing, or do Item Homogeneity
they measure different constructs? The more
homogeneous the items, the higher the internal
consistency.
A statistic used to determine internal reliability of tests that Kuder-Richardson
use items with dichotomous answers (yes/no, true/false). Formula 20
(K-R 20)
A form of internal reliability in which the consistency of item
responses is determined by comparing scores on half of Split-half Method
the items with scores on the other half of the items.
Used to correct reliability coefficients resulting from the Spearman-Brown
split-half method. Prophecy Formula
Used to adjust correlations.
A statistic used to determine internal reliability of tests that Coefficient Alpha
use interval or ratio scales.
The extent to which two people scoring a test agree on the Scorer Reliability
test score, or the extent to which a test is scored correctly.
6.1.2 Validity
It is the degree to which inferences from scores on tests or
assessments are justified by the evidence.

As with reliability, a test must be valid to be useful. But just


because a test is reliable does not mean it is valid. Validity

The potential validity of a test is limited by its reliability.


Thus, if a test has poor reliability, it cannot have high
validity.
● Content Validity Five Common Strategies
● Criterion Validity to Investigate the
● Construct Validity Validity of Scores on a
● Face Validity Test:
● Known-group Validity
The extent to which test items sample the content that they Content Validity
are supposed to measure.
The extent to which a test score is related to some Criterion Validity
measure of job performance.
A measure of job performance, such as attendance, Criterion
productivity, or a supervisor rating.
● Concurrent Validity Two forms of Criterion
● Predictive Validity Validity
A form of criterion validity that correlates test scores with
measures of job performance for employees currently Concurrent Validity
working for an organization.
A form of criterion validity in which test scores of applicants Predictive Validity
are compared at a later date with a measure of job
performance.
A narrow range of performance scores that makes it Restricted Range
difficult to obtain a significant validity coefficient.
The extent to which inferences from test scores from one
organization can be applied to another organization. Validity Generalization

Try to generalize the results of studies conducted on a


particular job to the same job at another organization
A form of validity generalization in which validity is inferred
on the basis of a match between job components and tests
previously found valid for those job components. Synthetic Validity

Tries to generalize the results of studies of different jobs


to a job that shares a common component
The extent to which a test actually measures the construct
that it purports to measure.
Construct Validity
Concerned with inferences about test scores

Method of measuring construct validity


● known-group validity
A form of validity in which test scores from two contrasting Known-Group Validity
groups “known” to differ on a construct are compared.
A test is given to two groups of people who are “known” to
be different on the trait in question.
The extent to which a test appears to be valid or is job Face Validity
related.
Statements, such as those used in astrological forecasts, Barnum Statements
that are so general that they can be true of almost anyone.
A book containing information about the reliability and Mental Measurement
validity of various psychological tests. Yearbook
(MMY)
6.1.3 Cost-Efficiency
If two or more tests have similar validities, then cost should Cost-efficiency
be considered.
A type of test taken on a computer in which the computer Computer Adaptive
adapts the difficulty level of questions asked to the test Testing
taker’s success in answering previous questions.
6.2 Establishing the Usefulness of a Selection Device
6.2.1 Taylor-Russell Tables
To determine how useful a test would be in any given
situation, several formulas and tables have been
designed. Each formula and table provide slightly different Formulas and tables
information to an employer. designed to see the
● Taylor-Russell tables usefulness of the test
● Lawshe tables
● Utility formula
A series of tables based on the selection ratio, base rate,
and test validity that yield information about the Taylor-Russell Tables
percentage of future employees who will be successful if a
particular test is used.
The percentage of applicants an organization hires.

Is determined by the following formula: Selection Ratio

Percentage of current employees who are considered Base Rate


successful.
6.2.2 Proportion of Correct Decisions
A utility method that compares the percentage of times a Proportion of Correct
selection decision was accurate with the percentage of Decisions
successful employees.
6.2.3 Lawshe Tables
Tables that use the base rate, test validity, and applicant Lawshe Tables
percentile on a test to determine the probability of future
success for that applicant.

Designed to determine the overall impact of a testing


procedure.
6.2.4 Brogden-Cronbach-Gleser Utility Formula
Another way to determine the value of a test in a given
situation is by computing the amount of money an Brogden-Cronbach-
organization would save if it used the test to select Gleser Utility Formula
employees.
Method of ascertaining the extent to which an organization
will benefit from the use of a particular selection system.

To use this formula, five items of information must be


known; Utility Formula
1. Number of employees hired per year. (n)
2. Average tenure. (t)
3. Test validity.(r)
4. Standard deviation of performance in dollars. (SD)
5. Mean standardized predictor score of selected
applicants. (m)
The length of time an employee has been with an Tenure
organization.
6.3 Determining the Fairness of a Test
Once a test has been determined to be reliable and valid Determining the
and to have utility for an organization, the next step is to Fairness of a Test
ensure that the test is fair and unbiased.
● Measurement bias Two types of Bias
● Predictive bias
6.3.1 Measurement Bias
It refers to technical aspects of a test. A test is considered
to have measurement bias if there are group differences
(e.g., sex, race, or age) in test scores that are unrelated to
the construct being measured.
Measurement Bias
Fairness can include bias, but also includes political and
social issues. A test is considered fair if people of equal
probability of success on a job have an equal chance of
being hired.
An employment practice that results in members of a
protected class being negatively affected at a higher rate Adverse impact
than members of the majority class. Adverse impact is
usually determined by the four-fifths rule.
6.3.2 Predictive Bias
Refers to situations in which the predicted level of job
success falsely favors one group (e.g., men) over another Predictive bias
(e.g., women).
● Single-group validity Forms of predictive bias
● Differential validity
The characteristic of a test that significantly predicts a
criterion for one class of people but not for another.
Separate correlations are computed between the test and
the criterion for each group.
Single-Group Validity
If both correlations are significant, the test does not exhibit
single-group validity and it passes this fairness hurdle. If,
however, only one of the correlations is significant, the test
is considered fair for only that one group. It is very rare and
is usually the result of small sample sizes and other
methodological problems. The test is only valid for one
group
The characteristic of a test that significantly predicts a
criterion for two groups, such as both minorities and
nonminorities, but predicts significantly better for one of Differential validity
the two groups. A test is valid for two groups but more valid
for one than for the other.
Another important aspect of test fairness is the perception
of fairness held by the applicants taking the test. That is, a
test may not have measurement or predictive bias, but Perception of fairness
applicants might perceive the test itself or the way in which
the test is administered as not being fair.
1. difficulty of the test
2. amount of time allowed to complete the test
3. face validity of the test items Factors that might affect
4. manner in which hiring decisions are made from the applicants’ perceptions
test scores of fairness
5. policies about retaking the test
6. way in which requests for testing accommodations
for disabilities were handled.
6.4 Making the Hiring Decision
If more than one criterion-valid test is used, the
scores on the tests must be combined. Usually, this Making the Hiring
is done by a statistical procedure known as Decision
multiple regressions, with each test score weighted
according to how well it predicts the criterion.
● Unadjusted top-down selection Selection decisions can
● Rules of three be made in four ways
● Passing scores
● Banding
A statistical procedure in which the scores from more than Multiple regression
one criterion-valid test are weighted according to how well
each test score predicts the criterion.
6.4.1 Unadjusted Top-Down Selection
Rank-ordered on the basis of their test scores. Selection
is then made by starting with the highest score and moving
down until all openings have been filled

Advantage:
Hiring the top scorers on a valid test, an organization will Unadjusted Top-Down
gain the most utility Selection

Disadvantage:
Can result in high levels of adverse impact and it reduces
an organization’s flexibility to use nontest factors such as
references or organizational fit
Assumption is that if multiple test scores are used, the
relationship between a low score on one test can be Compensatory approach
compensated for by a high score on another
6.4.2 Rules of Three
It is used in the public sector. This method ensures that
the person hired will be well qualifies but provides more Rules of Three
choice than does top-down selection
6.4.3 Passing Score
Reducing adverse impact and increasing flexibility. An
organization determines the lowest score on a test that is Passing Score
associated with acceptable performance on the job
● multiple-cutoff approach Approaches that can be
● multiple-hurdle approach used when the
relationship between the
selection test and
performance is not
linear
The applicants would be administered all of the tests at Multiple-cutoff
one time.
Reduce the costs associated with applicants failing one or Multiple-hurdle
more tests. The applicant is administered one test at a approach
time, usually beginning with the least expensive scores.
6.4.4 Banding
Attempts to hire the top test scorers while still allowing
some flexibility for affirmative action. Take into
consideration the degree of error associated with any test
score. Thus, even though one applicant might score two Banding
points higher than another, the two-point difference might
be the result of chance (error) rather than actual
differences in ability
The number of points that a test score could be off due to
test unreliability Standard error of
measurement (SEM)
Solution: SEM = SD √ 1 – reliability

You might also like