Psychological Testing_Introduction
Psychological Testing_Introduction
the gathering and integration of psychology - related data for the purpose of
making a psychological evaluation that is accomplished through the use of tools
such as tests, interviews, case studies, behavioral observation, and specially
designed apparatuses and measurement procedures- McGraw
Difference between test & assessment
Testing Assessment
Intelligence Paper-pencil
Group test Verbal test Speed test
test test
Performance Non-Verbal
Personality test Individual test Power test
test test
Aptitude test
Interest test
Types of tests used in practice
Personality tests
Rating scales
Neuropsychological tests
Psychological tests are used for analyzing,
describing and evaluating individuals to
predict and guide their behaviour.
What are the uses of psychological tests?
Clinical setting
School setting
Corporate setting
Vocational setting
Military setting
Research setting
What are the uses of psychological tests?
Diagnostic utility
Treatment planning
Standardization
Objectivity
Sampling
Reliability
Validity
Standardisation
uniformity of procedure not only in administering & scoring of the test but also in
interpreting the test results
1) Establishment of norms- in this process the test is conducted on a group of people to see the scores
which are typically obtained. These scores become the standard / typical scores. In this way, an
appropriate test taker can make sense of his or her score by comparing it to standard / typical scores.
These norms help the examiner in interpreting test results.
2) The test constructor provides detailed information regarding test. In this, he / she describes the exact
material to be used while administering a test, time limits, oral instructions, preliminary demonstrations,
ways of handling queries from the examiner & the testing conditions under which the test should be
administered.
3) Scoring procedure which is thoroughly explained in the test manual- the test constructor provides
detailed guidelines with examples about scoring tests. He sees to it that the biases on the part of the
examiner that might affect the results are eliminated , the scoring is made more objective &
quantifiable.
Objectivity
objectivity is the goal of the test construction & is been achieved to a considerable
extent in most tests
achieved by: establishing uniformity in testing situations which includes factors like time
limit, instructions & materials to be used & preliminary demonstrations as well as
controlling for the physical environment of texting(lighting, noise, ventilation etc)
Objectivity is the degree to which equally
competent scores obtain the same
results- free of personal opinion & bias
judgement.
Gronlund & Linn (1995)
Sampling
process of selecting the portion of the universe deemed to be representative of the whole
population
test measures what it purports to measure / the test measures what it is supposed to measure
Types:
1. Face validity: it is not valid in a technical sense. Face validity is the extent to which items on test
appear to be meaningful and relevant
2. Content validity: the extent to which the content of the test provides an adequate representative of
the conceptual domain it is designed to cover. It involves essentially the systematic examination of
the test content to determine whether it covers a representative sample of the behavior domain to be
measured.
Reliability
also can be a measure of a test’s internal consistency; useful test is consistent over time
Types:
1. Test – retest reliability: consistency of scores obtained by the same person when retested with the
same test on different occasions, or with different sets of equivalent items or under variable
examining conditions
2. Split-half reliability: it is also called odd even reliability. Split-half reliability is so called because
the reliability of a test is determined by splitting the test into two equal halves & then determining
the coefficient of reliability by correlating the scores on each half of the test.
Can a test be valid and not
reliable?
Effectiveness
Classical Test Theory (CTT)
Difficulty index: a measure of individual test item difficulty. It is calculated by finding proportion of
students who answered correctly out of the total number of students. 𝐷𝑖𝑓𝑓𝑐𝑖𝑢𝑙𝑡𝑦 𝑙𝑒𝑣𝑒𝑙 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑠 / 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠. Greater the number of students attempting an item
correctly, lesser is the difficulty level of items.
Discrimination index: a measure of the effectiveness of an item in discriminating between high
and low ability students on a test. The notion is that high-ability students will tend to choose the
right answer, on the other hand low-ability students will tend to choose the wrong answer. High
ability or low ability is purely based on performance on the test. 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑖𝑜𝑛 𝑖𝑛𝑑𝑒𝑥 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑠 𝑖𝑛 ℎ𝑖𝑔ℎ𝑒𝑟 𝑔𝑟𝑜𝑢𝑝−𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑠 𝑖𝑛 𝑙𝑜𝑤𝑒𝑟 𝑔𝑟𝑜𝑢𝑝 / 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑒𝑎𝑐ℎ 𝑔𝑟𝑜𝑢𝑝
Distractor analysis: conducted to evaluate the efficiency of each distractor for a multiple-choice
question. Ideally, all distractors should be equally plausible to students who do not know the right
answer. It is advisable to eliminate distractors that are never chosen which means they are not
working.
Limitations
Monotonicity – The assumption indicates that as the trait level is increasing, the probability
of a correct response also increases
Unidimensionality – The model assumes that there is one dominant latent trait being
measured and that this trait is the driving force for the responses observed for each item in
the measure
Local Independence – Responses given to the separate items in a test are mutually
independent given a certain level of ability.
Invariance – We are allowed to estimate the item parameters from any position on the
item response curve. Accordingly, we can estimate the parameters of an item from any
group of subjects who have answered the item.