PSY 311 Week 2
PSY 311 Week 2
INTRODUCTION
Psychological variables or characteristics are best measured using
psychometric tools referred to as tests or psychological tests. The term
tests can be broad and include any measurement tool or device that can
be used to measure such psychological attributes or variables such as
the intelligence, aptitude, personality, interests, and other variables of
interest to a psychologist.
Types of Tests
There are different types of tests which include;
1) Psychological Tests and Inventories:
As data-gathering devices, psychological tests are among the most
useful tools of educational research, for they provide the data for most
experimental and descriptive studies in education. In school surveys for
the past several decades, achievement tests have been used extensively
in the appraisal of instruction. Because tests yield quantitative
descriptions or measure, they make possible more precise analysis than
can be achieved through subjective judgment alone. There are many
ways of classifying psychological tests as seen earlier. One distinction
is made between performance tests and paper-and-pencil tests.
Performance tests, usually administered individually, require that
subjects manipulate objects or mechanical apparatus while their actions
are observed and recorded by the examiner. Paper-and-pencil tests,
usually administered in groups, require the subjects to mark their
response on a prepared sheet.
Two other classes of tests are power versus timed or speed tests.
Power tests have no time limit, and the subjects attempt progressively
more difficult tasks until they are unable to continue successfully.
Timed or speed tests usually involve the element of power, but in
addition, they limit the time the subjects have to complete certain tasks.
1
Another distinction is that made between non-standardized, teacher
made tests and standardized tests. The test that the classroom teacher
constructs is likely to be less expertly designed than that of the
professional, although it is based upon the best logic and skill that the
teacher can command and is usually “tailor-made” for a particular
group of pupils.
Which type of test is used depends on the test’s intended purpose. The
standardized test is designed for general use. The items and the total
scores have been carefully analyzed, and validity and reliability have
been established by careful statistical controls. Norms have been
established based upon the performance of many subjects of various
ages living in many different types of communities and geographic
areas. Not only has the content of the test been standardized, but the
administration and scoring have been set in one pattern so that those
subsequently taking the tests will take them under like conditions. As
far as possible, the interpretation has also been standardized.
2
instruments to measure interest.
Interest blanks or inventories are examples of self-report instruments in
which individuals note their own likes and dislikes. These self-report
instruments are really standardized interviews in which the subjects,
though introspection, indicate feelings that may be interpreted in terms
of what is known about interest patterns.
3
Eysenck Personality Inventory (EPI) all share something in common
with the other items of this scale (Dooley, 2004).
An alternative to factor analysis, the empirical criterion approach,
selects test items according to their ability to discriminate previously
identified groups. For example, the developers of the Minnesota
Multiphasic Personality Inventory (MMPI) used the empirical criterion
approach.
4
familiar examples.
2. Completion. The respondent is asked to complete an incomplete
sentence or task. A sentence-completion instrument may include
such items as:My greatest ambition is My greatest fear is I most
enjoyI dream a great deal about I get very angry when If I could do
anything I wanted it would be to.
3. Role-playing. Subjects are asked to improvise or act out a situation
in which they have been assigned various roles. The researcher
may observe such traits as hostility, frustration, dominance,
sympathy, insecurity, prejudice- or the absence of such traits.
4. Creative or Constructive. Permitting subject to model clay, finger
paint, play with dolls, play with toys, or draw or write imaginative
stories about assigned situations may be revealing. The choice of
colour, form, words, the sense of orderliness, evidence of tensions,
and other reactions may provide opportunities to infer deep-seated
feelings. Just like good tests, good inventories need to have a high
degree of both validity and reliability.
Note
Kuder-Richardson formula. This formula is a mathematical test that results in the
average correlation of all possible split half correlation (Cronbach, 1951).
5). Economy
Tests that can be given in a short period of time are likely to gain the
cooperation of the subject and to conserve the time of all those
involved in test administration. The matter of expense of administering
a test is often a significant factor if the testing program is being
operated on a limited budget.
Ease of administration, scoring, and interpretation is an important
factor in selecting a test, particularly when expert personnel or an
adequate budget are not available. Many good tests are easily and
effectively administered, scored, and interpreted by the classroom
teacher, who may not be an expert.
6).Interest
When psychological tests are used in educational research, one should
remember that standardized tests scores are only approximate measures
of the trait under consideration. This limitation is inevitable and may
be ascribed to a number of possible factors:
i. Errors inherent in any psychological test – no test is completely
valid or reliable.
ii. Errors that result from poor test conditions, inexpert or careless
administration or scoring of the test, or faulty tabulation of test
scores
iii. Inexpert interpretation of test results
iv. The choice of an inappropriate test for specific purpose in mind.
5
can save large amount of test development time. Moreover, using
existing measures improves the comparison of different studies. When
studies use the same measure, differences in their outcomes can be
traced to design and sample differences rather than to measurement
differences.
Definition of ‘test’:
A test is a systematic procedure for measuring a sample
of behaviour (psychological variable). Systematic
procedure indicates that a test is constructed,
administered, and scored (or marked) according to
prescribed rules or laid down rules, which must be
followed to the letter or absolutely.
Test items are systematically chosen to fit the test
specifications, the same or equivalent items are
administered to all persons (examinees) and the
directions and time limits are the same for all persons
taking the test. The use of predetermined rules [or
marking scheme] for evaluating (scoring) responses
assures agreement between different persons who might
score (mark) the test, in other words consistency or
reliability is ensured consequently.
Using standard procedures ensures comparability among
the examinees and ensures there is uniformity in all
aspects you can think of. The test should not favour any
individuals or any group of individuals unfairly. A test
should not have any kind of bias.
A second important term in the definition is behaviour.
In the strictest sense, a test measures only test-taking
behaviour. That is, the responses a person (examinee)
makes to the test items. Here we are talking about
psychological variables and as we know these cannot be
measured directly, rather we infer the characteristics
(trait) from his or her responses to the given test items.
We have to measure their manifestations since they are
not tangible.
If the behaviour exhibited (manifested) on the test
adequately mirrors the construct (trait) being measured,
the test will provide useful information. Here we are
talking about validity of the test, i.e. the test measuring
what it is supposed to measure. If the test does not
adequately reflect the underlying characteristic,
inferences made from test scores will be in error for
validity is important.
A test contains only a sample of all possible items. No
test is so comprehensive that it includes every possible
item. No test is so comprehensive that it includes every
possible item that might be developed to measure the
behaviour domain [or population or universe]; e.g. a
driver’s test will not test you how to drive at night, or on
6
a slippery wet road or when raining very much. Thus
any particular test is better thought of as a sample of all
possible items.
Because a test contains only a sample of all possible items, two
problems arise.
1. We must ensure that the questions or items
represented on the test are a representative sample
of all-possible questions or items. [Validity]
2. Would an examinee get the same score if he were
given a different set of sampled items from the
same domain? [Reliability].
A test is a measuring instrument. Thus we need to define measurement.
Measurement is assigning of numbers to individuals in a
systematic way as a means of representing the properties
of the individuals such that those with more of the
property you are measuring will score more, those with
less will score less.
7
Hence we seem to have very little interest on those who
are rejected (or left out). Social economic status, poor
health, poor facilities and background or other adverse
factors may contribute to a person being left out. Many
such factors are assumed uniform for all. In other words,
nobody is favoured is the assumption yet we know this
is not true.
2. In Placement:
There are several individuals and several alternative
courses of action for instance, in universities there are
several departments and each has its requirements. In
general each person is to be assigned to a program using
certain criteria.
3. Diagnosis:
It involves comparing an individual’s performance in
several areas in order to determine relative strengths and
weaknesses. Generally, diagnostic procedures are
instituted when an individual is having difficult in some
area. Once the areas of disability are identified, a
program of remediation can be undertaken. For
example, If a child has problem in reading or doing
word problems in mathematics, you may give a test
consisting of phonetic, word meaning (vocabulary),
sentence meaning, paragraph meaning and reading rate,
so as to identify what particular weaknesses or strengths
of the child need appropriate action.
4. Hypothesis Testing:
In psychological research, tests are often used for
hypothesis testing. And what is a hypothesis? This is
dealt with here briefly (for details see the appropriate
section on this). In brief a hypothesis is a speculative
statement, or an educated guess, which you may wish to
establish whether to accept or reject. For instance, we
can manipulate our subjects in a certain way (varying
may be the degree of manipulation) and then we try to
find the effect of the manipulation. We give a test to find
out the effect of the manipulation. This is a type of
experimental study or design. In a correlational study
(design) we have cases of natural manipulation. In a
correlation study we may look at the performance at a
certain time or under certain conditions. We study what
has taken place and then we make inferences. Using
varies methods like keeping other variables constant or
eliminating them analytically or otherwise, we are able
to study the effect of the variable manipulated.
Tests can also be used for hypothesis building. We may
find a difference in performance in 80’s and 90’s and
then we go on to hypothesize what could be the reason
may be a drop in socioeconomic status, 8-4-4
educational system or a combination of these and others.
8
Psychologists or educators (even lay-people) use tests to
make a lot of deductions or build hypotheses. For
instance, Muthoni got a very good division I in the O-
level examination, but failed to obtain university
entrance after A-level. Why? Muthoni went to do
science for A- level because of parental pressure. Her
father wanted her to be a doctor, but she did not have
much interest in sciences (Biology and the like).
Muthoni could have done very well if she took Arts
(Humanities). Or Muthoni may have lost her father just
before the exams and this traumatized her too much
beyond recovery and this indeed may have contributed
to her poor performance in the A-level exams.
5. Another use of tests is in Evaluation:
Formative evaluation and summative evaluation:
A teacher can use test to find not only the weak students
but also his weaknesses or topic not understood well etc.
Thus classroom examinations and tests are usually used
to evaluate the instructional method or the teacher.
All of these uses involve some decision. In selection, the
decision is whether to accept or reject an applicant. In
placement, where does the candidate fit best in terms of
ability and skills while in diagnosis, which remedial
treatment is to be used after finding out the weakness?
In hypothesis testing, usually using statistics you need to
establish (reject or accept) the hypothesis. In evaluation,
what grade to give to a student or how effective is the
procedure, effectiveness has to deal with summative
evaluation, or evaluation done at the end while what is
done at the beginning (e.g. to check entry behaviour) is
formative evaluation.
We know how seriously we take tests. We belong to a
culture, which overrates exams. You get a lot of respect
if you are an A student, division I, first class or Ph.D.
scholar. If you do your tests badly, you seldomly
(rarely) get a chance of saying why you obtained a low
score. Research on tests shows ‘ability’ is important in
doing well in a test, but accounts for less than 50%.
Other factors do count like difficult of items, quality of
instructions, personality variables e.g. socioeconomic
status, linguistic variables etc.
9
Nwana (1981) gives the following variety of functions of tests.
These are:
This is why test are regularly used to motivate pupils to learn. They
study hard towards their weekly, terminal or end of the year
promotion examinations.
One of the functions of tests can be to find out the extent to which
the contents have been covered or mastered by the testees. For
instance, if you treat a topic in your class at the end you give a test
and many of your students score high marks. This is an indication
that they have understood the topic very well. But if they score very
low marks, it implies that your efforts have been wasted. You need
to do more teaching. It is the results of the test that will help you
decide whether to move to the next topic or repeat the current
topic.
10
and bring about purposeful and desirable changes in the students
entrusted to him
There are goals and objectives set for the schools. Every school is
expected to achieve the goals and objectives through the
instructional programmes. The results of tests given to students are
used to evaluate how well the instructional programmes have
helped in the achievement of the goals and objectives
11