0% found this document useful (0 votes)
16 views28 pages

Quality of A Good Instrument

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views28 pages

Quality of A Good Instrument

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Dr JIMOH Mohammed Idris

Department of Social Sciences Education


Faculty of Education
University of Ilorin, Ilorin, Nigeria
Recap of Previous Lectures
Types of Data Collection Instruments
1. Cognitive Based Instruments
Data are usually quantitative in nature (e.g., Tests and
Examinations of academic subjects).
a) Free Response (e.g., Essay, Short answers etc)
b) Close response format (e.g., Multiple-choice, Multiple
response, True/False, Matching question, Fill in the gaps
etc)
2. Non-cognitive Based Instruments
The data obtained can quantitative or qualitative.
a) Quantitative Data Instruments (e.g., Questionnaire,
Opinionnaire, Inventories)
b) Qualitative Data Instruments (e.g., Interview, Focus
QUALITY OF A GOOD INSTRUMENT
The quality of a good data collection instrument refers to its validity,
reliability, fair to all categories of examinees, and practicality.

In today’s lecture, emphasis is on validity and reliability of a good


instrument

Psychometric Properties of an Instrument


1. Validity of an Instrument
2. Reliability of an Instrument
Validity in an Instrument
 Validity is the extent to which an instrument measures
what it is supposed to measure or what it is designed
to perform.
 When an instrument is seeing as truthful, accurate, and
relevant in measuring what it suppose to measure.
 validity is generally measured in degrees.
 There are numerous statistical tests and measures to
assess the validity of quantitative instruments.
 It involves collecting and analyzing data to assess the
accuracy of an instrument.
Types of validity
There are a number of different ways that can be used
to validate an instrument.
1. Face validity; Content
2. Content validity; Validity

3. Construct validity; and


4. Criterion-related validity.Face
Validity
Types of Construct
Validity Validity

Criterion-
related
Validity
1. Face validity
 Face validity is concerned with whether an
instrument is relevant and appropriate for what it’s
assessing on the surface.

 Face validity is an informal review of an instrument


by non-experts, who assess its clarity, and
appropriateness for the target-group.

 Face validity is the extent to which a test is


subjective i.e., viewed facially as being good that it
has covered the concept it purports to measure.

 Face validity tends to “look like” it will measure


what it is supposed to measure.
Short Comings of Face Validity
 Face validity does not guarantee overall good
measurement.

 It is a weak form of validity because the assessment


takes place facially without any in-depth
examination of the instrument.

 Face validity is likely to be bias in the assessment of


the instrument.
2. Content validity
 Content validity is the degree to which an assessment instrument
evaluates all aspects of the instrument by looking at the statements or
questions in the instrument one after the other.

 It investigates whether the items in the instrument fully cover the


variable or concept being measured.

 It indicates the extent to which items adequately measure or


represent the content of the trait or variable that the
researcher wishes to measure.
 It is defined as the degree to which items in an instrument
reflect the content universe to which the instrument can be
generalized.

 It requires a judgmental approach that involve the experts or


panels judges.
Measurement of Content Validity
 Measuring content validity involves assessing
individual items on a test and asking experts
whether each item is addressing the topic or
variable the instrument is designed to measured.

 Lawshe proposed a standard method for


measuring content validity that incorporates
expert ratings.

 This Lawshe approach involves asking experts


to determine whether each item in the
instrument is relevant
Number in
Agreemen Item
Items Expert 1Expert 2Expert 3Expert 4Expert 5 Expert 6 t CVI
1 — √ √ √ √ √ 5 .83

2 √ — √ √ √ √ 5 .83
3 √ √ — √ √ √ 5 .83
4 √ √ √ — √ √ 5 .83
5 √ √ √ √ — √ 5 .83
6 √ √ √ √ √ — 5 .83
7 √ √ √ √ √ √ 6 1.00
8 √ √ √ √ √ √ 6 1.00
9 √ √ √ √ √ √ 6 1.00
10 √ √ √ √ √ √ 6 1.00

Relevant:
3. Construct Validity
o Constructs are variables or concepts that are
abstract in nature and cannot be measured directly.

o Construct validity is concern with how well a set of


statements or items represent a concept that is
not directly measurable.

o Construct validity indicates the extent to which a


measurement instrument accurately measures a
construct that cannot be measured directly, and
produces a distinct observable and measurable
concept.
Types of Construct Variable
1. Convergent Validity
 Convergent validity refers to the degree to which
two instruments of the same construct that are
related.

 Convergent validity is the extent to which measures


of the same or similar constructs actually
correspond to each other.

2. Discriminant Validity
 Discriminant validity is when the two measures of
unrelated constructs that should be unrelated are, in
4. Criterion Validity
 Criterion validity focus on how well an instrument
can be compared with another instrument that have
been established as being good instrument in
measuring behaviour.

 Criterion validity shows you how well an


instrument correlates with an established standard
of comparison called a criterion.

 indicates the extent to which an instrument’s


scores correlate with an external criterion (i.e.,
usually another measurement from a different
instrument).
Types of criterion validity
1. Concurrent Validity
 This is demonstrated when a new instrument
correlates with another instrument that is already
considered valid.

2. Predictive Validity
 This is demonstrated when an instrument can
predict future performance.
 The test must correlate with a variable that can
only be assessed at future data after the test has
been administered.
Reliability of Research Instrument
Reliability of Research Instrument

 Reliability is the degree to which an instrument


produce stable and consistent results.

 Reliability is the consistency with which a measuring


instrument measures what it is designed to measure.

 The extent to which the results obtained from such an


instrument can be relied upon as a true score.

 It is obtained through correlation statistics which


yields a coefficient of reliability.

 Reliability of not less than 0.60 may be regarded as


satisfactory but not good enough.
Types of Reliability
There are a number of different ways that can be used
to validate the instrument.
1. Test Re-test Reliability; Interrater
2. Internal Consistency; Reliability

3. Parallel Form; and


4. Inter-rater Types of Parallel
Test Re-test
Reliability Form

Internal
Consistenc
y
1. Test Re-test Reliability
Test-retest reliability measures the consistency of
results when you repeat the same instrument on the
same sample at a different point in time.

Test-re-test reliability relates to the measure of


reliability that has been obtained by conducting the
same test more than one time over a period of time
with the participation of the same sample group.

You use it when you are measuring something that you


expect to stay constant in your sample.
Measurement Test Re-test
• To measure test-retest reliability, you conduct the
same test on the same group of people at two
different points in time.

• Then, calculate the correlation between the two


sets of results using appropriate correlation statistics.

Items 1st 2nd


Administration Administration
1 50 54
2 45 50
3 76 70
4 39 42
5 54 56
2. Interrater Reliability
 Inter-rater reliability (also called inter-observer
reliability) measures the degree of agreement between
different people observing or assessing the same thing.

 Use it when data are collected that assigned ratings,


scores or categories to one or more variables.

 Inter-rater reliability as the name indicates relates to


the measure of sets of results obtained by different
assessors using the same methods.

 The benefits and importance of assessing inter-rater


reliability can be explained by referring to the subjectivity
of assessments.
Measurement of Interrater Reliability
 To measure interrater reliability, administer the same
instrument on the same set of population once and
calculate the degree of agreement between different
people.

 If all the raters give similar ratings, the test has high
interrater reliability.
Items Rater 1 Rater 2 Rater 3 Rater 4 Rater 5 Agreemen
t
1. √ √ √ √ √ 5/5

2. √ √ 0 √ √ 4/5

3. √ O √ 0 √ 3/5

4. O √ √ √ √ 4/5

5. √ O √ √ √ 4/5

6. 0 √ O √ √ 3/5

7. √ √ √ √ √ 5/6

8 √ √ √ √ √ 5/5

Interrater Reliability
3. Parallel forms reliability
 Parallel forms reliability measures the correlation
between two equivalent versions of instruments.
 This also referred to as alternative or equivalent for of
reliability.
 You use it when you have two different assessment tools
designed to measure the same thing.

Examinees 2022 WAEC Math 2022 NECO Math


1 50 54
2 45 50
3 76 70
4 39 42
5 54 56
4. Internal Consistency
 Internal consistency assesses the correlation
between multiple items in a test that are intended to
measure the same construct.

 Internal consistency reliability is applied to assess the


extent of differences within the test items that explore
the same construct produce similar results.

 It is also know as inter-stem consistency.

 It measures correlation between items on the same


test which measure whether several items used t5o
measure the same general construct produce similar
scores.
Internal Consistency
 Internal consistency assesses the correlation between
multiple items in a test that are intended to measure
the same construct.

 Internal consistency reliability is applied to assess the


extent of differences within the test items that explore the
same construct produce similar results.

 It is also know as inter-stem consistency.

 It measures correlation between items on the same


test which measure whether several items used t5o
measure the same general construct produce similar scores.
Methods of Determining Internal Consistency
1. Split-half
2. Kuder-Richardson’s KR20 and Kuder-Richardson’s
KR21
3. Cronbach’s Alpha or Coefficient Alpha

Split-half
 It is the process of splitting a test item into two
halves.

Assuming there 50 items in a test


a) Using even and odd numbers in the questions (i.e.
Even- 2,4,6,8, etc
odd- 1,3,5,7,9, etc
b) Using the first 25 items and the second 25 items in a
test.
2 (a) Kuder-Richardson’s KR20
Kuder-Richardson’s KR20 in meant to compute
internal consistency of dichotomously scored items.
(e.g., multiple-choice items).

2 (b) Kuder-Richardson’s KR21


It is a modified version of KR20 for essay test.

3. Cronbach’s Alpha or Coefficient Alpha


 It is used to determine if multiple-question with
Likert scale format are reliable.
 It is used compute internal consistency for essay
test and instrument that has Likert scale (e.g.,
Strongly Disagree, Disagree, Agree, and Strongly
Agree).
The End

You might also like