Module 2 Part 4 Criteria For Measurement
Module 2 Part 4 Criteria For Measurement
module 2( part 4)
Criteria for good measurement &
scaling
• It is important to make sure that the instrument
used to measure variables is accurate and
efficient.
• The three major criteria (Qualities/
characteristics) for evaluating a measurement
tool are;
• Reliability
• Validity
• Sensitivity
Reliability
• Reliability is the degree to which a measure is
dependable to assess the intended construct.
• It is concerned with accuracy & consistency of
the scale.
• It refers to the extend to which measurement
process is free from random errors (Accuracy)
• An instrument is said to be reliable If the
instrument gives consistently the same result
every time when repeated in similar condition
(Consistency)
Methods of measuring Reliability
Test Retest
Reliability
Reliabilit Split-half
y Reliability
Inter-rater
Reliability
Test retest reliability
• It measure the stability of a test over time.
• In this method, repeated measurements of the same variable
or construct are taken using the same scale under similar
conditions.
• Find correlation co-efficient of the scores in all the
measurements.
• A very high correlation between the scores indicates that
the scale is reliable.
• It can also be done by taking the measurement twice and
checking whether two scores are the same.
• The main limitations of this method are the possibility of
bias, time difference between trials, situational factors etc.
Split-half reliability
• It measure the extent to which all parts of the test
contribute equally to what is being measured (Internal
consistency).
• Here the test is administered with a large number of
students (30 or more)
• The variables (questions) are randomly divided into two
parts and the scores of both part is taken
• A correlation co-efficient between the two is obtained.
• A high correlation indicates that there is internal
consistency which leads to greater reliability.
• Spit half reliability can also be tested by calculating co-
efficient of alpha (cronbach alpha) (α)
• α = 0 means No consistency
• α = 1 means Complete consistency
• α in between 0.95 & 0.80 means Very good Reliability
• α in between 0.80 & 0.70 means Good Reliability
• α in between 0.70 & 0.60 means Fair Reliability
• α in below 0.60 means poor Reliability
Inter-rater Reliability
• It is a measure of consistency between two or more
independent raters (observers) of the same
construct
• It measure how much homogeneity exist in the
ratings given by various examiners or observers to
the scale.
• When all observers agree that the observed
phenomena fit to the measurement scale, the scale
is said to be reliable.
• More than 80% of agreement is considered as high
reliability.
Validity
• Validity refers to the extend to which a
measurement tool measures what it is
supposed to measure.
• Validity focusses on the question whether
we are measuring what we want to measure.
• It is the extent to which the measurement
process is free from systematic errors.
• Validity of a scale is a more serious issue
than reliability.
Ways to measure validity
2. Construct Validity
• Construct Validity refers to how well a test or tool
measure the construct (skill, ability or attribute)
that it was designed to measure.
• It is the extent to which a scale adequately assess
the theoretical concept it does.
• Eg. A test designed to measure ‘depression’ must
only measure that particular construct, not closely
related ideals such as anxiety or stress.
• The commonly used method to check construct
validity is Confirmatory Factor Analysis (CFA)
Construct validity may be
• Convergent Validity: Test whether the
constructs that are expected to be related
with other constructs are actually related.
• Discriminant Validity: Test whether the
construct that should have no relationship
with other constructs are not actually
related.
3. Concurrent validity:
• The validity of the new measurement scale is
measured by comparing it with an established
measuring tool.
• It is done by correlating the pilot data collected by
using the tool with the data collected by using a
properly validated standard tool.
• The correlation coefficients of the results of two
measures should be computed for this purpose.
• The stronger the correlation, the higher the
degree of concurrent validity.
4. Predictive validity
• Predictive validity is the extent to which a scale is able to
measure the construct as predicted in the criterion.
• Predictive validity examines
• Does this scale measure what it is intended and
• Can the result be used to predict things about the
participants?
• It addresses how well a specific tool predicts future
behaviour.
• Predictive validity is determined by calculating the
correlation coefficient between the assessment data
(pilot data) and the targeted behaviour.
• The stronger the correlation, the higher the degree of
An Example of Testing Predictive Validity
• Assume that some specific traits and skills are required
for the employees to do a particular job.
• We want to prepare test questions to measure the
personality traits of the applicants of that particular job.
• There may be several questions to assess the traits of
applicants.
• Some toppers of the test may be invited for an interview
and practical test to assess their actual abilities and
skills.
• If these toppers posses the expected abilities and skills
required for the job, the test instrument is said to have
predictive validity.
Sensitivity
• It is the ability of the scale to accurately measure
the construct and the variability in responses.
• A dichotomous scale (yes or no) has less sensitivity
whereas likerts scale has more sensitivity.
• More response categories (5, 7 or 11) helps to
increase the sensitivity of the scale.
• The sensitivity of scales may also be improved by
eliminating items that poorly represent the
intended construct and adding items that is
expected to measure the construct well.
Other Criteria or Qualities of good measurement Scale
2. Situational Errors
• Unfavourable external environment of the
study
• Anonymity of the interviewer.
• Wrong time and situation of the study
• Situations which creates biased data
Sources of Errors