The document discusses the concepts of validity and reliability in research instruments, defining validity as the degree to which a tool measures what it is intended to measure, and reliability as the consistency of scores across repeated measurements. It outlines different types of validity, including content, criterion-related, and construct validity, as well as various forms of reliability such as test-retest, rater, and internal consistency. The importance of establishing both validity and reliability is emphasized to ensure accurate and dependable research outcomes.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
2 views
Validity and reliability
The document discusses the concepts of validity and reliability in research instruments, defining validity as the degree to which a tool measures what it is intended to measure, and reliability as the consistency of scores across repeated measurements. It outlines different types of validity, including content, criterion-related, and construct validity, as well as various forms of reliability such as test-retest, rater, and internal consistency. The importance of establishing both validity and reliability is emphasized to ensure accurate and dependable research outcomes.
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24
VALIDITY & RELIABILITY
Mrs Deborah Esan
Introduction Reliability and validity are two important and essential concepts that relate to each potential instrument that the investigator is considering.
They have specific definitions in research and can be
easily confused Validity refers to whether the instrument actually measures what it is supposed to measure.
For Example: Does the instrument measure the concept
being examined? Reliability refers to whether the tool conveys consistent and reproducible data, for example, from one participant to another or from one point in time to another. VALIDITY Validity is the degree to which a tool measures what it is supposed to measure (Mateo, 1999). Assures that a test is measuring what it is intended to measure There are three types of validity; content, criterion-related, and construct.
Validity relies first and foremost on reliability
Validity Contd. Content validity relates to an instrument’s adequacy in covering all concepts pertaining to the phenomena being studied.
For example;If the purpose of the tool is to learn
whether the patient is anxious before taking an exami- nation, the questions should include a list of behaviors that anxious peo-ple report when they are experiencing anxiety. . It is important that an instrument be reviewed for content by persons who possess characteristics and experiences similar to those of the participants in a study. Validity Contd Criterion-related validity, which can be either predictive or con-current, measures the extent to which a tool is related to other criteria. Predictive validity is the adequacy of the tool to estimate the individual performance or behavior in the future. It deals with association with a subsequent event thought to be an outcome –pre-maturity and death
For example, if a tool is developed to measure clinical
competence of nurses, persons who respond to the tool can be followed over time to see if this score correlates with other measures of competence, such as performance ap- praisals, commendations, or other indications of competence. Validity Contd . If results indicate that there is a high coefficient correlation, it means that the clinical competence scale can be used to predict future performance Appraisals. Concurrent validity is the ability of a tool to compare the respondent’s status at a given time to a criterion (Mateo, 1999). For ex-ample, when a patient is asked to complete a questionnaire to determine the presence of anxiety, results of the test can be compared to the same patient’s ratings on an established measure of anxiety administered at the same time. Validity Contd Construct validity is concerned with the ability of the instrument to adequately measure the underlying concept (Mateo, 1999). With this type of validity, the researcher’s concern relates to whether the scores represent the degree to which a person possesses a trait. Construct validity deals with how measure relates to other measures- attitude to HIV patients measured by reaction to PLWHA
Since construct validity is a judgment based on a
number of studies, it takes time to es-tablish this type of validity. Types of validity Face validity Indicates that an instrument appears to test what it is supposed to Content validity Indicates that the items that make up an instrument adequately sample the universe of content that defines the variable being measured Most useful with questionnaires and inventories Criterion-related validity Indicates that the outcomes of one instrument, the target test, can be used as a substitute measure for an established reference standard criterion test Can be tested as concurrent or predictive validity Concurrent validity establishes validity when 2 measures are taken at relatively the same time, most often used when the target test is considered more efficient than the gold standard, and can be used instead of the gold standard Predictive validity establishes that the Construct validity Establishes the ability of an instrument to measure an abstract construct and the degree to which the instrument reflects the theoretical components of the construct RELIABILTY The reliability of a tool is the degree of consistency in scores achieved by sub-jects across repeated measurements. It is the extent to which a measurement is consistent and free from error Also known as reproducibility or dependability. Consistency of measurement or its stability The comparison is usually reported as a reliability coefficient. The reliability coefficient is determined by the proportion of true variability (attributed to true differences among re-spondents) to the total obtained variability (attributed to the result of true differences among respondents and differences related to other factors). Reliability coefficients normally range between 0 and 1.00; the higher the value, the greater the reliability. In general, coefficients greater than 0.70 are considered appropriate. However, in some circumstances this will vary. Acceptable reliability Poor reliability : less than 0.5 Moderate reliability: 0.5 to 0.75 Good reliability: greater than 0.75 Researchers may be able to tolerate lower reliability for measurements used for description Measurements used for decision making or diagnosis need to be higher, perhaps at least 0.9 to ensure valid interpretation of findings The researcher takes a chance that data across repeated administra-tions will not be consistent when instruments with reliability estimates of 0.60 or lower are used (Mateo, 1999). Three aspects should be considered when determining the reliabil-ity of instruments: stability, equivalence, and homogeneity The stability of a tool refers to its ability to consistently measure the phenomenon being studied.
This is de-termined through test-retest
reliability. Test-retest reliability
The tool is administered to the same
persons on two separate occasions. Scores of the two sets of data are then compared, and the correlation is derived. The recommended interval between testing times is 2 to 4 weeks (Burns & Grove, 2005; Mateo, 1999) Equivalence should be determined when two versions of the same tool are used to measure a concept (alternative forms) or when two or more persons are asked to rate the same event or the behavior of an- other person (interrater reliability; Mateo, 1999). In alternative-form reliability, two versions of the same instrument are developed and ad-ministered. The scores obtained from the two tools should be similar. It is helpful for the researcher to know whether a published instrument has alternate forms; when there are, a decision must be made about which form to use. Establishing interrater reliability is important when two or more observers are used for collecting data. The homogeneity of an instrument is determined most commonly by calculating a Cronbach’s alpha coefficient. This test is found in a number of statistical packages. This test is a way of determining whether each item on an instrument measures the same thing. Internal consistency reliability estimates are also calculated by using the Kuder-Richardson formula. Types of reliability Test-retest reliability Rater reliability Alternate forms reliability Internal consistency Test retest reliability Sample of individuals is subjected to the identical test on two separate occasions, keeping all testing conditions as constant as possible The test-retest reliability coefficient is reported The coefficient is indicative of reliability in situations where raters are not involved e.g. self-report survey instruments and physical and physiological measures with mechanical or digital readouts Rater reliability Many clinical measurements require that a human observer, or rater, be part of the measurement system Could be Intrarater reliability (stability of the data recorded by one individual across two or more trials) Interrater relaibility (variation between two or more raters who measure the same group of subjects) Alternate forms reliability Many measuring instruments exist in 2 or more versions, called equivalent or parallel forms Often used as an alternative to test-retest reliability when the nature of the test is such that subjects are likely to recall their responses to test items. Internal consistency Extent to which items measure items measure aspects of the same characteristic Most common approach involves looking at the correlation among all items in a scale Split half reliability THANK YOU