0% found this document useful (0 votes)

160 views

Chapter 6edited

Uploaded by

Ella Mae Fulleros

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

160 views

Chapter 6edited

Uploaded by

Ella Mae Fulleros

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

LESSON 6: ESTABLISHING TEST VALIDITY AND RELIABILITY

Desired Significant Learning Outcomes

In this lesson, you are expected to:
➢ Use procedures and statistical analysis to
establish test validity and reliability;
➢ Decide whether a test is valid or
reliable and;
➢ Decide which test items are easy and
difficult.

What is test reliability?

Reliability is the consistency of the responses to measure under three conditions:
(1) When retested on the same person;
(2) When retested on the same measure; and
(3) Similarity of responses across items that measure the same characteristic.
In the first condition, consistent response is expected when the test is given to the same participants. In
the second condition, reliability is attained if the responses to the same test is consistent with the same
test or its equivalent or another test that measures but measures the same characteristic when
administered at a different time. In the third condition, there is reliability when the person responded in
the same way or consistently across items that measure the same characteristic.
There are different factors that affect the reliability of a measure. The reliability of a measure can
be high or low, depending on the following factors:
1 . The number of items in a test - The more items a test has, the likelihood of reliability is high.
The probability of obtaining consistent scores is high because of the large pool of items.

2. Individual differences of participants Every participant possesses characteristics that affect

their performance in a test, such as fatigue, concentration, innate ability, perseverance, and
motivation. These individual factors change over time and affect the consistency of the answers
in a test.

3. External environment - The external environment may include room temperature, noise level,
depth of instruction, exposure to materials, and quality of instruction, which could affect changes
in the responses of examinees in a test.

What are the different ways to establish test reliability?

There are different ways in determining the reliability of a test. The specific kind of reliability will
depend on the (1) variable you are measuring, (2) type of test, and (3) number of versions of the
test.
The different types of reliability are indicated and how they are done.
Notice in the third column that statistical analysis is needed to determine the test reliability.
Method in
What statistics is used?
Testing
Reliability How is this reliability done?
1. Test-retest Correlate the test scores from the
You have a test, and you need to administer it at first and the next administration.
one time to a group of examinees. Administer it Significant and positive correlation
again at another time to the "same group" of indicates that the test has temporal
examinees. There is a time interval of not more stability overtime. Correlation
than 6 months between the first and second refers to a statistical procedure
administration of tests that measure stable where linear relationship is
characteristics, such as standardized aptitude expected for two variables. You
tests. The post-test can be given with a may use Pearson Product Moment
minimum time interval of 30 minutes. The Correlation or Pearson r because
responses in the test should more or less be the test data are usually in an interval
same across the two points in time. scale (refer to a statistics book for
Pearson r).

Test-retest is applicable for tests that measure

stable variables, such as aptitude and
psychomotor measures (e.g., typing test, tasks in
physical education).

2. Parallel There are two versions of a test. The items need Correlate the test results for the
Forms to exactly measure the same skill. Each test first form and the second form.
version is called a "form." Administer one form
at one time and the other form to another time to
the "same" group of participants. The responses
on the two forms should be more or less the
same.

Parallel forms are applicable if there are two Significant and positive correlation
versions of the test. This is usually done when coefficient are expected. The
the test is repeatedly used for different groups, significant and positive correlation
such as entrance examinations and licensure indicates that the responses in the
examinations. Different versions of the test two forms are the same or
are given to a different group of examinees. consistent. Pearson r is usually

used for this analysis.

LESSON 6: ESTABLISHING TEST VALIDITY AND RELIABILITY

3. Split-Half Administer a test to a group of examinees. The Correlate the two sets of scores
items need to be split into halves, usually using using Pearson
the odd-even technique. In this technique, get the r. After the correlation, use another
sum of the points in the odd-numbered items and formula called SpearmanBrown
correlate it with the sum of points of the even- Coefficient. The correlation
numbered items. Each examinee will have two coefficient The correlation
scores coming from the same test. The scores on coefficient obtained using Pearson r
each set should be close or consistent. and Spearman Brown should be
significant and positive to mean
Split-half is applicable when the test has a large that the test has internal consistency
number of items. reliability.

4. Test of This procedure involves determining if the A statistical analysis called

Internal scores for each item are consistently answered Cronbach's alpha or the Kuder
Consistency by the examinees. After administering the test to Richardson is used to determine
Using Kuder a group of examinees, it is necessary to the internal consistency of the
Richardson determine and record the scores for each item. items. A Cronbach's alpha value
and The idea here is to see if the responses per item
of 0.60 and above indicates that
Cronbach's are consistent with each other. This technique
will work well when the assessment tool has a the test items have internal
Alpha
large number of items. It is also applicable for consistency.
Method
scales and inventories (e.g.,

Likert scale from "strongly agree" to "strongly

disagree").

5. Inter-rater This procedure is used to determine the A statistical analysis called

Reliability consistency of multiple raters when using Kendall's tau coefficient of
rating scales and rubrics to judge concordance is used to determine
performance. The reliability here refers to if the ratings provided by multiple
the similar or Consistent ratings provided by raters agree with each other.
more than one rater or judge when they use Significant Kendall's tau value
an assessment tool.
indicates that the raters concur or
agree with each other in their
ratings.
Inter-rater is applicable when the
assessment requires the use of multiple
raters.

What are the different ways to establish test reliability?

1. Linear regression
➢ This is demonstrated when you have two variables that are measured, such as two set of scores in
a test taken at two different times by the same participants.
➢ When the two scores are plotted in a graph (with X- and Y-axis), they tend to form a straight
line
➢ The straight line formed for the two sets of scores can produce a linear regression.
➢ When a straight line is formed, we can say that there is a correlation between the two sets of
scores. This can be seen in the graph shown. This correlation is shown in the graph given.
➢ The graph is called a scatterplot.
➢ Each point in the scatterplot is a respondent with two scores (one for each test)
2. Computation of Pearson r correlation
➢ The index of the linear regression is called a correlation coefficient.
➢ When the points in a scatterplot tend to fall within the linear line, the correlation is said to be
strong.
➢ When the direction of the scatterplot is directly proportional, the correlation coefficient will have
a positive value. If the line is inverse, the correlation coefficient will have a negative value.
➢ The statistical analysis used to determine the correlation coefficient is called the Pearson
r.
Suppose that a teacher gave the spelling of two-syllable words with 20 items for Monday and
Tuesday. The teacher wanted to determine the reliability of two set of scores by computing for the
Pearson r.
Formula:
LESSON 6: ESTABLISHING TEST VALIDITY AND RELIABILITY

Substitute the values in the formula:

r = 0.80
The value of a correlation coefficient does not exceed 1.00 or -1.00. A value of 1.00 and -1.00
indicates perfect correlation. In test of reliability though, we aim for high positive correlation to mean
that there is consistency in the way the student answered the tests taken.

3. Difference between a positive and a negative correlation

➢ When the value of the correlation coefficient is positive, it means that the higher the scores
in X, the higher the scores in Y. This is called a positive correlation.
➢ When the value of the correlation coefficient is negative, it means that the higher the scores
in X, the lower the scores in Y, and vice versa. This is called a negative correlation.
➢ When the same test is administered to the same group of participants, usually a positive
correlation indicates reliability or consistency of the scores.

4. Determining the strength of a correlation

The strength of the correlation also indicates the strength of the reliability of the test. This is
indicated by the value of the correlation coefficient. The closer the value to 1.00 or
-1.00, the stronger is the correlation. Below is the guide:
0.80-1.00 Very strong relationship

0.6-0.79 Strong relationship

0.40-0.59 Substantial/marked relationship

0.2-0.39 Weak relationship

0.00-0.19 Negligible relationship

5. Determining the significance of the correlation

The correlation obtained between two variables could be due to chance. In order to
determine if the correlation is free of certain errors, it is tested for significance. When a
correlation is significant, it means that the probability of the two variables being related is free of
certain errors.
In order to determine if a correlation coefficient value is significant, it is compared with an
expected probability of correlation coefficient values called a critical value. When the value
computed is greater than the critical value, it means that the information obtained has more than
95% chance of being correlated and is significant.
Another statistical analysis mentioned to determine the internal consistency of the test is
Cronbach's alpha. Follow the procedure to determine the internal consistency.
LESSON 6: ESTABLISHING TEST VALIDITY AND RELIABILITY

Suppose that five students answered a checklist about their hygiene with a scale of 1 to 5, where in the
following are the corresponding scores:
5 - always, 4 - often, 3 - sometimes, 2 - rarely, 1 - never
The checklist has five items. The teacher wanted to determine if the items have internal
consistency.

The internal consistency of the responses in the attitude toward teaching is 0.10, indicating
low internal consistency.
The consistency of ratings can also be obtained using a coefficient of concordance. The Kendall's w
coefficient of concordance is used to test the agreement among raters.
Below is a performance task demonstrated by five students rated by three raters.
The rubric used a scale of 1 to 4, where in 4 is the highest and 1 is the lowest.

Sum of
Rater Rater 2 Rater
Five demonstrations Ratings
1 3

4 4 3 11 2.6 6.76
3 2 3 8 -0.4 0.16

4
c 3 4 11 2.6 6.76

3 3 2 8 -0.4 0.16

1 1 2 4 -4.4 19.36

x = 8.4 ZD2 = 33.2

Ratings

The scores given by the three raters are first computed by summing up the
total ratings for each demonstration. The mean is obtained for the sum of ratings (R = 8.4). The
mean is subtracted from each of the Sum of Ratings (D). Each difference is squared (D), then
the sum of squares is computed (iD2= 33.2). The mean and summation of squared difference is
substituted in the Kendall's w formula. In the formula, m is the numbers of raters. A Kendall's w
coefficient value of 0.38 indicates the agreement of the three raters in the five demonstrations.
There is moderate concordance among the three raters because the value is far from 1.00.

What is test validity?

A measure is valid when it measures what it is supposed to measure. If a quarterly exam is
valid, then the contents should directly measure the objectives of the curriculum. If a scale that
measures personality is composed of five factors, then the scores on the five factors should have
items that are highly correlated. If an entrance exam is valid, it should predict students' grades after
the first semester.

What are the different ways to establish test validity?

There are different ways to establish test validity.

Type of Validity Definition Procedure

Content When the items represent the The items are compared with the objectives of
Validity domain being measured the program. The items need to measure
directly the objectives (for achievement) or
definition (for scales). A reviewer conducts
the checking.

Validity When the test is presented well, The test items and layout are reviewed and tried out
free of errors,and administered on a small
well group of respondents. A manual for
administration can be made as a guide for the
test administrator.

Predictive A measure should predict a future A correlation coefficient is obtained where the X-
Validity criterion. Example is an entrance variable is used as the
exam predicting the grades of the predictor and the Y-variable as the criterion.
students after the first semester.

Construct The Components or factors of the The Pearson r can be used to correlate the items for
Validity test should contain items that are each factor. However, there is a technique called
strongly correlated. factor analysis to determine which items are highly
correlated to form a factor.

Concurrent When two br more The scores on the measures should be

Validity measures are present for correlated.
each examinee that
measure the same characteristic

Convergent When the components or factors Correlation is done for the factors of the test.
Validity of a test are hypothesized to
have a positive correlation
Divergent Validity When the components or factors Correlation is done for the factors of the test.
of a test are hypothesized to have
a negative correlation. An
example to correlate are the
scores in a test on intrinsic and
extrinsic motivation.

How to determine if an item is easy or difficult?

➢ An item is difficult if the majority of students are unable to provide the correct
answer.
➢ The item is easy if the majority of the students are able to answer correctly
➢ An item can discriminate if the examinees who score high in the test can answer the items
correctly more than examinees who got low scores.
ITEM ANALYSIS: Process of examining student’s responses to individual test items in order to assess
the quality of those items and test on a whole (Mehta,2011)
● An excellent question can separate the performing from the non-performing students

STEPS
1. Get the total score of each student and arrange the scores from highest to lowest.
2. Get the top 27% (upper group) and below 27% (lower group) of the examinees.
3. Count the number of the examinees in the higher group (pH)and lower group (pL)who got
each item correctly.
4. Compute for the Difficulty Index of each item. (It is a measure of the proportion of examinees
who answered the item correctly.)
5. Compute the Discrimination Index. ( This is a measure of how well an item is able to
distinguish between examinees who are knowledgeable and those who are not or between
masters and non-masters.
FORMULA:
➢ The difficulty index is obtained using the formula:
Item difficulty= (pH+pL)/2

➢ The index discrimination is obtained using formula:

Item discrimination= pH-pL

INTERPRETATION

DIFFICULTY INDEX REMARK

0.76 or higher Easy Item

0.25 to 0.75 Average Item

0.24 or lower Difficult Item

INDEX DISCRIMINATION REMARK

0.40 and above Very Good Item

0.30 - 0.39 Good Item

0.20 - 0.29 Reasonably Good Item

0.10 - 0.19 Marginal Item

Below 0.10 Poor Item

EXAMPLE
This is the result of a test given to a total of 10 students. (N=10)

ITEM ANALYSIS

COMPUTATION

Item 1 Item 2 Item 3 Item 4 Item 5

=(0.67+0)/2 =(0.67+0.33)/2 =(2.00+0.67)/2 =(1.00+0.33)/2 =(1.00+0.33)/2

Index of 0.33 0.50 0.83 0.50 0.67

Difficulty

Item Difficult Average Easy Average Average

Difficulty
Item 1 Item 2 Item 3 Item 4 Item 5

=0.67-0 =0.67-0.33 =2.00-0.67 =1.00-0.33 =1.00-0.3

Discrimination 0.67 0.33 0.33 0.33 0.67

Index

Discrimination Very Good Good Item Good Item Good Item Very Good Item
Item

References:

1. Ubiña-Balagtas, M., David, A.J Golla, E, Magno, C.j & Valladolid, V. (2020)
Establishing Test Validity and Reliability. In Assessment in Learning 1 (pp.96-
118). Rex Book Store, Inc.

2. Application of IRT Using the Rasch Model in Constructing Measures: https://

www.slideshare.net/crimgn/the-application-of-irt-using-the-rasch-model-
presnetation

3. Reliability and Validity in Student Assessment: https://www.youtube.com/

watch?v=gzv8Cm1jC4M

Alternative Ways in Assessing Learning 2
100% (1)
Alternative Ways in Assessing Learning 2
15 pages
Lesson in EDUC 4 (Establishing Test Validity and Reliability)
No ratings yet
Lesson in EDUC 4 (Establishing Test Validity and Reliability)
20 pages
Alternative Response
0% (1)
Alternative Response
28 pages
Chapter 7: Overall Test Development Process
No ratings yet
Chapter 7: Overall Test Development Process
25 pages
Relevance of Assessment
No ratings yet
Relevance of Assessment
12 pages
Good Afternoon!: Assessment Tools To Measure Authentic Learning Performan Ce and Products (KPUP)
No ratings yet
Good Afternoon!: Assessment Tools To Measure Authentic Learning Performan Ce and Products (KPUP)
13 pages
Activity 3
No ratings yet
Activity 3
2 pages
ASL Exercise Ch1
100% (1)
ASL Exercise Ch1
5 pages
Module 6 Grading and Reporting
No ratings yet
Module 6 Grading and Reporting
35 pages
PRINCIPLES OF HIGH QUALITY ASSESSMENT (Lesson 2)
No ratings yet
PRINCIPLES OF HIGH QUALITY ASSESSMENT (Lesson 2)
13 pages
Module 11 Ass 2
No ratings yet
Module 11 Ass 2
13 pages
Unit Iv Unit V
No ratings yet
Unit Iv Unit V
8 pages
4.1 Development of Assessment Tools
No ratings yet
4.1 Development of Assessment Tools
12 pages
Tetepan Elementary School Table of Specification: Fourth Periodic Test (English 6)
100% (1)
Tetepan Elementary School Table of Specification: Fourth Periodic Test (English 6)
7 pages
Asssessment Scenario Eden
100% (2)
Asssessment Scenario Eden
1 page
The 12 Principles of High Quality Assessment of Learning For Teachers
No ratings yet
The 12 Principles of High Quality Assessment of Learning For Teachers
4 pages
Chapter 1 Asl2
No ratings yet
Chapter 1 Asl2
34 pages
Assessmentinlearning 2 Part 4
No ratings yet
Assessmentinlearning 2 Part 4
46 pages
Scoring Methods For Multiple Choice Assessment in Higher Education - Is It Still A Matter of Number Right Scoring or Negative Marking?
No ratings yet
Scoring Methods For Multiple Choice Assessment in Higher Education - Is It Still A Matter of Number Right Scoring or Negative Marking?
16 pages
Unit 2. B
No ratings yet
Unit 2. B
47 pages
Lesson-1-Math-M 3112
No ratings yet
Lesson-1-Math-M 3112
3 pages
Section 3.2 Improving A Classroom-Based Assessment Test
No ratings yet
Section 3.2 Improving A Classroom-Based Assessment Test
14 pages
Module 2 Lesson 3
No ratings yet
Module 2 Lesson 3
18 pages
Lesson 2 Thorndike's Connectionism
100% (1)
Lesson 2 Thorndike's Connectionism
2 pages
Activity 3
No ratings yet
Activity 3
5 pages
Module 4 CHAPTER 4 DESIGNING AND DEVELOPING ASSESSMENT TOOLS
100% (1)
Module 4 CHAPTER 4 DESIGNING AND DEVELOPING ASSESSMENT TOOLS
7 pages
Gose Educ 105
No ratings yet
Gose Educ 105
19 pages
Expectations (GMRC)
No ratings yet
Expectations (GMRC)
1 page
AOL 2 Mod 3
No ratings yet
AOL 2 Mod 3
18 pages
Lesson Plan in Review On Theories Related To The Learner's Development I. Objectives
No ratings yet
Lesson Plan in Review On Theories Related To The Learner's Development I. Objectives
5 pages
Balasabas Product Rating Scale
No ratings yet
Balasabas Product Rating Scale
3 pages
Developing Assessment Tools
No ratings yet
Developing Assessment Tools
7 pages
2.1 Assessment As An Integral Part of Teaching
No ratings yet
2.1 Assessment As An Integral Part of Teaching
38 pages
Cloze Procedure, Fixed-Ratio (Every Seven Word)
No ratings yet
Cloze Procedure, Fixed-Ratio (Every Seven Word)
7 pages
Task-Specific Rubrics Are Helpful For Assessments That Require Specialized Performance Criteria, Making It
100% (1)
Task-Specific Rubrics Are Helpful For Assessments That Require Specialized Performance Criteria, Making It
2 pages
How Do You Conduct Performance Assessment
No ratings yet
How Do You Conduct Performance Assessment
2 pages
2.4 Assessment For, of and As Learning (GROUP 7)
No ratings yet
2.4 Assessment For, of and As Learning (GROUP 7)
10 pages
Edu 536 Reviewer With Key
No ratings yet
Edu 536 Reviewer With Key
13 pages
Lesson 1.
No ratings yet
Lesson 1.
12 pages
Authentic and Traditional Assessment
100% (1)
Authentic and Traditional Assessment
5 pages
Group Teaching
No ratings yet
Group Teaching
19 pages
Reviewer For Assessment in Learning - Gonzales
No ratings yet
Reviewer For Assessment in Learning - Gonzales
17 pages
Group 1 PPT Ta vs. Aa 15 July 2017
100% (1)
Group 1 PPT Ta vs. Aa 15 July 2017
11 pages
Docto Claud BEEd 3 Special Topics and Teaching Multi Grade
No ratings yet
Docto Claud BEEd 3 Special Topics and Teaching Multi Grade
20 pages
Classification of Test According To Format
No ratings yet
Classification of Test According To Format
4 pages
English Lesson Plan 1
No ratings yet
English Lesson Plan 1
10 pages
Gestalt Psychology
100% (2)
Gestalt Psychology
2 pages
LP in Arts 3 No.1
No ratings yet
LP in Arts 3 No.1
7 pages
Assessment Activity
No ratings yet
Assessment Activity
7 pages
Assessment 2 Chapter 3 Performance Based Assessment
100% (1)
Assessment 2 Chapter 3 Performance Based Assessment
72 pages
Preliminary Concepts - Common Terms - High Quality Assessment
No ratings yet
Preliminary Concepts - Common Terms - High Quality Assessment
14 pages
Designing Meaningful Performance-Based Assessment: Activity 1: Process and Product-Oriented Performance
No ratings yet
Designing Meaningful Performance-Based Assessment: Activity 1: Process and Product-Oriented Performance
6 pages
ED209- Lesson-5-PPT
No ratings yet
ED209- Lesson-5-PPT
36 pages
Assessment in Learning 1: Principles of High Quality Assessment
No ratings yet
Assessment in Learning 1: Principles of High Quality Assessment
20 pages
CHAPTER 2 Assessment of Learning
100% (1)
CHAPTER 2 Assessment of Learning
16 pages
Authentic Assessment Perform A Task Real Life Construction/ Application Student Centered Direct Evidence Process Oriented
No ratings yet
Authentic Assessment Perform A Task Real Life Construction/ Application Student Centered Direct Evidence Process Oriented
2 pages
Module - Unit 6 - Facilitating - Learner - Centered Teaching
No ratings yet
Module - Unit 6 - Facilitating - Learner - Centered Teaching
6 pages
Conflict
No ratings yet
Conflict
17 pages
FLCT Bloom's Taxonomy
No ratings yet
FLCT Bloom's Taxonomy
9 pages
Friendship Has No Color
From Everand
Friendship Has No Color
Christopher Gordon
No ratings yet
Doc 2 Elisha Grade 10
No ratings yet
Doc 2 Elisha Grade 10
2 pages
Basca Form
No ratings yet
Basca Form
2 pages
Elisha Tablea
No ratings yet
Elisha Tablea
3 pages
Basca Election
No ratings yet
Basca Election
1 page
PROFED 5-1
No ratings yet
PROFED 5-1
2 pages
Authorization
No ratings yet
Authorization
1 page
Indigency Barangay Madlawon
No ratings yet
Indigency Barangay Madlawon
1 page
What Makes a Good Learning Environment
No ratings yet
What Makes a Good Learning Environment
3 pages
Children's Literature
No ratings yet
Children's Literature
2 pages
A Eightys Senior Citizens
No ratings yet
A Eightys Senior Citizens
3 pages
DYNAMICS Music Report
No ratings yet
DYNAMICS Music Report
3 pages
COVER TO FS
No ratings yet
COVER TO FS
1 page
Accomplishment Report Barangay - Copy
No ratings yet
Accomplishment Report Barangay - Copy
9 pages
Big Book
No ratings yet
Big Book
13 pages
Assessment in Learning 2 1
No ratings yet
Assessment in Learning 2 1
12 pages
Ella Mae B Fulleros Activity 3
No ratings yet
Ella Mae B Fulleros Activity 3
2 pages
Detailed LP in English
No ratings yet
Detailed LP in English
9 pages
Lesson 3 - New Literacies Functional Literacy and Multiliteracy
100% (1)
Lesson 3 - New Literacies Functional Literacy and Multiliteracy
38 pages
Basketball Training Center Infographics
No ratings yet
Basketball Training Center Infographics
10 pages
DLP Building
No ratings yet
DLP Building
3 pages
Climate Report
No ratings yet
Climate Report
10 pages
(Ebook PDF) Classroom Reading Inventory 12th Edition Download PDF
100% (8)
(Ebook PDF) Classroom Reading Inventory 12th Edition Download PDF
38 pages
Final MOH National Guidelines For HSIs
No ratings yet
Final MOH National Guidelines For HSIs
33 pages
46 05. Zelalem READY
No ratings yet
46 05. Zelalem READY
12 pages
Jiwaji University PHD Course Work Syllabus
100% (2)
Jiwaji University PHD Course Work Syllabus
7 pages
Exam Questions: NCA-AIIO
No ratings yet
Exam Questions: NCA-AIIO
3 pages
Course Syllabus: Unit of The Technical College System of Georgia
No ratings yet
Course Syllabus: Unit of The Technical College System of Georgia
13 pages
ICT Assessment Tools
No ratings yet
ICT Assessment Tools
20 pages
658ug-Cbcs-I, Iii, V Semester (Backlogs) Fee Notification - Apr - 2024
No ratings yet
658ug-Cbcs-I, Iii, V Semester (Backlogs) Fee Notification - Apr - 2024
2 pages
Eligibility Slip NOT
No ratings yet
Eligibility Slip NOT
2 pages
Anna University Course Work Registration Form
100% (2)
Anna University Course Work Registration Form
6 pages
How To Use This Competency - Based Learning Materials
No ratings yet
How To Use This Competency - Based Learning Materials
16 pages
Applied - 11 - Research in Daily Life 1 - semII - CLAS1B - The Characteristics Processes and Ethics of Research - v10 PNS PDF
100% (1)
Applied - 11 - Research in Daily Life 1 - semII - CLAS1B - The Characteristics Processes and Ethics of Research - v10 PNS PDF
26 pages
Improvements in Osteological Pedagogy. - Berlier, C. (2018)
No ratings yet
Improvements in Osteological Pedagogy. - Berlier, C. (2018)
173 pages
A HISTORY of The SCOTTISH PEOPLE THE SCOTTISH EDUCACTION
No ratings yet
A HISTORY of The SCOTTISH PEOPLE THE SCOTTISH EDUCACTION
10 pages
Chapter 2 NEW
No ratings yet
Chapter 2 NEW
17 pages
St32 Call for Scholarship Applications Kenya Pwani Ma
No ratings yet
St32 Call for Scholarship Applications Kenya Pwani Ma
10 pages
Study Materials: Vedantu Innovations Pvt. Ltd. Score High With A Personal Teacher, Learn LIVE Online!
No ratings yet
Study Materials: Vedantu Innovations Pvt. Ltd. Score High With A Personal Teacher, Learn LIVE Online!
17 pages
Testbank and Solutions for Essentials of Economics 12th Edition Schiller
No ratings yet
Testbank and Solutions for Essentials of Economics 12th Edition Schiller
17 pages
Venue Guide Edexcel on 2024 Dhaka 08.10
No ratings yet
Venue Guide Edexcel on 2024 Dhaka 08.10
15 pages
Exam Handbook
No ratings yet
Exam Handbook
6 pages
Research Methodology
No ratings yet
Research Methodology
25 pages
Sample Score Report 2009
No ratings yet
Sample Score Report 2009
3 pages
SIMG Handbook v1 0
No ratings yet
SIMG Handbook v1 0
39 pages
Mid-Term Summer-2021 Faculty of Engineering Sciences and Technology
No ratings yet
Mid-Term Summer-2021 Faculty of Engineering Sciences and Technology
4 pages
Federal Public Service Commission: Aga Khan Road, Sector F-5/ 1, ISLAMABAD
No ratings yet
Federal Public Service Commission: Aga Khan Road, Sector F-5/ 1, ISLAMABAD
3 pages
Transforming Medical Education To Strengthen The H
No ratings yet
Transforming Medical Education To Strengthen The H
11 pages
Hons Regulation Ammended 2168 24.07.2023
No ratings yet
Hons Regulation Ammended 2168 24.07.2023
16 pages
RMish NCC 5050 F24 Syllabus Final_1921925601
No ratings yet
RMish NCC 5050 F24 Syllabus Final_1921925601
8 pages
Neet - Ntaonline.in Frontend Web Admitcard Index
No ratings yet
Neet - Ntaonline.in Frontend Web Admitcard Index
4 pages
MKT 460 Marketing Information and Analysis DUAN 05545
No ratings yet
MKT 460 Marketing Information and Analysis DUAN 05545
7 pages

Chapter 6edited

Uploaded by

Chapter 6edited

Uploaded by

LESSON 6: ESTABLISHING TEST VALIDITY AND RELIABILITY

Desired Significant Learning Outcomes

What is test reliability?

2. Individual differences of participants Every participant possesses characteristics that affect

What are the different ways to establish test reliability?

Test-retest is applicable for tests that measure

used for this analysis.

4. Test of This procedure involves determining if the A statistical analysis called

Likert scale from "strongly agree" to "strongly

5. Inter-rater This procedure is used to determine the A statistical analysis called

What are the different ways to establish test reliability?

Substitute the values in the formula:

3. Difference between a positive and a negative correlation

4. Determining the strength of a correlation

0.6-0.79 Strong relationship

0.40-0.59 Substantial/marked relationship

0.2-0.39 Weak relationship

0.00-0.19 Negligible relationship

5. Determining the significance of the correlation

x = 8.4 ZD2 = 33.2

What is test validity?

What are the different ways to establish test validity?

Type of Validity Definition Procedure

Concurrent When two br more The scores on the measures should be

How to determine if an item is easy or difficult?

➢ The index discrimination is obtained using formula:

DIFFICULTY INDEX REMARK

0.76 or higher Easy Item

0.25 to 0.75 Average Item

0.24 or lower Difficult Item

INDEX DISCRIMINATION REMARK

0.40 and above Very Good Item

0.30 - 0.39 Good Item

0.20 - 0.29 Reasonably Good Item

0.10 - 0.19 Marginal Item

Below 0.10 Poor Item

Item 1 Item 2 Item 3 Item 4 Item 5

=(0.67+0)/2 =(0.67+0.33)/2 =(2.00+0.67)/2 =(1.00+0.33)/2 =(1.00+0.33)/2

Index of 0.33 0.50 0.83 0.50 0.67

Item Difficult Average Easy Average Average

=0.67-0 =0.67-0.33 =2.00-0.67 =1.00-0.33 =1.00-0.3

Discrimination 0.67 0.33 0.33 0.33 0.67

2. Application of IRT Using the Rasch Model in Constructing Measures: https://

3. Reliability and Validity in Student Assessment: https://www.youtube.com/

You might also like