8602 SPRING 2024
8602 SPRING 2024
SPRING 2024
ہیافلئابلکلتفمےہ،اسےکےیلوکیئیھبصخشادایگیئ
ہنرکے۔ارگوکیئآپےساسافلئےکےیلےسیپبلط
رکے،وترباہرکماناکربمناوراساکارکسنیاشٹامہرے
اسھترئیشرکںی۔تفمیپڈیافیاحلصرکےنےک
ےیلامہرےواسٹاپیرگوپںیماشلموہاجںیئ۔
راہطبربمن03259594602:
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
2 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
2. Validity
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
3 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
3. Reliability
• Reliability refers to the consistency of assessment results across
time, different evaluators, or varied conditions.
• Example: If a student takes the same test on different days, the
results should be similar if their knowledge hasn't changed.
• Significance: Builds trust in the assessment process and ensures
fairness.
4. Fairness
• Assessments must be free from bias and provide all students an
equal opportunity to demonstrate their abilities.
• Example: Avoid using culturally specific language or contexts that
may disadvantage certain groups of students.
• Significance: Promotes equity in the classroom.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
4 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
6. Feedback-Oriented
• Assessments should provide meaningful, timely, and constructive
feedback to students.
• Example: After a test, giving specific comments on strengths and
areas that need improvement.
• Significance: Encourages learning and growth by guiding students
in the right direction.
7. Variety of Methods
• Effective assessment employs diverse methods, such as written
tests, oral presentations, projects, and peer evaluations.
• Example: Combining multiple-choice questions, essays, and group
activities in a unit assessment.
• Significance: Captures a holistic picture of students’ abilities and
caters to different learning styles.
8. Student Involvement
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
5 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
9. Transparency
• The criteria, methods, and purposes of assessments should be clear
to both teachers and students.
• Example: Sharing rubrics, guidelines, and expectations before an
assignment.
• Significance: Reduces confusion and ensures that students
understand how they are being evaluated.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
6 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
7 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
how well students have mastered the learning objectives set for a course
or grade level.
• Formative Assessment: Used during the learning process to
identify gaps and adjust instruction accordingly.
• Summative Assessment: Conducted at the end of a course or unit
to evaluate overall achievement.
3. Guiding Instruction
Assessment data from tests inform teaching practices by highlighting the
effectiveness of instructional strategies. Teachers can use test results to:
• Modify lesson plans.
• Allocate more time to challenging topics.
• Adapt teaching methods to cater to diverse learning styles.
4. Providing Feedback
Testing provides valuable feedback to students, parents, and educators:
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
8 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
5. Encouraging Accountability
Tests hold students, teachers, and educational institutions accountable
for achieving desired outcomes.
• Students are motivated to prepare and engage with the material.
• Teachers are driven to ensure their instruction aligns with
standards.
• Schools and districts are evaluated on their ability to meet
benchmarks.
6. Certifying Competence
In many cases, tests are used to certify that a student has achieved a
certain level of competence or skill. For example:
• High-stakes tests like graduation exams validate a student’s
readiness to advance or enter the workforce.
• Professional certifications or licensure exams confirm specialized
knowledge or skills.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
9 | Page KHA N BHA I WHATSA P P N O 0325 9594 602
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
10 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
Tests are often used for selection and placement purposes, such as:
• Identifying students for advanced courses, gifted programs, or
remedial education.
• Placing students in appropriate grade levels or subject groups
based on their proficiency.
Challenges in Testing
While testing serves these essential purposes, it must be carefully
designed and implemented to avoid:
• Overemphasis on rote learning.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
11 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
1. Closed-Ended Questions
Closed-ended questions provide respondents with a limited set of
predefined response options, making it easier to quantify and analyze
data.
Types of Closed-Ended Questions:
• Multiple Choice Questions
Respondents select one or more answers from a list of options.
Example:
What is your preferred mode of transport?
o ☐ Car
o ☐ Bike
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
12 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
o ☐ Bus
o ☐ Train
• Yes/No Questions
These require a simple binary response.
Example:
Have you traveled internationally in the last year?
o ☐ Yes
o ☐ No
• Likert Scale
Measures attitudes or opinions on a scale (e.g., agreement,
satisfaction).
Example:
How satisfied are you with our service?
o ☐ Very dissatisfied
o ☐ Dissatisfied
o ☐ Neutral
o ☐ Satisfied
o ☐ Very satisfied
• Rating Scale
Respondents rate a specific aspect on a numeric or descriptive
scale.
Example:
Rate the quality of our customer support (1 = Poor, 5 = Excellent).
o 1☐
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
13 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
o 2☐
o 3☐
o 4☐
o 5☐
2. Open-Ended Questions
Open-ended questions allow respondents to answer freely in their own
words. These questions provide rich, qualitative data but are harder to
analyze.
Example:
What improvements would you suggest for our service?
_Response: ______________________________
3. Demographic Questions
These questions collect background information about the respondents.
They are often closed-ended but can include open-ended options when
appropriate.
Example:
What is your age group?
• ☐ Under 18
• ☐ 18–24
• ☐ 25–34
• ☐ 35–44
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
14 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
• ☐ 45 and above
What is your occupation?
_Response: ______________________________
4. Contingency Questions
These are follow-up questions dependent on a previous response. They
help in filtering irrelevant questions.
Example:
Did you purchase any of our products in the last month?
• ☐ Yes (If yes, please answer the next question.)
• ☐ No (Skip to Question 5)
Which product did you purchase?
_Response: ______________________________
5. Rank-Order Questions
These ask respondents to rank options in a specific order based on
preference or importance.
Example:
Rank the following features of our product in order of importance (1 =
Most important, 4 = Least important).
• ☐ Price
• ☐ Durability
• ☐ Design
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
15 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
• ☐ Functionality
6. Matrix Questions
Matrix questions combine multiple Likert-scale questions in a grid
format, making it efficient for respondents to answer.
Example:
How would you rate the following aspects of our service?
Aspect Poor Fair Good Very Good Excellent
Staff behavior ☐ ☐ ☐ ☐ ☐
Response time ☐ ☐ ☐ ☐ ☐
Ease of access ☐ ☐ ☐ ☐ ☐
7. Dichotomous Questions
These are a simpler form of closed-ended questions with only two
possible answers, often used for binary decision-making.
Example:
Are you currently employed?
• ☐ Yes
• ☐ No
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
16 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
9. Checklist Questions
These allow respondents to select multiple options from a list.
Example:
Which of the following apps do you use daily?
• ☐ Instagram
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
17 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
18 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
19 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
Comparison Table
Aspect Restrictive Response Extended Response
Length of Brief and focused Detailed and
Response comprehensive
Scope Narrow, specific Broad, allowing for
exploration
Skills Assessed Recall, application, basic Analysis, synthesis,
comprehension evaluation
Examples Definitions, problem- Essays, in-depth
solving explanations
Evaluation Accuracy of facts, Depth, organization,
Criteria correctness of answer creativity, evidence
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
20 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
1. Test-Retest Reliability
Test-retest reliability measures the consistency of a test's results when
it is administered to the same group of people at two different points in
time. This type of reliability is particularly useful for measuring traits or
abilities that are relatively stable over time, such as intelligence or
personality traits.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
21 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
• How it's measured: The test is given to the same group of people
twice, with a time interval between the two administrations. The
correlation between the two sets of scores is computed (usually
using the Pearson correlation coefficient). A higher correlation
indicates higher test-retest reliability.
• Ideal for: Situations where the characteristic being measured does
not change quickly (e.g., intelligence, personality).
• Limitations: Test-retest reliability can be affected by memory or
learning effects, particularly if the test items are the same or
similar each time.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
22 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
computing the scores for each half, and then correlating these
scores.
• Ideal for: Multi-item tests, such as surveys or questionnaires,
where items are designed to measure different aspects of a single
construct.
• Limitations: Internal consistency is primarily concerned with how
items correlate with each other, so it doesn’t necessarily indicate
whether the test measures the construct in a valid or
comprehensive way.
3. Inter-Rater Reliability
Inter-rater reliability (also known as inter-observer reliability) refers to
the degree to which different raters or observers agree on their
assessments when using the same test or tool. This is particularly
important in tests or evaluations that involve subjective judgment or
scoring, such as in clinical assessments, interviews, or performance
evaluations.
• How it's measured: It is assessed by comparing the ratings or
scores given by multiple raters for the same individuals or subjects.
Common measures include:
o Cohen’s Kappa: A statistical coefficient that adjusts for the
possibility of the agreement occurring by chance.
o Intra-class correlation (ICC): A measure of reliability used
when there are more than two raters.
• Ideal for: Situations where more than one person is responsible for
scoring or evaluating the same subject, such as in clinical settings,
peer reviews, or educational assessments.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
23 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
4. Parallel-Forms Reliability
Parallel-forms reliability involves comparing two different versions of
a test that are designed to measure the same construct. The goal is to
determine whether both forms of the test produce consistent results. This
type of reliability is useful when you want to avoid practice effects in a
test-retest situation.
• How it's measured: The test is administered to the same group of
people using two different but equivalent forms of the test. The
scores from both versions are correlated, and a high correlation
indicates good parallel-forms reliability.
• Ideal for: Testing the same construct with different items, often
used in large-scale assessments or standardized testing where
alternate forms of the test may be needed (e.g., SAT, GRE).
• Limitations: Developing parallel forms of a test that are
equivalent in difficulty and content can be challenging and time-
consuming.
5. Split-Half Reliability
Split-half reliability is a method of testing internal consistency by
splitting a test into two halves and checking how well the two halves
correlate with each other. The idea is that if the two halves of a test are
measuring the same thing, their scores should be highly correlated.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
24 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
• How it's measured: The test is divided into two halves (for
example, odd-numbered and even-numbered items). The scores for
the two halves are correlated, and the result is used to estimate the
overall reliability of the test. This method is often used as an
alternative to Cronbach's alpha.
• Ideal for: Short tests or those with many items where full retesting
may not be feasible.
• Limitations: The way in which the test is split can affect the
results, and it may not always reflect the reliability of the full test.
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2
25 | P a g e K H A N B H A I W H A T S A P P N O 0 3 2 5 9 5 9 4 6 0 2
K H A N B H A I W H AT S A P P N O 0 3 2 5 9 5 9 4 6 0 2