0% found this document useful (0 votes)
3 views

8602 Assignment # 01 pfd

The document discusses formative and summative assessments, highlighting their purposes, characteristics, and examples. Formative assessments are ongoing evaluations aimed at improving student learning, while summative assessments evaluate student learning at the end of an instructional unit. Additionally, it covers the creation of tables of specifications for assessments and compares norm-referenced and criterion-referenced testing methods.

Uploaded by

Javeria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

8602 Assignment # 01 pfd

The document discusses formative and summative assessments, highlighting their purposes, characteristics, and examples. Formative assessments are ongoing evaluations aimed at improving student learning, while summative assessments evaluate student learning at the end of an instructional unit. Additionally, it covers the creation of tables of specifications for assessments and compares norm-referenced and criterion-referenced testing methods.

Uploaded by

Javeria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

ALLAMA IQBAL OPEN UNIVERSITY

(Department of Secondary Teacher Education)


Assignment No.1
Submitted to: Muhammad Naveed
Course code 8602
Educational Assessment and Evaluation
Name: Javeria Arshad
Roll no: CE604166
Q.1: What is formative and summative Assessment? Distinguish between
them with the help of relevant examples.

Answer:

Formative assessment
Ideally, formative assessment strategies evaluation methodologies further develop instructing and
adapting at the same time. Teachers can assist understudies with developing students by effectively
promising them to self-evaluate their own abilities and information maintenance, and by giving
clear directions and input. The objective of developmental evaluation is to screen understudy
figuring out how to give progressing criticism that can be utilized by educators to work on their
instructing and by understudies to work on their learning. All the more explicitly, developmental
appraisals:
 Help understudies distinguish their qualities and shortcomings and target regions that need
work.
 Help workforce perceive where understudies are battling and address issues right away.
Developmental evaluations are by and large low stakes, which implies that they have low or no
point esteem.
Summative assessment
The objective of summative assessment is to assess understudy learning toward the finish of an
educational unit by contrasting it against some norm or benchmark. Summative evaluations are
regularly high stakes, which implies that they have a high point esteem. Data from summative
appraisals can be utilized developmentally when understudies or personnel use it to direct their
endeavors and exercises in resulting courses.
Purpose of these Assessments
Formative Assessment

“The purpose of formative assessment is to monitor student learning and provide ongoing
feedback to staff and students”.

It is assessment for learning. If designed appropriately, it helps students identify their strengths
and weaknesses, can enable students to improve their self-regulatory skills so that they manage
their education in a less haphazard fashion than is commonly found. It also provides information
to the faculty about the areas students are struggling with so that sufficient support can be put in
place. Formative assessment can be tutor led, peer or self-assessment. Formative assessments have
low stakes and usually carry no grade, which in some instances may discourage the students from
doing the task or fully engaging with it.

Summative Assessment

“The goal of summative assessment is to evaluate student learning at the end of an instructional
unit by comparing it against some standard or benchmark”.

Summative assessments often have high stakes and are treated by the students as the priority over
formative assessments. However, feedback from summative assessments can be used
formatively by both students and faculty to guide their efforts and activities in subsequent courses.

Hence, achieving a balance between formative and summative assessments is important, although
one that students don't always fully grasp and/or take seriously. Formative assessments, provide a
highly effective and risk-free environment in which students can learn and experiment.
Examples of Formative and Summative Assessments

Examples of Formative Assessment

1. Make a promotion

Have your understudies make a promotion for an idea they just educated. Use visuals and text to
truly sell a thought. This causes understudies to apply what they've realized into an inventive
exercise, which assists with long haul maintenance.

2. Thought correlations

Educate understudies to spread out the fundamental thoughts of another idea they learned. Then,
at that point, have the contrast that idea with another to see where they concur and clash. As well
as assisting understudies with recalling these ideas, this activity causes them to apply past
information to another configuration so they can recollect it better later on.

3. Misinterpretations

After you acquaint an idea with understudies, present a famous misguided judgment about it.
Have understudies examine why the misguided judgment is bogus and where it might have
begun. This activity makes understudies ponder what they've recently realized while telling them
the best way to expose falsehood.

Examples of Summative assessments


1. In-depth reports
Train understudies to pick a point that impacted them in class and report top to bottom on it. This
is an incredible open door for understudies to take a thought and go for it under your watch.
These reports regularly grandstand an understudy's advantage, and you'll have the option to assess
an understudy's commitment level in the class by how they approach the report.
The objective is an energetic, savvy, and extensive assessment of an idea that is important to an
understudy.
2. Cumulative, individual projects
Have your understudies pick a task to finish. This task ought to some way or another reflect what
they've realized all through the course.
Ventures are extraordinary for any viable application class from wellbeing science to physical
science. Making a cross-part of the human heart, planning an eating routine, or making a defensive
egg-drop vessel are on the whole fun ways understudies can flaunt their insight into a subject.
3. Individual assessment papers
Expect understudies to apply standards from your group to their own lives. These papers are
brilliant fits for brain science, sustenance, money, business, and other hypothesis based classes.
More or less, individual assessments let understudies take a gander at themselves through an
alternate focal point while investigating the subtleties of the standards they learned in class.
Besides, it allows understudies to accomplish something everybody loves to talk about themselves.

Key comparison between Formative and Summative Assessment

Formative Assessment Summative Assessment

This assessment refers to a variety of It is defined as a standard for


assessment procedures that provides evaluating learning of students.
the required information, to adjust
teaching, during the learning process.

Formative Assessment occurs on an Summative Assessment occurs only at


on-going basis, either monthly or specific intervals which are normally
quarterly. end of the course.

It is an assessment for learning It is an assessment of learning

It is diagnostic in nature This one is evaluative.

It is conducted to enhance the learning It is conducted to judge student’s


of the students. performance.
This assessment is undertaken to It aims at evaluating
monitor student’s learning. student’s learning.

Formative Assessment attempts to Summative Assessment, seeks to


provide direct and detailed feedback evaluate the effectiveness of the
to both teachers and students, course or program, checks the learning
regarding the performance and progress, etc. Scores, grades or
learning of the student. It is a percentage obtained to act as an
continuous process, which observes indicator that shows the quality of the
student’s needs and progress, in the curriculum and forms a basis for
learning process. rankings in schools.

A set of formal and informal This assessment refers to the


assessment methods undertaken by the evaluation of students; that focuses on
teachers at the time of the learning the result.
process is called Formative
assessment.
Q.2: How to prepare table of specifications? What are different ways of
developing table of specifications?

Answer:
Table of specifications (TOS)
The table of specifications (TOS) is a device used to guarantee that a test or evaluation measures
The substance and thinking abilities that the test plans to quantify. Consequently, when utilized
suitably, it can give reaction content and build (i.e., reaction measure) legitimacy proof. A TOS
might be utilized for huge scope test development, homeroom level evaluations by instructors, and
psychometric scale improvement. It is an essential device in planning tests or then again gauges
for research and instructive purposes.
The basic role of a TOS is to guarantee arrangement between the things or components of an
appraisal and the substance, abilities, or develops that the evaluation expects to survey. That is, a
TOS helps test constructors to zero in on issue of reaction content, guaranteeing that the test or
then again appraisal estimates what it plans to quantify. For instance, if an instructor is keen on
surveying the understudies' comprehension of lunar stages, then, at that point it is suitable to have
a test thing requesting them to draw the stages from the moon. Be that as it may, a test thing asking
them to distinguish the main individual to stroll on the moon would not have a similar substance
legitimacy to evaluate understudies' information on lunar stages.
The educational objectives play a significant role in the development of classroom tests. The
reason is that the preparation of classroom test is closely related to the curriculum and educational
objectives. And we have also explained that a test should measure what was taught. For ensuring
that there is similarity between classroom instruction and test content is the development and
application of table of Specification, which is also called “a test blue print”.
Steps of creating table of specifications

Tables of Specification typically are designed based on the list of course objectives, the topics
covered in class, the amount of time spent on those topics, textbook chapter topics, and the
emphasis and space provided in the text. In some cases a great weight will be assigned to a concept
that is extremely important, even if relatively little class time was spent on the topic. Three steps
are involved in creating a Table of Specifications: 1) choosing the measurement goals and domain
to be covered, 2) breaking the domain into key or fairly independent parts- concepts, terms,
procedures, applications, and 3) constructing the table.
Sample Table of Specifications
Bloom's Taxonomy

Subject Knowledge & Analysis, Synthesis


Application TOTALS
Content Comprehension &Evaluation
Topic A 10% 20% 10% 40%
Topic B 15% 15% 30% 60%
TOTALS 25% 35% 40% 100%

Ways of developing table of specifications

1- Determine the coverage of your exam

The first rule in making exams and therefore in making a document called table of specification is
to make sure the coverage of your exam is something that you have satisfactorily taught in class.
Select the topics that you wish to test in the exam. It is possible that you will not be able to cover
all these topics as it might create a test that is too long and will not be realistic for your students in
the given time. So select only the most important topics.

2- Determine your testing objectives for each topic area

In this step, you will need to be familiar with bloom’s taxonomy of thinking skills. Bloom has
identified the hierarchy of learning objectives, from the lower thinking skills of knowledge and
comprehension to the higher thinking skills of evaluation and synthesis.

Bloom’s Taxonomy has six categories: (starting from lower level to highest) - (1) Knowledge, (2)
Comprehension, (3) Application, (4) Analysis, (5) Synthesis and (6) Evaluation

So for each content area that you wish to test, you will have to determine how you will test each
area. Will you test simply their recall of knowledge? Or will you be testing their comprehension
of the matter? Or perhaps you will be challenging them to analyze and compare and contrast
something. Again, this would depend on your instructional objectives in the classroom. It is
important that your terms of specification reflect your instructional procedures during the semester.
If your coverage on a topic mostly dwelt on knowledge and comprehension of material, then you
cannot test them by going up the hierarchy of bloom’s taxonomy. Thus it is crucial that you give
a balanced set of objectives throughout the semester depending on the nature of your students.

3- Determine the duration for each content area

The next step in making the table of specifications is to write down how long you spent teaching
a particular topic. This is important because it will determine how many points you should devote
for each topic. Logically, the longer time you spent on teaching a material, then the more questions
should be devoted for that area.

4- Determine the Test Types for each objective

Now that you have created your table of specifications for your test by aligning your objectives to
bloom’s taxonomy, it’s time to determine the test types that will accomplish your testing
objectives. For example, knowledge questions can be accomplished easily through multiple choice
questions or matching type exams. If you want to test evaluation or synthesis of a topic, you will
want to create exam type questions or perhaps you will ask the students to create diagrams and
explain their diagrams in their analysis. The important thing is that the test type should reflect your
testing objective.

5- Polish your terms of specification

After your initial draft of the table of specifications, it’s time to polish it. Make sure that you have
covered in your terms of specification the important topics that you wish to test. The number of
items for your test should be sufficient for the time allotted for the test. You should seek your
academic coordinator and have them comment on your table of specification. They will be able to
give good feedback on how you can improve or modify it.

After their approval, it’s time to put into action your blueprint by creating your exam. It would be
best to use a spreadsheet like Microsoft Excel so you could easily modify your Terms of
Specification in case you have some corrections.

Q.3: Define criteria and Norm-reference testing. Make a comparison between


them.
Answer:
Norm-referenced Tests and Criterion-Referenced Tests
Tests can be categorized into two major groups: norm-referenced tests and criterion referenced
tests. These two tests differ in their intended purposes, the way in which content is selected, and
the scoring process which defines how the test results must be interpreted.
Norm-Referenced Test
Norm-referenced measures compare a person’s knowledge or skills to the knowledge or skills of
the norm group. Norm-referenced tests are made with compare test takers to each other. The
composition of the norm group depends on the assessment. For student assessments, the norm
group is often a nationally representative sample of several thousand students in the same grade
(and sometimes, at the same point in the school year). Norm groups may also be further narrowed
by age, English Language Learner (ELL) status, socioeconomic level, race/ethnicity, or many
other characteristics.

On an NRT driving test, test-takers would be compared as to who knew most or least about driving
rules or who drove better or worse. Scores would be reported as a percentage rank with half scoring
above and half below the mid-point.
This type of test determines a student's placement on a normal distribution curve. Students compete
against each other on this type of assessment. This is what is being referred to with the phrase,
'grading on a curve'.

Criterion-Referenced Tests
Criterion-referenced tests compare a person’s knowledge or skills against a predetermined
standard, learning goal, performance level, or other criterion. With criterion-referenced tests, each
person’s performance is compared directly to the standard, without considering how other students
perform on the test. Criterion-referenced tests often use “cut scores” to place students into
categories such as “basic,” “proficient,” and “advanced.”

Criterion-referenced tests are intended to measure how well a person has learned a specific body
of knowledge and skills. Criterion-referenced test is a term which is used daily in classes. These
tests assess specific skills covered in class. Criterion-referenced tests measure specific skills and
concepts. Typically, they are designed with 100 total points possible. Students are earned points
for items completed correctly. The students' scores are typically expressed as a percentage.
Criterion referenced tests are the most common type of test teacher’s use in daily classroom work.

Comparison between criteria and Norm-reference testing

Norm-Referenced tests Criterion-Reference tests

It measures the performance of one group It measures the performance of test takers
of test takers against another group of test against the criteria covered in the
takers. curriculum.
To measure how much the test taker
To measure how much a test taker knows
known before and after the instruction is
compared to another student.
finished.

Norm-Referenced tests measure broad Criterion-Reference tests measure the


skill areas taken from a variety of skills the test taker has acquired on
textbooks and syllabi. finishing a curriculum.

Each skill is tested by at least four items


Each skill is tested by less than four items.
to obtain an adequate sample of the
The items vary in difficulty.
student.

Norm-Referenced tests must be Criterion-Reference tests need not be


administered in a standardized format. administered in a standardized format.

Norm-Referenced test scores are reported Criterion-Reference test scores are


in a percentile rank. reported in categories or percentage.

In Norm-Referenced tests, if a test taker


In Criterion-Reference, the score
ranks 95%, it implies that he/she has
determines how much of the curriculum
performed better than 95% of the other
is understood by the test taker.
test takers.

It measures the acquisition of skills and Criterion referenced tests measure


knowledge from multiple sources such as performance on specific concepts and are
notes, texts and syllabi. often used in a pre-test / post-test format.
Q.4: What are the types of selection type tests items? What are the advantages
of multiple choice questions?

Answer:
Selection Type Items (objective type)
There are four types of test items in selection category of test which are in common use today.
 Multiple-choice
 Matching
 True-false
 Completion items.

Multiple-Choice
Multiple-choice test items consist of a stem or a question and three or more alternative answers
(options) with the correct answer sometimes called the keyed response and the incorrect answers
called distracters. This form is generally better than the incomplete stem because it is simpler and
more natural. Grounlund (1995) writes that the multiple choice question is probably the most
popular as well as the most widely applicable and effective type of objective test. Student selects
a single response from a list of options. It can be used effectively for any level of course outcome.
It consists of two parts: the stem, which states the problem and a list of three to five alternatives,
one of which is the correct (key) answer and the others are distracters (incorrect options that draw
the less knowledgeable pupil away from the correct response).structure is for the most part better
compared to the fragmented stem since it is less difficult and more normal.

Grounlund (1995) composes that the various decision question is presumably the most well-known
just as the most broadly pertinent and viable kind of target test. Understudy chooses a solitary
reaction from a rundown of alternatives. It tends to be utilized successfully for any degree
obviously result. It comprises of two sections: the stem, which expresses the issue and a rundown
of three to five other options, one of which is the right (key) answer and the others are distracters
(wrong alternatives that draw the less learned understudy away from the right reaction).
Matching
As per Cunningham (1998), the coordinating things comprise of two equal segments. The section
on the left contains the inquiries to be replied, named premises; the section on the right, the
appropriate responses, named reactions. The understudy is approached to relate each reason with
a reaction to frame a coordinating with pair.

True/False Questions
A True-False test item requires the student to determine whether a statement is true or false. The
chief disadvantage of this type is the opportunity for successful guessing.
Because the true-false option is the most common, this type is mostly refers to true-false type.
Students make a designation about the validity of the statement. Also known as a “binary-choice”
item because there are only two options to select from. These types of items are more effective for
assessing knowledge, comprehension, and application outcomes as defined in the cognitive
domain of Blooms’ Taxonomy of educational objectives.

Completion Items
Like true-false items, completion items are relatively easy to write. Perhaps the first tests
classroom teachers’ construct and students take completion tests. Like items of all other formats,
though, there are good and poor completion items. Student fills in one or more blanks in a
statement. These are also known as “Gap-Fillers.” Most effective for assessing knowledge and
comprehension learning outcomes but can be written for higher level outcomes. e.g.
The capital city of Pakistan is -----------------.

Advantages of multiple choice questions

Multiple-choice test items are not a panacea. They have advantages and advantages just as any
other type of test item. Teachers need to be aware of these characteristics in order to use multiple-
choice items effectively.
 They have fast processing times
 There's no room for subjectivity
 You can ask more questions, it takes less time to complete a multiple-choice question
compared to an open question.
 Respondents don't have to formulate an answer but can focus on the content
Dependability
Elegantly composed various decision test things contrast well and other test thing types on the
issue of dependability. They are less helpless to speculating than are genuine bogus test things,
what’s more, thusly equipped for delivering more solid scores. Their scoring is all the more
obvious than short answer test thing scoring on the grounds that there are no incorrectly spelled or
fractional responses to manage. Since various decision things are impartially scored, they are not
influenced by scorer irregularities as are paper questions, and they are basically resistant to the
impact of feigning and composing capacity factors, the two of which can bring down the
dependability of article test scores.
Versatility
Multiple-choice test items are appropriate for use in many different subject-matter areas, and can
be used to measure a great variety of educational objectives. They are adaptable to various levels
of learning outcomes, from simple recall of knowledge to more complex levels, such as the
student’s ability to:
 Analyze phenomena
 Apply principles to new situations
 Comprehend concepts and principles
 Discriminate between fact and opinion
 Interpret cause-and-effect relationships
 Interpret charts and graphs
 Judge the relevance of information
 Make inferences from given data
 Solve problems
The difficulty of multiple-choice items can be controlled by changing the alternatives, since the
more homogeneous the alternatives, the finer the distinction the students must make in order to
identify the correct answer. Multiple-choice items are amenable to item analysis, which enables
the teacher to improve the item by replacing distracters that are not functioning properly. In
addition, the distracters chosen by the student may be used to diagnose misconceptions of the
student or weaknesses in the teacher’s instruction.
Effectiveness
Various decision things are manageable to quick scoring, which is regularly done by scoring
machines. This speeds up the revealing of test results to the understudy so that any development
explanation of guidance might be done before the course has continued a lot further. Essay
questions, on the other hand, must be graded manually, one at a time. Overall multiple choice tests
are:
 Very effective
 Versatile at all levels
 Minimum of writing for student
 Guessing reduced
 Can cover broad range of content
Legitimacy

By and large, it takes any longer to react to a paper test question than it does to react to a numerous
decision test thing, since the making and recording out of a paper answer is a lethargic cycle. An
understudy is in this manner ready to answer numerous multiple choice things in time it would
take to address a solitary paper question. This component empowers the educator utilizing various
decision things to test a more extensive example obviously substance in a given measure of testing
time. Thusly, the grades will probably be more delegate of the understudies' general
accomplishment in the course.

Q.5: Which factors affect the reliability of test?


Answer:
There are different factors that effect
Intrinsic Factors
The principal intrinsic factors (i.e. those factors which lie within the test itself) which affect
the reliability are:
Reliability has a definite relation with the length of the test. The more the number of items the test
contains, the greater will be its reliability and vice-versa. Logically, the more sample of items we
take of a given area of knowledge, skill and the like, the more reliable the test will be.
However, it is difficult to ensure the maximum length of the test to ensure an appropriate value of
reliability. The length of the tests in such case should not give rise to fatigue effects in the testes,
etc. Thus, it is advisable to use longer tests rather than shorter tests. Shorter tests are less reliable.

Homogeneity of Items:
Homogeneity of items has two aspects: item reliability and the homogeneity of traits measured
from one item to another. If the items measure different functions and the inter-correlations of
items are ‘zero’ or near to it, then the reliability is ‘zero’ or very low and vice-versa.

Difficulty Value of Items:


The difficulty level and clarity of expression of a test item also affect the reliability of test scores.
If the test items are too easy or too difficult for the group members it will tend to produce scores
of low reliability. Because both the tests have a restricted spread of scores.

Discriminative Value:
When items can discriminate well between superior and inferior, the item total-correlation is high,
the reliability is also likely to be high and vice-versa.

Test instructions:
Clear and concise instructions increase reliability. Complicated and ambiguous directions give rise
to difficulties in understanding the questions and the nature of the response expected from the
testes ultimately leading to low reliability.

Extrinsic Factors
The important extrinsic factors (i.e. the factors which remain outside the test itself)
influencing the reliability are:
Group variability:
When the group of pupils being tested is homogeneous in ability, the reliability of the test scores
is likely to be lowered and vice-versa.
Guessing and chance errors:
Guessing in test gives rise to increased error variance and as such reduces reliability. For example,
in two-alternative response options there is a 50% chance of answering the items correctly in terms
of guessing.

Environmental conditions:

As far as practicable, testing environment should be uniform. Arrangement should be such that
light, sound, and other comforts should be equal to all testes, otherwise it will affect the reliability
of the test scores.

Momentary fluctuations:

Momentary fluctuations may raise or lower the reliability of the test scores. Broken pencil,
momentary distraction by sudden sound of a train running outside, anxiety regarding non-
completion of home-work, mistake in giving the answer and knowing no way to change it are the
factors which may affect the reliability of test score.

Errors that Can Increase or Decrease Individual Scores:


There might be some errors committed by the test developers that also affect the reliability of the
tests developed by teachers. These errors initially affect the students’ scores, mean deviate the
scores from the true ability of the students, and therefore affect the reliability. A careful
consideration of these factors may help to measure the true ability of the students.
 The test itself: the overall look of the test may affect the students score. Normally a test is
written in well readable font size and style, the language of the test should be simple and
understandable.
 The test administration: After the development of the test, the test developer may have to
prepare the manual of the test administration, the time, environment, invigilation, and the
anxiety also affects students’ performance while attempting the test. Therefore the uniform
administration of the test leads to the increased reliability.
 The test scoring: Marking of the test is another factor towards the variation in the scores of
the students. Normally there are many raters to rate the students’ responses/answers on the
test. Objective type test items and the marking rubric for essay type/ supply type test items
help to get the consistent scores.

You might also like