0% found this document useful (0 votes)
25 views

The Assessment Glossary

Uploaded by

GLYKERIA MANGELA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

The Assessment Glossary

Uploaded by

GLYKERIA MANGELA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

The Assessment Network

Assessment 101
Glossary
101 words and phrases for
assessment professionals

1
The world of assessment contains a lot of terminology and
phrasing that can present a barrier to fully engaging with
the topic.

At The Assessment Network, we are passionate about


helping educators, learners, caregivers and parents
understand the purposes and principles of assessment
with the aim of supporting overall learning.

We achieve this in several different ways - from targeted


courses in a range of assessment topics and practices, to
thought leadership webinars and conferences. A common
request we receive is to provide support in helping people
develop their confidence in assessment - making complex
technical ideas digestible and understandable.

This is the motivation behind the development of our


glossary of assessment terms. The glossary aims to provide
easy access to sound definitions and concepts that are
regularly used in assessment discourse. It builds on
previous work across Cambridge University Press &
Assessment from the Assessment Projects Group and
reflects the wide range of activities in assessment that
Cambridge delivers and supports.

We hope you’ll find the glossary to be a useful and trusted


resource, inspiring you to delve deeper and develop your
assessment practice further.

Dr Simon Child, Head of Assessment Training,


The Assessment Network

2
Glossary
The list below gives an indication of how these words are intended to be understood within the
context of educational assessment. It is not intended to provide formal definitions of these terms.

A Level Aggregation
Advanced Level. An academic qualification The summing of marks from a number of
taken by secondary school students (usually at different components.
age 17 or 18) in England, Wales and Northern
Ireland. A Levels are offered in a wide range
of subjects, such as mathematics, English
Analytic mark scheme
literature, media studies and dance. These A mark scheme in which separate marks are
qualifications are normally sat at the end awarded for separate aspects or awarding
of two years of study and are often used as objectives. The final mark is typically the
selection criteria by universities and employers. aggregate of the separate marks.

Access arrangements Anchoring


Where individual arrangements or adjustments Usually a set of common items used to establish
to the test are identified and set up in advance an equivalent standard between tests.
of the assessment for candidates with
particular needs, for example a Angoff
permanent disability.
A method of determining cut-scores based on
expert predictions of performance on
Accessibility individual test items.
Accessibility refers to the ease with which
candidates can understand a question or task. Archived scripts
Arrangements can be put in place
for candidates who have specific needs to Examples of candidate work from previous
promote accessibility: examples would be testing, usually at the cut-scores / grade
reprinting a paper in Braille for candidates thresholds. Often used to exemplify the
with visual impairments, or giving extra time to awarding standard
candidates who have difficulty with language
processing. However, it is important to note Assessment
that the concept of accessibility applies to all
Educational assessment is the process of
candidates: a poorly worded question is likely
gathering, interpreting, recording and using
to be less accessible to candidates than a well
information about pupils’ responses to
- worded one. Accessibility is one of the factors
educational tasks. It can be in more formal
that determines the difficulty of a question.
contexts such as timed tests under strict
conditions, or less formal contexts such as
Adaptive testing reading pupils’ work and listening to what they
A computer-based test that adapts to the say. Forms of assessment are also used in non-
ability level of the candidate, i.e. the next item educational settings, for example to measure
or set of items in a test. personality traits or diagnose psychological
disorders.

3
Assessment claim
A statement that something is the
case related to an assessment.
In educational assessment,
validation is the collection of
empirical evidence to establish
whether claims to validity
are warranted.

Assessment objective Banded mark scheme


The skills that the learner is expected to A mark scheme where the marks are divided
demonstrate in a test. Individual items in an into bands or levels with general marking
assessment may be linked to one or more criteria supplied for each band. Also known as a
assessment objectives. levels-based mark scheme.

Assessor Benchmark centres


The person awarding a mark or other A group of large schools with stable entry
evaluative judgement on a candidate’s work. numbers, identified for the purpose of ensuring
that the awarding standard does not change.
Authentic assessment
Authentic assessment simulates real-world Candidate
conditions and scenarios as part of the The person whose work or performance is
assessment task. Examples of authentic being assessed.
assessment activities include research projects,
working with real-world case studies or data,
and roleplay tasks.
Classical test theory (CTT)
A branch of psychometrics that is concerned
with improving the validity and reliability of
Availability assessments. CTT assumes that each person
When the assessment will be available to be has an innate true score (i.e. their ability on a
taken by candidates, for example on demand or scale related to a trait).
once a year in June.
Various statistics can be used in CTT to
Awarding standard determine the quality of the assessment
as a measure of the targeted trait. One of the
The quality of work necessary to be awarded a
statistics used is the standard error of
particular grade or other assessment outcome,
measurement (SEM), which is a measure of how
taking into account the difficulty of the tasks.
scores on an assessment are spread around a
The awarding standard is what is maintained
true score
during standards-referenced grading.

4
Cohort Construct
Either the entire group of people taking an This term refers to unobservable characteristics
assessment, or a sub-group with a common that many types of test are designed to
shared characteristic such as year of birth or measure. For example, an intelligence test
country of residence. is designed to measure the construct of
intelligence. Personality tests are designed
to assess constructs such as introversion
Command words or psychopathy. They are referred to as
Words of instruction used within an assessment ‘constructs’ because they are not directly
to tell the candidate what they need to do, observable but are ‘constructed’. That is to say,
e.g. State, Describe, Explain, Calculate… you can’t directly observe introversion, but you
can observe someone regularly behaving in
Comparability ways that taken together signify introversion.
While some writers draw a distinction between
The degree to which the results from one observable attributes (proficiency in spoken
assessment can be viewed as equivalent to English or mental arithmetic) and unobservable
results from a different assessment. This might theoretical constructs (such as introversion),
include comparing equivalent grades between often the term construct is used simply to
different examination boards, or results from describe whatever it is that an assessment is
one year with a previous year. Comparability is intended to measure.
normally viewed as part of validity.

Construct-irrelevant variance
Competency
When candidates’ marks are affected
The ability to perform a specific skill, role or by factors other than the knowledge,
function to a predefined standard or target. understanding and skills that the assessment is
intended to assess.
Constrained mark scheme
An objective mark scheme, in which the only
acceptable answers are those provided in the
mark scheme.

Coursework
Classroom assignments undertaken
by students as prescribed in the
syllabus. These are normally marked
by the student’s teacher according to
criteria set by the assessment provider.
This work is standardised within the
centre and then standardised by the
assessment provider.

5
that discriminates well, the strongest
Criterion referencing candidates will gain a high mark (with the
The judgements about the work produced by strongest candidate possibly achieving full
learners are based on standards and criteria marks) and the weakest candidates will achieve
that are defined by the curriculum or other a very low mark. For the assessment to be valid,
documentation. See also norm referencing. this discrimination should be solely on the basis
of the constructs and content that the test is
Cut-scores designed to assess.

The minimum mark needed to obtain a certain


grade or outcome for an assessment. Domain
This concept refers to all of the content or
Demand knowledge that could potentially be included
in an assessment. Ideally this is clearly defined
The demands of a task are the (mostly for those taking the assessment, for example, in
cognitive) mental processes that candidates a syllabus document (specification).
have to carry out in order to complete the
task (having accessed it). Demand is often
intentionally varied in examinations and it is Double marking
normal to try and provide a range of questions Where more than one examiner reaches a
of various difficulties by varying the demands of judgement about a submitted response to an
the questions. Demand is one of the factors that assessment. The final mark given to the student
determines the difficulty of a question. is typically a combination of the two given
marks. Multiple marking is when more than two
Diagnostic use examiners are used.

The use of an assessment to identify strengths


and weaknesses. This type of assessment may Exemption
take place before, during or after a course An arrangement which permits a candidate not
of teaching. to take a particular part or component of
the assessment.
Difficulty
The difficulty of a task is something that we External assessment
can ascertain statistically from the marks data, An assessment that is marked by an examiner
provided that we already know something who is independent of the candidates’ own
about the ability of the candidates. Difficulty centre of learning.
can be defined as a statistical measure
indicating how likely it is for candidates of a
given ability to score marks. There are several
Fairness
factors which can contribute to making a task An assessment is fair if there is no bias which
difficult. Distinguishing between these factors might cause the assessment to discriminate
is a matter of judgement. Three in particular against a group of individuals according to
are important: the accessibility of the task, the some trait (e.g. gender, disability) which is not
demands of the task, and the severity of the the construct being assessed.
marking. All three can affect the likelihood of a
candidate scoring marks. Formative assessment
Often characterised as assessment for learning,
Discrimination formative assessments are carried out by
This concept refers to the ability of an teachers as part of instruction, so that they
assessment (or item in an assessment) to can modify and adapt their teaching in order
differentiate between high-performing and to improve student attainment. An example
low-performing candidates. In an assessment of this would be a teacher questioning their

6
students during a lesson to see if they have Grade descriptors
understood the topic currently being taught,
and, if not, adapting and redelivering the An overall statement about the standards that
material. Formative assessment is often viewed need to be reached in a subject discipline or
in contrast to summative assessment. qualification to achieve a particular grade
outcome. They aim to define the requirements
of a qualification and the main learning
GCSE outcomes, and can inform the development of
General Certificate of Secondary Education. assessment objectives.
An academic qualification taken by secondary
school students (usually at age 15 or 16) in
England, Wales and Northern Ireland. GCSEs
Grading
are offered in a wide range of subjects, from The process of converting assessment marks
mathematics and English (which are normally into grades.
compulsory for students) to psychology and
Latin. These qualifications are normally sat at High-stakes
the end of two years of study and are often
Where the outcome of an assessment has a
used as selection criteria for post-16 education.
high impact on the individual or group
being assessed.

Grade inflation
The real or claimed increase of grade
outcomes of consecutive cohorts.
Grade inflation is considered an issue
for standards if there is evidence to
suggest that students of the same
ability are receiving different grade
outcomes on the basis of their cohort.

Generic mark scheme Holistic mark scheme


A mark scheme that does not change but the The final mark for an item is based on an overall
same marking criteria are applied even if the judgement of the performance, usually using a
actual task changes. best-fit judgement and possibly weighing up
the candidate’s performance in more than one
Grade assessment objective.

A level on a scale of performance used to


differentiate achievement (for example A, B, C Interim assessment
or Distinction, Merit, Pass). Assessment which takes place part way
through a programme of study.

7
Internal assessment assessment-takers responding to the items,
and the underlying trait being measured. IRT
Assessment that takes place and is marked is based on the idea that the probability of a
within the candidates’ own centre of learning. correct response to an item is a mathematical
function of person and item parameters. Similar
to classical test theory (CTT), IRT is used in the
Invigilation design, scoring and analysis of assessments.
Supervising candidates taking examinations to Compared to CTT, IRT brings greater flexibility
ensure they follow all necessary rules. and provides more sophisticated information.
For example, it can provide more precise
Item predictions for whether students of differing
ability levels will answer a particular item
On an exam, an item is a question, or part correctly or not. IRT has many applications
question, that is not broken down further in educational assessment, including but
to be marked. For example, if an exam paper not limited to providing a framework for the
has Question 1, Question 2a, Question 2bi and evaluation of item performance, maintaining
Question 2bii, these are all items. So Question banks of items (e.g. by analysing whether items
1 is an item and Question 2 is made up of three within a bank have become overexposed),
different items (2a, 2bi and 2bii). and using item banks to create assessments of
equivalent demand.
Item banking
A system for storing test items such that items Learning objective
can then be selected and compiled into a test. The knowledge or understanding that the
learner is expected to acquire.
Item characteristic curve (ICC)
Also known as item response curves, ICCs are Levels-based mark scheme
used to describe the relationship between the A mark scheme where the marks are divided
ability, defined on an ability scale, and each into bands or levels with general marking
item in an assessment. ICCs plot the probability criteria supplied for each band. Also known as a
of assessment-takers correctly answering an banded mark scheme.
item based on their ability. As ability increases,
the probability of correctly answering the
item also increases. The shape of an ICC Low-stakes
plot determines both the difficulty and the Where the outcome of an assessment has little
discriminatory properties of an item. or no impact on the individual or group
being assessed.
Item-level data
Outcome and performance data for individual Maladministration
test items. Where the integrity of an assessment is
threatened by the incorrect application of the
Item response theory (IRT) assessment regulations and administration
requirements by those delivering or supervising
A statistical theory of testing. Also known the assessment.
as latent trait theory, IRT is a theory of
testing based on the relationship between
individuals’ performances on a test item and the Malpractice
assessment-takers’ levels of performance on Where the integrity of an assessment is
an overall measure of the ability that item was threatened by the actions of the candidate.
designed to measure. IRT statistical methods
aim to establish a link between the properties
of items on an assessment or instrument,
8
Marking Modular assessment
A process, undertaken by markers or examiners, Where the overall assessment outcome is
which converts candidate responses to marks. aggregated from separate assessment modules
which take place at intervals throughout the
course of study.
Marking scheme
An article used primarily by markers and
assessors in the marking process that indicates Multiple choice
the number of marks to be awarded for An objective question where the candidate
specific items, the approaches that can be selects an answer from a list of availabler
used by candidates to attract marks, and the options. The correct options are called the 'key'
acceptable answers to them. and the incorrect options are called the
'distractors'.
Measurement error
The difference between a measured value of a Norm referencing
quantity and its true value. Measurement error An approach to grading which ensures that the
may come from a variety of different sources. same percentage of candidates achieve each
Human measurement error can occur in all grade as in previous years; the outcome for an
stages of assessment design, including during individual candidate depends on their
marking. Systematic errors may occur when performance relative to the other candidates
there are issues in the setting or administration rather than relative to a standard.
of the assessment. Error can also be random
(i.e. from an unknown source).
Objective response item
A question with a clearly defined correct
Moderation answer, not subjective.
The process of checking that assessment
standards have been applied correctly and
consistently between assessors, between
assessment centres and over time.

Office of Qualifications and


Examinations Regulation
(Ofqual)
A non-ministerial government department
based in England. It is responsible for the
maintenance of standards and instilling public
confidence in qualifications and assessment.
It is the authority that accredits and
regulates assessment organisations offering
high-stakes qualifications.

9
On demand Rasch
A test which is available to be taken at any time, A statistical theory of testing developed to
not part of a fixed timetable. improve the precision in which the proprieties of
an assessment can be evaluated and analysed.
It is a specific case of Item Response Theory.
Option / Optional route
Compare with Classicial Test Theory
Where there is more than one set of admissible
components that could make up the total
required assessment, each valid set is referred Reliability
to as an optional route or option. This concept refers to the extent to which the
results of an assessment are consistent and
replicable. So if an assessment is highly reliable
Points-based mark scheme it means that if a student took a different
A mark scheme where marks are awarded for version of the test, or if a different examiner
each instance of a credit-worthy point in a marked the test, they would get exactly the
candidate’s response. same result.

Pretesting
Reporting scale
A pretest of an assessment is applied when
trialling examinations with learners before they The range of marks, grades or statements on
are used in live examinations or assessments. which the outcome of an assessment may
They are used to ensure the accuracy and be reported.
fairness of assessment materials and to check
for appropriateness of test content. Rubric
A set of instructions. Assessment rubrics are
Prior attainment tools used to mark students’ work against a set
The outcomes of previous assessment. of criteria and standards. They may also be
used to give students guidance as to what the
assessment requires of them.
Psychometric testing
The measurement of psychological attributes Sample or specimen assessment
such as ability or aptitude.
materials
Qualification Examples of questions or question papers or
other assessment materials, including their
A qualification is a formal recognition of mark schemes, made available before the
learning/ achievement. Qualifications are actual assessment to illustrate to teachers
designed and certificated by awarding bodies or candidates the content and format of the
(also known as exam boards) and are usually actual assessment materials.
part of a qualification framework, made up of
different levels with qualifications at the same
level recognised as equivalent. A qualification Scheme of assessment
will often include more than one assessment The different combination(s) of components
and may include different types of assessment. that make up the assessment.

Rank order Script scrutiny


A list of all candidates arranged in order of A method of setting standards by comparing
their test outcomes. examples of candidates’ complete
answer scripts.

10
Semi-constrained mark scheme assessment objectives and the scheme of the
assessment. Often also referred to as
Marks are awarded for answers which match
a specification.
any of those listed, or have the same meaning.

Task-specific mark scheme


Setter
A mark scheme that changes with the task set.
A subject and assessment specialist with
responsibility for writing an assessment task,
question paper or individual test items. Teacher assessment
A test which is marked by the teacher;
Special considerations coursework is often teacher-assessed.
Adjustments made to a candidate’s mark to
make allowances for adverse circumstances Technical and vocational
which affected the candidate’s performance.
qualifications
Technical and vocational qualifications focus on
Specification grid preparing students for a particular profession.
A table showing how the test meets the They aim to teach the skills and knowledge
criteria specified for that assessment, e.g. required for a particular industry and often
the percentage of marks available for each contain practical and workplace-based
assessment objective. elements. Here is an example of a vocational
qualification preparing students to become
Stakeholder electricians.
A person with an interest in something, or who
is affected by something. Key stakeholders in Test form
an assessment might be students, teachers, A single instance of a test which is repeated
employers, or parents, among others. with differing content at different times or to
different candidates.
Standards referencing
An approach to grading in which the outcome Test specification
awarded depends on the candidate’s work
meeting a certain standard, often exemplified A document specifying the details of the
by archived scripts, and is independent of the assessment, usually including information about
achievement of other candidates. the content, the assessment objectives, the
components and test items.
Summative assessment
Validity
Often characterised as assessment of learning,
summative assessments are carried out by Traditional definitions of assessment validity
teachers after instruction has been completed, describe it as the extent to which an
to sum up a student’s learning. An example of assessment measures what it is intended
this would be an end-of-topic test. Summative to measure. However, in reality the same
assessment is often viewed in contrast to assessment could be seen as valid in one
formative assessment. context and not valid in others (for example
an intelligence test designed for adults would
not be valid for children). For this reason, many
Syllabus people argue that it is not a test itself that is
Assessment and course documentation valid or not valid, but inferences that might be
which includes all the information necessary drawn on the basis of test results.
to prepare for an assessment, including the
learning objectives, the areas of knowledge So rather than asking, for example, whether a
and skills contained within the course, the given intelligence test is valid or not, we should
11
really be asking whether there is evidence
to support inferences that we might want to
draw from the test (e.g. that someone who has
scored highly in the test is more intelligent than
someone who has scored lower in the test).
Another way to think about it is that validity
refers to the extent to which a candidate’s
performance on a test results in decisions about
that candidate that are ‘correct’ (or more
accurately, defensible).

Washback
This concept refers to the positive and
negative effects that an educational
assessment may have upon those taking, and
preparing students to take, an assessment.
Positive effects could include students working
harder in preparation for the test or teachers
focusing more on something because it will be
assessed. Negative effects could be focusing
too heavily on preparation for the exam rather
than deep authentic learning, or narrowing
the curriculum to only teach things which will
be assessed at the expense of other valuable
learning or experience.

Weighting
A measure of how much any particular
component or assessment objective
contributes to the final assessment
outcome.

12
The Assessment Network is a global
leader in professional development,
covering assessment principles,
practices and insights.
We equip organisations, teams and individuals with powerful
knowledge, recognised skills and inspiring network opportunities.
Part of Cambridge University Press & Assessment, we draw
on evidence from Europe’s largest assessment research division
and the wider Cambridge educational community.

Empowering practitioners in every profession


Our membership programme and accredited learning are open to all in
assessment, from awarding organisations and professional associations
to schools, training providers, colleges and higher education institutions.
Alongside online and face-to-face courses, we regularly deliver workforce
training, from continuing professional development (CPD) programmes to
bespoke solutions.

Trusted worldwide
Training 1,000 organisations in 100 countries,
supported by world-class research from the
University of Cambridge

cambridgeassessment.org.uk/events

1
Transform your
assessment practice
Professional development for impactful assessment.

Find out more at:


cambridgeassessment.org.uk

Email our experts at:


[email protected]

Join the conversation online:


Cambridge Assessment Network
AssessNetwork

© Cambridge University Press & Assessment 2024

You might also like