The Assessment Glossary
The Assessment Glossary
Assessment 101
Glossary
101 words and phrases for
assessment professionals
1
The world of assessment contains a lot of terminology and
phrasing that can present a barrier to fully engaging with
the topic.
2
Glossary
The list below gives an indication of how these words are intended to be understood within the
context of educational assessment. It is not intended to provide formal definitions of these terms.
A Level Aggregation
Advanced Level. An academic qualification The summing of marks from a number of
taken by secondary school students (usually at different components.
age 17 or 18) in England, Wales and Northern
Ireland. A Levels are offered in a wide range
of subjects, such as mathematics, English
Analytic mark scheme
literature, media studies and dance. These A mark scheme in which separate marks are
qualifications are normally sat at the end awarded for separate aspects or awarding
of two years of study and are often used as objectives. The final mark is typically the
selection criteria by universities and employers. aggregate of the separate marks.
3
Assessment claim
A statement that something is the
case related to an assessment.
In educational assessment,
validation is the collection of
empirical evidence to establish
whether claims to validity
are warranted.
4
Cohort Construct
Either the entire group of people taking an This term refers to unobservable characteristics
assessment, or a sub-group with a common that many types of test are designed to
shared characteristic such as year of birth or measure. For example, an intelligence test
country of residence. is designed to measure the construct of
intelligence. Personality tests are designed
to assess constructs such as introversion
Command words or psychopathy. They are referred to as
Words of instruction used within an assessment ‘constructs’ because they are not directly
to tell the candidate what they need to do, observable but are ‘constructed’. That is to say,
e.g. State, Describe, Explain, Calculate… you can’t directly observe introversion, but you
can observe someone regularly behaving in
Comparability ways that taken together signify introversion.
While some writers draw a distinction between
The degree to which the results from one observable attributes (proficiency in spoken
assessment can be viewed as equivalent to English or mental arithmetic) and unobservable
results from a different assessment. This might theoretical constructs (such as introversion),
include comparing equivalent grades between often the term construct is used simply to
different examination boards, or results from describe whatever it is that an assessment is
one year with a previous year. Comparability is intended to measure.
normally viewed as part of validity.
Construct-irrelevant variance
Competency
When candidates’ marks are affected
The ability to perform a specific skill, role or by factors other than the knowledge,
function to a predefined standard or target. understanding and skills that the assessment is
intended to assess.
Constrained mark scheme
An objective mark scheme, in which the only
acceptable answers are those provided in the
mark scheme.
Coursework
Classroom assignments undertaken
by students as prescribed in the
syllabus. These are normally marked
by the student’s teacher according to
criteria set by the assessment provider.
This work is standardised within the
centre and then standardised by the
assessment provider.
5
that discriminates well, the strongest
Criterion referencing candidates will gain a high mark (with the
The judgements about the work produced by strongest candidate possibly achieving full
learners are based on standards and criteria marks) and the weakest candidates will achieve
that are defined by the curriculum or other a very low mark. For the assessment to be valid,
documentation. See also norm referencing. this discrimination should be solely on the basis
of the constructs and content that the test is
Cut-scores designed to assess.
6
students during a lesson to see if they have Grade descriptors
understood the topic currently being taught,
and, if not, adapting and redelivering the An overall statement about the standards that
material. Formative assessment is often viewed need to be reached in a subject discipline or
in contrast to summative assessment. qualification to achieve a particular grade
outcome. They aim to define the requirements
of a qualification and the main learning
GCSE outcomes, and can inform the development of
General Certificate of Secondary Education. assessment objectives.
An academic qualification taken by secondary
school students (usually at age 15 or 16) in
England, Wales and Northern Ireland. GCSEs
Grading
are offered in a wide range of subjects, from The process of converting assessment marks
mathematics and English (which are normally into grades.
compulsory for students) to psychology and
Latin. These qualifications are normally sat at High-stakes
the end of two years of study and are often
Where the outcome of an assessment has a
used as selection criteria for post-16 education.
high impact on the individual or group
being assessed.
Grade inflation
The real or claimed increase of grade
outcomes of consecutive cohorts.
Grade inflation is considered an issue
for standards if there is evidence to
suggest that students of the same
ability are receiving different grade
outcomes on the basis of their cohort.
7
Internal assessment assessment-takers responding to the items,
and the underlying trait being measured. IRT
Assessment that takes place and is marked is based on the idea that the probability of a
within the candidates’ own centre of learning. correct response to an item is a mathematical
function of person and item parameters. Similar
to classical test theory (CTT), IRT is used in the
Invigilation design, scoring and analysis of assessments.
Supervising candidates taking examinations to Compared to CTT, IRT brings greater flexibility
ensure they follow all necessary rules. and provides more sophisticated information.
For example, it can provide more precise
Item predictions for whether students of differing
ability levels will answer a particular item
On an exam, an item is a question, or part correctly or not. IRT has many applications
question, that is not broken down further in educational assessment, including but
to be marked. For example, if an exam paper not limited to providing a framework for the
has Question 1, Question 2a, Question 2bi and evaluation of item performance, maintaining
Question 2bii, these are all items. So Question banks of items (e.g. by analysing whether items
1 is an item and Question 2 is made up of three within a bank have become overexposed),
different items (2a, 2bi and 2bii). and using item banks to create assessments of
equivalent demand.
Item banking
A system for storing test items such that items Learning objective
can then be selected and compiled into a test. The knowledge or understanding that the
learner is expected to acquire.
Item characteristic curve (ICC)
Also known as item response curves, ICCs are Levels-based mark scheme
used to describe the relationship between the A mark scheme where the marks are divided
ability, defined on an ability scale, and each into bands or levels with general marking
item in an assessment. ICCs plot the probability criteria supplied for each band. Also known as a
of assessment-takers correctly answering an banded mark scheme.
item based on their ability. As ability increases,
the probability of correctly answering the
item also increases. The shape of an ICC Low-stakes
plot determines both the difficulty and the Where the outcome of an assessment has little
discriminatory properties of an item. or no impact on the individual or group
being assessed.
Item-level data
Outcome and performance data for individual Maladministration
test items. Where the integrity of an assessment is
threatened by the incorrect application of the
Item response theory (IRT) assessment regulations and administration
requirements by those delivering or supervising
A statistical theory of testing. Also known the assessment.
as latent trait theory, IRT is a theory of
testing based on the relationship between
individuals’ performances on a test item and the Malpractice
assessment-takers’ levels of performance on Where the integrity of an assessment is
an overall measure of the ability that item was threatened by the actions of the candidate.
designed to measure. IRT statistical methods
aim to establish a link between the properties
of items on an assessment or instrument,
8
Marking Modular assessment
A process, undertaken by markers or examiners, Where the overall assessment outcome is
which converts candidate responses to marks. aggregated from separate assessment modules
which take place at intervals throughout the
course of study.
Marking scheme
An article used primarily by markers and
assessors in the marking process that indicates Multiple choice
the number of marks to be awarded for An objective question where the candidate
specific items, the approaches that can be selects an answer from a list of availabler
used by candidates to attract marks, and the options. The correct options are called the 'key'
acceptable answers to them. and the incorrect options are called the
'distractors'.
Measurement error
The difference between a measured value of a Norm referencing
quantity and its true value. Measurement error An approach to grading which ensures that the
may come from a variety of different sources. same percentage of candidates achieve each
Human measurement error can occur in all grade as in previous years; the outcome for an
stages of assessment design, including during individual candidate depends on their
marking. Systematic errors may occur when performance relative to the other candidates
there are issues in the setting or administration rather than relative to a standard.
of the assessment. Error can also be random
(i.e. from an unknown source).
Objective response item
A question with a clearly defined correct
Moderation answer, not subjective.
The process of checking that assessment
standards have been applied correctly and
consistently between assessors, between
assessment centres and over time.
9
On demand Rasch
A test which is available to be taken at any time, A statistical theory of testing developed to
not part of a fixed timetable. improve the precision in which the proprieties of
an assessment can be evaluated and analysed.
It is a specific case of Item Response Theory.
Option / Optional route
Compare with Classicial Test Theory
Where there is more than one set of admissible
components that could make up the total
required assessment, each valid set is referred Reliability
to as an optional route or option. This concept refers to the extent to which the
results of an assessment are consistent and
replicable. So if an assessment is highly reliable
Points-based mark scheme it means that if a student took a different
A mark scheme where marks are awarded for version of the test, or if a different examiner
each instance of a credit-worthy point in a marked the test, they would get exactly the
candidate’s response. same result.
Pretesting
Reporting scale
A pretest of an assessment is applied when
trialling examinations with learners before they The range of marks, grades or statements on
are used in live examinations or assessments. which the outcome of an assessment may
They are used to ensure the accuracy and be reported.
fairness of assessment materials and to check
for appropriateness of test content. Rubric
A set of instructions. Assessment rubrics are
Prior attainment tools used to mark students’ work against a set
The outcomes of previous assessment. of criteria and standards. They may also be
used to give students guidance as to what the
assessment requires of them.
Psychometric testing
The measurement of psychological attributes Sample or specimen assessment
such as ability or aptitude.
materials
Qualification Examples of questions or question papers or
other assessment materials, including their
A qualification is a formal recognition of mark schemes, made available before the
learning/ achievement. Qualifications are actual assessment to illustrate to teachers
designed and certificated by awarding bodies or candidates the content and format of the
(also known as exam boards) and are usually actual assessment materials.
part of a qualification framework, made up of
different levels with qualifications at the same
level recognised as equivalent. A qualification Scheme of assessment
will often include more than one assessment The different combination(s) of components
and may include different types of assessment. that make up the assessment.
10
Semi-constrained mark scheme assessment objectives and the scheme of the
assessment. Often also referred to as
Marks are awarded for answers which match
a specification.
any of those listed, or have the same meaning.
Washback
This concept refers to the positive and
negative effects that an educational
assessment may have upon those taking, and
preparing students to take, an assessment.
Positive effects could include students working
harder in preparation for the test or teachers
focusing more on something because it will be
assessed. Negative effects could be focusing
too heavily on preparation for the exam rather
than deep authentic learning, or narrowing
the curriculum to only teach things which will
be assessed at the expense of other valuable
learning or experience.
Weighting
A measure of how much any particular
component or assessment objective
contributes to the final assessment
outcome.
12
The Assessment Network is a global
leader in professional development,
covering assessment principles,
practices and insights.
We equip organisations, teams and individuals with powerful
knowledge, recognised skills and inspiring network opportunities.
Part of Cambridge University Press & Assessment, we draw
on evidence from Europe’s largest assessment research division
and the wider Cambridge educational community.
Trusted worldwide
Training 1,000 organisations in 100 countries,
supported by world-class research from the
University of Cambridge
cambridgeassessment.org.uk/events
1
Transform your
assessment practice
Professional development for impactful assessment.