0% found this document useful (0 votes)
33 views

Istof 2

Uploaded by

aykatexas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Istof 2

Uploaded by

aykatexas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Studies in Educational Evaluation 70 (2021) 101028

Contents lists available at ScienceDirect

Studies in Educational Evaluation


journal homepage: www.elsevier.com/locate/stueduc

Conceptualizing and exploring the quality of teaching using generic


frameworks: A way forward
A. Panayiotou a, B. Herbert b, P. Sammons c, L. Kyriakides d, *
a
Department of Education, University of Cyprus, Nicosia, Cyprus
b
DIPF | Leibniz Institute for Research and Information in Education, Rostocker Str. 6, 60323 Frankfurt am Main, Germany
c
University of Oxford, Oxford, UK
d
Department of Education, University of Cyprus, P. O. Box 20537, 1678, Nicosia, Cyprus

A R T I C L E I N F O A B S T R A C T

Keywords: Empirical studies and meta-analyses conducted during the past 35 years led to the development of a number of
Teacher effectiveness research theoretical frameworks of teacher effectiveness. In this paper, we aim to summarize the main characteristics of
Educational effectiveness research three dominant frameworks within the field of educational effectiveness and discuss both their conceptual dif­
Quality of teaching
ferences and their similarities, as well as, the different observational instruments used by each one to capture
instructional quality. Specifically, the three frameworks are: a) the dynamic model of educational effectiveness;
b) the International System for Teacher Observation and Feedback (ISTOF); and c) the Three Basic Dimensions of
Teaching Quality (TBD). These frameworks were also used to analyze three videolessons for comparing the
quality of teaching through the lens of each respective framework. Based on the results of the three lesson an­
alyses, possibilities for combining different generic frameworks of effective teaching to provide a more complete
view of teaching quality are discussed.

1. Introduction researchers have therefore turned to the classroom level with the crea­
tion of different frameworks and models aiming to provide a description
Educational quality has been a topic of interest for researchers in the of those aspects of teaching considered important for promoting student
field of educational effectiveness for more than two decades (Chapman, outcomes, either cognitive, meta-cognitive, affective or psychomotor.
Muijs, Reynolds, Sammons, & Teddlie, 2016; Creemers, Kyriakides, & Research during the past 40 years has led to the demonstration of a
Antoniou, 2013; Sammons, 1999). A large number of studies and number of teacher factors that are positively related to student outcomes
meta-analyses have taken place that shed some light on the factors that (e.g., Brophy & Good, 1986; Creemers, 1994; Doyle, 1986; Galton, 1987;
contribute to maximizing student gains from education. The direct and Muijs & Reynolds, 2000; Muijs et al., 2014). These factors do not stem from
indirect effect of factors at the different levels of education (i.e., system, only one approach to learning and teaching such as the so called direct or
school, classroom and student) on student learning outcomes have been active teaching approach (Rosenshine & Stevens, 1986; Brophy & Good,
examined through research that seeks to predict variation in student 1986), but reflect a much wider spectrum of teacher behaviors that in­
outcomes, typically using multilevel statistical models. The long history corporates characteristics of different approaches. Such factors include
of research into effective teaching has shown, however, that the class­ management of the classroom, expectations of student performance,
room level has a more immediate and direct effect on student achieve­ teacher objectives, structuring of lessons, questioning skills, and immediate
ment than other levels, such as the school level (e.g., Caldwell & Spinks, exercise after presentation, as well as evaluation, feedback, and corrective
1993; Creemers & Kyriakides, 2008; Muijs & Reynolds, 2000), which instruction (Scheerens, 2013). Moreover, meta-analyses have provided a
was shown to have mostly indirect influences on student performance synthesis of findings in the field of teacher effectiveness research (TER),
(Kyriakides, Creemers, Panayiotou, & Charalambous, 2021; Sammons, have also identified specific teacher factors, such as reinforcement of
Davis, Day, & Gu, 2014). This is particularly evident in studies of student content and feedback, that influence student outcomes (Kyriakides,
progress conducted during a single academic year. The efforts of Christoforou, & Charalambous, 2013; Seidel & Shavelson, 2007).

* Corresponding author.
E-mail addresses: [email protected] (A. Panayiotou), [email protected] (B. Herbert), [email protected] (P. Sammons),
[email protected] (L. Kyriakides).

https://doi.org/10.1016/j.stueduc.2021.101028
Received 1 October 2020; Received in revised form 1 May 2021; Accepted 4 May 2021
Available online 13 May 2021
0191-491X/© 2021 Elsevier Ltd. All rights reserved.
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

The various frameworks and models developed in the field of from a measurement perspective, but also, and even more, from a
educational effectiveness have incorporated the results of research in the theoretical point of view (Kyriakides et al., 2021). For example, the
field of TER, as well as the results of the dominant meta-analyses con­ focus dimension is related with the synergy theory. In particular, the
ducted in the field, placing, however, emphasis on different aspects of focus dimension takes into account that an activity may be expected to
teaching. In this paper, we aim to summarize the main characteristics of achieve single or multiple purposes. The importance of measuring this
three dominant frameworks within the field of educational effectiveness aspect of focus dimension can be attributed to research findings which
and discuss both their conceptual differences and their similarities, as reveal that if all the activities are expected to achieve a single purpose,
well as the different observational instruments used by each one to then the chances of achieving the purpose are high, but the effect of the
capture features of instructional quality. Drawing on the analysis of the factor might be small due to the fact that other purposes are not achieved
same three elementary-school mathematics lessons by using the three and/or synergy may not exist since the activities are isolated (Schoen­
different frameworks, we then discuss the merits and the limitations of feld, 1998). On the other hand, if all the activities are expected to ach­
drawing on multiple frameworks and models of teaching quality as ieve multiple purposes, there is a danger that specific purposes are not
opposed to using single approaches. We also discuss the extent to which addressed in such a way that they can be implemented successfully
research on teaching quality may benefit from considering frameworks (Pellegrino, 2004). The dynamic model explicitly refers to the mea­
and models that build on and synthesize existing conceptualizations on surement of each factor and uses three observation instruments and a
teaching quality as compared to employing single frameworks and student questionnaire for their measurement (Kyriakides et al., 2014).
models. In addition, the possibilities for establishing better collaboration Namely, two low-inference (LIO1 and LIO2) and one high-inference
among scholars for untangling issues of teaching quality are explored. observational instruments are used since each type of instrument has
With regard to the three frameworks studied in this paper, their main advantages as well as disadvantages. Using all three instruments
features are presented and elaborated on in the following section. together provides more information on the lesson observed regarding
the teaching factors of the dynamic model. In particular, these in­
2. Description of the three frameworks used for defining quality struments were designed to collect data concerned with different aspects
of teaching of the eight teaching factors of the dynamic model, and previous studies
have provided empirical support for their construct validity (for more
2.1. The dynamic model of educational effectiveness information about the instruments see Creemers & Kyriakides, 2012).

Based on the main findings of TER, the dynamic model (Creemers & 2.2. The International System for Teacher Observation and Feedback
Kyriakides, 2008) refers to eight factors that describe teachers’
instructional role and have been consistently shown to be associated The International System for Teacher Observation and Feedback
with student outcomes: orientation, structuring, questioning, teaching (ISTOF) is a unique instrument in the educational effectiveness field that
modelling, application, management of time, teacher role in making was derived from an international cross-country collaboration to try to
classroom a learning environment, and classroom assessment. The conceptualize, define, and measure possible generic features of class­
model includes factors/teaching skills associated with both direct room practice relevant to teacher effectiveness. A team of experts from
teaching and mastery learning (Joyce, Weil, & Calhoun, 2000), such as more than 20 countries were involved in its creation using an iterative
structuring and questioning, and with theories of teaching associated ‘Delphi’ process via contacts with expert opinion in various countries to
with constructivism (Brekelmans, Sleegers, & Fraser, 2000) including facilitate cross-cultural relevance and construct validity. The instrument
factors such as orientation and teaching modelling. Teachers’ ability to development was also informed by a review of TER studies. In partic­
promote collaboration among students is also considered. Therefore, an ular, an iterative, multiple-step, Internet-based, “modified’’ Delphi
integrated approach to quality of teaching is adopted. The model also technique (Teddlie, Creemers, Kyriakides, Muijs, & Yu, 2006) was used
argues that factors operating at the same level (i.e., the classroom) are to develop the ISTOF model and instrument. It generated a number of
related to each other. This assumption has been tested by seven studies purposively inter-linked components, indicators and specific items for
conducted in different countries which reveal that the teaching factors constructing the final instrument. Teddlie et al. (2006) report that
can be classified into stages of effective teaching, structured in a devel­ country experts identified teaching factors deemed important for stu­
opmental order which can be used for teacher professional development dent learning and attainment generating a total of 103 components.
purposes (Creemers et al., 2013; Kyriakides, Creemers, & Antoniou, Qualitative content analyses of responses by two independent teams of
2009). researchers using the ATLAS program facilitated a form of constant
The model also assumes that each factor can be defined and comparative analysis that combined the coding of qualitative data
measured using the following five dimensions: frequency, focus, stage, submitted by respondents with theory generation. A conceptual map of
quality, and differentiation. These dimensions help researchers better components of effective teaching that are considered highly important
describe the functioning of a factor. Most effectiveness studies have by experts across 17 countries was thus produced. This conceptual map
examined how frequently an activity related to a factor occurs; there­ refers to 11 components of effective teaching that can be grouped into
fore, these studies took into account only the quantitative characteristics five overarching factors (see Kyriakides, Creemers, Teddlie, & Muijs,
of a factor. Frequency is in line with this need as it comprises a quan­ 2010). A summary of the 11 components is provided below as described
titative way to measure the functioning of each factor. However, only by Muijs et al. (2018).
examining the number of activities related with a factor is not sufficient
to determine the quality of teaching offered (Kyriakides & Creemers, 1 assessment and evaluation (the extent to which effective feed­
2008). For example, providing students with opportunities to apply new back is provided and assessment is aligned to goals and
knowledge was found to have a positive impact on their outcomes. objectives),
However, spending too much teaching time on application activities 2 clarity of instruction (the extent to which lessons are well struc­
may not allow sufficient time for teaching new content; which in turn tured and purposeful and teacher communication is of high
may have a negative effect on student outcomes. Therefore, when quality),
measuring the functioning of a factor one should also take into consid­ 3 classroom climate (the extent to which the teacher communicates
eration its qualitative characteristics. The other four dimensions high expectations, communicates with and involves and values
included in the dynamic model examine the qualitative characteristics all students),
of the functioning of a factor (for a description of each dimension see 4 classroom management (the extent to which the teacher maxi­
Creemers & Kyriakides, 2008). The dimensions are not only important mizes learning time and deals with disruptions),

2
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

5 differentiation and inclusion (the extent to which all students are countries (Klieme et al., 2001; Lipowsky & Bleck, 2019; Praetorius,
involved in the lesson and the teacher takes student differences Klieme, et al., 2020). It was developed based on data of the TIMSS-Video
into account), study 1995, which included scales on educational effectiveness, class­
6 instructional skills (the extent to which the teacher can engage room and school climate, as well as various paradigms of educational
students, shows good questioning skills, and uses varied methods psychology and didactics (Gruehn, 2000) that got reworked into
and strategies), high-inference observation protocols (Clausen, 2002). By conducting an
7 planning of single lessons (the extent to which the teacher has exploratory factor analysis to systematize and structure the observable
effectively planned the observed lesson), facets of instructional quality, Klieme, Schümer and Knoll (2001) found
8 long-term planning (the extent to which the teacher can plan a a three-dimensional solution from which they inductively developed the
sequence of lessons) TBD: Classroom Management, Student Support and Cognitive Activation.
9 teacher knowledge (subject, pedagogy and pedagogical content Each dimension represents an overarching factor comprised by specific
knowledge), elements (sub-dimensions) that were found to be associated with stu­
10 teacher professionalism and reflectivity (the extent to which the dent achievement gains. Later on, all three dimensions (or factors, as
teacher can reflect on her/his own practice and contribute to the more widely used in the TER tradition) were conceptually linked to
schools’ learning community and the teaching profession), and well-established psychological theories that explain why each dimen­
11 promoting active learning and developing metacognitive skills sion supports students’ learning and motivation. Even though the
(the extent to which the teacher develops pupils’ metacognitive framework was originally developed in the context of mathematics in­
skills, provides opportunities for active learning, and fosters struction, the three dimensions are conceptualized as being generaliz­
critical thinking skills). able across school subjects and grade levels.
The dimension of classroom management goes back to Kounin
Four of the 11 main overarching components (planning of single (1970): the aim is to spend as much teaching time as possible on the task.
lessons, long-term planning, teacher knowledge, and teacher profes­ Structured classroom management is therefore characterized by
sionalism and reflectivity) were deemed unobservable in typical lessons disruption-preventive teaching, appropriate reactions to disruptions,
so the final ISTOF schedule was based around seven observable well working routines, and smooth transitions between teaching phases
components. (Seidel, 2009). The dimension of student support is based on the
Each component is based on between two and four indicators. Each self-determination theory by Ryan and Deci (2000) and includes
indicator consists of two items rated on a five-point Likert scale with a teaching characteristics that promote social relatedness as well as the
‘not applicable’ category available for most items. The instrument was experience of autonomy and competence for students. Characteristics
translated from English into the relevant language of participating include, for example, an appreciative way in which the teacher deals
countries, then independently back-translated. Following this an expert with students and a constructive approach to correcting errors (Klieme
committee met to review the items and indicators to check for semantic & Rakoczy, 2008). The dimension of cognitive activation is based on
and conceptual equivalence. constructivist learning theories and is oriented towards whether the
As noted above the core international team drew on TER, building in teacher succeeds in stimulating students to engage in complex thought
particular on the comprehensive model of educational effectiveness processes and to deal with the subject matter in greater depth (Lip­
(Creemers, 1994). Three ISTOF components (classroom climate, class­ owsky, 2015). It includes aspects such as students being required to
room management, clarity of instruction) can be related to established make connections between topics, use different approaches to solve a
TER models derived from a longstanding tradition of research on the problem or to reflect, explain and discuss the mathematical content
relationship between teacher behaviors and pupil attainment, particu­ (Baumert et al., 2010; Lipowsky et al., 2009; Renkel, 2011). Effects of
larly in basic skills, which has generally supported direct or explicit the TBD on student outcomes are hypothesized theoretically: student
instruction (Kyriakides, Archambault, & Janosz, 2013; Kyriakides, support should have a positive effect on student motivation, cognitive
Christoforou et al., 2013; Muijs et al., 2014), while two other compo­ activation a positive effect on performance, and good classroom man­
nents (promoting active learning and metacognition and differentiation agement should be a prerequisite for both aspects (Klieme & Rakoczy,
components) derive from constructivist approaches to practice that seek 2003, 2008). The predicted effects have already been investigated in
to promote self-regulated learning (Tsai, 2001). Instructional skills numerous studies with mixed results (see Section 3 in this article).
(linked to questioning, engaging students and varied instruction) is a The TBD framework is not linked to a single instrument for assessing
component identified in both conceptualizations of teaching (direct in­ the quality of teaching. Instead, it serves as a basis for developing
struction and constructivist) highlighting the role of active learning and different instruments fitting the requirements of individual studies.
engagement and also the importance of teachers’ questioning skills. These may differ in terms of perspective (observer, student, teacher),
Assessment is a component linked to various theories of learning. Muijs subdimensions and items included, and the relation to a subject, school
et al. (2018) draw attention to the value of the qualitative grounded type or school system. Therefore, observations, student and teacher
theory approach to create ISTOF that does not support any one theo­ questionnaires can be used as data basis. The gradations in the quality
retical or conceptual model. rating of a dimension or subdimension are usually based on qualitative
Also, as Muijs et al. (2018) argue, ISTOF’s main aim was to develop a criteria, in some cases, quantitative criteria or a mixture of both,
model to identify and measure teacher effectiveness factors hypothe­ quantitative and qualitative, is used. The exact criteria depend on the
sized to generalize across different subjects and age groups in school. In individual operationalization. A review of the different applications of
order to support improvement, ISTOF was also intended to facilitate the the TBD framework in empirical studies including the perspectives and
processes of providing formative feedback on teaching (based on ratings subdimensions used in the operationalizations was conducted by Prae­
obtained for constructs measured in the observation schedule). Thus, torius, Klieme, Herbert, and Pinger (2018).
ISTOF was intended to be both a research instrument, but also provide a In recent years there have been various approaches to further
tool for engagement with practitioners to support improvement in the develop the framework. There are proposals to add other dimensions
quality of teaching via feedback to support professional development. (for a summary see Praetorius, Rogh, et al., 2020), e.g. a dimension
focusing on content specific aspects of instructional quality (Lipowsky &
2.3. Three Basic Dimensions of teaching quality Bleck, 2019; Lipowsky, 2015; Lipowsky, Drollinger-Vetter, Klieme,
Pauli, & Reusser, 2018; Schlesinger, Jentsch, Kaiser, König, & Blömeke,
The Three Basic Dimensions of Teaching Quality (TBD) is the most 2018). In addition, a recent publication by Praetorius, Klieme et al.
prominent framework for instructional quality in German-speaking (2020) focused on developing the TBD into a theory to help deal with its

3
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

Table 1 Specifically, seventeen empirical studies and one meta-analysis have


Empirical evidence supporting the main assumptions of the dynamic model at been conducted to examine the main assumptions of the dynamic model
the classroom level emerging from empirical studies and a meta-analysis. at classroom level (for a review of these studies see Kyriakides et al.,
Assumptions of the dynamic model Empirical Studies Meta- 2021). Table 1 summarizes the findings of these studies, indicating the
analysis type of support that each of the assumptions of the model has received
1. Five dimensions can be used to measure 1, 2, 3, 4, 5, 6, 8, 9, 10, from the empirical studies and meta-analysis. Below, issues of validity,
the teacher factors 11, 12, 13, 14, 16, 17 reliability and prediction of student outcomes are discussed.
2. Impact of teacher factors on learning 1, 2, 3, 4, 5, 7, 8, 9, 10, 1
outcomes 11, 12, 13, 14, 15, 16, 17
a) Prediction of student outcomes
3. Relationships between factors 1, 4, 5, 6, 7, 8, 9, 17 1
operating at the same level: stages of
effective teaching (including First, it should be noted that all studies have provided empirical
assessment) support for the multilevel nature of the dynamic model since factors
Studies: operating at different levels have been found to be associated with
1) A longitudinal study measuring teacher and school effectiveness in different student achievement. These studies have also revealed that the teaching
subjects (i.e., mathematics, language and religious education) and different factors and their dimensions are associated with student achievement
learning domains (cognitive and affective) (Kyriakides & Creemers, 2008). gains. Cognitive learning outcomes in different subjects (i.e., mathe­
2) A study investigating the impact of teacher factors on achievement of Cypriot matics, language, science and religious education) as well as non-
students at the end of pre-primary school (Kyriakides & Creemers, 2009). cognitive outcomes, such as student attitudes towards mathematics,
3) A European study testing the validity of the dynamic model at teacher, school were used to measure the impact of factors. Thereby some support for
and system level (Panayiotou et al., 2014). the assumption that these factors are associated with student achieve­
4) A study in Canada searching for grouping of teacher factors included in the
ment gains in different learning outcomes has been provided. The
dynamic model and revealing specific stages of effective teaching (Kyriakides,
generic nature of the factors is also supported since these studies
Archambault et al., 2013).
5) An experimental study investigating the impact upon student achievement of revealed that the effects of the factors of the dynamic model on different
a teacher professional development approach based on the dynamic approach student learning outcomes were similar (i.e., Cohen’s d values were
(Antoniou & Kyriakides, 2011). around 0.20). However, it should be noted that only two studies
6) Examining not only the impact but also the sustainability of the dynamic examined the impact of the teacher factors on non-cognitive outcomes
approach on improving teacher behaviour and student outcomes (Antoniou & and only one on student metacognitive outcomes.
Kyriakides, 2013).
7) Searching for stages of teacher’s skills in assessment (Christoforidou et al., b) Construct validity – factor structure
2014).
8) The effects of two intervention programmes on teaching quality and student
A multi-trait, multilevel model was used in a study by Kyriakides and
achievement revealing the added value of the dynamic approach (Azkiyah,
Creemers (2008) to examine the need of using the five proposed di­
Doolaard, Creemers, & Van Der Werf, 2014).
mensions for the measurement of the effectiveness factors included in
9) Using the dynamic model to identify stages of teacher skills in assessment in
two different countries (Cyprus and Greece) (Christoforidou & Xirafidou, 2014). the dynamic model. This model was then replicated in a series of studies
10) Using observation and student questionnaire data to measure the impact of (see Table 1) and confirmatory factor analyses revealed that each factor
teaching factors on mathematical achievement of primary students in Ghana can be measured in relation to the five dimensions thus providing sup­
(Azigwe et al., 2016). port for the use of the five dimensions in measuring the functioning of
11) Examining the impact of teacher behaviour on promoting students’ cogni­ teacher factors (see Kyriakides et al., 2021). It is relevant to point out
tive and metacognitive skills (Kyriakides, Anthimou, & Panayiotou, 2020). that one of these studies was conducted in Ghana whereby the obser­
12) Investigating the impact of teacher factors on slow learners’ outcomes in vation instruments and the student questionnaire were used to collect
language (Ioannou, 2017). data on the teacher factors of the dynamic model and measure the
13) Integrating generic and content-specific teaching practices when exploring
impact of teaching factors on mathematical achievement of primary
teaching quality in primary physical education (Kyriakides, Tsangaridou,
students in Ghana (see Azigwe, Kyriakides, Panayiotou, & Creemers,
Charalambous, & Kyriakides, 2018).
14) A longitudinal study investigating for the short- and long-term effects of the
2016). In this study no effect of the teacher factors was identified
home learning environment and teacher factors included in the dynamic model through the student questionnaire which was able to collect data on all
on student achievement in mathematics (Dimosthenous, Kyriakides, & Pan­ eight teacher factors but not on all measurement dimensions. However,
ayiotou, 2020). data collected through the observation instruments revealed effects of
15) A case study of policy and actions of Rivers State, Nigeria to improve the teacher factors on student achievement. This shows the need to take
teaching quality and the school learning environment (Lelei, 2019). into consideration all five dimensions for the measurement of the factors
16) Do teachers exhibit the same generic teaching skills when they teach in since by only using some dimensions the effect of factors may not be
different classrooms (Kokkinou & Kyriakides, 2018). identified. Similar results were also found in a study in the Maldives
17) A longitudinal study on the impact of instructional quality on student where again data collected through the student questionnaire were able
learning in primary schools of Maldives. (Musthafa, 2020).
to detect the effect of only few factors on student learning outcomes
Meta-analysis:
whereas observation data were able to detect the effect of all factors on
1) A quantitative synthesis of 167 studies investigating for the impact of generic
teaching skills on student achievement (Kyriakides, Christoforou et al., 2013). student learning outcomes (Musthafa, 2020). Finally, it should be noted
that, no analyses have been done to examine whether the factors may be
grouped into second order overarching factors, however, studies have
main criticism that developments of the framework are rather empiri­
supported the assumption that the teaching factors of the dynamic
cally driven then resulting from a proper theoretical base.
model and their dimensions are inter-related. By using the Rasch model,
it was possible to classify the teaching factors and their dimensions into
3. Empirical support provided to each framework
stages of effective teaching, structured in a developmental order (see
Kyriakides et al., 2021).
3.1. The dynamic model of educational effectiveness

c) Internal consistency - reliability


Some material supporting the validity of the dynamic model has
been produced since 2003, when the model was first developed.
In each study included in Table 1, in which a student questionnaire

4
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

was used to collect data on the teacher factors, a generalisability study c) Internal consistency – reliability of the factors
on the use of student ratings was conducted to examine whether the data
could be generalised at the classroom level. In all cases, it was shown The observation instrument was also used in the Effective Classroom
that students were able to provide reliable data on the teaching practices Practice mixed methods study of teaching (Kington, Day, Sammons,
of their teachers. For the studies in which classroom observations were Regan, & Brown, 2009), and in a study of ‘Inspiring Teaching’ in En­
used the inter-rater reliability was examined and satisfactory results gland (Sammons, Lindorff, Ortega, & Kington, 2016; Sammons et al.,
emerged from these studies (see Creemers & Kyriakides, 2012). 2014, 2018) where it was employed alongside the Quality of Teaching
observation instrument (Van de Grift, 2007; Van de Grift, Matthews,
Tabak, & de Rijcke, 2004), as well as employing qualitative observa­
3.2. The International System for Teacher Observation and Feedback tions, student survey measures and teacher interviews to explore the
concepts of effectiveness and inspiring practice from different perspec­
a) Prediction of student outcomes tives. In the Kington et al. (2009) study five factors were extracted, with
Cronbach’s Alpha’s for these factors ranging from 0.55 to 0.84. A study
Relatively few studies have investigated the associations between in the Spanish region of Mallorca supported the validity and reliability
teaching quality as conceptualized and measured by the ISTOF instru­ of the instrument, with the hypothesized factor structure showing good
ment and student outcomes. Miao, Reynolds, Harris, and Jones (2015) model fit using confirmatory factor analysis, and reliability of the factors
used the ISTOF instrument to compare instruction in China and the UK. ranging from 0.73 to 0.86 (Reynolds et al., 2012).
The ratings were strongly positively correlated with student attainment
in mathematics. In Norway, the observation instrument demonstrated
good predictive validity, being correlated 0.31 with student attainment 3.3. The Three Basic Dimensions of teaching quality
in mathematics, and was also seen as useful in professional development
(Soderlund, Sorlie, & Syse, 2015). In a study in the Spanish region of Since the development of the TBD, many instruments have been
Mallorca the instrument was able to discriminate between teachers in designed on its basis to measure the quality of instruction. Praetorius
high and low performing schools (Reynolds, Salom, Delaiglesia, & et al. (2018) reviewed communalities and differences of the operation­
Ramon, 2012). The evaluation of Teach First in England produced some alizations across 39 publications from 21 studies to summarize empirical
similar findings (Muijs, Chapman, & Armstrong, 2012; Muijs, Chapman, support for the TBD.
Collins, & Armstrong, 2010). It sought to investigate the link between
observed pedagogy and student outcomes as part of an evaluation of the a) Prediction of student outcomes
impact of Teach First (an alternative teacher certification programme).
It used ISTOF to measure teaching quality among Teach First practi­ Regarding the predictive validity of the model only results from
tioners and showed associations at the school level between better studies using a longitudinal, multi-level design were reported as other
value-added results for schools and numbers of Teach First teachers. designs can be questioned with respect to their interpretability. 13
publications investigated level 2 (i.e., classroom level) effects; to prevent
b) Construct validity – factor structure counting a single study repeatedly if several publications exist on the
same data, 7 publications were selected based on criteria such as
The factor structure of the ISTOF instrument containing 7 factors to focusing on direct effects of the TBD on the dependent variables, using
match the theoretical components has produced different results in some observer ratings or being the earliest English language publication. For
countries indicating that the 7 factors are not always fully supported. In each of the theoretically hypothesized effects on student achievement
particular, a study in Flanders, Belgium found that teacher effectiveness and motivational student characteristics empirical examples were
measured using the ISTOF observation instrument tended to be quite found, however not consistently across all studies. With regard to stu­
unidimensional, with the ISTOF items loading onto one ‘overall’ teacher dent achievement, the hypothesized positive effects of classroom man­
effectiveness scale (Marciniak & Janssen, 2012). On the other hand, a agement and cognitive activation have been found in approximately half
study in Hong Kong by Ko (2010) supported the hypothesized of the studies. Six studies also investigated the effects on students’
multi-factor structure, which showed good model fit, though high cor­ motivational characteristics. Positive effects of classroom management
relations between factors suggest the possibility of an overarching were reported in one of these studies and positive effects of student
higher order overall ‘effectiveness’ factor too. Variability analysis support were identified in another study. In addition, two unexpected
showed that the instrument had good stability across classroom obser­ effects were found: two studies reported negative effects of classroom
vations of the same teachers, who were observed between 15 and 23 management and one study a positive effect for cognitive activation on
times each, and was able to distinguish distinct patterns of behavior students’ motivational characteristics. Praetorius et al. (2018) discussed
between the observed teachers. Reliability tests based on Cronbach’s a number of possible explanations for the inconsistent patterns of re­
alpha showed internal consistency for the components ranging from sults, pointing to differences in study designs (e.g., school type, subject
0.70 to 0.97 for each. The observation instrument was also validated in taught, sample size, operationalization of the TBD, etc.) and statistical
Ireland, where discriminant and factorial validity were tested. Teacher modeling (e.g., latent vs. manifest modeling, selection of control vari­
effectiveness was found to be higher on most scales in co-educational ables, centering options, etc.).
and girls’ schools compared to boys’ schools and differed between
components. The hypothesized factor structure was supported (Devine, b) Construct validity – factor structure
Fahie, & McGillicuddy, 2013; Devine, Fahie, McGillicuddy, MacRuairc,
& Harford, 2010). A number of studies have been conducted in England In six studies the factorial validity of the framework was investi­
which have generally shown the observation instrument to be discrim­ gated. Of those, four reported positive results according to the three-
inating, but not necessarily to factor onto the seven proposed compo­ factor structure of the model (Fauth, Decristan, Rieser, Klieme, & Bütt­
nents. Ko and Sammons (2008) for example found 8 rather than 7 ner, 2014; Fauth et al., 2014; Künsting, Neuber, & Lipowsky, 2016;
constructs using EFA, though the sample size was less than twice the Kunter & Voss, 2011; Lipowsky et al., 2009). The other two studies
number of items, making analysis problematic. In their comparison of covered additional aspects of teaching quality from the beginning and
teaching instruction in China and the UK, Miao et al. (2015) reported therefore ended up with more than three factors, but still found factors
good cross-country validity, with the study supporting the ISTOF factor that were close or identical to the TBD (Kunter et al., 2005; Taut &
structure in both China and the UK. Rakoczy, 2016).

5
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

c) Internal consistency - reliability different groups of students by taking into consideration their different
learning needs and abilities (e.g., by allocating supplementing work to
The temporal stability of the TBD was also tested in six studies. Four gifted students that finish work earlier than others). Similarly, the TBD
of them were based on student ratings and reported test-retest- considers teachers’ actions for ensuring that students spend as much
correlations ranging between 0.72 and 0.86 for classroom manage­ teaching time as possible on the task. Classroom management is there­
ment, between 0.57 and 0.98 for student support, and between 0.27 and fore depicted by: a) disruption-preventive teaching and/or an appro­
0.91 for cognitive activation (Holzberger, Philipp, & Kunter, 2013; priate and time-saving way of dealing with disruptions; b) monitoring;
Pinger, Rakoczy, Besser, & Klieme, 2018; Praetorius, Lauermann, Klas­ and c) well working routines and smooth transitions between teaching
sen, & Dickhäuser, 2017; Wagner et al., 2016). The other two studies phases (e.g., student have all their materials on their desk as soon as the
investigated the stability of the TBD with respect to teacher ratings and lesson starts; passing out papers goes smooth and fast). For ISTOF,
reported test-retest-correlations between 0.71 and 0.80 for classroom classroom management focuses on the extent to which the teacher
management, between 0.56 and 0.67 for student support, and 0.67 for maximizes learning time and effectively deals with any disruptions that
cognitive activation (Holzberger et al., 2013; Wagner et al., 2016). All may occur. Time management is one of the factors that was consistently
studies reported acceptable to excellent reliability for the TBD with found to have an impact on learning outcomes through studies that took
minor exceptions for student support (see Praetorius et al., 2018, S. place in the field of TER. Management of time is considered as one of the
416). most important indicators of teacher ability to manage the classroom in
an effective way and was also included in earlier models in the field of
4. Similarities and differences between the three frameworks educational effectiveness. For example, Creemers’ comprehensive
model considered opportunity to learn and time on task as two of the
4.1. Common factors most significant factors of effectiveness that operate at different levels.
Opportunity to learn is also related to student engagement and time on
With regard to similarities found between the three frameworks it task (Emmer & Everston, 1981). Therefore, effective teachers are ex­
should first be noted that all three frameworks are generic in nature and pected to organise and manage the classroom environment as an effi­
thus refer to factors that are assumed to affect student learning irre­ cient learning environment and thereby to maximise engagement rates
spective of the age of students, ethnicity, gender and other contextual (Creemers & Reezigt, 1996). With regard to the other common factors
factors. The three frameworks also refer to some common factors such as between the three frameworks, an overview may be found in Figs. 1 and
classroom management, supportive learning environment, questioning 2 which present a comparison between the factors included in the three
skills etc. With regard to the management of time factor, the dynamic frameworks.
model considers teachers’ actions for a) organising the classroom envi­ Fig. 1 provides a summary of how each of the three models deals with
ronment; and b) maximising engagement rates. Thus, the main interest aspects of the classroom climate such as the relations among teachers
of this factor for the dynamic model is whether students are on task and and students and between students. It is notable that all three models see
whether their teacher is able to deal effectively with any kind of class­ the development of a supportive learning environment where teaching
room disorder without wasting teaching time. It is also important to and learning can take place as important. In addition, the dynamic
investigate whether teachers manage to decrease loss of time for model also refers to a business-like environment that may promote

Fig. 1. Similarities between the three frameworks with regard to the relations among teachers and students and between students.

6
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

Fig. 2. Similarities between the three frameworks with regard to factors concerning teacher instructional practices.

learning inside the classroom. Specifically, apart from the type of in­ refers to the provision of challenging questions for which students will
teractions that exist in a classroom and can be seen as important for be asked to provide reasons for their answers whilst the dynamic
measuring classroom climate (Den Brok, Brekelmans, & Wubbels, 2004; model refers to the teacher raising different types of question (i.e.,
Harjunen, 2012), the dynamic model also refers to three other elements process and product) at an appropriate difficulty level, giving students
(i.e., students’ treatment by the teacher, competition between students, time to respond and dealing with student responses. With regard to
and collaboration) which refer to teachers’ efforts to create a structuring, ISTOF refers to the factor of clarity of instruction which
well-organized and accommodating environment for learning in the takes into consideration the extent to which lessons are well structured
classroom (Walberg, 1986). Apart from the promotion of a supportive and purposeful. The TBD examines whether the new content is linked
learning environment, the TBD refers to teachers’ actions to promote to prior knowledge and the dynamic model refers to the lesson
social relatedness as well as the experience of autonomy and compe­ beginning with an overview and/or review of objectives, outlining the
tence for students. With regard to social relatedness, it examines content to be covered, signalling transitions between lesson parts and
whether everyone deals with each other in a friendly and respectful drawing attention to, and reviewing, main ideas. Finally, with regard
manner, whilst with regard to support for autonomy, it examines to strategy development, the dynamic model refers to the factor of
whether a) the lesson is interesting and relevant for students (e.g., ex­ teaching-modelling which anticipates that effective teachers are pro­
ercises refer to their daily lives); b) no performance pressure exists; and moting students’ use of learning strategies and/or development of
c) students are able to make decisions (e.g., between tasks/solution their own strategies in order to address different types of problems and
strategies/social forms). For ISTOF, the extent to which the teacher develop skills promoting active learning. The TBD refers to tasks that
communicates high expectations, and involves and values all students have to be solved with more than one given solution strategy under the
are additionally considered. factor of teaching in a non-transmissive way and ISTOF refers to the
Fig. 2 demonstrates the similarities amongst the three frameworks use of varied methods and strategies.
with regard to the instructional practices that are considered impor­ Apart from the similarities between the three models, Fig. 2 also
tant. First of all, it should be noted that in opposition to ISTOF and the portrays some of the individual characteristics of each framework. For
dynamic model which consider effectiveness factors as being separate example, one may notice that only the TBD includes specific methods
from one another, the TBD refers to three overarching factors/di­ of teaching such as the Genetic-socratic method whilst the dynamic
mensions (i.e., classroom management, student support and cognitive model and ISTOF consider a variety of methods and do not refer to
activation). The factors that refer to instructional practices are thus specific methods of teaching. Also, whilst ISTOF treats differentiation
included in TBD under the overarching factor of Cognitive activation. as an effectiveness factor, the dynamic model considers it as one of the
As shown in Fig. 2, all three models acknowledge the importance of five dimensions which are used for the measurement of all the other
questioning and structuring including aspects of these factors, as well teacher factors. More information on the conceptual and measure­
as aspects of self-regulated learning, such as the development of ment differences of the three frameworks may be found in the
learning strategies. For example, with regard to questioning, the TBD following section.

7
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

4.2. Differences in conceptualization and measurement discipline, clarity of rules, and monitoring. The second overarching
factor represented student support, and covered, amongst others, the
It is commonly accepted by the research and wider educational elements of teacher sensitivity for individual needs, achievement pres­
community that teaching practice comprises a multifaceted phenome­ sure (negative indicator) and the experience of social relatedness for
non with several different factors determining its effectiveness (Kyr­ students. The third overarching factor was named cognitive activation
iakides & Creemers, 2008; Muijs et al., 2014; Scheerens, 2013). These and contained, amongst others, the elements of challenging tasks and
factors may be realized as being either, generic meaning that they may Socratic teaching. In opposition, therefore, to both the dynamic model
be relevant in different contexts, subjects and age-groups of students or and ISTOF, the TBD refers to wider/ overarching factors.
domain-specific as they are considered more related with the teaching of Fourthly, similar to the development of the dynamic model, for the
specific subjects. All three frameworks described in the previous section development of ISTOF, the core international team primarily drew from
may be seen as generic in nature as they refer to generic factors a teacher effectiveness perspective to develop the original proposal,
measuring teacher behavior in the classroom. Nonetheless, the three building on, e.g., Creemers’ (1994) model, which draws extensively on
frameworks also demonstrate several conceptual differences. Firstly, that of Carroll (1963). As to the dynamic model, ISTOF and the TBD do
with regard to their development process, ISTOF was developed by a not only draw on one approach of learning. For instance, ISTOF does not
team of experts from 20 countries using an iterative Delphi process to only draw on direct instruction approaches as was the case in earlier
ensure cross-cultural relevance and validity. Namely, ISTOF queries teacher effectiveness studies, but also explicitly incorporates develop­
asked experts their opinions about what constitutes “effective teaching, ment of self-regulated learning and the TBD is also based on construc­
’’. The process started with a number of queries from the central com­ tivist theories of learning (i.e., Cognitive Activation; Piaget, 1985;
mittee to the country teams, using an iterative process leading to ever Vygotsky, 1978). The dynamic model includes factors/teaching skills
more focused queries. This process was used to generate the compo­ associated with direct teaching and mastery learning (Joyce et al.,
nents, indicators and items for the final instrument. The importance of 2000), such as structuring and questioning, and with theories of teach­
these factors was then examined through TER as noted previously. The ing associated with constructivism (Brekelmans et al., 2000) including
dynamic model’s starting point, on the other hand, derived by taking factors such as teaching modelling. The cognitive load theory is also
into account the results in the field of TER prior to its conception. In taken into consideration by including the application factor and also
particular, it was based on the results of a series of process–product motivation theories by including the factor of orientation. Therefore, an
studies which have led to the identification of a list of factors that link integrated approach to quality of teaching is adopted in all three
specific teaching behaviors and characteristics to student outcomes. frameworks.
Specifically, teaching factors that were shown to have an effect on stu­ Fifthly, the frameworks differ with regard to how strongly the
dent achievement, such as structuring of lessons, time management, teaching factors and their dimensions are conceptually seen as inter-
questioning and application were included in the model and thereafter related. For ISTOF the individual factors are deemed independent of
to the instruments measuring the teacher factors of the dynamic model. each other (though empirical studies reveal quite strong correlations). In
With regard to the TBD framework, its theoretical rationale was based the TBD the factors can be distinguished from one another empirically
on general theories of schooling and teaching, while the psychological (see 3.2 in this article), in some publications it is hypothesized that
mechanisms underlying the impact of the dimensions on student out­ classroom management is a prerequisite for student support and
comes were realized later, based on well-established theories of student cognitive activation, which is something that is also supported by some
cognition and motivation. of the empirical studies conducted to test the validity of the TBD (e.g.
Secondly, an essential difference between the dynamic model and Brunner, 2018; Klieme et al., 2001). In addition, there is an overlap
previous models in the field of Educational Effectiveness Research between some elements of student support and cognitive activation, e.g.,
(EER), as well as, the other two frameworks described in this paper is using a constructive approach to errors, differentiation and adaptive
that it explicitly refers to the measurement of each factor and assumes support, as well as constructive feedback indicate good student support
that these factors represent multi-dimensional constructs. By proposing but can also contribute to students’ cognitive activation (Praetorius,
five measurement dimensions, the dynamic model aims to provide a Klieme et al., 2020). The strongest interrelations can be found in the
clear distinction and a more accurate measurement not only of the dynamic model. The assumption of the dynamic model with regard to
quantitative aspects of the factors (i.e., the frequency), but also of the the relations between the different factors was supported by seven
qualitative (i.e., stage, focus, quality and differentiation). The other two studies conducted in different countries which revealed that the teach­
frameworks (i.e., ISTOF and TBD) deal with the differences between the ing factors and their dimensions can be classified into five stages of
quality gradations of the individual dimensions as a discreet factor. effective teaching, structured in a broad developmental order (Kyr­
While ISTOF uses only frequency as a means of assessing the quality of iakides et al., 2021). The first three stages are mainly related to teaching
instruction, TBD uses qualitative and/or quantitative criteria depending skills concerning the direct and active teaching approach, moving from
on the factor and element of factor to be assessed. One could therefore the basic requirements concerning quantitative characteristics of
claim that while all three frameworks argue for the importance of teaching routines to the more advanced requirements concerning the
treating differentiation as a separate aspect of effective teaching, taking appropriate use of these skills as measured by the qualitative charac­
into consideration that a class of students of any age and in any culture teristics of these factors. These skills also gradually move from the use of
will differ from one another not only cognitively but also with regard to teacher-centered approaches to the active involvement of students in
their affective and psychomotor skills, their generalized and specialized teaching and learning. The last two stages are more demanding since
prior knowledge, their interests and motives, their socioeconomic and teachers are expected to differentiate their instruction (level 4) and also
cultural background, and several other factors related to their learning to demonstrate their ability to use the new teaching approach (level 5).
outcomes (Dowson & McInerney, 2003; Slavin, 1987; Teddlie & Rey­ The allocation of teaching skills into stages of teaching may provide
nolds, 2000), only the dynamic model regards differentiation as an in­ information for differentiating teacher professional development efforts
tegral part of the qualitative measurement dimensions. to meet the individual needs of teachers. However, the functioning of
Thirdly, differences may be identified between the three frameworks each factor separately should be taken into account in defining effective
with regard to their structural characteristics. In particular, the TBD teaching and observational data should be used for teacher improve­
refers to a three-dimensional structure of teaching quality comprised of ment purposes since teachers located at the same stage may need to set
three overarching factors. The first overarching factor (i.e., called different improvement priorities.
dimension) was named classroom management and covered, amongst Finally, one of the main differences of the dynamic model and the
others, elements such as effective handling of disruptions, classroom other two frameworks refers to its multilevel nature. Namely, despite the

8
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

emphasis given by the dynamic model to the classroom level, this operationalization which can be based on different types of assessments
framework also refers to factors of effectiveness operating at different (observer ratings, student and teacher questionnaires). However, high-
levels (e.g., school or system). It is assumed that factors of the upper (i.e., inference observation instruments are recommended. Operationaliza­
system, and school) and of lower levels (i.e., student) affect classroom tions can be adapted to individual requirements: Originally developed
teaching and therefore also indirectly affect student outcomes. This is for mathematics instruction in a German secondary school sample, the
taken into consideration when assessing the effectiveness of the teacher framework was used with respect to several subjects (e.g., reading, see
factors. Lotz, 2016; science, see Fauth et al., 2014), school types (e.g., primary
As discussed in this section, the three frameworks portray both school, see Gabriel, 2013; vocational school, Helm, 2016), and educa­
important differences and similarities. However, the aim of this com­ tional systems (e.g., PISA 2012, see Klieme et al., 2013). The ISTOF
parison is not to merely suggest a combination of all the factors included instrument consists of seven components each of which has between two
in each framework for the development of a new one. In such case, the and four indicators, and each indicator consists of two items. Each item
parsimony principle could not be satisfied since such a framework in the instrument is rated on a five-point Likert scale (labelled
cannot easily be used for identifying improvement priorities and 5=strongly agree, 4=agree somewhat, 3=neutral, 2=disagree some­
designing focused interventions. On the contrary, this comparison may what, 1=strongly disagree), with a ‘not applicable’ category also
provide a starting point for the development of a more comprehensive available which was included as not all items can necessarily be
framework for studying teaching and learning. This may be achieved by observed in all lessons.
examining which of the factors included in each framework have been Secondly, with regard to the use of the observational data collected it
found to be associated with student learning in order to be used in is important to note that the dynamic model mainly aims to use obser­
developing a comprehensive theory. Similarly, the development of a vational data regarding teaching for improvement purposes. Namely, the
comprehensive framework of teaching and learning based on the three dynamic model takes into consideration that the ultimate aim of effec­
different generic frameworks discussed in this paper, foresees the tiveness research is to link research and practice by exploiting research
development of instruments, practical in use, that will be able to effec­ results and using them to achieve improved educational outcomes for
tively collect information on the functioning of the factors that will be students. It therefore uses observational data for teacher professional
included in this comprehensive framework. development purposes (see Creemers et al., 2013). Through proposing
five stages of teaching skills, as these were demonstrated through
4.3. Differences in the use of the three frameworks research evidence (see Section 3.1), teachers may be classified based on
observations of their teaching and receive individual training and feed­
As discussed in the previous sections, the three frameworks aim to back regarding areas of teaching in which they need to place more focus
provide a basis for measuring teaching effectiveness through a generic whilst undertaking improvement actions. The other two frameworks
scope so as to contribute to improving educational quality in different have been used for collecting data on the quality of teaching however
contexts, settings, subject and age groups of students. Apart from the they have not been used in any large-scale ways for teacher professional
aforementioned conceptual differences of the three frameworks, differ­ development purposes. For instance, even though previous studies have
ences also exist with regard to the measurement process of teaching usually aggregated ISTOF observations over a relatively large number of
quality, as well as, to the focus placed by each framework in the use of lessons, using psychometrics to analyze the validity, reliability and factor
the data collected. First, as regards the data collection process, for col­ structure of the instrument, they have not dealt with the analysis of in­
lecting data on the teacher factors of the dynamic model two low- dividual lessons. However, ISTOF is freely available to use and there have
inference (LIO1 and LIO2) and one high-inference observational in­ been instances in England where teachers have used it to support
struments are used since it is acknowledged that each type of instrument informal low stakes within school observations (e.g. pairs of colleagues
has both, advantages as well as disadvantages. In particular, the dy­ observing each other and discussing the ratings to support each other’s
namic model uses the low observation instruments since they demon­ professional learning as part of school improvement foci on teaching
strate a higher level of reliability, however, in the attempt to develop quality). Nonetheless, larger scale research and development studies are
specific scores for each factor, information on its qualitative character­ essential for evaluating ISTOF’s usefulness for developmental purposes,
istics may be lost. On the other hand, even though the high inference such as providing feedback to teachers and addressing one of the original
instrument provides a more holistic view of the lesson, reliability is more aims of its development. With regard to the TBD, widely available access
difficult to achieve. Using all three instruments together may provide is provided for anyone who wishes to use the framework, which has led to
more information on the lesson observed regarding the teaching factors the development of applications in educational practice, e.g. in some
of the dynamic model. In particular, these instruments were designed to individual cases in Germany the model is used as a theoretical basis for
collect data concerned with different aspects of the eight teaching fac­ teacher training, further teacher education, as well as, instruments for
tors of the dynamic model, and previous studies provided empirical teacher and school evaluation.
support for their construct validity. Specifically, LIO1 and LIO2 are best
used when combined together as they examine different aspects of the 5. Discussion on the analysis of three lessons – differences/
factors and together, they are able to generate data for all teaching similarities
factors of the dynamic model (except student assessment) and their five
dimensions. The high-inference observation instrument provides a more The lessons analyzed for the purpose of comparing the quality of
general overview of the lesson and covers the five dimensions of all eight teaching through the lens of the three frameworks in this paper are three
factors of the model. Observers are expected to complete a Likert scale to 4th grade mathematics lessons drawn from the NCTE video library at
indicate how often each teaching behavior was observed. The high- Harvard University. For further description of the lessons see Char­
inference observation instrument is completed as soon as the lesson alambous and Praetorius (2018). The lessons rated in this analysis
finishes, since observers are asked to document a broader view of the include a geometry lesson taught by a teacher named Mr. Smith, a lesson
lesson based on the factors of the dynamic model (for more information on strategies for multiplication taught by a teacher named Ms. Young,
on the instruments of the dynamic model see Creemers & Kyriakides, and a lesson on multiplying a fraction by a whole number taught by a
2012). teacher named Ms. Jones (all names are pseudonyms). All three teachers
In opposition to the dynamic model, both the TBD framework and volunteered to have their lessons video recorded and gave consent for
ISTOF make use of one instrument. Unlike many frameworks aiming at their video-recorded lessons to be used for research purposes (for the
measuring teaching quality, the TBD is not connected to a single pre­ complete analyses of the three lessons see Muijs et al., 2018; Kyriakides,
defined instrument but is associated with many different Creemers, & Panayiotou, 2018; Praetorius et al., 2018).

9
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

5.1. Analysis procedure not considered practically possible to measure teachers’ skills in
assessment by using external observation. More specifically, observation
For the lesson analysis using the ISTOF instrument three observers of teacher behavior in the classroom could not provide information
were used who observed each lesson and scored it following the obser­ relating to teachers’ skills in assessment tool construction, recording and
vation. Where observers disagreed on a rating, the mean was calculated reporting of data since these tasks may take place outside classroom
and rounded to the nearest whole rating. For the analysis of the three (Christoforidou, Kyriakides, Antoniou, & Creemers, 2014). Moreover, it
video-lessons using the instruments of the dynamic model, four inde­ is assumed that to measure teachers’ skills in administering assessment
pendent observers were used. After the four observers watched the tasks, it would have been necessary to observe a large number of lessons
whole video-lessons twice in order to code them using the three obser­ per teacher, especially since a significant percentage of teachers provide
vation instruments, their scores were compared to examine whether assessment tasks only at the end of a unit or series of lessons and it would
there was consensus (i.e., the inter-rater reliability was examined). therefore have been very difficult to obtain data on teachers’ skills in
Differences in some scores, mainly concerned with the quality dimen­ assessment unless many lessons for each teacher had been observed. In
sion, were identified but after a discussion, consensus was reached. To terms of the assessment component, ISTOF takes into consideration,
evaluate three lessons according to the TBD framework an instrument amongst others, teachers’ ability to promptly correct errors when
was created based on three already existing observer rating manuals that questioning, and provide correction and explanation to student answers.
focused on teaching quality in mathematics (German TIMSS Video: These aspects of teaching are however recorded under questioning
Clausen, Reusser, & Klieme, 2003; PERLE: Lotz, 2016; Pythagoras: techniques for the dynamic model and under student support and more
Rakoczy & Pauli, 2006). The instrument was applied by two experienced specifically, constructive approach to errors for the TBD. Despite, the
raters that were required to give a holistic rating for each of the three fact however that some teaching behaviors may be defined differently
basic dimensions as well as ratings for multiple sub-dimensions. Both and thus fall within the spectrum of different factors by each framework,
raters familiarized themselves with the manual beforehand and dis­ consensus exists on the assessment of teaching with regard to the spe­
cussed it in detail to develop a common understanding of the codes. To cific aspects of each factor.
calibrate the raters and to increase inter-rater reliability, the raters With regard to classroom management and classroom climate (or
applied the codes to three training videos from German mathematics classroom as a learning environment as referred to in the dynamic
instruction in a first step, discussed the dimension and subdimension model), differences were again found between the three lessons with a
ratings, and agreed on a consensus score. After the training, the three common viewing by all three frameworks. For example, with regard to
target videos of this study were rated separately by each rater. In cases of the first lesson (i.e., Mr. Smith’s) ISTOF rated both classroom climate
disagreement, consensus scores were used. It was assumed that having and classroom management at the midpoint of the scale due to strengths
three lessons from different teachers with different improvement needs in classroom climate, such as Mr. Smith’s attempts to engage all students
may allow for an in-depth analysis of the teaching processes that take in question-and-answer sessions and promote interaction among stu­
place. In the next section, the differences and similarities on how the dents and limitations in the ability of the tasks to respond to the stu­
three lessons were evaluated using the three frameworks are discussed. dents’ varying learning needs. ISTOF also found the overall student
behavior in the class to be good, with very little disruption, but also
5.2. Comparison of the analyses of the three lessons identified the weakness of Mr. Smith to address off-task behaviors and
lack of concentration, which were observed during both the whole class
All three lesson analyses as conducted for the three different and seatwork parts of the lesson. The limitations observed in Mr. Smith’s
frameworks demonstrated significant strengths and weaknesses of the ability to manage the classroom effectively, in terms of keeping all
observed teaching. Based on the factors which each framework con­ students on-task and maximizing their engagement rates were also noted
siders, some common features were identified. All three frameworks by the observers using the dynamic framework. Similarly to ISTOF, the
assessed the Ms. Jones’s lesson as the one with the highest score/stage dynamic model noted that the teacher of lesson 1 did not attempt to
overall and the other two (i.e., Mr. Smith’s and Ms. Young’s) as being promote any interactions among students either by inviting students to
equally weak in terms of teachers’ response to student learning needs, comment on their classmates’ answers to questions or by assigning them
classroom management and classroom climate. The analysis done by group application activities. The TBD found Mr. Smith’s lesson scored
ISTOF, collectively showed that the three lessons were strongest in the well in classroom management since focus was placed to the fact that no
area of assessment and evaluation, and weakest in the areas of differ­ time was spent on noninstructional activities, classroom rules and rou­
entiation and inclusion and encouraging active learning and metacog­ tines appeared to be clear to all students and no disruptive behavior was
nition. Similarly, neither the TBD nor the dynamic model analyses found observed which is something that was also noted by the other two
any sign of differentiation either in task difficulty or in dealing with frameworks.
student responses and this is one of the reasons that the dynamic model Concerning the second lesson (Ms. Young’s), observations held by all
classified the teaching of these lessons in the lower stages of teacher three frameworks demonstrated a higher level of questioning skills than
development. In particular, based on the analysis of the three lessons lesson 1. Specifically, ISTOF was able to identify that questions were
conducted using the dynamic model no differentiation activity associ­ posed early in the lesson relating to learning from previous lessons but
ated with any factor was observed and this finding accords with previous then generating new questions where new strategies were practiced.
studies that found the differentiation dimension of most factors situated With regard to teachers’ response to students’ answers it was noted that
at stage 4. Ms. Young frequently asked students to provide explanations and she
With regard to student assessment, it should be noted that neither the used open ended questions to develop understanding. Students were also
TBD, nor the dynamic model address this factor in lesson observation. asked to solve quite complex problems at various points during the
Particularly, despite the fact that the dynamic model considers student lesson. With regard to questioning, the observation done using the dy­
assessment as an integral part of teaching (Stenmark, 1992), and namic model identified some process questions which were posed in the
formative assessment, in particular, has been shown to be one of the lesson and the teacher’s attempts to use questioning techniques to aid
most important factors associated with effectiveness at all levels, espe­ learning (for example, when a student gave an answer, she asked
cially at the classroom level (e.g., De Jong, Westerhof, & Kruiter, 2004; another student to say it in another way and then another one to explain
Kyriakides, 2008; Shepard, 1989), it assumes that the functioning of this why the answers given by the previous two students were correct). The
factor may not be easily captured through lesson observations. Since the dynamic model however, in opposition to ISTOF, found that the feed­
dynamic model refers to all four phases of assessment (i.e., assessment back provided was not constructive enough to help students understand
tool construction, administration, recording and reporting of data), it is their mistakes (e.g., the teacher said: “You’re making the mistake, so fix

10
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

it”) and the conclusions drawn from a student’s answer were derived in frameworks and instruments so as to provide a more complete view of
most cases from the teacher’s words instead of the teacher asking stu­ teaching quality. Based on the lesson analyses presented in the previous
dents to comment on their classmates’ answers. ISTOF, however, also section the advantages, as well as, the limitations of drawing on multiple
noted the tendency for the lesson to remain teacher-driven as the teacher frameworks and models of teaching quality as opposed to using single
was found to respond to questions herself when students did not. approaches are next discussed.
Similarly to the dynamic model, the TBD also found that the teacher With regard to the advantages of using multiple frameworks and
rarely provided constructive feedback and supported only those students instruments to collect data on quality of teaching, the possibilities of
who were engaged with the task. However, the TBD also rated this lesson obtaining a more complete view of teaching have been illustrated and
as having a higher score in questioning than the other two since it found can be acknowledged. By using a combination of instruments which take
that the teacher posed questions and problems that required making into consideration different aspects of teaching, more information may
connections between multiple representations. The teacher also asked be provided on the particular weaknesses and strengths of the lessons
students to justify their answers or to repeat another student’s solution seen and subsequently the information gathered on the teaching skills of
in a different way. As to the other two frameworks, the TBD also found the teacher may be more effectively employed for teacher professional
the lesson to be very teacher-directed and as the teacher was explaining development purposes and for providing more specific feedback that can
and developing the mathematics for the solution to the problems, the be used for teaching improvement purposes. In addition, the use of
chance was missed of starting a discourse among students. For this different instruments deriving from different frameworks may overcome
reason, cognitive activation was rated to be moderate to low. the weaknesses of instruments coming from just a single framework. For
Finally, the third lesson (i.e., Ms. Jones’s) rated highly on clarity of example, whilst the ISTOF instrument sufficiently provides data on the
instruction, classroom climate and management when using ISTOF, and quantitative aspects of factors (i.e., how frequently an activity related to
was scored highest of the three lessons on these aspects. Similarly, the a factor takes place), and many operationalizations of TBD include
dynamic model found the teaching skills of this teacher to be better than qualitative and quantitative criteria in the assessment, only the dynamic
the skills of the other two teachers in terms of structuring activities model assumes that when measuring the functioning of a factor one
which were observed at different stages of the lesson (i.e., beginning and should always take into consideration both, its quantitative and quali­
end), as well as progression in the difficulty level of the activities in tative characteristics. Apart from frequency therefore it also foresees the
which students were involved, application opportunities at different measurement of factors through four dimensions which examine quali­
points of the lesson and some established routines (e.g., how to show tative characteristics of the functioning of a factor. The dimensions may
that students have finished the task they are working on). This is why the therefore add to the information provided by the instruments used in
third teacher was allocated to a higher level than the previous two other frameworks, as well as to the further development of future
teachers (i.e., stage 2). The TBD also found this lesson to be of moderate frameworks of educational effectiveness as they are seen not only
to high rating on classroom management and student support with important from a measurement perspective, but also, and even more,
effectively implemented routines and appropriate teacher feedback as from a theoretical point of view.
there seemed to be a common understanding in class that there is Furthermore, a combination of different frameworks may provide a
nothing wrong in making errors and feedback was formulated benevo­ broader view of teaching and take into consideration a wider range of
lently. Despite the fact that the dynamic model found progression in the factors. Factors that may not be taken into consideration in assessing the
difficulty level of the activities in which students were involved, both the quality of teaching by one framework may be included in another and
TBD and ISTOF noted that the provided tasks were either not cognitively therefore using different frameworks may provide a better linkage be­
demanding (creating posters and writing down what the teacher had tween different approaches to teaching. For example, the dynamic
presented) or broken down into easily manageable small steps indi­ model refers to specific eight factors that describe teachers’ instructional
cating not particularly high expectations on behalf of the teacher. role (i.e., orientation, structuring, questioning, teaching-modelling,
As shown from the analyses of these three lessons conducted using application, management of time, teacher role in making classroom a
the three frameworks, despite the fact that the instruments used in each learning environment, and classroom assessment). These eight factors
framework portray different strengths and limitations they are all suited are associated with both direct teaching and mastery learning and also
to identify areas of improvement in the observed lessons and dis­ with theories of teaching associated with constructivism. The TBD on
tinguishing different stages of teaching. Implications of using different the other hand includes three broader dimensions of teaching (i.e.,
frameworks and instruments for measuring the quality of teaching are, classroom management, student support and cognitive activation)
therefore, discussed in the next section. which cover sub-dimensions such as effective handling of disruptions,
classroom discipline, clarity of rules, monitoring, teacher sensitivity for
6. Towards developing a comprehensive theory of teaching and individual needs, teacher support, challenging tasks and Socratic
learning teaching adding to the factors measured by the dynamic model. Finally,
ISTOF also includes measurements of teachers’ skills in promoting the
The aim of this paper is to contribute to the ongoing efforts of re­ development of metacognition which further complement the mea­
searchers to strengthen the theoretical basis of EER and add to the surements of teaching made by the other two frameworks. Past research
broader debates about what constitutes effective teaching, under which has provided examples of using ISTOF in combination with other
conditions and for whom. A variety of theories, models and frameworks framework such as the Quality of Teaching Framework by Van de Grift
have been developed during the past three decades, giving emphasis to et al. (2004). In addition, a ‘mixed methods lens’ has been advocated to
different factors for defining and measuring the quality of teaching and add in qualitative data to supplement quantitative ratings used in ISTOF
also portraying some common viewings. These frameworks were either providing triangulation and enrichment with the ability to provide vi­
more generic in nature, given that they aim to describe teaching more gnettes from field notes or interviews to expand on numeric ratings
universally in terms of identifying factors that may explain variation in particular (see the Effective Classroom Practice and the Inspiring
student outcomes and apply in different settings, circumstances, stu­ Teaching studies in England (Kington et al., 2014; Sammons et al., 2016;
dents etc., or more domain-specific. In the second case, the focus lies on Sammons, Kington, Lindorff, & Ortega, 2018). However, one may argue
the teaching of specific subjects and a more in-depth analysis of the that the combination of different frameworks and instruments also
teaching of specific subjects is attempted. In this paper we made use of portrays significant limitations to the conceptualization of teaching,
three different generic frameworks developed within the field of both due to practical and theoretical reasons. Firstly, with regard to
educational effectiveness to analyze three video-lessons in order to practicality issues, it is acknowledged that using different instruments in
examine whether possibilities exist for combining different generic combination for the observing of a single lesson may be both time and

11
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

money consuming in terms of training observers and finding adequate evaluation). Second, a recurring commonality among models is that the
funds to do so. Namely, in the case of combining different frameworks aspects of instructional quality they include can be grouped around
for conducting classroom observations different observers are needed three factors that are similar across models. Namely, the greatest com­
each of which will be trained to use the instruments of each framework mon denominator for comparing the three generic models comprised:
to avoid contamination and bias of raters. For using the instruments of classroom management, classroom climate and teacher instructional
the dynamic approach only, when conducting classroom observations practices. Similar dimensions can be found in the intermediate layer of
using LIO1 and LIO2, two observers are needed (i.e., one observer codes MAIN-TEACH (classroom and time management, socio-emotional sup­
the lesson using LIO1 and the other using LIO2). If the other two port, and support for active engagement) as well as several other models
frameworks are also to be used four different observers are therefore in instructional quality (e.g., CLASS, Pianta, La Paro, & Hamre, 2008;
needed. Global Teaching InSights, Opfer et al., 2020). Third, the quality of in­
Secondly, despite the fact that more rich data may be collected for struction in the generic models is primarily determined by the oppor­
the teaching skills of teachers by using different frameworks to assess tunities crafted for students. Only in the case of TBD, some aspects of the
quality of teaching, practical limitations may derive in using the use that students make of these opportunities are also considered.
observation results for providing feedback to teachers and professional MAIN-TEACH covers both areas and thus has the potential to describe a
development purposes. Namely, by observing the functioning of a large more holistic picture of what is happening in the classroom and bring
number of factors the focus of the observation is widened and less spe­ greater focus to the aspect of the use that students make of opportunities
cific suggestions may derive for improvement purposes. Despite the fact in teaching quality research, although the authors claim that this area
that all three frameworks used to analyze the lessons for this paper found needs to be further elaborated in the model first. Fourth, the model
some similar weaknesses and points for improvement in each lesson, the solves the problem of the lack of content specificity in such a way that
use of three different frameworks may be seen as a lot for providing the individual dimensions can be operationalized generically or
teachers with priorities for improvement. It should also be reminded subject-specifically as needed. Also, the layered design of MAIN-TEACH
that all three frameworks described in this paper refer to generic factors is a valuable innovation that establishes relationships between different
measuring teacher behavior in the classroom without considering that aspects of instructional quality and describes which dimensions are a
some teachers may have insufficient knowledge about the topic they foundation for the effective functioning others. Although all underlying,
teach. If these generic frameworks would therefore be combined with intermediate and upper layer factors are included in the generic models
frameworks measuring the effects of more subject-specific teaching to some degree, the new structure adds assumptions about the re­
factors on student learning, then in some cases of teaching a large lationships of instructional features that could be helpful to researchers
number of suggestions would derive for improvement which would and practitioners. However, it also makes for a much higher complexity
make it difficult to determine improvement priorities. Finally, from a of possible operationalizations, since for all factors both the aspect of
theoretical point of view the combination of different frameworks which differentiation (underlying layer) and possible subdivisions into generic
do not link to similar theories of teaching may not be in a position to and subject-related components must be taken into account. Standard­
provide solid information on the measurement and realization of quality ized operationalizations similar to those of the Dynamic Model and
of teaching that would allow for the promotion of developments of the ISTOF are thus difficult to achieve even in the long term. Finally, for the
theoretical basis of the fields of educational and teacher effectiveness use of teacher feedback, MAIN-TEACH is accompanied by similar
research. Therefore, we do not argue for the combination of all three problems as those pointed out for a combination of the three generic
frameworks in terms of adopting all factors and instruments to create a models. Due to the comprehensive conceptualization of instructional
new comprehensive theory of teaching and learning. On the contrary, quality and the large number of features included in the model it would
we argue for the need to critically review the factors that are included in be very time-consuming to measure as well as too unspecific for pur­
existing models of effective teaching and consider factors that were poseful teacher feedback. It is therefore more suitable as a starting point
found through research evidence to affect student learning. It should for the targeted selection of characteristics to be captured in studies or
also be taken into consideration that through the ratings of the same for teacher feedback.
lessons, using the three frameworks mentioned in this paper as a basis,
not so different results derived despite having different lenses. This 7. Conclusions – future directions
shows the possibilities of creating a comprehensive theory of teaching
and learning based on existing frameworks. The need for a common In this paper, we have made detailed comparisons of three different
theoretical basis upon which lessons may be assessed in terms of their frameworks to study teaching quality. Then these frameworks were
quality is therefore argued so as to provide practitioners with specific employed to rate and analyse three video-lessons for three different
improvement priorities. teachers aiming to capture details concerning the observed quality of
Such an attempt for developing a common theoretical basis by syn­ teaching. Despite the fact that some different aspects of teaching are
thesizing different frameworks and models to provide a platform for included in each framework, all three frameworks proved of utility and
focused discussions is made by the MAIN-TEACH model presented in the were able to come to common conclusions with regard to the rating of
introductory paper of this special issue. Among others, the three generic lessons and the overall quality of teaching of each lesson. At this point,
frameworks presented in this paper, were used as a basis for the devel­ however, it should be noted that the small number of lessons observed
opment of the MAIN-TEACH model. Therefore, after carefully and the fact that they all concerned the teaching of mathematics at the
comparing the three models with each other and discussing advantages same age-group of students in one country context (the US) did not
and limitations of drawing on multiple frameworks at once, additional provide the opportunity to test the generic nature of the factors included
insights can be gained by comparing the three models with the MAIN- in each framework. Even though evidence has been provided through
TEACH model. First, one of the uses of the MAIN-TEACH model is to previous research regarding their generic nature (as discussed earlier in
provide a basis for discussions about instructional quality, as well as a this paper), a larger number of different subject lessons and for classes
common language for those discussions. MAIN-TEACH proved to be with different ages of students would further provide support to com­
helpful, because as comparing the three generic models showed, similar parisons of the generic nature of the frameworks. Different subjects
constructs do not necessarily share the same name across frameworks (e. taught by the same teacher would also be useful in identifying points for
g., TBD: classroom management; DM: the classroom as a learning improvement in their teaching. It should further be acknowledged that
environment and management of time). At the same time, constructs since these frameworks are generic, they cannot provide measurements
with the same name can differ in content - although deviations were of the teachers’ pedagogical content knowledge or whether some
usually minor (e.g., DM: Assessment; ISTOF: assessment and teachers may have insufficient knowledge about the topic they teach.

12
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

For example, by observing these three lessons, we noticed that one of the in the availability of devices and digital literacy expertise will also shape
teachers made some mathematical errors, but the frameworks and their the ways learning takes place in different home and school contexts.
instruments do not enable us to generate scores on teacher knowledge Teachers’ use of artefacts to support learning may need more promi­
and identify relevant professional needs so reason raters did not examine nence in studying the quality of teaching and learning. There will be
any mathematical errors that teachers made during a lesson. In the case increased opportunities to collect more detailed data (from videos and
of the dynamic model, the stages of teaching in which teachers are digital records of ‘lessons’ and associated activities). Collaboration with
allocated based on their generic teaching skills may be found useful for researchers in the digital and e-learning fields are likely to prove fruitful
the purpose of teacher professional development for teachers found in in advancing our current understanding and in revising or expanding
the lower classification of stages in which they lack teaching skills existing generic frameworks to recognise the changing context and to
related to the basic elements of direct teaching. The lower stages refer to provide new evidence on how teaching and learning are evolving and
the quantitative characteristics of factors such as management of time, the implications for different groups of students, particularly those most
structuring of lessons, posing questions and assigning application ac­ affected by the digital divide.
tivities to students and but also incorporate some qualitative features of
the three factors associated with the direct teaching approach (i.e., References
structuring, application, and questioning). For teachers, however,
located in higher stages who have achieved the basic teaching skills Antoniou, P., & Kyriakides, L. (2011). The impact of a dynamic approach to professional
development on teacher instruction and student learning: Results from an
emphasis may also be placed to improving domain-specific teaching experimental study. School Effectiveness and School Improvement, 22(3), 291–311.
practices. Therefore, the frameworks presented here could also be https://doi.org/10.1080/09243453.2011.577078.
expanded to include the effects of more subject-specific teaching factors Antoniou, P., & Kyriakides, L. (2013). A dynamic integrated approach to teacher
professional development: Impact and sustainability of the effects on improving
on student learning or they could be combined when observing teaching teacher behavior and student outcomes. Teaching and Teacher Education, 29(1), 1–12.
with domain-specific frameworks since combining both generic and https://doi.org/10.1016/j.tate.2012.08.001.
domain-specific factors may allow the development of a comprehensive Azigwe, J. B., Kyriakides, L., Panayiotou, A., & Creemers, B. P. M. (2016). The impact of
effective teaching characteristics in promoting student achievement in Ghana.
framework for measuring quality of teaching. In addition, we should International Journal of Educational Development, 51, 51–61.
stress the need for further research using different models and obser­ Azkiyah, S. N., Doolaard, S., Creemers, B. P. M., & Van Der Werf, M. P. C. (2014). The
vation instruments to explore a broader range of outcomes and partic­ effects of two intervention programs on teaching quality and student achievement.
Journal of Classroom Interaction, 49(1), 4–11.
ularly socio-emotional outcomes and to further explore effects for
Baumert, J., Kunter, M., Blum, W., Brunner, M., Voss, T., Jordan, A., & Tsai, Y.-M.
different student groups, since there may be arguments that different (2010). Teachers’ mathematical knowledge, cognitive activation in the classroom,
teaching approaches are more helpful for some student groups (e.g., and student progress. American Educational Research Journal, 47(1), 133–180.
gender, SES, low or high prior attainment students, younger or older age https://doi.org/10.3102/0002831209345157.
Brekelmans, M., Sleegers, P., & Fraser, B. (2000). Teaching for active learning. New
groups). Thus, the concept of teaching quality may shift depending on learning (pp. 227–242). Netherlands: Springer.
the context and educational goals (Kyriakides et al., 2021). The Brophy, J., & Good, T. L. (1986). Teacher behavior and student achievement. In
MAIN-TEACH model which is discussed in the previous section, as well M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed, pp. 328–375). New
York: MacMillan.
as any other new forms and combinations of frameworks of teaching and Brunner, E. (2018). Qualität von Mathematikunterricht: Eine Frage der Perspektive.
learning should also take the above into consideration, especially given Journal für Mathematik-Didaktik, 39(2), 257–284.
the new practices in teaching such as the rise of on-line and blended Caldwell, J. B., & Spinks, M. J. (1993). The self-managing school. London: The Farmer
Press.
learning. Carroll, J. B. (1963). A model of school learning. Teachers College Record, 64(8), 723–733.
Overall, we conclude that the purposes of observation (research or to Chapman, C., Muijs, D., Reynolds, D., Sammons, P., & Teddlie, C. (2016). The Routledge
support professional development or improvement) and the nature of international handbook of educational effectiveness and improvement. London:
Routledge.
any research questions should guide the choice of frameworks in any Charalambous, C. Y., & Praetorius, A. K. (2018). Studying mathematics instruction
observational study. In addition, practical constraints such as resources through different lenses: Setting the ground for understanding instructional quality
to employ and train raters will also shape decisions about whether to use more comprehensively. ZDM, 50(3), 355–366.
Christoforidou, M., & Xirafidou, E. (2014). Using the dynamic model to identify stages of
more than one framework and instrument. Such choices should be
teacher skills in assessment. Journal of Classroom Interaction, 49(1), 12–25. https://
considered carefully and their consequences discussed since they will doi.org/10.13140/2.1.4511.2969.
shape the findings and the uses of the research, or foci of professional Christoforidou, M., Kyriakides, L., Antoniou, P., & Creemers, B. P. M. (2014). Searching
development and thus understandings of the quality of teaching for for stages of teacher skills in assessment. Studies in Educational Evaluation, 40, 1–11.
Clausen, M. (2002). Unterrichtsqualität: Eine Frage der Perspektive? Waxmann.
different age groups, subjects and, contexts. Although it is recognised Clausen, M., Reusser, K., & Klieme, E. (2003). Unterrichtsqualität auf der Basis
that the quality of teaching and learning goes beyond what may be hochinferenter Unterrichtsbeurteilungen. Ein Vergleich zwischen Deutschland und
observed in regular classroom settings (and that some important di­ der deutschsprachigen Schweiz. Unterrichtswissenschaft, 31(2), 122–141.
Creemers, B. P. M. (1994). The effective classroom. London: Cassell.
mensions may not be covered e.g., the role of planning or assessment Creemers, B. P. M., & Kyriakides, L. (2008). The dynamics of educational effectiveness: A
etc) because the frameworks were developed based on the premise of contribution to policy, practice and theory in contemporary schools. London and New
observing natural behaviour in normal classroom settings. Increasingly York: Routledge.
Creemers, B. P. M., & Kyriakides, L. (2012). Improving quality in education: Dynamic
it is recognised that teaching and learning are occurring very differently approaches to school improvement. London and New York: Routledge.
in many schools due to the rise of online learning and platforms that Creemers, B. P. M., & Reezigt, G. J. (1996). School level conditions affecting the
support out of school learning, and most recently the impact of the effectiveness of instruction. School Effectiveness and School Improvement, 7, 197–228.
Creemers, B. P. M., Kyriakides, L., & Antoniou, P. (2013). Teacher professional
Covid-19 pandemic means that online and blended learning are on the development for improving quality of teaching. Dordrecht, the Netherlands: Springer.
rise in many countries due to school closures or restrictions on numbers De Jong, R., Westerhof, K. J., & Kruiter, J. H. (2004). Empirical evidence of a
attending in face-to-face classes. This highlights the need for a major comprehensive model of school effectiveness: A multilevel study in mathematics in
the 1st year of junior general education in the Netherlands. School Effectiveness and
new direction in research on teaching quality. How relevant are existing
School Improvement, 15(1), 3–31.
frameworks in this new teaching and learning context? What constitutes Den Brok, P., Brekelmans, M., & Wubbels, T. (2004). Interpersonal teacher behaviour
quality in online and blended learning and how can it be studied and and student outcomes. School Effectiveness and School Improvement, 15(3/4),
measured? 407–442.
Devine, D., Fahie, D., & McGillicuddy, D. (2013). What is ‘good’ teaching? Teacher
International collaborations will be important to explore the way beliefs and practices about their teaching. Irish Educational Studies, 32(1), 83–108.
teaching quality can be defined and measured in the new online and Devine, D., Fahie, E., McGillicuddy, D., MacRuairc, G., & Harford, J. (2010). Report on the
blended learning contexts that many schools are adopting. Planning and use of the ISTOF (International System of Teacher Observation and Feedback) protocol in
Irish schools; Challenges, issues and teacher effect. Dublin: School of Education,
assessment are likely to pose additional challenges to teachers. The University College Dublin.
quality and availability of online platforms and packages, equity issues

13
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

Dimosthenous, A., Kyriakides, L., & Panayiotou, A. (2020). Short- and long-term effects Kyriakides, L., & Creemers, B. P. M. (2008). Using a multidimensional approach to
of the home learning environment and teachers on student achievement in measure the impact of classroom level factors upon student achievement: A study
mathematics: A longitudinal study. School Effectiveness and School Improvement, 31 testing the validity of the dynamic model. School Effectiveness and School
(1), 50–79. Improvement, 19(2), 183–205.
Dowson, M., & McInerney, D. M. (2003). What do students say about motivational Kyriakides, L., Anthimou, M., & Panayiotou, A. (2020). Searching for the impact of
goals?: Towards a more complex and dynamic perspective on student motivation. teacher behavior on promoting students’ cognitive and metacognitive skills. Studies
Contemporary Educational Psychology, 28(1), 91–113. in Educational Evaluation, 64. https://doi.org/10.1016/j.stueduc.2019.100810.
Doyle, W. (1986). Classroom organization and management. In M. C. Wittrock (Ed.), Kyriakides, L., Creemers, B. P., & Antoniou, P. (2009). Teacher behaviour and student
Handbook of research on teaching (third edition, pp. 392–431). New York: Macmillan. outcomes: Suggestions for research on teacher training and professional
Emmer, E. T., & Everston, C. M. (1981). Synthesis of research on classroom management. development. Teaching and Teacher Education, 25(1), 12–23.
Educational Leadership, 38(4), 342–347. Kyriakides, L., Creemers, B. P. M., Panayiotou, A., & Charalambous, E. (2021). Quality
Fauth, B., Decristan, J., Rieser, S., Klieme, E., & Büttner, G. (2014). Student ratings of and equity in education: Revisiting theory and research on educational effectiveness and
teaching quality in primary school: Dimensions and prediction of student outcomes. improvement. London and New York: Routledge.
Learning and Instruction, 29, 1–9. https://doi.org/10.1016/j. Kyriakides, L., Creemers, B. P. M., Panayiotou, A., Vanlaar, G., Pfeifer, M., Cankar, G., &
learninstruc.2013.07.001. McMahon, L. (2014). Using student ratings to measure quality of teaching in six
Gabriel, K. (2013). Videobasierte Erfassung von Unterrichtsqualität im Anfangsunterricht der European countries. European Journal of Teacher Education, 37(2), 125–143.
Grundschule. Klassenführung und Unterrichtsklima in Deutsch und Mathematik. Kassel: Kyriakides, L., Creemers, B. P. M., Teddlie, C., & Muijs, D. (2010). The international
University press GmbH. system for teacher observation and feedback: A theoretical framework for
Galton, M. (1987). An ORACLE chronicle: A decade of classroom research. Teaching and developing international instruments. In P. Peterson, E. Baker, & B. McGaw (Eds.),
Teacher Education, 3(4), 299–313. International encyclopaedia of education (pp. 726–734). Oxford: Elsevier.
Gruehn, S. (2000). Unterricht und schulisches Lernen. Schüler als Quellen der Kyriakides, L., Archambault, I., & Janosz, M. (2013). Searching for stages of effective
Unterrichtsbeschreibung. Waxmann. teaching: A study testing the validity of the dynamic model in Canada. Journal of
Harjunen, E. (2012). Patterns of control over the teaching–studying–learning process and Classroom Interaction, 48(2), 11–24.
classrooms as complex dynamic environments: A theoretical framework. European Kyriakides, L., Christoforou, C., & Charalambous, C. Y. (2013). What matters for student
Journal of Teacher Education, 34(2), 139–161. learning outcomes: A meta-analysis of studies exploring factors of effective teaching.
Helm, C. (2016). Zentrale Qualitätsdimensionen von Unterricht und ihre Effekte auf Teaching and Teacher Education, 36, 143–152.
Schüleroutcomes im Fach Rechnungswesen. Zeitschrift für Bildungsforschung, 6(2), Kyriakides, L., Creemers, B. P., & Panayiotou, A. (2018). Using educational effectiveness
101–119. research to promote quality of teaching: The contribution of the dynamic model.
Holzberger, D., Philipp, A., & Kunter, M. (2013). How teachers’ self-efficacy is related to ZDM, 50(3), 381–393.
instructional quality: A longitudinal analysis. Journal of Educational Psychology, 105 Kyriakides, E., Tsangaridou, N., Charalambous, C., & Kyriakides, L. (2018). Integrating
(3), 774–786. https://doi.org/10.1037/a0032198. generic and content-specific teaching practices in exploring teaching quality in
Ioannou, C. (2017). The dynamic model of educational effectiveness tested by investigating the primary physical education. European Physical Education Review, 24(4), 418–448.
impact of classroom level factors on slow learners’ outcomes in language: An effectiveness Lelei, H. (2019). A case study of policy and actions of Rivers State, Nigeria to improve
study on a specific student population. Unpublished doctoral dissertation. Nicosia, teaching quality and the school learning environment. Unpublished doctoral
Cyprus: Department of Education, University of Cyprus. dissertation. Sydney, Australia: School of Education, UNSW.
Joyce, B., Weil, M., & Calhoun, E. (2000). Models of teaching. Boston: Allyn & Bacon. Lipowsky, F. (2015). Unterricht. In E. Wild, & J. Möller (Eds.), Pädagogische Psychologie
Kington, A., Day, C., Sammons, P., Regan, E., & Brown, E. (2009). Effective classroom (pp. 69–105). Springer. https://doi.org/10.1007/978-3-642-41291-2_4.
practice: A mixed-method study of influences and outcomes. In Symposium Paper Lipowsky, F., & Bleck, V. (2019). Was wissen wir über guten Unterricht? - Ein Update. In
Presented at the British Educational Research Association Annual Conference. U. Steffens, & R. Messner (Eds.), Unterrichtsqualität: Konzepte und Bilanzen gelingenden
Kington, A., Sammons, P., Regan, E., Brown, E., Ko, J., & Buckler, S. (2014). Effective Lehrens und Lernens. Waxmann (Bd. 3).
classroom practice. Maidenhead: McGraw Hill Open University Press. Lipowsky, F., Drollinger-Vetter, B., Klieme, E., Pauli, C., & Reusser, K. (2018). Generische
Klieme, E., & Rakoczy, K. (2003). Unterrichtsqualität aus Schülerperspektive: und fachdidaktische Dimensionen von Unterrichtsqualität – zwei Seiten einer
Kulturspezifische Profile, regionale Unterschiede und Zusammenhänge mit Effekten Medaille? In M. Martens, K. Rabenstein, K. Bräu, M. Fetzer, H. Gresch, I. Hardy, &
von Unterricht. PISA 2000—Ein differenzierter Blick auf die Länder der Bundesrepublik C. Schelle (Eds.), Konstruktionen von Fachlichkeit Ansätze, Erträge und Diskussionen in
Deutschland (pp. 333–359). VS Verlag für Sozialwissenschaften. der empirischen Unterrichtsforschung (pp. 183–202). Julius Klinkhardt.
Klieme, E., & Rakoczy, K. (2008). Empirische Unterrichtsforschung und Fachdidaktik. Lipowsky, F., Rakoczy, K., Pauli, C., Drollinger-Vetter, B., Klieme, E., & Reusser, K.
Outcome-orientierte Messung und Prozessqualität des Unterrichts. Zeitschrift für (2009). Quality of geometry instruction and its short-term impact on students’
Pädagogik, 54(2), 222–237. understanding of the Pythagorean Theorem. Learning and Instruction, 19(6),
Klieme, E., Backhoff, E., Blum, W., Hong, Y., Kaplan, D., Levin, H., & Vieluf, S. (2013). 527–537. https://doi.org/10.1016/j.learninstruc.2008.11.001.
PISA 2012 context questionnaire framework. In OECD (Ed.), PISA 2012 assessment Lotz, M. (2016). Kognitive Aktivierung im Leseunterricht der Grundschule. Springer
and analytical framework (pp. 167–208). Paris: OECD.tali. Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-10436-8.
Klieme, E., Schümer, G., & Knoll, S. (2001). Mathematikunterricht in der Sekundarstufe I: Marciniak, J., & Janssen, R. (2012). The international system for teacher observation and
Aufgabenkultur “und Unterrichtsgestaltung. In Bundesministerium für Bildung und feedback questionnaire in the biology assessment–theory, evaluation, utility. In
Forschung (Ed.), TIMSS-Impulse für Schule und Unterricht (pp. 43–57). Paper Presented at Biennial Meeting of the Special Interest Group Educational
Ko, J. (2010). Consistency and variation in classroom practice: A mixed-method investigation Effectiveness (SIG 18) of the European Association for Research on Learning and
based on case studies of four EFL teachers of a disadvantaged secondary school in Hong Instruction (EARLI).
Kong. Doctoral thesis. Nottingham, UK: University of Nottingham. Retrieved from: Miao, Z., Reynolds, D., Harris, A., & Jones, M. (2015). Comparing performance: A cross-
http://ethes es.notti ngham.ac.uk/1363/1/CVCP_SUBMI SSION (FINAL)PB3.pdf national investigation into the teaching of mathematics in primary classrooms in
(Accessed May 2017). England and China. Asia Pacific Journal of Education, 35(3), 392–403.
Ko, J., & Sammons, P. (2008). Variations in effective classroom practices: Confirmatory Muijs, D., & Reynolds, D. (2000). School effectiveness and teacher effectiveness in
factor analysis results from analysis of measures from the International System for mathematics: Some preliminary findings from the evaluation of the mathematics
Teacher Observation and Feedback (ISTOF) Scale and Quality and Teaching Lesson enhancement programme. School Effectiveness and School Improvement, 11, 273–303.
Observation Indicator (GRIFT) Scale. Nottingham: University of Nottingham, School Muijs, D., Chapman, C., & Armstrong, P. (2012). Teach First: Pedagogy and outcomes.
of Education. The impact of an alternative certification programme. Journal for Educational
Kokkinou, E., & Kyriakides, L. (2018). Do teachers exhibit the same generic teaching Research, 4(2), 29–64.
skills when they teach in different classrooms?. In Paper Presented at the Annual Muijs, D., Chapman, C., Collins, A., & Armstrong, P. (2010). Maximum impact evaluation
Meeting of the American Educational Research Association (AERA) 2018. the impact of teach first teachers in schools an evaluation funded by the maximum impact
Kounin, J. S. (1970). Observing and delineating technique of managing behavior in programme for teach first final report. University of Southampton & University of
classrooms. Journal of Research and Development in Education, 4(1), 62–67. Manchester.
Künsting, J., Neuber, V., & Lipowsky, F. (2016). Teacher self-efficacy as a long-term Muijs, D., Kyriakides, L., van der Werf, G., Creemers, B., Timperley, H., & Earl, L. (2014).
predictor of instructional quality in the classroom. European Journal Psychology of State of the art–teacher effectiveness and professional learning. School Effectiveness
Education, 31, 299–322. https://doi.org/10.1007/s10212-015-0272-7. and School Improvement, 25(2), 231–256. https://doi.org/10.1080/09243
Kunter, M., & Voss, T. (2011). Das Modell der Unterrichtsqualität in COACTIV: Eine 453.20192011.4.88545 1.
multikriteriale Analyse. In M. Kunter, J. Baumert, W. Blum, U. Klusmann, S. Krauss, Muijs, D., Reynolds, D., Sammons, P., Kyriakides, L., Creemers, B. P., & Teddlie, C.
& M. Neubrand (Eds.), Professionelle Kompetenz von Lehrkräften – Ergebnisse des (2018). Assessing individual lessons using a generic teacher observation instrument:
Forschungsprogramms COACTIV (pp. 85–113). Waxmann. How useful is the International System for Teacher Observation and Feedback
Kunter, M., Brunner, M., Baumert, J., Klusmann, U., Krauss, S., Blum, W., … (ISTOF)? ZDM, 50(3), 395–406.
Neubrand, M. (2005). Der Mathematikunterricht der PISA-Schülerinnen und Musthafa, H. S. (2020). A longitudinal study on the impact of instructional quality on student
-Schüler: Schulformunterschiede in der Unterrichtsqualität. Zeitschrift für learning in primary schools of Maldives. Doctoral dissertation. Nicosia, Cyprus:
Erziehungswissenschaft, 8(4), 502–520. https://doi.org/10.1007/s11618-005-0156-8. Department of Education, University of Cyprus.
Kyriakides, L. (2008). Testing the validity of the comprehensive model of educational Opfer, D., Bell, C. A., Klieme, E., McCaffrey, D., Schweig, J., & Stecher, B. (2020).
effectiveness: A step towards the development of a dynamic model of effectiveness. Understanding and measuring mathematics teaching practice. In OECD (Ed.), OECD
School Effectiveness and School Improvement, 19(4), 429–446. global teaching in sights: A video study of teaching. Paris: OECD Publishing.
Kyriakides, L., & Creemers, B. P. M. (2009). The effects of teacher factors on different Panayiotou, A., Kyriakides, L., Creemers, B. P., McMahon, L., Vanlaar, G., Pfeifer, M., …
outcomes: Two studies testing the validity of the dynamic model. Effective Education, Bren, M. (2014). Teacher behavior and student outcomes: Results of a European
1(1), 61–86. https://doi.org/10.1080/19415530903043680. study. Educational Assessment, Evaluation and Accountability, 26(1), 73–93.

14
A. Panayiotou et al. Studies in Educational Evaluation 70 (2021) 101028

Pellegrino, J. W. (2004). Complex learning environments: Connecting learning theory, teachers. Review of Education Review of Education, 6(3), 303–356. https://doi.org/
instructional design, and technology. In N. M. Seel, & S. Dijkstra (Eds.), Curriculum, 10.1002/rev3.3141. October 2018.
plans, and processes in instructional design (pp. 25–49). Mahwah, NJ: Lawrence Scheerens, J. (2013). The use of theory in school effectiveness research revisited. School
Erlbaum Associates. Effectiveness and School Improvement, 24(1), 1–38.
Piaget, J. (1985). The equilibration of cognitive structures. The central problem of intellectual Schlesinger, L., Jentsch, A., Kaiser, G., König, J., & Blömeke, S. (2018). Subject-specific
development. Chicago: The University of Chicago Press. characteristics of instructional quality in mathematics education. ZDM, 50(3),
Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom assessment scoring 475–490. https://doi.org/10.1007/s11858-018-0917-5.
system™: Manual K-3. Paul H Brookes Publishing. Schoenfeld, A. H. (1998). Toward a theory of teaching in context. Issues in Education, 4
Pinger, P., Rakoczy, K., Besser, M., & Klieme, E. (2018). Interplay of formative (1), 1–94.
assessment and instructional quality—Interactive effects on students’ mathematics Seidel, T. (2009). Klassenführung. In E. Wild, & J. Möller (Eds.), Pädagogische Psychologie
achievement. Environment Research, 21(1), 61–79. https://doi.org/10.1007/s10984- (pp. 135–148). Berlin Heidelberg: Springer. https://doi.org/10.1007/978-3-540-
017-9240-2. 88573-3_6.
Praetorius, A.-K., Klieme, E., Herbert, B., & Pinger, P. (2018). Generic dimensions of Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade:
teaching quality: The German framework of three basic dimensions. ZDM, 50(3), The role of theory and research design in disentangling meta-analysis results. Review
407–426. https://doi.org/10.1007/s11858-018-0918-4. of Educational Research, 77(4), 454–499.
Praetorius, A.-K., Lauermann, F., Klassen, R. M., & Dickhäuser, O. (2017). Longitudinal Shepard, L. A. (1989). Why we need better assessments. Educational Leadership, 46(2),
relations between teaching-related motivations and student-reported teaching 4–8.
quality. Teaching and Teacher Education, 65, 241–254. Slavin, R. E. (1987). Ability grouping and student achievement in elementary schools: A
Praetorius, A.-K., Klieme, E., Kleickmann, T., Brunner, E., Lindmeier, A., Taut, S., & best evidence synthesis. Review of Educational Research, 57(3), 293–336.
Charalambous, C. Y. (2020). Towards developing a theory of generic teaching Soderlund, G., Sorlie, K., & Syse, I. (2015). Mestringsforventninger i matematikk. Paper
quality. Origin, current status, and necessary next steps regarding the three basic Presented at Finnut, Lilyhammer, 29-01-2015.
dimensions model. Zeitschrift für Pädagogik, 66, 15–36. Beiheft. Stenmark, J. K. (1992). Mathematics assessment: Myths, models, good questions, and
Praetorius, A.-K., Rogh, W., & Kleickmann, T. (2020). Blinde Flecken des Modells der drei practical suggestions. Reston: NCTM.
Basisdimensionen von Unterrichtsqualität? Das Modell im Spiegel einer Taut, S., & Rakoczy, K. (2016). Observing instructional quality in the context of school
internationalen Synthese von Merkmalen der Unterrichtsqualität. evaluation. Learning and Instruction, 46, 45–60. https://doi.org/10.1016/j.
Unterrichtswissenschaft. https://doi.org/10.1007/s42010-020-00072-w. learninstruc.2016.08.003.
Rakoczy, K., & Pauli, C. (2006). Hoch inferentes Rating: Beurteilung der Qualität Teddlie, C., & Reynolds, D. (2000). The international handbook of school effectiveness
unterrichtlicher Prozesse. Dokumentation der Erhebungs- und Auswertungsinstrumente research. London: Falmer Press.
zur schweizerisch-deutschen Videostudie “Unterrichtsqualität, Lernverhalten und Teddlie, C., Creemers, B., Kyriakides, L., Muijs, D., & Yu, F. (2006). The International
mathematisches Verständnis”. Teil 3: Videoanalysen (pp. 206–233). GFPF. System for Teacher Observation and Feedback: Evolution of an international study of
Renkel, A. (2011). Aktives Lernen = gutes Lernen? Reflektion zu einer (zu) einfachen teacher effectiveness constructs. Educational Research and Evaluation, 12(6),
Gleichung. Unterrichtswissenschaft, 39(3), 194–196. 561–582. https://doi.org/10.1080/13803 61060 08740 67.
Reynolds, D., Salom, K., Delaiglesia, B., & Ramon, R. M. (2012). The ISTOF project-A Tsai, C. C. (2001). Relationships between student scientific epistemological beliefs and
preliminary report. Spain: IAQSE (Institut d’ Avaluacio del Sistema Education), perceptions of constructivist learning environments. Educational Research, 42(2),
Ministry of Education of the Balearic Isles. 193–205. https://doi.org/10.1080/00131 88003 63836.
Rosenshine, B., & Stevens, R. (1986). Teaching functions. In M. C. Wittrock (Ed.), Van de Grift, W. (2007). Quality of teaching in four European countries: A review of the
Handbook of research on teaching (3rd ed., pp. 376–391). New York: Macmillan. literature and application of an assessment instrument. Educational Research, 49,
Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of 127–152.
intrinsic motivation, social development, and wellbeing. American Psychologist, 55 Van de Grift, W., Matthews, P., Tabak, L., & de Rijcke, F. (2004). Comparative research
(1), 68–78. into the inspection of teaching in England and the Netherlands (HMI 2251). London:
Sammons, P. (1999). School effectiveness: Coming of age in the twenty-first century. Lisse, Ofsted.
Netherlands: Swets and Zeitlinger. Vygotsky, L. S. (1978). Mind in society. The development of higher psychological processes.
Sammons, P., Davis, S., Day, C., & Gu, Q. (2014). Using mixed methods to investigate Cambridge: Harvard University Press.
school improvement and the role of leadership: An example of a longitudinal study Wagner, W., Göllner, R., Werth, S., Voss, T., Schmitz, B., & Trautwein, U. (2016). Student
in England. Journal of Educational Administration, 52(5), 565–589. and teacher ratings of instructional quality: Consistency of ratings over time,
Sammons, P., Lindorff, A., Ortega, L., & Kington, A. (2016). Inspiring teaching: Learning agreement, and predictive power. Journal of Educational Psychology, 108(5),
from exemplary practitioners. Journal of Professional Capital & Community, 1(2), 705–721. https://doi.org/10.1037/edu0000075.
124–144. Walberg, H. J. (1986). Syntheses of research on teaching. In M. C. Wittrock (Ed.),
Sammons, P., Kington, A., Lindorff, A., & Ortega, L. (2018). It ain’t (only) what you do, Handbook of research on teaching (pp. 214–229). New York: Macmillan.
it’s the way that you do it’: A mixed method approach to the study of inspiring

15

You might also like