Myers Et Al 2024 Considerations For Intensifying Word Problem Interventions For Students With MD A Qualitative Umbrella
Myers Et Al 2024 Considerations For Intensifying Word Problem Interventions For Students With MD A Qualitative Umbrella
research-article2024
LDXXXX10.1177/00222194241281293Journal of Learning DisabilitiesMyers et al.
Article
Journal of Learning Disabilities
Relevant Meta-Analyses
Abstract
Word problem-solving (WPS) poses a significant challenge for many students, particularly those with mathematics
difficulties (MD), hindering their overall mathematical development. To improve WPS proficiency, providing individualized
and intensive interventions is critical. This umbrella review examined 11 medium- to high-quality meta-analyses to identify
intervention and participant characteristics, informed by the Taxonomy of Intervention Intensity (TII) framework, that
consistently moderate WPS outcomes for students with MD. Our analysis identified four characteristics with consistent
moderating effects: intervention model, number of treatment sessions, group size, and academic risk area. This result
suggests that these variables are potential considerations when customizing and intensifying WPS interventions to
maximize their effectiveness for students with MD. We discuss the implications of these findings for practice and research
and acknowledge the limitations of our review.
Keywords
learning disabilities (LD), mathematics difficulties, word problems
Problem-solving skills, such as critical and analytical think- Using proven instructional strategies, or evidence-based
ing, are crucial for student success in post-secondary out- practices (EBPs), is crucial for improving academic out-
comes, including employment, college enrollment, comes for students with MD (Bouck et al., 2022; Gersten et
graduation, and personal finance (Lombardi et al., 2015; al., 2008; Witzel et al., 2024). EBPs are backed by rigorous
Ritchie & Bates, 2013). Word problem-solving (WPS) in research and consistently demonstrate effectiveness in
mathematics relies on these skills. However, WPS poses a improving student outcomes. They empower educators to
significant challenge for many students in the United States. increase the percentage of students attaining or surpassing
For instance, recent National Assessment of Educational grade-level proficiency in mathematics (Doabler et al.,
Progress (NAEP) data reveal most students fall below pro- 2021; Feliz, 2020). To address persistent low mathematics
ficiency in mathematics, as measured through WPS, with performance among students with MD, schools often imple-
students with disabilities showing lower performance than ment multi-tiered systems of support (MTSS) integrated
their non-disabled peers (National Center for Education with EBPs (Mason et al., 2019; Schumacher et al., 2017).
Statistics, 2020). In 2022, only 37% of all fourth-graders An MTSS instructional framework typically comprises
and 27% of eighth-graders performed at or above the NAEP three distinct tiers (L. S. Fuchs & Fuchs, 2002). Tier 1 core
proficient level in mathematics. For fourth grade, 53% of
students with disabilities scored below the basic level on the
1
NAEP, compared with 20% of students without disabilities. Georgia State University, Atlanta, USA
2
A similar performance gap was evident among eighth-grade The University of Texas-El Paso, USA
3
The University of Texas-Austin, USA
students, with nearly three-fourths (73%) of students with 4
Western Carolina University, Cullowhee, NC, USA
disabilities performing below the basic level versus one-
third of students without disabilities. These data highlight Corresponding Author:
Jonte A. Myers, Department of Learning Sciences, College of Education
the prevalence of mathematics difficulties (MD) among and Human Development, Georgia State University, 30 Pryor Street,
U.S. students, emphasizing the need for enhanced WPS Suite 724, Atlanta, GA 30303, USA.
intervention to improve student proficiency in this skill. Email: [email protected]; [email protected]
2 Journal of Learning Disabilities 00(0)
instruction utilizes the general education curriculum to Word-Problem Challenges Faced by Students
cater to all students within inclusive classroom settings. At With MD
Tier 2, students who exhibit inadequate responsiveness to
core instruction receive targeted small-group instruction Effective WPS is a complex process demanding students to
(3–7) accompanied by monthly progress monitoring (D. apply various cognitive functions and academic skills
Fuchs et al., 2014; Powell & Fuchs, 2018). Tier 3 instruc- (Björn et al., 2016; Boonen et al., 2013; Pongsakdi et al.,
tion explicitly targets students with more intensive learning 2020). It requires utilizing cognitive processes, such as
needs, often those with diagnosed disabilities or requiring working memory, metacognition, self-regulation, and pro-
specialized interventions beyond general education support. ficiency in diverse academic domains, including mathe-
These students receive individualized or small-group matics, language, and reading (Verschaffel et al., 2020).
instruction (2–3) with more frequent progress monitoring The process begins with carefully reading the problem to
(D. Fuchs et al., 2014). Practitioners at Tiers 2 and 3 use grasp its meaning and relationships, interpreting complex
data-driven intervention adaptations (e.g., dosage adjust- vocabulary, and extracting essential details (Daroczy et al.,
ments) to support students with MD within general educa- 2015; Kintsch & Greeno, 1985). Next, students create
tion settings before disability identification and special visual representations, structure a solution plan, select an
education supports (L. S. Fuchs et al., 2017). Therefore, appropriate strategy, perform accurate computations, and
understanding the factors influencing the effectiveness of verify their solutions (Björn et al., 2016; Verschaffel et al.,
EBPs for supporting WPS performance is crucial to suc- 2020). These tasks demand cognitive skills like self-regu-
cessfully providing targeted and individualized WPS inter- lation and a conceptual and procedural understanding of
vention at Tiers 2 and 3 of MTSS. mathematics (Borkowski et al., 1989; Decker & Roberts,
Understanding the factors that influence the effective- 2015).
ness of existing interventions is crucial for designing tar- In addition to the cognitive and mathematical skills men-
geted and individualized WPS instruction for students with tioned earlier, effective WPS demands proficiency in lan-
MD at Tiers 2 and 3 of the intervention (Powell & Fuchs, guage and reading, including verbal reasoning and
2015; Vaughn et al., 2012). These factors can be related to vocabulary decoding (Leiss et al., 2019; Verschaffel et al.,
the intervention (e.g., instructional dosage) or the student 2020). Word problems often present complex scenarios,
(e.g., specific learning needs). We conducted an umbrella requiring students to comprehend intricate details, decode
review to identify these factors, synthesizing findings from complex vocabulary, disregard extraneous information, and
multiple meta-analyses focusing on WPS interventions for extract essential clues (Ng et al., 2017; Powell et al., 2019).
students with MD. Our goal was to identify characteristics Students with MD often need help with one or more of these
of both the interventions and the participants that consis- domains and skills necessary for successful WPS (Fuchs et
tently moderate (influence) the outcomes of these al., 2021; Griffin et al., 2018; Peltier et al., 2020; Swanson
interventions. et al., 2015). As a result, WPS becomes exceptionally chal-
lenging for these students, emphasizing their need for an
intensive and individualized WPS intervention (Powell
Definition of MD et al., 2019).
Students with MD represent diverse learners with persistent
low mathematics achievement (Nelson & Powell, 2018). Intensifying WPS Instruction for Students With
This group includes students with Individualized Education
MD
Programs (IEPs) and mathematics-focused goals under the
Individuals with Disabilities Education Act (IDEA) cate- Within MTSS frameworks, schools can use data-based indi-
gory of Specific Learning Disability (SLD), as well as stu- vidualization (DBI) to tailor WPS interventions for students
dents without this diagnosis. Many students without SLD with MD (National Center on Intensive Intervention [NCII],
but with persistent mathematics challenges are also 2023). DBI is a systematic, data-driven approach that practi-
included. Schools often identify these students as at-risk tioners can utilize to intensify instruction for students with
based on set cutoff scores on standardized achievement disabilities (NCII, 2023). It incorporates a multi-component
tests, such as the 10th, 25th, 35th, or 40th percentiles design comprising sequential steps (Powell, Bos, et al.,
(Nelson & Powell, 2018). Also, teacher recommendations 2022). First, teachers select appropriate EBP for eligible stu-
play a role in identifying students with MD (Clarke et al., dents for Tier 2 instruction and implement it through tar-
2014). Despite the variability in MD criteria, students with geted small-group instruction following established
MD share familiar challenges in cognitive processes, such guidelines (Powell & Fuchs, 2015). Next, teachers gather
as working memory, and academic skills, such as vocabu- student performance data through regular progress monitor-
lary and computational fluency, which limit their WPS pro- ing to track their progress and identify areas requiring addi-
ficiency (Powell et al., 2019). tional support (Park et al., 2023). They then use diagnostic
Myers et al. 3
data to inform intervention adaptations for students not mak- supporting students with MD. Yet, the evidence is inconsis-
ing sufficient progress after Tier 2 intervention, providing tent, with some results showing agreement while others are
more intensive instruction for students with persistent and divergent or inconclusive. For example, while the depen-
significant difficulties (Powell, Benz, et al., 2022; Powell, dent measure consistently predicts WPS outcomes across
Bos, et al., 2022). At Tier 3, teachers intensify instruction by various reviews (e.g., Kong et al., 2021; Myers et al., 2022),
reducing instructional group size, increasing intervention other essential intervention features, such as group size,
frequency, or extending instructional duration (NCII, 2023). exhibit inconsistent results. Myers et al. (2022) concluded
However, interventions should be adapted cautiously, as interventions with larger groups yielded better outcomes
alterations may diminish effectiveness (L. S. Fuchs et al., among students in Grades 1–5, while Xin and Jitendra
2017). Providing guidelines and evidence-based adaptations (1999) inferred those done with individual students were
is crucial to assist teachers in delivering individualized WPS more effective among students in K through post-second-
interventions for students with more intensive needs. ary. Lein et al. (2020) focused on K–12 students and
The Taxonomy of Intervention Intensity (TII), devel- reported no difference in effect sizes (ESs) based on group
oped by L. S. Fuchs et al. (2017), serves as a promising size. Similar inconsistencies emerged for participant MD
framework for intensifying and adapting WPS interventions status (i.e., learning disability [LD] vs. at-risk), with some
for students with MD. It provides valuable guidelines for studies reporting significant moderating impacts (e.g., Xin
selecting and adapting interventions to enhance the efficacy & Jitendra) and others finding none (e.g., Lein et al., 2020).
of EBPs for these students (NCII, 2023). This taxonomy A deeper literature analysis is needed to reconcile findings
comprises seven essential dimensions: strength, dosage, and establish a clearer understanding of intervention and
alignment, attention to transfer, comprehensiveness, behav- participant characteristics impacting WPS outcomes for
ioral or academic support, and individualization (L. S. students with MD.
Fuchs et al., 2017). The TII offers a valuable conceptual The conflicting results across different studies on WPS
framework for tailoring WPS interventions for students interventions for students with MD may be due to variations
with MD. However, it does not explicitly identify specific in inclusion criteria, such as the targeted populations (e.g.,
intervention (e.g., group size and intervention frequency) grade and age range). These inconsistencies highlight the
and participant (e.g., academic risk area and English learner risk of relying solely on the findings of a single study to
[EL] status) characteristics, which are considered modera- make decisions about intensifying and individualizing WPS
tors of WPS interventions’ outcomes. Verifying these mod- interventions for students who do not respond well to regu-
erators is crucial for optimizing WPS interventions for lar classroom instruction. Instead, educators should con-
students with MD. Intervention features such as group size sider the collective findings of several studies in a
and intervention frequency can interact with participant meta-review, such as an umbrella review. This review can
characteristics, such as their academic risk area (e.g., MD provide a more comprehensive and balanced understanding
vs. MD and reading difficulty [RD]) or EL status, to influ- of the intervention and participant characteristics associated
ence WPS outcomes. For instance, a reduced group size with the TII dimensions (e.g., intensity, duration, individu-
might benefit students with both MD and RD more than alization) that moderate the efficacy of WPS interventions.
those with only MD (Powell et al., 2019). Similarly, ELs Nonetheless, no umbrella reviews currently focus on the
and students with RD who face mathematics challenges factors that influence the effectiveness of WPS interven-
might require additional language and reading support to tions for students with MD. This gap in research represents
improve their WPS skills (King & Powell, 2023; Lei et al., a critical area requiring further investigation. Conducting
2020). This highlights the need to identify intervention and an umbrella review in this domain could offer invaluable
participant characteristics influencing WPS outcomes for insights into the complex interplay between intervention
students with MD. features, student characteristics, and WPS outcomes. Such
research would equip educators with the knowledge to tai-
lor WPS interventions and maximize learning outcomes for
Literature Review
students with MD.
Researchers have conducted numerous meta-analyses to
identify critical factors influencing the effectiveness of
Definition of Umbrella Review
WPS interventions for students with MD. These reviews
analyzed findings from group design research (GDR) and Umbrella reviews offer a comprehensive and systematic
single-subject research (SSR), providing a comprehensive approach to evidence synthesis, leveraging the strengths of
understanding of the factors moderating the outcomes of multiple existing systematic reviews and meta-analyses
such interventions. This research has significantly advanced centered on a specific research question, ultimately produc-
our knowledge, enabling educators to tailor WPS interven- ing high-quality evidence for various stakeholders (Choi &
tions to individual needs and optimize their effectiveness in Kang, 2022; Fusar-Poli & Radua, 2018). This expansive
4 Journal of Learning Disabilities 00(0)
approach provides a broader and more complete under- research methods and student populations across meta-anal-
standing of the available evidence. Furthermore, the foun- yses on WPS interventions make it challenging to apply
dation of a robust umbrella review lies in the well-established quantitative synthesis methods such as meta-analyses
methodologies employed by the included systematic (Bushman & Wang, 2009; McKenzie & Brennan, 2022).
reviews and meta-analyses, such as comprehensive searches These methods may overlook important details and contex-
for relevant studies, rigorous data-extraction procedures to tual factors that influence the effectiveness of interventions.
minimize bias, and the use of appropriate statistical meth- A qualitative umbrella review, on the other hand, provides a
ods for combining findings (Fusar-Poli & Radua, 2018). more comprehensive and refined understanding of these
These reviews follow protocols that reduce bias and ensure factors, allowing practitioners to make informed decisions
reliable data extraction and analysis (Aromataris et al., about tailoring WPS interventions for students with MD.
2015). Umbrella reviews then build upon this strong foun- Second, while individual meta-analyses offer valuable
dation by applying similar methods to assess the quality and insights, they present certain limitations. Overreliance on
consistency of evidence across the included studies (Fusar- the findings of a single study can be problematic due to
Poli & Radua, 2018; Papatheodorou & Evangelou, 2022). potential biases, such as publication bias, which may restrict
This multi-layered approach allows them to synthesize the generalizability and validity of conclusions (Allen,
findings and identify potential biases that might be present 2020; Greco et al., 2013). This can lead to discrepancies in
in individual studies (Fusar-Poli & Radua, 2018; findings across multiple meta-analyses, highlighting the
Papatheodorou & Evangelou, 2022). need for a more comprehensive analysis to identify the
Umbrella reviews can provide a more robust and trust- sources of variability.
worthy assessment of the overall evidence by considering a Third, advancements in statistical methodologies have
more comprehensive range of perspectives and methodolo- accompanied the growing number of meta-analyses on
gies (Papatheodorou & Evangelou, 2022). Consequently, WPS interventions in recent years. Unlike traditional
experts consider evidence derived from well-conducted approaches that aggregate multiple effects within studies,
umbrella reviews among the highest quality available advanced techniques including multi-level modeling and
(Fusar-Poli & Radua, 2018). This high-quality evidence is robust variance estimation (RVE) help researchers address
valuable not only in healthcare but also in fields like educa- ES dependency. These methods provide more precise esti-
tion and psychology, where umbrella reviews are becoming mates by considering data structure and correlated factors
increasingly utilized to synthesize research across diverse that might influence the estimates (Hedges et al., 2010). In
areas (Choi & Kang, 2022; Faulkner et al., 2022; addition, to obtain a deeper understanding of moderating
Papatheodorou & Evangelou, 2022). Ultimately, well- factors in intervention studies, there has been a paradigm
designed umbrella reviews provide a reliable and trustwor- shift from traditional methods for assessing heterogeneity
thy foundation for stakeholders to make informed decisions (e.g., multiple subgroup analyses) to meta-regression tech-
and recommendations based on the best available evidence niques (Tipton et al., 2023). Meta-regression models are
(Choi & Kang, 2022). contemporary approaches that simultaneously assess
numerous moderators. These models allow for a deeper
understanding of moderators by exploring how study-level
Rationale and Research Questions
characteristics systematically influence the size and direc-
This qualitative systematic umbrella review aims to equip tion of intervention effects beyond the simple identification
practitioners with the knowledge to make informed deci- of ES differences achieved through traditional methods
sions about intensifying and individualizing WPS inter- (Tipton et al., 2023).
ventions for students with MD. We analyzed and The evolving landscape of WPS intervention research,
synthesized findings across relevant meta-analyses, metic- marked by model improvement and precision, requires a
ulously evaluating their reporting quality to ensure reli- thorough analysis of findings from both recent and older
able and trustworthy evidence. This comprehensive studies. Examining shifts in moderators over time is vital to
approach, guided by the TII dimensions, allowed us to ensure practitioners have access to the latest information for
gain a deeper understanding of the critical intervention planning WPS interventions for students with MD requiring
and participant characteristics influencing the effective- more intensive support. Understanding how these method-
ness of WPS interventions. Specifically, we addressed the ological advancements influence the interpretation of WPS
following research question: Which intervention and par- intervention outcomes for students with MD is crucial. This
ticipant characteristics associated with the TII dimensions knowledge equips researchers to better address issues of
are consistent moderators of the efficacy of WPS interven- study quality, confidence in the findings, and the overall
tions for students with MD? strength of the evidence base in their analyses. Therefore, a
Our approach of using a qualitative umbrella review is qualitative umbrella review is needed to synthesize findings
well-suited for several reasons. First, the differences in from recent and older meta-analyses and provide
Myers et al. 5
practitioners with evidence-based recommendations for tai- AND “word problem*” OR “word-problem*” “problem-
loring WPS interventions for these students. solving” OR “problem solving.” Second, we conducted
hand searches of relevant and available electronic journals to
identify meta-analyses of WPS interventions published
Method between 1975 and September 2023. We searched the follow-
We conducted this study following the best practices for sys- ing journals: Review of Educational Research, Learning
tematic reviews outlined by the Preferred Reporting Items Disabilities Research and Practice, Journal of Learning
for Systematic Reviews and Meta-Analyses (PRISMA; Disabilities, Exceptional Children, Remedial and Special
Page et al., 2021). We used a systematic process to locate Education, Journal of Educational Effectiveness, and
meta-analyses of studies on WPS interventions for students Learning Disabilities Quarterly. Lastly, we manually
with MD. This process included three standard approaches. searched the references of the articles identified using the
First, we searched electronic databases, including Academic first two approaches to locate additional studies.
Search Premier, Education Source, ERIC, and PsycINFO,
from 1975, the advent of Federal law guaranteeing a free, Search Results, Title, Abstract, and Article
appropriate public education to each child with a disability
in the United States, to July 2023. We used the following
Screening
Boolean string to search the titles, abstracts, and keywords: Figure 1 shows a PRISMA diagram of the study-selection
“meta-analysis” OR “systematic review” OR “synthesis” process. The electronic search yielded 856 titles and
OR “review” AND “LD” OR “learning disabilit*” OR abstracts. Two experienced screeners with expertise in con-
“remedial” OR “at-risk” OR “at risk” OR “MD” OR “math* ducting systematic reviews and meta-analyses indepen-
difficult*” OR “disabilit*” OR “low achieving” OR “low- dently checked these records, ensuring double screening for
achieving” OR “low performing” OR “low-performing” accuracy. At each stage, we calculated inter-rater reliability
6 Journal of Learning Disabilities 00(0)
Table 1. Inclusion and Exclusion Criteria for Selecting Meta-Analyses in the Systematic Umbrella Review.
(IRR) as the number of agreements divided by the sum of Study Quality Appraisal
disagreements and agreements multiplied by 100 and dis-
cussed discrepancies to reach a consensus. We used Endnote Rigorous methodology is the foundation for trustworthy
to identify and remove 371 duplicates. After de-duplication, research; umbrella reviews that synthesize evidence from
we retained 485 unique abstracts and titles for screening multiple meta-analyses are no exception. The quality of the
using the inclusion and exclusion criteria (Table 1). We underlying studies significantly influences the interpreta-
used Rayyan, a Web-based application, to screen the titles tion and reliability of findings presented in umbrella reviews
and abstracts obtained through Endnote. Rayyan uses (Faulkner et al., 2022). High-quality meta-analyses,
machine learning algorithms to select studies to include in a employing robust methodologies that minimize bias and
systematic review (Ouzzani et al., 2016). random error, ultimately provide a more solid foundation
We screened the 485 abstracts and titles using Rayyan for the interpretations presented in this umbrella review.
and excluded 465, with an initial IRR of 85%. After dis- Conversely, lower-quality meta-analyses can introduce bias
cussing disagreements and reaching a consensus, we or methodological flaws that propagate to the umbrella
retrieved the full-text versions of the remaining 20 records review, potentially leading to misleading interpretations of
and screened them using the inclusion and exclusion crite- the overall evidence (Fusar-Poli & Radua, 2018). Therefore,
ria. We excluded 13 articles, with an initial IRR of 87%. assessing and ensuring the methodological quality of meta-
After discussion, we reached a 100% agreement and analyses included in umbrella reviews is crucial. By evalu-
retained the remaining seven articles for inclusion in the ating the quality of the included research, we enhance the
study. Our manual searches of the reference lists of reviews credibility and trustworthiness of the findings, providing a
meeting the inclusion criteria yielded two additional stud- more accurate and dependable synthesis of the evidence. To
ies. We retained one of these studies in the final sample as it minimize bias and establish reliable findings in our umbrella
met our inclusion criteria. Our hand search did not produce review, we assessed the methodological rigor of the included
any new articles. The IRR for this process was 100%. meta-analyses, recognizing the importance of quality
Our systematic search yielded eight articles, three of assessment (Fusar-Poli & Radua, 2018). While tools like
which included only GDR (i.e., Kong et al., 2021; Lein the Assessment of Multiple Systematic Reviews (AMSTAR)
et al., 2020; Myers et al., 2022), two focused solely on SSR exist, they are primarily designed for healthcare research
(i.e., Lei et al., 2020; Shin et al., 2021), and three reported (Kung et al., 2010). This highlights a critical gap in the
separate findings for both types of research (i.e., Xin & social sciences where dedicated instruments for evaluating
Jitendra, 1999; Zhang & Xin, 2012; Zheng et al., 2013). Our the quality of meta-analyses are lacking.
final sample comprised 11 meta-analyses (i.e., six GDR and We adapted the revised AMSTAR (R-AMSTAR) scale
five SSR reviews). (Kung et al., 2010) for our research. The R-AMSTAR
Myers et al. 7
checklist is well-suited for our study as it assesses 11 core the visual checks, we created a table with each primary
methodological aspects of meta-analyses: (a) a priori study citation in a row and each meta-analysis in a separate
design; (b) duplicate study selection and data extraction; (c) column, marking the included primary studies with check-
comprehensiveness of literature search; (d) inclusion of marks (Pieper et al., 2014). We then examined the patterns
grey literature; (e) list of included and excluded studies; (f) and relationships in the matrix.
characteristics of included studies; (g) quality assessment of For the second approach, we calculated and interpreted
included studies; (h) appropriate use of study quality in for- the CCA using data from the citation matrices. The CCA
mulating conclusions; (i) methods for combining findings; was calculated using the following formula: CCA = (k − r)
(j) publication bias; and (k) conflict of interest (Kung et al., / (r [c − r]). In this formula, k is the total number of primary
2010). We made minor modifications to the tool, substitut- studies across the meta-analyses (i.e., all checkmarks), r is
ing healthcare-related examples (e.g., disease status) with the number of rows (i.e., the number of unique primary
information pertinent to our research (e.g., grade level). studies), and c is the number of columns (i.e., the number of
These changes were minimal, preserving the tool’s integ- meta-analyses). We used criteria given by Pieper et al.
rity. The tool’s flexible structure allows it to be applied (2014) to assess the degree of overlap: slight (0%–5%),
across various fields, including social science research moderate (6%–10%), high (11%–15%), or very high
(e.g., Filiz, 2023). (>15%). Based on these categories, we interpreted the
We developed a comprehensive coding manual through extent of overlap to understand its potential impact on our
a systematic process involving discussions among the findings and to ensure that any conclusions drawn account
research team to ensure consistent and reliable quality for the possible bias introduced by overlapping studies.
assessment. This manual provided detailed criteria and defi- Recognizing that high overlap can inflate the precision of
nitions for evaluating each aspect of the R-AMSTAR meta-analytic estimates, we accounted for this overlap in
checklist. Using this established coding manual, the first our review to mitigate potential bias (Hennessy & Johnson,
and second authors independently assessed the quality of 2020; Pieper et al., 2014).
each meta-analysis using the R-AMSTAR checklist (see
Supplemental Figures S1 and S2 for the checklist coding
and manual, respectively). They coded each criterion within
Coding Procedures
an item as “Yes” (met) or “No” (not met). Following the To ensure consistency in the coding process, we employed
specific scoring guidelines for each item, they assigned an a systematic approach for developing coding tools and uti-
overall score (1–4), with higher scores indicating better lized a double-screening method to extract information
quality. Finally, using the final score (sum of all item scores, from the included meta-analyses. The first author initially
with a maximum of 44), they ranked the quality of each created a web-based coding survey and a comprehensive
meta-analysis as insufficient (0–11), low (12–22), medium coding manual. Subsequently, a co-author then reviewed
(23–33), or high (34–44) based on Youngs’ (2017) criteria. the documents and provided feedback for further refine-
We calculated the IRR to ensure consistency, obtaining ment. The first and second authors collaboratively discussed
82% agreement and resolving discrepancies to achieve the feedback and incorporated the suggested modifications
100% agreement. We retained all studies in the review but to create draft versions of the coding survey and manual.
reported the quality scores alongside the findings to allow We conducted a pilot study to assess the draft coding tools’
readers to assess the potential influence of methodological effectiveness. The first and second authors independently
limitations on the result. coded a single study using the draft coding manual and sur-
vey. They then compared and discussed their coding results,
meticulously evaluating any discrepancies to identify areas
Overlap Analysis for improvement in the coding tools. Based on the pilot
To ensure the validity and reliability of our umbrella review, study’s findings, we made final revisions to the coding
we evaluated the extent of overlap across the included manual and survey.
meta-analyses, following established guidelines (Hennessey Using the finalized coding tools, the first and second
et al., 2019). Overlap occurs when multiple meta-analyses authors independently coded various aspects of each study,
incorporate the same primary studies, potentially introduc- including descriptive information (e.g., search year and
ing bias if not adequately addressed (Faulkner et al., 2022; number of studies) and results of the moderator analyses of
Hennessy & Johnson, 2020). We assessed overlap using intervention and participant characteristics. Notably, we did
two complementary approaches: visual inspections of cita- not extract data on methodological characteristics, such as
tion matrices and interpretation of the Corrected Covered the nature of treatment conditions, as they were not of inter-
Area (CCA) index (Pieper et al., 2014). We evaluated over- est in this study. The IRR for this process was 92%. After
lap across the SSR and GDR meta-analyses separately. For discussing inconsistencies and applying the established
8 Journal of Learning Disabilities 00(0)
Feature Description
Study Identification We recorded the full title, authors, and publication year of the meta-analysis.
Design of Included Research We recorded the research design used in the primary studies included in the meta-analysis,
distinguishing between reviews synthesizing group design research (i.e., experimental
and quasi-experimental designs) and those synthesizing single-subject research (SSR)
methodologies (e.g., multiple-baseline, multiple-probe across subjects).
Grade Level We coded the range or categories of grade levels targeted in the review (e.g., K–12, K–6,
Grades 1–5).
Search Years We recorded the years covered in the literature search for primary studies included in the
meta-analysis.
MD Criteria We recorded specific details or operationalization of the criteria used to define and identify
students with MD.
Addressed Dependency in Effects We coded if the study included methods to address effect size (ES) dependency, such as
Robust Variance Estimation (RVE). (Yes/No)
Addressed Outliers We recorded whether the methods included specific statistical treatments for outliers (e.g.,
winsorization or sensitivity analyses) of ESs. (Yes/No)
Number of Studies We noted the total number of primary studies included in the meta-analysis.
Number of Effect Sizes We noted the total number of effects extracted from the primary studies included in the
review.
Pooled Effect Size We extracted the mean ESs reported in the studies, specifying the measures used (e.g.,
Cohen’s d, Hedges’ g, and Tau-U). When authors provided estimates with and without
outliers, we prioritized those excluding outliers to avoid bias.
Moderator Analysis Results Through detailed analyses of the moderator sections in each meta-analysis, we extracted
information on potential moderators representing intervention (e.g., intervention model)
and participant characteristics (e.g., academic risk area):
•• Categories examined: The specific categorization the authors used for the moderator (e.g.,
academic risk area: MD vs. MD & RD)
•• ESs: Reported ES for each moderator category and the specific measure used.
•• Heterogeneity index: Statistical measure of the variability in ESs across studies between and
within each moderator category, such as I2 (I-squared) or τ2 (Tau-squared).
•• 95% confidence intervals (CI): Range of plausible values for the estimated ES in each
moderator category.
•• p-value: Statistical significance level for the observed heterogeneity within each
moderator category.
coding definitions, we reached 100% agreement. The fea- multiple influences on outcomes. The first and second
tures we coded for each study are presented in Table 2. authors collaboratively drafted the initial linkage, which
was reviewed and refined through feedback from two addi-
Connecting Moderators of WPS Outcomes to TII tional authors. This iterative process involved multiple
rounds of review and revision until we reached a consensus
Dimensions on all the links between the dimensions and moderators.
We thoroughly reviewed the literature to understand how Table 3 serves as a structured overview of the findings,
intervention and participant characteristics, examined as presenting the potential theoretical associations between
moderators of WPS outcomes across the meta-analyses in intervention dimensions and various participant and inter-
our umbrella review, correlate with relevant TII dimen- vention characteristics examined as moderators across the
sions. We aimed to connect these potential moderators to meta-analyses. The first column lists the dimensions (e.g.,
the TII framework, focusing on six dimensions, excluding Dosage) and potential moderators indicative of that dimen-
intensity. These dimensions pinpoint factors crucial for sion. The second column provides a detailed explanation of
intensifying instruction for students with MD. Unlike the each dimension’s application in the context of intervention
others, the intensity aspect concentrates solely on the over- intensity, breaking down the components and explaining
all intervention effectiveness (mean effect), which does not their significance in the overall intervention process. For
align with our goal of understanding how to intensify instance, under “Dosage,” the second column distinguishes
instruction for these students. The links were not mutually “Dosage Across the Intervention” from “Dosage Within the
exclusive, allowing for a comprehensive understanding of Intervention,” highlighting aspects such as the number of
Table 3. Connections Between the Dimensions of the Taxonomy of Intervention Intensity and Intervention and Participant Characteristics Examined as Moderators.
Dimension & associated moderators Description of dimension Explanations
Dosage In intensifying instruction, two areas of instructional Dosage Across the Intervention
(a) Intervention Frequency dosage need consideration: (a) Dosage across the Three possible study characteristics include intervention frequency, number of treatment sessions, and duration of interventions (Powell, Benz, et al., 2022).
(b) Number of Treatment Sessions intervention, which refers to the number of minutes An increase in at least one of these characteristics would increase student exposure to explicit instruction, increase the number of opportunities
(c) Intervention Duration per session, total number of sessions, and frequency to learn and practice, and subsequently improve student performance.
(d) Group size of instruction, and (b) Dosage within session Dosage Within the Intervention
addresses factors related to the opportunities for Reducing instructional group size empowers teachers to personalize instruction and maximize instructional dosage, fostering greater student
students to actively engage with the intervention engagement and opportunities to respond (D. Fuchs et al., 2014). Smaller groups enable frequent corrective feedback, progress monitoring, and
and receive corrective feedback. It focuses on the individualization, ensuring instruction supports each student’s needs (D. Fuchs et al., 2014).
number of opportunities for students to respond and
receive feedback during each session.
Alignment Two essential areas of alignment are critical for Student Alignment
(a) Participant Math Difficulty (MD) Status word problem instruction: (a) Student alignment: The success of WPS interventions hinges on understanding and addressing individual student characteristics. Two key factors influencing intervention
(b) Academic Risk Area Consider students’ academic characteristics (such outcomes are participants’ MD status (e.g., LD vs. At-risk) and academic risk area (i.e., MD only vs. MD and RD). Research consistently
(c) English Learner (EL) Status as MD status, academic risk area, and EL status), demonstrates that students with LD generally require more intensive instruction than their at-risk peers (Compton et al., 2012; L. S. Fuchs et al.,
(d) Subject Area Focus cultural background, and behavioral characteristics 2002). This increased intensity arises from the specific learning challenges associated with LD, which often require more explicit and long-term
(e) Culturally Responsive Practices when selecting word problem interventions, and (b) support. Furthermore, studies have shown that students experiencing comorbidity of MD and RD perform significantly lower in WPS tasks than
(f) Content Domain Curriculum alignment: Select interventions that align those with MD alone (e.g., Myers et al., 2022). This suggests that RD adds additional complexity to the learning process, necessitating tailored
(g) Word Problem Type/Task with the specific math content standards and practice intervention approaches that simultaneously address MD and RD (Powell et al., 2020). Therefore, carefully considering student MD status,
(h) Broad Math Achievement standards addressed by the word problems being the presence of comorbid RD, and individual academic risk areas is crucial for ensuring appropriate instructional alignment and maximizing the
(i) CCSS-Math Content Standards taught. These two areas of alignment can interact effectiveness of WPS interventions (Myers et al., 2022).
(j) CCSS-Math Practice Standards and influence each other. For example, an EL student Additionally, the language demands of word problems can further complicate the situation for English learners (ELs) with MD. This population often
(k) Student-Directed Learning with MD and low student-directed learning might faces more significant WPS challenges than their peers without language difficulties (Lei et al., 2020). They must simultaneously decipher the
need an intervention combining CRP strategies and mathematical concepts and the linguistic intricacies of the problem itself (King & Powell, 2023). Consequently, ELs with MD necessitate targeted
explicit instruction on self-regulation strategies to and individualized instruction that directly addresses their multifaceted learning needs (Lei et al., 2020). This necessitates incorporating culturally
manage their behavior and learning. The selected responsive practices (CRP) into their word problem instruction. CRP strategies (e.g., utilizing culturally relevant contexts and promoting the use of
intervention should also be linked to the specific students’ native languages) may help ELs with MD better connect with the material and overcome the additional linguistic barriers they face (Kong
math content and practice standards covered in the et al., 2023). Addressing ELs’ linguistic and cultural needs with MD is crucial for enhancing their WPS skills (Lei et al., 2020).
word problems they must solve. Student-directed learning (SDL), where students actively manage their learning process, is crucial in enhancing word problem interventions for students
with MD, particularly those with LD. Compared to their typically developing peers, MD/LD students are more likely to display internalizing
(e.g., anxiety) and externalizing (e.g., aggression) behaviors that negatively impact their learning (Jordan et al., 2020). SDL empowers students to
become more self-aware and develop coping mechanisms for managing these behaviors. Students with high levels of SDL can effectively manage
their learning and behavior independently (Fitzpatrick & Knowlton, 2009). However, students exhibiting lower levels of SDL may significantly
benefit from WPS programs that incorporate explicit instruction on self-regulation strategies. These strategies equip students with tools like self-
monitoring (observing and recording one’s behavior and emotions) and self-talk (positive self-affirmations and self-talk) to manage their behavior
effectively (Feeney, 2022), fostering a more conducive learning environment.
Alignment with Curriculum
Furthermore, students with MD struggle to perform calculations involving number operations and fractions (Aunio et al., 2015; Shin & Bryant, 2017);
therefore, the content domain targeted by an intervention should be paramount in instructional alignment.
The subject area focus of the intervention is crucial for linking word problem instruction with the curriculum because it ensures the skills being
taught are relevant and applicable to the specific content area (Lei et al., 2020). For example, a WPS intervention targeting both reading
comprehension and mathematics may help students make connections between literacy and numeracy, enhancing their understanding of word
problems in math contexts. Alternatively, interventions focusing solely on one area can provide more targeted support for that skill. Notably,
students with MD and RD might significantly benefit from interventions addressing both areas (Myers et al., 2022). This integrated approach
can facilitate connections between literacy and numeracy, enhancing their overall understanding and application within word problem contexts
(Arsenault & Powell, 2022a).
Students must also solve different word problems, such as those requiring one or multiple. A key consideration is how the type of word problem
addresses students’ academic skill deficits (Powell, Berry, et al., 2022).
The Broad Math Achievement score serves as a screening measure of students’ achievement on standardized achievement tests, such as the
Woodcock-Johnson Psycho-Educational Battery Tests of Achievement. These tests assess foundational skills, including operations with whole
numbers and applied problems. Many students with MD struggle with these foundational skills, which are essential for understanding and solving
word problems (Powell et al., 2020). Broad Math scores are valuable as they provide a quantifiable measure of these foundational skills. By
considering students’ current achievement levels, educators can pinpoint areas of weakness and customize WPS instruction to bridge these gaps.
Connecting WPS instruction for students with MD with rigorous mathematics standards is also vital (Shin et al., 2020). These students are
expected to demonstrate proficiency on assessments built on rigorous general education math content standards, such as the Common Core
State Standards-Mathematics (National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010). These
assessments may include word problems related to CCSSM’s math content standards (i.e., math topic) and practice standards (i.e., pedagogical
approach). Hence, in planning WPS instruction for students with MD, practitioners are encouraged to select interventions that emphasize building
a solid foundation in the specific math content standards and the practice standards outlined in the CCSSM (Shin et al., 2021).
(continued)
9
10
Table 3. (continued)
Dimension & associated moderators Description of dimension Explanations
Attention to Transfer The intervention program should support students in Manipulating instructional group size to create small learning groups can be a powerful strategy for enhancing the transfer of learning (Pai et al., 2015).
(a) Group Size (a) applying mathematical skills or concepts learned It allows for personalized instruction, immediate feedback, peer collaboration, and continuous adjustment of instruction based on student needs,
(b) Intervention Model in one context to a different context and (b) realizing all of which contribute to a deeper, more flexible understanding that students can apply in new contexts (Pai et al., 2015).
connections between the concepts they have learned Furthermore, intensive WPS intervention programs for students with MD should prioritize explicit instruction on the transfer of learning (Powell,
and related skills. Berry, et al., 2022). Simply acquiring WPS knowledge and skills is insufficient; students require the tools and strategies to apply them across
contexts and tasks. Intervention models that explicitly guide students through this process are crucial for maximizing the impact of WPS programs
and fostering long-term WPS success (Powell & Fuchs, 2018)
Comprehensiveness The number of explicit instruction principles within Practitioners’ choice of intervention model may significantly impact the comprehensiveness of WPS instruction for students with MD. SBI models are
(a) Intervention Model the intervention is essential for providing clear and a multi-component program offering a holistic approach to WPS instruction. This program goes beyond simply teaching problem-solving steps;
concise explanations, modeling efficient solution it explicitly guides students to identify the underlying structure (schema) of each problem, represent it accurately, formulate a plan for solving it,
strategies, providing guided and independent practice, execute the plan, and check the solution’s appropriateness (Powell et al., 2020). Additionally, SBI incorporates strategy instruction (e.g., the use
and incorporating strategies to help students monitor of cognitive and meta-cognitive strategies), where students regulate and monitor their learning and use attack strategies, mnemonics, or general
their understanding and adjust as needed. heuristics to effectively remember and implement the solution steps (Witzel et al., 2022). In contrast, other WPS models, like strategy instruction
and direct instruction, while valuable, are less comprehensive (Myers et al., 2022). While they, like SBI, prioritize assisting students with sequencing
the steps of solving problems, they lack the explicit focus on the fundamental problem structure that distinguishes SBI (Lein et al., 2020). This
difference can significantly impact the development of students’ deep understanding and flexible WPS skills.
Behavioral Support Including (a) self-regulation and executive function The setting in which WPS interventions take place, and the group size may interact to play a critical role in the effectiveness of behavioral support.
(a) Intervention Setting components and (b) behavioral principles to support Large general education classrooms, while convenient, often limit the ability of practitioners to provide individualized behavioral support to
(b) Student-Directed Learning productive behavior in word problem intervention students with MD (Powell, Berry, et al., 2022). This can hinder their ability to focus, manage distractions, and ultimately learn effectively.
(c) Group Size may be essential, particularly for students with MD Smaller instructional groups in separate settings offer a significantly more supportive environment for enhancing behavioral support in WPS
who display behavioral challenges. Specifically, these interventions (Powell, Berry, et al., 2022). These settings provide several benefits for students with MD, such as increased teacher attention,
components may help students with MD to regulate reduced distractions, and enhanced self-regulation training (Powell, Benz, et al., 2022).
their attention, manage their emotions, and stay on Student-directed learning is also crucial in tailoring behavioral support and WPS intervention strategies for students with MD. As noted, many of
task during word problem instruction. these students display behavioral challenges that may inhibit their learning (Jordan et al., 2020). While SDL empowers students with practical self-
regulation skills, those requiring additional support may benefit from explicit instruction in self-regulation strategies like self-monitoring and self-
talk (Feeney, 2022). When WPS instruction matches students’ individual behavioral needs, educators can create a more supportive and conducive
learning environment for all students (Powell, Berry, et al., 2022).
Individualization A key consideration for word problem instructional Smaller instructional groups are more effective for facilitating instruction as they allow more concentration on students’ needs (Powell, Berry, et al.,
(a) Group Size programs for students with MD is a data-driven 2022).
(b) Intervention Setting process for individualizing intervention, in which The intervention setting should also be considered given that teachers typically provide such instruction outside the general education classroom
the educator systematically adjusts the intervention (Vaughn et al., 2012).
over time in response to the student’s progress-
monitoring data.
Myers et al. 11
sessions and opportunities for student engagement. The relevant TII dimensions that consistently moderate inter-
third column offers explanations, grounded in the literature, vention effectiveness.
of how each moderator may be linked to the dimensions.
Results
Data Analysis
Descriptive Summary of Study Information and
We used a two-step process to answer our research ques- Quality Appraisal Results
tion. In the first step, we conducted a detailed qualitative
synthesis of each meta-analysis. Instead of focusing on We present descriptive information and the quality appraisal
characteristics correlated with the TII framework, our results for the GDR and SSR reviews in Tables 4 and 5,
approach examined all intervention and participant charac- respectively. A more detailed version of the quality appraisal
teristics reported as potentially influencing intervention results is available in the Supplemental Material (See Table
effectiveness. This broader approach ensured we captured S1). Across the GDR reviews (n = 6), researchers synthe-
the full range of potential moderators identified across the sized 85 unique primary studies, while the SSR reviews (n
studies, aligning with Booth’s (2016) recommendations for = 5) comprised 41 studies. The mean ESs from these
transparency. It also enhances the research’s credibility by reviews were positive, ranging from moderate to large.
providing a more complete picture of the factors considered These findings suggest that students with MD who partici-
in the original moderator analyses. Specifically, we sum- pated in WPS interventions experienced meaningful
marized the results of these analyses to understand how improvements in their problem-solving skills. A moderate
these characteristics affected the outcomes. Our summaries ES, for example, translates to approximately 3 months of
included the ES, their corresponding confidence intervals additional learning compared to students who did not
(CIs), and measures of heterogeneity, such as I² and tau- receive such interventions (Bloom et al., 2008). Notably,
squared (τ²). while the authors of the included meta-analyses reported
We used established criteria to evaluate the strength of measures of heterogeneity, the majority relied on the Q sta-
the ESs and the level of heterogeneity among the studies tistic, which has limitations (Higgins & Thompson, 2002).
included in the meta-analysis. We interpreted the magni- More robust measures of heterogeneity, such as I² and τ²,
tude of the reported effects using benchmarks specific to were not consistently reported. This inconsistency limits
educational interventions in mathematics, established by our ability to definitively analyze the extent of variation
Bloom et al. (2008). These benchmarks focus on annual (heterogeneity) across the studies’ findings.
gains in student achievement in mathematics, providing a Results of the quality appraisal revealed the included
more meaningful context for interpreting the intervention meta-analyses exhibited medium to high methodological
effects than generic benchmarks. For example, an ES of 0.3 quality, with a mean score of 33.75, a median of 34.0, a
might be considered a moderate ES based on the bench- range of 27–40, and a standard deviation of 3.83. None of
marks set by Bloom et al., indicating an annual gain equiv- the studies were considered insufficient or low. More
alent to 3 months of additional learning. Higher values of recent studies (e.g., Lei et al., 2020; Myers et al., 2022;
the I² statistic and τ² indicate more significant heterogene- Shin et al., 2021) tended to have higher quality ratings
ity or inconsistency among the included studies. Common than older ones, suggesting a possible improvement in the
interpretations suggest that I² values above 50% or τ² val- methodological quality of meta-analyses over time.
ues exceeding 0.1 indicate substantial heterogeneity Strengths across the meta-analyses included comprehen-
(Higgins & Thompson, 2002). sive literature searches and appropriate outcome synthe-
The second step focused on a cross-study analysis of ses. However, areas for improvement were often related to
moderators correlated with the TII framework (details in detailed reporting of study characteristics and publication
Table 3). We aimed to identify intervention and participant bias assessment. The distribution of quality ratings (seven
characteristics within this framework that consistently medium and six high) bolsters our confidence in the syn-
exhibited statistically significant moderating impacts on the thesized results.
effectiveness of WPS interventions for students with MD.
Using a qualitative approach, we examined moderators Overlap in Primary Studies Across the Included
demonstrating consistency in the direction and magnitude
of their effects across the studies. This analysis sought to
Meta-Analyses
identify patterns and trends in how these TII-linked moder- The overlap analysis revealed substantial overlap in the
ators impact intervention effectiveness. Ultimately, we included research (Cohen’s Kappa: 15% for GDR and 11%
aimed to inform efforts to intensify WPS instruction for stu- for SSR), suggesting the reviews relied heavily on the same
dents with MD by determining characteristics related to the primary studies. Visual inspections of the citation matrices
Table 4. Study Design Summary and Quality Appraisal Results for the Meta-Analyses of Group Design Research (GDR).
12
Zhang & Xin
Feature Xin & Jitendra (1999) (2012) Zheng et al. (2013) Lein et al. (2020) Kong et al. (2021) Myers et al. (2022)
Study Design
Grade Level K to Post-secondary K to 12 K to 12 K to 12 K to 6 Grades 1 to 5
Search Years 1960–1996 1996–2009 1986–2009 1989–2019 1990–2019 1975–2021
MD Criteria Diagnosis of a mild Diagnosis of Scores below the 25th Diagnosis of LD Diagnosis of LD or Diagnosis of LD or
disability, such as LD & LD percentile (standard or scores below at risk for LD based at risk for LD based
ADHD, or at risk for score of 90) on a norm- the 35th percentile on scores below a on scores below a
math failure referenced math test on a standardized designated cutoff on designated cutoff on
math measure a math assessment a math assessment
Addressed Dependency in No No No Yesa No Yesb
Effects
Addressed Outliers Yes No Yes Yes No Yes
Number of Studies 14 29 7 31 18 52
Number of Effect Sizes 14 29 7 34 113 231
Pooled Effect Size d = 0.89 d = 1.58c g = 0.45 (MD and RD)c g = 0.56 g = 1.08 g = 0.81
g = 0.95 (MD only)c
Quality Appraisal (R-AMSTAR)-Results
Overall Score and Quality 33 (High) 31 (Medium) 31 (Medium) 35 (High) 27 (Medium) 40 (High)
Rating
Score by Each Item
1. A priori design 3 3 3 3 3 3
2. Duplicate study 4 4 4 4 4 4
selection and data
extraction
3. Comprehensive 4 4 4 4 3 4
literature search
4. Inclusion of gray 3 3 4 3 3 3
literature
5. List of included and 3 3 3 4 1 4
excluded studies
6. Characteristics of 4 4 3 4 3 4
included studies
7. Quality assessment of 1 1 1 1 1 4
included studies
8. Use of quality in 1 1 1 1 1 4
formulating conclusions
9. Methods for combining 4 4 4 4 4 4
findings
10. Assessment of 3 1 1 4 1 3
publication bias
11. Conflict of interest 3 3 3 3 3 3
a
The authors selected the treatment that best represented the word problem-solving instruction. b Used robust variance estimation (RVE). c Lacked assessment of publication bias.
Table 5. Study Design Summary and Quality Appraisal Results for the Meta-Analyses of Single-Subject Research (SSR).
Feature Xin & Jitendra (1999) Zhang & Xin (2012) Zheng et al. (2013) Lei et al. (2020) Shin et al. (2021)
Study Design
Grade Level K to Post Secondary K to 12 K to 12 K to 12 K to 6
Search Years 1960–1996 1996–2009 1986–2009 Not Reported 1975–2020
MD Criteria Diagnosis of a mild Diagnosis of LD Scores below the 25th Diagnosis of LD or Diagnosis of LD
disability, such as LD percentile (standard low math scores and
& ADHD, or at-risk score of 90) on a norm- considered as English
for math failure referenced math test learners (ELs)
Addressed Dependency in Effects No No No No Yesa
Addressed Outliers Yes No Yes Yes No
Number of Studies 12 10 8 10 20
Number of Effects 15 13 27 72 39
Pooled Effect Size PND = 89% PND = 95%b d = 0.90b TAU = 0.81 BC-SMD = 4.52c
Quality Appraisal (R-AMSTAR)-Results
Overall Score and Quality Rating 33 (High) 31 (Medium) 31 (Medium) 36 (High) 37 (High)
Score by Each Item
1. A priori design 3 3 3 3 3
2. Duplicate study selection and data 4 4 4 4 4
extraction
3. Comprehensive literature search 4 4 4 4 4
4. Inclusion of gray literature 3 3 4 3 2
5. List of included and excluded studies 3 3 3 1 3
6. Characteristics of included studies 4 4 3 4 4
7. Quality assessment of included studies 1 1 1 4 4
8. Use of study quality in formulating 1 1 1 2 2
conclusions
9. Methods for combining findings 4 4 4 4 4
10. Assessment of publication bias 3 1 1 4 4
11. Conflict of interest 3 3 3 3 3
13
14 Journal of Learning Disabilities 00(0)
(see Figures 2 and 3) confirmed this overlap. This high instruction (k = 20). Regarding the interventionist, inter-
degree of overlap in primary studies validates our choice of ventions delivered jointly by teachers and researchers had
a qualitative synthesis approach for assessing the quality of the highest ES (d = 6.01; CI = [4.77, 7.24]), followed by
the meta-analyses and generating informative qualitative those implemented by teachers (d = 1.93; CI = [1.52,
summaries (Gates et al., 2020). However, due to this high 2.34]) and interventions provided by researchers (d = 0.65;
degree of overlap, careful interpretation of the findings CI = [0.49, 0.82]). For the word problem task, one-step
from our qualitative synthesis remains crucial. word problems had higher ESs (d = 1.89; CI = [1.64,
2.13]) than mixed (one- and multi-step) problems (d =
Qualitative Summaries of Included Meta- 0.63; CI = [0.43, 0.83]), and multi-step problems had no
Analyses effect.
Xin and Jitendra (1999) attained moderating results for
We present a qualitative summary of each meta-analysis three participant characteristics. For IQ, students with IQ <
chronologically for easy reference. For reviews reporting 85 scored higher (d = 1.87; CI = [1.26, 2.49]) than those
results with and without outliers, we focused on the results with IQ ≥ 85 (d = 0.51; CI = [0.35, 0.67]). For grade level,
excluding outliers to minimize bias and obtain more reli- post-secondary students had better outcomes (d = 1.68; CI
able ESs. These qualitative summaries are complemented = [1.29, 2.07]) than secondary students (d = 0.78; CI =
by the detailed information in Tables 4–6. These tables [0.58, 0.99]) and elementary students (d = 0.47; CI =
provide a comprehensive overview of the included meta- [0.23, 0.72]). Also, for participant MD status, mixed sam-
analyses, detailing study design characteristics, quality ples, including at-risk and students with LD, produced the
assessment, and the results of moderator analyses. We first highest ES (d = 1.96; CI = [1.18, 2.74]), followed by at-
summarize the GDR reviews, followed by the SSR risk (d = 1.90; CI = [1.54, 2.26]) and LD (d = 0.50; CI =
reviews. [0.35, 0.65]) samples. The mixed and IQ < 85 samples
were small (k = 2 and 3, respectively).
Group Design Research
Xin and Jitendra (1999). Xin and Jitendra (1999) con- Zhang and Xin (2012). Zhang and Xin (2012) conducted
ducted subgroup analyses to explore the moderating influ- a follow-up study to expand upon the findings of Xin and
ence of various interventions and participant characteristics Jitendra (1999). They used a comprehensive moderator
on the effectiveness of WPS interventions. Although a analysis using multiple subgroup analyses to examine the
majority of the analyzed variables significantly influenced impact of seven variables representing educational poli-
WPS outcomes with moderate to large ESs, the small sam- cies and laws affecting students with disabilities, including
ple sizes limit the reliability of these findings. Among the No Child Left Behind (NCLB), Response to Intervention
seven intervention characteristics examined, five (interven- (RTI), and the IDEA. Specifically, they explored two vari-
tion model, number of treatment sessions, group size, inter- ables related to inclusive education (instructional setting
ventionist, and word problem task) had positive impacts, and participant status): one variable associated with RTI
while two (setting and level of student-directed learning) (diagnostic approach), two variables focused on standards-
showed no significant effect. The intervention model esti- based education (dependent measure type and intervention
mates showed studies classified as technology-based pro- model), and two variables tied to mathematics education
duced the largest mean ESs (d = 1.80; CI = [1.27, 2.33]), reform (algebraic problem-solving and word problem
followed by those classified as the use of representations tasks).
(e.g., schema-based instruction [SBI], diagrams, and Results from the study by Zhang and Xin (2012) showed
manipulatives; d = 1.77; CI = [1.43, 2.12]) and strategy ESs varied based on three of five variables representing
instruction (e.g., general heuristics, cognitive, and meta- intervention characteristics, including the intervention
cognitive strategies; d = 0.74; CI = [0.56, 0.93]). For the model, instructional setting, and dependent measure type,
number of treatment sessions, long-term interventions (i.e., with moderate to large ESs. For the intervention model,
>1 month) produced the highest ESs (d = 2.51; CI = [1.93, interventions using representations had the highest impact
3.09]), followed by short-term (i.e., seven sessions or less; (d = 2.64; CI = [1.96, 3.31]), followed by strategy instruc-
d = 1.72; CI = [1.46, 1.98]) and intermediate-term (i.e., tion (d = 1.86; CI = [1.07, 2.64]) and technology-based
more than seven sessions but <1 month; d = 0.73; CI = interventions (d = 1.22; CI = [0.62, 1.81]). For the instruc-
[0.51, 0.95]) interventions. tional setting, interventions done in general education class-
In Xin and Jitendra’s analysis, for group size, one-on- rooms (d = 2.60; CI = [1.99, 3.21]) yielded higher
one instruction (d = 2.18; CI = [1.76, 2.61]) produced outcomes than those conducted in special education settings
higher ESs than group instruction (d = 0.54; CI = [0.39, (d = 1.35, CI = [0.86, 1.83]). For dependent measure type,
0.68]). However, the sample of studies using one-on-one researcher-made measures (d = 1.87; CI = [1.48, 2.26])
instruction (k = 5) was smaller than those using group produced higher outcomes than standardized assessments
Myers et al. 15
(d = 0.60; CI = [−0.43, 1.63]). Results showed no varia- math achievement (high vs. low). Only the academic risk
tions in impact based on the remaining two intervention area indicator yielded significant results, with medium to
characteristics: algebraic problem-solving level (arithmetic large ESs. Results showed interventions had a large posi-
vs. pre-algebraic) and problem task type (simple vs. real- tive mean impact for students with MD who received treat-
world). In addition, neither participant characteristics, ment (g = 0.95; CI = [0.58, 1.33]) compared to students
including participant MD status (MD vs. average achiev- with MD in the control groups. However, they reported a
ing), nor diagnostic approach (discrepancy model vs. RTI conflicting and medium negative mean ES for students with
model) impacted WPS effects. MD and RD (g = −0.45; CI = [−0.72, −0.18]) compared to
students with MD and RD assigned to the control groups.
Zheng et al. (2013). Zheng et al. (2013) conducted a
meta-analysis to compare the effectiveness of interven- Lein et al. (2020). Lein et al. (2020) conducted a meta-
tions for students with MD and comorbid MD and RD. analysis to identify WPS EBPs for serving students with
The authors used subgroup analyses to examine the influ- MD within MTSS frameworks. They used subgroup analy-
ence of four factors: academic risk area (MD vs. MD and ses and meta-regression techniques to examine the moder-
RD), age (older vs. younger), IQ (high vs. low), and broad ating impact of two participant characteristics (participant
Table 6. Summary and Results of Comparative Analysis of Moderator Effects Across Meta-Analyses.
Note. Y = Significant moderator effect reported (p < .05); N = non-significant moderator effect reported (p > .05); CCSS-M = Common Core State Standards for Mathematics; Variables in bold
emerged as consistent moderators with uniform results.
a
Connections were established between these variables and at least one dimension of the Taxonomy of Intervention Intensity (TII). b No connections were established between these variables and
any of the TII dimensions
17
18 Journal of Learning Disabilities 00(0)
status and grade level) and six intervention characteristics measure type, researcher-made tests (g = 1.27; CI = [0.92,
(dependent measure type, intervention model, intervention- 1.63]; I2 = 95%) produced higher ESs than standardized
ist, group size, intervention duration, and content focus). assessments (g = 0.37; CI = [0.03, 0.71]; I2 = 67%).
Results showed moderating outcomes for one participant Regarding group size, interventions delivered in large
(grade level) and three intervention characteristics (depen- groups (g = 1.64; CI = [1.09, 2.19]; I2 = 94%) had larger
dent measure type, intervention model, and intervention- ESs than those provided using small groups (g = 0.78; CI
ist). The remaining variables did not exhibit a differential = [0.44, 1.13]; I2 = 94%). The estimate for individualized
impact. Due to the low statistical power, the authors cau- instruction was not significant (p > .05). In terms of the
tioned against definitive interpretations of these findings. interventionist, teacher-delivered (g = 1.23; CI = [0.73,
For grade level, the effects for elementary students (g = 1.74]; I2 = 90%) and university student-delivered (g =
0.66, CI = [0.42, 0.85]) were larger than those for second- 1.15; CI = [0.73, 1.58]; I2 = 96%) interventions had higher
ary students (g = 0.33; CI = [0.20, 0.46]). Regarding the effects than researcher-implemented ones (g = 0.35; CI =
dependent measure type, researcher-made tests produced [0.05, 0.64]; I2 = 0%) and those delivered by hired com-
higher ESs than standardized assessments (g = 0.09; CI = munity members (g = 0.36; CI = [0.28, 0.44]; I2 = 0%).
[−0.16, 0.34]). For the intervention model, Schema Studies implemented by parents yielded no effects.
Broadening (and Transfer) Instruction (SBTI; g = 1.06; CI However, the authors urged caution in interpreting these
= [0.88, 1.24]) and SBI (g = 0.40; CI = [0.23, 0.58]) findings due to limited studies focusing on individualized
yielded larger ESs than strategy (cognitive and explicit) instruction and involving researchers and community hires.
instruction (g = 0.28; CI = [0.06, 0.50]) and models cate- Based on the findings of the meta-regression analysis,
gorized as “other” (g = 0.11; CI = [−0.20, 0.43]). For the Kong et al. (2021) concluded that the three examined par-
interventionist, studies delivered jointly by researchers and ticipant characteristics significantly influenced the variance
school personnel (e.g., teachers) produced the highest ESs in treatment effectiveness. Results for participant status
(g = 0.76; CI = [0.20, 1.32]), followed by those involving showed higher ESs for at-risk students (g = 1.35; CI =
researchers (g = 0.71; CI = [0.47, 0.95]) and school per- [0.94, 1.77]; I2 = 96%) than for those with LD (g = 0.74;
sonnel (g = 0.28; CI = [0.18, 0.38]). CI = [0.34, 1.14]; I2 = 87%). For EL status, samples com-
prising mainly non-ELs produced smaller ESs (g = 0.77;
Kong et al. (2021). Kong et al. (2021) conducted a selec- CI = [0.41, 1.12]; I2 = 87%) than those primarily including
tive meta-analysis of interventions MD. Like Lein et al. EL (g = 1.40; CI = [0.94, 1.86]; I2 = 96%). Kong et al.
(2020), they used a combination of subgroup analyses and (2021) adopted a more granular approach to investigate the
meta-regression to examine eight potential moderators of moderating effect of grade level, examining each grade
treatment. The subgroup analyses targeted five interven- separately rather than in broad categories. Their findings
tion characteristics (intervention duration, number of treat- revealed the largest outcomes for third graders (g = 1.31;
ment sessions, group size, interventionist, and dependent CI = [0.95, 1.68]; I2 = 95%), followed by fourth graders (g
measure type), while the meta-regression evaluated three = 0.77; CI = [0.24, 1.31]; I2 = 83%). However, results for
participant characteristics (participant status, EL status, and Grades 2 and 5 were insignificant due to limited sample
grade level). The findings supported the moderating effect sizes (p > .05).
of most of these variables, with medium to large ESs. How-
ever, the medium to high I2 values indicate substantial het- Myers, et al. (2022). In a recent meta-analysis, Myers
erogeneity among the included studies. et al. (2022) applied an MTSS framework to examine how
Results showed the two intervention dosage measures various factors influence the effectiveness of WPS inter-
influenced WPS outcomes, including intervention duration ventions. They considered three participant characteristics
and the number of treatment sessions. For intervention (grade level, participant status, and academic risk area) and
duration, 50-minute sessions produced the largest ESs, nine intervention characteristics (intervention setting, group
while 25-minute sessions had the lowest impact. The size, duration, number of sessions, intervention frequency,
authors did not report statistics to determine these ESs’ intervention model, word-problem type, content focus, and
magnitude and their CIs. For the number of treatment ses- fidelity). In addition, the analysis included nine method-
sions in the study by Kong et al. (2021), results indicated ological design characteristics that were not directly cor-
interventions with 34 sessions produced the largest impact related with the TII framework. First, they calculated the
(g = 3.24; CI = [1.15, 4.96]; I2 = 98%), followed by inter- average ES for each level of each variable. Then, they used
ventions with 26 sessions (g = 1.75; CI = [0.45, 3.05]; I2 = a meta-regression model to simultaneously explore how
97%), 36 sessions (g = 1.45; CI = [0.53, 2.38]; I2 = 94%); these variables influenced the effectiveness of WPS inter-
and 12 sessions (g = 1.15; CI = [0.15, 2.16]; I2 = 94%). ventions for students with MD.
Results for the remaining session durations (18, 20, 24, 32, Among the participant characteristics, only the academic
and 60) were not significant (p > .05). For the dependent risk area emerged as a significant factor influencing
Myers et al. 19
intervention effectiveness. Students at risk for only MS Lei et al. (2020). Lei et al. (2020) conducted a meta-
showed higher average gains (g = 1.04; CI = [0.76, 1.32]) analysis to investigate the effectiveness of SSR interven-
than those with MD and RD (g = 0.66; CI = [0.31, 1.02]). tions in improving WPS outcomes for ELs with MD. They
Results of the meta-regression showed the difference in used subgroup analyses to examine potential differential
these ESs, represented by the regression coefficient (β), was effects based on five participant-related (gender, EL status,
significant (β = −0.61; CI = [−1.09, −0.13]). Myers et al. grade, native language, and participant status) and six inter-
(2022) also reported that only two intervention characteris- vention-related characteristics (content focus, instructional
tics influenced outcomes: group size and how often the focus, interventionist, duration, group size, and culturally
intervention was delivered (intervention frequency). responsive pedagogy). The findings revealed differential
Interventions delivered to larger groups (more than eight outcomes based on five moderators, with moderate to large
students) had higher ESs (g = 1.41; CI = [0.91, 1.91]) than ESs: two participant-related (grade level and participant
smaller groups (eight or fewer students; g = 0.86; CI = status) and three intervention-related characteristics (inter-
[0.56, 1.15]). The difference in these estimates was statisti- ventionist, content focus, and instructional focus). How-
cally significant (β = 1.58; CI = [0.78, 2.38]). ever, it is essential to note the samples of studies used in
these analyses were small, which may limit the generaliz-
Single-Subject Research ability of the findings.
Xin and Jitendra (1999). Xin and Jitendra (1999) con- In the analysis for grade level by Lei et al. (2020), fourth-
ducted subgroup analyses to investigate the moderating out- grade (Tau-U = 1.00; CI = [0.83, 1.00]) and fifth-grade
comes of three participant characteristics (grade level, IQ, (Tau-U = 1.00; CI = [0.53, 1.00]) reported the highest ESs,
and participant status) and five intervention characteristics followed by second grade (Tau-U = 0.82; CI = [0.68,
(intervention approach, length of treatment, intervention- 0.96]) and third grade (Tau-U = 0.75; CI = [0.67, 0.83]).
ists, word-problem task, and level of student-directed learn- The fourth- and fifth-grade calculations involved nine and
ing) for the efficacy of SSR interventions. Results showed one subject, respectively. Regarding participant status,
significant differences in ESs based on two intervention studies including students with LD produced higher ESs
characteristics: intervention model and treatment length. (Tau-U = 1.00; CI = [0.81, 1.00]) than those with samples
Representational models (PND = 100) were more effective of at-risk learners (Tau-U = 0.78; CI = [0.71, 0.84]). For
than strategy instruction (PND = 87). Intermediate- (PND the interventionist, teacher-delivered interventions were the
= 97) and long-term interventions (PND = 100) produce most effective (Tau-U = 0.91; CI = [0.76, 1.00]), followed
higher ESs than short-term interventions (PND = 49). Nev- by researcher-delivered ones (Tau-U = 0.81; CI = [0.72,
ertheless, these findings warrant cautious interpretation due 0.90]) and those delivered jointly by teachers and research-
to the limited sample size. ers (Tau-U = 0.74; CI = [0.64, 0.84]). For the content
focus, fractions interventions yielded higher ESs (Tau-U =
Zhang and Xin (2012). Like Xin and Jitendra (1999), 1.00; CI = [0.80, 1.00]) than whole number computations
Zhang and Xin (2012) conducted a meta-analysis of SSR interventions (Tau-U = 0.78; CI = [0.71, 0.84]). For the
on WPS interventions for students with MD. The authors instructional focus, interventions focused primarily on
computed the moderating influence of the same variables mathematics instruction (Tau-U = 1.00; CI = [0.81, 100])
examined in group studies, but all the calculations produced produced higher ESs than those targeting math and reading
PND values close to 100, suggesting no difference in the jointly (Tau-U = 0.81; CI = [0.73, 0.89]) and those mainly
moderators. focused on reading (Tau-U = 0.71; CI = [0.60, 0.83]).
Zheng et al. (2013). Zheng et al. (2013) also analyzed Shin et al. (2021). Shin et al. (2021) conducted a multi-
the findings of SSR. They calculated ESs as the PND and level meta-analysis of SSR on WPS for students with LD.
then converted these to Cohen’s d for standardization. They They used multiple meta-regression analyses to examine
examined the moderating influence of four participant potential moderators of intervention outcomes. The vari-
characteristics, including age, IQ, broad math achievement, ables they investigated included two intervention char-
and academic risk area (MD vs. MD and RD vs. No read- acteristics: Common Core State Standards-Mathematics
ing scores), and a single intervention feature: the type of (CCSSM) content standards and CCSSM practice stan-
materials (curriculum-based vs. experimenter-developed). dards. They obtained differential effects based on both these
Among these factors, only the academic risk area showed variables. However, the authors reported evidence of publi-
significant differential outcomes with small to large cation bias, warranting caution in interpreting the findings.
impacts. Studies of interventions for students with only MD The heterogeneity in effects was medium to high, with I2
(d = 1.45) showed larger ESs than those for students with values ranging from 51.50 to 93.16.
both MD and RD (d = 0.58) and those without any reading For the content standards, results showed studies
scores reported (d = 0.35). addressing operations and algebraic thinking produced
20 Journal of Learning Disabilities 00(0)
smaller ESs than those focused on other standards, includ- results associated with the word problem type or problem
ing numbers and operations with fractions, ratio and pro- task, while Zhang and Xin (2012) and Myers, Witzel, et al.
portional relationships, geometry, number systems, (2022) identified no evidence of its impact on the outcomes.
expressions and equations, and mathematical practice. In addition, all the reviews except that of Zheng et al. (2013)
However, further analysis showed only the contrast between examined the effect of participants’ MD status, with two
operations and algebraic thinking (BC-SMD = 2.98; CI = (Kong et al., 2021; Xin & Jitendra, 1999) finding evidence
[1.82, 4.41]; I2 = 52.91) and number system (BC-SMD = supporting its moderating influence and three (Lein et al.,
3.04; CI = [0.53, 5.55]; I2 = 51.50) yielded a significant 2020; Myers et al., 2022; Zhang & Xin, 2012) reporting
and large ES (β = 2.99, p < .05). For the practice standards, contradictory results.
the authors compared the ESs of studies focused on reason- The inconsistent findings among GDR reviews extended
ing, modeling, and using tools strategically with those beyond differences in conclusions about the significance of
addressing a combination of standards (e.g., making sense a particular moderator. Researchers also reported differ-
of problems and attending to precision). The only signifi- ences in the direction (positive or negative) and magnitude
cant difference in outcomes was a large effect observed of the ESs within categories of variables representing mod-
between interventions addressing a combination of reason- erators that consistently influenced outcomes, such as aca-
ing, modeling, and using tools strategically and those demic risk area and group size. For example, while Myers
focused on making sense of problems, attending to preci- et al. (2022), and Zhang and Xin (2012) determined that
sion, and looking for and using structure (β = 3.24, p < students’ academic risk area moderated treatment, they
.05). reported contrasting results. Zhang and Xin (2012) identi-
fied adverse impacts for students with MD and RD, while
Summary of Findings Across Studies. The moderator analyses Myers et al. (2022) inferred that the interventions positively
examining intervention and participant characteristics impacted WPS outcomes. Similarly, although three out of
yielded reasonably consistent results across the SSR meta- four studies examining group size as a predictor of interven-
analyses. The SSR review findings consistently showed that tion outcomes showed significant moderating effects, the
the most effective interventions emphasized explicit instruc- category with the most substantial impact varied. In two
tion in problem-solving strategies, such as representational studies (Kong et al., 2021; Myers et al., 2022), larger groups
techniques and strategy instruction. In addition, interven- had larger ESs than smaller groups. Conversely, the third
tions that were longer in duration (i.e., intermediate or long study (Xin & Jitendra, 1999) reported smaller groups (indi-
term) were more effective than those that were shorter (i.e., vidual instruction) yielded more favorable outcomes.
short term). Notably, these meta-analyses differed in the Similar trends emerged for the results supporting the mod-
specific moderators they examined. For instance, Xin and erating impact of the interventionist and intervention
Jitendra (1999) and Zhang and Xin (2012) solely focused models.
on the moderators of intervention approach and length of The divergent findings across these GDR meta-analyses
treatment, while Lei et al. (2020) additionally examined the stem from differences in statistical design, variable specifi-
moderators of grade level, participant status, intervention- cation, inclusion criteria, and scope of the reviews. Some
ist, content focus, and instructional focus. Shin et al. (2021) meta-analyses (e.g., Lein et al., 2020; Myers et al., 2022)
examined CCSSM content and practice standards. used statistical designs addressing publication bias and out-
Results of the GDR reviews were mixed, with some con- liers, while others (e.g., Kong et al., 2021; Zhang & Xin,
sistent and inconsistent results. For example, all studies 2012) did not adequately account for these potential sources
examining the effect of the dependent measure type consis- of distortion. In addition, some meta-analyses (e.g., Zheng
tently reported that researcher-made tests produced larger et al., 2013) aggregated data at the study level, while others
ESs than standardized measures. Moreover, Lein et al. (e.g., Myers et al., 2022) considered each ES individually,
(2020) and Myers et al. (2022) concluded that intervention leading to variations in the overall estimates.
duration, measured by the total number of hours of instruc- Two notable discrepancies in variable specifications are
tion, did not influence the reported outcomes. Also, all three particularly evident in the indicators of participants’ MD
studies examining the content domain (Lein et al., 2020; status and group size. Regarding participants’ MD status,
Myers et al., 2022; Zhang & Xin, 2012) reported no moder- Zhang and Xin (2012) and Xin and Jitendra (1999) used the
ating influences. discrepancy model to define LD but used different coding
However, we observed several inconsistencies across the criteria for this variable. For instance, when studies included
GDR reviews. For instance, Zhang and Xin (2012) identi- a diverse sample of students with MD, including low-per-
fied a significant moderating effect for the intervention set- forming students and those with LD but lacking formal LD
ting, while Xin and Jitendra (1999) and Myers et al. (2022) determination, Zhang and Xin classified these samples as
did not detect variations in impact related to this variable. “at-risk,” while Xin and Jitendra categorized similar sam-
Similarly, Xin and Jitendra (1999) reported differential ples as “mixed.” Similarly, authors differed in how they
Myers et al. 21
defined group size. Other authors provided detailed break- diagnostic approach, IQ scores, fidelity, and interventionist)
downs (e.g., whole class, small group, small-to-medium fell outside the scope of the TII dimensions. Notably, the
group, one-on-one; Lein et al., 2020), while others used type of dependent measure produced consistent results,
more straightforward yet clearly defined classifications with several studies reporting higher effects for researcher-
(i.e., small group ≤8 vs. large group >8; Myers et al., based measures than for standardized assessments.
2022). Still, other researchers (i.e., Kong et al., 2021) used Our analysis revealed that most of the 17 potential mod-
corresponding categories (i.e., small group, large group, erators correlated with the TII framework did not signifi-
individual) without precise definitions. cantly influence WPS outcomes. Factors like word problem
The group meta-analyses also exhibited variations in type and content focus lacked consistent effects across stud-
their inclusion criteria and research scope. For instance, ies. In addition, moderators with limited coverage, like EL
Zheng et al. (2013) used a stringent criterion for MD, iden- status and broad math achievement (examined by only one
tifying students with MD as those scoring at or below the or two studies), made it difficult to draw definitive conclu-
25th percentile on standardized math achievement assess- sions about their moderating role. Inconsistencies also
ments. In contrast, Lein et al. (2020) used a less strict emerged for other moderators, such as participant MD sta-
threshold, specifically the 35th percentile, to define this tus, where some studies showed significant effects while
group. Myers et al. (2022) identified students with MD others did not.
based on low scores on standardized achievement tests but We identified four variables that consistently moderated
did not explicitly specify a cutoff score. Also, the meta- WPS outcomes: intervention model, academic risk area,
analysis conducted by Zhang and Xin (2012) had a nar- number of treatment sessions, and group size. Notably, the
rower focus, centering on interventions within the context academic risk area represents the sole participant character-
of education reforms and laws impacting mathematics edu- istic among these moderators. The remaining three vari-
cation for students with disabilities, such as IDEA and RTI. ables—intervention model, number of treatment sessions,
Similarly, Zheng et al. (2013) used a selective analysis to and group size—pertain to intervention characteristics. The
compare the outcomes of students with MD to those with ESs associated with these moderators were generally sub-
comorbid MD and RD, resulting in a more targeted analy- stantial and positive, suggesting these factors hold substan-
sis. In contrast, Myers et al. (2022), and Lein et al. (2020) tial educational significance for improving the effectiveness
took a broader approach, emphasizing WPS interventions of WPS interventions for students with MD (Bloom et al.,
within the MTSS framework. 2008). We summarize the results of each of these modera-
The substantial variability in results and contradictory tors in the preceding sections.
outcomes for the essential intervention and participant char-
acteristics related to intensive interventions across the Intervention Model. We identified the intervention model
meta-analyses suggest that practitioners intensifying and as a consistent moderator of treatment outcomes linked to
tailoring WPS interventions for students with MD within two TII dimensions: attention to transfer and comprehen-
DBI instructional frameworks should carefully consider the siveness. Three GDR reviews (Lein et al., 2020; Xin &
findings of multiple meta-analyses. Furthermore, these Jitendra, 1999; Zhang & Xin, 2012) and one SSR review
findings emphasize the necessity for a qualitative umbrella (Xin & Jitendra, 1999) consistently showed interventions
analysis that thoroughly examines and synthesizes results, using representational techniques outperformed other
offering a more holistic understanding of the factors influ- models, despite the overall effectiveness of all models. In
encing the outcomes of WPS interventions for students with particular, SBI techniques, which require students to iden-
MD. tify word problem structures, produced higher ESs than
other models, such as technology-based interventions and
Findings of Umbrella Analysis: Consistent strategy instruction. SBI models yielded moderate-to-
large ESs, ranging from 0.40 to 2.46. Similarly, Xin and
Moderators of WPS Interventions Jitendra (1999) reported higher ESs for SBI than for other
Table 6 summarizes the moderating variables examined models in their SSR analysis. Myers et al. (2022) inferred
across the reviews and their findings. Among 24 unique that SBI produced larger ESs than strategy instruction and
variables investigated across the reviews, 16 were interven- other models, but the differences in the effects were
tion-related, and eight were participant-related. Findings insignificant.
showed 17 variables (13 intervention and 4 participant
characteristics) directly mapped onto at least one TII dimen- Academic Risk Area. Three meta-analyses, comprising two
sion, highlighting their potential for intensifying and tailor- GDR reviews (Myers et al., 2022; Zheng et al., 2013) and
ing WPS interventions for students with MD. The remaining one SSR (Zheng et al., 2013), explored the moderating
seven (gender, dependent measure type, grade level, LD influence of academic risk area, a crucial factor in
22 Journal of Learning Disabilities 00(0)
models, such as strategy instruction. We hypothesize that the additional support in foundational reading skills before they
promising results for SBI are due to its association with criti- can fully benefit from interventions for enhancing their
cal dimensions of intensive instruction, namely comprehen- WPS proficiency (Powell et al., 2020). For example, pro-
siveness and attention to transfer. SBI’s multi-step and viding explicit instruction in vocabulary development and
strategic approach to WPS provides a more comprehensive comprehension strategies alongside WPS instruction, such
framework than other strategies, such as strategy instruction as planning, organization, and self-monitoring, can
(Lein et al., 2020). SBI also helps students to apply metacog- empower students with MD and RD to participate more
nitive (e.g., self-questioning) and cognitive techniques (e.g., actively in the writing process and achieve greater gains
mnemonics) to select and use appropriate schematic diagrams (Arsenault & Powell, 2022b). Tailoring instruction within
or equations to represent the problem’s underlying structure, MTSS to address these specific challenges can significantly
choose an appropriate attack strategy, apply proper algorithms improve these students’ WPS outcomes. Further research is
to obtain a solution, and verify their answers (Witzel et al., warranted to validate these findings and explore the most
2022). effective instructional approaches for supporting WPS out-
The attention to transfer is also evident in SBI as it comes among students with MD and RD.
equips students with a strategic process that can be applied
to different problem types, such as problems with irrelevant Number of Intervention Sessions. Our analysis suggests that
information and multi-step and authentic/real problems longer interventions are generally more effective for stu-
(Powell & Fuchs, 2018). In contrast, while other models, dents with MD who require additional support. While a
such as general heuristics strategies, may be transferable to definitive minimum number of sessions cannot be estab-
different types of problems, they offer a more simplified lished, the findings indicate that interventions lasting at
and generic approach to problem-solving, where students least 30 sessions may be necessary to produce meaningful
apply cognitive techniques (e.g., mnemonics) or attack outcomes. This conclusion is congruent with research by
strategies (e.g., FOPS: Find the Problem Type; Organize Myers et al. (2022), which suggests a positive correlation
Information; Plan to Solve; and Solve and Check) to between the number of treatment sessions and instructional
remember the sequential steps involved in solving a prob- frequency, a known predictor of WPS intervention effec-
lem (Lein et al., 2020). These models also may not explic- tiveness for students with MD. This result is vital for inten-
itly include behavior supports that benefit students with sifying WPS instruction for students with MD who require
MD who require supplemental support. Although promis- additional support. These students often have substantial
ing, SBI models need research to assess their impact on deficits in areas critical for WPS skills, such as foundational
critical aspects of instructional intensity, such as their math knowledge and metacognitive strategies (Powell,
impact on students’ behavioral and transfer outcomes and Benz, et al., 2020). Longer interventions provide more dos-
long-term effects (Powell, Benz, et al., 2022). Additional age, a key concept within the TII framework, allowing these
research is needed to bolster these conclusions. students to practice and receive corrective feedback to
address these deficits. In addition, extended interventions
Academic Risk Area. Our analysis revealed a consistent pat- enable more in-depth exploration of concepts, targeted
tern across the studies: Interventions were more effective practice opportunities tailored to individual needs, and
for students with MD only than for students with both MD more opportunities to adjust instruction to address the
and RD. This conclusion supports previous research sug- unique needs of each student, including those exhibiting
gesting that students with MD and RD face additional chal- behavioral challenges (Powell, Benz, et al., 2022). How-
lenges due to deficits in essential WPS skills, such as ever, further research is needed to refine these findings.
comprehension and vocabulary decoding, which hinder
their performance on WPS tasks (Powell et al., 2020). Group Size. Our analysis revealed that group size consis-
While students with MD and RD can benefit from WPS tently impacts WPS outcomes for students with MD. How-
interventions, they tend to show lower gains than their peers ever, the evidence regarding the optimal group configuration
with MD only (Myers et al., 2022). These results under- for instruction is inconclusive. While some meta-analyses
score the importance of instructional alignment in meeting showed benefits for large or small groups, others reported
the diverse learning needs of students. Tailoring supple- higher effects for one-on-one interventions. Notably, the
mental instruction within MTSS to address the specific meta-analyses examining one-on-one instruction had lim-
challenges faced by students with MD and RD is crucial for ited sample sizes, hindering the generalizability of their
maximizing their learning outcomes (Powell et al., 2020). findings. The potential higher benefits of large-group
Understanding students’ prior academic performance, instruction may be due to positive peer interaction and
including reading proficiency levels, is essential for align- exposure to successful WPS strategies used by classmates
ing instruction to meet their unique needs (Abrams et al., (L. S. Fuchs et al., 2006). Another plausible explanation is
2016). Students with MD and RD typically require that large groups tend to have students with varying
24 Journal of Learning Disabilities 00(0)
baseline skills. Students with higher starting points might independent practice, and receipt of corrective feedback (L.
benefit from observing their peers, while those with lower S. Fuchs et al., 2017). However, we recommend careful
skills might struggle to keep pace. The inconsistent findings interpretation of this finding as we based it on a single meta-
may also be due to differences in scope, inclusion criteria, analysis. Further research is required to comprehend the
and statistical design across the meta-analyses we examined impact of intervention frequency on WPS interventions.
(see Tables 4 and 5). Additional research is needed to offer Our analysis also uncovered several noteworthy null
better guidance on the most effective instructional group findings. The results suggest various intervention character-
configurations for providing WPS instruction for students istics, including the intervention setting, word-problem type
with MD. or task, and content domain, did not significantly influence
Considering the relevant dimensions of the TII linked to intervention effectiveness. These findings are noteworthy
group size, including dosage, attention to transfer, behav- given these variables correlate with essential dimensions of
ioral support, and individualization, it is crucial to address instructional intensity for students with MD, such as behav-
the specific needs of students with MD when intensifying ioral support and alignment (Powell, Benz, et al., 2022).
WPS instruction. While large-group instruction can be ben- Therefore, the absence of significant differential effects
eficial under certain circumstances, it is vital to consider the implies teachers may have flexibility in selecting the setting
specific individualized needs of students with MD, espe- (e.g., general or special education classrooms) for imple-
cially those requiring intensive support when intensifying menting WPS interventions for students with MD. Also,
WPS instruction within the TII framework. Group instruc- teachers may utilize WPS interventions to improve stu-
tion may not be optimal for these students for several rea- dents’ scores on measures covering various word problem
sons. First, large group settings may limit the time available types (e.g., one-step, multiple-step, and real-world) across
for tailored instruction and feedback (D. Fuchs et al., 2014), diverse content domains, such as fractions and
which can be detrimental to these students who require computations.
more individualized support. Second, the pace of instruc-
tion in large groups might be too fast for them, limiting their
ability to transfer their learning to new situations (Pai et al.,
Limitations
2015). Third, large instructional groups can hinder teachers’ We note several limitations in our study. First, the small
ability to effectively manage the classroom environment, sample size of meta-analytic studies (n = 11) used in our
identify and support students needing additional behavioral analysis may limit the conclusiveness of its findings. This
support, and teach self-regulation strategies to students with lack of research underscores the need for more primary
behavioral challenges (Powell, Benz, et al., 2022). Finally, research on EBPs promoting WPS outcomes among MD
smaller instructional groups allow teachers to tailor instruc- students. Second, we did not empirically synthesize find-
tion to students’ individual needs, offer more personalized ings across the meta-analyses, preventing us from making
support, and collect reliable progress data (Powell, Berry, clear inferences about the exact mean magnitude of ESs
et al., 2022; Vaughn et al., 2012). Therefore, when design- across studies. Variations in the statistical and clinical
ing WPS interventions for students with MD, particularly designs across the studies examined precluded our ability to
those requiring intensive support, carefully considering the empirically summarize the literature via a meta-analysis of
instructional group size and tailoring the approach to indi- the meta-analyses (McKenzie & Brennan, 2022). Third, we
vidual needs is crucial for maximizing effectiveness. only included peer-reviewed meta-analyses, excluding
unpublished reviews such as dissertations, which may intro-
Other Notable Findings. A significant observation from our duce publication bias. Fourth, our use of a modified tool
study is the notable impact of intervention frequency, a cru- (R-AMSTAR) for assessing methodological quality, while a
cial measure of intervention dosage. The meta-analysis con- common practice in umbrella reviews, may introduce limi-
ducted by Myers et al. (2022) revealed substantial ESs for tations. Social science research can differ from healthcare
interventions administered at least three times a week (g = research in its approach to evidence and inquiry. Ideally, a
1.15), as well as those provided once or twice per week (g tool specifically designed for evaluating the quality of
= 0.76). The meta-regression results indicated significant social science meta-analyses might capture these nuances
differences in these estimates, whereby more frequent inter- more effectively. A fifth limitation of our study is the assess-
ventions resulted in greater student gains than those deliv- ment of heterogeneity. Most meta-analyses we reviewed
ered once or twice weekly. This conclusion is consistent reported the Q statistic primarily, with few providing I² or τ²
with research on aiding students with MD in operations values, restricting our ability to infer the magnitude and dis-
involving whole numbers (Codding et al., 2016). Our find- tribution of heterogeneity. The Q statistic indicates signifi-
ings suggest that providing instruction more frequently may cant heterogeneity but lacks detail on its extent or nature.
offer students with MD additional opportunities for peer Metrics such as I² and τ² are more appropriate for a substan-
interaction, participation in explicit instruction, tive discussion on heterogeneity (Higgins & Thompson,
Myers et al. 25
2002). The absence of these metrics restricted our compre- comprehension and vocabulary, to help students with RD
hensive assessment and comparison of heterogeneity across overcome reading challenges, ultimately advancing their
studies. Finally, we note potential biases arising from over- WPS abilities (Powell et al., 2019). Fourth, our results for
lapping primary studies in the meta-analyses we synthe- the number of intervention sessions suggest that practitio-
sized deserve consideration. While such overlap can ners should consider extending WPS interventions for stu-
introduce biases and threaten the validity of findings, it is dents with MD. This extended instruction allows for more
often inevitable and can even have advantages (Hennessy & in-depth teaching, personalized practice, and targeted feed-
Johnson, 2020). In this case, the overlap suggests our meta- back tailored to students’ needs (Powell, Benz, et al., 2022).
analyses drew from a consistent pool of conceptually simi- Finally, our null findings for important intervention charac-
lar studies, potentially leading to more reliable evidence teristics, including setting, word problem type or task, and
(Hennessy & Johnson, 2020). While potential biases content domain, suggest that the WPS interventions we ana-
remain, our findings offer valuable insights into crucial lyzed may be effectively implemented in various locations,
moderators for intensifying WPS interventions, providing including general or special education settings, across
essential guidance for future research and practice. diverse content domains, and using multiple word problem
types, without compromising intervention effectiveness.
These findings highlight the potential flexibility and adapt-
Implications for Practice
ability of WPS interventions for students with MD, enabling
Our study has five notable implications for using the TII to teachers to tailor instruction to meet individual student
tailor WPS interventions for students with MD within needs.
MTSS. First, practitioners may find it beneficial to adopt
SBI as an effective instructional strategy for WPS inter-
Implications for Research
ventions due to its association with critical dimensions of
intensive instruction, including comprehensiveness, The results of our umbrella review highlight several critical
behavioral support, and attention to transfer. SBI’s multi- areas for future research on WPS interventions for students
step intentional approach to solving word problems offers with MD. First, the inconsistencies in our results suggest
a more comprehensive framework than other models, such the need for a more comprehensive approach to identifying
as strategy instruction (Jitendra et al., 2021). It encourages gaps in the existing research. Evidence mapping, a visual
students to use metacognitive strategies, such as self- tool for identifying areas where research is lacking, could
instruction and self-monitoring, which are crucial for be instrumental in exploring these gaps and advancing the
promoting behavioral regulation and fostering the trans- field (Saran & White, 2018). Second, the limited sample
ferability of students’ WPS skills to various problem types sizes in previous meta-analyses necessitate more research.
(Powell & Fuchs, 2018). Second, our findings demon- This research should include primary studies that involve
strate consistently larger ESs for researcher-developed groups of participants (e.g., GRD designs) and those
WPS measures than standardized assessments, suggesting focused on a single participant over time (SSR designs) on
potential implications for data-collection practices in WPS interventions for students with MD. Building a richer
progress monitoring and intervention evaluation and research base will enable researchers to conduct more
intensification for students with MD. We urge practitio- robust meta-analyses and ultimately identify factors influ-
ners to exercise caution when selecting measures for prog- encing these interventions’ outcomes.
ress monitoring. Researcher-made assessments may Third, future primary research incorporating designs that
inflate ESs, while standardized ones may underestimate directly manipulate features of WPS interventions (e.g.,
them (Scammacca et al., 2007). Both measures may number of sessions) may be necessary to understand how
improve practitioners’ understanding of student WPS these features impact student outcomes. However, research-
progress and inform targeted instructional adjustments. ers must prioritize student well-being and adhere to ethical
Third, we deduced that students with both MD and RD guidelines when designing interventions. Fourth, future
exhibited lower WPS outcomes than their peers with MD research should consistently report suitable metrics to
only. Therefore, it is crucial to anticipate additional WPS enable more robust evaluations of variability in intervention
challenges among this student group and plan proactively to effects. As noted, indices such as I² and τ² provide more
address these challenges. We recommend that practitioners accurate assessments of the variation in results between
use pretreatment performance data to identify students with studies, which facilitates better comparisons across studies.
MD who demonstrate language difficulty and RD and Fifth, developing and validating a tool for more precise and
embed language and reading supports within WPS inter- relevant evaluations of methodological quality in social sci-
ventions targeting these students (Arsenault & Powell, ence meta-analyses is crucial. This instrument would
2022b). Also, we urge practitioners to incorporate strate- enhance the reliability of findings and provide more accu-
gies, such as paraphrasing and explicit instruction in rate guidance for educational interventions. Finally, given
26 Journal of Learning Disabilities 00(0)
contributing to their difficulty. Frontiers in Psychology, 6, Fusar-Poli, P., & Radua, J. (2018). Ten simple rules for conduct-
Article 348. https://doi.org/10.3389/fpsyg.2015.00348 ing umbrella reviews. Evidence-Based Mental Health, 21(3),
Decker, S. L., & Roberts, A. M. (2015). Specific cognitive pre- 95–100. https://doi.org/10.1136/ebmental-2018-300014
dictors of early math problem solving. Psychology in the Gates, M., Gates, A., Guitard, S., Pollock, M., & Hartling, L.
Schools, 52(5), 477–488. https://doi.org/10.1002/pits.21837 (2020). Guidance for overviews of reviews continues to accu-
Doabler, C. T., Clarke, B., Kosty, D., Turtura, J. E., Sutherland, M., mulate, but important challenges remain: A scoping review.
Maddox, S. A., & Smolkowski, K. (2021). Using direct observa- Systematic Reviews, 9(1), 1–19. https://doi.org/10.1186/
tion to document “Practice-Based Evidence” of evidence-based s13643-020-01509-0
mathematics Instruction. Journal of Learning Disabilities, Gersten, R., Chard, D., Jayanthi, M., Baker, S., Morphy, P., &
54(1), 20–35. https://doi.org/10.1177/0022219420911375 Flojo, J. (2008). Mathematics instruction for students with
Faulkner, G., Fagan, M. J., & Lee, J. (2022). Umbrella reviews learning disabilities or difficulty learning mathematics: A
(systematic review of reviews). International Review of Sport synthesis of the intervention research. Portsmouth, NH: RMC
and Exercise Psychology, 15(1), 73–90. https://doi.org/10.10 Research Corporation, Center on Instruction
80/1750984X.2021.1934888 Greco, T., Zangrillo, A., Biondi-Zoccai, G., & Landoni, G. (2013).
Feeney, D. M. (2022). Self-talk monitoring: A how-to guide for Meta-analysis: Pitfalls and hints. Heart, Lung, and Vessels,
special educators. Intervention in School and Clinic, 57(5), 5(4), 219–225.
298–305. https://doi.org/10.1177/10534512211032575 Griffin, C. C., Gagnon, J. C., Jossi, M. H., Ulrich, T. G., &
Feliz, V. A. (2020). Educational practices that decrease oppor- Myters, J. A. (2018). Priming mathematics word prob-
tunity gaps in literacy. Journal for Leadership, Equity, and lem structures in a rural elementary classroom. Rural
Research, 6(2), 1–21. https://journals.sfu.ca/cvj/index.php/ Special Education Quarterly, 37(3), 150–163. https://doi.
cvj/article/view/101 org/10.1177/8756870518772164
Filiz, T. (2023). The effect of mathematics difficulty intervention Hedges, L. V., Tipton, E., & Johnson, M. C. (2010). Robust vari-
programs on mathematics performance: A second-order meta- ance estimation in meta-regression with dependent effect size
analysis. Research on Education and Psychology, 7(Special estimates. Research Synthesis Methods, 1(1), 39–65. https://
Issue 2), 454–477. https://doi.org/10.54535/rep.1360558 doi.org/10.1002/jrsm.5
Fitzpatrick, M., & Knowlton, E. (2009). Bringing evidence-based Hennessy, E. A., Johnson, B. T., & Keenan, C. (2019). Best
self-directed intervention practices to the trenches for students practice guidelines and essential methodological steps to
with emotional and behavioral disorders. Preventing School conduct rigorous and systematic meta‐reviews. Applied
Failure: Alternative Education for Children and Youth, 53(4), Psychology: Health and Well‐Being, 11(3), 353–381. https://
253–266. https://doi.org/10.3200/PSFL.53.4.253-266 doi.org/10.1111/aphw.12169
Fuchs, D., Fuchs, L. S., Mathes, P. G., Lipsey, M. W., & Roberts, Hennessy, E. A., & Johnson, B. T. (2020). Examining overlap
P. H. (2002). Is “learning disabilities” just a fancy term for low of included studies in meta-reviews: Guidance for using the
achievement? A meta-analysis of reading differences between corrected covered area index. Research Synthesis Methods,
low achievers with and without the label. Paper written for 11(1), 134–145. https://doi.org/10.1002/jrsm.1390
the Office of Special Education Programs, U.S. Department Higgins, J. P., & Thompson, S. G. (2002). Quantifying hetero-
of Education, and presented at the OSEP’s LD Summit con- geneity in a meta-analysis. Statistics in Medicine, 21(11),
ference, Washington, DC. https://eric.ed.gov/?id=ED459544. 1539–1558. https://doi.org/10.1002/sim.1186
Fuchs, D., Fuchs, L. S., & Vaughn, S. (2014). What is intensive instruc- Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery,
tion and why is it important? TEACHING Exceptional Children, M. (2005). The use of single-subject research to identify evi-
46(4), 13–18. https://doi.org/10.1177/0040059914522966 dence-based practice in special education. Exceptional Children,
Fuchs, L. S., & Fuchs, D. (2002). Principles for the preven- 71(2), 165–179. https://doi.org/10.1177/00144029050710020
tion and intervention of mathematics difficulties. Learning Jitendra, A. K., Alghamdi, A., Edmunds, R., McKevett, N. M.,
Disabilities Research and Practice, 16(2), 85–95. https://doi. Mouanoutoua, J., & Roesslein, R. (2021). The effects of tier
org/10.1111/0938-8982.00010 2 mathematics interventions for students with mathematics
Fuchs, L. S., Fuchs, D., Compton, D. L., Powell, S. R., Seethaler, difficulties: A meta-analysis. Exceptional Children, 87(3),
P. M., Capizzi, A. M., Schatschneider, C., & Fletcher, J. 307–325. https://doi.org/10.1177/0014402920969187
M. (2006). The cognitive correlatives of third-grade skill in Jordan, R. L. P., Fernandez, E. P., Costa, L.-J. C., & Hooper, S.
arithmetic, algorithmic computation, and arithmetic word R. (2020). Internalizing and externalizing behaviors of chil-
problems. Journal of Educational Psychology, 98(1), 29–43. dren with writing disabilities. Learning Disabilities Research
https://doi.org/10.1037/0022-0663.98.1.29 & Practice, 35(2), 72–81. https://doi.org/10.1111/ldrp.12216
Fuchs, L. S., Fuchs, D., & Malone, A. S. (2017). The taxonomy King, S. G., & Powell, S. R. (2023). Language proficiency and
of intervention intensity. TEACHING Exceptional Children, the relation to word-problem performance in emergent
50(1), 35–43. https://doi.org/10.1177/0040059918758166 bilingual students with mathematics difficulties. Learning
Fuchs, L. S., Seethaler, P. M., Sterba, S. K., Craddock, C., Fuchs, Disabilities Research and Practice, 38(4), 263–273. https://
D., Compton, D. L., Geary, D. C., & Changas, P. (2021). doi.org/10.1111/ldrp.12325
Closing the word-problem achievement gap in first grade: Kintsch, W., & Greeno, J. G. (1985). Understanding and solving
Schema-based word-problem intervention with embedded word arithmetic problems. Psychological Review, 92(1), 109–
language comprehension instruction. Journal of Educational 129. https://doi.org/10.1037/0033-295X.92.1.109
Psychology, 113(1), 86–103. https://doi.org/10.1037/ Kong, J. E., Arizmendi, G. D., & Doabler, C. T. (2023).
edu0000467 Implementing the science of math in a culturally sustainable
28 Journal of Learning Disabilities 00(0)
framework for students with and at risk for math learning *Myers, J. A., Witzel, B. S., Powell, S. R., Li, H., Pigott, T. D.,
disabilities. TEACHING Exceptional Children, 56(1), 44–51. Xin, Y. P., & Hughes, E. M. (2022). A meta-analysis of
https://doi.org/10.1177/00400599221127385 mathematics word-problem solving interventions for ele-
*Kong, J. E., Yan, C., Serceki, A., & Swanson, H. L. (2021). mentary students who evidence mathematics difficulties.
Word-problem-solving interventions for elementary students Review of Educational Research, 92(5), 695–742. https://doi.
with learning disabilities: A selective meta-analysis of the org/10.3102/00346543211070049
literature. Learning Disability Quarterly, 44(4), 248–260. National Center for Education Statistics. (2020). NAEP report
https://doi.org/10.1177/0731948721994843 card: The 2022 NAEP mathematics assessment highlighted
Kung, J., Chiappelli, F., Cajulis, O. O., Avezova, R., Kossan, results at grades 4 and 8 for the nation, states, and districts.
G., Chew, L., & Maida, C. A. (2010). From systematic U.S. Department of Education.
reviews to clinical recommendations for evidence-based National Center on Intensive Intervention. (2023). American
health care: Validation of revised assessment of multiple Institutes for Research. https://intensiveintervention.org/
systematic reviews (R-AMSTAR) for grading of clinical rel- National Governors Association Center for Best Practices &
evance. The Open Dentistry Journal, 4, 84–91. https://doi. Council of Chief State School Officers. (2010). Common core
org/10.2174/1874210601004020084 state standards for mathematics.
*Lei, Q., Mason, R. A., Xin, Y. P., Davis, J. L., David, M., & Nelson, G., & Powell, S. R. (2018). Computation error analysis:
Lory, C. (2020). A meta-analysis of single-case research on Students with mathematics difficulty compared to typically
mathematics word problem-solving interventions for English achieving students. Assessment for Effective Intervention,
learners with learning disabilities and mathematics difficul- 43(3), 144–156. https://doi.org/10.1177/1534508417745627
ties. Learning Disabilities Research & Practice, 35(4), 201– Ng, J., Lee, K., & Khng, K. H. (2017). Irrelevant information in
217. https://doi.org/10.1111/ldrp.12233 math problems need not be inhibited: Students might just
*Lein, A. E., Jitendra, A. K., & Harwell, M. R. (2020). need to spot them. Learning and Individual Differences, 60,
Effectiveness of mathematical word problem solving 46–55. https://doi.org/10.1016/j.lindif.217.09.008
interventions for students with learning disabilities and/ Ouzzani, M., Hammady, H., Fedorowicz, Z., & Elmagarmid,
or mathematics difficulties: A meta-analysis. Journal of A. (2016). Rayyan—A web and mobile app for system-
Educational Psychology, 112(7), 1388–1408. https://doi. atic reviews. Systematic Reviews, 5(1), 1–10. https://doi.
org/10.1037/edu0000453 org/10.1186/s13643-016-0384-4
Leiss, D., Plath, J., & Schwippert, K. (2019). Language and math- Page, M. J., McKenzie, J., Bossuyt, P., Boutron, I., Hoffmann, T.,
ematics—Key factors influencing the comprehension process in & Mulrow, C. D. (2021). The PRISMA 2020 statement: An
reality-based tasks. Mathematical Thinking and Learning, 21(2), updated guideline for reporting systematic reviews. British
131–153. https://doi.org/10.1080/10986065.2019.1570835 Medical Journal, 372, 71. https://doi.org/10.1136/bmj.n71
Lombardi, A. R., Kowitt, J. S., & Staples, F. E. (2015). Correlates Pai, H. H., Sears, D. A., & Maeda, Y. (2015). Effects of small-
of critical thinking and college and career readiness for stu- group learning on transfer: A meta-analysis. Educational
dents with and without disabilities. Career Development Psychology Review, 27, 79–102. https://doi.org/10.1007/
and Transition for Exceptional Individuals, 38(3), 142–151. s10648-014-9260-8
https://doi.org/10.1177/2165143414534888 Papatheodorou, S. I., & Evangelou, E. (2022). Umbrella reviews:
Mason, E. N., Benz, S. A., Lembke, E. S., Burns, M. K., & Powell, What they are and why we need them. Methods in Molecular
S. R. (2019). From professional development to implemen- Biology, 2345, 135–146. https://doi.org/10.1007/978-1-0716-
tation: A district’s experience implementing mathematics 1566-9_8
tiered systems of support. Learning Disabilities Research and Park, S., Stecker, P. M., & Powell, S. R. (2023). A teacher’s tool-
Practice, 34(4), 207–214. https://doi.org/10.1111/ldrp.12206 kit for assessment when implementing data-based individu-
McKenzie, J. E., & Brennan, S. E. (2022). Chapter 12: alization in mathematics. Intervention in School and Clinic,
Synthesizing and presenting findings using other methods. In 59(4), 243–253. https://doi.org/10.1177/10534512231182042
J. P. T. Higgins, J. Thomas, J. Chandler, M. Cumpston, T. Li, Peltier, C., Sinclair, T. E., Pulos, J. M., & Suk, A. (2020).
M. J. Page, & V. A. Welch (Eds.), Cochrane handbook for Effects of schema-based instruction on immediate, gen-
systematic reviews of interventions (Version 6.3). www.train- eralized, and combined structured word problems. The
ing.cochrane.org/handbook Journal of Special Education, 54(2), 101–112. https://doi.
Myers, J. A., Brownell, M. T., Griffin, C. C., Hughes, E. M., org/10.1177/0022466919883397
Witzel, B. S., Gage, N. A., Peyton, D., Acosta, K., & Peltier, C., & Vannest, K. J. (2017). A meta-analysis of schema
Wang, J. (2021). Mathematics interventions for adolescents instruction on the problem-solving performance of elemen-
with mathematics difficulties: A meta-analysis. Learning tary school students. Review of Educational Research, 87(5),
Disabilities Research and Practice, 36(2), 145–166. https:// 899–920. https://doi.org/10.3102/0034654317720163
doi.org/10.1111/ldrp.12244 Pieper, D., Antoine, S. L., Mathes, T., Neugebauer, E. A., &
Myers, J. A., Hughes, E. M., Witzel, B. S., Anderson, R. D., & Eikermann, M. (2014). Systematic review finds overlap-
Owens, J. (2023). A meta-analysis of mathematical interven- ping reviews were not mentioned in every other overview.
tions for increasing the word problem solving performance of Journal of Clinical Epidemiology, 67(4), 368–375. https://
upper elementary and secondary students with mathematics doi.org/10.1016/j.jclinepi.2013.11.007
difficulties. Journal of Research on Educational Effectiveness, Pongsakdi, N., Kajamies, A., Veermans, K., Lertola, K., Vauras,
16(1), 1–35. https://doi.org/10.1080/19345747.2022.2080131 M., & Lehtinen, E. (2020). What makes mathematical word
Myers et al. 29
problem solving challenging? Exploring the roles of word *Shin, M., Bryant, D. P., Powell, S. R., Jung, P. G., Ok, M. W.,
problem characteristics, text comprehension, and arithmetic & Hou, F. (2021). A meta-analysis of single-case research
skills. ZDM Mathematics Education, 52, 33–44. https://doi. on word-problem instruction for students with learning dis-
org/10.1007/s11858-019-01118-9 abilities. Remedial and Special Education, 42(6), 398–411.
Powell, S. R., Benz, S. A., Mason, E. N., & Lembke, E. S. https://doi.org/10.1177/0741932520964918
(2022). How to structure and intensify mathematics Swanson, H. L., Lussier, C. M., & Orosco, M. J. (2015).
intervention. Beyond Behavior, 31(1), 5–15. https://doi. Cognitive strategies, working memory, and growth in word
org/10.1177/10742956211072267 problem solving in children with math difficulties. Journal
Powell, S. R., Berry, K. A., Acunto, A. N., Fall, A.-M., & Roberts, of Learning Disabilities, 48(4), 339–358. https://doi.
G. (2022). Applying an individual word-problem intervention org/10.1177/0022219413498771
to a small-group setting: A pilot study’s evidence of improved Tipton, E., Bryan, C., Murray, J., McDaniel, M. A., Schneider, B.,
word-problem performance for students experiencing & Yeager, D. S. (2023). Why meta-analyses of growth mindset
mathematics difficulty. Journal of Learning Disabilities, and other interventions should follow best practices for examin-
55(5), 359–374. https://doi.org/10.1177/00222194211047635 ing heterogeneity: Commentary on Macnamara and Burgoyne
Powell, S. R., Berry, K. A., & Barnes, M. A. (2020). The role (2023) and Burnette et al. (2023). Psychological Bulletin,
of pre-algebraic reasoning within a word-problem interven- 149(3–4), 229–241. https://doi.org/10.1037/bul0000384
tion for third-grade students with mathematics difficulty. Vaughn, S., Wanzek, J., Murray, C. S., & Roberts, G. (2012). Intensive
ZDM Mathematics Education, 52, 151–163. https://doi. interventions for students struggling in reading and mathemat-
org/10.1007/s11858-019-01093-1 ics: A practice guide. RMC Research Corporation, Center on
Powell, S. R., Bos, S. E., King, S. G., Ketterlin-Geller, L., & Instruction. https://files.eric.ed.gov/fulltext/ED531907.pdf
Lembke, E. S. (2022). Using the data-based individualization Verschaffel, L., Schukajlow, S., Star, J., & Van Dooren, W.
framework within math intervention. TEACHING Exceptional (2020). Word problems in mathematics education: A sur-
Children. https://doi.org/10.1177/00400599221111114 vey. ZDM Mathematics Education, 52(1), 1–16. https://doi.
Powell, S. R., Doabler, C. T., Akinola, O. A., Therrien, W. J., org/10.1007/s11858-020-01130-4
Maddox, S. A., & Hess, K. E. (2019). A synthesis of elemen- Williams, R., Citkowicz, M., Miller, D. I., Lindsay, J., & Walters,
tary mathematics interventions: Comparisons of students with K. (2022). Heterogeneity in mathematics intervention effects:
mathematics difficulty with and without comorbid reading Evidence from a meta-analysis of 191 randomized experi-
difficulty. Journal of Learning Disabilities, 53(4), 244–276. ments. Journal of Research on Educational Effectiveness,
https://doi.org/10.1177/0022219419881646 15(3), 584–634. https://doi.org/10.1080/19345747.2021.200
Powell, S. R., & Fuchs, L. S. (2015). Intensive intervention in 9072
mathematics. Learning Disabilities Research and Practice, Witzel, B., Myers, J. A., & Xin, Y. P. (2022). Intensive word prob-
30(4), 182–192. https://doi.org/10.1111/ldrp.120 lem solving for students with learning disabilities in mathe-
Powell, S. R., & Fuchs, L. S. (2018). Effective word-problem matics. Intervention in School and Clinic, 58(1), 9–14. https://
instruction: Using schemas to facilitate mathematical reason- doi.org/10.1177/10534512211047580
ing. TEACHING Exceptional Children, 51(1), 31–42. https:// Witzel, B., Myers, J., Root, J., Freeman-Green, S., Riccomini,
doi.org/10.1177/0040059918777250 P., & Mims, P. (2024). Research should focus on improving
Ritchie, S. J., & Bates, T. C. (2013). Enduring links from child- mathematics proficiency for students with disabilities. The
hood mathematics and reading achievement to adult socio- Journal of Special Education, 57(4), 240–247. https://doi.
economic status. Psychological Science, 24(7), 1301–1308. org/10.1177/00224669231168373
https://doi.org/10.1177/0956797612466268 *Xin, Y. P., & Jitendra, A. K. (1999). The effects of instruc-
Saran, A., & White, H. (2018). Evidence and gap maps: A compar- tion in solving mathematical word problems for stu-
ison of different approaches. Campbell Systematic Reviews, dents with learning problems: A meta-analysis. The
14(1), 1–38. https://doi.org/10.4073/cmdp.2018.2 Journal of Special Education, 32(4), 207–225. https://doi.
Scammacca, N., Roberts, G., Vaughn, S., Edmonds, M., Wexler, org/10.1177/002246699903200402
J., Reutebuch, C. K., & Torgesen, J. K. (2007). Interventions Young, J. (2017). Technology-enhanced mathematics instruc-
for adolescent struggling readers: A meta-analysis with tion: A second-order meta-analysis of 30 years of research.
implications for practice. RMC Research Corporation, Center Educational Research Review, 22, 19–33. https://doi.
on Instruction. org/10.1016/j.edurev.2017.07.001
Schumacher, R. F., Zumeta Edmonds, R., & Arden, S. V. (2017). *Zhang, D., & Xin, Y. P. (2012). A follow-up meta-analysis for
Examining implementation of intensive intervention in math- word-problem-solving interventions for students with math-
ematics. Learning Disabilities Research and Practice, 32(3), ematics difficulties. The Journal of Educational Research,
189–199. https://doi.org/10.1111/ldrp.12141 105(5), 303–318. https://www.jstor.org/stable/26586944
Shin, M., & Bryant, D. P. (2017). Improving the frac- *Zheng, X., Flynn, L. J., & Swanson, H. L. (2013). Experimental
tion word problem solving of students with mathemat- intervention studies on word problem solving and
ics learning disabilities: Interactive computer application. math disabilities: A selective analysis of the literature.
Remedial and Special Education, 38(2), 76–86. https://doi. Learning Disability Quarterly, 36(2), 97–111. https://doi.
org/10.1177/0741932516669052 org/10.1177/0731948712444277