Tracing Distinct Learning Trajectories in Introductory Programming Course
Tracing Distinct Learning Trajectories in Introductory Programming Course
International Journal of
International Journal of STEM Education (2025) 12:27
https://doi.org/10.1186/s40594-025-00546-2 STEM Education
Abstract
Background With the increasing interdisciplinarity between computer science (CS) and other fields, a growing
number of non-CS students are embracing programming. However, there is a gap in research concerning differences
in programming learning between CS and non-CS students. Previous studies predominantly relied on outcome-
based assessments, focusing on summative evaluations and surveys while providing little insight into the real learning
process and differences therein. This study aims to provide a process-oriented comparison of programming learning
between two novice student groups, CS and Math, under uniform instructional conditions, focusing on their semes-
ter-long scores, engagement, and code metrics.
Results Our research involves 75 novice students enrolled in a compulsory introductory programming course
designed for a mixed class, comprising 35 Math and 40 CS. Through Latent Class Analysis and Self-Organizing Maps,
we identify distinct learning states throughout the semester and employ sequence mining to explore the differ-
ences in learning trajectories and state transitions between the two cohorts. Our results reveal that the association
between engagement and scores diverges across different majors as the course progresses, deviating from the widely
discussed positive correlation. In the semester-long code metrics analysis of students exhibiting over-engineer-
ing state, the two cohorts display opposing trends. Moreover, CS students demonstrate significant alignment
between formative and summative scores, whereas Math peers exhibit phenomena of cold-start and learning
avoidance.
Conclusions This study underscores the importance of understanding distinct learning trajectories to improve
instructional design for diverse learner groups. Our findings indicate that CS students follow increasingly efficient
learning patterns with decreasing code complexity over time, while Math students need strategies to overcome
phenomena of cold-start and learning avoidance. Code metrics can provide valuable insights into students’ program-
ming performance and patterns. The research also highlights the importance of active engagement and fostering
*Correspondence:
Xia Sun
[email protected]
Jun Feng
[email protected]
Full list of author information is available at the end of the article
© The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0
International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long
as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if
you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or
parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To
view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Gao et al. International Journal of STEM Education (2025) 12:27 Page 2 of 26
computational thinking in the early stages. Based on these insights, we propose recommendations for instructional
design to better support students in introductory programming courses. This study also makes a methodological
contribution to the process-oriented research in programming education.
Keywords Programming learning, Major differences, Sequence mining, Latent class analysis, Self-organizing maps
low at the beginning of semester, but it exhibits an learning (Cetin & Ozden, 2015; Chou et al., 2021;
overall increasing trend over time (Kelly & Allen, 2023; De Santo et al., 2022; Jones et al., 2023; Zhong et al.,
Hogan et al., 2023). Furthermore, some researchers have 2023; Kelly & Allen, 2023; Shah et al., 2023), as well as
found that non-majors approach learning programming summative assessments like final scores or the state of
differently (O’Malley, 2020). course completion (Jones et al., 2023). Yet, solely relying
For research that concurrently considers both CS and on outcome-based assessments without incorporating
non-CS, previous studies mainly reveal stark differences data generated during the entire learning process of the
in cognitive approaches and influencing factors on semester may overlook authentic differences among
programming between both groups (Shell & Soh, 2013; students in terms of learning patterns, developmental
Gao et al., 2022; Zhong et al., 2023). CS students tend trajectories and abilities. Consequently, the
to exhibit a preference for planning, facilitating more understanding of this aspect remains limited (Villamor,
efficient self-regulation of their studies. Conversely, 2020).
non-CS students lean towards knowing, spending
considerable time collecting information to solve Competency learning framework during programming
problems but often overlooking their own logical and process
analytical problem-solving skills (Zhong et al., 2023). IT2017 (Information Technology 2017) and CC2020
Moreover, perceptions of course climate, encompassing (Computing Curricular 2020) reports issued by ACM
five motivational dimensions(empowerment, useful, underscored the critical shift from knowledge-based
success, interest, caring), are significantly higher among to competency-based programming education (Sabin
CS students compared to non-CS students (Jones et al., et al., 2018; Impagliazzo & Pears, 2018; Raj et al., 2021).
2023). Some researchers have also designed a scale To address this shift, extensive research has proposed
questionnaire to evaluate students’ attitude towards competency learning framework that define student
programming across different majors (Cetin & Ozden, programming competency as the integration of three
2015). dimensions: contextualized knowledge, skills, and
Among all non-CS disciplines, Math has the closest dispositions (Frezza et al., 2018; Clear et al., 2020; Kiesler,
relationship with CS (Knuth, 1974). Numerous studies 2020; Sabin et al., 2023; Gao et al., 2025). Knowledge
have established that mathematical proficiency is reflects mastery of content and core concepts, skills
one of the most critical factors for success in CS, and encompass critical thinking, problem-solving, and
mathematical logic and formal symbolic reasoning are high-level strategies, while dispositions capture socio-
foundational elements that permeate CS (Wilson & emotional patterns, behaviors, and attitudes (Frezza
Shrock, 2001; Arnold et al., 2007; Chea et al., 2020). et al., 2018; Raj et al., 2021).
Meanwhile, programming as a problem-solving tool Capturing and analyzing data during programming
is often guided by mathematical principles (Olsson process could facilitate student modeling across these
& Granberg, 2024), which not only enhances the three dimensions (Li et al., 2022; Sun et al., 2023). For
understanding of concepts (Benton et al., 2018; Zhong instance, the sequences of code submissions during
& Xia, 2020) but also fosters the development of logical problem-solving record solving strategies and iterative
thinking and problem-solving skills Forsström and processes. They not only evaluate mastery of directly
Kaufmann (2018); Kaufmann and Stenseth (2021); or indirectly tested knowledge but also reveal students’
Laurent et al. (2022). problem-solving abilities, programming skills, and
However, research on non-CS students is relatively dispositions (such as active engagement, persistence in
scarce and still in an exploratory stage (Chou et al., 2021; the face of failure, and response to challenges), offering
Kelly & Allen, 2023), especially Math students. Despite a rich portrait of learner profiles (Li et al., 2022). Thus,
the well-documented historical ties between Math compared to traditional outcome-based research,
and CS, which position Math as one of the disciplines programming process-oriented analysis holds greater
deepest historical connections with CS (Knuth, 1974; potential for understanding the learning trajectories and
Wilson & Shrock, 2001; Brodley et al., 2024), there has evolution of students, enabling targeted support (Sun
been little investigation into the distinct experiences and et al., 2021).
challenges Math students face in learning programming While some studies have realized the importance of
(Castle, 2023, 2024). Furthermore, existing research on programming process, they mainly focused on single-
the differences in programming learning across various dimensional cross-sectional analyses, such as coding
majors primarily relies on outcome-based assessments, errors (Altadmri & Brown, 2015), engagement (López-
such as surveys or self-reported data, which assess Pernas & Saqr, 2021), code update frequency (Blikstein
students’ perceptions and experiences in programming et al., 2014), compilation runs (Song et al., 2021), live
Gao et al. International Journal of STEM Education (2025) 12:27 Page 4 of 26
manifestation of the other two dimensions of engagement retention and success in a field. Fields vary in their
(Saqr & López-Pernas, 2021; Saqr et al., 2023a). perceived goal affordance, with CS often seen as offering
While initial conceptualizations of engagement focused fewer public opportunities compared to social sciences or
on face-to-face teaching, engagement concepts have been life sciences (Diekman et al., 2010). This perception has
extended to online and blended learning environments been identified as a contributing factor to the significant
(Martin & Borup, 2022). In recent years, online learning gender and racial disparities in the CS field (Landivar,
has experienced a substantial surge, emerging as a pivotal 2013; Brinkman & Diekman, 2016).
component of contemporary education, particularly OJS Situated learning theory highlights the crucial
in the field of programming education (Quadir et al., relationship between learning behaviors and the social
2023; Liu et al., 2023). This advancement greatly aids environment in which learning occurs. Students who do
in collection of fine-grained data on learners’ behavior not feel the belongingness within a course group may
engagement, facilitating a deeper understanding of their struggle to engage fully, limiting their ability to connect
authentic learning processes. Consequently, educators with peers and instructors and potentially impacting
are empowered to observe, track, and comprehend their performance (Salguero et al., 2021). Research in
these phenomena, thereby refining instructional design CS has demonstrated that the belongingness is closely
practices (Saqr & López-Pernas, 2021). related to critical student outcomes, including academic
In programming education, existing studies based on performance, memory retention, motivation, and
outcome-based assessments have investigated the impact persistence in computing, with race and gender being
of engagement on CS students’ learning experiences and significant predictors belongingness (Lewis et al., 2019;
performance, revealing a positive correlation between Salguero et al., 2021; Krause-Levy et al., 2021).
higher engagement levels and better learning outcomes
(Sinclair & Kalvala, 2015; Lishinski & Rosenberg,
2021; Lishinski et al., 2022). However, limited research Person‑centered and process‑oriented methods
has focused on the engagement evolution of non-CS Most studies examining major differences in
students and its comparison with CS counterparts programming education predominantly employ variable-
(Kelly & Allen, 2023). With the development of OJS and oriented analytical methods and outcome-based
collection of behavioral log data, examining fluctuations assessment. There is limited research that genuinely
in engagement and their associations with performance investigate individual learner development and learning
throughout the semester are becoming more and trajectories in programming (Wu et al., 2019; Sun et al.,
more essential for instructors to not only understand 2021; Shah et al., 2023).
the potentially unique experiences and challenges of The variable-oriented data analysis methods, such
individual students, but sheds light on differences in the as Analysis of Variance (ANOVA), various regression
learning process and underlying corresponding roots models, and factor analysis, are valuable for capturing
between student cohorts(Sun et al., 2021). In this study, relationships and interactions among variables, thereby
we use “engagement"to refer to behavioral ones during providing valuable insights into the average expectations
the programming process, while code metrics can be of the entire learner cohort (Hickendorff et al., 2018; Saqr,
considered a mapping of cognitive engagement. Our 2024). However, they are criticized for their tendency to
goal is to explore differences in programming learning overlook learners’ heterogeneity, as the average learning
between students from different majors by jointly pattern may not sufficiently describe individual learners
analyzing three trajectories: score, code metrics, and (Asikainen & Gijbels, 2017; Saqr & López-Pernas, 2021).
engagement. In contrast, person-centered methods that explore
potential subgroups of students focus on understanding
Goal congruity and belongingness the unique characteristics, learning patterns and
The goal congruity theory suggests that for students to behaviors of individual learners, rather than averaging
develop an interest in a specific field, their personal goals across the entire group (Saqr & López-Pernas, 2021a).
must align with their perceived ability to achieve these These approaches yield more consistent findings by
goals within that field, referred to as goal affordance recognizing that distinct subgroups of students develop
(Brinkman & Diekman, 2016). This alignment is essential differently, with development inherently individualized,
for students to enter and persist in the field. If individuals thereby acknowledging this variability as heterogeneity
do not believe they can achieve their personal goals among learners (Fryer, 2017; Bryan et al., 2021; Saqr
within a particular career path, they are less likely to et al., 2023a). By examining each learner’s data separately,
pursue it (Lewis et al., 2019). Empirical research has researchers can identify diverse learning trajectories,
identified this congruity as a key predictor of students’ uncover hidden patterns, and tailor interventions to
Gao et al. International Journal of STEM Education (2025) 12:27 Page 6 of 26
meet individual needs (Beck & Jackson, 2021; Saqr et al., stage, among those who chose to study Math or CS at
2023a; Gao et al., 2025). the university of this study, students all had a chance to
Outcome-based assessments can unveil learners’ direct apply to join this mixed class. Thus, only those who were
demonstrations of mastery in knowledge or skills and curious about or interested in learning programming
their perceptions towards learning (Sun et al., 2021). would apply for this class, as its academic demands
However, considering that the process of programming actually exceed those of their respective original majors.
entails thoughtful problem-solving, meaning-making, Applicants must take an exam assessing computational
and knowledge construction, process-oriented research thinking and mathematical logic, and only those who
is advantageous in depicting the holistic landscape of passed were admitted to the mixed class. In addition,
learners’ programming behaviors and the differences students in this class had the option to withdraw at
in learning trajectories among students from different any time and returned to their original major in the
majors (Pereira et al., 2020; Shah et al., 2023). first year. As a result, the sample size of this class is not
predetermined but is instead determined by the students
Methods themselves. Furthermore, after the two stages, the initial
Context level of students in this class including computational
Structures of mixed program thinking and mathematical logic could be considered as
The context of this study is a public research-intensive the same.
university in China. Data for our research were gathered
from a compulsory introductory programming course
Sample background
(CS1) designed for a mixed class setting at this university,
Finally, this mixed class consisted of 75 students, with 40
which included students majoring in both CS and
majoring in CS (30 males and 10 females) and 35 in Math
Math. This class was the first student cohort of a mixed
(20 males and 15 females). Based on a brief questionnaire
program, which was supported by both the school
at the first class which aimed to collect basic information,
of Mathematics and Computer Science. The purpose
these students all have no prior programming experience
of establishing this mixed class was to integrate the
or systematic programming course. This is likely because
strengths of both CS and Math disciplines, aiming to
the existing K-12 education system in China does not
cultivate students’ computational thinking, mathematical
incorporate systematic programming course within its
logic, and problem-solving skills. This program’s duration
curriculum, although some developed cities in China
aligns with general CS or Math programs, spanning
have begun to introduce introductory programming
4 years. However, students were informed from the
courses (Sun & Luo, 2021). Meanwhile, due to the large
outset that the program requires them to complete
number of competitors, the pressure of national college
mandatory courses from both the computer science and
entrance examination is immense, leaving students in
mathematics fields, which may add additional academic
this education phase with little time to study subjects
pressure. In particular, the curriculum design of this
beyond the curriculum. Moreover, the questionnaire also
class at the first 2 years in university included both CS
surveyed “Do you have a personal computer?", “Have
and Math compulsory foundation courses, such as
you ever used a computer before college?", and students’
mathematical analysis and introductory programming
scores and ranks in each subject of the national college
course (i.e., the CS1 course in this study), with the
entrance examination in different 24 provinces, which
following 2 years including necessary elective credits
indicated similar results. Furthermore, there were even
from both disciplines. Upon graduation, students will
two students who never used computer before.
defend their thesis before both the school of Math and
CS, and if successful, they will receive dual degrees (not
a minor). Course settings and content
This course serves as an introductory program design
Two‑stage selection course for C language requiring no prior experience and
The formation of this class went through two stages. First, spans a 16-week term, offered during the first semester
students participated in the national college entrance in university for this mixed-class cohort. The term
examination (with over 10 million participants each year structure comprises an initial 2-week introduction,
in China) (Carnoy et al., 2013; Loyalka et al., 2021). Based followed by 12 weeks of regular instruction accompanied
on their scores, they selected their preferred universities by tests (formative assessment), and concludes with a
and majors, each with varying admission thresholds. final 2-week period allocated for review and preparation
Consequently, students entering the same university for final exam (summative assessment). It covers basic
and major generally have similar scores. In the second concepts and applications of six modules: variables and
Gao et al. International Journal of STEM Education (2025) 12:27 Page 7 of 26
expressions, conditionals and loops, arrays and strings, The former three types primarily assessed knowledge
functions, pointers, and structures. The course content points and were considered objective, while the latter
was jointly adjusted and supported by both school of CS two involved comprehensive code writing, which were
and Math, to ensure that all students, regardless of their regarded as subjective questions. The grading process
major, gain the necessary knowledge and skills to succeed of formative assessment and final exam were both
in programming. Meanwhile, the difficulty and workload fully automated by the OJS platform, resulting in five
are also controlled especially for Math students. categories: fully correct, partially correct (where the
Following this course, there were other subsequent program only passes parts of test cases), format error
foundational courses including advanced programming (where the program’s output format does not meet the
and mathematical courses planned, which were still requirements), incorrect answer (where the output of
compulsory for both CS and Math students in this program is wrong), and compilation error (where the
mixed class. These courses were also adjusted to ensure code fails to compile). The teacher did not participate
that one cohort was not systematically disadvantaged or in the grading process, ensuring that the evaluation
overburdened compared to the other. criteria were consistent across different majors and
Despite comprising students from both CS and Math, could be considered free from teachers’ bias. However,
the instructor maintained equal treatment for all students to some extent, this may also introduce potential bias of
and the instructional approach integrated traditional machine scoring, as only code submissions that pass at
face-to-face teaching with an online judge system, least some test cases will receive a score. Students who
remaining consistent throughout the course. Each week, write extensive code but fail to pass any test cases will
students face-to-face attended both theoretical and not receive any points. Nonetheless, this mechanism is
practical lectures using OJS, engaging in communications applied equally to all students as well, ensuring fairness.
with the teacher or peers. Theoretical sessions prioritized
teacher-led lectures, covering course materials
sequentially, while practical sessions involved hands-on Indicators and pre‑processing
programming exercises with guidance and explanation The data were collected from the back-end of OJS.
provided by the instructor. For classroom dynamics, An initial overview of datasets indicates a total of
the instructor was always available to answer student 224 questions across 12 formative assessments with a
questions and address learning difficulties. In addition, final exam, and there was a total of 22,950 submission
two graduate teaching assistants from CS participated records, including 1023 in multiple-choice, 1129 true
throughout the semester, assisting the instructor and or false, 1464 in code completion, 4779 in function
answering student inquiries. At the beginning of the writing, 12,162 programming problems, and 2393
semester, the instructor also created a chat group on a submissions for the final exam. It’s worth noting
shared messaging platform, allowing all students and that since this course was compulsory and assigned
teaching assistants to join for easy communication and formative assessments contributed to the final credit,
support outside of class hours. all students participated in every test, thereby ensuring
no missing data in our datasets. Table 1 shows a brief
Assessments introduction of all indicators for clarity. In this study,
After class, students were required to complete weekly all student privacy information has been anonymized,
programming tests assigned on the OJS by their teacher, and replaced with numerical identifiers, approved
which were regarded as formative assessments. These by the Research Ethics Committee of this University
tests differed from proctored final exam, as students were (ProjectID: 240704074).
only required to complete them anywhere at any time
within a specified time-frame (usually 1 week) without
supervision. The final exam required students to take it Engagement indicators
in a designated computer lab at the school during the While the three dimensions of engagement have been
designated time, where computers were pre-logged discussed in prior related work, this study focuses on
into the OJS with exam mode. The lab was monitored the indicators of engagement intensity reflected in
by teachers and surveillance cameras to prevent formative assessments under the dimension of behavio-
plagiarism. Tests and final exam encompassed various ral engagement. Notably, behavioral engagement serves
question types, including multiple-choice, true or false, as the outward expression of cognitive and emotional
code completion, function writing, and programming engagement, thereby allowing researchers to infer the
problems, and were allowed to revise their submitted internal processes associated with these dimensions
responses casually before final submission. (Saqr et al., 2023a). Here we collected three indicators
Gao et al. International Journal of STEM Education (2025) 12:27 Page 8 of 26
for each student in every test, which served to reflect reflected quality and efficiency of code submissions on
their investment and enthusiasm in finishing the tests, solving the specific problem (Harrison et al., 1996). Over-
with a time indicator representing the sequence of all, program_score, pass_rate and sqs reflects students’
tests. These indicators have previously been selected performance in subjective questions answering, nloc,
to evaluate students’ study efficiency and learning atti- fundef_number and token_number represent the basic
tude during programming process (Liu et al., 2022; code quality, while code complexity is made of aver-
Shen et al., 2023; Gao et al., 2024, 2025). It is impor- age_ccn, memoryC and timeC. They jointly reflect the
tant to note that the magnitude of these variables does computational thinking and problem-solving skills of stu-
not indicate the quality of student performance, i.e., dents during programming process (Robins et al., 2003;
low level of a certain indicator or all indicators does Zhang et al., 2022; Gao et al., 2025).
not mean that this student perform poor. It just reflects
their behavioral engagement in the programming Pre‑processing
process. This study conducted process-oriented comparisons to
identify major differences, necessitating an evaluation of
Code metrics indicators the aforementioned indicators across the entire sequence
When a student submitted his code to answer a subjec- of all tests, along with final exam scores as a reference.
tive question on the OJS, the system compiled it and However, considering that each test may examine differ-
dynamically evaluated code metrics based on correct- ent knowledge points, skills, difficulty levels and discrim-
ness, time efficiency, space complexity and other aspects ination, the significance of students’ score, engagement
using test cases that were invisible for students. From this and code metrics during the answering process varied.
process, we extracted all nine indicators for each sub- This introduced heterogeneity in the indicators across
jective question in every test and calculated the average different time points. Therefore, we conducted stand-
value for each student across the test. This dimension ardized pre-processing on these indicators to extract
Table 2 Results of Levene’s test for comparisons between CS and math of all indicators
Df = 1 Score Submission cost_time start_time nloc fundef_number
Fig. 1 Box plot of indicators in each test with comparisons between CS and math
the relative levels of each student in the class for each of all pre-processed indicators across 12 tests with t test
test. Thus, over the entire semester, the sequence of comparisons conducted between CS and Math. Subse-
each indicator for a student represented its evolution- quently, Levene’s test with median as the default, which
ary trajectory of relative level among all students in this is considered robust test for homogeneity even under
class. The box plot in Fig. 1 illustrated the distributions non-normality, was employed to examine all indicators
Gao et al. International Journal of STEM Education (2025) 12:27 Page 10 of 26
for pairwise comparisons (Brown & Forsythe, 1974). (Hickendorff et al., 2018; Saqr et al., 2023a). LCA not
The results of 156 t tests as shown in Fig. 1 and Levene’s only considers the distribution of individual observed
test as shown in Table 2 indicated that the two student indicators but also takes into account the joint distribu-
cohorts from CS and Math did not exhibit significant tion of different indicators, enabling a better capture of
differences within the majority of indicators across the complex relationships in the data (Nylund-Gibson &
whole semester, thus largely meeting the assumption of Choi, 2018). In this study, we first conducted the LCA
homogeneity of variances. Here, the focus was scenario on three engagement indicators: submission, cost_time,
of accepting the null hypothesis (no significance), thus and start_time over the entire semester to determine the
the risk of p-hacking and Type I error inflation was not optimal clustering categories, which were then classified
considered. based on their practical significance. Next, we aligned the
Modeling a long-term learning process introduces engagement categories for each time point in chronologi-
significant variability. Within this period, our focus cal order for individual students, constructing personal
lay on identifying main transitions and continuous engagement sequences. This approach will clearly illus-
stable sequences in students’ states, thus necessitating trate each student’s relative performance in the engage-
coarsening nuanced changes (Saqr & López-Pernas, ment dimension at each time point, as well as the stability
2021). Extensive research has shown that employing of long-term learning throughout the semester. They can
discretization (or binning techniques) (Dewar et al., help us understand their learning trajectories.
2021) in process-oriented learning analytics, to adjust Due to the absence of an absolute criterion for
the granularity of students’ log data (neither too coarse to determining the number of clusters, to ascertain the
mask internal differences nor too fine to preclude reliable optimal number of subgroups, we experimented with
differentiation (Winne, 2020)) can enhance the reliability scenarios ranging from one to ten clusters, employing
of research and yield more interpretable results (Dewar the Akaike information critetion (AIC) and Bayes
et al., 2021; Saqr & López-Pernas, 2022; Saqr et al., information criterion (BIC) to ascertain the optimal
2023c). Therefore, for the three engagement indicators number of engagement subgroups. Lower values of
with a wide range of values, we discretized them into both AIC and BIC indicated better fit, and we chose the
equi-width deciles, ensuring roughly equal numbers of clustering result with the best fit. As shown in Fig. 2a, the
students in each decile. The least active students will fall AIC curve reached its minimum at 3 clusters (1067.54),
within the lowest interval, whereas the most active ones whereas the BIC curve showed a consistent upward
will be in the highest interval. Furthermore, this pre- trend with the smallest increment observed at 3 clusters
processing served a similar purpose to standardization (50.53 = 1259.89 − 1209.36), considering BIC’s stricter
in extracting the relative levels of students’ indicators and penalty for model complexity (Spiegelhalter et al., 2002).
eliminating the influence of heterogeneity. These results consistently indicated that 3 clusters
Many studies have systematically and extensively represented the optimal cluster. This finding aligned with
investigated how to classify students’ performance based previous related studies, where the three clusters have
on their scores, with the objective of achieving a balance been demonstrated to be the most interpretable model
between interpretability and effectiveness (Norcini, 2003; (Saqr & López-Pernas, 2021; Saqr et al., 2023a).
De Champlain, 2018). Here, we adopted the classical To rigorously validate the validity of our clustering
ternary grouping for the scores of 12 tests and the final choice, we conducted two assessments. First, one-way
exam (Saqr et al., 2023a): high, average, and low, with analysis of variance(ANOVA) was employed to examine
approximately equal numbers of students allocated to whether there were significant differences among
each group. engagement indicators across clustered subgroups as
shown in Table 3. Then, the Dunn test was utilized for
pairwise comparisons to identify specific indicators with
Data analysis significant differences among clustered subgroups, with
Clustering of engagement Bonferroni correction for multiple tests as shown in
To answer RQ2, we utilized Latent Class Analysis (LCA) Table 4. Overall, the differences among indicators of three
to explore potential distinct subgroups among students clustered engagement states were largely significant,
based on their discretized engagement indicators. As thereby affirming the validity of our clustering approach.
a case of person-centered and parametric clustering
methods, LCA is a classic social statistical model aimed Clustering and visualization of code metrics
at identifying latent subgroups (potential heterogene- To answer RQ3, we utilized Self-Organizing Maps (SOM)
ity) within samples based on categorical data, commonly to cluster and visualize code metrics while preserving
employed for comprehending various behavioral patterns underlying topological structure. SOM constitutes a class
Gao et al. International Journal of STEM Education (2025) 12:27 Page 11 of 26
Fig. 2 Fit values for each number of clusters for LCA (a) and SOM (b)
of unsupervised nonlinearly projecting mapping based on the optimal number of nodes. Then another clustering
on competitive learning, designed to cluster and visu- algorithm was used to determine the optimal clustering
alize high-dimensional input data by mapping it onto a categories among nodes over the entire semester, which
lower-dimensional topology (Kohonen, 1990; Shalagi- were then classified based on their practical significance.
nov & Franke, 2015). The fundamental principle involves Furthermore, personal code metrics sequence was also
iterative weight adjustments during training to learn the constructed by ordering the categories in each time point
distribution and structure of data while preserving the chronologically.
topological relationships among input elements. Thus, For the optimal number of nodes, we adhered to the
employing SOM, compared to other clustering methods, empirical findings and recommendations from previous
enables the preservation of internal topological structure studies (Vesanto et al., 1999; Shalaginov & Franke, 2015),
information of input code metrics indicators and show- setting it at 25 to balance accuracy and interpretability.
cases relationships between clustered nodes through Following the SOM results, we employed non-parametric
2D visualization (Wehrens & Buydens, 2007; Kohonen, hierarchical clustering to explore latent subgroups within
2013). Given these advantages, SOM has been used in the 25 nodes, with silhouette width (Silwidth) for evalu-
code analysis, aiding in a better understanding of struc- ation, as shown in Fig. 2b. The silwidth value, a metric
ture, similarities and disparities within students’ code where high values indicated better clustering quality,
metrics (Zhu & Zhu, 2010a, b; Zhang et al., 2022; Jevtić reached maximum at 4 clusters (0.42). Thus, the 25 nodes
et al., 2023). In this study, similar to LCA, we first con- of SOM were clustered into 4 clusters. Here, we took the
ducted the SOM on nine indicators of code metrics based third test (t = 3) as an example (figures of all tests are
Gao et al. International Journal of STEM Education (2025) 12:27 Page 12 of 26
Table 4 Dunn test for pairwise comparisons of engagement states of LCA with p value correction
Indicators Comparison r Cohen’s d Z P.unadj P.adj
included in the supplementary materials) to present the of each bar represented the magnitude of correspond-
results of SOM and hierarchical clustering, as shown in ing indicator. Each small circle denoted a Math student,
Fig. 3. It illustrated the distributions of nine indicators while each triangle represented a CS student.
and two students cohorts across the 25 nodes, alongside Then we also conducted rigorous validation to prove
the segmentation of four states. On each node, the nine the validity of clustering. Initially, we used the Kruskal–
indicators of code metrics from program_score to pass_ Wallis test, a non-parametric test, to examine whether
rate were arranged counterclockwise from deep green there were significant differences among different
to white, starting from the right side of node. The height subgroups. Then we also used the Dunn test for pairwise
Fig. 4 Sequence plots for states of score (a), engagement (b), and code metrics (c) in all formative assessments throughout the semester with final
scores as a reference
the final exam across test sequences, we observed a more high state, and even a 36% probability of continuously
balanced distribution among CS students across different maintaining a high state. This phenomenon was
segments. Even among those positioned lower in the significantly reinforced in the later stage of semester. For
test sequences, a considerable portion (n = 4, 30.8%) Math students, although a few may initially maintain high
managed to attain high scores in the final exam. In score state(two step 54.0%, three step 47.9%), struggling
contrast, most high-performing students (n = 5, 83.3%) at a low state became the prevailing pattern during the
in Math are positioned at least within the middle of early sequences (two step 61.0%, three step 51.4%). This
higher in test sequences, with only one student making a trend further exacerbated as the semester progresses
remarkable leap from a lower position. It was noteworthy with the probability of consistently maintaining low
that most CS students (n = 34, 85%) maintained state reaching 67.9% and 54.8%, which were even higher
consistency or even showed improvement in scores than probability of CS students maintaining a high
between the final exam and test sequences. However, state. Moreover, remaining high state were not in top5
a considerable portion of Math students (n = 7, 20%) most common state anymore. Therefore, combining the
experienced significant declines in the final exam findings of Table 5 with Fig. 4a, it became apparent that
compared to their test sequences. This stark contrast in CS students were more likely to stabilize at high state,
the upper positions of test sequences between the two whereas Math peers exhibited a higher propensity for
student groups was clearly observable. immersion at lower states.
Table 5 presented the top 5 most common state
transitions (two step and three step) with transition RQ2
probability(TP) within score sequences. During the As shown in Fig. 5, in early stages, the main factors dis-
early stage of semester, CS students who were in a high tinguishing the three engagement states were cost_time
state had a 90% probability of maintaining an average or and start_time, while in the later stages, submission and
Gao et al. International Journal of STEM Education (2025) 12:27 Page 15 of 26
Fig. 5 Box-plot of each engagement indicator for each cluster in each test
Table 5 Top 5 most common state transitions within score sequences of CS and math students
Major Early stage Later stage
Transition TP Counts Transition TP Counts
cost_time were key factors. This implied that in early increased. This shift may be attributed to the escalat-
stages, significant differences existed in students’ active ing complexity and difficulty of programming questions,
response to tests and their cognitive investment dur- alongside heightened expectations for comprehensive
ing the tests. However, as the stages advance, differences program design. Consequently, students may find it
in start_time diminished, while those in submissions
Gao et al. International Journal of STEM Education (2025) 12:27 Page 16 of 26
challenging to provide fully correct answers in the initial higher efficiency in the later stages(less over-engineering
few submission attempts. state and more good state).
Figure 4b illustrates engagement sequences. Table 6
presented the top 5 most common state transitions with Associations
transition probability within engagement sequences. Figure 6 offers a holistic perspective on the associations
CS students tended to be more active compared to between engagement and test scores, as well as between
Math peers in early stages, with fewer individuals in code metrics and final scores, from both undifferentiated
the low engagement state (n = 58, 24.2%) than Math and differentiated major viewpoints. Overall, regard-
(n = 71, 33.8%). In the early stage, CS students exhibited less of major, a significant positive correlation existed
a higher engagement state, with maintaining average between high engagement and high score, as well as
and high state becoming the predominant sequence. between low engagement and low score, with a nega-
However, in the latter half of semester, engagement tive correlation between high engagement and low score.
state declined and remained at low levels, becoming Moreover, over-engineering and high score in final exam
predominant. As for Math, the transitions in engagement showed a positive correlation.
were more evenly distributed, but the states tended to However, upon separate examination of majors, high
concentrate at the high and low ends. Thus, CS students engagement was positively correlated with high score
exhibited a trend of declining engagement state, while and negatively correlated with low score in CS, while
Math ones demonstrate polarization. in Math low engagement was positively correlated with
low score. Thus, overall, low engagement among Math
RQ3 students was likely to result in low scores. Nevertheless,
Figure 4c illustrates state sequences of code metrics for high engagement did not necessarily guarantee high
the two students cohorts and Table 7 presented the top 5 scores. Conversely, high engagement among CS peers
most common state transitions with transition probabil- was more likely to yield commensurate scores, while low
ity within code metrics sequences. It became evident that engagement did not necessarily lead to low scores. On
during the early and middle stages, Math students pri- the other hand, good and over-engineering state showed
marily exhibited poor and average state (n = 171, 61.1%), positively correlated with high score and negatively
while there was a noticeable shift towards more instances correlated with low score in final exam among CS groups.
of over-engineering in the middle to later stages. Con- In the contrast, in Math cohorts, good state showed
versely, CS students displayed a higher prevalence of negatively correlated with high score and positively
over-engineering (n = 35, 21.9%) during the early to correlated with low score in final exam.
middle stages, transitioning to predominantly good Figure 7 illustrates three correlations: final score and
performance in the later stages (n = 18, 75%). Over- test score, final score and test engagement, test score
all, CS students demonstrated better performance in and engagement. Primarily, it was evident that forma-
coding metrics, showing greater diligence in the early tive scores of CS students maintained a significant posi-
stages(more over-engineering and good states) and tive correlation with their final scores throughout most
of the semester. This phenomenon indicated that the
Table 6 Top 5 most common state transitions within engagement sequences of CS and Math students
Major Early stage Later stage
Transition TP Counts Transition TP Counts
Table 7 Top 5 most common state transitions within code metrics sequences of CS and math students
Major Early stage Later stage
Transition TP Counts Transition TP Counts
scores obtained by CS students in formative assess- code metrics, and scores, different from prior research.
ments reflected their genuine proficiency level. Con- In the early stage, CS students did indeed exhibit more
versely, a significant positive correlation in Math became proactive engagement, more over-engineering and higher
apparent only in the final stages of the semester. Fur- scores, which corroborated findings in existing litera-
thermore, in later stages of the course, negative correla- ture on CS students’ performance (Lishinski & Rosen-
tions emerged between the final score and engagement, berg, 2021; Lishinski et al., 2022; Kelly & Allen, 2023).
as well as between test score and engagement, among Nevertheless, in later stages, both the engagement level
CS students. This was evidenced by the combination of and code complexity (less over-engineering, much more
sequence plots depicting test scores and engagement, good) of CS students notably declined, especially among
suggesting that a considerable proportion of them expe- those who achieve high state in final exam, yet their
rienced a decrease in engagement without corresponding scores did not decrease, even with improvement. For
decrease in scores(n = 23, 57.5%), or even an improve- Math students, low engagement generally resulted in low
ment. Conversely, students who showed increased scores, while high engagement did not guarantee high
engagement were more likely to experience a decrease in scores.
scores (n = 11, 47.8%). Meanwhile, Math students exhib- This suggested that a considerable portion of CS
ited a positive correlation in most of both aspects. Some students have “learned how to learn"(Zhen et al., 2020),
remained struggling at low levels of engagement and demonstrating more efficient learning patterns. As a
score (n = 6, 17.1%) during the whole learning process, result, students achieved similar or even better grades
while some demonstrated high levels of engagement in in less time than before, and their level of effort may not
pursuit of better scores (n = 17, 48.6%). have changed; rather, the efficiency of their problem-
solving processes in programming has improved.
Discussion Consequently, their apparent decreased in engagement
Dynamic associations between engagement, code metrics, and code complexity could be misleading, as it reflected
and performance enhanced productivity rather than a reduction in
Substantial prior research has demonstrated a positive investment (Saqr et al., 2023a). As for Math students,
correlation between engagement and academic perfor- while low engagement typically implied lower scores,
mance (Wang & Eccles, 2013; Wang & Degol, 2014; Lei similar to CS students, high engagement in later stages
et al., 2018; Huang & Wang, 2023), particularly under- did not necessarily equate to higher scores. Instead,
scored within programming courses, emphasizing the it often manifested as complex and inefficient code
significance of active involvement in programming tasks efficiency alongside lower scores, indicating struggles
regardless of whether CS or Math students (Sinclair & in learning among Math students in later stages. This
Kalvala, 2015; Lishinski & Rosenberg, 2021; Lishinski dynamic associations among three dimensions just
et al., 2022; Kelly & Allen, 2023). The results presented underscored the importance of our multidimensional
in Fig. 6 from holistic perspective aligned with previous research, which aimed to analyze student learning
studies. However, our process-oriented investigations trajectories through a process-oriented approach.
have unveiled dynamic associations among engagement,
Gao et al. International Journal of STEM Education (2025) 12:27 Page 18 of 26
Fig. 6 Mosaic plot for the association between engagement and score states (OE represents over-engineering)
We believed that fluctuations in the association which were not considered by prior outcome-based
between engagement, code metrics, and formative per- research. A considerable portion of Math students
formance reflected the developmental process of com- (n = 7, 20%) exhibited tendencies of learning avoidance:
putational thinking during programming learning (Sun they excelled in formative assessments but showed
et al., 2022). In the initial stages, students needed to a significant decline in the final exam. As Shell et al.
invest significant time and enthusiasm in engaging with discovered that students may complete all assignments
programming tasks to devise solutions, thereby establish- and perform adequately on tests to get good scores
ing computational thinking. As course progressed, facing or grade, yet they often refrained from exerting
increasingly comprehensive assessments and more com- additional effort to fully grasp the learning materials
plex problem designs, students who have mastered com- or knowledge, a phenomenon referred to as learning
putational thinking would demonstrate more adept and avoidance (Shell et al., 2013). They viewed the course
efficient performance (Hattie & Donoghue, 2016; Saqr merely as a series of tasks to complete rather than as an
et al., 2023b). Conversely, it was not surprising that stu- opportunity for learning (Shell & Soh, 2013). Thus, for
dents, particularly math ones, who have not yet acquired these Math students, although they performed well in
these skills may display relatively high engagement but formative assessments throughout the process (possibly
low efficiency, resulting in even lower scores. As men- by relying on classmates’ answers, online searches, or
tioned before, submission and cost_time mainly con- inquiring ChatGPT), they have not truly mastered the
tributed to engagement states in later stages. Moreover, programming knowledge and computational thinking.
all indicators reflected students’ relative level among the Consequently, their performance in supervised
whole student groups after pre-processing. Thus, high final exams declined, sharply contrasting with their
engagement state suggested that students may not swiftly outstanding formative scores. In contrast, the vast
and accurately solve programming problems in the ini- majority of CS students’ performance in the final exam
tial few submission attempts. This indirectly reflected aligned consistently with their formative score sequences,
potential deficiencies or gaps in their knowledge, skills with some even surpassing expectations (n = 12, 30%).
and computational thinking compared to those efficient This suggested that formative scores of CS students
students. Consequently, they may require more time to generally reflected their genuine proficiency levels in the
conceptualize program designs, produce possibly more learning process.
complex and inefficient code, troubleshoot issues, and Such major difference between CS and Math
rectify errors, with the possibility of not resolving them aligned with the proposed theory of goal congruence
ultimately. (Brinkman & Diekman, 2016). Goal congruity posits
This finding highlighted the importance of cultivating that the alignment of students’ personal goals with the
computational thinking in the early stages of learning opportunities to achieve them within a field is crucial for
programming (Bati, 2022). From the perspective of students to enter and remain in that field. When these
complex systems theory of engagement dynamics, goals are misaligned, students are more inclined to leave
individual learning trajectories tend to exhibit the field (Diekman et al., 2011, 2017). Among all non-CS
considerable stability over time, aligning with the general disciplines, Math has the closest relationship with CS
dynamics of complex systems (López-Pernas & Saqr, (Knuth, 1974). Numerous studies have established that
2024). This can be understood as having inertia or a mathematical proficiency is one of the most critical
reluctance to shift from the current state to alternative factors for success in CS, and mathematical logic and
states. For students to enhance their learning status or formal symbolic reasoning are foundational elements
progress, they must invest substantial and tangible effort that permeate CS (Wilson & Shrock, 2001; Arnold et al.,
to address the shortcomings in their previous learning 2007; Chea et al., 2020). Meanwhile, programming as
experiences. However, once students surpass their prior a problem-solving tool is often guided by mathematical
selves, this transformation can establish a new steady principles (Olsson & Granberg, 2024), which not only
state for future learning, making the learning process feel enhances the understanding of concepts (Benton
more intuitive and manageable. Therefore, entering an et al., 2018; Zhong & Xia, 2020) but also fosters the
ideal steady state as early as possible will be advantageous development of logical thinking and problem-solving
for programming learning. skills (Forsström & Kaufmann, 2018; Kaufmann &
Stenseth, 2021; Laurent et al., 2022).
Despite the recognized benefits, for Math students,
Learning avoidance they may not perceive programming as an essential skill
Furthermore, our person-centered process-oriented for their future learning or profession goals, thus lacking
analysis revealed fine-grained learning phenomena, sufficient motivation to develop robust programming
Gao et al. International Journal of STEM Education (2025) 12:27 Page 20 of 26
skills (De Santo et al., 2022). Moreover, these students performance, CS students tended to behave a stronger
were all still in the first year, the freshmen in the willingness to learn (Fleming et al., 2023).
university, lacking an ability to clearly recognize the
meaningful link between programming and Math, or the Cold‑start
importance of programming in their future mathematical Among Math students the cold-start phenomenon
careers, even if teachers has explained this for several (engagement, code metrics and formative scores were
times during the semester (Kaufmann & Stenseth, 2021). generally low in the initial stages) was more prevalent,
Although this CS1 course was compulsory for Math including the state transition. For Math students, it was
students in this mixed class, they still exhibited a certain easier to fall into and remain in the low state across each
degree of relative passivity. Their learning goals that they dimension compared to their CS peers. Even though this
just wanted to get assigned tasks done and score on a course was a requirement for Math students, there still
test differed from CS ones, thus they may not consider existed a considerable portion who persisted in a state
how to enhance their levels of knowledge understanding of low scores and engagement over the long term, while
and problem-solving skills to better and more efficiently some may be compelled to invest substantial effort in the
tackle programming tasks (Shell & Soh, 2013). later stages due to credit pressure. In this context, even
Building on the results, even among the best- the credit pressure contributed by formative assessments
performing student group in the Math students, in compulsory course failed to motivate some students
a significant portion was unable to maintain high towards more active engagement state in these tests
performance in the final exam, exhibiting considerable during the process of programming learning.
decline. Furthermore, other Math student groups also Programming courses required an understanding of
struggled to achieve significant progress. This suggested many abstract concepts, along with extensive practice
that the overall performance of the Math cohort in to enhance computational thinking and coding skills.
programming learning appeared to lack sufficient self- Even when students grasped the material and invested
regulated motivation for effective learning, affirming significant time, they may still struggle to complete
discussions above (De Santo et al., 2022). It was exercises correctly. This can lead to frustration for first-
important to note that, based on the two-stage process year novices, potentially creating a sense of belonging
that constituted this mixed class, we believed that the crisis and marginalizing students. Students who lacked
abilities of Math students were comparable to those of CS a sense of belonging in a course or student group, such
students. Therefore, it was not that Math students lacked as those who felt marginalized, may find it challenging
the capability to excel in programming. to engage fully. This limitation restricted their ability
For CS students, programming skills and computational to interact with teachers and peers, ultimately affecting
thinking were pivotal for their future development. their learning outcomes (Rybarczyk, 2020; Salguero et al.,
Consequently, this alignment of goals fostered their 2021; Krause-Levy et al., 2021). In fact, prior research
genuine active engagement at the beginning in hands-on has extensively shown that a lack of belongingness was
programming tasks, collaboration with peers, and a a significant factor contributing to students’ struggles
deeper interest and curiosity in the subject matter in introductory programming courses, impacting their
(Brinkman & Diekman, 2016). At the same time, when overall experiences, academic success, and potentially
completing programming tasks, they may exert extra giving rise to negative emotions (Salguero et al., 2021;
effort not merely to fulfill requirements or earn credits Torbey, 2023). On the other hand, as mentioned before,
but to enhance their computational thinking and deepen Math students may not realize the potential applications
their understanding of knowledge (Shell & Soh, 2013). of programming in their future academic pursuits or life
Thus, their performance in formative assessments, plans. This lack of understanding could diminish their
which basically reflected their genuine proficiency in belongingness in introductory programming courses,
programming, remained consistent with their final ultimately resulting in high failure rates and poor
scores even improved. Furthermore, even among the retention rates within the programming field (Torbey,
lowest-performing group of CS students, a substantial 2023).
portion demonstrated significant improvement in the These findings collectively indicated a notable lack
final exam, achieving high performance. In addition, of interest or motivation towards learning, particularly
other groups of CS students with varying performance in the early stages. As discussed before, students
rankings in formative assessments also possessed similar experiencing a cold-start phase in the early stages
opportunities for advancement, resulting in a more might not have fully undergone the process of nurturing
balanced distribution compared to their CS counterparts. computational thinking. Some students, even if intended
This further validated that, regardless of their daily to exert renewed effort, would find it difficult to make
Gao et al. International Journal of STEM Education (2025) 12:27 Page 21 of 26
significant progress due to various obstacles, which instructors to allocate specific attention to these students
also aligned well with the complex system theory in (Shell & Soh, 2013).
last section (López-Pernas & Saqr, 2024). Others who Second, the two-stage selection process before
entered a state of effort may still exhibited high code joining this program ensured that they got similar
complexity and inefficiency. Meanwhile, as illustrated baseline abilities, including the necessary computational
in Fig. 7, Math students exhibited a significant positive thinking and mathematical logic skills required
correlation between engagement and final scores during for learning programming. However, despite these
both the mid-term and later stages, particularly evident shared foundational capabilities, a significant portion
in the sixth formative assessment, which was the last of students especially in Math still exhibited cold-
assessment before the mid-term exam. This strong start behaviors, characterized by insufficient learning
correlation suggested that a considerable number of attention and engagement in the early stages of the
students, especially those with better final scores, course. Thus, we suggest increasing the proportion of
significantly increased their engagement level before formative assessments in the final grading or adding
the mid-term exam. Thus, the pressure of exams can more formative assessments, particularly in the early
objectively motivate students to invest sufficient effort. stages of the course. This approach can indirectly exert
In fact, research indicated that setting examinations can some pressure to enhance students’ attention to learning,
enhance students’ focus and attention, and the attention thereby alleviating the cold start phenomenon (Quadir
was a important component of learning motivation et al., 2023). Addressing this issue may significantly
(Quadir et al., 2023). help students who are struggling to catch up later in the
course.
Thirdly, we recommend integrating programming
Implications instruction with practical applications relevant to non-CS
Findings in this paper reveal that, under identical majors while imparting programming knowledge. This
teaching environments and with no prior programming approach allows students to experience the authentic
experience, students entering with similar levels after relevance of programming in their respective majors and
two-stage selection process exhibit starkly different recognize the importance of cultivating computational
learning patterns and outcomes within the same class. thinking, thereby enhancing their sense of belonging,
Based on research findings and teaching practices, there experience and performance in programming courses
are some potential implications for instructional design. (Dawson et al., 2018). On the other hand, instructors can
Firstly, instructors should not assume that conclusions employ various strategies to enhance goal alignment or
based on group-level averages from variable-centered perceived usefulness, for instance, they could connect
studies apply to all students, such as the positive correla- course activities to students’ future career goals and
tions among engagement, code metrics, and performance professional requirements (McGough Spence et al.,
(Lishinski et al., 2022; Kelly & Allen, 2023; Saqr & López- 2022; Jones et al., 2023). In addition, fostering situational
Pernas, 2024). At different stages of learning, it may be interest, such as incorporating novelty, sparking curiosity,
essential to focus on student groups with varying per- and stimulating emotional engagement, can further
formance levels. Firstly, in early stages of course, which motivate students and sustain their attention (Jones,
are critical for building foundational knowledge and fos- 2018).
tering computational thinking, instructors should pay Fourthly, this study underscores the importance of
attention to students with low engagement and scores to person-centered process-oriented methods. The data
understand the challenges they face and encourage their can be used to evaluate individual students’ learning
active participation (O’Malley, 2020; Hogan et al., 2023), processes, enabling a deeper understanding of the
thus avoiding cold-start. In later stages, in addition to diverse learning trajectory patterns within the class.
focusing on this part of students, attention should also be Such individual-level analysis can support educators
directed towards those with high engagement and code in providing a more equitable and inclusive learning
complexity, as this may indicate gaps in their knowledge, experience for all students with different backgrounds,
skills or computational thinking (Sun et al., 2021). Fur- which could be applied in broader STEM education
thermore, due to the presence of learning avoidance phe- community (Jones et al., 2023).
nomenon, high formative scores among non-CS may not Finally, there are some reflections on the
entirely reflect their genuine proficiency level, prompting implementation of this mixed program. Building
upon the preceding discussions, we believe that this
disparity arises from the early major choices made
Gao et al. International Journal of STEM Education (2025) 12:27 Page 22 of 26
This work is also supported in part by Key Research and Development Projects Castle, S.D. (2023). Leveraging computational science students’ coding
in Shaanxi Province of China (No. 2019ZDLGY03-10). strengths for mathematics learning. Proceedings of the 54th acm techni-
cal symposium on computer science education v. 1 (pp. 263–269).
Data availability Castle, S.D. (2024). Embracing mathematical conjecture through coding and
The data sets used and/or analysed during the current study are available computational thinking. Proceedings of the 55th acm technical sympo-
from the corresponding author on reasonable request. sium on computer science education v. 2 (pp. 1594–1595).
Cetin, I., & Ozden, M. Y. (2015). Development of computer programming atti-
tude scale for university students. Computer Applications in Engineering
Declarations Education, 23(5), 667–672.
Chea, K., Moore, C. & Bares, W. (2020). Motivating future adventures in comput-
Competing interests ing by unmasking math behind movie special effects. Proceedings of
The authors declare that they have no competing interests. the 51st acm technical symposium on computer science education (pp.
275–281).
Author details Cheng, G., Zou, D., Xie, H., & Wang, F. L. (2024). Exploring differences in self-
1
Northwest University, Xi’an 710127, China. 2 University of California, Davis, CA regulated learning strategy use between high-and low-performing
95616, USA. students in introductory programming: An analysis of eye-tracking and
retrospective think-aloud data from program comprehension. Comput-
Received: 18 June 2024 Accepted: 10 April 2025 ers & Education, 208, 104948.
Chou, T.-L., Tang, K.-Y., & Tsai, C.-C. (2021). A phenomenographic analysis of
college students’ conceptions of and approaches to programming
learning: Insights from a comparison of computer science and non-
computer science contexts. Journal of Educational Computing Research,
References 59(7), 1370–1400.
Allen, J., & Robbins, S. B. (2008). Prediction of college major persistence based Clear, A., Clear, T., Vichare, A., Charles, T., Frezza, S., Gutica, M. et al. (2020).
on vocational interests, academic preparation, and first-year academic Designing computer science competency statements: A process and
performance. Research in Higher Education, 49, 62–79. curriculum model for the 21st century. Proceedings of the working group
Altadmri, A., & Brown, N.C. (2015). 37 million compilations: Investigating novice reports on innovation and technology in computer science education (pp.
programming mistakes in large-scale student data. Proceedings of 211–246).
the 46th acm technical symposium on computer science education (pp. Dawson, J.Q., Allen, M., Campbell, A. & Valair, A. (2018). Designing an intro-
522–527). ductory programming course to improve non-majors’ experiences.
Arnold, R., Langheinrich, M. & Hartmann, W. (2007). Infotraffic: teaching Proceedings of the 49th ACM Technical Symposium on Computer Science
important concepts of computer science and math through real-world Education, 26–31,
examples. Proceedings of the 38th sigcse technical symposium on com- De Champlain, A. F. (2018). Standard setting methods in medical education: High-
puter science education (pp. 105–109). stakes assessment. Understanding medical education: Evidence, theory,
Asikainen, H., & Gijbels, D. (2017). Do students develop towards more deep and practice. (pp. 347–359).
approaches to learning during studies? A systematic review on the De Santo, A., Farah, J. C., Martínez, M. L., Moro, A., Bergram, K., Purohit, A. K.,
development of students’ deep and surface approaches to learning in & Holzer, A. (2022). Promoting computational thinking skills in non-
higher education. Educational Psychology Review, 29, 205–234. computer-science students: Gamifying computational notearticles to
Bati, K. (2022). A systematic literature review regarding computational thinking increase student engagement. IEEE Transactions on Learning Technolo-
and programming in early childhood education. Education and Infor- gies, 15(3), 392–405.
mation Technologies, 27(2), 2059–2082. Dewar, A., Hope, D., Jaap, A., & Cameron, H. (2021). Predicting failure before it
Beck, E. D., & Jackson, J. J. (2021). Within-person variability (pp. 75–100). Elsevier. happens: A 5-year, 1042 participant prospective study. Medical Teacher,
Benton, L., Saunders, P., Kalas, I., Hoyles, C., & Noss, R. (2018). Designing for 43(9), 1039–1043.
learning mathematics through programming: A case study of pupils Diekman, A. B., Brown, E. R., Johnston, A. M., & Clark, E. K. (2010). Seeking con-
engaging with place value. International journal of child–computer gruity between goals and roles: A new look at why women opt out of
interaction, 16, 68–76. science, technology, engineering, and mathematics careers. Psychologi-
Blikstein, P., Worsley, M., Piech, C., Sahami, M., Cooper, S., & Koller, D. (2014). cal Science, 21(8), 1051–1057.
Programming pluralism: Using learning analytics to detect patterns in Diekman, A. B., Clark, E. K., Johnston, A. M., Brown, E. R., & Steinberg, M. (2011).
the learning of computer programming. Journal of the Learning Sciences, Malleability in communal goals and beliefs influences attraction to
23(4), 561–599. stem careers: Evidence for a goal congruity perspective. Journal of
Brinkman, B., & Diekman, A. (2016). Applying the communal goal congru- Personality and Social Psychology, 101(5), 902.
ity perspective to enhance diversity and inclusion in undergraduate Diekman, A. B., Steinberg, M., Brown, E. R., Belanger, A. L., & Clark, E. K. (2017). A
computing degrees. Proceedings of the 47th ACM technical symposium goal congruity model of role entry, engagement, and exit: Understand-
on computing science education, 102–107. ing communal goal processes in stem gender gaps. Personality and
Broadbent, J., Sharman, S., Panadero, E., & Fuller-Tyszkiewicz, M. (2021). How Social Psychology Review, 21(2), 142–175.
does self-regulated learning influence formative assessment and sum- Fleming, M., O’Sheaa, P., Vrbik, P., Webb, B. & Birkett, G. (2023). Learning to
mative grade? comparing online and blended learners. The Internet and program-a tale of two cohorts. 34th australasian association for engineer-
Higher Education, 50, 100805. ing education conference (aaee2023) (pp. 431–439).
Brodley, C. E., Quam, M., & Weiss, M. (2024). An analysis of the math require- Forsström, S. E., & Kaufmann, O. T. (2018). A literature review exploring the use
ments of 199 cs bs/ba degrees at 158 u.s. universities. Communication of programming in mathematics education. International Journal of
of the ACM, 67(8), 122–31. Learning, Teaching and Educational Research, 17(12), 18–32.
Brown, M. B., & Forsythe, A. B. (1974). Robust tests for the equality of variances. Frezza, S., Daniels, M., Pears, A., Cajander, Å., Kann, V., Kapoor, A. & Wallace, C.
Journal of the American Statistical Association, 69(346), 364–367. (2018). Modelling competencies for computing education beyond
Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to 2020: a research based approach to defining competencies in the
change the world without a heterogeneity revolution. Nature Human computing disciplines. Proceedings companion of the 23rd annual acm
Behaviour, 5(8), 980–989. conference on innovation and technology in computer science education
Carnoy, M., Loyalka, P., Dobryakova, M., Dossani, R., Froumin, I., Kuhns, K., & (pp. 148–174).
Wang, R. (2013). University expansion in a changing global economy: Fryer, L. K. (2017). (latent) transitions to learning at university: A latent profile
Triumph of the brics? Stanford University Press. transition analysis of first-year Japanese students. Higher Education, 73,
519–537.
Gao et al. International Journal of STEM Education (2025) 12:27 Page 24 of 26
Gale, J., Alemdar, M., Boice, K., Hernández, D., Newton, S., Edwards, D., & Laurent, M., Crisci, R., Bressoux, P., Chaachoua, H., Nurra, C., de Vries, E., &
Usselman, M. (2022). Student agency in a high school computer sci- Tchounikine, P. (2022). Impact of programming on primary math-
ence course. Journal for STEM Education Research, 5(2), 270–301. ematics learning. Learning and Instruction, 82, 101667.
Gao, Z., Cui, C., Yan, H., Liu, J., Sun, X. & Feng, J. (2025). Towards a quantita- Lee, G. & Lee, E-H. (2019). Computational thinking and programming for
tive competency model for cs1 via five-channel learning sequences. non-cs majors. E-Learn: World Conference on E-Learning in Corporate,
Proceedings of the 56th ACM Technical Symposium on Computer Science Government, Healthcare, and Higher Education, 1013–1018.
Education V. 1, 25–31. Lee, H.-Y., Lin, C.-J., Wang, W.-S., Chang, W.-C., & Huang, Y.-M. (2023). Precision
Gao, Z., Yan, H., Wu, Y., Cui, C., Zhang, Y. & Feng, J. (2024). Exploring relations education via timely intervention in k-12 computer programming
between programming learning trajectories and students’ majors. course to enhance programming skill and affective-domain learning
Proceedings of the acm turing award celebration conference-china 2024 objectives. International Journal of STEM Education, 10(1), 52.
(pp. 177–180). Lei, H., Cui, Y., & Zhou, W. (2018). Relationships between student engage-
Gao, Z., Zhang, Y., Zhang, R., Sun, X., & Feng, J. (2022). Do gender or major ment and academic achievement: A meta-analysis. Social Behavior
influence the performance in programming learning? Teaching mode and Personality An International Journal, 46(3), 517–528.
decision based on exercise series analysis. Computational Intelligence Lewis, C., Bruno, P., Raygoza, J. & Wang, J. (2019). Alignment of goals and
and Neuroscience, 2022(1), 7450669. perceptions of computing predicts students’ sense of belonging in
Guo, J.-P., Lv, S., Wang, S.-C., Wei, S.-M., Guo, Y.-R., & Yang, L.-Y. (2023). Reciprocal computing. Proceedings of the 2019 acm conference on international
modeling of university students’ perceptions of the learning environ- computing education research (pp. 11–19).
ment, engagement, and learning outcome: A longitudinal study. Learn- Li, R., Yin, Y., Dai, L., Shen, S., Lin, X., Su, Y. & Chen, E. (2022). Pst: measuring
ing and Instruction, 83, 101692. skill proficiency in programming exercise process via programming
Hao, X., Xu, Z., Guo, M., Hu, Y., & Geng, F. (2023). The effect of embedded struc- skill tracing. Proceedings of the 45th international acm sigir conference
tures on cognitive load for novice learners during block-based code on research and development in information retrieval (pp. 2601–2606).
comprehension. International Journal of STEM Education, 10(1), 42. Liang, J.-C., Su, Y.-C., & Tsai, C.-C. (2015). The assessment of Taiwanese col-
Harrison, R., Samaraweera, L., Dobie, M. R., & Lewis, P. H. (1996). An evaluation lege students’ conceptions of and approaches to learning computer
of code metrics for object-oriented programs. Information and Software science and their relationships. The Asia-Pacific Education Researcher,
Technology, 38(7), 443–450. 24, 557–567.
Hattie, J. A., & Donoghue, G. M. (2016). Learning strategies: A synthesis and Lishinski, A. & Rosenberg, J. (2021). All the pieces matter: The relationship of
conceptual model. NPJ Science of Learning, 1(1), 1–13. momentary self-efficacy and affective experiences with cs1 achieve-
Hickendorff, M., Edelsbrunner, P. A., McMullen, J., Schneider, M., & Trezise, K. ment and interest in computing. Proceedings of the 17th ACM Confer-
(2018). Informative tools for characterizing individual differences in ence on International Computing Education Research, 252–265.
learning: Latent class, latent profile, and latent transition analysis. Learn- Lishinski, A., Narvaiz, S. & Rosenberg, J.M. (2022). Self-efficacy, interest,
ing and Individual Differences, 66, 4–15. and belongingness–urm students’ momentary experiences in cs1.
Hogan, E., Li, R. & Soosai Raj, A.G. (2023). Cs0 vs. cs1: Understanding fears Proceedings of the 2022 ACM Conference on International Computing
and confidence amongst non-majors in introductory cs courses. Education Research-Volume 1, 44–60.
Proceedings of the 54th ACM Technical Symposium on Computer Science Liu, K., Han, Y., Zhang, J.M., Chen, Z., Sarro, F., Harman, M. & Ma, Y. (2023).
Education V. 1, 25–31, Who judges the judge: An empirical study on online judge tests. Pro-
Huang, Y., & Wang, S. (2023). How to motivate student engagement in emer- ceedings of the 32nd acm sigsoft international symposium on software
gency online learning? Evidence from the covid-19 situation. Higher testing and analysis (pp. 334–346).
Education, 85(5), 1101–1123. Liu, F., Zhao, L., Zhao, J., Dai, Q., Fan, C., & Shen, J. (2022). Educational process
Impagliazzo, J. & Pears, A.N. (2018). The cc2020 project-computing curricula mining for discovering students’ problem-solving ability in computer
guidelines for the 2020s. 2018 ieee global engineering education confer- programming education. IEEE Transactions on Learning Technologies,
ence (educon) (pp. 2021–2024). 15(6), 709–719.
Jevtić, M., Mladenović, S., & Granić, A. (2023). Source code analysis in program- López-Pernas, S., & Saqr, M. (2021). Bringing synchrony and clarity to com-
ming education: Evaluating learning content with self-organizing plex multi-channel data: A learning analytics study in programming
maps. Applied Sciences, 13(9), 5719. education. IEEE Access, 9, 166531–166541.
Jones, B.D. (2018). Motivating students by design: Practical strategies for profes- López-Pernas, S., & Saqr, M. (2024). How the dynamics of engagement
sors. CreateSpace Independent Publishing Platform. explain the momentum of achievement and the inertia of dis-
Jones, B. D., Ellis, M., Gu, F., & Fenerci, H. (2023). Motivational climate predicts engagement: A complex systems theory approach. Computers in
effort and achievement in a large computer science course: Examin- Human Behavior, 153, 108126.
ing differences across sexes, races/ethnicities, and academic majors. Loyalka, P., Liu, O. L., Li, G., Kardanova, E., Chirikov, I., Hu, S., et al. (2021). Skill
International Journal of STEM Education, 10(1), 65. levels and gains in university stem education in China, India, Russia
Kaufmann, O. T., & Stenseth, B. (2021). Programming in mathematics educa- and the United States. Nature Human Behaviour, 5(7), 892–904.
tion. International Journal of Mathematical Education in Science and Martin, F., & Borup, J. (2022). Online learner engagement: Conceptual
Technology, 52(7), 1029–1048. definitions, research themes, and supportive practices. Educational
Kelly, R., & Allen, M. (2023). Exploring engagement and self-efficacy in an intro- Psychologist, 57(3), 162–177.
ductory computer science course. Proceedings of the 2023 ACM SIGPLAN McGough Spence, C., Kirn, A., & Benson, L. (2022). Perceptions of future
International Symposium on SPLASH-E, 60–68. careers for middle year engineering students. Journal of Engineering
Kiesler, N. (2020). On programming competence and its classification. Pro- Education, 111(3), 595–615.
ceedings of the 20th koli calling international conference on computing Nguyen, H., Lim, M., Moore, S., Nyberg, E., Sakr, M. & Stamper, J. (2021).
education research (pp. 1–10). Exploring metrics for the analysis of code submissions in an introduc-
Knuth, D. E. (1974). Computer science and its relation to mathematics. The tory data science course. LAK21: 11th International Learning Analytics
American Mathematical Monthly, 81(4), 323–343. and Knowledge Conference, 632–638.
Kohonen, T. (1990). The self-organizing map. Proceedings of the IEEE, 78(9), Norcini, J. J. (2003). Setting standards on educational tests. Medical Educa-
1464–1480. tion, 37(5), 464–469.
Kohonen, T. (2013). Essentials of the self-organizing map. Neural Networks, Nuñez-Varela, A. S., Pérez-Gonzalez, H. G., Martínez-Perez, F. E., & Souber-
37, 52–65. vielle-Montalvo, C. (2017). Source code metrics: A systematic map-
Krause-Levy, S., Griswold, W.G., Porter, L. & Alvarado, C. (2021). The relation- ping study. Journal of Systems and Software, 128, 164–197.
ship between sense of belonging and student outcomes in cs1 Nylund-Gibson, K., & Choi, A. Y. (2018). Ten frequently asked questions about
and beyond. Proceedings of the 17th acm conference on international latent class analysis. Translational Issues in Psychological Science, 4(4),
computing education research (pp. 29–41). 440.
Landivar, L. C. (2013). Disparities in stem employment by sex, race, and
hispanic origin. Education Review, 29(6), 911–922.
Gao et al. International Journal of STEM Education (2025) 12:27 Page 25 of 26
Olsson, J., & Granberg, C. (2024). Teacher–student interaction supporting Saqr, M., López-Pernas, S., Jovanović, J., & Gašević, D. (2023). Intense, turbulent,
students’ creative mathematical reasoning during problem solving or wallowing in the mire: A longitudinal study of cross-course online
using scratch. Mathematical Thinking and Learning, 26(3), 278–305. tactics, strategies, and trajectories. The Internet and Higher Education, 57,
O’Malley, C. (2020). How do non-majors approach a cs1 course? Proceedings 100902.
of the 51st ACM Technical Symposium on Computer Science Education, Saqr, M., López-Pernas, S., & Vogelsmeier, L. V. (2023). When, how and for whom
1425–1425. changes in engagement happen: A transition analysis of instructional
Parker, M.C., Ren, H., Li, M. & Wang, C. (2024). Intersectional biases within variables. Computers & Education, 207, 104934.
an introductory computing assessment. Proceedings of the 55th Shah, A., Hogan, E., Agarwal, V., Driscoll, J., Porter, L., Griswold, W.G. & Soosai Raj,
acm technical symposium on computer science education v. 1 (pp. A.G. (2023). An empirical evaluation of live coding in cs1. Proceed-
1021–1027). ings of the 2023 ACM Conference on International Computing Education
Pereira, F. D., Oliveira, E. H., Oliveira, D. B., Cristea, A. I., Carvalho, L. S., Fonseca, Research-Volume 1, 476–494.
S. C., & Isotani, S. (2020). Using learning analytics in the amazonas: Shalaginov, A., & Franke, K. (2015). A new method for an optimal som size
Understanding students’ behaviour in introductory programming. determination in neuro-fuzzy for the digital forensics applications.
British Journal of Educational Technology, 51(4), 955–972. Advances in Computational Intelligence: 13th International Work-Confer-
Quadir, B., Mostafa, K., Yang, J. C., Shen, J., & Akter, R. (2023). Arcs approach ence on Artificial Neural Networks, IWANN 2015, Palma de Mallorca, Spain,
to pta-based programming language practice sessions: Factors June 10-12, 2015. Proceedings, Part II 13, 549–563.
influencing programming problem-solving skills. Education and Shell, D. F., Hazley, M. P., Soh, L.-K., Ingraham, E., & Ramsay, S. (2013). Asso-
Information Technologies, 28(10), 13713–13735. ciations of students’ creativity, motivation, and self-regulation with
Quille, K., & Bergin, S. (2018). Programming: predicting student success early learning and achievement in college computer science courses. IEEE
in cs1. a re-validation and replication study. Proceedings of the 23rd Frontiers in Education Conference (FIE), 2013, 1637–1643.
annual acm conference on innovation and technology in computer sci- Shell, D. F., & Soh, L.-K. (2013). Profiles of motivated self-regulation in college
ence education (pp. 15–20). computer science courses: Differences in major versus required non-
Raj, R., Sabin, M., Impagliazzo, J., Bowers, D., Daniels, M., Hermans, F.. others major courses. Journal of Science Education and Technology, 22, 899–913.
(2021). Professional competencies in computing education: Pedago- Shen, G., Yang, S., Huang, Z., Yu, Y., & Li, X. (2023). The prediction of program-
gies and assessment. Proceedings of the 2021 working group reports ming performance using student profiles. Education and Information
on innovation and technology in computer science education (pp. Technologies, 28(1), 725–740.
133–161). Sinclair, J. & Kalvala, S. (2015). Exploring societal factors affecting the experi-
Raj, R.K., Sabin, M., Impagliazzo, J., Bowers, D., Daniels, M., Hermans, F.. others ence and engagement of first year female computer science under-
(2021). Toward practical computing competencies. Proceedings of the graduates. Proceedings of the 15th Koli Calling Conference on Computing
26th acm conference on innovation and technology in computer science Education Research, 107–116.
education v. 2 (pp. 603–604). Song, D., Hong, H., & Oh, E. Y. (2021). Applying computational analysis of nov-
Rezadoust Siah Khaleh Sar, H. (2023). The role of clandestine assessment in ice learners’ computer programming patterns to reveal self-regulated
iranian efl context: a teacher’s perceptions about the impact of educa- learning, computational thinking, and learning performance. Computers
tional digital-based games on teaching and learning english. Education in Human Behavior, 120, 106746.
and Information Technologies, 1–25. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian
Robins, A., Rountree, J., & Rountree, N. (2003). Learning and teaching program- measures of model complexity and fit. Journal of the Royal Statistical
ming: A review and discussion. Computer Science Education, 13(2), Society Series B: Statistical Methodology, 64(4), 583–639.
137–172. Stanja, J., Gritz, W., Krugel, J., Hoppe, A., & Dannemann, S. (2023). Formative
Rybarczyk, R. (2020). Non-major peer mentoring for cs1. Proceedings of the 51st assessment strategies for students’ conceptions-the potential of learn-
acm technical symposium on computer science education (p.1068-1074). ing analytics. British Journal of Educational Technology, 54(1), 58–75.
New York, NY, USA: Association for Computing Machinery. Sun, Y. & Luo, C. (2021). A perspective on computer programming education
Sabin, M., Impagliazzo, J., Alrumaih, H., Tang, C. & Zhang, M. (2018). It2017 for children in china. 2021 international conference on modern educa-
report: Implementing a competency-based information technology tional technology and social sciences (icmetss 2021) (pp. 154–158).
program. Proceedings of the 49th acm technical symposium on computer Sun, L., Hu, L., & Zhou, D. (2022). Programming attitudes predict computational
science education (pp. 1045–1046). thinking: Analysis of differences in gender and programming experi-
Sabin, M., Kiesler, N., Kumar, A.N., MacKellar, B.K., McCauley, R., Raj, R.K. & ence. Computers & Education, 181, 104457.
Impagliazzo, J. (2023). Fostering dispositions and engaging computing Sun, X., Li, B., Sutcliffe, R., Gao, Z., Kang, W., & Feng, J. (2023). Wse-mf: A
educators. Proceedings of the 54th acm technical symposium on computer weighting-based student exercise matrix factorization model. Pattern
science education v. 2 (pp. 1216–1217). Recognition, 138, 109285.
Salguero, A., Griswold, W.G., Alvarado, C. & Porter, L. (2021). Understanding Sun, D., Ouyang, F., Li, Y., & Zhu, C. (2021). Comparing learners’ knowledge,
sources of student struggle in early computer science courses. Proceed- behaviors, and attitudes between two instructional modes of computer
ings of the 17th ACM Conference on International Computing Education programming in secondary education. International Journal of STEM
Research, 319–333. Education, 8, 1–15.
Saqr, M., & López-Pernas, S. (2021a). Idiographic learning analytics: A single Sun, Q., Wu, J., & Liu, K. (2020). Toward understanding students’ learning perfor-
student (n= 1) approach using psychological networks. CEUR Workshop mance in an object-oriented programming course: The perspective of
Proceedings, 2868. program quality. IEEE Access, 8, 37505–37517.
Saqr, M. (2024). Group-level analysis of engagement poorly reflects individual Tellhed, U., Björklund, F., Kallio Strand, K., & Schöttelndreier, K. (2023). “program-
students’ processes: Why we need idiographic learning analytics. Com- ming is not that hard!" when a science center visit increases young
puters in Human Behavior, 150, 107991. women’s programming ability beliefs. Journal for STEM Education
Saqr, M., & López-Pernas, S. (2021). The longitudinal trajectories of online Research, 6(2), 252–274.
engagement over a full program. Computers & Education, 175, 104325. Torbey, R. (2023). Inequities of enrollment: A quantitative analysis of participa-
Saqr, M., & López-Pernas, S. (2022). How cscl roles emerge, persist, transition, tion in high school computer science coursework across a 4-year
and evolve over time: A four-year longitudinal study. Computers & period. Proceedings of the 2023 ACM Conference on International Comput-
Education, 189, 104581. ing Education Research-Volume 1, 344–355.
Saqr, M., & López-Pernas, S. (2024). Mapping the self in self-regulation using Tran, H., Vu-Van, T., Bang, T., Le, T.-V., Pham, H.-A., & Huynh-Tuong, N. (2023).
complex dynamic systems approach. British Journal of Educational Data mining of formative and summative assessments for improving
Technology, 55(4), 1376–1397. teaching materials towards adaptive learning: A case study of program-
Saqr, M., López-Pernas, S., Helske, S., & Hrastinski, S. (2023). The longitudinal ming courses at the university level. Electronics, 12(14), 3135.
association between engagement and achievement varies by time, Vehec, I., & Pietriková, E. (2020). Metrics for student source code analysis. 2020
students’ profiles, and achievement state: A full program study. Comput- 18th International Conference on Emerging eLearning Technologies and
ers & Education, 199, 104787. Applications (ICETA), 739–744.
Gao et al. International Journal of STEM Education (2025) 12:27 Page 26 of 26
Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J., et al. (1999). Self-
organizing map in matlab: The som toolbox. Proceedings of the Matlab
DSP conference, 99, 16–17.
Villamor, M. M. (2020). A review on process-oriented approaches for analyzing
novice solutions to programming problems. Research and Practice in
Technology Enhanced Learning, 15(1), 8.
Wang, M.-T., & Degol, J. (2014). Staying engaged: Knowledge and research
needs in student engagement. Child Development Perspectives, 8(3),
137–143.
Wang, M.-T., & Eccles, J. S. (2013). School context, achievement motivation,
and academic engagement: A longitudinal study of school engage-
ment using a multidimensional perspective. Learning and Instruction,
28, 12–23.
Wehrens, R., & Buydens, L. M. (2007). Self-and super-organizing maps in r: The
Kohonen package. Journal of Statistical Software, 21, 1–19.
Wilson, B. C., & Shrock, S. (2001). Contributing to success in an introductory
computer science course: A study of twelve factors. ACM Sigcse Bulletin,
33(1), 184–188.
Winne, P. H. (2020). Construct and consequential validity for learning analytics
based on trace data. Computers in Human Behavior, 112, 106457.
Wu, B., Hu, Y., Ruis, A. R., & Wang, M. (2019). Analysing computational thinking
in collaborative programming: A quantitative ethnography approach.
Journal of Computer Assisted Learning, 35(3), 421–434.
Xu, W., & Ouyang, F. (2022). The application of ai technologies in stem educa-
tion: A systematic review from 2011 to 2021. International Journal of
STEM Education, 9(1), 59.
Xu, Z., Ritzhaupt, A. D., Umapathy, K., Ning, Y., & Tsai, C.-C. (2021). Exploring
college students’ conceptions of learning computer science: A draw-a-
picture technique study. Computer Science Education, 31(1), 60–82.
Zhang, W., Zeng, X., Wang, J., Ming, D., & Li, P. (2022). An analysis of learners’
programming skills through data mining. Education and Information
Technologies, 27(8), 11615–11633.
Zhao, X., Luo, X., Shi, Q., Chen, C., Wang, S., Che, W. & Sun, M. (2025). Chart-
coder: Advancing multimodal large language model for chart-to-code
generation. arXiv preprint arXiv:2501.06598.
Zhen, R., Liu, R.-D., Wang, M.-T., Ding, Y., Jiang, R., Fu, X., & Sun, Y. (2020). Trajec-
tory patterns of academic engagement among elementary school
students: The implicit theory of intelligence and academic self-efficacy
matters. British Journal of Educational Psychology, 90(3), 618–634.
Zhong, H-X., Chang, J-H., Lai, C-F., Chen, P-W., Ku, S-H. & Chen, S-Y. (2023).
Information undergraduate and non-information undergraduate on
an artificial intelligence learning platform: An artificial intelligence
assessment model using pls-sem analysis. Education and Information
Technologies, 1–30.
Zhong, B., & Xia, L. (2020). A systematic review on exploring the potential of
educational robotics in mathematics education. International Journal of
Science and Mathematics Education, 18(1), 79–101.
Zhu, G., & Zhu, X. (2010a). The growing self-organizing map for clustering algo-
rithms in programming codes. 2010 International Conference on Artificial
Intelligence and Computational Intelligence, 3, 178–182.
Zhu, X., & Zhu, G. (2010). Self-organizing map for clustering algorithms in pro-
gramming codes. Third International Conference on Business Intelligence
and Financial Engineering, 2010, 24–27.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.