0% found this document useful (0 votes)
52 views10 pages

Exercise Makes Better Mind A Data Mining Study On

This study examined the relationship between physical activity and academic achievement in 2,219 college students over 12 weeks. Decision tree models were used to analyze physical activity data from a running app and average academic course scores. The results found that higher regularity and a reduced average step frequency were associated with better academic performance. Students who exercised 1 time/week for 16-25 minutes had higher academic performance than 88% of other students. Decision trees can effectively analyze complex physical activity and academic data to provide insights for enhancing student achievement.

Uploaded by

MUSTAFA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views10 pages

Exercise Makes Better Mind A Data Mining Study On

This study examined the relationship between physical activity and academic achievement in 2,219 college students over 12 weeks. Decision tree models were used to analyze physical activity data from a running app and average academic course scores. The results found that higher regularity and a reduced average step frequency were associated with better academic performance. Students who exercised 1 time/week for 16-25 minutes had higher academic performance than 88% of other students. Decision trees can effectively analyze complex physical activity and academic data to provide insights for enhancing student achievement.

Uploaded by

MUSTAFA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

TYPE Original Research

PUBLISHED 16 October 2023


DOI 10.3389/fpsyg.2023.1271431

Exercise makes better mind: a data


OPEN ACCESS mining study on effect of physical
activity on academic achievement
EDITED BY
Jorge Carlos-Vivas,
University of Extremadura, Spain

REVIEWED BY
Corrado Lupo,
of college students
University of Turin, Italy
Noelia Belando Pedreño,
European University of Madrid, Spain
Shuang Du 1, Hang Hu 2, Kaiwen Cheng 1 and Huan Li 2*
*CORRESPONDENCE
1
College of Language Intelligence, Sichuan International Studies University, Chongqing, China, 2 College
Huan Li of Teacher Education, Southwest University, Chongqing, China
[email protected]

RECEIVED 02 August 2023


The effect of physical activity (PA) on academic achievement has long been a
ACCEPTED 27 September 2023
PUBLISHED 16 October 2023 hot research issue in physical education, but few studies have been conducted
CITATION
using machine learning methods for analyzing activity behavior. In this paper,
Du S, Hu H, Cheng K and Li H (2023) Exercise we collected the data on both physical activity and academic performance from
makes better mind: a data mining study on 2,219 undergraduate students (Mean = 19 years) over a continuous period of
effect of physical activity on academic
achievement of college students.
12 weeks within one academic semester. Based on students’ behavioral indicators
Front. Psychol. 14:1271431. transformed from a running APP interface and the average academic course
doi: 10.3389/fpsyg.2023.1271431 scores, two models were constructed and processed by CHAID decision tree for
COPYRIGHT regression analysis and significance detection. It was found that first, to attain
© 2023 Du, Hu, Cheng and Li. This is an open- higher academic performance, it is imperative for students to not only exhibit
access article distributed under the terms of
the Creative Commons Attribution License exceptional activity regularity, but also sustain a reduced average step frequency;
(CC BY). The use, distribution or reproduction second, the students completing running exercise with an average frequency
in other forums is permitted, provided the of 1 time/week and the duration of 16–25 min excelled over approximately
original author(s) and the copyright owner(s)
are credited and that the original publication in 88 percentage of other students on academic performance; third, the processing
this journal is cited, in accordance with validity and reliability of physical observation data in complex systems can
accepted academic practice. No use, be improved by utilizing decision tree as a leveraging machine learning tool and
distribution or reproduction is permitted which
does not comply with these terms. statistical method. These findings provide insights for educational practitioners and
policymakers who will seek to enhance college students’ academic performance
through physical education programs, combined with data mining methods.

KEYWORDS

complex systems, college students, physical activity, running, academic performance,


decision tree

Introduction
The relationship between physical activity and academic performance has been studied in
various adolescent populations in different countries. For instance, data from public schools in
the northeastern United States confirmed a positive correlation between physical fitness test
scores and pass rates in math and English course assessments (Chomitz et al., 2010). Moreover,
middle school students who met the aerobic endurance running standards not only had a higher
likelihood of meeting standardized test benchmarks but also demonstrated improved academic
performance (Bass et al., 2013). In Spain, after controlling for BMI z-scores, waist circumference,
and body fat percentage, the levels of aerobic fitness and motor skills were positively correlated
with the grades on math and language tests among 6–18-year-old adolescents (Esteban-Cornejo
et al., 2014). Similarly, in Japan, cardiorespiratory fitness and overall health-related fitness were
found to have significant positive effect on academic performance among middle school

Frontiers in Psychology 01 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

students (Ishihara et al., 2018). Meanwhile, in a study involving 183 significant influence on tendencies toward overeating, weight gain,
college students examining the relationship between physical fitness and diminished physical fitness. As the volume of data utilized in
and academic performance, it was found that, apart from body mass sports research continues to grow, the expansive magnitude and
index (BMI), all students’ physical fitness tests showed a significant complex nature of sports-related data necessitate enhanced data
positive correlation with average academic scores, indicating that high processing techniques.
levels of physical fitness contribute positively to academic success In the field of sports research, there is an increasing inclination
(Başkurt et al., 2020). Zhang (2022) further investigated the factors toward the utilization of non-linear data mining techniques. These
influencing physical fitness scores among college students and approaches offer practical insights into associations between predictor
identified physical fitness level, exercise frequency, and physical variables (e.g., team performance indicators) and dependent variables
injuries as key factors. Currently, there is a contentious debate in the (e.g., match outcomes) (Robertson et al., 2016). Unlike linear methods,
academic community regarding the apparent association between these approaches can reveal multiple patterns within the data
physical activity and academic performance due to varying research (Mandorino et al., 2021; Teixeira et al., 2022). One widely-used
methodologies and data sources employed (Rodriguez et al., 2020). non-linear method is the decision tree, which partitions samples
In addition to the correlation and predictability of physical based on maximum information entropy (Mooney et al., 2017).
exercise on academic performance, some previous research has Hijriana and Muttaqin (2016) applied decision trees to classify
incorporated social cognitive theories from psychology to explain the academic achievement, while You et al. (2018) used them to analyze
underlying mechanisms. This suggests that the enhancement of physical activity’s impact on hypertension prevention in middle-aged
students’ cognitive abilities through physical activity primarily and older adults in China. Pei et al. (2019) evaluated five classifiers for
manifests in self-control, specifically focusing on self-regulatory identifying individuals with diabetes based on clinical features.
efficacy (Anderson et al., 2006). The impact of self-efficacy on self- Benediktus and Oetama (2020) employed the decision tree C5.0
regulation and its association with exercise are highlighted, with self- classification algorithm, based on information entropy, to predict
regulatory efficacy positively correlated with exercise intensity student academic performance and explore the role of student
(Bauman et al., 2012). This explanation aligns well with social activeness as a predictor. The use of information entropy allows for a
cognitive theory, as identifying oneself as an exerciser is, to some comprehensive exploration of intricate relationships and patterns
extent, influenced by past exercise experiences and serves as a source within the complex system of physical activity (Silva et al., 2016). In
of self-efficacy (Bandura, 1997). Moreover, achieving the desired this study, information entropy was also employed to construct
intensity of exercise is associated with various behavioral outcomes indicators of activity patterns, with the aim of quantitatively assessing
related to academic development (Strachan and Whaley, 2013), the uncertainty and randomness in the exercise patterns and trends of
including weekly exercise minutes (Strachan et al., 2010), weekly college students.
exercise frequency, duration and intensity of vigorous exercise The progression of research involving the CHAID (Chi-squared
(Strachan and Brawley, 2008), and the number of weeks engaging in Automatic Interaction Detector) method, in contrast to the commonly
exercise (Anderson et al., 1998). These studies indicate a correlation used decision tree algorithm, can be traced through multiple studies.
between exercise intensity and self-regulation. Therefore, the question Sanz Arazuri and de Leon Elizondo (2010) initially elucidated the
arises as to which specific aspect of cognitive processes in adolescents application of hierarchical segmentation with CHAID, laying the
may be impacted by physical exercise and how exactly it influences foundation. Subsequently, Gómez et al. (2015) employed CHAID to
cognition. Current research has only scratched the surface by pinpoint influential variables in ball screens, demonstrating its
exploring certain facets of cognitive processes, and the studies practical use. Building on this, Robertson et al. (2016) delved deeper,
conducted thus far remain fragmented (Balk and Englert, 2020). revealing distinctions between teams and showcasing CHAID’s
In the study of the mechanisms underlying the impact of physical effectiveness in crafting performance indicator profiles. In a more
activity on academic performance, two approaches are commonly recent study, Eagle et al. (2022) extended the research by utilizing
used: examining the mediating variables in the causal pathway CHAID for subgroup analysis and examining its role in assessing
between the two factors and exploring the underlying mechanisms sport-related suicide risk. Throughout these studies, CHAID
from other disciplines such as psychology and cognitive science. The consistently displayed its potential in predicting behavior indicators
former approach, as proposed by Kayani et al. (2018), was “physical and elucidating causal relationships, as underscored by Schnell et al.
activity → self-esteem → learning motivation and performance,” (2014), thus emphasizing its evolving significance in the field.
which suggests that the strongest mediator between physical activity In the realm of academic inquiry, a contentious debate persists
and academic performance is self-esteem. To put it another way, regarding the connection between physical activity and academic
physical activity could enhance students’ self-esteem, which may serve performance. This debate stems from the diverse research
as a guarantee for their motivation and academic success. Liang and methodologies and data sources employed in previous studies
Li (2020) explored the pathway of “physical activity → physical health (Rodriguez et al., 2020). Our research endeavors to contribute to this
→ academic performance” by considering both explicit physical discourse by addressing several key objectives. Firstly, we aim to
appearance and implicit physical skills as mediating factors. The unravel the intricate relationship between physical activity and
scholars underscored the pivotal role of physical fitness as a significant academic achievement among college students. we aspire to delve
mediating factor influencing academic achievement (Chacón-Cuberos deeper into the impact of physical exercise on cognitive processes in
et al., 2020; Koçak et al., 2021). The aforementioned studies illuminate adolescents. While prior research has touched upon this topic, our
the substantial correlation existing between psychological factors, goal is to identify specific facets of cognition influenced by exercise
physical well-being, and academic attainment. Specifically, factors intensity. Secondly, we recognize the need for advanced data
such as self-control and low self-efficacy have been found to exert a processing techniques in sports research due to the complex and

Frontiers in Psychology 02 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

expansive nature of sports-related data. By embracing non-linear data Physical behavioral indicators
mining methodologies and leveraging information entropy, we aim to
offer a fresh approach to exploring intricate relationships and patterns Behavioral indicators are input datasets used for machine learning
within the realm of physical activity and its impact on academic modeling. Wearable sports monitoring devices or mobile apps are
achievement. Furthermore, we also aim to elucidate the interplay applied to quantify various parameters and indicators of individuals
between psychological factors, physical well-being, and academic and even groups, such as movement trajectories, exercise habits,
attainment. By focusing on variables such as self-control and self- energy expenditure, and health status. There are two main types of
efficacy, we intend to shed light on their significant influence on behavioral indicators: demographic indicators and behavioral
behaviors related to physical fitness. Our research seeks to provide a indicators. Demographic indicators include basic personal
holistic perspective on student well-being and academic success. information about students, such as age, gender, and major, which
We focused on three principal research objectives: have good predictive capabilities in the early stages of learning
activities which represent static data (Whitener, 1989). Behavioral
• Q1: Is there a correlation between the data model constructed indicators, on the other hand, encompass changing data generated
using behavioral indicators and academic performance? during learning activities, such as activity frequency, duration and
• Q2: How can effectively uncover the factors that influence speed. These indicators exhibit better predictive effects in the middle
academic performance and attribute interpretability to physical and later stages of activities (Hussain et al., 2018; Karthikeyan et al.,
activity metrics through the utilization of machine 2020), representing dynamic data. The research primarily investigates
learning techniques? students’ behavioral performance, specifically the impact of dynamic
• Q3: How can the establishment of a pathway depicting the factors indicators on academic performance. Hence, in the construction of
of physical activity on academic performance aid in revealing the the analytical model, performance indicators pertaining to physical
potential mechanisms? exercise are carefully chosen. Subsequently, directional indicators are
employed to visually represent and classify the findings, thereby
providing an effective means to elucidate the observed outcomes.
Methods The utilization of information entropy in constructing an activity
regularity indicator for college students aims to quantitatively measure
Data source and preprocessing the uncertainty and randomness pertaining to their exercise patterns
and trends. Information entropy plays a vital role in the analysis of
The research data was gathered over a continuous 12-week period intricate systems in sports research, providing researchers with
during one academic semester from undergraduate students at quantitative measures to assess and analyze various aspects of complex
Sichuan International Studies University in China, with an average age sports systems (Rhea et al., 2011). For instance, the utilization of
of 19.08 years. The data was obtained from two different systems. entropy measurements in team sports has exhibited considerable
Firstly, approximately 9,000 academic records, including the grades of potential in evaluating the uncertainty pertaining to players’ spatial
three subjects and physical fitness test scores, were retrieved from the distributions, dominant regions, and various collective team behaviors
Educational Administration System. Secondly, the physical activity log (Silva et al., 2016). Additionally, entropy has been employed to analyze
data for the research subjects during the semester was extracted from the complexity and information content of heart rate variability as an
a running app installed on their mobile phones, yielding approximately indicator of activity (Namazi, 2021). In this study, entropy measures
34,000 records. have been employed in investigating the variability of performance to
In the context of this study, the log data was distributed across unveil the underlying interactions governing activity regulation among
various business systems, necessitating a series of preprocessing steps college students, and the indicator Hx was calculated based on the
to fully harness the data’s intrinsic value when constructing predictive distribution of exercise frequency. The entropy value was computed
indicators. Initially, the log data undergone anonymization and using the proportion of the number of exercise sessions on days for one
aggregation, involving the removal of sensitive information such as student out of the total number of exercise sessions over days. The Hx
names, ID numbers, and phone numbers, followed by the correlation indicator codes and descriptions are presented in Table 1.
and integration of multiple datasets. Subsequently, common issues Physical behavioral indicators in current study were constructed
associated with log data, such as missing and imbalanced data, were based on the key indicators of the Physical Activity Readiness
addressed. Specifically, post-aggregation data undergone cleansing Questionnaire (PAR-Q). These indicators were developed from three
and adjustments. For instance, approximately 3.5% of students lacked aspects: exercise intensity, duration, and frequency (Thomas et al.,
running data, and there existed an imbalance in the gender ratio at 1992; Liang, 1994; Shephard, 2015). PAR-Q is widely used to assess
college (male-to-female ratio: 1:4.3). Hence, during the preprocessing physical activity levels. By scoring the three dimensions in the
stage, missing data were addressed by eliminating invalid and questionnaire, the individual’s exercise volume is calculated using the
duplicate records. Additionally, for datasets exhibiting skewed formula “intensity * duration * frequency = exercise volume.” This
distributions, a Stratified Sampling approach was employed for female study built exercise indicators reflecting students’ physical activity
students to reduce the sample size, while a Bootstrap method was (running) over a 12-week period in one semester. These indicators
applied to male students to augment the sample size. This adjustment included distance covered (in meters), average step frequency (steps
resulted in a more balanced male-to-female student data ratio of per minute), average pace (meters per minute), running duration (in
approximately 1:1.5, ensuring the integrity and validity of the seconds), exercise regularity, and frequency. Among them, distance,
predictive dataset. Ultimately, following data processing, a sample of step frequency, and pace reflected exercise intensity; running duration
2,129 students was retained for the purposes of this research. reflected exercise time; exercise regularity and frequency reflected

Frontiers in Psychology 03 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

TABLE 1 Descriptive characteristics of physical activity behavioral indicators and academic achievement indicators.

Dimension Variable Code Data range Description


The average distance (in meters) covered by students per running session during the
Distance covered (meters) DX 592–7,413
12-week semester

Average step frequency The average frequency of running sessions per week for students during the 12-week
FX 41.7–193.1
(steps/min) semester

Average running speed The average speed (in meters per minute) of students during each running session over
SX 4.2–11.3
(meters/min) the 12-week semester.

Average running duration The average duration (in seconds) of each running session for students over the 12-week
Physical activity TX 328–3,415
(seconds) semester
behavioral
indicators The regularity of exercise H X was calculated based on the distribution of exercise
frequency. The entropy value was computed using the proportion f jmof the number of
exercise sessions on the jth day for the mth student out of the total number of exercise
Activity regularity HX 0–1
sessions over n days. The formula for the calculation is as follows:
n
HX = −
1
ln ( n )

i =1
( )
f jm ln f jm (Formula 1)

Activity frequency VX 1–43 The total number of running sessions in the 12-week semester was determined.

The formula for calculating APisasfollows : fi denotes the final exam score for the ith
major-specific course, and the weight is determined based on the credit value gi of the
Academic
Academic performance course. This weight is used to calculate the weighted average score for the student’s major
achievement AP 23–98.7
score (AP) courses.  
indicators n
1  gi 
AP =
n
∑  n
× fi  (Formula 2)
i =1 ∑ i =1 i
 g 

exercise frequency. The specific indicator codes and descriptions are conventional classification categories such as “pass,” “good,” and
presented in Table 1. “excellent,” but rather continuous variables directly associated with
academic performance scores. This choice transformed the task
into a typical regression problem. The study had two main parts:
Academic achievement indicators firstly, the data collected from the administration system and
mobile apps are anonymized, aggregated, and cleaned, and the
Academic performance (AP) indicators, are influenced by a predictive variables for correlation and variance inflation factor
number of factors such as teacher subjectivity, selection bias, and (VIF) to identify the optimal predictors. Secondly, the CHAID
student behavior (Marques et al., 2018). Scholars commonly employ decision tree algorithm was utilized for significance testing and
standardized tests to assess AP. Examples include the Academic branch prediction, providing statistical explanations and
Aptitude Test (SAT) in the United States, the National High School attributions to the results, and identifying potential factors
Examination (ENEM) in Brazil, and the General Scholastic Ability influencing academic performance from the patterns of physical
Test (GSAT) for higher education admission in Taiwan. Some activity behavior among college students. The flowchart involving
researchers also use final grades from common courses and major- data collection, preprocessing, screening process, and data model
specific courses within the students’ respective schools as indicators construction, and CHAID decision tree modeling is shown in
of academic performance. In the current study, the physical fitness Figure 1.
scores and standardized average scores from major-specific courses of
first-year university students over one semester were used as predictive
targets to evaluate their physical fitness and academic performance. Data model
As for the selection of major-specific scores, due to the large sample
size and the variation among students’ colleges and majors, AP was To validate and compare the predictive capabilities of physical
primarily determined by the average scores of their highest credit behavioral indicators on academic performance, the behavioral
courses. The conversion method is detailed in Table 1. dataset was divided into two subsets. Both subsets were associated
with the predictive target variables of academic performance, forming
the learner data models Model 1 and Model 2, as follows. These data
Data mining based on machine learning models served as the data source for subsequent prediction model
construction and performance comparison.
In order to enhance the interpretability of the study’s Model 1: Physical behavioral indicators (Variables) - > Academic
predictions, the target variables for prediction were not the Performance Score (All Target).

Frontiers in Psychology 04 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

FIGURE 1
Flowchart of data mining based on physical activities.

Model 2: Physical behavioral indicators (Variables) - > Academic From Figure 2, it is evident that exercise regularity significantly
Performance Score (Only AP > 80). influences academic performance (p < 0.00). In Node 2, 70% of
students exhibited exercise regularity ranging from 0.488 to 0.753.
These students, as long as they maintain good exercise regularity, can
Analysis tools achieve satisfactory academic performance (AP = 79.951, comparable
to the overall average of 79.553). Within the subset of students with
The predictive tools employed in this study utilized prediction higher exercise regularity, some individuals (Node 6) not only
algorithms provided by machine learning models, specifically SPSS demonstrate regular exercise habits but also fulfill the designated
Modeler for predictive modeling and analysis. The CHAID module running distance (Dx > 2731.63), resulting in above-average scores
in SPSS Modeler was used for decision tree visualization modeling. (AP = 80.896). The highest score is observed in Node 7, where students
This module is used for branch prediction and significance analysis with the best exercise regularity (Hx > 0.796) and not necessarily fast
in the two data models. By utilizing the CHAID method, we could running or high step frequency (FX > 155.13) achieve the best
quickly and effectively unearth the primary influencing factors. This academic performance (AP = 78.0). It is the students who exhibit
approach could handle nonlinear and highly correlated physical regular, slower-paced, and lower step frequency exercise patterns that
behavioral data. Furthermore, it could accommodate missing values, excel in academic performance.
thus overcoming restrictions faced by traditional parametric tests in
these aspects.
Impact of exercise performance indicators
on academic performance from data
Results model 2

Correlation analysis When investigating the impact of exercise frequency and duration
on academic performance, no significant differences were found in the
Correlation analysis and variance inflation factor (VIF) tests were decision tree analysis among all study subjects (p > 0.05). Therefore,
conducted on the behavioral indicators. The former assessed the the study sample was reduced, focusing primarily on students with
phenomenon correlation between the predictive indicators and the good academic performance (AP > 80). From a total of 2,129
target variable, while the latter evaluated the collinearity among the occurrences, 1,468 individuals (accounting for 68.9%) were selected
indicators within a controllable range. If the VIF value was less than as the new sample for further analysis, as depicted in Figure 3.
0.1 or greater than 10, it indicated poor predictive performance and Based on Figure 3, it was evident that exercise frequency had a
necessitates adjustment or removal of the respective indicator (as significant impact on achieving better academic performance
shown in Table 2). From Table 2, it can be observed that the average (p < 0.00). As the number of exercise sessions (VX) increased from 8 to
running speed (SX) has a relatively high VIF value, but it still falls 10, academic performance also increased from 80.01 to 82.46,
within a reasonable range. All other indicator VIF values are less than exhibiting a linear correlation trend. Among the majority of students
3, indicating that all predictive indicators satisfy the collinearity (65.6%), exercise frequency exceeded 10 sessions (VX > 10). However,
condition and should be retained. it was not the duration of each running session that determined the
academic performance; instead, students (12.057%) with an average
running time between 982.17 and 1555.33 s (16.4–26.1 min) achieved
Impact of exercise performance indicators the best academic performance (AP = 83.632). Additionally, within
on academic performance from data this group of students, 44.69% had a running duration of less than
model 1 16 min, indicating relatively shorter running times and only meeting
the minimum requirements. On the other hand, a small percentage
The analysis of academic performance was conducted based on (8.86%) of students had an average running time exceeding 26 min,
the indicators from data model 1, as shown in Figure 2. indicating slower running speeds, primarily jogging or even walking,

Frontiers in Psychology 05 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

TABLE 2 Descriptive statistics, correlations, and VIF between physical behavioral indicators and academic achievement indicators.

Variable M SD 1 2 3 4 5 6 7 8
1. Age 19.08 0.776 –

2. TX 1389.370 366.333 0.012 –

3. DX 2997.380 752.255 0.041 0.875** –

4. FX 132.742 26.929 0.025 −0.164** −0.082** –

5. SX 7.707 0.917 −0.063** 0.360** −0.107** −0.207** –

6. HX 0.649 0.136 −0.051* −0.240** −0.306** 0.040 0.085** –

7. VX 12.130 3.982 −0.042 −0.488** −0.520** 0.089** 0.007 0.513** -

8. AP 81.944 6.210 −0.0076 −0.002 0.048* −0.027 0.010 0.090** 0.097** –

VIF – – – 1.416 2.883 1.061 8.909 1.37 1.731 –


N = 2,129; M, mean; SD, stanfard deviation; VIF, variance inflation factor; *p < 0.05; **p < 0.01.

FIGURE 2
CHAID decision tree analysis diagram based on data model 1.

and insufficient intensity for cardiovascular exercise. Nodes 4 and CHAID methodology to identify the most substantial influencing
Node 6 demonstrated a threshold effect, displaying an inverted factors in addressing the second research question (Q2). In terms of
U-shaped trend. While these students can also achieve satisfactory academic performance, students who successfully complete assigned
academic performance (AP > 82), their overall exercise effectiveness tasks may achieve satisfactory average grades. However, to attain higher
was inferior to that of students in Node 5, which exceeded the academic performance (AP = 82.932), as depicted in Node 7 of Figure 2,
academic performance of approximately 88% of all other students. students not only need to demonstrate excellent activity regulation
(HX > 0.796) but also maintain a lower stride frequency (FX < 155.13).
This implies that students predominantly engage in jogging or walking,
Discussion indicating lower exercise intensity compared to students in Node 8. It
can be inferred that consistent engagement in low-intensity running
The effect of activity tasks on academic promotes regular and sustained physical activity, indirectly affirming
performance the endurance training component of exercise. This contributes to the
development of students’ self-control and self-efficacy, which in turn
In our academic endeavor, we undertook a correlation validation aligns with their academic performance. In the academic domain,
analysis to address the first research question (Q1) and utilized the students are encouraged to cultivate a mindset of continuous learning

Frontiers in Psychology 06 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

FIGURE 3
CHAID decision tree analysis diagram based on data model 2.

and steadfastness, rather than relying solely on intense and short-term AP. Furthermore, 44.29% of students participate in physical activity
bursts of studying. It is through consistent effort and perseverance that between 10 and 14 times (at least once per week on average),
students can build a solid foundation of knowledge and skills, enhancing resulting in favorable academic achievements (AP > 82). According
their academic achievements in the long run. By integrating regular to Figure 3, students who engage in physical activity for durations
physical activity into their routines, students not only improve their ranging from 16 to 26 min demonstrate the highest predictive
cardiovascular and aerobic fitness but also develop important qualities capability for academic performance. Although the proportion of
such as discipline, focus, and resilience, all of which are conducive to these students in Node 5 is not high (12.06%), it reflects the positive
academic success. This highlights the significance of maintaining a impact of physical activity on improving cardiorespiratory
balanced approach to both physical exercise and academic pursuits, endurance and regulating self-efficacy. Considering the average
recognizing the synergistic relationship between the two domains. running distance, most students have covered over 2 kilometers after
Therefore,emphasizing the value of consistent and moderate exercise running for 16 min, which is a critical period for cardiorespiratory/
contributes to the overall well-being and holistic development of aerobic fitness (C/AF) development. These students are capable of
students, ultimately benefiting their academic endeavors. maintaining a moderate pace during running without rushing to
complete the distance task. Their awareness of self-regulation
efficiency influences goal selection, persistence in goal achievement,
Optimal activity frequency and duration for and response to setbacks, thereby enhancing their self-regulatory
academic performance abilities (Maddux et al., 2012). Allocating up to an additionally
approximate half hour per day of curricular time to AP program
During the exploration of second research question (Q2), does not affect the academic performance of primary school students
we sought to unravel the factors that exert a substantial influence on negatively, even though the time allocated to other subjects usually
academic performance and simultaneously imbue interpretability shows a corresponding reduction.
into the realm of physical activity metrics, leveraging the capabilities
of machine learning techniques. In pursuit of this objective,
we turned to the CHAID method, a powerful tool that allowed us to Mechanism underlying the impact of
identify and highlight the most pivotal influencing factors. physical activity on academic performance
According to Figure 2, 67% of college students engage in physical
activity with a frequency ranging from 7 to 14 times over the course To address research question (Q3), which pertains to
of 12 weeks, which yields the maximum improvement in elucidating the potential mechanisms by which establishing

Frontiers in Psychology 07 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

pathways may be beneficial, our study furnishes evidence for a Limitations


mediating pathway within the impact mechanism. Specifically,
we propose the pathway as follows: “physical exercise → self- First, this study leveraged a sizable sample for evaluating
control ability → academic performance.” The self-control ability academic performance in relation to physical activity. Our research
is derived from college students engaging in low-intensity running demonstrated an approach to enhance the interpretability and
during physical exercise, which allows them to control their speed effectiveness of decision trees in processes. The challenges pertaining
without rushing to reach their fitness goals while still achieving to missing physical exercise data, overfitting during model
the required intensity. It also supports the findings of Xu et al. construction, and optimization of model parameters are to
(2018), who concluded that executive function serves as an be addressed. Secondly, participants’ levels of physical activity may
intermediate variable by which physical exercise promotes not be fully reflected in the data obtained from the running
academic performance, explaining the pathway as “physical application (APP) since some special cases may have not
exercise → executive function → academic performance.” been excluded completely, where low physical activity values could
Furthermore, physical exercise offers the advantage of being be due to student dropout or illness-related leaves and exceptionally
regularly and consistently performed on a weekly basis, thus high values could be attributed to student athletes or long-distance
enhancing college students’ confidence and self-efficacy. This running enthusiasts (Lupo et al., 2017a). Thirdly, university students
finding further corroborates with the research conclusion of may engage in physical exercise for varying objectives, such as
Anderson et al. (2006) that while exercise directly influences medals, participation in competitive events or improving their
academic performance, psychological-social factors and physical academic performance. Therefore, future research will delve further
fitness levels play a mediating role. Through the expenditure of into the motivations behind physical exercise and their direct or
body fat calories during exercise, college students enhance their indirect (mediating) impact on academic performance (Lupo et al.,
self-control ability and willpower, representing a self-regulatory 2017b; Liang and Li, 2020).
structure that impacts individuals’ efforts to maintain consistency
between cognition and behavior (Anderson et al., 2006).
Conclusion
Data mining in sport education research This study utilized machine learning methods to investigate the
impact of physical activity on academic achievement among
Physical activity involves complex decision-making processes, undergraduates. The decision tree model effectively captured the
necessitating the utilization of effective tools and techniques to relationship between physical and academic performance. Activity
support physical educators. In the context of physical education regularity exhibited varying degrees of influence on the interaction
research, it is essential to continuously explore the utilization of between physical test scores and academic achievement, and
various research and experimental tools in practical investigations, explaining the relationship between physical activity and academic
fostering the in-depth application of advanced quantitative achievement in terms of psycho-social factors and physical fitness
research methods and tools. In the domain of regression problems, level. These findings contribute to the existing literature on the
it is imperative for machine learning algorithms to demonstrate subject and provide insights for educational practitioners to enhance
not just robust predictive abilities, but also effective generalization. academic performance through physical activity interventions.
Therefore, in this study, the analysis extended beyond examining
mean values of each indicator. To better capture the model’s
generalization and explanatory power, the CHAID decision tree Data availability statement
was employed, enabling statistical significance testing and offering
comprehensive regression results (Morgan et al., 2013). Decision The original contributions presented in the study are included in
trees, as a tool in machine learning have been playing a role in the article/supplementary material, further inquiries can be directed
researching and solving complex problems in many fields, and has to the corresponding author.
gained attention as a promising approach for tackling the
intricacies and uncertainties associated with analyzing physical
activity. Especially in the current era of big data, the abundance of Ethics statement
data collected from observations of physical exercise (PE) and
physical activity (PA) enables the emergence of behavioral The studies involving humans were approved by Sichuan
patterns. By leveraging machine learning tools and statistical International Studies University. The studies were conducted in
methods, the processing validity and reliability of physical accordance with the local legislation and institutional requirements.
observation data in complex systems can be improved (Robertson The participants provided their written informed consent to
and Joyce, 2015). This serves as the material foundation and participate in this study.
underlying logic for educational data mining and data-driven
approaches, which are essential for enhancing educational Author contributions
management and informed decision-making. For instance,
unsupervised learning methods can be employed to classify or SD: Formal analysis, Funding acquisition, Methodology,
cluster groups based on sports-related data using entropy-based Writing – original draft, Writing – review & editing. HH:
techniques (Rhea et al., 2011; Namazi, 2021; Yang, 2021). Supervision, Writing – review & editing. KC: Investigation,

Frontiers in Psychology 08 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

Validation, Writing – original draft, Writing – review & editing. Conflict of interest
HL: Methodology, Writing – review & editing.
The authors declare that the research was conducted in the
Funding absence of any commercial or financial relationships that could
be construed as a potential conflict of interest.
The author(s) declare financial support was received
for the research, authorship, and/or publication of this
article. This research was supported by the Science and Publisher’s note
Technology Research Program of Chongqing Municipal Education
Commission (Grant No. KJZD-K202200903). All claims expressed in this article are solely those of the authors
and do not necessarily represent those of their affiliated
Acknowledgments organizations, or those of the publisher, the editors and the
reviewers. Any product that may be evaluated in this article, or claim
Data were analyzed using college students at Sichuan that may be made by its manufacturer, is not guaranteed or endorsed
International Studies University in China. by the publisher.

References
Anderson, D. F., Cychosz, C. M., and Franke, W. D. (1998). Association of exercise Hussain, M., Zhu, W., Zhang, W., and Abidi, S. M. R. (2018). Student engagement
identity with measures of exercise commitment and physiological indicators of fitness predictions in an e-learning system and their impact on student course assessment
in a law enforcement cohort. J. Sport Behav. 21, 233–241. scores. Comput. Intell. Neurosci. 2018:6347186. doi: 10.1155/2018/6347186
Anderson, E. S., Wojcik, J. R., Winett, R. A., and Williams, D. M. (2006). Social- Ishihara, T., Morita, N., Nakajima, T., Okita, K., Sagawa, M., and Yamatsu, K. (2018).
cognitive determinants of physical activity: the influence of social support, self-efficacy, Modeling relationships of achievement motivation and physical fitness with academic
outcome expectations, and self-regulation among participants in a church-based health performance in Japanese school children: moderation by gender[J]. Physiol. Behav. 194,
promotion study. Health Psychol. 25, 510–520. doi: 10.1037/0278-6133.25.4.510 66–72. doi: 10.1016/j.physbeh.2018.04.031
Balk, Y. A., and Englert, C. (2020). Recovery self-regulation in sport: theory, research, Karthikeyan, V. G., Thangaraj, P., and Karthik, S. (2020). Towards developing hybrid
and practice. Int. J. Sports Sci. Coach. 15, 273–281. doi: 10.1177/1747954119897528 educational data mining model (HEDM) for efficient and accurate student performance
evaluation. Soft. Comput. 24, 18477–18487. doi: 10.1007/s00500-020-05075-4
Bandura, A., (1997).Self-efficacy: the exercise of control. Freeman Press, New York, NY.
Kayani, S., Kiyani, T., Wang, J., Zagalaz Sánchez, M., Kayani, S., and Qurban, H.
Bauman, A. E., Reis, R. S., Sallis, J. F., Wells, J. C., Loos, R. J. F., and Martin, B. W. (2018). Physical activity and academic performance: the mediating effect of self-esteem
(2012). Correlates of physical activity: why are some people physically active and others and depression[J]. Sustainability 10:3633. doi: 10.3390/su10103633
not? Lancet 380, 258–271. doi: 10.1016/S0140-6736(12)60735-1
Koçak, Ö., Göksu, İ., and Göktas, Y. (2021). The factors affecting academic
Başkurt, Z., Başkurt, F., and Ercan, S. (2020). Correlations of physical fitness and achievement: a systematic review of meta analyses. Int. Onl. J. Educ. Teach. 8, 454–484.
academic achievement in undergraduate students. J. Phys. Educ. Hum. Move. 2, 9–20.
doi: 10.24310/JPEHMjpehm.v2i1.6770 Liang, Z., and Li, M. L. (2020). Progress of research on physical health promotion and
academic performance of adolescent children. J. Phys. Educ. 27, 96–102. doi: 10.16237/j.
Bass, R. W., Brown, D. D., Laurson, K. R., and Coleman, M. M. (2013). Physical fitness cnki.cn44-1404/g8.2020.03.015
and academic performance in middle school students[J]. Acta Paediatr. Scand. 102,
832–837. doi: 10.1111/apa.12278 Liang, D. C. (1994). Stress levels of college students and their relationship with
physical activity. Chin. J. Ment. Health 1, 5–6. doi: 10.3321/j.issn:1000-6729.1994.01.020
Benediktus, N., and Oetama, R. S. (2020). The decision tree c5. 0 classification
algorithm for predicting student academic performance. Ultimatics: Jurnal Teknik Lupo, C., Mosso, C. O., Guidotti, F., Cugliari, G., Pizzigalli, L., and Rainoldi, A.
Informatika 12, 14–19. doi: 10.31937/ti.v12i1.1506 (2017a). The adapted Italian version of the baller identity measurement scale to evaluate
the student-athletes’ identity in relation to gender, age, type of sport, and competition
Chacón-Cuberos, R., Zurita-Ortega, F., Ramírez-Granizo, I., and Castro-Sánchez, M. level. PLoS One 12:e0169278. doi: 10.1371/journal.pone.0169278
(2020). Physical activity and academic performance in children and preadolescents: a
systematic review. Apunts. Educación Física y Deportes 139, 1–9. doi: 10.5672/ Lupo, C., Mosso, C. O., Guidotti, F., Cugliari, G., Pizzigalli, L., and Rainoldi, A.
apunts.2014-0983.es.(2020/1).139.01 (2017b). Motivation toward dual career of Italian student-athletes enrolled in
different university paths. Sport Sci. Health 13, 485–494. doi: 10.1007/
Chomitz, V. R., Slining, M. M., McGowan, R. J., Mitchell, S. E., Dawson, G. F., s11332-016-0327-4
Hacker, K. A., et al. (2010). Is there a relationship between physical fitness and academic
achievement? Positive results from public school children in the northeastern Maddux, J. M. N., Schiffino, F. L., and Chang, S. E. (2012). The amygdala central
United States. J. Sch. Health 79, 30–37. doi: 10.1111/j.1746-1561.2008.00371.x nucleus: a new region implicated in habit learning. J. Neurosci. 32, 7769–7770. doi:
10.1523/JNEUROSCI.1223-12.2012
Colley, R.C., Garriguet, D., Janssen, I., Craig, C.L., Clarke, J., and Tremblay, M.S.,
(2011). Physical activity of Canadian adults: accelerometer results from the 2007 to 2009 Mandorino, M., Figueiredo, A., Cima, G., and Tessitore, A. (2021). A data mining
Canadian health measures survey (no. 82-003-X). Retrieved from statistics Canada approach to predict non-contact injuries in young soccer players. Int. J. Comp. Sci. Sport
Canadian Centre for Health website. Available at: http://www.statcan.gc.ca/ 20, 147–163. doi: 10.2478/ijcss-2021-0009
pub/82-003-x/2011001/article/11396-eng.htm
Marques, A., Da, S., Hillman, C., and Sardinha, L. B. (2018). How does academic
Eagle, S. R., Brent, D., Covassin, T., Elbin, R. J., Wallace, J., Ortega, J., et al. (2022). Exploration achievement relate to cardiorespiratory fitness, self-reported physical activity and
of race and ethnicity, sex, sport-related concussion, depression history, and suicide attempts objectively reported physical activity: a systematic review in children and
in US youth. JAMA Netw. Open 5:e2219934. doi: 10.1001/jamanetworkopen.2022.19934 adolescents aged 6-18 years. Br. J. Sports Med. 52:1039. doi: 10.1136/
bjsports-2016-097361
Esteban-Cornejo, I., Tejero-González, C. M., Martinez-Gomez, D., del-Campo, J.,
González-Galo, A., Padilla-Moledo, C., et al. (2014). Independent and combined Mooney, M., Charlton, P. C., Soltanzadeh, S., and Drew, M. K. (2017). Who ‘owns’ the injury
influence of the components of physical fitness on academic performance in youth[J]. or illness? Who ‘owns’ performance? Applying systems thinking to integrate health and
J. Pediatr. 165, 306–312.e2. doi: 10.1016/j.jpeds.2014.04.044 performance in elite sport. Br. J. Sports Med. 51, 1054–1055. doi: 10.1136/bjsports-2016-096649
Farrahi, V., Niemelä, M., Kärmeniemi, M., Puhakka, S., Kangas, M., Korpelainen, R., Morgan, S., Williams, M. D., and Barnes, C. (2013). Applying decision tree induction
et al. (2020). Correlates of physical activity behavior in adults: a data mining approach. for identification of important attributes in one-versus-one player interactions: a hockey
Int. J. Behav. Nutr. Phys. Act. 17, 1–14. doi: 10.1186/s12966-020-00996-7 exemplar. J. Sports Sci. 31, 1031–1037. doi: 10.1080/02640414.2013.770906
Gómez, M. Á., Battaglia, O., Lorenzo, A., Lorenzo, J., Jiménez, S., and Sampaio, J. Namazi, H. (2021). Complexity and information-based analysis of the heart rate
(2015). Effectiveness during ball screens in elite basketball games. J. Sports Sci. 33, variability (HRV) while sitting, hand biking, walking, and running. Fractals 29:2150201.
1844–1852. doi: 10.1080/02640414.2015.1014829 doi: 10.1142/S0218348X21502017
Hijriana, N., and Muttaqin, R. (2016). Penerapan metode decision tree algoritma c4. Pei, D., Gong, Y., Kang, H., Zhang, C., and Guo, Q. (2019). Accurate and rapid
5 untuk klasifikasi mahasiswa berprestasi. Al-Ulum: J. Sains Teknol. 2, 39–43. doi: screening model for potential diabetes mellitus. BMC Med. Inform. Decis. Mak. 19, 1–8.
10.31602/ajst.v2i1.651 doi: 10.1186/s12911-019-0790-3

Frontiers in Psychology 09 frontiersin.org


Du et al. 10.3389/fpsyg.2023.1271431

Rhea, C. K., Silver, T. A., Hong, S. L., Ryu, J. H., Studenka, B. E., Hughes, C. M., et al. Shephard, R. J. (2015). Qualified fitness and exercise as professionals and exercise
(2011). Noise and complexity in human postural control: interpreting the different prescription: evolution of the PAR-Q and Canadian aerobic fitness test. J. Phys. Act.
estimations of entropy. PLoS One 6:e17696. doi: 10.1371/journal.pone.0017696 Health 12, 454–461. doi: 10.1123/jpah.2013-0473
Robertson, S. J., and Joyce, D. G. (2015). Informing in-season tactical eriodization in Silva, P., Duarte, R., Esteves, P., Travassos, B., and Vilar, L. (2016). Application of
team sport: development of a match difficulty index for super Rugby. J. Sports Sci. 33, entropy measures to analysis of performance in team sports. Int. J. Perform. Anal. Sport
99–107. doi: 10.1080/02640414.2014.925572 16, 753–768. doi: 10.1080/24748668.2016.11868921
Robertson, S., Back, N., and Bartlett, J. D. (2016). Explaining match outcome in elite Thomas, S., Reading, J., and Shephard, R. J. (1992). Revision of the physical activity
Australian rules football using team performance indicators. J. Sports Sci. 34, 637–644. readiness questionnaire (PAR-Q). Can. J. Sport Sci. 17, 338–345.
doi: 10.1080/02640414.2015.1066026
Teixeira, J. E., Forte, P., Ferraz, R., Branquinho, L., Silva, A. J., Barbosa, T. M., et al.
Rodriguez, C. C., Camargo, E. M., Rodriguez-Añez, C. R., and Reis, R. S. (2020). (2022). Methodological procedures for non-linear analyses of physiological and
Physical activity, physical fitness and academic achievement in adolescents: a systematic behavioural data in football. Exerc. Physiol. 1, 1–25. doi: 10.5772/intechopen.102577
review. Rev. Bras. Med. Esporte 26, 441–448. doi: 10.1590/1517-8692202026052019_0048
Whitener, E. M. (1989). A meta-analytic review of the effect on learning of the
Strachan, S. M., and Whaley, D. (2013). “Identities, schemas, and definitions: how
interaction between prior achievement and instructional support. Rev. Educ. Res. 59,
aspects of the self influence exercise behaviour” in Handbook of physical activity and
65–86. doi: 10.3102/00346543059001065
mental health. ed. P. Ekkekakis (New York: Routledge), 212–223.
Strachan, S. M., Brawley, L. R., and Spink, K. (2010). Glazebrook. Older adults’ physically- Xu, W., Zhang, Y., Zhou, L., Hua, J., School, E., University, Z. (2018). Influence of
active identity: relationships between social cognitions, physical activity and satisfaction with physical fitness on academic achievement in adolescents: evidences from a longitudinal
life. Psychol. Sport Exerc. 11, 114–121. doi: 10.1016/j.psychsport.2009.09.002 study. Journal of Beijing Sports University 41, 70–76. doi: 10.19582/j.cnki.11-3785/
g8.2018.07.010
Strachan, S. M., and Brawley, L. R. (2008). Reactions to a challenge to identity: a focus
on exercise and healthy eating. J. Health Psychol. 13, 575–588. doi: Yang, B. (2021). Learning motivations and learning Behaviors of sports majors based
10.1177/1359105308090930 on big data. Int. J. Emerg. Technol. Learn. 16, 86–97. doi: 10.3991/ijet.v16i23.27823
Sanz Arazuri, E., and de Leon Elizondo, A. P. (2010). Key to applying the CHAID You, Y., Teng, W., Wang, J., Ma, G., Ma, A., Wang, J., et al. (2018). Hypertension and
algorithm: a study of university physical-sport leisure activities. Revista De Psicologia physical activity in middle-aged and older adults in China. Sci. Rep. 8:16098. doi:
Del Deporte 19, 319–333. 10.1038/s41598-018-34617-y
Schnell, A., Mayer, J., Diehl, K., Zipfel, S., and Thiel, A. (2014). Giving everything for Zhang, Y. (2022). An empirical study on the influence of college students’ physical
athletic success! – sports-specific risk acceptance of elite adolescent athletes. Psychol. fitness on the level of public health. J. Environ. Public Health 8:8197903. doi:
Sport Exerc. 15, 165–172. doi: 10.1016/j.psychsport.2013.10.012 10.1155/2022/8197903

Frontiers in Psychology 10 frontiersin.org

You might also like