Impact of Practical Skills On Academic Performance A Data-Driven Analysis
Impact of Practical Skills On Academic Performance A Data-Driven Analysis
net/publication/355182006
CITATIONS READS
3 1,056
5 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Md. Mostafizer Rahman on 24 October 2021.
Received August 26, 2021, accepted October 5, 2021, date of publication October 8, 2021, date of current version October 19, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3119145
ABSTRACT Most academic courses in information and communication technology (ICT) or engineering
disciplines are designed to improve practical skills; however, practical skills and theoretical knowledge are
equally important to achieve high academic performance. This research aims to explore how practical skills
are influential in improving students’ academic performance by collecting real-world data from a computer
programming course in the ICT discipline. Today, computer programming has become an indispensable
skill for its wide range of applications and significance across the world. In this paper, a novel framework to
extract hidden features and related association rules using a real-world dataset is proposed. An unsupervised
k-means clustering algorithm is applied for data clustering, and then the frequent pattern-growth algorithm
is used for association rule mining. We leverage students’ programming logs and academic scores as
an experimental dataset. The programming logs are collected from an online judge (OJ) system, as OJs
play a key role in conducting programming practices, competitions, assignments, and tests. To explore
the correlation between practical (e.g., programming, logical implementations, etc.) skills and overall
academic performance, the statistical features of students are analyzed and the related results are presented.
A number of useful recommendations are provided for students in each cluster based on the identified hidden
features. In addition, the analytical results of this paper can help teachers prepare effective lesson plans,
evaluate programs with special arrangements, and identify the academic weaknesses of students. Moreover,
a prototype of the proposed approach and data-driven analytical results can be applied to other practical
courses in ICT or engineering disciplines.
INDEX TERMS Practical skills, programming education, feature extraction, educational data mining,
learning analytics, e-learning, online judge, clustering, association rule mining.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 9, 2021 139975
M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis
that students achieve computer literacy [1], most educational programming. The results of the analysis helped students to
institutions that teach programming have redesigned their improve their weak concepts through frequent faculty support
academic curriculum to effectively meet the basic literacy and also offered benefits to institutions. Ang et al. [18] con-
requirements of programming education. ducted a comprehensive survey and presented architectures
Basic computer programming courses are normally avail- and their challenges for growing big educational data.
able in the first semester of university studies. Initial program- On the other hand, learning analytics (LA) refers to the
ming classes have the ancillary role of attracting students to collection, analysis, and visualization of educational data
the field of computer programming. Because students may to understand and improve the learning processes and out-
make decisions based on these initial programming classes, comes better. LA provides interventions based on the anal-
it is essential that those classes impart positive programming ysis of educational data to improve both learning and the
experiences. Note that, introductory programming courses learning environment [19]. Also, LA encompasses broader
have a significant rate of failure and dropout [2]. However, components of other disciplines such as EDM, academic
due to limited amounts of class time, classrooms and teachers, analytics, learning sciences, cognitive sciences, human fac-
and limitations in other forms of logistic support, it is difficult tors, psychology, and so on. Maher et al. [20] proposed a
to fully educate students in programming through traditional Personalized Adaptive Gamified E-learning (PAGE) model
programming classes alone. To overcome these problems, to enhance MOOCs LA and visualization in the learning pro-
online judge (OJ) systems provide additional platforms that cess. The PAGE model helped learners in learning adaptation
enable students to continue their programming studies over and visualization.
a period of years [3]. Such systems normally contain large In this research, our goal is to investigate the impact of
collections of interesting programming problems [4] that practical skills on academic performance through a compre-
students can pursue independently or teachers can assign to hensive analysis using real-world e-learning data. Consider-
stimulate students’ interest. The concept of the OJ system was ing the context of this study, these two important terms such
first introduced at the 1977 International Collegiate Program- as practical skills and academic performance are defined as
ming Contest (ICPC) [5], [6], which is now held annually. follows.
Furthermore, because OJ systems have proven useful, many Practical skills relate to reasoning, critical-thinking,
universities and colleges are now attempting to develop online problem-solving, and implementation skills. Let consider a
support systems for programming education [7]–[9]. basic programming course that consists of two learning activ-
Today, OJs are used by many educational institutions ities such as theory-based and practice-based. The practice-
to conduct courses related to programming, computing, based activities include programming, programming-related
and software engineering [10], [11]. Many universi- assignments, and coding tests. In this research, perfor-
ties have created their own automated program assess- mance in practice-based activities is referred to as practi-
ment (APA) systems for programming courses to accelerate cal skills. On the other hand, academic performance refers
students’ learning [12]–[14]. As a result, a large number to theoretical knowledge, innovative ideas, and memoriza-
of programming-related submission logs are created every tion. Performance in various theory-based activities includes
day by OJ or APA systems in various organizations world- algorithmic-based assignments, theory-based assignments,
wide, which can be valuable resources for research and and paper-based tests which are referred to as academic
analysis [15], [16]. Therefore, this research aims to use performance.
programming-related resources (submission logs) for empir- To accomplish this study, a novel framework is proposed
ical research and analysis. to extract students’ hidden features and association rules.
Educational data collected from various e-learning plat- Hidden features derived from submission logs and scores
forms such as Moodle, MOOCs, OJs, and APAs are not carry significant meaning. This work makes the following
unified, structured, well-organized, neat and in a collected contributions:
format because the data archiving format differs from one • We present the correlation between practical skills and
e-learning platform to another. Therefore, educational data academic performance.
mining (EDM) and learning analytics (LA) techniques are • We find students’ programming and academic weak-
effective in transforming these big educational data into use- nesses and strengths through empirical analysis using
ful knowledge and patterns that can be applied to improve submission logs and scores. Further, necessary recom-
overall education. EDM has become an effective technique mendations have been provided accordingly.
for exploring invisible knowledge and useful patterns in edu- • We extract important and relevant features from the
cational data. Nowadays, traditional education is changing submission logs and scores that are not clearly visible
at an unprecedented pace and many academic activities are in a simple form of dataset.
conducted on e-learning platforms. The collection of this • We determine that the hidden features are useful to
vast amount of educational data has opened up opportunities students as well as teachers to achieve programming and
for research and analysis to understand and improve learn- academic goals.
ing outcomes. V. Hegde and S. Rao H.S. [17] presented an • The proposed framework and its data analysis process
EDM-based framework to analyze students’ performance in can be useful for other related academic courses and
disciplines to discover hidden features/correlations in each student’s programming experience. Another study [23]
e-learning data. For example, this framework can be presents a continuous programming assessment system for
applied to a course that consists of theory and hands- programming courses using automated assessment tools
on activities and collects resources/data, like a program- (AATs). A quantitative analysis was performed based on
ming course. the relationship between the student and the AAT outcome.
The rest of the article is structured as follows. In Section II, The submitted solutions are analyzed in depth using an AAT
the background and related works are presented. Section III and judgments (either correct or incorrect) are provided.
describes the dataset and preprocessing. In Section IV, The experimental results showed that AATs help students
the proposed approach is presented. The experimental results to better understand computer programming. Lu et al. [24]
are presented in Section V and discussed in Section VI. presented programming education via an OJ system that has
Section VII concludes the study with outlooks on the future increased students’ performance curve in programming and
works. other academic activities. Their experimental results show
that the OJ system enhanced performance levels, as well as
II. BACKGROUND AND RELATED WORKS stimulated students’ interest throughout the year- or semester-
In this section, we briefly introduce OJ or APA systems long course.
and their applications in programming education. In addition, Toledo et al. [25] presented a fuzzy recommender sys-
supervised and unsupervised learning algorithms, association tem for OJ programming that provides suggestions to learn-
rule mining (ARM) algorithms, educational data mining and ers regarding their upcoming problems based on their past
learning analytics are also presented. performance in the OJ system. That method also pro-
vided useful information to students via recommendations.
A. ONLINE PROGRAMMING LEARNING PLATFORM In [26], OJ programming problems were classified using
OJ or APA systems are now widely used by educational two topic-modeling algorithms, latent dirichlet allocation and
institutions as academic learning tools in programming and non-negative matrix factorization in order to extract relevant
other exercise-based classes. These platforms play an impor- features from problem descriptions. The classification of OJ
tant role in improving students’ programming skills, knowl- programming problems can help novice and advanced stu-
edge, and overall academic performance. The vast resources dents to pick and solve appropriate problems.
(e.g., code archives, submission logs, etc.) generated by Our approach differs from that of existing research by
these systems can help researchers to find students’ flaws focusing on discovering hidden features from submission
in programming and thus expands the scope of available logs and scores to improve programming skills and aca-
improvements. As a result, numerous studies have focused on demic performance. We also focus on finding the correlation
programming education, educational data mining, and data- between practical skills and academic performance based on
driven analysis using resources from OJ or APA systems. the extracted hidden features. To the best of our knowledge,
In [7], the authors used learning log data extracted from no study has been conducted to address this issue by using
the M2B system. A recurrent neural network is used to submission logs and scores.
predict student performance. This study showed that numer-
ous useful hidden features can be extracted by analyzing B. SUPERVISED AND UNSUPERVISED LEARNING
the M2B system’s data. Mekterovic̀ et al. [12] proposed an ALGORITHMS
APA system for conducting programming courses and cre- Within the context of artificial intelligence and machine
ated the educational software Edgar to automatically evaluate learning (ML), supervised and unsupervised learning algo-
programming assignments and other programming-related rithms are frequently used in real-world applications. In short,
tasks. Edgar provides a variety of services, including con- both input data and output labels are known in supervised
tent writing, course administration, system monitoring, and learning (SL) algorithms. Formally, SL involves ML algo-
troubleshooting. Furthermore, Edgar produces the results of rithms that are trained with known input data and associated
various statistics in a visual format. APA systems provide output labels. Let U = {u1 , u2 , u3 , . . . , un } be the set of input
many benefits for students as well as instructors. Mean- data and V = {v1 , v2 , v3 , . . . , vn } be the set of corresponding
while, a ranking system [21] based on student performance output labels of the input data U . Thus, the output function
and quick responses has positively impacted programming can be written as V = f (U ), where the output V depends
learning. APA systems have extended the conventional use on the input U and f is a mapping function. After training,
of the OJ systems for evaluating programming assignments the ML algorithm can predict the output label for all new input
and their use significantly stimulates students’ interest in data. SL algorithms are divided into two categories such as
programming. classification and regression.
In [22], the authors extended the BOCA OJ system to Classification is an SL approach that classifies a given set
improve its suitability for programming classes. The resulting of data into classes. The classification model predicts the
PROBOCA project was used to aid classroom teachers. target class for a given data point. After training, the model
This method identifies problems by degree of difficulty, predicts class names for data that it has not seen before.
thus making it easier for teachers to match problems with There are two types of classification in ML such as binary
classification (true or false) and multi-class classification. to select an algorithm that can group students based on their
Typically, the evaluation of a classification model is done by source code submission logs and class performance scores,
computing the precision, recall, and accuracy scores. Exam- we expected that a clustering approach would provide the
ples of some classification algorithms include support vector best-suited solution to group the students from unlabeled
machine, decision tree, random forest tree, artificial neural datasets. The most commonly used and effective cluster-
network, similarity learning, and k-nearest neighbor [27]. ing approaches, such as k-means, k-medoids, DBSCAN,
Similarly, regression is an SL approach used to predict the agglomerative hierarchical cluster tree, and other variations
continuous output variable based on one or more indepen- of k-means were found based on a review. The modified
dent (predictors) variables. Mainly, this approach is used for k-means clustering algorithm [30], which we found to be a
forecasting, time series modeling, prediction, and determin- robust, scalable, and effective tool, is a variant of the conven-
ing market trends. Examples of regression algorithms include tional k-means clustering algorithm.
linear regression, logistic regression, polynomial regression,
decision tree regression, and random forest regression [27]. C. ASSOCIATION RULE MINING ALGORITHMS
In contrast, unsupervised learning (USL) is a kind of ARM algorithm is a USL algorithm used for data mining
ML algorithm in which models are trained with unlabeled in big data. ARM was first proposed by Agrawal [31] and
datasets. The USL algorithm can group the data based on has since been used in many fields, such as educational
their similarity features by applying some mathematical pro- data analysis, medical data analysis, market-basket analy-
cedures. The USL algorithms have the following advan- sis, and census data. Usually, ARM aims to find a set of
tages over SL including hidden feature extraction, useful cooccurring high-frequency items and extract the correla-
insights, human-like learning, handling unlabeled and uncat- tion among items from large dataset. Although the Apriori
egorized data. USL algorithms are divided into two cate- algorithm [31] is often used for data mining, many enhance-
gories such as clustering and association. Clustering is a ments are proposed based on Apriori to improve performance
USL algorithm used to group data into clusters, similarity and scalability, such as the sampling approach [32], hashing
characteristics of the data in a group are high, on the other technique [33], dynamic counting [34], partitioning tech-
hand, there is a minimal similarities with the data of another nique [35], and incremental mining [36]. Prior studies showed
group. Examples of clustering algorithms include k-means, that the Apriori algorithm achieved significant results, but
k-medoids, Density-based Spatial Clustering of Applications some methods also reported the worse results by generating
with Noise (DBSCAN), Clustering Large Applications based a large number of candidate item sets, additional scans, etc.
on RANdomized Search (CLARANS), and Clustering Large Subsequently, a new algorithm called FP-growth was
Applications (CLARA). Association is a USL algorithm that proposed without the leverage of candidate item set gen-
is used to find relationships between items in a large database. eration [37]. This method used a partitioning-based divide-
This algorithm determines the set of items that co-occur in a and-conquer approach. Previous studies have shown that it
database. For example, if three items M , N , and O exist in significantly reduced the search space and time compared
the database, the algorithm can generate patterns/rules that to Apriori [38]. Similarly, many extensions are added to the
co-occur such as M −→ N , (M & N ) −→ O, and N −→ M . FP- growth algorithm to improve efficiency. Some examples
These patterns/rules are useful for analyzing market-basket, of enhanced FP-growth algorithms are h-mine [39], depth-
educational data, and so on. Examples of association algo- first mining [40], pattern-growth mining in both directions
rithms include Apriori and frequent pattern (FP)-growth. (bottom-up and top-down), and tree structures [41], [42].
In [28], students have been classified by a clustering In contrast, Zaki [43] proposed the Equivalence CLASS
approach based on their learning behaviors. The clustering Transformation (Eclat) algorithm for ARM applied to vertical
by fast search and finding of density peaks via heat diffusion data. The Eclat uses the same candidate-generation process
(CFSFDP-HD) algorithm has achieved a better clustering like Apriori. In brief, Apriori, FP-growth, and Eclat ARM
performance than other clustering algorithms. The authors algorithms are most frequently used in many applications,
also proposed an e-learning system architecture that detects and also serve as the foundation of many other ARM algo-
and responds to teaching content based on student learning rithms. In this research, an FP-growth algorithm is used.
capabilities. Tabanao et al. [29] proposed a method that clas-
sifies programmers using submission log data, such as com- D. EDUCATIONAL DATA MINING AND LEARNING
pilation profiles, error profiles, compilation frequency, and ANALYTICS
error quotient profiles produced during an introductory pro- EDM is the same as traditional data mining, except that it is
gramming course. This study identified correlations between applied to educational fields. EDM is used to extract hidden
the submission log data and the midterm examination scores knowledge and discover patterns from the data in different
of students. educational learning platforms [44]. In the study [44], various
In our dataset, the output labels are unknown because the data mining techniques including clustering, classification,
submission logs and class performance scores have not yet and ARM are exploited to discover useful information from
output information that could be used for labeling (e.g., poor, the educational data. They used EDM tools (Rapid Miner
good, very good, or genius). Accordingly, as it is necessary and Weka) to analyze data from Moodle in a programming
course. Fernandes et al. [45] presented a predictive analysis TABLE 1. Topic-wise problem list of ALDS1 course.
of students’ academic performance. The Gradient Boosting
Machine (GBM) classification model was applied to predict
students’ academic performance at the end of the school year.
In another study [46], a semi-supervised learning algorithm
was used to predict the students’ performance in the final
exams. In the study [47], a survey of EDM and its future
directions is presented. It also discusses some recent trends
in the field of EDM research.
LA has become an important research topic in the field of
educational technology. This involves understanding and ana-
lyzing real-world educational data to provide useful support
for improving learning and teaching. Tran et al. [48] used
LA for a learning management system (LMS). The exper-
imental results showed that LA plays an important role in
improving productivity, learning, and support for LMS user.
Ang et al. [18] discussed LA from five different perspec-
tives: learning and assessment analysis, personalized learn-
ing, behavior learning, collaborative and interactive learning,
and social network analytics. Current LA trends and practices
to improve teaching and learning in education are presented
in [49], [50].
In addition, numerous studies have been conducted using
the resources of OJ or APA systems. These systems are
actively used for education, e-learning, computing, pro-
gramming competitions, and software engineering. The
importance of empirical data-driven analysis to make criti-
cal decisions, and even to change algorithm configurations
automatically, is growing [51]. However, this data-driven
analytical research differs from previous studies in that a real-
world dataset has been used. The analytical findings of this
research are beneficial to assist students in improving their
academic and practical performance, as well as for educa-
tional planning.
three (03) or four (04) problems which we call problems A, The PCS generates a CVal score for submitted codes based
B, C, and D. on the degree of similarity, and the codes are collected from a
The logs are generated by the AOJ system based on the specific time period and users. A score of 1 means that there is
submitted solution codes by the students over two semesters, no copying/duplication, 0.5 means that a few codes are copied
the size of the submission logs is approximately 69,000. Each from others, and 0 means that a number of codes are copied
solution log has a set of information, such as the judge id (jid), from others. In addition, CVal is used to justify the scores of
user id (uid), problem id (pid), language (C, C++, python, PA. Some sample data distribution of student evaluations are
etc.), accuracy, verdict (accepted, wrong answer, compile listed in Table 3.
error, etc.), CPU time, memory usage, code size, submission
date, and judge date. Let UID be a set of users (students) TABLE 3. Sample data distribution of student evaluations.
i.e., UID = {uid1 , uid2 , . . . , uidn }, n ≥ 1. JID is a set of
judge IDs JID = {jid1 , jid2 , . . . , jidm }, m ≥ 1; Prob is a set
of problems Prob = {prob1 , prob2 , . . . , probj }, j ≥ 1 where
prob1 , prob2 , . . . probj are unique problems; judge verdicts
are Verd = {AC, WA, CE, RTE, MLE, TLE, OLE, PrE}
where AC = Accepted, WA = Wrong Answer, CE = Com-
pile Error, RTE = Run Time Error, TLE = Time Limit
Exceeded, OLE = Output Limit Exceeded, MLE = Memory
Limit Exceeded, and PrE = Presentation Error; and the pro- Definition 1: The CVal score refers to the degree/level of
gramming languages are Lang = {C, C++, C++11, Ruby, program code plagiarism.
Python 2, Python 3, Java, Haskell, C#, PHP, Rust, . . . }. Example 1: If a user u15 copies/replicates programs from
A corresponding submission log is created immediately after others, then u15 receives a CVal score of 0.5; if user u15
submitting a solution code to the AOJ system. Thus, a sample copies/replicates the code from others with malicious intent,
output log of AOJ system can be written as Ologs = {ur , js , CVal is 0.
pt , vu , lv , ct, mu, cs, sd, jd}, where ur ∈ UID, js ∈ JID, pt ∈ Each exercise class is divided into two parts. First, students
Prob, vu ∈ Verd, lv ∈ Lang, ct = CPU time, mu = Memory are asked to submit an AA, which consists of a few ques-
usage, cs = Code size, sd = Submission date, jd = Judge tions and is also considered student attendance (AT ). Second,
date. Some sample logs generated by the AOJ system are three or four problems are given to the students as a PA.
listed in Table 2. The students are then encouraged to submit their solutions
through the AOJ system. Students are allowed to consult
C. CLASS PERFORMANCE SCORES with each other, teachers, and teaching assistants to solve
In addition to the submission logs, we collected various problems during PA. In contrast, CoT is conducted in exercise
test (exam) scores for the ALDS1 course from 357 stu- rooms with a separate workstation for each student, provid-
dents in two different years at the University of Aizu, Japan. ing a process by which each student’s actual programming
Usually, most students take the ALDS1 course as part of capabilities can be verified. Note that it is strictly forbidden
their regular study. This course consists of various tests, for a student to consult with other students during the CoT .
such as algorithm assignment (AA), programming assignment Similarly, the PT is a closed-book test that is given to check
(PA), code validation (CVal), coding test (CoT ), and paper- the true level of each student’s theoretical understanding. The
based test (PT ). Note that PA and CVal are calculated based test scores distribution can be expressed as Tscore = {UID,
on the student’s program submission to the AOJ system. AT , AA, PA, CVal, PT , CoT , T , Prac}, where AT ∈ N and
To check the plagiarism/similarity/duplication of submitted 0 ≤ AT ≤ 13, AA ∈ N and 0 ≤ AA ≤ 100, PA ∈ N and
solution codes, a plagiarism checking software (PCS) has 0 ≤ PA ≤ 120, CVal ∈ R and 0 ≤ CVal ≤ 1, PT ∈ N
been developed and integrated into the management sys- and 0 ≤ PT ≤ 120, CoT ∈ N and 0 ≤ CoT ≤ 120,
tem of AOJ. The PCS checks solution codes submitted by T ∈ N and 0 ≤ T ≤ 120, Prac ∈ N and 0 ≤ Prac ≤
the students against the existing source codes in the AOJ. 120. To better evaluate the students of the ALDS1 course
by considering the importance of theoretical and practical where pn = number of problems, TAS = total accepted solu-
knowledge, the equations (1), (2), and (3) are developed tions, and TS = total submissions.
for the Theory (T ), Practical (Prac) and Final Score (FS) Example 2: Let u5 be a user who has submitted a total
calculations, respectively, based on the different test scores. of 39 solutions to the AOJ system, of which, a total of 28 have
Note that the equations for the ALDS1 course are approved been accepted. Then, the solution accuracy of the user u5 is
by the course coodinator. (28/39) = 71%, according to (4).
√
T = AA × PT (1) Another important term is trial and error (T &E), we use
p the T &E method to estimate a programmer’s ability to solve
Prac = (PA × CoT ) × CVal (2)
AT +1 p problems. In this study, the following definition is adopted
FS = min(100, b c × (T × Prac)) (3) for the T &E method.
10
Definition 3: A number of repeated attempts are taken
For explanation, the student evaluation process is com-
until a problem is successfully solved; this process is called
pared using the following two scenarios: (i) the conventional
trial and error (T &E).
case and (ii) the proposed case (based on the equations). Ppn
j=1 TSj
1) CONVENTIONAL CASE T &E = Ppn (5)
j=1 TASj
In this case, the final results are usually generated by aver-
aging Prac and T scores. For example, if student s1 gets where pn = number of problems, TS = total submissions, and
10 points on the Prac test and 90 points on the T test, TAS = total accepted solutions.
the final result of s1 using the conventional method is Example 3: Suppose that u10 is a user who has received a
(10 + 90)/2 = 50. total of 25 accepted (AC) verdicts from the AOJ for 5 prob-
lems, but has taken a total of 129 attempts (T &E) to achieve
2) PROPOSED CASE it. Then, the average T &E of user u10 for each solved problem
In this case, Prac and T scores are given equal priority to is (129/25) = 5.16, according to (5).
generate final results, so the equations (1 − 3) are introduced
IV. APPROACH
to emphasize both the Prac and T scores. Let us assume that
Figure 1 shows the framework of our proposed educational
if student s1 gets 10 points on the Prac test and 90 points on
data-driven approach. We employed the framework to a real-
the
√ T test, the final result of s1 using our equations will be world dataset to extract the hidden features and associa-
10 × 90 = 30. We observed that the proposed evaluation
tion rules of students to explore the importance of practical
method considers both the Prac and T scores, although there
skills. Experimental data are collected from AOJ system
is no balance between the Prac and T scores when calculating
and ALDS1 programming course, respectively. The proposed
the final result using the conventional method.
approach consists of four main steps: (i) data collections
For statistical feature extraction and ARM, Tables 2 and 3
and preprocessing, (ii) data clustering, (iii) statistical hid-
are joined (Ologs o n Tscores ) to produce the operational data,
den features extraction from clusters, and (iv) association
as shown in Table 4. In addition to the existing attributes,
rules mining from clusters. A modified k-means clustering
a new attribute (Accuracy) has been added to the operational
algorithm is applied for data clustering, where the elbow
data.
method used to select the optimal k values for the k-means.
Definition 2: The number of accepted solutions out of
Furthermore, the FP-growth ARM algorithm is leveraged to
total submissions is called the solution Accuracy of users.
Ppn extract the association rules from each cluster. The methods
i=1 TASi and algorithms used for the proposed approach are discussed
Accuracy(Accu) = Ppn (4)
i=1 TSi below.
TABLE 4. Sample operational data distributions by joining submission logs (Table 2) and evaluation scores (Table 3).
A. ELBOW METHOD iterations to construct clusters for our dataset than other ran-
The elbow method is a proven technique to determine the dom initial centroid-selection algorithms. The first module
optimal number of clusters k for the k-means algorithm. is initial centroid selection module (ICSM) which leverages
It uses the sum of squared errors (SSE) of each cluster to cal- to (i) select optimal centroids and (ii) build clusters with the
culate the optimal number of clusters. The SSE is calculated most similar data. The pseudocode of ICSM is provided in
by the equation (6). Algorithm 1.
k X
X Algorithm 1 Initial Centroid Selection Module (ICSM)
SSE = dist 2 (mi , x) (6)
i=1 x∈Ci
[h] Define: Distance: D, Origin: O(0, 0), Cluster Number:
K
where k is a number of clusters, x is a data point in cluster Ci , Input: Dataset: X = {x1 , x2 , x3 , . . . . . . . . . , xn }
and mi is the center of cluster Ci . Output: Optimal initial centroids Cn =[]
The elbow method reduces unnecessary clustering in the for xj ∈ X do
dataset, where a small SSE value indicates a better cluster. D ←− distance(xj , O)
Normally, increasing the value of k automatically decreases end
the SSE value. When the SSE value is drastically decreased, for di ∈ D do
that point is caught as the ideal number (k) of clusters for Apply sorting on D
k-means. The elbow method was applied to our dataset to D ←− d1 , d2 , d3 , . . . , dn
obtain the optimal k value (k = 4) for the k-means clustering end
algorithm, as shown in Figure 2. if K ≤ |X | then
Divide sorted data D into K subsets
s1 ⊆ D, s2 ⊆ D, s3 ⊆ D, s4 ⊆ D, . . . , sk ⊆ D
end
while k ≤ K do
P Mean value of each subset
Calculate
xx∈S
Mk = |Sk | k
for xj ∈ Sk do
Cn ←− mindistance(Mk , xj∈Sk )
end
end
FIGURE 2. Elbow method for optimal k value selection. The second module is the outlier detection module (ODM),
which is used to (i) detect outliers (irrelevant/insignificant
data point), (ii) remove them from the datasets, and
B. MODIFIED K-MEANS CLUSTERING ALGORITHM (iii) improve the overall cluster quality. The pseudocode of
Usually, the k-means clustering algorithm randomly chooses the ODM is presented in Algorithm 2.
the initial centroid, so it is possible to select an irrelevant
data point as the initial centroid. In addition, conventional C. FP-GROWTH ALGORITHM
k-means algorithms cannot detect and remove outliers from In the field of data mining, the Apriori, Eclat, and FP-growth
the dataset. Consequently, the results may have a nega- algorithms are the most commonly used [58]. The FP-growth
tive impact on the overall clustering process and results. algorithm is much more efficient and faster than Apriori
To address these problems, the modified k-means clustering because the Apriori algorithm repeatedly scans the database,
algorithm [30] integrates two important modules, (i) opti- whereas the FP-growth algorithm only scans twice to com-
mal initial centroid selection and (ii) outlier detection and plete the process. The FP-growth algorithm basically consists
removal. To the best of our knowledge, this is a unique of two (2) main steps [37], namely (i) construction of the
modification of the k-means clustering algorithm, and these FP-tree and (ii) FP mining based on the FP-tree. Let L =
two modules makes the algorithm more efficient, robust, and {l1 , l2 , l3 , l4 , . . . , ld } be the set of all items in the database.
scalable. This algorithm takes approximately 17.33% fewer The databases are built based on a set of tuples/transactions
Algorithm 2 Outlier Detection Module (ODM) To visualize the data distribution of each cluster, we applied
Define: Number of Cluster: K , Cluster: C principal component analysis (PCA) technique to multidi-
Input: Dataset: X = {x1 , x2 , x3 , . . . . . . . . . , xn }, Distance: D mensional clustered data to convert it into a two-dimensional
Output: Outliers: O, SSE (2D) shape. For this reason, the first two components (PCA 1
Run ICSM and Calculate min-max average (MMA) using and PCA 2) of the PCA that explain the majority of the
sorted distance d ∈ D variance in the data are used for the 2D visualization. The
MMA = dmin +d 2
max
visualized clusters are shown in Figure 3.
while k ≤ K do
for xi ∈ XCk do
if distance(xi , centerCk ) > MMA then
Remove xi from the cluster Ck
O ←− xi
Recalculate SSE
end
end
end
TABLE 6. Overview of the judge verdicts of ALDS1 course. can be drawn: (i) the students of clusters P and Q have more
T &E as well as higher solution accuracy, and (ii) both the
solution accuracy and T &E of cluster R are lower than those
of other clusters. Note that the solution accuracy and T &E
are calculated by equations (4) and (5), respectively.
We found more interesting features from the clusters. during the first few days of the submission period compared
For example, students solved numerous additional problems to clusters P and Q, and (iii) more students from clusters R
beyond their regular exercise assignments through the AOJ and S submitted their assignments on the last day (8th ) of the
platform, solely for their own interests and amusement. submission deadline than students from clusters P and Q.
The cluster-wise extra problem solution statistics are listed Sometimes students submitted their solutions after the
in Table 10. The following observations can be drawn from deadlines. The topic-wise accepted (AC) solution rate and
the Table 10: (i) the students of cluster P solved a huge num- average accepted (AC) rate for all clusters are calculated and
ber of problems beyond their regular exercise assignments, listed in Table 11. Moreover, a visual comparison between all
which clearly indicates their enthusiasm for programming, topics for all clusters is presented in Figure 7. A few observa-
and (ii) the students in other clusters (Q, R, and S) did not tions can be found: (i) the students of cluster P received the
solve a significant number of extra problems. highest acceptance against all their assignment submissions
and (ii) the students of cluster R obtained the lowest accep-
TABLE 10. Statistics of extra problem solutions. tance rate compared to those in clusters P, Q, and S.
clusters R and S submitted most solutions on the last day and TABLE 13. Repetition tendency of accepted problems.
(ii) students of cluster P submitted the fewest solutions on the
last day.
follows: W1 = {29, 60, 70, 81, 90, 100}, W2 = {17, 61, 71, TABLE 18. Association rules for the students of cluster S.
80, 92, 102}, and W3 = {32, 58, 70, 81, 91, 101}.
Interesting and relevant association rules are obtained
from each cluster by setting the optimal minimum support
(minSup) and confidence (minConf ) threshold values. For
cluster P, we set minSup = 1500 and minConf = 90%.
Consequently, the frequent rules shown in Table 15 are
obtained.
lower tendency to submit solutions on the last day than stu- explanations, recommendations, assessments, practical appli-
dents of clusters R and S instead submitted their assignments cations, and limitations.
early (Table 12). The students of cluster Q obtained higher
scores in PA and PT than in CoT (Table 9), as shown by the TABLE 19. Main features.
association rules (Table 16). Most of the features involved
high values, indicating that they put a great effort into pro-
gramming. However, the lower attempt to solve additional
problems likely affected the CoT scores in this cluster.
Students of cluster R (i) took an average of 9.52 attempts/
trials to solve problems, (ii) did not solve many additional
problems outside of regular academic assignments (Table 10),
(iii) submitted their assignments on the deadline or the day
before (Figure 6), and (iv) rarely repeated the AC problems
more than once (Table 13). These features are related to their
various programming activities and indicate that they did
not put much effort into programming. These students also
received the highest error (WA, CE, RTE, TLE, etc.) verdicts For a better understanding, ten main features are listed
of 67.30% and the lowest AC rate of 32.70% compared to in Table 19 with three indicator values: higher (H ),
clusters P, Q, and S. Their topic-wise AC rate is not coherent medium (M ), and lower (L). Feature c, which indicates when
across all thirteen topics. Students in cluster R obtained good the assignments are submitted within the alloted time, uses
scores in PA (60.92) and PT (68.26), but lower scores in CoT indicator values of early (E), mid-time (M ), and delay (D).
(28.46) (Table 9). The association rules showed that students
were involved with lower scores and infrequent AC verdicts. A. ANALYSIS AND RECOMMENDATIONS
Note that the coding test (CoT ) is used to verify the students’ In the summary graph of the main features shown in Figure 9,
core programming skills. Thus, less effort in programming we observe that the students of cluster P performed extraordi-
negatively affects this CoT score. narily well in different programming activities and academic
Students of cluster S (i) undertook an average of tests. Importantly, most students in this cluster are highly
9.86 attempts/trials to solve problems (Table 8) (ii) solved enthusiastic about programming, with more than 62% of total
no additional problems (Table 10), (iii) had the highest rate solutions (Figure 6) submitted on the very first day of all
of last-day submission for each assignment with an average assignments. They also solved an average of 56.14 extra
of 22.22% solutions submitted on the last day (Figure 6 and problems in addition to their academic assignments. The
Table 12) compared to clusters P, Q, and R, and (iv) obtained tendency to submit solutions on the last day is approximately
very low CVal score of 0.44. Furthermore, an insignificant 2.90% which is the lowest compared to clusters Q, R, and
number of students attempted to repeat the AC problems more S (Table 12). For solution optimization, a large number of
than once (Table 13). Collectively, these features indicate students repeated their AC solutions (Table 13). In addition,
that students in cluster S did not perform well in program- these students achieved higher AC rates of 46.24% for prob-
ming. Most features are negatively prioritized. Consequently, lems A and B (Table 11), accuracy of 55.71% (Table 8),
students in this cluster obtained the lowest scores in CoT and scores on various tests of 81.22%, 65.46%, and 82.33%
(16.55), which is alarming for actual coding performance. for PA, CoT , and PT , respectively than those of clusters
Most of the association rules are connected with lower CoT Q, R, and S (Table 9), as reflected by the association rules
scores (Table 18). In addition, we found an interesting corre- (Table 15). In contrast, the total error verdict is analyzed from
lation: most of the students submitted their solutions on the this cluster, with approximately 45% error due to WA, 19%
last day, but achieved higher AC rates and accuracy. This due to resource limitations (TLE, MLE, OLE), and 15% due
trend differs from that of clusters P and Q. to RTE (Figure 4).
Note that, to develop students’ programming skills and
VI. DISCUSSION ensure the efficiency of the source codes, several constraints
In this study, many hidden features are obtained by employing are set for problems such as input and output limits/numbers,
the proposed framework, where modified k-means is applied space and time complexity. In this case, a solution code
for data clustering and then FP-growth is applied to the must satisfy the set of constraints to be accepted, otherwise
clustered data to discover the association rules. Interesting it receives error verdicts such as TLE and MLE. Figure 4
features and behaviors are observed that are not readily appar- shows that students in clusters Q, R, and S received 10.87%,
ent in the base dataset. After applying the elbow and k-means 9.32%, and 6.25% errors due to TLE and MLE, respectively.
algorithms to the dataset, four (04) clusters are found. Differ- In contrast, the students of cluster P received about 19.28%
ent features and rules are extracted from each cluster consid- errors due to TLE and MLE, which is the highest compared
ering the different conditions presented in the experimental to clusters Q, R, and S. However, students in cluster P took
results section. Next, we discuss the features and the resulting about 12.40 attempts (T &E) to solve a problem, which is
higher than students in clusters Q, R, and S (Table 8). In gen- in cluster Q achieved medium (M ) scores in CoT and did
eral, problems C and D are comparatively more difficult and not solve a significant number of problems outside of their
contain tough constraints than problems A and B. Students in regular assignments. Usually, CoT is used to verify actual
cluster P submitted the highest percentage of solutions, about programming ability; a medium score in CoT means students
43.36%, for problems C and D compared to clusters Q, R, and need to pay more attention in programming. Accordingly,
S (Table 7). Moreover, each student in cluster P solved an to improve programming skills, students can practice more
average of 56.14 additional problems, which is significantly outside of their academic workload.
higher than students in clusters Q, R, and S (Table 10). Considering all the results and analysis, we determined the
Students in cluster P attempted many additional and chal- following recommendations for clusters P and Q: (i) special
lenging problems, resulting in a high percentage of errors in attention to these students can further improve their skills and
TLE and MLE. Usually, complex algorithm-based problems knowledge; (ii) more difficult problems can be assigned to
contain various tough constraints, and sometimes it is very these students because they find general assignments are very
difficult to deal with these kinds of constraints alone without easy; and (iii) they can be involved in real-world problem-
prerequisite knowledge. Our analysis shows that students in solving tasks.
cluster P have a high tendency to take on difficult problems For the students of cluster R, the rate of last-day submis-
independently (Table 7), and have achieved significant suc- sion was approximately 20.06% (Figure 6) which indicates
cess in solving problems with tough constraints (Table 6). a tendency to delay submission, and they show inconsis-
Besides, students in this cluster still have the opportunity to tent acceptance (AC) for all topics (Figure 7). Moreover,
further improve their programming skills in dealing tough this cluster had the lowest acceptance (AC) and accuracy
constraint-based problems. Based on the overall empirical rates among all clusters; very few students repeated their
and analytical results, we can summarize that the students of accepted solutions for optimization and solved extra prob-
cluster P are highly skilled and enthusiastic about program- lems. Students of this cluster obtained the highest error rate
ming and perform well on academic tests. (67.30%) among all the clusters. The extracted association
Similarly, students in cluster Q achieved higher values rules show that most of these students achieve lower CoT
in most features, as shown in Figure 9. More than 40% scores, accuracy, AC rate, and WA verdicts. The students
of their assignments were submitted on the very first day in this cluster scored much higher in PA (60.92) than CoT
(Figure 6), with higher accuracy, AC rate, repetition ten- (28.46). During PA, students can consult with others to solve
dency, and scores in PA and PT . The error verdicts of this problems. This may allow some students to solve problems
cluster have been analyzed approximately 45% of the errors with the help of other students without understanding the
occur due to WA, 19% due to CE, 11% due to PrE, 11% problems properly. In contrast, students are not allowed to
due to resource limitation (TLE, MLE, OLE), and 14% due consult/talk with others during CoT , in which students of this
to RTE (Figure 4). These students can be understood by cluster rarely obtain good scores. The average CoT score is
analyzing the reasons for each type of error. In addition to 28.46 out of 120. Considering all the features, it is concluded
these positive features, we found some flaws. The students that (i) students may solve assignments with the help of
others without understanding the problems and this cluster early (Figure 6), the topic-wise acceptance rate of 40.06%
(ii) lacks actual programming skills, and (iii) has less effort for problems (A and B) (Table 11), last-day submission
in programming. rate of 10.28% (Table 12), and PA, CoT , and PA scores
Figure 9 shows that students of cluster S achieved lower of 67.86%, 40.18%, and 70.43% respectively (Table 9).
values in most features. Their last-day submission rate is As shown in Figure 9, most of the features are associated with
22.22%, which is the highest among all clusters. They good indicators. It can be seen that the students of cluster Q
achieved lower scores in coding and paper-based exami- also performed well in programming.
nations (CoT and PT ), but obtained relatively high scores On the other hand, students in cluster S did not solve any
in PA. Similarly, they had fewer T &E attempts but achieved additional problems (Table 10), had a less repetition tendency
higher accuracy and AC rate. Most of the association rules (Table 13), a higher last day submission rate of 22.22%
are involved with lower scores (CoT and PT ) and accuracy, compared to clusters P, Q, and R (Table 12 and Figure 6), and
as well. The following observations can be obtained from the received the lowest CVal score of 0.44 compared to clusters
extracted features: (i) while a large number of solutions were P, Q, and R. Besides, students scored 54.33%, 13.79%, and
submitted on the last day, there may have been some students 44.83% on PA, CoT , and PT , respectively, which is very poor
who waited for other solutions to become available; this is compared to clusters P, Q, and R (Table 9). The overall results
justified by the CVal score. (ii) There is an unusual trend show that the students in this cluster did not perform well in
where students obtained lower scores in CoT while achiev- programming. In addition, the summary graph (Figure 9) of
ing higher AC rate, accuracy, and PA scores; this suggests the features and association rules (Table 18) show that they
(iii) a lack of actual programming skills and (iv) less effort were involved with lower indicators in most features.
in programming, and that (v) the students may solve their From the above results, it can be seen that the students
assignments through collaboration with others. of clusters P and Q made a good effort in programming
After analyzing the features and association rules from and obtained good results in various tests, while the students
different perspectives, some deficiencies have been identified of cluster S made less effort and therefore achieved poor
in the programming and academic fields for the students results in various tests. So, we can conclude that if students
of clusters R and S. Accordingly, we provide some recom- (especially in ICT-related disciplines) perform well in prac-
mendations that may help improve students’ programming tical applications (e.g., programming, logical implementa-
skills and academic performance: (i) special assistance can be tion) then they are also likely to perform well in different
provided in the development of algorithms and mathematical academic activities, including tests. In addition, the current
logic; (ii) encourage students to solve problems with self- research provided some recommendations for students based
knowledge and understanding; (iii) students can participate in on the identified features and flaws. Teachers, instructors, and
different programming activities, such as competitions, pro- faculty advisors can use these analytical results and recom-
gramming lectures, and workshops; and (iv) teachers should mendations to improve students’ programming and academic
give these students additional attention and support in theory performance levels. In addition, our proposed framework,
and exercise classes and observe their responses. experiments, and overall analytical results can be applied to
other related courses/disciplines.
B. OVERALL ASSESSMENTS AND PRACTICAL The ultimate goal of this research is to support and
APPLICATIONS improve student learning by identifying their weaknesses
Considering all the empirical results and analysis, we see and strengths. For this purpose, a real-world dataset from
that the students of cluster P obtained the highest acceptance a programming course was used. The proposed framework
rate of 40.88% for all problems (A, B, C, and D) (Table 6), included EDM techniques and LA to find invisible knowl-
average solution accuracy of 55.71% (Table 8), solved the edge from the e-learning data. The knowledge was then ana-
average additional problem of 56.14 (Table 10), a faster lyzed and visualized from various perspectives. The results
propensity to submit assignments early (Figure 6), topic- of these analyses highlight the weaknesses and strengths
wise accepted solution rate of 46.24 for problems (A and B) of the students and improve their learning. The proposed
(Table 11), lowest submission rate on the last day of 2.90% research can be suitable for practical applications for the
(Table 12), highest number of repetitions (Table 13), highest following reasons: (i) the proposed research can provide a
PA, CoT , and PT scores of 81.22%, 65.46%, and 82.33%, useful direction, that is, how to deal with e-learning data,
respectively (Table 9). All the features indicate that the stu- (ii) e-learning data processing has always been a challenging
dents of cluster P invested great efforts in programming- task, in this regard, the proposed research shows the way of
related tasks. In addition, the summary graph (Figure 9) of handling real-world e-learning data. As the proposed research
the features shows that the students of cluster P are involved has already processed OJ (e-learning) data for EDM and LA,
in better indicators in all the features. Similarly, students (iii) the process of data analysis and its results can be helpful
in cluster Q received an acceptance rate of 36.34% for all for other related courses to improve students’ learning, and
problems (A, B, C, and D) (Table 6), solution accuracy (iv) the proposed framework can be integrated with existing
of 48.45% (Table 8), high tendency to submit assignments e-learning platforms for EDM and LA purposes.
C. LIMITATIONS [3] S. Wasik, M. Antczak, J. Badura, A. Laskowski, and T. Sternal, ‘‘A survey
on online judge systems and their applications,’’ ACM Comput. Surv.,
The proposed framework is leveraged for data clustering, and vol. 51, no. 1, pp. 1–34, Apr. 2018, doi: 10.1145/3143560.
then the hidden features and association rules are extracted [4] R. Yera and L. Martínez, ‘‘A recommendation approach for programming
from each cluster. The results are generated based on a dataset online judges supported by data preprocessing techniques,’’ Appl. Intell.,
comprising submission logs and scores collected from the vol. 47, pp. 277–290, Mar. 2017, doi: 10.1007/s10489-016-0892-x.
[5] S. Manzoor, ‘‘Common mistakes in online and real-time contests,’’ ACM
AOJ system; they may vary for other datasets due to noise Mag. Students, vol. 14, no. 4, pp. 10–16, Jun. 2008, doi: 10.1145/1375972.
or irrelevant data. The number of association rules may vary 1375976.
depending on the threshold values of minSup and minConf . [6] M. A. Revilla, S. Manzoor, and R. Liu, ‘‘Competitive learning in infor-
matics: The UVA online judge experience,’’ Olympiads Informat., vol. 2,
The value of k for the modified k-means clustering algorithm no. 10, pp. 131–148, 2008.
may differ based on the dataset. Therefore, the proposed [7] F. Okubo, T. Yamashita, and A. Shimada, ‘‘Students’ performance pre-
framework can produce better or worse results for other diction using data of multiple courses by recurrent neural network,’’ in
Proc. 25th Int. Conf. Comput. Educ. (ICCE), Christchurch, New Zealand,
datasets. Dec. 2017, pp. 439–444.
[8] J. Petit, S. Roura, J. Carmona, J. Cortadella, J. Duch, O. Gimnez, A. Mani,
VII. CONCLUSION AND FUTURE WORK J. Mas, E. Rodrguez-Carbonell, E. Rubio, and E. de San Pedro, ‘‘Jutge.org:
In this research, a novel framework for exploring the effects Characteristics and experiences,’’ IEEE Trans. Learn. Technol., vol. 11,
of practical skills on academic performance was proposed. no. 3, pp. 321–333, Jul./Sep. 2018, doi: 10.1109/TLT.2017.2723389.
[9] J. L. Bez, N. A. Tonin, and P. R. Rodegheri, ‘‘URI online judge academic:
Subsequently, a programming course was selected as a sam- A tool for algorithms and programming classes,’’ in Proc. 9th Int. Conf.
ple course for experiments and analyses. By employing the Comput. Sci. Educ., Vancouver, BC, Canada, Aug. 2014, pp. 149–152.
framework, many meaningful and significant features were [10] R. Romli, S. Sulaiman, and K. Z. Zamli, ‘‘Improving automated pro-
gramming assessments: User experience evaluation using fast-generator,’’
extracted from the dataset. The extracted features are deeply Proc. Comput. Sci., vol. 72, pp. 186–193, Jan. 2015, doi: 10.1016/j.
correlated to the students’ behavior. The analytical results procs.2015.12.120.
showed that better practical (e.g., programming) skills have a [11] C. A. Higgins, G. Gray, P. Symeonidis, and A. Tsintsifas, ‘‘Automated
positive effect on academic performance. Moreover, the inter- assessment and experiences of teaching programming,’’ J. Educ. Resour.
Comput., vol. 5, no. 3, pp. 1–21, Sep. 2005, doi: 10.1145/1163405.
action and interdependence between practical skills and aca- 1163410.
demic performance are presented based on the experimental [12] I. Mekterovic, L. Brkic, B. Milasinovic, and M. Baranovic, ‘‘Build-
results. Thus, we have concluded that if a student of an ICT ing a comprehensive automated programming assessment system,’’ IEEE
Access, vol. 8, pp. 81154–81172, 2020, doi: 10.1109/ACCESS.2020.
or engineering discipline performs well in practical assign- 2990980.
ments (e.g., programming, logical implementation, PL/SQL, [13] A. Kosowski, M. Malafiejski, and T. Noinski, ‘‘Application of an online
etc.), then they are likely to perform well in other academic judge & contester system in academic tuition,’’ in Proc. 6th Int. Conf. Adv.
Web Based Learn. (ICWL), Edinburgh, U.K., Aug. 2007, pp. 343–354.
activities. The overall approach of this research is applicable [14] N. A. Rashid, L. W. Lim, O. S. Eng, T. H. Ping, Z. Zainol, and
to other fields such as education, educational data mining, O. Majid, ‘‘A framework of an automatic assessment system for learning
data analytics, and behavior analysis. In future work, we will programming,’’ in Advanced Computer and Communication Engineering
Technology (Lecture Notes in Electrical Engineering), vol. 362. Cham,
consider an automated recommender system that can guide Switzerland: Springer, Dec. 2016, pp. 967–977, doi: 10.1007/978-3-319-
students to improve their practical skills. Moreover, other 24584-3_82.
types of datasets in addition to programming logs and scores [15] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘Source code assessment
and classification based on estimated error probability using attentive
will be included.
LSTM language model and its application in programming education,’’
Appl. Sci., vol. 10, no. 8, p. 2973, Apr. 2020, doi: 10.3390/app10082973.
AVAILABILITY OF DATA AND MATERIALS [16] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘A neural network based
In the present research, all the experimental data are col- intelligent support model for program code completion,’’ Sci. Program.,
lected from AOJ system and class performance scores of a vol. 2020, pp. 1–18, Jul. 2020, doi: 10.1155/2020/7426461.
[17] V. Hegde and H. S. S. Rao, ‘‘A framework to analyze performance of
course (ALDS1). Source code submission logs are accessed Student’s in programming language using educational data mining,’’ in
by these two (02) web applications http://developers. Proc. IEEE Int. Conf. Comput. Intell. Comput. Res. (ICCIC), Dec. 2017,
u-aizu.ac.jp/index and https://onlinejudge.u-aizu.ac.jp. pp. 1–4, doi: 10.1109/ICCIC.2017.8524244.
[18] K. L.-M. Ang, F. L. Ge, and K. P. Seng, ‘‘Big educational data &
CONFLICT OF INTEREST analytics: Survey, architecture and challenges,’’ IEEE Access, vol. 8,
pp. 116392–116414, 2020, doi: 10.1109/ACCESS.2020.2994561.
The authors declare that they have no conflicts of interest. [19] J. Knobbout and E. Van Der Stappen, ‘‘Where is the learning in learn-
ing analytics? A systematic literature review on the operationalization of
ETHICS APPROVAL learning-related constructs in the evaluation of learning analytics interven-
This research was approved by the Research Ethics Exami- tions,’’ IEEE Trans. Learn. Technol., vol. 13, no. 3, pp. 631–645, Jul. 2020,
nation Boards, The University of Aizu, Japan. doi: 10.1109/TLT.2020.2999970.
[20] Y. Maher, S. M. Moussa, and M. E. Khalifa, ‘‘Learners on focus: Visualiz-
ing analytics through an integrated model for learning analytics in adaptive
REFERENCES gamified e-learning,’’ IEEE Access, vol. 8, pp. 197597–197616, 2020, doi:
[1] A. Vee, ‘‘Understanding computer programming as a literacy,’’ Literacy 10.1109/ACCESS.2020.3034284.
Composition Stud., vol. 1, no. 2, pp. 42–64, Nov. 2013, doi: 10.21623/ [21] J. L. F. Aleman, ‘‘Automated assessment in a programming tools
1.1.2.4. course,’’ IEEE Trans. Educ., vol. 54, no. 4, pp. 576–581, Nov. 2011, doi:
[2] L. E. Margulieux, B. B. Morrison, and A. Decker, ‘‘Reducing withdrawal 10.1109/TE.2010.2098442.
and failure rates in introductory programming with subgoal labeled worked [22] R. E. Francisco and A. P. Ambrosio, ‘‘Mining an online judge system to
examples,’’ Int. J. STEM Educ., vol. 7, no. 1, pp. 1–16, May 2020, doi: support introductory computer programming teaching,’’ in Proc. 8th Int.
10.1186/s40594-020-00222-7. Conf. Educ. Data Mining, Madrid, Spain, Jun. 2015, pp. 1–6.
[23] F. Restrepo-Calle, J. J. R. Echeverry, and F. A. González, ‘‘Continuous [45] E. Fernandes, M. Holanda, M. Victorino, V. Borges, R. Carvalho, and
assessment in a computer programming course supported by a software G. V. Erven, ‘‘Educational data mining: Predictive analysis of academic
tool,’’ Comput. Appl. Eng. Educ., vol. 27, no. 1, pp. 80–89, Sep. 2018, doi: performance of public school students in the capital of Brazil,’’ J. Bus.
10.1002/cae.22058. Res., vol. 94, pp. 335–343, Jan. 2019, doi: 10.1016/j.jbusres.2018.02.012.
[24] X. Lu, D. Zheng, and L. Liu, ‘‘Data driven analysis on the effect of online [46] I. E. Livieris, K. Drakopoulou, V. T. Tampakas, T. A. Mikropoulos, and
judge system,’’ in Proc. IEEE Int. Conf. Internet Things (iThings) IEEE P. Pintelas, ‘‘Predicting secondary school students’ performance utilizing a
Green Comput. Commun. (GreenCom) IEEE Cyber, Phys. Social Com- semi-supervised learning approach,’’ J. Educ. Comput. Res., vol. 57, no. 2,
put. (CPSCom) IEEE Smart Data (SmartData), Exeter, U.K., Jun. 2017, pp. 448–470, Jan. 2018, doi: 10.1177/0735633117752614.
pp. 573–577. [47] S. A. Salloum, M. Alshurideh, A. Elnagar, and K. Shaalan, ‘‘Mining in
[25] R. Y. Toledo, Y. C. Mota, and L. Martínez, ‘‘A recommender system for educational data: Review and future directions,’’ in Proceedings of the
programming online judges using fuzzy information modeling,’’ Informa- International Conference on Artificial Intelligence and Computer Vision
tion, vol. 5, no. 2, pp. 1–17, Apr. 2018, doi: 10.3390/informatics5020017. (AICV2020) (Advances in Intelligent Systems and Computing), vol. 1153,
[26] C. M. Intisar, Y. Watanobe, M. Poudel, and S. Bhalla, ‘‘Classification of A. E. Hassanien, A. Azar, T. Gaber, D. Oliva, and F. Tolba, Eds. Cham,
programming problems based on topic modeling,’’ in Proc. 7th Int. Conf. Switzerland: Springer, 2020, doi: 10.1007/978-3-030-44289-7_9.
Inf. Educ. Technol., Aizu-Wakamatsu, Japan, Mar. 2019, pp. 275–283. [48] T. P. Tran and D. Meacheam, ‘‘Enhancing learners’ experience through
[27] A. R. Anaya and J. Boticario, ‘‘A data mining approach to reveal repre- extending learning systems,’’ IEEE Trans. Learn. Technol., vol. 13, no. 3,
sentative collaboration indicators in open collaboration frameworks,’’ in pp. 540–551, Jul./Sep. 2020, doi: 10.1109/TLT.2020.2989333.
Proc. 2nd Int. Conf. Educ. Data Mining (EDM), Córdoba, Spain, Jul. 2009, [49] O. Viberg, M. Hatakka, O. Bälter, and A. Mavroudi, ‘‘The current land-
pp. 210–219. scape of learning analytics in higher education,’’ Comput. Hum. Behav.,
[28] S. Kausar, X. Huahu, I. Hussain, W. Zhu, and M. Zahid, ‘‘Integration of vol. 89, pp. 98–110, Dec. 2018, doi: 10.1016/j.chb.2018.07.027.
data mining clustering approach in the personalized e-learning system,’’ [50] L.-K. Lee, S. K. S. Cheung, and L.-F. Kwok, ‘‘Learning analytics: Current
IEEE Access, vol. 6, pp. 72724–72734, 2018, doi: 10.1109/ACCESS. trends and innovative practices,’’ J. Comput. Educ., vol. 7, no. 1, pp. 1–6,
2018.2882240. Feb. 2020, doi: 10.1007/s40692-020-00155-8.
[29] E. S. Tabanao, M. M. T. Rodrigo, and M. C. Jadud, ‘‘Predicting at- [51] F. Dunke and S. Nickel, ‘‘A data-driven methodology for the auto-
risk novice Java programmers through the analysis of online protocols,’’ mated configuration of online algorithms,’’ Decis. Support Syst., vol. 137,
in Proc. 7th Int. Workshop Comput. Educ. Res., New York, NY, USA, Oct. 2020, Art. no. 113343, doi: 10.1016/j.dss.2020.113343.
Aug. 2011, pp. 58–92. [52] Y. Watanobe. Aizu Online Judge. Accessed: May 19, 2020. [Online].
[30] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘An efficient approach for Available: https://onlinejudge.u-aizu.ac.jp
selecting initial centroid and outlier detection of data clustering,’’ in Proc. [53] Aizu Online Judge: Developers Site (API). Accessed: Dec. 19, 2019.
18th Int. Conf. Intell. Softw. Methodol., Tools, Techn. (SOMET), Kuching, [Online]. Available: http://developers.u-aizu.ac.jp/index
Malaysia, Sep. 2019, pp. 616–628. [54] T. Saito and Y. Watanobe, ‘‘Learning path recommendation system for
[31] R. Agrawal and R. Srikant, ‘‘Fast algorithms for mining association rules programming education based on neural networks,’’ Int. J. Distance
in large databases,’’ in Proc. 20th Int. Conf. Very Large Data Bases, Educ. Technol., vol. 18, no. 1, pp. 36–64, Jan. 2020, doi: 10.4018/IJDET.
San Francisco, CA, USA, Sep. 1994, pp. 487–499. 2020010103.
[32] H. Toivonen, ‘‘Sampling large databases for association rules,’’ in Proc. [55] Y. Watanobe, C. M. Intisar, R. Cortez, and A. Vazhenin, ‘‘Next-
22nd Int. Conf. Very Large Data Bases, San Francisco, CA, USA, generation programming learning platform: Architecture and challenges,’’
Sep. 1996, pp. 134–145. in Proc. 2nd ACM Chapter Conf. Educ. Technol., Lang. Tech. Commun.,
[33] J. S. Park, M.-S. Chen, and P. S. Yu, ‘‘An effective hash-based algo- Aizu-Wakamatsu, Japan, Nov. 2020, pp. 1–11.
rithm for mining association rules,’’ ACM SIGMOD Rec., vol. 24, no. 2, [56] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘A bidirectional LSTM
pp. 175–186, May 1995, doi: 10.1145/568271.223813. language model for code evaluation and repair,’’ Symmetry, vol. 13, no. 2,
[34] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, ‘‘Dynamic itemset counting p. 247, Feb. 2021, doi: 10.3390/sym13020247.
and implication rules for market basket data,’’ ACM SIGMOD Rec., vol. 26, [57] International Business Machines (IBM). (May 2021). Project CodeNet.
no. 2, pp. 255–264, Jun. 1997, doi: 10.1145/253262.253325. [Online]. Available: https://github.com/IBM/Project_CodeNet
[35] A. Savasere, E. Omiecinski, and S. B. Navathe, ‘‘An efficient algorithm for [58] P. Wang, C. An, and L. Wang, ‘‘An improved algorithm for mining associa-
mining association rules in large databases,’’ in Proc. 21th Int. Conf. Very tion rule in relational database,’’ in Proc. Int. Conf. Mach. Learn. Cybern.,
Large Data Bases, San Francisco, CA, USA, Sep. 1995, pp. 432–444. Lanzhou, China, Jul. 2014, pp. 247–252.
[36] D. W. Cheung, J. Han, V. T. Ng, and C. Y. Wong, ‘‘Maintenance of
discovered association rules in large databases: An incremental updating
technique,’’ in Proc. 12th Int. Conf. Data Eng., New Orleans, LA, USA,
1996, pp. 106–114, doi: 10.1109/ICDE.1996.492094.
[37] J. Han, J. Pei, and Y. Yin, ‘‘Mining frequent patterns without candidate
generation,’’ in Proc. ACM SIGMOD Int. Conf. Manage. Data (SIGMOD),
New York, NY, USA, May 2000, pp. 1–12.
[38] D. Ai, H. Pan, X. Li, Y. Gao, and D. He, ‘‘Association rule mining
algorithms on high-dimensional datasets,’’ Artif. Life Robot., vol. 23, no. 3,
pp. 420–427, May 2018, doi: 10.1007/s10015-018-0437-y.
[39] J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, ‘‘H-Mine: Fast and
space-preserving frequent pattern mining in large databases,’’ IIE Trans., MD. MOSTAFIZER RAHMAN received the
vol. 39, no. 6, pp. 593–605, Mar. 2007, doi: 10.1080/07408170600897460. B.Sc. degree in engineering from the Depart-
[40] R. C. Agarwal, C. C. Aggarwal, and V. V. V. Prasad, ‘‘A tree projection ment of Computer Science and Engineering,
algorithm for generation of frequent item sets,’’ J. Parallel Distrib. Com- Hajee Mohammad Danesh Science and Technol-
put., vol. 61, no. 3, pp. 350–371, Mar. 2001, doi: 10.1006/jpdc.2000.1693. ogy University, Dinajpur, Bangladesh, in 2009,
[41] J. Liu, Y. Pan, K. Wang, and J. Han, ‘‘Mining frequent item sets by and the M. Sc. degree in engineering from the
opportunistic projection,’’ in Proc. 8th ACM SIGKDD Int. Conf. Knowl.
Department of Computer Science and Engineer-
Discovery Data Mining, Edmonton, AB, Canada, Jul. 2002, pp. 229–238.
ing, Dhaka University of Engineering & Technol-
[42] G. Grahne and J. Zhu, ‘‘Fast algorithms for frequent itemset mining using
FP-trees,’’ IEEE Trans. Knowl. Data Eng., vol. 17, no. 10, pp. 1347–1362,
ogy, Gazipur, Bangladesh, in 2014. He is currently
Oct. 2005, doi: 10.1109/TKDE.2005.166. pursuing the Ph.D. degree with the Database Sys-
[43] M. J. Zaki, ‘‘Scalable algorithms for association mining,’’ IEEE Trans. tems Laboratory, Department of Computer and Information Systems, The
Knowl. Data Eng., vol. 12, no. 3, pp. 372–390, May/Jun. 2000, doi: University of Aizu, Aizuwakamatsu, Fukushima, Japan. He is also working
10.1109/69.846291. (on study leave) at the Dhaka University of Engineering & Technology. His
[44] A. Lile, ‘‘Analyzing e-learning systems using educational data mining research interests include machine learning, deep learning, machine learning
techniques,’’ Medit. J. Social Sci., vol. 2, no. 3, pp. 403–419, Sep. 2011, application in programming, programming education, data mining, and big
doi: 10.5901/mjss.2011.v2n3p403. data analytics.
YUTAKA WATANOBE (Member, IEEE) received TRUONG CONG THANG (Senior Member,
the master’s and Ph.D. degrees from The Univer- IEEE) received the B.E. degree from the Hanoi
sity of Aizu, Japan, in 2004 and 2007, respectively. University of Science and Technology, Vietnam,
He was a Research Fellow of the Japan Society in 1997, and the Ph.D. degree from KAIST,
for the Promotion of Science (JSPS), The Univer- South Korea, in 2006. From 1997 to 2000, he was
sity of Aizu, in 2007. He was a Coach of several a Network Engineer with the Vietnam Post and
ACM-ICPC World Final teams. He is currently Telecommunications (VNPT). Since 2002, he has
a Senior Associate Professor with the School of been an Active Member of Korean and Japanese
Computer Science and Engineering, The Univer- delegations to standard meetings of ISO/IEC and
sity of Aizu. He is a Key Member of the Aizu ITU-T. From 2007 to 2011, he was a member
Online Judge (AOJ) System. His research interests include visual program- of Research Staff with the Electronics and Telecommunications Research
ming language, programming education, data mining, e-learning systems, Institute (ETRI), South Korea. Since 2011, he has also been an Associate
filmification of methods, and cloud robotics. Professor with The University of Aizu, Japan. His research interests include
multimedia networking, image/video processing, content adaptation, IPTV,
and MPEG/ITU standards.