0% found this document useful (0 votes)

52 views

Impact of Practical Skills On Academic Performance A Data-Driven Analysis

This document summarizes a research article that analyzed the impact of practical skills like programming on academic performance using educational data. The researchers collected programming logs from an online judge system and students' academic scores. They applied clustering and association rule mining to extract hidden features and relationships. The results showed correlations between practical skills and overall performance. The findings can help students improve and help educators develop effective lessons.

Uploaded by

Marcelino Halili III

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Impact of Practical Skills On Academic Performance A Data-Driven Analysis

Uploaded by

Marcelino Halili III

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/355182006

Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

Article in IEEE Access · October 2021

DOI: 10.1109/ACCESS.2021.3119145

CITATIONS READS

3 1,056

5 authors, including:

Md. Mostafizer Rahman Yutaka Watanobe

The University of Aizu The University of Aizu
23 PUBLICATIONS 118 CITATIONS 112 PUBLICATIONS 457 CITATIONS

SEE PROFILE SEE PROFILE

Uday Kiran Rage Truong Cong Thang

The University of Aizu The University of Aizu
118 PUBLICATIONS 1,443 CITATIONS 179 PUBLICATIONS 1,851 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Cost-effective 360-degree video streaming over networks View project

Programming Education based on Deep Learning View project

All content following this page was uploaded by Md. Mostafizer Rahman on 24 October 2021.

The user has requested enhancement of the downloaded file.

IEEE EDUCATION SOCIETY SECTION

Received August 26, 2021, accepted October 5, 2021, date of publication October 8, 2021, date of current version October 19, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3119145

Impact of Practical Skills on Academic

Performance: A Data-Driven Analysis
MD. MOSTAFIZER RAHMAN 1,2 , YUTAKA WATANOBE1 , (Member, IEEE), RAGE UDAY KIRAN1 ,
TRUONG CONG THANG 1 , (Senior Member, IEEE),
AND INCHEON PAIK 1 , (Senior Member, IEEE)
1 Graduate Department of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu, Fukushima 965-8580, Japan
2 Department of Computer Science and Engineering, Dhaka University of Engineering & Technology, Gazipur 1707, Bangladesh
Corresponding authors: Md. Mostafizer Rahman ([email protected]) and Yutaka Watanobe ([email protected])
This work was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI under Grant 19K12252.
This work involved human subjects or animals in its research. Approval of all ethical and experimental procedures and protocols was
granted by the Research Ethics Examination Boards, The University of Aizu, Japan.

ABSTRACT Most academic courses in information and communication technology (ICT) or engineering
disciplines are designed to improve practical skills; however, practical skills and theoretical knowledge are
equally important to achieve high academic performance. This research aims to explore how practical skills
are influential in improving students’ academic performance by collecting real-world data from a computer
programming course in the ICT discipline. Today, computer programming has become an indispensable
skill for its wide range of applications and significance across the world. In this paper, a novel framework to
extract hidden features and related association rules using a real-world dataset is proposed. An unsupervised
k-means clustering algorithm is applied for data clustering, and then the frequent pattern-growth algorithm
is used for association rule mining. We leverage students’ programming logs and academic scores as
an experimental dataset. The programming logs are collected from an online judge (OJ) system, as OJs
play a key role in conducting programming practices, competitions, assignments, and tests. To explore
the correlation between practical (e.g., programming, logical implementations, etc.) skills and overall
academic performance, the statistical features of students are analyzed and the related results are presented.
A number of useful recommendations are provided for students in each cluster based on the identified hidden
features. In addition, the analytical results of this paper can help teachers prepare effective lesson plans,
evaluate programs with special arrangements, and identify the academic weaknesses of students. Moreover,
a prototype of the proposed approach and data-driven analytical results can be applied to other practical
courses in ICT or engineering disciplines.

INDEX TERMS Practical skills, programming education, feature extraction, educational data mining,
learning analytics, e-learning, online judge, clustering, association rule mining.

I. INTRODUCTION implementation skills. Computer programming is an exam-

Most courses in information and communication technology ple of a practical course in these disciplines. The necessity
(ICT), computer science, and engineering-related disciplines of programming education is rapidly growing in pace with
are designed with a practical basis. Basically, each course the increasing expansion of computers into our daily lives;
consists of two parts namely, theory and exercise where thus, computer programming is among the key courses in
theory develops students’ theoretical knowledge, ideas, and the ICT discipline and has become a foundational course in
memorization. In contrast, exercise or practical application other disciplines, as well [1]. In a recent effort to encourage
develops logic, critical thinking, problem-solving skills, and students, including children, to take an increased interest in
programming, numerous online programming platforms have
The associate editor coordinating the review of this manuscript and become available. Here, it should be noted that because the
approving it for publication was Meriel Huggard . primary requirement of programming education is to ensure

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 9, 2021 139975
M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

that students achieve computer literacy [1], most educational programming. The results of the analysis helped students to
institutions that teach programming have redesigned their improve their weak concepts through frequent faculty support
academic curriculum to effectively meet the basic literacy and also offered benefits to institutions. Ang et al. [18] con-
requirements of programming education. ducted a comprehensive survey and presented architectures
Basic computer programming courses are normally avail- and their challenges for growing big educational data.
able in the first semester of university studies. Initial program- On the other hand, learning analytics (LA) refers to the
ming classes have the ancillary role of attracting students to collection, analysis, and visualization of educational data
the field of computer programming. Because students may to understand and improve the learning processes and out-
make decisions based on these initial programming classes, comes better. LA provides interventions based on the anal-
it is essential that those classes impart positive programming ysis of educational data to improve both learning and the
experiences. Note that, introductory programming courses learning environment [19]. Also, LA encompasses broader
have a significant rate of failure and dropout [2]. However, components of other disciplines such as EDM, academic
due to limited amounts of class time, classrooms and teachers, analytics, learning sciences, cognitive sciences, human fac-
and limitations in other forms of logistic support, it is difficult tors, psychology, and so on. Maher et al. [20] proposed a
to fully educate students in programming through traditional Personalized Adaptive Gamified E-learning (PAGE) model
programming classes alone. To overcome these problems, to enhance MOOCs LA and visualization in the learning pro-
online judge (OJ) systems provide additional platforms that cess. The PAGE model helped learners in learning adaptation
enable students to continue their programming studies over and visualization.
a period of years [3]. Such systems normally contain large In this research, our goal is to investigate the impact of
collections of interesting programming problems [4] that practical skills on academic performance through a compre-
students can pursue independently or teachers can assign to hensive analysis using real-world e-learning data. Consider-
stimulate students’ interest. The concept of the OJ system was ing the context of this study, these two important terms such
first introduced at the 1977 International Collegiate Program- as practical skills and academic performance are defined as
ming Contest (ICPC) [5], [6], which is now held annually. follows.
Furthermore, because OJ systems have proven useful, many Practical skills relate to reasoning, critical-thinking,
universities and colleges are now attempting to develop online problem-solving, and implementation skills. Let consider a
support systems for programming education [7]–[9]. basic programming course that consists of two learning activ-
Today, OJs are used by many educational institutions ities such as theory-based and practice-based. The practice-
to conduct courses related to programming, computing, based activities include programming, programming-related
and software engineering [10], [11]. Many universi- assignments, and coding tests. In this research, perfor-
ties have created their own automated program assess- mance in practice-based activities is referred to as practi-
ment (APA) systems for programming courses to accelerate cal skills. On the other hand, academic performance refers
students’ learning [12]–[14]. As a result, a large number to theoretical knowledge, innovative ideas, and memoriza-
of programming-related submission logs are created every tion. Performance in various theory-based activities includes
day by OJ or APA systems in various organizations world- algorithmic-based assignments, theory-based assignments,
wide, which can be valuable resources for research and and paper-based tests which are referred to as academic
analysis [15], [16]. Therefore, this research aims to use performance.
programming-related resources (submission logs) for empir- To accomplish this study, a novel framework is proposed
ical research and analysis. to extract students’ hidden features and association rules.
Educational data collected from various e-learning plat- Hidden features derived from submission logs and scores
forms such as Moodle, MOOCs, OJs, and APAs are not carry significant meaning. This work makes the following
unified, structured, well-organized, neat and in a collected contributions:
format because the data archiving format differs from one • We present the correlation between practical skills and
e-learning platform to another. Therefore, educational data academic performance.
mining (EDM) and learning analytics (LA) techniques are • We find students’ programming and academic weak-
effective in transforming these big educational data into use- nesses and strengths through empirical analysis using
ful knowledge and patterns that can be applied to improve submission logs and scores. Further, necessary recom-
overall education. EDM has become an effective technique mendations have been provided accordingly.
for exploring invisible knowledge and useful patterns in edu- • We extract important and relevant features from the
cational data. Nowadays, traditional education is changing submission logs and scores that are not clearly visible
at an unprecedented pace and many academic activities are in a simple form of dataset.
conducted on e-learning platforms. The collection of this • We determine that the hidden features are useful to
vast amount of educational data has opened up opportunities students as well as teachers to achieve programming and
for research and analysis to understand and improve learn- academic goals.
ing outcomes. V. Hegde and S. Rao H.S. [17] presented an • The proposed framework and its data analysis process
EDM-based framework to analyze students’ performance in can be useful for other related academic courses and

139976 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

disciplines to discover hidden features/correlations in each student’s programming experience. Another study [23]
e-learning data. For example, this framework can be presents a continuous programming assessment system for
applied to a course that consists of theory and hands- programming courses using automated assessment tools
on activities and collects resources/data, like a program- (AATs). A quantitative analysis was performed based on
ming course. the relationship between the student and the AAT outcome.
The rest of the article is structured as follows. In Section II, The submitted solutions are analyzed in depth using an AAT
the background and related works are presented. Section III and judgments (either correct or incorrect) are provided.
describes the dataset and preprocessing. In Section IV, The experimental results showed that AATs help students
the proposed approach is presented. The experimental results to better understand computer programming. Lu et al. [24]
are presented in Section V and discussed in Section VI. presented programming education via an OJ system that has
Section VII concludes the study with outlooks on the future increased students’ performance curve in programming and
works. other academic activities. Their experimental results show
that the OJ system enhanced performance levels, as well as
II. BACKGROUND AND RELATED WORKS stimulated students’ interest throughout the year- or semester-
In this section, we briefly introduce OJ or APA systems long course.
and their applications in programming education. In addition, Toledo et al. [25] presented a fuzzy recommender sys-
supervised and unsupervised learning algorithms, association tem for OJ programming that provides suggestions to learn-
rule mining (ARM) algorithms, educational data mining and ers regarding their upcoming problems based on their past
learning analytics are also presented. performance in the OJ system. That method also pro-
vided useful information to students via recommendations.
A. ONLINE PROGRAMMING LEARNING PLATFORM In [26], OJ programming problems were classified using
OJ or APA systems are now widely used by educational two topic-modeling algorithms, latent dirichlet allocation and
institutions as academic learning tools in programming and non-negative matrix factorization in order to extract relevant
other exercise-based classes. These platforms play an impor- features from problem descriptions. The classification of OJ
tant role in improving students’ programming skills, knowl- programming problems can help novice and advanced stu-
edge, and overall academic performance. The vast resources dents to pick and solve appropriate problems.
(e.g., code archives, submission logs, etc.) generated by Our approach differs from that of existing research by
these systems can help researchers to find students’ flaws focusing on discovering hidden features from submission
in programming and thus expands the scope of available logs and scores to improve programming skills and aca-
improvements. As a result, numerous studies have focused on demic performance. We also focus on finding the correlation
programming education, educational data mining, and data- between practical skills and academic performance based on
driven analysis using resources from OJ or APA systems. the extracted hidden features. To the best of our knowledge,
In [7], the authors used learning log data extracted from no study has been conducted to address this issue by using
the M2B system. A recurrent neural network is used to submission logs and scores.
predict student performance. This study showed that numer-
ous useful hidden features can be extracted by analyzing B. SUPERVISED AND UNSUPERVISED LEARNING
the M2B system’s data. Mekterovic̀ et al. [12] proposed an ALGORITHMS
APA system for conducting programming courses and cre- Within the context of artificial intelligence and machine
ated the educational software Edgar to automatically evaluate learning (ML), supervised and unsupervised learning algo-
programming assignments and other programming-related rithms are frequently used in real-world applications. In short,
tasks. Edgar provides a variety of services, including con- both input data and output labels are known in supervised
tent writing, course administration, system monitoring, and learning (SL) algorithms. Formally, SL involves ML algo-
troubleshooting. Furthermore, Edgar produces the results of rithms that are trained with known input data and associated
various statistics in a visual format. APA systems provide output labels. Let U = {u1 , u2 , u3 , . . . , un } be the set of input
many benefits for students as well as instructors. Mean- data and V = {v1 , v2 , v3 , . . . , vn } be the set of corresponding
while, a ranking system [21] based on student performance output labels of the input data U . Thus, the output function
and quick responses has positively impacted programming can be written as V = f (U ), where the output V depends
learning. APA systems have extended the conventional use on the input U and f is a mapping function. After training,
of the OJ systems for evaluating programming assignments the ML algorithm can predict the output label for all new input
and their use significantly stimulates students’ interest in data. SL algorithms are divided into two categories such as
programming. classification and regression.
In [22], the authors extended the BOCA OJ system to Classification is an SL approach that classifies a given set
improve its suitability for programming classes. The resulting of data into classes. The classification model predicts the
PROBOCA project was used to aid classroom teachers. target class for a given data point. After training, the model
This method identifies problems by degree of difficulty, predicts class names for data that it has not seen before.
thus making it easier for teachers to match problems with There are two types of classification in ML such as binary

VOLUME 9, 2021 139977

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

classification (true or false) and multi-class classification. to select an algorithm that can group students based on their
Typically, the evaluation of a classification model is done by source code submission logs and class performance scores,
computing the precision, recall, and accuracy scores. Exam- we expected that a clustering approach would provide the
ples of some classification algorithms include support vector best-suited solution to group the students from unlabeled
machine, decision tree, random forest tree, artificial neural datasets. The most commonly used and effective cluster-
network, similarity learning, and k-nearest neighbor [27]. ing approaches, such as k-means, k-medoids, DBSCAN,
Similarly, regression is an SL approach used to predict the agglomerative hierarchical cluster tree, and other variations
continuous output variable based on one or more indepen- of k-means were found based on a review. The modified
dent (predictors) variables. Mainly, this approach is used for k-means clustering algorithm [30], which we found to be a
forecasting, time series modeling, prediction, and determin- robust, scalable, and effective tool, is a variant of the conven-
ing market trends. Examples of regression algorithms include tional k-means clustering algorithm.
linear regression, logistic regression, polynomial regression,
decision tree regression, and random forest regression [27]. C. ASSOCIATION RULE MINING ALGORITHMS
In contrast, unsupervised learning (USL) is a kind of ARM algorithm is a USL algorithm used for data mining
ML algorithm in which models are trained with unlabeled in big data. ARM was first proposed by Agrawal [31] and
datasets. The USL algorithm can group the data based on has since been used in many fields, such as educational
their similarity features by applying some mathematical pro- data analysis, medical data analysis, market-basket analy-
cedures. The USL algorithms have the following advan- sis, and census data. Usually, ARM aims to find a set of
tages over SL including hidden feature extraction, useful cooccurring high-frequency items and extract the correla-
insights, human-like learning, handling unlabeled and uncat- tion among items from large dataset. Although the Apriori
egorized data. USL algorithms are divided into two cate- algorithm [31] is often used for data mining, many enhance-
gories such as clustering and association. Clustering is a ments are proposed based on Apriori to improve performance
USL algorithm used to group data into clusters, similarity and scalability, such as the sampling approach [32], hashing
characteristics of the data in a group are high, on the other technique [33], dynamic counting [34], partitioning tech-
hand, there is a minimal similarities with the data of another nique [35], and incremental mining [36]. Prior studies showed
group. Examples of clustering algorithms include k-means, that the Apriori algorithm achieved significant results, but
k-medoids, Density-based Spatial Clustering of Applications some methods also reported the worse results by generating
with Noise (DBSCAN), Clustering Large Applications based a large number of candidate item sets, additional scans, etc.
on RANdomized Search (CLARANS), and Clustering Large Subsequently, a new algorithm called FP-growth was
Applications (CLARA). Association is a USL algorithm that proposed without the leverage of candidate item set gen-
is used to find relationships between items in a large database. eration [37]. This method used a partitioning-based divide-
This algorithm determines the set of items that co-occur in a and-conquer approach. Previous studies have shown that it
database. For example, if three items M , N , and O exist in significantly reduced the search space and time compared
the database, the algorithm can generate patterns/rules that to Apriori [38]. Similarly, many extensions are added to the
co-occur such as M −→ N , (M & N ) −→ O, and N −→ M . FP- growth algorithm to improve efficiency. Some examples
These patterns/rules are useful for analyzing market-basket, of enhanced FP-growth algorithms are h-mine [39], depth-
educational data, and so on. Examples of association algo- first mining [40], pattern-growth mining in both directions
rithms include Apriori and frequent pattern (FP)-growth. (bottom-up and top-down), and tree structures [41], [42].
In [28], students have been classified by a clustering In contrast, Zaki [43] proposed the Equivalence CLASS
approach based on their learning behaviors. The clustering Transformation (Eclat) algorithm for ARM applied to vertical
by fast search and finding of density peaks via heat diffusion data. The Eclat uses the same candidate-generation process
(CFSFDP-HD) algorithm has achieved a better clustering like Apriori. In brief, Apriori, FP-growth, and Eclat ARM
performance than other clustering algorithms. The authors algorithms are most frequently used in many applications,
also proposed an e-learning system architecture that detects and also serve as the foundation of many other ARM algo-
and responds to teaching content based on student learning rithms. In this research, an FP-growth algorithm is used.
capabilities. Tabanao et al. [29] proposed a method that clas-
sifies programmers using submission log data, such as com- D. EDUCATIONAL DATA MINING AND LEARNING
pilation profiles, error profiles, compilation frequency, and ANALYTICS
error quotient profiles produced during an introductory pro- EDM is the same as traditional data mining, except that it is
gramming course. This study identified correlations between applied to educational fields. EDM is used to extract hidden
the submission log data and the midterm examination scores knowledge and discover patterns from the data in different
of students. educational learning platforms [44]. In the study [44], various
In our dataset, the output labels are unknown because the data mining techniques including clustering, classification,
submission logs and class performance scores have not yet and ARM are exploited to discover useful information from
output information that could be used for labeling (e.g., poor, the educational data. They used EDM tools (Rapid Miner
good, very good, or genius). Accordingly, as it is necessary and Weka) to analyze data from Moodle in a programming

139978 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

course. Fernandes et al. [45] presented a predictive analysis TABLE 1. Topic-wise problem list of ALDS1 course.
of students’ academic performance. The Gradient Boosting
Machine (GBM) classification model was applied to predict
students’ academic performance at the end of the school year.
In another study [46], a semi-supervised learning algorithm
was used to predict the students’ performance in the final
exams. In the study [47], a survey of EDM and its future
directions is presented. It also discusses some recent trends
in the field of EDM research.
LA has become an important research topic in the field of
educational technology. This involves understanding and ana-
lyzing real-world educational data to provide useful support
for improving learning and teaching. Tran et al. [48] used
LA for a learning management system (LMS). The exper-
imental results showed that LA plays an important role in
improving productivity, learning, and support for LMS user.
Ang et al. [18] discussed LA from five different perspec-
tives: learning and assessment analysis, personalized learn-
ing, behavior learning, collaborative and interactive learning,
and social network analytics. Current LA trends and practices
to improve teaching and learning in education are presented
in [49], [50].
In addition, numerous studies have been conducted using
the resources of OJ or APA systems. These systems are
actively used for education, e-learning, computing, pro-
gramming competitions, and software engineering. The
importance of empirical data-driven analysis to make criti-
cal decisions, and even to change algorithm configurations
automatically, is growing [51]. However, this data-driven
analytical research differs from previous studies in that a real-
world dataset has been used. The analytical findings of this
research are beneficial to assist students in improving their
academic and practical performance, as well as for educa-
tional planning.

III. DATASET AND PREPROCESSING

In this section, we introduce the Aizu Online Judge (AOJ)
system, which is the source of the submission logs. More-
over, we describe submission logs and class performance
scores collected from the AOJ and a programming course,
respectively as our datasets. The data types, structure, and
preprocessing steps are also presented.
has a rich repository with approximately 100,000 users,
A. AIZU ONLINE JUDGE SYSTEM 3,000 problems, and 5.5 million code archives and sub-
The AOJ system [52], [53] is a popular OJ platform in Japan mission logs. All the problems are systematically catego-
and worldwide. It has been running for more than 15 years rized [54]. The AOJ’s resources have been used for various
to host programming competitions, practices, assignments, research and application purposes [55], [56]. Recently, AOJ’s
and education. In addition, the AOJ system is officially dataset has been used in the IBM CodeNet Project [57].
employed to conduct programming- and algorithm-related
courses at the University of Aizu, Japan. The AOJ’s typi- B. SOLUTION SUBMISSION LOGS
cal courses include Introduction to Programming I (ITP1), In this research, submission logs from a programming course
Algorithms and Data Structures I (ALDS1), Introduction (ALDS1) were collected for experiments. Usually, the prob-
to Programming II (ITP2), Datasets and Queries, Discrete lems in the ALDS1 course are assigned to students to
Optimization Problems, Graph Algorithms, Computational solve, as shown in Table 1. The overall topic-wise success
Geometry, and Number Theory. Thus, plenty of source codes rate (%) of this course is also mentioned in Table 1. The
and submission logs are generated on a regular basis. AOJ ALDS1 course has 13 topics, and each topic consists of

VOLUME 9, 2021 139979

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

TABLE 2. Sample submission logs generated by the AOJ system.

three (03) or four (04) problems which we call problems A, The PCS generates a CVal score for submitted codes based
B, C, and D. on the degree of similarity, and the codes are collected from a
The logs are generated by the AOJ system based on the specific time period and users. A score of 1 means that there is
submitted solution codes by the students over two semesters, no copying/duplication, 0.5 means that a few codes are copied
the size of the submission logs is approximately 69,000. Each from others, and 0 means that a number of codes are copied
solution log has a set of information, such as the judge id (jid), from others. In addition, CVal is used to justify the scores of
user id (uid), problem id (pid), language (C, C++, python, PA. Some sample data distribution of student evaluations are
etc.), accuracy, verdict (accepted, wrong answer, compile listed in Table 3.
error, etc.), CPU time, memory usage, code size, submission
date, and judge date. Let UID be a set of users (students) TABLE 3. Sample data distribution of student evaluations.
i.e., UID = {uid1 , uid2 , . . . , uidn }, n ≥ 1. JID is a set of
judge IDs JID = {jid1 , jid2 , . . . , jidm }, m ≥ 1; Prob is a set
of problems Prob = {prob1 , prob2 , . . . , probj }, j ≥ 1 where
prob1 , prob2 , . . . probj are unique problems; judge verdicts
are Verd = {AC, WA, CE, RTE, MLE, TLE, OLE, PrE}
where AC = Accepted, WA = Wrong Answer, CE = Com-
pile Error, RTE = Run Time Error, TLE = Time Limit
Exceeded, OLE = Output Limit Exceeded, MLE = Memory
Limit Exceeded, and PrE = Presentation Error; and the pro- Definition 1: The CVal score refers to the degree/level of
gramming languages are Lang = {C, C++, C++11, Ruby, program code plagiarism.
Python 2, Python 3, Java, Haskell, C#, PHP, Rust, . . . }. Example 1: If a user u15 copies/replicates programs from
A corresponding submission log is created immediately after others, then u15 receives a CVal score of 0.5; if user u15
submitting a solution code to the AOJ system. Thus, a sample copies/replicates the code from others with malicious intent,
output log of AOJ system can be written as Ologs = {ur , js , CVal is 0.
pt , vu , lv , ct, mu, cs, sd, jd}, where ur ∈ UID, js ∈ JID, pt ∈ Each exercise class is divided into two parts. First, students
Prob, vu ∈ Verd, lv ∈ Lang, ct = CPU time, mu = Memory are asked to submit an AA, which consists of a few ques-
usage, cs = Code size, sd = Submission date, jd = Judge tions and is also considered student attendance (AT ). Second,
date. Some sample logs generated by the AOJ system are three or four problems are given to the students as a PA.
listed in Table 2. The students are then encouraged to submit their solutions
through the AOJ system. Students are allowed to consult
C. CLASS PERFORMANCE SCORES with each other, teachers, and teaching assistants to solve
In addition to the submission logs, we collected various problems during PA. In contrast, CoT is conducted in exercise
test (exam) scores for the ALDS1 course from 357 stu- rooms with a separate workstation for each student, provid-
dents in two different years at the University of Aizu, Japan. ing a process by which each student’s actual programming
Usually, most students take the ALDS1 course as part of capabilities can be verified. Note that it is strictly forbidden
their regular study. This course consists of various tests, for a student to consult with other students during the CoT .
such as algorithm assignment (AA), programming assignment Similarly, the PT is a closed-book test that is given to check
(PA), code validation (CVal), coding test (CoT ), and paper- the true level of each student’s theoretical understanding. The
based test (PT ). Note that PA and CVal are calculated based test scores distribution can be expressed as Tscore = {UID,
on the student’s program submission to the AOJ system. AT , AA, PA, CVal, PT , CoT , T , Prac}, where AT ∈ N and
To check the plagiarism/similarity/duplication of submitted 0 ≤ AT ≤ 13, AA ∈ N and 0 ≤ AA ≤ 100, PA ∈ N and
solution codes, a plagiarism checking software (PCS) has 0 ≤ PA ≤ 120, CVal ∈ R and 0 ≤ CVal ≤ 1, PT ∈ N
been developed and integrated into the management sys- and 0 ≤ PT ≤ 120, CoT ∈ N and 0 ≤ CoT ≤ 120,
tem of AOJ. The PCS checks solution codes submitted by T ∈ N and 0 ≤ T ≤ 120, Prac ∈ N and 0 ≤ Prac ≤
the students against the existing source codes in the AOJ. 120. To better evaluate the students of the ALDS1 course

139980 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

FIGURE 1. The overall framework of the data-driven approach.

by considering the importance of theoretical and practical where pn = number of problems, TAS = total accepted solu-
knowledge, the equations (1), (2), and (3) are developed tions, and TS = total submissions.
for the Theory (T ), Practical (Prac) and Final Score (FS) Example 2: Let u5 be a user who has submitted a total
calculations, respectively, based on the different test scores. of 39 solutions to the AOJ system, of which, a total of 28 have
Note that the equations for the ALDS1 course are approved been accepted. Then, the solution accuracy of the user u5 is
by the course coodinator. (28/39) = 71%, according to (4).
√
T = AA × PT (1) Another important term is trial and error (T &E), we use
p the T &E method to estimate a programmer’s ability to solve
Prac = (PA × CoT ) × CVal (2)
AT +1 p problems. In this study, the following definition is adopted
FS = min(100, b c × (T × Prac)) (3) for the T &E method.
10
Definition 3: A number of repeated attempts are taken
For explanation, the student evaluation process is com-
until a problem is successfully solved; this process is called
pared using the following two scenarios: (i) the conventional
trial and error (T &E).
case and (ii) the proposed case (based on the equations). Ppn
j=1 TSj
1) CONVENTIONAL CASE T &E = Ppn (5)
j=1 TASj
In this case, the final results are usually generated by aver-
aging Prac and T scores. For example, if student s1 gets where pn = number of problems, TS = total submissions, and
10 points on the Prac test and 90 points on the T test, TAS = total accepted solutions.
the final result of s1 using the conventional method is Example 3: Suppose that u10 is a user who has received a
(10 + 90)/2 = 50. total of 25 accepted (AC) verdicts from the AOJ for 5 prob-
lems, but has taken a total of 129 attempts (T &E) to achieve
2) PROPOSED CASE it. Then, the average T &E of user u10 for each solved problem
In this case, Prac and T scores are given equal priority to is (129/25) = 5.16, according to (5).
generate final results, so the equations (1 − 3) are introduced
IV. APPROACH
to emphasize both the Prac and T scores. Let us assume that
Figure 1 shows the framework of our proposed educational
if student s1 gets 10 points on the Prac test and 90 points on
data-driven approach. We employed the framework to a real-
the
√ T test, the final result of s1 using our equations will be world dataset to extract the hidden features and associa-
10 × 90 = 30. We observed that the proposed evaluation
tion rules of students to explore the importance of practical
method considers both the Prac and T scores, although there
skills. Experimental data are collected from AOJ system
is no balance between the Prac and T scores when calculating
and ALDS1 programming course, respectively. The proposed
the final result using the conventional method.
approach consists of four main steps: (i) data collections
For statistical feature extraction and ARM, Tables 2 and 3
and preprocessing, (ii) data clustering, (iii) statistical hid-
are joined (Ologs o n Tscores ) to produce the operational data,
den features extraction from clusters, and (iv) association
as shown in Table 4. In addition to the existing attributes,
rules mining from clusters. A modified k-means clustering
a new attribute (Accuracy) has been added to the operational
algorithm is applied for data clustering, where the elbow
data.
method used to select the optimal k values for the k-means.
Definition 2: The number of accepted solutions out of
Furthermore, the FP-growth ARM algorithm is leveraged to
total submissions is called the solution Accuracy of users.
Ppn extract the association rules from each cluster. The methods
i=1 TASi and algorithms used for the proposed approach are discussed
Accuracy(Accu) = Ppn (4)
i=1 TSi below.

VOLUME 9, 2021 139981

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

TABLE 4. Sample operational data distributions by joining submission logs (Table 2) and evaluation scores (Table 3).

A. ELBOW METHOD iterations to construct clusters for our dataset than other ran-
The elbow method is a proven technique to determine the dom initial centroid-selection algorithms. The first module
optimal number of clusters k for the k-means algorithm. is initial centroid selection module (ICSM) which leverages
It uses the sum of squared errors (SSE) of each cluster to cal- to (i) select optimal centroids and (ii) build clusters with the
culate the optimal number of clusters. The SSE is calculated most similar data. The pseudocode of ICSM is provided in
by the equation (6). Algorithm 1.
k X
X Algorithm 1 Initial Centroid Selection Module (ICSM)
SSE = dist 2 (mi , x) (6)
i=1 x∈Ci
[h] Define: Distance: D, Origin: O(0, 0), Cluster Number:
K
where k is a number of clusters, x is a data point in cluster Ci , Input: Dataset: X = {x1 , x2 , x3 , . . . . . . . . . , xn }
and mi is the center of cluster Ci . Output: Optimal initial centroids Cn =[]
The elbow method reduces unnecessary clustering in the for xj ∈ X do
dataset, where a small SSE value indicates a better cluster. D ←− distance(xj , O)
Normally, increasing the value of k automatically decreases end
the SSE value. When the SSE value is drastically decreased, for di ∈ D do
that point is caught as the ideal number (k) of clusters for Apply sorting on D
k-means. The elbow method was applied to our dataset to D ←− d1 , d2 , d3 , . . . , dn
obtain the optimal k value (k = 4) for the k-means clustering end
algorithm, as shown in Figure 2. if K ≤ |X | then
Divide sorted data D into K subsets
s1 ⊆ D, s2 ⊆ D, s3 ⊆ D, s4 ⊆ D, . . . , sk ⊆ D
end
while k ≤ K do
P Mean value of each subset
Calculate
xx∈S
Mk = |Sk | k
for xj ∈ Sk do
Cn ←− mindistance(Mk , xj∈Sk )
end
end

FIGURE 2. Elbow method for optimal k value selection. The second module is the outlier detection module (ODM),
which is used to (i) detect outliers (irrelevant/insignificant
data point), (ii) remove them from the datasets, and
B. MODIFIED K-MEANS CLUSTERING ALGORITHM (iii) improve the overall cluster quality. The pseudocode of
Usually, the k-means clustering algorithm randomly chooses the ODM is presented in Algorithm 2.
the initial centroid, so it is possible to select an irrelevant
data point as the initial centroid. In addition, conventional C. FP-GROWTH ALGORITHM
k-means algorithms cannot detect and remove outliers from In the field of data mining, the Apriori, Eclat, and FP-growth
the dataset. Consequently, the results may have a nega- algorithms are the most commonly used [58]. The FP-growth
tive impact on the overall clustering process and results. algorithm is much more efficient and faster than Apriori
To address these problems, the modified k-means clustering because the Apriori algorithm repeatedly scans the database,
algorithm [30] integrates two important modules, (i) opti- whereas the FP-growth algorithm only scans twice to com-
mal initial centroid selection and (ii) outlier detection and plete the process. The FP-growth algorithm basically consists
removal. To the best of our knowledge, this is a unique of two (2) main steps [37], namely (i) construction of the
modification of the k-means clustering algorithm, and these FP-tree and (ii) FP mining based on the FP-tree. Let L =
two modules makes the algorithm more efficient, robust, and {l1 , l2 , l3 , l4 , . . . , ld } be the set of all items in the database.
scalable. This algorithm takes approximately 17.33% fewer The databases are built based on a set of tuples/transactions

139982 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

Algorithm 2 Outlier Detection Module (ODM) To visualize the data distribution of each cluster, we applied
Define: Number of Cluster: K , Cluster: C principal component analysis (PCA) technique to multidi-
Input: Dataset: X = {x1 , x2 , x3 , . . . . . . . . . , xn }, Distance: D mensional clustered data to convert it into a two-dimensional
Output: Outliers: O, SSE (2D) shape. For this reason, the first two components (PCA 1
Run ICSM and Calculate min-max average (MMA) using and PCA 2) of the PCA that explain the majority of the
sorted distance d ∈ D variance in the data are used for the 2D visualization. The
MMA = dmin +d 2
max
visualized clusters are shown in Figure 3.
while k ≤ K do
for xi ∈ XCk do
if distance(xi , centerCk ) > MMA then
Remove xi from the cluster Ck
O ←− xi
Recalculate SSE
end
end
end

FIGURE 3. 2D visualization of the clusters.

T = {t1 , t2 , t3 , t4 , . . . , tN }, where each transaction ti is a

subset of L(ti ⊆ L). The formula of an association rule can First, the preliminary statistical information related to each
be written as R = X −→ Y , where X , Y is a subset of L cluster, (i) the number of students per cluster and (ii) the total
(X ⊆ L, Y ⊆ L) and X ∩Y = φ. The set of items in X is often number of problems solved by the students in each cluster,
called the preceding (if ), and the set of items in Y is called is presented in Table 5. We found that approximately 33.33%
the subsequent (then). Mathematically, the support count for of the students are in cluster Q, which is the largest, and
item set X is expressed as ς(X ) = |{ti |X ⊆ ti , ti ∈ T }|, where approximately 16.01% of the students are in cluster P, which
|.| denotes the number of elements in a set. The minimum is the smallest. On the other hand, the students in cluster
support (minSup) and minimum confidence (minConf ) are Q generated the largest submission log of 22,110, and the
two important terms that are used to create association rules. students in cluster S produced the smallest submission log
The minSup threshold is used to find item frequencies in a of 5,153.
database, whereas the minConf threshold value is applied to
TABLE 5. Preliminary statistical information of each cluster.
these frequent items to construct the association rules. The
support (Sup) and confidence (Con) are represented by the
equations (7) and (8), respectively.
ς (X ∪Y )
Sup(X −→ Y ) = (7)
N
ς (X ∪Y )
Con(X −→ Y ) = (8)
ς (X )
where N = total number of transactions. B. EXTRACTING HIDDEN FEATURES
In this section, different features of students are extracted
V. EXPERIMENTAL RESULTS from clusters P, Q, R, and S. We calculated the solution
In this section, the experimental results are presented. We first verdicts (considering problems A, B, C, and D) in each
cluster students based on their submission logs and scores, cluster, as denoted in Table 6. Each submission log contains
and then extracted the hidden features from each clus- at least one judge verdict out of many (AC, WA, CE, etc.).
ter. The association rules are generated from each clus- Therefore, each verdict determined the ultimate result of a
ter using the FP-growth algorithm to validate the features. submitted solution. A few observations can be illustrated
Finally, all the correlated features are accumulated for from the Table 6: (i) clusters P and S have the highest AC
discussion. rates, (ii) the students of cluster R achieved the lowest AC
rates, and (iii) the students of cluster R received higher error
A. CLUSTERING THE DATA verdicts than those in other clusters.
According to the proposed framework (Figure 1), the modi- Also, we enumerated problem-wise statistics of the sub-
fied k-means clustering algorithm is applied to the Table 3 for mitted solutions to find out how many submissions belong to
the clustering process. Before clustering begins, the elbow each problem such as A, B, C, and D in each cluster, as pre-
method is applied to the same data to generate the optimal sented in Table 7. A few observations can be drawn from
number (k = 4) of clusters, as shown in Figure 2. Now, four the Table 7: (i) Students of cluster P submitted the fewest
clusters have been formed, named clusters P, Q, R, and S. solutions to problem A, at 30.99%, compared to clusters Q,
Note that multidimensional data (Table 3) are clustered. R, and S. (ii) Cluster P students submitted the highest number

VOLUME 9, 2021 139983

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

TABLE 6. Overview of the judge verdicts of ALDS1 course. can be drawn: (i) the students of clusters P and Q have more
T &E as well as higher solution accuracy, and (ii) both the
solution accuracy and T &E of cluster R are lower than those
of other clusters. Note that the solution accuracy and T &E
are calculated by equations (4) and (5), respectively.

TABLE 8. Cluster-wise solution accuracy and problem-solving T &E .

of solutions for problems C and D, at 32.82% and 10.55%,

respectively, compared to clusters Q, R, and S. (iii) Students
of clusters R and S submitted the fewest solutions for prob-
lems C and D, compared to clusters P and Q, respectively. Next, the average score and standard deviation (σ ) for
each cluster are calculated, as shown in Table 9 and the
TABLE 7. Overview of the submission statistics for each type of problem. comparative views presented in Figure 5. Standard deviation
(σ ) is used to measure the variation of values in a cluster.
Thus, a low value of σ indicates that the values are likely close
to the mean (average). The following observations can be
drawn: (i) the CoT , PA, and PT scores of cluster P are much
higher than those of clusters Q, R, and S; (ii) the PA, CoT ,
In contrast, the error verdicts of each cluster are also and PT scores of cluster Q are higher than those of clusters R
calculated. The segmentation of error verdicts received by and S; (iii) the CoT score of cluster R is comparatively lower
the students in each cluster are shown in Figure 4. For that, than the other PA and PT scores of this cluster; (iv) the CoT
the error verdicts are divided into five (05) categories based score of cluster S is also much lower than those of clusters P,
on the error types in codes such as (i) WA, (ii) CE, (iii) RTE, Q, and R.
(iv) PrE, and (v) Resource Limitation (TLE, MLE, OLE)
i.e., RL. Detailed error statistics for each cluster are presented TABLE 9. Overview of the average scores and standard deviation (σ ) in
each cluster.
in Figure 4.

FIGURE 4. Segmentation of error verdicts received by the students.

Now, the solution accuracy and T &E are calculated for

each cluster, as enumerated in Table 8. A few observations FIGURE 5. Comparison of scores in different tests.

139984 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

We found more interesting features from the clusters. during the first few days of the submission period compared
For example, students solved numerous additional problems to clusters P and Q, and (iii) more students from clusters R
beyond their regular exercise assignments through the AOJ and S submitted their assignments on the last day (8th ) of the
platform, solely for their own interests and amusement. submission deadline than students from clusters P and Q.
The cluster-wise extra problem solution statistics are listed Sometimes students submitted their solutions after the
in Table 10. The following observations can be drawn from deadlines. The topic-wise accepted (AC) solution rate and
the Table 10: (i) the students of cluster P solved a huge num- average accepted (AC) rate for all clusters are calculated and
ber of problems beyond their regular exercise assignments, listed in Table 11. Moreover, a visual comparison between all
which clearly indicates their enthusiasm for programming, topics for all clusters is presented in Figure 7. A few observa-
and (ii) the students in other clusters (Q, R, and S) did not tions can be found: (i) the students of cluster P received the
solve a significant number of extra problems. highest acceptance against all their assignment submissions
and (ii) the students of cluster R obtained the lowest accep-
TABLE 10. Statistics of extra problem solutions. tance rate compared to those in clusters P, Q, and S.

TABLE 11. Topic-wise average accepted (AC ) solution rate.

The tendency to submit each assignment in the

ALDS1 course is analyzed for more information. There
are a few rules to submit each PA task through the AOJ
platform: (i) problems A and B must be solved by a certain
predetermined deadline, where students usually have eight
(08) days to submit each assignment, and (ii) problems C
and D can be submitted by the end of the semester. One of our
goals is to observe students’ submission trends for each topic,
how they submitted solutions to problems A and B within
the allotted time, because problems A and B are mandatory
for scoring. The average submission trend among all clusters
over a period of time (08 days) is shown in Figure 6.

FIGURE 7. Comparison of topic-wise accepted (AC ) rate.

FIGURE 6. Tendency to submit assignments within the allotted period

(08 days). We also analyzed the data across all clusters to find the
assignment submission trends on the last (8th ) day of the
The following observations can be illustrated from the allotted time. A comparative analysis of the tendency to
Figure 6: (i) the students of cluster P tried very hard to solve submit assignments on the last day of each topic is presented
and submit their assignments (problems A and B) on the in Figure 8. The average submission rate on the last day
very first day of the submission period, (ii) the students of is also calculated across all topics, as shown in Table 12.
clusters R and S made less effort in submitting assignments It can be observed that among all clusters, (i) students of

VOLUME 9, 2021 139985

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

clusters R and S submitted most solutions on the last day and TABLE 13. Repetition tendency of accepted problems.
(ii) students of cluster P submitted the fewest solutions on the
last day.

else’s code to solve the assignments. In contrast, the students

of clusters P, Q, and R obtained higher CVal scores.

C. ASSOCIATION RULE MINING

The FP-growth algorithm is applied to the clustered data
to find the association rules, which help identify the actual
relationship between programming skills and academic per-
formance. In addition, the association rules are used to verify
the extracted statistical features of each cluster.

TABLE 14. Dictionary for the set attributes.

FIGURE 8. Tendency to submit assignments on the last day.

TABLE 12. Average submission rate on the last day.

To obtain more interesting hidden features that are not

plainly visible in the dataset, we analyzed the data of each
cluster and found that many students repeatedly solved prob-
lems (already accepted) for optimization in terms of memory
usage, CPU time, code refactoring, etc. The repetition ten-
dency (only for accepted problems) of students in each cluster
is calculated. The ALDS1 course has thirteen topics, each
with four problems (A, B, C and D), for a total of 52 unique
problems (total problems TP = 52). We determined how
many students in each cluster repeatedly solved 25%, 50%, Before the FP-growth algorithm is applied, we prepare
and 75% of the TP, as enumerated in Table 13. The students each cluster’s data in a uniform data format. Therefore,
participation (maximum and minimum) from each cluster the prominent attributes such as Prob, Accu, Verd, PA, CoT ,
are enumerated as follows: (i) 92.86% students of cluster P and PT are selected for ARM. Let W = {{Prob}, {Accu},
repeatedly solved 25% of the TP whereas 28.21% of students {Verd}, {PA}, {CoT }, {PT }} be a set of attributes, where
of cluster S participated, which is the lowest; (ii) for 50% Prob = {catId | catId ∈ N, 1 ≤ catId ≤ 59}, Accu = {catId |
TP repetition, 44.05% and 24.16% of students from clusters catId ∈ N, 60 ≤ catId ≤ 69}, Verd = {catId | catId ∈ N, 70 ≤
P and Q participated, respectively; and (iii) for 75% TP catId ≤ 79}, PA = {catId | catId ∈ N, 80 ≤ catId ≤ 89},
repetition, 9.52% students participated from cluster P, which CoT = {catId | catId ∈ N, 90 ≤ catId ≤ 99}, and PT = {catId
is the largest and no students (0%) participated from clusters | catId ∈ N, 100 ≤ catId ≤ 109}. Thus, the sets Prob, Accu,
R and S. Verd, PA, CoT , and PT are a subset of W , i.e., Prob ⊆
Here, the CVal scores are calculated for each cluster, with W , Accu ⊆ W , Verd ⊆ W , PA ⊆ W , CoT ⊆ W , PT ⊆ W .
average scores of 1, 0.96, 0.96, and 0.44 for clusters P, Q, The values of the elements in each set have been converted
R, and S, respectively. The students in cluster S received an into uniform categorical IDs (catId) according to the defi-
average CVal score of 0.44, indicating poor coding skills. nition in Table 14. After the cluster data is converted into
According to the Definition 1, they likely copied someone uniform catId, the sample data formats of the tuples are as

139986 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

follows: W1 = {29, 60, 70, 81, 90, 100}, W2 = {17, 61, 71, TABLE 18. Association rules for the students of cluster S.
80, 92, 102}, and W3 = {32, 58, 70, 81, 91, 101}.
Interesting and relevant association rules are obtained
from each cluster by setting the optimal minimum support
(minSup) and confidence (minConf ) threshold values. For
cluster P, we set minSup = 1500 and minConf = 90%.
Consequently, the frequent rules shown in Table 15 are
obtained.

TABLE 15. Association rules for the students of cluster P.

association rules are involved with higher accuracy, PA, PT ,

and CoT , as well as the most frequent accepted (AC) verdicts;
(ii) students of cluster Q showed with higher accuracy, PA,
In Table 16, the association rules extracted from cluster Q and PT but lower scores in CoT ; (iii) students of cluster R
using the values minSup = 2000 and minConf = 90% are tended to have lower scores in CoT , PT , and PA, as well as
listed. infrequent AC verdicts; and (iv) cluster S students showed
lower scores in CoT , PT , and PA, as well as lower accuracy.
TABLE 16. Association rules for the students of cluster Q.

D. ACCUMULATION OF CORRELATED FEATURES

Many significant features are generated from each cluster
by employing the proposed framework. These features are
deeply correlated to each other and meaningful. These cor-
related features and rules are accumulated for each cluster.
Students of the cluster P (i) took an average problem-
solving T &E of 12.40 (Table 8), (ii) solved an average
of 56.14 extra problems beyond their academic assignments
(Table 10), (iii) repeated more accepted (AC) solutions for
optimization than those of other clusters (Table 13), (iv) sub-
mitted their assignments on the very first day more than those
Similarly, valuable rules are also extracted from cluster R in other clusters (Figure 6), and (v) had the lowest average
when minSup = 3000 and minConf = 90%. The generated last-day submission rate of approximately 2.90% than clus-
rules are listed in Table 17. ters Q, R, and S (Table 12). These features are interdependent
and deeply correlated to each other. The features mentioned
TABLE 17. Association rules for the students of cluster R.
above have interesting meanings; overall, these features indi-
cate that students in this cluster are committed to program-
ming, which has a positive impact on their programming and
academic performance.
Cluster P also had an overall AC rate (considering prob-
lems A, B, C, and D) of 40.88% (Table 6), topic-wise AC
rate (considering problems A and B) of 46.24% (Table 11),
average solution accuracy of 55.71% (Table 8), and higher
CVal score of 1. These higher success rates in programming
enabled higher scores in PA, CoT , and PT that are also
validated by the association rules (Table 15).
In cluster Q the overall AC rate considering all problems
is 36.34% (Table 6) and the topic-wise average AC rate (con-
sidering problems A and B) is 40.06% (Table 11), which are
Finally, rules are generated for cluster S when we set lower than those of clusters P and S. The students of cluster
minSup = 1500 and minConf = 90%, as shown in Table 18. Q consistently maintained a high AC rate throughout the
The following observations can be obtained based on the thirteen topics (Figure 7) but solved minimal extra problems
association rules from different clusters: (i) in cluster P, beyond their academic assignments (Table 10). They had a

VOLUME 9, 2021 139987

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

lower tendency to submit solutions on the last day than stu- explanations, recommendations, assessments, practical appli-
dents of clusters R and S instead submitted their assignments cations, and limitations.
early (Table 12). The students of cluster Q obtained higher
scores in PA and PT than in CoT (Table 9), as shown by the TABLE 19. Main features.
association rules (Table 16). Most of the features involved
high values, indicating that they put a great effort into pro-
gramming. However, the lower attempt to solve additional
problems likely affected the CoT scores in this cluster.
Students of cluster R (i) took an average of 9.52 attempts/
trials to solve problems, (ii) did not solve many additional
problems outside of regular academic assignments (Table 10),
(iii) submitted their assignments on the deadline or the day
before (Figure 6), and (iv) rarely repeated the AC problems
more than once (Table 13). These features are related to their
various programming activities and indicate that they did
not put much effort into programming. These students also
received the highest error (WA, CE, RTE, TLE, etc.) verdicts For a better understanding, ten main features are listed
of 67.30% and the lowest AC rate of 32.70% compared to in Table 19 with three indicator values: higher (H ),
clusters P, Q, and S. Their topic-wise AC rate is not coherent medium (M ), and lower (L). Feature c, which indicates when
across all thirteen topics. Students in cluster R obtained good the assignments are submitted within the alloted time, uses
scores in PA (60.92) and PT (68.26), but lower scores in CoT indicator values of early (E), mid-time (M ), and delay (D).
(28.46) (Table 9). The association rules showed that students
were involved with lower scores and infrequent AC verdicts. A. ANALYSIS AND RECOMMENDATIONS
Note that the coding test (CoT ) is used to verify the students’ In the summary graph of the main features shown in Figure 9,
core programming skills. Thus, less effort in programming we observe that the students of cluster P performed extraordi-
negatively affects this CoT score. narily well in different programming activities and academic
Students of cluster S (i) undertook an average of tests. Importantly, most students in this cluster are highly
9.86 attempts/trials to solve problems (Table 8) (ii) solved enthusiastic about programming, with more than 62% of total
no additional problems (Table 10), (iii) had the highest rate solutions (Figure 6) submitted on the very first day of all
of last-day submission for each assignment with an average assignments. They also solved an average of 56.14 extra
of 22.22% solutions submitted on the last day (Figure 6 and problems in addition to their academic assignments. The
Table 12) compared to clusters P, Q, and R, and (iv) obtained tendency to submit solutions on the last day is approximately
very low CVal score of 0.44. Furthermore, an insignificant 2.90% which is the lowest compared to clusters Q, R, and
number of students attempted to repeat the AC problems more S (Table 12). For solution optimization, a large number of
than once (Table 13). Collectively, these features indicate students repeated their AC solutions (Table 13). In addition,
that students in cluster S did not perform well in program- these students achieved higher AC rates of 46.24% for prob-
ming. Most features are negatively prioritized. Consequently, lems A and B (Table 11), accuracy of 55.71% (Table 8),
students in this cluster obtained the lowest scores in CoT and scores on various tests of 81.22%, 65.46%, and 82.33%
(16.55), which is alarming for actual coding performance. for PA, CoT , and PT , respectively than those of clusters
Most of the association rules are connected with lower CoT Q, R, and S (Table 9), as reflected by the association rules
scores (Table 18). In addition, we found an interesting corre- (Table 15). In contrast, the total error verdict is analyzed from
lation: most of the students submitted their solutions on the this cluster, with approximately 45% error due to WA, 19%
last day, but achieved higher AC rates and accuracy. This due to resource limitations (TLE, MLE, OLE), and 15% due
trend differs from that of clusters P and Q. to RTE (Figure 4).
Note that, to develop students’ programming skills and
VI. DISCUSSION ensure the efficiency of the source codes, several constraints
In this study, many hidden features are obtained by employing are set for problems such as input and output limits/numbers,
the proposed framework, where modified k-means is applied space and time complexity. In this case, a solution code
for data clustering and then FP-growth is applied to the must satisfy the set of constraints to be accepted, otherwise
clustered data to discover the association rules. Interesting it receives error verdicts such as TLE and MLE. Figure 4
features and behaviors are observed that are not readily appar- shows that students in clusters Q, R, and S received 10.87%,
ent in the base dataset. After applying the elbow and k-means 9.32%, and 6.25% errors due to TLE and MLE, respectively.
algorithms to the dataset, four (04) clusters are found. Differ- In contrast, the students of cluster P received about 19.28%
ent features and rules are extracted from each cluster consid- errors due to TLE and MLE, which is the highest compared
ering the different conditions presented in the experimental to clusters Q, R, and S. However, students in cluster P took
results section. Next, we discuss the features and the resulting about 12.40 attempts (T &E) to solve a problem, which is

139988 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

FIGURE 9. A summary graph of the main features.

higher than students in clusters Q, R, and S (Table 8). In gen- in cluster Q achieved medium (M ) scores in CoT and did
eral, problems C and D are comparatively more difficult and not solve a significant number of problems outside of their
contain tough constraints than problems A and B. Students in regular assignments. Usually, CoT is used to verify actual
cluster P submitted the highest percentage of solutions, about programming ability; a medium score in CoT means students
43.36%, for problems C and D compared to clusters Q, R, and need to pay more attention in programming. Accordingly,
S (Table 7). Moreover, each student in cluster P solved an to improve programming skills, students can practice more
average of 56.14 additional problems, which is significantly outside of their academic workload.
higher than students in clusters Q, R, and S (Table 10). Considering all the results and analysis, we determined the
Students in cluster P attempted many additional and chal- following recommendations for clusters P and Q: (i) special
lenging problems, resulting in a high percentage of errors in attention to these students can further improve their skills and
TLE and MLE. Usually, complex algorithm-based problems knowledge; (ii) more difficult problems can be assigned to
contain various tough constraints, and sometimes it is very these students because they find general assignments are very
difficult to deal with these kinds of constraints alone without easy; and (iii) they can be involved in real-world problem-
prerequisite knowledge. Our analysis shows that students in solving tasks.
cluster P have a high tendency to take on difficult problems For the students of cluster R, the rate of last-day submis-
independently (Table 7), and have achieved significant suc- sion was approximately 20.06% (Figure 6) which indicates
cess in solving problems with tough constraints (Table 6). a tendency to delay submission, and they show inconsis-
Besides, students in this cluster still have the opportunity to tent acceptance (AC) for all topics (Figure 7). Moreover,
further improve their programming skills in dealing tough this cluster had the lowest acceptance (AC) and accuracy
constraint-based problems. Based on the overall empirical rates among all clusters; very few students repeated their
and analytical results, we can summarize that the students of accepted solutions for optimization and solved extra prob-
cluster P are highly skilled and enthusiastic about program- lems. Students of this cluster obtained the highest error rate
ming and perform well on academic tests. (67.30%) among all the clusters. The extracted association
Similarly, students in cluster Q achieved higher values rules show that most of these students achieve lower CoT
in most features, as shown in Figure 9. More than 40% scores, accuracy, AC rate, and WA verdicts. The students
of their assignments were submitted on the very first day in this cluster scored much higher in PA (60.92) than CoT
(Figure 6), with higher accuracy, AC rate, repetition ten- (28.46). During PA, students can consult with others to solve
dency, and scores in PA and PT . The error verdicts of this problems. This may allow some students to solve problems
cluster have been analyzed approximately 45% of the errors with the help of other students without understanding the
occur due to WA, 19% due to CE, 11% due to PrE, 11% problems properly. In contrast, students are not allowed to
due to resource limitation (TLE, MLE, OLE), and 14% due consult/talk with others during CoT , in which students of this
to RTE (Figure 4). These students can be understood by cluster rarely obtain good scores. The average CoT score is
analyzing the reasons for each type of error. In addition to 28.46 out of 120. Considering all the features, it is concluded
these positive features, we found some flaws. The students that (i) students may solve assignments with the help of

VOLUME 9, 2021 139989

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

others without understanding the problems and this cluster early (Figure 6), the topic-wise acceptance rate of 40.06%
(ii) lacks actual programming skills, and (iii) has less effort for problems (A and B) (Table 11), last-day submission
in programming. rate of 10.28% (Table 12), and PA, CoT , and PA scores
Figure 9 shows that students of cluster S achieved lower of 67.86%, 40.18%, and 70.43% respectively (Table 9).
values in most features. Their last-day submission rate is As shown in Figure 9, most of the features are associated with
22.22%, which is the highest among all clusters. They good indicators. It can be seen that the students of cluster Q
achieved lower scores in coding and paper-based exami- also performed well in programming.
nations (CoT and PT ), but obtained relatively high scores On the other hand, students in cluster S did not solve any
in PA. Similarly, they had fewer T &E attempts but achieved additional problems (Table 10), had a less repetition tendency
higher accuracy and AC rate. Most of the association rules (Table 13), a higher last day submission rate of 22.22%
are involved with lower scores (CoT and PT ) and accuracy, compared to clusters P, Q, and R (Table 12 and Figure 6), and
as well. The following observations can be obtained from the received the lowest CVal score of 0.44 compared to clusters
extracted features: (i) while a large number of solutions were P, Q, and R. Besides, students scored 54.33%, 13.79%, and
submitted on the last day, there may have been some students 44.83% on PA, CoT , and PT , respectively, which is very poor
who waited for other solutions to become available; this is compared to clusters P, Q, and R (Table 9). The overall results
justified by the CVal score. (ii) There is an unusual trend show that the students in this cluster did not perform well in
where students obtained lower scores in CoT while achiev- programming. In addition, the summary graph (Figure 9) of
ing higher AC rate, accuracy, and PA scores; this suggests the features and association rules (Table 18) show that they
(iii) a lack of actual programming skills and (iv) less effort were involved with lower indicators in most features.
in programming, and that (v) the students may solve their From the above results, it can be seen that the students
assignments through collaboration with others. of clusters P and Q made a good effort in programming
After analyzing the features and association rules from and obtained good results in various tests, while the students
different perspectives, some deficiencies have been identified of cluster S made less effort and therefore achieved poor
in the programming and academic fields for the students results in various tests. So, we can conclude that if students
of clusters R and S. Accordingly, we provide some recom- (especially in ICT-related disciplines) perform well in prac-
mendations that may help improve students’ programming tical applications (e.g., programming, logical implementa-
skills and academic performance: (i) special assistance can be tion) then they are also likely to perform well in different
provided in the development of algorithms and mathematical academic activities, including tests. In addition, the current
logic; (ii) encourage students to solve problems with self- research provided some recommendations for students based
knowledge and understanding; (iii) students can participate in on the identified features and flaws. Teachers, instructors, and
different programming activities, such as competitions, pro- faculty advisors can use these analytical results and recom-
gramming lectures, and workshops; and (iv) teachers should mendations to improve students’ programming and academic
give these students additional attention and support in theory performance levels. In addition, our proposed framework,
and exercise classes and observe their responses. experiments, and overall analytical results can be applied to
other related courses/disciplines.
B. OVERALL ASSESSMENTS AND PRACTICAL The ultimate goal of this research is to support and
APPLICATIONS improve student learning by identifying their weaknesses
Considering all the empirical results and analysis, we see and strengths. For this purpose, a real-world dataset from
that the students of cluster P obtained the highest acceptance a programming course was used. The proposed framework
rate of 40.88% for all problems (A, B, C, and D) (Table 6), included EDM techniques and LA to find invisible knowl-
average solution accuracy of 55.71% (Table 8), solved the edge from the e-learning data. The knowledge was then ana-
average additional problem of 56.14 (Table 10), a faster lyzed and visualized from various perspectives. The results
propensity to submit assignments early (Figure 6), topic- of these analyses highlight the weaknesses and strengths
wise accepted solution rate of 46.24 for problems (A and B) of the students and improve their learning. The proposed
(Table 11), lowest submission rate on the last day of 2.90% research can be suitable for practical applications for the
(Table 12), highest number of repetitions (Table 13), highest following reasons: (i) the proposed research can provide a
PA, CoT , and PT scores of 81.22%, 65.46%, and 82.33%, useful direction, that is, how to deal with e-learning data,
respectively (Table 9). All the features indicate that the stu- (ii) e-learning data processing has always been a challenging
dents of cluster P invested great efforts in programming- task, in this regard, the proposed research shows the way of
related tasks. In addition, the summary graph (Figure 9) of handling real-world e-learning data. As the proposed research
the features shows that the students of cluster P are involved has already processed OJ (e-learning) data for EDM and LA,
in better indicators in all the features. Similarly, students (iii) the process of data analysis and its results can be helpful
in cluster Q received an acceptance rate of 36.34% for all for other related courses to improve students’ learning, and
problems (A, B, C, and D) (Table 6), solution accuracy (iv) the proposed framework can be integrated with existing
of 48.45% (Table 8), high tendency to submit assignments e-learning platforms for EDM and LA purposes.

139990 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

C. LIMITATIONS [3] S. Wasik, M. Antczak, J. Badura, A. Laskowski, and T. Sternal, ‘‘A survey
on online judge systems and their applications,’’ ACM Comput. Surv.,
The proposed framework is leveraged for data clustering, and vol. 51, no. 1, pp. 1–34, Apr. 2018, doi: 10.1145/3143560.
then the hidden features and association rules are extracted [4] R. Yera and L. Martínez, ‘‘A recommendation approach for programming
from each cluster. The results are generated based on a dataset online judges supported by data preprocessing techniques,’’ Appl. Intell.,
comprising submission logs and scores collected from the vol. 47, pp. 277–290, Mar. 2017, doi: 10.1007/s10489-016-0892-x.
[5] S. Manzoor, ‘‘Common mistakes in online and real-time contests,’’ ACM
AOJ system; they may vary for other datasets due to noise Mag. Students, vol. 14, no. 4, pp. 10–16, Jun. 2008, doi: 10.1145/1375972.
or irrelevant data. The number of association rules may vary 1375976.
depending on the threshold values of minSup and minConf . [6] M. A. Revilla, S. Manzoor, and R. Liu, ‘‘Competitive learning in infor-
matics: The UVA online judge experience,’’ Olympiads Informat., vol. 2,
The value of k for the modified k-means clustering algorithm no. 10, pp. 131–148, 2008.
may differ based on the dataset. Therefore, the proposed [7] F. Okubo, T. Yamashita, and A. Shimada, ‘‘Students’ performance pre-
framework can produce better or worse results for other diction using data of multiple courses by recurrent neural network,’’ in
Proc. 25th Int. Conf. Comput. Educ. (ICCE), Christchurch, New Zealand,
datasets. Dec. 2017, pp. 439–444.
[8] J. Petit, S. Roura, J. Carmona, J. Cortadella, J. Duch, O. Gimnez, A. Mani,
VII. CONCLUSION AND FUTURE WORK J. Mas, E. Rodrguez-Carbonell, E. Rubio, and E. de San Pedro, ‘‘Jutge.org:
In this research, a novel framework for exploring the effects Characteristics and experiences,’’ IEEE Trans. Learn. Technol., vol. 11,
of practical skills on academic performance was proposed. no. 3, pp. 321–333, Jul./Sep. 2018, doi: 10.1109/TLT.2017.2723389.
[9] J. L. Bez, N. A. Tonin, and P. R. Rodegheri, ‘‘URI online judge academic:
Subsequently, a programming course was selected as a sam- A tool for algorithms and programming classes,’’ in Proc. 9th Int. Conf.
ple course for experiments and analyses. By employing the Comput. Sci. Educ., Vancouver, BC, Canada, Aug. 2014, pp. 149–152.
framework, many meaningful and significant features were [10] R. Romli, S. Sulaiman, and K. Z. Zamli, ‘‘Improving automated pro-
gramming assessments: User experience evaluation using fast-generator,’’
extracted from the dataset. The extracted features are deeply Proc. Comput. Sci., vol. 72, pp. 186–193, Jan. 2015, doi: 10.1016/j.
correlated to the students’ behavior. The analytical results procs.2015.12.120.
showed that better practical (e.g., programming) skills have a [11] C. A. Higgins, G. Gray, P. Symeonidis, and A. Tsintsifas, ‘‘Automated
positive effect on academic performance. Moreover, the inter- assessment and experiences of teaching programming,’’ J. Educ. Resour.
Comput., vol. 5, no. 3, pp. 1–21, Sep. 2005, doi: 10.1145/1163405.
action and interdependence between practical skills and aca- 1163410.
demic performance are presented based on the experimental [12] I. Mekterovic, L. Brkic, B. Milasinovic, and M. Baranovic, ‘‘Build-
results. Thus, we have concluded that if a student of an ICT ing a comprehensive automated programming assessment system,’’ IEEE
Access, vol. 8, pp. 81154–81172, 2020, doi: 10.1109/ACCESS.2020.
or engineering discipline performs well in practical assign- 2990980.
ments (e.g., programming, logical implementation, PL/SQL, [13] A. Kosowski, M. Malafiejski, and T. Noinski, ‘‘Application of an online
etc.), then they are likely to perform well in other academic judge & contester system in academic tuition,’’ in Proc. 6th Int. Conf. Adv.
Web Based Learn. (ICWL), Edinburgh, U.K., Aug. 2007, pp. 343–354.
activities. The overall approach of this research is applicable [14] N. A. Rashid, L. W. Lim, O. S. Eng, T. H. Ping, Z. Zainol, and
to other fields such as education, educational data mining, O. Majid, ‘‘A framework of an automatic assessment system for learning
data analytics, and behavior analysis. In future work, we will programming,’’ in Advanced Computer and Communication Engineering
Technology (Lecture Notes in Electrical Engineering), vol. 362. Cham,
consider an automated recommender system that can guide Switzerland: Springer, Dec. 2016, pp. 967–977, doi: 10.1007/978-3-319-
students to improve their practical skills. Moreover, other 24584-3_82.
types of datasets in addition to programming logs and scores [15] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘Source code assessment
and classification based on estimated error probability using attentive
will be included.
LSTM language model and its application in programming education,’’
Appl. Sci., vol. 10, no. 8, p. 2973, Apr. 2020, doi: 10.3390/app10082973.
AVAILABILITY OF DATA AND MATERIALS [16] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘A neural network based
In the present research, all the experimental data are col- intelligent support model for program code completion,’’ Sci. Program.,
lected from AOJ system and class performance scores of a vol. 2020, pp. 1–18, Jul. 2020, doi: 10.1155/2020/7426461.
[17] V. Hegde and H. S. S. Rao, ‘‘A framework to analyze performance of
course (ALDS1). Source code submission logs are accessed Student’s in programming language using educational data mining,’’ in
by these two (02) web applications http://developers. Proc. IEEE Int. Conf. Comput. Intell. Comput. Res. (ICCIC), Dec. 2017,
u-aizu.ac.jp/index and https://onlinejudge.u-aizu.ac.jp. pp. 1–4, doi: 10.1109/ICCIC.2017.8524244.
[18] K. L.-M. Ang, F. L. Ge, and K. P. Seng, ‘‘Big educational data &
CONFLICT OF INTEREST analytics: Survey, architecture and challenges,’’ IEEE Access, vol. 8,
pp. 116392–116414, 2020, doi: 10.1109/ACCESS.2020.2994561.
The authors declare that they have no conflicts of interest. [19] J. Knobbout and E. Van Der Stappen, ‘‘Where is the learning in learn-
ing analytics? A systematic literature review on the operationalization of
ETHICS APPROVAL learning-related constructs in the evaluation of learning analytics interven-
This research was approved by the Research Ethics Exami- tions,’’ IEEE Trans. Learn. Technol., vol. 13, no. 3, pp. 631–645, Jul. 2020,
nation Boards, The University of Aizu, Japan. doi: 10.1109/TLT.2020.2999970.
[20] Y. Maher, S. M. Moussa, and M. E. Khalifa, ‘‘Learners on focus: Visualiz-
ing analytics through an integrated model for learning analytics in adaptive
REFERENCES gamified e-learning,’’ IEEE Access, vol. 8, pp. 197597–197616, 2020, doi:
[1] A. Vee, ‘‘Understanding computer programming as a literacy,’’ Literacy 10.1109/ACCESS.2020.3034284.
Composition Stud., vol. 1, no. 2, pp. 42–64, Nov. 2013, doi: 10.21623/ [21] J. L. F. Aleman, ‘‘Automated assessment in a programming tools
1.1.2.4. course,’’ IEEE Trans. Educ., vol. 54, no. 4, pp. 576–581, Nov. 2011, doi:
[2] L. E. Margulieux, B. B. Morrison, and A. Decker, ‘‘Reducing withdrawal 10.1109/TE.2010.2098442.
and failure rates in introductory programming with subgoal labeled worked [22] R. E. Francisco and A. P. Ambrosio, ‘‘Mining an online judge system to
examples,’’ Int. J. STEM Educ., vol. 7, no. 1, pp. 1–16, May 2020, doi: support introductory computer programming teaching,’’ in Proc. 8th Int.
10.1186/s40594-020-00222-7. Conf. Educ. Data Mining, Madrid, Spain, Jun. 2015, pp. 1–6.

VOLUME 9, 2021 139991

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

[23] F. Restrepo-Calle, J. J. R. Echeverry, and F. A. González, ‘‘Continuous [45] E. Fernandes, M. Holanda, M. Victorino, V. Borges, R. Carvalho, and
assessment in a computer programming course supported by a software G. V. Erven, ‘‘Educational data mining: Predictive analysis of academic
tool,’’ Comput. Appl. Eng. Educ., vol. 27, no. 1, pp. 80–89, Sep. 2018, doi: performance of public school students in the capital of Brazil,’’ J. Bus.
10.1002/cae.22058. Res., vol. 94, pp. 335–343, Jan. 2019, doi: 10.1016/j.jbusres.2018.02.012.
[24] X. Lu, D. Zheng, and L. Liu, ‘‘Data driven analysis on the effect of online [46] I. E. Livieris, K. Drakopoulou, V. T. Tampakas, T. A. Mikropoulos, and
judge system,’’ in Proc. IEEE Int. Conf. Internet Things (iThings) IEEE P. Pintelas, ‘‘Predicting secondary school students’ performance utilizing a
Green Comput. Commun. (GreenCom) IEEE Cyber, Phys. Social Com- semi-supervised learning approach,’’ J. Educ. Comput. Res., vol. 57, no. 2,
put. (CPSCom) IEEE Smart Data (SmartData), Exeter, U.K., Jun. 2017, pp. 448–470, Jan. 2018, doi: 10.1177/0735633117752614.
pp. 573–577. [47] S. A. Salloum, M. Alshurideh, A. Elnagar, and K. Shaalan, ‘‘Mining in
[25] R. Y. Toledo, Y. C. Mota, and L. Martínez, ‘‘A recommender system for educational data: Review and future directions,’’ in Proceedings of the
programming online judges using fuzzy information modeling,’’ Informa- International Conference on Artificial Intelligence and Computer Vision
tion, vol. 5, no. 2, pp. 1–17, Apr. 2018, doi: 10.3390/informatics5020017. (AICV2020) (Advances in Intelligent Systems and Computing), vol. 1153,
[26] C. M. Intisar, Y. Watanobe, M. Poudel, and S. Bhalla, ‘‘Classification of A. E. Hassanien, A. Azar, T. Gaber, D. Oliva, and F. Tolba, Eds. Cham,
programming problems based on topic modeling,’’ in Proc. 7th Int. Conf. Switzerland: Springer, 2020, doi: 10.1007/978-3-030-44289-7_9.
Inf. Educ. Technol., Aizu-Wakamatsu, Japan, Mar. 2019, pp. 275–283. [48] T. P. Tran and D. Meacheam, ‘‘Enhancing learners’ experience through
[27] A. R. Anaya and J. Boticario, ‘‘A data mining approach to reveal repre- extending learning systems,’’ IEEE Trans. Learn. Technol., vol. 13, no. 3,
sentative collaboration indicators in open collaboration frameworks,’’ in pp. 540–551, Jul./Sep. 2020, doi: 10.1109/TLT.2020.2989333.
Proc. 2nd Int. Conf. Educ. Data Mining (EDM), Córdoba, Spain, Jul. 2009, [49] O. Viberg, M. Hatakka, O. Bälter, and A. Mavroudi, ‘‘The current land-
pp. 210–219. scape of learning analytics in higher education,’’ Comput. Hum. Behav.,
[28] S. Kausar, X. Huahu, I. Hussain, W. Zhu, and M. Zahid, ‘‘Integration of vol. 89, pp. 98–110, Dec. 2018, doi: 10.1016/j.chb.2018.07.027.
data mining clustering approach in the personalized e-learning system,’’ [50] L.-K. Lee, S. K. S. Cheung, and L.-F. Kwok, ‘‘Learning analytics: Current
IEEE Access, vol. 6, pp. 72724–72734, 2018, doi: 10.1109/ACCESS. trends and innovative practices,’’ J. Comput. Educ., vol. 7, no. 1, pp. 1–6,
2018.2882240. Feb. 2020, doi: 10.1007/s40692-020-00155-8.
[29] E. S. Tabanao, M. M. T. Rodrigo, and M. C. Jadud, ‘‘Predicting at- [51] F. Dunke and S. Nickel, ‘‘A data-driven methodology for the auto-
risk novice Java programmers through the analysis of online protocols,’’ mated configuration of online algorithms,’’ Decis. Support Syst., vol. 137,
in Proc. 7th Int. Workshop Comput. Educ. Res., New York, NY, USA, Oct. 2020, Art. no. 113343, doi: 10.1016/j.dss.2020.113343.
Aug. 2011, pp. 58–92. [52] Y. Watanobe. Aizu Online Judge. Accessed: May 19, 2020. [Online].
[30] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘An efficient approach for Available: https://onlinejudge.u-aizu.ac.jp
selecting initial centroid and outlier detection of data clustering,’’ in Proc. [53] Aizu Online Judge: Developers Site (API). Accessed: Dec. 19, 2019.
18th Int. Conf. Intell. Softw. Methodol., Tools, Techn. (SOMET), Kuching, [Online]. Available: http://developers.u-aizu.ac.jp/index
Malaysia, Sep. 2019, pp. 616–628. [54] T. Saito and Y. Watanobe, ‘‘Learning path recommendation system for
[31] R. Agrawal and R. Srikant, ‘‘Fast algorithms for mining association rules programming education based on neural networks,’’ Int. J. Distance
in large databases,’’ in Proc. 20th Int. Conf. Very Large Data Bases, Educ. Technol., vol. 18, no. 1, pp. 36–64, Jan. 2020, doi: 10.4018/IJDET.
San Francisco, CA, USA, Sep. 1994, pp. 487–499. 2020010103.
[32] H. Toivonen, ‘‘Sampling large databases for association rules,’’ in Proc. [55] Y. Watanobe, C. M. Intisar, R. Cortez, and A. Vazhenin, ‘‘Next-
22nd Int. Conf. Very Large Data Bases, San Francisco, CA, USA, generation programming learning platform: Architecture and challenges,’’
Sep. 1996, pp. 134–145. in Proc. 2nd ACM Chapter Conf. Educ. Technol., Lang. Tech. Commun.,
[33] J. S. Park, M.-S. Chen, and P. S. Yu, ‘‘An effective hash-based algo- Aizu-Wakamatsu, Japan, Nov. 2020, pp. 1–11.
rithm for mining association rules,’’ ACM SIGMOD Rec., vol. 24, no. 2, [56] M. M. Rahman, Y. Watanobe, and K. Nakamura, ‘‘A bidirectional LSTM
pp. 175–186, May 1995, doi: 10.1145/568271.223813. language model for code evaluation and repair,’’ Symmetry, vol. 13, no. 2,
[34] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, ‘‘Dynamic itemset counting p. 247, Feb. 2021, doi: 10.3390/sym13020247.
and implication rules for market basket data,’’ ACM SIGMOD Rec., vol. 26, [57] International Business Machines (IBM). (May 2021). Project CodeNet.
no. 2, pp. 255–264, Jun. 1997, doi: 10.1145/253262.253325. [Online]. Available: https://github.com/IBM/Project_CodeNet
[35] A. Savasere, E. Omiecinski, and S. B. Navathe, ‘‘An efficient algorithm for [58] P. Wang, C. An, and L. Wang, ‘‘An improved algorithm for mining associa-
mining association rules in large databases,’’ in Proc. 21th Int. Conf. Very tion rule in relational database,’’ in Proc. Int. Conf. Mach. Learn. Cybern.,
Large Data Bases, San Francisco, CA, USA, Sep. 1995, pp. 432–444. Lanzhou, China, Jul. 2014, pp. 247–252.
[36] D. W. Cheung, J. Han, V. T. Ng, and C. Y. Wong, ‘‘Maintenance of
discovered association rules in large databases: An incremental updating
technique,’’ in Proc. 12th Int. Conf. Data Eng., New Orleans, LA, USA,
1996, pp. 106–114, doi: 10.1109/ICDE.1996.492094.
[37] J. Han, J. Pei, and Y. Yin, ‘‘Mining frequent patterns without candidate
generation,’’ in Proc. ACM SIGMOD Int. Conf. Manage. Data (SIGMOD),
New York, NY, USA, May 2000, pp. 1–12.
[38] D. Ai, H. Pan, X. Li, Y. Gao, and D. He, ‘‘Association rule mining
algorithms on high-dimensional datasets,’’ Artif. Life Robot., vol. 23, no. 3,
pp. 420–427, May 2018, doi: 10.1007/s10015-018-0437-y.
[39] J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, ‘‘H-Mine: Fast and
space-preserving frequent pattern mining in large databases,’’ IIE Trans., MD. MOSTAFIZER RAHMAN received the
vol. 39, no. 6, pp. 593–605, Mar. 2007, doi: 10.1080/07408170600897460. B.Sc. degree in engineering from the Depart-
[40] R. C. Agarwal, C. C. Aggarwal, and V. V. V. Prasad, ‘‘A tree projection ment of Computer Science and Engineering,
algorithm for generation of frequent item sets,’’ J. Parallel Distrib. Com- Hajee Mohammad Danesh Science and Technol-
put., vol. 61, no. 3, pp. 350–371, Mar. 2001, doi: 10.1006/jpdc.2000.1693. ogy University, Dinajpur, Bangladesh, in 2009,
[41] J. Liu, Y. Pan, K. Wang, and J. Han, ‘‘Mining frequent item sets by and the M. Sc. degree in engineering from the
opportunistic projection,’’ in Proc. 8th ACM SIGKDD Int. Conf. Knowl.
Department of Computer Science and Engineer-
Discovery Data Mining, Edmonton, AB, Canada, Jul. 2002, pp. 229–238.
ing, Dhaka University of Engineering & Technol-
[42] G. Grahne and J. Zhu, ‘‘Fast algorithms for frequent itemset mining using
FP-trees,’’ IEEE Trans. Knowl. Data Eng., vol. 17, no. 10, pp. 1347–1362,
ogy, Gazipur, Bangladesh, in 2014. He is currently
Oct. 2005, doi: 10.1109/TKDE.2005.166. pursuing the Ph.D. degree with the Database Sys-
[43] M. J. Zaki, ‘‘Scalable algorithms for association mining,’’ IEEE Trans. tems Laboratory, Department of Computer and Information Systems, The
Knowl. Data Eng., vol. 12, no. 3, pp. 372–390, May/Jun. 2000, doi: University of Aizu, Aizuwakamatsu, Fukushima, Japan. He is also working
10.1109/69.846291. (on study leave) at the Dhaka University of Engineering & Technology. His
[44] A. Lile, ‘‘Analyzing e-learning systems using educational data mining research interests include machine learning, deep learning, machine learning
techniques,’’ Medit. J. Social Sci., vol. 2, no. 3, pp. 403–419, Sep. 2011, application in programming, programming education, data mining, and big
doi: 10.5901/mjss.2011.v2n3p403. data analytics.

139992 VOLUME 9, 2021

M. M. Rahman et al.: Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

YUTAKA WATANOBE (Member, IEEE) received TRUONG CONG THANG (Senior Member,
the master’s and Ph.D. degrees from The Univer- IEEE) received the B.E. degree from the Hanoi
sity of Aizu, Japan, in 2004 and 2007, respectively. University of Science and Technology, Vietnam,
He was a Research Fellow of the Japan Society in 1997, and the Ph.D. degree from KAIST,
for the Promotion of Science (JSPS), The Univer- South Korea, in 2006. From 1997 to 2000, he was
sity of Aizu, in 2007. He was a Coach of several a Network Engineer with the Vietnam Post and
ACM-ICPC World Final teams. He is currently Telecommunications (VNPT). Since 2002, he has
a Senior Associate Professor with the School of been an Active Member of Korean and Japanese
Computer Science and Engineering, The Univer- delegations to standard meetings of ISO/IEC and
sity of Aizu. He is a Key Member of the Aizu ITU-T. From 2007 to 2011, he was a member
Online Judge (AOJ) System. His research interests include visual program- of Research Staff with the Electronics and Telecommunications Research
ming language, programming education, data mining, e-learning systems, Institute (ETRI), South Korea. Since 2011, he has also been an Associate
filmification of methods, and cloud robotics. Professor with The University of Aizu, Japan. His research interests include
multimedia networking, image/video processing, content adaptation, IPTV,
and MPEG/ITU standards.

RAGE UDAY KIRAN received the Ph.D. degree

in computer science from the International Insti-
tute of Information Technology, Hyderabad, INCHEON PAIK (Senior Member, IEEE) received
Telangana, India. He was a Project Assistant Pro- the master’s and Ph.D. degrees in electronics engi-
fessor with the Kitsuregawa Laboratory, Institute neering from Korea University, in 1987 and 1992,
of Industrial Science, The University of Tokyo, respectively. He is currently a Full Professor with
Tokyo, Japan. He was a Researcher with the The University of Aizu, Japan. His research inter-
Social Big Data Research Collaboration Center, ests include semantic web, web services and their
National Institute of Information and Commu- composition, web data mining, big data analytics,
nications Technology, Tokyo. He is currently an deep learning, awareness computing, and agents
Associate Professor with the School of Computer Science and Engineering, on semantic web. He served in several conferences
The University of Aizu, Japan. His current research interests include data as the chair. He also serves as an Editor for the
mining, parallel computation, air pollution data analytics, traffic congestion journals of JIPS and IEICE.
data analytics, recommender systems, and ICTs for agriculture.

VOLUME 9, 2021 139993

View publication stats

(eBook PDF) Anthropology What Does It Mean to be Human 3rd all chapter instant download
100% (1)
(eBook PDF) Anthropology What Does It Mean to be Human 3rd all chapter instant download
55 pages
Discrete Structure and Automata Theory for Learners: Learn Discrete Structure Concepts and Automata Theory with JFLAP
From Everand
Discrete Structure and Automata Theory for Learners: Learn Discrete Structure Concepts and Automata Theory with JFLAP
Sukhpreet Kaur Gill
No ratings yet
PCP OBE-CBTP SelfAssessment RTP-ACC Form#9
100% (2)
PCP OBE-CBTP SelfAssessment RTP-ACC Form#9
2 pages
Exploring Higher Vocational Software Technology Education
From Everand
Exploring Higher Vocational Software Technology Education
Chen Ping
No ratings yet
Optimizing Cultivation of Cordyceps Militaris For Fast Growth and Cordycepin Overproduction Using Rational Design of Synthetic Media
No ratings yet
Optimizing Cultivation of Cordyceps Militaris For Fast Growth and Cordycepin Overproduction Using Rational Design of Synthetic Media
9 pages
ICT Project Management: Framework for ICT-based Pedagogy System: Development, Operation, and Management
From Everand
ICT Project Management: Framework for ICT-based Pedagogy System: Development, Operation, and Management
Suman Ahmmed
No ratings yet
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
From Everand
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
Alejandra J. Magana
No ratings yet
Formulation and Evaluation of Ranolazine Sustained Release Matrix Tablets Using Eudragit and HPMC
No ratings yet
Formulation and Evaluation of Ranolazine Sustained Release Matrix Tablets Using Eudragit and HPMC
7 pages
COMST Camera Ready3 Compressed
No ratings yet
COMST Camera Ready3 Compressed
32 pages
TwowaycommunicationbetweenscientistsandthepublicaviewfromsciencecommunicationtrainersinNorthAmerica
No ratings yet
TwowaycommunicationbetweenscientistsandthepublicaviewfromsciencecommunicationtrainersinNorthAmerica
17 pages
E - Learning Modules: Dlr Associates Series
From Everand
E - Learning Modules: Dlr Associates Series
Dan Ryan
No ratings yet
Quick Hits for Teaching with Technology: Successful Strategies by Award-Winning Teachers
From Everand
Quick Hits for Teaching with Technology: Successful Strategies by Award-Winning Teachers
Robin K. Morgan
No ratings yet
EAS SeismicSingaporeRG
No ratings yet
EAS SeismicSingaporeRG
15 pages
IEEEBMEi Conpublishment
No ratings yet
IEEEBMEi Conpublishment
6 pages
Compound
No ratings yet
Compound
6 pages
Evaluation of An Interphalangeal-Joint Prosthetic Hand in Trans-Radial Prosthesis Users
No ratings yet
Evaluation of An Interphalangeal-Joint Prosthetic Hand in Trans-Radial Prosthesis Users
11 pages
Evolution and Genetic Diversity of Theileria
No ratings yet
Evolution and Genetic Diversity of Theileria
15 pages
Noise Nikhil
No ratings yet
Noise Nikhil
4 pages
Factors Affecting Buildability of Building Designs: Canadian Journal of Civil Engineering February 2011
No ratings yet
Factors Affecting Buildability of Building Designs: Canadian Journal of Civil Engineering February 2011
13 pages
Denture Plaque Biofilm
No ratings yet
Denture Plaque Biofilm
15 pages
Electronics 09 00031 v2 PDF
No ratings yet
Electronics 09 00031 v2 PDF
42 pages
The Prospects of Bahrain's Entrepreneurial Ecosystem: An Exploratory Approach
No ratings yet
The Prospects of Bahrain's Entrepreneurial Ecosystem: An Exploratory Approach
14 pages
2017.survey - Game Theory For Cyber Security and Privacy
No ratings yet
2017.survey - Game Theory For Cyber Security and Privacy
38 pages
EE ImperfectCSIv2
No ratings yet
EE ImperfectCSIv2
6 pages
Improved Benzene Production From Methane Dehydroaromatization Over Mo/HZSM-5 Catalysts Via Hydrogen-Permselective Palladium Membrane Reactors
No ratings yet
Improved Benzene Production From Methane Dehydroaromatization Over Mo/HZSM-5 Catalysts Via Hydrogen-Permselective Palladium Membrane Reactors
15 pages
Elearning Theories & Designs: Between Theory & Practice. a Guide for Novice Instructional Designers
From Everand
Elearning Theories & Designs: Between Theory & Practice. a Guide for Novice Instructional Designers
Awatef Bouledroua
No ratings yet
Different Approaches to Learning Science, Technology, Engineering, and Mathematics: Case Studies from Thailand, the Republic of Korea, Singapore, and Finland
From Everand
Different Approaches to Learning Science, Technology, Engineering, and Mathematics: Case Studies from Thailand, the Republic of Korea, Singapore, and Finland
Asian Development Bank
No ratings yet
Qualityof Work Life Developmentand Scale Validationfor Textile Sectorsin India
No ratings yet
Qualityof Work Life Developmentand Scale Validationfor Textile Sectorsin India
13 pages
PRD2010 Gingival Biotype Assessmentinthe Esthetic Zone Kan
No ratings yet
PRD2010 Gingival Biotype Assessmentinthe Esthetic Zone Kan
9 pages
Teaching and Learning in Technology Empowered Classrooms—Issues, Contexts and Practices
From Everand
Teaching and Learning in Technology Empowered Classrooms—Issues, Contexts and Practices
Ajitha Nayar K
No ratings yet
52.DASS21 QOL AsianJournalPsychiatry
No ratings yet
52.DASS21 QOL AsianJournalPsychiatry
5 pages
Analysis of CT and MRI Image Fusion Using Wavelet Transform
No ratings yet
Analysis of CT and MRI Image Fusion Using Wavelet Transform
5 pages
How to Integrate and Evaluate Educational Technology
From Everand
How to Integrate and Evaluate Educational Technology
Rebecca Bunz
4.5/5 (3)
167 ArticleText 1221 1 10 20220205
No ratings yet
167 ArticleText 1221 1 10 20220205
13 pages
Raut2022 Article RainwaterHarvestingForMetroRai
No ratings yet
Raut2022 Article RainwaterHarvestingForMetroRai
11 pages
ANumericalStudyontheFireSafetyAnalysisofHealthcareFacilitiesinBangladesh
No ratings yet
ANumericalStudyontheFireSafetyAnalysisofHealthcareFacilitiesinBangladesh
7 pages
Applications of Deep Learning For Crisis Response
No ratings yet
Applications of Deep Learning For Crisis Response
7 pages
Miocene Warm Tropical
No ratings yet
Miocene Warm Tropical
6 pages
Published IEEE MWSCAS 2012 Ultralowpower Biosensor SRC
No ratings yet
Published IEEE MWSCAS 2012 Ultralowpower Biosensor SRC
5 pages
Building Information Modelling For Tertiary Construction Education in Hong Kong
No ratings yet
Building Information Modelling For Tertiary Construction Education in Hong Kong
2 pages
EdTech Rise
From Everand
EdTech Rise
Harrison Stewart
No ratings yet
EuRAD2013_circ_scan
No ratings yet
EuRAD2013_circ_scan
5 pages
Simulation of Laser Cutting On Functionally Graded Material Used in Aviation Industry
No ratings yet
Simulation of Laser Cutting On Functionally Graded Material Used in Aviation Industry
24 pages
Devic 2019 SmartPowerTheftDetectionSystem
No ratings yet
Devic 2019 SmartPowerTheftDetectionSystem
5 pages
Kimetal 2020 GjaponicusMaxEntModeling
No ratings yet
Kimetal 2020 GjaponicusMaxEntModeling
9 pages
Modified Waste Egg Shell Derived Bifunctional Catalyst For Biodiesel Production From High FFA Waste Cooking Oil. A Review
No ratings yet
Modified Waste Egg Shell Derived Bifunctional Catalyst For Biodiesel Production From High FFA Waste Cooking Oil. A Review
12 pages
Optimal Coordination of G2V and V2G To Support Power Grids With High Penetration of Renewable Energy
No ratings yet
Optimal Coordination of G2V and V2G To Support Power Grids With High Penetration of Renewable Energy
9 pages
Extraction of Fingerprint From Regular Expression
No ratings yet
Extraction of Fingerprint From Regular Expression
7 pages
Evolutionary Learning in Strategy-Project Systems
From Everand
Evolutionary Learning in Strategy-Project Systems
Antonio Calabrese
No ratings yet
Group Project Software Management: A Guide for University Students and Instructors
From Everand
Group Project Software Management: A Guide for University Students and Instructors
Tommy Yuan
No ratings yet
The Technology-Ready School Administrator: Standard-Based Performance
From Everand
The Technology-Ready School Administrator: Standard-Based Performance
Clinton Born
No ratings yet
2017MagneticsUKSparkMultilayerCorelessWdg
No ratings yet
2017MagneticsUKSparkMultilayerCorelessWdg
5 pages
Online First
No ratings yet
Online First
7 pages
2020PhysicsofFluids Bnfree
No ratings yet
2020PhysicsofFluids Bnfree
29 pages
Automotive Crack Detection For Railway Track Using Ultrasonic Sensorz
No ratings yet
Automotive Crack Detection For Railway Track Using Ultrasonic Sensorz
5 pages
INSTRUCTIONAL DESIGN AND TECHNOLOGY-BASED LEARNING STRATEGIES APPLICATIONS
From Everand
INSTRUCTIONAL DESIGN AND TECHNOLOGY-BASED LEARNING STRATEGIES APPLICATIONS
Ernesto Gonzalez
No ratings yet
Wu 2020 BSM 4 DP
No ratings yet
Wu 2020 BSM 4 DP
14 pages
Recycling of Seashell Waste in Concrete: A Review: Construction and Building Materials February 2018
No ratings yet
Recycling of Seashell Waste in Concrete: A Review: Construction and Building Materials February 2018
15 pages
An Evaluation of Life Cycle Cost Analysis of Airport Pavement PDF
No ratings yet
An Evaluation of Life Cycle Cost Analysis of Airport Pavement PDF
6 pages
Educational Technology
From Everand
Educational Technology
KHRITISH SWARGIARY
No ratings yet
2015 JAIM Sasoka
No ratings yet
2015 JAIM Sasoka
7 pages
2 Phangetal 2015
No ratings yet
2 Phangetal 2015
28 pages
Effective Reading Strategies For Reading Skills By: Article
No ratings yet
Effective Reading Strategies For Reading Skills By: Article
10 pages
English Summary Writing in Digital Materiality
No ratings yet
English Summary Writing in Digital Materiality
12 pages
Quarter 2 - Module 1
No ratings yet
Quarter 2 - Module 1
19 pages
Isu - 2020 - 40 3 - Isu 40 3 Isu200090 - Isu 40 Isu200090
No ratings yet
Isu - 2020 - 40 3 - Isu 40 3 Isu200090 - Isu 40 Isu200090
8 pages
SLM G12 Week 4 The Human Person and Death
No ratings yet
SLM G12 Week 4 The Human Person and Death
13 pages
SLM G12 Week 3 The Human Person and Society
No ratings yet
SLM G12 Week 3 The Human Person and Society
14 pages
SLM G12 Week 3 Human Person As An Embodied Spirit
No ratings yet
SLM G12 Week 3 Human Person As An Embodied Spirit
14 pages
SLM G12 Week 2 The Human Person and Intersubjectivity
No ratings yet
SLM G12 Week 2 The Human Person and Intersubjectivity
13 pages
SLM G12 Week 1 Meaning and Process of Philosophy
100% (1)
SLM G12 Week 1 Meaning and Process of Philosophy
13 pages
SLM G12 Week 1 The Human Person and Freedom
No ratings yet
SLM G12 Week 1 The Human Person and Freedom
15 pages
Re15225 05 - 2016 12
No ratings yet
Re15225 05 - 2016 12
2 pages
Suuply Chain Dewan Farooque Motors Limited Report
100% (1)
Suuply Chain Dewan Farooque Motors Limited Report
18 pages
Maths Worksheet 1+2
No ratings yet
Maths Worksheet 1+2
4 pages
State of The Nation Address 2023
100% (1)
State of The Nation Address 2023
21 pages
MDM PRACTICE
No ratings yet
MDM PRACTICE
2 pages
Chapter 5 Inflation TEST BANK 1
No ratings yet
Chapter 5 Inflation TEST BANK 1
20 pages
Department of International Relation and
No ratings yet
Department of International Relation and
70 pages
Network Operating Systems and Service 4
No ratings yet
Network Operating Systems and Service 4
9 pages
SJT Final 2021 1
No ratings yet
SJT Final 2021 1
23 pages
Arrange and Display Pharmaceutical Products
No ratings yet
Arrange and Display Pharmaceutical Products
28 pages
Classical Mechanics MSC Physics I Mid Term Paper
No ratings yet
Classical Mechanics MSC Physics I Mid Term Paper
1 page
Architects Datafile ADF - June 2019
No ratings yet
Architects Datafile ADF - June 2019
92 pages
Material Matters Basics Vol4
No ratings yet
Material Matters Basics Vol4
22 pages
Corporate Takeover and Automobile Industry A Review From India, USA & UK
No ratings yet
Corporate Takeover and Automobile Industry A Review From India, USA & UK
22 pages
Gung Ho Movie Intercultural Unit
No ratings yet
Gung Ho Movie Intercultural Unit
5 pages
Log Cat 1699712504331
No ratings yet
Log Cat 1699712504331
14 pages
Bussiness Analyst Interview Questions
No ratings yet
Bussiness Analyst Interview Questions
18 pages
Ericsson - Huawei KPI Formula
No ratings yet
Ericsson - Huawei KPI Formula
19 pages
ANAS Time Sheet APRIL 2021
No ratings yet
ANAS Time Sheet APRIL 2021
2 pages
The Exploding Manual + Coffee Kitten Expansion
No ratings yet
The Exploding Manual + Coffee Kitten Expansion
8 pages
Catia v5
No ratings yet
Catia v5
35 pages
Alexa Vazquez
No ratings yet
Alexa Vazquez
2 pages
An Evaluation On The Use of E-Learning Platforms During The COVID 19 Pandemic by Student Teachers and Lecturers at A Teacher Education Institution in Zimbabwe
No ratings yet
An Evaluation On The Use of E-Learning Platforms During The COVID 19 Pandemic by Student Teachers and Lecturers at A Teacher Education Institution in Zimbabwe
6 pages
Factsheet - Q-Bix Digital Signage Player
No ratings yet
Factsheet - Q-Bix Digital Signage Player
2 pages
RRB NTPC - 120 Day Time Table - 31159793 - 2024 - 09 - 27 - 20 - 19
No ratings yet
RRB NTPC - 120 Day Time Table - 31159793 - 2024 - 09 - 27 - 20 - 19
21 pages
RSP Suction Excavators
No ratings yet
RSP Suction Excavators
14 pages
AP60T03GH/J: ELECTROCS 6s2wsws2w0t03 - b1444
No ratings yet
AP60T03GH/J: ELECTROCS 6s2wsws2w0t03 - b1444
3 pages
3.2 The Grand Alliance Worksheet
No ratings yet
3.2 The Grand Alliance Worksheet
3 pages

Impact of Practical Skills On Academic Performance A Data-Driven Analysis

Uploaded by

Impact of Practical Skills On Academic Performance A Data-Driven Analysis

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Impact of Practical Skills on Academic Performance: A Data-Driven Analysis

Article in IEEE Access · October 2021

Md. Mostafizer Rahman Yutaka Watanobe

SEE PROFILE SEE PROFILE

Uday Kiran Rage Truong Cong Thang

SEE PROFILE SEE PROFILE

Cost-effective 360-degree video streaming over networks View project

Programming Education based on Deep Learning View project

The user has requested enhancement of the downloaded file.

Impact of Practical Skills on Academic

I. INTRODUCTION implementation skills. Computer programming is an exam-

139976 VOLUME 9, 2021

VOLUME 9, 2021 139977

139978 VOLUME 9, 2021

III. DATASET AND PREPROCESSING

VOLUME 9, 2021 139979

TABLE 2. Sample submission logs generated by the AOJ system.

139980 VOLUME 9, 2021

FIGURE 1. The overall framework of the data-driven approach.

VOLUME 9, 2021 139981

139982 VOLUME 9, 2021

FIGURE 3. 2D visualization of the clusters.

T = {t1 , t2 , t3 , t4 , . . . , tN }, where each transaction ti is a

VOLUME 9, 2021 139983

TABLE 8. Cluster-wise solution accuracy and problem-solving T &E .

of solutions for problems C and D, at 32.82% and 10.55%,

FIGURE 4. Segmentation of error verdicts received by the students.

Now, the solution accuracy and T &E are calculated for

139984 VOLUME 9, 2021

TABLE 11. Topic-wise average accepted (AC ) solution rate.

The tendency to submit each assignment in the

FIGURE 7. Comparison of topic-wise accepted (AC ) rate.

FIGURE 6. Tendency to submit assignments within the allotted period

VOLUME 9, 2021 139985

else’s code to solve the assignments. In contrast, the students

C. ASSOCIATION RULE MINING

TABLE 14. Dictionary for the set attributes.

FIGURE 8. Tendency to submit assignments on the last day.

TABLE 12. Average submission rate on the last day.

To obtain more interesting hidden features that are not

139986 VOLUME 9, 2021

TABLE 15. Association rules for the students of cluster P.

association rules are involved with higher accuracy, PA, PT ,

D. ACCUMULATION OF CORRELATED FEATURES

VOLUME 9, 2021 139987

139988 VOLUME 9, 2021

FIGURE 9. A summary graph of the main features.

VOLUME 9, 2021 139989

139990 VOLUME 9, 2021

VOLUME 9, 2021 139991

139992 VOLUME 9, 2021

RAGE UDAY KIRAN received the Ph.D. degree

VOLUME 9, 2021 139993

View publication stats

You might also like