SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 684
UNIVERSITY ADMISSION SYSTEMS USING DATA MINING TECHNIQUES TO
PREDICT STUDENT PERFORMANCE TO SUPPORT DECISION MAKING
PRANEETH REDDY PENUGONDA1, MEENALOCHANI GANDHAM 2, VENKATA SATYA KOPPULA 3,
RADHA KRISHNA BALUSU 4.
1, 2, 3, 4. SCHOOL OF COMPUTER SCIENCE AND ENGINEERING VELLORE INSTITUTE OF TECHNOLOGY VELLORE (TN.), INDIA.
----------------------------------------------------------------------------------***--------------------------------------------------------------------------
ABSTRACT – The admissions process is a struggle for universities nowadays, particularly those that focus on STEM fields like
computer science and engineering. In order to identify students who would be successful in their programmers, universities
should use objective criteria for admissions. The suggested technique was tested using a dataset consisting of 2,039 students
from 2016 through 2019 who were enrolled in the Information and Computer Science College of the a Saudi state institution.
The findings show that early academic success at university may be predicted prior to admission using certain criteria. The
findings also suggest that a student's score on the Scholastic Proficiency Admission Test is the best predictive factor for
admission. This score should thus be given greater weight in selection procedures.
Index terms – Data Mining, SVM, ANN, Decision tree.
I. INTRODUCTION
In order to identify students who would be successful in their
programmes, universities should use objective criteria for
admissions. Additionally, each school should use the most
advanced methods for estimating an individual student's
potential in the classroom before enrolling them. This would
help policymakers at universities establish effective
admissions standards. Nonetheless, most universities have
difficulty analysing their massive educational datasets to
foretell students' success [1]. This is because they rely only
on traditional statistical approaches, as opposed to more
modern and effective prediction methods like Educational
Data Mining (EDM), the most widely used method for
assessing and predicting students' performance [2–6]. In
order to anticipate students' performance, EDM first involves
gathering meaningful information and trends from a massive
educational database [2]. Better data allows for more
planned approaches to boosting students' academic standing.
This research aims to aid colleges in their admissions
processes by providing more accurate predictions of
applicants' future academic achievement using data mining
methods.
In various areas, this research adds to the existing body of
knowledge. At first, we use four data mining classification
models to foretell candidates' early academic success based
on their profiles before they enrol. Quiz and final exam
scores, extracurricular involvement, student demographics,
cumulative grade point average, and social network contacts
are among the profile data most often utilised for predicting
students' success in higher education (e.g. [7]–[10]).
However, factors that may be used to predict student
achievement, like as pre-admission test results, are seldom
taken into account in the admissions process (e.g. [11]–[13]).
This research focuses on these underappreciated indicators.
We also compare four categorization methods for making
predictions about students' performance and determine
which one is most effective in terms of accuracy, precision,
recall, and F1-Measure.
Second, we use a correlation coefficient analysis to find out
how the selection criteria for incoming freshmen affect their
GPA after the first semester. We also determine the best
predictive admissions criteria for student achievement so
that decision-makers may give that factor greater weight.
Third, the institution where this research was done adopted
a new admissions policy that gave different weight to the
qualities that were shown to be most important. By
comparing the cumulative grade point averages of freshmen
accepted under the old and new systems, this research
demonstrated the wisdom of the latter. The number of
freshmen with outstanding or very good cumulative GPAs
rose by 31%, while the number with acceptable or bad GPAs
fell by 18%.
The huge sample size of 2,039 students from the Faculty of
Information and Computer Sciences (CCIS) of Prince Nourah
bint Abdulrahman University (PNU) in Riyadh, Kingdom of
Saudi Arabia, sets this research apart from others in the area
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 685
of forecasting student performance (KSA). It has more female
students than any other institution in the world. Most
previous research in this area validates the efficacy of their
models using significantly smaller samples.
II. BACKGROUND WORK
Attributes and prediction techniques are the two most
important aspects in predicting student achievement. The
cumulative grade point average (CGPA) of college students is
the single most important factor in determining their
academic success. Many studies have benefited from its use
(e.g. [7]–[10]). Assessments, quiz scores, lab work, & final
exam grades are additional indicators employed by academic
performance studies (e.g. [8], [9],). Only a small number of
studies have accounted for demographics, student activities,
and social networks as independent variables.
However, in the admissions process, input factors such as
pre-admission exams are seldom employed (e.g. [11]-[13]) to
predict student success in university. This is what we'll be
looking at in detail.
Many different data mining categorization methods have
been used to attempt to foretell how well students would do
in their courses. In one research, for instance, ANN is used to
predict how well 505 students would do in their eighth-
semester classes. Using Decision Trees, a model was
presented to predict student success in specific courses with
little data (32 and 42 students). An analysis of 1,600
students' grades for a single class using Naive Bayes. The
research uses SVM on a dataset of 1,074 individuals to
forecast the academic success of at-risk freshmen.
Predicting first-year cumulative grade point averages in the
computer engineering departments of Saudi public
institutions was investigated here using a variety of
admissions factors as input qualities. Few research
conducted in KSA and published on the topic have focused
outside of medical schools (e.g. [11]–[13], [15]). While EDM
may help uncover hidden patterns in huge datasets, it has
not been applied in these investigations. Study, which used
one of the EDM approaches (i.e., J48 decision tree) to usually
considered' final GPA based on grades in all classes, is one of
the few on the issue that has been limited to a computer
science institution. Authors gathered information from the
transcripts of 236 Computer Science College students at King
Saud University (KSU) from Saudi Arabia. They determined
which classes had the most influence on the cumulative
grade point average. However, they only used a single EDM
methodology on a tiny dataset to make predictions about
student performance & did not double-check their work with
any other EDM methods.
III. PROPOSED SYSTEM
This study used the Linear Regression method, a common
method for determining the connection between
independent variables (i.e., predictors) or a predictor
variables (i.e., future academic performance), to answer the
first question (Can the admission criterion that best predicts
future academic performance be identified?) (i.e., response).
We used this model to analyse the correlation between the
three entrance factors (HSGA, SAAT, and GAT) and the
cumulative grade point average (CGPA) after the first two
semesters of study. We utilised the correlation coefficient, a
standard statistical measure of the strength and direction of
linear correlations between two variables, to characterise the
linear link between each admission criterion and the CGPA.
In addition, we employed the determination coefficient to
characterise the relative impact of each admission criterion
on the students' first-year CGPA.
Using four well-known data mining classification
techniques—Artificial Neural Network (ANN), Decision Tree
(DT), Support Vector Machine (SVM), or Naive Bayes—we
created four prediction models to address the second issue in
this research.
Although there are many other methods for classifying DMs,
the following four are the ones we settled on for this article
due to the salient qualities they share.
A. ARTIFICIAL NEURAL NETWORK (ANN)
The artificial neural network (ANN) is a common tool in
evolutionary computational modelling (EDM) because it
attempts to replicate the way the human brain works to
address difficult issues. It is made up of a collection of
modules that take in a set of weighted inputs and produce a
corresponding output. Predicting student performance using
ANN has been the subject of several published publications
(e.g., [8]). We utilised it as well since it can learn from a small
sample and can discover all conceivable relationships
between variables. The accuracy with which ANN models
predicted which applicants would be approved and which
would be rejected was also superior to that of traditional
classification methods. Due to the limited size of the datasets,
the ANN model in this investigation used a simple topology
called Multilayer Perception (MLP).
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 686
B. DECISION TREE
In a decision tree, the nodes are ranked from most important
to least important. Each node in the graph represents a
property of an instance, or the edges between them stand for
the range of values the property may take. Since this method
is so popular among scientists, we choose to employ it (e.g.,
[1], [6], [7], [9]). It makes uncomplicated and easy-to-
understand value predictions. In addition, it performs well
with both category and numeric features, requires nothing in
the way of complicated data preparation, and expresses rules
that can be simply understood and comprehended by users
[6].
C. SUPPORT VECTOR MACHINE (SVM)
In order to divide items into their respective categories, this
method constructs a hyperplane. The generalisation error of
a SVM method decreases with increasing distance from of the
hyper - plane to the closest object. SVM is utilised in this
work because it is well-suited for tiny datasets, and it has
been used in only a limited number of previous studies (e.g.,
[7]). Additionally, it is quicker than the other methods.
D. NAIVE BAYES
Naive Bayes is a straightforward probabilistic method that
uses independent assumptions across variables to apply
Bayes' theorem. It calculates the chances of each item
belonging to each category. Due to its computational
efficiency, widespread usage in relevant literature, and
general ease of implementation, this method was selected for
this investigation.
3) We used accuracy, recall, precision, or F1-Measure metrics
to analyse and compare the performances of the four models
in order to answer the third research question in this work
(Which datamining prediction approach performs best in
this study?). (For a detailed breakdown of evaluation
metrics, please refer to Section A, Experimental Design
below.)
4) We developed this same second stage of the research to
make a comparison the first-year CGPAs of incoming
freshmen admitted in 2018-2019 using the new admission
weights of criteria to the inaugural CGPAs of former students
admitted in 2016-2 using the old admission weight of each
criterion, in order to answer the 4th question in this
research.
IV. RESULTS
Data from PNU's CCIS student database was used for this
analysis. However, the approaches used are universal and
may be implemented in any university setting. These figures
are from the Admitted & Registration Deanship's
computerised academic database. The Institution Review
Panel at PNU provided the necessary ethical clearance
(Number 19-0152). In the first phase of our research, we
gathered data on 1,569 students across all three
departments and two cohorts: 902 students from the 2016–
2017 school year and 667 students from the 2017–2018
school year.
Second, we gathered 470 student records across the three
departments from the 2018-2019 school year, when
admissions were made using the revised weightings. Similar
situations like those described in the first section of the
research occurred in other student bodies. We utilised this
information to evaluate how newly accepted students' first-
year GPAs stacked up against those of incoming students
under the previous weighting system.
Based on the initial numeric parameter CGPA, we built a
category target variable (class). The PNU grading system
uses a five-point scale, broken down as follows: exceptional
(4.5), extremely good (3.75 to 4.5), great (2.75 to 3.75),
ordinary (2.0 to 2.75), and bad (2.0). An illustration of the
grading scale used by students in the 2016–17 and 2017–18
academic years.
Just like we did with the data sets spanning 2016–2018
academic years, we performed which was before to the set of
data spanning 2018–2019. There were 437 student files left
over from the 2018-2019 school year.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 687
We used Artificial Neural Networks (ANNs), Decision Trees,
Support Vector Machines (SVMs), and Naive Bayes to build
four different prediction models. Each model was developed
using a ten-fold cross validation, with 9 data sets used during
training and 1 set employed while testing. This procedure
was carried out a total of ten times, once for every one of the
distinct sets.
The following ideas may be used to evaluate the efficacy of
data mining models:
Number of occasions for which a positive prediction was
made (True Positive Rate, or TP).
Number of times a negative result was wrongly labelled as
positive; sometimes called the "False Positive Rate" (FP).
Number of occurrences when a negative prediction was
made accurately; sometimes called the "True Negative Rate"
(TN).
One measure of accuracy is the False Negative Rate (FN),
which is the percentage of positive events that were wrongly
labelled negative.
Predictive accuracy is quantified as the proportion of times
an outcome is accurately anticipated.
Preciseness equals (Total Probability) divided by (Total
Probability) plus (First Probability) plus (Final Probability)
(1)
Measured by (2), recall is the proportion of true positives
that were accurately anticipated.
Calculating Recall: TP / (TP + FN) (2)
Accuracy is quantified as the proportion of true positives,
and it may be calculated as (3):
TP / (TP + FP) = Accuracy (3)
The F1-Measure highlights a classifier's performance on both
common and unusual categories, and it represents the trade-
off between recall and accuracy. There are four units of
measurement for it:
Formula for F1 Measurement = 2 * Recall * Precision *
(Recall + Precision) (4)
To improve first-year student performance, the PNU
Admitted & Registration Deanship has chosen to implement
changes to the present admissions procedure based on the
study's findings and suggestions. For the 2018-2019 school
year, the dean's office decided to increase the weight of the
SAAT criterion and adjust the weights of the other two
criteria (HSGA and GAT) to 30% and 40%, respectively. Prior
to this change, the percentages were 60%, 20%, and 20%.
Figure 2 displays a comparison of first-year CGPAs from new
students to those from prior years.
V. CONCLUSION
This research was conducted with the aim of assisting
universities in making informed admissions choices based on
accurate predictions of prospective students' academic
success after they are admitted. Artificial Neural Networks
(ANNs), Decision Trees, Support Vector Machines (SVMs),
and Naive Bayes were used to propose and create four
different models for making predictions. The research used a
database including 2,039 student records from PNU, the
biggest university in KSA. However, the approaches used are
universal and may be implemented in any university setting.
The results of the research lend credence to the employment
of prediction models in higher education, where they may be
put to good use in the allocation of scarce resources.
Moreover, the findings demonstrate that, with sufficient pre-
admission data, a high-performance model to predict
students' early performance may be created. In this research,
for instance, the ANN model achieved an accuracy rate of
roughly 79.22% in terms of its performance. The research
also found that the ANN method achieved the highest levels
of accuracy and precision, while the Decision Tree method
achieved the highest levels of recall and F1-Measure. Out of
all the methods, Naive Bayes performed the poorest.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 688
Students' personalities, demographics, families, and
communication abilities are just a few of the pre-admission
characteristics that have been shown to influence future
academic success; further research is required. It's possible
to employ other data mining methods, such as clustering, in
subsequent research.
REFRENCES
[1] H. Guruler, A. Istanbullu, and M. Karahasan, ‘‘A new
student performance analysing system using knowledge
discovery in higher educational databases,’’ Comput. Edu.,
vol. 55, no. 1, pp. 247–254, Aug. 2010.
[2] S. K. Mohamad and Z. Tasir, ‘‘Educational data mining: A
review,’’ Procedia Social Behav. Sci., vol. 97, pp. 320–324,
Nov. 2013.
[3] A. Peña-Ayala, ‘‘Educational data mining: A survey and a
data miningbased analysis of recent works,’’ Expert Syst.
Appl., vol. 41, no. 4, pp. 1432–1462, Mar. 2014.
[4] C. Romero and S. Ventura, ‘‘Educational data mining: A
review of the state of the art,’’ IEEE Trans. Syst., Man, Cybern.
C, Appl. Rev., vol. 40, no. 6, pp. 601–618, Nov. 2010.
[5] H. Aldowah, H. Al-Samarraie, and W. M. Fauzy,
‘‘Educational data mining and learning analytics for 21st
century higher education: A review and synthesis,’’
Telematics Informat., vol. 37, pp. 13–49, Apr. 2019.
[6] C. Anuradha and T. Velmurugan, ‘‘A comparative analysis
on the evaluation of classification algorithms in the
prediction of students performance,’’ Indian J. Sci. Technol.,
vol. 8, no. 15, pp. 974–6846, Jan. 2015.
[7] V. L. Miguéis, A. Freitas, P. J. V. Garcia, and A. Silva, ‘‘Early
segmentation of students according to their academic
performance: A predictive modelling approach,’’ Decis.
Support Syst., vol. 115, pp. 36–51, Nov. 2018.
[8] M. Mayilvaganan and D. Kalpanadevi, ‘‘Comparison of
classification techniques for predicting the performance of
students academic environment,’’ in Proc. Int. Conf. Commun.
Netw. Technol., Sivakasi, India, Dec. 2014, pp. 113–118.
[9] S. Natek and M. Zwilling, ‘‘Student data mining solution–
knowledge management system related to higher education
institutions,’’ Expert Syst. Appl., vol. 41, no. 14, pp. 6400–
6407, Oct. 2014.
[10] T. M. Christian and M. Ayub, ‘‘Exploration of
classification using NB tree for predicting students’
performance,’’ in Proc. Int. Conf. Data Softw. Eng. (ICODSE),
Bandung, ID, USA, Nov. 2014, pp. 1–6.
[11] J. Albishri, S. Aly, and Y. Alnemary, ‘‘Admission criteria to
Saudi medical schools. Which is the best predictor for
successful achievement?’’ Saudi Med. J., vol. 33, pp. 1222–
1228, 2012.
[12] M. O. Al-Rukban, F. M. Munshi, H. Abdulghani, and I. Al-
Hoqail, ‘‘The ability of the pre-admission criteria to predict
performance in a Saudi medical school,’’ Saudi Med. J., vol. 31,
pp. 560–564, 2010.
[13] A. M. Alhadlaq, O. F. Alshammari, S. M. Alsager, K. A. F.
Neel, and A. G. Mohamed, ‘‘Ability of admissions criteria to
predict early academic performance among students of
health science colleges at King Saud University, Saudi
Arabia,’’ J. Dental Educ., vol. 79, pp. 665–670, Jan. 2015.
[14] S. M. Hassan and M. S. Al-Razgan, ‘‘Pre-university exams
effect on students GPA: A case study in IT department,’’
Procedia Comput. Sci., vol. 82, pp. 127–131, 2016.
[15] M. F. Al-Qahtani and T. M. Alanzi, ‘‘Comparisons of the
predictive values of admission criteria for academic
achievement among undergraduate students of health and
non-health science professions: A longitudinal cohort study,’’
Psychol. Res. Behav. Manage., vol. 12, pp. 1–6, Dec. 2018.
VI. FUTURE ENHANCEMENT

More Related Content

Similar to UNIVERSITY ADMISSION SYSTEMS USING DATA MINING TECHNIQUES TO PREDICT STUDENT PERFORMANCE TO SUPPORT DECISION MAKING (20)

PDF
IRJET- Analysis of Student Performance using Machine Learning Techniques
IRJET Journal
 
PDF
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
Editor IJCATR
 
PDF
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
Editor IJCATR
 
PDF
IRJET- Predictive Analytics for Placement of Student- A Comparative Study
IRJET Journal
 
PDF
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...
IRJET Journal
 
DOC
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
IIRindia
 
PDF
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
IRJET Journal
 
PDF
Survey on Techniques for Predictive Analysis of Student Grades and Career
IRJET Journal
 
PDF
IRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET Journal
 
PDF
Data Mining Techniques in Higher Education an Empirical Study for the Univer...
IJMER
 
PDF
Student Performance Prediction via Data Mining & Machine Learning
IRJET Journal
 
PDF
IRJET- Academic Performance Analysis System
IRJET Journal
 
PDF
Graduate Admission Predictor
IRJET Journal
 
DOCX
mini project on artificial intelligence and machine learning
Gayu Ram
 
PDF
Predicting students performance using classification techniques in data mining
Lovely Professional University
 
PDF
The Architecture of System for Predicting Student Performance based on the Da...
Thada Jantakoon
 
PDF
Empirical Study on Classification Algorithm For Evaluation of Students Academ...
iosrjce
 
PDF
K017626773
IOSR Journals
 
PDF
Identifying the Key Factors of Training Technical School and College Teachers...
ijtsrd
 
IRJET- Analysis of Student Performance using Machine Learning Techniques
IRJET Journal
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
Editor IJCATR
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
Editor IJCATR
 
IRJET- Predictive Analytics for Placement of Student- A Comparative Study
IRJET Journal
 
A WEB BASED APPLICATION FOR TUTORING SUPPORT IN HIGHER EDUCATION USING EDUCAT...
IRJET Journal
 
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
IIRindia
 
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
IRJET Journal
 
Survey on Techniques for Predictive Analysis of Student Grades and Career
IRJET Journal
 
IRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET Journal
 
Data Mining Techniques in Higher Education an Empirical Study for the Univer...
IJMER
 
Student Performance Prediction via Data Mining & Machine Learning
IRJET Journal
 
IRJET- Academic Performance Analysis System
IRJET Journal
 
Graduate Admission Predictor
IRJET Journal
 
mini project on artificial intelligence and machine learning
Gayu Ram
 
Predicting students performance using classification techniques in data mining
Lovely Professional University
 
The Architecture of System for Predicting Student Performance based on the Da...
Thada Jantakoon
 
Empirical Study on Classification Algorithm For Evaluation of Students Academ...
iosrjce
 
K017626773
IOSR Journals
 
Identifying the Key Factors of Training Technical School and College Teachers...
ijtsrd
 

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
PDF
Kiona – A Smart Society Automation Project
IRJET Journal
 
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
PDF
Breast Cancer Detection using Computer Vision
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
Kiona – A Smart Society Automation Project
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Ad

Recently uploaded (20)

PDF
MODULE-5 notes [BCG402-CG&V] PART-B.pdf
Alvas Institute of Engineering and technology, Moodabidri
 
PPTX
Knowledge Representation : Semantic Networks
Amity University, Patna
 
PDF
3rd International Conference on Machine Learning and IoT (MLIoT 2025)
ClaraZara1
 
PPTX
UNIT 1 - INTRODUCTION TO AI and AI tools and basic concept
gokuld13012005
 
PDF
Electrical Machines and Their Protection.pdf
Nabajyoti Banik
 
PDF
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
PPTX
Alan Turing - life and importance for all of us now
Pedro Concejero
 
PPT
New_school_Engineering_presentation_011707.ppt
VinayKumar304579
 
PPTX
2025 CGI Congres - Surviving agile v05.pptx
Derk-Jan de Grood
 
PDF
NTPC PATRATU Summer internship report.pdf
hemant03701
 
PPTX
Basics of Electrical Engineering and electronics .pptx
PrabhuNarayan6
 
PDF
Submit Your Papers-International Journal on Cybernetics & Informatics ( IJCI)
IJCI JOURNAL
 
PPTX
Numerical-Solutions-of-Ordinary-Differential-Equations.pptx
SAMUKTHAARM
 
PDF
Artificial Neural Network-Types,Perceptron,Problems
Sharmila Chidaravalli
 
PPTX
Explore USA’s Best Structural And Non Structural Steel Detailing
Silicon Engineering Consultants LLC
 
PPTX
Water Resources Engineering (CVE 728)--Slide 4.pptx
mohammedado3
 
PPTX
Fundamentals of Quantitative Design and Analysis.pptx
aliali240367
 
PPTX
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
PPTX
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
PDF
WD2(I)-RFQ-GW-1415_ Shifting and Filling of Sand in the Pond at the WD5 Area_...
ShahadathHossain23
 
MODULE-5 notes [BCG402-CG&V] PART-B.pdf
Alvas Institute of Engineering and technology, Moodabidri
 
Knowledge Representation : Semantic Networks
Amity University, Patna
 
3rd International Conference on Machine Learning and IoT (MLIoT 2025)
ClaraZara1
 
UNIT 1 - INTRODUCTION TO AI and AI tools and basic concept
gokuld13012005
 
Electrical Machines and Their Protection.pdf
Nabajyoti Banik
 
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
Alan Turing - life and importance for all of us now
Pedro Concejero
 
New_school_Engineering_presentation_011707.ppt
VinayKumar304579
 
2025 CGI Congres - Surviving agile v05.pptx
Derk-Jan de Grood
 
NTPC PATRATU Summer internship report.pdf
hemant03701
 
Basics of Electrical Engineering and electronics .pptx
PrabhuNarayan6
 
Submit Your Papers-International Journal on Cybernetics & Informatics ( IJCI)
IJCI JOURNAL
 
Numerical-Solutions-of-Ordinary-Differential-Equations.pptx
SAMUKTHAARM
 
Artificial Neural Network-Types,Perceptron,Problems
Sharmila Chidaravalli
 
Explore USA’s Best Structural And Non Structural Steel Detailing
Silicon Engineering Consultants LLC
 
Water Resources Engineering (CVE 728)--Slide 4.pptx
mohammedado3
 
Fundamentals of Quantitative Design and Analysis.pptx
aliali240367
 
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
WD2(I)-RFQ-GW-1415_ Shifting and Filling of Sand in the Pond at the WD5 Area_...
ShahadathHossain23
 
Ad

UNIVERSITY ADMISSION SYSTEMS USING DATA MINING TECHNIQUES TO PREDICT STUDENT PERFORMANCE TO SUPPORT DECISION MAKING

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 684 UNIVERSITY ADMISSION SYSTEMS USING DATA MINING TECHNIQUES TO PREDICT STUDENT PERFORMANCE TO SUPPORT DECISION MAKING PRANEETH REDDY PENUGONDA1, MEENALOCHANI GANDHAM 2, VENKATA SATYA KOPPULA 3, RADHA KRISHNA BALUSU 4. 1, 2, 3, 4. SCHOOL OF COMPUTER SCIENCE AND ENGINEERING VELLORE INSTITUTE OF TECHNOLOGY VELLORE (TN.), INDIA. ----------------------------------------------------------------------------------***-------------------------------------------------------------------------- ABSTRACT – The admissions process is a struggle for universities nowadays, particularly those that focus on STEM fields like computer science and engineering. In order to identify students who would be successful in their programmers, universities should use objective criteria for admissions. The suggested technique was tested using a dataset consisting of 2,039 students from 2016 through 2019 who were enrolled in the Information and Computer Science College of the a Saudi state institution. The findings show that early academic success at university may be predicted prior to admission using certain criteria. The findings also suggest that a student's score on the Scholastic Proficiency Admission Test is the best predictive factor for admission. This score should thus be given greater weight in selection procedures. Index terms – Data Mining, SVM, ANN, Decision tree. I. INTRODUCTION In order to identify students who would be successful in their programmes, universities should use objective criteria for admissions. Additionally, each school should use the most advanced methods for estimating an individual student's potential in the classroom before enrolling them. This would help policymakers at universities establish effective admissions standards. Nonetheless, most universities have difficulty analysing their massive educational datasets to foretell students' success [1]. This is because they rely only on traditional statistical approaches, as opposed to more modern and effective prediction methods like Educational Data Mining (EDM), the most widely used method for assessing and predicting students' performance [2–6]. In order to anticipate students' performance, EDM first involves gathering meaningful information and trends from a massive educational database [2]. Better data allows for more planned approaches to boosting students' academic standing. This research aims to aid colleges in their admissions processes by providing more accurate predictions of applicants' future academic achievement using data mining methods. In various areas, this research adds to the existing body of knowledge. At first, we use four data mining classification models to foretell candidates' early academic success based on their profiles before they enrol. Quiz and final exam scores, extracurricular involvement, student demographics, cumulative grade point average, and social network contacts are among the profile data most often utilised for predicting students' success in higher education (e.g. [7]–[10]). However, factors that may be used to predict student achievement, like as pre-admission test results, are seldom taken into account in the admissions process (e.g. [11]–[13]). This research focuses on these underappreciated indicators. We also compare four categorization methods for making predictions about students' performance and determine which one is most effective in terms of accuracy, precision, recall, and F1-Measure. Second, we use a correlation coefficient analysis to find out how the selection criteria for incoming freshmen affect their GPA after the first semester. We also determine the best predictive admissions criteria for student achievement so that decision-makers may give that factor greater weight. Third, the institution where this research was done adopted a new admissions policy that gave different weight to the qualities that were shown to be most important. By comparing the cumulative grade point averages of freshmen accepted under the old and new systems, this research demonstrated the wisdom of the latter. The number of freshmen with outstanding or very good cumulative GPAs rose by 31%, while the number with acceptable or bad GPAs fell by 18%. The huge sample size of 2,039 students from the Faculty of Information and Computer Sciences (CCIS) of Prince Nourah bint Abdulrahman University (PNU) in Riyadh, Kingdom of Saudi Arabia, sets this research apart from others in the area
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 685 of forecasting student performance (KSA). It has more female students than any other institution in the world. Most previous research in this area validates the efficacy of their models using significantly smaller samples. II. BACKGROUND WORK Attributes and prediction techniques are the two most important aspects in predicting student achievement. The cumulative grade point average (CGPA) of college students is the single most important factor in determining their academic success. Many studies have benefited from its use (e.g. [7]–[10]). Assessments, quiz scores, lab work, & final exam grades are additional indicators employed by academic performance studies (e.g. [8], [9],). Only a small number of studies have accounted for demographics, student activities, and social networks as independent variables. However, in the admissions process, input factors such as pre-admission exams are seldom employed (e.g. [11]-[13]) to predict student success in university. This is what we'll be looking at in detail. Many different data mining categorization methods have been used to attempt to foretell how well students would do in their courses. In one research, for instance, ANN is used to predict how well 505 students would do in their eighth- semester classes. Using Decision Trees, a model was presented to predict student success in specific courses with little data (32 and 42 students). An analysis of 1,600 students' grades for a single class using Naive Bayes. The research uses SVM on a dataset of 1,074 individuals to forecast the academic success of at-risk freshmen. Predicting first-year cumulative grade point averages in the computer engineering departments of Saudi public institutions was investigated here using a variety of admissions factors as input qualities. Few research conducted in KSA and published on the topic have focused outside of medical schools (e.g. [11]–[13], [15]). While EDM may help uncover hidden patterns in huge datasets, it has not been applied in these investigations. Study, which used one of the EDM approaches (i.e., J48 decision tree) to usually considered' final GPA based on grades in all classes, is one of the few on the issue that has been limited to a computer science institution. Authors gathered information from the transcripts of 236 Computer Science College students at King Saud University (KSU) from Saudi Arabia. They determined which classes had the most influence on the cumulative grade point average. However, they only used a single EDM methodology on a tiny dataset to make predictions about student performance & did not double-check their work with any other EDM methods. III. PROPOSED SYSTEM This study used the Linear Regression method, a common method for determining the connection between independent variables (i.e., predictors) or a predictor variables (i.e., future academic performance), to answer the first question (Can the admission criterion that best predicts future academic performance be identified?) (i.e., response). We used this model to analyse the correlation between the three entrance factors (HSGA, SAAT, and GAT) and the cumulative grade point average (CGPA) after the first two semesters of study. We utilised the correlation coefficient, a standard statistical measure of the strength and direction of linear correlations between two variables, to characterise the linear link between each admission criterion and the CGPA. In addition, we employed the determination coefficient to characterise the relative impact of each admission criterion on the students' first-year CGPA. Using four well-known data mining classification techniques—Artificial Neural Network (ANN), Decision Tree (DT), Support Vector Machine (SVM), or Naive Bayes—we created four prediction models to address the second issue in this research. Although there are many other methods for classifying DMs, the following four are the ones we settled on for this article due to the salient qualities they share. A. ARTIFICIAL NEURAL NETWORK (ANN) The artificial neural network (ANN) is a common tool in evolutionary computational modelling (EDM) because it attempts to replicate the way the human brain works to address difficult issues. It is made up of a collection of modules that take in a set of weighted inputs and produce a corresponding output. Predicting student performance using ANN has been the subject of several published publications (e.g., [8]). We utilised it as well since it can learn from a small sample and can discover all conceivable relationships between variables. The accuracy with which ANN models predicted which applicants would be approved and which would be rejected was also superior to that of traditional classification methods. Due to the limited size of the datasets, the ANN model in this investigation used a simple topology called Multilayer Perception (MLP).
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 686 B. DECISION TREE In a decision tree, the nodes are ranked from most important to least important. Each node in the graph represents a property of an instance, or the edges between them stand for the range of values the property may take. Since this method is so popular among scientists, we choose to employ it (e.g., [1], [6], [7], [9]). It makes uncomplicated and easy-to- understand value predictions. In addition, it performs well with both category and numeric features, requires nothing in the way of complicated data preparation, and expresses rules that can be simply understood and comprehended by users [6]. C. SUPPORT VECTOR MACHINE (SVM) In order to divide items into their respective categories, this method constructs a hyperplane. The generalisation error of a SVM method decreases with increasing distance from of the hyper - plane to the closest object. SVM is utilised in this work because it is well-suited for tiny datasets, and it has been used in only a limited number of previous studies (e.g., [7]). Additionally, it is quicker than the other methods. D. NAIVE BAYES Naive Bayes is a straightforward probabilistic method that uses independent assumptions across variables to apply Bayes' theorem. It calculates the chances of each item belonging to each category. Due to its computational efficiency, widespread usage in relevant literature, and general ease of implementation, this method was selected for this investigation. 3) We used accuracy, recall, precision, or F1-Measure metrics to analyse and compare the performances of the four models in order to answer the third research question in this work (Which datamining prediction approach performs best in this study?). (For a detailed breakdown of evaluation metrics, please refer to Section A, Experimental Design below.) 4) We developed this same second stage of the research to make a comparison the first-year CGPAs of incoming freshmen admitted in 2018-2019 using the new admission weights of criteria to the inaugural CGPAs of former students admitted in 2016-2 using the old admission weight of each criterion, in order to answer the 4th question in this research. IV. RESULTS Data from PNU's CCIS student database was used for this analysis. However, the approaches used are universal and may be implemented in any university setting. These figures are from the Admitted & Registration Deanship's computerised academic database. The Institution Review Panel at PNU provided the necessary ethical clearance (Number 19-0152). In the first phase of our research, we gathered data on 1,569 students across all three departments and two cohorts: 902 students from the 2016– 2017 school year and 667 students from the 2017–2018 school year. Second, we gathered 470 student records across the three departments from the 2018-2019 school year, when admissions were made using the revised weightings. Similar situations like those described in the first section of the research occurred in other student bodies. We utilised this information to evaluate how newly accepted students' first- year GPAs stacked up against those of incoming students under the previous weighting system. Based on the initial numeric parameter CGPA, we built a category target variable (class). The PNU grading system uses a five-point scale, broken down as follows: exceptional (4.5), extremely good (3.75 to 4.5), great (2.75 to 3.75), ordinary (2.0 to 2.75), and bad (2.0). An illustration of the grading scale used by students in the 2016–17 and 2017–18 academic years. Just like we did with the data sets spanning 2016–2018 academic years, we performed which was before to the set of data spanning 2018–2019. There were 437 student files left over from the 2018-2019 school year.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 687 We used Artificial Neural Networks (ANNs), Decision Trees, Support Vector Machines (SVMs), and Naive Bayes to build four different prediction models. Each model was developed using a ten-fold cross validation, with 9 data sets used during training and 1 set employed while testing. This procedure was carried out a total of ten times, once for every one of the distinct sets. The following ideas may be used to evaluate the efficacy of data mining models: Number of occasions for which a positive prediction was made (True Positive Rate, or TP). Number of times a negative result was wrongly labelled as positive; sometimes called the "False Positive Rate" (FP). Number of occurrences when a negative prediction was made accurately; sometimes called the "True Negative Rate" (TN). One measure of accuracy is the False Negative Rate (FN), which is the percentage of positive events that were wrongly labelled negative. Predictive accuracy is quantified as the proportion of times an outcome is accurately anticipated. Preciseness equals (Total Probability) divided by (Total Probability) plus (First Probability) plus (Final Probability) (1) Measured by (2), recall is the proportion of true positives that were accurately anticipated. Calculating Recall: TP / (TP + FN) (2) Accuracy is quantified as the proportion of true positives, and it may be calculated as (3): TP / (TP + FP) = Accuracy (3) The F1-Measure highlights a classifier's performance on both common and unusual categories, and it represents the trade- off between recall and accuracy. There are four units of measurement for it: Formula for F1 Measurement = 2 * Recall * Precision * (Recall + Precision) (4) To improve first-year student performance, the PNU Admitted & Registration Deanship has chosen to implement changes to the present admissions procedure based on the study's findings and suggestions. For the 2018-2019 school year, the dean's office decided to increase the weight of the SAAT criterion and adjust the weights of the other two criteria (HSGA and GAT) to 30% and 40%, respectively. Prior to this change, the percentages were 60%, 20%, and 20%. Figure 2 displays a comparison of first-year CGPAs from new students to those from prior years. V. CONCLUSION This research was conducted with the aim of assisting universities in making informed admissions choices based on accurate predictions of prospective students' academic success after they are admitted. Artificial Neural Networks (ANNs), Decision Trees, Support Vector Machines (SVMs), and Naive Bayes were used to propose and create four different models for making predictions. The research used a database including 2,039 student records from PNU, the biggest university in KSA. However, the approaches used are universal and may be implemented in any university setting. The results of the research lend credence to the employment of prediction models in higher education, where they may be put to good use in the allocation of scarce resources. Moreover, the findings demonstrate that, with sufficient pre- admission data, a high-performance model to predict students' early performance may be created. In this research, for instance, the ANN model achieved an accuracy rate of roughly 79.22% in terms of its performance. The research also found that the ANN method achieved the highest levels of accuracy and precision, while the Decision Tree method achieved the highest levels of recall and F1-Measure. Out of all the methods, Naive Bayes performed the poorest.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 688 Students' personalities, demographics, families, and communication abilities are just a few of the pre-admission characteristics that have been shown to influence future academic success; further research is required. It's possible to employ other data mining methods, such as clustering, in subsequent research. REFRENCES [1] H. Guruler, A. Istanbullu, and M. Karahasan, ‘‘A new student performance analysing system using knowledge discovery in higher educational databases,’’ Comput. Edu., vol. 55, no. 1, pp. 247–254, Aug. 2010. [2] S. K. Mohamad and Z. Tasir, ‘‘Educational data mining: A review,’’ Procedia Social Behav. Sci., vol. 97, pp. 320–324, Nov. 2013. [3] A. Peña-Ayala, ‘‘Educational data mining: A survey and a data miningbased analysis of recent works,’’ Expert Syst. Appl., vol. 41, no. 4, pp. 1432–1462, Mar. 2014. [4] C. Romero and S. Ventura, ‘‘Educational data mining: A review of the state of the art,’’ IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 40, no. 6, pp. 601–618, Nov. 2010. [5] H. Aldowah, H. Al-Samarraie, and W. M. Fauzy, ‘‘Educational data mining and learning analytics for 21st century higher education: A review and synthesis,’’ Telematics Informat., vol. 37, pp. 13–49, Apr. 2019. [6] C. Anuradha and T. Velmurugan, ‘‘A comparative analysis on the evaluation of classification algorithms in the prediction of students performance,’’ Indian J. Sci. Technol., vol. 8, no. 15, pp. 974–6846, Jan. 2015. [7] V. L. Miguéis, A. Freitas, P. J. V. Garcia, and A. Silva, ‘‘Early segmentation of students according to their academic performance: A predictive modelling approach,’’ Decis. Support Syst., vol. 115, pp. 36–51, Nov. 2018. [8] M. Mayilvaganan and D. Kalpanadevi, ‘‘Comparison of classification techniques for predicting the performance of students academic environment,’’ in Proc. Int. Conf. Commun. Netw. Technol., Sivakasi, India, Dec. 2014, pp. 113–118. [9] S. Natek and M. Zwilling, ‘‘Student data mining solution– knowledge management system related to higher education institutions,’’ Expert Syst. Appl., vol. 41, no. 14, pp. 6400– 6407, Oct. 2014. [10] T. M. Christian and M. Ayub, ‘‘Exploration of classification using NB tree for predicting students’ performance,’’ in Proc. Int. Conf. Data Softw. Eng. (ICODSE), Bandung, ID, USA, Nov. 2014, pp. 1–6. [11] J. Albishri, S. Aly, and Y. Alnemary, ‘‘Admission criteria to Saudi medical schools. Which is the best predictor for successful achievement?’’ Saudi Med. J., vol. 33, pp. 1222– 1228, 2012. [12] M. O. Al-Rukban, F. M. Munshi, H. Abdulghani, and I. Al- Hoqail, ‘‘The ability of the pre-admission criteria to predict performance in a Saudi medical school,’’ Saudi Med. J., vol. 31, pp. 560–564, 2010. [13] A. M. Alhadlaq, O. F. Alshammari, S. M. Alsager, K. A. F. Neel, and A. G. Mohamed, ‘‘Ability of admissions criteria to predict early academic performance among students of health science colleges at King Saud University, Saudi Arabia,’’ J. Dental Educ., vol. 79, pp. 665–670, Jan. 2015. [14] S. M. Hassan and M. S. Al-Razgan, ‘‘Pre-university exams effect on students GPA: A case study in IT department,’’ Procedia Comput. Sci., vol. 82, pp. 127–131, 2016. [15] M. F. Al-Qahtani and T. M. Alanzi, ‘‘Comparisons of the predictive values of admission criteria for academic achievement among undergraduate students of health and non-health science professions: A longitudinal cohort study,’’ Psychol. Res. Behav. Manage., vol. 12, pp. 1–6, Dec. 2018. VI. FUTURE ENHANCEMENT