1 s2.0 S1877050919321696 Main
1 s2.0 S1877050919321696 Main
com
ScienceDirect
ScienceDirect
Procedia Computer Science 00 (2019) 000–000
Available online at www.sciencedirect.com
Procedia Computer Science 00 (2019) 000–000 www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia
ScienceDirect
Procedia Computer Science 163 (2019) 472–481
Abstract
Abstract
This paper presents a novel application of data mining techniques to guide academic programs design and assessment. More
This paper presents
specifically, it propose a novel application rule
using association of data mining
mining techniques
techniques to guide
to discover academic
a set programs
of rules that governdesign and assessment.
the relationship betweenMore
two
specifically, it propose
core components of anusing association
academic program, ruleprogram
mining educational
techniques toobjectives
discover (PEOs)
a set of rules that govern
and students the relationship
outcomes(SOs). As abetween two
case study,
this paper
core demonstrates
components how association
of an academic program,rule mining
program techniquesobjectives
educational are applied to mine
(PEOs) mappingoutcomes(SOs).
and students rules between theAs aPEOs and a
case study,
this paper demonstrates
predefined set of SOs adoptedhow association rule mining
by the American Board techniques
for Engineering are applied to mine mapping rulesAccreditation
and Technology-Engineering between the Commission
PEOs and a
predefined
(ABET-EAC) set for
of SOs adoptedprograms.
engineering by the American Board
To this end, forofEngineering
a set 152 ABET-EAC and Technology-Engineering Accreditation
accredited engineering programs’ Commission
self-study reports
have been collected
(ABET-EAC) and the mapping
for engineering programs. data
Tobetween
this end,their
a setPEOs
of 152and ABET-EAC
ABET-EAC SOs have
accredited been extracted.
engineering Thisself-study
programs’ dataset has been
reports
have been collected
pre-processed and the mapping
and transformed into adata between their
representation PEOs for
suitable and applying
ABET-EAC SOs haverule
association been extracted.
mining This dataset
techniques. This has been
involves
pre-processed
identifying a setand
of transformed
PEOs labels, into a representation
annotating suitable
data instances with forPEOsapplying
labels, association
and projectingruleeach
mining techniques.
multi-label This involves
data instance into a
set of single-label
identifying a set ofinstances. Apriopi
PEOs labels, algorithm
annotating dataisinstances
then appliedwithto discover
PEOs theand
labels, rules that govern
projecting eachthe mapping data
multi-label between PEOs
instance anda
into
set of single-label
ABET-EAC SOs. Theinstances. Apriopi
discovered rulesalgorithm is then importance
are of particular applied to discover the the
for guiding rules that and
design govern the mapping
assessment between PEOs
of engineering and
academic
ABET-EAC
programs. SOs. The
Besides that,discovered rules are
the discovered of particular
rules unveil of aimportance
number offor guiding the
interesting design andbetween
correlations assessment
PEOsof engineering
and ABET-EAC academic
SOs
that need further
programs. Besides investigation by pedagogists.
that, the discovered rules unveil of a number of interesting correlations between PEOs and ABET-EAC SOs
that need further investigation by pedagogists.
© 2019 The Authors. Published by Elsevier B.V.
© 2019 The Authors. Published by Elsevier B.V.
This
© 2019is an open
The accessPublished
Authors. article under the CC BY-NC-ND
by Elsevier B.V. license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an
Peer-review open access
under article under
responsibility of the CC BY-NC-ND
scientific
Peer-review under responsibility of the scientific committeecommitteelicense
ofofthe(https://creativecommons.org/licenses/by-nc-nd/4.0/)
the 16thInternational
16th International Learning
Learning && Technology
Technology Conference
Conference 2019.
2019.
Peer-review under responsibility of the scientific committee of the 16th International Learning & Technology Conference 2019.
Keywords: Association rule mining; educational data mining; academic program management; program educational objectives; student
outcomes; ABET
Keywords: accreditation.
Association rule mining; educational data mining; academic program management; program educational objectives; student
outcomes; ABET accreditation.
1. Introduction
In academia, an academic Program is a unique course of study that culminates in the awarding of a specific degree
(or certificate) in combination with a specific major. As is the case of student lifecycle, each academic program has a
lifecycle; typically includes: conceptualization, introduction, growth, maturity, and decline [1] . During the program
lifecycle, the availability of information to its decisions makers is crucial for its success. This paper show how data
mining techniques can be exploited to elicit a valuable information on the relationship between two core components
of academic program, that are PEOs and SOs. The elicited information are of particular importance for program
management in different stages of its lifecycle. For example, to optimize its design, evaluate its quality, and develop
a robust assessment plan. However, prior to this, it would be useful to shed light on the structure of the Outcome Based
Education (OBE) [2] academic program and the relationship between its components.
OBE is a new education paradigm focuses on graduate attributes that allow students to accept the challenges, adopt
to technological changes, and translate their knowledge to new contexts for the benefit of the society. That is to say,
the OBE-based academic programs are designed to develop various abilities as per the requirements of graduate
attributes. An essential step in designing OBE-based academic program is the identification of long-term PEOs and
SOs [3]. While the SOs represent the knowledge, skills, and capabilities that should students possess by the time of
graduation, PEOs represent the achievements graduates should attain few years (3 to 5 years and more) after graduation
[4]. Having specified the PEOs and SOs of the OBE-based academic program and established the mapping between
them, the curriculum is designed in such a way that the student ultimately gain knowledge and develop skills stated in
the SOs. Accordingly, the curriculum is viewed as number of courses aim to attain certain Course Outcomes (COs)
that map to SOs which themselves map to PEOs which in turn map to the mission of the institution in a hierarchical
structure shown in Fig. 1 [2].
It goes without saying that a robust mapping between the PEOs and SOs of an OBE-based academic program plays
a crucial role in its design and assessment. It is a connection point between a program and its context and plays a key
role in constructing program curriculum and assessment of educational objectives. For example, the questions of
alumni questionnaire, a direct program assessment method used to measure the attainment of the PEOs, are excerpted
from SOs and take into consideration their mapping to PEOs. Based on the average rating by the alumni the attainment
of PEO is obtained and thus the program is assessed [3].
Despite the key role of PEOs, SOs, and their mapping in OBE-based academic program, there is a consensus on
the lingering confusion, among practitioners, related to them. The drastic consequences of such confusion is a poor
design of curriculum and teaching strategies and a misleading assessment of PEOs and ultimately inaccurate program
corrective plan. In part, this confusion is attributed to the inherent difficulties of understanding the meaning of these
terms and their relationship to each other [4]. In case of ABET accreditation, this terminologies confusion has created
an insistent need for progressive clarification changes, over the years, in ABET wording of criterion 2 (Program
Educational Objectives) and criterion 3 (Student Outcomes), the accreditation policy, and procedure manual. This
concern has been raised in several events such as the fall 2005 ABET Summit [5], where questions regarding the
difference between PEOs, SOs, and their mapping have been asked again.
474 Anwar Ali Yahya et al. / Procedia Computer Science 163 (2019) 472–481
Author name / Procedia Computer Science 00 (2019) 000–000 3
Based on the foregoing premises, it ca be concluded that the discovery of a set of well-founded and accurate
mapping rules between PEOs and SOs would be of significant benefit for the design and assessment of OBE-based
academic programs. This paper proposes an effective way of doing so by inductively interrogate a set of PEOs-SOs
mapping data of academic programs using Association Rule Mining (ARM) techniques [6] [7]. In doing so a suitable
source of data has to be specified. For this purpose, program’s self-study reports (SSRs), documents formally submitted
to the academic accreditation agency to obtain academic accreditation, is specified as a source of the required data.
The remainder of this paper reviews the related literature, presents the collected dataset, describes how ARM
technique can be applied to this dataset to discover a set of PEOs and SOs mapping rules, and evaluates these rules.
2. Related Works
The huge amount of digital content available in different fields has motivated research in, and the development of,
different techniques that make it easier to search, organize and analyze this content. Data mining is a discipline that
have emerged to analyze data in an automated manner by finding patterns and relationships in raw data [6]. Data
mining has already achieved significant success in many areas including medicine, business, robotics and computer
vision, to name just a few. By the same token, the continuous growth of data in educational institutions leads to the
emergence of Educational Data Mining (EDM) with a concern of developing, researching, and applying computerized
methods to detect patterns in large collections of educational data that would otherwise be hard or impossible to
analyze [8]. In an educational context, there are many interesting and difficult problems that may arise from four
aspects: administrative problems and problems associated with school, academic staff, and students. The application
of EDM to solve these problems have been reviewed in several surveys [8] [9] [10], however for the sake of this
research, this survey focuses on the efforts made to study various aspect of academic program at any stage of its
lifecycle. In this regards, a combination of neutral network and experts’ prior knowledge are applied in [11] to predict
and evaluate student-learning outcomes of an academic program and ultimately enhance teaching quality. K−means
clustering algorithm is applied in [12] to investigate the relationship between skills taught in business programs and
the title of the program using a data set extracted from program catalogue. The analysis shows that, with very limited
exceptions, the labels of programs match the skills one would expect to learn. Data mining methods are used in [13]
to identify the similarities between course content at a learning object, module and program level. The hierarchy and
structures of the learning environment created by educators are extracted and then the similarity between documents
are calculated and both are used for narrowing and selecting applicable learning objects with similar content.
Regression and classification techniques are used in [14] to investigate the effect of academic program type, among
others like years of study and gender, on the mental health of students. The dataset, survey responses of undergraduate
students engineering programs in a large candian university, is analyzed and interestingly the results show that due to
the competitive nature of the program, students of electrical engineering had lower mental health, though high
self-actualization, whereas due to strong classmate relationships and a flexible curriculum, students of systems design
had mental health higher scores.
The applications of EDM to academic program assessment can be found in many works. In the first work [15],
dated back to 2006, a data-driven course assessment and program assessment is applied to quantify the level to which
program’s curriculum meet the program outcomes. This involves the use of clustering and prediction techniques for
question evaluation, topic clustering, predicting missing scores, and clustering with partial topic information to
construct score matrix and relevancy matrix. Using these matrices with course matrix the numeric value for each
program outcome can be determined by simply find the average score for each topic and multiply that vector by the
course matrix. More on using data mining in programs assessment is found in [16], where different data mining
techniques are applied to online professional development program that contains workshops for pre K-12 teachers. In
this work data mining techniques is applied to study the shared learning characteristics: - frequent learning paths
(association rules),engagement prediction (clustering), expectation prediction (decision tree), workshop satisfaction
prediction (decision tree), and instructor quality Prediction (decision tree).
Anwar Ali Yahya et al. / Procedia Computer Science 163 (2019) 472–481 475
4 Author name / Procedia Computer Science 00 (2019) 000–000
3. Methodology
This section describes the detailed methodology of mining the relationship between PEOs and SOs of academic
programs. It is based on the general methodology of the knowledge discovery process, depicted in Fig. 2, which
involves raw data collection, selection, preprocessing, transformation, mining, and evaluation [17].
In the data collection step, a raw data is collected and used to create a target dataset, or focusing on a subset of
variables or data samples on which discovery is to be performed. The target data is cleaned and preprocessed in order
to obtain consistent data in the pre-processing step. The transformation step transforms the data using dimensionality
reduction or other transformation methods. The step of data mining applies procedures to search for patterns of interest
in a particular representational form, depending on the data mining objective. Finally, in the interpretation/evaluation
step, the mined patterns are interpreted and evaluated.
One of the data mining techniques, which is applied to discover all hidden associations that satisfy some user-
predefined criteria is association rules algorithms [6] [18]. They work by dividing the problem into two parts: mining
for frequent itemsets and rules discovery from the frequent itemsets. A Frequent itemset is a set of items, which are
more than a threshold. The procedure of finding frequent itemsets is simple but very time consuming, because of the
large number of the possible combinations. Once they have been discovered, the rules production is a simple process.
A widely used algorithm for the association rules mining is the Apripri algorithm [18]. It is based on the following
rule: All sub-itemsets of a frequent itemset must also be frequent. By using this rule, Apriori is able to prune a huge
amount of itemsets examinations since it is certain that they are not frequent. Frequent sub-itemsets are extended one
item at a time (candidate generation), and groups of candidates are examined. It terminates when no further extensions
are found. In other words, Apriori algorithm generates candidate itemsets of length l from itemsets of length l−1 and
then it prunes the candidates which have a non frequent sub-itemset. Thus, it keeps only the frequent item sets among
the candidates.
476 Anwar Ali Yahya et al. / Procedia Computer Science 163 (2019) 472–481
Author name / Procedia Computer Science 00 (2019) 000–000 5
The following subsections give a detailed description of how the above methodology of knowledge discovery has
been particularized to discover the PEOs-SOs mapping rules in engineering programs.
As mentioned above, this research is concerned with the discovery of the association rules between the PEOs and
SOs, therefore the SSRs of academic programs represent a suitable source of the raw data. In accreditation, SSR is the
primary document used by a program to demonstrate its compliance with all applicable criteria and policies adopted
by accreditation agency such as ABET. According to ABET, the SSR addresses all paths to completion of the degree,
all methods of instructional delivery used for the program, and all remote location offerings. In practice ABET requests
each program to develop its PEOs and link them to a set of SOs it adopts. Since ABET adopts different set of SOs for
each discipline, only engineering programs previously accredited by ABET-EAC between 2000 and 2017 has been
considered in this collection [19]. The set of SOs adopted by ABET-EAC is listed below.
(a) an ability to apply knowledge of mathematics, science, and engineering
an ability to design and conduct experiments, as well as to analyze and
(b)
interpret data
an ability to design a system, component, or process to meet desired needs
(c) within realistic constraints such as economic, environmental, social,
political, ethical, health and safety, manufacturability, and sustainability
(d) an ability to function on multidisciplinary teams
(e) an ability to identify, formulate, and solve engineering problems
(f) an understanding of professional and ethical responsibility
(g) an ability to communicate effectively
the broad education necessary to understand the impact of engineering
(h)
solutions in a global, economic, environmental, and societal context
(i) a recognition of the need for, and an ability to engage in life-long learning
(j) a knowledge of contemporary issues
an ability to use the techniques, skills, and modern engineering tools
(k)
necessary for engineering practice
The result of data collection is a set of 152 engineering programs SSRs distributed over years 2002-2017 as in Fig. 3.
In this step, the data of the PEOs-SOs mapping has been drawn from the collected programs’ SSRs. These data
have been extracted from sub section B of the third criteria (Student Outcomes) of each SSR and consolidated in a
table form as shown in Table 1, where ✓ denotes the presence of the SO in the PEOs-SOs mapping and × denotes its
absence
Table 1. Example of PEOs-SOs mapping data.
PEOs a b c d e f g h i j k
Practice the disciplines of
transportation,
environmental, structural, ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
water resources, and
geotechnical engineering,
and/or related
Engage infields.advanced
education, research, and ✓ ✓ ✓ × ✓ × × × ✓ × ✓
development
Pursue continuing education
× × × × × × × × ✓ × ✓
and professional licensure
Act in a responsible,
professional and ethical × × × × × × × ✓ × ✓ ×
manner
The preprocessing of the dataset involves substantial verification and validation of the content, attempts to remove
spurious or duplicated objectives, fulfilling the objectives and outcomes format, etc. Figure 4 shows the distribution
of SOs over the dataset.
Moreover, Table 2 shows the main statistical aspects of the collected dataset.
Since the PEOs in the collected dataset are in a textual form, they are unsuitable for association mining algorithms.
Therefore, a transformation procedure that transform each PEO text into a single or a combination of labels
representing the graduates attributes expressed in its text is proposed. Prior to this, a set of PEOs labels is identified,
which is used to annotate data instances with a single or multiple PEOs labels, and then each multi-label data instance
is projected into single-label data instance. The details of the transformation procedure is given in the following
subsections.
In OBE model, PEOs describe both technical and professional aspects of the expected achievements of the graduate
after graduation. On this basis, they can be grouped into a finite set of common attributes. Typical PEOs cover the
followings attributes: technical skills, professional, ethical, and communicational aspects, management and leadership,
life-long learning and continuous education, advanced and graduate studies pursuing, and other aspects [20]. Based
on PEOs wordings of a number of engineering programs, the following set of common PEOs labels, shown in Table3,
have been identified.
Table 3. PEOs Label Set
1 Life-long Learning LL
2 Communication C
3 Leadership L
4 Teaming T
5 Ethical Conduct EC
6 Professionalism P
7 Social and Community SC
8 Career Success CS
9 Technical Competency TC
10 Knowledge Competency KC
11 Graduate Studies GS
12 Others O
Using the identified PEOs label set, each data instance is annotated with one or more labels that match its PEO text.
This process has been accomplished by three annotators who initially annotated the data set individually using the
PEOs label set. The three annotators then met to resolve the conflicting cases of PEOs annotation. It should be
mentioned that some PEOs text have been annotated with more than PEOs label, because it might happens that the
PEOs text describes multiple skills and attributes. Table 4 shows excerpts of the dataset after PEOs annotation. It
should be noted that the values 1 in each data instance indicates the presence of SO in the mapping between its PEO
and the SOs set, whereas 0 indicate its exclusion from the mapping .
As shown in Table 5, the PEOs annotation step results in a set of data instances with multiple PEOs labels. As the
aim is to discover the rules that governs the relationships of each PEO and the SOs set, each multi-label data instance
is decomposed into a single-label data instances, where each one is annotated with a single PEO of those in the original
data instance. The results of this step is an enlarged dataset with 1196 single-label data instances.
In this step, Apriori algorithm is applied under WEKA framework [21]. It works by iteratively generating for each
PEO the frequent k-item sets whose their count equal to or greater than to a pre-specified minimum support count
which can be calculated as follows
A process of joining and pruning is applied to the current frequent k-item sets to generates frequent k + 1 item sets.
The frequent k-item sets are joined together and the k-item sets that cannot satisfy the minimum support count are
pruned since they are infrequent item sets. The process of joining and pruning is repeatedly applied until the frequent
item sets become null. Thereafter, the algorithm is aborted and the association rules generation begins. The confidence
level of the rule must satisfy the user specified confidence threshold. The confidence level of the rules can be calculated
using the equation 2.
confidence(PEO→SOs)=count(PEO∪SOs)/count(PEO) (2)
To implement the Apriori algorithm as described above in WEKA, the minimum support and minimum confidence
parameters must be specified. Since the interest of this research is confined to the relationship between PEOs and SOs,
the application of Apriori algorithm focuses on generating rules that contains a particular PEO in it antecedent and a
combination of SOs in its consequent. This is can be achieved by controlling the minimum support parameter of
WEKA to be in an interval centered by the probability of that PEO. The generated rules for each PEO are manipulated
to select only rules that has a particular PEO in its antecedent and a combination of SOs in its consequent. Thereafter
the generated rules are sorted based on their confidence. Table 5 shows examples of the discovered rules along with
their confidences.
Table 5. Examples of the discovered PEOs-SOs mapping rules
Rule confidence
if (PEO=LL) then i=1 0.85
if (PEO=C) then g=1and i=0 0.79
if (PEO=L) then d=1 0.76
if (PEO=T) then a=0 and b=0 0.76
if (PEO=EC) then f =1 0.85
if (PEO=P) then b=0 and c=0 0.7
if (PEO=SC) then b=0 and k=0 0.81
if (PEO=CS) then a=1 and e=1 0.66
if (PEO=TC) then a=1 0.74
if (PEO=KC) then d=0 0.75
if (PEO=GS) then a=1 and e=1 0.55
480 Anwar Ali Yahya et al. / Procedia Computer Science 163 (2019) 472–481
Author name / Procedia Computer Science 00 (2019) 000–000 9
The set of rules for each PEO has been sorted in descending order according to their confidence values and the top
10 rules have been used to generate a more compact representation of the discovered rules as shown in Table 6. It can
be observed that the PEOs Ethical Conduct (EC) and Social and Community (SC) have the highest average confidence,
whereas Knowledge Competency (KC) and Graduate Study (GS) have the lowest. In this rules representation the PEOs,
Social and Community (SC) and Knowledge Competency (KC), are not dependent on the presence of any SOs. They
mainly depend on the absence of different combinations of SOs. The PEOs Life-long Learning(LL),
Communication(C), Ethical Conduct(EC), and Professionalism(P) depend on the presence of a single SO. This
indicates a one to one mapping between the graduate attributes of the PEO and the corresponding skills of the SO.
Interestingly, the PEOs Leadership(L) and Teaming (T) depend on the presence of the same SOs (d and g), and on
the absence of almost the same combination of SOs. This suggests a correlation between the two graduate attributes
Leadership(L) and Teaming(T) and a correlation between the skills d and g. A somewhat related PEO to them is
Communication(C), which have a similarity in the dependent SOs. Concerning the PEOs Career Success (CS) and
Technical Competency(TC), their dependence on the same combination of SOs suggests a very high correlation
between their corresponding graduate attributes. A closely related PEO to them is Graduate Study (GS), which depends
on the same set of SOs except c. As for PEO Knowledge Competency (KC), it does not show dependency on the
presence of any SOs; however, it depends on absence of seven SOs.
Finally, the recommended PEOs-SOs mapping rules provide very useful insights to the decision makers of
engineering academic programs during the continuous improvement process. They represent a guidance manual that
can be used to assist academicians when designing a new academic program or reviewing the existing one.
5. Conclusion
Association rules data mining technique is proposed and applied to discover the set of rules that govern the mapping
between PEOs and SOs of academic programs. It is applied to dataset extracted from SSRs of 152 ABET accredited
engineering programs. The discovered rules are useful for academic programs design and assessment. In addition, the
discovered rules disclose a number of interesting correlations between PEOs and ABET SOs that are informative for
programs designing and assessment and deserving further investigation.
Anwar Ali Yahya et al. / Procedia Computer Science 163 (2019) 472–481 481
10 Author name / Procedia Computer Science 00 (2019) 000–000
References