0% found this document useful (0 votes)

0 views33 pages

2016 RSTML Chapter

This document reviews the application of Rough Set Theory (RST) in Machine Learning (ML), highlighting its significance as a data analysis and knowledge discovery paradigm. It covers RST's role in data preprocessing, decision rule induction, and various ML scenarios such as imbalanced classification and Big Data analysis. The chapter emphasizes the theoretical developments and practical applications of RST within the ML community, showcasing its advantages in handling incomplete and complex data.

Uploaded by

nermine.limem.tbs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views33 pages

2016 RSTML Chapter

Uploaded by

nermine.limem.tbs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/315800218

Rough Sets in Machine Learning: A Review

Chapter in Studies in Computational Intelligence · April 2017

DOI: 10.1007/978-3-319-54966-8_5

CITATIONS READS
43 4,829

2 authors:

Rafael Bello Rafael Falcon

Universidad Central "Marta Abreu" de las Villas (UCLV) Shopify Inc.
312 PUBLICATIONS 3,849 CITATIONS 119 PUBLICATIONS 1,804 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Rafael Falcon on 10 October 2017.

The user has requested enhancement of the downloaded file.

Rough Sets in Machine Learning: A Review

Rafael Bello and Rafael Falcon

Abstract This chapter emphasizes on the role played by rough set theory (RST)
within the broad field of Machine Learning (ML). As a sound data analysis and
knowledge discovery paradigm, RST has much to offer to the ML community. We
surveyed the existing literature and reported on the most relevant RST theoretical
developments and applications in this area. The review starts with RST in the context
of data preprocessing (discretization, feature selection, instance selection and meta-
learning) as well as the generation of both descriptive and predictive knowledge
via decision rule induction, association rule mining and clustering. Afterward, we
examined several special ML scenarios in which RST has been recently introduced,
such as imbalanced classification, multi-label classification, dynamic/incremental
learning, Big Data analysis and cost-sensitive learning.

1 Introduction

Information granulation is the process by which a collection of information granules

are synthesized, with a granule being a collection of values (in the data space) which
are drawn towards the center object(s) (in the object space) by an underlying indis-

Rafael Bello
Department of Computer Science, Universidad Central de Las Villas
Carretera Camajuanı́ km 5.5, Santa Clara, Cuba
e-mail: [email protected]
Rafael Falcon
Research & Engineering Division, Larus Technologies Corporation
170 Laurier Ave West - Suite 310, Ottawa ON, Canada
e-mail: [email protected]

School of Electrical Engineering and Computer Science, University of Ottawa

800 King Edward Ave, Ottawa ON, Canada
e-mail: [email protected]

1
2 Rafael Bello and Rafael Falcon

tinguishability, similarity or functionality mechanism. Note that the data and object
spaces can actually coincide [142]. The Granular Computing (GrC) paradigm [7]
[185] encompasses several computational models based on fuzzy logic, Computing
With Words, interval computing, rough sets, shadowed sets, near sets, etc.
The main purpose behind Granular Computing is to find a novel way to syn-
thesize knowledge in a more human-centric fashion and from vast, unstructured,
possibly high-dimensional raw data sources. Not surprisingly, Granular Computing
(GrC) is closely related to Machine Learning [259], [95] [83]. The aim of a learning
process is to derive a certain rule or system for either the automatic classification
of the system objects or the prediction of the values of the system control variables.
The key challenge with prediction lies in modeling the relationships among the sys-
tem variables in such a way that it allows inferring the value of the control (target)
variable.
Rough set theory (RST) [1] was developed by Zdzislaw Pawlak in the early 1980s
[181] as a mathematical approach to intelligent data analysis and data mining [182].
This methodology is based on the premise that lowering the degree of precision
in the data makes the data pattern more visible, i.e., the rough set approach can be
formally considered as a framework for pattern discovery from imperfect data [222].
Several reasons are given in [34] to employ RST in knowledge discovery, including:
• It does not require any preliminary or additional information about the data
• It provides a valuable analysis even in presence of incomplete data
• It allows the interpretation of large amounts of both quantitative and qualitative
data
• It can model highly nonlinear or discontinuous functional relations to provide
complex characterizations of data
• It can discover important facts hidden in the data and represent them in the form
of decision rules, and
• At the same time, the decision rules derived from rough set models are based on
facts, because every decision rule is supported by a set of examples.
Mert Bal [3] brought up other RST advantages, such as: (a) it performs a clear
interpretation of the results and evaluation of the meaningfulness of data; (b) it can
identify and characterize uncertain systems and (c) the patterns discovered using
rough sets are concise, strong and sturdy.
Among the main components of the knowledge discovery process we can men-
tion:
• PREPROCESSING

– Discretization
– Training set edition (instance selection)
– Feature selection
– Characterization of the learning problem (data complexity, metalearning)
• KNOWLEDGE DISCOVERY
Rough Sets in Machine Learning: A Review 3

– Symbolic inductive learning methods

– Symbolic implicit learning methods (a.k.a. lazy learning)
• KNOWLEDGE EVALUATION

– Evaluation of the discovered knowledge

All of the above stages have witnessed the involvement of rough sets in their
algorithmic developments. Some of the RST applications are as follows:
• Analysis of the attributes to consider

– Feature selection
– Inter-attribute dependency characterization
– Feature reduction
– Feature weighting
– Feature discretization
– Feature removal
• Formulation of the discovered knowledge
– Discovery of decision rules
– Quantification of the uncertainty in the decision rules.

RST’s main components are an information system and an indiscernibility rela-

tion. An information system is formally defined as follows. Let A = {A1 , A2 , . . . , An }
be a set of attributes characterizing each example (object, entity, situation, state, etc.)
in non-empty set U called the universe of discourse. The pair (U, A) is called an in-
formation system. If there exists an attribute d ∈
/ A, called the decision attribute, that
represents the decision associated with each example in U, then a decision system
(U, A ∪ {d}) is obtained.
The fact that RST relies on the existence of an information system allows estab-
lishing a close relationship with data-driven knowledge discovery processes given
that these information or decision systems can be employed as training sets for un-
supervised or supervised learning models, respectively.
A binary indiscernibility relation IB is associated with each subset of attributes
B ⊆ A. This relation contains the pairs of objects that are inseparable from each
other given the information expressed in the attributes in B, as shown in Eq. (1).

IB = {(x, y) ∈ U ×U : f (x, Ai ) = f (y, Ai ) ∀Ai ∈ B}. (1)

where f (x, Ai ) returns the value of the i-th attribute in object x ∈ U.
The indiscernibility relation induces a granulation of the information system.
The classical RST leaned on a particular type of indiscernibility relations called
equivalence relations (i.e., those that are simmetric, reflexive and transitive). An
equivalence relation induces a granulation of the universe in the form of a partition.
This type of relation works well when there are only nominal attributes and no
missing values in the information system.
4 Rafael Bello and Rafael Falcon

Information systems having incomplete, continuous, mixed or heterogeneous

data are in need of a more flexible type of indiscernibility relation. Subsequent RST
formulations relaxed the stringent requirement of having an equivalence relation by
considering either a tolerance or a similarity relation [209], [214] [68], [183], [233]
[307] [308] [61] [285] [286]; these relations will induce a covering of the system.
Another relaxation avenue is based on the probabilistic approach [184] [261] [309]
[212] [65] [266] [269]. A third alternative is the hybridization with fuzzy set theory
[54] [55] [174] [260] [282]. These different approaches have contributed to posi-
tioning RST as an important component within Soft Computing [12].
All of the aforementioned RST formulations retain some basic definitions, such
as the lower and upper approximations; however, they defined it in multiple ways.
The canonical RST definition for the lower approximation of a concept X is given
as B∗ (X) = {x ∈ U : B(x) ⊆ X} whereas its upper approximation is calculated as
B∗ (X) = {x ∈ U : B(x) ∩ X 6= 0}. / From these approximations we can compute the
positive region POS(X) = B∗ (X), the boundary region BND(X) = B∗ (X) − B∗ (X)
and the negative region NEG(X) = U − B∗ (X). These concepts serve as build-
ing blocks for developing many problem-solving approaches, including data-driven
learning.
RST and Machine Learning are also related in that both take care of removing
irrelevant/redundant attributes. This process is termed feature selection and RST
approaches it from the standpoint of calculating the system reducts. Given an infor-
mation system S = (U, A), where U is the universe and A is the set of attributes, a
reduct is a minimum set of attributes B ⊆ A such that IA = IB .
This chapter emphasizes on the role played by RST within the broad field of Ma-
chine Learning (ML). As a sound data analysis and knowledge discovery paradigm,
RST has much to offer to the ML community. We surveyed the existing literature
and reported on the most relevant RST theoretical developments and applications
in this area. The review starts with RST in the context of data preprocessing (dis-
cretization, feature selection, instance selection and meta-learning) as well as the
generation of both descriptive and predictive knowledge via decision rule induction,
association rule mining and clustering. Afterward, we examined several special ML
scenarios in which RST has been recently introduced, such as imbalanced classifi-
cation, multi-label classification, dynamic/incremental learning, Big Data analysis
and cost-sensitive learning.
The rest of the chapter is structured as follows. Section 2 reviews ML meth-
ods and processes from an RST standpoint, with emphasis on data preprocessing
and knowledge discovery. Section 3 unveils special ML scenarios that are being
gradually permeated by RST-based approaches, including imbalanced classifica-
tion, multi-label classification, dynamic/incremental learning, Big Data analysis and
cost-sensitive learning. Section 5 concludes the chapter.
Rough Sets in Machine Learning: A Review 5

2 Machine Learning methods and RST

This section briefly goes over reported studies showcasing RST as a tool in data
preprocessing and descriptive/predictive knowledge discovery.

2.1 Preprocessing

2.1.1 Discretization

As mentioned in [197], discretization is the process of converting a numerical at-

tribute into a nominal one by applying a set of cuts to the domain of the numer-
ical attribute and treating each interval as a discrete value of the (now nominal)
attribute. Discretization is a mandatory step when processing information systems
with the canonical RST formulation, as there is no provisioning for handling numer-
ical attributes there. Some RST extensions avoid this issue by, for example, using
similarity classes instead of equivalence classes and building a similarity relation
that encompasses both nominal and numerical attributes.
It is very important that any discretization method chosen in the context of RST-
based data analysis preserves the underlying discernibility among the objects. The
level of granularity at which the cuts are performed in the discretization step will
have a direct impact on any ensuing prediction, i.e., generic (wider) intervals (cuts)
will likely avoid overfitting when predicting the class for an unseen object.
Dougherty et. al. [53] categorize discretization methods along three axes:
• global vs. local: indicates whether an approach simultaneously converts all nu-
merical attributes (global) or is restricted to a single numerical attribute (local).
For instance, the authors in [176] suggest both local and global handling of nu-
merical attributes in large data bases.
• supervised vs. unsupervised: indicates whether an approach considers values of
other attributes in the discretization process or not. A simple example of an unsu-
pervised approach is an “equal width” interval method that works by dividing the
range of continuous attributes into k equal intervals, where k is given. A super-
vised discretization method, for example, will consider the correlation between
the numerical attribute and the label (class) attribute when choosing the location
of the cuts.
• static vs. dynamic: indicates whether an approach requires a parameter to deter-
mine the number of cut values or not. Dynamic approaches automatically gener-
ate this number along the discretization process whereas static methods require
an a priori specification of this parameter.
Lenarcik and Piasta [129] introduced an RST-based discretization method that
leans on the concepts of a random information system and of an expected value
of classification quality. The method of finding suboptimal discretizations based
6 Rafael Bello and Rafael Falcon

on these concepts is presented and is illustrated with data from concretes’ frost
resistance investigations.
Nguyen [175] considers the problem of searching for a minimal set of cuts that
preserves the discernibility between objects with respect to any subset of s attributes,
where s is a user-defined parameter. It was shown that this problem is NP-hard and
its heuristic solution is more complicated than that for the problem of searching for
an optimal, consistent set of cuts. The author proposed a scheme based on Boolean
reasoning to solve this problem.
Bazan [5] put forth a method to search for an irreducible sets of cuts of an infor-
mation system. The method is based on the notion of dynamic reduct. These reducts
are calculated for the information system and the one with the best stability coeffi-
cient is chosen. Next, as an irreducible set of cuts, the author selected cuts belonging
to the chosen dynamic reduct.
Bazan et. al. [6] proposed a discretization technique named maximal discerni-
bility (MD), which is based on rough sets and Boolean reasoning. MD is a greedy
heuristic that searches for cuts along the domains of all numerical attributes that dis-
cern the largest number of object pairs in the dataset. These object pairs are removed
from the information system before the next cut is sought. The set of cuts obtained
that way is optimal in terms of object indiscernibility; however this procedure is not
feasible since computing one cut requires O(|A| · |U|3 ). Locally optimal cuts [6] are
computed in O(|A| · |U|) steps using only O(|A| · |U|) space.
Dai and Li [46] improved Nguyen’s discretization techniques by reducing the
time and space complexity required to arrive at the set of candidate cuts. They
proved that all bound cuts can discern the same object pairs as the entire set of
initial cuts. A strategy to select candidate cuts was proposed based on that proof.
They obtained identical results to Nguyen’s with a lower computational overhead.
Chen et. al. [26] employ a genetic algorithm (GA) to derive the minimal cut set
in a numerical attribute. Each gene in a binary chromosome represents a particular
cut value. Enabling this gene means the corresponding cut value has been selected
as a member of the minimal cut set. Some optimization strategies such as elitist
selection and father-offspring combined selection helped the GA converge faster.
The experimental evidence showed that the GA-based scheme is more efficient than
Nguyen’s basic heuristic based on rough sets and Boolean reasoning.
Xie et. al. [251] defined an information entropy value for every candidate cut
point in their RST-based discretization algorithm. The final cut points are selected
based on this metric and some RST properties. The authors report that their approach
outperforms other discretization techniques and scales well with the number of cut
points.
Su and Hsu [221] extended the modified Chi2 discretizer by learning the pre-
defined misclassification rate (input parameter) from data. The authors additionally
considered the effect of variance in the two adjacent intervals. In the modified Chi2,
the inconsistency check in the original Chi2 is replaced with the “quality of approxi-
mation” measure from RST. The result is a more robust, parameterless discretization
method.
Rough Sets in Machine Learning: A Review 7

Singh and Minz [207] designed a hybrid clustering-RST-based discretizer. The

values of each numerical attribute are grouped using density-based clustering algo-
rithms. This produces a set of (possibly overlapping) intervals that naturally reflect
the data distribution. Then, the rough membership function in RST is employed
to refine these intervals in a way that maximizes class separability. The proposed
scheme yielded promising results when compared to seven other discretizers.
Jun and Zhou [117] enhanced existing RST-based discretizers by (i) computing
the candidate cuts with an awareness of the decision class information; in this way,
the scales of candidate cuts can be remarkably reduced, thus considerably saving
time and space and (ii) introducing a notion of cut selection probability that is de-
fined to measure cut significance in a more reasonable manner. Theoretical analyses
and simulation experiments show that the proposed approaches can solve the prob-
lem of data discretization more efficiently and effectively.

2.1.2 Feature selection

The purpose behind feature selection is to discard irrelevant features that are gener-
ally detrimental to the classifier’s performance, generate noise, increase the amount
of information to be stored and the computational cost of the classification process
[224] [304]. Feature selection is a computationally expensive problem that requires
searching for a subset of the n original features in a space of 2n −1 candidate subsets
according to a predefined evaluation criterion. The main components of a feature se-
lection algorithm are: (1) an evaluation function (EF), used to calculate the fitness
of a feature subset and (2) a generation procedure that is responsible for generating
different subsets of candidate features.
Different feature selection schemes that integrate RST into the feature subset
evaluation function have been developed. The quality of the classification γ is the
most frequently used RST metric to judge the suitability of a candidate feature sub-
set, as shown in [9] [11] [64] [10] etc. Other indicators are conditional independence
[210] and approximate entropy [211].
The concept of reduct is the basis for these results. Essentially, a reduct is a
minimal subset of features that generates the same granulation of the universe as
that induced by all features. Among these works we can list [37] [38] [249] [304]
[225] [250] [170] [85] [89] [198] [112] [137] [241] [223] [257] [272]. One of the
pioneer methods is the QuickReduct algorithm, which is typical of those algorithms
that resort to a greedy search strategy to find a relative reduct [204] [249] [137].
Generally speaking, feature selection algorithms are based on heuristic search [304]
[97] [166]. Other RST-based methods for reduct calculation are [98] [211]
More advanced methods employ metaheuristic algorithms (such as Genetic Al-
gorithms, Ant Colony Optimization or Particle Swarm Optimization) as the underly-
ing feature subset generation engine [247] [248] [102] [9] [243] [11] [15] [244] [10]
[64] [120] [299] [8] [276] [270]. Feature selection methods based on the hybridiza-
tion between fuzzy and rough sets have been proposed in [126] [101] [199] [103]
[104] [205] [13] [90] [87] [42] [227] [301] [105] [43] [44] [75] [51] [28] [92] [195].
8 Rafael Bello and Rafael Falcon

Some studies aim at calculating all possible reducts of a decision system [208] [227]
[301] [27] [28].
Feature selection is arguably the Machine Learning (ML) area that has witnessed
the most influx of rough-set-based methods. Other RST contributions to ML are
concerned with providing metrics to calculate the inter-attribute dependence and
the importance (weight) of any attribute [121] [224].

2.1.3 Instance selection

Another important data preprocessing task is the editing of the training sets, also re-
ferred to as instance selection. The aim is to reduce the number of examples in order
to bring down the size of the training set while maintaining the system efficiency.
By doing that, a new training set is obtained that will bring forth a higher efficiency
usually also produces a reduction of the data.
Some training set edition approaches using rough sets have been published in
[19] and [16]. The simplest idea is to remove all examples in the training set that
are not contained in the lower approximation of any of the decision classes. A more
thorough investigation also considers those examples that lie in the boundary re-
gion of any of the decision classes. Fuzzy rough sets have been also applied to the
instance selection problem in [99] [235] [234].

2.1.4 Meta-learning

An important area within knowledge discovery is that of meta-learning, whose ob-

jective is to learn about the underlying learning processes in order to make them
more efficient or effective [236]. These methods may consider measures related to
the complexity of the data [79]. The study in [18] explores the use of RST-based
metrics to estimate the quality of a data set. The relationship between the “quality
of approximation” measure and the performance of some classifiers is investigated
in [17]. This measure describe the inexactness of the rough-set-based classification
and denotes the percentage of examples that were correctly classified employing the
attributes included in the indiscernibility relationship [226]. The authors in [253]
analyze the inclusion degree as a perspective on measures for rough set data anal-
ysis (RSDA). Other RSDA measures are the “accuracy of the approximation” and
the rough membership function [121]; for example, in [109] and [110], the rough
membership function and other RST-based measures are employed to detect outliers
(i.e., examples that behave in an unexpected way or have abnormal properties).
Rough Sets in Machine Learning: A Review 9

2.2 Descriptive and predictive knowledge discovery

2.2.1 Decision rule induction

The knowledge uncovered by the different data analysis techniques can be either
descriptive or predictive. The former characterizes the general properties of the data
in the data set (e.g., association rules) while the latter allows performing inferences
from the available data (e.g., decision rules). A decision rule summarizes the rela-
tionship between the properties (features) and describes a causal relationship among
them. For example, IF Headache = Yes AND Weakness = YES THEN Influenza =
YES. The most common rule induction task is to generate a rule base R that is both
consistent and complete.
According to [163], RST-based rule induction methods provide the following
benefits:

• Better explanation capabilities

• Generate a simple and useful set of rules.
• Work with sparse training sets.
• Work even when the underlying data distribution significantly deviates from the
normal distribution.
• Work with incomplete, inaccurate, and heterogeneous data.
• Usually faster execution time to generate the rule base compared to other meth-
ods.
• No assumptions made on the size or distribution of the training data.
Among the most popular RST-based rule induction methods we can cite LERS
[67] and [217], which includes the LEM1 (Learn from examples model v1) and
LEM2 methods (Learn from examples model v2); the goal is to extract a minimum
set of rules to cover the examples by exploring the attribute-value pairs search space
of while taking into account possible data inconsistency issues. MODLEM [216]
[217] is based on sequentially building coverings of the training data and generating
minimal decision rule sets for each decision class. Each of these sets aims at cover-
ing all positive examples that belong to a concept and none from any other concept.
The EXPLORE algorithm [218] extracts from data all the decision rules satisfy-
ing certain requirements. It can be adapted to handle inconsistent examples. The
LEM2, EXPLORE and MODLEM algorithms rule induction algorithms are imple-
mented in the ROSE2 software [3]. Filiberto et. al. proposed the IRBASIR method
[62], which generates decision rules using an RST extension rooted on similarity
relations; another technique is put forth in [122] to discover rules using similarity
relations for incomplete data sets. This learning problem in presence of missing data
is also addressed in [80].
Other RST-based rule induction algorithms available in the literature using
rough sets are [181] [230], [231] [111] [130] [156] [119] [14] [3] [63]. The use
of hybrid models based on rough sets and fuzzy sets for rule induction and other
knowledge discovery methods is illustrated in [203] [100] [302] [124] [161] [2] [41]
10 Rafael Bello and Rafael Falcon

[300] [24], which includes working with the so called “fuzzy decision information
systems” [2].
One of the most popular rule induction methods based on rough sets is the so-
called three-way decisions model [262] [263] [264] [265] [81]. This methodology
is strongly related to decision making. Essentially, for each decision alternative,
this method defines three rules based on the RST’s positive, negative and bound-
ary regions. They respectively indicate acceptance, rejection or abstention (non-
commitment, denotes weak or insufficient evidence).
This type of rules, derived from the basic RST concepts, is a suitable knowledge
representation vehicle in a plethora of application domains. Hence, it has been inte-
grated into common machine learning tasks to facilitate the knowledge engineering
process required for a successful modeling of the domain under consideration. The
three-way decisions model has been adopted in feature selection [267] [165] [106]
[134] [295] [108], classification [275] [295] [284] [283], clustering [278] [279] and
face recognition [291] [133].

2.2.2 Association rule mining

The discovery of association rules is one of the classical data mining tasks. Its goal
is to uncover relationships among attributes that frequently appear together; i.e., the
presence of one implies the presence of the other. One of the typical examples is the
purchase of beer and diapers during the weekends. Association rules are representa-
tive of descriptive knowledge. A particular case are the so called “class association
rules”, which are used to build classifiers. Several methods have been developed for
discovering association rules using rough sets, including [49] [128] [94] [70] [268]
[135] [112] [213].

2.2.3 Clustering

The clustering problem is another learning task that has been approached from
a rough set perspective. Clustering is a landmark unsupervised learning problem
whose main objective is to group similar objects in the same cluster and separate ob-
jects that are different from each other by assigning them to different clusters [169]
[96]. The objects are grouped in such a way that those in the same group exhibit a
high degree of association among them whereas those in different groups show a low
degree of association. Clustering algorithms map the original N-dimensional feature
space to a 1-dimensional space describing the cluster each object belongs to. This is
why clustering is considered both an important dimensionality reduction technique
and also one of the most prevalent Granular Computing [185] manifestations.
One of the most popular and efficient clustering algorithms for conventional ap-
plications is K-means clustering [71]. In the K-means approach, randomly selected
objects serve as initial cluster centroids. The objects are then assigned to different
clusters based on their distance to the centroids. In particular, an object gets as-
Rough Sets in Machine Learning: A Review 11

signed to the cluster with the nearest centroid. The newly modified clusters then
employ this information to determine new centroids. The process continues itera-
tively until the cluster centroids are stabilized. K-means is a very simple clustering
algorithm, easy to understand and implement. The underlying alternate optimiza-
tion approach iteratively converges but might get trapped into a local minimum of
the objective function. K-means’ best performance is attained in those applications
where clusters are well separated and a crisp (bivalent) object-to-cluster decision is
required. Its disadvantages include the sensitivity to outliers and the initial cluster
centroids as well as the a priori specification of the desired number of clusters k.
Pawan Lingras [143] and [146] found that the K-means algorithm often yields
clustering results with unclear, vague boundaries. He pointed out that the “hard par-
titioning” performed by K-means does not meet the needs of grouping vague data.
Lingras then proposed to combine K-means with RST and in the so-called “Rough
K-means” approach. In this technique, each cluster is modeled as a rough set and
each object belongs either to the lower approximation of a cluster or to the upper
approximation of multiple clusters. Instead of building each cluster, its lower and
upper approximations are defined based on the available data. The basic properties
of the Rough K-means method are: (i) an object can be a member of at most a lower
approximation; (ii) an object that is a member of the lower approximation of a clus-
ter is also a member of its upper approximation and (iii) an object that does not
belong to the lower approximation of any cluster is a member of at least the upper
approximation of two clusters. Other pioneering works on rough clustering methods
are put forth in [78] [194] [238] [237].
Rough K-means has been the subject of several subsequent studies aimed at im-
proving its clustering capabilities. Georg Peters [189] concludes that rough cluster-
ing offers the possibility of reducing the number of incorrectly clustered objects,
which is relevant to many real-world applications where minimizing the number of
wrongly grouped objects is more important than maximizing the number of cor-
rectly grouped objects. Hence in these scenarios, Rough K-means arises as a power-
ful and stronger alternative to K-means. The same author proposes some improve-
ments to the method regarding the calculation of the centroids, thus aiming to make
the method more stable and robust to outliers [186] [187]. The authors in [293] pro-
posed a Rough K-means improvement based on a variable weighted distance mea-
sure. Another enhancement brought forward in [188] suggested that well-defined
objects must have a greater impact on the cluster centroid calculation rather than
having this impact be governed by the number of cluster boundaries an object be-
longs to, as proposed in the original method. An extension to Rough K-means based
on the decision-theoretic rough sets model was developed in [131]. An evolutionary
approach for rough partitive clustering was designed in [170] [191] while [45] and
[192] elaborate on dynamic rough clustering approaches.
Other works that tackle the clustering problem using rough sets are [123] [164]
[180] [215] [144] [77] [76] [125] [35] [273] [277] [72] [136] [145] [274] [294]
[179]. These methods handle more specific scenarios (such as sequential, imbal-
anced, categorical and ordinal data), as well as applications of this clustering ap-
proach to different domains. The rough-fuzzy K-means method is put forward in
12 Rafael Bello and Rafael Falcon

[88] and [172] whereas the fuzzy-rough K-means is unveiled in [171] and [190].
Both approaches amalgamate the main features of Rough K-means and Fuzzy C-
means by using the fuzzy membership of the objects to the rough clusters. Other
variants of fuzzy and rough set hybridization for the clustering problem are pre-
sented in [162] [173] [56] [127].

3 Special learning cases based on RST

This section elaborates on more recent ML scenarios tackled by RST-based ap-

proaches. In particular, we review the cases of imbalanced classification, multi-label
classification, dynamic/incremental learning and Big Data analysis.

3.1 Imbalanced classification

The traditional knowledge discovery methods presented in the previous section have
to be adapted if we are dealing with an imbalanced dataset [21]. A dataset is bal-
anced if it has an approximately equal percentage of positive and negative examples
(i.e., those belonging to the concept to be classified and those belonging to other
concepts, respectively). However, there are many application domains where we
find an imbalanced dataset; for instance, in healthcare scenarios there are usually
a plethora of patients that do not have a particularly rare disease. When learning a
normalcy model for a certain environment, the number of labeled anomalous events
is often scarce as most of the data corresponds to normal behaviour. The problem
with imbalanced classes is that the classification algorithms have a tendency towards
favoring the majority class. This occurs because the classifier attempts to reduce the
overall error, hence the classification error does not take into account the underlying
data distribution [23].
Several solutions have been researched to deal with this kind of situations. Two of
the most popular avenues are either resampling the training data (i.e., oversampling
the minority class or undersampling the majority class) or modifying the learning
method [155]. One of the classical methods for learning with imbalanced data is
SMOTE (synthetic minority oversampling technique) [22]. Different learning meth-
ods for imbalanced classification have been developed from an RST-based stand-
point. For instance, Hu et. al. [91] proposed models based on probabilistic rough
sets where each example has an associated probability p(x) instead of the default
1/n. Ma et. al. [160] introduced weights in the variable-precision rough set model
(VPRS) to denote the importance of each example. Liu et. al. [155] bring about
some weights in the RST formulation to balance the class distribution and develop
a method based on weighted rough sets to solve the imbalanced class learning prob-
lem. Ramentol et. al. [196] proposed a method that integrates SMOTE with RST.
Rough Sets in Machine Learning: A Review 13

Stefanowski et. al. [219] introduced filtering techniques to process inconsistent

examples of the majority class (i.e., those lying in the boundary region), thereby
adapting the MODLEM rule extraction method for coping with imbalanced learn-
ing problems. Other RST-based rule induction methods in the context of imbalanced
data are also presented in [154] and [245]. The authors in [220] proposed the EX-
PLORE method that generates rules for the minority class with a minimum coverage
equal to a user-specified threshold.

3.2 Multi-label classification

Normally, in a typical classification problem, a class (label) ci from a set C =

{c1 , . . . , ck } is assigned to each example. However, in multi-label classification,
a subset S ⊆ C is assigned to each example, which means that an example could be-
long to multiple classes. Some applications of this type of learning emerge from text
classification and functional genomics, namely, assigning functions to genes [228].
This gives rise to the so-called multi-label learning problem. The two avenues envi-
sioned for solving this new class of learning problems have considered either con-
verting the multi-label scenario to a single-label (classical) scenario or adapting the
learning methods. Examples of the latter trend are the schemes proposed in [200]
[47] [229] [292]. Similar approaches have been proposed for multi-label learning
using rough sets. A first alternative is to transform the multi-label problem into a
traditional single-label case and use classical RST-based learning methods to de-
rive the rules (or any other knowledge); the other option is to adapt the RST-based
learning methods, as shown in [280] [281] [290] [242].
In the first case, a decision system can be generated where some instances could
belong to multiple classes. Multi-label classification can be regarded as an incon-
sistent decision problem, in which two objects having the same predictive attribute
values do not share the same decision class. This leads to the modification of the
definition of the lower/upper approximations through a probabilistic approach that
facilitates modeling the uncertainty generated by the inconsistent system. This idea
gives rise to the so-called multi-label rough set model, which incorporates a proba-
bilistic approach such as the decision-theoretic rough set model. Some RST-based
feature selection methods in multi-label learning scenarios have been enunciated
[132], where the reduct concept was reformulated for the multi-label case.

3.3 Dynamic/incremental learning

Data are continuously being updated in nowadays’ information systems. New data
are added and obsolete data are purged over time. Traditional batch-learning meth-
ods lean on the principle of running these algorithms on all data when the informa-
tion is updated, which obviously affects the system efficiency while ignoring any
14 Rafael Bello and Rafael Falcon

previous learning. Instead, learning should occur as new information arrives. Man-
aging this learning while adapting the previous knowledge learned is the essence
behind incremental learning. This term refers to an efficient strategy for the anal-
ysis of data in dynamic environments that allows acquiring additional knowledge
from an uninterrupted information flow. The advantage of incremental learning is
not to have to analyze the data from scratch but to utilize the learning process’ pre-
vious outcomes as much as possible [178] [202] [73] [57] [113]. The continuous
and massive acquisition of data becomes a challenge for the discovery of knowl-
edge; especially in the context of Big Data, it becomes very necessary to develop
capacities to assimilate the continuous data streams [29].
As an information-based methodology, RST is not exempt from being scrutinized
in the context of dynamic data. The fundamental RST concepts and the knowledge
discovery methods ensuing from them are geared towards the analysis of static data;
hence, they need to be thoroughly revised in light of the requirements posed by
data stream mining systems [153]. The purpose of the incremental learning strategy
in rough sets is the development of incremental algorithms to quickly update the
concept approximations, the reduct calculation or the discovered decision rules [40]
[286]. The direct precursor of these studies can be found in [177]. According to
[150], in recent years RST-based incremental learning approaches have become “hot
topics” in knowledge extraction from dynamic data given their proven data analysis
efficiency.
The study of RST in the context of learning with dynamic data can be approached
from two different angles: what kind of information is considered to be dynamic and
what type of learning task must be carried out. In the first case, the RST-based incre-
mental updating approach could be further subdivided into three alternatives: (i) ob-
ject variation (insertion or deletion of objects in the universe), (ii) attribute variation
(insertion/removal of attributes) and (iii) attribute value variation (insertion/deletion
of attribute values). In the second case, we can mention (i) incremental learning of
the concept approximations [140] [33]; (ii) incremental learning of attribute reduc-
tion [52] [252] [239] [240] [141] and (iii) incremental learning of decision rules
[303] [66] [59] [149].
Object variations include so-called object immigration and emigration [149].
Variations of the attributes include feature insertion or deletion [289] [139]. Vari-
ations in attribute values are primarily manifested via the refinement or scaling of
the attribute values [147] [32]. Other works that propose modifications to RST-based
methods for the case of dynamic data are [148] [151] [159].
The following studies deal with dynamic object variation:
• The update of the lower and upper approximations of the target concept is ana-
lyzed in [33] [138] [158].
• The update in the reduction of attributes is studied in [82] [252].
• The update of the decision rule induction mechanism is discussed in [201] [4]
[246] [271] [303] [59] [149] [232] [40] [93].
If the variation occurs in the set of attributes, its effects have been studied with
respect to these aspects:
Rough Sets in Machine Learning: A Review 15

• The update of the lower and upper approximations of the target concept is ana-
lyzed in [20] [140] [36] [289] [139] [152].
• The update of the decision rule induction mechanism is discussed in [39].
The effect of the variations in the attribute values (namely, via refinement or
extension of the attribute domains) with respect to the update of the lower and upper
approximations of the target concept is analyzed in [50] [310] [30] [32] [31] [239].
The calculation of reducts for dynamic data has also been investigated. The effect
when the set of attributes varies is studied in [39]. The case of varying the attribute
values is explored in [50] and [69] whereas the case of dynamic object update is dis-
sected in [201] and [246]. Other studies on how dynamic data affect the calculation
of reducts appear in [239] [240] [141] and [206].

3.4 Rough sets and Big Data

On the other hand, the accelerated pace of technology has led to an exponential
growth in the generation and collection of digital information. This growth is not
only limited to the amount of data available but to the plethora of diverse sources
that emit these data streams. It becomes paramount then to efficiently analyze and
extract knowledge from many dissimilar information sources within a certain appli-
cation domain. This has led to the emergece of the Big Data era [25], which has a
direct impact on the development of RST and its applications. Granular Computing,
our starting point in this chapter, has a strong relation to Big Data [25], as its in-
herent ability to process information at multiple levels of abstraction and interpret
information from different perspectives greatly facilitates the efficient management
of large data volumes.
Simply put, Big Data can be envisioned as a large and complex data collec-
tion. These data are very difficult to analyze through traditional data management
and processing tools. Big Data scenarios require new architectures, techniques, al-
gorithms and processes to manage and extract value and knowledge hidden in the
data streams. Big Data is often characterized by the 5 V’s vector: Volume, Veloc-
ity, Variety, Veracity and Value. Big Data includes both structured and unstructured
data, including images, videos, textual reports, etc. Big Data frameworks such as
MapReduce and Spark have been recently developed and constitute indispensable
tools for the accurate and seamless knowledge extraction from an array of disparate
data sources. For more information on the Big Data paradigm, the reader is referred
to the following articles: [48] [25] [60] [118].
As a data analysis and information extraction methodology, RST needs to adapt
and evolve in order to cope with this new phenomenon. A major motivation to do so
lies in the fact that the sizes of nowadays’ decision systems are already extremely
large. This poses a significant challenge to the efficient calculation of the underlying
RST concepts and the knowledge discovery methods that emanate from them. Recall
that the computational complexity of computing the target concept’s approximations
is O(lm2 ), the computational cost of finding a reduct is bounded by O(l 2 m2 ) and the
16 Rafael Bello and Rafael Falcon

time complexity to find all reducts is O( 2l J), where l is the number of attributes
characterizing the objects, m is the number of objects in the universe and J is the
computational cost required to calculate a reduct.
Some researchers have proposed RST-based solutions to the Big Data challenge
[288] [193]. These methods are concerned with the design of parallel algorithms
to compute equivalence classes, decision classes, associations between equivalence
classes and decision classes, approximations, and so on. They are based on partition-
ing the universe, concurrently processing those information subsystems and then
integrating the results. In other words, given the decision system S = (U,C ∪ D),
generate the subsystems {S1 , S2 , . . . , Sm }, where Si = (Ui ,C ∪ D) and U = Ui , then
S

process each subsystem Si , i ∈ {1, 2, . . . , m}, Ui /B, B ⊆ C. Afterwards, the results are
amalgamated. This MapReduce-compliant workflow is supported by several theo-
rems stating that (a) equivalence classes can be independently computed for each
subsystem and (b) the equivalence classes from different subsystems can be merged
if they are based on the same underlying attribute set. These results enable the par-
allel computation of the equivalence classes of the decision system S. Zhang et. al.
[288] developed the PACRSEC algorithm to that end.
Analogously, RST-based knowledge discovery methods, including reduct calcu-
lation and decision rule induction, have been investigated in in the context of Big
Data [258] [287] [58].

3.5 Cost-sensitive learning

Cost is an important property inherent to real-world data. Cost sensitivity is an

important problem which has been addressed from different angles. Cost-sensitive
learning [254] [306] [296] [305] emerged when an awareness of the learning context
was brought into Machine Learning. This is one of the most difficult ML problems
and was listed as one of the top ten challenges in the Data Mining / ML domain
[298].
Two types of learning costs have been addressed through RST: misclassifica-
tion cost and test cost [255]. Test cost has been studied by Min et. al. [167] [165]
[168] [297] using the classical rough set approach, i.e., using a single granulation;
a test-cost-sensitive multigranulation rough set model is presented in [255]. Multi-
granulation rough set is an extension of the classical RST that leans on multiple
granular structures.
A recent cost-sensitive rough set approach was put forward in [116]. The crux of
this method is that the information granules are sensitive to test costs while approx-
imations are sensitive to decision costs, respectively; in this way, the construction
of the rough set model takes into account both the test cost and the decision cost
simultaneously. This new model is called cost-sensitive rough set and is based on
decision-theoretic rough sets. In [133], the authors combine sequential three-way
decisions and cost-sensitive learning to solve the face recognition problem; this
Rough Sets in Machine Learning: A Review 17

is particularly interesting since in real-world face recognition scenarios, different

kinds of misclassifications will lead to different costs [296] [157].
Other studies focused on the cost-sensitive learning problem from an RST per-
spective are presented in [84] [114] [255] [256]; these works have considered both
the test cost and the decision cost. Attribute reduction based on test-cost-sensitivity
has been quite well investigated [74] [165] [86] [107] [115] [134] [168] [166] [116]
[298].

4 Reference categorization

Table 1 lists the different RST studies according to the ML tasks they perform.
18

Table 1 Rough sets in Machine Learning

Task Subtask Subproblem / Approach References
Rough sets and Boolean reasoning [175] [5] [6] [176] [46]
Discretization
Other approaches [129] [26] [251] [221] [207] [117]
Reduct calculation [37] [38] [204] [249] [304] [210] [211] [225] [250] [170] [85] [89] [267] [165] [198] [98] [112] [106] [134] [137] [241] [223] [257] [272] [295] [108]
Heuristic search [304] [97] [166]
Feature selection
Metaheuristic search [247] [248] [102] [9] [243] [11] [15] [244] [10] [64] [120] [299] [8] [276] [270]
Data preprocessing Fuzzy and rough set hybridization [126] [101] [199] [103] [104] [205] [13] [90] [87] [42] [227] [301] [105] [43] [44] [75] [51] [28] [92] [195]
Rough sets [19] [16]
Instance selection
Fuzzy and rough set hybridization [99] [235] [234]
Quality of training set [253] [18] [17]
Meta-learning
Outlier detection [109] [110]
Inter-attribute dependence /
Miscellaneous [121] [224]
attribute importance
Rough sets [181] [67] [216] [218] [217] [62] [230] [231] [111] [130] [156] [119] [14] [3] [63]
Decision rule induction Fuzzy and rough set hybridization [203] [100] [302] [124] [161] [2] [41] [300] [24]
Three-way decisions [262] [263] [264] [265] [81] [275] [295] [284] [283]
Knowledge discovery Association rule mining Rough sets [49] [128] [94] [70] [268] [135] [112] [213]
Rough clustering [143] [146] [78] [194] [238] [237] [186] [187] [188] [189] [293] [131]
Evolutionary/dynamic rough clustering [45] [170] [191] [192]
Clustering Other rough clustering scenarios [123] [164] [180] [215] [56] [144] [77] [76] [125] [35] [273] [277] [72] [136] [145] [274] [294] [179]
Fuzzy and rough set hybridization [88] [172] [171] [190] [162] [173] [56] [127]
Three-way decisions [278] [279]
Imbalanced classification Rough set theory adaptation [91] [160] [155] [196] [219] [154] [245] [220]
Multi-label classification Rough set theory adaptation [280] [281] [290] [242] [132]
Dynamic objects [149] [33] [138] [158] [82] [252] [201] [4] [246] [271] [303] [58] [147] [232] [40] [93]
Dynamic attributes [289] [139]
Dynamic attribute values [147] [32] [50] [310] [30] [32] [31] [239]
Incremental learning of concept approximations [33] [138] [158] [140] [20] [289] [139] [152]
Special learning cases Dynamic/incremental learning
Incremental learning of attribute reduction [52] [252] [239] [240] [141] [82] [252]
Incremental learning of decision rules [303] [66] [59] [147] [201] [4] [246] [271] [303] [232] [40] [93] [39]
Dynamic reducts [39] [50] [69] [201] [246] [239] [240] [141] [206]
Miscellaneous [153] [177] [40] [286] [148] [150] [159]
Big Data analysis Rough set theory adaptation [193] [288] [258] [287] [58]
Test-cost-aware [167] [74][165] [297] [168] [86] [168] [255] [107] [134] [166] [115] [133] [298]
Cost-sensitive learning
Test-and-decision-cost aware [84] [114] [256] [116]
Rafael Bello and Rafael Falcon
Rough Sets in Machine Learning: A Review 19

5 Conclusions

We have reported on hundreds of successful attempts to tackle different ML prob-

lems using RST. These approaches touch all components of the knowledge discov-
ery process, ranging from data preprocessing to descriptive and predictive knowl-
edge induction. Aside from the well-known RST strengths in identifying inconsis-
tent information systems, calculating reducts to reduce the dimensionality of the
feature space or generating an interpretable rule base, we have walked the reader
through more recent examples that show the redefinition of some of the RST’s
building blocks to make it a suitable approach for handling special ML scenarios
characterized by an imbalance in the available class data, the requirement to clas-
sify a pattern into one or more predefined labels, the dynamic processing of data
streams, the need to manage large volumes of static data or the management of mis-
classification/test costs. All of these efforts bear witness to the resiliency and adapt-
ability of the rough set approach, thus making it an appealing choice for solving
non-conventional ML problems.

References

1. Abraham, A., Falcon, R., Bello, R.: Rough Set Theory: a True Landmark in Data Analysis.
Springer Verlag, Berlin-Heidelberg, Germany (2009)
2. Bai, H., Ge, Y., Wang, J., Li, D., Liao, Y., Zheng, X.: A method for extracting rules from
spatial data based on rough fuzzy sets. Knowledge-Based Systems 57, 28–40 (2014)
3. Bal, M.: Rough sets theory as symbolic data mining method: an application on complete
decision table. Information Sciences Letters 2(1), 111–116 (2013)
4. Bang, W.C., Bien, Z.: New incremental learning algorithm in the framework of rough set
theory. International Journal of Fuzzy Systems 1, 25–36 (1999)
5. Bazan, J.G.: A comparison of dynamic and non-dynamic rough set methods for extracting
laws from decision tables. Rough sets in knowledge discovery 1, 321–365 (1998)
6. Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough set algorithms in
classification problem. In: Rough set methods and applications, pp. 49–88. Springer (2000)
7. Bello, R., Falcon, R., Pedrycz, W., Kacprzyk, J.: Granular Computing: at the Junction of
Rough Sets and Fuzzy Sets. Springer Verlag, Berlin-Heidelberg, Germany (2008)
8. Bello, R., Gómez, Y., Caballero, Y., Nowe, A., Falcon, R.: Rough Sets and Evolutionary
Computation to Solve the Feature Selection Problem. In: A. Abraham, R. Falcon, R. Bello
(eds.) Rough Set Theory: A True Landmark in Data Analysis, Studies in Computational
Intelligence, vol. 174, pp. 235–260. Springer Berlin / Heidelberg (2009)
9. Bello, R., Nowe, A., Gómez, Y., Caballero, Y.: Using ACO and rough set theory to feature
selection. WSEAS Transactions on Information Science and Applications 2(5), 512–517
(2005)
10. Bello, R., Puris, A., Falcon, R., Gómez, Y.: Feature Selection through Dynamic Mesh Opti-
mization. In: J. Ruiz-Shulcloper, W. Kropatsch (eds.) Progress in Pattern Recognition, Im-
age Analysis and Applications, Lecture Notes in Computer Science, vol. 5197, pp. 348–355.
Springer Berlin / Heidelberg (2008)
11. Bello, R., Puris, A., Nowe, A., Martı́nez, Y., Garcı́a, M.M.: Two step ant colony system to
solve the feature selection problem. In: Iberoamerican Congress on Pattern Recognition, pp.
588–596. Springer (2006)
20 Rafael Bello and Rafael Falcon

12. Bello, R., Verdegay, J.L.: Rough sets in the soft computing environment. Information Sci-
ences 212, 1–14 (2012)
13. Bhatt, R.B., Gopal, M.: On fuzzy-rough sets approach to feature selection. Pattern recogni-
tion letters 26(7), 965–975 (2005)
14. Błaszczyński, J., Słowiński, R., Szelkag, M.: Sequential covering rule induction algorithm for
variable consistency rough set approaches. Information Sciences 181(5), 987–1002 (2011)
15. Caballero, Y., Bello, R., Alvarez, D., Garcia, M.M.: Two new feature selection algorithms
with rough sets theory. In: IFIP International Conference on Artificial Intelligence in Theory
and Practice, pp. 209–216. Springer (2006)
16. Caballero, Y., Bello, R., Alvarez, D., Gareia, M.M., Pizano, Y.: Improving the k-nn method:
Rough set in edit training set. In: Professional Practice in Artificial Intelligence, pp. 21–30.
Springer (2006)
17. Caballero, Y., Bello, R., Arco, L., Garcı́a, M., Ramentol, E.: Knowledge discovery using
rough set theory. In: Advances in Machine Learning I, pp. 367–383. Springer (2010)
18. Caballero, Y., Bello, R., Arco, L., Márquez, Y., León, P., Garcı́a, M.M., Casas, G.: Rough
set theory measures for quality assessment of a training set. In: Granular Computing: At the
Junction of Rough Sets and Fuzzy Sets, pp. 199–210. Springer (2008)
19. Caballero, Y., Joseph, S., Lezcano, Y., Bello, R., Garcia, M.M., Pizano, Y.: Using rough sets
to edit training set in k-nn method. In: ISDA, pp. 456–463 (2005)
20. Chan, C.C.: A rough set approach to attribute generalization in data mining. Information
Sciences 107(1), 169–176 (1998)
21. Chawla, N.V.: Data mining for imbalanced datasets: An overview. In: Data mining and
knowledge discovery handbook, pp. 853–867. Springer (2005)
22. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-
sampling technique. Journal of artificial intelligence research 16, 321–357 (2002)
23. Chawla, N.V., Cieslak, D.A., Hall, L.O., Joshi, A.: Automatically countering imbalance and
its empirical relationship to cost. Data Mining and Knowledge Discovery 17(2), 225–252
(2008)
24. Chen, C., Mac Parthaláin, N., Li, Y., Price, C., Quek, C., Shen, Q.: Rough-fuzzy rule inter-
polation. Information Sciences 351, 1–17 (2016)
25. Chen, C.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technolo-
gies: A survey on big data. Information Sciences 275, 314–347 (2014)
26. Chen, C.Y., Li, Z.G., Qiao, S.Y., Wen, S.P.: Study on discretization in rough set based on
genetic algorithm. In: Machine Learning and Cybernetics, 2003 International Conference
on, vol. 3, pp. 1430–1434. IEEE (2003)
27. Chen, D., Hu, Q., Yang, Y.: Parameterized attribute reduction with gaussian kernel based
fuzzy rough sets. Information Sciences 181(23), 5169–5179 (2011)
28. Chen, D., Zhang, L., Zhao, S., Hu, Q., Zhu, P.: A novel algorithm for finding reducts with
fuzzy rough sets. IEEE Transactions on Fuzzy Systems 20(2), 385–389 (2012)
29. Chen, H., Chiang, R.H., Storey, V.C.: Business intelligence and analytics: From big data to
big impact. MIS quarterly 36(4), 1165–1188 (2012)
30. Chen, H., Li, T., Qiao, S., Ruan, D.: A rough set based dynamic maintenance approach for
approximations in coarsening and refining attribute values. International Journal of Intelli-
gent Systems 25(10), 1005–1026 (2010)
31. Chen, H., Li, T., Ruan, D.: Dynamic maintenance of approximations under a rough-set based
variable precision limited tolerance relation. Journal of Multiple-Valued Logic & Soft Com-
puting 18 (2012)
32. Chen, H., Li, T., Ruan, D.: Maintenance of approximations in incomplete ordered decision
systems while attribute values coarsening or refining. Knowledge-Based Systems 31, 140–
161 (2012)
33. Chen, H., Li, T., Ruan, D., Lin, J., Hu, C.: A rough-set-based incremental approach for up-
dating approximations under dynamic maintenance environments. IEEE Transactions on
Knowledge and Data Engineering 25(2), 274–284 (2013)
Rough Sets in Machine Learning: A Review 21

34. Chen, Y.S., Cheng, C.H.: A delphi-based rough sets fusion model for extracting payment
rules of vehicle license tax in the government sector. Expert Systems with Applications
37(3), 2161–2174 (2010)
35. Cheng, X., Wu, R.: Clustering path profiles on a website using rough k-means method. Jour-
nal of Computational Information Systems 8(14), 6009–6016 (2012)
36. Cheng, Y.: The incremental method for fast computing the rough fuzzy approximations. Data
& Knowledge Engineering 70(1), 84–100 (2011)
37. Choubey, S.K., Deogun, J.S., Raghavan, V.V., Sever, H.: A comparison of feature selection
algorithms in the context of rough classifiers. In: Fuzzy Systems, 1996., Proceedings of the
Fifth IEEE International Conference on, vol. 2, pp. 1122–1128. IEEE (1996)
38. Chouchoulas, A., Shen, Q.: A rough set-based approach to text classification. In: Interna-
tional Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing,
pp. 118–127. Springer (1999)
39. Ciucci, D.: Attribute dynamics in rough sets. In: International Symposium on Methodologies
for Intelligent Systems, pp. 43–51. Springer (2011)
40. Ciucci, D.: Temporal dynamics in information tables. Fundamenta informaticae 115(1), 57–
74 (2012)
41. Coello, L., Fernandez, Y., Filiberto, Y., Bello, R.: Improving the multilayer perceptron learn-
ing by using a method to calculate the initial weights with the similarity quality measure
based on fuzzy sets and particle swarms. Computación y Sistemas 19(2), 309–320 (2015)
42. Cornelis, C., Jensen, R.: A noise-tolerant approach to fuzzy-rough feature selection. In:
Fuzzy Systems, 2008. FUZZ-IEEE 2008.(IEEE World Congress on Computational Intelli-
gence). IEEE International Conference on, pp. 1598–1605. IEEE (2008)
43. Cornelis, C., Jensen, R., Hurtado, G., Śle, D., et al.: Attribute selection with fuzzy decision
reducts. Information Sciences 180(2), 209–224 (2010)
44. Cornelis, C., Verbiest, N., Jensen, R.: Ordered weighted average based fuzzy rough sets. In:
International Conference on Rough Sets and Knowledge Technology, pp. 78–85. Springer
(2010)
45. Crespo, F., Peters, G., Weber, R.: Rough clustering approaches for dynamic environments.
In: Rough Sets: Selected Methods and Applications in Management and Engineering, pp.
39–50. Springer (2012)
46. Dai, J.H., Li, Y.X.: Study on discretization based on rough set theory. In: Machine Learning
and Cybernetics, 2002. Proceedings. 2002 International Conference on, vol. 3, pp. 1371–
1373. IEEE (2002)
47. De Comité, F., Gilleron, R., Tommasi, M.: Learning multi-label alternating decision trees
from texts and data. In: International Workshop on Machine Learning and Data Mining in
Pattern Recognition, pp. 35–49. Springer (2003)
48. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commu-
nications of the ACM 51(1), 107–113 (2008)
49. Delic, D., Lenz, H.J., Neiling, M.: Improving the quality of association rule mining by means
of rough sets. In: Soft Methods in Probability, Statistics and Data Analysis, pp. 281–288.
Springer (2002)
50. Deng, D., Huang, H.: Dynamic reduction based on rough sets in incomplete decision systems.
In: International Conference on Rough Sets and Knowledge Technology, pp. 76–83. Springer
(2007)
51. Derrac, J., Cornelis, C., Garcı́a, S., Herrera, F.: Enhancing evolutionary instance selection
algorithms by means of fuzzy rough set based feature selection. Information Sciences 186(1),
73–92 (2012)
52. Dey, P., Dey, S., Datta, S., Sil, J.: Dynamic discreduction using rough sets. Applied Soft
Computing 11(5), 3887–3897 (2011)
53. Dougherty, J., Kohavi, R., Sahami, M., et al.: Supervised and unsupervised discretization of
continuous features. In: Machine learning: proceedings of the twelfth international confer-
ence, vol. 12, pp. 194–202 (1995)
54. Dubois, D., Prade, H.: Twofold fuzzy sets and rough sets some issues in knowledge repre-
sentation. Fuzzy sets and Systems 23(1), 3–18 (1987)
22 Rafael Bello and Rafael Falcon

55. Dubois, D., Prade, H.: Rough fuzzy sets and fuzzy rough sets*. International Journal of
General System 17(2-3), 191–209 (1990)
56. Falcon, R., Jeon, G., Bello, R., Jeong, J.: Rough clustering with partial supervision. In:
Rough Set Theory: A True Landmark in Data Analysis, pp. 137–161. Springer (2009)
57. Falcon, R., Nayak, A., Abielmona, R.: An Online Shadowed Clustering Algorithm Applied
to Risk Visualization in Territorial Security. In: IEEE Symposium on Computational Intelli-
gence for Security and Defense Applications (CISDA), pp. 1–8. Ottawa, Canada (2012)
58. Fan, Y.N., Chern, C.C.: An agent model for incremental rough set-based rule induction: a
Big Data analysis in sales promotion. In: System Sciences (HICSS), 2013 46th Hawaii
International Conference on, pp. 985–994. IEEE (2013)
59. Fan, Y.N., Tseng, T.L.B., Chern, C.C., Huang, C.C.: Rule induction based on an incremental
rough set. Expert Systems with Applications 36(9), 11,439–11,450 (2009)
60. Fernández, A., del Rı́o, S., López, V., Bawakid, A., del Jesus, M.J., Benı́tez, J.M., Herrera, F.:
Big data with cloud computing: an insight on the computing environment, mapreduce, and
programming frameworks. Wiley Interdisciplinary Reviews: Data Mining and Knowledge
Discovery 4(5), 380–409 (2014)
61. Filiberto, Y., Caballero, Y., Larrua, R., Bello, R.: A method to build similarity relations into
extended rough set theory. In: 2010 10th International Conference on Intelligent Systems
Design and Applications, pp. 1314–1319. IEEE (2010)
62. Filiberto Cabrera, Y., Caballero Mota, Y., Bello Pérez, R., Frı́as, M.: Algoritmo para el apren-
dizaje de reglas de clasificación basado en la teorı́a de los conjuntos aproximados extendida.
Dyna; Vol. 78, núm. 169 (2011); 62-70 DYNA; Vol. 78, núm. 169 (2011); 62-70 2346-2183
0012-7353 (2011)
63. Gogoi, P., Bhattacharyya, D.K., Kalita, J.K.: A rough set-based effective rule generation
method for classification with an application in intrusion detection. International Journal of
Security and Networks 8(2), 61–71 (2013)
64. Gómez, Y., Bello, R., Puris, A., Garcia, M.M., Nowe, A.: Two step swarm intelligence to
solve the feature selection problem. J. UCS 14(15), 2582–2596 (2008)
65. Greco, S., Matarazzo, B., Słowiński, R.: Parameterized rough set model using rough member-
ship and bayesian confirmation measures. International Journal of Approximate Reasoning
49(2), 285–300 (2008)
66. Greco, S., Słowiński, R., Stefanowski, J., Żurawski, M.: Incremental versus non-incremental
rule induction for multicriteria classification. In: Transactions on Rough Sets II, pp. 33–53.
Springer (2004)
67. Grzymala-Busse, J.W.: LERS - a system for learning from examples based on rough sets. In:
Intelligent decision support, pp. 3–18. Springer (1992)
68. Grzymała-Busse, J.W.: Characteristic relations for incomplete data: A generalization of the
indiscernibility relation. In: International Conference on Rough Sets and Current Trends in
Computing, pp. 244–253. Springer (2004)
69. Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Inducing better rule sets by adding missing
attribute values. In: International Conference on Rough Sets and Current Trends in Comput-
ing, pp. 160–169. Springer (2008)
70. Guan, J., Bell, D.A., Liu, D.: The rough set approach to association rule mining. In: Data
Mining, 2003. ICDM 2003. Third IEEE International Conference on, pp. 529–532. IEEE
(2003)
71. Hartigan, J.A., Wong, M.A.: Algorithm as 136: A k-means clustering algorithm. Journal of
the Royal Statistical Society. Series C (Applied Statistics) 28(1), 100–108 (1979)
72. Hassanein, W., Elmelegy, A.A.: An algorithm for selecting clustering attribute using signif-
icance of attributes. International Journal of Database Theory and Application 6(5), 53–66
(2013)
73. He, H., Chen, S., Li, K., Xu, X.: Incremental learning from stream data. IEEE Transactions
on Neural Networks 22(12), 1901–1914 (2011)
74. He, H., Min, F., Zhu, W.: Attribute reduction in test-cost-sensitive decision systems with
common-test-costs. In: Proceedings of the 3rd International Conference on Machine Learn-
ing and Computing, vol. 1, pp. 432–436 (2011)
Rough Sets in Machine Learning: A Review 23

75. He, Q., Wu, C., Chen, D., Zhao, S.: Fuzzy rough set based attribute reduction for information
systems with fuzzy decisions. Knowledge-Based Systems 24(5), 689–696 (2011)
76. Herawan, T.: Rough set approach for categorical data clustering. Ph.D. thesis, Universiti Tun
Hussein Onn Malaysia (2010)
77. Herawan, T., Deris, M.M., Abawajy, J.H.: A rough set approach for selecting clustering at-
tribute. Knowledge-Based Systems 23(3), 220–231 (2010)
78. Hirano, S., Tsumoto, S.: Rough clustering and its application to medicine. Journal of Infor-
mation Science 124, 125–137 (2000)
79. Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE trans-
actions on pattern analysis and machine intelligence 24(3), 289–300 (2002)
80. Hong, T.P., Tseng, L.H., Wang, S.L.: Learning rules from incomplete training examples by
rough sets. Expert Systems with Applications 22(4), 285–293 (2002)
81. Hu, B.Q.: Three-way decisions space and three-way decisions. Information Sciences 281,
21–52 (2014)
82. Hu, F., Wang, G., Huang, H., Wu, Y.: Incremental attribute reduction based on elementary
sets. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft
Computing, pp. 185–193. Springer (2005)
83. Hu, H., Shi, Z.: Machine learning as granular computing. In: Granular Computing, 2009,
GRC’09. IEEE International Conference on, pp. 229–234. IEEE (2009)
84. Hu, Q., Che, X., Zhang, L., Zhang, D., Guo, M., Yu, D.: Rank entropy-based decision trees
for monotonic classification. IEEE Transactions on Knowledge and Data Engineering 24(11),
2052–2064 (2012)
85. Hu, Q., Liu, J., Yu, D.: Mixed feature selection based on granulation and approximation.
Knowledge-Based Systems 21(4), 294–304 (2008)
86. Hu, Q., Pan, W., Zhang, L., Zhang, D., Song, Y., Guo, M., Yu, D.: Feature selection for
monotonic classification. IEEE Transactions on Fuzzy Systems 20(1), 69–81 (2012)
87. Hu, Q., Xie, Z., Yu, D.: Hybrid attribute reduction based on a novel fuzzy-rough model and
information granulation. Pattern recognition 40(12), 3509–3521 (2007)
88. Hu, Q., Yu, D.: An improved clustering algorithm for information granulation. In: Inter-
national Conference on Fuzzy Systems and Knowledge Discovery, pp. 494–504. Springer
(2005)
89. Hu, Q., Yu, D., Liu, J., Wu, C.: Neighborhood rough set based heterogeneous feature subset
selection. Information sciences 178(18), 3577–3594 (2008)
90. Hu, Q., Yu, D., Xie, Z.: Information-preserving hybrid data reduction based on fuzzy-rough
techniques. Pattern recognition letters 27(5), 414–423 (2006)
91. Hu, Q., Yu, D., Xie, Z., Liu, J.: Fuzzy probabilistic approximation spaces and their informa-
tion measures. IEEE transactions on fuzzy systems 14(2), 191–201 (2006)
92. Hu, Q., Zhang, L., An, S., Zhang, D., Yu, D.: On robust fuzzy rough set models. IEEE
transactions on Fuzzy Systems 20(4), 636–651 (2012)
93. Huang, C.C., Tseng, T.L.B., Fan, Y.N., Hsu, C.H.: Alternative rule induction methods based
on incremental object using rough set theory. Applied Soft Computing 13(1), 372–389 (2013)
94. Huang, Z., Hu, Y.Q.: Applying ai technology and rough set theory to mine association rules
for supporting knowledge management. In: Machine Learning and Cybernetics, 2003 Inter-
national Conference on, vol. 3, pp. 1820–1825. IEEE (2003)
95. Hüllermeier, E.: Granular computing in machine learning and data mining. Handbook of
Granular Computing pp. 889–906 (2008)
96. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM computing surveys
(CSUR) 31(3), 264–323 (1999)
97. Janusz, A., Slezak, D.: Rough set methods for attribute clustering and selection. Applied
Artificial Intelligence 28(3), 220–242 (2014)
98. Janusz, A., Stawicki, S.: Applications of approximate reducts to the feature selection prob-
lem. In: International Conference on Rough Sets and Knowledge Technology, pp. 45–50.
Springer (2011)
99. Jensen, R., Cornelis, C.: Fuzzy-rough instance selection. In: Fuzzy Systems (FUZZ), 2010
IEEE International Conference on, pp. 1–7. IEEE (2010)
24 Rafael Bello and Rafael Falcon

100. Jensen, R., Cornelis, C., Shen, Q.: Hybrid fuzzy-rough rule induction and feature selection.
In: Fuzzy Systems, 2009. FUZZ-IEEE 2009. IEEE International Conference on, pp. 1151–
1156. IEEE (2009)
101. Jensen, R., Shen, Q.: Fuzzy-rough sets for descriptive dimensionality reduction. In: Fuzzy
Systems, 2002. FUZZ-IEEE’02. Proceedings of the 2002 IEEE International Conference on,
vol. 1, pp. 29–34. IEEE (2002)
102. Jensen, R., Shen, Q.: Finding rough set reducts with ant colony optimization. In: Proceedings
of the 2003 UK workshop on computational intelligence, vol. 1, pp. 15–22 (2003)
103. Jensen, R., Shen, Q.: Fuzzy–rough attribute reduction with application to web categorization.
Fuzzy sets and systems 141(3), 469–485 (2004)
104. Jensen, R., Shen, Q.: Semantics-preserving dimensionality reduction: rough and fuzzy-
rough-based approaches. IEEE Transactions on knowledge and data engineering 16(12),
1457–1471 (2004)
105. Jensen, R., Shen, Q.: New approaches to fuzzy-rough feature selection. IEEE Transactions
on Fuzzy Systems 17(4), 824–838 (2009)
106. Jia, X., Liao, W., Tang, Z., Shang, L.: Minimum cost attribute reduction in decision-theoretic
rough set models. Information Sciences 219, 151–167 (2013)
107. Jia, X., Liao, W., Tang, Z., Shang, L.: Minimum cost attribute reduction in decision-theoretic
rough set models. Information Sciences 219, 151–167 (2013)
108. Jia, X., Shang, L., Zhou, B., Yao, Y.: Generalized attribute reduct in rough set theory.
Knowledge-Based Systems 91, 204–218 (2016)
109. Jiang, F., Sui, Y., Cao, C.: Outlier detection based on rough membership function. In: Inter-
national Conference on Rough Sets and Current Trends in Computing, pp. 388–397. Springer
(2006)
110. Jiang, F., Sui, Y., Cao, C.: Some issues about outlier detection in rough set theory. Expert
Systems with Applications 36(3), 4680–4687 (2009)
111. Jiang, Y.C., Liu, Y.Z., Liu, X., Zhang, J.K.: Constructing associative classifier using rough
sets and evidence theory. In: International Workshop on Rough Sets, Fuzzy Sets, Data Min-
ing, and Granular-Soft Computing, pp. 263–271. Springer (2007)
112. Jiao, X., Lian-cheng, X., Lin, Q.: Association rules mining algorithm based on rough set.
In: International Symposium on Information Technology in Medicine and Education, Print
ISBN, pp. 978–1 (2012)
113. Joshi, P., Kulkarni, P.: Incremental learning: areas and methods - a survey. International
Journal of Data Mining & Knowledge Management Process 2(5), 43 (2012)
114. Ju, H., Yang, X., Song, X., Qi, Y.: Dynamic updating multigranulation fuzzy rough set: ap-
proximations and reducts. International Journal of Machine Learning and Cybernetics 5(6),
981–990 (2014)
115. Ju, H., Yang, X., Yang, P., Li, H., Zhou, X.: A moderate attribute reduction approach in
decision-theoretic rough set. In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Com-
puting, pp. 376–388. Springer (2015)
116. Ju, H., Yang, X., Yu, H., Li, T., Yu, D.J., Yang, J.: Cost-sensitive rough set approach. Infor-
mation Sciences 355, 282–298 (2016)
117. Jun, Z., Zhou, Y.h.: New heuristic method for data discretization based on rough set theory.
The Journal of China Universities of Posts and Telecommunications 16(6), 113–120 (2009)
118. Kambatla, K., Kollias, G., Kumar, V., Grama, A.: Trends in big data analytics. Journal of
Parallel and Distributed Computing 74(7), 2561–2573 (2014)
119. Kaneiwa, K.: A rough set approach to mining connections from information systems. In: Pro-
ceedings of the 2010 ACM Symposium on Applied Computing, pp. 990–996. ACM (2010)
120. Ke, L., Feng, Z., Ren, Z.: An efficient ant colony optimization approach to attribute reduction
in rough set theory. Pattern Recognition Letters 29(9), 1351–1357 (2008)
121. Komorowski, J., Pawlal, Z., Polkowski, L., Skowron, A.: A rough set perspective on data
and knowledge. The handbook of data mining and knowledge discovery. Oxford University
Press, Oxford (1999)
122. Kryszkiewicz, M.: Rough set approach to incomplete information systems. Information sci-
ences 112(1), 39–49 (1998)
Rough Sets in Machine Learning: A Review 25

123. Kumar, P., Krishna, P.R., Bapi, R.S., De, S.K.: Rough clustering of sequential data. Data &
Knowledge Engineering 63(2), 183–199 (2007)
124. Kumar, P., Vadakkepat, P., Poh, L.A.: Fuzzy-rough discriminative feature selection and clas-
sification algorithm, with application to microarray and image datasets. Applied Soft Com-
puting 11(4), 3429–3440 (2011)
125. Kumar, P., Wasan, S.K.: Comparative study of k-means, pam and rough k-means algorithms
using cancer datasets. In: Proceedings of CSIT: 2009 International Symposium on Comput-
ing, Communication, and Control (ISCCC 2009), vol. 1, pp. 136–140 (2011)
126. Kuncheva, L.I.: Fuzzy rough sets: application to feature selection. Fuzzy sets and Systems
51(2), 147–153 (1992)
127. Lai, J.Z., Juan, E.Y., Lai, F.J.: Rough clustering using generalized fuzzy clustering algorithm.
Pattern Recognition 46(9), 2538–2547 (2013)
128. Lee, S.C., Huang, M.J.: Applying ai technology and rough set theory for mining associa-
tion rules to support crime management and fire-fighting resources allocation. Journal of
Information, Technology and Society 2(65), 65–78 (2002)
129. Lenarcik, A., Piasta, Z.: Discretization of condition attributes space. In: Intelligent Decision
Support, pp. 373–389. Springer (1992)
130. Leung, Y., Fischer, M.M., Wu, W.Z., Mi, J.S.: A rough set approach for the discovery of clas-
sification rules in interval-valued information systems. International Journal of Approximate
Reasoning 47(2), 233–246 (2008)
131. Li, F., Ye, M., Chen, X.: An extension to rough c-means clustering based on decision-
theoretic rough sets model. International Journal of Approximate Reasoning 55(1), 116–129
(2014)
132. Li, H., Li, D., Zhai, Y., Wang, S., Zhang, J.: A variable precision attribute reduction approach
in multilabel decision tables. The Scientific World Journal 2014 (2014)
133. Li, H., Zhang, L., Huang, B., Zhou, X.: Sequential three-way decision and granulation for
cost-sensitive face recognition. Knowledge-Based Systems 91, 241–251 (2016)
134. Li, H., Zhou, X., Zhao, J., Liu, D.: Non-monotonic attribute reduction in decision-theoretic
rough sets. Fundamenta Informaticae 126(4), 415–432 (2013)
135. Li, J., Cercone, N.: A rough set based model to rank the importance of association rules. In:
International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Com-
puting, pp. 109–118. Springer (2005)
136. Li, M., Deng, S., Wang, L., Feng, S., Fan, J.: Hierarchical clustering algorithm for categorical
data using a probabilistic rough set model. Knowledge-Based Systems 65, 60–71 (2014)
137. Li, M., Shang, C., Feng, S., Fan, J.: Quick attribute reduction in inconsistent decision tables.
Information Sciences 254, 155–180 (2014)
138. Li, S., Li, T., Liu, D.: Dynamic maintenance of approximations in dominance-based rough
set approach under the variation of the object set. International Journal of Intelligent Systems
28(8), 729–751 (2013)
139. Li, S., Li, T., Liu, D.: Incremental updating approximations in dominance-based rough sets
approach under the variation of the attribute set. Knowledge-Based Systems 40, 17–26
(2013)
140. Li, T., Ruan, D., Geert, W., Song, J., Xu, Y.: A rough sets based characteristic relation
approach for dynamic attribute generalization in data mining. Knowledge-Based Systems
20(5), 485–494 (2007)
141. Liang, J., Wang, F., Dang, C., Qian, Y.: A group incremental approach to feature selection
applying rough set technique. IEEE Transactions on Knowledge and Data Engineering 26(2),
294–308 (2014)
142. Lin, T.Y., Yao, Y.Y., Zadeh, L.A.: Data mining, rough sets and granular computing, vol. 95.
Physica (2013)
143. Lingras, P.: Unsupervised rough set classification using gas. Journal of Intelligent Informa-
tion Systems 16(3), 215–228 (2001)
144. Lingras, P., Chen, M., Miao, D.: Rough cluster quality index based on decision theory. IEEE
Transactions on Knowledge and Data Engineering 21(7), 1014–1026 (2009)
26 Rafael Bello and Rafael Falcon

145. Lingras, P., Chen, M., Miao, D.: Qualitative and quantitative combinations of crisp and rough
clustering schemes using dominance relations. International Journal of Approximate Rea-
soning 55(1), 238–258 (2014)
146. Lingras, P., West, C.: Interval set clustering of web users with rough k-means. Journal of
Intelligent Information Systems 23(1), 5–16 (2004)
147. Liu, D., Li, T., Liu, G., Hu, P.: An approach for inducing interesting incremental knowledge
based on the change of attribute values. In: Granular Computing, 2009, GRC’09. IEEE
International Conference on, pp. 415–418. IEEE (2009)
148. Liu, D., Li, T., Ruan, D., Zhang, J.: Incremental learning optimization on knowledge discov-
ery in dynamic business intelligent systems. Journal of Global Optimization 51(2), 325–344
(2011)
149. Liu, D., Li, T., Ruan, D., Zou, W.: An incremental approach for inducing knowledge from
dynamic information systems. Fundamenta Informaticae 94(2), 245–260 (2009)
150. Liu, D., Li, T., Zhang, J.: A rough set-based incremental approach for learning knowledge in
dynamic incomplete information systems. International Journal of Approximate Reasoning
55(8), 1764–1786 (2014)
151. Liu, D., Li, T., Zhang, J.: A rough set-based incremental approach for learning knowledge in
dynamic incomplete information systems. International Journal of Approximate Reasoning
55(8), 1764–1786 (2014)
152. Liu, D., Li, T., Zhang, J.: Incremental updating approximations in probabilistic rough sets
under the variation of attributes. Knowledge-Based Systems 73, 81–96 (2015)
153. Liu, D., Liang, D.: Incremental learning researches on rough set theory: status and future.
International Journal of Rough Sets and Data Analysis (IJRSDA) 1(1), 99–112 (2014)
154. Liu, J., Hu, Q., Yu, D.: A comparative study on rough set based class imbalance learning.
Knowledge-Based Systems 21(8), 753–763 (2008)
155. Liu, J., Hu, Q., Yu, D.: A weighted rough set based method developed for class imbalance
learning. Information Sciences 178(4), 1235–1256 (2008)
156. Liu, Y., Xu, C., Zhang, Q., Pan, Y.: Rough rule extracting from various conditions: Incre-
mental and approximate approaches for inconsistent data. Fundamenta Informaticae 84(3,
4), 403–427 (2008)
157. Lu, J., Tan, Y.P.: Cost-sensitive subspace analysis and extensions for face recognition. IEEE
Transactions on Information Forensics and Security 8(3), 510–519 (2013)
158. Luo, C., Li, T., Chen, H., Liu, D.: Incremental approaches for updating approximations in
set-valued ordered information systems. Knowledge-Based Systems 50, 218–233 (2013)
159. Luo, C., Li, T., Yi, Z., Fujita, H.: Matrix approach to decision-theoretic rough sets for evolv-
ing data. Knowledge-Based Systems 99, 123–134 (2016)
160. Ma, T., Tang, M.: Weighted rough set model. In: Sixth International Conference on Intelli-
gent Systems Design and Applications, vol. 1, pp. 481–485. IEEE (2006)
161. Maji, P., Garai, P.: Fuzzy–rough simultaneous attribute selection and feature extraction algo-
rithm. IEEE Transactions on Cybernetics 43(4), 1166–1177 (2013)
162. Maji, P., Pal, S.K.: Rfcm: A hybrid clustering algorithm using rough and fuzzy sets. Funda-
menta Informaticae 80(4), 475–496 (2007)
163. Mak, B., Munakata, T.: Rule extraction from expert heuristics: A comparative study of rough
sets with neural networks and id3. European Journal of Operational Research 136(1), 212–
229 (2002)
164. Miao, D., Chen, M., Wei, Z., Duan, Q.: A reasonable rough approximation for clustering
web users. In: International Workshop on Web Intelligence Meets Brain Informatics, pp.
428–442. Springer (2006)
165. Min, F., He, H., Qian, Y., Zhu, W.: Test-cost-sensitive attribute reduction. Information Sci-
ences 181(22), 4928–4942 (2011)
166. Min, F., Hu, Q., Zhu, W.: Feature selection with test cost constraint. International Journal of
Approximate Reasoning 55(1), 167–179 (2014)
167. Min, F., Liu, Q.: A hierarchical model for test-cost-sensitive decision systems. Information
Sciences 179(14), 2442–2452 (2009)
Rough Sets in Machine Learning: A Review 27

168. Min, F., Zhu, W.: Attribute reduction of data with error ranges and test costs. Information
Sciences 211, 48–67 (2012)
169. Mirkin, B.: Mathematical classification and clustering: From how to what and why. In:
Classification, data analysis, and data highways, pp. 172–181. Springer (1998)
170. Mitra, S.: An evolutionary rough partitive clustering. Pattern Recognition Letters 25(12),
1439–1449 (2004)
171. Mitra, S., Banka, H.: Application of rough sets in pattern recognition. In: Transactions on
rough sets VII, pp. 151–169. Springer (2007)
172. Mitra, S., Banka, H., Pedrycz, W.: Rough-fuzzy collaborative clustering. IEEE Transactions
on Systems, Man, and Cybernetics, Part B (Cybernetics) 36(4), 795–805 (2006)
173. Mitra, S., Barman, B.: Rough-fuzzy clustering: an application to medical imagery. In: In-
ternational Conference on Rough Sets and Knowledge Technology, pp. 300–307. Springer
(2008)
174. Nanda, S., Majumdar, S.: Fuzzy rough sets. Fuzzy sets and systems 45(2), 157–160 (1992)
175. Nguyen, H.S.: Discretization problem for rough sets methods. In: International Conference
on Rough Sets and Current Trends in Computing, pp. 545–552. Springer (1998)
176. Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundamenta
Informaticae 48(1), 61–81 (2001)
177. Orlowska, E.: Dynamic information systems. Institute of Computer Science, Polish Academy
of Sciences (1981)
178. Ozawa, S., Pang, S., Kasabov, N.: Incremental learning of chunk data for online pattern
classification systems. IEEE Transactions on Neural Networks 19(6), 1061–1074 (2008)
179. Park, I.K., Choi, G.S.: Rough set approach for clustering categorical data using information-
theoretic dependency measure. Information Systems 48, 289–295 (2015)
180. Parmar, D., Wu, T., Blackhurst, J.: Mmr: an algorithm for clustering categorical data using
rough set theory. Data & Knowledge Engineering 63(3), 879–893 (2007)
181. Pawlak, Z.: Rough sets. International Journal of Computer & Information Sciences 11(5),
341–356 (1982)
182. Pawlak, Z.: Rough sets and intelligent data analysis. Information sciences 147(1), 1–12
(2002)
183. Pawlak, Z., Skowron, A.: Rough sets: some extensions. Information Sciences 177(1), 28 –
40 (2007)
184. Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic ap-
proach. International Journal of Man-Machine Studies 29(1), 81–95 (1988)
185. Pedrycz, W.: Granular computing: an emerging paradigm, vol. 70. Springer Science & Busi-
ness Media (2001)
186. Peters, G.: Outliers in rough k-means clustering. In: International Conference on Pattern
Recognition and Machine Intelligence, pp. 702–707. Springer (2005)
187. Peters, G.: Some refinements of rough k-means clustering. Pattern Recognition 39(8), 1481–
1491 (2006)
188. Peters, G.: Rough clustering utilizing the principle of indifference. Information Sciences
277, 358–374 (2014)
189. Peters, G.: Is there any need for rough clustering? Pattern Recognition Letters 53, 31–37
(2015)
190. Peters, G., Crespo, F., Lingras, P., Weber, R.: Soft clustering–fuzzy and rough approaches
and their extensions and derivatives. International Journal of Approximate Reasoning 54(2),
307–322 (2013)
191. Peters, G., Lampart, M., Weber, R.: Evolutionary rough k-medoid clustering. In: Transactions
on rough sets VIII, pp. 289–306. Springer (2008)
192. Peters, G., Weber, R., Nowatzke, R.: Dynamic rough clustering and its applications. Applied
Soft Computing 12(10), 3193–3207 (2012)
193. Pradeepa, A., Selvadoss ThanamaniLee, A.: hadoop file system and fundamental concept of
mapreduce interior and closure rough set approximations. International Journal of Advanced
Research in Computer and Communication EngineeringVol 2 (2013)
28 Rafael Bello and Rafael Falcon

194. do Prado, H.A., Engel, P.M., Chaib Filho, H.: Rough clustering: An alternative to find mean-
ingful clusters by using the reducts from a dataset. In: International Conference on Rough
Sets and Current Trends in Computing, pp. 234–238. Springer (2002)
195. Qian, Y., Wang, Q., Cheng, H., Liang, J., Dang, C.: Fuzzy-rough feature selection accelerator.
Fuzzy Sets and Systems 258, 61–78 (2015)
196. Ramentol, E., Caballero, Y., Bello, R., Herrera, F.: Smote-rsb*: a hybrid preprocessing ap-
proach based on oversampling and undersampling for high imbalanced data-sets using smote
and rough sets theory. Knowledge and information systems 33(2), 245–265 (2012)
197. Riza, L.S., Janusz, A., Bergmeir, C., Cornelis, C., Herrera, F., Śle, D., Benı́tez, J.M., et al.:
Implementing algorithms of rough set theory and fuzzy rough set theory in the R package
“roughsets”. Information Sciences 287, 68–89 (2014)
198. Salamó, M., López-Sánchez, M.: Rough set based approaches to feature selection for case-
based reasoning classifiers. Pattern Recognition Letters 32(2), 280–292 (2011)
199. Salido, J.F., Murakami, S.: Rough set analysis of a general type of fuzzy data using transitive
aggregations of fuzzy similarity relations. Fuzzy sets and systems 139(3), 635–660 (2003)
200. Schapire, R.E., Singer, Y.: Boostexter: A boosting-based system for text categorization. Ma-
chine learning 39(2-3), 135–168 (2000)
201. Shan, N., Ziarko, W.: Data-based acquisition and incremental modification of classification
rules. Computational Intelligence 11(2), 357–370 (1995)
202. Shen, F., Yu, H., Kamiya, Y., Hasegawa, O.: An online incremental semi-supervised learning
method. JACIII 14(6), 593–605 (2010)
203. Shen, Q., Chouchoulas, A.: Combining rough sets and data-driven fuzzy learning for gener-
ation of classification rules. Pattern Recognition 32(12), 2073–2076 (1999)
204. Shen, Q., Chouchoulas, A.: A modular approach to generating fuzzy rules with reduced
attributes for the monitoring of complex systems. Engineering Applications of Artificial
Intelligence 13(3), 263–278 (2000)
205. Shen, Q., Jensen, R.: Selecting informative features with fuzzy-rough sets and its application
for complex systems monitoring. Pattern recognition 37(7), 1351–1363 (2004)
206. Shu, W., Shen, H.: Incremental feature selection based on rough set in dynamic incomplete
data. Pattern Recognition 47(12), 3890–3906 (2014)
207. Singh, G.K., Minz, S.: Discretization using clustering and rough set theory. In: Computing:
Theory and Applications, 2007. ICCTA’07. International Conference on, pp. 330–336. IEEE
(2007)
208. Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems.
In: Intelligent Decision Support, pp. 331–362. Springer (1992)
209. Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundamenta Informaticae 27(2,
3), 245–253 (1996)
210. Slezak, D.: Approximate bayesian networks. In: Technologies for Constructing Intelligent
Systems 2, pp. 313–325. Springer (2002)
211. Ślezak, D.: Approximate entropy reducts. Fundamenta informaticae 53(3-4), 365–390 (2002)
212. Slezak, D., Ziarko, W., et al.: The investigation of the bayesian rough set model. International
Journal of Approximate Reasoning 40(1), 81–91 (2005)
213. Slimani, T.: Class association rules mining based rough set method. arXiv preprint
arXiv:1509.05437 (2015)
214. Slowinski, R., Vanderpooten, D., et al.: A generalized definition of rough approximations
based on similarity. IEEE Transactions on knowledge and Data Engineering 12(2), 331–336
(2000)
215. Soni, R., Nanda, R.: Neighborhood clustering of web users with rough k-means. In: In Pro-
ceedings of 6thWSEAS International Conference on Circuits, Systems, Electronics, Control
& Signal Processing, pp. 570–574 (2007)
216. Stefanowski, J.: The rough set based rule induction technique for classification problems. In:
In Proceedings of 6th European Conference on Intelligent Techniques and Soft Computing
EUFIT, vol. 98 (1998)
217. Stefanowski, J.: On combined classifiers, rule induction and rough sets. In: Transactions on
rough sets VI, pp. 329–350. Springer (2007)
Rough Sets in Machine Learning: A Review 29

218. Stefanowski, J., Vanderpooten, D.: Induction of decision rules in classification and discovery-
oriented perspectives. International Journal of Intelligent Systems 16(1), 13–27 (2001)
219. Stefanowski, J., Wilk, S.: Rough sets for handling imbalanced data: combining filtering and
rule-based classifiers. Fundamenta Informaticae 72(1-3), 379–391 (2006)
220. Stefanowski, J., Wilk, S.: Extending rule-based classifiers to improve recognition of imbal-
anced classes. In: Advances in Data Management, pp. 131–154. Springer (2009)
221. Su, C.T., Hsu, J.H.: An extended chi2 algorithm for discretization of real value attributes.
IEEE transactions on knowledge and data engineering 17(3), 437–441 (2005)
222. Su, C.T., Hsu, J.H.: Precision parameter in the variable precision rough sets model: an appli-
cation. Omega 34(2), 149–157 (2006)
223. Susmaga, R.: Reducts and constructs in classic and dominance-based rough sets approach.
Information Sciences 271, 45–64 (2014)
224. Świniarski, R.W.: Rough sets methods in feature reduction and classification. International
Journal of Applied Mathematics and Computer Science 11(3), 565–582 (2001)
225. Swiniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pat-
tern recognition letters 24(6), 833–849 (2003)
226. Tay, F.E., Shen, L.: Economic and financial prediction using rough sets model. European
Journal of Operational Research 141(3), 641–659 (2002)
227. Tsang, E.C., Chen, D., Yeung, D.S., Wang, X.Z., Lee, J.W.: Attributes reduction using fuzzy
rough sets. IEEE Transactions on Fuzzy systems 16(5), 1130–1141 (2008)
228. Tsoumakas, G., Katakis, I.: Multi-label classification: An overview. Dept. of Informatics,
Aristotle University of Thessaloniki, Greece (2006)
229. Tsoumakas, G., Vlahavas, I.: Random k-labelsets: An ensemble method for multilabel clas-
sification. In: European Conference on Machine Learning, pp. 406–417. Springer (2007)
230. Tsumoto, S.: Automated extraction of medical expert system rules from clinical databases
based on rough set theory. Information sciences 112(1), 67–84 (1998)
231. Tsumoto, S.: Automated extraction of hierarchical decision rules from clinical databases
using rough set model. Expert systems with Applications 24(2), 189–197 (2003)
232. Tsumoto, S.: Incremental rule induction based on rough set theory. In: International Sympo-
sium on Methodologies for Intelligent Systems, pp. 70–79. Springer (2011)
233. Vanderpooten, D.: Similarity relation as a basis for rough approximations. Adv Machine
Intell Soft Comput 4, 17–33 (1997)
234. Verbiest, N.: Fuzzy rough and evolutionary approaches to instance selection. Ph.D. thesis,
Ghent University (2014)
235. Verbiest, N., Cornelis, C., Herrera, F.: Frps: A fuzzy rough prototype selection method. Pat-
tern Recognition 46(10), 2770–2782 (2013)
236. Vilalta, R., Drissi, Y.: A perspective view and survey of meta-learning. Artificial Intelligence
Review 18(2), 77–95 (2002)
237. Voges, K., Pope, N., Brown, M.: A rough cluster analysis of shopping orientation data. In:
Proceedings Australian and New Zealand Marketing Academy Conference, Adelaide, pp.
1625–1631 (2003)
238. Voges, K.E., Pope, N., Brown, M.R.: Cluster analysis of marketing data examining on-line
shopping orientation: A comparison of k-means and rough clustering approaches. Heuristics
and Optimization for Knowledge Discovery pp. 207–224 (2002)
239. Wang, F., Liang, J., Dang, C.: Attribute reduction for dynamic data sets. Applied Soft Com-
puting 13(1), 676–689 (2013)
240. Wang, F., Liang, J., Qian, Y.: Attribute reduction: a dimension incremental strategy.
Knowledge-Based Systems 39, 95–108 (2013)
241. Wang, G., Yu, H., Li, T., et al.: Decision region distribution preservation reduction in
decision-theoretic rough set model. Information Sciences 278, 614–640 (2014)
242. Wang, X., An, S., Shi, H., Hu, Q.: Fuzzy rough decision trees for multi-label classification.
In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, pp. 207–217. Springer
(2015)
30 Rafael Bello and Rafael Falcon

243. Wang, X., Yang, J., Peng, N., Teng, X.: Finding minimal rough set reducts with particle
swarm optimization. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining,
and Granular-Soft Computing, pp. 451–460. Springer (2005)
244. Wang, X., Yang, J., Teng, X., Xia, W., Jensen, R.: Feature selection based on rough sets and
particle swarm optimization. Pattern Recognition Letters 28(4), 459–471 (2007)
245. Wei, M.H., Cheng, C.H., Huang, C.S., Chiang, P.C.: Discovering medical quality of total
hip arthroplasty by rough set classifier with imbalanced class. Quality & Quantity 47(3),
1761–1779 (2013)
246. Wojna, A.: Constraint based incremental learning of classification rules. In: International
Conference on Rough Sets and Current Trends in Computing, pp. 428–435. Springer (2000)
247. Wróblewski, J.: Finding minimal reducts using genetic algorithms. In: Proccedings of the
second annual join conference on infromation science, pp. 186–189 (1995)
248. Wróblewski, J.: Theoretical foundations of order-based genetic algorithms. Fundamenta
Informaticae 28(3, 4), 423–430 (1996)
249. Wróblewski, J.: Ensembles of classifiers based on approximate reducts. Fundamenta Infor-
maticae 47(3-4), 351–360 (2001)
250. Wu, Q., Bell, D.: Multi-knowledge extraction and application. In: International Workshop on
Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing, pp. 274–278. Springer
(2003)
251. Xie, H., Cheng, H.Z., Niu, D.X.: Discretization of continuous attributes in rough set the-
ory based on information entropy. CHINESE JOURNAL OF COMPUTERS-CHINESE
EDITION- 28(9), 1570 (2005)
252. Xu, Y., Wang, L., Zhang, R.: A dynamic attribute reduction algorithm based on 0-1 integer
programming. Knowledge-Based Systems 24(8), 1341–1347 (2011)
253. Xu, Z., Liang, J., Dang, C., Chin, K.: Inclusion degree: a perspective on measures for rough
set data analysis. Information Sciences 141(3), 227–236 (2002)
254. Yang, Q., Ling, C., Chai, X., Pan, R.: Test-cost sensitive classification on data with missing
values. IEEE Transactions on Knowledge and Data Engineering 18(5), 626–638 (2006)
255. Yang, X., Qi, Y., Song, X., Yang, J.: Test cost sensitive multigranulation rough set: model
and minimal cost selection. Information Sciences 250, 184–199 (2013)
256. Yang, X., Qi, Y., Yu, H., Song, X., Yang, J.: Updating multigranulation rough approximations
with increasing of granular structures. Knowledge-Based Systems 64, 59–69 (2014)
257. Yang, Y., Chen, D., Dong, Z.: Novel algorithms of attribute reduction with variable precision
rough set model. Neurocomputing 139, 336–344 (2014)
258. Yang, Y., Chen, Z., Liang, Z., Wang, G.: Attribute reduction for massive data based on rough
set theory and mapreduce. In: International Conference on Rough Sets and Knowledge Tech-
nology, pp. 672–678. Springer (2010)
259. Yao, J., Yao, Y.: A granular computing approach to machine learning. FSKD 2, 732–736
(2002)
260. Yao, Y.: Combination of rough and fuzzy sets based on α-level sets. In: Rough sets and Data
Mining, pp. 301–321. Springer (1997)
261. Yao, Y.: Decision-theoretic rough set models. In: International Conference on Rough Sets
and Knowledge Technology, pp. 1–12. Springer (2007)
262. Yao, Y.: Three-way decision: an interpretation of rules in rough set theory. In: International
Conference on Rough Sets and Knowledge Technology, pp. 642–649. Springer (2009)
263. Yao, Y.: Three-way decisions with probabilistic rough sets. Information Sciences 180(3),
341–353 (2010)
264. Yao, Y.: The superiority of three-way decisions in probabilistic rough set models. Information
Sciences 181(6), 1080–1096 (2011)
265. Yao, Y.: An outline of a theory of three-way decisions. In: International Conference on
Rough Sets and Current Trends in Computing, pp. 1–17. Springer (2012)
266. Yao, Y., Greco, S., Słowiński, R.: Probabilistic rough sets. In: Springer Handbook of Com-
putational Intelligence, pp. 387–411. Springer (2015)
267. Yao, Y., Zhao, Y.: Attribute reduction in decision-theoretic rough set models. Information
sciences 178(17), 3356–3373 (2008)
Rough Sets in Machine Learning: A Review 31

268. Yao, Y., Zhao, Y., Maguire, R.B.: Explanation oriented association mining using rough set
theory. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-
Soft Computing, pp. 165–172. Springer (2003)
269. Yao, Y., Zhou, B.: Two bayesian approaches to rough sets. European Journal of Operational
Research 251(3), 904–917 (2016)
270. Ye, D., Chen, Z., Ma, S.: A novel and better fitness evaluation for rough set based minimum
attribute reduction problem. Information Sciences 222, 413–423 (2013)
271. Yong, L., Congfu, X., Yunhe, P.: An incremental rule extracting algorithm based on pawlak
reduction. In: Systems, Man and Cybernetics, 2004 IEEE International Conference on, vol. 6,
pp. 5964–5968. IEEE (2004)
272. Yong, L., Wenliang, H., Yunliang, J., Zhiyong, Z.: Quick attribute reduct algorithm for neigh-
borhood rough set model. Information Sciences 271, 65–81 (2014)
273. Yu, H., Chu, S., Yang, D.: Autonomous knowledge-oriented clustering using decision-
theoretic rough set theory. Fundamenta Informaticae 115(2-3), 141–156 (2012)
274. Yu, H., Liu, Z., Wang, G.: An automatic method to determine the number of clusters using
decision-theoretic rough set. International Journal of Approximate Reasoning 55(1), 101–
115 (2014)
275. Yu, H., Su, T., Zeng, X.: A three-way decisions clustering algorithm for incomplete data. In:
International Conference on Rough Sets and Knowledge Technology, pp. 765–776. Springer
(2014)
276. Yu, H., Wang, G., Lan, F.: Solving the attribute reduction problem with ant colony optimiza-
tion. In: Transactions on rough sets XIII, pp. 240–259. Springer (2011)
277. Yu, H., Wang, Y.: Three-way decisions method for overlapping clustering. In: International
Conference on Rough Sets and Current Trends in Computing, pp. 277–286. Springer (2012)
278. Yu, H., Wang, Y., Jiao, P.: A three-way decisions approach to density-based overlapping
clustering. In: Transactions on Rough Sets XVIII, pp. 92–109. Springer (2014)
279. Yu, H., Zhang, C., Hu, F.: An incremental clustering approach based on three-way decisions.
In: International Conference on Rough Sets and Current Trends in Computing, pp. 152–159.
Springer (2014)
280. Yu, Y., Miao, D., Zhang, Z., Wang, L.: Multi-label classification using rough sets. In: Inter-
national Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing,
pp. 119–126. Springer (2013)
281. Yu, Y., Pedrycz, W., Miao, D.: Multi-label classification by exploiting label correlations.
Expert Systems with Applications 41(6), 2989–3004 (2014)
282. Zhai, J., Zhang, S., Zhang, Y.: An extension of rough fuzzy set. Journal of Intelligent &
Fuzzy Systems (Preprint), 1–10 (2016)
283. Zhai, J., Zhang, Y., Zhu, H.: Three-way decisions model based on tolerance rough fuzzy set.
International Journal of Machine Learning and Cybernetics pp. 1–9 (2016)
284. Zhang, H.R., Min, F.: Three-way recommender systems based on random forests.
Knowledge-Based Systems 91, 275–286 (2016)
285. Zhang, J., Li, T., Chen, H.: Composite rough sets. In: International Conference on Artificial
Intelligence and Computational Intelligence, pp. 150–159. Springer (2012)
286. Zhang, J., Li, T., Chen, H.: Composite rough sets for dynamic data mining. Information
Sciences 257, 81–100 (2014)
287. Zhang, J., Li, T., Pan, Y.: Parallel rough set based knowledge acquisition using mapreduce
from big data. In: Proceedings of the 1st International Workshop on Big Data, Streams and
Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applica-
tions, pp. 20–27. ACM (2012)
288. Zhang, J., Li, T., Ruan, D., Gao, Z., Zhao, C.: A parallel method for computing rough set
approximations. Information Sciences 194, 209–223 (2012)
289. Zhang, J., Li, T., Ruan, D., Liu, D.: Rough sets based matrix approaches with dynamic at-
tribute variation in set-valued information systems. International Journal of Approximate
Reasoning 53(4), 620–635 (2012)
32 Rafael Bello and Rafael Falcon

290. Zhang, L., Hu, Q., Duan, J., Wang, X.: Multi-label feature selection with fuzzy rough sets. In:
International Conference on Rough Sets and Knowledge Technology, pp. 121–128. Springer
(2014)
291. Zhang, L., Li, H., Zhou, X., Huang, B., Shang, L.: Cost-sensitive sequential three-way deci-
sion for face recognition. In: International Conference on Rough Sets and Intelligent Systems
Paradigms, pp. 375–383. Springer (2014)
292. Zhang, M.L., Zhou, Z.H.: Ml-knn: A lazy learning approach to multi-label learning. Pattern
recognition 40(7), 2038–2048 (2007)
293. Zhang, T., Chen, L., Ma, F.: An improved algorithm of rough k-means clustering based on
variable weighted distance measure. International Journal of Database Theory and Applica-
tion 7(6), 163–174 (2014)
294. Zhang, T., Chen, L., Ma, F.: A modified rough c-means clustering algorithm based on hybrid
imbalanced measure of distance and density. International Journal of Approximate Reason-
ing 55(8), 1805–1818 (2014)
295. Zhang, X., Miao, D.: Three-way weighted entropies and three-way attribute reduction. In:
International Conference on Rough Sets and Knowledge Technology, pp. 707–719. Springer
(2014)
296. Zhang, Y., Zhou, Z.H.: Cost-sensitive face recognition. IEEE Transactions on Pattern Anal-
ysis and Machine Intelligence 32(10), 1758–1769 (2010)
297. Zhao, H., Min, F., Zhu, W.: Test-cost-sensitive attribute reduction based on neighborhood
rough set. In: Granular Computing (GrC), 2011 IEEE International Conference on, pp. 802–
806. IEEE (2011)
298. Zhao, H., Wang, P., Hu, Q.: Cost-sensitive feature selection based on adaptive neighborhood
granularity with multi-level confidence. Information Sciences 366, 134–149 (2016)
299. Zhao, M., Luo, K., Liao, X.X.: Rough set attribute reduction algorithm based on immune ge-
netic algorithm. Jisuanji Gongcheng yu Yingyong(Computer Engineering and Applications)
42(23), 171–173 (2007)
300. Zhao, S., Chen, H., Li, C., Du, X., Sun, H.: A novel approach to building a robust fuzzy
rough classifier. IEEE Transactions on Fuzzy Systems 23(4), 769–786 (2015)
301. Zhao, S., Tsang, E.C., Chen, D.: The model of fuzzy variable precision rough sets. IEEE
Transactions on Fuzzy Systems 17(2), 451–467 (2009)
302. Zhao, S., Tsang, E.C., Chen, D., Wang, X.: Building a rule-based classifiera fuzzy-rough set
approach. IEEE Transactions on Knowledge and Data Engineering 22(5), 624–638 (2010)
303. Zheng, Z., Wang, G., Wu, Y.: A rough set and rule tree based incremental knowledge acqui-
sition algorithm. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and
Granular-Soft Computing, pp. 122–129. Springer (2003)
304. Zhong, N., Dong, J., Ohsuga, S.: Using rough sets with heuristics for feature selection. Jour-
nal of intelligent information systems 16(3), 199–214 (2001)
305. Zhou, Z.H.: Cost-sensitive learning. In: International Conference on Modeling Decisions for
Artificial Intelligence, pp. 17–18. Springer (2011)
306. Zhou, Z.H., Liu, X.Y.: Training cost-sensitive neural networks with methods addressing the
class imbalance problem. IEEE Transactions on Knowledge and Data Engineering 18(1),
63–77 (2006)
307. Zhu, W.: Generalized rough sets based on relations. Information Sciences 177(22), 4997–
5011 (2007)
308. Zhu, W.: Topological approaches to covering rough sets. Information sciences 177(6), 1499–
1508 (2007)
309. Ziarko, W.: Variable precision rough set model. Journal of computer and system sciences
46(1), 39–59 (1993)
310. Zou, W., Li, T., Chen, H., Ji, X.: Approaches for incrementally updating approximations
based on set-valued information systems while attribute values’ coarsening and refining. In:
2009 IEEE International Conference on Granular Computing (2009)

View publication stats

Intelligent Decision Support - Handbook of Applications and Advances of The Rough Sets Theory PDF
No ratings yet
Intelligent Decision Support - Handbook of Applications and Advances of The Rough Sets Theory PDF
471 pages
Test Bank for deWit’s Fundamental Concepts and Skills for Nursing 5th Edition Patricia Williams download pdf
100% (15)
Test Bank for deWit’s Fundamental Concepts and Skills for Nursing 5th Edition Patricia Williams download pdf
41 pages
Engineering Mechanics Statics and Dynamics 9789395125048 9395125047 Compress
No ratings yet
Engineering Mechanics Statics and Dynamics 9789395125048 9395125047 Compress
275 pages
Physical Education-4-Week 3
No ratings yet
Physical Education-4-Week 3
8 pages
Transactions On Rough Sets Xii 1st Edition Lech Polkowski Maria Semeniukpolkowska Auth download
No ratings yet
Transactions On Rough Sets Xii 1st Edition Lech Polkowski Maria Semeniukpolkowska Auth download
78 pages
Pessimistic Multigranulation Roughness of A Fuzzy
No ratings yet
Pessimistic Multigranulation Roughness of A Fuzzy
21 pages
Soft Lattice in Approximation Space
No ratings yet
Soft Lattice in Approximation Space
3 pages
Comparing Ordinary Least Square Regression and GWR For Modelling NDVI-Precipitation Relationships Over Crop/Grassland Ecosystem in Northwestern Nigeria
No ratings yet
Comparing Ordinary Least Square Regression and GWR For Modelling NDVI-Precipitation Relationships Over Crop/Grassland Ecosystem in Northwestern Nigeria
7 pages
Ibmd Seconds Special Auc Dt. 27.06.2025 (East & North)_195576_1
No ratings yet
Ibmd Seconds Special Auc Dt. 27.06.2025 (East & North)_195576_1
26 pages
Interpreter Vocabulary Asessment
No ratings yet
Interpreter Vocabulary Asessment
3 pages
rp010 Vol.6-S0184
No ratings yet
rp010 Vol.6-S0184
4 pages
FPS4 e
No ratings yet
FPS4 e
3 pages
futuresvaluation
No ratings yet
futuresvaluation
2 pages
Ranking Ratios based on weighted scores
No ratings yet
Ranking Ratios based on weighted scores
2 pages
Book Credit Scoring
No ratings yet
Book Credit Scoring
382 pages
Correction Project 5
No ratings yet
Correction Project 5
35 pages
page_de_garde (1)
No ratings yet
page_de_garde (1)
3 pages
5fca49f4567f89cbc0bc3e47_bondura design sheet
No ratings yet
5fca49f4567f89cbc0bc3e47_bondura design sheet
1 page
Multiple-Category Attribute Reduct Using Decision-Theoretic Rough Set Model
No ratings yet
Multiple-Category Attribute Reduct Using Decision-Theoretic Rough Set Model
18 pages
Journal Rough Set
No ratings yet
Journal Rough Set
42 pages
FPGA Implementation of A Reduct Generation Algorithm Based On Rough Set Theory
No ratings yet
FPGA Implementation of A Reduct Generation Algorithm Based On Rough Set Theory
7 pages
Parker Value Bond
No ratings yet
Parker Value Bond
2 pages
Financial Market Project Report
No ratings yet
Financial Market Project Report
46 pages
A Study of Modal Logic With Semantics Based on Rough Set Theory
No ratings yet
A Study of Modal Logic With Semantics Based on Rough Set Theory
26 pages
2-FCFF-FCFE-Valuation-Models-Blank
No ratings yet
2-FCFF-FCFE-Valuation-Models-Blank
2 pages
houses prices prediction model
No ratings yet
houses prices prediction model
11 pages
Documentclass{Article}
No ratings yet
Documentclass{Article}
2 pages
Group92_Phase 2_518022
No ratings yet
Group92_Phase 2_518022
23 pages
QM ARAD
No ratings yet
QM ARAD
20 pages
Huang 2004
No ratings yet
Huang 2004
8 pages
2 Rough Clustering Highlighted
No ratings yet
2 Rough Clustering Highlighted
9 pages
Reduction Using Semi Correlation Factor
No ratings yet
Reduction Using Semi Correlation Factor
10 pages
Coursework FSA Fall 2024-2025
No ratings yet
Coursework FSA Fall 2024-2025
4 pages
English 9
No ratings yet
English 9
50 pages
A Semantical and Computational Approach To Covering-Based Rough Sets
No ratings yet
A Semantical and Computational Approach To Covering-Based Rough Sets
342 pages
IJFS Volume 14 Issue 2 Page 127-154
No ratings yet
IJFS Volume 14 Issue 2 Page 127-154
28 pages
Unit-IV Rough Set Theory
No ratings yet
Unit-IV Rough Set Theory
40 pages
Rough Set For Categorical
No ratings yet
Rough Set For Categorical
21 pages
F1369 TarjomeFa English
No ratings yet
F1369 TarjomeFa English
7 pages
Rough Set Notes
No ratings yet
Rough Set Notes
2 pages
Rough Set
No ratings yet
Rough Set
10 pages
TQCQ Petro Valves
No ratings yet
TQCQ Petro Valves
3 pages
CreditScoringModelsEnhancementUsingSupportVectorMachines
No ratings yet
CreditScoringModelsEnhancementUsingSupportVectorMachines
6 pages
SimpleProjectPlaninMSExcelwithGanttChart-1
No ratings yet
SimpleProjectPlaninMSExcelwithGanttChart-1
5 pages
Financial-Markets-Project-2023-2024
No ratings yet
Financial-Markets-Project-2023-2024
3 pages
Application of Approximate Equality For Reduction of Feature Vector Dimension
No ratings yet
Application of Approximate Equality For Reduction of Feature Vector Dimension
15 pages
AddressingBiasandDataPrivacyConcernsinAI-DrivenCreditScoringSystemsThroughCybersecurityRiskAssessment
No ratings yet
AddressingBiasandDataPrivacyConcernsinAI-DrivenCreditScoringSystemsThroughCybersecurityRiskAssessment
25 pages
RST-MLP method
No ratings yet
RST-MLP method
11 pages
AIandCreditScoring
No ratings yet
AIandCreditScoring
13 pages
hu2021
No ratings yet
hu2021
12 pages
IV Insertion Procedure
100% (1)
IV Insertion Procedure
11 pages
Time-Series Data Analysis With Rough Sets
No ratings yet
Time-Series Data Analysis With Rough Sets
5 pages
Information Sciences: Junbo Zhang, Tianrui Li, Da Ruan, Zizhe Gao, Chengbing Zhao
No ratings yet
Information Sciences: Junbo Zhang, Tianrui Li, Da Ruan, Zizhe Gao, Chengbing Zhao
15 pages
759-ArticleText-1527-1-10-20190412
No ratings yet
759-ArticleText-1527-1-10-20190412
7 pages
Assignment 2 (02201022022)
No ratings yet
Assignment 2 (02201022022)
9 pages
Business-Plan EasyBank[1]
No ratings yet
Business-Plan EasyBank[1]
16 pages
MODULE 4
No ratings yet
MODULE 4
23 pages
Using Data Mining to Improve Assessment
No ratings yet
Using Data Mining to Improve Assessment
10 pages
(SCI 174) Rough Set Theory. A True Landmark in Data Analysis (Studies in Computational Intelligence) (Springer 2009)
No ratings yet
(SCI 174) Rough Set Theory. A True Landmark in Data Analysis (Studies in Computational Intelligence) (Springer 2009)
327 pages
Rough Clustering
No ratings yet
Rough Clustering
12 pages
Startup Ecosystem Analysis Model
No ratings yet
Startup Ecosystem Analysis Model
21 pages
Enhancing Financial Decision-Making and Education in Fintech with Data Analytics and Information Technology
No ratings yet
Enhancing Financial Decision-Making and Education in Fintech with Data Analytics and Information Technology
9 pages
C, M. "R S: L F ": The Future of Rough Sets
No ratings yet
C, M. "R S: L F ": The Future of Rough Sets
37 pages
Unit 6
No ratings yet
Unit 6
17 pages
Steps for the Technical Analysis
No ratings yet
Steps for the Technical Analysis
3 pages
Enhancing portfolio management using artificial intelligence
No ratings yet
Enhancing portfolio management using artificial intelligence
20 pages
PFE Report
No ratings yet
PFE Report
10 pages
Statistical Techniques For Rough Set Data Analysis
No ratings yet
Statistical Techniques For Rough Set Data Analysis
22 pages
IET Wireless Sensor Systems - 2019 - Mohapatra - Detection and Avoidance of Water Loss Through Municipality Taps in India
No ratings yet
IET Wireless Sensor Systems - 2019 - Mohapatra - Detection and Avoidance of Water Loss Through Municipality Taps in India
11 pages
Gs Orakom - m5
No ratings yet
Gs Orakom - m5
20 pages
A Review On Dimensionality Reduction
No ratings yet
A Review On Dimensionality Reduction
12 pages
Reasoning With Rough Sets - Logical Approaches To Granularity-Based Framework
No ratings yet
Reasoning With Rough Sets - Logical Approaches To Granularity-Based Framework
210 pages
Building Knowledge For Substation-Based Decision Support Using Rough Sets
No ratings yet
Building Knowledge For Substation-Based Decision Support Using Rough Sets
8 pages
Uncertainty Management With Fuzzy and Rough Sets - Recent Advances and Applications
No ratings yet
Uncertainty Management With Fuzzy and Rough Sets - Recent Advances and Applications
424 pages
Work Method Statement For Fire Protection
75% (4)
Work Method Statement For Fire Protection
46 pages
2018 Sram Spare Parts Catalog
No ratings yet
2018 Sram Spare Parts Catalog
152 pages
Data Mining With Rough Set Using Map-Reduce
No ratings yet
Data Mining With Rough Set Using Map-Reduce
7 pages
1 An Introduction To Rough Set Theory and Its Applic
No ratings yet
1 An Introduction To Rough Set Theory and Its Applic
40 pages
Bridging+the+Gap-+The+Impact+of+Open+Banking+on+Traditional+Banking+and+FinTech+Collaboration+(5)
No ratings yet
Bridging+the+Gap-+The+Impact+of+Open+Banking+on+Traditional+Banking+and+FinTech+Collaboration+(5)
11 pages
Finacial Plan
No ratings yet
Finacial Plan
5 pages
Precast Wall
No ratings yet
Precast Wall
33 pages
04
No ratings yet
04
7 pages
A New Rough Sets Model Based On Database Systems: Xiaohua Hu T. Y. Lin
No ratings yet
A New Rough Sets Model Based On Database Systems: Xiaohua Hu T. Y. Lin
18 pages
Tutorial On Rough Sets
No ratings yet
Tutorial On Rough Sets
39 pages
Science Grade IV Shadows Done 2019-20
100% (4)
Science Grade IV Shadows Done 2019-20
4 pages
VHLP4-15 C
No ratings yet
VHLP4-15 C
4 pages
Edexcel IGCSE Biology Experimental Method Notes
No ratings yet
Edexcel IGCSE Biology Experimental Method Notes
17 pages
IPS MBD21907 in 522 Datasheet of Emergency Relief Valve A
No ratings yet
IPS MBD21907 in 522 Datasheet of Emergency Relief Valve A
3 pages
Nissan Leaf Maintenance Guide
No ratings yet
Nissan Leaf Maintenance Guide
42 pages
PDAJChallenge Kit
100% (1)
PDAJChallenge Kit
15 pages
Whirlpool Cabrio Washer With 6th Sense Technology
No ratings yet
Whirlpool Cabrio Washer With 6th Sense Technology
84 pages
Speaking Test Feedback
No ratings yet
Speaking Test Feedback
12 pages
Bangalore Company Contact List 2
No ratings yet
Bangalore Company Contact List 2
3 pages
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
From Everand
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The Paradigm of Data
From Everand
The Paradigm of Data
Pasquale De Marco
No ratings yet
Union-Find Data Structures and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Union-Find Data Structures and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Python Data Structures Explained: A Practical Guide with Examples
From Everand
Python Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
From Everand
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
A Trek Beyond Complexity: A Journey Through Discrete Math for Computing
From Everand
A Trek Beyond Complexity: A Journey Through Discrete Math for Computing
Pasquale De Marco
No ratings yet
Emergence III
From Everand
Emergence III
Larry Matthews
No ratings yet
Data Structures Explained: A Practical Guide with Examples
From Everand
Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
Search Algorithm: Fundamentals and Applications
From Everand
Search Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Exploring the World of Data Science and Machine Learning
From Everand
Exploring the World of Data Science and Machine Learning
NIBEDITA Sahu
No ratings yet
Mastering Algorithms and Data Structures
From Everand
Mastering Algorithms and Data Structures
Manish Soni
No ratings yet
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Basic Concepts in Data Structures
From Everand
Basic Concepts in Data Structures
K.Meenendranath Reddy
No ratings yet
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
From Everand
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
1/5 (1)
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet

2016 RSTML Chapter

Uploaded by

2016 RSTML Chapter

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Rough Sets in Machine Learning: A Review

Chapter in Studies in Computational Intelligence · April 2017

Rafael Bello Rafael Falcon

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Rafael Bello and Rafael Falcon

Information granulation is the process by which a collection of information granules

School of Electrical Engineering and Computer Science, University of Ottawa

– Symbolic inductive learning methods

– Evaluation of the discovered knowledge

RST’s main components are an information system and an indiscernibility rela-

IB = {(x, y) ∈ U ×U : f (x, Ai ) = f (y, Ai ) ∀Ai ∈ B}. (1)

Information systems having incomplete, continuous, mixed or heterogeneous

2 Machine Learning methods and RST

As mentioned in [197], discretization is the process of converting a numerical at-

Singh and Minz [207] designed a hybrid clustering-RST-based discretizer. The

2.1.2 Feature selection

2.1.3 Instance selection

An important area within knowledge discovery is that of meta-learning, whose ob-

2.2 Descriptive and predictive knowledge discovery

2.2.1 Decision rule induction

• Better explanation capabilities

2.2.2 Association rule mining

3 Special learning cases based on RST

This section elaborates on more recent ML scenarios tackled by RST-based ap-

3.1 Imbalanced classification

Stefanowski et. al. [219] introduced filtering techniques to process inconsistent

3.2 Multi-label classification

Normally, in a typical classification problem, a class (label) ci from a set C =

3.3 Dynamic/incremental learning

3.4 Rough sets and Big Data

3.5 Cost-sensitive learning

Cost is an important property inherent to real-world data. Cost sensitivity is an

is particularly interesting since in real-world face recognition scenarios, different

Table 1 Rough sets in Machine Learning

We have reported on hundreds of successful attempts to tackle different ML prob-

View publication stats

You might also like