2016 RSTML Chapter
2016 RSTML Chapter
net/publication/315800218
CITATIONS READS
43 4,829
2 authors:
All content following this page was uploaded by Rafael Falcon on 10 October 2017.
Abstract This chapter emphasizes on the role played by rough set theory (RST)
within the broad field of Machine Learning (ML). As a sound data analysis and
knowledge discovery paradigm, RST has much to offer to the ML community. We
surveyed the existing literature and reported on the most relevant RST theoretical
developments and applications in this area. The review starts with RST in the context
of data preprocessing (discretization, feature selection, instance selection and meta-
learning) as well as the generation of both descriptive and predictive knowledge
via decision rule induction, association rule mining and clustering. Afterward, we
examined several special ML scenarios in which RST has been recently introduced,
such as imbalanced classification, multi-label classification, dynamic/incremental
learning, Big Data analysis and cost-sensitive learning.
1 Introduction
Rafael Bello
Department of Computer Science, Universidad Central de Las Villas
Carretera Camajuanı́ km 5.5, Santa Clara, Cuba
e-mail: [email protected]
Rafael Falcon
Research & Engineering Division, Larus Technologies Corporation
170 Laurier Ave West - Suite 310, Ottawa ON, Canada
e-mail: [email protected]
1
2 Rafael Bello and Rafael Falcon
tinguishability, similarity or functionality mechanism. Note that the data and object
spaces can actually coincide [142]. The Granular Computing (GrC) paradigm [7]
[185] encompasses several computational models based on fuzzy logic, Computing
With Words, interval computing, rough sets, shadowed sets, near sets, etc.
The main purpose behind Granular Computing is to find a novel way to syn-
thesize knowledge in a more human-centric fashion and from vast, unstructured,
possibly high-dimensional raw data sources. Not surprisingly, Granular Computing
(GrC) is closely related to Machine Learning [259], [95] [83]. The aim of a learning
process is to derive a certain rule or system for either the automatic classification
of the system objects or the prediction of the values of the system control variables.
The key challenge with prediction lies in modeling the relationships among the sys-
tem variables in such a way that it allows inferring the value of the control (target)
variable.
Rough set theory (RST) [1] was developed by Zdzislaw Pawlak in the early 1980s
[181] as a mathematical approach to intelligent data analysis and data mining [182].
This methodology is based on the premise that lowering the degree of precision
in the data makes the data pattern more visible, i.e., the rough set approach can be
formally considered as a framework for pattern discovery from imperfect data [222].
Several reasons are given in [34] to employ RST in knowledge discovery, including:
• It does not require any preliminary or additional information about the data
• It provides a valuable analysis even in presence of incomplete data
• It allows the interpretation of large amounts of both quantitative and qualitative
data
• It can model highly nonlinear or discontinuous functional relations to provide
complex characterizations of data
• It can discover important facts hidden in the data and represent them in the form
of decision rules, and
• At the same time, the decision rules derived from rough set models are based on
facts, because every decision rule is supported by a set of examples.
Mert Bal [3] brought up other RST advantages, such as: (a) it performs a clear
interpretation of the results and evaluation of the meaningfulness of data; (b) it can
identify and characterize uncertain systems and (c) the patterns discovered using
rough sets are concise, strong and sturdy.
Among the main components of the knowledge discovery process we can men-
tion:
• PREPROCESSING
– Discretization
– Training set edition (instance selection)
– Feature selection
– Characterization of the learning problem (data complexity, metalearning)
• KNOWLEDGE DISCOVERY
Rough Sets in Machine Learning: A Review 3
– Feature selection
– Inter-attribute dependency characterization
– Feature reduction
– Feature weighting
– Feature discretization
– Feature removal
• Formulation of the discovered knowledge
– Discovery of decision rules
– Quantification of the uncertainty in the decision rules.
This section briefly goes over reported studies showcasing RST as a tool in data
preprocessing and descriptive/predictive knowledge discovery.
2.1 Preprocessing
2.1.1 Discretization
on these concepts is presented and is illustrated with data from concretes’ frost
resistance investigations.
Nguyen [175] considers the problem of searching for a minimal set of cuts that
preserves the discernibility between objects with respect to any subset of s attributes,
where s is a user-defined parameter. It was shown that this problem is NP-hard and
its heuristic solution is more complicated than that for the problem of searching for
an optimal, consistent set of cuts. The author proposed a scheme based on Boolean
reasoning to solve this problem.
Bazan [5] put forth a method to search for an irreducible sets of cuts of an infor-
mation system. The method is based on the notion of dynamic reduct. These reducts
are calculated for the information system and the one with the best stability coeffi-
cient is chosen. Next, as an irreducible set of cuts, the author selected cuts belonging
to the chosen dynamic reduct.
Bazan et. al. [6] proposed a discretization technique named maximal discerni-
bility (MD), which is based on rough sets and Boolean reasoning. MD is a greedy
heuristic that searches for cuts along the domains of all numerical attributes that dis-
cern the largest number of object pairs in the dataset. These object pairs are removed
from the information system before the next cut is sought. The set of cuts obtained
that way is optimal in terms of object indiscernibility; however this procedure is not
feasible since computing one cut requires O(|A| · |U|3 ). Locally optimal cuts [6] are
computed in O(|A| · |U|) steps using only O(|A| · |U|) space.
Dai and Li [46] improved Nguyen’s discretization techniques by reducing the
time and space complexity required to arrive at the set of candidate cuts. They
proved that all bound cuts can discern the same object pairs as the entire set of
initial cuts. A strategy to select candidate cuts was proposed based on that proof.
They obtained identical results to Nguyen’s with a lower computational overhead.
Chen et. al. [26] employ a genetic algorithm (GA) to derive the minimal cut set
in a numerical attribute. Each gene in a binary chromosome represents a particular
cut value. Enabling this gene means the corresponding cut value has been selected
as a member of the minimal cut set. Some optimization strategies such as elitist
selection and father-offspring combined selection helped the GA converge faster.
The experimental evidence showed that the GA-based scheme is more efficient than
Nguyen’s basic heuristic based on rough sets and Boolean reasoning.
Xie et. al. [251] defined an information entropy value for every candidate cut
point in their RST-based discretization algorithm. The final cut points are selected
based on this metric and some RST properties. The authors report that their approach
outperforms other discretization techniques and scales well with the number of cut
points.
Su and Hsu [221] extended the modified Chi2 discretizer by learning the pre-
defined misclassification rate (input parameter) from data. The authors additionally
considered the effect of variance in the two adjacent intervals. In the modified Chi2,
the inconsistency check in the original Chi2 is replaced with the “quality of approxi-
mation” measure from RST. The result is a more robust, parameterless discretization
method.
Rough Sets in Machine Learning: A Review 7
The purpose behind feature selection is to discard irrelevant features that are gener-
ally detrimental to the classifier’s performance, generate noise, increase the amount
of information to be stored and the computational cost of the classification process
[224] [304]. Feature selection is a computationally expensive problem that requires
searching for a subset of the n original features in a space of 2n −1 candidate subsets
according to a predefined evaluation criterion. The main components of a feature se-
lection algorithm are: (1) an evaluation function (EF), used to calculate the fitness
of a feature subset and (2) a generation procedure that is responsible for generating
different subsets of candidate features.
Different feature selection schemes that integrate RST into the feature subset
evaluation function have been developed. The quality of the classification γ is the
most frequently used RST metric to judge the suitability of a candidate feature sub-
set, as shown in [9] [11] [64] [10] etc. Other indicators are conditional independence
[210] and approximate entropy [211].
The concept of reduct is the basis for these results. Essentially, a reduct is a
minimal subset of features that generates the same granulation of the universe as
that induced by all features. Among these works we can list [37] [38] [249] [304]
[225] [250] [170] [85] [89] [198] [112] [137] [241] [223] [257] [272]. One of the
pioneer methods is the QuickReduct algorithm, which is typical of those algorithms
that resort to a greedy search strategy to find a relative reduct [204] [249] [137].
Generally speaking, feature selection algorithms are based on heuristic search [304]
[97] [166]. Other RST-based methods for reduct calculation are [98] [211]
More advanced methods employ metaheuristic algorithms (such as Genetic Al-
gorithms, Ant Colony Optimization or Particle Swarm Optimization) as the underly-
ing feature subset generation engine [247] [248] [102] [9] [243] [11] [15] [244] [10]
[64] [120] [299] [8] [276] [270]. Feature selection methods based on the hybridiza-
tion between fuzzy and rough sets have been proposed in [126] [101] [199] [103]
[104] [205] [13] [90] [87] [42] [227] [301] [105] [43] [44] [75] [51] [28] [92] [195].
8 Rafael Bello and Rafael Falcon
Some studies aim at calculating all possible reducts of a decision system [208] [227]
[301] [27] [28].
Feature selection is arguably the Machine Learning (ML) area that has witnessed
the most influx of rough-set-based methods. Other RST contributions to ML are
concerned with providing metrics to calculate the inter-attribute dependence and
the importance (weight) of any attribute [121] [224].
Another important data preprocessing task is the editing of the training sets, also re-
ferred to as instance selection. The aim is to reduce the number of examples in order
to bring down the size of the training set while maintaining the system efficiency.
By doing that, a new training set is obtained that will bring forth a higher efficiency
usually also produces a reduction of the data.
Some training set edition approaches using rough sets have been published in
[19] and [16]. The simplest idea is to remove all examples in the training set that
are not contained in the lower approximation of any of the decision classes. A more
thorough investigation also considers those examples that lie in the boundary re-
gion of any of the decision classes. Fuzzy rough sets have been also applied to the
instance selection problem in [99] [235] [234].
2.1.4 Meta-learning
The knowledge uncovered by the different data analysis techniques can be either
descriptive or predictive. The former characterizes the general properties of the data
in the data set (e.g., association rules) while the latter allows performing inferences
from the available data (e.g., decision rules). A decision rule summarizes the rela-
tionship between the properties (features) and describes a causal relationship among
them. For example, IF Headache = Yes AND Weakness = YES THEN Influenza =
YES. The most common rule induction task is to generate a rule base R that is both
consistent and complete.
According to [163], RST-based rule induction methods provide the following
benefits:
[300] [24], which includes working with the so called “fuzzy decision information
systems” [2].
One of the most popular rule induction methods based on rough sets is the so-
called three-way decisions model [262] [263] [264] [265] [81]. This methodology
is strongly related to decision making. Essentially, for each decision alternative,
this method defines three rules based on the RST’s positive, negative and bound-
ary regions. They respectively indicate acceptance, rejection or abstention (non-
commitment, denotes weak or insufficient evidence).
This type of rules, derived from the basic RST concepts, is a suitable knowledge
representation vehicle in a plethora of application domains. Hence, it has been inte-
grated into common machine learning tasks to facilitate the knowledge engineering
process required for a successful modeling of the domain under consideration. The
three-way decisions model has been adopted in feature selection [267] [165] [106]
[134] [295] [108], classification [275] [295] [284] [283], clustering [278] [279] and
face recognition [291] [133].
The discovery of association rules is one of the classical data mining tasks. Its goal
is to uncover relationships among attributes that frequently appear together; i.e., the
presence of one implies the presence of the other. One of the typical examples is the
purchase of beer and diapers during the weekends. Association rules are representa-
tive of descriptive knowledge. A particular case are the so called “class association
rules”, which are used to build classifiers. Several methods have been developed for
discovering association rules using rough sets, including [49] [128] [94] [70] [268]
[135] [112] [213].
2.2.3 Clustering
The clustering problem is another learning task that has been approached from
a rough set perspective. Clustering is a landmark unsupervised learning problem
whose main objective is to group similar objects in the same cluster and separate ob-
jects that are different from each other by assigning them to different clusters [169]
[96]. The objects are grouped in such a way that those in the same group exhibit a
high degree of association among them whereas those in different groups show a low
degree of association. Clustering algorithms map the original N-dimensional feature
space to a 1-dimensional space describing the cluster each object belongs to. This is
why clustering is considered both an important dimensionality reduction technique
and also one of the most prevalent Granular Computing [185] manifestations.
One of the most popular and efficient clustering algorithms for conventional ap-
plications is K-means clustering [71]. In the K-means approach, randomly selected
objects serve as initial cluster centroids. The objects are then assigned to different
clusters based on their distance to the centroids. In particular, an object gets as-
Rough Sets in Machine Learning: A Review 11
signed to the cluster with the nearest centroid. The newly modified clusters then
employ this information to determine new centroids. The process continues itera-
tively until the cluster centroids are stabilized. K-means is a very simple clustering
algorithm, easy to understand and implement. The underlying alternate optimiza-
tion approach iteratively converges but might get trapped into a local minimum of
the objective function. K-means’ best performance is attained in those applications
where clusters are well separated and a crisp (bivalent) object-to-cluster decision is
required. Its disadvantages include the sensitivity to outliers and the initial cluster
centroids as well as the a priori specification of the desired number of clusters k.
Pawan Lingras [143] and [146] found that the K-means algorithm often yields
clustering results with unclear, vague boundaries. He pointed out that the “hard par-
titioning” performed by K-means does not meet the needs of grouping vague data.
Lingras then proposed to combine K-means with RST and in the so-called “Rough
K-means” approach. In this technique, each cluster is modeled as a rough set and
each object belongs either to the lower approximation of a cluster or to the upper
approximation of multiple clusters. Instead of building each cluster, its lower and
upper approximations are defined based on the available data. The basic properties
of the Rough K-means method are: (i) an object can be a member of at most a lower
approximation; (ii) an object that is a member of the lower approximation of a clus-
ter is also a member of its upper approximation and (iii) an object that does not
belong to the lower approximation of any cluster is a member of at least the upper
approximation of two clusters. Other pioneering works on rough clustering methods
are put forth in [78] [194] [238] [237].
Rough K-means has been the subject of several subsequent studies aimed at im-
proving its clustering capabilities. Georg Peters [189] concludes that rough cluster-
ing offers the possibility of reducing the number of incorrectly clustered objects,
which is relevant to many real-world applications where minimizing the number of
wrongly grouped objects is more important than maximizing the number of cor-
rectly grouped objects. Hence in these scenarios, Rough K-means arises as a power-
ful and stronger alternative to K-means. The same author proposes some improve-
ments to the method regarding the calculation of the centroids, thus aiming to make
the method more stable and robust to outliers [186] [187]. The authors in [293] pro-
posed a Rough K-means improvement based on a variable weighted distance mea-
sure. Another enhancement brought forward in [188] suggested that well-defined
objects must have a greater impact on the cluster centroid calculation rather than
having this impact be governed by the number of cluster boundaries an object be-
longs to, as proposed in the original method. An extension to Rough K-means based
on the decision-theoretic rough sets model was developed in [131]. An evolutionary
approach for rough partitive clustering was designed in [170] [191] while [45] and
[192] elaborate on dynamic rough clustering approaches.
Other works that tackle the clustering problem using rough sets are [123] [164]
[180] [215] [144] [77] [76] [125] [35] [273] [277] [72] [136] [145] [274] [294]
[179]. These methods handle more specific scenarios (such as sequential, imbal-
anced, categorical and ordinal data), as well as applications of this clustering ap-
proach to different domains. The rough-fuzzy K-means method is put forward in
12 Rafael Bello and Rafael Falcon
[88] and [172] whereas the fuzzy-rough K-means is unveiled in [171] and [190].
Both approaches amalgamate the main features of Rough K-means and Fuzzy C-
means by using the fuzzy membership of the objects to the rough clusters. Other
variants of fuzzy and rough set hybridization for the clustering problem are pre-
sented in [162] [173] [56] [127].
The traditional knowledge discovery methods presented in the previous section have
to be adapted if we are dealing with an imbalanced dataset [21]. A dataset is bal-
anced if it has an approximately equal percentage of positive and negative examples
(i.e., those belonging to the concept to be classified and those belonging to other
concepts, respectively). However, there are many application domains where we
find an imbalanced dataset; for instance, in healthcare scenarios there are usually
a plethora of patients that do not have a particularly rare disease. When learning a
normalcy model for a certain environment, the number of labeled anomalous events
is often scarce as most of the data corresponds to normal behaviour. The problem
with imbalanced classes is that the classification algorithms have a tendency towards
favoring the majority class. This occurs because the classifier attempts to reduce the
overall error, hence the classification error does not take into account the underlying
data distribution [23].
Several solutions have been researched to deal with this kind of situations. Two of
the most popular avenues are either resampling the training data (i.e., oversampling
the minority class or undersampling the majority class) or modifying the learning
method [155]. One of the classical methods for learning with imbalanced data is
SMOTE (synthetic minority oversampling technique) [22]. Different learning meth-
ods for imbalanced classification have been developed from an RST-based stand-
point. For instance, Hu et. al. [91] proposed models based on probabilistic rough
sets where each example has an associated probability p(x) instead of the default
1/n. Ma et. al. [160] introduced weights in the variable-precision rough set model
(VPRS) to denote the importance of each example. Liu et. al. [155] bring about
some weights in the RST formulation to balance the class distribution and develop
a method based on weighted rough sets to solve the imbalanced class learning prob-
lem. Ramentol et. al. [196] proposed a method that integrates SMOTE with RST.
Rough Sets in Machine Learning: A Review 13
Data are continuously being updated in nowadays’ information systems. New data
are added and obsolete data are purged over time. Traditional batch-learning meth-
ods lean on the principle of running these algorithms on all data when the informa-
tion is updated, which obviously affects the system efficiency while ignoring any
14 Rafael Bello and Rafael Falcon
previous learning. Instead, learning should occur as new information arrives. Man-
aging this learning while adapting the previous knowledge learned is the essence
behind incremental learning. This term refers to an efficient strategy for the anal-
ysis of data in dynamic environments that allows acquiring additional knowledge
from an uninterrupted information flow. The advantage of incremental learning is
not to have to analyze the data from scratch but to utilize the learning process’ pre-
vious outcomes as much as possible [178] [202] [73] [57] [113]. The continuous
and massive acquisition of data becomes a challenge for the discovery of knowl-
edge; especially in the context of Big Data, it becomes very necessary to develop
capacities to assimilate the continuous data streams [29].
As an information-based methodology, RST is not exempt from being scrutinized
in the context of dynamic data. The fundamental RST concepts and the knowledge
discovery methods ensuing from them are geared towards the analysis of static data;
hence, they need to be thoroughly revised in light of the requirements posed by
data stream mining systems [153]. The purpose of the incremental learning strategy
in rough sets is the development of incremental algorithms to quickly update the
concept approximations, the reduct calculation or the discovered decision rules [40]
[286]. The direct precursor of these studies can be found in [177]. According to
[150], in recent years RST-based incremental learning approaches have become “hot
topics” in knowledge extraction from dynamic data given their proven data analysis
efficiency.
The study of RST in the context of learning with dynamic data can be approached
from two different angles: what kind of information is considered to be dynamic and
what type of learning task must be carried out. In the first case, the RST-based incre-
mental updating approach could be further subdivided into three alternatives: (i) ob-
ject variation (insertion or deletion of objects in the universe), (ii) attribute variation
(insertion/removal of attributes) and (iii) attribute value variation (insertion/deletion
of attribute values). In the second case, we can mention (i) incremental learning of
the concept approximations [140] [33]; (ii) incremental learning of attribute reduc-
tion [52] [252] [239] [240] [141] and (iii) incremental learning of decision rules
[303] [66] [59] [149].
Object variations include so-called object immigration and emigration [149].
Variations of the attributes include feature insertion or deletion [289] [139]. Vari-
ations in attribute values are primarily manifested via the refinement or scaling of
the attribute values [147] [32]. Other works that propose modifications to RST-based
methods for the case of dynamic data are [148] [151] [159].
The following studies deal with dynamic object variation:
• The update of the lower and upper approximations of the target concept is ana-
lyzed in [33] [138] [158].
• The update in the reduction of attributes is studied in [82] [252].
• The update of the decision rule induction mechanism is discussed in [201] [4]
[246] [271] [303] [59] [149] [232] [40] [93].
If the variation occurs in the set of attributes, its effects have been studied with
respect to these aspects:
Rough Sets in Machine Learning: A Review 15
• The update of the lower and upper approximations of the target concept is ana-
lyzed in [20] [140] [36] [289] [139] [152].
• The update of the decision rule induction mechanism is discussed in [39].
The effect of the variations in the attribute values (namely, via refinement or
extension of the attribute domains) with respect to the update of the lower and upper
approximations of the target concept is analyzed in [50] [310] [30] [32] [31] [239].
The calculation of reducts for dynamic data has also been investigated. The effect
when the set of attributes varies is studied in [39]. The case of varying the attribute
values is explored in [50] and [69] whereas the case of dynamic object update is dis-
sected in [201] and [246]. Other studies on how dynamic data affect the calculation
of reducts appear in [239] [240] [141] and [206].
On the other hand, the accelerated pace of technology has led to an exponential
growth in the generation and collection of digital information. This growth is not
only limited to the amount of data available but to the plethora of diverse sources
that emit these data streams. It becomes paramount then to efficiently analyze and
extract knowledge from many dissimilar information sources within a certain appli-
cation domain. This has led to the emergece of the Big Data era [25], which has a
direct impact on the development of RST and its applications. Granular Computing,
our starting point in this chapter, has a strong relation to Big Data [25], as its in-
herent ability to process information at multiple levels of abstraction and interpret
information from different perspectives greatly facilitates the efficient management
of large data volumes.
Simply put, Big Data can be envisioned as a large and complex data collec-
tion. These data are very difficult to analyze through traditional data management
and processing tools. Big Data scenarios require new architectures, techniques, al-
gorithms and processes to manage and extract value and knowledge hidden in the
data streams. Big Data is often characterized by the 5 V’s vector: Volume, Veloc-
ity, Variety, Veracity and Value. Big Data includes both structured and unstructured
data, including images, videos, textual reports, etc. Big Data frameworks such as
MapReduce and Spark have been recently developed and constitute indispensable
tools for the accurate and seamless knowledge extraction from an array of disparate
data sources. For more information on the Big Data paradigm, the reader is referred
to the following articles: [48] [25] [60] [118].
As a data analysis and information extraction methodology, RST needs to adapt
and evolve in order to cope with this new phenomenon. A major motivation to do so
lies in the fact that the sizes of nowadays’ decision systems are already extremely
large. This poses a significant challenge to the efficient calculation of the underlying
RST concepts and the knowledge discovery methods that emanate from them. Recall
that the computational complexity of computing the target concept’s approximations
is O(lm2 ), the computational cost of finding a reduct is bounded by O(l 2 m2 ) and the
16 Rafael Bello and Rafael Falcon
time complexity to find all reducts is O( 2l J), where l is the number of attributes
characterizing the objects, m is the number of objects in the universe and J is the
computational cost required to calculate a reduct.
Some researchers have proposed RST-based solutions to the Big Data challenge
[288] [193]. These methods are concerned with the design of parallel algorithms
to compute equivalence classes, decision classes, associations between equivalence
classes and decision classes, approximations, and so on. They are based on partition-
ing the universe, concurrently processing those information subsystems and then
integrating the results. In other words, given the decision system S = (U,C ∪ D),
generate the subsystems {S1 , S2 , . . . , Sm }, where Si = (Ui ,C ∪ D) and U = Ui , then
S
process each subsystem Si , i ∈ {1, 2, . . . , m}, Ui /B, B ⊆ C. Afterwards, the results are
amalgamated. This MapReduce-compliant workflow is supported by several theo-
rems stating that (a) equivalence classes can be independently computed for each
subsystem and (b) the equivalence classes from different subsystems can be merged
if they are based on the same underlying attribute set. These results enable the par-
allel computation of the equivalence classes of the decision system S. Zhang et. al.
[288] developed the PACRSEC algorithm to that end.
Analogously, RST-based knowledge discovery methods, including reduct calcu-
lation and decision rule induction, have been investigated in in the context of Big
Data [258] [287] [58].
4 Reference categorization
Table 1 lists the different RST studies according to the ML tasks they perform.
18
5 Conclusions
References
1. Abraham, A., Falcon, R., Bello, R.: Rough Set Theory: a True Landmark in Data Analysis.
Springer Verlag, Berlin-Heidelberg, Germany (2009)
2. Bai, H., Ge, Y., Wang, J., Li, D., Liao, Y., Zheng, X.: A method for extracting rules from
spatial data based on rough fuzzy sets. Knowledge-Based Systems 57, 28–40 (2014)
3. Bal, M.: Rough sets theory as symbolic data mining method: an application on complete
decision table. Information Sciences Letters 2(1), 111–116 (2013)
4. Bang, W.C., Bien, Z.: New incremental learning algorithm in the framework of rough set
theory. International Journal of Fuzzy Systems 1, 25–36 (1999)
5. Bazan, J.G.: A comparison of dynamic and non-dynamic rough set methods for extracting
laws from decision tables. Rough sets in knowledge discovery 1, 321–365 (1998)
6. Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough set algorithms in
classification problem. In: Rough set methods and applications, pp. 49–88. Springer (2000)
7. Bello, R., Falcon, R., Pedrycz, W., Kacprzyk, J.: Granular Computing: at the Junction of
Rough Sets and Fuzzy Sets. Springer Verlag, Berlin-Heidelberg, Germany (2008)
8. Bello, R., Gómez, Y., Caballero, Y., Nowe, A., Falcon, R.: Rough Sets and Evolutionary
Computation to Solve the Feature Selection Problem. In: A. Abraham, R. Falcon, R. Bello
(eds.) Rough Set Theory: A True Landmark in Data Analysis, Studies in Computational
Intelligence, vol. 174, pp. 235–260. Springer Berlin / Heidelberg (2009)
9. Bello, R., Nowe, A., Gómez, Y., Caballero, Y.: Using ACO and rough set theory to feature
selection. WSEAS Transactions on Information Science and Applications 2(5), 512–517
(2005)
10. Bello, R., Puris, A., Falcon, R., Gómez, Y.: Feature Selection through Dynamic Mesh Opti-
mization. In: J. Ruiz-Shulcloper, W. Kropatsch (eds.) Progress in Pattern Recognition, Im-
age Analysis and Applications, Lecture Notes in Computer Science, vol. 5197, pp. 348–355.
Springer Berlin / Heidelberg (2008)
11. Bello, R., Puris, A., Nowe, A., Martı́nez, Y., Garcı́a, M.M.: Two step ant colony system to
solve the feature selection problem. In: Iberoamerican Congress on Pattern Recognition, pp.
588–596. Springer (2006)
20 Rafael Bello and Rafael Falcon
12. Bello, R., Verdegay, J.L.: Rough sets in the soft computing environment. Information Sci-
ences 212, 1–14 (2012)
13. Bhatt, R.B., Gopal, M.: On fuzzy-rough sets approach to feature selection. Pattern recogni-
tion letters 26(7), 965–975 (2005)
14. Błaszczyński, J., Słowiński, R., Szelkag, M.: Sequential covering rule induction algorithm for
variable consistency rough set approaches. Information Sciences 181(5), 987–1002 (2011)
15. Caballero, Y., Bello, R., Alvarez, D., Garcia, M.M.: Two new feature selection algorithms
with rough sets theory. In: IFIP International Conference on Artificial Intelligence in Theory
and Practice, pp. 209–216. Springer (2006)
16. Caballero, Y., Bello, R., Alvarez, D., Gareia, M.M., Pizano, Y.: Improving the k-nn method:
Rough set in edit training set. In: Professional Practice in Artificial Intelligence, pp. 21–30.
Springer (2006)
17. Caballero, Y., Bello, R., Arco, L., Garcı́a, M., Ramentol, E.: Knowledge discovery using
rough set theory. In: Advances in Machine Learning I, pp. 367–383. Springer (2010)
18. Caballero, Y., Bello, R., Arco, L., Márquez, Y., León, P., Garcı́a, M.M., Casas, G.: Rough
set theory measures for quality assessment of a training set. In: Granular Computing: At the
Junction of Rough Sets and Fuzzy Sets, pp. 199–210. Springer (2008)
19. Caballero, Y., Joseph, S., Lezcano, Y., Bello, R., Garcia, M.M., Pizano, Y.: Using rough sets
to edit training set in k-nn method. In: ISDA, pp. 456–463 (2005)
20. Chan, C.C.: A rough set approach to attribute generalization in data mining. Information
Sciences 107(1), 169–176 (1998)
21. Chawla, N.V.: Data mining for imbalanced datasets: An overview. In: Data mining and
knowledge discovery handbook, pp. 853–867. Springer (2005)
22. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-
sampling technique. Journal of artificial intelligence research 16, 321–357 (2002)
23. Chawla, N.V., Cieslak, D.A., Hall, L.O., Joshi, A.: Automatically countering imbalance and
its empirical relationship to cost. Data Mining and Knowledge Discovery 17(2), 225–252
(2008)
24. Chen, C., Mac Parthaláin, N., Li, Y., Price, C., Quek, C., Shen, Q.: Rough-fuzzy rule inter-
polation. Information Sciences 351, 1–17 (2016)
25. Chen, C.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technolo-
gies: A survey on big data. Information Sciences 275, 314–347 (2014)
26. Chen, C.Y., Li, Z.G., Qiao, S.Y., Wen, S.P.: Study on discretization in rough set based on
genetic algorithm. In: Machine Learning and Cybernetics, 2003 International Conference
on, vol. 3, pp. 1430–1434. IEEE (2003)
27. Chen, D., Hu, Q., Yang, Y.: Parameterized attribute reduction with gaussian kernel based
fuzzy rough sets. Information Sciences 181(23), 5169–5179 (2011)
28. Chen, D., Zhang, L., Zhao, S., Hu, Q., Zhu, P.: A novel algorithm for finding reducts with
fuzzy rough sets. IEEE Transactions on Fuzzy Systems 20(2), 385–389 (2012)
29. Chen, H., Chiang, R.H., Storey, V.C.: Business intelligence and analytics: From big data to
big impact. MIS quarterly 36(4), 1165–1188 (2012)
30. Chen, H., Li, T., Qiao, S., Ruan, D.: A rough set based dynamic maintenance approach for
approximations in coarsening and refining attribute values. International Journal of Intelli-
gent Systems 25(10), 1005–1026 (2010)
31. Chen, H., Li, T., Ruan, D.: Dynamic maintenance of approximations under a rough-set based
variable precision limited tolerance relation. Journal of Multiple-Valued Logic & Soft Com-
puting 18 (2012)
32. Chen, H., Li, T., Ruan, D.: Maintenance of approximations in incomplete ordered decision
systems while attribute values coarsening or refining. Knowledge-Based Systems 31, 140–
161 (2012)
33. Chen, H., Li, T., Ruan, D., Lin, J., Hu, C.: A rough-set-based incremental approach for up-
dating approximations under dynamic maintenance environments. IEEE Transactions on
Knowledge and Data Engineering 25(2), 274–284 (2013)
Rough Sets in Machine Learning: A Review 21
34. Chen, Y.S., Cheng, C.H.: A delphi-based rough sets fusion model for extracting payment
rules of vehicle license tax in the government sector. Expert Systems with Applications
37(3), 2161–2174 (2010)
35. Cheng, X., Wu, R.: Clustering path profiles on a website using rough k-means method. Jour-
nal of Computational Information Systems 8(14), 6009–6016 (2012)
36. Cheng, Y.: The incremental method for fast computing the rough fuzzy approximations. Data
& Knowledge Engineering 70(1), 84–100 (2011)
37. Choubey, S.K., Deogun, J.S., Raghavan, V.V., Sever, H.: A comparison of feature selection
algorithms in the context of rough classifiers. In: Fuzzy Systems, 1996., Proceedings of the
Fifth IEEE International Conference on, vol. 2, pp. 1122–1128. IEEE (1996)
38. Chouchoulas, A., Shen, Q.: A rough set-based approach to text classification. In: Interna-
tional Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing,
pp. 118–127. Springer (1999)
39. Ciucci, D.: Attribute dynamics in rough sets. In: International Symposium on Methodologies
for Intelligent Systems, pp. 43–51. Springer (2011)
40. Ciucci, D.: Temporal dynamics in information tables. Fundamenta informaticae 115(1), 57–
74 (2012)
41. Coello, L., Fernandez, Y., Filiberto, Y., Bello, R.: Improving the multilayer perceptron learn-
ing by using a method to calculate the initial weights with the similarity quality measure
based on fuzzy sets and particle swarms. Computación y Sistemas 19(2), 309–320 (2015)
42. Cornelis, C., Jensen, R.: A noise-tolerant approach to fuzzy-rough feature selection. In:
Fuzzy Systems, 2008. FUZZ-IEEE 2008.(IEEE World Congress on Computational Intelli-
gence). IEEE International Conference on, pp. 1598–1605. IEEE (2008)
43. Cornelis, C., Jensen, R., Hurtado, G., Śle, D., et al.: Attribute selection with fuzzy decision
reducts. Information Sciences 180(2), 209–224 (2010)
44. Cornelis, C., Verbiest, N., Jensen, R.: Ordered weighted average based fuzzy rough sets. In:
International Conference on Rough Sets and Knowledge Technology, pp. 78–85. Springer
(2010)
45. Crespo, F., Peters, G., Weber, R.: Rough clustering approaches for dynamic environments.
In: Rough Sets: Selected Methods and Applications in Management and Engineering, pp.
39–50. Springer (2012)
46. Dai, J.H., Li, Y.X.: Study on discretization based on rough set theory. In: Machine Learning
and Cybernetics, 2002. Proceedings. 2002 International Conference on, vol. 3, pp. 1371–
1373. IEEE (2002)
47. De Comité, F., Gilleron, R., Tommasi, M.: Learning multi-label alternating decision trees
from texts and data. In: International Workshop on Machine Learning and Data Mining in
Pattern Recognition, pp. 35–49. Springer (2003)
48. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commu-
nications of the ACM 51(1), 107–113 (2008)
49. Delic, D., Lenz, H.J., Neiling, M.: Improving the quality of association rule mining by means
of rough sets. In: Soft Methods in Probability, Statistics and Data Analysis, pp. 281–288.
Springer (2002)
50. Deng, D., Huang, H.: Dynamic reduction based on rough sets in incomplete decision systems.
In: International Conference on Rough Sets and Knowledge Technology, pp. 76–83. Springer
(2007)
51. Derrac, J., Cornelis, C., Garcı́a, S., Herrera, F.: Enhancing evolutionary instance selection
algorithms by means of fuzzy rough set based feature selection. Information Sciences 186(1),
73–92 (2012)
52. Dey, P., Dey, S., Datta, S., Sil, J.: Dynamic discreduction using rough sets. Applied Soft
Computing 11(5), 3887–3897 (2011)
53. Dougherty, J., Kohavi, R., Sahami, M., et al.: Supervised and unsupervised discretization of
continuous features. In: Machine learning: proceedings of the twelfth international confer-
ence, vol. 12, pp. 194–202 (1995)
54. Dubois, D., Prade, H.: Twofold fuzzy sets and rough sets some issues in knowledge repre-
sentation. Fuzzy sets and Systems 23(1), 3–18 (1987)
22 Rafael Bello and Rafael Falcon
55. Dubois, D., Prade, H.: Rough fuzzy sets and fuzzy rough sets*. International Journal of
General System 17(2-3), 191–209 (1990)
56. Falcon, R., Jeon, G., Bello, R., Jeong, J.: Rough clustering with partial supervision. In:
Rough Set Theory: A True Landmark in Data Analysis, pp. 137–161. Springer (2009)
57. Falcon, R., Nayak, A., Abielmona, R.: An Online Shadowed Clustering Algorithm Applied
to Risk Visualization in Territorial Security. In: IEEE Symposium on Computational Intelli-
gence for Security and Defense Applications (CISDA), pp. 1–8. Ottawa, Canada (2012)
58. Fan, Y.N., Chern, C.C.: An agent model for incremental rough set-based rule induction: a
Big Data analysis in sales promotion. In: System Sciences (HICSS), 2013 46th Hawaii
International Conference on, pp. 985–994. IEEE (2013)
59. Fan, Y.N., Tseng, T.L.B., Chern, C.C., Huang, C.C.: Rule induction based on an incremental
rough set. Expert Systems with Applications 36(9), 11,439–11,450 (2009)
60. Fernández, A., del Rı́o, S., López, V., Bawakid, A., del Jesus, M.J., Benı́tez, J.M., Herrera, F.:
Big data with cloud computing: an insight on the computing environment, mapreduce, and
programming frameworks. Wiley Interdisciplinary Reviews: Data Mining and Knowledge
Discovery 4(5), 380–409 (2014)
61. Filiberto, Y., Caballero, Y., Larrua, R., Bello, R.: A method to build similarity relations into
extended rough set theory. In: 2010 10th International Conference on Intelligent Systems
Design and Applications, pp. 1314–1319. IEEE (2010)
62. Filiberto Cabrera, Y., Caballero Mota, Y., Bello Pérez, R., Frı́as, M.: Algoritmo para el apren-
dizaje de reglas de clasificación basado en la teorı́a de los conjuntos aproximados extendida.
Dyna; Vol. 78, núm. 169 (2011); 62-70 DYNA; Vol. 78, núm. 169 (2011); 62-70 2346-2183
0012-7353 (2011)
63. Gogoi, P., Bhattacharyya, D.K., Kalita, J.K.: A rough set-based effective rule generation
method for classification with an application in intrusion detection. International Journal of
Security and Networks 8(2), 61–71 (2013)
64. Gómez, Y., Bello, R., Puris, A., Garcia, M.M., Nowe, A.: Two step swarm intelligence to
solve the feature selection problem. J. UCS 14(15), 2582–2596 (2008)
65. Greco, S., Matarazzo, B., Słowiński, R.: Parameterized rough set model using rough member-
ship and bayesian confirmation measures. International Journal of Approximate Reasoning
49(2), 285–300 (2008)
66. Greco, S., Słowiński, R., Stefanowski, J., Żurawski, M.: Incremental versus non-incremental
rule induction for multicriteria classification. In: Transactions on Rough Sets II, pp. 33–53.
Springer (2004)
67. Grzymala-Busse, J.W.: LERS - a system for learning from examples based on rough sets. In:
Intelligent decision support, pp. 3–18. Springer (1992)
68. Grzymała-Busse, J.W.: Characteristic relations for incomplete data: A generalization of the
indiscernibility relation. In: International Conference on Rough Sets and Current Trends in
Computing, pp. 244–253. Springer (2004)
69. Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Inducing better rule sets by adding missing
attribute values. In: International Conference on Rough Sets and Current Trends in Comput-
ing, pp. 160–169. Springer (2008)
70. Guan, J., Bell, D.A., Liu, D.: The rough set approach to association rule mining. In: Data
Mining, 2003. ICDM 2003. Third IEEE International Conference on, pp. 529–532. IEEE
(2003)
71. Hartigan, J.A., Wong, M.A.: Algorithm as 136: A k-means clustering algorithm. Journal of
the Royal Statistical Society. Series C (Applied Statistics) 28(1), 100–108 (1979)
72. Hassanein, W., Elmelegy, A.A.: An algorithm for selecting clustering attribute using signif-
icance of attributes. International Journal of Database Theory and Application 6(5), 53–66
(2013)
73. He, H., Chen, S., Li, K., Xu, X.: Incremental learning from stream data. IEEE Transactions
on Neural Networks 22(12), 1901–1914 (2011)
74. He, H., Min, F., Zhu, W.: Attribute reduction in test-cost-sensitive decision systems with
common-test-costs. In: Proceedings of the 3rd International Conference on Machine Learn-
ing and Computing, vol. 1, pp. 432–436 (2011)
Rough Sets in Machine Learning: A Review 23
75. He, Q., Wu, C., Chen, D., Zhao, S.: Fuzzy rough set based attribute reduction for information
systems with fuzzy decisions. Knowledge-Based Systems 24(5), 689–696 (2011)
76. Herawan, T.: Rough set approach for categorical data clustering. Ph.D. thesis, Universiti Tun
Hussein Onn Malaysia (2010)
77. Herawan, T., Deris, M.M., Abawajy, J.H.: A rough set approach for selecting clustering at-
tribute. Knowledge-Based Systems 23(3), 220–231 (2010)
78. Hirano, S., Tsumoto, S.: Rough clustering and its application to medicine. Journal of Infor-
mation Science 124, 125–137 (2000)
79. Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE trans-
actions on pattern analysis and machine intelligence 24(3), 289–300 (2002)
80. Hong, T.P., Tseng, L.H., Wang, S.L.: Learning rules from incomplete training examples by
rough sets. Expert Systems with Applications 22(4), 285–293 (2002)
81. Hu, B.Q.: Three-way decisions space and three-way decisions. Information Sciences 281,
21–52 (2014)
82. Hu, F., Wang, G., Huang, H., Wu, Y.: Incremental attribute reduction based on elementary
sets. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft
Computing, pp. 185–193. Springer (2005)
83. Hu, H., Shi, Z.: Machine learning as granular computing. In: Granular Computing, 2009,
GRC’09. IEEE International Conference on, pp. 229–234. IEEE (2009)
84. Hu, Q., Che, X., Zhang, L., Zhang, D., Guo, M., Yu, D.: Rank entropy-based decision trees
for monotonic classification. IEEE Transactions on Knowledge and Data Engineering 24(11),
2052–2064 (2012)
85. Hu, Q., Liu, J., Yu, D.: Mixed feature selection based on granulation and approximation.
Knowledge-Based Systems 21(4), 294–304 (2008)
86. Hu, Q., Pan, W., Zhang, L., Zhang, D., Song, Y., Guo, M., Yu, D.: Feature selection for
monotonic classification. IEEE Transactions on Fuzzy Systems 20(1), 69–81 (2012)
87. Hu, Q., Xie, Z., Yu, D.: Hybrid attribute reduction based on a novel fuzzy-rough model and
information granulation. Pattern recognition 40(12), 3509–3521 (2007)
88. Hu, Q., Yu, D.: An improved clustering algorithm for information granulation. In: Inter-
national Conference on Fuzzy Systems and Knowledge Discovery, pp. 494–504. Springer
(2005)
89. Hu, Q., Yu, D., Liu, J., Wu, C.: Neighborhood rough set based heterogeneous feature subset
selection. Information sciences 178(18), 3577–3594 (2008)
90. Hu, Q., Yu, D., Xie, Z.: Information-preserving hybrid data reduction based on fuzzy-rough
techniques. Pattern recognition letters 27(5), 414–423 (2006)
91. Hu, Q., Yu, D., Xie, Z., Liu, J.: Fuzzy probabilistic approximation spaces and their informa-
tion measures. IEEE transactions on fuzzy systems 14(2), 191–201 (2006)
92. Hu, Q., Zhang, L., An, S., Zhang, D., Yu, D.: On robust fuzzy rough set models. IEEE
transactions on Fuzzy Systems 20(4), 636–651 (2012)
93. Huang, C.C., Tseng, T.L.B., Fan, Y.N., Hsu, C.H.: Alternative rule induction methods based
on incremental object using rough set theory. Applied Soft Computing 13(1), 372–389 (2013)
94. Huang, Z., Hu, Y.Q.: Applying ai technology and rough set theory to mine association rules
for supporting knowledge management. In: Machine Learning and Cybernetics, 2003 Inter-
national Conference on, vol. 3, pp. 1820–1825. IEEE (2003)
95. Hüllermeier, E.: Granular computing in machine learning and data mining. Handbook of
Granular Computing pp. 889–906 (2008)
96. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM computing surveys
(CSUR) 31(3), 264–323 (1999)
97. Janusz, A., Slezak, D.: Rough set methods for attribute clustering and selection. Applied
Artificial Intelligence 28(3), 220–242 (2014)
98. Janusz, A., Stawicki, S.: Applications of approximate reducts to the feature selection prob-
lem. In: International Conference on Rough Sets and Knowledge Technology, pp. 45–50.
Springer (2011)
99. Jensen, R., Cornelis, C.: Fuzzy-rough instance selection. In: Fuzzy Systems (FUZZ), 2010
IEEE International Conference on, pp. 1–7. IEEE (2010)
24 Rafael Bello and Rafael Falcon
100. Jensen, R., Cornelis, C., Shen, Q.: Hybrid fuzzy-rough rule induction and feature selection.
In: Fuzzy Systems, 2009. FUZZ-IEEE 2009. IEEE International Conference on, pp. 1151–
1156. IEEE (2009)
101. Jensen, R., Shen, Q.: Fuzzy-rough sets for descriptive dimensionality reduction. In: Fuzzy
Systems, 2002. FUZZ-IEEE’02. Proceedings of the 2002 IEEE International Conference on,
vol. 1, pp. 29–34. IEEE (2002)
102. Jensen, R., Shen, Q.: Finding rough set reducts with ant colony optimization. In: Proceedings
of the 2003 UK workshop on computational intelligence, vol. 1, pp. 15–22 (2003)
103. Jensen, R., Shen, Q.: Fuzzy–rough attribute reduction with application to web categorization.
Fuzzy sets and systems 141(3), 469–485 (2004)
104. Jensen, R., Shen, Q.: Semantics-preserving dimensionality reduction: rough and fuzzy-
rough-based approaches. IEEE Transactions on knowledge and data engineering 16(12),
1457–1471 (2004)
105. Jensen, R., Shen, Q.: New approaches to fuzzy-rough feature selection. IEEE Transactions
on Fuzzy Systems 17(4), 824–838 (2009)
106. Jia, X., Liao, W., Tang, Z., Shang, L.: Minimum cost attribute reduction in decision-theoretic
rough set models. Information Sciences 219, 151–167 (2013)
107. Jia, X., Liao, W., Tang, Z., Shang, L.: Minimum cost attribute reduction in decision-theoretic
rough set models. Information Sciences 219, 151–167 (2013)
108. Jia, X., Shang, L., Zhou, B., Yao, Y.: Generalized attribute reduct in rough set theory.
Knowledge-Based Systems 91, 204–218 (2016)
109. Jiang, F., Sui, Y., Cao, C.: Outlier detection based on rough membership function. In: Inter-
national Conference on Rough Sets and Current Trends in Computing, pp. 388–397. Springer
(2006)
110. Jiang, F., Sui, Y., Cao, C.: Some issues about outlier detection in rough set theory. Expert
Systems with Applications 36(3), 4680–4687 (2009)
111. Jiang, Y.C., Liu, Y.Z., Liu, X., Zhang, J.K.: Constructing associative classifier using rough
sets and evidence theory. In: International Workshop on Rough Sets, Fuzzy Sets, Data Min-
ing, and Granular-Soft Computing, pp. 263–271. Springer (2007)
112. Jiao, X., Lian-cheng, X., Lin, Q.: Association rules mining algorithm based on rough set.
In: International Symposium on Information Technology in Medicine and Education, Print
ISBN, pp. 978–1 (2012)
113. Joshi, P., Kulkarni, P.: Incremental learning: areas and methods - a survey. International
Journal of Data Mining & Knowledge Management Process 2(5), 43 (2012)
114. Ju, H., Yang, X., Song, X., Qi, Y.: Dynamic updating multigranulation fuzzy rough set: ap-
proximations and reducts. International Journal of Machine Learning and Cybernetics 5(6),
981–990 (2014)
115. Ju, H., Yang, X., Yang, P., Li, H., Zhou, X.: A moderate attribute reduction approach in
decision-theoretic rough set. In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Com-
puting, pp. 376–388. Springer (2015)
116. Ju, H., Yang, X., Yu, H., Li, T., Yu, D.J., Yang, J.: Cost-sensitive rough set approach. Infor-
mation Sciences 355, 282–298 (2016)
117. Jun, Z., Zhou, Y.h.: New heuristic method for data discretization based on rough set theory.
The Journal of China Universities of Posts and Telecommunications 16(6), 113–120 (2009)
118. Kambatla, K., Kollias, G., Kumar, V., Grama, A.: Trends in big data analytics. Journal of
Parallel and Distributed Computing 74(7), 2561–2573 (2014)
119. Kaneiwa, K.: A rough set approach to mining connections from information systems. In: Pro-
ceedings of the 2010 ACM Symposium on Applied Computing, pp. 990–996. ACM (2010)
120. Ke, L., Feng, Z., Ren, Z.: An efficient ant colony optimization approach to attribute reduction
in rough set theory. Pattern Recognition Letters 29(9), 1351–1357 (2008)
121. Komorowski, J., Pawlal, Z., Polkowski, L., Skowron, A.: A rough set perspective on data
and knowledge. The handbook of data mining and knowledge discovery. Oxford University
Press, Oxford (1999)
122. Kryszkiewicz, M.: Rough set approach to incomplete information systems. Information sci-
ences 112(1), 39–49 (1998)
Rough Sets in Machine Learning: A Review 25
123. Kumar, P., Krishna, P.R., Bapi, R.S., De, S.K.: Rough clustering of sequential data. Data &
Knowledge Engineering 63(2), 183–199 (2007)
124. Kumar, P., Vadakkepat, P., Poh, L.A.: Fuzzy-rough discriminative feature selection and clas-
sification algorithm, with application to microarray and image datasets. Applied Soft Com-
puting 11(4), 3429–3440 (2011)
125. Kumar, P., Wasan, S.K.: Comparative study of k-means, pam and rough k-means algorithms
using cancer datasets. In: Proceedings of CSIT: 2009 International Symposium on Comput-
ing, Communication, and Control (ISCCC 2009), vol. 1, pp. 136–140 (2011)
126. Kuncheva, L.I.: Fuzzy rough sets: application to feature selection. Fuzzy sets and Systems
51(2), 147–153 (1992)
127. Lai, J.Z., Juan, E.Y., Lai, F.J.: Rough clustering using generalized fuzzy clustering algorithm.
Pattern Recognition 46(9), 2538–2547 (2013)
128. Lee, S.C., Huang, M.J.: Applying ai technology and rough set theory for mining associa-
tion rules to support crime management and fire-fighting resources allocation. Journal of
Information, Technology and Society 2(65), 65–78 (2002)
129. Lenarcik, A., Piasta, Z.: Discretization of condition attributes space. In: Intelligent Decision
Support, pp. 373–389. Springer (1992)
130. Leung, Y., Fischer, M.M., Wu, W.Z., Mi, J.S.: A rough set approach for the discovery of clas-
sification rules in interval-valued information systems. International Journal of Approximate
Reasoning 47(2), 233–246 (2008)
131. Li, F., Ye, M., Chen, X.: An extension to rough c-means clustering based on decision-
theoretic rough sets model. International Journal of Approximate Reasoning 55(1), 116–129
(2014)
132. Li, H., Li, D., Zhai, Y., Wang, S., Zhang, J.: A variable precision attribute reduction approach
in multilabel decision tables. The Scientific World Journal 2014 (2014)
133. Li, H., Zhang, L., Huang, B., Zhou, X.: Sequential three-way decision and granulation for
cost-sensitive face recognition. Knowledge-Based Systems 91, 241–251 (2016)
134. Li, H., Zhou, X., Zhao, J., Liu, D.: Non-monotonic attribute reduction in decision-theoretic
rough sets. Fundamenta Informaticae 126(4), 415–432 (2013)
135. Li, J., Cercone, N.: A rough set based model to rank the importance of association rules. In:
International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Com-
puting, pp. 109–118. Springer (2005)
136. Li, M., Deng, S., Wang, L., Feng, S., Fan, J.: Hierarchical clustering algorithm for categorical
data using a probabilistic rough set model. Knowledge-Based Systems 65, 60–71 (2014)
137. Li, M., Shang, C., Feng, S., Fan, J.: Quick attribute reduction in inconsistent decision tables.
Information Sciences 254, 155–180 (2014)
138. Li, S., Li, T., Liu, D.: Dynamic maintenance of approximations in dominance-based rough
set approach under the variation of the object set. International Journal of Intelligent Systems
28(8), 729–751 (2013)
139. Li, S., Li, T., Liu, D.: Incremental updating approximations in dominance-based rough sets
approach under the variation of the attribute set. Knowledge-Based Systems 40, 17–26
(2013)
140. Li, T., Ruan, D., Geert, W., Song, J., Xu, Y.: A rough sets based characteristic relation
approach for dynamic attribute generalization in data mining. Knowledge-Based Systems
20(5), 485–494 (2007)
141. Liang, J., Wang, F., Dang, C., Qian, Y.: A group incremental approach to feature selection
applying rough set technique. IEEE Transactions on Knowledge and Data Engineering 26(2),
294–308 (2014)
142. Lin, T.Y., Yao, Y.Y., Zadeh, L.A.: Data mining, rough sets and granular computing, vol. 95.
Physica (2013)
143. Lingras, P.: Unsupervised rough set classification using gas. Journal of Intelligent Informa-
tion Systems 16(3), 215–228 (2001)
144. Lingras, P., Chen, M., Miao, D.: Rough cluster quality index based on decision theory. IEEE
Transactions on Knowledge and Data Engineering 21(7), 1014–1026 (2009)
26 Rafael Bello and Rafael Falcon
145. Lingras, P., Chen, M., Miao, D.: Qualitative and quantitative combinations of crisp and rough
clustering schemes using dominance relations. International Journal of Approximate Rea-
soning 55(1), 238–258 (2014)
146. Lingras, P., West, C.: Interval set clustering of web users with rough k-means. Journal of
Intelligent Information Systems 23(1), 5–16 (2004)
147. Liu, D., Li, T., Liu, G., Hu, P.: An approach for inducing interesting incremental knowledge
based on the change of attribute values. In: Granular Computing, 2009, GRC’09. IEEE
International Conference on, pp. 415–418. IEEE (2009)
148. Liu, D., Li, T., Ruan, D., Zhang, J.: Incremental learning optimization on knowledge discov-
ery in dynamic business intelligent systems. Journal of Global Optimization 51(2), 325–344
(2011)
149. Liu, D., Li, T., Ruan, D., Zou, W.: An incremental approach for inducing knowledge from
dynamic information systems. Fundamenta Informaticae 94(2), 245–260 (2009)
150. Liu, D., Li, T., Zhang, J.: A rough set-based incremental approach for learning knowledge in
dynamic incomplete information systems. International Journal of Approximate Reasoning
55(8), 1764–1786 (2014)
151. Liu, D., Li, T., Zhang, J.: A rough set-based incremental approach for learning knowledge in
dynamic incomplete information systems. International Journal of Approximate Reasoning
55(8), 1764–1786 (2014)
152. Liu, D., Li, T., Zhang, J.: Incremental updating approximations in probabilistic rough sets
under the variation of attributes. Knowledge-Based Systems 73, 81–96 (2015)
153. Liu, D., Liang, D.: Incremental learning researches on rough set theory: status and future.
International Journal of Rough Sets and Data Analysis (IJRSDA) 1(1), 99–112 (2014)
154. Liu, J., Hu, Q., Yu, D.: A comparative study on rough set based class imbalance learning.
Knowledge-Based Systems 21(8), 753–763 (2008)
155. Liu, J., Hu, Q., Yu, D.: A weighted rough set based method developed for class imbalance
learning. Information Sciences 178(4), 1235–1256 (2008)
156. Liu, Y., Xu, C., Zhang, Q., Pan, Y.: Rough rule extracting from various conditions: Incre-
mental and approximate approaches for inconsistent data. Fundamenta Informaticae 84(3,
4), 403–427 (2008)
157. Lu, J., Tan, Y.P.: Cost-sensitive subspace analysis and extensions for face recognition. IEEE
Transactions on Information Forensics and Security 8(3), 510–519 (2013)
158. Luo, C., Li, T., Chen, H., Liu, D.: Incremental approaches for updating approximations in
set-valued ordered information systems. Knowledge-Based Systems 50, 218–233 (2013)
159. Luo, C., Li, T., Yi, Z., Fujita, H.: Matrix approach to decision-theoretic rough sets for evolv-
ing data. Knowledge-Based Systems 99, 123–134 (2016)
160. Ma, T., Tang, M.: Weighted rough set model. In: Sixth International Conference on Intelli-
gent Systems Design and Applications, vol. 1, pp. 481–485. IEEE (2006)
161. Maji, P., Garai, P.: Fuzzy–rough simultaneous attribute selection and feature extraction algo-
rithm. IEEE Transactions on Cybernetics 43(4), 1166–1177 (2013)
162. Maji, P., Pal, S.K.: Rfcm: A hybrid clustering algorithm using rough and fuzzy sets. Funda-
menta Informaticae 80(4), 475–496 (2007)
163. Mak, B., Munakata, T.: Rule extraction from expert heuristics: A comparative study of rough
sets with neural networks and id3. European Journal of Operational Research 136(1), 212–
229 (2002)
164. Miao, D., Chen, M., Wei, Z., Duan, Q.: A reasonable rough approximation for clustering
web users. In: International Workshop on Web Intelligence Meets Brain Informatics, pp.
428–442. Springer (2006)
165. Min, F., He, H., Qian, Y., Zhu, W.: Test-cost-sensitive attribute reduction. Information Sci-
ences 181(22), 4928–4942 (2011)
166. Min, F., Hu, Q., Zhu, W.: Feature selection with test cost constraint. International Journal of
Approximate Reasoning 55(1), 167–179 (2014)
167. Min, F., Liu, Q.: A hierarchical model for test-cost-sensitive decision systems. Information
Sciences 179(14), 2442–2452 (2009)
Rough Sets in Machine Learning: A Review 27
168. Min, F., Zhu, W.: Attribute reduction of data with error ranges and test costs. Information
Sciences 211, 48–67 (2012)
169. Mirkin, B.: Mathematical classification and clustering: From how to what and why. In:
Classification, data analysis, and data highways, pp. 172–181. Springer (1998)
170. Mitra, S.: An evolutionary rough partitive clustering. Pattern Recognition Letters 25(12),
1439–1449 (2004)
171. Mitra, S., Banka, H.: Application of rough sets in pattern recognition. In: Transactions on
rough sets VII, pp. 151–169. Springer (2007)
172. Mitra, S., Banka, H., Pedrycz, W.: Rough-fuzzy collaborative clustering. IEEE Transactions
on Systems, Man, and Cybernetics, Part B (Cybernetics) 36(4), 795–805 (2006)
173. Mitra, S., Barman, B.: Rough-fuzzy clustering: an application to medical imagery. In: In-
ternational Conference on Rough Sets and Knowledge Technology, pp. 300–307. Springer
(2008)
174. Nanda, S., Majumdar, S.: Fuzzy rough sets. Fuzzy sets and systems 45(2), 157–160 (1992)
175. Nguyen, H.S.: Discretization problem for rough sets methods. In: International Conference
on Rough Sets and Current Trends in Computing, pp. 545–552. Springer (1998)
176. Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundamenta
Informaticae 48(1), 61–81 (2001)
177. Orlowska, E.: Dynamic information systems. Institute of Computer Science, Polish Academy
of Sciences (1981)
178. Ozawa, S., Pang, S., Kasabov, N.: Incremental learning of chunk data for online pattern
classification systems. IEEE Transactions on Neural Networks 19(6), 1061–1074 (2008)
179. Park, I.K., Choi, G.S.: Rough set approach for clustering categorical data using information-
theoretic dependency measure. Information Systems 48, 289–295 (2015)
180. Parmar, D., Wu, T., Blackhurst, J.: Mmr: an algorithm for clustering categorical data using
rough set theory. Data & Knowledge Engineering 63(3), 879–893 (2007)
181. Pawlak, Z.: Rough sets. International Journal of Computer & Information Sciences 11(5),
341–356 (1982)
182. Pawlak, Z.: Rough sets and intelligent data analysis. Information sciences 147(1), 1–12
(2002)
183. Pawlak, Z., Skowron, A.: Rough sets: some extensions. Information Sciences 177(1), 28 –
40 (2007)
184. Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic ap-
proach. International Journal of Man-Machine Studies 29(1), 81–95 (1988)
185. Pedrycz, W.: Granular computing: an emerging paradigm, vol. 70. Springer Science & Busi-
ness Media (2001)
186. Peters, G.: Outliers in rough k-means clustering. In: International Conference on Pattern
Recognition and Machine Intelligence, pp. 702–707. Springer (2005)
187. Peters, G.: Some refinements of rough k-means clustering. Pattern Recognition 39(8), 1481–
1491 (2006)
188. Peters, G.: Rough clustering utilizing the principle of indifference. Information Sciences
277, 358–374 (2014)
189. Peters, G.: Is there any need for rough clustering? Pattern Recognition Letters 53, 31–37
(2015)
190. Peters, G., Crespo, F., Lingras, P., Weber, R.: Soft clustering–fuzzy and rough approaches
and their extensions and derivatives. International Journal of Approximate Reasoning 54(2),
307–322 (2013)
191. Peters, G., Lampart, M., Weber, R.: Evolutionary rough k-medoid clustering. In: Transactions
on rough sets VIII, pp. 289–306. Springer (2008)
192. Peters, G., Weber, R., Nowatzke, R.: Dynamic rough clustering and its applications. Applied
Soft Computing 12(10), 3193–3207 (2012)
193. Pradeepa, A., Selvadoss ThanamaniLee, A.: hadoop file system and fundamental concept of
mapreduce interior and closure rough set approximations. International Journal of Advanced
Research in Computer and Communication EngineeringVol 2 (2013)
28 Rafael Bello and Rafael Falcon
194. do Prado, H.A., Engel, P.M., Chaib Filho, H.: Rough clustering: An alternative to find mean-
ingful clusters by using the reducts from a dataset. In: International Conference on Rough
Sets and Current Trends in Computing, pp. 234–238. Springer (2002)
195. Qian, Y., Wang, Q., Cheng, H., Liang, J., Dang, C.: Fuzzy-rough feature selection accelerator.
Fuzzy Sets and Systems 258, 61–78 (2015)
196. Ramentol, E., Caballero, Y., Bello, R., Herrera, F.: Smote-rsb*: a hybrid preprocessing ap-
proach based on oversampling and undersampling for high imbalanced data-sets using smote
and rough sets theory. Knowledge and information systems 33(2), 245–265 (2012)
197. Riza, L.S., Janusz, A., Bergmeir, C., Cornelis, C., Herrera, F., Śle, D., Benı́tez, J.M., et al.:
Implementing algorithms of rough set theory and fuzzy rough set theory in the R package
“roughsets”. Information Sciences 287, 68–89 (2014)
198. Salamó, M., López-Sánchez, M.: Rough set based approaches to feature selection for case-
based reasoning classifiers. Pattern Recognition Letters 32(2), 280–292 (2011)
199. Salido, J.F., Murakami, S.: Rough set analysis of a general type of fuzzy data using transitive
aggregations of fuzzy similarity relations. Fuzzy sets and systems 139(3), 635–660 (2003)
200. Schapire, R.E., Singer, Y.: Boostexter: A boosting-based system for text categorization. Ma-
chine learning 39(2-3), 135–168 (2000)
201. Shan, N., Ziarko, W.: Data-based acquisition and incremental modification of classification
rules. Computational Intelligence 11(2), 357–370 (1995)
202. Shen, F., Yu, H., Kamiya, Y., Hasegawa, O.: An online incremental semi-supervised learning
method. JACIII 14(6), 593–605 (2010)
203. Shen, Q., Chouchoulas, A.: Combining rough sets and data-driven fuzzy learning for gener-
ation of classification rules. Pattern Recognition 32(12), 2073–2076 (1999)
204. Shen, Q., Chouchoulas, A.: A modular approach to generating fuzzy rules with reduced
attributes for the monitoring of complex systems. Engineering Applications of Artificial
Intelligence 13(3), 263–278 (2000)
205. Shen, Q., Jensen, R.: Selecting informative features with fuzzy-rough sets and its application
for complex systems monitoring. Pattern recognition 37(7), 1351–1363 (2004)
206. Shu, W., Shen, H.: Incremental feature selection based on rough set in dynamic incomplete
data. Pattern Recognition 47(12), 3890–3906 (2014)
207. Singh, G.K., Minz, S.: Discretization using clustering and rough set theory. In: Computing:
Theory and Applications, 2007. ICCTA’07. International Conference on, pp. 330–336. IEEE
(2007)
208. Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems.
In: Intelligent Decision Support, pp. 331–362. Springer (1992)
209. Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundamenta Informaticae 27(2,
3), 245–253 (1996)
210. Slezak, D.: Approximate bayesian networks. In: Technologies for Constructing Intelligent
Systems 2, pp. 313–325. Springer (2002)
211. Ślezak, D.: Approximate entropy reducts. Fundamenta informaticae 53(3-4), 365–390 (2002)
212. Slezak, D., Ziarko, W., et al.: The investigation of the bayesian rough set model. International
Journal of Approximate Reasoning 40(1), 81–91 (2005)
213. Slimani, T.: Class association rules mining based rough set method. arXiv preprint
arXiv:1509.05437 (2015)
214. Slowinski, R., Vanderpooten, D., et al.: A generalized definition of rough approximations
based on similarity. IEEE Transactions on knowledge and Data Engineering 12(2), 331–336
(2000)
215. Soni, R., Nanda, R.: Neighborhood clustering of web users with rough k-means. In: In Pro-
ceedings of 6thWSEAS International Conference on Circuits, Systems, Electronics, Control
& Signal Processing, pp. 570–574 (2007)
216. Stefanowski, J.: The rough set based rule induction technique for classification problems. In:
In Proceedings of 6th European Conference on Intelligent Techniques and Soft Computing
EUFIT, vol. 98 (1998)
217. Stefanowski, J.: On combined classifiers, rule induction and rough sets. In: Transactions on
rough sets VI, pp. 329–350. Springer (2007)
Rough Sets in Machine Learning: A Review 29
218. Stefanowski, J., Vanderpooten, D.: Induction of decision rules in classification and discovery-
oriented perspectives. International Journal of Intelligent Systems 16(1), 13–27 (2001)
219. Stefanowski, J., Wilk, S.: Rough sets for handling imbalanced data: combining filtering and
rule-based classifiers. Fundamenta Informaticae 72(1-3), 379–391 (2006)
220. Stefanowski, J., Wilk, S.: Extending rule-based classifiers to improve recognition of imbal-
anced classes. In: Advances in Data Management, pp. 131–154. Springer (2009)
221. Su, C.T., Hsu, J.H.: An extended chi2 algorithm for discretization of real value attributes.
IEEE transactions on knowledge and data engineering 17(3), 437–441 (2005)
222. Su, C.T., Hsu, J.H.: Precision parameter in the variable precision rough sets model: an appli-
cation. Omega 34(2), 149–157 (2006)
223. Susmaga, R.: Reducts and constructs in classic and dominance-based rough sets approach.
Information Sciences 271, 45–64 (2014)
224. Świniarski, R.W.: Rough sets methods in feature reduction and classification. International
Journal of Applied Mathematics and Computer Science 11(3), 565–582 (2001)
225. Swiniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pat-
tern recognition letters 24(6), 833–849 (2003)
226. Tay, F.E., Shen, L.: Economic and financial prediction using rough sets model. European
Journal of Operational Research 141(3), 641–659 (2002)
227. Tsang, E.C., Chen, D., Yeung, D.S., Wang, X.Z., Lee, J.W.: Attributes reduction using fuzzy
rough sets. IEEE Transactions on Fuzzy systems 16(5), 1130–1141 (2008)
228. Tsoumakas, G., Katakis, I.: Multi-label classification: An overview. Dept. of Informatics,
Aristotle University of Thessaloniki, Greece (2006)
229. Tsoumakas, G., Vlahavas, I.: Random k-labelsets: An ensemble method for multilabel clas-
sification. In: European Conference on Machine Learning, pp. 406–417. Springer (2007)
230. Tsumoto, S.: Automated extraction of medical expert system rules from clinical databases
based on rough set theory. Information sciences 112(1), 67–84 (1998)
231. Tsumoto, S.: Automated extraction of hierarchical decision rules from clinical databases
using rough set model. Expert systems with Applications 24(2), 189–197 (2003)
232. Tsumoto, S.: Incremental rule induction based on rough set theory. In: International Sympo-
sium on Methodologies for Intelligent Systems, pp. 70–79. Springer (2011)
233. Vanderpooten, D.: Similarity relation as a basis for rough approximations. Adv Machine
Intell Soft Comput 4, 17–33 (1997)
234. Verbiest, N.: Fuzzy rough and evolutionary approaches to instance selection. Ph.D. thesis,
Ghent University (2014)
235. Verbiest, N., Cornelis, C., Herrera, F.: Frps: A fuzzy rough prototype selection method. Pat-
tern Recognition 46(10), 2770–2782 (2013)
236. Vilalta, R., Drissi, Y.: A perspective view and survey of meta-learning. Artificial Intelligence
Review 18(2), 77–95 (2002)
237. Voges, K., Pope, N., Brown, M.: A rough cluster analysis of shopping orientation data. In:
Proceedings Australian and New Zealand Marketing Academy Conference, Adelaide, pp.
1625–1631 (2003)
238. Voges, K.E., Pope, N., Brown, M.R.: Cluster analysis of marketing data examining on-line
shopping orientation: A comparison of k-means and rough clustering approaches. Heuristics
and Optimization for Knowledge Discovery pp. 207–224 (2002)
239. Wang, F., Liang, J., Dang, C.: Attribute reduction for dynamic data sets. Applied Soft Com-
puting 13(1), 676–689 (2013)
240. Wang, F., Liang, J., Qian, Y.: Attribute reduction: a dimension incremental strategy.
Knowledge-Based Systems 39, 95–108 (2013)
241. Wang, G., Yu, H., Li, T., et al.: Decision region distribution preservation reduction in
decision-theoretic rough set model. Information Sciences 278, 614–640 (2014)
242. Wang, X., An, S., Shi, H., Hu, Q.: Fuzzy rough decision trees for multi-label classification.
In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, pp. 207–217. Springer
(2015)
30 Rafael Bello and Rafael Falcon
243. Wang, X., Yang, J., Peng, N., Teng, X.: Finding minimal rough set reducts with particle
swarm optimization. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining,
and Granular-Soft Computing, pp. 451–460. Springer (2005)
244. Wang, X., Yang, J., Teng, X., Xia, W., Jensen, R.: Feature selection based on rough sets and
particle swarm optimization. Pattern Recognition Letters 28(4), 459–471 (2007)
245. Wei, M.H., Cheng, C.H., Huang, C.S., Chiang, P.C.: Discovering medical quality of total
hip arthroplasty by rough set classifier with imbalanced class. Quality & Quantity 47(3),
1761–1779 (2013)
246. Wojna, A.: Constraint based incremental learning of classification rules. In: International
Conference on Rough Sets and Current Trends in Computing, pp. 428–435. Springer (2000)
247. Wróblewski, J.: Finding minimal reducts using genetic algorithms. In: Proccedings of the
second annual join conference on infromation science, pp. 186–189 (1995)
248. Wróblewski, J.: Theoretical foundations of order-based genetic algorithms. Fundamenta
Informaticae 28(3, 4), 423–430 (1996)
249. Wróblewski, J.: Ensembles of classifiers based on approximate reducts. Fundamenta Infor-
maticae 47(3-4), 351–360 (2001)
250. Wu, Q., Bell, D.: Multi-knowledge extraction and application. In: International Workshop on
Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing, pp. 274–278. Springer
(2003)
251. Xie, H., Cheng, H.Z., Niu, D.X.: Discretization of continuous attributes in rough set the-
ory based on information entropy. CHINESE JOURNAL OF COMPUTERS-CHINESE
EDITION- 28(9), 1570 (2005)
252. Xu, Y., Wang, L., Zhang, R.: A dynamic attribute reduction algorithm based on 0-1 integer
programming. Knowledge-Based Systems 24(8), 1341–1347 (2011)
253. Xu, Z., Liang, J., Dang, C., Chin, K.: Inclusion degree: a perspective on measures for rough
set data analysis. Information Sciences 141(3), 227–236 (2002)
254. Yang, Q., Ling, C., Chai, X., Pan, R.: Test-cost sensitive classification on data with missing
values. IEEE Transactions on Knowledge and Data Engineering 18(5), 626–638 (2006)
255. Yang, X., Qi, Y., Song, X., Yang, J.: Test cost sensitive multigranulation rough set: model
and minimal cost selection. Information Sciences 250, 184–199 (2013)
256. Yang, X., Qi, Y., Yu, H., Song, X., Yang, J.: Updating multigranulation rough approximations
with increasing of granular structures. Knowledge-Based Systems 64, 59–69 (2014)
257. Yang, Y., Chen, D., Dong, Z.: Novel algorithms of attribute reduction with variable precision
rough set model. Neurocomputing 139, 336–344 (2014)
258. Yang, Y., Chen, Z., Liang, Z., Wang, G.: Attribute reduction for massive data based on rough
set theory and mapreduce. In: International Conference on Rough Sets and Knowledge Tech-
nology, pp. 672–678. Springer (2010)
259. Yao, J., Yao, Y.: A granular computing approach to machine learning. FSKD 2, 732–736
(2002)
260. Yao, Y.: Combination of rough and fuzzy sets based on α-level sets. In: Rough sets and Data
Mining, pp. 301–321. Springer (1997)
261. Yao, Y.: Decision-theoretic rough set models. In: International Conference on Rough Sets
and Knowledge Technology, pp. 1–12. Springer (2007)
262. Yao, Y.: Three-way decision: an interpretation of rules in rough set theory. In: International
Conference on Rough Sets and Knowledge Technology, pp. 642–649. Springer (2009)
263. Yao, Y.: Three-way decisions with probabilistic rough sets. Information Sciences 180(3),
341–353 (2010)
264. Yao, Y.: The superiority of three-way decisions in probabilistic rough set models. Information
Sciences 181(6), 1080–1096 (2011)
265. Yao, Y.: An outline of a theory of three-way decisions. In: International Conference on
Rough Sets and Current Trends in Computing, pp. 1–17. Springer (2012)
266. Yao, Y., Greco, S., Słowiński, R.: Probabilistic rough sets. In: Springer Handbook of Com-
putational Intelligence, pp. 387–411. Springer (2015)
267. Yao, Y., Zhao, Y.: Attribute reduction in decision-theoretic rough set models. Information
sciences 178(17), 3356–3373 (2008)
Rough Sets in Machine Learning: A Review 31
268. Yao, Y., Zhao, Y., Maguire, R.B.: Explanation oriented association mining using rough set
theory. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-
Soft Computing, pp. 165–172. Springer (2003)
269. Yao, Y., Zhou, B.: Two bayesian approaches to rough sets. European Journal of Operational
Research 251(3), 904–917 (2016)
270. Ye, D., Chen, Z., Ma, S.: A novel and better fitness evaluation for rough set based minimum
attribute reduction problem. Information Sciences 222, 413–423 (2013)
271. Yong, L., Congfu, X., Yunhe, P.: An incremental rule extracting algorithm based on pawlak
reduction. In: Systems, Man and Cybernetics, 2004 IEEE International Conference on, vol. 6,
pp. 5964–5968. IEEE (2004)
272. Yong, L., Wenliang, H., Yunliang, J., Zhiyong, Z.: Quick attribute reduct algorithm for neigh-
borhood rough set model. Information Sciences 271, 65–81 (2014)
273. Yu, H., Chu, S., Yang, D.: Autonomous knowledge-oriented clustering using decision-
theoretic rough set theory. Fundamenta Informaticae 115(2-3), 141–156 (2012)
274. Yu, H., Liu, Z., Wang, G.: An automatic method to determine the number of clusters using
decision-theoretic rough set. International Journal of Approximate Reasoning 55(1), 101–
115 (2014)
275. Yu, H., Su, T., Zeng, X.: A three-way decisions clustering algorithm for incomplete data. In:
International Conference on Rough Sets and Knowledge Technology, pp. 765–776. Springer
(2014)
276. Yu, H., Wang, G., Lan, F.: Solving the attribute reduction problem with ant colony optimiza-
tion. In: Transactions on rough sets XIII, pp. 240–259. Springer (2011)
277. Yu, H., Wang, Y.: Three-way decisions method for overlapping clustering. In: International
Conference on Rough Sets and Current Trends in Computing, pp. 277–286. Springer (2012)
278. Yu, H., Wang, Y., Jiao, P.: A three-way decisions approach to density-based overlapping
clustering. In: Transactions on Rough Sets XVIII, pp. 92–109. Springer (2014)
279. Yu, H., Zhang, C., Hu, F.: An incremental clustering approach based on three-way decisions.
In: International Conference on Rough Sets and Current Trends in Computing, pp. 152–159.
Springer (2014)
280. Yu, Y., Miao, D., Zhang, Z., Wang, L.: Multi-label classification using rough sets. In: Inter-
national Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing,
pp. 119–126. Springer (2013)
281. Yu, Y., Pedrycz, W., Miao, D.: Multi-label classification by exploiting label correlations.
Expert Systems with Applications 41(6), 2989–3004 (2014)
282. Zhai, J., Zhang, S., Zhang, Y.: An extension of rough fuzzy set. Journal of Intelligent &
Fuzzy Systems (Preprint), 1–10 (2016)
283. Zhai, J., Zhang, Y., Zhu, H.: Three-way decisions model based on tolerance rough fuzzy set.
International Journal of Machine Learning and Cybernetics pp. 1–9 (2016)
284. Zhang, H.R., Min, F.: Three-way recommender systems based on random forests.
Knowledge-Based Systems 91, 275–286 (2016)
285. Zhang, J., Li, T., Chen, H.: Composite rough sets. In: International Conference on Artificial
Intelligence and Computational Intelligence, pp. 150–159. Springer (2012)
286. Zhang, J., Li, T., Chen, H.: Composite rough sets for dynamic data mining. Information
Sciences 257, 81–100 (2014)
287. Zhang, J., Li, T., Pan, Y.: Parallel rough set based knowledge acquisition using mapreduce
from big data. In: Proceedings of the 1st International Workshop on Big Data, Streams and
Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applica-
tions, pp. 20–27. ACM (2012)
288. Zhang, J., Li, T., Ruan, D., Gao, Z., Zhao, C.: A parallel method for computing rough set
approximations. Information Sciences 194, 209–223 (2012)
289. Zhang, J., Li, T., Ruan, D., Liu, D.: Rough sets based matrix approaches with dynamic at-
tribute variation in set-valued information systems. International Journal of Approximate
Reasoning 53(4), 620–635 (2012)
32 Rafael Bello and Rafael Falcon
290. Zhang, L., Hu, Q., Duan, J., Wang, X.: Multi-label feature selection with fuzzy rough sets. In:
International Conference on Rough Sets and Knowledge Technology, pp. 121–128. Springer
(2014)
291. Zhang, L., Li, H., Zhou, X., Huang, B., Shang, L.: Cost-sensitive sequential three-way deci-
sion for face recognition. In: International Conference on Rough Sets and Intelligent Systems
Paradigms, pp. 375–383. Springer (2014)
292. Zhang, M.L., Zhou, Z.H.: Ml-knn: A lazy learning approach to multi-label learning. Pattern
recognition 40(7), 2038–2048 (2007)
293. Zhang, T., Chen, L., Ma, F.: An improved algorithm of rough k-means clustering based on
variable weighted distance measure. International Journal of Database Theory and Applica-
tion 7(6), 163–174 (2014)
294. Zhang, T., Chen, L., Ma, F.: A modified rough c-means clustering algorithm based on hybrid
imbalanced measure of distance and density. International Journal of Approximate Reason-
ing 55(8), 1805–1818 (2014)
295. Zhang, X., Miao, D.: Three-way weighted entropies and three-way attribute reduction. In:
International Conference on Rough Sets and Knowledge Technology, pp. 707–719. Springer
(2014)
296. Zhang, Y., Zhou, Z.H.: Cost-sensitive face recognition. IEEE Transactions on Pattern Anal-
ysis and Machine Intelligence 32(10), 1758–1769 (2010)
297. Zhao, H., Min, F., Zhu, W.: Test-cost-sensitive attribute reduction based on neighborhood
rough set. In: Granular Computing (GrC), 2011 IEEE International Conference on, pp. 802–
806. IEEE (2011)
298. Zhao, H., Wang, P., Hu, Q.: Cost-sensitive feature selection based on adaptive neighborhood
granularity with multi-level confidence. Information Sciences 366, 134–149 (2016)
299. Zhao, M., Luo, K., Liao, X.X.: Rough set attribute reduction algorithm based on immune ge-
netic algorithm. Jisuanji Gongcheng yu Yingyong(Computer Engineering and Applications)
42(23), 171–173 (2007)
300. Zhao, S., Chen, H., Li, C., Du, X., Sun, H.: A novel approach to building a robust fuzzy
rough classifier. IEEE Transactions on Fuzzy Systems 23(4), 769–786 (2015)
301. Zhao, S., Tsang, E.C., Chen, D.: The model of fuzzy variable precision rough sets. IEEE
Transactions on Fuzzy Systems 17(2), 451–467 (2009)
302. Zhao, S., Tsang, E.C., Chen, D., Wang, X.: Building a rule-based classifiera fuzzy-rough set
approach. IEEE Transactions on Knowledge and Data Engineering 22(5), 624–638 (2010)
303. Zheng, Z., Wang, G., Wu, Y.: A rough set and rule tree based incremental knowledge acqui-
sition algorithm. In: International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and
Granular-Soft Computing, pp. 122–129. Springer (2003)
304. Zhong, N., Dong, J., Ohsuga, S.: Using rough sets with heuristics for feature selection. Jour-
nal of intelligent information systems 16(3), 199–214 (2001)
305. Zhou, Z.H.: Cost-sensitive learning. In: International Conference on Modeling Decisions for
Artificial Intelligence, pp. 17–18. Springer (2011)
306. Zhou, Z.H., Liu, X.Y.: Training cost-sensitive neural networks with methods addressing the
class imbalance problem. IEEE Transactions on Knowledge and Data Engineering 18(1),
63–77 (2006)
307. Zhu, W.: Generalized rough sets based on relations. Information Sciences 177(22), 4997–
5011 (2007)
308. Zhu, W.: Topological approaches to covering rough sets. Information sciences 177(6), 1499–
1508 (2007)
309. Ziarko, W.: Variable precision rough set model. Journal of computer and system sciences
46(1), 39–59 (1993)
310. Zou, W., Li, T., Chen, H., Ji, X.: Approaches for incrementally updating approximations
based on set-valued information systems while attribute values’ coarsening and refining. In:
2009 IEEE International Conference on Granular Computing (2009)