Analysis of Imbalanced Classification Algorithms A Perspective View
Analysis of Imbalanced Classification Algorithms A Perspective View
Volume: 3 | Issue: 2 | Jan-Feb 2019 Available Online: www.ijtsrd.com e-ISSN: 2456 - 6470
ABSTRACT
Classification of data has become an important research area. The process of classifying documents into predefined categories
Unbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification
performance of machine learning algorithms. There have been many attempts at dealing with classification of unbalanced data
sets. In this paper we present a brief review of existing solutions to the class-imbalance problem proposed both at the data and
algorithmic levels. Even though a common practice to handle the problem of imbalanced data is to rebalance them artificially
by oversampling and/or under-sampling, some researchers proved that modified support vector machine, rough set based
minority class oriented rule learning methods, cost sensitive classifier perform good on imbalanced data set. We observed that
current research in imbalance data problem is moving to hybrid algorithms.
Keywords: cost-sensitive learning, imbalanced data set, modified SVM, oversampling, undersampling
I. INTRODUCTION
A data set is called imbalanced if it contains many more 2. They assume that there is equal distribution of data for
samples from one class than from the rest of the classes. Data all the classes.
sets are unbalanced when at least one class is represented by 3. They also assume that the errors coming from different
only a small number of training examples (called the minority classes have the same cost[2].
class) while other classes make up the majority. In this
scenario, classifiers can have good accuracy on the majority With unbalanced data sets, data mining learning algorithms
class but very poor accuracy on the minority class(es) due to produce degenerated models that do not take into account
the influence that the larger majority class has on traditional the minority class as most data mining algorithms assume
training criteria. Most original classification algorithms balanced data set.
pursue to minimize the error rate: the percentage of the
incorrect prediction of class labels. They ignore the difference A number of solutions to the class-imbalance problem were
between types of misclassification errors. In particular, they previously proposed both at the data and algorithmic levels
implicitly assume that all misclassification errors cost [3]. At the data level, these solutions include many different
equally. forms of re-sampling such as random oversampling with
replacement, random under sampling, directed oversampling
In many real-world applications, this assumption is not true. (in which no new examples are created, but the choice of
The differences between different misclassification errors can samples to replace is informed rather than random), directed
be quite large. For example, in medical diagnosis of a certain undersampling (where, again, the choice of examples to
cancer, if the cancer is regarded as the positive class, and eliminate is informed), oversampling with informed
non-cancer (healthy) as negative, then missing a cancer (the generation of new samples, and combinations of the above
patient is actually positive but is classified as negative; thus it techniques. At the algorithmic level, solutions include
is also called ―false negative‖) is much more serious (thus adjusting the costs of the various classes so as to counter the
expensive) than the false-positive error. The patient could class imbalance, adjusting the probabilistic estimate at the
lose his/her life because of the delay in the correct diagnosis tree leaf (when working with decision trees), adjusting the
and treatment. Similarly, if carrying a bomb is positive, then it decision threshold, and recognition-based (i.e., learning from
is much more expensive to miss a terrorist who carries a one class) rather than discrimination-based (two class)
bomb to a flight than searching an innocent person. learning. The most common techniques to deal with
unbalanced data include resizing training datasets, cost-
The unbalanced data set problem appears in many real world sensitive classifier, and snowball method. Recently, several
applications like text categorization, fault detection, fraud methods have been proposed with good performance on
detection, oil-spills detection in satellite images, toxicology, unbalanced data. These approaches include modified SVMs, k
cultural modeling, medical diagnosis.[1] Many research nearest neighbor (kNN), neural networks, genetic
papers on imbalanced data sets have commonly agreed that programming, rough set based algorithms, probabilistic
because of this unequal class distribution, the performance of decision tree and learning methods. The next sections focus
the existing classifiers tends to be biased towards the on some of the method in detail.
majority class. The reasons for poor performance of the
existing classification algorithms on imbalanced data sets are: II. SAMPLING METHODS
1. They are accuracy driven i.e., their goal is to minimize the An easy Data level methods for balancing the classes consists
overall error to which the minority class contributes very of re-sampling the original data set, either by over- sampling
little. the minority class or by under-sampling the majority class,
@ IJTSRD | Unique Reference Paper ID – IJTSRD21574 | Volume – 3 | Issue – 2 | Jan-Feb 2019 Page: 974
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
until the classes are approximately equally represented. Both training process becomes faster. The most common
strategies can be applied in any learning system, since they preprocessing technique is random majority under-sampling
act as a preprocessing phase, allowing the learning system to (RUS), IN RUS, Instances of the majority class are randomly
receive the training instances as if they belonged to a well- discarded from the dataset.
balanced data set. Thus, any bias of the system towards the
majority class due to the different proportion of examples per However, the main drawback of under-sampling is that
class would be expected to be suppressed. potentially useful information contained in these ignored
examples is neglected. There many ways attempts to
Hulse et al. [4] suggest that the utility of the re-sampling improve upon the performance of random sampling, such as
methods depends on a number of factors, including the ratio Tomek links, Condensed Nearest Neighbor Rule and One-
between positive and negative examples, other sided selection etc. one-sided selection (OSS) is proposed
characteristics of data, and the nature of the classifier. by Rule Kubat and Matwin attempts to intelligently under-
However, re-sampling methods have shown important sample the majority class by removing majority class
drawbacks. Under-sampling may throw out potentially useful examples that are considered either redundant or noisy.‘
data, while over-sampling artificially increases the size of the Over-sampling is a method for improve minority class
data set and consequently, worsens the computational recognition, randomly duplicate the minority data not only
burden of the learning algorithm. without increase any category of a small number of new
information, but also will lead to over-fitting.
A. Oversampling
The simplest method to increase the size of the minority class For some problems like fraud detection which is highly
corresponds to random over-sampling, that is, a non- overlapped unbalanced data classification problem, where
heuristic method that balances the class distribution through non-fraud samples heavily outnumber the fraud samples,T.
the random replication of positive examples. Nevertheless, Maruthi Padmaja[10] proposed hybrid sampling technique, a
since this method replicates existing examples in the minority combination of SMOTE to over-sample the minority data
class, over fitting is more likely to occur. (fraud samples) and random under- sampling to under-
sample the majority data (non-fraud samples) if we eliminate
Chawla proposed Synthetic Minority Over-sampling extreme outliers from the minority samples for highly
Technique (SMOTE) [5] an over-sampling approach in which skewed imbalanced data sets like fraud detection
the minority class is over-sampled by creating synthetically classification accuracy can be improved.
examples rather than by over-sampling with replacement.
The minority class is over-sampled by taking each minority Sampling methods consider the class skew and properties of
class sample and introducing synthetic examples along the the dataset as a whole. However, machine learning and data
line segments joining any/all of the k minority class nearest mining often face nontrivial datasets, which often exhibit
neighbors. Depending upon the amount of over-sampling characteristics and properties at a local, rather than global
required, neighbors from the k nearest neighbors are level. It is noted that a classifier improved through global
randomly chosen.From the original SMOTE algorithm, several sampling levels may be insensitive to the peculiarities of
modifications have been proposed in the literature. While different components or modalities in the data, resulting in a
SMOTE approach does not handle data sets with all nominal suboptimal performance. David A. Cieslak, Nitesh V.
features, it was generalized to handle mixed datasets of Chawla[11] has suggested that for improving classifier
continuous and nominal features. Chawla propose SMOTE-NC performance sampling can be treated locally, instead of
(Synthetic Minority Over-sampling Technique Nominal applying uniform levels of sampling globally. They proposed
Continuous) and SMOTE-N (Synthetic Minority Over- a framework which first identifies meaningful regions of data
sampling Technique Nominal), the SMOTE can also be and then proceeds to find optimal sampling levels within
extended for nominal features. each.
Andrew Estabrooks et al. proposed a multiple re- sampling There are known disadvantages associated with the use of
method which selected the most appropriate re-sampling sampling to implement cost-sensitive learning. The
rate adaptively [6]. Taeho Jo et al. put forward a cluster- disadvantage with undersampling is that it discards
based over-sampling method which dealt with between-class potentially useful data. The main disadvantage with
imbalance and within-class imbalance simultaneously [7]. oversampling, from our perspective, is that by making exact
Hongyu Guo et al. found out hard examples of the majority copies of existing examples, it makes over fitting likely. In
and minority classes dur-ing the process of boosting, then fact, with oversampling it is quite common for a learner to
generated new synthetic examples from hard examples and generate a classification rule to cover a single, replicated,
add them to the data sets [8].Based on SMOTE method, Hui example. A second disadvantage of oversampling is that it
Han and Wen-Yuan Wang [9] presented two new minority increases the number of training examples, thus increasing
over-sampling methods, borderline-SMOTE1 and borderline- the learning time.
SMOTE2, in which only the minority examples near the
borderline are over- sampled. These approaches achieve Given the disadvantages with sampling, still sampling is a
better TP rate and F- value than SMOTE and random over- popular way to deal with imbalanced data rather than a cost-
sampling methods. sensitive learning algorithm. There are several reasons for
this. The most obvious reason is there are not cost- sensitive
B. Undersampling implementations of all learning algorithms and therefore a
Under-sampling is an efficient method for classing- imbalance wrapper-based approach using sampling is the only option.
learning. This method uses a subset of the majority class to While this is certainly less true today than in the past, many
train the classifier. Since many majority class examples are learning algorithms (e.g., C4.5) still do not directly handle
ignored, the training set becomes more balanced and the costs in the learning process. A second reason for using
@ IJTSRD | Unique Reference Paper ID – IJTSRD21574 | Volume – 3 | Issue – 2 | Jan-Feb 2019 Page: 975
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
sampling is that many highly skewed data sets are enormous Genetic algorithms for oversampling to enlarge the ratio of
and the size of the training set must be reduced in order for positive samples and then apply clustering to the
learning to be feasible. oversampled training data set as a data clearning method for
both classes, removing the redundant or noisy samples. They
In this case, undersampling seems to be a reasonable, and used AUC as evaluation metric and found that their algorithm
valid, strategy. if one needs to discard some training data, it performed better.
still might be beneficial to discard some of the majority class
examples in order to reduce the training set size to the Nguyen ha vo, Yonggwan won[19] extended Regularized
required size, and then also employ a cost- sensitive learning Least Square(RLS) algorithm that penalizes errors of
algorithm, so that the amount of discarded training data is different samples with different weights and some rules of
minimized. A final reason that may have contributed to the thumb to determine those weights. The significantly better
use of sampling rather than a cost-sensitive learning classification accuracy of weighted RLS classifiers showed
algorithm is that misclassification costs are often unknown. that it is promising substitution of other previous cost-
However, this is not a valid reason for using sampling over a sensitive classification methods for unbalanced data set. This
cost-sensitive learning algorithm, since the analogous issue approach is equivalent to up- sampling or down-sampling
arises with sampling—what should the class distribution of depending on the cost we choose. For example, doubling the
the final training data be? If this cost information is not cost-sensitivity of one class is said to be equivalent to
known, a measure such as the area under the ROC curve doubling the number of samples in that class.
could be used to measure classifier performance and both
approaches could then empirically determine the proper cost Ref[20] proposed a novel approach reducing each within
ratio/class distribution [12]. group error, BABoost that is a variant of AdaBoost. Adaboost
algorithm gives equal weight to each misclassified example.
III. COST-SENSITIVE LEARNING But the misclassification error of each class is not same.
At the algorithmic level, solutions include adjusting the costs Generally, the misclassification error of the minority class will
of the various classes so as to counter the class imbalance, larger than the majority‘s. So Adaboost algorithm will lead to
adjusting the probabilistic estimate at the tree leaf (when higher bias and smaller margin when encountering skew
working with decision trees), adjusting the decision distribution. BABoost algorithm in each round of boosting
threshold, and recognition-based (i.e., learning from one assigns more weights to the misclassified examples,
class) rather than discrimination-based (two class) learning. especially those in the minority class.
Cost-Sensitive Learning is a type of learning in data mining Yanmin Sun a and Mohamed S. Kamel[21] explored three
that takes the misclassification costs (and possibly other cost-sensitive boosting algorithms, which are developed by
types of cost) into consideration. There are many ways to introducing cost items into the learning framework of
implement cost sensitive learning, in [13], it is categorized AdaBoost. These boosting algorithms are also studied with
into three, the first class of techniques apply misclassification respect to their weighting strategies towards different types
costs to the data set as a form of data space weighting, the of samples, and their effectiveness in identifying rare cases
second class applies cost-minimizing techniques to the through experiments on several real world medical data sets,
combination schemes of ensemble methods, and the last class where the class imbalance problem prevails.
of techniques incorporates cost sensitive features directly
into classification paradigms to essentially fit the cost IV. SVM AND IMBALANCED DATASETS
sensitive framework into these classifiers. The success of SVM is very limited when it is applied to the
problem of learning from imbalanced datasets in which
Incorporating cost into decision tree classification algorithm negative instances heavily outnumber the positive instances.
which is one of the most widely used and simple classifier. Even though undersampling the majority class does improve
Cost can be incorporated into it in various ways. First way is SVM performance, there is an inherent loss of valuable
cost can be applied to adjust the decision threshold, second information in this process. Rehan Akbani[22]combined
way is cost can be used in splitting attribute selection during sampling and cost sensitive learning for improving
decision tree construction and the other way is cost sensitive performance of SVM. Their algorithm is based on a variant of
pruning schemes can be applied to the tree. Ref. [14] propose the SMOTE algorithm by Chawla et al, combined with
a method for building and testing decision trees that Veropoulos et al‘s different error costs algorithm.
minimizes total sum of the misclassification and test costs.
TAO Xiao-yan[23] presented A modified proximal support
The algorithm used by them chooses an splitting attribute vector machine (MPSVM) which assigns different penalty
that minimizes the total cost, the sum of the test cost and the coefficients to the positive and negative samples respectively
misclassification cost rather than choosing an attribute that by adding a new diagonal matrix in the primal optimization
minimizes the entropy. Information gain, Gini measures are problem. And further the decision function is obtained. The
considered to be skew sensitive [15]. In Ref. [16] a new real-coded immune clone algorithm (RICA) is employed to
decision tree algorithm called Class Confidence Proportion select the global optimal parameters to get the high
Decision Tree (CCPDT) is proposed which is robust and generalization performance.
insensitive to size of classes and generates rules which are
statistically significant. Ref. [17] analytically and empirically M. Muntean 1 and H. Vălean[24] provided the Enhancer, a
demonstrates the strong skew insensitivity of Hellinger viable algorithm for improving the SVM classification of
Distance and its advantages over popular alternative metrics. unbalanced datasets. They improve the Cost-sensitive
They arrived at a conclusion that for imbalanced data it is classification for Support Vector Machines, by multiplying in
sufficient to use Hellinger trees with bagging without any the training step the instances of the underrepresented
sampling methods. Ref. [18] uses different operators of classes.
@ IJTSRD | Unique Reference Paper ID – IJTSRD21574 | Volume – 3 | Issue – 2 | Jan-Feb 2019 Page: 976
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Yuchun Tang and nitesh chawla[25] also implemented and popular alternatives such as entropy (gain ratio). For
rigorously evaluated four SVM modeling techniques SVM can imbalanced data it is sufficient to use Hellinger trees with
be effective if incorporate different ―rebalance‖ heuristics in bagging without any sampling methods.
SVM modeling, including cost-sensitive learning, and over
and under sampling. VI. CONCLUSION
This paper provides an overview of the classification of
Genetic programming (GP) can evolve biased classifiers when imbalanced data sets. At data level, sampling is the most
data sets are unbalanced. The cost sensitive learning uses common approach to deal with imbalanced data. Over-
cost adjustment within the learning algorithm to factor in the sampling clearly appears as better than under-sampling for
uneven distribution of class examples in the original local classifiers, whereas some under-sampling strategies
(unmodified) unbalanced data set, during the training outperform over-sampling when employing classifiers with
process. In GP, cost adjustment can be enforced by adapting global learning. Researchers proved that Hybrid sampling
the fitness function. Here, solutions with good classification techniques can perform better than just oversampling or
accuracy on both classes are rewarded with better fitness, under sampling. At the algorithmic level, solutions include
while those that are biased toward one class only are adjusting the costs of the various classes so as to counter the
penalized with poor fitness. class imbalance, adjusting the probabilistic estimate at the
tree leaf (when working with decision trees), adjusting the
Common techniques include using fixed misclassification decision threshold, and recognition-based (i.e., learning from
costs for minority and majority class examples [26], [27], or one class) rather than discrimination-based (two class)
improved performance criteria such as the area under the learning. Solutions based on modified support vector
receiver operating characteristic (ROC) curve (AUC) [28], in machine, rough set based minority class oriented rule
the fitness function. While these techniques have learning methods, cost sensitive classifier are also proposed
substantially improved minority class performances in to deal with unbalanced data. There are of course many other
evolved classifiers, they can incur both a tradeoff in majority worthwhile research possibilities that are not included here.
class accuracy and, thus, a loss in overall classification ability, Developing Classifiers which are robust and skew- insensitive
and long training times due to the computational overhead in or hybrid algorithms can be point of interest for the future
evaluating these improved fitness measures. In addition, research in imbalanced dataset.
these approaches can be problem specific, i.e., fitness
functions are handcrafted for a particular problem domain REFERENCE
only. [1] Miho Ohsaki, Peng Wang, Kenji Matsuda, Shigeru
Katagiri, Hideyuki Watanabe, and Anca Ralescu,
V. HYBRID ALGORITHMS “Confusion-matrix-based Kernel Logistic Regression
The Easy Ensemble classifier is an under-sampling algorithm, for Imbalanced Data Classification”, IEEE Transactions
which independently samples several subsets from negative on Knowledge and Data Engineering, 2017.
examples and one classifier is built for each subset. All
[2] Alberto Fernández, Sara del Río, Nitesh V. Chawla,
generated classifiers are then combined for the final decision
Francisco Herrera, “An insight into imbalanced Big
by using Adaboost. In imbalanced problems, some features
Data classification: outcomes and challenges”, Springer
are redundant and even irrelevant; these features will hurt
journal of bigdata, 2017.
the generalization performance of learning machines. Feature
selection, a process of choosing a subset of features from the [3] Vaibhav P. Vasani1, Rajendra D. Gawali, “Classification
original ones, is frequently used as a preprocessing technique and performance evaluation using data mining
in analysis of data. It has been proved effective in reducing algorithms”, International Journal of Innovative
dimensionality, improving mining efficiency, increasing Research in Science, Engineering and Technology,
mining accuracy and enhancing result comprehensibility. 2014.
Ref[29] combined the feature selection method with Easy
[4] Kaile Su, Huijing Huang, Xindong Wu, Shichao Zhang,
Ensemble in order to improve the accuracy.
“Rough Sets for Feature Selection and Classification: An
Overview with Applications”, International Journal of
In ref[30] a hybrid algorithm based on random over-
Recent Technology and Engineering (IJRTE) ISSN:
sampling, decision tree (DT), particle swarm optimization
2277-3878, Volume-3, Issue-5, November 2014.
(PSO) and feature selection is proposed to classify
unbalanced data. The proposed algorithm has the ability to [5] Senzhang Wang, Zhoujun Li, Wenhan Chao and
select beneficial feature subsets, automatically adjust values Qinghua Cao, “Applying Adaptive Over-sampling
of parameter and obtain the best classification accuracy. The Technique Based on Data Density and Cost-Sensitive
zoo dataset is used to test the performance. From simulation SVM to Imbalanced Learning”, IEEE World Congress on
results, the classification accuracy of this proposed algorithm Computational Intelligence June, 2012.
outperforms other existing methods
[6] Mikel Galar, Alberto Fernandez, Edurne Barrenechea,
Humberto Bustince and Francisco Herrera, “A Review
Decision trees, supplemented with sampling techniques, have
on Ensembles for the Class Imbalance Problem:
proven to be an effective way to address the imbalanced data
Bagging, Boosting, and Hybrid-Based Approaches”,
problem. Despite their effectiveness, however, sampling
IEEE Transactions on Systems, Man and Cybernetics—
methods add complexity and the need for parameter
Part C: Applications and Reviews, Vol. 42, No. 4, July
selection. To bypass these difficulties a new decision tree
2012.
technique called Hellinger Distance Decision Trees (HDDT)
which uses Hellinger distance as the splitting criterion is [7] Nada M. A. Al Salami, “Mining High Speed Data
suggested in ref [17]. They took advantage of the strong skew Streams”. UbiCC Journal, 2011.
insensitivity of Hellinger distance and its advantages over
@ IJTSRD | Unique Reference Paper ID – IJTSRD21574 | Volume – 3 | Issue – 2 | Jan-Feb 2019 Page: 977
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
[8] Dian Palupi Rini, Siti Mariyam Shamsuddin and Siti [21] J. Kolter and M. Maloof. Using additive expert
Sophiyati, “Particle Swarm Optimization: Technique, ensembles to cope with concept drift. In Proc. ICML, Pp
System and Challenges”, International Journal of 449–456, 2005.
Computer Applications (0975 – 8887) Volume 14–
[22] D. D. Lewis, Y. Yang, T. Rose, and F. Li. Rcv1: A new
No.1, January 2011.
benchmark collection for text categorization research.
[9] Amit Saxena, Leeladhar Kumar Gavel, Madan Madhaw Journal of Machine Learning Research, Pp 361–397,
Shrivas, “Online Streaming Feature Selection”, 27th 2004.
International Conference on Machine Learning, 2010.
[23] X. Li, P. S. Yu, B. Liu, and S.-K. Ng. Positive unlabeled
[10] Yuchun Tang, Member, Yan-Qing Zhang, Nitesh V. learning for data stream classification. In Proc. SDM, Pp
Chawla and Sven Krasser, “SVMs Modeling for Highly 257–268, 2009.
Imbalanced Classification”, IEEE Transaction on
[24] M. M. Masud, Q. Chen, J. Gao, L. Khan, J. Han, and B. M.
Systems, Man and Cybernetics, Vol. 39, NO. 1, Feb 2009.
Thuraisingham. Classification and novel class detection
[11] Haibo He and Edwardo A. Garcia, “Learning from of data streams in a dynamic feature space. In Proc.
Imbalanced Data”, IEEE Transactions on Knowledge ECML PKDD, volume II, Pp 337–352, 2010.
and Data Engineering, September 2009.
[25] P. Zhang, X. Zhu, J. Tan, and L. Guo, “Classifier and
[12] Thair Nu Phyu, “Survey of Classification Techniques in Cluster Ensembles for Mining Concept Drifting Data
Data Mining”, International Multi Conference of Streams,” Proc. 10th Int’l Conf. Data Mining, 2010.
Engineers and Computer Scientists, IMECS 2009,
[26] X. Zhu, P. Zhang, X. Lin, and Y. Shi, “Active Learning
March, 2009.
from Stream Data Using Optimal Weight Classifier
[13] Haibo He, Yang Bai, Edwardo A. Garcia and Shutao Li, Ensemble,” IEEE Trans. Systems, Man, Cybernetics Part
“ADASYN: Adaptive Synthetic Sampling Approach for B, vol. 40, no. 6, Pp 1607- 1621, Dec. 2010.
Imbalanced Learning”, IEEE Transaction of Data
[27] Q. Zhang, J. Liu, and W. Wang, “Incremental Subspace
Mining, 2009.
Clustering over Multiple Data Streams,” Proc. Seventh
[14] Swagatam Das, Ajith Abraham and Amit Konar, Int’l Conf. Data Mining, 2007.
“Particle Swarm Optimization and Differential
[28] Q. Zhang, J. Liu, and W. Wang, “Approximate Clustering
Evolution Algorithms: Technical Analysis, Applications
on Distributed Data Streams,” Proc. 24th Int’l Conf.
and Hybridization Perspectives”, Springer journal on
Data Eng., 2008.
knowledge engineering, 2008.
[29] C. C. Aggarwal. On classification and segmentation of
[15] “A logical framework for identifying quality knowledge
massive audio data streams. Knowl. and Info. Sys., Pp
from different data sources”, International Conference
137–156, July 2009.
on Decision Support Systems, 2006.
[30] C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A
[16] “Database classification for multi-database mining”,
framework for on-demand classification of evolving
International Conference on Decision Support Systems,
data streams. IEEE Trans. Knowl. Data Eng, Pp 577–
2005.
589, 2006.
[17] Volker Roth, “Probabilistic Discriminative Kernel
[31] A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R.
Classifiers for Multi-class Problems”, Springer-Verlag
Gavald. New ensemble methods for evolving data
journal, 2001.
streams. In Proc. SIGKDD, Pp 139–148, 2009.
[18] R. Chen, K. Sivakumar and H. Kargupta “Collective
[32] S. Chen, H. Wang, S. Zhou, and P. Yu. Stop chasing
Mining of Bayesian Networks from Distributed
trends: Discovering high order models in evolving data.
Heterogeneous Data”, Kluwer Academic Publishers,
In Proc. ICDE, Pp 923–932, 2008.
2001.
[33] P. Zhang, X. Zhu, and L. Guo. Mining data streams with
[19] Shigeru Katagiri, Biing-Hwang Juang and Chin-Hui Lee,
labeled and unlabeled training examples. In Proc.
“Pattern Recognition Using a Family of Design
ICDM, Pp 627–636, 2009.
Algorithms Based Upon the Generalized Probabilistic
Descent Method”, IEEE Journal of Data Minig, 1998. [34] O. R. Terrades, E. Valveny, and S. Tabbone, “Optimal
classifier fusion in a non-Bayesian probabilistic
[20] I. Katakis, G. Tsoumakas, and I. Vlahavas. Tracking
framework,” IEEE Trans. Pattern Anal. Mach. Intell., vol.
recurring contexts using ensemble classifiers: an
31, no. 9, Pp 1630–1644, Sep. 2009.
application to email filtering. Knowledge and
Information Systems, Pp 371–391, 2010.
@ IJTSRD | Unique Reference Paper ID – IJTSRD21574 | Volume – 3 | Issue – 2 | Jan-Feb 2019 Page: 978