Chi2 Feature Selection and Discretization of Numeric Attributes

This document describes the Chi2 algorithm for feature selection and discretization of numeric attributes. The algorithm has two phases: 1. Phase 1 discretizes each numeric attribute using the Chi2 test with a high significance level, repeatedly merging intervals until inconsistencies are found. This determines an appropriate significance level. 2. Phase 2 further discretizes attributes individually with lower significance levels, stopping when inconsistencies are introduced. Attributes discretized into one interval are removed as irrelevant. The algorithm is tested on data sets, showing it improves accuracy over raw data and benchmarks while effectively discretizing attributes and selecting relevant features via discretization.

Uploaded by

Natsuyuki Hana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views4 pages

Chi2 Feature Selection and Discretization of Numeric Attributes

Uploaded by

Natsuyuki Hana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Chi2: Feature Selection and Discretization of Numeric Attributes

Huan Liu and Rudy Setiono

Department of Information Systems and Computer Science
National University of Singapore
Kent Ridge, Singapore 0511
{liuh,rudys}@iscs.nus.sg

Abstract are data sets with numeric attributes, some are irrel-
Discretization can turn numeric attn'butes into dis- evant and the range of each numeric attribute could
crete ones. Feature selection can eliminate some ir- be very wide; find an algorithm that can automati-
relevant attributes. This paper describes Chi$?,Q sim- cally discretize the numeric attributes as well as re-
ple and general algorithm that uses the x2 statistic to move those irrelevant ones.
discretize numeric attributes repeatedly until some in- This work stems from Kerber's ChiMerge [4] which
consistencies are found in the data, and achieves fea- is designed to discretize numeric attributes based on
ture selection via discretization. The empara'cal results the x2 statistic. ChiMerge consists of an initialization
demonstrate that Chi2 is eflective in feature selection step and a bottom-up merging'process, where intervals
and discretization of numeric and ordinal attributes. are continuously merged until a termination condition,
which is determined by a significance level a (set man-
1 Introduction ually), is met. It is an improvement from the most
obvious simple methods such as equal-width-intervals
Feature selection is a task to select the minimum or equal-frequency-intervals. Instead of defining a
number of attributes needed to represent the data ac- width or frequency threshold (which is not easy un-
curately. By using relevant features, classification al- til scrutinizing each attribute and knowing what it is),
gorithms can in general improve their predictive ac- ChiMerge requires cy to be specified. Nevertheless, too
curacy, shorten the learning period, and result in the big or too small an a will over- or under-discretize an
simpler concepts. There are abundant feature selec- attribute. An extreme example of under-discretization
tion algorithms [5]. Our work adopts an approach is the continuous attribute itself. Over-discretization
that selects a subset of the original attributes since it will introduce many inconsistencies' nonexistent be-
not only has the above virtues, but also serves as an fore, thus change the characteristics of the data. In
indicator on what kind of data (along those selected short, it is not easy to find a proper a for ChiMerge.
features) should be collected. The feature selection It is thereby ideal to let the data determine what value
algorithms can be further divided based on the data a should take. This leads to Phase 1 of Chi2. Natu-
types they operate on. The basic two types of data are rally, if the discretization continues without generat-
nominal e.g., attribute color may have values of red, ing more inconsistencies than in the original data, it is
green, ye1( l and ordinal (e.g., attribute winningpo-
ow) possible that some attributes will be discretized into
sition can have values of 1 , 2 , and 3, or attribute salary one interval only. Hence, they can be removed.
can have 22345.00,46543.89, etc. as its values). Many
feature selection algorithms [l,3, 51 are shown to work 2 Chi2 Algorithm
effectively on discrete data or even more strictly, on bi-
nary data (and/or binary class value). In order to deal The Chi2 algorithm (summarized below) is based
with numeric attributes, a common practice for those on the x2 statistic, and consists of two phases. In
algorithms is to discretize the data before conduct- the first phase, it begins with a high significance level
ing feature selection. This paper provides a way to (sigLevel), e.g., 0.5, for all numeric attributes for dis-
select features directly from numeric attributes while cretization. Each attribute is sorted according to its
discretizing them. Numeric data are very common in values. Then the following is performed: 1. calcu-
real world problems. However, many classification al- late the x2 value as in equation (1) for every pair of
gorithms require that the training data contain only adjacent intervals (at the beginning, each pattern is
discrete attributes, and some would work better on put into its own interval that contains only one value
discretized or binarized data [2, 41. If those numeric of an attribute); 2. merge the pair of adjacent inter-
data can be automatically transformed into discrete vals with the lowest x2 value. Merging continues un-
ones, these classification algorithms would be readily til all pairs of intervals have x2 values exceeding the
at our disposal. Chi2 is our effort towards this goal: parameter determined by sigLevel (initially, 0.5, its
discretize the numeric attributes as well as select fea-
tures a,mong them. 'By inconsistency we mean that two patterns are the same,
The problem this work tackles is as follows: there but classified into different categories.

388
1082-3409195 $04.00 0 1995 IEEE
corresponding x’ value is 0.455 if the degree of free- The formula for computing the x’ value is:
dom is 1, more below). The above process is repeated
with a decreased sigLevel until an inconsistency rate,
6 is exceeded in the discretized data. Phase 1 is, as
a matter of fact, a generalized version of ChiMerge of
Kerber [4]. Instead of specifying a x2 threshold, Chi2
wraps up ChiMerge with a loop that automatically in- where:
crements the x 2 threshold (decrementing siglevel). A k = number of (no.) classes,
consistency checking is also introduced as a stoppin Aij = no. patterns in the ith interval, j t h class,
criterion in order to guarantee that the discretize8 Ri = no. patterns in the ith interval = Cjz1 k
Aij,
data set accurately represents the original one. With
these two new features, Chi2 automatically determines Cj = no. patterns in the j t h class = Et=,A i j ,
a proper x 2 threshold that keeps the fidelity of the N = total no. patterns = ~ i ,
original data. Eij = expected frequency of Aij = Ri * C j / N .
Phase 2 is a finer process of Phase 1. Starting with If either Ri or Cj is 0, Eij is set to 0.1. The degree
sigLevel0 determined in Phase 1, each attribute i is of freedom of the x2 statistic is one less the number of
associated with a sigLevel[i],and takes turns for merg- classes.
ing. Consistency checking is conducted after each at-
tribute’s merging. If the inconsistency rate is not ex- 3 Experiments
ceeded, sigLevel[i]is decremented for attribute i’s next T w o sets of experiments are conducted. In the first
round of merging; otherwise attribute i will not be in- set of experiments, we want to establish that 1. Chi2
volved in further merging. This process is continued helps improve predictive accuracy; and 2. Chi2 prop-
until no attribute’s values can be merged. At the end erly and effectively discretizes data as well as elimi-
of Phase 2, if an attribute is merged to only one value,
it simply means that this attribute is not relevant in
representing the original data set. As a result, when
8‘
nates some irrelevant attributes. C4.5 81 an exten-
sion of ID3 [7])is used to verify the e ectiveness of
Chi2. The reasons for our choice are 1. C4.5 (or ID3)
discretization ends, feature selection is accomplished. works well for many problems and is well known, thus
requiring no further description; and 2. C4.5 selects
relevant features by itself in tree branching so it can be
Chi2 Algorithm: used as a benchmark, as in [5,9, 11, to verify the effects
of Chi2. In the second set of experiments, we have a
closer examination of Chi2’s ati!ity of discretization
Phase 1: and feature selection by introducing a synthetic data
set sigLevel = . S ; set and adding noise attributes to the existing data
do while (InConsistency(data) < 6) {
set. Through these more controlled data sets, we can
for each numeric attribute { better understand how effective Chi2 is.
Sort(attribute, data); 3.1 Real data
chi-sq-initializat ion(attribute, data) ; Three data sets used in experiments are Iris, Wis-
do { consin Breast Cancer and Heart Disease’. They have
chi-sq-calculat ion(attribute , dat a) different types of attributes. The Iris data are of con-
} while (Merge(data1) tinuous attributes, the breast cancer data are of or-
1 dinal discrete ones, and the heart disease data have
sigLevel0 = siglevel; mixed attributes (numeric and discrete).
sigLeve1 = decreSigLevel(sigLeve1); 3.2 Controlled data
1 Two extra data sets are designed to test if noise
Phase 2: attributes can be removed. One is synthetic, the other
set all sigLvlCi1 = sigLevelO for attribute i; is the Iris data added with noise attributes.
do until no-attribute-can-be-merged { The synthetic data consists of 600 items and is de-
for each attribute i that can be merged { scribed by four attributes among which only one at-
Sort(attribute, data); tribute determines each item’s class label. The values,
chi-sq-init ializat ion( attribute, data) ; 2rl of attribute A1 are generated from a uniform dis-
do { tribution between the lower bound (L = 0) and the
chi-sq-calculation(attribute,data) upper bound (U = 75), each item’s class label is de-
} while (Merge(data1) termined as follows: vug < 25 + class l, vo < 50 +
if (InConsistency(data) < 6) class 2, vug < 75 + class 3. Then we add noise at-
sigLvl [i] = decreSigLeve1 (s igLvl [il ) ; tributes Az,A3, and Aq. The values of A2 are gen-
else erated from a normal distribution with U, = U/2 (i.e.
1
attribute i cannot be merged; 37.5) and U = p/3. The values of AB are generated
2They are all obtained from the University of California-
Irvine machine learning repository via anonymous ftp to
ics.uci.edu.

389
a -
X Int
7C.m m Int Class Freq xa
4.6 2 0 0 0.20 6.1 4.4 9 0 0 5.05
4.7 1 0 0 0.20 6.2 4.9 1 0 1 8.11
4.8 3 0 0 1.97 6.3 5.0 12 3 0 13.64
4.9 1 0 1 2.62 6.4 5.5 3 12 3 14.23
5.0 3 1 0 0.10 6.5 6.1 0 10 21
5.1 3 1 0 0.70 6.6
5.2 2 0 0 0.20 6.7
5.3 1 0 0 0.41 6.8 0 1
5.4 3 1 0 1.32 6.9 0 1 0.85
5.5 1 2 0 1.66 7.0 0 1 2.10
5.6 0 4 0 2.50 7.1
5.7
5.8
5.9
1
1
0
1
2
1
0
2
0
1.28
1.20
0.54
-
7.4
7.7
-
O0 I o.20
Table 2: The intervals, class frequencies, and x2 values
for attribute sepal-length after Phase 1 and Phase 2.
The x2 thresholds are (a) 3.22 and (b) 50.6.

Table 1: The initial intervals, class frequencies, and

xz values for sepal-length.

from two normal distributions with p = U/3 (i.e. 25),

p =2 * U/3 (i.e. 50) and (T = p/3 respectively, 300
items each distribution. The values of Ad are gener-
ated from a uniform distribution.
The second data is a modified version of Iris data.
Four noise attributes Ag, Ag, A7 and As are added to
the Iris training data corresponding to the four origi-
nal attributes. The values of each noise attribute are
determined by a normal distribution with p = aue and 1 DataSete
(T = (maz- min)/6, where aue is the average value of, Irl- Heart Breast
muz and min are the maximum and minimum values
of the original attribute. The choice of U is to ap- Figure 1: Number of attributes: original vs. those
proximate p/3 if the corresponding original attribute after Chi2 processing.
is of uniform distribution. Now there are total eight
attributes. The number of patterns used is 75.
3.3 Example reduced from 4 to 2 (petal length and petal width),
In this section, some steps of Chi2 processing for each has four values. For the breast cancer data, 3
the Iris data are shown to demonstrate the behavior attributes are removed from the original 9 attributes.
of Chi2. Table 1 shows the intervals, class frequencies, The remaining 6 attributes have 3 , 4 , 4 , 5 , 3, and 3 dis-
and x2 values of sepal-length after the initialization in crete values respectively. For the heart disease data,
Phase 1. The results for sepal-length after Phase 1and the discrete attributes are left out in discretization
Phase 2 are shown in Table 2. An inconsistency rate and feature selection although they are used for con-
S = 5% is allowed in the experiment, that means up sistency checking. Among the 5 continuous attributes
to 3 (75~0.05)inconsistencies are acceptable. Phase 1 (1, 4, 5, 8 and lo), only 2 attributes (5 and 8) should
stops at sigLevel = 0.2, x2 = 3.22. That means the remain as suggested by Chi2, having 8 and 4 discrete
next sigLevel (0.1) will introduce more inconsistencies. values respectively. For the cancer and disease data
When Phase 2 terminates, the values of both sepal- sets, the default inconsistency rate is used, i.e., 0.
length and sepal-width are merged into one value, so Second, we run C4.5 on both the original data sets
they can be removed; and attributes petal-length and and the dimensionally reduced ones. C4.5 is run using
petal-width are discretized into four discrete values its default setting. Chi2 discretizes the training data
each. With the x2 threshold 3.22, for example, six and generates a mapping table, based on which the
discrete values are needed for attribute sepal-length: testing data are discretized.
< 4.4 -+ 0, < 4.9 -+ 1, ..., < 6.1 + 4, and 2 6.1 -+ 5. Shown in Figure 2 are predictive accuracies and tree
The last one reads if a numeric value is greater than sizes of C4.5 for the three data sets. Predictive accu-
and equal to 6.1, it is quantized to 5. racy improves and tree size drops (by half) for the
breast cancer and heart disease data. As for the Iris
3.4 Empirical results on real data data, accuracy and tree size remain the same by using
First we show that after discretization, the number two attributes only (with 4 values each). In a way, it
of attributes decreases for the three data sets (in Fig- shows that C4.5 works pretty well without Chi2 for
ure 1). For the Iris data, the number of attributes is this data set.

390
mixed attributes (e.g., Heart Disease Data). In addi-
Slze tion, Chi2 can work with multi-calss data. This is an
46.00 advantage over some statistic-based feature selection
algorithms such as Relief [5] which is applicable only
to the two-class data.
Other issues such as selecting S, limitations of Chi2
as well as its computational complexity can be found
in [6].
5 Conclusion
Chi2 is a simple and general algorithm that can
automatically select a proper x2 value, determine the
intervals of a numeric attribute, as well as select fea-
tures according to the characteristics of the data. It
guarantees that the fidelity of the training data can
remain after Chi2 is applied. The empirical results
on both the real data and controlled data have shown
that Chi2 is a useful and reliable tool for discretization
Figure 2: (a) Predictive accuracy and (b) size of de- and feature selection of numeric attributes.
cision trees of C4.5 for the three data sets after and
before the Chi2 processing. References
[l]H. Almuallim and T.G. Dietterich. Learning
boolean concepts in the presence of many irrele-
3.5 Empirical results on controlled data vant features. Artificial Intelligence, 69( 1-2):279-
The purpose of experimenting on the controlled 305, November 1994.
data is to verify how effective Chi2 is in removing [2] J. Catlett. On changing continuous attributes
irrelevant attributes through discretizing numeric at- into ordered discrete attributes. In European
tributes. Therefore, it is only necessary to see if Chi2 Working Session on Learning, 1991.
I1
can 1 discretize the relevant attribute(s) properly
and 2 remove the irrelevant attributes.
Chi2 merged A1 into three discrete values (1,2 and
[3] U.M. Fayyad and K.B. Irani. The attribute se-
lection problem in decision tree generation. In
3 corresponding to three classes (1,2, and 3); merged
h
t e other three attributes Aa,A3, and A4 into one
value. That means that only AI should remain, and
A A A I-92, Proceedings Ninth National Confer-
ence on Artificial Intelligence, pages 104-110.
AAAI Press/The MIT Press, 1992.
the noise (irrelevant) attributes should be removed.
For the modified Iris data, Chi2 merged six at- [4] R. Kerber. Chimerge: Discretization of numeric
tributes out of eight. They are attributes 0, 1, 4, 5, 6 attributes. In A A A I-92, Proceedings Ninth Na-
and 7. The first two are sepal-length and sepal-width. tional Conference on Artificial Intelligence, pages
The last four are added noise (irrelevant) attributes. 123-128. AAAI Press/The MIT Press, 1992.
The remaining two attributes have been merged into
4 discrete values respectively as did in the real data [5] K . Kira and L.A. Rendell. The feature selection
experiment. problem: Traditional methods and a new algo-
Through this set of controlled experiments, it is rithm. In AAAI-92, Proceedings Ninth National
shown that Chi2 effectively discretizes numeric at- Conference on Artificial Intelligence, pages 129-
tributes and removes irrelevant attributes. 134. AAAI Press/The MIT Press, 1992.

4 Discussions [6] H. Liu and R. Setiono. Discretization of ordinal

attributes and feature selection. Technical Report
ChiMerge requires a user to specify a proper sig- TRB4/95, Department of Info Sys and Comp
nificance level (a)which is used for merging values of Sci, National University of Singapore, April 1995,
all the attributes. No definite rule is given to choose
this a. In other words, it is still a matter of trial-and- ”http: //www.iscs.nus.sg/~iuh/chi2.ps”.
error, and clearly it is not easy to find a proper signifi- [7] J.R. Quinlan. Induction of decision trees. Ma-
cance level for each problem. Phase 1 of Chi2 extends chine Learning, 1(1):81-106, 1986.
ChiMerge to an automated one. That is a is automat-
ically varied until further merging is discontinued by [8] J.R. Quinlan. (74.5: Programs for Machine
the stopping criterion (the inconsistency rate). What Learning. Morgan Kaufmann, 1993.
makes Chi2 special is its capability of feature selection
- a big step forward from discretization. In Phase 2 of [9] H. Ragavan and L. Rendell. Lookahead feature
Chi2, each attribute has its own significance level for construction for learning hard concepts. In Ma-
merging in a round robin fashion. Merging stops when chine Learning: Proceedings of the Seventh In-
the inconsistency rate exceeds a specified one S. This ternational Conference, pages 252-259. Morgan
phase of Chi2 accomplishes feature selection. Another Kaufmann Pub. San Mateo, California, 1993.
feature of Chi2 is that it can be applied to data with

Animal Kaiser Barcodes
No ratings yet
Animal Kaiser Barcodes
99 pages
Chi Square Test in Weka
67% (3)
Chi Square Test in Weka
40 pages
Test Afar
100% (3)
Test Afar
24 pages
A Data Pre Processing
No ratings yet
A Data Pre Processing
7 pages
4_Discretization and Concept Hierarchy
No ratings yet
4_Discretization and Concept Hierarchy
27 pages
4 - Discretization and Concept Hierarchy
No ratings yet
4 - Discretization and Concept Hierarchy
26 pages
Chi Merge
No ratings yet
Chi Merge
5 pages
Discretization of Continuous Attributes
No ratings yet
Discretization of Continuous Attributes
38 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
69 pages
Data Discretization and Concept Hierarchy Generation_PPT
No ratings yet
Data Discretization and Concept Hierarchy Generation_PPT
21 pages
Entropy Discretization
No ratings yet
Entropy Discretization
20 pages
Study of Discretization Methods in Classification
No ratings yet
Study of Discretization Methods in Classification
6 pages
Feature Ranking Methods Based On Information Entropy With Parzen Windows
No ratings yet
Feature Ranking Methods Based On Information Entropy With Parzen Windows
9 pages
Entropy-Based Algorithm For Discretization: April 2011
No ratings yet
Entropy-Based Algorithm For Discretization: April 2011
9 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Data Mining Discretization Methods and Performances
No ratings yet
Data Mining Discretization Methods and Performances
3 pages
Data Mining - Discretization
100% (1)
Data Mining - Discretization
5 pages
10_chapter 3
No ratings yet
10_chapter 3
15 pages
Efficient SQL-Querying Method For Data M I N I N G in Large Data Bases
No ratings yet
Efficient SQL-Querying Method For Data M I N I N G in Large Data Bases
6 pages
LINFO2275 Questions d Examen-4
No ratings yet
LINFO2275 Questions d Examen-4
34 pages
Lecture#10
No ratings yet
Lecture#10
24 pages
Decision Tree
No ratings yet
Decision Tree
22 pages
Data Preprocessing
No ratings yet
Data Preprocessing
39 pages
Discret Ization
No ratings yet
Discret Ization
12 pages
Module2.1 Feature Selection
No ratings yet
Module2.1 Feature Selection
46 pages
u1 p2 2
No ratings yet
u1 p2 2
66 pages
Chandra Shekar 2014
No ratings yet
Chandra Shekar 2014
13 pages
Discretization and Concept Hierarchy Generation
No ratings yet
Discretization and Concept Hierarchy Generation
16 pages
Discretization Techniques A Recent Survey
No ratings yet
Discretization Techniques A Recent Survey
12 pages
CSC 522 Lecture3 4bd3ba83ce402d2da5bafd60f41095b6
No ratings yet
CSC 522 Lecture3 4bd3ba83ce402d2da5bafd60f41095b6
32 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
Algorithms 20130703 PDF
No ratings yet
Algorithms 20130703 PDF
53 pages
Privacy-preserving Feature Selection a Survery-2020
No ratings yet
Privacy-preserving Feature Selection a Survery-2020
29 pages
1.6
No ratings yet
1.6
75 pages
ML Lecture 02
No ratings yet
ML Lecture 02
40 pages
Feature Selection - New
No ratings yet
Feature Selection - New
41 pages
Chapter 3: Data Preprocessing
No ratings yet
Chapter 3: Data Preprocessing
15 pages
Deterministic Feature Selection for KMeans Clustering
No ratings yet
Deterministic Feature Selection for KMeans Clustering
12 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
3-Data Pre-Processing
No ratings yet
3-Data Pre-Processing
18 pages
23-01!08!00 CS 633 Data Mining Prediction With Decision Trees PDF.pdf
No ratings yet
23-01!08!00 CS 633 Data Mining Prediction With Decision Trees PDF.pdf
80 pages
Data Preprocessing for Clustering
No ratings yet
Data Preprocessing for Clustering
40 pages
Package Discretization': R Topics Documented
No ratings yet
Package Discretization': R Topics Documented
24 pages
PPA Data Preparation
No ratings yet
PPA Data Preparation
31 pages
Dimenn Red PDF
No ratings yet
Dimenn Red PDF
135 pages
2033 Rao Faisal Maqbool Data Maining 2
No ratings yet
2033 Rao Faisal Maqbool Data Maining 2
3 pages
1744-5586-1-PB
No ratings yet
1744-5586-1-PB
9 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
74 pages
Feature Engg Pre Processing Python
No ratings yet
Feature Engg Pre Processing Python
68 pages
Attribute Selection Measure
No ratings yet
Attribute Selection Measure
3 pages
Feature Selection Based On Fuzzy Entropy
No ratings yet
Feature Selection Based On Fuzzy Entropy
5 pages
A Comparison of Multivariate Mutual Information Estimators For Feature Selection
No ratings yet
A Comparison of Multivariate Mutual Information Estimators For Feature Selection
10 pages
Formulas at a Glance_IDS
No ratings yet
Formulas at a Glance_IDS
5 pages
UNIT 2 Class Basic
No ratings yet
UNIT 2 Class Basic
69 pages
Improved Discretization Based Decision Tree For Continuous Attributes
No ratings yet
Improved Discretization Based Decision Tree For Continuous Attributes
5 pages
MIS416 Chapter6 by DrAsimAlwabel
No ratings yet
MIS416 Chapter6 by DrAsimAlwabel
73 pages
20200603_063143jnr-2-ijast
No ratings yet
20200603_063143jnr-2-ijast
14 pages
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
No ratings yet
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
21 pages
03preprocessing DMDW
No ratings yet
03preprocessing DMDW
81 pages
ML2 1 Data Transformations
No ratings yet
ML2 1 Data Transformations
71 pages
Data Discretization Unification
No ratings yet
Data Discretization Unification
14 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Primacy PDF
No ratings yet
Primacy PDF
25 pages
Purple Hibiscus Summary
No ratings yet
Purple Hibiscus Summary
4 pages
Food Service Organizations
No ratings yet
Food Service Organizations
10 pages
Montecarlosimulations: Software By: Barringer & Associates, Inc
100% (1)
Montecarlosimulations: Software By: Barringer & Associates, Inc
19 pages
Guatemalan Crafts Illustrations
No ratings yet
Guatemalan Crafts Illustrations
2 pages
AIS - Chap 7 Questions (Finals)
No ratings yet
AIS - Chap 7 Questions (Finals)
8 pages
Invitational Argument
No ratings yet
Invitational Argument
9 pages
Aims of La Solidaridad
100% (1)
Aims of La Solidaridad
2 pages
Guide To Authors - Nature Neuroscience
No ratings yet
Guide To Authors - Nature Neuroscience
9 pages
Few Liners Juris - Persons and Family Relations
No ratings yet
Few Liners Juris - Persons and Family Relations
2 pages
Unity University: Blood Bank Management System
100% (1)
Unity University: Blood Bank Management System
96 pages
Before The Seminar
No ratings yet
Before The Seminar
1 page
FS 1 JOEY
No ratings yet
FS 1 JOEY
64 pages
The Habits Guide How To Build Good Habits and Break Bad Ones
50% (2)
The Habits Guide How To Build Good Habits and Break Bad Ones
3 pages
Abdul Bari-Front Pages
No ratings yet
Abdul Bari-Front Pages
6 pages
Dill, (2015) - Pegmatites y Aplites
100% (1)
Dill, (2015) - Pegmatites y Aplites
145 pages
Atoms From The Eyes of The Philosophers
63% (8)
Atoms From The Eyes of The Philosophers
2 pages
A Detailed Lesson Plan
74% (23)
A Detailed Lesson Plan
5 pages
Bilkent Cope Bilkent Hazirlik Cope Sorulari Cloze Test 1
No ratings yet
Bilkent Cope Bilkent Hazirlik Cope Sorulari Cloze Test 1
2 pages
Tic Tac Toe C++
No ratings yet
Tic Tac Toe C++
4 pages
Caring For Your Skin After A Skin Graft
No ratings yet
Caring For Your Skin After A Skin Graft
8 pages
Passive Voice Kelas Xi
No ratings yet
Passive Voice Kelas Xi
3 pages
Kelsey Clovechok Resume
No ratings yet
Kelsey Clovechok Resume
3 pages
What Is An Alternative Dispute Resolution?
100% (1)
What Is An Alternative Dispute Resolution?
8 pages
BUG Jam Song Book 2012-04
No ratings yet
BUG Jam Song Book 2012-04
39 pages
Science Fair 2019 Quiz Bee: Calauag Central College Inc
No ratings yet
Science Fair 2019 Quiz Bee: Calauag Central College Inc
2 pages
English 9 Essay
No ratings yet
English 9 Essay
3 pages
Ketu 34.odt
No ratings yet
Ketu 34.odt
1 page

Chi2 Feature Selection and Discretization of Numeric Attributes

Uploaded by

Chi2 Feature Selection and Discretization of Numeric Attributes

Uploaded by

Chi2: Feature Selection and Discretization of Numeric Attributes

Huan Liu and Rudy Setiono

Table 1: The initial intervals, class frequencies, and

from two normal distributions with p = U/3 (i.e. 25),

4 Discussions [6] H. Liu and R. Setiono. Discretization of ordinal

You might also like