Feature Selection based on F-score for Enhancing CTG Data Classification
Feature Selection based on F-score for Enhancing CTG Data Classification
Abstract—The existence of many features refracts the the wrapper category, the quality of the feature selection
manual interpretation process of the Cardiotocography (CTG) methods depends on the classifier. This condition affects the
data. Therefore, feature selection methods are useful to select search space size and computing time, especially for data with
the relevant features that can reduce the complexity of the many features. Meanwhile, in the filter category, the
interpretation. The reduction of the complexity also speeds up performance of the feature selection methods is not affected
time computation besides improving the accuracy of the by the classifier, which results in faster computing time. In the
classification and prediction results. This study proposes a filter category, the methods consider the characteristics of the
statistical approach by using the feature selection method based data regardless of the evaluation criteria to select the features
on F-Score. The method aims to tackle the imbalanced data with
[9]. The last, hybrid categories method is a combination of the
multi-class output. In this method, the features will be assessed
filter and wrapper categories. The methods perform an
individually and rated based on their F-score. The features with
an F-score value above the average will be chosen as the relevant
interaction with the classifier but not evaluate feature set
features. We use Support Vector Machine (SVM) as a classifier iteratively [8]. Several factors can be used to choose the
to implement the F-score method. The experiment also feature selection methods such as simplicity, stability, number
employed other datasets to test the compatibility of the F-score of reduced features, computational requirement, and
method. The scalability and stability testing conducted to classification accuracy [7].
evaluate the performance of the F-score method. The The statistical approach to feature selection methods
experiment result shows that the F-score method can be
works by considering the characteristics of the data. This
implemented successfully. In the case of CTG dataset, the
accuracy of the classifier improves from 94.35% by using 21
approach is utilized in the case of flat data and there is a
features to 99.91% by using eight relevant features. This relation between features. Previous research on CTG was
improvement also can be found in all of the dataset experiment generally carried out by utilizing the evolutionary algorithm
results. as a feature selection method [2][3][6][10][11]. Those
algorithms do not accommodate a statistical approach as part
Keywords— feature selection; statistical approach; CTG of its fitness function.
978-1-7281-0867-4/19/$31.00
Authorized ©2019
licensed use limited to: Universitas Indonesia. 18
IEEE Downloaded on February 28,2023 at 03:55:00 UTC from IEEE Xplore. Restrictions apply.
II. PROPOSED METHOD A. Dataset
In the process of feature selection, the relevant features CTG dataset that used in this paper was obtained from the
will be sorted by F-score. Those features can describe the CTG UCI Machine Learning Repository [13][14]. UCI Machine
data clearly. The simplicity, the ability to handle high Learning Repository is a website which provides a collection
dimensional data, and suitable for continuous signal, are of a database that can be accessed by the public for free. The
several advantages of the F-score. The use of the relevant CTG dataset consists of 2126 sample. It is divided into three-
features is expected to eliminate bias in the interpretation of class output: Normal, Suspect, and Pathological. There are
CTG signals and also enhance the accuracy of the 1656 samples in the Normal class, 176 samples in the
classification. Figure 1 shows the general steps of the Pathological class, and 295 class in the Suspect class. There
proposed method by using F-score as a feature selection are 21 features on CTG dataset. Those features are
method and SVM as the classifier. The detailed process of the acceleration, deceleration, variability, and their derivation.
F-score features selection method illustrated as the red square The derivation features include the mean value of long-term
in Figure 2. variability, number of prolonged deceleration per second, the
F-score Performance Testing :
mean value of short-term variability, and the percentage of
Dataset
Feature Selected Classifier : - Stability time with abnormal long-term variability.
Selection Features SVM - Scalability
Method - Accuracy Acceleration refers to the CTG signal above the FHR
Fig. 1. The general steps of the proposed method
baseline. It indicates fetal alertness, such as fetal distress. Its
commonly happens in early labor and associated with fetal
Figure 2 shows that the proposed method will engage both movement or uterine contractions[1]. Deceleration is the
the instances and the feature in the dataset for the initial step. opposite of the acceleration. It refers to the CTG signal below
The dataset source was obtained from the UCI Machine the FHR baseline and indicates a fetal disorder. Based on the
Learning Repository [13][14]. The next step is the shape and the association with contractions, the deceleration
computational process, which is the F-score computation of divided into early, late, variable, and prolonged deceleration
each feature. The proposed method also computes the mean [1]. Variability is the most important features to determine
F-score from all the features which set as the threshold to current fetal condition. Short-term-variability (STV) and
choose the relevant features. The selected or relevant features long-term variability (LTV) are the common used features in
that satisfy the condition will be set as an input for the the term of variability. STV refer to beat-to-beat which
classifier. Meanwhile, the unselected features will be ignored. describe the differences between beats. LTV indicate FHR
From the classifier output, the accuracy will be calculated changes in the cycle of 3 to 5 per minutes as a response of
using the confusion matrix. uterine contractions or fetal movement [1].
B. F-Score
F-score is a feature selection method based on a statistical
approach. It sorts the relevant feature by assessing each
feature individually [8][12]. The increasing of the F-score
value means the most relevance feature. Due to the continuity
and imbalance of CTG dataset, F-score will be implemented.
The ranking method will be used to choose the subset of the
features. Equation (1) is the F-score formula used in this study
∑
− ( )= (1)
∑
19
Authorized licensed use limited to: Universitas Indonesia. Downloaded on February 28,2023 at 03:55:00 UTC from IEEE Xplore. Restrictions apply.
TABLE I. F-SCORE FROM EIGHT SELECTED FEATURES
Feature 1 Feature 2 .. . Feature n
Feature Name F-score
Instance 1
Number of acceleration per second 196.03
C. Support Vector Machine (SVM) The selected features satisfy the condition, whereas their
Generally, SVM is utilize for binary classification and F-score value is greater than F-score mean value, which equals
multi-class classification. SVM works by dividing the class to 147.537743. Table I shows the F-score value for eight
using a surface called a hyperplane. SVM will dissever the selected features. The selected features in Table I show that
hyperplane by optimizing the margin of the class. Equation (2) the number of acceleration, the duration of the deceleration,
express a formula for hyperplane and the variability are selected as the relevant factor for CTG
dataset. This result is in line with [1].
. + =0 (2)
B. SVM Performance with Selected Features
where w is a normal vector to the hyperplane, b is a scalar, and Basically, the automated classification of the fetal status
is the input of the class. Lagrange multiplier will be used to base on CTG can be proven through its performance. This
maximize the margin between hyperplane. Equation (3) is a research utilizing the SVM as a classifier to test the
formulation for hyperplane optimization performance of selected features in predicting fetal status. The
RBF also implemented in correlation to SVM because of the
( , , ) = || || − ∑ ( ( . + ) − 1) (3) CTG dataset is not fully linearly separated. The experiment
results show that there is an improvement of the accuracy by
Here i=1,2,...,l and is the i-th vector of Lagrange using the selected features to classify the fetal status.
multipliers. Assumed that the CTG data is not fully separable
linearly. The RBF was used to replace the yi.xi and maximizing The accuracy measures the overall efficiency of a
the margin [2][10]. The RBF lead to obtained low classifier [12]. In this research, the accuracy is calculated
computational cost. The formulation of RBF express by based on the confusion matrix. The elements of the confusion
Equation (4). The value of ɣ and σ2 are set to be 90 and 0.4 matrix consist of the correlation value between expert
respectively [10] annotation and the classification result. It can be written as
( , ) = exp(
|| ||
) (4) = , (5)
20
Authorized licensed use limited to: Universitas Indonesia. Downloaded on February 28,2023 at 03:55:00 UTC from IEEE Xplore. Restrictions apply.
C. Stability and Scalability Testing
Stability and scalability are several challenges related to
the feature selection method [8]. Good feature selection
method has to improve the performance of the classifier even
when it is used in different types of the dataset with various
amount of data. Stability testing will prove the ability of F-
score methods, especially when its implemented in a different
dataset. Meanwhile, the scalability testing will prove the
ability of F-score method to handle the various amount of data.
Table II shows the performance of the classification using full
features and using selected features chosen by F-score feature
selection method for all the benchmark dataset.
Liver
2 6 97.98 2 99.71
Disorder
Breast
3 10 98.14 3 100
Cancer
21
Authorized licensed use limited to: Universitas Indonesia. Downloaded on February 28,2023 at 03:55:00 UTC from IEEE Xplore. Restrictions apply.
IV. CONCLUSION Fetal Hypoxia Assessment,” Computers in Biology and Medicine, 99,
pp. 85-97, 2018.
This study applied F-score method to assess the relevance [4] V. Chudacek et.al, “Assessment of Features for Automatic CTG
between feature and data. The determination of the relevant Analysis Based on Expert Annotation,” Proc. 33rd Annual
features to describe the CTG data is based on the ranking International Conference of the IEEE EMBS, Sept. 2011, pp. 6051-
method. The mean of F-score is implemented as the threshold. 6054.
From the experiment results, the F-score is successfully [5] A. Georgieva, S. J. Payne, M. Moulden, C. W. G. Redman, “Artificial
implemented in CTG dataset and three other datasets Neural Networks Applied to Fetal Monitoring in Labour,” Neural
Comput. Applic., 22, pp. 85-93, 2013.
benchmarking. The utilization of F-score method improves
[6] L. Xu, C. W. G. Redman, S. J. Payne, A. Georgieva, “Feature
the prediction fetal status accuracy. In the future, the Selection Using Genetic Algorithms for Fetal Heart Rate Analysis,”
performance of the F-score method when combined with other Physiological Measurement, 35, pp. 1357-1371, 2014.
classifiers will be tested. Further investigation to assess the [7] G. Chandrashekar and F. Sahin, “A Survey on Feature Selection
correlation between the number of features in the dataset with Methods,” Computers and Electrical Engineering, 40, pp. 16-28, 2014.
the accuracy of the classifier will be conducted. The usage of [8] J. Li et.al., “Feature Selection : A Data Perspective,” ACM Computing
F-score as a statistical approach in the fitness function for Surveys, Vol. 50, No. 6, Article 94, pp. 94:1-94:45, 2017.
wrapper or hybrid feature selection methods also needs to be [9] A. A. Nadri, F. Rad, and H. Parvin, “A Framework for Categorize
explored deeply. Feature Selection Algorithms for Classification and Clustering,”
Bulletin de la Societe Royale des sciences de Liege, Vol. 85, pp. 850-
ACKNOWLEDGMENT 862, 2016.
[10] S. Ravindran, A. B. Jambek, H. Muthusamy, and S-C. Neoh, “A Novel
The dataset was obtained from the UCI Machine Learning Clinical Decision Support System Using Improved Adaptive Genetic
Repository. The authors are very grateful for the support of Algorithm for the Assessment of Fetal Well-Being,” Computational
the data. This research is funded by the grant from and Mathematical Methods in Medicine, Vol. 2015, pp. 1-11, 2015.
Konsorsium Riset Unggulan Perguruan Tinggi (KRUPT) [11] Z. Chen, T. Lin, N. Tang, and X. Xia, “A Parallel Genetic Algorithm
NKB-1070/UN2.R3.1/HKP.05.00/2019. Based Feature Selection and Parameter Optimization for Support
Vector Machine,” Scientific Programming, Volume 2016, pp. 1-10,
2016.
REFERENCES
[12] S. Gunes, K. Polat, and S. Yosunkaya, “Multi-class f-score Feature
[1] R. K. Freeman, M. P. Nageotte, T. J. Garite, and L. A. Miller, Fetal Selection Approach to Classification of Obstructive Sleep Apnea
Heart Rate Monitoring, 4th ed., Lippincott Williams & Wilkins: USA, Syndrome,” Expert System with Applications, 37, pp. 998-1004, 2010.
2012, pp. 85-111.
[13] D. Dua and C. Graff, UCI Machine Learning Repository
[2] H. Ocak, “A Medical Decision Support System Based on Support [http://archive.ics.uci.edu/ml], University of California, School of
Vector Machine and the Genetic Algorithm for The Evaluation of Fetal Information and Computer Science : Irvine, CA, 2019.
Well-Being,” J Med Syst, 37: 9913, pp. 1-9, 2013.
[14] K. P. Bennet and O. L. Mangasarian, “Robust Linear Programming
[3] Z. Comert, A. F. Kocamaz, and V. Subha, “Prognosis Model Based on Discrimination of Two Linearly Inseparable Sets,” Optimization
Image-Based Time-Frequency Features and Genetic Algorithm for Methods and Software, 1, pp. 23-34. 1992.
22
Authorized licensed use limited to: Universitas Indonesia. Downloaded on February 28,2023 at 03:55:00 UTC from IEEE Xplore. Restrictions apply.