Parida 2014
Parida 2014
Abstract—In this paper an application of genetic algorithms fMRI data. Hence these motivations drive us to combine their
(GAs) and Gaussian Naïve Bayesian (GNB) approach is studied best attributes in pipeline for building a robust and accurate
to explore the brain activities by decoding specific cognitive states classifier. The strength of Bayesian approach is that it naturally
from functional magnetic resonance imaging (fMRI) data. leads to a distribution which used to make inferences for
However, in case of fMRI data analysis the large number of
model which contains complex parameters than simple
attributes may leads to a serious problem of classifying cognitive
amplitudes and variances [7].
states. It significantly increases the computational cost and
memory usage of a classifier. Hence to address this problem, we The rest of the paper is organized as follows. In Section II
use GAs for selecting optimal set of attributes and then GNB the preliminaries of this work are discussed. The overall
classifier in a pipeline to classify different cognitive states. The framework of the proposed method for combining the GA and
experimental outcomes prove its worthiness in successfully GNB in pipeline is explained in the Section III. The details of
classifying different cognitive states. The detailed comparison fMRI data, experimental setup, and the overview of the
study with popular machine learning classifiers illustrates the comparative methods are discussed in Section IV. The
importance of such GA-Bayesian approach applied in pipeline experimental results, analysis, and comparative study are
for fMRI data analysis. presented in Section V. Conclusions and future research
directions are discussed in Section VI.
Keywords—functional Magnetic Resonance Imaging
(fMRI); Genetic Algorithms; Gaussian Naïve Bayes; Decision II. PRELIMINARIES
Tree; Support Vector Machine
The machine learning techniques used in the proposed
I. INTRODUCTION method are explained in this section.
Neuroimaging shown that it is possible to decode a person’s A. Gaussian Naïve Bayesian
conscious experience based on their brain activity using non- Over decades, the Bayesian statistical decision theory
invasive technique [1]. The fMRI is a non-invasive technique gained attention in diversified research areas. Moreover, its
based on Blood Oxygen Level Dependent (BOLD) contrast, importance in perception has been realized recently, as it
measures the neural activity [2]. The state-of-the-art machine provides a rigorous mathematical framework for describing
learning techniques are popularly used by the neuroscientists the tasks that perceptual system performs [11]. In comparison
for variety of fMRI data analysis [3]. In case of fMRI data to other machine learning models such as neural networks and
analysis, the challenging task is to deal with the high support vector machine, Bayes model has the advantages of
dimensional data, also known as “curse of dimensionality”. modeling the inner relationships by incorporating the prior
This is due to the reason that a single fMRI volume of the brain knowledge through probabilistic theory [12]. By using the
typically contains tens of thousands of voxels [4]. The data training data, the GNB classifier estimates the probability
analysis and classification problems become harder with the distribution over fMRI observation, conditioned on the
increase in the dimensionality of data [5]. In fMRI data subject’s cognitive state. It classifies the new example
analysis, the feature selection techniques may calibrate the ሬԦ ൌ ۃଵ ǥ୬ ۄby estimating the probability ሺ ୧ ȁሬԦሻ of
problem by selecting relevant features which are passed as cognitive state ୧ given fMRI observation ሬሬԦ . It estimates
input to machine learning classifiers. GAs has been ሺ ୧ ȁሬԦሻ using the following equation along with an assumption
successfully applied in medical domains to return the best set that features are conditionally independent w.r.t. a class.
of features from a high dimensional data [6]. The Bayesian
approach has also been successfully applied to the analysis of
978-1-4799-2572-8/14/$31.00 2014
c IEEE 1237
ሺܿ ሻǤ ܲሺݔԦ ȁܿ ሻ We have compared the performance of the proposed
ܲሺܿ ȁݔԦሻ ൌ ǡ (1) technique with the popular classifiers such as Decision Tree
σ ܲሺܿ ሻǤ ܲሺݔԦȁܿ ሻ (DT), Support Vector Machine (SVM) and Multilayer
where ܲሺݔԦȁܿ ሻ ൌ ς ܲ൫ݔ หܿ ൯ can be estimated from the Perceptron Network (MLP), which are highlighted in Section
training set. Some extensions of GNB in the context of fMRI IV B.
are GNB-pooled and hierarchical GNB discussed in [13] and A. Genetic Algorithms for Feature Selection from fMRI
[14], respectively.
Here we discuss the working structure of GAs starting from
B. Genetic Algorithms the generation of the initial population pool, fitness function,
Genetic algorithms are parallel, iterative optimizers which and genetic operators to parameter configuration in connection
have been successfully applied to a large number of to feature selection.
optimization problems including classification tasks. Given a The initial population generated by populating a matrix
set of feature vector of the form ൌ ሼଵ ǡ ଶ ǡ ǥ ǡ ୢ ሽ, the GA with dimension of population size rows by independent
produces a transformed set of vectors of the form ᇱ ൌ variable (genome length) columns. The values in this matrix
ሼଵ ଵ ǡ ଶ ଶ ǥ ǡ ୢ ୢ ሽwhere ୧ is a weight vector associated are integers, which are randomly selected from the processed
with feature. The feature values are normalized and scaled by input data based on ranking, as shown in Fig. 2.
the associated weight before applying to training, testing and
classification [8].
GA follows Darwin fittest principle where the next
generation produces from the current generation using three
operators: reproduction, crossover, and mutation [9]. The
highly fittest chromosomes move to the next generation. The
classification accuracy is returned as a measure of the quality
of the transformation matrix, which is used by GA to searches
transformation that minimizes the dimensionality of the
transformed patterns and maximizes classification accuracy
[10].
III. PROPOSED METHOD
Figure 2. Structure of the initial population matrix.
The overall framework of the proposed technique is shown
in the Fig. 1. In the proposed technique, the processed fMRI The fitness of the population is estimated using the fitness
data (.mat) file is supplying as the input to GA for selecting the function as shown in Fig. 3. We have used the fitness function
most promising features from the high dimensional dataset provided by the GA Toolbox, which maximizes the
(detail is discussed in Subsection III A). The selected features separability of two classes using linear combination of the
are used to construct the GNB classifier for classification (c.f., posterior probability and empirical error rate of linear classifier
Subsection III B). The popular k-fold cross validation method (classify).
used for validating the data. We have partitioned the data using
group (class of each observation). The classification accuracy
of the constructed classifier is determined using the confusion
matrix based on the classification result obtained by test data.
Table I. GA Parameters
Figure 4. Picture Sentence Study.
Genetic Algorithm Parameter Value
Population Size 48 For these trials, the sentence and picture were presented in
Number of Generation 100 sequence, with the picture presented first on half of the trials,
Selection function Roulette and the sentence presented first on the other half of the trials.
Crossover function/rate Two point crossover/0.8 Forty such trials were available for each subject. The timing
Mutation function 0.01 within each such trial is as follows:
Based on the above initial population and parameters • The first stimulus (sentence or picture) was presented at the
configuration, the input passed to GA toolbox provided beginning of the trail (image=1).
genetic algorithm function which returns the best features. The • Four seconds later (image=9) the stimulus was removed,
genetic algorithm function runs multiple times (as it is replaced by a blank screen.
stochastic) to obtain the best set of features which can • Four seconds later (image=17) the second stimulus was
contribute significantly in subsequent GNB classification. presented. This remained on the screen for four seconds, or
B. Gaussian Naïve Bayesian for Cognitive State until the subject pressed the mouse button, whichever came
Classification first.
The best features selected using the GA is then given as the • A rest period of 15 seconds (30 images) was added after the
input to GNB for classifying the true class labels. The best second stimulus was removed from the screen. Thus, each
features obtained from GA are in the matrix formሾܺ ൈ ܻ ], trial lasted a total of approximately 27 seconds
where X denotes the observation i.e., data of the best features (approximately 54 images).
and Y denotes the class labels for the observation. We have
divided the dataset into two parts in the ratio of (80-20), where The images were collected every 500 msec. There are 54
80% of data are used for training the GNB classifier and 20% trials, 2800 snapshots. The data is stored in a ሾͷͶ ൈ ͳሿ cell
data are kept for testing. The GNB classifier is trained using array with one cell per ‘trial’ in the experiment. Each element
the training data and the classification accuracy of the model is in the cell array is an ሾܰ ൈ ܸሿ array of observed fMRI
predicted using the test data. The confusion matrix prepared to activations and each array contains 4698 number of voxels
obtain the classification accuracy. (features) per snapshot. The sample of voxel activity at a
specific time course shown in Fig. 5.
IV. EXPERIMENTAL STUDY
The experimental setup along with data set description is
described in this section.
A. Experimental Setup and Data Preparation
The experiment is carried out on the platform of 32 bit,
Intel 2.70 GHz processor, 4.00 GB RAM, running under
Windows 7 operating system. The programs for the experiment
are coded using the Matlab R2010a (The Mathworks©). The
Matlab provided GA toolbox functions used for feature
selection. The fMRI data set is collected from the Carnegie Figure 5. Voxel activity at a particular time course.
Mellon University (CMU)'s public StarPlus fMRI data
repository. The data is taken for a single subject (`04847') and The genetic algorithm applied to reduce the number of
is partitioned into trials. The experiment consists of set of features (V) of each of ሾܰ ൈ ܸሿ array. During the initial
trials. For some of these intervals, the subject simply rested, or population generation we have ignored the Cond=0 which
gazed at a fixation point on the screen. In other trials, the indicates data to ignored and Cond=1 indicates, segment is a
subject has shown a picture and a sentence, and instructed to rest or fixation interval. The dataset containing total number of
press a button to indicate whether the sentence correctly features and class label applied to genetic algorithm isሾͶͻͺ ൈ
described the picture as shown in Fig. 4. ʹͳͻሿ. The number of samples used for training and testing is
shown in Table. II.
We have partitioned the training and test sample into 80-20
ratio using ‘cvpartition’ based on the group/class label for
ሺାሻ
ൌ ሺାାାሻ ȗͳͲͲΨǡ (2)
ሺሻ
ൌ ሺାሻȗͳͲͲΨǡ (3)
ሺۼ܂ሻ (4)
Specificity = ሺ ȗͳͲͲΨǡ
ۼ܂۴۾ሻ
where, TP (True Positives) = correctly classifier positive
cases,
TN (True Negative) = correctly classifier negative cases,
FP (False Positives) = incorrectly classified negative cases,
FN (False Negative) = incorrectly classified positive cases.
Algorithm Sensitivity Specificity Accuracy Fig. 8, illustrate the accuracy comparison of classifiers. The
GNB without FS 40.94 84.93 71.75 generation vs. fitness value of 51 generation is shown in the
GA FS + GNB Fig. 9.
99.54 14.28 96.46
DT without FS 51.19 56.16 57.40
GA FS + DT 100 0 94.69
SVM without FS 51.07 83.1 84.73
MLP without FS 49.37 93.15 91.79