Epileptic Seizure Detection Using Dynami
Epileptic Seizure Detection Using Dynami
www.elsevier.com/locate/eswa
Abstract
Epileptic seizures are manifestations of epilepsy. Careful analyses of the electroencephalograph (EEG) records can provide valuable
insight and improved understanding of the mechanisms causing epileptic disorders. The detection of epileptiform discharges in the EEG is an
important component in the diagnosis of epilepsy. Wavelet transform is particularly effective for representing various aspects of non-
stationary signals such as trends, discontinuities, and repeated patterns where other signal processing approaches fail or are not as effective.
Through wavelet decomposition of the EEG records, transient features are accurately captured and localized in both time and frequency
context. This paper deals with a novel method of analysis of EEG signals using discrete wavelet transform, and classification using ANN.
EEG signals were decomposed into the frequency sub-bands using wavelet transform. Then these sub-band frequencies were used as an input
to an ANN with two discrete outputs: normal and epileptic. In this study, FEBANN and DWN based classifiers were developed and compared
in relation to their accuracy in classification of EEG signals. The comparisons between the developed classifiers were primarily based on
analysis of the ROC curves as well as a number of scalar performance measures pertaining to the classification. The DWN-based classifier
outperformed the FEBANN based counterpart. Within the same group, the DWN-based classifier was more accurate than the FEBANN-
based classifier.
q 2005 Elsevier Ltd. All rights reserved.
Keywords: Electroencephalogram (EEG); Epileptic seizure; Discrete wavelet transform (DWT); Feedforward error backpropagation artificial neural network
(FEBANN); Dynamic wavelet network (DWN)
autoregressive (AR), reduces the spectral loss problems and Kalayci and Ozdamar (1995) showed that an ANN
gives better frequency resolution. But, since the EEG performs better, if the input and output data can be
signals are non-stationary, the parametric methods are not processed to capture the characteristic features of the signal.
suitable for frequency decomposition of these signals They used a wavelet representation for automated detection
(Subasi, 2005). of the EEG spikes. More recently, ANN that applies
A powerful method was proposed in the late 1980s to Bayesian methods are shown to be more robust compared
perform time-scale analysis of signals: the wavelet trans- with other techniques because they incorporate measures of
forms (WT). This method provides a unified framework for confidence in their output for the Levenberg-Marquardt
different techniques that have been developed for various (LM) procedure (Vuckovic, Radivojevic, Chen, & Popovic,
applications (Adeli, Zhou, &, Dadmehr, 2003; Basar, 2002). In addition, standard MLP was improved by using
Schurmann, Demiralp, Basar-Eroglu, & Ademoglu, 2001; finite impulse response filters (FIR) instead of static weights
Folkers, Mosch, Malina, & Hofmann, 2003; Geva, & for a temporal processing of data (Haselsteiner, &
Kerem, 1998; Hazarika, Chen, Tsoi, & Sergejer, 1997; Pfurtscheller, 2000). Petrosian et al. (2000) showed that
Kalayci, & Ozdamar, 1995; Khan, & Gotman, 2003; the ability of specifically designed and trained recurrent
Patwardhan, Dhawan, & Relue, 2003; Petrosian, Prokhorov, neural networks (RNN) combined with wavelet pre-
Homan, Dashei, & Wunsch, 2000; Quiroga, Sakowitz, processing, to predict the onset of epileptic seizures both
Basar, & Schurmann, 2001; Quiroga, & Schurmann, 1999; on scalp and intracranial recordings. Recently, Kiymik et al.
Rosso, Blanco, & Rabinowicz, 2003; Rosso, Martin, & (2004) presented time–frequency analysis of EEG signals
Plastino, 2002; Samar, Bopardikar, Rao, & Swartz, 1999; for detecting the information on alertness and drowsiness
Soltani, Simard, & Boichu, 2004; Zhang, Kawabata, & Liu, using spectral densities of DWT coefficients as an input to
2001). It should also be emphasized that the WT is ANN.
appropriate for analysis of non-stationary signals, and this Some studies, such as those of Petrosian et al. (2000),
represents a major advantage over spectral analysis. Hence report seizure prediction after analyzing one channel of
the WT is well suited to locating transient events. Such electroencephalogram (EEG) from an intracranial depth
transient events as spikes can occur during epileptic electrode in one patient. In these studies, using univariate
seizures. techniques no analysis of baseline data far removed from the
Wavelet is an effective time–frequency analysis tool for seizure was undertaken. A potential pitfall of conclusions
analyzing transient signals. Its feature extraction and based upon such limited data is that quantitative changes
representation properties can be used to analyze various identified prior to seizure onset may not be specific to the
transient events in biological signals. Adeli et al. (2003) pre-seizure period, but may occur at other times as well,
gave an overview of the DWT developed for recognizing unrelated to epileptic events. Validation of prediction
and quantifying spikes, sharp waves and spike-waves. They algorithms on long, continuous sets of clinical data,
used wavelet transform to analyze and characterize epilepti- representing all states of awareness, is an important part
form discharges in the form of 3-Hz spike and wave of more recent seizure prediction studies. A number of
complex in patients with absence seizure. Through wavelet promising quantitative features derived from the EEG, each
decomposition of the EEG records, transient features are with different theoretical bases, have demonstrated utility
accurately captured and localized in both time and for seizure prediction.
frequency context. The capability of this mathematical As compared to the conventional method of frequency
microscope to analyze different scales of neural rhythms is analysis using Fourier transform or short time Fourier
shown to be a powerful tool for investigating small-scale transform, wavelets enable analysis with a coarse to fine
oscillations of the brain signals. A better understanding of multi-resolution perspective of the signal (Subasi, 2005). In
the dynamics of the human brain through EEG analysis can this work, DWT has been applied for the time–frequency
be obtained through further analysis of such EEG records. analysis of EEG signals and ANN for the classification
Numerous other techniques from the theory of signal using wavelet coefficients. EEG signals were decomposed
analysis have been used to obtain representations and into frequency sub-bands using discrete wavelet transform
extract the features of interest for classification purposes. (DWT). An ANN based system was implemented to classify
Neural networks and statistical pattern recognition methods the EEG signal to one of the categories: epileptic or normal.
have been applied to EEG analysis. Neural network The aim of this study was to develop a simple algorithm for
detection systems have been proposed by a number of the detection of epileptic seizure which could also be
researchers (Gabor, Leach, & Dowla, 1996; Haselsteiner, & applied to real-time.
Pfurtscheller, 2000; Kiymik, Akin, & Subasi, in This paper aims to compare the more advanced and
press; Peters, Pfurtscheller, & Flyvbjerg, 2001; Pradhan, relatively recent neural network techniques, as mathematical
Sadasivan, & Arunodaya, 1996; Qu, & Gotman, 1997; tools for developing classifiers for the detection of epileptic
Robert, Gaudy, & Limoge, 2002; Sun, & Sclabassi, 2000; seizure. In the neural network techniques, both the feedfor-
Webber, Lesser, Richardson , & Wilson, 1996; Weng, & ward error backpropagation ANN (FEBANN), and the
Khorasani, 1996). dynamic wavelet neural network (DWN) (Becerikli, 2004;
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 345
Oysal, Yilmaz, & Koklukaya, 2005) will be used. The choice recording entirely for epileptic seizures that had been
of these two networks was based on the fact that the former is overlooked by all during the first pass and marked them as
the most popular type of ANNs and the latter is one of the definite or possible. This validated set provided the
most powerful networks commonly used in solving reference evaluation to estimate the sensitivity and speci-
classification/discrimination problems. The accuracy of the ficity of computer scorings. Nevertheless, a preliminary
various classifiers will be assessed and cross-compared, and analysis was carried out solely on events in the training set,
advantages and limitations of each technique will be as each stage in these sets had a definite start and duration.
discussed.
2.3. Analysis using discrete wavelet transforms
500
Amplitude
0 F8-C4
–500
0 500 1000 1500 2000 2500 3000
500
Amplitude
0 F7-C3
–500
0 500 1000 1500 2000 2500 3000
500
Amplitude
0 T6-O2
–500
0 500 1000 1500 2000 2500 3000
500
Amplitude
0 T5-O1
–500
0 500 1000 1500 2000 2500 3000
The proposed method was applied on a wide variety of corresponding to different levels of decomposition for
EEG data for both epileptic and normal signals. Four Daubechies order 4 wavelet with a sampling frequency of
channels of EEG (F7-C3, F8-C4, T5-O1 and T6-O2) 200 Hz. It can be seen from Table 1 that the components A5
recorded from a patient with absence seizure epileptic decomposition are within the d (1–4 Hz), D5 decomposition
discharges are shown in Fig. 1 and normal EEG signal are within the q range (4–8 Hz), D4 decomposition are
shown in Fig. 2. Fig. 3 shows five different levels of within the a range (8–13 Hz), and D3 decomposition are
approximation (identified by A1–A5 and displayed in the within the b range (13–30 Hz). Lower level decompositions
left column) and details (identified by D1–D5 and displayed corresponding to higher frequencies have negligible mag-
in the right column) of an epileptic EEG signal. Fig. 4 shows nitudes in a normal EEG.
five different levels of approximation (identified by A1–A5
and displayed in the left column) and details (identified by 2.4. Classification using artificial neural networks
D1–D5 and displayed in the right column) of a normal EEG
signal. These approximation and detail records are recon- Artificial neural networks (ANNs) are formed of cells
structed from the Daubechies 4 (DB4) wavelet filter. These simulating the low level functions of biological neurons. In
approximation and detail records are reconstructed from the ANN, knowledge about the problem is distributed in
wavelet coefficients. Approximation A4 is obtained by neurons and connections weights of links between neurons.
superimposing details D5 on approximation A5. Approxi- The neural network has to be trained to adjust the
mation A3 is obtained by superimposing details D4 on connection weights and biases in order to produce the
approximation A4, and so on. Finally, the original signal is desired mapping. At the training stage, the feature vectors
obtained by superimposing details D1 on approximation A1. are applied as input to the network and the network adjusts
Wavelet transform acts like a mathematical microscope, its variable parameters, the weights and biases, to capture
zooming into small scales to reveal compactly spaced the relationship between the input patterns and outputs.
events in time and zooming out into large scales to exhibit ANNs are particularly useful for complex pattern recog-
the global waveform patterns (Adeli et al., 2003). nition and classification tasks. The capability of learning
The extracted wavelet coefficients provide a compact from examples, the ability to reproduce arbitrary non-linear
representation that shows the energy distribution of the EEG functions of input, and the highly parallel and regular
signal in time and frequency. Table 1 presents frequencies structure of ANN make them especially suitable for pattern
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 347
Amplitude 200
100
F8-C4
0
–100
0 500 1000 1500 2000 2500 3000
500
Amplitude
0 F7-C3
–500
0 500 1000 1500 2000 2500 3000
200
Amplitude
0
T6-O2
–200
–400
0 500 1000 1500 2000 2500 3000
100
Amplitude
0
T5-O1
–100
–200
0 500 1000 1500 2000 2500 3000
Number of Samples
classification tasks (Basheer, & Hajmeer, 2000; Fausett, test examples, quantified using some error measure, can
1994; Haselsteiner, & Pfurtscheller, 2000; Shimada, Shiina serve as a criterion for stopping training or determining the
& Saito, 2000; Sun, & Sclabassi, 2000). In this paper, two optimum network size. Unlike the error in training data that
neural networks relevant to the application being considered continues to decline with network size and number of
(i.e., classification of epileptic/normal EEG data) will be training cycles, the test sets error reaches a minimum at the
employed for designing classifiers; namely the FEBANN optimum ANN size and/or number of training cycles. The
and DWN. ‘optimum’ network is considered to contain sufficient
For network trained by error backpropagation, a number knowledge about the phenomenon being modelled.
of issues have to be addressed to insure successful network
development (Basheer, & Hajmeer, 2000). Most important
among those issues are the network size (architecture) and 2.4.1. Selection of network parameters
number of training cycles. If training is insufficient, the For solving pattern classification problem ANN employ-
network will not learn the examples presented to it. In ing back-propagation training algorithm was used. Effective
contrast, extremely excessive training of the network will training algorithm and better-understood system behaviour
force it to memorize the training examples. This will result are the advantages of this type of neural network. Selection
in a network that is unable to generalize to cases from of network input parameters and performance of neural
outside the training database. Additionally, an oversized network are important for epileptic seizure detection.
ANN comprised of large number of units in the hidden The classification scheme of 1-of-C coding has been used
layers tends to learn the noise and over-fit the data rather for classifying the signal into one of the output categories.
than uncover the overall underlying trend (similar to over- For each type of EEG signals, a corresponding output class
parameterized polynomials). One practical approach to is associated. The feature vector set, x represents the ANN
avoid these problems is through cross validation in which inputs, and the corresponding class, once coded, constitutes
test examples (different from training examples) selected the ANN outputs. In order to make the neural network
randomly from the parent database are continuously used to training more efficient, the input feature vectors were
examine generalization of the network after each cycle normalized so that they fall in the range [0, 1.0]. Since the
during training. The quality of network predictions for these number of output classes is 2, the ANN with one output is
348 A. Subasi / Expert Systems with Applications 29 (2005) 343–355
–200
–400
500 1000 1500 2000 2500 3000 Details
1000
100
D1
A1
0 0
–100
–1000
0 1000 2000 3000 500 1000 1500 2000 2500 3000
1000 200
D2
0
A2
–200
–1000
0 1000 2000 3000 500 1000 1500 2000 2500 3000
1000
200
0
D3
A3
0
–200
–400
–1000
0 1000 2000 3000 500 1000 1500 2000 2500 3000
1000 400
200
D4
A4
0 0
–200
–1000 –400
0 1000 2000 3000 500 1000 1500 2000 2500 3000
1000
200
0
D5
A5
0
–200
–400
–1000
0 1000 2000 3000 500 1000 1500 2000 2500 3000
Number of Samples Number of Samples
Fig. 3. Approximate and detailed coefficients of EEG signal taken from unhealthy subject (epileptic patient).
sufficient to produce a code for each class. The outputs are disjoint subsets, the union of which is equal to the original
represented by basis vectors: set. Each learning model is trained on nK1 of the available
subsets, and then tested on the one subset which was not
[0.1]Znormal used during training. This process is repeated n times, each
[0.9]Zepileptic time using a different test set chosen from the n available
partitions of the training data, until all possible choices for
Each dummy variable is given the value 0.1 except for the test set have been exhausted. The n test set scores for
the one corresponding to the correct category, which is each learning model are then averaged, and the model with
given the value 0.9. Using target values of 0.1 and 0.9 the highest average test set score is chosen as the one most
instead of the common practice of 0 and 1 prevents the likely to perform well on unseen data (Kandaswamy et al.,
outputs of the network from being directly interpretable as 2004).
posterior probabilities (Kandaswamy, Kumar, Ramanathan,
Jayaraman, & Malmurugan, 2004).
2.4.3. Measuring error
Given a random set of initial weights, the outputs of the
2.4.2. Cross validation network will be very different from the desired classifi-
Cross validation (CV) (Basheer, & Hajmeer, 2000; cations. As the network is trained, the weights of the system
Haselsteiner, & Pfurtscheller, 2000) is often used for are continually adjusted to reduce the difference between
comparing two or more learning ANN models to estimate the output of the system and the desired response. The
which model will perform the best on the problem at hand. difference is referred to as the error and can be measured in
With n-fold CV, the available data is partitioned into n several ways. The most common measurement is sum
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 349
D1
A1
0 0
–200 –5
0 1000 2000 3000 500 1000 1500 2000 2500 3000
200 10
D2
A2
0 0
–10
–200
0 1000 2000 3000 500 1000 1500 2000 2500 3000
200 20
D3
A3
0
–20
–200
0 1000 2000 3000 500 1000 1500 2000 2500 3000
200
20
D4
A4
0 0
–20
–200
0 1000 2000 3000 500 1000 1500 2000 2500 3000
100 60
40
20
D5
A5
0 0
–20
–100 –40
0 1000 2000 3000 500 1000 1500 2000 2500 3000
Number of Samples Number of Samples
Fig. 4. Approximate and detailed coefficients of EEG signal taken from a healthy subject.
squared error (SSE) and mean squared error (MSE). SSE is to the ith wavelon, wij is the interconnection weight from the
the average of the squares of the difference between each jth wavelon to the ith wavelon and qij is the output
output and the desired output (Basheer, & Hajmeer, 2000; connection weight from the jth wavelon to the ith output. Ti
Haselsteiner, & Pfurtscheller, 2000). In this study, SSE was is the dynamic constant of the ith wavelon and bi is the bias
used for measuring performance of the neural network. (or polarization) term of the ith wavelon (Becerikli, 2004).
In DWNs, wavelet neurons (wavelons) input over a lag
2.5. Dynamic wavelet network dynamic transport to output via a wavelet activation
function. Wavelets are usually explained as basis functions
Dynamic wavelet network (DWN) models have been which are compact (closed and bounded), orthogonal
used in the meaning of a network. The DWN model we used
has unconstrained connectivity and has dynamic elements in
the wavelon (neuron of DWN) processing units. A
schematic diagram for the dynamic networks with three W1
neurons is shown in Fig. 5. Wi can be a wavelon in a DWN.
In general, there are L input signals which can be time-
u y
varying, n dynamic units, n bias terms, and M output signals. W2
The units have dynamics associated with them and they
receive the input from themselves, the bias term, and from
all other units. The output of a unit yi is an activation W3
function h(xi) of a state variable xi associated with the unit.
The output of the overall network is a linear weighted sum
of the unit outputs. The bias term bi is added to the unit
inputs. pij is the input connection weights from the jth input Fig. 5. Schematic diagram of a DWN with three-wavelon.
350 A. Subasi / Expert Systems with Applications 29 (2005) 343–355
(or orthonormal), and have time–frequency localization problems, there is a very effective way to use wavelet
properties. But, to provide all of those properties is very functions with time–frequency localization properties
difficult. Basis functions are called ‘activation functions’ in (Cannon, & Slotine, 1995). In some studies, the first
ANN literature, and can be a global or local feature in time. derivative of the Gaussian function has been used (Mallat,
Global basis functions are active for the wide values of 1987; Qussar, Rivals, Personnaz, & Dreyfus, 1998).
inputs and the receptive field of the basis function is However, the locality properties of the second derivative
approximately constant far from the center (i.e., logarithmic of the Gaussian function are clearer. A non-orthonormal
sigmoid function). But, the local basis functions are only Mexican Hat basis function (second derivative of the
active near the center; the value tends to zero far from the Gaussian function) can be easily written in the analytical
centre (Becerikli, 2004). form and its Fourier transform can be found (Becerikli,
If the global basis function is used in a network, all 2004), thus:
activation functions interact with each other and each node, 2
and they cover a wide input interval. This causes the large t
fðti Þ Z ð1 K ti2 Þexp K i ; t 2R (3)
number of parameters to adjust and necessitates a long 2
computation time. In addition, for wide input intervals,
pffiffiffiffiffiffi 2 u2
much more extrapolation error occurs. The most important
fðuÞ Z 2pu exp K ; u 2R (4)
disadvantage of orthonormal compact basis functions is that 2
they can not be obtained in the closed analytical form.
where u is a real frequency. The last equation can be
To remove all those disadvantages, the local basis
generalized as follows:
functions can be used. The local basis functions are only
active for certain inputs. In addition, the generalization t i K bi 2 1 t i K bi 2
t i K bi
errors decrease (Becerikli, 2004). In this study, only the f Z 1K exp K
ai ai 2 ai
local basis functions have been used. The most important
(5)
local function is Gaussian:
2 where bi and ai are the translation (center) and dilation
t (standard deviation) parameters, respectively. Wavelet
fðtÞ Z exp K ; x 2R (1)
2 functions have efficient time–frequency localization proper-
ties, as shown from the frequency spectrum (Mallat, 1987).
where f2L2(R). For the more general case: If the dilation parameter is changed, the support region
width of the wavelet function changes, but the number of
1 t Kb 2
t Kb cycles does not change. That is, the peak number does not
f Z exp K ; t 2R (2)
a 2 a change; however, when the dilation parameter decreases,
the peak point of the spectrum shifts to a higher frequency.
where b is the center or translation and a is the standard
Therefore, all frequency spectrums can be obtained by
deviation or dilation. However, the Gaussian function is not
changing the dilation. In this study, Eq. (4) has been used as
local in frequency (Becerikli, 2004). The locality features in
a mother (main) wavelet (Becerikli, 2004). An N-dimen-
both time and frequency is a very important concept for the
sional mother wavelet can be given in the separable
representation of the signals. Therefore, the mission of the
structure with the product rule as follows (Cannon, &
wavelet functions is comprehensive.
Slotine, 1995; Mallat, 1987; Qussar et al., 1998; Zhang, &
The locality in time and frequency can be explained as
Benveniste, 1992; Zhang, Walter, & Lee, 1995):
follows:
N
† If a function is described in a bounded interval and has a Y tj K bij
Fi ðtÞ Z fj (6)
very small value outside the boundary, then that function jZ1
aij
is local in time. The local function in time can be shifted
by changing its centre. where ti2RN is the input and N is the input number. A
† If the frequency spectrum of the local function in time is function yZf(t) can be represented with wavelets obtained
described in a bounded frequency interval and has very from the mother wavelet, (Cannon, & Slotine, 1995; Mallat,
small value outside the boundary, and also can be shifted 1987; Qussar et al., 1998) as below:
by changing its dilation, then that function is local in
frequency. Nw
X N
X
yi Z hi ðtÞ Z sij fj ðtÞ C ai0 C aik tk (7)
A deficiency of Gaussian-based ANNs is that they do not jZ1 jZ1
have localization capabilities in frequency. Since the
Gaussian function is not local in frequency, it is very where sij are the coefficients of the mother wavelets, Nw is
difficult to use Gaussian-based functions in some appli- the number of wavelets, ai0 is a mean or bias term, and aik
cations (Sanner, & Slotine, 1992). To overcome these are the linear term coefficients of this approach.
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 351
The wavelet function in this structure will be used in the calculated by analyzing the output data obtained from the
DWN given in Fig. 5. The structure used in (Becerikli, test. Furthermore, the performance of the model may be
2004; Becerikli, Konar, & Samad, 2003; Becerikli, Oysal, & measured by calculating the region under the ROC curve.
Konar, 2004; Oysal et al., 2005) has been adapted to this The ROC curve is a plot of the true positive rate (sensitivity)
network. The wavelets in Eqs. (6) and (7) will be used as the against the false positive rate (1-specificity) for each
activation functions in the network. Each activation possible cutoff. A cutoff value is selected that may classify
function has a single input/single output (SISO), and can the degree of epileptic seizure detection correctly by
be re-expressed as: determining the input parameters optimally according to
the used model.
ti K bij
Fi ðti Þ Z fi (8)
aij
3. Results and discussion
Nw
X ti K bij
yi Z hi ðti Þ Z sij fi C ai0 C ai1 ti (9)
jZ1
aij In this study, we used EEG signals of normal and
epileptic patients in order to perform comparison between
ti K bij 2 two neural network models. EEG recordings were divided
1 ti K bij 2
ti K bij
fi Z 1K exp K into sub-band frequencies such as a, b, d and q by using
aij aij 2 aij
DWT (Figs. 3 and 4). Then these wavelet sub-band
(10) frequencies d (1–4 Hz), q (4–8 Hz), a (8–13 Hz) and b
(13–30 Hz) are applied to neural networks.
2.6. Evaluation of performance The classification efficiency which is defined as the
percentage ratio of the number of EEG signals correctly
The coherence of the diagnosis of the expert neurologists classified to the total number of EEG signals considered for
and diagnosis information was calculated at the output of classification also depends on the type of wavelet chosen for
the classifier. Prediction success of the classifier may be the application. In the previous work on application of WT
evaluated by examining the confusion matrix. In order to in EEG analysis (Subasi, 2005), Daubechies wavelet of
analyze the output data obtained from the application, order 2 (db2) was used and found to yield good results. In
sensitivity (true positive ratio) and specificity (true negative order to investigate the effect of other wavelets on
ratio) are calculated by using confusion matrix. The classifications efficiency, tests were carried out using other
sensitivity value (true positive, same positive result as the wavelets also. Apart from db2, Symmlet of order 10
diagnosis of expert neurologists) was calculated by dividing (sym10), Coiflet of order 4 (coif4), Daubechies of order 4
the total of diagnosis numbers to total diagnosis numbers (db4) and Daubechies of order 8 (db8) were also tried.
that are stated by the expert neurologists. Sensitivity, also Average efficiency obtained for each wavelet when EEG
called the true positive ratio, is calculated by the formula: signals were classified using various ANN structures. It can
be seen that the Daubechies wavelet offers better efficiency
TP than the others, and db4 is marginally better than db2 and
Sensitivity Z TPR Z !100% (11)
TP C FN db8. Hence db4 wavelet is chosen for this application.
On the other hand, specificity value (true negative, same
diagnosis as the expert neurologists) is calculated by 3.1. Development of neural network model
dividing the total of diagnosis numbers to total diagnosis
numbers that are stated by the expert neurologists. The objective of the modelling phase in this application
Specificity, also called the true negative ratio, is calculated was to develop classifiers that are able to identify any input
by the formula: combination as belonging to either one of the two classes:
normal or epileptic. For developing neural network
TN
Specifity Z TNR Z !100% (12) classifiers, 300 examples were randomly taken from the
TN C FP 500 examples and used for training the neural networks, and
Neural network analyses were compared to each other by 100 for the cross validation. The remaining 100 examples
receiver operating characteristic (ROC) analysis. ROC were kept aside and used for testing the developed models.
analysis is an appropriate means to display sensitivity and The class distribution of the samples in the training and
specificity relationships when a predictive output for two validation data set is summarized in Table 2.
possibilities is continuous. In its tabular form, the ROC The ANNs were designed with wavelet sub-band
analysis displays true and false positive and negative totals frequencies of EEG signal using DWT, in the input layer;
and sensitivity and specificity for each listed cutoff value and the output layer consisted of one node representing
between 0 and 1 (Subasi, in 2005). whether epileptic seizure detected or not. A value of 0.1 was
In order to perform the performance measure of the used when the experimental investigation indicated a
output classification graphically, the ROC curve was normal and 0.9 for epileptic seizure. The preliminary
352 A. Subasi / Expert Systems with Applications 29 (2005) 343–355
be used in clinical studies in the future after it is developed. robustness to noisy data (with outliers) which can severely
This application brings objectivity to the evaluation of EEG hamper many types of ANNs as well as most traditional
signals and its automated nature makes it easy to be used in statistical methods. Finally, the fact that a DWN-based
clinical practice. Besides the feasibility of a real-time classifier can be developed quickly makes such classifiers
implementation of the expert diagnosis system, diagnosis efficient tools that can be easily re-trained, as additional data
may be made more accurately by increasing the variety and become available, when implemented in the hardware of
the number of parameters. A ‘black box’ device that may be EEG signal processing systems.
developed as a result of this study may provide feedback to With specificity and sensitivity values both above 92%,
the neurologists for classification of the EEG signals quickly the wavelet neural network classification may be used as an
and accurately by examining the EEG signals with real-time important diagnostic decision support mechanism to assist
implementation. physicians in the treatment of epileptic patients.
jj,k(t) is called the discrete wavelet transform (DWT) basis. Cannon, M., & Slotine, J. J. E. (1995). Space-frequency localized basis
Although it is called DWT, the time variable of the function networks for nonlinear system estimation and control.
Neurocomputing, 9(3), 293–342.
transform is still continuous. The DWT coefficients of a
Cohen, A., & Kovacevic, J. (1996). Wavelets: the mathematical back-
continuous time function are similarly defined as ground. Proceedings of the IEEE, 84, 514–522.
1
ð Daubechies, I. (1996). Where do wavelets come from? A personal point of
dj;k Z hfw ðtÞ; jj;k ðtÞi Z j=2 fw ðtÞjðaKj
0 t K kb0 Þdt (A.4) view. Proceedings of the IEEE, 84, 510–513.
a0 De Carli, F., Nobili, L., Gelcich, P., & Ferrillo, F. (1999). A method for the
automatic detection of arousals during sleep. Sleep, 22, 561–572.
When the DWT set (jj,k(t)) is complete, the wavelet Fausett, L. (1994). Fundamentals of neural networks architectures,
representation of a function fw(t) is expressed as algorithms, and applications. Englewood Cliffs, NJ: Prentice Hall Inc..
XX Folkers, A., Mosch, F., Malina, T., & Hofmann, U. G. (2003). Realtime
fw ðtÞ Z hfw ðtÞ; jj;k ðtÞijj;k ðtÞ (A.5) bioelectrical data acquisition and processing from 128 channels
j k utilizing the wavelet-transformation. Neurocomputing, 52–54, 247–
254.
In general, a function can be completely represented by Gabor, A. J., Leach, R. R., & Dowla, F. U. (1996). Automated seizure
using L-finite resolutions of wavelet, and the scaling detection using a self-organizing neural network. Electroencephalo-
function with parameters value of a0Z2 and b0Z1 as graphy and Clinical Neurophysiology, 99, 257–266.
N Geva, A. B., & Kerem, D. H. (1998). Forecasting generalized epileptic
seizures from the EEG signal by wavelet analysis and dynamic
X
fw ðtÞ Z cL;k 2KL=2 fð2t=L K kÞ
unsupervised fuzzy clustering. IEEE Transactions on Biomedical
kZKN
Engineering, 45(10), 1205–1216.
L X
X N Haselsteiner, E., & Pfurtscheller, G. (2000). Using time-dependent neural
C dj;k 2Kj=2 jð2t=j K kÞ (A.6) networks for EEG classification. IEEE Transactions on Rehabilitation
jZ1 kZKN Engineering, 8, 457–463.
Hazarika, N., Chen, J. Z., Tsoi, A. C., & Sergejew, A. (1997). Classification
Where scaling coefficients [cL,k] are similarly defined as of EEG signals using the wavelet transform. Signal Processing, 59(1),
ð
t 61–72.
Iasemidis, L. D., Shiau, D. S., Chaovalitwongse, W., Sackellares, J. C.,
cL;k Z hfw ðtÞ; fL;K ðtÞi Z fw ðtÞ2KL=2 f L K k dt (A.7)
2 Pardalos, P. M., Principe, J. C., et al. (2003). Adaptive epileptic seizure
prediction system. IEEE Transactions on Biomedical Engineering,
and 50(5), 616–627.
Kalayci, T., & Ozdamar, O. (1995). Wavelet preprocessing for automated
fL;k ðtÞ Z 2KL=2 fð2KL t K kÞ (A.8) neural network detection of EEG spikes. IEEE Engineering in Medicine
X and Biology Magazine, Mar/Apr, 160–166.
jZ2 h1 ðtÞfð2t K kÞ (A.9) Kandaswamy, A., Kumar, C. S., Ramanathan, R. P., Jayaraman, S., &
k Malmurugan, N. (2004). Neural classification of lung sounds using
wavelet coefficients. Computers in Biology and Medicine, 34(6),
X 523–537.
f Z2 h0 ðkÞfð2t K kÞ (A.10) Khan, Y. U., & Gotman, J. (2003). Wavelet based automatic seizure
k detection in intracerebral electroencephalogram. Clinical Neurophy-
siology, 114, 898–908.
Kiymik, M. K., Akin, M., & Subasi, A. (2004). Automatic recognition of
alertness level by using wavelet transform and artificial neural network.
Journal of Neuroscience Methods, 139(2), 231–240.
Mallat, S. G. (1987). Multifrequency channel decompositions of images
References and wavelet models. IEEE Transactions on ASSP, 37(12), 2091–2109.
Oysal, Y., Yilmaz, A.S., Koklukaya, E. (2005). A dynamic wavelet network
Adeli, H., Zhou, Z., & Dadmehr, N. (2003). Analysis of EEG records in an based adaptive load frequency control in power systems. Electrical
epileptic patient using wavelet transform. Journal of Neuroscience Power and Energy Systems, 27, 21–29.
Methods, 123, 69–87. Patwardhan, S. V., Dhawan, A. P., & Relue, P. A. (2003). Classification of
Basar, E., Schurmann, M., Demiralp, T., Basar-Eroglu, C., & Ademoglu, A. melanoma using tree structured wavelet transforms. Computer Methods
(2001). Event-related oscillations are ‘real brain responses’—wavelet and Programs in Biomedicine, 72, 223–239.
analysis and new strategies. International Journal of Psychophysiology, Peters, B. O., Pfurtscheller, G., & Flyvbjerg, H. (2001). Automatic
39, 91–127. differentiation of multichannel EEG signals. IEEE Transactions on
Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: Biomedical Engineering, 48, 111–116.
Fundamentals, computing, design, and application. Journal of Micro- Petrosian, A., Prokhorov, D., Homan, R., Dashei, R., & Wunsch, D. (2000).
biological Methods, 43, 3–31. Recurrent neural network based prediction of epileptic seizures in intra-
Becerikli, Y. (2004). On three intelligent systems: Dynamic neural, fuzzy and extracranial EEG. Neurocomputing, 30, 201–218.
and wavelet networks for training trajectory. Neural Computing Pradhan, N., Sadasivan, P. K., & Arunodaya, G. R. (1996). Detection of
Applications, 13(4), 339–351. seizure activity in EEG by an artificial neural network: A preliminary
Becerikli, Y., Konar, A. F., & Samad, T. (2003). Intelligent optimal control study. Computers and Biomedical Research, 29, 303–313.
with dynamic neural networks. Neural Networks, 16(2), 251–259. Qu, H., & Gotman, J. (1997). A patient-specific algorithm for the detection
Becerikli, Y., Oysal, Y., & Konar, A. F. (2004). Trajectory priming with of seizure onset in long-term EEG monitoring: Possible use as a
dynamic fuzzy networks in nonlinear optimal control. IEEE Trans- warning device. IEEE Transactions on Biomedical Engineering, 44,
actions Neural Networks, 15(2), 383–394. 115–122.
A. Subasi / Expert Systems with Applications 29 (2005) 343–355 355
Quiroga, R. Q., Sakowitz, O. W., Basar, E., & Schurmann, M. (2001). Soltani, S., Simard, P., & Boichu, D. (2004). Estimation of the self-
Wavelet transform in the analysis of the frequency composition of similarity parameter using the wavelet transform. Signal Processing,
evoked potentials. Brain Research Protocols, 8, 16–24. 84, 117–123.
Quiroga, R. Q., & Schurmann, M. (1999). Functions and sources of event- Subasi, A. (2005). Automatic recognition of alertness level from EEG by
related EEG alpha oscillations studied with the wavelet transform. using neural network and wavelet coefficients. Expert Systems with
Clinical Neurophysiology, 110, 643–654. Applications, 28, 701–711.
Qussar, Y., Rivals, I., Personnaz, L., & Dreyfus, G. (1998). Training Sun, M., & Sclabassi, R. J. (2000). The forward EEG solutions can be
wavelet networks for nonlinear dynamic input–output modeling. computed using artificial neural networks. IEEE Transactions on
Neurocomputing, 20, 173–188. Biomedical Engineering, 47, 1044–1050.
Rioul, O., & Vetterli, M. (1991). Wavelet and signal processing. IEEE Vuckovic, A., Radivojevic, V. A., Chen, C. N., & Popovic, D. (2002).
Signal Processing Magazine , 14–46. Automatic recognition of alertness and drowsiness from EEG by an
Robert, C., Gaudy, J. F., & Limoge, A. (2002). Electroencephalogram artificial neural network. Medical Engineering and Physics, 24,
processing using neural networks. Clinical Neurophysiology, 113, 349–360.
694–701. Webber, W. R. S., Lesser, R. P., Richardson, R. T., & Wilson, K. (1996).
Rosso, O. A., Blanco, S., & Rabinowicz, A. (2003). Wavelet analysis of An approach to seizure detection using an artificial neural network
generalized tonic-clonic epileptic seizures. Signal Processing, 83, (ANN). Electroencephalography and Clinical Neurophysiology, 98,
1275–1289. 250–272.
Rosso, O. A., Martin, M. T., & Plastino, A. (2002). Brain electrical activity Weng, W., & Khorasani, K. (1996). An adaptive structure neural network
analysis using wavelet-based informational tools. Physica A, 313, 587– with application to EEG automatic seizure detection. Neural Networks,
608. 9, 1223–1240.
Samar, V. J., Bopardikar, A., Rao, R., & Swartz, K. (1999). Wavelet Zhang, Q., & Benveniste, A. (1992). Wavelet networks. IEEE Transactions
analysis of neuroelectric waveforms: A conceptual tutorial. Brain and Neural Networks, 3(6), 889–898.
Language, 66, 7–60. Zhang, M., Kawabata, H., & Liu, Z. Q. (2001). Electroencephalogram
Sanner, R., & Slotine, J. J. E. (1992). Gaussian networks for direct adaptive analysis using fast wavelet transform. Computers in Biology and
control. IEEE Transactions Neural Network, 13(6), 837–863. Medicine, 31, 429–440.
Shimada, T., Shiina, T., & Saito, Y. (2000). Detection of characteristic Zhang, J., Walter, G. G., & Lee, W. (1995). Wavelet neural networks for
waves of sleep EEG by neural network analysis. IEEE Transactions on function learning. IEEE Transactions Signal Processing, 43(6), 1485–
Biomedical Engineering, 47, 369–379. 1497.