0% found this document useful (0 votes)
14 views

feature_extraction

This study evaluates various feature extraction techniques and classifiers for recognizing finger movements using surface electromyography (EMG) signals. A system was developed to classify 14 finger movements based on a feature vector derived from six-channel EMG signals, with the combination of spectral regression extreme learning machine (SRELM) for feature extraction and neural network (NN) as the classifier achieving the highest accuracy of 99%. The research highlights the importance of selecting appropriate feature extraction methods and classifiers to enhance classification performance in bio-driven systems.

Uploaded by

ee24mtech14002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

feature_extraction

This study evaluates various feature extraction techniques and classifiers for recognizing finger movements using surface electromyography (EMG) signals. A system was developed to classify 14 finger movements based on a feature vector derived from six-channel EMG signals, with the combination of spectral regression extreme learning machine (SRELM) for feature extraction and neural network (NN) as the classifier achieving the highest accuracy of 99%. The research highlights the importance of selecting appropriate feature extraction methods and classifiers to enhance classification performance in bio-driven systems.

Uploaded by

ee24mtech14002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Medical & Biological Engineering & Computing (2018) 56:2259–2271

https://doi.org/10.1007/s11517-018-1857-5

ORIGINAL ARTICLE

Evaluation of feature extraction techniques and classifiers for finger


movement recognition using surface electromyography signal
Pornchai Phukpattaranont 1 & Sirinee Thongpanja 1 & Khairul Anam 2 & Adel Al-Jumaily 2 & Chusak Limsakul 1

Received: 13 December 2017 / Accepted: 27 May 2018 / Published online: 18 June 2018
# International Federation for Medical and Biological Engineering 2018

Abstract
Electromyography (EMG) in a bio-driven system is used as a control signal, for driving a hand prosthesis or other wearable
assistive devices. Processing to get informative drive signals involves three main modules: preprocessing, dimensionality
reduction, and classification. This paper proposes a system for classifying a six-channel EMG signal from 14 finger movements.
A feature vector of 66 elements was determined from the six-channel EMG signal for each finger movement. Subsequently,
various feature extraction techniques and classifiers were tested and evaluated. We compared the performance of six feature
extraction techniques, namely principal component analysis (PCA), linear discriminant analysis (LDA), uncorrelated linear
discriminant analysis (ULDA), orthogonal fuzzy neighborhood discriminant analysis (OFNDA), spectral regression linear
discriminant analysis (SRLDA), and spectral regression extreme learning machine (SRELM). In addition, we also evaluated
the performance of seven classifiers consisting of support vector machine (SVM), linear classifier (LC), naive Bayes (NB),
k-nearest neighbors (KNN), radial basis function extreme learning machine (RBF-ELM), adaptive wavelet extreme learning
machine (AW-ELM), and neural network (NN). The results showed that the combination of SRELM as the feature extraction
technique and NN as the classifier yielded the best classification accuracy of 99%, which was significantly higher than those from
the other combinations tested.

Keywords Electromyography (EMG) . Feature extraction . Dimensionality reduction . Finger movement classification . EMG
pattern recognition

1 Introduction

The loss of finger functions is a major disability that limits


everyday capabilities and interactions [1]. Hence, myoelectric
* Pornchai Phukpattaranont control-based devices using residual muscles, such as the
[email protected]
muscles of the shoulder and/or arm, are used for improving
Sirinee Thongpanja the quality of life for people with physical disabilities [2, 3].
[email protected] Surface electromyography (EMG) observes electrical activi-
ties of the muscles by detection with surface electrodes [4].
Khairul Anam
[email protected] The EMG signal contains useful information related to mus-
cular activity, neuromuscular disease, and movements
Adel Al-Jumaily
[email protected]
intended [5]. It can be used for controlling a prosthetic arm
or hand, as well as with other devices such as a wheelchair, a
Chusak Limsakul mouse, and a keyboard. This requires that the pattern of an
[email protected]
EMG signal is classified into a predefined class that is
1
Department of Electrical Engineering, Faculty of Engineering, Prince matched with the command for controlling the device [6, 7].
of Songkla University, Hat Yai, Songkhla 90112, Thailand A finger movement classification system consists of three
2
School of Electrical, Mechanical and Mechatronic Systems, Faculty main modules, namely preprocessing, dimensionality reduc-
of Engineering and Information Technology, University of tion, and classification. In the preprocessing module, a
Technology Sydney, 15 Broadway, Ultimo, NSW 2007, Australia D-dimensional vector of numerical features is generated from
2260 Med Biol Eng Comput (2018) 56:2259–2271

each segment of EMG data. Then, to increase the classifica- signals from 10 hand and finger movements. We reported
tion accuracy and decrease the computational complexity, the that SRELM gave the best performance. Moreover, we
dimensionality reduction techniques are applied in the second found that the classification accuracy depended on the
module. As a result, a d-dimensional vector is obtained. Note classifier. In other words, while SREML provided the best
that the dimension of the reduced feature vector is smaller than performance when the KNN classifier was used, ULDA
the dimension of the original feature vector (d < D). Finally, gave the best performance with the SVM classifier. These
the reduced feature vector is used as an input of a classifier for results indicated that the pairing of a feature extraction
finger movement classification in the last module. technique with a type of classifier affects the classification
When the number of movements to be classified was accuracy. Therefore, another effective classifier, neural
small, the dimensionality reduction was not applied be- network (NN), which was not used in [18], was investigat-
cause the dimension of the original feature vector was ed in this current study.
also not high. Classification of eight finger movements
was proposed in [8] using mean absolute value (MAV),
and the spectra from Gabor transform as feature values.
The number of EMG channels was 2, resulting in the 2 Theory
dimension of the feature vector 16. The classification ac-
curacy was 85.10%. Uchida et al. [9] reported that the 2.1 Preprocessing methods
classification accuracy of five finger movements with
the feature values based on fast Fourier transform (FFT) In the preprocessing methods, we transform segments of
was 86% when the feature vector with dimension 20 (10 EMG data into an original feature vector. Feature values,
FFT coefficients × two-channel EMG) was used. which are elements of the original feature vector, are usu-
When the number of movements to be classified in- ally determined from the EMG data in the time domain
creases, the number of elements in the feature vector in- and/or the frequency domain [6]. Recent studies have pro-
creases to improve the classification accuracy. The posed further feature values based on statistical methods.
high-dimensional feature vector has been proposed by In this paper, we used Hudgins’s feature set [2, 3, 19]:
combining time domain, frequency domain, and/or statis- MAV, waveform length (WL), zero crossing (ZC), and
tical feature values. However, the increase in the dimen- slope sign change (SSC), which are popular time domain
sion of feature vector can introduce redundancy and add to features used in previous studies. In addition, we also
the computational complexity of classification. Therefore, used the fourth-order autoregressive (AR) coefficient for
various dimensionality reduction techniques were pro- representing information on the prediction model [12, 20],
posed to reduce the redundancy and computational com- mean frequency (MNF) for representing information on
plexity [10]. There are two main strategies of dimension- the power spectral density [21], kurtosis (KURT) for
ality reduction, i.e., feature extraction and feature selec- representing information on peakedness of distribution
tion. While feature extraction tries to determine the best [22], and skewness (SKW) for representing information
combinations of the original feature vectors to form a on the symmetry of distribution in the EMG signal [13].
new feature vector with smaller dimension, feature selec- As a result, the original feature vector of 11 elements
tion chooses the best subset of elements from the original from each segment of EMG data per EMG channel con-
feature vector. Previous studies applied various feature ex- sists of (1) MAV, (2) WL, (3) ZC, (4) SSC, (5)–(8) four
traction methods in EMG classification including principal AR coefficients, (9) MNF, (10) KURT, and (11) SKW.
component analysis (PCA) [11, 12], linear discriminant The detailed mathematical definition of each feature is
analysis (LDA) [13, 14], uncorrelated linear discriminant as follows:
analysis (ULDA) [15], orthogonal fuzzy neighborhood dis-
criminant analysis (OFNDA) [16], and spectral regression (1) MAV represents the signal energy, which is frequently
linear discriminant analysis (SRLDA) [17]. used for detecting the onset of an EMG signal. MAV
Our previous study [18] proposed a new feature extraction, feature is the average of the absolute value of the EMG
namely spectral regression extreme learning machine signal. It can be defined as [2]
(SRELM), and evaluated its performance along with other
feature extraction techniques, including SRLDA, ULDA, 1 N
OFNDA, and PCA. Moreover, in [18], five classifiers includ- MAV ¼ ∑ jxi j ð1Þ
N i¼1
ing adaptive wavelet ELM (AW-ELM), radial basis function
ELM (RBF-ELM), support vector machine (SVM), k-nearest
neighbors (KNN), and linear classifier (LC) were evaluated where xi is the amplitude of the EMG signal at sample i and N
for their performances in classifying two channels of EMG is the length of the EMG signal.
Med Biol Eng Comput (2018) 56:2259–2271 2261

(2) WL is the cumulative length of the EMG waveform over the total spectrum intensity, which can be expressed as
the segment and is indicative of the complexity of the [21]
EMG signal. It can be expressed as [2]

M M
N −1
MNF ¼ ∑ f j P j = ∑ P j ð8Þ
WL ¼ ∑ jxiþ1 −xi j: ð2Þ j¼1 j¼1
i¼1
where fj is the frequency of spectrum at frequency bin j, Pj is
the EMG power spectrum at frequency bin j, and M is the
number of bins.
(3) ZC is the number of times that the EMG signal amplitude
crosses zero. In other words, it is the number of times that (7) KURT is a classical higher-order statistical characteristic,
the signal amplitude changes its sign. A threshold must indicating non-Gaussianity, and is used to quantify the
be set to reduce the noise (i.e., threshold was set to peakedness of a distribution. It is the fourth-order
10 μV). It can be defined as [2] cumulant of the data, which can be defined as [22]

N −1
"  #
ZC ¼ ∑ ½ f ðxi  xiþ1 Þ∩jxi −xiþ1 j≥ 10 ð3Þ 1 N 4 1 N 2 2
KURT ¼ ∑ y= ∑ y −3 ð9Þ
i¼1 N i¼1 i N i¼1 i

1; if x < 0
f ð xÞ ¼ : ð4Þ where yi represents the ith normalized EMG amplitude, which
0; otherwise
has zero mean and unit variance. N denotes the total number of
the normalized EMG samples. Kurtosis can be either positive
or negative.
(4) SSC is the number of times that the slope of the EMG
signal changes sign. It is defined as [2] (8) SKW is a measure used for characterizing the degree of
asymmetry of the distribution of a random variable y. It is
the third-order cumulant of the data, which can be de-
N
fined as [13]
SSC ¼ ∑ ½sfðxi −xi−1 Þðxi −xiþ1 Þg∩fjxi −xi−1 j ≥ 10∪jxi −xiþ1 j ≥ 10g ð5Þ
i¼2

1; if x > 0 !
s ð xÞ ¼ : ð6Þ  3 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2ffi 3
0; otherwise 1 1 N 
SKW ¼ ∑Ni¼1 yi −y = ∑ y −y : ð10Þ
N N i¼1 i

(5) AR model describes each sample of the EMG signal as a


linear combination of the previous sample plus a white 2.2 Feature extraction
noise error term, which can be defined as [20]
Six feature extraction techniques are evaluated in this paper
including PCA, LDA, ULDA, OFNDA, SRLDA, and
P SRELM. It should be noted that the dimension of the reduced
xi ¼ ∑ ap xi−p þ wi ð7Þ feature vector from each feature extraction technique except
p¼1
PCA was 13, matching the total number of movements minus
1. On the other hand, the dimension of the reduced feature
where ap is the coefficient in the AR model, P is the order of vector from PCA was 14. The brief details on each technique
the AR model, and wi is the white noise or error sequence. In are as follows:
this paper, P is set to 4. As a result, the number of feature
values from the AR model is 4. & PCA tries to find a set of orthogonal basis vectors that
captures maximum information from the original dimen-
(6) MNF is the average frequency. It is defined as the sum of sions. PCA decomposes the covariance structure of the
the product of power spectrum and frequency divided by original dimensions by calculating the eigenvalues and
2262 Med Biol Eng Comput (2018) 56:2259–2271

eigenvectors of the data. The components, i.e., eigen- as


values and eigenvectors, are ranked according to their var- rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 K−1 K  p q 2  p q 2
iance to the principal axes ranging from the highest con- ED ¼ ∑ ∑ m1 −m1 þ m2 −m2 ð12Þ
tribution to the lowest. K ðK−1Þ p¼1 q¼pþ1
& LDA tries to find an optimal transformation vector by
maximizing the ratio of the between-class distance to the
within-class distance, so that the maximum class discrim- where m is the average value of feature, p and q are indexes
ination is achieved. representing the movements, and K is the total number of
& ULDA is an extension of classical LDA, such that the movements (1 ≤ k ≤ K, K = 14).σ is dispersion of clusters p
features in the transformed space are uncorrelated, so the and q, which can be expressed as
redundancy in the transformed space could be reduced.
The objective of ULDA is to find the optimal discriminant 1 I K
σi ¼ ∑ ∑ sik ð13Þ
vectors. IK i¼1 k¼1
& OFNDA minimizes the distances within the classes and
maximizes the distances between the centers of different
classes, while taking into account the contribution of the where s is the standard deviation of a feature and I is the length
samples to the different classes and to efficiently over- of feature vector (1 ≤ i ≤ I, I = 13 or 14). The RES index in-
come the singularity problems of classical LDA by creases when the class separation performance of EMG fea-
employing the QR decomposition. tures increases.
& SRLDA combines the spectral analysis of the graph ma-
trix and regression techniques and is essentially developed
from LC [23]. A set of linear regression problems is solved 2.4 Classification
to obtain the transformation vectors.
& SRELM was proposed in our previous study [18]. It is Seven classifiers are tested and compared in this paper, i.e.,
integrated from ELM and spectral regression (SR), which SVM, LC, naive Bayes (NB), KNN, RBF-ELM, AW-ELM,
utilizes the obtained eigenvector to project the hidden lay- and NN. Brief details of each classifier and its corresponding
er output to the output layer. The hidden layer weights are parameters used are as follows:
determined randomly. The output weight is computed
using SR. There are two parameters in optimizing & SVM uses a discriminant hyperplane to separate the clas-
SRELM performance: the number of hidden nodes and ses [25]. SVM aims to find the optimal hyperplane that
alpha. In order to evaluate the optimal parameters in this maximizes the margins between the points of different
paper, the number of hidden nodes was varied from 100 to classes. The margins are the distances between the hyper-
1500 nodes with an increment of 100 nodes and alpha was plane and the nearest training points. In this study, SVM
varied from 1 to 20 with an increment of 1. type was C-support vector classification. Kernel type was
radial basis function. Gamma in kernel function was set to
1/number of features, and cost was set to 1.
& LC was implemented using a simple max gate function as
2.3 Feature evaluation a classification rule [26]. It is assumed that the feature
vectors have multivariate normal distribution with mean
In this paper, we applied the statistical criteria, namely the vector and common covariance matrix.
ratio of a Euclidean distance to a standard deviation (RES) & NB classifier aims to reach the best hypothesis
index, to evaluate class separation performance of the reduced through a given training data set [27]. Bayes theorem
feature vector obtained from each feature extraction tech- provides a way to calculate the probability of a hy-
nique. The advantage of the RES index is that its result is pothesis based on its prior probability of both the data
independent of any classifier. The RES index can be defined found and the total data. NB often performs well al-
as [24] though independence assumptions between data are
violated.
ED &
RES index ¼ : ð11Þ KNN is a process to assign a new feature vector to a
σ class in all available cases using a similarity measure
such as distance functions [26]. After the distances
between the feature vector and all the training samples
ED is the distance between coordinates of a pair of clusters p are determined, the new case is assigned to the class
and q in n-dimensional Euclidean space, which can be defined with the largest probability. In other words, it is
Med Biol Eng Comput (2018) 56:2259–2271 2263

classified by a majority vote of its k neighbors. In this 3 Materials and methods


paper, k was set to 14.
& RBF-ELM is a variant of ELM classifier, which is 3.1 EMG data acquisition and experimental setup
single-layer feed-forward network with radial basis func-
tion [28]. It employs a randomized method to initialize the A commercial EMG measurement system (Mobi6-6b, TMS
centers and widths of RBF kernels, and the output weights International B.V.) with built-in band-pass filter (20–500 Hz)
of RBF network are calculated analytically. In order to and amplifier with a gain factor of 19.5 was used for recoding
select the optimal parameters, a grid search method was EMG signals at a sampling rate of 1024 Hz. The EMG signals
used. From the step sizes at 0.1, 0.5, 1, 5, 10, 15, and 20, from six forearm muscles were recorded using 12 pairs of
we can obtain 49 combinations of cost and kernel param- bipolar disposable Ag/AgCl electrodes (H124SG, Kendel
eters under test. The optimal parameters were selected ARBO) with an inter-electrode distance of 20 mm. In addi-
from the combination that gave the maximum accuracy. tion, an Ag/AgCl electrode was placed on the wrist to provide
& AW-ELM proposed by Anam and Al-Jumaily [29] is the a common ground reference. Figure 1 (left) shows the elec-
combination of ELM with wavelet neural network. It uti- trode placements on the six forearm muscles used for EMG
lizes a wavelet function as the activation function in the data acquisition. While the first group of muscles, namely
hidden node. The function is adjusted according to chang- extensor carpi ulnaris (CH6), extensor carpi radialis longus
es in the input. In order to select the optimal parameter, the (CH5), and extensor digitorum (CH4), is located on the pos-
number of hidden nodes was varied from 25 to 500 nodes terior compartment of the forearm to perform extension at the
with an increment of 25 nodes fingers, the second group of muscles, namely flexor carpi
& NN is a multilayer perceptron, which is composed of ulnaris (CH3), palmaris longus (CH2), and flexor carpi
several layers: one input layer, one layer or several hid- radialis (CH1), is located on the anterior compartment of the
den layers, and one output layer [30]. Each neuron in forearm to produce flexion at the fingers.
each layer is connected with the output of the previous Ten able-bodied subjects (seven males and three females)
one. In this paper, we designed three layered with ages ranging from 20 to 23 years participated in the
feed-forward back-propagation neural networks experiments. Each subject performed 14 different finger move-
consisting of input layer, tan-sigmoid hidden layer, ments in a random sequence for a trial consisting of thumb
and linear output layer. The number of neurons in the flexion (M1), index flexion (M2), middle flexion (M3), ring
input layer was either 13 for PCA or 14 for other feature flexion (M4), little flexion (M5), hand close (M6),
extraction techniques. The number of neurons in the index-middle-ring-little flexion (M7), middle-ring-little flexion
hidden layer was 10, 20, or 30, with the best alternative (M8), ring-little flexion (M9), middle-ring flexion (M10),
selected for obtaining maximal accuracy. The number index-middle-ring flexion (M11), thumb-little flexion (M12),
of neurons in the output layer was 14, i.e., one neuron thumb-ring-little flexion (M13), and thumb-middle-ring-little
per movement type. In addition, NN was trained using flexion (M14), as shown in Fig. 1 (right). Within the trial, the
scaled gradient descent algorithm. beginning of each movement activity is triggered by an auditory

Fig. 1 Left: the electrode locations on forearm muscles. Right: the 14 finger movements
2264 Med Biol Eng Comput (2018) 56:2259–2271

Fig. 2 Example of the six-


channel EMG signal from thumb
flexion (M1)

clue. Following the clue, the subject performed the movement


and held the contraction for 5 s in duration until a rest cue was
given. A 1-min period rest state was taken between each move-
ment in the trial. The trial was repeated five times with a 10-min EMG
period rest state. As a result, each movement was performed
five times.
Figure 2 shows an example of EMG signals obtained from Step 1: Collect and segment
six muscles during thumb flexion (M1). The EMG signals
6 channels of EMG from
with 5 s in duration (5120 samples) from CH1 to CH6 were
shown in the top to bottom rows, respectively. The differences 14 finger movements
in amplitudes of EMG signals from different muscles are
clearly seen. While the amplitudes of EMG signals from
CH6 are largest, the amplitudes of EMG signals from CH1 Step 2: Generate a feature
are smallest. vector from EMG segments

3.2 Methods
Step 3: Apply 6 types of
Figure 3 shows the method for evaluating feature extraction
feature extractions
techniques and classifiers used in recognizing the EMG sig-
nals from finger movements in this paper. After six channels
of EMG signals from 14 hand and finger movements were
acquired, they were processed using the analytical method Step 4: Evaluate performance
consisting of five steps, i.e., (1) segmentation, (2) feature gen- with RES index
eration, (3) feature extraction, (4) performance evaluation with
RES index, and (5) performance evaluation with classifiers.
The details on each step are as follows: Step 5: Evaluate performance
Step 1: segmentation: In this step, the collected EMG data
with 7 classifiers
with a length of 5120 samples was segmented by the
disjoint windowing technique with a window length of
256 samples (250 ms), resulting in 20 segmented EMG Evaluation results
data for each EMG channel of each movement. Fig. 3 EMG acquisition and analytical method
Med Biol Eng Comput (2018) 56:2259–2271 2265

Fig. 4 Scatter plots of the first two elements of the reduced feature vectors when using a SRELM, b LDA, c ULDA, d SRLDA, e OFNDA, and f PCA

Step 2: feature generation: In this step, the 11 feature the AR model were calculated for each EMG segment.
values described in Section 2.1 including MAV, WL, The feature values from six EMG channels were formed
ZC, SSC, MNF, KURT, SKW, and four coefficients from as an original feature vector. As a result, the dimension of

Fig. 5 RES index determined


using all reduced feature vectors
from six feature extraction
techniques
2266 Med Biol Eng Comput (2018) 56:2259–2271

Table 1 Mean and standard deviation of classification accuracies for 14 movements obtained with various pairs of feature extraction (FE) and classifier

FE SVM LC NB KNN RBF-ELM AW-ELM NN

SRELM 92.92 ± 4.35 93.64 ± 4.00 90.04 ± 4.57 93.04 ± 4.09 93.24 ± 3.88 92.12 ± 4.34 99.09 ± 0.83
LDA 93.30 ± 3.91 92.42 ± 3.69 90.39 ± 4.41 92.29 ± 4.37 93.33 ± 4.11 91.08 ± 4.55 95.51 ± 2.74
ULDA 93.01 ± 3.97 92.34 ± 3.77 90.01 ± 4.30 92.15 ± 4.46 93.12 ± 4.06 90.76 ± 4.98 95.58 ± 2.82
SRLDA 93.70 ± 3.55 92.13 ± 3.89 89.81 ± 4.39 93.01 ± 3.65 93.89 ± 3.54 92.07 ± 4.17 95.12 ± 3.08
OFNDA 93.09 ± 3.98 92.31 ± 3.84 90.30 ± 4.21 92.06 ± 4.30 93.31 ± 3.90 90.84 ± 4.68 95.59 ± 2.76
PCA 83.96 ± 6.93 83.23 ± 6.46 72.61 ± 7.26 79.51 ± 7.76 81.91 ± 8.27 75.46 ± 7.67 85.59 ± 6.58

The italics indicate the highest classification accuracy for each classifier

the original feature vector was 66 for each movement (11 Note that the reduced feature vectors were classified
feature values per EMG channel × 6 EMG channels). with a 10-fold cross-validation. In other words, the re-
Step 3: feature extraction: In this step, the six feature duced feature vectors were randomly partitioned into 10
extraction techniques described in Section 2.2 including subsets. The classifier training was performed using
PCA, LDA, ULDA, OFNDA, SRLDA, and SRELM nine subsets, and the remaining subset was used for
were applied to the original feature vector in step 2. As classifier testing. This process was repeated 10 times
a result, the dimension of the original feature vector, such that each of the 10 subsets was used as the testing
which was 66 from step 2, was reduced to 14 for PCA data. Finally, the performance of each pairing of the
and 13 for the others in this step. reduced feature vector with the classifier was evaluated
Step 4: performance evaluation with RES index: In this and compared using the mean and standard deviation of
step, the performance on class separation ability of all classification accuracies. The classification accuracy can
reduced feature vectors from each feature extraction tech- be expressed as
nique resulting from step 3 was evaluated with the RES
index described in Section 2.3. As a result, six RES in- classification accuracy
dexes from six feature extraction techniques were obtain- Number of correct classifications
ed and compared. ¼  100%
Total number of finger movements under test
Step 5: performance evaluation with classifiers: In this
ð14Þ
step, all reduced feature vectors from each feature extrac-
tion technique in step 3 were used as the inputs of seven
classifiers, which were briefly described in Section 2.4.
Therefore, there are 42 combinations of the reduced fea- 4 Results
ture vector with the classifier under test. The performance
based on classification accuracy from each combination 4.1 Characteristics of the reduced feature vectors
was evaluated and compared.
Figure 4 shows, as an example, the scatter plot between
the first two elements of the reduced feature vectors from
each feature extraction technique. The result shows that
Table 2 Mean and standard deviation of classification accuracies for 14 the first two elements of the reduced feature vectors by
movements obtained from the NN classifier with three alternative sizes of
the hidden layer SRELM provided better separation than those from other
feature extraction techniques, while the first two ele-
FE 10 neurons 20 neurons 30 neurons ments of the reduced feature vectors from LDA,
ULDA, OFNDA, and SRLDA are quite overlapped. In
SRELM 99.09 ± 0.83 99.57 ± 0.42 99.54 ± 0.46
addition, PCA provided results that had the worst perfor-
LDA 95.51 ± 2.74 96.61 ± 2.45 96.84 ± 2.25
mance in separating finger movements.
ULDA 95.58 ± 2.82 96.68 ± 2.34 96.83 ± 2.21
Figure 5 shows the RES index calculated from all reduced
SRLDA 95.12 ± 3.08 96.37 ± 2.33 96.49 ± 2.40
feature vectors by each feature extraction technique. The
OFNDA 95.59 ± 2.76 96.47 ± 2.56 96.86 ± 2.25
RES index of reduced feature vectors by SRELM is higher
PCA 85.59 ± 6.58 87.87 ± 6.21 88.47 ± 6.01
than that of other feature extraction techniques. In other
The italics indicate the highest classification accuracy for each size of the words, SRELM provides the reduced feature vectors that
hidden layer have the best performance in separating finger movements.
Med Biol Eng Comput (2018) 56:2259–2271 2267

Table 3 Mean and standard


deviation (SD) of classification Channel combination Mean ± SD Note
accuracies for 14 movements
obtained from the SRELM feature CH1-CH2-CH3-CH4-CH5-CH6 99.57 ± 0.52 6 channels
extraction and the NN classifier as CH2-CH3-CH4-CH5-CH6 99.24 ± 0.51 Remove CH1
the number of available EMG CH1-CH2-CH3-CH4-CH5 98.71 ± 1.12
channels is reduced step by step
CH1-CH2-CH3-CH4-CH6 98.90 ± 0.90
CH1-CH2-CH3-CH5-CH6 99.05 ± 1.00
CH1-CH2-CH4-CH5-CH6 98.90 ± 1.60
CH1-CH3-CH4-CH5-CH6 98.86 ± 1.31
CH2-CH3-CH5-CH6 97.95 ± 1.52 Remove CH1 and CH4
CH2-CH3-CH4-CH5 97.33 ± 1.70
CH2-CH3-CH4-CH6 97.90 ± 1.06
CH2-CH4-CH5-CH6 96.95 ± 2.24
CH3-CH4-CH5-CH6 97.90 ± 1.29
CH3-CH5-CH6 93.71 ± 3.94 Remove CH1, CH4, and CH2
CH2-CH3-CH5 93.38 ± 2.68
CH2-CH3-CH6 92.81 ± 3.79
CH2-CH5-CH6 92.90 ± 3.32
CH3-CH6 85.38 ± 4.55 Remove CH1, CH4, CH2, and CH5
CH3-CH5 84.76 ± 4.92
CH5-CH6 80.19 ± 5.67
CH3 58.95 ± 8.49 Remove CH1, CH4, CH2, CH5, and CH6
CH6 56.24 ± 9.45

The italics indicate the highest classification accuracy for each subset of channels

The RES indexes of reduced feature vectors from SRLDA, accuracy changes slightly for each feature extraction tech-
OFNDA, LDA, and ULDA are quite similar, while the re- nique. Results show that 20 neurons in the hidden layer
duced feature vectors from PCA give the lowest RES index. give the best accuracy at 99.57% among all combinations
We can clearly see that the RES index of reduced feature of feature extraction techniques and classifiers, when the
vectors in Fig. 5 is consistent with the scatter plot of reduced reduced feature vectors from SRELM are used.
feature vectors in Fig. 4. Table 3 presents classification accuracies with channel
reduction. The subset of channels was optimized by con-
4.2 Classification accuracy sidering the classification accuracies obtained from all
combinations of each channel set. Firstly, all possible com-
Table 1 presents the classification accuracy using various fea- binations of five channels out of the six total were trialed
ture extraction techniques paired with different classifiers. for classification. Only the set of five channels providing
While the best classification accuracies from LC, KNN, the highest classification accuracy was selected. Secondly,
AW-ELM, and NN are obtained with the reduced feature vec- all possible combinations of four channels out of the five
tors from SRELM, the best classification accuracies from total from the first step were trialed for classification. For
SVM and RBF-ELM are obtained with the reduced feature instance, the accuracies from all combinations of five
vectors from SRLDA. However, for each feature extraction channels are shown in the second row to the seventh row
technique, we can observe that NN with 10 nodes in the hid- in Table 3. We can see that the combination of CH2, CH3,
den layer provides the highest classification accuracy. CH4, CH5, and CH6 provides the highest classification
Moreover, the combination of SRELM and NN gives the accuracy, so this channel set was selected as the best com-
maximum classification accuracy at 99.09%. bination of five channels. Then, all possible combinations
Table 2 presents the classification accuracies for 14 of four channels out of the five selected channels from the
movements obtained from the NN classifier with different first step were trialed. As a result, the combination of CH2,
numbers of nodes in the hidden layer, i.e., 10, 20, or 30 CH3, CH5, and CH6 provides the best classification accu-
neurons. When we increase the number of neurons in the racy and it was chosen as the optimal set of four channels.
hidden layer from 10 to 20 and to 30, the classification The procedure was repeated for three channels, two
2268 Med Biol Eng Comput (2018) 56:2259–2271

Table 4 Mean and standard deviation of classification accuracies for movement providing the lowest classification accuracy was
movement reduction obtained from the SRELM feature extraction and
removed from the movement set. The procedure was repeated
the NN classifier using the EMG signals from CH3 and CH6
until the number of movements decreased to two movements.
No. of movements Mean ± SD Movement removal The results show that the classification accuracy increases from
85.38 to 100% when the number of movements decreases from
14 85.38 ± 4.55 –
14 to 10 movements. In other words, the reduction in the num-
13 99.08 ± 0.68 M7
ber of movements decreases the complexity of classification,
12 99.28 ± 0.59 M7 and M13
resulting in better classification accuracy.
11 99.94 ± 0.19 M7, M13, and M6
10 100.00 ± 0.00 M7, M13, M6, and M14

5 Discussion

Results of the scatter plot shown in Fig. 4 and the RES index
shown in Fig. 5 show that the reduced feature vectors from
channels, and one channel, respectively. The results show SRELM provide the best performance in separating finger
that the classification accuracy decreases from 99.57 to movements. Anam and Al-Jumaily [18] reported that
58.95% when the number of channels decreases from 6 SRELM is an ELM for supervised feature extraction with
to 1. Moreover, to obtain a high classification accuracy, consideration of the class label. The aim of the training is to
EMG signals from the muscles located on the anterior produce output that is very close to the output target. In other
and posterior compartments of the forearm are needed. words, the training tries to minimize the error between the
For example, the maximum classification accuracy from actual output and target. As a result, the reduced feature vec-
two EMG channels at 85.38% can be obtained from the tors from SRELM show better performance in separating 14
combination of flexor carpi radialis (CH3) and extensor finger movements than those from other feature extractions. In
carpi ulnaris (CH6), which are located on the anterior and addition, LDA considers also class label in the extraction step
posterior compartments of the forearm, respectively. (i.e., supervised feature extraction) and ULDA is developed to
Table 4 presents classification accuracies from movement solve the limitation of LDA by producing a set of uncorrelated
reduction using two channels of EMG signals, namely CH3 discriminant features employing the singular value decompo-
and CH6. The selection of these two EMG channels was guided sition [14]. In contrast, as Chu et al. [32] reported the PCA
by Table 3. The subset of finger movements was optimized by does not consider the class labels in the extraction process (i.
considering classification accuracy of each movement. All e., it performs unsupervised feature extraction). Therefore, the
EMG signals from 14 finger movements were firstly classified, output is another representation of the reduced feature vectors
and then the classification accuracy was individually investigat- and its performance is lower than with other feature extraction
ed for each movement from the confusion matrix [31]. The techniques.

Table 5 Performance comparisons with other techniques from previous publications

Ref. #M #Ch Features in each EMG channel #DF FE Classifiers Acc. (%)

[8] 8 2 MAV, SGT 16 – NN 85.10


[9] 5 2 FFT 20 – NN 86.00
[13] 10 2 7th-order AR coefficient, SSC, ZC, WL, SKW, HTD 28 LDA SVM ≈ 92.00
[18] 10 2+1 6th-order AR coefficient, SSC, ZC, WL, SKW, MAV, HTD 42 SRELM AW-ELM 86.73
[A] 10 2 4th-order AR coefficient, SSC, ZC, WL, SKW, MAV, MNF, KURT 22 SRELM NN 100.00
[11] 12 32 WL 32 PCA NN 94.30
[12] 15 6 6th-order AR coefficient, RMS, WL, ZC, IEMG, SSC 66 OFNDA LDA 98.25
[19] 11 7 IEMG, WL, VAR, ZC, SSC, WAMP 42 – NN 93.90
[20] 15 4 4th-order AR coefficient, WL, RMS 24 – SVM 97.60
[B] 14 6 4th-order AR coefficient, MAV, WL, ZC, SSC, MNF, KURT, SKW 66 SRELM NN 99.57

#M the number of movements, #Ch the number of EMG channels used, #DF the dimension of the feature vector before applying feature extraction, FE
feature extraction, Acc accuracy, SGT the spectra from Gabor transform, FFT fast Fourier transform, HTD Hjorth time domain, IEMG integrated EMG,
VAR variance of EMG, RMS root mean square, [A] the proposed method when using two-EMG channels for classifying 10 finger movements, [B] the
proposed method when using six-EMG channels for classifying 14 finger movements
Med Biol Eng Comput (2018) 56:2259–2271 2269

Table 5 presents the performance comparisons of the pro- six-channel EMG signals to identify 14 finger movements.
posed method with those from previous publications. The Classification accuracy of up to 99% was reached when using
classification performance can be divided into two groups. SRELM and NN in combination.
In the first group, the number of EMG channels used is 2 [8,
9, 13, 18, A]. The dimensions of feature vectors from [8, 9] are Acknowledgements The authors would like to thank the Research and
Development Office (RDO), Prince of Songkla University, and Associate
16 and 20, respectively. The classifier used is NN. The clas-
Professor Dr. Seppo Karrila, Faculty of Science and Industrial
sification accuracy is 85–86%. It is important to note that there Technology, Prince of Songkla University, for commenting on the
is no application of feature extraction for classifying move- manuscript.
ments from both individual and combined fingers in [8, 9].
This may be the cause of poor classification accuracy. Funding information This work was jointly funded by the Thailand
Research Fund and Faculty of Engineering, Prince of Songkla
However, feature extraction is applied for reducing a dimen-
University, through Contract No. RSA5980049, in part by the Higher
sion of the feature vector in [13, 18, A]. The classification Education Research Promotion and National Research University
accuracy of the proposed technique for classifying 10 move- Project of Thailand, Office of the Higher Education Commission, and
ments from two-channel EMG signals achieves 100% [A] UTS International Research Scholarship, University of Technology,
Sydney.
compared to 86.72 and 92.00% in [13, 18], respectively.
Note that, in [18], the feature vectors were generated from
two EMG channels plus one channel formed from summation
of the two channels. Moreover, Bayesian fusion was applied References
as a post processing in [13]. The comparison between [A] and
1. Kuiken TA, Li G, Lock BA, Lipschutz RD, Miller LA, Stubblefield
[18] indicates that the pairing of a feature extraction technique
KA, Englehart K (2009) Targeted muscle reinnervation for real-
with a type of classifier affects the classification accuracy. time myoelectric control of multifunction artificial arms. J Am
Another way to increase classification accuracy when the Med Assoc 301(6):619–628
number of movement increases is to increase the number of 2. Englehart K, Hudgins B (2003) A robust, real-time control scheme
EMG channels as shown in [11, 12, 19, 20, B]. Results show for multifunction myoelectric control. IEEE Trans Biomed Eng 50
(7):848–854
that the proposed technique achieves good accuracy in classi- 3. Hudgins B, Parker P, Scott RN (1993) A new strategy for multi-
fying 14 movements from six-channel EMG signals at function myoelectric control. IEEE Trans Biomed Eng 40(1):82–94
99.57% [B]. The results of this study clearly illustrate that 4. De Luca CJ (1979) Physiology and mathematics of myoelectric
using high-dimensional feature vectors with feature extraction signals. IEEE Trans Biomed Eng 26(6):313–325
5. Orosco EC, Lopez NM, Di Sciascio F (2013) Bispectrum-based
could improve the classification performance.
features classification for myoelectric control. Biomed Signal
Proces 8(2):153–168
6. Oskoei MA, Hu H (2007) Myoelectric control systems—a survey.
6 Conclusions Biomed Signal Proces 2(4):275–294
7. Parker P, Englehart K, Hudgins B (2006) Myoelectric signal pro-
cessing for control of powered limb prostheses. J Electromyogr
This paper proposed a system for classifying 14 finger move- Kinesiol 16(6):541–548
ments, involving individual and combined finger flexion ob- 8. Nishikawa D, Yu W, Yokoi H, Kakazu Y (1999) EMG prosthetic hand
served by six channels of EMG signals. Six feature extraction controller using real-time learning method. In: Proc IEEE International
techniques were evaluated including principal component Conference on Systems, Man and Cybernetics, pp. 153–158
9. Uchida N, Hiraiwa A, Sonehara N, Shimohara K (1992) EMG
analysis (PCA), linear discriminant analysis (LDA), uncorre- pattern recognition by neural networks for multi fingers control.
lated linear discriminant analysis (ULDA), orthogonal fuzzy In: Proc 14th Annual International Conference of the IEEE
neighborhood discriminant analysis (OFNDA), spectral re- Engineering in Medicine and Biology, 1992, pp. 1016–1018
gression linear discriminant analysis (SRLDA), and spectral 10. Zecca M, Micera S, Carrozza MC, Dario P (2002) Control of mul-
tifunctional prosthetic hands by processing the electromyographic
regression extreme learning machine (SRELM). The results
signal. Crit Rev Biomed Eng 30(4–6):459–485
show that the reduced feature vectors from SRELM give the 11. Tenore FVG, Ramos A, Fahmy A, Acharya S, Cummings RE, Thakor
best performance in terms of feature separation among these NV (2009) Decoding of individuated finger movements using surface
feature extraction techniques. In addition, the best feature sep- electromyography. IEEE Trans Biomed Eng 56(5):1427–1434
aration ability obtained with SRELM was confirmed by a 12. Al-Timemy AH, Bugmann G, Escudero J, Outram N (2013)
Classification of finger movements for the dexterous hand prosthe-
quantitative measure, namely the RES index. Subsequently, sis control with surface electromyography. IEEE J Biomed Health
seven classifiers were validated, namely support vector ma- Inform 17(3):608–618
chine (SVM), linear classifier (LC), naive Bayes (NB), k- 13. Khushaba RN, Kodagoda S, Takruri M, Dissanayake G (2012) Toward
-nearest neighbors (KNN), radial basis function extreme learn- improved control of prosthetic fingers using surface electromyogram
(EMG) signals. Expert Syst Appl 39(12):10731–10738
ing machine (RBF-ELM), adaptive wavelet extreme learning 14. Khushaba RN, Kodagoda S, Liu D, Dissanayake G (2013) Muscle
machine (AW-ELM), and neural network (NN). The results computer interfaces for driver distraction reduction. Comput
show that NN provides the best performance in separating Methods Prog Biomed 110(2):137–149
2270 Med Biol Eng Comput (2018) 56:2259–2271

15. Phinyomark A, Phukpattaranont P, Limsakul C (2012) Pornchai Phukpattaranont re-


Investigating long-term effects of feature extraction methods for ceived the B.Eng. (Hons.) and
continuous EMG pattern classification. Fluct Noise Lett 11(4): M.Eng. degrees in electrical engi-
1250028 neering from the Prince of
16. Khushaba RN, Al-Ani A, Al-Jumaily A (2010) Orthogonal fuzzy Songkla University, Songkhla,
neighborhood discriminant analysis for multifunction myoelectric Thailand, in 1993 and 1997, re-
hand control. IEEE Trans Biomed Eng 57(6):1410–1419 spectively, and the Ph.D. degree
17. Anam K, Al-Jumaily A (2014) Swarm-wavelet based extreme in electrical and computer engi-
learning machine for finger movement classification on transradial neering from the University of
amputees. In: Proc 36th Annual International Conference of the Minnesota, Minneapolis, MN,
IEEE Engineering in Medicine and Biology Society, 2014, pp. USA, in 2004. He is currently an
4192–4195 Associate Professor of Electrical
18. Anam K, Al-Jumaily A (2015) A novel extreme learning machine Engineering with the Prince of
for dimensionality reduction on finger movement classification Songkla University. Examples of
using sEMG. In: Proc 7th International IEEE/EMBS Conference his ongoing research include the pattern recognition system based on
on Neural Engineering (NER), pp. 824–827 electromyographic signal, electrocardiographic signal, and microscopic
19. Du YC, Lin CH, Shyu LY, Chen T (2010) Portable hand motion images of breast cancer cells. His current research interests include signal
classifier for multi-channel surface electromyography recogni- and image analysis for medical applications and ultrasound signal pro-
tion using grey relational analysis. Expert Syst Appl 37(6): cessing. Dr. Phukpattaranont is a member of the ECTI Association and
4283–4291 Thai Biomedical Engineering Research Societies.
20. Tavakolan M, Xiao ZG, Menon C (2011) A preliminary investiga-
tion assessing the viability of classifying hand postures in seniors.
Biomed Eng Online 10:79
21. Phinyomark A, Phukpattaranont P, Limsakul C (2012) Feature re- Sirinee Thongpanja was born in
duction and selection for EMG signal classification. Expert Syst Songkhla, Thailand. She received
Appl 39:7420–7431 the B.Eng. degree in biomedical
22. Al-Timemy A, Khushaba R, Bugmann G, Escudero J (2016) engineering and the M.Eng. de-
Improving the performance against force variation of EMG gree in electrical engineering
controlled multifunctional upper-limb prostheses for from the Prince of Songkla
transradial amputees. IEEE Trans Neural Syst Rehabil Eng University, Songkhla, Thailand,
24(6):650–661 in 2011 and 2012, respectively,
23. Cai D, He X, Han J (2008) SRDA: an efficient algorithm for large- and the Ph.D. degree in electrical
scale discriminant analysis. IEEE Trans Knowl Data Eng 20(1):1–12 engineering from the Prince of
24. Phinyomark A, Limsakul C, Phukpattaranont P (2011) Application Songkla University, Songkhla,
of wavelet analysis in EMG feature extraction for pattern classifi- Thailand, in 2016. Her current re-
cation. Meas Sci Rev 11:45–52 search interests include surface
25. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector electromyography signal process-
machines. ACM Trans Intel Syst Technol 2(3):27:1–27:27 ing and pattern recognition.
26. Kim KS, Choi HH, Moon CS, Muna CW (2011) Comparison of k-
nearest neighbor, quadratic discriminant and linear discriminant
analysis in classification of electromyogram signals based on the Khairul Anam was born in
wrist-motion directions. Curr Appl Phys 11(3):740–745 Buleleng-Bali on the 5th of April
27. Domingos, P., & Pazzani, M. (1996). Beyond independence: con- 1978. He received his B.Eng. de-
ditions for the optimality of the simple Bayesian classifier. In: Proc gree from the Department of
International Conference on Machine Learning, pp. 105–112 Electrical Engineering,
28. Huang GB, Siew CK (2004) Extreme learning machine: RBF net- University of Brawijaya, in
work case. In: Proc 8th Control, Automation, Robotics and Vision 2002; M.Eng. degree from the
Conference, pp. 1029–1036 Institut Teknologi Sepuluh
29. Anam K, Al-Jumaily A (2014) Adaptive wavelet extreme Nopember (ITS) Surabaya in
learning machine (AW-ELM) for index finger recognition 2008; and Ph.D. degree from the
using two-channel electromyography. In: Proc International U n i ve r s i t y o f Te c h n ol og y,
Conference on Neural Information Processing (ICONIP Sydney, Australia, in 2016. He is
2014), pp. 471–478 currently a Senior Lecturer in the
30. Ibrahimy MI, Ahsan MR, Khalifa OO (2013) Design and perfor- Department of Electrical
mance analysis of artificial neural network for hand motion detec- Engineering, University of
tion from EMG signals. World Appl Sci J 23(6):751–758 Jember, Indonesia. His main interest is artificial intelligence and its ap-
31. Al-Timemy A, Khushaba RN, Escudero J (2016) Selecting the plication in electrical engineering, biomedical engineering, and other
optimal movement subset with different pattern recognition based fields.
EMG control algorithms. In: Proc 38th IEEE EMBC Annual
International Conference
32. Chu JU, Moon I, Mun MS (2006) A supervised feature extrac-
tion for real-time multifunction myoelectric hand control. In
Proc 28th IEEE EMBS Annual International Conference, pp.
2417–2420
Med Biol Eng Comput (2018) 56:2259–2271 2271

Dr. Adel Al-Jumaily received his Chusak Limsakul received the B.


B. SC . (Eng.) i n El ect rica l Eng. degree in electrical engineer-
Engineering and Education and ing from the King Mongkut’s
M.SC. in Engineering I n s t i t u t e o f Te c h n o l o g y
Management from UT Bagdad Ladkrabang, Bangkok, Thailand,
and Ph.D. in Electrical in 1978, and the D.E.A. and Dr.
Engineering from UTM Ing. degrees from the Institute
Malaysia. Currently, he is an National des Sciences Appliquees
Associate Professor in the de Toulouse, Toulouse, France, in
U n i v e r s i t y o f Te c h n o l o g y 1982 and 1985, respectively. He
Sydney. His research interest is was a Lecturer with the
in the fields of computational in- Department of Electrical
telligence, bio-mechatronics sys- Engineering, Prince of Songkla
tems, health technology and biomedical, vision-based cancer diagnosing, University, Songkhla, Thailand, in
and artificial intelligent systems. 1978, where he is currently an
Associate Professor of Electrical Engineering and the President. His current
research interests include biomedical signal processing, biomedical instru-
mentation, and neural network.

You might also like