0% found this document useful (0 votes)

23 views

Data Augmentation Strategies For Eeg Based Motor Imagery Decoding

Uploaded by

gunda manasa

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Data Augmentation Strategies For Eeg Based Motor Imagery Decoding

Uploaded by

gunda manasa

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Heliyon 8 (2022) e10240

Contents lists available at ScienceDirect

Heliyon
journal homepage: www.cell.com/heliyon

Research article

Data augmentation strategies for EEG-based motor imagery decoding

Olawunmi George a, *, Roger Smith b, Praveen Madiraju a, Nasim Yahyasoltani a,
Sheikh Iqbal Ahamed a
a
Marquette University, Milwaukee, Wisconsin, USA
b
University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA

A R T I C L E I N F O A B S T R A C T

Keywords: The wide use of motor imagery as a paradigm for brain-computer interfacing (BCI) points to its characteristic
BCI ability to generate discriminatory signals for communication and control. In recent times, deep learning tech-
Data augmentation niques have increasingly been explored, in motor imagery decoding. While deep learning techniques are prom-
Deep learning
ising, a major challenge limiting their wide adoption is the amount of data available for decoding. To combat this
EEG
challenge, data augmentation can be performed, to enhance decoding performance. In this study, we performed
Motor imagery
VAE data augmentation by synthesizing motor imagery (MI) electroencephalography (EEG) trials, following six ap-
proaches. Data generated using these methods were evaluated based on four criteria, namely – the accuracy of
prediction, the Frechet Inception distance (FID), the t-distributed Stochastic Neighbour Embedding (t-SNE) plots
and topographic head plots. We show, based on these, that the synthesized data exhibit similar characteristics
with real data, gaining up to 3% and 12% increases in mean accuracies across two public datasets. Finally, we
believe these approaches should be utilized in applying deep learning techniques, as they not only have the
potential to improve prediction performances, but also to save time spent on subject data collection.

1. Introduction help a patient gain functionality of a damaged part of their body. In such
cases, which typically include stroke, the goal is to help the patient
The use of brain-computer interfaces in health-related applications, gradually make use of the affected part through motor imagery. When-
such as the prognosis of abnormality conditions like epilepsy [1, 2] and ever an imagined action is performed in this scenario, the BCI decodes
the restoration of hand grasping functionality in patients with movement the signals for the imagined action and sends a command to the orthosis,
impairments and disorders, such as stroke [3, 4, 5, 6, 7], is very common. to gradually lift that part of the body. Repeated use can help the patient
In non-health related applications like gaming and vehicle use [8, 9, 10, gain functionality of the affected area, over time.
11, 12], brain-computer interfaces can also be used to communicate and In processing MI signals via deep learning techniques, fairly large
control. Many of these applications make use of motor imagery (MI), amounts of data are required to train a model. This has been stated as a
either solely or in a hybrid fashion, with other paradigms. challenge to the wide adoption of deep learning techniques in the
Motor imagery remains one of the most popular BCI paradigms decoding process, since most MI experiments yield datasets, which are
widely explored. It is a state wherein a person imagines the performance small and typically only a few hundred in number [19, 20, 21, 22]. The
of a particular body movement action. This involves thinking, as though laboratory process of data collection for MI can be exhaustive and have a
performing the action [13, 14, 15, 16]. It could be viewed as performing tiring effect on participants. The repeated instructions to perform the
the action in the mind. Previous works have shown that motor imagery imagined action [23, 24, 25] can cause subjects to be easily worn out,
and the actual performance of an activity have similar neural mecha- hampering their ability to generate necessary neurological signals
nisms over the sensorimotor cortex [17, 18]. Since these imaginations needed for the experiment [26, 27]. Prolonged repetition of trials can
create neuronal activity in the sensorimotor area in a similar way as the lead to the acquisition of bad quality data, unsuitable for building a good
actual action does, many works have investigated the use of motor im- decoding model.
agery in performing commands of action. This has proven to be very Given the wide use of MI and its efﬁcacy in BCIs [23, 24, 25], the
useful, particularly in neurorehabilitation, where the goal is to gradually investigation of possible enhancements to the decoding process is

* Corresponding author.
E-mail address: [email protected] (O. George).

https://doi.org/10.1016/j.heliyon.2022.e10240
Received 17 February 2022; Received in revised form 1 May 2022; Accepted 5 August 2022
2405-8440/© 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
O. George et al. Heliyon 8 (2022) e10240

beneficial, since that can potentially improve communication and con- that augmentation improved classification. The authors also reported
trol. Deep learning techniques prevalent in computer vision have seen individual subject improvements of 2.9–19.7%. Another work by Tayeb
wide successes in enhancing model performance via data augmentation et al. [34], performed data augmentation by cropping. The authors used a
and transfer learning. Data augmentation techniques, in this case, include time window of 4 s, on trials 7 s long, with a stride of 125 ms to create
image rotation, flips, noise addition and shearing. These techniques, crops, yielding 25 times more trials. Though, results for direct compari-
though prevalent, in computer vision, need to be adapted for MI data son between the use and non-use of augmentation were not provided, the
augmentation and ought not to be used without a consideration of motor authors reported having better results with augmentation, as against
imagery peculiarities. In motor imagery, for instance, random flips of the without augmentation. They also reported that the approach helped curb
signal or its representation might yield a totally different representation overfitting and forced their convolutional neural network (CNN) model
than expected, leading to data corruption. to learn features from all the crops, leading to better classification.
Considering the limitation due to dataset sizes, we explore the use of Other approaches to data generation have made use of generative
six data generation techniques in synthesizing motor imagery signals, adversarial networks (GANs). GANs are a combination of neural net-
which may be used in augmenting data for the decoding process. The works in a generating- discriminating cycle. They were first introduced in
techniques are: Goodfellow et al.‘s work [35], have progressively evolved and are being
used in many applications. GANs have been used in image processing and
1. Averaging randomly selected trials medical analysis [36, 37, 38], generation of financial data [39, 40, 41]
2. Recombining time slices of randomly selected trials and also in EEG signal processing, for signal generation and reconstruc-
3. Recombining frequency slices of randomly selected trials tion [19, 42, 43, 44, 45, 46]. These works demonstrate that data
4. Gaussian noise addition augmentation is beneficial to classification performance.
5. Cropping In contrast to many of these works, where a single dataset and one or
6. Variational autoencoder (VAE) data synthesis two methods are explored for augmentation, this work contributes by
exploring 2 datasets across a wider range of data augmentation methods.
More details on these techniques are presented in Section 3.2. Exploring different datasets of varying trial lengths help to better validate
the strengths or limitations of the approaches.
2. Related works
3. Method
While data augmentation has been widely used in computer vision
and natural language processing (NLP), it has not seen wide use in motor 3.1. Datasets
imagery decoding. A few works have, however, explored data augmen-
tation for motor imagery decoding. This section presents some of such Two public datasets were used in this work. The first dataset by Cho
works. et al. [47] is a dataset of 3-second left- and right- hand motor imageries of
Zhang et al., in their work [28], explored data augmentation in the 52 subjects. A Biosemi Active Two system, with a 64-electrode 10-10
motor imagery decoding process. They applied the empirical mode montage and 512 Hz sampling rate, was used for data acquisition. Sub-
decomposition (EMD) on the EEG data to obtain intrinsic mode functions jects had between 100 to 200 trials recorded. Electromyographic read-
(IMFs). Artificial EEG frames were then generated by mixing the IMFs. ings (EMG) readings were also made available for muscular artifact
For any given class, real EEG trials were randomly selected and the IMFs removal. The second dataset, provided by Kaya et al [48], contained
of the real trials were summed for each channel. The signals were then imageries of 6 tasks – left hand, right hand, left foot, right foot, tongue
transformed into the time frequency domain, using complex Morlet and a passive period, during which the subject was not performing any
wavelets. Classification was done with neural networks and traditional imagined action. Data for 12 subjects were made available for the 6-class
machine learning classifiers, with the neural networks outperforming the imagery, with subjects having between 700 to 900 trials recorded. The
machine learning classifiers. Their approach was validated on their EEG-1200 EEG system, a standard medical EEG station, was used for data
motor imagery EEG dataset and dataset III from the BCI Competition II acquisition, with a sampling rate of 200 Hz and 19 EEG channels in a
[29]. 10–20 montage. Similar pre-processing steps were carried out on both
In Li et al.‘s work [30], the authors took an amplitude-perturbation datasets. The steps were as follows:
approach to data augmentation. First, the time-frequency representa-
tion of the signals were generated, using short time Fourier transform 1. Data were bandpass filtered for 1–40Hz.
(STFT). Then, the amplitude and phase information at each time and 2. Baseline correction was performed with the first 200ms pre-cue.
frequency were obtained. Random noise was added to the amplitude 3. Artifact correction was done slightly differently for each dataset. The
after which the perturbed amplitude and original phase information were major artifacts of concern were the oculographic and myographic
combined in the representation. Afterwards, the inverse STFT was artifacts. For the first dataset [47], EMG artifacts were eliminated, by
computed to get the artificially generated EEG time series. The data were using EMG readings and independent component analysis (ICA), to
then classified using a variety of neural network architectures and filter remove artifact-like components. For the second [48], a simulated
bank common spatial patterns. Their approach was validated on the BCI electrooculographic (EOG) bipolar channel was constructed using the
competition IV 2a [31] and high gamma [32] datasets. Across both two pre-frontal electrodes, Fp1 and Fp2. The bipolar channel readings
datasets and for the same network architectures, the results obtained were used to remove EOG artifacts in a similar manner as EMG arti-
with augmentation were better than those without augmentation. A facts in the first dataset, before ICA application for removal of other
recent work by Dai et al. [33] took an approach of performing recom- artifact-like components.
bination across the time and frequency domains to generate trials. First, 4. Data re-referencing to average to improve the signal-to-noise ratio.
the trials were grouped into their classes, after which real trials of the 5. A final round of artifact repair and rejection was performed, using the
same class were randomly selected and time slices of the trials were auto-reject package [49].
swapped. After the time domain swapping, frequency swaps were done,
in which slices of the same frequency bands of the intermediate artificial 3.2. Augmentation techniques
trials were swapped. A CNN was used for classification, with validation
performed using BCI Competition IV 2a and 2b datasets. Average clas- Six augmentation techniques were used namely: trial averaging, time
sification accuracies on the latter dataset, with and without data slice recombination, frequency slice recombination, noise addition,
augmentation, were reported as 87.6% and 85.6%, respectively, showing cropping and the use of a VAE. These include simple approaches

2
O. George et al. Heliyon 8 (2022) e10240

Figure 1. Schematic depicting trial generation by averaging N randomly selected trials.

(averaging, time slice recombination) to more sophisticated ones (crop- 3.2.3. Recombination of frequency slices (RF)
ping, VAE) and these were chosen for comparisons across a wide range of For recombination in frequency, first, the time-frequency represen-
augmentation techniques. First, the data for each subject and/or session, tations of all trials are generated using the short time Fourier transform
was split into train, validation and test sets; after which data augmen- (STFT). N trials are then randomly selected and roughly equal frequency
tation was performed on the training set. A 70:12:18 split ratio was used slices of the trials are combined to generate a new trial. After recombi-
for train, validation and test sets. This was chosen due to the small nation in frequency domain, the inverse STFT is applied to get the time
number of trials, particularly in dataset I and the need to have a vali- series representation of all generated trials. N was set to 5 for a baseline
dation set. For all data generation approaches, data were grouped for and like the previous two approaches, we noticed no significant increase
each subject and class. Artificial trials were generated on a per-class basis in performance with varied number of recombination trials. Figure 3
for approaches requiring direct synthesis from real trials. depicts the process of data generation via recombination of frequency
slices.
3.2.1. Trial averaging (AVG)
This approach involved the random selection of N real trials, which 3.2.4. Noise addition (NS)
were then averaged to create a new trial. We set N to be 5, as that pro- This approach involves adding random Gaussian noise, generated
vides a good number of samples for averaging, for a baseline. N was based on the statistical properties of the data. First, the mean of trials of
varied to see how the results vary with the number of averaged samples. the class for which trials are generated is calculated. Afterwards,
With varied number of epochs, we noticed no significant increase in Gaussian noise is generated with zero mean and standard deviation equal
performance. Figure 1 depicts the data generation process via averaging. to the class mean. The generated noise is then added to randomly
The significance of this method lies in the fact that averaging trials selected trials to generate artificial frames. Figure 4 depicts the process of
generates a trial with different numerical values but similar distribution data generation via Gaussian noise addition. The simplistic approach
as that of the original trials. retains the original characteristic of the wave form, while generating
trials with slightly different numerical values.
3.2.2. Recombination of time slices (RT)
This involves selecting N trials randomly to generate a new trial by 3.2.5. Cropping (CPS)
recombining roughly equal time slices from all selected trials. We set N to Crops were generated, on the data, with a sliding window of length,
be 5 and combine roughly equal time slices from the 5 trials, for a wlen ¼ 0.5 and overlap ¼ 50%, from start to the end of the trial. Each trial
baseline. N was varied and like AVG, we noticed no significant increase in in the training set was broken down into crops of length wlen and crops
performance. Figure 2 depicts the process of data generation via originating from the same trial were assigned the same label. For testing,
recombination of time slices. Recombining slices of trials generates prediction scores were generated on crops originating from a test trial,
unique trials, which take on patterns from across the originating trials. after which the predictions were averaged and the trial was assigned the

Figure 2. Schematic depicting trial generation by recombining roughly equal time slices of N randomly selected trials.

3
O. George et al. Heliyon 8 (2022) e10240

Figure 3. Schematic depicting trial generation by recombining roughly equal frequency slices of N randomly selected trials.

class with the highest mean score. The number of crops per trial gener- the parameters of the probability distribution describing the data. The
ated is represented by the formula: representation learned is, therefore, constrained, based on the mean and
standard deviation of the data. We conditioned the learning by merging
ððtrial length*sampling rateÞ crop sizeÞ the labels with the trials, during the learning process. The loss function of
þ1 (1)
ð1 n overlapÞ*crop size the VAE was a summed loss function, consisting of the Kullback-Leibler
(KL) divergence [51] and mean square reconstruction loss. We applied
where KL cost annealing [52], training the VAE on the reconstruction loss only
crop size ¼ wlen * sampling rate ¼ 0.5 * 200 ¼ 60 for the first 40 epochs and gradually increasing the weight of the KL loss
Number of crops ¼ floor [((1 * 200) - 100)/((1-0.5) * 100)] þ 1 component over the next 20 epochs. The VAE architecture comprises
¼ floor(2) þ 1 ¼ 2 þ 1 ¼ 3 convolutional layers and is seen in Tables A1 and A2 of the appendix. The
For instance, to generate crops of wlen ¼ 0.5, with n overlap ¼ 50% on original training set of the VAE was oversampled as that yielded better
dataset II, having trial length ¼ 1 s and 200Hz sampling rate, the number losses and training stability. Training was done for 300 epochs with a
of crops generated is 3. This yields 3 times more data on dataset II. batch size of 32 and the Adam optimizer was used in the VAE and across
Figure 5 depicts the cropping process. Using crops not only generates all neural networks in this study. Figure 6 below shows the VAE learning
more training instances but could allow for learning task-specific patterns process. The VAE differs from other methods given its sophisticated
in a time window. approach of learning the distribution of the real data and before gener-
ating synthetic ones.
3.2.6. Variational autoencoder (VAE)
A conditional variational autoencoder was used to learn the distri-
bution of the data and to generate artificial trials based on real ones. The 3.3. Data evaluation
VAE [50] consists of the encoder and the decoder, just as in the case of an
autoencoder. In the typical encoder-decoder combination of an autoen- The quality of the generated data was evaluated using four criteria,
coder, the encoder learns a representation of the data and encodes it, by namely:
giving a lower-dimensional representation. The decoder, on the other
hand, takes the encoded representation and aims to reconstruct the signal i Accuracy of prediction using the augmented set as against no
back to its original form, thereby decoding the representation. With a augmentation.
VAE, the autoencoder does not simply learn a function that maps the ii Fr'echet inception distance (FID).
input to a compressed form and back to its original form. Rather, it learns iii TSNE plots of both real and synthetic trials.

Figure 4. Schematic depicting trial generation by adding noise to N randomly selected trials.

Figure 5. Schematic depicting the cropping process.

4
O. George et al. Heliyon 8 (2022) e10240

Figure 6. Illustration of the learning process of a VAE [53].

iv Topographic head plots of both real and synthetic trials. show clustering of original versus synthetic data for all classes. Sample
plots for AVG are presented in Figures 8 and 12 for datasets I and II,
For accuracy evaluation, classification was done using a CNN - the respectively.
Deep Net, with an architecture inspired by Schirrmeister et al.‘s work Topographic head plots of evoked responses of both real and syn-
[32]. The structure of the network is seen in Table A3 of the appendix. In thetic trials were generated, showing brain activity over the course of the
training the classifier, the already partitioned training set was used with trial. It is expected that generated trials would show similar evoked
and without augmentation across the mentioned techniques, with the response patterns. This was noticed across the head plots, giving a
classifier being trained for 50 epochs and with a batch size of 32. As consistent pattern of events, as expected. Sample plots for datasets I and II
earlier stated, the baseline approach for augmentation was to generate are seen in Figures 9 and 13, respectively.
equal number of samples as the number of available trials for each subject
and session. Using more augmentation samples yielded no significant 4. Results and discussion
improvement. Hence, the baseline was used for comparison across
methods. Mean accuracies are shown in Figures 6 and 10 for datasets I In this section, we present the results of our analyses on the two
and II, respectively. datasets. The accuracies show how well, a model distinguishes the clas-
The FID [54] calculates the distance between feature vectors calcu- ses, with and without augmentation; the FID shows the degree of
lated for real and generated samples. It measures how similar they are, in divergence between the synthesized and real data; the TSNE plot shows
statistical terms, based on the features of the trials calculated using the how real and synthetic trials for all tasks are clustered; and the topo-
Inception v3 model [55]. Lower scores show closeness or similarities in graphic head plot shows signal characteristics and power across the trial
properties, being correlated with higher quality data, while larger scores period. For comparisons across methods, p-values were computed using
show a greater dissimilarity. The FID on two datasets that are the same is repeated measures one-way analysis of variance (ANOVA) and the
0. For each augmentation approach, we computed FID values compared Bonferroni-corrected alpha value was set at 7.14E-03.
with the original data and compare for similarities in distribution due to
lower FID scores. FID values are shown in Figures 7 and 11 for datasets I 4.1. Accuracies of classification for Dataset I
and II, respectively.
The t-SNE visualization [56] is a dimension-reduced plot of the data. Summaries of resulting accuracies are shown in Figure 7 and p-values
The TSNE procedure compresses an n-dimensional sample into computed are shown in Table 1. Initial comparison was made for varied
two-dimensions and the plot of the 2-d representations of samples shows
how the samples are clustered together. Similarities between well
generated and real datasets would be observed by similar clustering and
shapes. For each augmentation approach, we generated t-SNE plots to

Figure 8. Summary plot of FID values across all methods. AVG – Averaging; RT
Figure 7. Summary plot of accuracies across all methods for dataset I. NA – No – Recombination in time; RF – Recombination in frequency; NS – Noise addition;
augmentation; AVG – Averaging; RT – Recombination in time; RF – Recombi- VAE – VAE. CPS is excluded since no new data is calculated and the resulting
nation in frequency; NS – Noise addition; CPS – Crops; VAE – VAE. Crosses and data form is not of the same dimension as the original trial. Crosses and hori-
horizontal markers depict the mean and median, respectively. zontal markers depict the mean and median, respectively.

5
O. George et al. Heliyon 8 (2022) e10240

Figure 9. TSNE plots across all 52 subjects of dataset I using AVG augmentation. Real and synthetic (synth) data are plotted as circles and crosses, respectively. Left
and right classes are denoted by blue and red colours, respectively. The reader is referred to the electronic copy for better viewing.

number of epochs for generation and varied number of augmentation I All augmentation methods yielded increases in decoding
samples. There was, however, no signiﬁcant change in results with accuracy
higher number of generation epochs or augmentation samples. So, equal Compared with no augmentation, all augmentation techniques yiel-
number of augmentation trials as real trials were generated across sub- ded higher (78.30–86.51 > 77.73) mean accuracies. Though, all p-
jects and/or sessions for both datasets. values except for CPS do not show signiﬁcance (all p-values > α ¼

6
O. George et al. Heliyon 8 (2022) e10240

Table 1. p-values for comparisons across all methods - Dataset I α ¼ 5E-02; Bonferroni-corrected α with 7 tests ¼ 7.14E-03. Bold values show signiﬁcance in mean
differences.
NA AVG RT RF NS CPS VAE
AVG 7.20E-01 - - - - - -
RT 4.01E-01 5.33E-01 - - - - -
RF 1.09E-01 2.23E-01 6.13E-01 - - - -
NS 1.81E-01 3.07E-01 7.07E-01 8.97E-01 - - -
CPS 1.14E-04 2.05E-04 5.32E-04 5.46E-03 3.47E-03 - -
VAE 3.15E-01 5.24E-01 9.99E-01 5.24E-01 6.64E-01 2.30E-03 -
Cho et al. 1.40E-05 9.44E-07 2.26E-06 1.59E-09 5.36E-07 6.83E-10 6.96E-07

7.14E-03), the increments do show that slight improvements are and if the resulting data is not too divergent from the real. Data points are
achievable using these techniques. shown to be well clustered with overlaps between real and synthetic data
II Most significant increases were seen using CPS for left and right class labels. The overlaps show similarities between the
The augmentation techniques yielding the most increases on Dataset I real and synthetic data. A similar plot for Dataset II is presented in Sec-
were CPS and frequency recombination. RF yielded a higher mean tion 4.7.
accuracy (80.24 > 77.73) than NA, though not statistically significant
(1.09E-01 > α ¼ 7.14E-03). CPS, on the other hand, yielded a higher 4.4. Topographic head plots for Dataset I
mean accuracy (86.51 > 77.73), with statistical significance (1.14E-
04 < α ¼ 7.14E-03). Model performances improved across 75% of The topographic head plots are plots of the evoked response for each
subjects, with up to 12% increase in mean accuracy and most indi- class, for both real and synthetic data. The evoked responses are gener-
vidual increases of 3–30 %, using CPS. In some cases, percentage ated by averaging trials, for each class. The plots in Figure 10 show that
increases were slightly above 50%. This shows how cropping the characteristics of the synthetic signals are not widely different from
improved model performances on the dataset. The crops of window those of the real signals. Similar characteristics, in terms of the peaks, are
length 0.5 (1500 ms) with 50% overlap yielded 3 times more data for seen across time points of the experiment. With S01, peaks in signals, for
each subject, providing smaller windows of data, with certain char- the left class, are observed at about 355 ms, 506 ms and 680 ms for the
acteristics that may be peculiar to similar window lengths across trials real trial. For the synthetic trial, as well, peaks are observed at 350ms,
of the same class. Based on this, the model can learn window-specific 510 ms and 682 ms. Similarly, for the right class, we see peaks around
features across trials of the same class. 340 ms, 496ms, 756 ms and 779 ms, for both real and synthetic data.
III All methods significantly outperformed results reported by au- These time points are close, showing that the augmentation did not
thors of the dataset significantly change the observed peaks in evoked responses, across
In comparison with results reported by the original authors [47], all classes for the subjects. With S52, there is a slight variation, though not
methods, with and without augmentation, gave superior performance too wide. A similar plot for dataset II is presented in Section 4.8.
with statistical significance (all p-values < α ¼ 7.14E-03). Our ap-
proaches significantly differ from the authors' due to the 4.5. Accuracies of classification for Dataset II
pre-processing, the use of CNNs for classification, rather than tradi-
tional machine learning algorithms, and augmentation. Results show Summaries of resulting accuracies and computed p-values are shown
better performances following our approaches, as compared to the in Figure 11 and Table 2, respectively. As with dataset I, initial com-
original authors', meaning that our approaches are preferred. parisons were done for varied number of generation epochs and
augmentation samples. However, no significant change in results was
4.2. FID values for Dataset I noticed and so, equal number of augmentation trials as real trials were
generated across subjects and/or sessions in the dataset.
A summary of resulting FID values is shown in Figure 8. FID for all
methods were calculated based on the real and synthetic data. So, values I Most augmentation methods yielded increases in decoding
were only generated for augmentation techniques yielding trials with accuracy
dimensions like the original data. CPS was excluded, since crops give Compared with no augmentation, all augmentation techniques,
smaller dimensions compared to the original data. Moreover, crops are except CPS and the VAE yielded higher (81.74–83.01 > 80.73) mean
not generated by modifying the values of the original data in any way. accuracies. Though, all p-values do not show significant increases (p-
Lower FID values are usually preferred, as they show lower divergence, in values > α ¼ 7.14E- 03), the increments do show that slight im-
terms of the distribution, from the real data. NS yielded the least FID. RF provements are achievable using these techniques.
and RT also yielded low FID values. FIDs of the VAE and averaging are II All methods, except CPS, significantly outperformed results re-
quite high compared to the others. The low FID of NS shows that the ported by authors of the dataset
perturbations were not too severe to distort the signals. A recommen- In comparison with results reported by the original authors [48], all
dation would be to explore RF and NS, as these yielded similar values in methods, except CPS, gave superior performance with statistical sig-
terms of accuracy improvements and low FIDs. nificance (p-values < α ¼ 7.14E-03). As with dataset I, our ap-
proaches differ from the authors' due to the pre-processing, the use of
4.3. TSNE plots for Dataset I CNNs and augmentation.
III Results significantly worsened using CPS, inferred to be due to
Figure 9 shows the TSNE plot of real and synthetic data across all the trial length
subjects in the dataset, using AVG. TSNE plots are helpful in showing how CPS and VAE yielded lesser mean accuracies than without augmen-
samples are clustered, based on inherent characteristics of the data. tation (65.17, 79.92 < 80.73), with the decrease from the VAE being
Closely clustered points would infer that such points have similar char- statistically insignificant (4.17E-01 α ¼ 7.14E-03). However, the
acteristics and so, it is expected that data for the same class, either real or decrease resulting from the crops is significant (1.37E-10 < α ¼
synthetic, would be clustered together, if the augmentation is done well 7.14E-03), with 93% of subjects having decreases in individual

7
O. George et al. Heliyon 8 (2022) e10240

Figure 10. Topographic plots of subjects of dataset I showing real and synthetic signal characteristics. Plots are shown for subjects (a) S01 and (b) S52, for both left
and right classes, using AVG. The reader is referred to the electronic copy for better viewing.

performance, resulting in a 20% total decrease in mean accuracy. We 4.6. FID values for Dataset II
varied the parameters, such as the window length and overlap, for
generating the crops but our variation of the parameters, particularly A summary of resulting FID values is shown in Figure 12. FID for all
in Dataset II, did not yield performances better than without methods were calculated based on the real and synthetic data. As with
augmentation. In the comparison of the CPS approach across both the first dataset, FID values were not computed for CPS, since crops give
datasets, we conclude that CPS might be more suitable with longer smaller dimensions compared to the original data.
trials than shorter ones. The trial length of Dataset I – 3 s - is 3 times Since lower FID values are preferred, NS, RF and RT yielded the least
more than that of Dataset II – 1 s. The cropped window length for FIDs, followed by AVG and VAE. Just as with the first dataset, the low FID
Dataset I covers 1500ms, whereas that of Dataset II covers 600ms. It of NS shows that perturbations were not too severe to distort the signals.
may be inferred that the 1500ms window length is sufficient to Also, a recommendation would be to explore RF, NS and RT, as these
contain discriminatory patterns over the imagery period, as compared yielded low FIDs and improvements in accuracies.
to 600ms. Typical reactionary times have been placed at between
200-500ms [56], which is one reason why a longer window length
4.7. TSNE plots for Dataset II
may be more appropriate, since it would, more consistently, capture
key signal changes over the course of the task. The significant wors-
TSNE plots for all 12 subjects A-M (excluding D) are shown in
ening of results in Dataset II shows the length of trials should be
Figure 13. The 6 classes of motor imageries are left hand (LH), right hand
considered before using CPS, as trials of length greater than 1 s might
(RH), left leg (LL), right leg (RL), tongue (TT), passive mode of inactivity
be more suitable than shorter trials when applying CPS.
(PV). The 6 classes are denoted with the following colours: blue (LH),

8
O. George et al. Heliyon 8 (2022) e10240

Figure 11. Summary plot of accuracies across all methods for dataset II. NA –
No augmentation; AVG – Averaging; RT – Recombination in time; RF –
Recombination in frequency; NS – Noise addition; CPS – Crops; VAE – VAE.
Crosses and horizontal markers depict the mean and median, respectively. Figure 12. Summary plot of FID values across all methods. AVG – Averaging;
RT – Recombination in time; RF – Recombination in frequency; NS – Noise
addition; VAE – VAE. CPS is excluded since no new data is calculated and the
orange (LL), green (PV), red (RH), purple (RL) and brown (TT). Real and
resulting data form is not of the same dimension as the original trial. Crosses and
synthetic data points are denoted with circles and crosses, respectively. horizontal markers depict the mean and median, respectively.
The data points are shown to be well clustered with overlaps between
real and synthetic data for left and right class labels. The overlaps show
VAE is seen to be the most computationally expensive, due to the need to
similarities between the real and synthetic data. TSNE plots of AVG show
train the networks.
much ﬁner clustering, compared with other techniques, where there the
plots show greater dispersion in the data.
4.10. Comparisons with other works

4.8. Topographic head plots for Dataset II Some recent works exploring data augmentation in motor imagery
decoding include works by Freer and Yang [57], K. Zhang et al. [58] Z.
The topographic head plots of the evoked response for each class, for Zhang et al. [28] and Dai et al. [33]. In all of these works, only a few
both real and synthetic data are presented in Figure 14, for dataset II. As augmentation techniques were explored. Freer and Yang [57] applied 5
seen with dataset I, the plots also show similar characteristics between methods for comparisons. Also, much of their augmentation efforts were
synthetic and real trials. Similar characteristics, in terms of the peaks, are toward handling data imbalances, which we handled by oversampling,
seen across time points of the experiment. With the plot for Subject A and not exploring in detail the effect of augmentation across different
(Session 160223), peaks in signals are observed at about 235 ms, 360 ms, datasets. K. Zhang et al. [58] also explored 3 augmentation methods.
425 ms and 435 ms, in the evoked response plot for the left-hand class, There augmentation was based on spectograms alone and no experi-
for real and synthetic trials. This same pattern is seen for all 6 classes. mentation was done with the raw data. So also, Z. Zhang et al. [28] and
Also, for Subject H (Session 160720), similarities in peaks are seen for the Dai et al. [33] only made use of the empirical mode decomposition
tongue imagery, with peaks noticed at 225 ms, 330 ms, 425 ms and 430 (EMD) and time- frequency recombination, respectively. These recent
ms, for the real and synthetic data. These similarities in peaks are also works have limited their exploration to only a few augmentation tech-
seen across other imageries. In some cases, slight differences, between niques and have mostly used the BCI competition datasets [29]. These
observed peaks in real and synthetic data, are seen. However, these contrast greatly with our work where we have used more augmentation
changes are not so signiﬁcant, showing that signals were not undesirably techniques across two datasets of varying trial lengths, showing statisti-
distorted. cally which methods tend to provide more signiﬁcant results compared
with others

4.9. Timings for data generation across all methods 5. Conclusion

In scenarios where computation time might be a factor to be In this comparative study, we presented our ﬁndings on the use of
considered, techniques with less computation time would be desired. different data augmentation techniques for motor imagery decoding,
Table 3 shows the time taken for data generation, across all methods. using neural networks. We compared a no-augmentation approach with
Cropping took the least time and might be the preferred augmentation six different augmentation techniques, which can be applied in gener-
method, where the trial length is also suitable. Compared with others, ating synthetic trials for enhancing decoding performance. The six

Table 2. p-values for comparisons across all methods - Dataset II α ¼ 5E-02; Bonferroni-corrected α with 7 tests ¼ 7.14E-03. Bold values show signiﬁcance in mean
differences.
NA AVG RT RF NS CPS VAE
AVG 4.74E-02 - - - - - -
RT 2.60E-01 2.22E-01 - - - - -
RF 6.70E-02 4.81E-01 4.66E-01 - - - -
NS 2.27E-02 9.99E-01 1.24E-01 4.73E-01 - - -
CPS 1.37E-10 8.57E-12 8.81E-11 1.79E-12 1.46E-13 - -
VAE 4.17E-01 4.80E-03 1.05E-01 1.35E-03 4.26E-03 4.70E-09 -
Kaya et al. 1.18E-06 4.69E-06 2.60E-06 1.08E-06 9.01E-07 7.76E-01 1.86E-10

9
O. George et al. Heliyon 8 (2022) e10240

Figure 13. TSNE plot across all subjects and sessions using AVG augmentation. Real and synthetic (synth) data are plotted as circles and crosses, respectively. The
reader is referred to the electronic copy for better viewing.

techniques include: averaging of trials, recombination in time, recom- In future, we can consider other techniques, such as the use of a robust
bination in frequency, noise addition, cropping and the use of a varia- lightweight generative adversarial network (GAN), that can learn the
tional autoencoder (VAE). These techniques range from simple ones data distribution with a smaller number of samples, as are available in
(AVG, RT, RF and NS) to more computationally intensive ones, such as motor imagery experiments. GANs have become increasingly popular for
the training of the VAE for the augmentation. Our results from applying data generation and several modifications to the original GAN have been
these techniques, are presented across two public datasets, to investigate proposed, for different tasks [59, 60, 61, 62].
these techniques on different datasets of different trial lengths. Time Noise addition must be done carefully to avoid overly perturbing the
taken for data generation across all techniques is also shown. data and distorting it. Using the mean and standard deviation for
Our findings generally show that these techniques offer improve- generating Gaussian noise is not recommended, due to the non-
ments in performance compared to an un-augmented approach. Across stationarity of the signal. We chose to add noise with zero mean and
both datasets, noise addition and recombination in frequency seemed to standard deviation equal to the mean of trials for a task. This preserves
improve decoding performance, with both techniques yielding mean the data and avoids introducing impurities, which will adversely affect
accuracies 80% on both datasets, albeit not significantly. All tech- learning. Another approach to noise addition could be performing the
niques gave some form of improvement on both datasets, except for the perturbations in the frequency space, as the signals are more stationary in
VAE and cropping, which gave reduced performances on dataset II. The that case. In future, we could also investigate noise addition in the fre-
drastic difference in results obtained with cropping on both datasets, quency domain for comparison with noise addition in time.
implies that the trial length needs to be considered, when applying A wider gap or variation in performance across the datasets is seen in
cropping. CPS, as compared with other methods. However, with the other simpler

10
O. George et al. Heliyon 8 (2022) e10240

Figure 14. Topographic plots of subjects of dataset II showing real and synthetic signal characteristics, for 6 classes, using AVG. Plots are shown for subjects (a) A –
Session 160223 and (b) H – Session 160720. The reader is referred to the electronic copy for better viewing.

Table 3. Table of timings for data augmentation techniques. Time (in secs) to generate 1000 trials. CPS is for window length ¼ 0.5 and 50% overlap. Times were
generated using Kaggle GPU - NVIDIA TESLA P100. CPS took the least time, as shown boldened.
AVG RT RF NS CPS VAE
Dataset I 1.83 0.85 88.52 4.17 0.05 1022.35
Dataset II 0.08 0.10 19.00 0.15 0.02 168.16

methods, there is less variation, which we infer to be because the full Roger Smith: Contributed reagents, materials, analysis tools or data;
length of trials is used for other methods. Also, parameters for the gen- Wrote the paper.
eration of crops should be selected to achieve optimal crops containing Praveen Madiraju: Contributed reagents, materials, analysis tools or
useful information for the learning. Another approach to cropping could data; Wrote the paper.
involve filtering crops temporally. So, only certain portions of the trial Nasim Yahyasoltani: Contributed reagents, materials, analysis tools
are used. In choosing the most relevant region, a good choice might be or data; Wrote the paper.
500–2500 ms for the 3-second-long trial or 100–900 ms for the 1-second- Sheikh Iqbal Ahamed: Contributed reagents, materials, analysis tools
long trial. The purpose of crop filtering would be to discard, as much as or data; Wrote the paper.
possible, crops not significantly contributing to the decoding process.
Since there exist latencies in subject reactionary times for the perfor-
Funding statement
mance of the imagery task, filtering out crops of earlier times will reduce
the amount of noise resulting from including non-discriminatory crop
This work is partially supported by a number of Grants of Ubicomp
regions. In future, we could perform correlation analyses on the crops to
Lab, Marquette University, USA [10.13039/100012793].
determine, which ones contain the most significant amount of informa-
tion. With this, an empirical optimal time range can be selected, though,
this may vary across subjects. Data availability statement
In conclusion, we recommend using NS or RT, for quick augmentation
and crops for trial lengths greater than 1 s. Where computational power is Data associated with this study has been deposited at (https://www.
not a constraint, VAEs could also be explored. In future, we could also use nature.com/articles/sdata2018211) and (https://www.ncbi.nlm.nih.gov/
a hybrid approach, by combining more than one augmentation pmc/articles/PMC5493744/).
techniques.
Declaration of interests statement
Declarations
The authors declare no conflict of interest.
Author contribution statement

Olawunmi George: Conceived and designed the experiments; Per- Additional information
formed the experiments; Analyzed and interpreted the data; Contributed
reagents, materials, analysis tools or data; Wrote the paper. No additional information is available for this paper.

11
O. George et al. Heliyon 8 (2022) e10240

Appendix A. Structure of networks

Outputs and dimensions are shown for Dataset I only.

Table A.1. VAE encoder architecture. trial dimension ¼ (64 * 1536); n nodes 1 ¼ product (trial dimension) ¼ 98304; kernel size 1 ¼ (1, ﬂoor (2 * trial dimension [1]/
96)) ¼ (1, 32); kernel size 2 ¼ (trial dimension [0], 1) ¼ (64, 1); strides ¼ (1, max (ﬂoor (trial dimension [1]/96), 4)) ¼ (1, 16).

Layer Type Filters Kernel size Strides Output Connected to

Input 1 - - - (None, 2) -
Embedding 1 - - - (None, 2, 10) Input 1
Flatten 1 - - - (None, 20) Embedding 1
Dense 1 5 - - (None, 5) Flatten 1
Dense 2 n nodes 1 - - (None, 98304) Dense 1
Reshape 1 - - - (None, 64, 1536, 1) Dense 2
Input 2 - - - (None, 64, 1536, 1) -
Concatenate 1 - - - (None, 64, 1536, 2) Input 2
Reshape 1
Conv 1 16 kernel size 1 strides (None, 64, 95, 16) Concatenate 1
LeakyReLU 1 - - - (None, 64, 95, 16) Conv 1
Conv 2 32 kernel size 2 - (None, 1, 95, 32) LeakyReLU 1
LeakyReLU 2 - - - (None, 1, 95, 32) Conv 2
Flatten 2 - - - (None, 3040) LeakyReLU 2
Dense 3 16 - - (None, 16) Flatten 2
Dense 4 10 - - (None, 10) Dense 3
Dense 5 10 - - (None, 10) Dense 3
Sampling 1 - - - (None, 10) Dense 4
Dense 5

Table A.2. VAE decoder architecture. trial dimension ¼ (64 * 1536); kernel size 1 ¼ (trial dimension [0], 1) ¼ (64, 1); kernel size 2 ¼ (1, ﬂoor (2 * trial dimension [1]/
96)) ¼ (1, 32) strides ¼ (1, max (ﬂoor (trial dimension [1]/96), 4)) ¼ (1, 16) * add padding to retain previous dimension.

Layer Type Filters Kernel size Strides Output Connected to

Input 1 - - - (None, 10) -
Dense 1 3040 - - (None, 3040) Input 1
LeakyReLU 1 - - - (None, 3040) Dense 1
Reshape 1 - - - (None, 1, 95, 32) LeakyReLU 1
ConvTranspose 1 64 kernel size 1 - (None, 64, 95, 64) Reshape 1
LeakyReLU 2 - - - (None, 64, 95, 64) ConvTranspose 1
ConvTranspose 2 32 kernel size 2 strides (None, 64, 1536, 32) LeakyReLU 2
LeakyReLU 3 - - - (None, 64, 1536, 32) ConvTranspose 2
ConvTranspose 3 * 1 7 - (None, 64, 1536, 1) LeakyReLU 3
LeakyReLU 4 - - - (None, 64, 1536, 1) ConvTranspose 3

Table A.3. Structure of the Deep Net classifier trial dimension ¼ (64 * 1536); kernel size ¼ (1, floor ((5 * trial dimension [1])/(256 * 1))) ¼ (1,30); pool size ¼ (1, max
(floor ((trial dimension [1] * 2)/(256 * 4)), 2)) ¼ (1, 3); strides¼ (1, max (floor ((trial dimension [1] * 2)/(256 * 4)), 2)) ¼ (1, 3).

Layer Type Filters Kernel size Pool size Strides Dropout rate Output
Input - - - - - (None, 1, 64, 1536)
Conv 25 (1, 30) - - - (None, 25, 64, 1507)
Conv 25 (1, 30) - - - (None, 25, 1, 1507)
Batch Normalization - - - - - (None, 25, 1, 1507)
Activation (SELU) - - - - - (None, 25, 1, 1507)
Average pooling - - (1, 3) (1, 3) - (None, 25, 1, 502)
Dropout - - - - 0.4 (None, 25, 1, 502)
Conv 50 (1, 30) - - - (None, 50, 1, 473)
Batch Normalization - - - - - (None, 50, 1, 473)
Activation (SELU) - - - - - (None, 50, 1, 473)
Average pooling - - (1, 3) (1, 3) - (None, 50, 1, 157)
Dropout - - - - 0.4 (None, 50, 1, 157)
Conv 100 (1, 30) - - - (None, 100, 1, 128)
Batch Normalization - - - - - (None, 100, 1, 128)
Activation (SELU) - - - - - (None, 100, 1, 128)
Average pooling - - (1, 3) (1, 3) - (None, 100, 1, 42)
Dropout - - - - 0.4 (None, 100, 1, 42)

(continued on next column)

12
O. George et al. Heliyon 8 (2022) e10240

Table A.3 (continued )

Layer Type Filters Kernel size Pool size Strides Dropout rate Output
Conv 200 (1, 30) - - - (None, 200, 1, 13)
Batch Normalization - - - - - (None, 200, 1, 13)
Activation (SELU) - - - - - (None, 200, 1, 13)
Max pooling - - (1, 3) (1, 3) - (None, 200, 1, 4)
Dropout - - - - 0.4 (None, 200, 1, 4)
Flatten - - - - - (None, 800)
Dense 6 - - - - (None, 2)
Activation (Softmax) - - - - - (None, 2)

References [27] F. Velasco-Alvarez, R. Ron-Angevin, L. da Silva-Sauer, S. Sancha-Ros, Audio-cued

motor imagery-based brain-computer interface: navigation through virtual and real
environments, Neurocomputing 121. doi:.
[1] L. Huang, G. van Luijtelaar, Brain Computer Interface for Epilepsy Treatment, Brain-
Computer Interface Systems-Recent Progress and Future Prospects. [28] Z. Zhang, F. Duan, J. Sole-Casals, J. Dinares-Ferran, A. Cichocki, Z. Yang, Z. Sun,
[2] A.T. Tzallas, N. Giannakeas, K.N. Zoulis, M.G. Tsipouras, E. Glavas, K.D. Tzimourta, A novel deep learning approach with data augmentation to classify motor imagery
L.G. Astrakas, S. Konitsiotis, EEG Classification and Short-Term Epilepsy Prognosis signals, IEEE Access 7 (2019) 15945–15954.
Using Brain Computer Interface Software, 2017-June, 2017. [29] B. Competition, Bci competition i-iv, 2003. URL, http://www.bbci.de/competit
[3] J. A. Stevens, M. E. P. Stoykov, Using motor imagery in the rehabilitation of ion/.
hemiparesis, Arch. Phys. Med. Rehabil. 84. doi: . [30] Y. Li, X.-R. Zhang, B. Zhang, M.-Y. Lei, W.-G. Cui, Y.-Z. Guo, A channel-projection
[4] S. de Vries, T. Mulder, Motor Imagery and Stroke Rehabilitation: A Critical mixed-scale convolutional neural network for motor imagery eeg decoding, IEEE
Discussion, 2007. Trans. Neural Syst. Rehabil. Eng. 27 (2019) 1170–1180.
[5] A. Zimmermann-Schlatter, C. Schuster, M.A. Puhan, E. Siekierka, J. Steurer, Efficacy [31] B. Blankertz, Bci Competition Iv, 2008, 2008, pp. 2–4. URL, http://www.bbci
of Motor Imagery in post-stroke Rehabilitation: A Systematic Review, 2008. .de/competition/iv/.
[6] R. Dickstein, A. Dunsky, E. Marcovitz, Motor imagery for gait rehabilitation in post- [32] R. T. Schirrmeister, J. T. Springenberg, L. D. J. Fiederer, M. Glasstetter, K.
stroke hemiparesis, Phys. Ther. 84. doi:. Eggensperger, M. Tangermann, F. Hutter, W. Burgard, T. Ball, Deep learning with
[7] G. Pfurtscheller, G.R. Müller-Putz, J. Pfurtscheller, R. Rupp, EEG-based convolutional neural networks for eeg decoding and visualization, Hum. Brain
asynchronous bci controls functional electrical stimulation in a tetraplegic patient, Mapp. 38. doi:.
EURASIP J. Appl. Signal Process. (2005). [33] G. Dai, J. Zhou, J. Huang, N. Wang, Hs-cnn: a cnn with hybrid convolution scale for
[8] B. A. S. Hasan, J. Q. Gan, Hangman bci: an unsupervised adaptive selfpaced brain- eeg motor imagery classification, J. Neural. Eng. 17. doi:.
computer interface for playing games, Comput. Biol. Med. 42. doi:. [34] Z. Tayeb, J. Fedjaev, N. Ghaboosi, L. R. Christoph, Everding, X. Qu, Y. Wu, G.
[9] C. Soraghan, F. Matthews, D. Kelly, T. Ward, C. Markham, B.A. Pearlmutter, Cheng, J. Conradt, Validating deep neural networks for online decoding of motor
R.O. Neill, A Dual-Channel Optical Brain-Computer Interface in a Gaming imagery movements from eeg signals, Sensors 19. doi:.
Environment, 2006. [35] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A.
[10] D. Marshall, D. Coyle, S. Wilson, M. Callaghan, Games, gameplay, and bci: the state Courville, Y. Bengio, Generative adversarial nets (nips version), Adv. Neural Inf.
of the art, IEEE Transactions on Computational Intelligence and AI in Games 5. doi:. Process. Syst. 27.
[11] T. Shi, H. Wang, C. Zhang, Brain computer interface system based on indoor semi- [36] A. Odena, C. Olah, J. Shlens, Conditional Image Synthesis with Auxiliary Classifier
autonomous navigation and motor imagery for unmanned aerial vehicle control, gans, Vol. 6, 2017.
Expert Syst. Appl. 42. doi:. [37] T.C. Wang, M.Y. Liu, J.Y. Zhu, A. Tao, J. Kautz, B. Catanzaro, High-resolution Image
[12] J. Zhuang, G. Yin, Motion Control of a Four-Wheel-Independent-Drive Electric Synthesis and Semantic Manipulation with Conditional Gans, 2018.
Vehicle by Motor Imagery EEG Based Bci System, 2017. [38] S. Kazeminia, C. Baur, A. Kuijper, B. van Ginneken, N. Navab, S. Albarqouni,
[13] C. Neuper, M. Wortz, G. Pfurtscheller, Erd/ers Patterns Reflecting Sensorimotor A. Mukhopadhyay, Gans for Medical Image Analysis, Artificial Intelligence in
Activation and Deactivation, Event-Related Dynamics of Brain Oscillations 159. Medicine, 2020, 101938.
[14] J.R. Wolpaw, N. Birbaumer, D.J. McFarland, G. Pfurtscheller, T.M. Vaughan, Brain- [39] M. Wiese, R. Knobloch, R. Korn, P. Kretschmer, Quant gans: Deep Generation of
computer Interfaces for Communication and Control, 2002. Financial Time Series, Quant. Finance 20. doi:.
[15] D. J. McFarland, J. R. Wolpaw, Brain-computer interfaces for communication and [40] S. Takahashi, Y. Chen, K. Tanaka-Ishii, Modeling financial time-series with
control, Commun. ACM 54. doi:. generative adversarial networks, Phys. Stat. Mech. Appl. 527. doi:.
[16] G. Schalk, D. J. McFarland, T. Hinterberger, N. Birbaumer, J. R. Wolpaw, Bci2000: a [41] X. Zhou, Z. Pan, G. Hu, S. Tang, C. Zhao, Stock market prediction on highfrequency
general-purpose brain-computer interface (bci) system, IEEE (Inst. Electr. Electron. data using generative adversarial nets, Math. Probl Eng. (2018).
Eng.) Trans. Biomed. Eng. 51. doi:. [42] Y. Luo, B.-L. Lu, EEG Data Augmentation for Emotion Recognition Using a
[17] J. Decety, Behavioural brain research the neurophysiological basis of motor Conditional Wasserstein gan, 2018, pp. 2535–2538.
imagery, Behav. Brain Res. 77. [43] I.A. Corley, Y. Huang, Deep EEG Super-resolution: Upsampling EEG Spatial
[18] G. Pfurtscheller, C. Neuper, Motor imagery and direct brain-computer Resolution with Generative Adversarial Networks, 2018-January, 2018.
communication, Proc. IEEE 89. doi:. [44] F. Wang, S.H. Zhong, J. Peng, J. Jiang, Y. Liu, Data Augmentation for EEG-Based
[19] S.M. Abdelfattah, G.M. Abdelrahman, M. Wang, Augmenting the Size of EEG Emotion Recognition with Deep Convolutional Neural Networks, 10705, LNCS,
Datasets Using Generative Adversarial Networks, 2018-July, 2018. 2018.
[20] I. Ullah, M. Hussain, E. ul Haq Qazi, H. Aboalsamh, An automated system for [45] S. Roy, S. Dora, K. McCreadie, G. Prasad, Mieeg-gan: Generating Artificial Motor
epilepsy detection using eeg brain signals based on deep learning approach, Expert Imagery Electroencephalography Signals, 2020.
Syst. Appl. 107. doi:.
[46] T. J. Luo, Y. Fan, L. Chen, G. Guo, C. Zhou, EEG signal reconstruction using a
[21] S.U. Amin, M. Alsulaiman, G. Muhammad, M.A. Mekhtiche, M.S. Hossain, Deep
generative adversarial network with wasserstein distance and temporal-spatial-
learning for eeg motor imagery classification based on multi-layer cnns feature
frequency loss, Front. Neuroinf. 14. doi: .
fusion, Future Generation Computer Systems- The International Journal of Escience
101 (2019) 542–554. [47] H. Cho, M. Ahn, S. Ahn, M. Kwon, S.C. Jun, EEG Datasets for Motor Imagery Brain-
[22] A.M. Azab, L. Mihaylova, K.K. Ang, M. Arvaneh, Weighted transfer learning for Computer Interface, 2017.
improving motor imagery-based brain-computer interface, IEEE Trans. Neural Syst. [48] M. Kaya, M. K. Binli, E. Ozbay, H. Yanar, Y. Mishchenko, Data descriptor: a large
Rehabil. Eng. 27 (2019) 1352–1359. electroencephalographic motor imagery dataset for electroencephalographic brain
[23] R. Boostani, B. Graimann, M. H. Moradi, G. Pfurtscheller, A comparison approach computer interfaces, Sci. Data 5. doi: .
toward finding the best feature and classifier in cue-based BCI, Med. Biol. Eng. [49] M. Jas, D. A. Engemann, Y. Bekhti, F. Raimondo, A. Gramfort, Autoreject:
Comput. 45. doi:. automated artifact rejection for meg and eeg data, Neuroimage 159. doi:.
[24] D. Choi, Y. Ryu, Y. Lee, M. Lee, Performance evaluation of a motorimagery-based [50] D.P. Kingma, M. Welling, Auto-encoding Variational Bayes, 2014.
eeg-brain computer interface using a combined cue with heterogeneous training [51] T. V. Erven, P. Harr€emos, Renyi divergence and kullback-leibler divergence, IEEE
data in bci-naive subjects, Biomed. Eng. Online 10 (2011) 91. Trans. Inf. Theor. 60. doi:.
[25] R. Ron-Angevin, F. Velasco-Alvarez, S. Sancha-Ros, L.D. Silva-Sauer, A Two-Class [52] X. Chen, D.P. Kingma, T. Salimans, Y. Duan, P. Dhariwal, J. Schulman, I. Sutskever,
Self-Paced Bci to Control a Robot in Four Directions, 2011. P. Abbeel, Variational Lossy Autoencoder, 2017.
[26] C. Wang, B. Xia, J. Li, W. Yang, D. Xiao, A.C. Velez, H. Yang, Motor Imagery Bci- [53] L. Weng, From Autoencoder to Beta-Vae, lilianweng.github.Io, URL, https://lili
Based Robot Arm System, Vol. 1, 2011. anweng.github.io/posts/2018-08-12-vae/.

13
O. George et al. Heliyon 8 (2022) e10240

[54] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, S. Hochreiter, Gans Trained by [59] X. Mao, Q. Li, H. Xie, R.Y. Lau, Z. Wang, S.P. Smolley, Least squares generative
a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium, 2017, 2017- adversarial networks 2017-October (2017).
December. [60] M. Arjovsky, S. Chintala, L. Bottou, Wasserstein gan, arXiv preprint arXiv:
[55] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the Inception 1701.07875.
Architecture for Computer Vision, 2016-December, 2016.
[61] P. Isola, J. Y. Zhu, T. Zhou, A. A. Efros, Pix2pix-gan, Proceedings - 30th IEEE
[56] L. V. D. Maaten, G. Hinton, Visualizing data using t-sne, J. Mach. Learn. Res. 9.
Conference on Computer Vision and Pattern Recognition, CVPR 2017 2017-Janua.
[57] D. Freer, G.-Z. Yang, Data augmentation for self-paced motor imagery classiﬁcation
with c-lstm, J. Neural. Eng. 17 (1) (2020), 016041. [62] A. Radford, L. Metz, S. Chintala, Unsupervised Representation Learning with Deep
[58] K. Zhang, G. Xu, Z. Han, K. Ma, X. Zheng, L. Chen, N. Duan, S. Zhang, Data Convolutional Generative Adversarial Networks, 2016.
augmentation for motor imagery signal classiﬁcation based on a hybrid neural
network, Sensors 20 (16) (2020) 4485.

Baseline Survey Budget
100% (10)
Baseline Survey Budget
3 pages
Account Acceptance Notice
No ratings yet
Account Acceptance Notice
1 page
Rcfpvpro Instmanual (En)
No ratings yet
Rcfpvpro Instmanual (En)
18 pages
Socket Programming PDF
0% (1)
Socket Programming PDF
4 pages
Synthetic Brain Images: Bridging The Gap in Brain Mapping With Generative Adversarial Model
No ratings yet
Synthetic Brain Images: Bridging The Gap in Brain Mapping With Generative Adversarial Model
14 pages
A Novel Algorithmic Structure of EEG Channel Attention Combined With Swin Transformer for Motor Patterns Classification
No ratings yet
A Novel Algorithmic Structure of EEG Channel Attention Combined With Swin Transformer for Motor Patterns Classification
10 pages
Fin Irjmets1672627712
No ratings yet
Fin Irjmets1672627712
8 pages
A Hybrid Machine Learning Method For Image Classification
No ratings yet
A Hybrid Machine Learning Method For Image Classification
15 pages
poster-garciacarrasco
No ratings yet
poster-garciacarrasco
1 page
A1 s2.0 S0010482523013598 Main
No ratings yet
A1 s2.0 S0010482523013598 Main
11 pages
Decoding using CNN - PLOS One
No ratings yet
Decoding using CNN - PLOS One
14 pages
Applsci 12 05807 v2
No ratings yet
Applsci 12 05807 v2
19 pages
MATR Multimodal Medical Image Fusion Via Multiscale Adaptive Transformer
No ratings yet
MATR Multimodal Medical Image Fusion Via Multiscale Adaptive Transformer
16 pages
A transfer learning framework based on motor imagery rehabilitation for stroke
No ratings yet
A transfer learning framework based on motor imagery rehabilitation for stroke
9 pages
A Deep Learning Paradigm For Medical Imagin - 2024 - Expert Systems With Applica
No ratings yet
A Deep Learning Paradigm For Medical Imagin - 2024 - Expert Systems With Applica
9 pages
bioengineering-10-01353
No ratings yet
bioengineering-10-01353
25 pages
Self-Supervised Learning For Medical Imaging
No ratings yet
Self-Supervised Learning For Medical Imaging
6 pages
Project Report On A Learning Framework For Morphological Operators Using CounterHarmonic Mean
No ratings yet
Project Report On A Learning Framework For Morphological Operators Using CounterHarmonic Mean
114 pages
07961149transfer Learning
No ratings yet
07961149transfer Learning
11 pages
Haroon_EMBC19
No ratings yet
Haroon_EMBC19
5 pages
Impact of Image Resizing On Deep Learning Detectors For Training Time and Model Performance
No ratings yet
Impact of Image Resizing On Deep Learning Detectors For Training Time and Model Performance
8 pages
Khosla 2020
No ratings yet
Khosla 2020
7 pages
electronics-11-02293-v2 (1)
No ratings yet
electronics-11-02293-v2 (1)
14 pages
NTTTTT
No ratings yet
NTTTTT
5 pages
The Cybathlon BCI race- Successful longitudinal mutual learning with two tetraplegic users
No ratings yet
The Cybathlon BCI race- Successful longitudinal mutual learning with two tetraplegic users
28 pages
Apple Fruits Categorizing Based On Deep Convolutional Neural Network Techniques
No ratings yet
Apple Fruits Categorizing Based On Deep Convolutional Neural Network Techniques
8 pages
Wei_2020_J._Phys.-_Conf._Ser._1453_012085
No ratings yet
Wei_2020_J._Phys.-_Conf._Ser._1453_012085
9 pages
Classification of Brain Image Tumor Using Efficientnet B1-B2 Deep Learning
No ratings yet
Classification of Brain Image Tumor Using Efficientnet B1-B2 Deep Learning
9 pages
(N) Semi-Supervised Learning Quantization Algorithm With Deep Features
No ratings yet
(N) Semi-Supervised Learning Quantization Algorithm With Deep Features
13 pages
Employment of Domain Adaptation Techniques in SSVEP-Based BrainComputer Interfaces
No ratings yet
Employment of Domain Adaptation Techniques in SSVEP-Based BrainComputer Interfaces
11 pages
IJCAIT132rsingh
No ratings yet
IJCAIT132rsingh
11 pages
Facial Age Estimation Using Transfer Learning and Bayesian Optimization Based On Gender Information
No ratings yet
Facial Age Estimation Using Transfer Learning and Bayesian Optimization Based On Gender Information
11 pages
Multi-Task Pre-Training of Deep Neural Networks For Digital Pathology
No ratings yet
Multi-Task Pre-Training of Deep Neural Networks For Digital Pathology
10 pages
Brief Cognizance Into Tampered Image Detection
No ratings yet
Brief Cognizance Into Tampered Image Detection
12 pages
2020_BigData_Hybrid_Decision_Tree-Neural_Network
No ratings yet
2020_BigData_Hybrid_Decision_Tree-Neural_Network
9 pages
Deep Learning Techniques For Depression Assessment
No ratings yet
Deep Learning Techniques For Depression Assessment
5 pages
paper [21]
No ratings yet
paper [21]
17 pages
Deep Learning Based Prediction of EEG Motor Imagery of Stroke Patients' For Neuro-Rehabilitation Application
No ratings yet
Deep Learning Based Prediction of EEG Motor Imagery of Stroke Patients' For Neuro-Rehabilitation Application
8 pages
Limited Data Rolling Bearing Fault Diagnosis With Few-Shot Learning
No ratings yet
Limited Data Rolling Bearing Fault Diagnosis With Few-Shot Learning
10 pages
Image Super Resolution
No ratings yet
Image Super Resolution
8 pages
Generalizing Deep Learning For Medical Image Segmentation To Unseen Domains Via Deep Stacked Transformation
No ratings yet
Generalizing Deep Learning For Medical Image Segmentation To Unseen Domains Via Deep Stacked Transformation
10 pages
SSCLNet A Self-Supervised Contrastive Loss-Based Pre-Trained Network For Brain MRI Classification
No ratings yet
SSCLNet A Self-Supervised Contrastive Loss-Based Pre-Trained Network For Brain MRI Classification
9 pages
DOC-20250214-WA0007.
No ratings yet
DOC-20250214-WA0007.
18 pages
Enhancing_Medical_Image_Classification_Through_PSO-Optimized_Dual_Deterministic_Approach_and_Robust_Transfer_Learning
No ratings yet
Enhancing_Medical_Image_Classification_Through_PSO-Optimized_Dual_Deterministic_Approach_and_Robust_Transfer_Learning
16 pages
An Overview of Deep Learning in Medical Imaging Focusing On MRI
No ratings yet
An Overview of Deep Learning in Medical Imaging Focusing On MRI
26 pages
Facial Emotion Recognition: State of The Art Performance On FER2013
No ratings yet
Facial Emotion Recognition: State of The Art Performance On FER2013
9 pages
Real-Time Fine-Grained Air Quality Sensing Networks in Smart City: Design, Implementation and Optimization
No ratings yet
Real-Time Fine-Grained Air Quality Sensing Networks in Smart City: Design, Implementation and Optimization
4 pages
Basics of Deep Learning A Radiologist’s Guide
No ratings yet
Basics of Deep Learning A Radiologist’s Guide
9 pages
A convolutional neural network based on a capsule network with strong generalization for bearing fault diagnosis
No ratings yet
A convolutional neural network based on a capsule network with strong generalization for bearing fault diagnosis
14 pages
A Study To Find Facts Behind Preprocessing On Deep Learning Algorithms
No ratings yet
A Study To Find Facts Behind Preprocessing On Deep Learning Algorithms
11 pages
Implementation of an AIoT-based Intelligent Water Resources
No ratings yet
Implementation of an AIoT-based Intelligent Water Resources
15 pages
Motor Imagery Classification Via Temporal Attention Cues of Graph Embedded EEG Signals
No ratings yet
Motor Imagery Classification Via Temporal Attention Cues of Graph Embedded EEG Signals
10 pages
Brain Tumour Detection Using M-IRO-Journals-3 4 5
No ratings yet
Brain Tumour Detection Using M-IRO-Journals-3 4 5
12 pages
Transfer Learning-Based Deep Feature Extraction Framework Using Fine-Tuned Efficientnet B7 For Multiclass Brain Tumor Classification
No ratings yet
Transfer Learning-Based Deep Feature Extraction Framework Using Fine-Tuned Efficientnet B7 For Multiclass Brain Tumor Classification
22 pages
Guest Editorial Deep Learning in Medical Imaging Overview and Future Promise of An Exciting New Technique
No ratings yet
Guest Editorial Deep Learning in Medical Imaging Overview and Future Promise of An Exciting New Technique
7 pages
A Stacked Multi-Connection Simple Reducing Net For
No ratings yet
A Stacked Multi-Connection Simple Reducing Net For
14 pages
1 s2.0 S016816992300460X Main
No ratings yet
1 s2.0 S016816992300460X Main
17 pages
A Data-Related Patch Proposal For Semantic Segmentation of Aerial Images
No ratings yet
A Data-Related Patch Proposal For Semantic Segmentation of Aerial Images
5 pages
Ijarcce 2023 12107
No ratings yet
Ijarcce 2023 12107
5 pages
gong2016
No ratings yet
gong2016
14 pages
A Study On Effects of Data Augmentation in Detection
No ratings yet
A Study On Effects of Data Augmentation in Detection
13 pages
Multimodal Transfer Learning-Based Approaches For Retinal Vascular Segmentation
No ratings yet
Multimodal Transfer Learning-Based Approaches For Retinal Vascular Segmentation
8 pages
Multiple Disease Detection Using Machine Learning A Survey
No ratings yet
Multiple Disease Detection Using Machine Learning A Survey
8 pages
Computer Vision: Exploring the Depths of Computer Vision
From Everand
Computer Vision: Exploring the Depths of Computer Vision
Fouad Sabry
No ratings yet
Conf - Analysis of Eeg Sig For The Estimation of Concentration Level of Humans
No ratings yet
Conf - Analysis of Eeg Sig For The Estimation of Concentration Level of Humans
8 pages
Data Augmentation For DNN Model in Eeg Classification Task-A Review
No ratings yet
Data Augmentation For DNN Model in Eeg Classification Task-A Review
15 pages
Edc - Assignment Questions - Nba
No ratings yet
Edc - Assignment Questions - Nba
5 pages
MPMC Unit-1
No ratings yet
MPMC Unit-1
26 pages
Unit-1 MPMC
No ratings yet
Unit-1 MPMC
40 pages
Interfacing 16×2 LCD With 8051
No ratings yet
Interfacing 16×2 LCD With 8051
39 pages
Unit - Iii MPMC-1
100% (1)
Unit - Iii MPMC-1
79 pages
Web Design Work Breakdown Structure
No ratings yet
Web Design Work Breakdown Structure
5 pages
Nonlinear Filtering: Methods and Applications Kumar Pakki Bharani Chandra All Chapters Instant Download
100% (1)
Nonlinear Filtering: Methods and Applications Kumar Pakki Bharani Chandra All Chapters Instant Download
55 pages
AWS Interview Questions and Answers
No ratings yet
AWS Interview Questions and Answers
33 pages
Trends in Construction
100% (2)
Trends in Construction
12 pages
Question Paper 2021-22 Social Media and Web Analytics
No ratings yet
Question Paper 2021-22 Social Media and Web Analytics
1 page
Customer Relationship Management and Patronage in Service Industry (A Study of Hotel)
100% (1)
Customer Relationship Management and Patronage in Service Industry (A Study of Hotel)
76 pages
Calculated in Mind Book
100% (1)
Calculated in Mind Book
392 pages
HUAWEI FreeBuds 5i Quick Start Guide-(T0014,04,en-gb)
No ratings yet
HUAWEI FreeBuds 5i Quick Start Guide-(T0014,04,en-gb)
53 pages
Ece
No ratings yet
Ece
45 pages
Vocabulary Board Game
No ratings yet
Vocabulary Board Game
6 pages
Unit 3 Uncertainity
No ratings yet
Unit 3 Uncertainity
57 pages
Nora Alexander: Profile
No ratings yet
Nora Alexander: Profile
1 page
IEC61508 IEC61511 Presentation E
100% (3)
IEC61508 IEC61511 Presentation E
56 pages
Sheqxel Emergency Evacuation Plan
No ratings yet
Sheqxel Emergency Evacuation Plan
9 pages
Chapter 2 Module
No ratings yet
Chapter 2 Module
22 pages
Service Manual: 8M29A/E/H Chassis
No ratings yet
Service Manual: 8M29A/E/H Chassis
41 pages
15 DICTIONARIES-2-unlocked
No ratings yet
15 DICTIONARIES-2-unlocked
42 pages
COMP 2804 - Assignment 4
No ratings yet
COMP 2804 - Assignment 4
10 pages
Using Bellville Springs To Maintain Bolt Preload
No ratings yet
Using Bellville Springs To Maintain Bolt Preload
30 pages
Impact of Digitalization On Youth
No ratings yet
Impact of Digitalization On Youth
17 pages
@HAVENCENTER - 90177 LINES 24.11.2024
No ratings yet
@HAVENCENTER - 90177 LINES 24.11.2024
2,210 pages
Program Life Cycle: Steps To Follow in Writing or Creating A Program
No ratings yet
Program Life Cycle: Steps To Follow in Writing or Creating A Program
4 pages
Operation Manual: SUT06D40L16-20 SUT10D40L16-20 SUT06D60L21-20 SUT10D60L21-20
No ratings yet
Operation Manual: SUT06D40L16-20 SUT10D40L16-20 SUT06D60L21-20 SUT10D60L21-20
64 pages
1.5 Huawei RH V3 Routine Maintenance and Troubleshooting V1 0
No ratings yet
1.5 Huawei RH V3 Routine Maintenance and Troubleshooting V1 0
27 pages
11gen 1
No ratings yet
11gen 1
106 pages
Dulcet Manual
No ratings yet
Dulcet Manual
2 pages

Data Augmentation Strategies For Eeg Based Motor Imagery Decoding

Uploaded by

Data Augmentation Strategies For Eeg Based Motor Imagery Decoding

Uploaded by

Heliyon 8 (2022) e10240

Contents lists available at ScienceDirect

Data augmentation strategies for EEG-based motor imagery decoding

Figure 1. Schematic depicting trial generation by averaging N randomly selected trials.

Figure 5. Schematic depicting the cropping process.

Figure 6. Illustration of the learning process of a VAE [53].

4.9. Timings for data generation across all methods 5. Conclusion

Appendix A. Structure of networks

Outputs and dimensions are shown for Dataset I only.

Layer Type Filters Kernel size Strides Output Connected to

Layer Type Filters Kernel size Strides Output Connected to

(continued on next column)

Table A.3 (continued )

References [27] F. Velasco-Alvarez, R. Ron-Angevin, L. da Silva-Sauer, S. Sancha-Ros, Audio-cued

You might also like