2020 - A Benchmark Dataset For RSVP-Based Brain-Computer Interfaces - Zhang Et Al - Frontiers in Neuroscience
2020 - A Benchmark Dataset For RSVP-Based Brain-Computer Interfaces - Zhang Et Al - Frontiers in Neuroscience
remarkable progresses have been made in the performance and data were provided with the original continuous data without
practicability of BCIs due to the optimization of the experimental any processing, including EEG data, electrode positions, and
paradigm, the improvement of the signal processing algorithm, subjects information. (4) Stimulus events (onsets and offsets)
and the application of the machine learning method (Chen et al., were precisely synchronized to EEG data. (5) The 64-channel
2015b; Nakanishi et al., 2018; Zhang et al., 2018). Especially in whole-brain EEG data were recorded. That means that this
recent years, the emergence of free open datasets has spared dataset contains a total of 64 subjects, 10,240 trials, 1,024,000
the time, money, and labor costs of data collection, thus image circles, and 102,400 s of 64-channel EEG data. This dataset
providing convenience for the majority of BCI researchers and provides potential opportunities for developing signal processing
promoting the progress of algorithm development. The datasets and machine learning algorithms that rely on large amounts
covered many BCI paradigms such as steady-state visual evoked of EEG data. These features also make it possible to study the
potentials (SSVEPs) (Wang et al., 2017; Lee et al., 2019), event- algorithms for ERP detection and the methods for stimulus
related P300 potentials (Abibullaev and Zollanvari, 2019; Vaineau coding with the dataset. In addition, through offline simulation,
et al., 2019), and motor imagery (Cho et al., 2017; Kaya et al., stimulus coding and target recognition methods can be jointly
2018). In addition, there are some open multimodal datasets for optimized toward the highest performance of an online BCI.
BCIs obtained synchronously with EEG (Lioi et al., 2019). As the The rest of this paper is organized as follows. The Methods
broad audience of these datasets, researchers in a wide range of section introduces the experimental setup of data recording.
fields have contributed their intelligence to the BCI technology. The Data Recording section introduces the data records and
Rapid serial visual presentation (RSVP)-based BCI is a other relevant information. The Technical Validations section
special type of BCI that detects target stimuli (e.g., letters or introduces the basic methods in data analysis and gives three
images) that are presented sequentially in a stream by detecting examples to illustrate how to use the dataset to study the methods
the brain’s response to the target. RSVP is the process of of target detection in RSVP-based BCIs. The Discussions and
sequentially displaying images in the same spatial position Conclusion section summarizes and discusses the future work to
at a high presentation rate with multiple images per second improve the dataset.
(such as 2–20 Hz) (Lees et al., 2017). In the applications that
benefit from this paradigm, computers are unable to analyze and
understand images with deep semantic and unstructured features MATERIALS AND METHODS
as successfully as humans, and the manual analysis tools are slow,
which makes the study of RSVP-BCI more and more popular in Subjects
recent decades. RSVP-BCI has been used in counterintelligence, Sixty-four subjects (32 females; aged 19–27 years, mean
police, and health care that require professionals to review age 22 years) with normal or corrected-to-normal vision
objects, scenes, people, and other relevant information contained were recruited for this study. Each subject signed a written
in a large number of images (Huang et al., 2017; Singh and informed consent before the experiment and received a
Jotheeswaran, 2018; Wu et al., 2018). monetary compensation for his or her participation. This
Different EEG components are associated with target and non- study was approved by the Research Ethics Committee of
target stimuli (Bigdely-Shamlo et al., 2008; Cohen, 2014), and Tsinghua University.
BCI signal processing algorithms have been used to recognize
event-related potential (ERP) responses and link them to target Experimental Design
images. The most commonly exploited ERP in RSVP-based BCI This study developed an offline RSVP-BCI system. A 23.6-inch
applications is the P300, ideally on a single-trial basis (Manor liquid crystal display (LCD) screen was used to present visual
et al., 2016). In order to detect ERPs induced by target images, stimuli. The resolution of the screen was 1,920 × 1,080 pixels,
researchers have developed a variety of algorithms and evaluated and the refresh rate was 60 Hz. The visual stimulus images were
them with the data collected independently (Sajda et al., 2003; rendered within a 1,200 × 800-pixel square in the center of the
Alpert et al., 2014; Zhao et al., 2019). Unfortunately, as far as screen. The screen area surrounding the stimuli image was gray
we know, there is still a lack of a benchmark dataset for the colored [red green blue (RGB): (128, 128, 128)].
RSVP-based BCI paradigm. It is always difficult to compare the The stimulus program was developed under MATLAB
performance of different algorithms with a small amount of data. (MathWorks, Inc.) using the Psychophysics Toolbox Ver. 3
One of the main difficulties in collecting a benchmark dataset is (PTB-3) (Brainard, 1997). The stimulus images, downloaded
the large number of system parameters in RSVP-based BCIs (e.g., from the Computer Science and Artificial Intelligence Library of
frequency of image presentation, target definition, target sparsity MIT University, were street-view images of two categories: target
and identifiability, and number of trials and subjects). There is a images showing human and non-target images without human.
great need to collect and publish a large benchmark dataset using During the experiment, subjects were asked to search for the
the RSVP-based BCI paradigm. target images and ignore the non-target images in a subjective
This study provides an open dataset for BCI study based manner. As previous studies have shown similar performance
on the RSVP paradigm. The characteristics of this dataset are between motor and non-motor response tasks (Gerson et al.,
described as follows. (1) A large number of subjects (64 in 2006), subjects in this study were required to make a manual
total) were recorded. (2) A large number of stimulation image button press to maintain attention once detecting target images
circles (16,000 for each subject) were included. (3) Complete in the RSVP task.
Figure 1 shows the time course of the RSVP paradigm. Each some subjects. We therefore suggest to select EEG data from the
trial started with a blank for 0.5 s with a cross mark on the other 62 channels for analysis, and the EEG analysis in this study
center of the screen, and subjects were asked to shift their gaze used the 62-channel data with the channel indices of [1:32 34:42
to the cross mark as soon as possible. The frequency of image 44:64] and removed the bad channels.
presentation was set to 10 Hz (10 images per second).
Figure 2 shows the parameter settings of the experiment for Data Preprocessing
each group. Each group covered two blocks, each containing 40 The dataset was the continuous data at a sampling rate of 250 Hz,
trials. Each trial contained 100 images, including one, two, three, and it was obtained from the raw EEG data (sampling rate at
or four target images. Images in each trial were presented in a 1,000 Hz) after four times downsampling. For each of the datasets
random order. At the beginning of each image’s presentation, a from 1 to 64 (sub1,. . ., sub64), EEG data contained four blocks,
time marker named “event trigger” was sent by the stimulation which were divided into two groups (namely, groups A and B)
program to mark the current stimulus image and was recorded on in chronological order. Each group contained two blocks, and
an event channel of the amplifier synchronized with EEG. There each contained 40 trials. Each trial contained 100 circles, and
was a short key-controlled pause between trials. The duration of each circle corresponded to one image. For each group, the two
each block was about 10 min. There was an average rest time of blocks were used for training and testing in the ERP-based target
about 5 min between two blocks to relieve subjects’ fatigue. detection, respectively. In addition, a 10-fold cross-validation
using both blocks 1 and 2 was performed to further evaluate the
Data Acquisition classification performance.
Electroencephalogram data were recorded using the Synamps2 To verify the validity of the dataset, the continuous EEG
system (Neuroscan, Inc.) at a sampling rate of 1,000 Hz. All 64 data at a sample rate of 250 Hz were processed by a four-
electrodes were used to record EEG and were placed according order Butterworth filter with a bandwidth of [2 30] Hz. EEG
to the international 10–20 system. The reference electrode, with data epochs were extracted according to event triggers generated
the 10–20 electrode name of “Ref,” was located at the vertex. by the stimulus program. In this study, time 0 represented the
Electrode impedances were kept below 10 k. During the beginning of each image stimulus period (marked by a trigger),
experiment, subjects were seated in a comfortable chair in a dimly and the EEG data corresponding to each image (namely, one
lit soundproof room at a distance of approximately 70 cm from circle) were intercepted within the time interval from −200 to
the monitor. The EEG data were filtered from 0.15 to 200 Hz by 1,000 ms. The waveforms of ERPs and SSVEPs corresponding to
the system. The power-line noise was removed by a notch filter target and non-target images were obtained using the averaged
at 50 Hz. It is worth to mention that the impedance of M1 and EEG data within the time interval of (−200 1,000) ms.
M2 electrodes (channels of 33 and 43) was higher than 10 k for
Target Classification
Single-circle EEG data were firstly processed by spatial filtering
methods, and then the target detection was realized by
Target
classification algorithms. Four spatial filtering methods, namely,
100ms
100ms
100ms
common spatial pattern (CSP), SIgnal-to-noise ratio Maximizer
100ms
(P300) (SIM), task-related component analysis (TRCA), and principal
component analysis (PCA) whitening, were compared in this
study. The effects of the number of components (from 1 to
100ms
100ms
50) of different spatial filtering methods on the classification
100ms
100ms performance were compared. The performance of spatial
filtering was evaluated by the followed classification results
(SSVEP) of the classical Hierarchical Discriminant Component Analysis
Non-target (HDCA) algorithm, which was adopted as a baseline measure
of classification performance for single-circle EEG between
FIGURE 1 | The time course of rapid serial visual presentation (RSVP) target and non-target images (Gerson et al., 2006; Sajda et al.,
paradigm.
2010). As a classical classification method widely used in
RSVP-BCIs, HDCA algorithm realizes target images recognition
2 blocks
based on spatial and temporal projection features of ERP
signals. EEG data were firstly divided into 100-ms data
40 trials segments, and then the feature extraction and classification were
100 circles conducted according to the spatial and temporal characteristics
500ms 500ms 100ms of the data segments.
welcome focus show key
To evaluate the performance of the classification methods,
rest four classification algorithms, namely, Support Vector Machine
message message stimuli task
(SVM), Spatially Weighted Fisher linear discriminant (FLD)-
FIGURE 2 | The parameter settings of the experiment for each group.
PCA (SWFP), Discriminative Canonical Pattern Matching
(DCPM), and HDCA, were compared based on this dataset.
The EEG data used for single-circle classification were the Classification performance of single-circle EEG data for target
data in the time interval of [0, t] ms, “t” might be 200, and non-target circles was measured using the area under the
300,. . ., 1,000 ms. SIM algorithm was used as a basic spatial receiver operating characteristic (ROC) curve (Fawcett, 2006).
filtering method before the performance comparison of the four ROC curves are used when applications have an unbalanced
classification algorithms. class distribution, which is typically the case with RSVP-BCI,
where the number of target stimulus is much smaller than that
Performance Evaluation of non-target stimuli.
R-square values for each time point were used to show the
separability between target and non-target stimuli. For each Statistical Analysis
subject, we selected all the target data and the same amount Statistical analyses were conducted using SPSS software (IBM
of non-target data randomly selected to calculate r-square SPSS Statistics, IBM Corporation). One-way repeated-measures
values. For each time point, the input was composed of analysis of variance (ANOVA) was used to test the difference
two one-dimensional vectors, which were composed of target in the classification performances among different algorithms.
data and non-target data, respectively. The r-square values of The Greenhouse–Geisser correction was applied if the data did
each subject were calculated, and the r-square values of all not conform to the sphericity assumption by Mauchly’s test of
subjects were averaged to obtain the final results, as shown in sphericity. All pairwise comparisons were Bonferroni corrected.
Figure 3B. Statistical significance was defined as p < 0.05.
4
FPz
0
-5
V)
1.5 Cz
Amplitude(
-2
3.5
Oz
0
-2
-200 0 500 1000
Time(ms)
B
R-square value
0.08
FPz Cz Oz
0.04
0
-200 0 500 1000
Time(ms)
C
1 V V V
Oz
Amplitude(V)
0 0 0
target nontarget target nontarget target nontarget
0
0 10 20 30 40 50
Frequency(Hz)
5
target nontarget ERP
FIGURE 3 | Waveforms and amplitude spectrum of EEG. (A) The temporal waveforms of EEG [including target images, non-target images, and event-related
potential (ERP) data] and the scalp topographies of amplitudes of ERP. (B) The r-square values in channels of FPz, Cz, and Oz. (C) Spectral characteristics of EEG
for target and non-target images.
DATA RECORDING were band-pass filtered between 2 and 30 Hz within the time
window from −200 to 1,000 ms.
EEG Data The EEG signals in this dataset were sensitive to target
The dataset is freely available at http://bci.med.tsinghua.edu. and non-target image stimuli, and the difference of the evoked
cn/download.html. The dataset was the raw continuous data EEG between the target and non-target image stimuli could be
without any processing. It contains 128 MATLAB MAT files reflected by the ERP components within a short data length at
corresponding to data from all 64 subjects (approximately 15 GB specific brain regions. Figure 3A showed the temporal waveforms
in total). Data were stored as double-precision floating-point of EEG for target images, non-target images, and target-related
values in MATLAB. Each MAT file covers a group of EEG data. ERP data. The waveform for non-target EEG is a near-sinusoidal
There are two sets of EEG data (groups A and B) for subjects signal at 10 Hz with the characteristics of SSVEP. The frequency
from 1 to 64. The files were named as subject and group indices and phase of the SSVEPs are stable over the 1.2-s stimulation
(i.e., sub1A.mat, sub1B.mat,. . ., sub64A.mat, sub64B.mat). For time. The waveforms of ERP located at FPz and Oz showed
each file, the data loaded in MATLAB generate two 2-D matrices obvious P300 (FPz: 3.18 µV, Oz: 2.54 µV) and N400 (FPz:
named “EEGdata1” (block1) and “EEGdata2” (block2) with −3.49 µV, Oz: −1.29 µV) components. Obviously, the latencies
dimensions of [64, L] (the two dimensions indicate “Electrode of P300 and N400 components in the prefrontal cortex were
index,” “Time points,” respectively) and two 2-D matrices named significantly smaller than those in the occipital cortex. For
“class_labels” and “trigger_positions” with dimensions of [2, example, the latencies of the P300 component in FPz and Oz were
4000]. The parameter of L (the length of time points) might 272 and 336 ms, while the latencies of the N400 component were
be different for different blocks. The two dimensions indicate 448 and 484 ms, respectively. While the ERP signal at Cz showed
“class labels,” in which “2” and “1” indicate “non-target images” an obvious negative peak appeared around 300 ms (latency:
and “target images,” respectively. Each circle corresponds to the 328 ms, amplitude: −1.29 µV). From the scalp topographies of
EEG data of a visual stimulus image. For each group, the data amplitudes of ERP in Figure 3A, it could be found that the
matrix consists of 8,000 circles (100 circles × 40 trials × 2 areas highly sensitive to ERP response were mainly located in
blocks), and each circle consists of 64 channels of EEG data. the occipital region and the prefrontal region. For example, these
A “Readme.txt” file explains the data structure and other task- two regions showed significant positive potentials at 300 ms and
related information. negative potentials at 400 and 500 ms. The sensitivity of ERPs for
the electrode in the parietal region to the stimulation of target
images was limited partly because the electrodes were close to the
Electrode Position
reference electrode.
The electrode positions were listed in a “64-channels.loc” file,
The results of r-square values indicated the separability
which contained all channel locations in polar coordinates.
between target and non-target stimuli, as shown in Figure 3B.
Information for each electrode contained four columns:
R-square values indicate the importance of features, and the
“Electrode Index,” “Degree,” “Radius,” and “Label.” For example,
larger the value, the greater the contribution to classification.
information on the first electrode was as follows: (“1,” “−18,”
In the time range of 0–200 ms, the r-square values of the three
“0.51111,” and “FP1”), which indicated that the degree is −18, and
channels were close to 0, which indicated that the features did
the radius is 0.51111 for the first electrode (FP1). The electrode
not contain information valid for classification. After the time
file can be used for topographic analysis by the topoplot()
of 200 ms, the r-square values of the three channels significantly
function in the EEGLAB toolbox (Delorme and Makeig, 2004).
increased, which was consistent with the emergence of the main
components of ERP. For example, the r-square value of Oz
reached the maximum value (0.07) at 340 ms, and at the same
TECHNICAL VALIDATIONS time, the ERP of Oz also reached the peak value (2.54 µV).
Similar results were also found in Cz and Oz. These results
Temporal Waveform and Amplitude indicated that the emergence of the main components of ERP was
Spectrum Analysis accompanied by a greater separability between target and non-
To evaluate the signal quality of the dataset, this study analyzed target stimuli, and ERP was a potentially effective classification
temporal waveform and amplitude spectrum of EEG across all feature. Compared with Cz, the r-square values of FPz and Oz
subjects. EEG data were re-referenced to the average of all were larger, indicating that FPz and Oz contained more effective
electrodes. Figure 3A shows the temporal waveform of averaged information and contributed more to classification.
EEG across all subjects. Three representative midline electrodes The results of Figure 3 indicate that the rapid periodic
(FPz, Cz, and Oz) were selected for temporal waveforms display. stimulation in RSVP produces a brain response characterized
For each subject, all EEG data corresponding to target and non- by a “quasi-sinusoidal” waveform whose frequency components
target images were averaged. Then, the averaged target and non- are constant in amplitude and phases. Figure 3C illustrates
target EEG data for each subject were averaged across all subjects. the amplitude spectra of EEG evoked by target and non-target
Finally, the cross-subject averaged EEG data corresponding to the images. EEG data were firstly averaged across all subjects, and
non-target images were subtracted from that of the target images then the spectrums were calculated by Fast Fourier Transform
to generate the target-related ERP, as shown in Figure 3A. To (FFT) method. As temporal waveforms in Figure 3A have shown
better observe the temporal characters of the SSVEPs, the data the non-target EEG as a quasi-sinusoidal signal with stable
AUC(%)
harmonic: 0.30 µV, third harmonic: 0.10 µV). Since the signals
were filtered from 2 to 30 Hz, the amplitudes in the frequencies 70 CSP
above the fourth harmonic were closed to 0. Figure 3C also SIM
illustrates the scalp topographies of amplitude of target and non- 60 TRCA
target SSVEP at 10 Hz and its harmonic frequencies. Consistent PCA Whitening
with previous studies (Gao et al., 2014; Chen et al., 2015a), the 50
occipital area shows the highest amplitude of SSVEPs. In addition 1 10 20 30 40 50
to the occipital area, lower amplitude can also be observed at the Number of Components/Electrodes
prefrontal area for components related to stimulus frequency (at
10 and 20 Hz). These characters show very robust and reliable FIGURE 4 | The effect of components number on classification performance
frequency features for the fundamental and harmonic SSVEP (block 1 for training, block 2 for testing).
the four spatial filtering methods for the component numbers indicated that the dataset was collected in a well-designed
of 1 [F(2.110,268.008) = 4.648, p = 0.009] and from 2 to 50 experimental environment, and the collected EEG data were
(p < 0.001). Pairwise comparisons showed that the classification of high quality.
accuracies of TRCA were significantly higher (p < 0.05) than The 10-fold cross-validation method showed similar results
that of CSP for the component numbers from 2 to 50 and were to the original verification method by blocks, i.e., SIM and
significantly higher than that of SIM and PCA whitening for the PCA whitening performed best among the four spatial filtering
component numbers from 1 to 6. The classification accuracies methods, and HDCA was the best among the four classification
of SIM and PCA whitening were significantly higher (p < 0.05) methods. The difference between the two validation methods was
than that of CSP for the component numbers from 6 to 50 and that the accuracies and variances of the 10-fold cross-validation
were significantly higher than that of TRCA for the component method were slightly higher and smaller than the method by
numbers from 11 to 50. blocks, respectively. For example, the classification results for
Figure 5 shows the results of classification performance for CSP, SIM, TRCA, and PCA whitening were 66.0% ± 7.0%,
the four spatial filtering methods with different data lengths of 70.2% ± 7.1%, 67.9% ± 7.0%, and 70.3% ± 7.1% and
EEG. The number of components for the four spatial filtering 63.4% ± 7.1%, 67.7% ± 7.3%, 65.4% ± 7.3%, and 67.8% ± 7.3%
methods was set to 30. Two validation methods were used, that for 10-fold cross-validation method and validation method
is, block 1 for training and block 2 for testing (Figure 5A) and a by blocks, respectively. This was due to the fact that the
10-fold cross-validation using both blocks 1 and 2 (Figure 5B). 10-fold cross-validation method used more data for training
For each spatial filtering method, the classification accuracy than the original verification method by blocks. Since the two
increased obviously as the data length increased when it was validation methods have shown similar results, we only chose
less than 500 ms. For example, in Figure 5A, the average results the classification results of the validation method by blocks to
of SIM for all subjects were 67.7% ± 7.3%, 80.5% ± 8.8%, perform the statistical analysis in this study.
88.1% ± 8.2%, and 91.1% ± 7.2% with the data length from A one-way repeated-measures ANOVA showed that there
200 to 500 ms, respectively. The changes of accuracy results were was a statistically significant difference in accuracies among the
no longer significant when the length of EEG data increased to four spatial filtering methods for the data length of 200 ms
600 ms and above. [F(1.326,168.403) = 76.929, p < 0.001], 300 ms [F(1.324,168.179)
In addition, there was a significant difference in the = 115.527, p < 0.001], 400 ms [F(1.204,152.967) = 128.453, p <
classification performance among the different spatial filtering 0.001], 500 ms [F(1.333,169.256) = 124.089, p < 0.001], 600 ms
methods. The CSP method corresponded to the worst [F(1.247, 58.402) = 131.426, p < 0.001], 700 ms [F(1.248,158.528)
classification performance, followed by the TRCA method. = 101.262, p < 0.001], 800 ms [F(1.409,178.955) = 100.214,
SIM and PCA whitening methods had higher classification p < 0.001], 900 ms [F(1.404,178.285) = 99.643, p < 0.001],
performance with no statistically significant difference. and 1,000 ms [F(1.350,171.387) = 102.250, p < 0.001]. Pairwise
For example, in Figure 5A, the classification results were comparisons showed that the classification accuracies of SIM and
77.2% ± 10.1%, 78.3% ± 9.4%, 80.5% ± 8.8%, and 80.7% ± 8.8% PCA whitening were significantly higher (p < 0.001) than those
for the data length of 300 ms in the conditions of CSP, TRCA, of CSP and TRCA for the data length from 200 to 1,000 ms. The
SIM, and PCA whitening, respectively. The statistical difference classification accuracies of TRCA were significantly higher (p <
among CSP, SIM, and TRCA was no longer significant when 0.01) than that of CSP for the data length from 200 to 300 ms
the data length was more than 500 ms. Meanwhile, the high and were significantly lower (p < 0.001) than that of CSP for
classification results based on EEG with short data lengths the data length from 400 to 1,000 ms. There was no significant
A B
100 100
AUC(%)
AUC(%)
75 75
CSP CSP
CSP
SIM SIM
SIM
TRCA TRCA
TRCA
PCA Whitening PCA Whitening
PCA Whitening
50 50
200 300 400 500 600 700 800 900 1000 200 300 400 500 600 700 800 900 1000
Data length (ms) Data length (ms)
FIGURE 5 | Performance of different data lengths for spatial filtering methods (A) Block 1 for training and block 2 for testing. (B) Result of 10-fold cross-validation
using both blocks 1 and 2.
difference between SIM and PCA whitening for the performance 144.651, p < 0.001], 300 ms [F(1.942,246.670) = 55.645, p <
of classification. 0.001], 400 ms [F(2.095,266.046) = 42.243, p < 0.001], 500 ms
[F(2.183,277.251) = 38.436, p < 0.001], 600 ms [F(2.362,299.935)
Evaluating the Performance of = 35.408, p < 0.001], 700 ms [F(3,381) = 27.146, p < 0.001],
800 ms [F(2.820,358.107) = 33.019, p < 0.001], 900 ms
Classification Methods
[F(2.601,330.287) = 29.985, p < 0.001], and 1,000 ms [F(3,381)
In addition to the evaluation of spatial filtering methods,
= 32.344, p < 0.001]. Pairwise comparisons showed that the
the dataset can also be used to evaluate the performance of
classification accuracies of HDCA were significantly higher (p
classification methods. Figure 6 indicated the performance of
< 0.001) than that of SVM, SWFP, and DCPM for the data
different classification methods with the EEG data length from
length from 200 to 1,000 ms. The classification accuracies of
200 to 1,000 ms. After preprocessing with the SIM method, EEG
SVM were significantly higher (p < 0.05) than that of SWFP
data for each image were classified by four different algorithms
for the data length from 400 to 1,000 ms and were significantly
including SVM, SWFP, DCPM, and HDCA. SVM finds a
higher (p < 0.05) than that of DCPM for the data length from
separating hyper-plane that maximizes the margin between the
900 to 1,000 ms. The classification accuracies of DCPM were
two classes. SWFP is based on a two-step linear classification
significantly higher (p < 0.01) than that of SWFP for the data
of event-related responses using FLD classifier and PCA for
length from 300 to 1,000 ms.
dimensionality reduction (Alpert et al., 2014). DCPM performs
well in classifying the miniature AVePs by first suppressing
the common-mode noise of the background EEG and then Evaluating the Performance of
recognizing canonical patterns of ERPs (Xiao et al., 2020). Two Cross-Subject Zero-Training Methods
validation methods were used, that is, block 1 for training The dataset can be used to study zero-training classification
and block 2 for testing (Figure 6A), and a 10-fold cross- methods of RSVP-based BCIs. To improve the performance
validation using both blocks 1 and 2 (Figure 6B). As shown of the system, most of the current RSVP-based BCIs adopt
in Figure 6A, HDCA had the best classification performance, supervised feature extraction and classification algorithms that
while the other three algorithms had approximately a similar require system calibration. The long time in training data
classification performance. This was especially true when the data collection and algorithm template extraction processes bring
length was less than 500 ms. For example, the AUC results for challenges to system practicability and user experience. With
HDCA were 67.7% ± 7.3%, 80.5% ± 8.8%, 88.1% ± 8.2%, and benefits from the large scale of the dataset that contains a
91.1% ± 7.2% for single-circle EEG classification between target total of 64 subjects, 10,240 trials, 1,024,000 image circles, and
and non-target images with the data length of 200, 300, 400, 102,400 s of 64-channel EEG data, it is possible to extract
and 500 ms, respectively. When the data length is greater than common information of EEG for target classification. A cross-
500 ms, the performance of the four classification algorithms subject strategy can be used to design zero-training algorithms
is similar, while the classification performance of the HDCA suitable for target identification in the RSVP paradigm.
algorithm is still the best. Figure 6B indicated the similar results In this paper, the dataset was used to design a zero-training
as Figure 6A, and the only difference was that the SVM method classification algorithm based on a cross-subject template. The
performed the worst. performance was estimated using a leave-one-subject-out cross-
A one-way repeated-measures ANOVA based on the validation. EEG data of each subject were trained separately
validation method by blocks showed that there was a statistically to obtain his or her algorithm template parameters for the
significant difference in accuracies among the four classification HDCA algorithm. In the testing session, by using cross-subject
methods for the data length of 200 ms [F(2.124,269.799) = template, the EEG classification performance of one subject was
A B
100 100
AUC(%)
AUC(%)
75 75
SVM SVM
SWFP SWFP
DCPM DCPM
HDCA HDCA
50 50
200 300 400 500 600 700 800 900 1000 200 300 400 500 600 700 800 900 1000
Data length (ms) Data length (ms)
FIGURE 6 | Performance of different classification methods with different data lengths. (A) Block 1 for training and block 2 for testing. (B) Result of 10-fold
cross-validation using both blocks 1 and 2.
REFERENCES Huang, L. T., Zhao, Y. Q., Zeng, Y., and Lin, Z. M. (2017). BHCR: RSVP
target retrieval BCI framework coupling with CNN by a Bayesian method.
Abibullaev, B., and Zollanvari, A. (2019). Event-Related Potentials (P300, EEG) - Neurocomputing 238, 255–268. doi: 10.1016/j.neucom.2017.01.061
BCI dataset. IEEE DataPort. Kazakhstan: Nazarbayev University, doi: 10.21227/ Hyvarinen, A., and Oja, E. (2000). Independent component analysis: algorithms
8aae-d579 and applications. Neural Netw. 13, 411–430. doi: 10.1016/s0893-6080(00)
Alpert, G. F., Manor, R., Spanier, A. B., Deouell, L. Y., and Geva, A. B. (2014). 00026-5
Spatiotemporal representations of rapid visual target detection: a single-trial Kaya, M., Binli, M. K., Ozbay, E., Yanar, H., and Mishchenko, Y. (2018). A large
EEG classification algorithm. IEEE Trans. Biomed. Eng. 61, 2290–2303. doi: electroencephalographic motor imagery dataset for electroencephalographic
10.1109/TBME.2013.2289898 brain computer interfaces. Sci. Data 5:180211. doi: 10.1038/sdata.
Bigdely-Shamlo, N., Vankov, A., Ramirez, R. R., and Makeig, S. (2008). Brain 2018.211
activity-based image classification from rapid serial visual presentation. IEEE Lee, M. H., Kwon, O. Y., Kim, Y. J., Kim, H. K., Lee, Y. E., Williamson, J.,
Trans. Neural Syst. Rehabil. Eng. 16, 432–441. doi: 10.1109/TNSRE.2008. et al. (2019). EEG dataset and OpenBMI toolbox for three BCI paradigms: an
2003381 investigation into BCI illiteracy. Gigascience 8, 1–16. doi: 10.1093/gigascience/
Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision. 10, 433–436. giz002
doi: 10.1163/156856897X00357 Lees, S., Dayan, N., Cecotti, H., Mccullagh, P., Maguire, L., Lotte, F., et al. (2017).
Chen, X. G., Wang, Y. J., Gao, S. K., Jung, T. P., and Gao, X. R. (2015a). Filter A review of rapid serial visual presentation-based brain-computer interfaces.
bank canonical correlation analysis for implementing a high-speed SSVEP- J. Neural Eng. 15:021001. doi: 10.1088/1741-2552/aa9817
based brain-computer interface. J. Neural Eng. 12:046008. doi: 10.1088/1741- Lees, S., McCullagh, P., Payne, P., Maguire, L., Lotte, F., and Coyle, D. (2020). Speed
2560/12/4/046008 of rapid serial visual presentation of pictures, numbers and words affects event-
Chen, X. G., Wang, Y. J., Nakanishi, M., Gao, X. G., Jung, T. P., and Gao, S. K. related potential-based detection accuracy. IEEE Trans. Neural Syst. Rehabil.
(2015b). High speed spelling with a noninvasive brain-computer interface. Proc. Eng. 28:122. doi: 10.1109/TNSRE.2019.2953975
Natl. Acad. Sci. U.S.A. 112, E6058–E6067. doi: 10.1073/pnas.1508080112 Lioi, G., Cury, C., Perronnet, L., Mano, M., Bannier, E., Lecuyer, A., et al.
Cho, H., Ahn, M., Ahn, S., Kwon, M., and Jun, S. C. (2017). EEG datasets for motor (2019). Simultaneous MRI-EEG during a motor imagery neurofeedback task:
imagery brain-computer interface. Gigascience 6:8. doi: 10.1093/gigascience/ an open access brain imaging dataset for multi-modal data integration.
gix034 bioRxiv[Preprint]. doi: 10.1101/862375
Cohen, M. (2014). Analyzing Neural Time Series Data: Theory and Practice. Lotte, F., and Guan, C. T. (2011). Regularizing common spatial patterns to improve
Cambridge, MA: The MIT Press. BCI designs: unified theory and new algorithms. IEEE Trans. Biomed. Eng. 58,
Delorme, A., and Makeig, S. (2004). EEGLAB: An open source toolbox for analysis 355–362. doi: 10.1109/TBME.2010.2082539
of single-trial EEG dynamics including independent component analysis. Manor, R., Mishali, L., and Geva, A. B. (2016). Multimodal neural network for rapid
J. Neurosci. Methods. 134, 9–21. doi: 10.1016/j.jneumeth.2003.10.009 serial visual presentation brain computer interface. Front. Comp. Neurosci.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recogn. Lett. 27, 10:130. doi: 10.3389/fncom.2016.00130
861–874. doi: 10.1016/j.patrec.2005.10.010 Nakanishi, M., Wang, Y. J., Chen, X. G., Wang, Y. T., Gao, X. R., and Jung,
Gao, S. K., Wang, Y. J., Gao, X. R., and Hong, B. (2014). Visual and auditory brain- T. P. (2018). Enhancing detection of SSVEPs for a high-speed brain speller
computer interfaces. IEEE Trans. Biomed. Eng. 61, 1436–1447. doi: 10.1109/ using task-related component analysis. IEEE Trans. Biomed. Eng. 65, 104–112.
TBME.2014.2300164 doi: 10.1109/TBME.2017.2694818
Gerson, A. D., Parra, L. C., and Sajda, P. (2006). Cortically coupled computer vision Sajda, P., Gerson, A., and Parra, L. (2003). High-throughput image search via
for rapid image search. IEEE Trans. Neural Syst. Rehabit. Eng. 14, 174–179. single-trial event detection in a rapid serial visual presentation task [M]. 1st
doi: 10.1109/TNSRE.2006.875550 International IEEE/EMBS Conference on Neural Engineering. Capri Italy. 2003,
Han, C. H., Muller, K. R., and Hwang, H. J. (2020). Brain-switches for asynchronous 7–10. doi: 10.1109/CNE.2003.1196297
brain-computer interfaces: a systematic review. Electronics 9:422. doi: 10.3390/ Sajda, P., Pohlmeyer, E., Wang, J., Parra, L., Christoforou, C., Dmochowski, J.,
electronics9030422 et al. (2010). In a blink of an eye and a switch of a transistor: cortically
coupled computer vision. Proc. IEEE. 98, 462–478. doi: 10.1109/JPROC.2009.20 Zhang, S. G., and Gao, X. R. (2019). The effect of visual stimuli noise and fatigue
38406 on steady-state visual evoked potentials. J. Neural Eng. 16:056023. doi: 10.1088/
Singh, A., and Jotheeswaran, J. (2018). “P300 brain waves instigated semi 1741-2552/ab1f4e
supervised video surveillance for inclusive security systems. Advances in Brain Zhang, S. G., Han, X., Chen, X. G., Wang, Y. J., Gao, S. K., and Gao,
Inspired Cognitive Systems,” in Proceedings of the 9th International Conference, X. R. (2018). A study on dynamic model of steady-state visual
China, 184–194. doi: 10.1007/978-3-030-00563-4_18 evoked potentials. J. Neural Eng. 15:046010. doi: 10.1088/1741-2552/
Stegman, P., Crawford, C. S., Andujar, M., Nijholt, A., and Gilbert, J. E. (2020). aabb82
Brain-computer interface software: a review and discussion. IEEE Trans. Hum. Zhang, S. G., Han, X., and Gao, X. R. (2020). Studying the effect
Mach. Syst. 50:115. doi: 10.1109/THMS.2020.2968411 of the pre-stimulation paradigm on steady-state visual evoked
Vaineau, E., Barachant, A., Andreev, A., Rodrigues, P. C., Cattan, G., and Congedo, potentials with dynamic models based on the zero-pole analytical
M. (2019). Brain invaders adaptive versus non-adaptive P300 brain-computer method. Tsingh. Sci. Technol. 25, 435–446. doi: 10.26599/TST.2019.90
interface dataset. arXiv[Preprint]. doi: 10.5281/zenodo.1494163 10028
Wang, Y. J., Chen, X. G., Gao, X. R., and Gao, S. K. (2017). A benchmark dataset Zhao, H. Z., Wang, Y. J., Sun, S., Pei, W. H., and Chen, H. D. (2019). “Obviating
for SSVEP-based brain-computer interfaces. IEEE Trans. Neural Syst. Rehabil. session-to-session variability in a rapid serial visual presentation-based brain-
Eng. 25, 1746–1752. doi: 10.1109/TNSRE.2016.2627556 computer interface,”in Proceedings of the 9TH International IEEE/EMBS
Wu, Q. J., Yan, B., Zeng, Y., Zhang, C., and Tong, L. (2018). Anti-deception: reliable Conference on Neural Engineering (NER), San Francis, CA, 174. doi: 10.1109/
EEG-based biometrics with real-time capability from the neural response of face NER.2019.8716892
rapid serial visual presentation. Biomed. Eng.17:55. doi: 10.1186/s12938-018-
0483-7 Conflict of Interest: The authors declare that the research was conducted in the
Wu, W., and Gao, S. K. (2011). “Learning Event-Related Potentials (ERPs) from absence of any commercial or financial relationships that could be construed as a
multichannel EEG recordings: a spatio-temporal modeling framework with a potential conflict of interest.
fast estimation algorithm [M],” in Proceedings of the 33rd Annual International
Conference of the IEEE Engineering-in-Medicine-and-Biology-Society (EMBS), Copyright © 2020 Zhang, Wang, Zhang and Gao. This is an open-access article
Boston, MA, 6959–6962. distributed under the terms of the Creative Commons Attribution License (CC BY).
Xiao, X. L., Xu, M. P., Jin, J., Wang, Y. J., Jung, T. P., and Ming, D. (2020). The use, distribution or reproduction in other forums is permitted, provided the
Discriminative canonical pattern matching for single-trial classification of ERP original author(s) and the copyright owner(s) are credited and that the original
components. IEEE Trans. Biomed. Eng. 67, 2266–2275. doi: 10.1109/TBME. publication in this journal is cited, in accordance with accepted academic practice. No
2019.2958641 use, distribution or reproduction is permitted which does not comply with these terms.