Bridge Graph Attention Based Graph Convolution Network With Multi-Scale Transformer for EEG Emotion Recognition
Bridge Graph Attention Based Graph Convolution Network With Multi-Scale Transformer for EEG Emotion Recognition
4, OCTOBER-DECEMBER 2024
Abstract—In multichannel electroencephalograph (EEG) emo- Index Terms—EEG, emotion recognition, graph attention
tion recognition, most graph-based studies employ shallow graph network, multi-scale transformer, over-smoothing.
model for spatial characteristics learning due to node over-
smoothing caused by an increase in network depth. To address I. INTRODUCTION
over-smoothing, we propose the bridge graph attention-based
graph convolution network (BGAGCN). It bridges previous graph MOTION is a fundamental aspect of human cognition, and
convolution layers to attention coefficients of the final layer by
adaptively combining each graph convolution output based on the
graph attention network, thereby enhancing feature distinctive-
E the emotional recognition can enhance human-computer
interaction and collaboration [1]. Affective computing aims to
ness. Considering that graph-based network primarily focus on enable computers to understand emotional states like humans.
local EEG channel relationships, we introduce a transformer for Compared to extraction methods external information, such as
global dependency. Inspired by the neuroscience finding that neural facial expressions [2], [3], [4] and body postures [5], the obvious
activities of different timescales reflect distinct spatial connectivi- advantage of physiological signals (e.g., electrocardiogram
ties, we modify the transformer to a multi-scale transformer (MT)
by applying multi-head attention to multichannel EEG signals after
(ECG) [6], electroencephalogram (EEG) [7], electrodermal
1D convolutions at different scales. MT learns spatial features more activity (EDA) [8]), is that these signals can directly reflect
elaborately to enhance feature representation ability. By combining the potential human nervous system with more reliable
BGAGCN and MT, our model BGAGCN-MT achieves state-of-the- characteristics. Among these physiological signals, EEG signal
art accuracy under subject-dependent and subject-independent is particularly suitable for affective computing due to its
protocols across three benchmark EEG emotion datasets (SEED,
SEED-IV and DREAMER). Notably, our model effectively ad-
temporal fine-grained resolution of cognitive psychological
dresses over-smoothing in graph neural networks and provides an processes [9]. EEG signal is defined as the general reaction of
efficient solution to learning spatial relationships of EEG features the electrophysiological activities of neurons on the surface of
at different scales. the cerebral cortex and scalp. Generally, a set of EEG signals
are collected by placing multiple electrodes according to the
Manuscript received 29 November 2023; revised 17 February 2024; accepted 10–20 system [10], enabling the simultaneous capture of EEG
20 February 2024. Date of publication 30 April 2024; date of current version 18
November 2024. This work was supported in part by the Fundamental Research
signals from different regions of the brain.
Funds for Central Universities, SCUT, under Grant 2023ZYGXZR013, in part A typical task of multichannel EEG emotion recognition
by the Basic and Applied Basic Research Foundation of Guangzhou under comprises two main parts: feature extraction and classification.
Grant 202201010681, in part by the Fundamental Research Funds for Central
Universities, SCUT, under Grant 2023ZYGXZR086, in part by the National Key
For each EEG channel, the most used method of feature ex-
R&D Program of China under Grant 2022YFB4500600, in part by the National traction is to decompose the signal into five frequency bands
Natural Science Foundation of China under Grant 61802131, in part by the (δ (1–4 Hz), θ (4–8 Hz), α (8–14 Hz), β (14–30 Hz), and γ
Science and Technology Project of Guangdong under Grant 2022B0101010003,
in part by the Natural Science Foundation of Guangdong Province, China,
(30–50 Hz) [11]), and then extract features from each frequency
under Grant 2020A1515010781 and Grant 2019B010154003, in part by the band. Some studies [12], [13] have found that differential en-
Guangzhou Key Laboratory of Body Data Science, under Grant 201605030011, tropy (DE) and power spectral density (PSD) is closely associ-
in part by the Science and Technology Project of Zhongshan, under Grant
2019AG024, in part by the Guangdong Provincial Key Laboratory of Human
ated with emotional processes, making them commonly used for
Digital Twin, under Grant 2022B1212010004. Recommended for acceptance EEG emotion recognition. Traditional methods usually utilize
by Z. Zhang. (Corresponding author: Kailing Guo.) classical classifiers like support vector machine (SVM) and deep
Huachao Yan and Xiaofen Xing are with the South China University of
Technology, Guangzhou 510641, China.
belief network (DBN) [13] to classify the extracted features.
Kailing Guo is with the South China University of Technology, Guangzhou With the development of deep learning, feature extraction and
510641, China, and also with Pazhou Lab, Guangzhou 510335, China (e-mail: classification are merged into one process through neural net-
[email protected]).
Xiangmin Xu is with the South China University of Technology, Guangzhou
works [14], based on EEG signals or pre-extracted features.
510641, China, also with Pazhou Lab, Guangzhou 510335, China, and also with Inspired by the collaborative relationship between brain re-
the Institute of Artiffcal Intelligence, Heifei Comprehensive National Science gions, an increasing number of studies [1] are focusing on
Center, Hefei 230088, China.
Our code is available at https://github.com/LogzZ.
mapping the spatial relationships between EEG channels. Re-
Digital Object Identifier 10.1109/TAFFC.2024.3394873 current neural network (RNN) [15], [16], convolution neural
1949-3045 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
YAN et al.: BRIDGE GRAPH ATTENTION BASED GRAPH CONVOLUTION NETWORK WITH MULTI-SCALE TRANSFORMER 2043
network (CNN) [17], [18], [19], and graph neural network to construct the graph. Liu et al. [38] generalize GraphSAGE
(GNN) [20], [21], [22] has received much attention for their to multiple traffic graphs by incorporating their temporal cor-
spatial relationship modeling ability. Some studies [15], [16], relation. DropEdges [37] randomly drop a percentage of edges,
[23] predefined sequences of EEG electrodes from different which is a classic strategy for solving over-smoothing. Some
placing directions and input them into RNN models to extract the studies improved DropEdges by setting the dropping probability
spatial structure of multichannel EEG. However, reshaping the with attention mechanisms [39] and edge importance [40]. How-
electrodes for RNN will disrupt the original spatial structure of ever, applying DropNodes or DropEdges to graph-based EEG
EEG channels. CNN [18], [19] can preserve the spatial structure emotion recognition will lose the interpretability of brain con-
of EEG channels when extracting spatial features from multi- nectivity. Motivated by CNN, Li et al. [25] computed the prox-
channel EEG data. However, CNN requires mapping the spatial imity similarity of point clouds to build a graph and introduced
distribution of EEG channels to a matrix and assigning values skip connections into GCN, partially mitigating over-smoothing.
to non-existent EEG channels, which may introduce irrelevant However, feature maps at different layers have different impor-
noise. Additionally, according to the findings in neuroimaging, tance [41]. The skip connection lacks discriminability across
different brain regions often exhibit unstructured relationships different depths of the network.
with each other under different emotional states [24]. To tackle the above issue, this paper proposes bridge graph
Graph-based methods are able to cope with unstructured attention based GCN, namely, BGAGCN for EEG emotion
data directly and hold promise in describing the intrinsic spa- recognition. BGAGCN consists of two parts: the deep graph
tial relationships between different nodes in the graph [25]. convolution path and the bridge graph attention (BGAT) block.
For multichannel EEG emotion recognition, many studies con- Skip layer connection is performed in the deep graph convolution
struct the adjacency matrix from different aspects, e.g., Gaus- path, which allows the model to have a deeper depth for learning
sian kernel [26], [27], correlations [28], [29], and traceability high-order topological representations from multichannel EEG.
method [30]. However, these studies ignore the functional ac- The BGAT block bridges previous graph convolution layers to
tivities of the brain. Inspired by neuroscience, Song et al. [31] attention coefficients of the final graph convolution layer by
and Ye et al. [22] constructed adjacency matrices according adaptively combining each graph convolution output based on
to functional regions of the brain and showed better results. graph attention network (GAT). BGAGCN captures the vary-
In terms of utilizing the adjacency matrix, earlier study [20] ing importance of spatial features at different depths of GCN
considered the advantages of flexibility, which used the dynamic and enriches node information of the final layer to overcome
strategy to update the adjacency matrix, and there are also some over-smoothing. Since the activities of the brain include both
follow-ups [22], [32]. Recently, considering the high stability regional and global activity, similar to previous studies [22],
and low computational complexity of static adjacency matrix, [31], this paper also uses multi-graph data to simulate these two
some studies [28], [29], [30] employed static adjacency matrix types of activities of the brain.
in GCN for spatial learning of multichannel EEG and achieved Graph-based methods can be considered as building the de-
better performance. pendency among EEG channels through spatial relationships of
However, the above graph-based methods predominantly em- local regions in the graph space. However, during the generation
ployed shallow GCN to investigate the spatial relationships of emotional EEG, neurons in the brain are stimulated to gen-
among EEG channels, and this will raise the issue of insufficient erate neural oscillation across the whole cerebral cortex [42],
representational ability. If simply increasing the depth of GCN, which means that multichannel EEG also has non-local charac-
more neighboring nodes are taken into account, which can lead teristics. Transformer [43] is capable of establishing dense corre-
to the over-smoothing problem [33]. lations to explore non-local characteristics among tokens and has
In EEG emotion recognition, most studies attempted to ad- achieved state-of-the-art (SOTA) results in image classification
dress the insufficient representation ability by combining shal- and natural language processing. Recently, the transformer is
low GCN with other models. Zhang et al. [34] and Lin et also employed for constructing global dependencies of EEG
al. [35] connected a graph convolution layer and multiple 1D in different domains such as the time domain [44], frequency
convolution layers to construct deeper models for EEG emotion domain [45], and spatial domain [46], [47]. Among them, Sun
recognition. Ye et al. [22] proposed a hierarchical dynamic GCN et al. [46] combined transformer after graph learning to build
and paralleled it with two 1D CNN models to achieve improved global spatial dependency of multichannel EEG for emotion
performance. However, the issue of over-smoothing in graph recognition, which demonstrated that transformer and graph
methods for EEG emotion recognition remains unresolved. In learning are complementary to each other. But these studies
fact, the essence of graph convolution can be seen as a localized have all overlooked the multi-scale EEG impact on the spatial
filter, repeatedly using this method can mix neighbor nodes relationship of multichannel EEG.
and make the graph more confusing. This especially holds for Inspired by the finding of neuroscience that brain activity at
small-scale datasets with fewer nodes [25] like EEG datasets. different timescales shows distinct spatial characteristics [48],
Some studies have proposed solutions to deal with over- we propose multi-scale transformer (MT) in our model to rem-
smoothing. GraphSAGE [36] aggregates neighbor nodes to edy this gap. In MT, we regard each EEG channel in each sample
downsample nodes, which can be viewed as dropping nodes as a token and introduce multi-scale convolution to act on each
to mitigate over-smoothing [37]. GraphSAGE has been applied token for feature extraction. Subsequently, multi-head attention
on fake news assessment by using inclusive relationship of text (MHA) is employed to establish global spatial dependency
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
2044 IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 15, NO. 4, OCTOBER-DECEMBER 2024
within different scales among all EEG channels. MT captures To address the sequencing issues, CNN is introduced as an
different spatial information from all the EEG channels with alternative method to learn the spatial features among multichan-
different scales after graph-based learning. nel EEG. The first step of this method is to map the multichannel
The contributions are summarized as follows: EEG into a matrix, and then employ 2D convolution to extract
1) The proposed BGAGCN treats information of different spatial features. Shen et al. [19] proposed a compact mapping of
depth in GCN for integration distinctly, which outperforms the 62 EEG channels into an 8×9 matrix for CNN learning. Li
other GNN methods in mitigating over-smoothing by pro- et al. [18] mapped the 62 EEG channels into a bigger mapping
ducing more distinguishable feature in the final layer of size of 20×20 matrix to cope with information loss caused by
GCN. pooling in deep CNN. Furthermore, Cui et al. [17] mapped the
2) Combining BGAGCN and MT, our model BGAGCN-MT 32 EEG channels into an 8×8 matrix and calculate the difference
has a strong representation capability that learns global of paired EEG channels to obtain the difference matrix, CNN
spatial dependency at different times-scales, which sur- is finally employed for capturing asymmetric features of the
passes the SOTA EEG emotion recognition methods. brain from the difference matrix. Although CNN can preserve
3) The feature extracted at different convolution scales in MT the spatial structure of multichannel EEG, it should be noted that
exhibits distinct spatial representations, which is consis- during the mapping process, nonexistent EEG channels need to
tent with the discovery in neuroscience. be zeroed or interpolated. This will introduce spatial noise that
is irrelevant to the actual spatial structure of multichannel EEG.
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
YAN et al.: BRIDGE GRAPH ATTENTION BASED GRAPH CONVOLUTION NETWORK WITH MULTI-SCALE TRANSFORMER 2045
Fig. 1. The framework of BGAGCN-MT for multichannel EEG emotion recognition. BGAGCN-MT consists of two main parts. The first part is the BGAGCN,
which aims to deepen the graph network while overcoming node over-smoothing. The second part is the MT, which is designed to learn non-local spatial features
and simulate brain activity at different scales with different spatial characteristics to improve EEG emotion recognition.
strengthening the feature representations. Du et al. [55] em- of multichannel EEG and improve the accuracy of emotion
ployed four different depths of 1D CNN for extracting multi- recognition.
dimensional features of EEG channel, features under all di- However, the above studies constructs spatial dependence
mensions were concatenated and then fed into a single-layer only at a single scale of EEG, which overlooks the impact of
GCN for learning spatial relationships of multichannel EEG temporal scales on spatial characteristics.
data. Although the studies mentioned above achieved decent
performance by employing shallow GCNs or combining them
with other networks, the issue of over-smoothing in deep GCN III. THE PROPOSED MODEL
has been overlooked. Inspired by the attention mechanism, we In this section, we will introduce the proposed model, includ-
consider the important information of each layer in GCN and ing the construction of multigraph, BGAGCN, and MT. The
reinforce the importance of emotion-related nodes overcome the overall framework of our model can be seen in Fig. 1.
over-smoothing problem.
Basically, a transformer is composed of multiple encoders and In order to record EEG signal for emotion recognition, emo-
decoders. Both encoder and decoder have the same structure tion induction experiments composed of multiple trials are con-
which includes multi-head self-attention layer, add and nor- ducted for each subject. The recorded EEG signal from each trial
malize layer, and feed-forward layer. Transformer is originally is decomposed into different frequency bands. For each band, the
designed to solve the sequential-related problem of RNN in entire EEG signal is divided into one-second segments, and then
natural language processing (NLP). Vaswani et al. [43] first pro- feature extraction methods, such as DE and PSD, are applied to
posed the transformer model based on the multi-head attention each segment.
mechanism, which effectively addressed the non-parallelization The extracted features from a set of EEG signals in one
problem in the NLP model. Benefiting from transformer’s ability trial are of the size C × F × T , where C is the number of
for building dense dependencies between tokens, transformer electrodes (channels), F denotes the number of frequency bands,
has also achieved great success in image classification [56] and and T is the number of segments. To obtain sufficient samples
object detection [57]. for training, the feature in each trial is split into non-overlap
In EEG emotion recognition, some studies also use trans- segments with the temporal length of D. The final segment is
formers in different domains for classification tasks. Gong discarded if its length is less than D. We denote each segment
et al. [45] employed a transformer after spatial and spectral is X ∈ RC×F ×D . When only a single frequency band of feature
learning to construct temporal global dependence of EEG. is utilized, the size of input feature is C × 1 × D and can
Wang et al. [47] proposed a hierarchical transformer to cap- be further reshaped into the size of C × D by squeezing the
ture both channel-level and region-level spatial dependen- frequency dimension. When all the frequency bands are used,
cies from multichannel EEG. Ma et al. [44] employed three we flatten the last dimensions of an input feature and obtain
transformers to perceive global dependencies of multichan- X ∈ RC×FD . Subsequently, X is right multiplied by a learnable
nel EEG from multi-domains including spatial, spectral, and transformation matrix to fuse the emotional information among
frequency domains. Sun et al. [46] applied a transformer different frequency bands:
after dual branch graph-based learning, which demonstrates
that transformer can enhance the graph-based spatial feature = XP + B,
X (1)
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
2046 IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 15, NO. 4, OCTOBER-DECEMBER 2024
Fig. 2. The region partition for 62-channel EEG and 14-channel EEG. The
EEG channels with the same color represent the same region.
B. Construction of Mutligraph
According to neuroscience [54], the activity of the brain
can be regarded as two types of circuits: global dynamic and Fig. 3. The framework of the BGAGCN. GCL is the abbreviation of the graph
convolution layer.
regional functional connectivity. Inspired by this, we propose a
multigraph method for global and local parts to model the spatial
relationships of EEG channels.
For each sample, we construct a global graph GG = (V, EG ), and jth column of AR is defined as follows:
ij
where V = {v 1 , v 2 , . . . , v C } represents the set of nodes (EEG aG , v i and v j are in the same region,
and EG denotes the set aij = (5)
channels), i.e., v i is the ith row of X, R 0, otherwise.
of edges {(v i , v j )|i, j ∈ (1, C)}. Suppose AG ∈ RC×C is the
adjacency matrix for GG . The Gaussian kernel is commonly C. Bridge Graph Attention Based GCN
used for defining adjacency matrix but may result in noisy
To deepen the GCN and solve the node over-smoothing prob-
connections. It is typically believed that the connection between
lem at the same time, we design BGAGCN. As shown in Fig. 3,
two variables is stronger when they are accompanied by a high
BGAGCN consists of two parts: deep graph convolution path
correlation and a small spatial distance. Therefore, we modify
(DGCP) and bridge graph attention (BGAT) block.
the Gaussian kernel method by incorporating the use of PPC
Given the adjacency matrix A of a graph G, The normalized
and Manhattan Distance (MD) to filter out noisy connections. 1 1
Laplacian matrix can be expressed as L = In − D− 2 AD− 2 ,
PPC provides correlation information of two variables, and MD
In ∈ RC×C is an identity matrix, and D ∈ RC×C is degree
measures the distance between two vectors. The element in the
matrix. Since the operation for the global graph and region func-
ith row and jth column of AG is defined as follows:
tional graph are the same, we abuse L to denote the normalized
exp − ||v i − v j ||22 /2σ 2 , ρi,j ≥ ρτ , mi,j ≤ mτ , Laplacian matrix of either of them in the following.
ij
aG = The deep graph convolution path contains L1 graph convo-
0, otherwise, lution layers. K order Chebyshev polynomials [58] are em-
(2)
ployed in each convolution layer. Given the input of the lth
where σ is kernel size, and ρτ and mdτ are correlation and
Chebyshev graph convolution layer, the corresponding output
distance thresholds, respectively. The mathematical definitions
Ĥl+1 ∈ RC×D is given by:
of PCC and MD are given as follows: K−1
cov(v i , v j ) Ĥl+1 = σ1 Ĥl Wl ,
Tk (L) (6)
ρi,j = , (3) k
σv i σv j k=0
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
YAN et al.: BRIDGE GRAPH ATTENTION BASED GRAPH CONVOLUTION NETWORK WITH MULTI-SCALE TRANSFORMER 2047
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
2048 IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 15, NO. 4, OCTOBER-DECEMBER 2024
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
YAN et al.: BRIDGE GRAPH ATTENTION BASED GRAPH CONVOLUTION NETWORK WITH MULTI-SCALE TRANSFORMER 2049
TABLE I
COMPARISONS OF THE RESULTS (ACC(%) / STD(%)) OF DIFFERNET METHODS FOR SUBJECT-DEPENDENT EXPERIMENTS ON SEED DATASET
For subject-independent experiments in three datasets, the can be seen from the result that all methods achieve the highest
training and testing data are both obtained from different sub- accuracy when taking all bands as input. It is worth noting
jects. that BGAGCN-MT achieves the new SOTA result (96.82%)
We apply leave-one-subject-out cross-validation as described on SEED when all frequency bands are taken as input and are
in [16], [67] to evaluate the performance of the proposed model. significantly outperforms the second-best method V-IAG.
Specifically, the data from one subject are used for testing and 2) Subject-Independent Experiments: To demonstrate the
the data from the remaining subjects are used for training. This generalization ability of BGAGCN-MT, we conduct subject-
process is repeated for each subject, and the results are averaged independent experiments on SEED. The results are reported
over all subjects. in Table II. All the compared methods in subject-dependent
experiments are used for subject-independent experiments if
C. Implementation Details the results have been reported in the corresponding papers.
We add four transfer learning methods (subspace alignment
The number of graph convolution layers L1 is set to 9, and the (SA) [64], transfer component analysis (TCA) [63], prototyp-
Chebyshev kernel size K is set to 3. The number of self-attention ical representation-based pairwise learning (PR-PL) [65], and
heads in BGAT is set to 3. The number of attention heads HA unsupervised dynamic domain adaptation (UDDA) [66]) for
and layers L2 in MT are set to 3 and 6. Adam is utilized for performance comparison.
optimization. The batch size is set to 32, the learning rate is In Table II, BGAGCN-MT achieves the highest accuracy of
0.001 and the training epochs are 150. Following [26], [34], [46], 89.66% with the standard deviation of 4.72% when using all fre-
we use mean accuracy (ACC) and standard deviation (STD) to quency band features as input. In the δ band, our model performs
evaluate the performance. Our model is trained with PyTorch on slightly worse than RGNN, but in the higher frequency bands (θ,
a Geforce RTX 2080Ti GPU. α, β, and γ bands), it performs much better than the second best
as well as the other compared methods. We can also observe
D. Experiment Results on SEED that all the methods perform better in the higher frequency
1) Subject-Dependent Experiment: We compare BGAGCN- bands. This indicates that higher frequency bands contain more
MT with two baseline methods (SVM and DBN) [13], two subject-invariant information, which is also consistent with the
deep learning methods without graph structure (bi-hemisphere fact that the higher frequency bands are associated with more
domain adversarial neural network (BiDANN) [16], regional general cognitive processes [68]. BGAGCN-MT outperforms
to global spatial-temporal neural network (R2G-STNN) [23]), the other methods, even better than the latest transfer learning
and seven graph-based methods (graph convolution neural net- methods (PR-PL [65] and UDDA [66]), which suggests that it
work (GCNN) [26], dynamic GCNN (DGCNN) [26], regular- is able to learn more subject-invariant representations.
ized graph neural networks (RGNN) [21], GCBNet+BLS [34], 3) Performance Comparison With Different Features: To
variational instance-adaptive graph (V-IAG) [31], multi-Scale evaluate the impact of different features on our model, we
feature reconstruction graph convolutional network (MSFR- compare DE with another popular feature PSD. We present
GCN) [62], and graph-based multi-task self-supervised learning the results in Table III. We can see that, when replacing the
(GMSS) [27]). input feature from DE to PSD, all the methods degenerate.
The subject-dependent recognition results on SEED are However, BGAGCN-MT still outperforms the other methods.
shown in Table I. BGAGCN-MT outperforms the other methods This shows that the superiority of BGAGCN-MT is invariant to
when a single frequency band is used except for the θ band. input features.
However, it is the second best and the accuracy is significantly
higher than the other methods. The accuracy on α, β, and γ bands
E. Experiment Results on SEED-IV
are much better than that on δ and θ bands for all the methods.
It can be inferred that the high-frequency component contains We conduct both subject-dependent and subject-independent
more relevant emotional information, which is consistent with experiments on SEED-IV and report the results in Table IV.
previous studies [12], [16]. Nevertheless, the low-frequency We carefully choose the eleven models from two groups for
component still carries some useful emotional information. This further comparison, including two baseline methods (SVM [13],
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
2050 IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 15, NO. 4, OCTOBER-DECEMBER 2024
TABLE II
COMPARISONS OF THE RESULTS (ACC(%) / STD(%)) OF DIFFERNET METHODS FOR SUBJECT-INDEPENDENT EXPERIMENTS ON SEED DATASET
TABLE III
THE RESULTS (ACC(%) / STD(%)) OF SUBJECT-DEPENDENT EXPERIMENT
WITH DIFFERENT TYPES OF INPUT FEATURES ON SEED DATASET
Fig. 5. Confusion matrices on SEED dataset. (a) and (b) are the results of
subject-dependent and subject-independent experiments, respectively.
TABLE IV
COMPARISONS OF THE RESULTS (ACC(%) / STD(%)) OF DIFFERNET METHODS
FOR SUBJECT-DEPENDENT AND SUBJECT-INDEPENDENT EXPERIMENT ON
SEED-IV
Fig. 6. Confusion matrices on SEED-IV dataset. (a) and (b) are the results of
subject-dependent and subject-independent experiments, respectively.
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
YAN et al.: BRIDGE GRAPH ATTENTION BASED GRAPH CONVOLUTION NETWORK WITH MULTI-SCALE TRANSFORMER 2051
TABLE V TABLE VI
COMPARISONS OF THE RESULTS (ACC(%) / STD(%)) OF DIFFERNET METHODS COMPARISONS OF THE RESULTS OF AVERAGE EUDCLIDEAN DISTANCE (AED)
IN SUBJECT-DEPENDENT EXPERIMENT WITH PSD FEATURES ON DREAMER FOR ALL SAMPLES IN SEED FROM EACH NODE TO THE CENTER AFTER T-SNE
FOR DIFFERENT MODELS
TABLE VII
generalized emotional patterns. Compared to SOTA methods THE RESULTS (ACC(%) / STD(%)) OF ABLATION STUDIES IN
GMSS [27], MSFR-GCN [62], our model achieves comparable SUBJECT-DEPENDENT AND SUBJECT-INDEPENDENT EXPERIMENTS ON SEED
AND SEED-IV
or best performance on most emotions, and the superiority on
happy emotion is significant.
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
2052 IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 15, NO. 4, OCTOBER-DECEMBER 2024
TABLE IX Fig. 8. Visualization of topographic maps of all EEG channels before and after
THE RESULTS OF DIFFERENT PARTITION METHODS IN SUBJECT-DEPENDENT convolution.
AND SUBJECT-INDEPENDENT EXPERIMENTS ON SEED
V. CONCLUSION
In this article, we propose a novel multichannel EEG emotion
3) Performance Comparison With Different Subgraphs: The
recognition model based on graph-learning methods and the
human brain consists of various functional regions that be-
transformer. We design the bridge graph attention based graph
come active during different emotional processes [14], different
convolution network to improve the importance of the nodes
partition methods of functional regions will lead to different
in the graph, addressing the over-smoothing problem caused
subgraphs. Following [31], we divide 62 EEG channels into 7
by increasing the depth of the graph model. Moreover, the
regions, 10 regions, and 17 regions (illustrated in Fig. 7) to study
multi-scale transformer is designed to learn non-local spatial
the impact of different partitioning methods on the performance
features among EEG channels and utilze spatial characteristics
of our model. The results are shown in Table IX. The 10 regions
at EEG different scales to improve emotion recognition ability.
partition method consistently achieves the highest accuracy
Comprehensive experiments are conducted on different types
when using different features as input. The results imply that
of datasets. The results indicate that our model successfully
more partitions may introduce more interference, leading to per-
overcomes the over-smoothing problem and effectively extracts
formance degradation. Fewer brain regions also show accuracy
distinct spatial characteristics at different scales of EEG. Our
reduction. This may be due to inadequate functional activity
model provides an effective approach for information mining in
connections.
multichannel EEG emotion recognition.
I. Analysis of Multi-Scale Convolution in Transformer
REFERENCES
The introduction of multi-scale convolution in MT is inspired
by neuroscience priors. To elucidate its relationship with exist- [1] Y. Wang et al., “A systematic review on affective computing: Emotion
models, databases, and recent advances,” Inf. Fusion, vol. 83/84, pp. 19–52,
ing neurophysiological discoveries, we analyze feature maps 2022.
before and after convolution. Multi-scale entropy (MSE), a [2] Y. Li, G. Lu, J. Li, Z. Zhang, and D. Zhang, “Facial expression recognition
well-established tool for exploring EEG patterns in various in the wild using multi-level features and attention mechanisms,” IEEE
Trans. Affect. Comput., vol. 14, no. 1, pp. 451–462, Jan.-Mar. 2023.
scenarios [48], is employed in this analysis. MSEs for correctly [3] Y. Li, Z. Zhang, B. Chen, G. Lu, and D. Zhang, “Deep margin-sensitive
categorized samples in the last encoder at three different scales representation learning for cross-domain facial expression recognition,”
both before and after convolution are calculated. The scales IEEE Trans. Multimedia, vol. 25, pp. 1359–1373, 2023.
[4] Y. Li, J. Huang, S. Lu, Z. Zhang, and G. Lu, “Cross-domain facial
of MSE align with the sizes of the convolution kernels. The expression recognition via contrastive warm up and complexity-aware
calculated MSE values are averaged for each EEG channel, and self-training,” IEEE Trans. Image. Process., vol. 32, pp. 5438–5450,
the results are visualized by the topographic map, as shown in 2023.
[5] A. Kleinsmith and N. Bianchi-Berthouze, “Affective body expression
Fig. 8. The after convolution results reveal enhanced functional perception and recognition: A survey,” IEEE Trans. Affect. Comput., vol. 4,
connectivity in specific brain regions. For short scales, connec- no. 1, pp. 15–33, Jan.-Mar. 2013.
tions in prefrontal regions are strengthened, with fewer changes [6] R. Harper and J. Southern, “A Bayesian deep learning framework for
end-to-end prediction of emotion from heartbeat,” IEEE Trans. Affect.
in the left parietal and right parietal lobes. For medium scales, the Comput., vol. 13, no. 2, pp. 985–991, Apr.-Jun. 2022.
prefrontal lobes exhibit more obvious strengthening, and internal [7] X. Zhang et al., “Fusing of electroencephalogram and eye movement with
connections among left parietal regions show increased strength group sparse canonical correlation analysis for anxiety detection,” IEEE
Trans. Affect. Comput., vol. 13, no. 2, pp. 958–971, Apr.-Jun. 2022.
and extent of connectivity. Large scales highlight enhanced [8] J. Shukla, M. Barreda-Angeles, J. Oliver, G. C. Nandi, and D. Puig,
strength and extent of internal connections in the prefrontal “Feature extraction and selection for emotion recognition from electro-
lobes, along with improved internal connectivity among the dermal activity,” IEEE Trans. Affect. Comput., vol. 12, no. 4, pp. 857–869,
Oct.-Dec. 2021.
both right and left parietal lobes regions. These findings indicate [9] X. Li et al., “EEG based emotion recognition: A tutorial and review,” ACM
that brain activity at different timescales shows distinct spatial Comput. Surv., vol. 55, no. 4, pp. 1–57, 2022.
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
YAN et al.: BRIDGE GRAPH ATTENTION BASED GRAPH CONVOLUTION NETWORK WITH MULTI-SCALE TRANSFORMER 2053
[10] R. Oostenvelda and P. Praamstrac, “The five percent electrode system [32] R. Jenke, A. Peer, and M. Buss, “Feature extraction and selection for
for high-resolution EEG and ERP measurements,” Clin. Neurophysiol., emotion recognition from EEG,” IEEE Trans. Affect. Comput., vol. 5, no. 3,
vol. 112, pp. 713–719, 2001. pp. 327–339, Jul.-Sep. 2014.
[11] B. Garcia-Martinez, A. Martinez-Rodrigo, R. Alcaraz, and A. Fernandez- [33] Q. Li, Z. Han, and X.-M. Wu, “Deeper insights into graph convolutional
Caballero, “A review on nonlinear methods using electroencephalographic networks for semi-supervised learning,” in Proc. 32nd AAAI Conf. Artif.
recordings for emotion recognition,” IEEE Trans. Affect. Comput., vol. 12, Intell., vol. 32, 2018, pp. 3538–3545.
no. 3, pp. 801–820, Jul.-Sep. 2021. [34] T. Zhang, X. Wang, X. Xu, and C. L. P. Chen, “GCB-Net: Graph convo-
[12] R. Duan, J. Zhu, and B. Lu, “Differential entropy feature for EEG-based lutional broad network and its application in emotion recognition,” IEEE
emotion classification,” in Proc. 6th Int. IEEE/EMBS Conf. Neural Eng., Trans. Affect. Comput., vol. 13, no. 1, pp. 379–388, Jan.-Mar. 2022.
2013, pp. 81–84. [35] X. Lin, J. Chen, W. Ma, W. Tang, and Y. Wang, “EEG emotion recognition
[13] W. Zheng and B. Lu, “Investigating critical frequency bands and using improved graph neural network with channel selection,” Comput.
channels for EEG-based emotion recognition with deep neural net- Methods Programs Biomed., vol. 231, 2023, Art. no. 107380.
works,” IEEE Trans. Auton. Ment. Develop., vol. 7, no. 3, pp. 162–175, [36] W. L. Hamilton, R. Ying, and J. Leskovec, “Inductive representation
Sep. 2015. learning on large graphs,” in Proc. 30th Conf. Neural Inf. Process. Syst.,
[14] S. M. Alarcao and M. J. Fonseca, “Emotions recognition using EEG sig- 2017, pp. 1025–1035.
nals: A survey,” IEEE Trans. Affect. Comput., vol. 10, no. 3, pp. 374–393, [37] Y. Rong, W. Huang, T. Xu, and J. Huang, “DropEdge: Towards deep graph
Jul.–Sep. 2019. convolutional networks on node classification,” in Proc. 6th Int. Conf.
[15] Y. Li et al., “A novel bi-hemispheric discrepancy model for EEG emotion Learn. Representations, 2020, pp. 1–11.
recognition,” IEEE Trans. Cogn. Develop. Syst., vol. 13, no. 2, pp. 354– [38] T. Liu, A. Jiang, J. Zhou, M. Li, and H. K. Kwan, “GraphSAGE-based
367, Jun. 2021. dynamic spatial–temporal graph convolutional network for traffic predic-
[16] Y. Li, W. Zheng, Y. Zong, Z. Cui, T. Zhang, and X. Zhou, “A tion,” IEEE Trans. Intell. Transp. Syst., vol. 24, no. 10, pp. 11210–11224,
Bi-hemisphere domain adversarial neural network model for EEG Oct. 2023.
emotion recognition,” IEEE Trans. Affect. Comput., vol. 12, no. 2, [39] Y. Liu, Y. Deng, J. Su, R. Wang, and C. Li, “Multiple input branches
pp. 494–504, Apr.-Jun. 2021. shift graph convolutional network with dropedge for skeleton-based action
[17] H. Cui, A. Liu, X. Zhang, X. Chen, K. Wang, and X. Chen, “EEG- recognition,” in Proc. 21st Int. Conf. Image Anal. Process., 2022, pp. 584–
based emotion recognition using an end-to-end regional-asymmetric 596.
convolutional neural network,” Knowl. Based Syst., vol. 205, 2020, [40] C. Duong, L. Zhang, and C.-T. Lu, “HateNet: A graph convolutional
Art. no. 106243. network approach to hate speech detection,” in Proc. Int. Conf. Big Data
[18] J. Li, Z. Zhang, and H. He, “Hierarchical convolutional neural networks Smart Comput., 2022, pp. 5698–5707.
for EEG-based emotion recognition,” Cogn. Comput., vol. 10, no. 2, [41] Y. Zhao, J. Chen, Z. Zhang, and R. Zhang, “BA-Net: Bridge attention for
pp. 368–380, 2017. deep convolutional neural networks,” in Proc. Eur. Conf. Comput. Vis.,
[19] F. Shen, G. Dai, G. Lin, J. Zhang, W. Kong, and H. Zeng, “EEG-based 2022, pp. 297–312.
emotion recognition using 4D convolutional recurrent neural network,” [42] J. D. Semedo et al., “Feedforward and feedback interactions between visual
Cogn. Neurodynamics, vol. 14, no. 6, pp. 815–828, 2020. cortical areas use different population activity patterns,” Nat. Commun.,
[20] Z. Yin, M. Zhao, Y. Wang, J. Yang, and J. Zhang, “Recognition of emotions vol. 13, no. 1, 2022, Art. no. 1099.
using multimodal physiological signals and an ensemble deep learning [43] A. Vaswani et al., “Attention is all you need,” in Proc. 31st Conf. Neural
model,” Comput. Methods Programs Biomed., vol. 140, pp. 93–110, pp. Inf. Process. Syst., 2017, pp. 6000–6010.
2017. [44] Y. Ma, Y. Song, and F. Gao, “A novel hybrid CNN-transformer model for
[21] P. Zhong, D. Wang, and C. Miao, “EEG-based emotion recognition using EEG motor imagery classification,” in Proc. Int. J. Conf. Neural Netw.,
regularized graph neural networks,” IEEE Trans. Affect. Comput., vol. 13, 2022, pp. 1–8.
no. 3, pp. 1290–1301, Jul.-Sep. 2022. [45] L. Gong, M. Li, T. Zhang, and W. Chen, “EEG emotion recognition
[22] M. Ye, C. L. P. Chen, and T. Zhang, “Hierarchical dynamic graph con- using attention-based convolutional transformer neural network,” Biomed.
volutional network with interpretability for EEG-based emotion recogni- Signal Process. Control, vol. 84, 2023, Art. no. 104835.
tion,” IEEE Trans. Neural Netw. Learn. Syst., early access, Dec. 9, 2022, [46] M. Sun, W. Cui, S. Yu, H. Han, B. Hu, and Y. Li, “A dual-branch dynamic
doi: 10.1109/TNNLS.2022.3225855. graph convolution based adaptive transformer feature fusion network for
[23] L. Yang, Z. Wenming, W. Lei, Z. Yuan, and C. Zhen, “From regional to EEG emotion recognition,” IEEE Trans. Affect. Comput., vol. 13, no. 4,
global brain: A novel hierarchical spatial-temporal neural network model pp. 2218–2228, Oct.-Dec. 2022.
for EEG emotion recognition,” IEEE Trans. Affect. Comput., vol. 13, no. 2, [47] Z. Wang, Y. Wang, C. Hu, Z. Yin, and Y. Song, “Transformers for EEG-
pp. 568–578, Apr.-Jun. 2022. based emotion recognition: A hierarchical spatial information learning
[24] R. J. Davidson, H. Abercrombie, J. B. Nitschke, and K. Putnam, “Regional model,” IEEE Sens. J., vol. 22, no. 5, pp. 4359–4368, Mar. 2022.
brain function, emotion and disorders of emotion,” Curr. Opin. Neurobiol., [48] S. Li and X.-J. Wang, “Hierarchical timescales in the neocortex: Mathe-
vol. 9, no. 2, pp. 228–234, 1999. matical mechanism and biological insights,” Proc. Nat. Acad. Sci. USA,
[25] G. Li, M. Muller, A. Thabet, and B. Ghanem, “DeepGCNs: Can GCNs vol. 119, no. 6, 2022, Art. no. e2110274119.
go as deep as CNNs?,” in Proc. IEEE 17th Int. Conf. Comput. Vis., 2019, [49] H. Cui, A. Liu, X. Zhang, X. Chen, J. Liu, and X. Chen, “EEG-based
pp. 9267–9276. subject-independent emotion recognition using gated recurrent unit and
[26] T. Song, W. Zheng, P. Song, and Z. Cui, “EEG emotion recognition minimum class confusion,” IEEE Trans. Affect. Comput., vol. 14, no. 4,
using dynamical graph convolutional neural networks,” IEEE Trans. Affect. pp. 2740–2750, Oct.-Dec. 2023.
Comput., vol. 11, no. 3, pp. 532–541, Jul.-Sep. 2020. [50] W. Zheng, J. Zhu, and B. Lu, “Identifying stable patterns over time for
[27] Y. Li et al., “GMSS: Graph-based multi-task self-supervised learning for emotion recognition from EEG,” IEEE Trans. Affect. Comput., vol. 10,
EEG emotion recognition,” IEEE Trans. Affect. Comput., vol. 14, no. 3, no. 3, pp. 417–429, Jul.-Sep. 2019.
pp. 2512–2525, Jul.-Sep. 2023, doi: 10.1109/TAFFC.2022.3170428. [51] T. Zhang, W. Zheng, Z. Cui, Y. Zong, and Y. Li, “Spatial-temporal recurrent
[28] M. Li, M. Qiu, W. Kong, L. Zhu, and Y. Ding, “Fusion graph representation neural network for emotion recognition,” IEEE Trans. Cybern., vol. 49,
of EEG for emotion recognition,” Sensors (Basel), vol. 23, no. 3, 2023, no. 3, pp. 839–847, Mar. 2019.
Art. no. 1404. [52] J. Yu, H. Yin, J. Li, M. Gao, Z. Huang, and L. Cui, “Enhancing social
[29] T. Chen, Y. Guo, S. Hao, and R. Hong, “Exploring self-attention graph recommendation with adversarial graph convolutional networks,” IEEE
pooling with EEG-based topological structure and soft label for depression Trans. Knowl. Data. Eng., vol. 34, no. 8, pp. 3727–3739, Aug. 2022.
detection,” IEEE Trans. Affect. Comput., vol. 28, no. 4, pp. 2016–2118, [53] M. Zitnik and J. Leskovec, “Predicting multicellular function through
Oct.-Dec. 2022. multi-layer tissue networks,” Bioinformatics, vol. 33, no. 14, pp. i190–
[30] S. Asadzadeh, T. Y. Rezaii, S. Beheshti, and S. Meshgini, “Accu- i198, 2017.
rate emotion recognition utilizing extracted EEG sources as graph [54] M. Rubinov and O. Sporns, “Complex network measures of brain connec-
neural network nodes,” Cogn. Comput., vol. 15, pp. 176–189, tivity: Uses and interpretations,” Neuroimage, vol. 52, no. 3, pp. 1059–
2023. 1069, 2010.
[31] T. Song et al., “Variational instance-adaptive graph for EEG emotion [55] G. Du et al., “A multi-dimensional graph convolution network for
recognition,” IEEE Trans. Affect. Comput., vol. 14, no. 1, pp. 343–356, EEG emotion recognition,” IEEE Trans. Instrum. Meas., vol. 71, 2022,
Jan.-Mar. 2023. Art. no. 2518311.
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.
2054 IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 15, NO. 4, OCTOBER-DECEMBER 2024
[56] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, and X. Zhai, Huachao Yan is currently working toward the PhD
“An image is worth 16x16 words: Transformers for image recognition at degree with the School of Electronic and Information
scale,” in Proc. 8th Int. Conf. Learn. Representations, 2020, pp. 1–22. Engineering, South China University of Technology.
[57] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. His research interests include EEG emotion recogni-
Zagoruyko, “End-to-end object detection with transformers,” in Proc. Eur. tion and affective computing.
Conf. Comput. Vis., 2020, pp. 213–229.
[58] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural
networks on graphs with fast localized spectral filtering,” in Proc. 30th
Conf. Neural Inf. Process. Syst., 2017, pp. 3844–3852.
[59] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio,
“Graph attention networks,” in Proc. 6th Int. Conf. Learn. Representations,
2018, pp. 1–12.
[60] W. L. Zheng, W. Liu, Y. Lu, B. L. Lu, and A. Cichocki, “EmotionMeter:
A multimodal framework for recognizing human emotions,” IEEE Trans.
Cybern., vol. 49, no. 3, pp. 1110–1122, Mar. 2019. Kailing Guo (Member, IEEE) received the PhD de-
[61] S. Katsigiannis and N. Ramzan, “DREAMER: A database for emotion gree from the South China University of Technology,
recognition through EEG and ECG signals from wireless low-cost off-the- Guangzhou, China. He is currently an associate pro-
shelf devices,” IEEE J. Biomed. Health Inform., vol. 22, no. 1, pp. 98–107, fessor with the School of Electronic and Information
Jan. 2018. Engineering, South China University of Technology.
[62] D. Pan et al., “MSFR-GCN: A multi-scale feature reconstruction graph His research interests include low-rank and sparse
convolutional network for eeg emotion and cognition recognition,” IEEE learning, deep learning optimization and model com-
Trans. Neural Syst. Rehabil. Eng., vol. 31, pp. 3245–3254, 2023. pression, multimodal human data processing.
[63] S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang, “Domain adaptation via
transfer component analysis,” IEEE Trans. Neural Netw., vol. 22, no. 2,
pp. 199–210, Feb. 2011.
[64] B. Fernando, A. Habrard, M. Sebban, and T. Tuytelaars, “Unsupervised
visual domain adaptation using subspace alignment,” in Proc. IEEE Int.
Conf. Comput. Vis., 2013, pp. 2960–2967.
[65] R. Zhou et al., “PR-PL: A novel prototypical representation based Xiaofen Xing (Member, IEEE) received the BS, MS,
pairwise learning framework for emotion recognition using EEG sig- and PhD degrees from the South China University
nals,” IEEE Trans. Affect. Comput., early access, Jun. 23, 2023, of Technology, Guangzhou, China, in 2001, 2004,
doi: 10.1109/TAFFC.2023.3288118. and 2013, respectively. Since 2017, she has been
[66] Z. Li et al., “Dynamic domain adaptation for class-aware cross-subject and an associate professor with the School of Electronic
cross-session EEG emotion recognition,” IEEE J. Biomed. Health Inform., and Information Engineering, South China University
vol. 26, no. 12, pp. 5964–5973, 2022. of Technology. Her main research interests include
[67] W. Zheng and B. Lu, “Personalizing EEG-based affective models with speech emotion analysis, image/video processing,
transfer learning,” in Proc. 34th Int. Joint Conf. Artif. Intell., 2016, and human computer interaction.
pp. 2732–2738.
[68] K. Yang, L. Tong, J. Shu, N. Zhuang, B. Yan, and Y. Zeng, “High gamma
band EEG closely related to emotion: Evidence from functional network,”
Front. Hum. Neurosci., vol. 14, 2020, Art. no. 89.
[69] T. Song, W. Zheng, C. Lu, Y. Zong, X. Zhang, and Z. Cui, “MPED: A multi-
modal physiological emotion database for discrete emotion recognition,” Xiangmin Xu (Senior Member, IEEE) received the
IEEE Access, vol. 7, pp. 12177–12191, 2019. PhD degree from the South China University of Tech-
[70] W. Guo, G. Xu, and Y. Wang, “Horizontal and vertical features fusion nology, Guangzhou, China. He is currently a full pro-
network based on different brain regions for emotion recognition,” Knowl. fessor with the School of Electronic and Information
Based Syst., vol. 247, 2022, Art. no. 108819. Engineering and the School of Future Technology,
[71] L. Zhu et al., “Multisource wasserstein adaptation coding network for EEG South China University of Technology. His recent
emotion recognition,” Biomed. Signal Process. Control, vol. 76, 2022, research focuses on image/video processing, human–
Art. no. 103687. computer interaction, computer vision, and machine
[72] W. Zheng, “Multichannel EEG-based emotion recognition via group learning.
sparse canonical correlation analysis,” IEEE Trans. Cogn. Develop. Syst.,
vol. 9, no. 3, pp. 281–290, Sep. 2017.
[73] L. Yang, Z. Wenming, C. Zhen, and Z. Xiaoyan, “A novel graph regularized
sparse linear discriminant analysis model for EEG emotion recognition,”
in Proc. 23 rd Int. Conf. Neural Inf. Process., 2016, pp. 175–182.
[74] G. Wu, S. Lin, Y. Zhuang, and J. Qiao, “Alleviating over-smoothing
via graph sparsification based on vertex feature similarity,” Appl. Intell.,
vol. 53, no. 17, pp. 20223–20238, 2023.
[75] Y. Liu et al., “CurvDrop: A RICCI curvature based approach to prevent
graph neural networks from over-smoothing and over-squashing,” in Proc.
ACM Web Conf., 2023, pp. 221–230.
[76] L. V. d. Maaten and G. Hinton, “Visualizing data using T-SNE,” J. Mach.
Learn. Res., vol. 90, no. 11, pp. 2579–2605, 2008.
Authorized licensed use limited to: JADAVPUR UNIVERSITY. Downloaded on January 11,2025 at 09:34:30 UTC from IEEE Xplore. Restrictions apply.