Auto-ABSA Automatic Detection of Aspects in Aspect-Based Sentiment Analysis
Auto-ABSA Automatic Detection of Aspects in Aspect-Based Sentiment Analysis
Sentiment Analysis
Teng Wang [email protected]
Abstract
Following the proposal of the transformer, many pre-trained language models were developed,
and the sentiment analysis (SA) task was improved.In this paper, we proposed a method that uses
an auxiliary sentence to describe aspects that the sentence contains to help sentiment prediction.
The first is aspect detection, which uses a multi-aspects detection model to predict all the aspects
that the sentence has. Combining the predicted aspects and the original sentence as the Sentiment
Analysis (SA) model’s input. The second is to do out-of-domain aspect-based sentiment analysis
(ABSA), train a sentiment classification model with one kind of dataset and validate it with
another kind of dataset. Finally, we created two baselines, they use no aspect and all aspects as
the sentiment classification model’s input, respectively. Compared two baselines’ performance to
our method, we found that our method really makes sense.
1 Introduction
With the rapid development of information technology, it provides users with a wealth of informa-
tion and services. Various fields combined with information technology have used natural language
processing technology to extract valuable information from massive unstructured text data. Sentiment
analysis (SA) is a sub-task of Natural Language Processing (NLP). Each sentence usually contains
sentiment, and people can use sentiment to analyze people’s reviews of products and news. It can also
be used for election polls.
Predicting the polarity, where the sentence has only one polarity, is easy. But every sentence usually
has more than one polarity. One sentence may have more than one target, and each target may have
more than one polarity for different aspects. For example, a comment review like this: ”I love using
iPhone because of its quality, but it’s too expensive.” In this sentence, the polarity of price is negative
and the polarity of quality is positive. Aspect-based sentiment analysis (ABSA) is a sub-task of SA;
this task detects every polarity of every aspect of the target, and it isn’t a simple text classification
task. How to build a model that can detect the polarity of different aspects will be the most important
part of ABSA.
[Tang et al., 2015] uses two LSTM neural networks; one input is the preceding contexts plus target,
and the other input is the following contexts plus target. Then they combine the two LSTM neural
network outputs, and apply the softmax function to get the possibility of polarity. This model uses
LSTM, whose performance is poorer than the attention mechanism and can’t be computed in parallel.
[Sun et al., 2019] utilizes Bert and an auxiliary sentence to predict polarity. [Wan et al., 2020] pro-
posed a method, before feeding a sentence to the model, we must know the aspect range of the dataset.
Then do full permutations of all aspects and polarities (positive, negative, neutral) as ”auxiliary sen-
tences.” Combine the auxiliary sentence and the original sentence, feeding them to the model, and the
model will output the possibility of the aspect polarity. This method has 2 drawbacks. One is that
we must know the aspect range of the dataset. If we use another type of dataset for prediction, which
1
has different aspects, this method will fail. Another is data imbalance. Each sentence may contain
one or two aspects, but the range of aspects can be very large. After data preprocessing, we will get
lots of data, which possibility equals 0. If the range of aspects is large, the model will train hard and
have a tendency to output 0. [Gao et al., 2019] proposed a method that uses an auxiliary sentence
as prior knowledge to get the possibilities of positive, negative, and neutral. The auxiliary sentence
consists of the target of the sentence; it doesn’t consider the aspect. Sometimes one target corresponds
to more than one aspect. [He et al., 2020] proposed disentangled attention and DeBerta, the model,
after fine-tuning, is the art-of-state on ABSA.
When natural language processing technology is applied in a new domain, the data in the current
task needs to be re-analyzed. Extracting valid features is a costly effort, especially when the domain
characteristics to be analyzed change frequently. In order to ensure the accuracy of the model, it often
requires a large amount of high-quality manual annotation data, and in many cases, annotating data
manually is also a difficult and costly task. The cross-domain aspect-based sentiment analysis (ABSA)
studied in this paper aims to conduct aspect-based sentiment analysis on unlabeled data in the target
domain through training with annotated data in the source domain. There are certain differences
between the two datasets. For example, the training data is related to restaurant reviews and the
testing data is related to real estate reviews. It is a very challenging task to realize cross-domain
sentiment analysis.
Words are very important in the emotion classification task. Apart from stop words that have little
influence on sentence meaning, other words may have different emotional tendencies in different fields,
such as ”unpredictable” usually contributes to the positive sentiment in movie reviews but is negative
in evaluation of electronic components. In order to overcome the constraint of using the aspect-based
sentiment analysis model in different domains, using the existing model trained in one field to deal
with tasks in other fields is a method to reduce the time and economic cost of retraining the model.
The Automatic Detection of Aspects in Aspect-Based Sentiment Analysis has become one of the most
effective solutions in this situation.
In this paper, we propose a cross-domain sentiment analysis model constructed based on deep
learning methods. First, we test the performance of the sentiment classification model when we give
all the aspects that the dataset has, and no aspects, as a part of the model input. Then we gave the
right aspects of the data from the SemEval16 dataset to our sentiment classification model trained
on the Sentihood dataset. Compared to the previous two baseline tests, we found that when we had
the right aspects to conduct aspect-based sentiment analysis, the performance improved. For getting
the right aspects, we found that modified SpanEmo [Alhuzali and Ananiadou, 2021] can find the right
aspects. SpanEmo was not initially made to detect aspects; it’s used to detect multi-emotions. So, we
changed SpanEmo and made it predict aspects that the sentence has. And we found that this method
based on SpanEmo has good performance in predicting various aspects. Combining SpanEmo with
Sentiment-Predictor as a ”Big Model”, we have proposed Auto-ABSA that can predict the aspects a
sentence has, to conduct cross-domain aspect-based sentiment analysis, and we tested its performance,
which has been improved over the two baselines we have.
Our contributions are listed as follows:
1. Modified SpanEmo to realize multi-aspect detection.
2. Trained the model with one type of dataset and validated the model with another type of dataset.
With an auxiliary sentence about aspect, it helps the model do Auto-ABSA.
3. Made better interpretability. When we give the aspects that the sentence has, we can see the
influence of the predicted aspect on the model. We created two baselines, compared two baselines
to our model, found that our method can make sense.
We now turn to describing our framework in detail.
2 Related Work
In early studies, sentiment classification was built by LSTM [Hochreiter and Schmidhuber, 1997],
GRU [Chung et al., 2014], etc. [Tang et al., 2015] combines two LSTM neural networks, one in-
put is the preceding contexts plus target, and the other input is the following contexts plus tar-
get. [Wang et al., 2016] proposed an attention-based LSTM to predict polarity at an aspect level.
2
[Chen et al., 2017] proposed a model that uses Bi-RNN, location- weighted memory, and attention.
But these model don’t have good performance.
After the transformer [Vaswani et al., 2017] was proposed, attention mechanism [Bahdanau et al., 2014]
became widely popular, and there are more and more pre-trained models in NLP ,more attention
mechanisms, and more strategies of training models, such as, Bert [Devlin et al., 2018], RoBerta
[Liu et al., 2019], DeBerta [He et al., 2020], etc. [Gao et al., 2019] [Sun et al., 2019] proposed a method
that uses BERT and auxiliary sentences as prior knowledge to improve sentiment prediction. [Wan et al., 2020]
employs BERT to construct a multi-task model; it predicts not only the polarity of aspects, but also
the location of the target on word level. But it uses all permutations for polarities (positive, negative,
neutral) and aspects that the dataset contains. It causes data imbalance, and the model has a tendency
to output 0. That makes it hard to train on sentiment prediction task, let alone model also trains on
target position prediction task.
Our aspect detection model will be based on SpanEmo, [Alhuzali and Ananiadou, 2021]. SpanEmo
is used to predict all of the emotions that a sentence has. We changed every emotion to every aspect,
to see whether SpanEmo can predict every aspect that the sentence contains well. Figure 1 will show
the SpanEmo architecture. The input is the combination of all aspects and a sentence, and the output
is all aspects of possibility.
3 Methodology
3.1 SpanEmo
{S,Y} as a set of N examples with the fixed aspects labels of C classes, where si represents i-th
sentence and yi ∈ 0, 1m represents the aspects that the sentence contains. The input of SpanEmo
has two parts, one is the set of fixed aspects classes and another is the corresponding sentence. After
feeding input to BERT Encoding, the hidden representations (Hi ∈ RL,D ) are as follows(L represents
the max length of input and D represents the hidden state dimensional):
ŷ = sigmoid(F CN (Hi ))
The original SpanEmo paper employs a model loss function that combines Label-Correlation Aware
(LCA) and Binary Cross-entropy (BCE), and we reasoned that SpanEmo would be penalized with
such a loss function when it predicted a pair of labels that should not co-exist for a given example.We
won’t change the loss function.
1 X
LLCA (y, ŷ) = exp(yˆp − yˆq )
|y 0 ||y 1 |
(p,q)∈y 0 ×y 1
M
X
L = (1 − α)LBCE + α LLCA
i=1
where y 0 represents the set of negative labels and y 1 represents the set of positive labels. yˆp
represents the pth element of vector ŷ.
3
Figure 1: The SpanEmo architecture.
Define (A,T,S) as our dataset, A denotes the aspect that the sentence contains, T denotes the target
of the sentence, and S denotes the sentence. The model input representations for each dataset are as
follows:
inputi = [CLS] + s1,i + [SEP ] + s2,i
Where s1,i is ”what do you think of [aspects] of [target]” and s2,i is the original sentence. After
feeding input to Bert Encoding, the hidden presentation Hi will be obtained. Owing to our task is a
4
classification task, we will use the Fully Connected Network and Softmax activation function to get
the possibilities of positive, negative, and neutral with the target. The output corresponding to the
[CLS] token represents the positive, negative, and neutral possibilities of sentiment with regard to the
target. Where y is 3 × 1.
Hi = BERT (inputi )
ŷ = Sof tmax(F CN (Hi ))
The loss function of the sentiment predictor is a cross-entropy loss function, where M is the number
of classes, y is a binary indicator (0 or 1) if class label c is the correct classification for observation o,
and p is the predicted probability that observation o is of class c.
M
X
Lo = − yo,c log(po,c )
c=1
3.3 Out-of-Domain-ABSA
What we want to do is to do out of domain aspect-based sentiment analysis. It means that if we
don’t have a kind of dataset to train a model, such as a restaurant dataset, but we want to make a
prediction with that dataset.
One method is to use SpanEmo and a sentiment predictor. For convenience, we name it ”Big-
Model”. Big-Model architecture will be shown in Figure 3. Firstly, we will train SpanEmo with a
dataset. When we want to predict the aspect that a sentence contains, we feed the sentence into
SpanEmo to get the aspects. Then, we will train the Sentiment Predictor with any dataset and
combine the predicting aspects, target, and sentence as sentiment predictor input. Feeding input to
the Sentiment Predictor to get the possibilities of sentiments for the target.
The second way is by using the sentiment predictor with all aspects. If we know all the aspects that
the sentence has, we can feed the set of aspects to input, test the model, and also get the possibilities
of sentiments.
The third way is to use the sentiment predictor with no aspect. Feeding NULL as an aspect to
input, testing with the model, and obtaining sentiment possibilities.
5
Figure 3: The BigModel architecture.
4 Experiments
This section, involving datasets and evaluation of our model, will report statistics on our results.
We used Pytorch [Paszke et al., 2019] for implementation and ran all experiments on V100 GPU with
16 GB memory. Each experiment uses three pre-trained models, which are BERT-base, DeBerta-base,
and RoBerta-base, for fine-tuning and comparison.
4.1 Datasets
We mainly use four datasets for our training, evaluation, and testing. The Sentihood dataset, the
SemEval-2014 dataset, the SemEval-2015 dataset and the SemEval-2016 dataset.
6
Table 2: Train SpanEmo Hyper-parameters
max-length 128
dropout 0.2
train bs 32
validation bs 32
Bert lr 2e-5
ffn lr 1e-3
alpha-loss 0.2
optimizer Adam
linear weight
weight decay
decay
0.1 * total
warm up
steps
max length 128
dropout 0.1
train bs 32
validation bs 32
Bert lr 2e-5
ffn lr 1e-3
optimizer Adam
F1- F1-
JS precision recall
Macro Micro
bert-
0.7551 0.6999 0.7126 0.7551 0.7171
base
DeBerta 0.7371 0.7129 0.6726 0.7911 0.6899
RoBerta 0.7752 0.7707 0.7519 0.7752 0.7752
7
Table 5: Train & evaluate SpanEmo on SemEval 2015
F1- F1-
JS precision recall
macro Micro
bert-
0.7000 0.4660 0.6750 0.7000 0.7000
base
DeBerta 0.7347 0.4191 0.6687 0.8060 0.6750
RoBerta 0.5862 0.2786 0.5312 0.6538 0.5312
F1- F1-
JS precision recall
macro Micro
bert-
0.8213 0.5307 0.7748 0.8438 0.8000
base
DeBerta 0.8401 0.6424 0.8168 0.8433 0.8370
RoBerta 0.8333 0.5364 0.8333 0.8527 0.8148
Table 7: Train SentimentPredictor on Sentihood dataset and test SentimentPredictor on SemEval 2016
dataset with the right aspect
F1- F1-
micro macro
bert-base 0.7734 0.6330
DeBerta 0.7750 0.6076
RoBerta 0.7508 0.6026
Table 8: Train Sentiment Predictor on Sentihood dataset and evaluate it on SemEval2016 dataset with
all aspects
F1- F1-
micro macro
bert-base 0.6586 0.2647
DeBerta 0.7453 0.2847
RoBerta 0.6586 0.2647
8
Table 9: Train SentimentPredictor on Sentihood dataset and evaluate it on SemEval2016 dataset with
no aspect
F1- F1-
micro macro
bert-base 0.6648 0.2662
DeBerta 0.6531 0.2634
RoBerta 0.663 0.2659
F1-micro F1-macro
bert-base 0.7000 0.5350
DeBerta 0.5000 0.5000
RoBerta 0.5813 0.4299
In table 10, we can compare it to two baselines. One is table 8, and the other is table 9. It’s clear that
after giving predicted aspects as prior knowledge, the Sentiment Predictor has better interpretability
and performance. It proves that the right aspect can be used as knowledge to do out-of-domain ABSA.
When we have another kind of dataset to classify and we don’t have enough money and time to re-build
or re-train a model, we can use this method to do out-of-domain text classification.
There is something wired that Big-Model based on RoBerta with the right aspects performance
is lower than Big-Model based on RoBerta with the predicted aspects(table 10 and table 7 show).
Predicted aspects always exist some aspects which are wrongly predicted. We think this phenomenon
is affected by random factors.
9
5 Conclusion
We propose a new method aimed at improving the performance of cross-domain aspect-based sen-
timent analysis. We first modify the SpanEmo to predict the correct aspects of a sentence. At the
same time, we demonstrate the performance of our proposed method on different Sentiment Pre-
dictors (Bert-base, DeBerta, RoBerta). Our empirical evaluation and analysis also demonstrate the
effectiveness and advantages of our model in cross-domain aspect-based sentiment analysis. Through
continuous training, we improved the performance of the model and achieved an improvement com-
pared to the two baseline tests. However, currently our model can only process a single target as
input, and we will conduct further work on multiple targets in the future. We hope that this study
will inspire the community to investigate further research into cross-domain aspect-based sentiment
analysis to greatly reduce the time and economic cost of retraining.
References
[Alhuzali and Ananiadou, 2021] Alhuzali, H. and Ananiadou, S. (2021). Spanemo: Casting multi-label
emotion classification as span-prediction. arXiv preprint arXiv:2101.10038.
[Bahdanau et al., 2014] Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural machine translation
by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
[Chen et al., 2017] Chen, P., Sun, Z., Bing, L., and Yang, W. (2017). Recurrent attention network on
memory for aspect sentiment analysis. In Proceedings of the 2017 conference on empirical methods
in natural language processing, pages 452–461.
[Chung et al., 2014] Chung, J., Gulcehre, C., Cho, K., and Bengio, Y. (2014). Empirical evaluation of
gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
[Devlin et al., 2018] Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training
of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[Gao et al., 2019] Gao, Z., Feng, A., Song, X., and Wu, X. (2019). Target-dependent sentiment clas-
sification with bert. IEEE Access, 7:154290–154299.
[He et al., 2020] He, P., Liu, X., Gao, J., and Chen, W. (2020). Deberta: Decoding-enhanced bert
with disentangled attention. arXiv preprint arXiv:2006.03654.
[Hochreiter and Schmidhuber, 1997] Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. Neural computation, 9(8):1735–1780.
[Liu et al., 2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M.,
Zettlemoyer, L., and Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach.
arXiv preprint arXiv:1907.11692.
[Paszke et al., 2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T.,
Lin, Z., Gimelshein, N., Antiga, L., et al. (2019). Pytorch: An imperative style, high-performance
deep learning library. Advances in neural information processing systems, 32:8026–8037.
[Sun et al., 2019] Sun, C., Huang, L., and Qiu, X. (2019). Utilizing bert for aspect-based sentiment
analysis via constructing auxiliary sentence. arXiv preprint arXiv:1903.09588.
[Tang et al., 2015] Tang, D., Qin, B., Feng, X., and Liu, T. (2015). Effective lstms for target-dependent
sentiment classification. arXiv preprint arXiv:1512.01100.
[Vaswani et al., 2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. In Advances in neural information
processing systems, pages 5998–6008.
[Wan et al., 2020] Wan, H., Yang, Y., Du, J., Liu, Y., Qi, K., and Pan, J. Z. (2020). Target-aspect-
sentiment joint detection for aspect-based sentiment analysis. In Proceedings of the AAAI Conference
on Artificial Intelligence, volume 34, pages 9122–9129.
10
[Wang et al., 2016] Wang, Y., Huang, M., Zhu, X., and Zhao, L. (2016). Attention-based lstm for
aspect-level sentiment classification. In Proceedings of the 2016 conference on empirical methods in
natural language processing, pages 606–615.
11