0% found this document useful (0 votes)
5 views

Quantum-Enhanced_Support_Vector_Machine_for_Sentiment_Classification

Uploaded by

bbhuvana.cas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Quantum-Enhanced_Support_Vector_Machine_for_Sentiment_Classification

Uploaded by

bbhuvana.cas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Received 11 July 2023, accepted 7 August 2023, date of publication 14 August 2023, date of current version 21 August 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3304990

Quantum-Enhanced Support Vector Machine


for Sentiment Classification
FARISKA ZAKHRALATIVA RUSKANDA1,2 , (Member, IEEE), MUHAMMAD RIFAT ABIWARDANI1 ,
RAHMAT MULYAWAN1,3,4 , (Member, IEEE), INFALL SYAFALNI 1,3 , (Member, IEEE),
AND HARASHTA TATIMMA LARASATI 1,5 , (Member, IEEE)
1 School of Electrical Engineering and Informatics, Bandung Institute of Technology, Bandung 40132, Indonesia
2 Artificial
Intelligence Center, Bandung Institute of Technology, Bandung 40132, Indonesia
3 University Center of Excellence on Microelectronics, Bandung Institute of Technology, Bandung 40132, Indonesia
4 Research Collaboration Center for Quantum Technology 2.0, Bandung Institute of Technology, Bandung 40132, Indonesia
5 School of Computer Science and Engineering, Pusan National University, Busan 46241, South Korea

Corresponding author: Infall Syafalni ([email protected])


This work was supported and funded by the 2022 Young Researcher Grant from the School of Electrical Engineering and Informatics,
Bandung Institute of Technology.

ABSTRACT Quantum computers have potential computational abilities such as speeding up complex
computations, parallelism by superpositions, and handling large data sets. Moreover, the field of natural
language processing (NLP) is rapidly attracting researchers and engineers in order to build larger model
computations of NLP. Thus, the use of quantum technology in NLP tasks, especially sentiment classification,
has the potential to be developed. In this research, we investigate the best technique to represent senti-
ment sentences so that sentiment can be analyzed using the Quantum-Enhanced Support Vector Machine
(QE-SVM) algorithm. Investigations were carried out using circuit parameter optimization methods and
data transformation. The pipeline of the proposed method consists of sentence-to-circuit conversion, circuit
parameter training, state vector formation, and finally the training and testing processes. As a result,
we obtained the best classification results with an accuracy of 93.33% using the SPSA optimization method
and PCA transformation data. These results have also outperformed the baseline SVM method.

INDEX TERMS Sentiment classification, SVM, quantum-enhanced, quantum representation.

I. INTRODUCTION Sentiment classification is generally solved using a


Nowadays, opinions can be expressed easily in various online Machine Learning (ML) algorithm that utilizes labeled train-
media by anyone. Therefore, this data is an important source ing data to predict sentiment values. The learning algorithm
that can be used to derive a person’s sentiment value for allows the prediction process to better deal with opinion sen-
something, such as a product, service, or person. The current tences characteristic of human language or natural sentences.
sentiment classification process is generally carried out using Quantum ML technology can be used to solve this problem.
Natural Language Processing (NLP) technology. Opinions or This technology combines quantum computers and artificial
subjective sentences are automatically labeled as positive or intelligence, especially learning algorithms. Quantum com-
negative sentiment values in sentiment classification [1]. This puters are used to solve complex problems intractable and
sentiment value can also be used further to make product pro- solved by classical computers [5]. One of the QML methods
file summaries [2], vote predictions [3], or improve customer is Quantum Enhanced - Support Vector Machine (QE-SVM)
service [4]. proposed in 2018 [6]. This method has advantages over SVM
in the form of a quantum kernel, namely kernel functions
that can be computed using quantum circuits. This kernel
The associate editor coordinating the review of this manuscript and accepts as input a feature map representing a complex vector
approving it for publication was Okyay Kaynak . space. This method outperforms SVM on various structured

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.


87520 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 11, 2023
F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

(numeric) datasets, using feature map transformations and path for using a quantum kernel in NLP using quantum
adjustments to the rotation factor [7]. computing.
The nature of complex vectors that QE-SVM can handle The following is the remainder of this paper. Section II
aligns with the nature of subjective sentences in sentiment covers sentiment analysis in quantum computing, the
classification data. Therefore, using QE-SVM in the senti- optimization method, the quantum kernel, and SVM in
ment classification task is a potential research area. As one brief. Section III explains our proposed QE-SVM method.
of the tasks in NLP, handling sentiment classification in a Section IV comprises the findings of our experiments as well
quantum environment is carried out using the Quantum NLP as some discussions. The final section brings the paper to
(QNLP) methodology [8]. This methodology uses a com- some conclusions.
positional language structure in the form of grammar and
semantics constructed in a quantum way. II. RELATED WORKS
The main problem of the research is to find the best data 1) SENTIMENT ANALYSIS AND QUANTUM NLP
representation for quantum NLP to represent a sentiment in a
Sentiment analysis, one of the most developed fields of
sentence or called sentiment classification. Moreover, several
NLP, has been widely researched because of its significant
optimizers such as SPSA and ANN are explored in order
use. One potential approach is to use Quantum Machine
to improve the classification performance. Finally, we also
Learning. Several methods that try to imitate quantum mech-
expand the improvements of sentiment classifications from
anisms include [11], which examined sentiment analysis
the classical SVM method to the Quantum Enhanced-SVM
on Twitter data using a quantum-inspired representation
(QE-SVM) method.
model. This method uses quantum mechanisms to model
Our previous work [9] has formulated a quantum repre-
semantic and sentiment information on a series of projectors
sentation for the sentiment classification task [9]. We used
in a probabilistic space. Then this method was developed
a state vector representation and particular negation han-
into a quantum-like multimodal network (QMN). It com-
dling with the Not-box operation. However, the dimensions
bines quantum theory with long short-term memory (LSTM)
of the vector representation are large, and the prediction
networks for multimodal sentiment analysis on conversa-
results could be more optimal (81.67% accuracy). The chal-
tions [12]. Quantum algorithms in Variational Quantum Clas-
lenge that needs to be solved is building a proper quantum
sifiers (VQC) can also be used to solve sentiment analysis
representation of subjective sentences that can be com-
problems, in which the work in [13]. carried out one of them
puted quickly and precisely using the QE-SVM learning
using EfficientSU2 and RealAmplitudes, a built-in library
algorithm.
from Qiskit quantum computer simulator. Although similar,
In this paper, the focus is on exploring how to use quan-
this method outperforms the classification results of classical
tum natural language processing (QNLP) to represent the
ML models.
sentiment of a sentence. The aim is to come up with an
One of the critical steps in the QNLP methodology is
effective and efficient quantum representation of subjective
circuit parameter training after changing sentences into cir-
sentences that can be used for quantum sentiment classifica-
cuits. This learning process is carried out using a learn-
tion. We modified an existing experimental QNLP pipeline
ing/optimization algorithm. One widely used method is
(described in [10]) to better suit their needs, particularly
the Simultaneous Perturbation Stochastic Approximation
during the optimization stage. The methodology involves
(SPSA) [14]. An essential feature of SPSA is the gradi-
converting sentences into circuits, training circuit parame-
ent approximation which requires only two measurements
ters, and reading state vectors, followed by techniques for
of the objective function regardless of the dimensions of
transforming the state vector data to work with the QE-SVM
the optimization problem. This feature significantly reduces
classifier.
optimization costs, especially in problems with many vari-
In summary, this paper has two main contributions. The
ables to optimize. Moreover, this method often outperforms
first is developing an effective and efficient quantum rep-
other optimization methods, especially in variational quan-
resentation of subjective sentences. We suggest using the
tum algorithms [15].
X-gate quantum operation to represent negative sentences in a
quantum circuit. In addition, we propose two alternative data
transformation methods - double angles and PCA - to make 2) QUANTUM KERNEL AND OPTIMIZATION
the data compatible with the QE-SVM classifier. The second To carry out the classification process, the kernel method for
contribution is being the first to apply QE-SVM to natural lan- machine learning is one that is widely used. Among them is
guage processing tasks, specifically sentiment classification. the Support Vector Machine (SVM) as the most well-known
We demonstrate that using QE-SVM with the appropriate rep- traditional learning method [16]. Combining the advantages
resentation leads to better predictive performance than SVM. of SVM with quantum computing, the authors of [6] proposed
Moreover, compared to the previous work [9], the proposed the concept of a quantum variational classifier that is run
method outperforms the accuracy performance up to 93.33% using a quantum variational circuit. Then they also proposed
using SPSA circuit parameter training and PCA with a quantum kernel estimator, which optimizes the SVM classi-
n = 14 data transformation. This work leads to the potential fier by estimating the kernel function. The last method is the

VOLUME 11, 2023 87521


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

FIGURE 1. Illustration of comparison between quantum and classical kernel.

basis for developing the QSVM module/library on Qiskit so data, namely the Parkinson’s dataset, IoT irrigation, and drug
that this method is easily adapted by many parties. classification.
The quantum kernel method utilizes a quantum fea-
ture space. Recalling that quantum states exist in Hilbert
space [17], one can calculate the inner product between two III. PROPOSED METHOD
quantum states. Theoretically, this can be achieved directly In this work, we design a sentiment classifier based on a
on a quantum circuit; the inner product between the state quantum feature map. Figure 1 shows the illustration of the
91 represented by a set of unitaries U1 and the state 92 repre- fundamental difference between classical feature maps and
sented by a set of unitaries U2 can be calculated by applying quantum feature maps. Basically, the classical feature space

the unitaries U2 U1 and observing the resulting state [18]. is formed by the classical values where the data points are
Alternatively, one can measure each state 91 and 92 and represented by their original features before any kernel is
calculate the inner product classically. In both cases, the value applied. On the other hand, quantum feature space is formed
of the inner product is used for further interpretation. Most by the quantum states. Thus, the quantum feature map is also
commonly in machine learning, it is used to find the support formed by the quantum circuits as depicted in Figure 1.
vectors of a support vector classifier [6]. The motivation for To classify sentiment using quantum representation,
using quantum kernels is that quantum feature maps are more we use an experimental QNLP pipeline similar to the one
difficult to calculate classically while potentially partitioning used in [10]. The pipeline involves converting sentences into
the data/input space in a more distinguishable manner [7]. circuits, optimizing them, and using the resulting circuits
The QSVM concept that was previously developed was to classify sentiment using the QE-SVM method. We made
then continued at the application level by [7], using the Noisy some modifications to the pipeline, particularly during the
Intermediate-Scale Quantum (NISQ) assumption. In their optimization stage. The general pipeline stages used in
work, the authors use quantum states built from quantum this study include: (1) generating circuits from sentences,
feature maps from structured data. Subsequently, the vector is (2) training circuit parameters, (3) extracting state vectors
handled by the quantum kernel to carry out the classification. from the circuits as sentence embeddings, and (4) using
The dataset used is three standard UCI datasets, namely wine, these embeddings to train the QE-SVM classifier and predict
breast cancer, and handwritten digits, as well as two artificial the sentiment of each sentence. This process is illustrated
numeric datasets. in Figure 2.
One of the essential stages before the quantum kernel The sentiment classification task used in this study
is data transformation which produces feature maps. This involved the restaurant sentiment dataset and required binary
process can be done using special functions or rules, such sentiment classification (positive or negative). In the cir-
as Principal Component Analysis (PCA) or double angles, cuit representation, each sentence type ′ s′ was mapped to
or automatically using a learning algorithm. The last category 1 qubit. The conversion process from sentences to cir-
was developed by [19], using a genetic algorithm to minimize cuits was adapted from previous studies, e.g., [8], [20],
circuit parameters. This approach was tested on structured and [21], with some modifications. The training process for

87522 VOLUME 11, 2023


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

FIGURE 2. General pipeline.

circuit parameters was conducted using SPSA. Each stage is by the findings of the Not-box experiment in our prior
explained further below. work [9].

A. SENTENCES-TO-CIRCUIT B. CIRCUIT-TO-PARAMETER TRAINING


1) Sentences to circuits: The process of converting sen- The quantum circuits are transformed into the objective func-
tences into circuits is performed using the Quantum tion for SPSA Optimization [25], where the output qubit of
Pipeline with Tket, which can be found in the Lam- type ′ s′ is compared with the sentiment label to determine
beq 0.1.2 documentation. First, the sentences are trans- the cost of the model. During the training process, the free
formed into DisCoCat diagrams through the DepCCG variables of the circuits, which are the gate parameters, are
Parser [22], with ‘‘not’’ words being ignored for the optimized. Once the parameters are optimized, the circuit
time being. A DisCoCat diagram is defined as a model values are used to predict the sentiment of a sentence by
of semantic words interactions in a sentence [22]. measuring the output qubit.
A complete explanation of a sentence representation
using graphical language or the DisCoCat method is C. READING STATE VECTORS
explained in [9]. Then, the diagrams are simplified 1) Stemming and Diagrams to circuits: The training
by rewriting them with the Lambeq Rewrite package, process involves using the state vector values of each
which uses a set of transformation rules to change the circuit. These state vectors are obtained from the word
strings or boxes of the diagram. The determiner, pre- boxes of a sentence, which are extracted from a string
adverb, and post-adverb rule is used in all experiments, diagram. The sentence’s vector representation is the
while the auxiliary rule usage varies. The cups in the result of taking the tensor product of the state vectors
diagrams are removed using the bigraph method in of each word box in the sentence. In order to measure
the Lambeq 0.1.2 documentation. To remove the cups, the states, circuits are created from diagrams that have
some restructuring may be necessary, such as moving not had their cups removed.
all the cups below all the word boxes and ordering them 2) Applying Not-Box settings to negative sentences: To
such that all the cups on the right of a cup are positioned handle negative sentences containing the word ‘‘not’’,
above it. The algorithm for this conversion is provided the Not-Box settings are applied based on the chosen
in [9]. representation.
2) Stemming: To reduce the complexity of the words 3) Reading state vectors and adding the padding: The
in the diagram, stemming is performed using NLTK’s circuit parameters are initialized, and the state vector
PorterStemmer [23] after the removal of the stop words. values are read. To ensure uniform feature size across
3) Diagrams to Circuits: The next step involves con- all inputs, the state vectors are padded with the |0⟩ state
verting the diagrams into quantum circuits using the to fill quantum registers of the same size.
IQP Ansatz, which is available in the Lambeq Ansatz
package [10]. D. TRAINING WITH QE-SVM
4) Apply Not-Box settings to negative sentences: To Finally, these inputs are trained with QE-SVM using quantum
deal with negative sentences, we added the Not-Box kernel [9]. State vector data trained from the previous section
settings to the circuit by using the Pauli-X gate on need to be processed to be fit on a QE-SVM. To train the
the output qubit since the Not-Box settings were pre- QE-SVM, the pipeline is described in the following section.
viously removed during the parsing stage. The output Note that the output of the training is a feature that is defined
qubit refers to the qubit that represents the resulting as a state vector measurement after a quantum circuit opti-
type ‘s’ after grammatical types reduction. In situations mizer training. In this case, we use SPSA or ANN as an
where the output qubit is located in the middle, the optimizer.
Not-Box may be applied there instead. This approach
was inspired by [24] which directly captures the nega- IV. QE-SVM PIPELINE
tive meaning of negated words. The Pauli-X gate was This section explains the steps taken to fit the state vector data
selected as one of the representations since it flips the trained from the previous section on a QE-SVM. In general,
probability of measurement on a single qubit, resulting the quantum kernel uses a quantum feature map to take data
in the sentence’s sentiment being flipped when applied of n-dimension and maps it onto n corresponding qubits.
to the output ′ s′ qubit. This decision was also supported The quantum kernel is then used by a classical SVM to

VOLUME 11, 2023 87523


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

approximate the separating hyperplane. Each data point is


supplied to the SVM, where the values are used as the param-
eters of the feature map to be used as the inputs of the SVM.

A. DATA TRANSFORMATION
First, data is transformed from state vector data with high
dimensions into one with lower dimensions. This is done in
order to train the QE-SVM in a reasonable time. For this
experiment, the data is transformed into 14 columns. The
value 14 is chosen because the original state vector data
is obtained from reading a quantum circuit with 7 qubits, FIGURE 3. Pauli Y feature map circuit.
where each qubit state has a |0⟩ and a |1⟩ component
(i.e., in superposition). weight vectors l is less than p such that the resulting trans-
This paper investigates two data transformation methods: formation of X yields data with reduced dimensionality as
the double angles method, and PCA with 14 principal com- follows [26]:
ponents.
w(k) = (w1 , . . . , wp )(k) (5)
1) DOUBLE ANGLES tk(i) = x(i) .w(k) . (6)
For a state vector composed of n qubits with states
[α0 , β0 , ] , . . . , [αn , βn , ] , each state can be described by two
angles θ1 and θ2 . Given that qubit states are normalized B. FEATURE MAP SELECTION
α 2 + β 2 = 1, we can calculate θ1 as The experiments presented in this paper focus on three Pauli
β feature maps: Pauli Y, Pauli YY, and Pauli Y YY. The decision
tan (θ1 ) = (1) is inspired by the work in [7], which uses the Pauli Y, Pauli
α
β
  YY, Pauli Y YY and Pauli Z, Pauli ZZ, Pauli Z ZZ feature
θ1 = tan−1 (2) maps. Preliminary experimentation showed that the results of
α
both the Y and Z counterparts yielded the same results, which
On the other hand, because α and β are complex values, is explained by the SVM only reading the real values of the
we can find the angle between them in complex space using quantum kernel output. Therefore for brevity, the methods
the cosine rule. Therefore, we can calculate θ2 as listed will cover the Y counterparts of those three feature
α.β = |α||β| cos(θ2 ) (3) maps.
A feature map with Pauli Y rotation gates takes input
α.β
 
θ2 = cos (4) data x and encodes it onto a quantum circuit by the following
|α||β| transformation. The general form can be written as [7]:
These two angles are chosen because they describe the   
n
magnitude and similarity of each component, respectively. X Y
Furthermore, this decorrelates a majority of the high dimen- UφY (x) = exp i  φS (x) σj∈{Y }  (7)
j=1
sional state vector data.
2) PCA (n=14) PThe
n
aboveQ gate encodes the transformation matrix
j=1 φ S (x) σY as a set of Pauli Y rotations with input
Principal Component Analysis is used to obtain the principal
φS (x), where S denotes the connectivity between a subset of
components and project the input data onto lower dimensions.
PCA takes the input data and projects it onto a set of orthog- Q φS (x) is x0 when only a
qubits in the quantum circuit, and
single qubit is concerned and is j∈S π − x otherwise.
onal vectors, which describe a p-dimensional ellipsoid fitted
onto the reference dataset. The coordinates are ordered such
1) PAULI Y FEATURE MAP
that the components have descending variance (where the
projection of the data with the greatest variance is known as The Pauli Y feature map is a simple feature map with a P gate
the first principal component and lies on the first coordinate, between a π/2 X-rotation gate and its inverse. The result is a
and so on). Y-rotation gate with angle x, which may be repeated multiple
PCA takes a data matrix X with n records and p fields times. There is no entanglement in this feature map.
(assuming the value of each column has been preprocessed
such that the mean of each column is zero) and transforms 2) PAULI YY FEATURE MAP
it by a set of l weight vectors, each with dimension p onto The Pauli YY feature map is a second-order Pauli Y evolution
a target vector space known as principal component scores, circuit with Pauli Y and Pauli YY components. In the YY
such that the set of scores t of a data entry has the maximum feature map, binary entanglement is introduced between all
possible variance of X . It is noted that the weight vectors pairs of qubits in the circuit, with its input parameter corre-
w have been normalized, and the cardinality of the set of sponding to the index of qubit pair permutation. As with the
87524 VOLUME 11, 2023
F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

FIGURE 4. Pauli YY feature map circuit.

FIGURE 5. Pauli Y YY feature map circuit.

Pauli Y feature map, this Pauli YY circuit may be repeated transformation into the following equation [7].
multiple times.   
X n Y
Uφ(x) = exp i  αφS (x) σj∈{X ,Y ,Z }  (8)
3) PAULI Y YY FEATURE MAP j=1
The Pauli Y YY feature map is a Pauli Y feature map followed
by a Pauli YY feature map. This feature map starts out with- The values chosen in this paper range from 0.5 to 2.0 with
out entanglement, then has linear entanglement introduced by an increment of 0.1, as well as several other interesting values
the second-order Pauli Y evolution circuit component. The (0.75, 1.25, and 1.75).
Pauli Y YY circuit may be repeated multiple times.
1) QUANTUM KERNEL PREPARATION
4) PAULI Y Y YY FEATURE MAP
This step prepares a Qiskit Quantum Instance from a Qiskit
backend, then instantiates a Quantum Kernel with the chosen
Following the construction pattern of the Pauli Y YY feature
Pauli feature map. The Quantum Instance is a Qiskit object
map, the Pauli Y Y YY feature map is a Pauli Y feature map,
that contains a Qiskit Backend, as well as the configura-
followed by another Pauli Y feature map, followed by a Pauli
tion for circuit transpilation and execution. It is used to run
YY feature map. This feature map prepends an additional
the Quantum Kernel when called by the SVC during future
Pauli Y encoding circuit to the Pauli Y YY feature map. The
steps. The Quantum Kernel is a Qiskit object that pack-
Pauli Y Y YY circuit may be repeated multiple times.
ages a quantum kernel function by transforming two sets of
n-dimensional data, say x and y, onto higher dimensional
C. ROTATION FACTOR APPLICATION data (typically of dimension 2n) through the use of a
To handle overfitting, a rotation factor is applied to the rota- quantum feature map which takes x as its input param-
tion gate parameter angles φS (x), such that the values are eters, and calculates the dot-product between them. The
multiplied by the scaling factor, modifying the feature map dot-product result in matrix form can then be used in common

VOLUME 11, 2023 87525


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

machine-learning techniques: The angle values are stored in the outputData with the corre-
sponding index.
K (x, y) =< f (x), f (y) > . (9)

2) QE-SVM TRAINING Algorithm 2 PCA Data Trans. (PCA)


In this paper, we use the Scikit-learn SVC as the classical Input: trainData: state vector decomposition of
basis of the SVM. It takes the previously defined quantum training data. testData: state vector
kernel as a hyperparameter and the transformed training set decomposition of test data. nQubits: number of
as its input, then classically train them to obtain the separat- qubits used in training/test circuits.
ing hyperplane. The transformed testing set is then used to Output: outputTrain: PCA of training data.
validate the results of the QE-SVM. outputTest: PCA of test data
begin
D. QE-SVM ALGORITHM IMPLEMENTATIONS DataFrame trainData
This subsection explains the implementations of the data DataFrame testData
transformations i.e., double angles and PCA. Finally, the DataFrame outputTrain
QESVM algorithm is described to predict the testing data. DataFrame outputTest
integer nQubit
Algorithm 1 Double Angles Data Trans. (DA) initialize TransformerPCA(nQubit)
fit TransformerPCA(nQubit)
Input: inputData: State vector decomposition of
outputTrain ← TransformerPCA(trainData)
training/test data, nQubits: Number of qubits
outputTest ← TransformerPCA(testData)
used in training/test circuits.
return (outputTrain, outputTest)
Output: outputData: Double angles of training/test
data
begin
DataFrame inputData Next, another data transformation is PCA (Algorithm 2).
DataFrame outputData The PCA is formed by the state vector. First, the initialization
integer nQubit of the PCA transformer is conducted. The PCA uses a set
complex v0 , v1 , α, β of nQubit weight vectors, where each weight inherits the
float norm, zeroComponenet, oneComponent dimension of X . In this case, X represents a 7-qubit state
integer N ← length of inputData vector and contains 128 elements. It uses the weights to
statevector sv, complex vector of length N/2 map each record x in X to the resulting scores t such that t
arrayFloat angles maintains a maximum variance of X . The outputs of the PCA
for i = 1 to N do transformation are stored in outputTrain and outputTest.
angles= [0.0, . . . , 0.0] Finally, Algorithm 3 shows the main QE-SVM procedure.
sv ← inputDatai The transformed data (by Double Angles or PCA) is the
v0 ← sv0 input of the QE-SVM. First, we apply the rotation factor
for q = 1 to nQubits do as expressed in Equation (8) by changing the value of α.
v1 ← sv with value |00 . . . 1 . . . 00⟩, where Next, we create the feature map with the following proper-
1 at q-th position. Note that the index of ties; type of feature, number of qubits (nQubit), repetitions,
v1 = j = q 2nQubits−q . and entanglement. After that, we run the simulation using
norm ← (v0 vT0 + v1 vT1 ) for calculating α, the quantum kernel with n shots (nShots) and stored the
β, in α|0⟩ + β|1⟩. result in adhocKernel. Finally, we run the QE-SVM classifier
v0
α ← norm and run the training and testing predictions and accuracy
β ← normv1 performances. Finally, we have the results by the variables
angles2q−1 ← arctan |α/β| trainAccuracy and testAccuracy.
angles2q ← arccos (αβ T )/(||α||||β||)
V. EXPERIMENT AND RESULTS
outputDatai ← angles
The experiments in this paper explore and compare three
return outputData
circuit-parameter-training methods and two data transforma-
tion methods for identifying the sentiment of a sentence using
Algorithm 1 shows the double angles data transformations. QE-SVM. These experiments are executed by implementing
The function converts a state vector into double-angle data. each step in the general pipeline in Figure 2. The purpose of
For each qubit, we set the sv with the value of |00 . . . 1 . . . 00⟩, our experiments can be described as follows:
where the q-th is 1. Next, the normalization of v0 and v1 is cal- 1) To study the effect of using two data transformation and
culated to find the value of α and β. Finally, the double angles circuit parameter training methods on the prediction
are calculated by arctan |α/β| and arccos (αβ T )/(||α||||β||). result

87526 VOLUME 11, 2023


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

TABLE 1. Data transformation and circuit parameter training experiment result.

Algorithm 3 QE-SVM words. This dataset is divided into 170 training, 50 develop-
Input: DataFrame trainData, DataFrame testData, ment, and 60 test sentences.
arrayInteger trainLabels, arrayInteger testLabels,
integer nQubits, dictionary trainingConfig.
Output: arrayInteger trainPredictions, arrayInteger A. DATA TRANSFORMATION AND CIRCUIT PARAMETER
testPredictions, float trainAccuracy, float TRAINING EXPERIMENTS
testAccuracy. This experiment was conducted to determine which com-
begin bination of methods is most appropriate to improve the
1) Apply rotation factor by ApplyRotationFactor
(trainData, testData) sentiment classification results in QE-SVM. The combina-
2) Create feature map by FeatureMap(Pauli, nQubits, tion of methods is done on: circuit parameter training, data
Repetitions, Entanglement) transformation method, feature map, and rotation factor. For
3) Run quantum simulation and kernel by the circuit parameter training method, we used three options:
QuantumSimulationAndKernel (nShots, FeatureMap) SPSA, ANN 1 (3 layers), and ANN 2 (5 layers). We use
stored in adhocKernel
4) Initialize SVM from the QESVM; two alternatives for the data transformation method, namely
qesvmClassifier(adhocKernel, scaledTrainData) Double angles and PCA (n=14). As for the feature map,
5) Call training prediction by we use four methods: Pauli Y, Pauli YY, Pauli Y YY, and
qesvmClassifier.predict(scaledTrainData) → Pauli Y Y YY. The parameter rotation factor varies from 0.5
trainPredictions to 1, and repetitions from 1 to 3, whereas the fixed parameters
6) Call testing prediction by
qesvmClassifier.predict(scaledTestData) → are using linear entanglement with 16 shots. To simplify
testPredictions the result’s presentation, only the three best results for the
return (trainAccuracy, testAccuracy) combination of circuit parameter training method and rep-
resentation are shown (Table 1). Based on the experimental
results, it can be seen that the combination of the SPSA opti-
mization method with the PCA transformation method, the
2) To study the impact of ANN architecture for circuit Pauli Y Y YY feature map, and a rotation factor of 0.9 gives
parameter training on the prediction result the best accuracy results of 93.33%.
3) To study the effect of rotation factor on the prediction
result B. ANN LAYER EXPERIMENTS
4) To perform prediction comparison with SVM baseline The ANN layer experiment was carried out to determine the
First, the experimental hardware used is a Linux OS with effect of the number of ANN layers on the sentiment clas-
8 vCPUs and 52 GB of RAM with VM type n1-highmem-8. sification results. The experimental parameters used are the
The hardware is the same as our preliminary work in [9]. The data transformation method, feature map, and rotation factor
dataset used in the experiment is a collection of simple sub- for ANN architectures: ANN 1 and ANN 2. The number of
jective sentences in the restaurant domain. These sentences layers of the two architectures is 3 and 5, respectively. Based
are generated from 29 vocabularies, consisting of positive and on the experimental results in Table 3 and Figure 6, ANN 1
negative sentences, with each sentence having a length of 4-5 gives better accuracy than ANN 2. The best configuration is

VOLUME 11, 2023 87527


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

TABLE 2. ANN configurations.

FIGURE 7. Rotation factor parameter.

FIGURE 6. Summary of ANN layer experiment.

ANN 1 with Double angles, Pauli Y, rotation factor of 1, and


repetition of 1. The detailed configuration is shown in Table 2.

C. ROTATION FACTOR EXPERIMENT


FIGURE 8. CPT and representation method.
We also conducted rotation factor experiments to find the
best values for classification in QE-SVM. We use the PCA
transformation method and the best SPSA circuit parameter
training method. The experimental results in Table 4 and training method and representation. The followings are the
Figure 7 show that the rotation factor value that gives the best three models used and their parameter configurations:
classification results is 0.9, with an accuracy value of 93.33%. 1) Model 1: circuit parameter training method
SPSA - representation PCA - feature map Pauli Y Y
D. COMPARISON WITH BASELINE: SVM YY - rotation factor 0.9 - repetitions 1
The baseline method used as a comparison in this experiment 2) Model 2: circuit parameter training method ANN 1
is (Classical) SVM. In addition, we use the circuit param- - representation Double angles - feature map
eter training method SPSA, ANN 1, and ANN 2, as well Pauli Y - rotation factor 1 - repetitions 2
as the data transformation method Double angles and PCA. 3) Model 3: circuit parameter training method ANN 2 -
As can be inferred from Table 5 and Figure 8, the best accu- representation PCA - feature map Pauli Y - rotation
racy for the baseline method was obtained at 80.00% using factor 2 - repetitions 3
ANN 1-Double angles. Meanwhile, for the proposed method, We take examples of several sentences with represen-
the best accuracy was obtained at 93.33%, and the highest tations of sentence types and their properties, along with
increase was 16.66% using SPSA-PCA. Moreover, compared their prediction results for the three models (Table 6). For
to the previous work [9], the proposed method outperforms example, in negative sentences with the negation word (not)
the accuracy performance up to 93.33% using SPSA circuit (sentence 3), model 2 (ANN 1) failed to predict correctly.
parameter training and PCA with n = 14 data transformation. Whereas in negative sentences without negation (sentence 1),
model 3 (ANN 2) failed to predict correctly. In positive
E. CASE STUDY sentences (sentence 2), model 2 and model 3 fail to pre-
To better understand the sentiment classification capability dict. On the other hand, model 1 succeeded in predicting all
of our model, we evaluate some of the best models and three types of sentences. This fact shows that the SPSA-PCA
analyze several cases. In the first evaluation, we compared combination on QE-SVM provides the best performance for
the three models representing the best circuit parameter various types of positive and negative sentences. However, the

87528 VOLUME 11, 2023


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

TABLE 3. ANN layer experiment result.

TABLE 4. Rotation factor experiment result.

TABLE 5. Comparison with SVM baseline.

TABLE 6. Comparison of QE-SVM prediction results with three variations of the circuit parameter training method.

combination of ANN with PCA or Double Angles still needs to the baseline method (SVM). The followings are the three
to be optimal in handling positive and negative sentences. models used and their parameter configurations:
In the second evaluation (Table 5), we tried comparing 1) Baseline 1: circuit parameter training method ANN 1 -
sentences that could be handled by our method (QE-SVM) representation Double angles - classifier SVM

VOLUME 11, 2023 87529


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

TABLE 7. Comparison of the predicted results of our method and the baselines.

FIGURE 9. Comparisons of several methods of QE-SVM.

2) Baseline 2: circuit parameter training method SPSA - negative = 31 and predicted positive = 29. Therefore,
representation PCA - classifier SVM it cannot be concluded that the model tends to predict
3) Our method: circuit parameter training method SPSA - positively or negatively, and prediction errors occur due
representation PCA - classifier QE-SVM to a lack of models in other aspects. By transforming
the state vector, QE-SVM provides more accurate pre-
We found some cases when comparing the three models
diction results for both positive and negative sentences.
(Table 7).
Moreover, classical SVM is unsuitable for transformed
1) The prediction is correct in our method but wrong in data (double angles/PCA) and performs better before
baseline 1 or baseline 2 (cases a1 and a2). Observa- transformation because the data is more descriptive and
tion of the prediction results shows that baseline 1 is no information is lost. Classical SVM can handle high-
often mistaken as a false positive rather than a false dimensional state-vector data because it does not need
negative. From the test set prediction results, base- to simulate a quantum kernel. The dot product between
line 1 has a minimal tendency to predict positively two high-dimensional vectors is only O(n). However,
compared to our method. As additional information, its overall performance is still below QE-SVM.
the false positive rate of baseline 1 is 34.62%, while our 2) The prediction was wrong in our method but correct
method’s false positive rate is 10.71%, baseline 1 accu- in baseline 1 or baseline 2 (Case b1 and b2). The
racy of 80%, and our method’s accuracy of 93.33%). case where our method is wrong and baseline 1 or
In the case of baseline 2, there is no particular ten- baseline 2 is correct occurs when the label is nega-
dency to predict negative or positive, with predicted tive and best incorrectly predicts it as positive. It is

87530 VOLUME 11, 2023


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

conjectured that baseline 1 happens to be accurate, [6] V. Havlíček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kandala,
given the slight tendency to predict negatively. In addi- J. M. Chow, and J. M. Gambetta, ‘‘Supervised learning with quantum-
enhanced feature spaces,’’ Nature, vol. 567, no. 7747, pp. 209–212,
tion, our method has a wrong prediction on the positive Mar. 2019.
sentence (9th sentence). This fact is presumably due to [7] J.-E. Park, B. Quanz, S. Wood, H. Higgins, and R. Harishankar, ‘‘Practical
the similarity of the sentence with one of the sentences application improvement to quantum SVM: Theory to practice,’’ 2020,
arXiv:2012.07725.
in the train set. [8] W. Zeng and B. Coecke, ‘‘Quantum algorithms for compositional natural
Lastly, the comparison among several methods of language processing,’’ 2016, arXiv:1608.01406.
[9] F. Z. Ruskanda, M. R. Abiwardani, M. A. Al Bari, K. A. Bagaspati,
QE-SVM in terms of accuracy with respect to epoch is R. Mulyawan, I. Syafalni, and H. T. Larasati, ‘‘Quantum representation for
depicted in Figure 9. As shown in the figure for the higher sentiment classification,’’ in Proc. IEEE Int. Conf. Quantum Comput. Eng.
epochs, the combination of PCA and SPSA yields the highest (QCE), Sep. 2022, pp. 67–78.
[10] D. Kartsaklis, I. Fan, R. Yeung, A. Pearson, R. Lorenz, A. Toumi,
accuracy for both training and testing. This is due to the G. de Felice, K. Meichanetzidis, S. Clark, and B. Coecke, ‘‘Lam-
state vector information being well represented and optimized beq: An efficient high-level Python library for quantum NLP,’’ 2021,
by the PCA and the SPSA. On the other hand, double arXiv:2110.04236.
[11] Y. Zhang, D. Song, P. Zhang, X. Li, and P. Wang, ‘‘A quantum-inspired
angles data (DA) may eliminate some information. For com- sentiment representation model for Twitter sentiment analysis,’’ Appl.
parison, the PCA (n=14) approximates the distribution of Intell., vol. 49, no. 8, pp. 3093–3108, 2019.
128-dimensional data with 14 values, whereas the double [12] Y. Zhang, D. Song, X. Li, P. Zhang, P. Wang, L. Rong, G. Yu, and B. Wang,
‘‘A quantum-like multimodal network framework for modeling interaction
angle represents 7-dimensional data with only 2 values (the dynamics in multiparty conversational sentiment analysis,’’ Inf. Fusion,
angle and the amplitude). vol. 62, pp. 14–31, Oct. 2020.
[13] N. Joshi, P. Katyayan, and S. A. Ahmed, ‘‘Comparing classical ML models
VI. CONCLUSION with quantum ML models with parametrized circuits for sentiment analysis
task,’’ J. Phys., Conf. Ser., vol. 1854, no. 1, Apr. 2021, Art. no. 012032.
This paper described a study on the implementation of [14] A. Liu, X. Deng, Z. Tong, Y. Luo, and B. Liu, ‘‘A simultaneous perturbation
QE-SVM on the NLP task: sentiment classification. The stochastic approximation enhanced teaching-learning based optimization,’’
subjective sentence, which contains sentiment value was ana- in Proc. IEEE Congr. Evol. Comput. (CEC), Jul. 2016, pp. 3186–3192.
[15] X. Bonet-Monroig, H. Wang, D. Vermetten, B. Senjean, C. Moussa,
lyzed by transforming it into a quantum representation that T. Bäck, V. Dunjko, and T. E. O’Brien, ‘‘Performance comparison
can be used as input to the quantum kernel. The experimental of optimization methods on variational quantum algorithms,’’ 2021,
results proved that the combination of sentences-to-circuit arXiv:2111.13454.
[16] C. Cortes and V. Vapnik, ‘‘Support-vector networks,’’ Mach. Learn.,
steps, the SPSA optimization method, and the data transfor- vol. 20, no. 3, pp. 273–297, 1995.
mation method PCA on QE-SVM provided the best sentiment [17] M. Schuld and N. Killoran, ‘‘Quantum machine learning in feature Hilbert
classification results of 93.33% accuracy, with an increase of spaces,’’ Phys. Rev. Lett., vol. 122, no. 4, Feb. 2019, Art. no. 040504.
16.6% compared to the baseline SVM. This approach worked [18] M. Schuld, ‘‘Supervised quantum machine learning models are kernel
methods,’’ 2021, arXiv:2101.11020.
both in positive and negative subjective sentences. Our work [19] S. Altares-López, A. Ribeiro, and J. J. García-Ripoll, ‘‘Automatic design
leads to a potential path for using quantum kernels for NLP of quantum feature maps,’’ Quantum Sci. Technol., vol. 6, no. 4, Oct. 2021,
using quantum computing. Art. no. 045015.
[20] K. Meichanetzidis, A. Toumi, G. de Felice, and B. Coecke, ‘‘Grammar-
In the future research, we suggest further development by aware sentence classification on quantum computers,’’ 2020,
using Variational Quantum Algorithms and by implementing arXiv:2012.03756.
them on a quantum computer. [21] R. Lorenz, A. Pearson, K. Meichanetzidis, D. Kartsaklis, and B. Coecke,
‘‘QNLP in practice: Running compositional models of meaning on a
quantum computer,’’ 2021, arXiv:2102.12846.
ACKNOWLEDGMENT [22] R. Yeung and D. Kartsaklis, ‘‘A CCG-based version of the DisCoCat
An earlier version of this paper was presented at framework,’’ 2021, arXiv:2105.07720.
the 2022 IEEE International Conference on Quantum Com- [23] M. F. Porter, ‘‘An algorithm for suffix stripping,’’ Program, vol. 40, no. 3,
pp. 211–218, Jul. 2006.
puting and Engineering (QCE), Broomfield, CO, USA [DOI: [24] B. Coecke, M. Sadrzadeh, and S. Clark, ‘‘Mathematical foundations for a
10.1109/QCE53715.2022.00025]. compositional distributional model of meaning,’’ 2010, arXiv:1003.4394.
[25] J. C. Spall, ‘‘An overview of the simultaneous perturbation method for
REFERENCES efficient optimization,’’ Johns Hopkins APL Tech. Dig., vol. 19, no. 4,
pp. 482–492, 1998.
[1] B. Liu, Sentiment Analysis: Mining Opinions, Sentiments, and Emotions.
[26] I. T. Jolliffe, Principal Component Analysis. Aberdeen, U.K.: Univ. of
Cambridge, U.K.: Cambridge Univ. Press, 2020.
Aberdeen, 2002.
[2] V. Vyas and V. Uma, ‘‘Approaches to sentiment analysis on product
reviews,’’ in Sentiment Analysis and Knowledge Discovery in Contempo-
rary Business. Hershey, PA, USA: IGI Global, 2019, pp. 15–30.
[3] S. Unankard, X. Li, M. Sharaf, J. Zhong, and X. Li, ‘‘Predicting elections
from social networks based on sub-event detection and sentiment analy- FARISKA ZAKHRALATIVA RUSKANDA (Mem-
sis,’’ in Proc. 15th Int. Conf. Web Inf. Syst. Eng. (WISE), Thessaloniki, ber, IEEE) received the B.S., M.S., and Ph.D.
Greece. Cham, Switzerland: Springer, Oct. 2014, pp. 1–16. degrees from the School of Electrical Engineering
[4] W. Duan, Q. Cao, Y. Yu, and S. Levy, ‘‘Mining online user-generated and Informatics, Bandung Institute of Technology,
content: Using sentiment analysis technique to study hotel service Bandung, Indonesia. She is currently an Assistant
quality,’’ in Proc. 46th Hawaii Int. Conf. Syst. Sci., Jan. 2013, Professor of natural language processing with the
pp. 3119–3128. Informatics Research Group, Bandung Institute of
[5] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Technology.
‘‘Quantum machine learning,’’ Nature, vol. 549, no. 7671, pp. 195–202,
2017.

VOLUME 11, 2023 87531


F. Z. Ruskanda et al.: Quantum-Enhanced Support Vector Machine for Sentiment Classification

MUHAMMAD RIFAT ABIWARDANI received INFALL SYAFALNI (Member, IEEE) received


the B.Sc. degree (cum laude) in informatics from the B.Eng. degree in electrical engineering from
Institut Teknologi Bandung, Indonesia, in 2023. Institut Teknologi Bandung (ITB), Bandung,
His current research interests include quantum Indonesia, in 2008, the M.Sc. degree in elec-
machine learning and quantum NLP. tronic engineering from the University of Science
Malaysia (USM), Penang, Malaysia, in 2011, and
the Dr.Eng. degree in engineering from the Kyushu
Institute of Technology (KIT), Iizuka, Fukuoka,
Japan, in 2014. From 2014 to 2015, he held a
research position with KIT. From 2015 to 2018,
he was an ASIC Engineer with the ASIC Development Group, Logic
Research Company Ltd., Fukuoka. In 2019, he joined ITB, where he is cur-
rently an Assistant Professor with the School of Electrical Engineering and
Informatics and a Researcher with the University Center of Excellence on
Microelectronics, ITB. His current research interests include logic synthesis,
logic design, VLSI design, efficient circuits, and algorithms.

HARASHTA TATIMMA LARASATI (Member,


IEEE) received the B.S. and M.S. degrees
RAHMAT MULYAWAN (Member, IEEE) received in telecommunication engineering from Institut
the B.Eng. degree in electrical engineering Teknologi Bandung (ITB), Bandung, Indonesia,
from ITB, Indonesia, in 2008, and the M.Sc. in 2016 and 2017, respectively. She is currently
degree in electrical engineering from TU Delft, pursuing the Ph.D. degree in computer engineering
The Netherlands, in 2011. He is currently a mem- with Pusan National University, Busan, Repub-
ber of the Microelectronics Centre, ITB. His lic of Korea. She is a Junior Lecturer with ITB.
current research interests include intelligent sig- Her current research interests include quantum
nal processing, MIMO systems, and transceiver computing and cryptanalysis, quantum machine
design for optical wireless communications. learning, AI security, and networking.

87532 VOLUME 11, 2023

You might also like