1 s2.0 S092523122300766X Main
1 s2.0 S092523122300766X Main
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
Communicated by J. Wu Machine learning using quantum convolutional neural networks (QCNNs) has demonstrated success in both
quantum and classical data classification. In previous studies, QCNNs attained a higher classification accuracy
Keywords:
than their classical counterparts under the same training conditions in the few-parameter regime. However,
Quantum computing
Quantum machine learning
the general performance of large-scale quantum models is difficult to examine because of the limited size of
Quantum convolutional neural network quantum circuits, which can be reliably implemented in the near future. We propose transfer learning as an
Transfer learning effective strategy for utilizing small QCNNs in the noisy intermediate-scale quantum era to the full extent.
In the classical-to-quantum transfer learning framework, a QCNN can solve complex classification problems
without requiring a large-scale quantum circuit by utilizing a pre-trained classical convolutional neural network
(CNN). We perform numerical simulations of QCNN models with various sets of quantum convolution and
pooling operations for MNIST data classification under transfer learning, in which a classical CNN is trained
with Fashion-MNIST data. The results show that transfer learning from classical to quantum CNN performs
considerably better than purely classical transfer learning models under similar training conditions.
1. Introduction Haar measure difficult as the number of qubits increases [14]. One way
to avoid the barren plateau is to adopt a hierarchical structure [15,16],
Machine learning (ML) with a parameterized quantum circuit (PQC) in which the number of qubits decreases exponentially with quantum
is a promising approach for improving existing methods beyond classi- circuit depth, such as in quantum convolutional neural networks (QC-
cal capabilities [1–7]. This is a classical-quantum hybrid algorithm in NNs) [4]. The hierarchical structure is interesting from a theoretical
which the cost function and its corresponding gradient are computed perspective because of its close connection to tensor networks [15,17].
using quantum circuits [8,9] and the model parameters are updated Moreover, the shallow depth of a QCNN, which grows logarithmically
classically. Such hybrid ML models are particularly advantageous when with the number of input qubits, makes it well suited for NISQ comput-
cost function minimization is difficult to perform classically [4,10,11]. ing. In addition, an information-theoretic analysis shows that the QCNN
These models optimize the quantum gate parameters under the given architecture can help reduce the generalization error [18], which is one
experimental setup, and hence can be robust to systematic errors. of the central goals of machine learning. All these factors motivate the
Furthermore, they are less prone to decoherence because iterative com- application of QCNN for machine learning. QCNNs have been shown to
putation can be exploited to reduce the quantum circuit depth. Thus, be useful for solving both quantum [4,19] and classical [20] problems
the hybrid algorithm has the potential to achieve quantum advantage despite their restricted structure with a shallow-depth quantum circuit.
in solving various problems in the noisy intermediate-scale quantum
In Ref. [20], for binary classification with the MNIST [21] and Fashion-
(NISQ)1 era [12,13].
MNIST [22] datasets, QCNN yielded higher classification accuracy than
A critical challenge in the utilization of PQC for solving real-world
the classical convolutional neural network (CNN) when only 51 or
problems is the barren plateau phenomenon in the optimization land-
fewer parameters were used to construct these models. The best-known
scape, which makes training the quantum model that samples from the
∗ Corresponding author at: Department of Applied Statistics, Yonsei University, Seoul, Republic of Korea.
E-mail addresses: [email protected] (J. Huh), [email protected] (D.K. Park).
1
NISQ refers to the domain of quantum computing where the number of quantum processors that can be manipulated reliably is limited due to noise, yet
holds the potential to surpasses the classical capabilities to a certain extent. As NISQ technology becomes increasingly accessible, the discovery of its real-world
applications has become crucially important.
https://doi.org/10.1016/j.neucom.2023.126643
Received 26 September 2022; Received in revised form 30 May 2023; Accepted 24 July 2023
Available online 29 July 2023
0925-2312/© 2023 Elsevier B.V. All rights reserved.
J. Kim et al. Neurocomputing 555 (2023) 126643
classical CNN-based classifiers for the same datasets typically employ which grows exponentially with the number of qubits. Consequently,
millions of parameters. However, the size of the quantum circuits that the quantum convolutional operation is not an inner product, as in the
can be implemented with current quantum devices is too small to in- classical case, but a unitary transformation of a state vector, which is a
corporate such a large number of parameters. Therefore, two important linear map that transforms a vector to a vector, whereas a classical con-
issues remain. The first is to verify whether a QCNN can continue to volution operation is a linear map that transforms a vector to a scalar.
outperform its classical counterpart as the number of trainable model The pooling in QCNN traces out half of the qubits, similar to the pooling
parameters increases. The second is to utilize small QCNNs that can in the CNN that subsamples the feature space. Typically, the pooling
be realized in the near future to the full extent, so that a quantum layer includes parameterized two-qubit controlled-unitary gates, and
advantage can be achieved in solving practical problems. The latter is the control qubits are traced out after the gate operations. Without
the main focus of this work. loss of generality, we refer to the structure of a parameterized unitary
An ML problem for which the quantum advantage in the few- operator for either convolution or pooling as ansatz. The cost function
parameter regime can be exploited is transfer learning (TL) [23–26]. of a model with given parameters is defined with an expectation value
TL aims to utilize what has been learned in one setting to improve of some observable with respect to the final quantum state obtained
generalization in another setting that is independent of the former. after repeating quantum convolutional and pooling operations. The
TL can be applied to classical-quantum hybrid networks such that the QCNN is trained by updating the model parameters to minimize the
parameters learned for a classical model are transferred to training a cost function until a pre-determined convergence condition is met. The
quantum model or vice versa [27]. In the classical-to-quantum (C2Q) general concept of a QCNN is illustrated in Fig. 1(a). An example of
TL scheme, the number of qubits increases with the number of output a circuit with eight input qubits is shown in (b). The depth of the
nodes (or features) of the pre-trained classical neural network. This QCNN circuit after repeating the convolution and pooling until one
indicates that the transferred part of a classical neural network should qubit remains is 𝑂(log 𝑁), where 𝑁 is the number of input qubits. This
have a small number of output nodes to find applications in the NISQ shallow depth allows the QCNN to perform well on quantum hardware
era. For example, using a pre-trained feedforward neural network with that will be developed in the near future.
a large number of nodes throughout the layers would not be well The quantum convolution and pooling operations can be parame-
suited for near-term hybrid TL. By contrast, building a TL model with terized in many ways. The convolution ansatzes evaluated in this study
a classical and quantum CNN is viable because the number of features are illustrated in Fig. 2. Among them, circuits (b) to (j) are the nine
in the CNN progressively decreases via subsampling (i.e., pooling), and ansatzes previously tested in Ref. [20]. These ansatzes are motivated
the QCNN has already exhibited an advantage with a small number of by past studies. For instance, circuit (b) is a parameterized quantum
input qubits. circuit that was used to train a tree tensor network (TTN) [15]. The
Motivated by the aforementioned observations, we propose a TL four-qubit parameterized quantum circuits analyzed by Sim et al. [29]
framework for classical-to-quantum convolutional neural networks were modified to two-qubit circuits to serve as building blocks for
(C2Q-CNNs). Unlike previous works, C2Q-CNN transfers knowledge the convolution layer, resulting in circuits (c), (d), (e), (f), (h), and
from a pre-trained (source) classical CNN to a quantum CNN, thereby (i). Circuits (h) and (i) are two-qubit versions of the circuits with the
preserving the benefits of quantum CNN. Our method avoids the best expressibility, while circuit (c) is of the best entangling circuit.
need for classical data dimensionality reduction, commonly required Circuits (d), (e), and (f) represent a good balance of expressibility and
in existing methods, as the classical CNN serves as the source for TL. entangling capability. Circuit (g) is used for the two-body variational
Additionally, we introduce new ansatzes for both quantum pooling and quantum eigensolver entangler [30] and can generate arbitrary 𝑆𝑂(4)
convolutional operations, enriching the model selection in the QCNN. gates [31], making it a suitable candidate for building convolution layer
To evaluate the performance of C2Q-CNN, we conduct numerical sim- in QCNN. Circuit (j) is a parameterized arbitrary 𝑆𝑈 (4) gate [19,32]
ulations on the MNIST data classification task using PennyLane [28]. capable of performing arbitrary two-qubit unitary operations. Because
The classical CNN is pre-trained on the Fashion-MNIST dataset. The the convolutional operations act on two qubits, parameterized 𝑆𝑈 (4)
simulations assess the classification accuracy under different quan- operations provides the most general ansatz. In this study, we introduce
tum convolution and pooling operations and compare C2Q-CNN with two new convolutional ansatzes, (a) and (k), to our benchmark. The
various classical-to-classical CNN (C2C-CNN) TL schemes. The results former aims to study the classification capability of a QCNN when only
show that C2Q-CNN outperforms C2C-CNN with respect to classifica- pooling operations are trained. The latter is inspired by the generalized
tion accuracy under similar training conditions. Furthermore, the new pooling operation described in the following paragraph, with an 𝑆𝑈 (2)
quantum pooling operation developed in this work is more effective in gate applied to a control qubit to split the subspaces in an arbitrary
demonstrating the quantum advantage. superposition.
The remainder of this paper is organized as follows. Section 2
The pooling ansatzes used in previous studies were simple single-
reviews QCNNs and TL. This section also introduces the generalization
qubit-controlled rotations followed by tracing out the control qubit. For
of the pooling operation of the QCNN. Section 3 explains the general
example, in Ref. [20], a pooling operation in the following form was
framework for C2Q-CNN TL. The simulation results are presented in
used:
Section 4. MNIST data classification was performed with a CNN pre- [( ]
)
trained for Fashion-MNIST data, and the performance of the C2Q-CNN Tr 𝐴 |1⟩⟨1|𝐴 ⊗ 𝑅𝑧 (𝜃1 )𝐵 + |0⟩⟨0|𝐴 ⊗ 𝑅𝑥 (𝜃2 )𝐵 𝜌𝐴𝐵 𝑈𝑝† , (1)
models was compared with that of various C2C-CNN models. The
conclusions and outlook are presented in Section 5. where Tr 𝐴 (⋅) represents a partial trace over subsystem 𝐴, 𝑅𝑖 (𝜃) is the
rotation around the 𝑖 axis of the Bloch sphere by an angle of 𝜃, 𝜃1 and
2. Preliminaries 𝜃2 are the free parameters, 𝜌𝐴𝐵 is a two-qubit state subject to pooling,
and 𝑈𝑝† is the conjugate transpose of the unitary gate for pooling. The
2.1. Quantum convolutional neural network pooling operation in Eq. (1) is referred to as ZX pooling. In addition to
ZX pooling, generalized pooling is introduced as
[( ) ]
Quantum convolutional neural networks are parameterized quan- Tr 𝐴 |1⟩⟨1|𝐴 ⊗ 𝑈 (𝜃1 , 𝜙2 , 𝜆3 )𝐵 + |0⟩⟨0|𝐴 ⊗ 𝑈 (𝜃4 , 𝜙5 , 𝜆6 )𝐵 𝜌𝐴𝐵 𝑈𝑝† . (2)
tum circuits with unique structures inspired by classical CNNs [4,
20]. In general, QCNNs follow two basic principles of classical CNNs: Here, 𝑈 (𝜃, 𝜙, 𝜆) = 𝑅𝑧 (𝜙)𝑅𝑥 (−𝜋∕2)𝑅𝑧 (𝜃)𝑅𝑥 (𝜋∕2)𝑅𝑧 (𝜆) and can implement
translational invariance of convolutional operations and dimensionality any unitary operator in 𝑆𝑈 (2). Again, 𝑈𝑝† is the conjugate transpose of
reduction via pooling. However, QCNNs differ from classical CNNs in the corresponding unitary gate for pooling. The unitary gates used in
several aspects. First, the data are defined in quantum Hilbert space, ZX pooling and generalized pooling are compared in Fig. 3.
2
J. Kim et al. Neurocomputing 555 (2023) 126643
Fig. 1. (a) Schematics of the QCNN algorithm with (b) an example for eight input qubits. Given a quantum state, |𝜓⟩𝑑 , which encodes classical data, the quantum circuit comprises
two parts: convolutional filters (rectangles) and pooling (circles). The convolutional filter and pooling use parameterized quantum gates. Three layers of convolution–pooling pairs
are presented in this example. In each layer, the convolutional filter applies the identical two-qubit ansatz to the nearest neighbor qubits in a translationally invariant manner.
The quantum convolutional operations in the QCNN circuits are designed to meet the closed boundary condition, as indicated by the open-ended gates in the figure, ensuring the
top and bottom qubits in each layer are connected. Pooling operations within the layer are identical to each other, but differ from convolutional filters. The pooling operation is
represented as a controlled unitary transformation, and the half-filled circle on the control qubit indicates that different unitary gates can be applied to each subspace of the control
qubit. The measurement outcome of the quantum circuit is used to calculate the user-defined cost function. A classical computer is used to compute the new set of parameters
based on the gradient, and the quantum circuit parameters are updated for the subsequent round. The optimization process is iterated until pre-selected conditions are met.
Fig. 2. Parameterized quantum circuits used in the convolutional layer. The convolutional circuits from (b) to (j) are adapted from Ref. [20], whereas (a) and (k) are the new
convolutional circuits tested in this study. 𝑅𝑖 (𝜃) is the rotation around the 𝑖-axis of the Bloch sphere by an angle of 𝜃, and 𝐻 is the Hadamard gate. 𝑈 (𝜃, 𝜙, 𝜆) is an arbitrary
single-qubit gate, which can be expressed as 𝑈 (𝜃, 𝜙, 𝜆) = 𝑅𝑧 (𝜙)𝑅𝑥 (−𝜋∕2)𝑅𝑧 (𝜃)𝑅𝑥 (𝜋∕2)𝑅𝑧 (𝜆). 𝑈 (𝜃, 𝜙, 𝜆) can implement any unitary operation in 𝑆𝑈 (2). As (j) can express an arbitrary
two-qubit unitary gate, we test it without any parameterized gates for pooling in addition to ZX pooling and generalized pooling. For (k), we do not apply parameterized gates
for pooling. In these cases, pooling simply traces out the top qubit after convolution.
2.2. Transfer learning source) ML model that is pre-trained for a different but related task
with a different dataset [23–26]. Transfer learning encompasses three
Transferring the knowledge accumulated from one task to another main categories: inductive transfer learning (ITL), transductive transfer
is a typical intelligent behavior that human learners always experience. learning (TTL), and unsupervised transfer learning (UTL) [24,33]. ITL
TL refers to the application of this concept in ML. Specifically, TL aims applies when label information is available for the target domain, while
to improve the training of a new ML model by utilizing a reference (or TTL applies when label information is only available for the source
3
J. Kim et al. Neurocomputing 555 (2023) 126643
Fig. 3. Parameterized quantum gates used in the pooling layer. The pooling circuit (a) is adapted from Ref. [20], and (b) is the generalized pooling method introduced in
this work. Generalized pooling applies two arbitrary single-qubit unitary gate rotations, 𝑈 (𝜃1 , 𝜙2 , 𝜆3 ) and 𝑈 (𝜃4 , 𝜙5 , 𝜆6 ), which are activated when the control qubit is 1 (filled
circle) or 0 (open circle), respectively. The control (first) qubit is traced out after the gate operations to reduce the dimensions. The single-qubit unitary gate is defined as
𝑈 (𝜃, 𝜙, 𝜆) = 𝑅𝑧 (𝜙)𝑅𝑥 (−𝜋∕2)𝑅𝑧 (𝜃)𝑅𝑥 (𝜋∕2)𝑅𝑧 (𝜆), and it can implement any unitary in 𝑆𝑈 (2). The thinner horizontal line (top qubit) indicates the qubit that is being traced out after
gate operations.
domain. UTL, on the other hand, applies when label information is expensive quantum state preparation routines to represent classical
unavailable for both the source and target domains. In this study, we data in a quantum state [35–43].
have chosen to focus exclusively on ITL to ensure simplicity and clarity C2Q TL has been utilized for image data classification [27,44]
in our explanations and demonstrations. Henceforth, when we refer to and spoken command recognition [45]. These works serve as proof
transfer learning, it pertains specifically to ITL. Detailed information of principle for the general idea and present interesting examples to
regarding our numerical simulations will be presented later in the motivate further investigations and benchmarks. The parameterized
manuscript. quantum circuits therein are vulnerable to the barren plateau problem,
because they follow the basic structure of a fully connected feedforward
TL is known to be particularly useful for training a deep learning
neural network with the same number of input and output qubits.
model that takes a long time owing to the large amount of data,
Moreover, these studies used classical neural networks to significantly
especially if the features extracted in early layers are generic across
reduce the number of data features to only four or eight. This means
various datasets. In such cases, starting from a pre-trained network
that most of the feature extraction is performed classically; hence,
such that only a portion of the model parameters is fine-tuned for a the necessity of the quantum part is unclear. These studies encode
particular task can be more practical than training the entire network the reduced data onto a quantum circuit using simple single-qubit
from scratch. For example, suppose that a neural network is trained rotations, also known as qubit encoding [15,42], which makes the
with data 𝐴 to solve task 𝐴 and finds the set of parameters (i.e., weights number of model parameters grow polynomially with the number of
and biases) 𝒘𝐴 ∈ R𝑁𝐴 . To solve task 𝐵 given dataset 𝐵, the neural data features. In contrast, the number of model parameters in our
network is not trained from scratch, as this may require vast compu- ML algorithm scales logarithmically with the number of input qubits.
tational resources. Instead, the parameters associated with some of the Furthermore, all of these works use only one type of ansatz based
earlier layers of the reference neural network are used as a set of fixed on repetitive applications of single-qubit rotations and controlled-NOT
parameters for the new neural network that is subjected to solving task gates. Finally, the performance of C2Q TL was not compared with that
𝐵 with data 𝐵. In other words, some elements of the parameters for this of the C2C version in any of these studies. Because the pre-trained clas-
new learning problem, denoted by 𝒘𝐵 ∈ R𝑁𝐵 , are identical to those of sical neural network performs a significant dimensionality reduction
𝒘𝐴 . Hence the number of parameters subject to new optimization is (and hence feature extraction), the absence of a direct comparison with
less than 𝑁𝐵 . The successful application of TL can improve training C2C TL raises the question of whether the quantum model achieves any
performance by starting from a higher training accuracy, achieving advantage over its classical counterparts.
In this study, we present a classical-to-quantum transfer learn-
a faster rate of accuracy improvement, and converging to a higher
ing framework with QCNN. Our framework facilitates the transfer
asymptotic training accuracy [34].
of knowledge from a pre-trained classical CNN to a quantum CNN,
The aforementioned observations imply that TL is also beneficial
leveraging the unique advantages offered by QCNNs. The adoption
when the amount of data available is insufficient to work with or
of QCNNs as the target model holds crucial importance for several
extremely small to build a good model. Because processing big data reasons. Firstly, QCNNs possess the capability to circumvent the barren
in the NISQ era will be challenging, working with small amounts of plateau effect, a critical bottleneck encountered during the training of
data through TL is a promising strategy for near-term quantum ML. quantum neural networks. This property of QCNNs addresses a major
The target ML model subjected to fresh training (i.e., fine-tuning) in challenge in quantum machine learning and enhances the training pro-
TL typically has a much smaller number of parameters than the pre- cess. Furthermore, previous research has demonstrated the advantages
trained model. This and the success of QCNN in the few-parameter of QCNNs over their classical counterparts, particularly in the few-
regime together promote the development of the classical-to-quantum parameter regime, along with their good generalization capabilities.
CNN transfer learning. As a result, fine-tuning a machine learning model using a QCNN is
expected to yield enhanced classification performance compared to
3. Classical-to-quantum transfer learning fine-tuning with a traditional CNN. To illustrate the practical imple-
mentation of our C2Q TL framework, we provide an example of transfer
An extension of TL to quantum ML was proposed, and its general learning using C2Q-CNN, which serves as the basis for our benchmark
studies. The schematic representation of this process can be found in
concept was formulated in Ref. [27]. Although the performances of
Fig. 4, showcasing the application of our proposed framework.
the quantum models were not compared with those of their classical
The general model is flexible with the choice of data encoding,
counterparts, three different scenarios of quantum TL, namely C2Q,
which loads classical data features to a quantum state |𝜓⟩𝑑 , and ansatz,
quantum-to-classical, and quantum-to-quantum, were shown to be fea- the quantum circuit model subject to training. We performed extensive
sible. Among these three possible scenarios, we focus on C2Q TL as benchmarking over the various ansatzes presented in Section 2.1 to
mentioned in Section 1, because we aim to utilize QCNNs in the few- classify MNIST data using a classical model pre-trained with Fashion-
parameter regime to the full extent. Sufficient reduction of the data MNIST data. Finally, we compared the classification accuracies of
dimensionality (i.e., the number of attributes or features) by classical C2Q and various C2C models. The C2Q models performed noticeably
learning would ensure that the size of a quantum circuit subject to better than all C2C models tested in this study under similar training
training is sufficiently small for implementation with NISQ devices. conditions. More details on the simulation and results are presented in
The dimensionality reduction technique is also necessary to simplify the following section.
4
J. Kim et al. Neurocomputing 555 (2023) 126643
Fig. 4. An example of classical-to-quantum convolutional neural network transfer learning simulated in this work for benchmarking and comparison to purely classical models.
A source CNN is trained on the Fashion-MNIST dataset. Then, the transfer learning trains a QCNN for MNIST data classification by utilizing the earlier layers of the pre-trained
CNN for feature extraction. The source CNN contains convolution (conv.), pooling, dense and batch normalization (BN) layers.
5
J. Kim et al. Neurocomputing 555 (2023) 126643
Fig. 5. Summary of the classification results with PennyLane simulations (quantum part) and Keras (classical part). Each bar represents the classification test accuracy of C2Q
TL averaged over 10 instances given by the random initialization of parameters. The different bars along the 𝑥-axis indicate that the results are for different convolution ansatz,
labeled according to Fig. 2. The unfilled, filled, and hatched bars represent the results of ZX pooling, generalized pooling, and trivial pooling, respectively. The number of trainable
model parameters for each case is shown at the top of the 𝑥-axis. The horizontal lines represent the results of the C2C TL with 1D and 2D CNN architectures. The number of
trainable model parameters for each case is provided in the legend.
The 2 vs. 3 classification results are shown in Fig. 5(b). Most of the parameters, but their average accuracies are different. In the current
ZX pooling average accuracies were between 70% and 90%, and most study, overfitting was not observed as the number of model parameters
of the generalized pooling average accuracies were between 85% and was much smaller than the number of data, but it is important to keep
90%. The test accuracies in (b) are lower than those of (a) because this issue in mind when designing and implementing larger models in
2 vs. 3 image classifications are more difficult than 0 vs. 1 image the future.
classifications. All C2Q TL classification accuracies are higher than the In summary, generalized pooling mostly produces higher classifica-
C2C classification accuracies except for ZX pooling with convolution 1 tion accuracy than ZX pooling with the same convolution circuit. This is
and 3, which use a much smaller number of parameters than the purely as expected since the generalized pooling has more model parameters.
classical TL. The test accuracy of generalized pooling is greater than Moreover, it can be reduced to ZX pooling under appropriate param-
that of ZX pooling when the convolution is the same. In addition, the eter selection. The accuracy of all ZX pooling, generalized pooling,
accuracy of trivial pooling with convolution ansatz 10 is higher than and trivial pooling circuits tends to be higher when the convolution
that with ansatz 11. These results are consistent with those obtained circuits have a larger number of gate parameters. Although C2Q models
from the 0 vs. 1 classification, providing further evidence that the have fewer trainable parameters than C2C models, most C2Q models
ansatz with more model parameters and improved expressibility is outperform C2C models.
favored. This suggests that the use of more complex models, such as the To validate our findings, we conducted a Welch’s t-test analysis [49]
generalized pooling, is advantageous in solving classification problems, to determine the statistical significance of the improved classification
particularly in comparison to less expressive models such as ZX pooling. results obtained by the C2Q models. Our results show that ZX pooling
The 8 vs. 9 classification results are shown in Fig. 5(c). Most with convolution ansatz 9 and generalized pooling with convolution
of the ZX pooling average accuracies were between 70% and 90%, ansatz 4, 5, 9, and 10 have a statistically significant quantum advantage
and most of the generalized pooling average accuracies were between over both 1D and 2D classical models for all classification problems,
85% and 90%. The test accuracies in (c) are lower than those in despite having a smaller number of model parameters. Further details
(a) because 8 vs. 9 image classification is more difficult than 0 vs. on the statistical analysis can be found in Appendix C. These findings
1 image classification, but the accuracies in (c) are similar to those underscore the potential of quantum-enhanced machine learning mod-
in (b). All C2Q TL classification accuracies are higher than the C2C els in solving complex classification tasks, even with limited model
TL classification accuracies except for ZX pooling with convolution resources.
1, 2, and 3, which use a much smaller number of parameters than The underlying source of the quantum advantage in quantum com-
the purely classical TL. As before, trivial pooling with convolution puting remains an open question. However, it is speculated that the
ansatz 10 has higher accuracy than trivial pooling with ansatz 11. advantage is related to certain properties of quantum computing that
the accuracy of generalized pooling is consistently higher than that have no classical equivalent. The first property is the ability of quantum
of ZX pooling when the underlying convolution is identical. However, measurements to discriminate non-orthogonal states, which enables
there are instances where the performance of ZX pooling is comparable quantum computers to capture subtle differences in data that are not
to that of generalized pooling, such as with the use of convolution captured by classical computers. The second property is the ability of
ansatz 7. Nevertheless, a Welch’s t-test analysis [49] revealed that these quantum convolutional operations to create entanglement among all
results are not statistically significant. Based on these findings, we qubits through the use of two-qubit gates between nearest neighbors,
can conclude that generalized pooling is a more favorable approach which allows for the capture of non-local correlations. In addition, the
for solving all of the classification problems tested, compared to ZX ability of a quantum computer to store 𝑁-dimensional data in ⌈log2 (𝑁)⌉
pooling. qubits, and the ability of the QCNN to classify 𝑀-qubit quantum states
The results in Fig. 5(a), (b), and (c) show a QCNN’s tendency to using only 𝑂(log(𝑀)) parameters, make it possible to construct an
perform better when the convolution circuits have a larger number extremely compact machine learning model.
of trainable parameters. However, simply increasing the number of
trainable parameters does not always guarantee to improve the test 5. Conclusion
accuracy because it is affected by various conditions, such as statistical
error and quantum gate arrangement. For example, ZX pooling with In this study, we proposed a classical-to-quantum CNN (C2Q-CNN),
convolution ansatz 5, 6, 7 in (b) has the same number of trainable a transfer learning (TL) model that uses some layers of a pre-trained
6
J. Kim et al. Neurocomputing 555 (2023) 126643
CNN as a starting point for a quantum CNN (QCNN). The QCNN consti- Declaration of competing interest
tutes an extremely compact machine learning (ML) model because the
number of trainable parameters grows logarithmically with the number The authors declare that they have no known competing finan-
of initial qubits [4] and is promising because of the absence of barren
cial interests or personal relationships that could have appeared to
plateaus [16] and generalization capabilities [18]. Supervised learning
influence the work reported in this paper.
with a QCNN has also demonstrated classification performance superior
to that of its classical counterparts under similar training conditions
for a number of canonical datasets [20]. C2Q-CNN TL provides an Data availability
approach to utilize the advantages of QCNN in the few-parameter
regime to the full extent. Moreover, the proposed method is suitable Data will be made available on request.
for implementation in quantum hardware expected to be developed in
the near future because it is robust to systematic errors and can be im-
plemented with a shallow-depth quantum circuit. Therefore, C2Q-CNN Acknowledgments
TL is a strong candidate for practical applications of NISQ computing
in ML with a quantum advantage. This research was supported by the Yonsei University Research
To demonstrate the quantum advantage of C2Q-CNN, we conducted Fund of 2022 (2022-22-0124), the National Research Founda-
a comparative study between two classical-to-classical (C2C) transfer tion of Korea, South Korea (Grant Nos. 2021M3H3A1038085,
learning (TL) models and C2Q TL models. The C2C and C2Q TL models 2019M3E4A1079666, 2022M3E4A1074591, 2022M3H3A1063074, and
shared the same pre-trained CNN, with the C2C TL models having
2021M3E4A1038308), and the KIST Institutional Program (2E32241-
slightly more parameters than the C2Q TL models. The pre-training was
23-010).
performed on the Fashion-MNIST dataset for multinomial classification.
Then the target model replaced the final dense layer of the source
model and was trained for three independent binary classification tasks Appendix A. Encoding classical data to a quantum state
using the MNIST data. Our simulation results, obtained using Penny-
Lane and Keras, revealed that the C2Q models consistently achieved
The first step in applying quantum ML to a classical dataset is
higher classification accuracy compared to the C2C models, despite
to transform the classical data into a quantum state. Without loss of
having fewer trainable parameters. These results highlight the potential
generality, we consider the classical data given as an 𝑁-dimensional
of quantum-enhanced transfer learning in improving the performance
of machine learning models. It is important to note that while our real vector 𝑥⃗ ∈ R𝑁 . Several encoding methods exist to achieve this,
simulation utilized a source CNN specifically designed for this study, such as algorithms that require a quantum circuit with 𝑂(𝑁) width and
the C2Q-CNN TL framework is compatible with other existing CNNs, 𝑂(1) depth and algorithms that require 𝑂(log(𝑁)) width and 𝑂(poly(𝑁))
such as VGGNet [50], ResNet [51], and DenseNet [52]. This compat- depth [40–43]. Among the various encoding methods explored previ-
ibility enhances the versatility of our approach, enabling researchers ously [20], we observed that amplitude encoding performs best in most
to leverage established CNN designs within the quantum-enhanced cases.
transfer learning paradigm.
The potential future research directions are as follows. First, the
A.1. Amplitude encoding
reason behind the quantum advantage demonstrated by C2Q-CNN re-
mains unclear. Although rigorous analysis is lacking, we speculate that
this advantage is related to the ability of a quantum measurement to Amplitude encoding encodes classical data into the probability am-
discriminate non-orthogonal states, for which a classical analog does plitude of each computational quantum state. Amplitude encoding
not exist. Moreover, verifying whether the quantum advantage would transforms 𝑥 = (𝑥1 , … , 𝑥𝑁 )⊤ of dimension 𝑁 = 2𝑛 classical data into
continue to hold as the number of trainable parameters increases and an n-qubit quantum state |𝜓(𝑥)⟩ as follows:
for other datasets would be interesting. To increase the number of
1 ∑
𝑁
model parameters for a fixed number of features and input qubits, ←← |𝜓(𝑥)⟩ =
𝑈 (𝑥) ∶ 𝑥 ∈ R𝑁 → 𝑥 |𝑖⟩ , (A.1)
one may consider generalizing the QCNN model to utilize multiple ‖𝑥‖ 𝑖=1 𝑖
channels, as in many classical CNN models. Note that, in the TL tested
where |𝑖⟩ denotes the 𝑖th computational basis state. Amplitude encod-
in our experiment, the final dense layer was replaced with a model
subjected to fine-tuning, while the entire convolutional part was frozen. ing can optimize the number of parameters on the 𝑂(log(𝑁)) scale.
Testing the various depths of frozen layers would be an interesting topic However, the quantum circuit depth of amplitude encoding typically
for future research. For example, freezing a smaller number of layers increases with 𝑂(poly(𝑁)).
to use the features of an earlier layer of the convolutional stage can
be beneficial when the new dataset is small and significantly different A.2. Qubit encoding
from the source. The focus of this study was on inductive TL, for which
both the source and new datasets were labeled. Exploring the potential
of leveraging quantum techniques in other TL scenarios, such as self- Qubit encoding uses a constant quantum circuit depth, while us-
taught, unsupervised, and transductive TL [24], is a promising direction ing 𝑁 qubits. Qubit encoding rescales classical data to 𝑥𝑖 which lies
for future research. Furthermore, extending the C2Q TL approach to between 0 and 𝜋, and then inputs 𝑥𝑖 into a single qubit as |𝜓(𝑥)⟩ =
𝑥 𝑥
address other machine learning problems, such as semi-supervised cos ( 2𝑖 ) |0⟩ + sin ( 2𝑖 ) |1⟩ for 𝑖 = 1, … , 𝑁. Therefore, qubit encoding
learning [53,54] and one-class classification [55,56], poses an open transforms 𝑥 = (𝑥1 , … , 𝑥𝑁 )⊤ into 𝑁 qubits as
challenge for future research. ( ( ) (𝑥 ) )
⨂
𝑁
𝑥
←← |𝜓(𝑥)⟩ =
𝑈 (𝑥) ∶ 𝑥 ∈ R𝑁 → cos 𝑖 |0⟩ + sin 𝑖 |1⟩ (A.2)
CRediT authorship contribution statement 𝑖=1
2 2
Juhyeon Kim: Performed simulations, Analyzed and discussed where 𝑥𝑖 ∈ [0, 𝜋) for all 𝑖. This unitary operator 𝑈 (𝑥) can be expressed
the results, Contributed to the writing of manuscript. Joonsuk Huh: as the tensor product of a single qubit unitary operator 𝑈 (𝑥) = ⊗𝑁 𝑈 ,
𝑗=1 𝑥𝑗
Analyzed and discussed the results, Contributed to the writing of where
( ) (𝑥 )
manuscript. Daniel K. Park: Conceived the main framework, Designed ⎡cos 𝑥𝑗
𝑥𝑗 − sin 2𝑗 ⎤
the experiments, Analyzed and discussed the results, Contributed to the 𝑈𝑥 𝑗 = 𝑒 −𝑖 2 𝜎𝑦
= ⎢ ( 2) ( ) ⎥. (A.3)
writing of manuscript. ⎢ sin 𝑥𝑗 cos 2𝑗 ⎥⎦
𝑥
⎣ 2
7
J. Kim et al. Neurocomputing 555 (2023) 126643
Table C.1
ZX pooling with the 0 vs. 1 classification 𝑝-value results.
Convolution circuit 1 2 3 4 5 6 7 8 9 10
1D 64 𝑝-value 0.0312 0.1289 0.7327 0.0793 0.1795 0.2957 0.0316 0.0240 0.0166 0.0422
2D 76 𝑝-value 0.0047 0.8761 0.0323 0.5240 0.9725 0.5402 0.1348 0.0888 0.0353 0.2441
Table C.2
ZX pooling with the 2 vs. 3 classification 𝑝-value results.
Convolution circuit 1 2 3 4 5 6 7 8 9 10
1D 64 𝑝-value 0.0768 0.9352 0.9927 0.0497 0.2332 0.1700 0.0304 0.3209 0.0446 0.0119
2D 76 𝑝-value 0.0289 0.6896 0.7954 0.0047 0.0579 0.0375 0.0023 0.0985 0.0040 0.0006
Table C.3
ZX pooling with the 8 vs. 9 classification 𝑝-value results.
Convolution circuit 1 2 3 4 5 6 7 8 9 10
1D 64 𝑝-value 0.0008 0.2373 0.4708 0.0035 0.0024 0.0012 0.0007 0.0010 0.0004 0.0002
2D 76 𝑝-value 0.0000 0.2806 0.0495 0.1591 0.0904 0.0204 0.0117 0.0085 0.0015 0.0002
Table C.4
Generalized pooling with the 0 vs. 1 classification 𝑝-value results.
Convolution circuit 1 2 3 4 5 6 7 8 9 10
1D 64 𝑝-value 0.1079 0.0215 0.0302 0.0169 0.0139 0.0297 0.0217 0.0206 0.0151 0.0103
2D 76 𝑝-value 0.7661 0.0535 0.1171 0.0308 0.0197 0.1026 0.0598 0.0609 0.0268 0.0107
Table C.5
Generalized pooling with the 2 vs. 3 classification 𝑝-value results.
Convolution circuit 1 2 3 4 5 6 7 8 9 10
1D 64 𝑝-value 0.0921 0.0290 0.0389 0.0106 0.0099 0.0138 0.0109 0.0074 0.0068 0.0048
2D 76 𝑝-value 0.0129 0.0021 0.0034 0.0006 0.0006 0.0008 0.0006 0.0004 0.0003 0.0002
Table C.6
Generalized pooling with the 8 vs. 9 classification 𝑝-value results.
Convolution circuit 1 2 3 4 5 6 7 8 9 10
1D 64 𝑝-value 0.0080 0.0079 0.0013 0.0009 0.0012 0.0010 0.0016 0.0004 0.0007 0.0002
2D 76 𝑝-value 0.4616 0.4574 0.0202 0.0085 0.0261 0.0062 0.0279 0.0010 0.0038 0.0002
B.1. 1D CNN
Fig. B.6. 1D CNN model.
The structure of the 1D CNN model is illustrated in Fig. B.6. The
CNN takes 256 features produced by the source (pre-trained) CNN as
input, and passes them on to 1D convolution and max pooling layers.
The output feature size is reduced to 28 at the end of the max pooling
layer. Finally, a dense layer is applied to reduce the size of the output reduce the size of the output features to two for binary classification.
features to two for binary classification. The total number of trainable The total number of trainable parameters is 76.
parameters is 64.
Appendix C. Welch’s t-test
B.2. 2D CNN
The Welch’s t-test [49] is a widely-utilized method for assessing the
The structure of the 2D CNN model is shown in Fig. B.7. The CNN equality of means between two populations with unequal variances.
takes 16 × 16 data, which is reshaped from 256 features produced by In our study, the Welch’s t-test was implemented using SciPy [57] to
the source (pre-trained) CNN as input. This two-dimensional data is obtain 𝑝-values between the C2Q TL model and the C2C TL models.
passed on to the 2D convolution and max pooling layer twice, and the In accordance with standard statistical practice [58,59], a 𝑝-value less
output features are reduced to eight. Finally, a dense layer is applied to than 𝛼, where 𝛼 is typically set to 0.05, is considered to indicate
8
J. Kim et al. Neurocomputing 555 (2023) 126643
Table C.7
Trivial pooling 𝑝-value results.
Classification 0 vs 1 2 vs 3 8 vs 9
Convolution 10 11 10 11 10 11
1D 64 𝑝-value 0.0340 0.9678 0.0602 0.1097 0.0002 0.0005
2D 76 𝑝-value 0.1305 0.1192 0.0065 0.0173 0.0003 0.0048
statistical significance. Based on the obtained 𝑝-values, we conclude [13] Kishor Bharti, Alba Cervera-Lierta, Thi Ha Kyaw, Tobias Haug, Sumner
that if a C2Q TL model demonstrates statistical significance compared Alperin-Lea, Abhinav Anand, Matthias Degroote, Hermanni Heimonen, Jakob S.
Kottmann, Tim Menke, Wai-Keong Mok, Sukin Sim, Leong-Chuan Kwek, Alán
to the C2C TL models and achieves a higher accuracy, it can be inferred
Aspuru-Guzik, Noisy intermediate-scale quantum algorithms, Rev. Modern Phys.
to possess a meaningful advantage (quantum advantage). The 𝑝-value 94 (2022) 015004.
results were organized into groups based on the ZX pooling, generalized [14] Jarrod R. McClean, Sergio Boixo, Vadim N. Smelyanskiy, Ryan Babbush, Hartmut
pooling, and trivial pooling approaches, as previously discussed in Neven, Barren plateaus in quantum neural network training landscapes, Nature
the paper. For each pooling type, the 𝑝-value results were further Commun. 9 (1) (2018) 4812.
[15] Edward Grant, Marcello Benedetti, Shuxiang Cao, Andrew Hallam, Joshua
grouped by the type of learning problem, which in this case was binary
Lockhart, Vid Stojevic, Andrew G. Green, Simone Severini, Hierarchical quantum
classification for 0 and 1, 2 and 3, and 8 and 9. The results of ZX pooling classifiers, npj Quantum Inf. 4 (1) (2018) 65.
are presented in Tables C.1, C.2, and C.3 for classifying between 0 and [16] Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoff, Andrew T. Sornborger,
1, 2 and 3, and 8 and 9, respectively. The results of generalized pooling Patrick J. Coles, Absence of barren plateaus in quantum convolutional neural
are presented in Tables C.4, C.5, and C.6 in the same order. Finally, networks, Phys. Rev. X 11 (2021) 041011.
[17] Rui Huang, Xiaoqing Tan, Qingshan Xu, Variational quantum tensor networks
the results of trivial pooling are listed in Table C.7. We highlighted in
classifiers, Neurocomputing 452 (2021) 89–98.
bold any statistically significant 𝑝-values where the corresponding C2Q [18] Leonardo Banchi, Jason Pereira, Stefano Pirandola, Generalization in quantum
model exhibited higher accuracy than the C2C model. machine learning: A quantum information standpoint, PRX Quantum 2 (2021)
040321.
References [19] Ian MacCormack, Conor Delaney, Alexey Galda, Nidhi Aggarwal, Prineha Narang,
Branching quantum convolutional neural networks, Phys. Rev. Res. 4 (2022)
013117.
[1] Jonathan Romero, Jonathan P. Olson, Alan Aspuru-Guzik, Quantum autoencoders
[20] Tak Hur, Leeseok Kim, Daniel K. Park, Quantum convolutional neural network
for efficient compression of quantum data, Quantum Sci. Technol. 2 (4) (2017)
for classical data classification, Quantum Mach. Intell. 4 (1) (2022) 3.
045001.
[21] Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to
[2] K. Mitarai, M. Negoro, M. Kitagawa, K. Fujii, Quantum circuit learning, Phys.
document recognition, Proc. IEEE 86 (11) (1998) 2278–2324.
Rev. A 98 (2018) 032309.
[3] Marcello Benedetti, Erika Lloyd, Stefan Sack, Mattia Fiorentini, Parameterized [22] Han Xiao, Kashif Rasul, Roland Vollgraf, Fashion-mnist: a novel image dataset
quantum circuits as machine learning models, Quantum Sci. Technol. 4 (4) for benchmarking machine learning algorithms, 2017.
(2019) 043001. [23] Stevo Bozinovski, Reminder of the first paper on transfer learning in neural
[4] Iris Cong, Soonwon Choi, Mikhail D. Lukin, Quantum convolutional neural networks, Informatica (Slovenia) 44 (1976) 2020.
networks, Nat. Phys. 15 (12) (2019) 1273–1278. [24] Sinno Jialin Pan, Qiang Yang, A survey on transfer learning, IEEE Trans. Knowl.
[5] M. Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C. Benjamin, Suguru Endo, Data Eng. 22 (10) (2010) 1345–1359.
Keisuke Fujii, Jarrod R. McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, [25] Chuanqi Tan, Fuchun Sun, Tao Kong, Wenchang Zhang, Chao Yang, Chunfang
Patrick J. Coles, Variational quantum algorithms, Nat. Rev. Phys. 3 (9) (2021) Liu, A survey on deep transfer learning, in: Věra Kůrková, Yannis Manolopoulos,
625–644. Barbara Hammer, Lazaros Iliadis, Ilias Maglogiannis (Eds.), Artificial Neural
[6] S. Mangini, F. Tacchino, D. Gerace, D. Bajoni, C. Macchiavello, Quantum Networks and Machine Learning, ICANN 2018, Springer International Publishing,
computing models for artificial neural networks, Europhys. Lett. 134 (1) (2021) Cham, 2018, pp. 270–279.
10002. [26] Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, MIT Press,
[7] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, Dacheng Tao, Learnability 2016, http://www.deeplearningbook.org.
of quantum neural networks, PRX Quantum 2 (2021) 040337. [27] Andrea Mari, Thomas R. Bromley, Josh Izaac, Maria Schuld, Nathan Killoran,
[8] Jun Li, Xiaodong Yang, Xinhua Peng, Chang-Pu Sun, Hybrid quantum–classical Transfer learning in hybrid classical-quantum neural networks, Quantum 4
approach to quantum optimal control, Phys. Rev. Lett. 118 (2017) 150503. (2020) 340.
[9] Maria Schuld, Ville Bergholm, Christian Gogolin, Josh Izaac, Nathan Killoran, [28] Ville Bergholm, Josh Izaac, Maria Schuld, Christian Gogolin, M. Sohaib Alam,
Evaluating analytic gradients on quantum hardware, Phys. Rev. A 99 (2019) Shahnawaz Ahmed, Juan Miguel Arrazola, Carsten Blank, Alain Delgado,
032331. Soran Jahangiri, Keri McKiernan, Johannes Jakob Meyer, Zeyue Niu, An-
[10] Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, tal Száva, Nathan Killoran, Pennylane: Automatic differentiation of hybrid
Peter J. Love, Alán Aspuru-Guzik, Jeremy L. O’Brien, A variational eigenvalue quantum–classical computations, 2020.
solver on a photonic quantum processor, Nature Commun. 5 (1) (2014) 4213. [29] Sukin Sim, Peter D. Johnson, Alán Aspuru-Guzik, Expressibility and entan-
[11] Jarrod R. McClean, Jonathan Romero, Ryan Babbush, Alán Aspuru-Guzik, The gling capability of parameterized quantum circuits for hybrid quantum-classical
theory of variational hybrid quantum–classical algorithms, New J. Phys. 18 (2) algorithms, Adv. Quantum Technol. 2 (12) (2019) 1900070.
(2016) 023023. [30] Robert M. Parrish, Edward G. Hohenstein, Peter L. McMahon, Todd J. Martínez,
[12] John Preskill, Quantum computing in the NISQ era and beyond, Quantum 2 Quantum computation of electronic transitions using a variational quantum
(2018) 79. eigensolver, Phys. Rev. Lett. 122 (23) (2019) 230401.
9
J. Kim et al. Neurocomputing 555 (2023) 126643
[31] Hai-Rui Wei, Yao-Min Di, Decomposition of orthogonal matrix and synthesis [52] Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger, Densely
of two-qubit and three-qubit orthogonal gates, Quantum Inf. Comput. 12 (3–4) connected convolutional networks, in: 2017 IEEE Conference on Computer Vision
(2012) 262–270. and Pattern Recognition, CVPR, 2017, pp. 2261–2269.
[32] Farrokh Vatan, Colin Williams, Optimal quantum circuits for general two-qubit [53] Oliver Chapelle, Bernhard Schölkopf, Alexander Zien, Semi-Supervised Learning,
gates, Phys. Rev. A 69 (3) (2004) 032315. MIT Press, Cambridge, 2006.
[33] Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu [54] Wandong Zhang, Q.M. Jonathan Wu, Yimin Yang, Semisupervised manifold
Zhu, Hui Xiong, Qing He, A comprehensive survey on transfer learning, Proc. regularization via a subnetwork-based representation learning model, IEEE Trans.
IEEE 109 (1) (2021) 43–76. Cybern. PP (99) (2022) 1–14.
[34] Emilio Soria Olivas, Jose David Martin Guerrero, Marcelino Martinez Sober, [55] Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed
Handbook of Research on Machine Learning Applications and Trends: Algo- Siddiqui, Alexander Binder, Emmanuel Müller, Marius Kloft, Deep one-class
rithms, Methods and Techniques - 2 Volumes, Information Science Reference classification, in: International Conference on Machine Learning, PMLR, 2018,
- Imprint of: IGI Publishing, Hershey, PA, 2009. pp. 4393–4402.
[35] Gui-Lu Long, Yang Sun, Efficient scheme for initializing a quantum register with [56] Wandong Zhang, Q.M. Jonathan Wu, W.G. Will Zhao, Haojin Deng, Yimin Yang,
an arbitrary superposed state, Phys. Rev. A 64 (2001) 014303. Hierarchical one-class model with subnetwork for representation learning and
[36] Andrei N. Soklakov, Rüdiger Schack, Efficient state preparation for a register of outlier detection, IEEE Trans. Cybern. PP (99) (2022) 1–14.
quantum bits, Phys. Rev. A 73 (2006) 012307. [57] Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler
[37] Michele Mosca, Phillip Kaye, Quantum networks for generating arbitrary quan- Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser,
tum states, in: Optical Fiber Communication Conference and International Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K.
Conference on Quantum Information, Optical Society of America, 2001, p. PB28. Jarrod Millman, Nikolay Mayorov, Andrew R.J. Nelson, Eric Jones, Robert Kern,
[38] Martin Plesch, Časlav Brukner, Quantum-state preparation with universal gate Eric Larson, C.J. Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas,
decompositions, Phys. Rev. A 83 (2011) 032302. Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E.A. Quintero,
[39] Mikko Möttönen, Juha J. Vartiainen, Ville Bergholm, Martti M. Salomaa, Trans- Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa,
formation of quantum states using uniformly controlled rotations, Quantum Inf. Paul van Mulbregt, SciPy 1.0 Contributors, SciPy 1.0: Fundamental algorithms
Comput. 5 (6) (2005) 467–473. for scientific computing in python, Nature Methods 17 (2020) 261–272.
[40] Israel F. Araujo, Daniel K. Park, Francesco Petruccione, Adenilton J. da Silva, [58] Valen E. Johnson, Revised standards for statistical evidence, Proc. Natl. Acad.
A divide-and-conquer algorithm for quantum state preparation, Sci. Rep. 11 (1) Sci. 110 (48) (2013) 19313–19317.
(2021) 6329. [59] Martin Krzywinski, Naomi Altman, Significance, P values and t-tests, Nat.
[41] T.M.L. Veras, I.C.S. De Araujo, K.D. Park, A.J. da Silva, Circuit-based quantum Methods 10 (11) (2013) 1041–1042.
random access memory for classical data with continuous amplitudes, IEEE Trans.
Comput. (2020) 1.
[42] Ryan LaRose, Brian Coyle, Robust data encodings for quantum classifiers, Phys.
Rev. A 102 (2020) 032420. Juhyeon Kim received the B.S. degree in department of physics from Sungkyunkwan
[43] Israel F. Araujo, Daniel K. Park, Teresa B. Ludermir, Wilson R. Oliveira, Francesco University (SKKU) in 2021. He is master’s student in advanced institute of nano
Petruccione, Adenilton J. da Silva, Configurable sublinear circuits for quantum technology in SKKU. His research interests include quantum algorithms, machine
state preparation, 2021, arXiv preprint arXiv:2108.10182. learning, and bioinformatics, especially on quantum algorithms for machine learning.
[44] Harshit Mogalapalli, Mahesh Abburi, B. Nithya, Surya Kiran Vamsi Bandreddi,
Classical–Quantum transfer learning for image classification, SN Comput. Sci. 3
Joonsuk Huh is an associate professor of the chemistry department of Sungkyunkwan
(1) (2022) 20.
University (SKKU), and he is visiting Xanadu Quantum Technologies Inc. for his
[45] Jun Qi, Javier Tejedor, Classical-to-quantum transfer learning for spoken com-
sabbatical year of 2023. He received his Ph.D. in theoretical physics in 2011 from the
mand recognition based on quantum neural networks, in: ICASSP 2022-2022
University of Frankfurt, where he developed computational methods for the classical
IEEE International Conference on Acoustics, Speech and Signal Processing,
simulation of vibronic spectra. He then joined Harvard University as a postdoc in
ICASSP, 2022, pp. 8627–8631.
2011. In 2015, Huh and his co-workers linked molecular vibronic spectra to quantum
[46] François Chollet, et al., Keras, 2015, https://keras.io.
sampling. His M-Qudit lab at SKKU develops quantum algorithms for chemistry,
[47] Vojtech Havlícek, Antonio D. Córcoles, Kristan Temme, Aram W. Harrow,
bioinformatics, machine learning, and matrix functions.
Abhinav Kandala, Jerry M. Chow, Jay M. Gambetta, Supervised learning with
quantum-enhanced feature spaces, Nature 567 (7747) (2019) 209–212.
[48] Diederik P. Kingma, Jimmy Ba, Adam: A method for stochastic optimization, Daniel K. Park is an Assistant Professor at Yonsei University in Korea, where he
2014, arXiv preprint arXiv:1412.6980. works on quantum information processing and machine learning. Before joining Yonsei
[49] B.L. Welch, The generalization of ‘student’s’ problem when several different University in 2022, Daniel was a research professor at Sungkyunkwan University for
population varlances are involved, Biometrika 34 (1–2) (1947) 28–35. approximately 1 year and at KAIST for approximately 2 years, where he also worked
[50] Karen Simonyan, Andrew Zisserman, Very deep convolutional networks for as a post-doctoral researcher for 3 years. Daniel obtained his Ph.D. degree in Physics-
large-scale image recognition, 2014, arXiv preprint arXiv:1409.1556. Quantum Information in 2015 at the University of Waterloo and Institute for Quantum
[51] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep residual learning for Computing.
image recognition, in: 2016 IEEE Conference on Computer Vision and Pattern
Recognition, CVPR, 2016, pp. 770–778.
10