0% found this document useful (0 votes)
5 views

Distributed Quantum Computing a Survey

Uploaded by

alsaydia1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Distributed Quantum Computing a Survey

Uploaded by

alsaydia1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Computer Networks 254 (2024) 110672

Contents lists available at ScienceDirect

Computer Networks
journal homepage: www.elsevier.com/locate/comnet

Distributed quantum computing: A survey


Marcello Caleffi a,b ,∗, Michele Amoretti c , Davide Ferrari c , Jessica Illiano a , Antonio Manzalini d ,
Angela Sara Cacciapuoti a,b
a FLY: Future Communications Laboratory, Department of Electrical Engineering and Information Technology (DIETI), University of Naples Federico
II, Naples, 80125, Italy1
b
Laboratorio Nazionale di Comunicazioni Multimediali, National Inter-University Consortium for Telecommunications (CNIT), Naples, 80126, Italy
c
QSLab: Quantum Software Laboratory, Department of Engineering and Architecture (DIA), University of Parma, Parma, 43124, Italy 2
d
TIM, Turin, 10148, Italy

ARTICLE INFO ABSTRACT

Keywords: Nowadays, quantum computing has reached the engineering phase, with fully-functional quantum processors
Quantum internet integrating hundreds of noisy qubits. Yet – to fully unveil the potential of quantum computing out of the labs
Quantum networks into the business reality – the challenge ahead is to substantially scale the qubit number, reaching orders
Quantum communications
of magnitude exceeding thousands of fault-tolerant qubits. To this aim, the distributed quantum computing
Quantum computing
paradigm is recognized as the key solution for scaling the number of qubits. Indeed, accordingly to such
Quantum computation
Distributed quantum computing
a paradigm, multiple small-to-moderate-scale quantum processors communicate and cooperate for executing
Quantum algorithms computational tasks exceeding the computational power of single processing devices. The aim of this survey is
Quantum compiler to provide the reader with an overview about the main challenges and open problems arising with distributed
Quantum compiling quantum computing from a computer and communications engineering perspective. Furthermore, this survey
Simulator provides an easy access and guide towards the relevant literature and the prominent results in the field.

1. Introduction Indeed, accordingly to the DQC paradigm, individual quantum pro-


cessors, limited in the number of qubits, work together to solve compu-
Quantum computing has finally reached the engineering phase, tational tasks exceeding the computational power of single processing
with fully-functional quantum processors integrating hundreds of noisy devices [3,5,12–16]. And, differently from distributed classical com-
qubits [1,2]. And it has the potential to completely change markets and puting, a linear increase in the number of interconnected quantum
industries, since a quantum computer can, in principle, tackle classes processors unlocks an exponential increase of the quantum computing
of problems that choke classical machines [3,4]. power [3,4,12,16].
However, to fully unlock the potentialities of quantum computing, DQC architectures are expected to be realized, in a very near future,
thousands of fault-tolerant interconnected qubits are required [1]. And in the form of local quantum server farms [4,10] whereas, on a longer
quantum technologies are still far away from this ambitious goal, since time-horizon, geographically-distributed server farms are envisioned to
there still exist hard technological limitations on the number of qubits be interconnected around the globe [10,16]. Indeed, Rigetti already
that can be embedded in a single quantum chip [5]. Indeed, we are in developed high fidelity, low-latency quantum interconnects between
the noisy intermediate-scale quantum (NISQ) processors age [6–8]. modules, providing technological foundations for modular quantum
In this context, the consensus of both academic and industry com- computers [17]. IBM plans to introduce in 2025 Kookaburra – a 1386
munities for realizing large-scale quantum processors is to adopt the qubit multi-chip processor with communication link support for quan-
distributed quantum computing (DQC) paradigm, which relays on a quan-
tum parallelization – with three Kookaburra chips inter-connected into
tum network infrastructure for clustering together modular and small
a 4158-qubit system [18]. Metropolitan-area and wide-area quantum
quantum chips in order to scale the number of qubits [3,4,9–11].
networks are also under research and development [19–23], which

∗ Corresponding author at: FLY: Future Communications Laboratory, Department of Electrical Engineering and Information Technology (DIETI), University of
Naples Federico II, Naples, 80125, Italy.
E-mail addresses: [email protected] (M. Caleffi), [email protected] (M. Amoretti), [email protected] (D. Ferrari),
[email protected] (J. Illiano), [email protected] (A. Manzalini), [email protected] (A.S. Cacciapuoti).
1
Web: www.quantuminternet.it
2
Web: www.qis.unipr.it/quantumsoftware

https://doi.org/10.1016/j.comnet.2024.110672
Received 14 May 2024; Received in revised form 28 June 2024; Accepted 22 July 2024
Available online 8 August 2024
1389-1286/© 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 1. The Distributed Quantum Computing ecosystem, represented by highlighting


the four different pillars overviewed within the survey: Algorithms, Compiling, Networking
and Simulation.

would enable DQC among geographically-distributed quantum farms


and/or devices.
Unfortunately, the existing literature on DQC is spread among dif-
ferent research communities – ranging from the physics through the
communications/computer engineering to the computer science com-
munity – leading to a fundamental gap. The aim of this survey is
precisely to bridge this gap, by introducing the astonishing and intrigu-
ing properties of distributed quantum computing, with the objective of
allowing the reader:

(i) to own the implications of the novel and distinct characteristics


of quantum information, for understanding the differences be-
tween distributed classical computing and distributed quantum
computing;
(ii) to grasp the challenges as well as to appreciate the marvels
arising with the paradigmatic shift from monolithic to distributed
quantum computing.

Due to the fast growth of this research field, such an understand-


ing serves the computer science and the communications engineering
communities to have an easy access and guide towards the relevant
literature and the prominent results, which is of paramount importance
for advancing the state-of-the-art.
The survey provides perspectives – including state-of-the-art and
challenges – on four different area related to DQC, namely: algorithms,
networking, compiling, and simulation, as detailed in Section 1.1. Fig. 2. Paper Structure.

1.1. Outline

As illustrated in Fig. 1, for each of the aforementioned four pillars,


from monolithic to distributed quantum computing, ranging from ap-
the most relevant aspects are analyzed and discussed.
propriate description formats through quantum algorithm partitioning
A quantum network infrastructure is a fundamental pre-requisite
for any form of DQC. Therefore, through the survey we shed the light to execution management. Quantum Compiling, instead, deals with
on the communication primitives required for inter-networking differ- translating a hardware-agnostic description of the algorithm into a
ent quantum processors. We discuss the main challenges arising with functionally equivalent description that takes into account the physical
this inter-networking, by introducing the reader to the fundamental constraints of the underlying computing architecture [13,15]. Indeed,
differences between interconnecting remote classical processors versus within the context of DQC, the compiling must account also for the net-
interconnecting remote quantum processors. Regarding algorithms, the work constraints, which impact on the strategy adopted for splitting the
focus is on the crucial and specific challenges arising when moving algorithm into ‘‘portions’’ to be concurrently executed on the individual

2
M. Caleffi et al. Computer Networks 254 (2024) 110672

outside the aforementioned set must be obtained with a proper combi-


nation of the allowed gates, through a process known as gate synthesis.
As an example, IBM quantum processors are realized exploiting the
superconducting technology, as mentioned before. And any logical gate
that can be run on current IBM quantum platform is built from a gate
set composed by the CNOT gate and four single-qubit gates, namely, I,
𝚁𝑧 , SX, and X gate.
It must be observed that a discrete set of gates cannot be used
to implement any arbitrary unitary operation exactly, since the set of
unitary operations is continuous [46]. In other words, for any finite set
of gates there exist unitary transformations that cannot be realized as
a combination of these gates. However, there exist finite sets of gates
– referred to as universal gate sets – that can approximate any unitary
transformation to arbitrary accuracy [47]. And indeed, for any level of
Fig. 3. Coupling map of a superconducting quantum processor [24,25]. The five accuracy, this approximation can be done efficiently accordingly to the
physical qubits stored within the processor are represented by circles. The arrows Solovay–Kitaev theorem [46,47].
denote the possibility to realize a two-qubit CNOT gate between the five qubits. As an
From the above, one can safely assume that a quantum algorithm
example, a CNOT between qubits 𝑞0 and 𝑞1 can be directly executed by the quantum
processor, whereas a CNOT between qubits 𝑞0 and 𝑞2 cannot. can be executed on a given quantum processor, either in the form
described by the original quantum circuit or by properly replacing
unavailable gates with equivalent sequences of the available ones. To
do this, in the context of monolithic quantum computing, it is necessary
quantum processing units (QPUs).3 As a matter of fact, a key goal is to
that the number of physical qubits within the processor is at least equal
minimize the number of remote operations, i.e., operations involving
to the circuit width, i.e., to the number of logical qubits. In fact, each
different QPUs. Last but not least, the design of DQC architectures can
logical qubit within the quantum circuit must be assigned to a physical
be highly facilitated by adequate simulation tools, as discussed and
qubit of the quantum processor.
detailed in the manuscript.
However, in the NISQ age, a logical qubit has to be usually mapped
The paper is structured as depicted in Fig. 2. Specifically, in Sec-
onto several physical qubits for implementing proper fault-tolerant
tion 2, we introduce some preliminaries by highlighting the differences
techniques [48]. Hence, the number of physical qubits available in a
between monolithic and distributed computing. Then, in Section 3,
single processor may be not sufficient to execute the quantum algo-
we provide the reader with the networking functionalities required
rithm. As a consequence, the consensus of both academic and industry
by the distributed computing paradigm, by detailing the pivotal role
communities for realizing large-scale quantum processors is to adopt
played by communication infrastructure to enable distributed quantum
the DQC paradigm, discussed below.
computing. In Section 4, we focus on quantum algorithms as well as
their execution management in the light of the distributed paradigm.
2.2. Archetypes for distributed quantum computing
In Section 5, we describe some relevant approaches to the problem of
compiling quantum algorithms for distributed execution. In Section 6,
In Distributed Quantum Computing, quantum processors, limited in
we provide an overview of the most advanced simulation tools, by
the number of qubits, work together for solving the computation as-
discussing their suitability for the design and analysis of distributed sociated with a quantum algorithm. Hence and as illustrated with
quantum computing architectures. In Section 7, we discuss the open the toy model in Fig. 4, a distributed quantum computation involves
issues and the research directions for each of the four pillars of this sur- non-local gates, i.e., it involves operations between qubits belonging
vey. Finally, we conclude our survey in Section 8, by first providing a to different processors. Also (classical) distributed computing involves
discussion about the main differences between distributed classical and operations between bits stored at different processors. And these non-
quantum computing, and then by providing an industrial perspective on local operations are executed through data replications, namely, by
DQC. simply copying and sending the bits from one processor to another. So
one might be tempted to believe that the same strategy can be adopted
2. Distributed quantum computing when it comes to DQC. Unfortunately, this is not true due to the laws of
quantum mechanics, such as quantum measurement postulate and non-
The purpose of this section is to briefly introduce the main differ- cloning theorem. Hence DQC requires a paradigm shift for dealing with
ences between monolithic and distributed quantum computing. To this inter-processor communications, as deeply discussed in the remaining
aim, intra-chip connectivity is discussed first, as it plays a key role in part of the manuscript.
both the two paradigms. Regardless of the challenges connected to DQC, one must notice that
DQC can be realized according to different archetypes, related to the
2.1. Monolithic quantum computing development of the underlying network infrastructure and maturity of
the quantum technologies, as represented in Fig. 5. In the following
Monolithic Quantum Computing refers to the execution of a quantum we introduce these archetypes, while in Section 7 we describe the
algorithm on a single quantum processor. challenges and open problems connected to them.
As briefly described in the box enclosed in the next page, a quan-
tum algorithm is commonly modeled by a quantum circuit. Quantum 2.2.1. Multi-core quantum architectures
circuits are made of quantum gates and, in general, the set of gates The first archetype for DQC is the one exploiting the interconnection
that can be executed on a certain quantum processor is finite, i.e., con- of multiple QPUs within a single quantum computer. This results in
stituted by few quantum gates, as a consequence of the constraints an architecture known as multi-chip [49] or multi-core [50,51]. The
imposed by the underlying qubit technology [26–45]. Thus, any gate quantum hardware underlying the qubits is likely to be homogeneous
among the different processors. Yet, some sort of hardware heterogene-
ity may arise within each processor due to the differences in terms of
3
Throughout the manuscript, the two terms quantum processor and requirements between qubits devoted to store quantum states, referred
quantum processing unit are used as synonyms. to as memory qubits, and qubits devoted to computational/processing

3
M. Caleffi et al. Computer Networks 254 (2024) 110672

Intra-chip Connectivity

Quantum computing requires quantum states to be manipulated in details in Section 5, must be properly optimized so that the
not only via single-qubit gates, but also through multi-qubit gates, overhead for satisfying all the constraints imposed by the coupling
mainly two-qubit gates such as CNOT or CZ gatesa . This implies graph is minimized.
that, within a quantum chip, physical qubits must be able to Quantum compilation must be performed regardless of the adopted
interact in a controlled fashion. Indeed, the underlying quan- quantum computing paradigm – i.e., monolithic vs distributed
tum technology influences the interaction features of the physical – since in both the paradigms multi-qubit operations are re-
qubits. Specifically, there exists a large class of different quantum stricted to act only on adjacent physical qubits. However, the
computing platforms where two-qubit gates cannot be applied overall optimization process underlying quantum compilation is
to any physical qubit pair of a quantum processor, but they more challenging in DQC, since multi-qubit operations can involve
are instead restricted to certain pairs. These limitations arise as qubits belonging to different chips.
a consequence of both the: (i) noise effects induced by qubit- Another category of quantum technologies, instead, does not con-
interactions, and (ii) physical-space constraints within a single straint the interactions among qubits, i.e., their coupling graph
processor [13]. In this class of quantum computing platform, the is fully connected. It is the case of quantum devices based on
quantum devices are characterized by a coupling graph that specifies trapped ions – such as the ones developed by AQT [37], IonQ
which qubits may interact. More into details, in the coupling [38], Quantinuum [39], and Oxford Ionics [40] – or neutral
graphb , vertices denote qubits and arrows denote the possibility atoms, such as the ones developed by PASQAL [41], QuEra [42],
of realizing a two-qubit gate between the connected qubits – as Atom Computing [43], and Infleqtion [44]. Complete connectivity
illustrated in Fig. 3. Among the technologies in the aforementioned among physical qubits constitutes an advantage over the partially
category, we can mention superconducting qubits, utilized for connected topologies exhibited by the first category of qubit tech-
example by Google [26], IBM [27], Rigetti [28], Alice & Bob [29], nologies, since extra gates are not needed for moving quantum
Anyon [30], IQ [31], and OQC [32]. But also quantum dots – states to nearest-neighbor qubits. Yet, these technologies, although
utilized for example by Intel [33], C12 [34], and Quobly [35] – they can operate at room temperature, exhibit different weaknesses
and color-center qubits – such as NV centers in diamonds utilized [2]. Regarding trapped ions, scaling the number of particles to
by Quantum Brilliance [36] – belong to this category. large numbers is challenging. With neutral atoms, the repetition
From the above, it becomes clear that any quantum computa- rate – i.e., the ‘‘computing clock’’ – is currently lower than other
tion executed on a quantum processor belonging to this category platforms, mainly limited by the time required for the preparation
requires that each multi-qubit operation between non-adjacent of the qubit array and a destructive readout. Furthermore, the
(within the coupling graph) physical qubits is mapped into a interfacing with classical electronic hardware is generally more
sequence of operations between adjacent physical qubits [13]. This complex when compared with other technologies.
mapping process, known as quantum compilation and described
a
See the box in the following page for a concise description of controlled gates such as CNOT or CZ.
b
We highlight that, in literature, coupling graphs are also referred to as coupling maps. And in the remaining part of the manuscript we adopt such
a widely-adopted terminology.

tasks. Indeed, memory qubits likely require coherence times several (QLAN) [56]. Thus, some sort of hardware heterogeneity might arise,
order of magnitude larger than computational qubits. given that different quantum computers are involved in the com-
Multi-core quantum architectures are also called Quantum Networks- putation. Such a heterogeneity must be taken into consideration by
on-Chip (QNoC). The rationale for this naming is to highlight that the distributed quantum computing ecosystem [12,16], as analyzed
some sort of chip-scale network is employed for interconnecting dif- in Section 7. As represented in Fig. 5, the physical distance between
ferent quantum computing modules [52–55]. In a QNoC architecture, remote qubits in a multi-computer quantum architecture increases
the physical distance between remote qubits – i.e., qubits belonging with respect to the multi-core architectures, since the qubits that may
to different computing modules – is very short, ranging from same- interact could belong to different computers. Accordingly, the distances
rack/same-refrigerator to same-optical table distances. Accordingly, the among remote qubits are between room-wide and building-wide, and
degrees of freedom in scaling the number of cores to be interconnected the number of physical qubits that can be clustered together is bounded
within such a space are lower than the degrees available in Multi- by the number of computers that can be interconnected within a server
Computer architectures, as analyzed in the next subsection. This, in farm.
turn, implies that the number of physical qubits that can be clustered From the above, it is evident that, although more demanding in
together in a Multi-Core quantum architecture [48] is limited as well. terms of quantum technologies maturity than QNoC, this type of DQC
We further observe that this DQC archetype is less demanding than archetype provides more degree of freedoms in optimizing the dis-
the other two archetypes in terms of maturity of the underlying quan- tributed computation.
tum computing technologies. In fact, several issues characterizing the
other two archetypes – such as quantum transduction and medium-to- 2.2.3. Multi-farm quantum architectures
long-range quantum communications – are not present in the multi-core In the third archetype of DQC, the distributed computation ex-
architecture. This, in turn, implies that multi-core DQC architectures ploits multiple geographically-distributed quantum farms. Hence, the
are also the most investigated in literature, whereas the state-of-the-art hardware heterogeneity is significant, given that the different quan-
of the other architectures is still at its infancy. tum farms are likely operated by different companies. Furthermore,
the interconnection of geographically-distributed quantum farms re-
2.2.2. Multi-computer quantum architectures quires a wide-scale network infrastructure, likely achieved through the
In the second DQC archetype, the distributed computation is per- Quantum Internet [4,16,57].
formed collectively by multiple quantum computers located within From the above, one can conclude that the main features of this last
the same farm and interconnected via some sort of Quantum LAN DQC archetype are the increasing number of quantum devices to be

4
M. Caleffi et al. Computer Networks 254 (2024) 110672

Quantum Circuit Model

The quantum circuit [46] is the most popular model of quantum As a matter of fact, every unitary operator 𝑈 on a single qubit can
computation, where quantum operators are described as quantum be formulated as:
gates. More into details, by sequentially interconnecting differ-
𝑈 = 𝑒𝑖𝜃1 R𝑥 (𝜃2 )R𝑦 (𝜃3 )R𝑧 (𝜃4 ), 𝜃𝑖 ∈ R (1)
ent quantum gates, a quantum circuit models the processing of
quantum information corresponding to a specific quantum algo- with R𝑖 denoting the 𝑖-axis rotation operator. More precisely, the
rithm [13]. Indeed, there exist several equivalent quantum circuits possibility of implementing two arbitrary rotation operators is
modeling the same computation with a different arrangement or sufficient, as their combined application can be exploited to obtain
different ordering of gates. the third type of rotation in 1.
A very simple example of quantum circuit is provided in the figure Among two-qubit gates, highly relevant are the controlled ones. The
below, where each horizontal line represents the time evolution of generic Controlled-U two-qubit gate operates on two qubits,
the state of a single (logical) qubit, with time flowing from left to namely on a control qubit (controlling the operation) and on a
right, dictating the order of execution of the different gates. target qubit (subjected to the operation). By denoting with |𝜑𝑐 ⟩ and
Quantum gates are described by unitary matrices relative to some |𝜑𝑡 ⟩ the control and target qubits respectively, the effect of the
basis, i.e., matrix 𝑈 such that 𝑈 † 𝑈 = 𝐼. It follows that ideal (or controlled U gate on the target qubit is the following:
noisy-free) quantum computation is reversible: it is always possible {
to invert a quantum computation. I|𝜑𝑡 ⟩ if |𝜑𝑐 ⟩ = |0⟩
(2)
U|𝜑𝑡 ⟩ if |𝜑𝑐 ⟩ = |1⟩.
withI denoting the identity operation. The CNOT gate is a
Controlled-U gate, where U is the Pauli-X gate. The CNOT
gate can be used to create or destroy entanglement among the
|0⟩ H qubits. Specifically, as depicted in the circuit model figure, to
obtain an entangled state we may start from the separable input √
|00⟩+|11⟩ |00⟩ and, by applying H to the first qubit, we obtain (|00⟩+|10⟩)∕ 2.
|𝛷+ ⟩ = √
2 Finally, by applying a CNOT gate (where the first qubit acts as
control qubit), the resulting state is the Bell state |𝛹 + ⟩:
|0⟩
1 ( )
|𝛷+ ⟩ = √ |00⟩ + |11⟩ . (3)
2

qubits are distributed among multiple devices, interconnected by some


sort of quantum network infrastructure.
Accordingly, whenever a quantum gate must operate on remote
qubits, some sort of communication primitive must be available for per-
forming inter-processor operations. Unfortunately, this communication
primitive cannot be accomplished through classical protocols. Indeed,
the physical phenomena underlying quantum communications with no
classical counterpart impose a paradigm shift.
To better substantiate the above statement, in Section 3.1 we intro-
duce the communication primitives required by DQC, by also discussing
the role played in the DQC by the classical communication infrastruc-
ture. Then, in Section 3.2, we present a key strategy – referred to
as entanglement swapping – for artificially augmenting the connectivity
among different quantum processors, by exploiting entangled states.

Fig. 4. Toy model for distributed quantum computation. The quantum circuit is 3.1. Communication primitives
composed by three two-qubit gates, i.e, CNOTs. First and last gates operate locally,
namely, between qubits stored within the same QPU, whereas the intermediate gate
operates remotely, namely, between qubits stored within different QPUs. Here, we discuss the two main strategies – namely, TeleGate and
TeleData – for implementing quantum gates between remote qubits.

wired and the heterogeneity of the environments hosting the quantum 3.1.1. Direct qubit transmission
computers. Distributed classical computing extensively relies on the possibility
Regardless of the considered DQC archetype, it is worthwhile to of freely duplicating information. But this basic assumption does not
mention that networking primitives with no counterpart in the classical hold when it comes to DQC [58,59] accordingly to the no-cloning
world are needed for enabling the data transfer among remote qubits theorem. Furthermore, according to the measurement postulate, even
required by a distributed quantum computation. These primitives are the simple action of measuring a qubit – i.e., reading the quantum
introduced and overviewed in the following section. information stored within – irreversibly alters its quantum properties,
such as superposition and entanglement.
3. Quantum networking: Enabling remote operations The above peculiarities of quantum mechanics have deep implica-
tions on the communication primitives underlying DQC [4]. To further
As mentioned in the previous sections – regardless of the specific elaborate on the above statement, let us clarify that it is possible to
DQC archetype – when it comes to distributed quantum computing, map a qubit into a photon degree of freedom by directly transmitting

5
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 5. Archetypes for Distributed Quantum Computing, with three different dimensions – i.e., scale, interconnection and heterogeneity – highlighted for the sake of comparison.

this qubit to a remote processor, e.g., via a fiber link or free space.4 Table 1
Quantum teleportation: post processing operations to be performed at the destination
However, if the traveling photon is lost due to attenuation or it is
for recovering the original quantum state, stemming from the measurement output.
corrupted by decoherence, the associated quantum information cannot
Measurement Output Decoding operation
be recovered via a measuring process or by re-transmitting a copy of
the original information. Specifically, any quantum system inevitably 00 I
01 X
interacts with the environment and it is afflicted by decoherence, a 10 Z
phenomenon that irreversibly scrambles the quantum state and there- 11 X followed by Z
fore its inner information [60]. This kind of quantum noise affects
every quantum operation, from qubit processing through qubit storing
to qubit transmission, and it causes an irreversible loss of the quantum
information as time passes. As a consequence of the peculiarities of information to be transmitted – say |𝜓⟩ – and the entangled qubit.
quantum decoherence, the techniques for mitigating the imperfections As represented in the gray box in the figure, the BSM consists of
introduced by qubit transmission cannot be directly borrowed from a CNOT gate – with |𝜓⟩ acting as control and the entangled qubit
classical communications [61], and the direct transmission of qubits acting as target – followed by an Hadamard gate on |𝜓⟩ and, fi-
remains, at the time of writing, limited to special cases characterized by nally, a measurement of both the qubits. Then, the source transmits
relatively short distances and tolerance to losses and low transmission – though classical communications – two classical bits encoding the
success rate, such as Quantum Key Distribution (QKD) networks [4]. measurement outcomes of the BSM. Remarkably, after the BSM, the
Clearly, DQC cannot be considered as an application tolerating losses source quantum state has been already teleported at the destination.
or low transmission success rate, since reliable computations need Nevertheless, the teleported state may have been undergone a phase
fault-tolerance to errors. and/or a bit-flip, with each flip event occurring individually with a
Thankfully, quantum entanglement [62] can be exploited as the probability equal to 0.25. Luckily, the measurement of the two qubits
key communication resource to avoid the issues arising with the direct at the source allows the destination – once the measurement outcomes
transmission of data qubits. Indeed, entanglement enables a communi- have been received through a classical communication channel – to
cation technique, known as quantum teleportation [4], for transmitting determine whether these flip events occurred. Hence, the destination
an unknown qubit without the physical transfer of the particle storing performs a post-processing to reconstruct the original state |𝜓⟩, as
the qubit, as described in the following. detailed in Table 1.
Briefly, pre-sharing a maximally-entangled pair of qubits,5 two
3.1.2. TeleData primitive nodes can reliably exchange quantum information through the tele-
Whenever two qubits are entangled – as for the Bell state |𝛹 + ⟩ portation process [63], by overcoming the limitations of direct data
given in 3 – they exist in a shared state, such that any action on a transmissions. Hence, quantum teleportation constitutes the fundamen-
qubit affects instantaneously the other qubit as well, regardless of the tal communication protocol underlying the communication paradigms
distance [4,58]. This unconventional correlation is exploited by the known as TeleData and TeleGate [64], which generalize the
quantum teleportation protocol [61], which enables the possibility of concept of moving quantum states among remote devices in DQC.
‘‘transmitting’’ – namely, teleporting – an unknown qubit without the To provide concrete insights on the TeleData and TeleGate
physical transfer of the particle storing the qubit, by exploiting a pair concepts, we must first classify qubits within a QPU either as commu-
of maximally entangled qubits (such as |𝛹 + ⟩) shared between source nication qubits or as data qubits [12]. Specifically, within each quantum
and destination (see Fig. 6). processor, a subset of qubits is reserved for storing entangled states
More into details, the source performs a pre-processing, namely, a enabling inter-processor communications. And we refer to these qubits
Bell State Measurement (BSM) on both the unknown qubit encoding the

5
We may observe that direct transmission of qubits is still needed to
4
If the first DQC archetype, i.e., the multi-core one, is analyzed this distribute entangled states among the network nodes. However and as deeply
consideration still holds, but a conversion – aka quantum transduction as clarified in [58], differently from unknown qubits, entangled states can be
analyzed in Section 7 – between quantum states may be un-necessary. repeatedly prepared for facing with losses and/or noise corruptions.

6
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 6. Circuital representation of the quantum teleportation process. The first two wires belong to the source node, whereas the bottom wire belongs to the destination node. A
generic qubit |𝜓⟩ is initially stored at the source, and a Bell state such as |𝛷+ ⟩ given in 3 must be distributed through a quantum link so that one entangled particle is stored
at the source and the other at the destination. Once the Bell state is available, the teleportation is obtained with some processing on |𝜓⟩ and on the entangled qubit at the
source, followed by two conditional gates on the entangled qubit at the destination, depending on the measurement of the two qubits at the source. Each double line denotes
the transmission of one classical bit – i.e., the measurement output – between the remote processors. The two classical bits are thus used, as detailed in Table 1, for determining
whether the two conditional gates X and Z must be applied to recover the original state |𝜓⟩ from the entangled qubit available at the destination.

as communication qubits [57], to distinguish them from the remaining the overall performance of the two strategies depends on a range of
qubits within the device devoted to processing/storage, which we refer factors, including (i) the pattern of remote operations to be executed,
either as data or memory qubits as pointed out in Section 2.2. Specif- (ii) the characteristics of the network interconnecting the remote quan-
ically, at least one qubit at each processor must be a communication tum processors, and (iii) the ratio between data and communication
qubit, i.e., a qubit reserved for the generation and distribution of the qubits [13,58,64].
entangled state [57]. The more communication qubits are available With reference to the latter factor, a fundamental trade-off arises
within a quantum processor, the more entanglement resources are [13]. Specifically, each remote operation – regardless whether it is
available at that processor, with an obvious positive effect on the implemented with a TeleData or a TeleGate – consumes the en-
achievable entanglement rate [58]. But the more communication qubits tangled resource. Consequently, a new Bell state must be distributed
are available, the less data qubits are available for quantum computing. between the remote processors before another remote operation could
As instance, let us consider two quantum processors interconnected be executed. Hence, the more communication qubits are available
via a quantum network as depicted in Fig. 7. Qubits 𝑞3 and 𝑞0′ are within each processor, the more remote operations can be executed
communication qubits and any interaction between the two remote in parallel, reducing the communication overhead induced by the
processors is carried out by exploiting them via either a TeleData distributed computation. But the more are the communication qubits,
or a TeleGate process. the less data qubits are available for computing in each processor.
With a TeleData, quantum information stored within a data qubit Accordingly to the above reasoning, the selection of the set of
at the first processor – say 𝑞4 in Fig. 7(a) – is teleported into a commu- communication qubits is a crucial task for DQC, with profound effects
nication qubit – say 𝑞0′ in the same figure – of the second processor. on the overall performance of the distributed computation as analyzed
Once the quantum state is teleported in 𝑞0′ , any remote operation – in the next sections.
originally involving 𝑞4 and some data qubits at the second processor
– can be now implemented through local operations as shown with
3.1.4. Classical control and communications
the last CNOT in Fig. 7(b). It must be noted, though, that whether the
It is worthwhile to mention that other communication primitives
teleported quantum state should subsequently interact with data qubits
required by DQC are the ones provided by the classical network. Specif-
at the first processor, a new teleportation process must be performed
ically, as highlighted above and regardless of the technology and the
for teleporting the quantum state back to the first processor.
qubit archetype, DQC depends on the availability of classical communi-
cation and network functionalities for managing the classical signaling.
3.1.3. TeleGate primitive
As a matter of fact, classical signaling is required by both TeleData
TeleData is not the only available option for implementing re-
and TeleGate. However, classical signaling is not limited to the 2-bits
mote quantum operations in DQC. Indeed, TeleGate represents an-
required by quantum teleportation: it rather constitutes a requirement
other option, which exploits a variation of the teleportation process
widespread within different DQC tasks, ranging from entanglement
to overcome the limitations of direct data transmissions. Specifically,
generation through distillation to swapping (discussed in the next
TeleGate enables a direct gate between remote physical qubits stored
section). And, indeed, it is fairly reasonable to assume these classical
at different processors without the need of moving the data qubits be-
services as provided by existing classical network infrastructures [59,
tween the processors, as long as a Bell state such as |𝛷+ ⟩ is distributed
65].
between the two processors. For the sake of exemplification, let us
assume that a remote CNOT between data qubit 𝑞4 and 𝑞1′ in Fig. 7(a),
with 𝑞4 acting as control and 𝑞1′ as target, must be performed. According 3.2. Augmented coupling map
to the TeleGate primitive, this remote CNOT can be implemented
with two local CNOTs between the data and the communication qubit From Fig. 7, it may seem that DQC requires a fully-connected
at each processor, followed by a conditional gate on the data qubit network topology, where each quantum processor must be directly
depending on the measurement of the remote communication qubit, inter-connected with all the other processors. This would, in turn, imply
as shown with the quantum circuit in Fig. 7(c). As a consequence, that the communication primitives would heavily depend on the avail-
TeleGate performs the remote operation mimicking a direct gate – ability of a direct (one-hop) entanglement generation and distribution
i.e., as the involved qubits are directly connected on the same processor architecture. However, the reality is quite the opposite. Specifically,
– by avoiding to ‘‘move’’ data qubits between the processors. DQC can exploit a strategy known as entanglement swapping [45], as
From a communication resource perspective, TeleData and Tel- summarized in Fig. 8, to implement a remote CNOT between qubits
eGate consume the same amount of quantum and classical resources, stored at remote processors, even if the processors are not directly
namely one EPR pair and the transmission of two classical bits. Yet connected through a quantum link.

7
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 7. Remote quantum operations through either TeleData or TeleGate. Fig. 7(a) shows the network topology along with the processors coupling maps, whereas Figs. 7(b)
and 7(c) illustrate the quantum circuit detailing the classical (2 bits) and the quantum (the Bell state) resources needed to execute a TeleData and a TeleGate, respectively.
Source: Figure reproduced from [13].

For the sake of exemplification, to distribute a Bell state between From the above, it becomes clear that entanglement swapping sig-
remote processors – say quantum processor #1 and #3 in Fig. 8(a) nificantly increases the degrees of freedom in performing remote quan-
– two Bell states must be first distributed so that one Bell state is tum operations, through the artificial quantum links, by enabling a
shared between the first processor and an intermediate processor/node dynamic coupling map, which includes the physical links as well, as
– usually referred as quantum repeater [66] – and another Bell state is represented in Fig. 8. The higher is the number of available quantum
shared by the same intermediate node and the second processor. Then, processors, the higher is the number of possible artificial links. Indeed,
by performing a BSM on the communication qubits at the intermediate this number scales linearly with the number of available processors,
node – i.e., qubits 𝑞0′ and 𝑞3′ in Fig. 8(b) – a Bell state is obtained at the when only two communication qubits are available at each intermedi-
remote communication qubits 𝑞3 and 𝑞0′′ in Fig. 8(b) – by applying some ate processor. If this constraint is relaxed, the number of artificial links
local processing at the remote processors depending on the (classical) via entanglement swapping scales more than linearly.
output of the Bell state measurement. It must be acknowledged that such an augmented connectivity does
Once remote processors share Bell states, they may operate as they not come for free. Entanglement swapping consumes Bell states at
were neighbors in the physical topology [67], since remote operations each intermediate processor. And the longer is the path between the
can be promptly performed. In other words, entanglement enables two processors involved in the remote operation, the higher is the
half-duplex unicast links between any pairs of processors sharing it, number of consumed Bell states. The more Bell states are devoted to
regardless of their relative positions within the underlying physical entanglement swapping, the less Bell states are available for implement-
network topology, by redefining so the very same concept of topolog- ing remote operations between neighbor quantum processors. Hence,
ical neighborhood, with no counterpart in the classical world [67,68]. a trade-off between ‘‘augmented connectivity’’ and ‘‘EPR cost’’ arises
As a consequence, the entanglement-enabled connectivity allows to with entanglement swapping [13], and the impact of this trade-off on
augment the neighbor set, by creating ‘‘additional’’ links, referred as the overall performance of distributed quantum computing must be
artificial quantum links [56,69], toward remote processors. carefully accounted for.

8
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 8. Augmented connectivity. Entanglement swapping increases the connectivity between physical qubits, with a number of possible remote CNOTs that scales at least linearly
with the number of processors.
Source: Figure reproduced from [13].

3.3. More on communication protocols for DQC for managing the entanglement requests through a scheduler. Besides,
it is in charge of carefully addressing some key needs of the entity
From the above subsections, the key role played in DQC by maximally- exploiting the entanglement such as the quality of the generated states,
entangled qubit pairs is evident. However, entanglement is affected by the number of the entangled pairs and some specifics of the operations
decoherence as well. This, in turn, affects the fidelity of the remote to be performed, such as the measurement basis. The inter-operation
operations. In [70] an extensively review of entanglement purification of the two protocols is achieved through the definition of dedicated
protocols and quantum error correction is carried out. Indeed, entangle- messages to be exchanged between the network entities.
ment purification protocols process entangled states in order to improve Differently, the so-called entanglement routing protocols aim at the
their fidelity. Among these protocols, recurrence entanglement purifi- distribution of entangled pairs between network nodes interconnected
cation protocols, process 𝑁 disjoint entangled pairs through iterative by a multi-hop path of physical links. In this regard several contri-
purification steps, for extracting 𝑀 < 𝑁 entangled pairs characterized butions have been provided in literature, each aiming at engineering
by higher fidelity [70]. From a communication perspective such proto- the swapping operations, described in the previous subsection, to be
cols exhibit the advantage of being iterative and they easily adapt to performed at the intermediate nodes in order to optimize some network
different input states. However, they introduce additional delay arising metrics such as the entanglement throughput or the fidelity of the
from the need of processing multiple entangled pairs. distributed pairs [72–74].
Additional communication protocols in DQC concern the entangle- We refer the reader with insights on the challenges related to the
ment generation and distribution functionalities. In [71] these func- design of network protocols and architecture for DQC in Section 7.
tionalities are achieved by leveraging two main protocols referred to
as Midpoint Heralding Protocol (MHP) and Quantum Entanglement 4. Quantum algorithms
Generation Protocol (QEGP). Specifically, the MHP protocol acts in
a time-slotted environment and is responsible of the generation of There exist several quantum algorithms known or expected to out-
entangled pairs at a given time-slot through the activation of dedicated perform classical algorithms for problems spanning different areas,
entanglement generation devices. Furthermore, such protocol is able to including cryptography, search and optimization, simulation of quan-
perform different operations on the generated entangled pairs, such as tum systems and learning [75]. Remarkably, most known quantum
measurement or storing. Differently, the QEGP protocol is responsible algorithms use a combination of algorithmic paradigms – actually,

9
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 9. Quantum Fourier Transform (QFT) circuit compilation for DQC.

sub-routines – specific to quantum computing [76]. These paradigms means of quantum gates. Therefore, quantum algorithms correspond
include the Quantum Fourier Transform (QFT) [77], the Grover Opera- to sequential layers of quantum operators applied to the quantum
tor (GO) [78], the Harrow/Hassidim/Lloyd (HHL) method for linear states. Each layer comprises operators that can be executed simulta-
systems [79], Variational Quantum Algorithms (VQA) [80], and di- neously. Long time ago, Knill [82] introduced a few conventions for
rect Hamiltonian simulation (SIM). A prominent example is Shor’s thinking about and writing quantum pseudocode. Subsequently, several
algorithm for integer factorization [77], which is based on the QFT, languages have been proposed to describe quantum algorithms in a
illustrated by the quantum circuit in Fig. 9. user-friendly and high-level fashion.
For most practical applications, quantum algorithms require large Nowadays, the vast majority of Software Development Kits (SDKs)
quantum computing resources – in terms of qubit number – much larger for quantum algorithm implementation and testing refers to the classi-
than those available with current noisy intermediate-scale quantum cal Python language. Major examples are Qiskit [83] by IBM, Cirq [84]
(NISQ) processors. For example, the IBM Quantum Osprey device has by Google, and PennyLane [85] by Xanadu.
433 qubits, which is an impressive progress with respect to state-of- Python is very convenient to write software that includes not only
the-art quantum processors, but not yet sufficient, as an example, for quantum circuit descriptions, but also instructions for executing the
running practical implementations of Shor’s algorithm.6 quantum programs on simulated or real quantum hardware. Sticking
In the following, we discuss three topics that concern quantum to circuit description, a few quantum assembly (QASM) languages
algorithms in a DQC context. First, we review the state of the art have emerged, such as OpenQASM [86,87] and NetQASM [88]. These
of description formats for quantum circuits. Second, we discuss the languages are characterized by a simple, hardware-agnostic but still
suitability of certain quantum algorithm to be partitioned. Third, we precise syntax for describing atomic gate-level operations. To facilitate
focus on execution management. the compilation process, intermediate representations between QASM
and hardware-specific control instructions have been designed, such as
4.1. Description formats
QSSA [89], QIRO [90] and InQuIR [91].
OpenQASM [86] was proposed as an imperative programming lan-
As illustrated in the box titled Quantum Circuit Model, in the quan-
guage for quantum circuits based on earlier QASM dialects. Current
tum circuit model of computation, quantum states are manipulated by
version 3 encompasses a broader set of circuits beyond the language
of qubits and gates, focusing on real-time classical computations that
6
Factoring 𝐿 = 2048 bit primes – for breaking current RSA implementations must be performed within the coherence times of the qubits.
– requires about 3𝐿 = 6144 noise-free qubits [46]. It is worth noting that With respect to other QASM languages, NetQASM provides elements
merely increasing the number of physical qubits is not sufficient, as some sort for remote entanglement generation. On the other hand, NetQASM
of quantum error correction [81] is also required to guarantee high-quality – contains no provision for classical communication with remote nodes.
namely, noise-free – computations. Synchronization between the NetQASM programs (through classical

10
M. Caleffi et al. Computer Networks 254 (2024) 110672

send/recv primitives) of multiple nodes is the responsibility of the in [93] is based on the distributed version of the QFT circuit, obtained
application programmer. by means of non-local controlled 𝑈 -gates.7
QSSA [89] is based on static single assignment (SSA), and it models Another example of distributable quantum algorithm is the Varia-
quantum operations as being side-effect-free. The inputs and outputs tional Quantum Eigensolver (VQE), a VQA that can be used to estimate
of the operation are in one-to-one correspondence; qubits cannot be ground state energies of molecular chemical Hamiltonians. In [65], the
created or destroyed. As a result, QSSA supports a static analysis pass authors provide a Local to Distributed Circuit algorithm that, given a
that verifies no-cloning at compile-time. The quantum circuit is fully circuit representation as a series of layers and a mapping of qubits,
encoded within the def-use chain of the intermediate representation, al- searches for any control gates where the control and target are physi-
lowing the compiler developer to leverage existing optimization passes cally separated between two QPUs. When found, the algorithm inserts,
on SSA representations such as redundancy elimination and dead-code between the current layer and next layer in the circuit, the neces-
elimination. In practice, QSSA enables decades of research in compiler sary steps to perform the control gate in a nonlocal way.8 The size
optimizations to be applied to quantum compilation. (maximum number of qubits) of the achievable Ansatz state for the
QIRO [90] is an intermediate representation for quantum comput- VQE algorithm grows linearly with the number of QPUs, with slope
ing that directly exposes quantum and classical data dependencies for linearly increasing with the number of qubits per QPU. The depth of
the purpose of optimization. QIRO consists of two dialects, one input the resulting quantum circuit is 𝛺(𝑛), meaning it has a tight upper and
dialect and one that is specifically tailored to enable quantum–classical lower bound proportional to the number 𝑛 of qubits. An example of a
co-optimization. The first dialect employs memory-semantics (quantum portion of distributed three-qubits VQA over two QPUs is depicted in
operations act on qubits via side-effects), while the second one uses Fig. 10.
value-semantics (operations consume and produce states) to integrate In [97], the authors present a distributed adder and a distributed
quantum dataflow in the intermediate representation’s SSA graph. This distance-based classification algorithm. Both applications are framed in
allows for a number of optimizations that leverage dataflow analysis. a way where a quantum server and 𝐾 other quantum nodes interact,
Last but not least, InQuIR [91] is a DQC-specialized intermediate with specific behaviors. In particular, the server is responsible for
representation, allowing the use of classical and quantum communica- orchestrating the computation by means of non-local CNOT gates, while
tion instructions between different QPUs. InQuIR is provided with a the 𝐾 parties provide inputs. It is possible to reframe these applications,
formal semantics that has enough instructions to describe complicated such that the proposed quantum circuits are considered as monolithic
behaviors of distributed quantum programs. In particular, it is able and subsequently split in 𝐾 + 1 parts to be submitted for execution to
to cope with runtime errors such as qubit memory exhaustion and a quantum network.
deadlock in intercommunication between QPUs.
4.3. Execution management
4.2. Partitioning
Another relevant aspect is the execution management of distributed
A first issue that arises with quantum algorithms is whether a given quantum computations. In general, given a collection  of quantum
algorithm – equivalently, a given quantum circuit – is natively suitable circuit instances to be executed, this collection should be partitioned
for distributed execution. More specifically, a perfectly distributable into non-overlapping subsets 𝑖 , such that  = ∪𝑖 𝑖 . One after the
quantum algorithm is a quantum algorithm that can be split into other, each subset will be assigned to the available QPUs. In other
autonomous parts that do not interact – or, at least, weakly interact – words, for each execution round 𝑖, there exists a schedule 𝑆(𝑖) that maps
with each others. If this is the case, each part can be assigned to some some quantum circuit instances to the quantum network. If DQC is
quantum processor, and each processor can contribute autonomously to supported, some quantum circuit instances may be split into sub-circuit
the overall computation without introducing communication overhead instances, each one to be assigned to a different QPU, as illustrated in
for interacting with other processors. Fig. 11. A QPU scheduling algorithm that partially address this service
Unfortunately, many relevant quantum algorithms are characterized was proposed in [11]. Such an algorithm is based on a greedy approach,
by intricate structures and multi-qubit gates, which move them away trying to fill all available QPUs while minimizing the number of dis-
from perfect distributability. As an example, let us consider the QFT tributed quantum circuit instances. Here the partitioning of quantum
algorithm, whose circuit is given in Fig. 9(a), notably used as sub- circuit instances is arbitrary, not taking into account the features of
routine in many quantum algorithms – e.g., Shor’s algorithm and the the programs. Recalling Section 4.1, we stress that partitioning should
quantum phase estimation algorithm – as mentioned above. From be an orthogonal service with respect to QPU scheduling.
Fig. 9(a), it is easy to assess that QFT requires each qubit to strongly It is reasonable to assume that the QPU scheduling plane should be
interact with all the other qubits through controlled 𝚁𝑚 gates [92]. separated from the networking plane, because of the separation of con-
Hence, QFT cannot be considered as perfectly distributable. A portion cerns principle. This means that entanglement routing must be provided
of the compiled QFT circuit, encompassing two QPUs, is illustrated in by the network infrastructure to support the execution of the DQC jobs,
Fig. 9(b). whose allocation to the QPUs is decided previously. We demand that
To distribute a monolithic quantum algorithm, a quantum compiler any subset of the available QPUs can be the target of any quantum
must be used to find the best breakdown, i.e., the one that minimizes computation, provided that the total number of physical qubits fits the
the number of gates that are applied to qubits stored at different circuit width. This means that the underlying network should allow to
devices. Quantum compilation is reviewed in Section 5. Here we discuss create entangled quantum states across any two QPUs. Technical details
some literature that addresses the partitioning of relevant quantum on entanglement distribution were discussed in Section 3. Here we
algorithms, using techniques that are tailored to the specific considered recall a recent work [98], which investigates the requirements and ob-
algorithms rather than general-purpose. These works may represent a jectives of DQC from the perspective of quantum network provisioning.
good reference for a comparative evaluation of quantum compilers. In particular, the authors elaborate on two different classes of traffic,
In [93], the authors present two distribution schemes for the quan- namely constant-rate flows and DQC applications. More recently, the
tum phase estimation algorithm, they give the resource requirements for
both and they show that using less noisy shared entangled states results
in a higher overall fidelity. Introduced by Kitaev [94], the quantum 7
Non-local controlled 𝑈 -gate generalizes the TeleGate operation discussed
phase estimation algorithm returns an approximation of an eigenvalue in Section 3.1 to arbitrary unitary 𝑈 [95].
of a given unitary 𝑈 and a corresponding eigenvector. It has numerous 8
By using the cat-entangling method by Yimsiriwattana et al. [96], which
applications, including Shor’s algorithm [77]. The solution proposed is substantially equivalent to TeleGate introduced in Section 3.1.

11
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 10. Variational Quantum Algorithm (VQA) circuit compilation for DQC.

framework. Such a metric quantifies the value derived from performing


QC tasks, and it is viewed as a ‘‘quantum volume throughput’’. It
differs from the quantum volume in two ways: (i) it explicitly considers
the rate at which non-local operations can be performed, and (ii) it
accounts for the utility derived simultaneously from tasks executed on
different parts of the network.
In a recent work [16], the authors observed that DQC execution
management deals with the parallel job scheduling problem, a widely
investigated optimization problem in which a set of jobs of varying
processing times need to be scheduled on multiple machines while
trying to minimize the makespan, i.e., the length of the schedule. Each
job has a processing time (in the DQC domain, it can be approximated
with the number of layers of computations of the distributed circuit),
and requires the simultaneous use of multiple machines. In general, the
problem is NP-hard. In [16], two novel metrics are introduced, to the
Fig. 11. Execution of multiple quantum circuit instances with 𝑘 QPUs. For each
execution round 𝑖, a schedule 𝑆(𝑖) maps some quantum circuit instances to the quantum purpose of evaluating QPU utilization and quantum network utilization
network, with each QPU receiving a quantum circuit 𝑃𝑖𝑗 that is either a monolithic with different parallel job scheduling strategies. Using two well-known
one or a sub-circuit of a monolithic one. The classical outputs are accumulated into an parallel job scheduling algorithms – namely, FIFO and List-Scheduling
output vector 𝑂.
– it is demonstrated that high QPU utilization may involve also high
quantum network utilization. In a classical computing setting, optimal
makespan and full resource utilization would be highly appreciated. In
same authors investigated the issue of service differentiation in the DQC DQC, the story is quite different. Indeed, makespan optimality needs
environment [99]. They defined the problem of how to select which highly effective and efficient entanglement routing between QPUs, in
computation nodes should participate in each pool, so as to achieve a order to guarantee timely execution of non-local gates that are all
fair share of the quantum network resources available. concentrated in a short time frame. The conclusion is that searching for
Recently, two frameworks with similar names have been proposed a reasonable tradeoff between QPU utilization and quantum network
almost at the same time, namely Quantum Network Utility Maxi- utilization is crucial.
mization (QNUM) [100] and Quantum Network Utility (𝑈𝑄𝑁 ) [101].
While QNUM is specifically tailored to the evaluation of entanglement 5. Quantum compiling
routing schemes in quantum networks (see Section 3 for details about
entanglement), 𝑈𝑄𝑁 is more abstract, aiming to capture the social For quantum devices characterized by constrained connectivity
and economic value of quantum networks, for a variety of applica- among qubits, the monolithic execution of a quantum algorithm on a
tions (from secure communications to distributed sensing). Incidentally, single quantum processor requires a circuit pre-processing known as
in [101] the example of DQC is studied in detail, through the lens of quantum compiling [13,15,103–105]. Specifically, compiling a quantum
𝑈𝑄𝑁 . More specifically, a quantum network utility metric is presented, circuit is a two-step10 process where:
which applies the Quantum Volume9 proposed in [102] to the 𝑈𝑄𝑁
(i) each logical qubit of the quantum circuit must be mapped onto
one (or more, when adopting fault-tolerant techniques [106])
9
Quantum Volume (QV) is a single-number metric that can be measured physical qubit of the quantum processor, and
using a concrete protocol on near-term quantum computers of modest size. The
QV method quantifies the largest random circuit of equal width and depth that
10
the quantum processor successfully executes. With the two steps being inter-dependent, affecting each others.

12
M. Caleffi et al. Computer Networks 254 (2024) 110672

For DQC to be effective and efficient, the quantum compiler must


perform some preliminary ebit optimization (such as the one illustrated
in Fig. 13), then find the best split for the abstract circuit, i.e., the split
that minimizes the overall communication cost required to execute the
distributed circuit. At the same time, the quantum compiler must find
the best local transformation for each piece of computation.
From the above, it should be clear that designing an efficient
compiler is a tough task. Because of this, a plethora of proposals to
tackle the problem emerges from the literature. In future work, some
Fig. 12. Pictorial representation of quantum compiling. The circuit on the left is
translated into the circuit on the right, in order to cope with the coupling map provided
of them may be combined to more sophisticated compilers. This already
in Fig. 3. Within the rightest figure, the 𝑞𝑖 with purple font denotes the physical qubits happened for local computing. For example, the quantum compiler
assigned to the logical qubits 𝑞𝑗 with black font. The SWAP gate – represented by two from the IBM Q framework [112] has several layers of optimization,
× symbols interconnected by a vertical line – introduced between logical qubits 𝑞1 and each tackling the problem from different perspectives.
𝑞2 swaps their quantum states, so that the last CNOT gate can be applied between two
Most quantum compilers for DQC are characterized by two funda-
neighbor physical qubits.
mental steps, namely qubit assignment and non-local gate handling. In the
following, we present these two compilation steps, with reference to the
most relevant literature. In Table 2, we compare some prominent DQC-
(ii) each two-qubit gate – as instance, a CNOT – between physical oriented quantum compiling strategies. To this purpose, we consider
qubits non-adjacent within the coupling map must be mapped the programming language, the supported network topologies, the
into a computational-equivalent sequence of gates between ad- qubit assignment strategy, the non-local gate handling strategy, and the
jacent physical qubits, as exemplified in Fig. 12. availability of an open source release of the software.
The overall process must be optimized to account for the key perfor- In the remainder of the section, we first present some of the most
mance metrics affecting quantum computation [107–109]. Typically, representative strategies for qubit assignment and non-local gate han-
this consists in minimizing the depth of the compiled circuit, namely, dling. Then, we discuss some open issues.
the equivalent quantum circuit satisfying all the constraints imposed
by the quantum processor coupling map. 5.1. Hardware matching
An example of quantum compilation is provided with Fig. 12,
where the original quantum circuit is translated into the compiled one A fundamental step in every quantum compiler is translating general-
to account for the coupling characteristics of the quantum processor purpose quantum gates instructions into instructions specific to the
shown in Fig. 3. Indeed, as long as the hardware provides a universal underlying quantum hardware. This translation can go down to the
set of operations, there exists a feasible transformation. level of analog signals for the control hardware [125–127] or remain
Compilers are well-established in NISQ architectures, because of at the same abstraction level of the input instructions albeit using the
specific gate set supported by the target quantum computer. In this
their role as intermediary between the user and the hardware. Specif-
section, we focus on the latter compilation case, which is commonly
ically, in designing a quantum algorithm using the quantum circuit
denoted as transpiling.
formalism, the designer is generally focused on expressing the com-
Depending on the technology used to manufacture a quantum com-
putation required by the algorithm with a circuit that minimizes the
puter, the set of natively supported gates that can be executed varies.
number of utilized qubits and gates, regardless from the particulars
Even inside the same ‘‘family’’ of quantum devices, the native gate set
of the quantum hardware that will execute the circuit. This abstract
may vary. This is the case, for example, of superconducting qubits.
circuit is then mapped to a circuit to be executed on a specific quantum
Among IBM quantum computers, there are three types of superconduct-
hardware by means of a suitable compiler. Introducing such an abstract
ing devices, each supporting a different flavor of single-qubit rotations
circuit has two main advantages: (i) the user can focus on the logic of
and two-qubit gates. Furthermore, Google’s superconducting devices
the circuit, namely, on the essence of the quantum algorithm, without
have a different gate set. Regarding ions trap devices, the native two-
caring too much about the hardware constraints, and (ii) the designed
qubit gate is the RXX gate, meaning that a CNOT must be decomposed
quantum circuit is portable, in theory, to any quantum back-end.
into multiple single qubit rotations and an RXX gate [128]. Finally, in
Intuitively, a circuit transformation may introduce some overhead,
NV centers [88], the only two-qubit gate available is the CNOT, and it
in terms of number of operations and noise. In DQC architectures, there
can only be applied between the central electron and the surrounding
is also a non-negligible communication cost, as discussed in Section 3.
carbon atoms.
Therefore, the compiler faces an optimization problem, i.e., finding a
To support different flavors of native gate sets, the compiler usually
feasible transformation while minimizing the overhead. In general, this
employs a collection of decomposition rules to translate each non native
problem is known to be NP-hard [103,110], even for the case of a single
gate into a sequence of native gates. This procedure usually produces
processor.
a more complex circuit than the input one. Therefore, other circuit
A fundamental issue in quantum compiling is related to qubit con-
optimization techniques may be adopted to reduce the number of native
nectivity. From the perspective of the quantum algorithm designer, any
gates used and the depth of the circuit.
qubit is assumed to be directly connected with any other qubit. i.e., any
two-qubit gate can be placed across any qubit pair. However, even on
5.2. Qubit assignment
a single quantum processor as introduced in the Intra-chip connectivity
box, the actual connectivity degree is usually low, to mitigate the noise An abstract circuit is composed by logical qubits, while a quantum
caused by cross-talking phenomena [111]. Qubit routing refers to the processor is equipped with a register of physical qubits. An assign-
task of modifying quantum circuits so that they satisfy the connectivity ment, in its most basic form, is a one-to-one mapping between logical
constraints of a target quantum computer. This involves inserting SWAP and physical qubits.11 Whether it is better to tackle it dynamically –
gates into the circuit so that the logical gates only ever occur between
adjacent physical qubits. Of course, the number of SWAP gates should
be minimized, in order keep the circuit depth reasonably small. The 11
One can also consider fault-tolerant mappings, where more than one
problem gets harder when considering distributed quantum processors, physical qubit encode a single logical qubit. However,we consider this as side
where the connectivity degree of the physical qubits can be even lower. work, out from the scope of this survey for the sake of simplicity.

13
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 13. Example of ebit optimization: the left part of the equivalence can be optimized to the right one, which reduces the number of non-local gates.

Table 2
Comparison of DQC-oriented quantum compiling strategies. Some strategies find the best partition of the input monolithic quantum circuit in
a completely network-agnostic fashion. Some strategies are purely theoretical, not supported by a software implementation.
Compiler Language Network Topologies Qubit Assignment Non-local Gate Handling Open Source
[113] Haskell hypergraph minimum k-cut TeleGate and TeleData YES
[114] unknown hypergraph minimum k-cut TeleGate NO
[115] unknown any Tabu search TeleGate and TeleData NO
[116] MATLAB / heuristic TeleData NO
[117] MATLAB / dynamic programming TeleData NO
[15] Python any minimum k-cut TeleGate and TeleData NO
[118] C++ and CPLEX n.a. minimum k-cut TeleGate and TeleData NO
[13] Python LLN sorting TeleGate and TeleData NO
[119] pseudo-code any integer linear programming TeleGate /
[120] MATLAB n.a. genetic alg. TeleData NO
[121] / any sorting TeleData /
[122] / hypercube sorting TeleData /
[123] Python any minimum k-cut TeleGate YES
[124] Python any reinforcement learning TeleGate and TeleData YES

where the first step is quantum assignment. Circuits are represented


as edge-weighted graphs with qubits as vertices. The edge weights
correspond to an estimation for the number of cat-entanglements8 . The
problem is then solved as a minimum k-cut, where partitions have
roughly the same size. In [115], the same authors extend their ap-
proach to the case of an arbitrary-topology network of heterogeneous
quantum computers by means of a Tabu search algorithm. In [116], the
circuit becomes an undirected graph with qubits as vertices, while edge
weights correspond to the number of two-qubit gates between them.
Fig. 14. Toy example of qubit assignment. Once the logical qubits composing the In [117], the authors represent circuits as bipartite graphs with two
quantum circuit have been assigned to the different QPUs, the CNOTs between remote
qubits – highlighted in violet – becomes non-local.
sets of vertices – one set for the qubits and one for the gates – and
edges to encode dependencies of qubits and gates. Then for the qubit
assignment problem, they propose a partitioning algorithm via dynamic
programming to minimize the number of TeleData operations. In [15],
changing the assignment while computing – or statically – defining the
the authors devise a two step process for the qubit assignment. First,
assignment at the beginning and keeping it for the whole execution of
the circuit is translated into a weighted graph and partitioned, using
the computation – is an open problem, which also depends on whether
an efficient k-way graph partitioning algorithm, into 𝑘 (where 𝑘 equals
the partition between communication qubits and computing qubits is
the number of available QPUs) partitions of roughly equal size. Finally,
static or dynamic.
In DQC, qubit assignment is a general-purpose approach to the the authors employ an heuristic algorithm to improve over this initial
partitioning problem, introduced in Section 4.2. Specifically, for a solution, as equal partitions may not be optimal.
given set of logical qubits, we need choose a partition that maps sub- When qubit assignment is dynamic, new challenges – as well as
sets of logical qubits to processors, while minimizing the number of new possibilities – arise. In [118] the authors propose a minimum k-
required interactions among different sub-sets, as shown in Fig. 14.Sev- cut partitioning algorithm formulated as an ILP optimization problem,
eral authors investigate this research direction [13,15,113–115]. The to minimize the number of remote interactions. They use a moving
reader will find in these works different proposals to address the window and apply the partitioning algorithm to small sections of the
qubit assignment problem. Not all the papers match in the minimum circuit, thus the partition may change with the moving window by
assumptions for the technology. Specifically, as described in Section 5, means of TeleData operations. In [13], the authors consider the worst-
we are at a stage where one need to make predictions on the most case scenario of QPUs interconnected through an LNN topology.12
likely DQC architecture that will run in the next future. If one assumes Rather than focusing on the number of remote interactions, they design
any connectivity, the resulting model is general-purpose, but it is also a sorting algorithm to reduce the depth overhead induced by such
hard to tackle. Restricting the connectivity to one that satisfies some time consuming operations. The authors show that the overhead is
properties makes the model less general, but a good set of assumptions upper-bounded by a factor that grows linearly with the number of
in this direction may shape future implementations as well. Currently,
the preferred line is to keep connectivity general [115].
The authors in [113] propose to encode a logical circuit as an 12
The Linear Nearest Neighbor (LNN) topology [129] consists of processors
hypergraph. An hyperedge represents one ebit – i.e., one EPR shared arranged in a single line – namely, in a 1-dimensional lattice – where each
between QPUs – which allows for a TeleGate to be performed. Qubit processor is interconnected with two neighbors. In the worst-case scenario
assignment works by minimizing the number of cuts, as each cut cor- – namely, the most challenging one – each QPU is equipped with a single
responds to an ebit. In [114], the authors present a two-step solution, computational qubit, and only neighboring qubits can interact each others.

14
M. Caleffi et al. Computer Networks 254 (2024) 110672

qubits. In [119], the authors model the compilation problem with an vertex-cover problem, allowing for a polynomial-time optimal solution
Integer Linear Programming formulation. The formulation is inspired based on integer linear programming. They also provide a 𝑂(log 𝑛)-
to the vast theory on dynamic network problems. Authors managed to approximate solution, where 𝑛 is the total number of global gates,
define the problem as a special case of quickest multi-commodity flow. for a generalized setting by means of greedy search algorithm. Also
Such a result allows to perform optimization by means of techniques the aforementioned work in [119] adopts the TeleGate approach. The
coming from the literature, such as a time-expanded representation of authors of [15] use an heuristic approach where the compiler can
the distributed architecture. be set to use only TeleGates or TeleGates and TeleDatas. The results
show that the best choice is highly dependent on the type of circuit.
5.3. Remote operations optimization The authors in [123] focus on the concept of gate embedding, which
make it possible to use the same EPR pair for multiple TeleGates.
As described in Section 3, assumptions on the architectures not only The authors combine this embedding techniques with different qubit
concern connectivity. Predicting the best kind of remote interactions assignment and non-local gate handling techniques from the literature
is of critical importance as well. In this sense, the general agreement to provide a flexible compilation workflow for heterogeneous quantum
is that the generation and distribution of entangled states is a fun- networks. Finally, in [124] the authors proposes an MDP formulation
damental resource to be used sparingly. Indeed, a common goal in that models non-local gate optimization and local compilation on each
the literature is to minimize the number of consumed ebits, as it is QPU. Given the prohibitive complexity of such a model, they propose
the main bottleneck to distributed quantum computation. To this aim, a relaxation of the MDP formulation and tackle the problem with
qubit assignment discussed above represents a starting point for further a reinforcement learning approach. The authors show that the RL
optimization steps, which now concern circuit manipulation. approach performs well with small random circuits and could be scaled
As described in Section 3, there are two main approaches for up to bigger circuits by sacrificing some degree of optimality for the
implementing non-local gates, namely TeleData and TeleGate. solution. Besides [15], this is the only other work considering local
The TeleData approach is considered, for example, in [116,117,120– compilation.
122]. In [121], the authors prove that the quantum circuit model, the
quantum parallel RAM model, and the DQC model are equivalent up 6. Simulation tools
to polylogarithmic depth overhead. Other than this major result, they
provide an algorithm for emulating circuits on any network graph. To support the research community in the design and evaluation
In [122], the authors focuses on 𝑛-qubit cyclic butterfly networks (a of quantum computing and quantum network technologies – including
special case of hypercubic network) and proves that there is a sequence hardware, protocols and applications – many simulation tools have
of local gates with depth 6 log 𝑛 such that the qubit at node 𝑎 is sent to been developed recently.
node 𝜋(𝑎) for all 𝑎 = 1, … , 𝑛 and any permutation 𝜋 ∶ [1, 𝑛] → [1, 𝑛]. Simulations are very important for several reasons. First of all, they
In other words, the butterfly network can implement any quantum allow for defining hardware requirements using a top-down approach,
algorithm with an overhead of 6 log 𝑛. Such a network topology is i.e., starting from applications and protocols. In this way, hardware
suitable for multi-chip quantum devices or small controlled networks. design is driven by high-level KPIs (key performance indicators), rather
In medium-scale or global networks, it is hard to implement such a than proceeding by trial and error. Another advantage of simulations
constrained architecture. In [116], the authors propose a method to is related to network sizing. Given the number of potential users and
minimize the number of quantum teleportations between DQC parti- the number of available quantum processors, simulation allows for
tions. The main idea is to turn the monolithic quantum circuit into an devising and evaluating different network topologies and entanglement
undirected weighted graph, where the weight of each edge represents routing schemes, which results in saving time and money. Regarding
the number of gates involving a specific pair of qubits for execution. DQC, simulation plays a crucial role for establishing the correctness of
Then, the graph is partitioned using the Kernighan–Lin (K-L) algorithm the compiled distributed quantum programs, and evaluating the qual-
for VLSI design [130], so that the number of edges between partitions ity of their execution against different hardware platforms, network
is minimized. Finally, each graph partition is converted to a quantum configurations and scheduling algorithms.
circuit. In [117], the authors propose an algorithm for minimizing In Table 3, we compare some prominent simulation tools that, in
teleportations consisting of two steps: first, the quantum circuit is con- our view, can be used for designing and evaluating DQC systems. We
verted into a bipartite graph model, and then a dynamic programming propose to classify each tool as belonging to one of three possible
approach (DP) is used to partition the model into low-capacity quantum classes: (i) hardware-oriented (HW), (ii) protocol-oriented (PR), and
circuits. Finally, in [120], the authors propose a heuristic approach to (iii) application-oriented (AP). In the remainder of the section, we first
replace the equivalent circuits in the initial quantum circuit. Then, they present each class with some of the most representative simulation
use a genetic algorithm to partition the placement of qubits so that the tools.
number of teleportations could be optimized for the communications
of a DQC. 6.1. Hardware-oriented
Conversely, the TeleGate direction is pursued, for example, in [15,
113,114,119,123]. In [113], the authors use cat-entanglement8 to im- We denote as HW simulation tools those that allow the user to
plement non-local quantum gates. The chosen gate set contains every model the physical entities with the desired degree of detail, in-
one-qubit gate and a single two-qubit gate, namely the CZ gate (i.e., the cluding noise models. Prominent examples are SQUANCH [131] and
controlled version of the Z gate). The authors consider no restriction on NetSquid [132], discussed in the following. Regarding DQC, we note
the ebit connectivity between QPUs. Then, they reduce the problem of that HW simulation tools are useful for evaluating the impact of
distributing a circuit across multiple QPUs to hypergraph partitioning. different hardware technologies (including noise models) on the quality
The proposed approach is evaluated against five quantum circuits, in- of the distributed program execution.
cluding QFT. The proposed solution has some drawbacks, in particular The Simulator for Quantum Networks and Channels (SQUANCH) [131]
that there is no way to customize the number of communication qubits is an open-source Python framework for creating parallelized sim-
of each QPU. As previously mentioned, in [114], a two-step quantum ulations of distributed quantum information processing. Despite the
compiling approach is introduced. The first step is qubit assignment, framework includes many features of a general-purpose quantum com-
while the second step is finding the smallest set of cat-entanglement puting simulator, it is optimized specifically for simulating quantum
operations that will enable the execution of all TeleGates. The authors networks. It includes functionality to allow users to design complex
state that, in a special setting, this problem can be reduced to a multi-party quantum networks, extensible classes for modeling noisy

15
M. Caleffi et al. Computer Networks 254 (2024) 110672

Table 3
Comparison of simulation tools that can be used for designing and evaluating DQC systems.
Simulation Tool Language Multiprocessing Multithreading Noise Models Open Source Class
SQUANCH [131] Python NO NO YES YES HW
NetSquid [132] Python NO NO YES NO HW
SimulaQron [133] Python YES NO NO YES PR
SeQUeNCe [134] C++/Python YES NO YES YES PR
QuiSP [135] C++ NO NO YES YES PR
QuNetSim [136] Python NO YES NO YES PR
NetQASM SDK [88] C++/Python NO YES YES YES AP
QNE-ADK [137] C++/Python NO NO YES NO AP

quantum channels, and a multiprocessed NumPy backend for perfor- one server, in order to make it possible to simulate measurements in a
mant simulations. The core modules are QSystem, representing a consistent fashion. This process-oriented approach makes SimulaQron
multi-body quantum system as a density matrix in the computational quite scalable and able to leverage multicore server architecture in
basis, and QStream, which is an iterable ensemble of separable 𝑁- order to speed up the execution of the simulations. However, Simu-
qubit QSystems optimized for cache locality. By default QStream state laQron does not come with noise model support, thus preventing the
is stored in a shared memory as a C-type array of doubles, which is simulation of quantum protocols over non-ideal networks.
type-casted as a 3D array of np.complex64 values. During simulations, SeQUeNCe [134] is an open-source discrete-event quantum net-
Agents run in parallel from separate processes, synchronizing clocks work simulator, whose latest release fully supports parallel simulation.
and passing information between each other through Channels. There The authors designed and developed a quantum state manager (QSM)
is no explicit concurrency safety when a QSystem is modified by mul- that maintains shared quantum information distributed across multiple
tiple agents, as sending and receiving Qubits are blocking operations processes, and also optimized their parallel code by minimizing the
that allow for naturally safe parallelism. However, the scalability of overhead of the QSM and by decreasing the amount of synchronization
this simulation tool is hindered by the lack of support for distributed among processes.
multiprocessing, as all the processes must run on the same machine. QuiSP [135] is an event-driven Quantum Internet simulation pack-
The source code is not maintained since 2018. age. QuiSP is built on top of the OMNeT++ discrete event simulation
NetSquid [132] is one of the most advanced platforms for simulating framework. Compared to the simulators discussed so far, many of which
quantum networking and modular computing systems subject to physi- focus on physically realistic simulation of a single small network, QuiSP
cal non-idealities. It ranges from the physical layer and its control plane is oriented to protocol design for complex, heterogeneous networks at
up to the application level. This is achieved by integrating several key large scale while keeping the physical layer as realistic as possible.
technologies: a discrete-event simulation engine, a specialized quan- Emphasis has been placed on realistic noise models. The declared long-
tum computing library, a modular framework for modeling quantum term goal for the simulator is to be able to handle an internetwork
hardware devices, and an asynchronous programming framework for with 100 networks of 100 nodes each. To simulate quantum networks
describing quantum protocols. NetSquid has been used for different at the cost of only a few classical bits per qubit, QuiSP works in the
purposes, such as the evaluation of a benchmarking procedure for error basis, i.e., tracking only errors, not states. The premise is that
quantum protocols [138], the evaluation of end-to-end entanglement the desired quantum state is known and only deviations from this
generation strategies in terms of capacity bounds and impact on Quan- ideal state must be tracked. This is a novel approach for simulating
tum Key Distribution (QKD) [139,140], and the performance evaluation quantum networks, adapted from quantum error correction [143]. The
of request scheduling algorithms for quantum networks [141]. performance of QuISP was investigated in terms of events processed per
second and the duration of CPU time taken to generate one end-to-end
6.2. Protocol-oriented Bell pair, using the Docker environment that QuISP provides. It was
shown in [135] that the average CPU time (in seconds) per end-to-end
In the proposed classification, PR simulation tools are mostly de- Bell pair generated grows no worse than polynomially in the number of
voted to the design and evaluation of general-purpose quantum pro- quantum repeaters. Increasing the number of repeaters results in longer
tocols, – such as quantum state teleportation, quantum leader election, simulation time in the scaling, as expected. It also emerged that QuISP
etc. [142] – with the possibility to model hardware-agnostic networked might have some kind of unintended overhead which scales linearly
quantum processors, with very limited (if not missing) support for noise on the number of buffer qubits, which the authors expect to fix in a
modeling. Relevant examples are SimulaQron [133], SeQUeNCe [134], near-term release [135].
QuiSP [135] and QuNetSim [136]. Regarding DQC, PR simulation tools QuNetSim [136] implements a layered model of network compo-
are useful for evaluating the impact of different compiling and execu- nent objects inspired by the OSI model. In particular, application,
tion management strategies on the quality of the distributed program transport, and network layers are considered. QuNetSim does not ex-
execution, in (almost) ideal conditions. plicitly incorporate features of the link and physical layers. Indeed,
SimulaQron [133] is a tool for developing distributed software that QuNetSim relies on open-source qubit simulators that are used to
runs on real or simulated classical and quantum end-nodes, connected simulate the physical qubits in the network, namely SimulaQron [133],
by classical and quantum links. SimulaQron spawns three stacked ProjectQ [144] and EQSN [145] (the latter one being the default
processes per network node: the lowest one for wrapping a simulated backend, as it was developed by the QuNetSim team). In QuNetSim,
quantum registry, based on an hardware-specific third-party simulator; network nodes can run both classical and quantum applications. The
the intermediate process exposing simulated qubits that map 1-to −1 transport layer component prepares classical packets, encodes qubits
to those of the quantum registry; the upper process providing virtual for superdense message transmission, handles the generation of the
qubits that are manipulated within a platform-independent application. two correction bits for quantum state teleportation, etc. The network
For example, if two virtual qubits belonging to different processes, run- layer component can route classical and quantum information using
ning on physically-separated servers, are manipulated in order to share two internal network graphs and two different routing algorithms.
an entangled state (let say, a Bell state), the corresponding simulated The network component objects are implemented using threading and
qubits (and quantum register ones) are both stored in the memory of observing queues. Extensive use of threading allows each task to wait

16
M. Caleffi et al. Computer Networks 254 (2024) 110672

without blocking the main program thread, which simulates the be- 7.1. Quantum networking
havior of sending information and waiting for an acknowledgment, or
expecting information to arrive for some period of time from another The open issues and research direction for the quantum networking
host. QuNetSim works well for small scale simulations using five to ten pillar are first discussed around the three DQC archetypes presented in
hosts that are separated by a small number of hops, while it tends to Section 2, as depicted in Fig. 15. Then, issues and directions crossing a
reach its limits when many entangled qubits are being generated across single archetype are gathered at the end of the subsection.
the network with many parallel operations.
7.1.1. Multi-core quantum architectures
In this type of DQC architecture, the physical distance between
6.3. Application-oriented
remote qubits is very short. Hence, it is reasonable to assume that the
underlying communication infrastructure exploits short-range commu-
The third class is devoted to AP simulation tools, which are tailored nication links, such as micro-wave links in case of superconducting
to the design and implementation of quantum network applications. computing technology. The network topology is likely static, so that
Usually, these tools rely on simulated backends offered by other pack- only simple quantum network functionalities are required. Quantum
ages that are not directly accessible to the user – for example, NetQASM decoherence must be carefully accounted for, so that the decoherence
SDK [88] relying on NetSquid [132]. Regarding DQC, AP simulation time can be used as overall key metric. Local operations between
tools are useful for quickly assessing the quality of quantum circuit qubits within a single processor must be complemented by remote
splits produced by quantum compilers. The execution management operations between qubits placed at different processors. The trade-off
scheme (i.e., job scheduling, entanglement routing, etc.) is hidden to between qubits devoted to computation and entangled qubits devoted
the user, which is at most allowed to specify the network topology to communication represents a fundamental issue with no counterpart
(from a short list of preconfigured networks) and the values of a few in classical distributed computing. The very challenging task of design-
parameters characterizing the hardware of the quantum processors. ing distributed quantum algorithms must explicitly take such trade-off –
The process of setting up a simulation requires strong expertise in as well as the delay induced by remote operations – into consideration.
the simulator itself, thus being inconvenient for those who are only
interested in quantum protocol evaluation or in the design of support- 7.1.2. Multi-computer quantum architectures
ing tools such as quantum compilers. Recently, Ferrari et al. [146] In this type of DQC architecture, as said, the computation is per-
presented a software tool, denoted as DQC Executor, that accepts as formed collectively by multiple quantum computers located within
input the description of the network and the code of the algorithm, the same farm. Hence, entanglement distribution still benefits from a
and then executes the simulation by automatically constructing the net- tightly controlled environment – reasonable to assume available within
a single quantum farm – and the relatively short distances. For the
work topology and mapping the computation onto it, in a framework-
sake of exemplification, the communication infrastructure can still be
agnostic way and transparently to the user. The tool is in its early
composed by cold microwave links [148] for superconducting-based
stages and currently supports automatic deployment of distributed
qubit technology, although optical links would greatly simplify the
quantum algorithms to the NetSquid [132] simulator. The description
hardware requirements albeit at the price of significant technological
of the network is provided by the user in a specific YAML format. The
advances in the microwave-optical conversion.
distributed algorithm, instead, is defined with the OpenQASM [147]
More into details, an interface – aka quantum transducer – between
language.
the processing unit and the inter-computer communication infrastruc-
NetQASM SDK [88] is a high-level software development kit, in ture is eventually needed, but it represents still an open problem.
Python, whose purpose is to make easier to write quantum network For instance, super-conducting technologies demand for the so-called
applications, to simulate them through NetSquid [132] or SimulaQron matter-flying interface [4,149], namely, a device able to convert a qubit
[133], and (expected in the near future) to execute them on real belonging to the QPU to a qubit suitable for the transmission over a
hardware. Indeed, the quantum programs developed with NetQASM quantum physical channel [149–151]. In multi-computer architectures,
SDK are translated into low-level programs based on the NetQASM such an interface represents a technology challenge which comprises
language. the major complexity source from a networking perspective.
The Quantum Network Explorer Application Development Kit (QNE- Delay imposed by classical and quantum communication times is
ADK) [137] allows the user to create applications and experiments and slightly longer, when compared to Multi-Core architectures. Hence,
run them on a simulator. When configuring an application, the user more sophisticated timing and synchronization functionalities are re-
specifies the different roles and what types of inputs the application quired. The network topology becomes more complex, and it may
uses. In addition, the user writes the functionality of the application present some sort of temporal dynamics as the number of intercon-
using the NetQASM SDK [88]. When configuring an experiment, the nected quantum computers might change in time. This, in turn, induces
user can give values to the inputs that were specified when creating network functionalities dynamics that must be carefully taken into
the application. The user also chooses which channels and nodes are account. The problem of remote operations compiling – and, hence, the
used in the network and which role is linked to which node. Once trade-off between computational and communication qubits – becomes
configured, the experiment is parsed and sent to the NetSquid simu- even more intricate. Finally, in this type of architecture, the execution
lator [132]. QNE-ADK is particularly useful when the application code management problem discussed in Section 4.3, arises, with multiple
developed with NetQASM SDK is provided to the user, whose only duty users performing concurrent access to the resources.
is to configure and perform experiments. Indeed, using the execution
environment is straightforward. There is also a visual interface that 7.1.3. Multi-farm quantum architectures
further simplifies the experiment configuration. This last archetype for DQC architectures involves interconnecting
multiple geographically-distributed quantum farms. Two are the key
challenges here. First, as mentioned in Section 2, there exists a likely
7. Open issues and research directions spread heterogeneity, which requires significant efforts in terms of stan-
dardization and interoperability [57]. Furthermore, the heterogeneity
In this section we discuss the open issues and research directions among quantum links – e.g., optical vs terrestrial free-space vs satellite
related to DQC, by focusing on the four pillars around which this survey free-space – will arise. And efficient quantum transducers are now
is organized on. mandatory [4,149,151].

17
M. Caleffi et al. Computer Networks 254 (2024) 110672

Fig. 15. Networking challenges for distributed quantum computing. It is reasonable to assume that the underlying hardware complexity scales proportional with the three DQC
archetypes in terms of: (i) extension of the communication infrastructure, (ii) number of interconnected quantum devices, (iii) and hardware heterogeneity among the quantum
devices.

The delays induced by the distances introduce severe challenges Furthermore, entanglement does not limit to Bell pairs. In fact,
on the entanglement generation and distribution. And effective routing multipartite entanglement – i.e., entanglement shared between more
techniques are required. than two parties – has the potentiality to be a powerful resource for
The increasing number of quantum devices to be wired and the DQC [69,158–160]. More into details, multipartite entangled states
heterogeneity of the environments hosting the quantum computers are exploited in the so-called measurement-based quantum computing,
must be taken into account as well. At this stage, the compiling and which gives rise to a computational model referred to as ‘‘one-way
execution management problems are even more complex, demanding computing ’’ [161,162]. The one-way computing is different from the
for specific network services to be integrated with those of the classical circuit model, as it relays on sequences of Pauli measurements on a
Internet (such as DNS, DHCP, etc.). multipartite entangled state to perform the computation. Remarkably,
We emphasize that, although each type of architecture is charac- in such a computing model, the process of generating entanglement is
terized by an increasing amount of interconnected quantum resources, decoupled from the process of consuming it for computation. This, in
the actual deployment evolution of DQC towards the multi-farm ar- principle, could be exploited for the design of the network functional-
chitecture is strongly dependent on the technological advances and ities. Indeed, one can proactively distribute the multipartite resource
the experimental implementations of the different entities composing for one-way computing among the remote processors, and then re-
a distributed quantum computing ecosystem [4,12]. actively proceed with the computation, i.e., the Pauli measurements.
In this context, from a network perspective, it becomes crucial to
7.1.4. Cross-architecture challenges efficiently distribute the multipartite entangled resource. Even more
Another fundamental issue arising with networking remote quan- importantly, this approach demands for a tight coordination among the
tum processors, regardless the specific DQC archetype, is represented networked processors, for the local corrections to be made after the
by noise and imperfections affecting the quality of the distributed measurements [69,163].
entangled states. The noisier is the distributed entangled state, the However, multipartite entanglement is still a widely unexplored
noisier is the overall distributed quantum computation. Luckily, a well- research area and it is an open research direction to fully unleash its
known technique for counteracting the noise impairments affecting the potentiality in DQC.
entanglement generation/distribution process is constituted by entan-
glement distillation (also known as entanglement purification) [70,152– 7.2. Quantum algorithms
157]. Accordingly, as long as the ‘‘quality’’ of the noisy entanglement
exceeds a certain threshold, it is possible to purify multiple imperfect Future directions are both theoretical and practical. Despite a con-
Bell states into a single ‘‘almost-maximally entangled’’ pair, albeit at siderable amount of work on the fundamentals of distributed quantum
the price of consuming multiple noisy entangled states within the computing [121,122,154], an ultimate theory of distributable quantum
process. From the above, it follows that one of two orthogonal resources algorithms is still missing. It is known that the quantum circuit model
must be exploited for implementing the distillation process, namely, and the DQC model are equivalent up to polylogarithmic depth over-
time or space. More into details, time-expensive distillation requires head [121], but a general framework for ranking quantum algorithms
multiple rounds of entanglement generation and distribution, with each in terms of distributability has not been defined. To this purpose, it
round requiring at least two communication qubits at each processor. is necessary to provide a quantitative definition of quantum circuit
Conversely, space-expensive distillation can be completed with few distributability.
rounds, but with each round involving several communication qubits. Regarding execution management, the broad literature on parallel
Hence, there exists a fundamental trade-off between (i) quality of the job scheduling for may be a starting point, but it is clear that the
overall computation, (ii) delay induced by entanglement distillation, peculiarities of quantum computing – quantum parallelism, no-cloning,
and (iii) communication qubits reserved for distilling a high-quality Bell entanglement, etc. – demand for novel and specific strategies for the
state. efficient execution of concurrent distributed quantum computations. A

18
M. Caleffi et al. Computer Networks 254 (2024) 110672

trade-off between the complexity of the distributed quantum circuit 7.4. Quantum simulation
and the physical distance between quantum processors is also envis-
aged. Furthermore, to compare different deployments and schedules, There is a sufficiently variegated choice of simulation tools for
DQC-specific key performance indicators must be defined [100,101]. quantum networks and backends to support DQC research, with special-
ization on hardware, protocols, or applications. Yet, on the other hand,
a simulation tool allowing for full-stack simulation of large networks
7.3. Quantum compiling
is still missing. Such a tool should be support multiprocessing and
multithreading, and simple deployment of DQC simulations on high
As described in Section 3, the entanglement generation and distri- performance computing facilities.
bution functionality plays a key role in quantum compiling. Although Another possible direction is the development of tools for orches-
there exists no standard model conferring to specific entities the re- trating DQC simulations, with automated instantiation of simulation
sponsibility of entanglement generation, distribution and managing, objects representing QPUs and quantum network components. Having
we can identify two possible approaches, namely, network-centric and quantum compilers for DQC in the loop would be also very useful. Last
computing-centric. but not least, it would be great to have the possibility to seamlessly
In the network-centric approach, the communication infrastructure replace simulated hardware with real devices.
and its protocols are responsible for the entanglement generation distri-
bution and managing, (seen as both functionality and actuation), while 8. Discussion and future perspective
the compiler exploits the entanglement information gathered by the
network to perform distributed algorithms. The other approach can be 8.1. Discussion
identified as computing-centric. Specifically, the compiler uses as input
the physical topology and instructs the quantum communication infras- In order to further highlight the peculiarities of DQC, we summa-
tructure, through requests about the entangled states to be generated rized in Table 4 the main differences between distributed classical and
and distributed for performing a distributed algorithm. quantum computing, by focusing on the different archetypes analyzed
Although these two approaches are yet to be completely investi- in Section 2. To this aim, we start from single-core computation.
gated, some general considerations can be made. From an architectural perspective and by oversimplifying, we can
In multi-computer and multi-core architectures, it is likely that identify the classical computing unit as the system comprising: (i)
network operations, such as entanglement generation and distribution, an Arithmetic and Logic Unit (ALU), responsible of performing logic
can be engineered and optimized according to the specific algorithm operations; (ii) a hierarchy of memories, L1 cache L2 cache at least;
to be performed. In such a case, ad-hoc augmented coupling map can (iii) and a Control Unit (CU). Accordingly, a single-core is identified as a
be likely generated in advance (proactively) – prior to the algorithm computing system composed by the CU, the ALU and a L1 cache and the
execution and at compilation time – and the compiler can issue requests single-core paradigm uses such computing unit to process information.
to the quantum network before performing the algorithm. This ap- Differently, the architectural definition for quantum computing unit
proach entails an additional set-up time to be added to the time interval is not well-established due to the differences among the underlying
required for compiling a distributed algorithm. Differently, in large- technologies. Indeed, there exist no single hardware unit responsible
scale networks as in multi-farm architectures, this can be an hard task for logic operations on qubits. As instance, superconducting technology
due to the large number of computer involved and remote operations. requires the physical qubits to be confined within a cryostat and logic
As a consequence, there exist a deep bi-directional impact between operations are implemented through pulses properly shaped to change
network and compiler design for DQC, which, in turn, is affected by the electrodynamics and the energy levels of the qubit archetype.
the considered variant of DQC. Differently, the photonic technology requires an optical table as support
and the logic operations are performed through optical components
The most advanced quantum compilers for execution on single
acting on a single photon, which acts as physical qubit archetype [169].
quantum processors are noise-aware, i.e., they take the noise statistics
Nevertheless, by performing a strong abstraction effort, in a way
of the device into account, for some or all steps [105,164–167]. A noise-
resembling of a single-core classical computing unit, a single QPU can
aware quantum compiler for DQC is still missing. Indeed, it is still an
be identified as the system comprising: (i) one entity responsible for
open question what kind of noise-awareness such a compiler should
the quantum information processing; (ii) another entity responsible for
have. The different options range from a compiler that has complete
storing such information – a quantum memory; (iii) one or more devices
knowledge of the target execution platform (quantum processors, quan-
in charge of measurement (accessing the information), (iv) and finally a
tum links, etc.) to a compiler that only knows generic features of the
control system which mainly comprises classical ad-hoc resources, both
target quantum processors and network – as the execution manager will
software and hardware, in charge of the management of such entities.
decide the actual execution platform assigned to the computation.
It is worthwhile to clarify that a unified architectural model that can
Further work could be done regarding the integration of quantum be commonly identified as ‘‘quantum single-core’’ is still missing.
compilers with simulation tools – in line with the preliminary attempt The step from single-core to multi-core in the classical domain was
that was made by Ferrari et al. [146] – allowing for automated work- simple and effective. More into details, a distributed classical multi-
flows that would allow for faster comparative evaluation of compiling core computing architecture comprises a single control unit in charge
strategies. Finally, the problem of combining compilation for DQC, of multiple cores, which, in turn, share an L2 cache. The cores are
i.e., partitioning and non-local gates optimization, with the local com- usually interconnected via a BUS and placed within the same chip.
pilation of each circuit partition on the QPUs is rarely studied in the Hence, for classical multi-core architectures we can clearly identify the
literature. shared resources, namely, L2 cache and control unit, alongside with the
So far, testing the quality of compiled circuits on real execution BUS for accessing the shared cache.
platforms has not been possible for the majority of researchers. Once Differently, the evolution from single-core to multi-core architecture
a quantum network will be available to the public – much like current is complex in the quantum domain and it appears as deeply technology-
IBM, Rigetti, etc. single quantum devices – it will be possible to eval- dependant. By accounting for the early-stage technology readiness
uate DQC compilers more effectively, with key performance indicators and for the high abstraction used in the concept of QPU, the key
including the resulting computation quality, state fidelity, and other observation here lies within the additional hardware and software re-
performance metrics [168]. sources required for the interconnection of multiple QPUs. Specifically,

19
M. Caleffi et al. Computer Networks 254 (2024) 110672

Table 4
A schematic summary of the differences arising with quantum and classical distributed computing paradigms.
Quantum Domain
Feature Multi-farm Computing Multi-computer Computing Multi-core Computing Single-core Computing
multiple quantum farms multiple quantum computers multiple refrigerators or
Architecture
optical tables
interconnected via Quantum interconnected via interconnected via single quantum computing
Internet unit
medium-range communication dedicated or short-range
links communication links
Geographical global/ same room (QLAN)/ same refrigerator /
Distribution wide area (QWAN) same building (QLANs) same optical table
yes: exponential scaling of the limited: number of physical
Physical Qubit
computing power with linear qubits
increasing of interconnected limited due to crosstalk effects
entities or
Scalability complex optical setting
entangled states classical control system
Shared Resources communication infrastructure entanglement generation
device
measurement and readout
system
Resource Sharing communication infrastructure dedicated
Programming model quantum circuit partitioning + monolithic execution
TeleData/TeleGate
Entangled topology mandatory Not applicable
coherence delegated to classical
communications
Hardware Cost increased as network hardware limited increase as some fixed by technology
and infrastructure are required hardware resources are shared
Hardware Heterogeneity very likely possible and required for some absent or very moderate
computing unit technologies
Interoperability mandatory highly suggested discretional
Standards mandatory and to be defined to be defined
Classical Domain
Feature Distributed Computing Multi-processor Computing Multi-core Computing Single-core Computing
multiple computer multiple processors multiple processing cores Single processing core
Architecture
connected via network connected via network or BUS connected via BUS
Geographical LAN limited to single physical
Global/WLAN
location
Distribution same rack (chip)
horizontally scalable bounded by the number of bounded by the number of
Bit Scalability Not applicable
by connecting computers to processors cores
the network
Shared Resources network resources BUS,level 3 cache BUS,level 2 cache, CU level 1 cache, CU
Resource Sharing network system dedicated
Programming model message passing shared memory shared memory
Cache Coherence Not applicable as no shared fundamental for data Not applicable
cache consistency and coherence
among nodes
Hardware Cost increased, network requires definitely improved, can be improved only at
additional hardware cost-effective solution design stage
Hardware Heterogeneity can be both homogeneous and no
heterogeneous
likely and easily managed typically no but allowed
Interoperability required limited Not applicable
Standards network standards and specific parallel programming
protocols models and libraries
Examples TCP/IP, HTTP- REST, SOAP OpenMP,MPI,CUDA

20
M. Caleffi et al. Computer Networks 254 (2024) 110672

differently from the classical domain, a multi-core quantum architec- for distributed quantum computing would provide unified northbound
ture requires some specificity, which at least includes: (i) dedicated quantum application programming interfaces (APIs) for the higher
hardware for the entanglement generation and distribution unit; (ii) a layers, decoupling from the different types of quantum hardware tech-
classical control unit enabling classical communications between the nologies (e.g., trapped ions, superconducting qubits, silicon photonic
cores needed by the quantum communication primitives and by the qubits).
entanglement generation and distribution unit. So, while in the classical Another key aspect for increasing the TRL (Technology Readiness
domain parallel processing is enabled by a shared memory, in the Level) of distributed quantum computing concerns its integration with
quantum domain parallelism is enabled by shared entangled states. current Telecom and ICT infrastructures. This implies the definition and
Despite this increase in the realization complexity, the interconnec- standardization of a management and control approach (architectures
tion of quantum multi-cores is worthwhile to be pursued. Indeed, while and APIs) able of interworking with current solutions. All these activi-
in classical multi-core architectures a linear increase of the number of ties require coordinated and joint efforts including – where appropriate
cores roughly corresponds to a linear increase of the computational – existing projects, industry bodies and standard (ITU-T, ETSI, IETF,
power, in the quantum domain the increasing of the computational CEN/CENELEC [171] and IEEE just to mention a few) active in the area
power is, in principle, exponential. of quantum technologies.
The differences between classical and quantum multi-core archi- Overall, the final goal is to bridge the gap between DQC and the
tectures are not limited to architectural aspects. Indeed, while in the established cloud and edge computing platforms, tools and methods,
classical domain we have the issue of cache coherence, which concerns and to focus in on the inter-related constraints between the different
the consistency of data stored and processed by multiple entities, in aspects of the architectural design, so to enable the development of
the quantum domain, we have entanglement coherence, concerning the practical DQC solutions. To achieve this goal, research and innovation
consistency of entangled states shared among the quantum computers activities are required in diverse and complementary fields, ranging
plays a fundamental role [170]. This, in turn, changes the design from computational complexity and networked systems through quan-
principles. Indeed, any action on one entangled qubit affects the overall tum information and optics to communications and computer science
augmented coupling map. Hence it impacts the remote operation that engineering.
can be performed [58]. Additionally, differently from the classically
stored data, entanglement changes its state over time. CRediT authorship contribution statement
As mentioned in Section 2, in this type of DQC architectures,
the physical distance between remote qubits is very short. And the Marcello Caleffi: Writing – review & editing, Writing – original
number of cores that can be interconnected within a limited space draft, Supervision. Michele Amoretti: Writing – review & editing, Writ-
cannot scale boundless. As a consequence, also the number of physical ing – original draft. Davide Ferrari: Writing – review & editing, Writing
qubits that can be clustered together is limited. The physical qubit – original draft. Jessica Illiano: Writing – review & editing, Writing –
scalability can be enhanced by moving to multi-computer quantum original draft. Antonio Manzalini: Writing – review & editing, Writing
architectures. However, we cannot adopt the reasoning adopted in – original draft. Angela Sara Cacciapuoti: Writing – review & editing,
the classical domain for moving from multi-core to multi-computer Writing – original draft, Supervision.
architectures. Indeed the specificity analyzed above for the multi-core
quantum architectures, become even more peculiar. As instance, the Declaration of competing interest
dedicated hardware for the entanglement generation and distribution
unit(s) must be more effective and efficient for covering the longer The authors declare that they have no known competing finan-
distances involved in this type of architecture. Additional hardware cial interests or personal relationships that could have appeared to
with no-counterpart in the classical architectures, such as the quantum influence the work reported in this paper.
transducer described in Section 7, may arise.
Similar considerations hold for the step from multi-computer to
Data availability
multi-farm architectures, with a clear worsening of the challenges due
to the scale (both in terms of distances and nodes) of the distributed
No data was used for the research described in the article.
quantum computing architecture.
Acknowledgment
8.2. Industrial and standardization perspective

Angela Sara Cacciapuoti and Michele Amoretti acknowledge finan-


A first quantum revolution has already exploited quantum technolo-
cial support from the European Union – NextGenerationEU, PNRR MUR
gies in our everyday life, creating a deep techno-economic and social
project PE0000023-NQSTI.
impact. Today, a second revolution is underway, and it is safe to predict
it will have a major impact in many markets, ranging from Telecom and
References
ICT, through Medicine, to Finance and Transportation, and so on.
Significant work is still needed to develop enabling components and
[1] K. Nemoto, Our future with quantum computers, JSAP Rev. 2023 (2023)
systems for DQC. Yet, considering the foreseen industrial opportunities, 230212.
significant investments are being made worldwide across public and [2] CSA QUCATS, Strategic Reasearch and Industry Agenda, European Commission,
private organizations. 2024.
One major obstacle on the way of industrial exploitation of dis- [3] M. Caleffi, A.S. Cacciapuoti, G. Bianchi, Quantum internet: From communica-
tion to distributed computing!, in: Proc. of ACM NANOCOM ’18, Association
tributed quantum computing is that, nowadays, the industry has not
for Computing Machinery, 2018, pp. 1–4, http://dx.doi.org/10.1145/3233188.
yet consolidated around one type of quantum hardware technology. 3233224.
In this scenario, a quantum hardware abstraction layer (Quantum-HAL) [4] A.S. Cacciapuoti, M. Caleffi, F. Tafuri, F.S. Cataliotti, S. Gherardini, G. Bianchi,
– embracing the two killer domains of quantum technologies for ICT, Quantum internet: Networking challenges in distributed quantum computing,
namely, quantum computing and quantum networking – would allow IEEE Netw. 34 (1) (2020) 137–143, http://dx.doi.org/10.1109/MNET.001.
1900092.
applications and services developers to start using the abstractions
[5] R. Van Meter, S.J. Devitt, The Path to Scalable Distributed Quantum Computing,
of the underneath quantum hardware, even if still under consolida- Computer 49 (9) (2016) 31–42, http://dx.doi.org/10.1109/MC.2016.291.
tion. This would definitely simplify and speed-up the development of [6] J. Preskill, Quantum Computing in the NISQ era and beyond, Quantum 2 (79)
quantum platforms, services, and applications. Indeed, a Quantum-HAL (2018).

21
M. Caleffi et al. Computer Networks 254 (2024) 110672

[7] Y. Kim, A. Eddins, S. Anand, K.X. Wei, E. van den Berg, S. Rosenblatt, H. [45] R. Van Meter, S. Devitt, The path to scalable distributed quantum computing.,
Nayfeh, Y. Wu, M. Zaletel, K. Temme, A. Kandala, Evidence for the utility of Computer 49 (9) (2016) 31–42, http://dx.doi.org/10.1109/MC.2016.291.
quantum computing before fault tolerance, Nature 618 (7965) (2023) 500–505. [46] M.A. Nielsen, I.L. Chuang, Quantum computation and quantum information,
[8] Y. Kim, C.J. Wood, T.J. Yoder, S.T. Merkel, J.M. Gambetta, K. Temme, Cambridge University Press, 2011.
A. Kandala, Scalable error mitigation for noisy quantum circuits produces [47] E. Rieffel, W. Polak, Quantum Computing: A Gentle Introduction, The MIT
competitive expectation values, Nat. Phys. 19 (5) (2023) 752–759. Press, 2011.
[9] S. Wehner, D. Elkouss, R. Hanson, Quantum Internet: a Vision for the Road [48] J. Kim, D. Min, J. Cho, H. Jeong, I. Byun, J. Choi, J. Hong, J. Kim, A
Ahead, Science 362 (6412) (2018). fault-tolerant million qubit-scale distributed quantum computer, in: Proceed-
[10] M. Caleffi, D. Chandra, D. Cuomo, S. Hasaanpour, A.S. Cacciapuoti, The Rise ings of the 29th ACM International Conference on Architectural Support for
of the Quantum Internet., IEEE Comput. (2020). Programming Languages and Operating Systems, Volume 2, 2024, pp. 1–19.
[11] R. Parekh, A. Ricciardi, A. Darwish, S. DiAdamo, Quantum algorithms and [49] J. Gambetta, Expanding the IBM Quantum roadmap to anticipate the fu-
simulation for parallel and distributed quantum computing, in: 2021 IEEE/ACM ture of quantum-centric supercomputing, https://www.ibm.com/quantum/blog/
Second International Workshop on Quantum Computing Software, QCS, IEEE ibm-quantum-roadmap-2025.
Computer Society, Los Alamitos, CA, USA, 2021, pp. 9–19, http://dx.doi.org/ [50] A. Ovide, S. Rodrigo, M. Bandic, H. Van Someren, S. Feld, S. Abadal, E.
10.1109/QCS54837.2021.00005, URL https://doi.ieeecomputersociety.org/10. Alarcon, C.G. Almudever, Mapping quantum algorithms to multi-core quantum
1109/QCS54837.2021.00005. computing architectures, in: 2023 IEEE International Symposium on Circuits
[12] D. Cuomo, M. Caleffi, A.S. Cacciapuoti, Towards a distributed quantum com- and Systems, ISCAS, IEEE, 2023, pp. 1–5.
puting ecosystem, IET Quantum Commun. 1 (2020) 3–8(5), http://dx.doi.org/ [51] P. Escofet, S.B. Rached, S. Rodrigo, C.G. Almudever, E. Alarcón, S. Abadal,
10.1049/iet-qtc.2020.0002. Interconnect fabrics for multi-core quantum processors: A context analysis,
[13] D. Ferrari, A.S. Cacciapuoti, M. Amoretti, M. Caleffi, Compiler design for in: Proceedings of the 16th International Workshop on Network on Chip
distributed quantum computing, IEEE Trans. Quantum Eng. 2 (2021) 1–20, Architectures, NoCArc ’23, ACM, 2023, pp. 34–39.
http://dx.doi.org/10.1109/TQE.2021.3053921. [52] S. Rodrigo, S. Abadal, E. Alarcon, M. Bandic, H. Van Someren, C.G.
[14] J. Avron, O. Casper, I. Rozen, Quantum advantage and noise reduction in Almudéver, On double full-stack communication-enabled architectures for
distributed quantum computing, Phys. Rev. A 104 (2021) 052404, http://dx. multicore quantum computers, IEEE Micro 41 (5) (2021) 48–56.
doi.org/10.1103/PhysRevA.104.052404. [53] H. Jnane, B. Undseth, Z. Cai, S.C. Benjamin, B. Koczor, Multicore quan-
[15] D. Ferrari, S. Carretta, M. Amoretti, A Modular Quantum Compilation Frame- tum computing, Phys. Rev. Appl. 18 (2022) 044064, http://dx.doi.org/
work for Distributed Quantum Computing, IEEE Trans. Quantum Eng. 4 (2023) 10.1103/PhysRevApplied.18.044064, URL https://link.aps.org/doi/10.1103/
1–13. PhysRevApplied.18.044064.
[16] D. Ferrari, M. Amoretti, A design framework for the simulation of distributed [54] P. Escofet, S.B. Rached, S. Rodrigo, C.G. Almudever, E. Alarcón, S. Abadal,
quantum computing, in: HPQCI Workshop in Conjunction with the 33rd Interconnect fabrics for multi-core quantum processors: A context analysis,
ACM International Symposium on High-Performance Parallel and Distributed in: Proceedings of the 16th International Workshop on Network on Chip
Computing, 2024. Architectures, 2023, pp. 34–39.
[17] A. Gold, J.P. Paquette, A. Stockklauser, M.J. Reagor, M.S. Alam, A. Bestwick, [55] P. Escofet, A. Ovide, M. Bandic, L. Prielinger, H. van Someren, S. Feld,
N. Didier, A. Nersisyan, F. Oruc, A. Razavi, B. Scharmann, E.A. Sete, B. Sur, D. E. Alarcón, S. Abadal, C.G. Almudéver, Revisiting the mapping of quantum
Venturelli, C.J. Winkleblack, F. Wudarski, M. Harburn, C. Rigetti, Entanglement circuits: Entering the multi-core era, ACM Trans. Quantum Comput. (2024).
across separate silicon dies in a modular superconducting qubit device, npj [56] F. Mazza, M. Caleffi, A.S. Cacciapuoti, Intra-QLAN connectivity: beyond the
Quantum Inf. 7 (1) (2021) 142. physical topology, 2024, arXiv preprint arXiv:2406.09963.
[18] IBM, Expanding the IBM Quantum roadmap to anticipate the future [57] W. Kozlowski, S. Wehner, R. Van Meter, B. Rijsman, A.S. Cacciapuoti, M.
of quantum-centric supercomputing, URL https://research.ibm.com/blog/ibm- Caleffi, S. Nagayama, Architectural principles for a quantum internet, 2023,
quantum-roadmap-2025. http://dx.doi.org/10.17487/RFC9340, RFC 9340, URL https://www.rfc-editor.
[19] Y. Zhong, H.-S. Chang, A. Bienfait, et al., Deterministic multi-qubit org/info/rfc9340.
entanglement in a quantum network, Nature 590 (7847) (2021) 571–575.
[58] J. Illiano, M. Caleffi, A. Manzalini, A.S. Cacciapuoti, Quantum internet protocol
[20] M. Pompili, S.L.N. Hermans, S. Baier, et al., Realization of a multinode quantum
stack: a comprehensive survey, Comput. Netw. 213 (2022) 109092.
network of remote solid-state qubits, Science 372 (6539) (2021) 259–264.
[59] A.S. Cacciapuoti, J. Illiano, S. Koudia, K. Simonov, M. Caleffi, The quantum
[21] S.L.N. Hermans, M. Pompili, H.K.C. Beukers, et al., Qubit teleportation between
internet: Enhancing classical services one qubit at a time, IEEE Netw. 36 (5)
non-neighbouring nodes in a quantum network, Nature 605 (7911) (2022)
(2022) 6–12.
663–668.
[60] A.S. Cacciapuoti, M. Caleffi, Toward the quantum internet: A directional-
[22] J.V. Rakonjac, S. Grandi, S. Wengerowsky, D. Lago-Rivera, F. Appas, H. de
dependent noise model for quantum signal processing, in: IEEE ICASSP ’19,
Riedmatten, Transmission of light–matter entanglement over a metropolitan
2019, pp. 7978–7982, http://dx.doi.org/10.1109/ICASSP.2019.8683195.
network, Optica Quantum 1 (2) (2023) 94–102.
[61] A.S. Cacciapuoti, M. Caleffi, R. Van Meter, L. Hanzo, When entanglement meets
[23] V. Krutyanskiy, M. Canteri, M. Meraner, V. Krcmarsky, B. Lanyon, Multi-
classical communications: Quantum teleportation for the quantum internet,
mode ion-photon entanglement over 101 kilometers, PRX Quantum 5 (2024)
IEEE Trans. Commun. 68 (6) (2020) 3808–3833, invited paper.
020308, http://dx.doi.org/10.1103/PRXQuantum.5.020308, URL https://link.
[62] R. Horodecki, P. Horodecki, M. Horodecki, K. Horodecki, Quantum
aps.org/doi/10.1103/PRXQuantum.5.020308.
entanglement, Rev. Modern Phys. 81 (2) (2009) 865.
[24] N.M. Linke, D. Maslov, M. Roetteler, et al., Experimental comparison of two
[63] A. Unnikrishnan, D. Markham, Authenticated teleportation and verification in
quantum computing architectures, Proc. Natl. Acad. Sci. 114 (13) (2017)
a noisy network, Phys. Rev. A 102 (2020) 042401.
3305–3310.
[64] R. Van Meter, K. Nemoto, W. Munro, K. Itoh, Distributed arithmetic on
[25] A. Kandala, K. Temme, A.D. Córcoles, et al., Error mitigation extends the
a quantum multicomputer, in: 33rd International Symposium on Computer
computational reach of a noisy quantum processor, Nature 567 (7749) (2019)
Architecture, ISCA’06, 2006, pp. 354–365.
491–495.
[65] S. DiAdamo, M. Ghibaudi, J. Cruise, Distributed Quantum Computing and
[26] Google Quantum AI, Official web site, https://quantumai.google/.
Network Control for Accelerated VQE, IEEE Trans. Quantum Eng. 2 (2021)
[27] IBM Quantum, Official web site, https://www.ibm.com/quantum.
1–21, http://dx.doi.org/10.1109/TQE.2021.3057908.
[28] Rigetti, Official web site, https://www.rigetti.com/.
[29] Alice, Bob, Official web site, https://www.alice-bob.com/. [66] K. Azuma, S.E. Economou, D. Elkouss, P. Hilaire, L. Jiang, H.-K. Lo, I. Tzitrin,
[30] Anyon, Official web site, https://anyonsys.com/. Quantum repeaters: From quantum networks to the quantum internet, Rev.
[31] IQM, Official web site, https://www.meetiqm.com/. Modern Phys. 95 (2023) 045006.
[32] OQC, Official web site, https://oxfordquantumcircuits.com/. [67] J. Illiano, A.S. Cacciapuoti, A. Manzalini, M. Caleffi, The impact of the quantum
[33] Intel, Intel–s New Chip to Advance Silicon Spin Qubit Research for Quantum data plane overhead on the throughput, in: Proc. of ACM NANOCOM ’21, 2021,
Computing, https://rb.gy/3kz9ih. pp. 1–6, http://dx.doi.org/10.1145/3477206.3477448.
[34] C12, Official web site, https://www.c12qe.com/. [68] A.S. Cacciapuoti, J. Illiano, M. Caleffi, Quantum internet addressing, IEEE Netw.
[35] Quobly, Official web site, https://www.quobly.io/. 38 (1) (2024) 104–111, http://dx.doi.org/10.1109/MNET.2023.3328393.
[36] Quantum Brilliance, Official web site, https://quantumbrilliance.com/. [69] S.-Y. Chen, J. Illiano, A.S. Cacciapuoti, M. Caleffi, Entanglement-based artificial
[37] Alpine Quantum Computing, Official web site, https://www.aqt.eu/. topology: Neighboring remote network nodes, 2024, arXiv preprint arXiv:2404.
[38] IonQ, Official web site, https://ionq.com/. 16204.
[39] Quantinuum, Official web site, https://www.quantinuum.com/. [70] W. Dür, H.J. Briegel, Entanglement purification and quantum error correction,
[40] Oxford Ionics, Official web site, https://www.oxionics.com/. Rep. Progr. Phys. 70 (8) (2007) 1381.
[41] PASQAL, Official web site, https://www.pasqal.com/. [71] A. Dahlberg, M. Skrzypczyk, T. Coopmans, L. Wubben, F. Rozpędek, M. Pompili,
[42] Quera, Official web site, https://www.quera.com/. A. Stolk, P. Pawełczak, R. Knegjens, J. de Oliveira Filho, et al., A link layer
[43] Atom Computing, Official web site, https://atom-computing.com/. protocol for quantum networks, in: Proceedings of the ACM Special Interest
[44] Infleqtion, Official web site, https://www.infleqtion.com/. Group on Data Communication, 2019, pp. 159–173.

22
M. Caleffi et al. Computer Networks 254 (2024) 110672

[72] R. Van Meter, Distributed digital computation and communication, in: Quantum [104] J. Kusyk, S.M. Saeed, M.U. Uyar, Survey on quantum circuit compilation for
Networking, John Wiley & Sons, Ltd, 2014, pp. 113–130. noisy intermediate-scale quantum computers: Artificial intelligence to heuristics,
[73] S. Shi, C. Qian, Concurrent entanglement routing for quantum networks: Model IEEE Trans. Quantum Eng. 2 (2021) 1–16, http://dx.doi.org/10.1109/TQE.
and designs, in: Proceedings of the Annual Conference of the ACM Special 2021.3068355.
Interest Group on Data Communication on the Applications, Technologies, [105] S. Sivarajah, S. Dilkes, A. Cowtan, et al., T|ket–: a retargetable compiler for
Architectures, and Protocols for Computer Communication, 2020, pp. 62–75. NISQ devices, Quantum Sci. Technol. 6 (1) (2020) 014003.
[74] F. Dupuy, C. Goursaud, F. Guillemin, A survey of quantum entanglement routing [106] A.D. Carcoles, A. Kandala, A. Javadi-Abhari, et al., Challenges and opportunities
protocols–challenges for wide-area networks, Adv. Quantum Technol. 6 (5) of near-term quantum computing systems, Proc. of the IEEE (2020) 1–15, in
(2023) 2200180. press.
[75] A. Montanaro, Quantum algorithms: An overview, npj Quantum Inf. 2 (1) [107] D. Ferrari, M. Amoretti, Efficient and effective quantum compiling for
(2016) 15023, http://dx.doi.org/10.1038/npjqi.2015.23. entanglement-based machine learning on IBM Q devices, Int. J. Quantum Inf.
[76] A. J., A. Adedoyin, J. Ambrosiano, et al., Quantum algorithm implementations 16 (08) (2018) 1840006.
for beginners, ACM Trans. Quantum Comput. 3 (4) (2022). [108] L. Cincio, Y. Subaşı, A.T. Sornborger, P.J. Coles, Learning the quantum
[77] P.W. Shor, Polynomial time algorithms for discrete logarithms and factoring algorithm for state overlap, New J. Phys. 20 (11) (2018) 113022.
on a quantum computer, in: Algorithmic Number Theory, Springer Berlin [109] A. Zulehner, A. Paler, R. Wille, An efficient methodology for mapping quantum
Heidelberg, 1994, p. 289. circuits to the IBM qx architectures, IEEE Trans. Comput.-Aided Des. Integr.
[78] L.K. Grover, A fast quantum mechanical algorithm for database search, in: Circuits Syst. 38 (7) (2019) 1226–1236.
Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of [110] M. Soeken, G. Meuli, B. Schmitt, et al., Boolean satisfiability in quantum
Computing, STOC ’96, 1996, pp. 212–219. compilation, Phil. Trans. Royal Soc. A 378 (2164) (2019) 1–16, http://dx.doi.
[79] A.W. Harrow, A. Hassidim, S. Lloyd, Quantum algorithm for linear systems of org/10.1098/rsta.2019.0161.
equations, Phys. Rev. Lett. 103 (2009) 150502. [111] C. Chamberland, G. Zhu, T.J. Yoder, et al., Topological and Subsystem Codes
[80] M. Cerezo, A. Arrasmith, R. Babbush, et al., Variational quantum algorithms, on Low-Degree Graphs with Flag Qubits, Phys. Rev. X 10 (011022) (2020).
Nat. Rev. Phys. 3 (9) (2021) 625–644. [112] IBM Q, Transpiler, https://qiskit.org/documentation/apidoc/transpiler.html.
[81] B.M. Terhal, Quantum error correction for quantum memories, Rev. Modern [113] P. Andrés-Martínez, C. Heunen, Automated distribution of quantum circuits via
Phys. 87 (2015) 307–346. hypergraph partitioning, Phys. Rev. A 100 (2019) 032308, http://dx.doi.org/
[82] E. Knill, Conventions for quantum pseudocode, in: Tech. rep, Los Alamos 10.1103/PhysRevA.100.032308.
National Lab, United States, 1996, URL https://www.osti.gov/biblio/366453. [114] R. G. Sundaram, H. Gupta, C.R. Ramakrishnan, Efficient Distribution of Quan-
[83] H. Abraham, I.Y. Akhalwaya, G. Aleksandrowicz, et al., Qiskit: An open-source tum Circuits, in: 35th International Symposium on Distributed Computing (DISC
framework for quantum computing, 2019. 2021), 2021.
[84] Google Quantum AI, Cirq, myehosthttps://quantumai.google/cirq. [115] R.G. Sundaram, H. Gupta, C.R. Ramakrishnan, Distribution of Quantum Circuits
[85] Xanadu, PennyLane, myehosthttps://pennylane.ai/. Over General Quantum Networks, in: 2022 IEEE International Conference on
[86] A.W. Cross, L.S. Bishop, J.A. Smolin, J.M. Gambetta, Open quantum assembly Quantum Computing and Engineering, QCE, 2022, pp. 415–425.
language, 2017, arXiv:1707.03429. [116] O. Daei, K. Navi, M. Zomorodi-Moghadam, Optimized quantum circuit parti-
[87] A. Cross, A. Javadi-Abhari, T. Alexander, N. De Beaudrap, L.S. Bishop, S. Heidel, tioning, Internat. J. Theoret. Phys. 59 (12) (2020) 3804–3820, http://dx.doi.
C.A. Ryan, P. Sivarajah, J. Smolin, J.M. Gambetta, B.R. Johnson, Openqasm org/10.1007/s10773-020-04633-8.
3: A broader and deeper quantum assembly language, ACM Trans. Quantum [117] Z. Davarzani, M. Zomorodi-Moghadam, M. Houshmand, M. Nouri-baygi, A
Comput. 3 (3) (2022) 1–50. dynamic programming approach for distributing quantum circuits by bipartite
[88] A. Dahlberg, B. van der Vecht, C.D. Donne, et al., NetQASM - a low-level graphs, Quantum Inf. Process. 19 (2020) http://dx.doi.org/10.1007/s11128-
instruction set architecture for hybrid quantum–classical programs in a quantum 020-02871-7.
internet, Quantum Sci. Technol. 7 (3) (2022) 035023. [118] E. Nikahd, N. Mohammadzadeh, M. Sedighi, M.S. Zamani, Automated window-
[89] A. Peduri, S. Bhat, T. Grosser, QSSA: an SSA-based IR for Quantum computing, based partitioning of quantum circuits, Phys. Scr. 96 (3) (2021) 035102,
in: Proceedings of the 31st ACM SIGPLAN International Conference on Compiler http://dx.doi.org/10.1088/1402-4896/abd57c.
Construction, 2022, pp. 2–14. [119] D. Cuomo, M. Caleffi, K. Krsulich, F. Tramonto, G. Agliardi, E. Prati, A.S.
[90] D. Ittah, T. Häner, V. Kliuchnikov, T. Hoefler, QIRO: A static single assignment- Cacciapuoti, Optimized compiler for distributed quantum computing, ACM
based quantum program representation for optimization, ACM Trans. Quantum Trans. Quantum Comput. 4 (2) (2023) 1–29.
Comput. 3 (3) (2022). [120] D. Dadkhah, M. Zomorodi, S.E. Hosseini, A New Approach for Optimization
[91] S. Nishio, R. Wakizaka, Inquir: Intermediate representation for interconnected of Distributed Quantum Circuits, Internat. J. Theoret. Phys. 60 (9) (2021)
quantum computers, 2023, arXiv:2302.00267. 3271–3285, http://dx.doi.org/10.1007/s10773-021-04904-y.
[92] R. Cleve, J. Watrous, Fast parallel circuits for the quantum Fourier transform, [121] R. Beals, S. Brierley, O. Gray, et al., Efficient distributed quantum computing,
in: Proceedings 41st Annual Symposium on Foundations of Computer Science, Proc. R. Soc. A Math. Phys. Eng. Sci. 469 (2153) (2013) 20120686.
2000, pp. 526–536. [122] S. Brierley, Efficient implementation of quantum circuits with limited qubit
[93] N.M.P. Neumann, R. van Houte, T. Attema, Imperfect Distributed Quantum interactions, Quantum Info. Comput. 17 (13–14) (2017) 1096–1104.
Phase Estimation, in: Computational Science – ICCS 2020, in: Lecture Notes in [123] P. Andres-Martinez, T. Forrer, D. Mills, J. Wu, L. Henaut, K. Yamamoto, M.
Computer Science, Springer International Publishing, 2020, pp. 605–615. Murao, R. Duncan, Distributing circuits over heterogeneous, modular quantum
[94] A. Kitaev, Quantum computations: algorithms and error correction, Russian computing network architectures, Quantum Science and Technology (2024)
Math. Surveys 52 (6) (1997) 1191–1249. http://dx.doi.org/10.1088/2058-9565/ad6734.
[95] J. Eisert, K. Jacobs, P. Papadopoulos, M.B. Plenio, Optimal local implementation [124] P. Promponas, A. Mudvari, L. Della Chiesa, P. Polakos, L. Samuel, L. Tassi-
of nonlocal quantum gates, Phys. Rev. A 62 (2000) 052317. ulas, Compiler for distributed quantum computing: a reinforcement learning
[96] A. Yimsiriwattana, S. Lomonaco, Generalized GHZ states and distributed quan- approach, 2024, arXiv preprint arXiv:2404.17077.
tum computing, Contemp. Math. 381 (2005) http://dx.doi.org/10.1090/conm/ [125] Y. Shi, N. Leung, P. Gokhale, Z. Rossi, D.I. Schuster, H. Hoffmann, F.T.
381. Chong, Optimized compilation of aggregated instructions for realistic quantum
[97] N.M.P. Neumann, R.S. Wezeman, Distributed quantum machine learning, in: computers, in: Proceedings of the Twenty-Fourth International Conference on
Innovations for Community Services, Springer International Publishing, 2022, Architectural Support for Programming Languages and Operating Systems,
pp. 281–293. ASPLOS ’19, Association for Computing Machinery, New York, NY, USA, 2019,
[98] C. Cicconetti, M. Conti, A. Passarella, Resource allocation in quantum networks pp. 1031–1044, http://dx.doi.org/10.1145/3297858.3304018, URL https://doi.
for distributed quantum computing, in: 2022 IEEE International Conference on org/10.1145/3297858.3304018.
Smart Computing, SMARTCOMP, 2022, pp. 124–132. [126] P. Gokhale, A. Javadi-Abhari, N. Earnest, Y. Shi, F.T. Chong, Optimized
[99] C. Cicconetti, M. Conti, A. Passarella, Service differentiation and fair sharing in quantum compilation for near-term algorithms with OpenPulse, in: 2020 53rd
distributed quantum computing, Pervasive Mob. Comput. 90 (2023) 101758. Annual IEEE/ACM International Symposium on Microarchitecture, MICRO,
[100] G. Vardoyan, S. Wehner, Quantum Network Utility Maximization, 2022, arXiv 2020, pp. 186–200, http://dx.doi.org/10.1109/MICRO50266.2020.00027.
e-prints, arXiv:2210.08135v1. [127] J. Cheng, H. Deng, X. Qia, Accqoc: Accelerating quantum optimal control based
[101] Y. Lee, W. Dai, D. Towsley, D. Englund, Quantum Network Utility: A Frame- pulse generation, in: 2020 ACM/IEEE 47th Annual International Symposium on
work for Benchmarking Quantum Networks, 2022, arXiv e-prints, arXiv:2210. Computer Architecture, ISCA, 2020, pp. 543–555, http://dx.doi.org/10.1109/
10752v1. ISCA45697.2020.00052.
[102] A.W. Cross, L.S. Bishop, S. Sheldon, P.D. Nation, J.M. Gambetta, Validating [128] S. Debnath, N.M. Linke, C. Figgatt, K.A. Landsman, K. Wright, C. Monroe,
quantum computers using randomized model circuits, Phys. Rev. A 100 (2019) Demonstration of a small programmable quantum computer with atomic qubits,
032328, http://dx.doi.org/10.1103/PhysRevA.100.032328. Nature 536 (7614) (2016) 63–66, http://dx.doi.org/10.1038/nature18648.
[103] A. Botea, A. Kishimoto, R. Marinescu, On the Complexity of Quantum Circuit [129] A.G. Fowler, S.J. Devitt, L.C.L. Hollenberg, Implementation of Shor’s algorithm
Compilation, in: The Eleventh International Symposium on Combinatorial on a linear nearest neighbor qubit array, Quantum Inf. Process. 4 (2004)
Search (SOCS 2018), 2018. 237–251, http://dx.doi.org/10.26421/QIC4.4.

23
M. Caleffi et al. Computer Networks 254 (2024) 110672

[130] B.W. Kernighan, S. Lin, An efficient heuristic procedure for partitioning graphs, [164] P. Murali, J.M. Baker, A.J. Abhari, et al., Noise-adaptive compiler mappings
Bell Syst. Tech. J. 49 (2) (1970) 291–307, http://dx.doi.org/10.1002/j.1538- for noisy intermediate-scale quantum computers, 2019, arXiv e-prints, arXiv:
7305.1970.tb01770.x. 1901.11054.
[131] B. Bartlett, A distributed simulation framework for quantum networks and [165] S. Nishio, Y. Pan, T. Satoh, et al., Extracting success from ibm–s 20-qubit
channels, 2018, arXiv e-prints, arXiv:1808.07047. machines using error-aware compilation, J. Emerg. Technol. Comput. Syst. 16
[132] T. Coopmans, R. Knegjens, A. Dahlberg, et al., NetSquid, a NETwork Simulator (3) (2020).
for QUantum Information using Discrete events, Commun. Phys. 4 (1) (2021)
[166] S. Niu, A. Suau, G. Staffelbach, A. Todri-Sanial, A hardware-aware heuristic
164.
for the qubit mapping problem in the nisq era, IEEE Trans. Quantum Eng. 1
[133] A. Dahlberg, S. Wehner, SimulaQron - a simulator for developing quantum
(2020) 1–14.
internet software, Quantum Sci. Technol. 4 (1) (2018) 015001.
[134] X. Wu, A. Kolar, J. Chung, et al., SeQUeNCe: a customizable discrete-event [167] D. Ferrari, M. Amoretti, Noise-adaptive quantum compilation strategies evalu-
simulator of quantum networks, Quantum Sci. Technol. 6 (4) (2021) 045027. ated with application-motivated benchmarks, in: Proceedings of the 19th ACM
[135] T. Matsuo, Simulation of a dynamic, RuleSet-based quantum network, 2021, International Conference on Computing Frontiers, CF ’22, 2022, pp. 237–243,
arXiv e-prints, arXiv:1908.10758. http://dx.doi.org/10.1145/3528416.3530250.
[136] S. DiAdamo, J. Notzel, B. Zanger, M.M. Bese, QuNetSim: A Software Framework [168] J. Wang, G. Guo, Z. Shan, Sok: Benchmarking the performance of a quantum
for Quantum Networks, IEEE Trans. Quantum Eng. 2 (2021) 1–12. computer, Entropy 24 (10) (2022) http://dx.doi.org/10.3390/e24101467.
[137] QuTech, Quantum Network Explorer ADK, 2022, URL https://github.com/ [169] F. Bouchard, K. Fenwick, K. Bonsma-Fisher, D. England, P.J. Bustard, K.
QuTech-Delft/qne-adk. Heshami, B. Sussman, Programmable photonic quantum circuits with ultrafast
[138] C.-T. Liao, S. Bahrani, F.F. da Silva, E. Kashefi, Benchmarking of quantum time-bin encoding, 2024, arXiv preprint arXiv:2404.17657.
protocols, Sci. Rep. 12 (1) (2022) 5298, http://dx.doi.org/10.1038/s41598-022-
[170] M. Amoretti, S. Carretta, Entanglement verification in quantum networks with
08901-x.
tampered nodes, IEEE J. Sel. Areas Commun. 38 (3) (2020) 598–604.
[139] M. Mehic, M. Niemiec, S. Rass, et al., Quantum key distribution: A networking
perspective, ACM Comput. Surv. 53 (5) (2020). [171] O. van Deventer, N. Spethmann, M. Loeffler, M. Amoretti, et al., Towards
[140] A. Manzalini, M. Amoretti, End-to-end entanglement generation strategies: European Standards for Quantum Technologies, EPJ Quantum Technol. 9 (3)
Capacity bounds and impact on quantum key distribution, Quantum Rep. 4 (2022).
(3) (2022) 251–263.
[141] C. Cicconetti, M. Conti, A. Passarella, Request scheduling in quantum networks,
IEEE Trans. Quantum Eng. 2 (2021) 2–17.
[142] Various Authors, Quantum protocol zoo, 2022, URL https://wiki.veriqloud.fr/ Marcello Caleffi is currently Professor with the DIETI
index.php. Department, University of Naples Federico II, where he co-
[143] S. Devitt, W. Munro, K. Nemoto, Quantum error correction for beginners, Rep. lead the Quantum Internet Research Group. He is also with
Progr. Phys. 76 (7) (2013). the National Laboratory of Multimedia Communications,
[144] D.S. Steiger, T. Häner, M. Troyer, ProjectQ: An Open Source Software Frame- National Inter- University Consortium for Telecommunica-
work for Quantum Computing, Quantum 2 (2018) 49, http://dx.doi.org/10. tions. From 2010 to 2011, he was with the Broadband
22331/q-2018-01-31-49, arXiv:1612.08091. Wireless Networking Laboratory with the Georgia Institute
[145] B. Zanger, S. DiAdamo, EQSN: Effective Quantum Simulator for Networks, 2020, of Technology, as a Visiting Researcher. In 2011, he was
URL https://github.com/tqsd/EQSN_python. also with the NaNoNetworking Center in Catalunya (N3Cat)
[146] D. Ferrari, S. Nasturzio, M. Amoretti, A software tool for mapping and executing with the Universitat Politecnica de Catalunya, as a Visiting
distributed quantum computations on a network simulator, 2021, URL https: Researcher. Since July 2018, he held the Italian National
//2021.qcrypt.net/speakers/#list-of-accepted-posters. Habilitation as a Full Professor of Telecommunications
[147] A.W. Cross, L.S. Bishop, J.A. Smolin, J.M. Gambetta, Open quantum assembly Engineering. His work appeared in several premier IEEE
language, 2017, arXiv e-prints, arXiv:1707.03429. Transactions and Journals, and he received multiple awards,
[148] P. Magnard, S. Storz, P. Kurpiers, et al., Microwave quantum link between including the ‘‘2024 IEEE Communications Society Award for
superconducting circuits housed in spatially separated cryogenic systems, Phys. Advances in Communication and the ‘‘2022 IEEE Commu-
Rev. Lett. 125 (2020) 260502. nications Society Best Tutorial Paper Award’’. He currently
[149] L. d’Avossa, M. Caleffi, C. Wang, J. Illiano, S. Zorzetti, A.S. Cacciapuoti, serves as an Editor/Associate Editor for IEEE Trans. On
Towards the quantum internet: entanglement rate analysis of high-efficiency Wireless Communications, IEEE Trans. on Communications,
electro-optic transducer, in: 2023 IEEE International Conference on Quantum IEEE Transactions On Quantum Engineering, IEEE Open
Computing and Engineering, QCE, 1, IEEE, 2023, pp. 1325–1334. Journal of the Communications Society and IEEE Internet
[150] N. Lauk, N. Sinclair, S. Barzanjeh, J.P. Covey, M. Saffman, M. Spiropulu, C. Computing. He has served as the chair, the TPC chair, and a
Simon, Perspectives on quantum transduction, Quantum Sci. Technol. 5 (2) TPC member for several premier IEEE conferences. In 2017,
(2020) 020501. he has been appointed as Distinguished Visitor Speaker
[151] L. d’Avossa, A.S. Cacciapuoti, M. Caleffi, Quantum transduction models for from the IEEE Computer Society and he has been elected
multipartite entanglement distribution, IEEE QCNC24 (2024). treasurer of the IEEE ComSoc/VT Italy Chapter. In 2019,
[152] C.H. Bennett, G. Brassard, S. Popescu, et al., Purification of noisy entanglement he has been also appointed as a member of the IEEE New
and faithful teleportation via noisy channels, Phys. Rev. Lett. 76 (5) (1996) 722. Initiatives Committee from the IEEE Board of Directors and,
[153] C.H. Bennett, D.P. DiVincenzo, J.A. Smolin, W.K. Wootters, Mixed-state in 2023, he has been appointed as ComSoc Distinguished
entanglement and quantum error correction, Phys. Rev. A 54 (5) (1996) 3824. Lecturer.
[154] J.I. Cirac, A.K. Ekert, S.F. Huelga, C. Macchiavello, Distributed quantum
computation over noisy channels, Phys. Rev. A 59 (1999) 4249–4254.
[155] L. Ruan, W. Dai, M.Z. Win, Adaptive recurrence quantum entanglement
Michele Amoretti received his Ph.D. in Information Tech-
distillation for two-kraus-operator channels, Phys. Rev. A 97 (5) (2018) 052332.
nologies in 2006 from the University of Parma, Parma,
[156] F. Rozpędek, T. Schiet, D. Elkouss, et al., Optimizing practical entanglement
Italy. He is an Associate Professor of Computer Engineering
distillation, Phys. Rev. A 97 (6) (2018) 062333.
at the University of Parma (Italy), where he leads the
[157] L. Ruan, B.T. Kirby, M. Brodsky, M.Z. Win, Efficient entanglement distillation
Quantum Software Laboratory (QSLab) in the Department of
for quantum channels with polarization mode dispersion, Phys. Rev. A 103 (3)
Engineering and Architecture. He authored or co-authored
(2021) 032425.
over 130 research papers in refereed international journals,
[158] F. Mazza, M. Caleffi, A.S. Cacciapuoti, Quantum LAN: On-demand network
conference proceedings, and books. He serves as Associate
topology via two-colorable graph states, IEEE QCNC24 (2024).
Editor for the journal IEEE Trans. on Quantum Engineering.
[159] K. Simonov, M. Caleffi, J. Illiano, A.S. Cacciapuoti, Universal quantum com-
He is the Principal Investigator of the University of Parma’s
putation via superposed orders of single-qubit gates, 2023, arXiv preprint
research unit involved in the ‘‘Quantum Internet Alliance’’
arXiv:2311.13654.
project funded by the European Union – Horizon Europe –
[160] F. Riera-Sàbat, W. Dür, A modular entanglement-based quantum computer
Quantum Flagship initiative. He is also involved as a re-
architecture, 2024, arXiv preprint arXiv:2406.05735.
searcher in the ‘‘National Quantum Science and Technology
[161] R. Raussendorf, H. Briegel, A one-way quantum computer, Phys. Rev. Lett. 86
Institute (NQSTI)’’, funded by the European Union – Next
(2001) 5188–5191.
Generation EU. He is a member of the Italian delegation in
[162] R. Raussendorf, H. Briegel, Computational model underlying the one-way
CEN/CENELEC’s JTC 22 ‘‘Quantum Technologies’’.
quantum computer, 2001, arXiv preprint quant-ph/0108067.
[163] M. Hein, J. Eisert, H.J. Briegel, Multiparty entanglement in graph states, Phys.
Rev. A 69 (6) (2004) 062311.

24
M. Caleffi et al. Computer Networks 254 (2024) 110672

Davide Ferrari his Ph.D. in Information Technologies at the Chair of the several IEEE Conferences. He owns several
Department of Engineering and Architecture of the Univer- patents on methods and systems for networks and services.
sity of Parma, Italy, in 2023. During the Ph.D. he worked on His results were published in more than 130 of technical
quantum compiling, quantum optimization and distributed papers and publications. Currently, he is working in TIM
quantum computing. He has been a research scholar at (Telecom Italia) Innovation, addressing Cloud-Edge Com-
Future Technology Lab of the University of Parma, working puting, Beyond-5G Networks and Quantum Communications
on the design of efficient algorithms for quantum compiling. and Computing. He is currently Chair in GSMA of a group
He is now a research fellow at the Department of Engi- on Quantum Networking and Services.
neering and Architecture of the University of Parma. He is
involved in the Quantum Information Science (QIS) research
initiative at the University of Parma, where he is a member Angela Sara Cacciapuoti (www.quantuminternet.it) is a
of the Quantum Software Laboratory. In 2020, he won Professor of Quantum Communications and Networks at
the ’IBM Quantum Awards Circuit Optimization Developer the University of Naples Federico II (Italy). Her work
Challenge’. His research focuses on quantum optimization has appeared in first tier IEEE journals and she received
applications and efficient quantum compiling for local and different awards, including the ‘‘2024 IEEE ComSoc Award
distributed quantum computing. for Advances in Communication’’, the ‘‘2022 IEEE ComSoc
Best Tutorial Paper Award’’, the ‘‘2022 WICE Outstanding
Achievement Award’’ for her contributions in the quantum
Jessica Illiano received the B.Sc degree in 2018 and then communication and network fields, and ‘‘2021 N2Women:
the M.Sc degree in 2020 both (summa cum laude) in Stars in Networking and Communications’’. Lately, she also
Telecommunications Engineering from University of Naples received the IEEE ComSoc Distinguished Service Award for
Federico II (Italy). In 2020 she was winner of the schol- EMEA 2023, assigned for the outstanding service to IEEE
arship "Quantum Communication Protocols for Quantum ComSoc in the EMEA Region. Currently, she is an IEEE
Security and Quantum Internet" fully funded by TIM S.p.A. ComSoc Distinguished Lecturer with lecture topics on the
and in 2024 she received her Ph.D. degree in Information Quantum Internet design and Quantum Communications.
Technologies and Electrical Engineering at University of And she serves also as Member of the TC on SPCOM
Naples Federico II. Since 2017, she is a member of the within the IEEE Signal Processing Society. Moreover, she
Quantum Internet Research Group, FLY: Future Communi- serves as Area Editor for IEEE Trans. on Communications
cations Laboratory at the University of Naples Federico II and as Editor/Associate Editor for the journals: IEEE Trans.
where she currently is Assistant Professor. Currently, she is on Quantum Engineering, IEEE Network and IEEE Commu-
website co-chair of N2Women and student Associate Editor nications Surveys & Tutorials. She served as Area Editor
for IET Quantum Communication. Her research interests for IEEE Communications Letters(2019 - 2023), and she
include quantum communications, quantum networks and was the recipient of the 2017 Exemplary Editor Award of
quantum information processing. the IEEE Communications Letters. In 2023, she also served
as Lead Guest Editor for IEEE JSAC special issue "The
Quantum Internet: Principles, Protocols, and Architectures".
Antonio Manzalini received the M. Sc. Degree in Electronic From 2020 to 2021, Angela Sara was the Vice-Chair of
Engineering from the Politecnico of Turin (Italy) and the the IEEE ComSoc Women in Communications Engineering.
Ph.D. (cum Laude) from Sorbonne Universités (France). In Previously, she has been appointed as Publicity Chair of
1990 he joined Telecom Italia (formerly CSELT) where he WICE. From 2017 to 2020, she has been the Treasurer of
was involved in innovation activities on technologies and the IEEE Women in Engineering (WIE) Affinity Group of
architectures for optical networks. He actively participated the IEEE Italy Section. Her research interests are in Quan-
in standardization, mainly in ETSI and ITU-T, and he was tum Information Processing, Quantum Communications and
involved in several EURESCOM and European Project play- Quantum Networks.
ing responsibility roles. He was Chair of the IEEE initiative
on Software Defined Networks (SDN), and he was General

25

You might also like