0% found this document useful (0 votes)
2 views

Wille et al. - 2016 - SyReC A hardware description language for the specification and synthesis of reversible circuits

The paper introduces SyReC, a hardware description language designed for the specification and synthesis of reversible circuits, which are increasingly relevant in low-power design and quantum computation. It discusses the language's concepts, operations, and optimization algorithms, demonstrating its effectiveness through a case study of a RISC CPU. The authors compare SyReC's synthesis approach to existing solutions based on Boolean function representations, highlighting its advantages in handling complex reversible circuit designs.

Uploaded by

alan.aversa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Wille et al. - 2016 - SyReC A hardware description language for the specification and synthesis of reversible circuits

The paper introduces SyReC, a hardware description language designed for the specification and synthesis of reversible circuits, which are increasingly relevant in low-power design and quantum computation. It discusses the language's concepts, operations, and optimization algorithms, demonstrating its effectiveness through a case study of a RISC CPU. The authors compare SyReC's synthesis approach to existing solutions based on Boolean function representations, highlighting its advantages in handling complex reversible circuit designs.

Uploaded by

alan.aversa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

INTEGRATION, the VLSI journal 53 (2016) 39–53

Contents lists available at ScienceDirect

INTEGRATION, the VLSI journal


journal homepage: www.elsevier.com/locate/vlsi

Invited paper

SyReC: A hardware description language for the specification


and synthesis of reversible circuits
Robert Wille a,b,n, Eleonora Schönborn c, Mathias Soeken b,c, Rolf Drechsler b,c
a
Institute for Integrated Circuits, Johannes Kepler University Linz, A-4040 Linz, Austria
b
Cyber-Physical Systems, DFKI GmbH, Bibliothekstr. 1, 28359 Bremen, Germany
c
Institute of Computer Science, University of Bremen, Bibliothekstr. 1, 28359 Bremen, Germany

art ic l e i nf o a b s t r a c t

Article history: Although researchers and engineers originally focused on a preponderantly irreversible computing
Received 18 December 2014 paradigm, alternative models receive more and more attention. Reversible computation is a promising
Received in revised form example which has applications in many emerging technologies such as quantum computation or
5 October 2015
alternative directions for low-power design. Accordingly, the design of reversible circuits has become an
Accepted 10 October 2015
intensely studied research area. In particular, the efficient synthesis of complex reversible circuits poses
Available online 14 November 2015
an important and difficult research question. Most of the solutions proposed thus far are based on pure
Keywords: Boolean function representations such as truth tables or decision diagrams.
Reversible logic In this paper, we provide a comprehensive introduction to and present extensions for the hardware
Hardware description languages
description language SyReC which allows for the specification and automatic synthesis of reversible
Synthesis
circuits. Besides a detailed presentation of the language's concepts and operations, we additionally
Optimization
propose algorithms that optimize the resulting circuits with respect to different objectives. A case study
on a RISC CPU as well as a thorough experimental evaluation of both, the synthesis approach and its
optimizations, show the applicability and demonstrate the advantage of SyReC compared to other
solutions based on Boolean function representations.
& 2015 Published by Elsevier B.V.

1. Introduction output pattern. Hence, all computations can be reverted as the


inputs can be obtained from the outputs and vice versa.
Researchers and engineers have focused the investigation of Although not so well established thus far, reversible computa-
computing machines on a preponderantly irreversible computing tion enables several promising applications and superiors con-
paradigm. In fact, most of the established computations are not ventional computation paradigms in many domains. Recent
invertible as a standard logical operation such as an AND illus- accomplishments and experimental validations in these domains
trates. Although it is possible to obtain the inputs of an AND gate if made this alternative paradigm also interesting for computer
aided design. In particular applications to low-power design and
the output is set to 1 (then, both inputs must be set to 1 as well), it
quantum computation accelerated this development.
is not possible to uniquely determine the input values if the gate
Low power computation may significantly profit from reversible
outputs 0. While mainly relying on this conventional way of
circuits because, as observed by Landauer [1], power is always
computation, also its alternative reversible paradigm receives dissipated when information is lost during computation. This
increasing attention. indeed happens independently from the applied technology.
Reversible computation is a computing paradigm which only Hence, all computing machines following the conventional para-
allows bijective operations, i.e. reversible n-input n-output func- digm always lose power if irreversible operations (as the above-
tions in which no two input vectors are mapped to the same mentioned AND) are performed. Although the fraction of dis-
sipated power is negligible today, it will become substantial with
the expected ongoing miniaturization. Moreover, Gershenfeld has
n
Corresponding author at: Institute for Integrated Circuits, Johannes Kepler Uni- shown that the actual power dissipation corresponds to the
versity Linz, A-4040 Linz, Austria. Tel.: þ 43 732 2468 4739; fax: þ 43 732 2468 4735. amount of energy used to represent the signal [2]. With ongoing
E-mail addresses: [email protected] (R. Wille),
[email protected] (E. Schönborn),
miniaturization of circuits, this amount will soon become sub-
[email protected] (M. Soeken), stantial. Since reversible computations are information loss-less
[email protected] (R. Drechsler). (inputs can always be restored from the outputs and vice versa),

http://dx.doi.org/10.1016/j.vlsi.2015.10.001
0167-9260/& 2015 Published by Elsevier B.V.
40 R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53

this power loss can significantly be reduced or even entirely In the remainder of this paper, the concepts of the SyReC lan-
avoided with the alternative paradigm (e.g. [3,4]). Recently, this guage as well as the corresponding synthesis methodology are
has experimentally been verified in [5]. described in detail. After a brief review on the basics of reversible
Quantum computation [6] allows for breaching complexity logic and circuits, Section 3 introduces the general concept as well
bounds which are valid for computing devices based on conven- as the precise syntax and the (informal) semantics of the proposed
tional computing by using quantum mechanical phenomena such language. It is shown how reversibility in the description is
as superposition and entanglement. Considering that many of the ensured while at the same time a wide range of (also non-rever-
established quantum algorithms include a significant Boolean sible) functionality can be provided. The realization of a reversible
component (e.g. the oracle transformation in the Deutsch–Jozsa control flow is also discussed. All concepts are illustrated by means
algorithm [7], the database in Grover's search algorithm [8], and of examples. Section 4 describes how the resulting programs are
the modulo exponentiation in Shor's algorithm [9]), it is crucial to realized as a reversible circuit. A hierarchical synthesis approach is
have efficient methods to synthesize quantum gate realizations of presented that automatically transforms the respective statements
Boolean functions. Since any quantum operation inherently is and operations of the new language into a reversible circuit. The
reversible, reversible circuits can be exploited for this purpose. respective realizations of the individual statements and expres-
Moreover, further promising applications have recently been sions as building blocks are presented and discussed. While this
investigated in the design of low-power encoding and decoding allows for the synthesis of complex reversible circuits, the quality
devices for on-chip interconnects in systems-on-a-chip [10], for of the results can still be improved. Sections 5 and 6 introduce
adiabatic circuits [11], and for circuits employing energy recovery optimization techniques which allow for a reduction of the num-
logic [12]. ber of lines or gate costs depending on the individual design
Motivated by these promising developments, design and needs. Finally, the applicability of the language as well as its
synthesis of reversible circuits have received significant attention synthesizer is demonstrated in Section 7. Here, a fully functional
in the last decade. This led to several synthesis techniques fol- RISC CPU is designed and realized using SyReC. This confirms that
lowing complementary schemes and exploiting different kinds of even complex logic can easily be realized with this language.
function representations. Examples include approaches Furthermore, the effects of the different optimizations are eval-
uated and the synthesizer is compared to another solution which
 relying on a permutation-based [13] or cycle-based [14] is based on a Boolean function representation.
representation, Overall, a comprehensive overview on this new hardware
 following a transformation-based scheme [15], description language for reversible circuit design is provided. This
 using functions descriptions such as positive-polarity Reed– eventually lifts the design of circuits following the reversible
Muller expansion [16], Reed–Muller spectra [17], or Exclusive computation paradigm from the Boolean level to a higher
Sum of Products [18], and abstraction.
 exploiting efficient data-structures such as Binary Decision
Diagrams [19] or Quantum Multiple-valued Decision Diagrams
[20]. 2. Preliminaries

However, all these approaches only allow one to automatically To keep the paper self-contained, this section introduces
synthesize reversible circuits up to a certain size that is bounded necessary definitions on reversible logic and circuits.
by the underlying function representation. In order to provide an
efficient design flow for this kind of computation and its appli- 2.1. Reversible functions
cations, synthesis of reversible logic has to reach a level which
allows for the description of circuits at higher levels of abstraction. A propositional or Boolean function f : Bn -Bn over the vari-
For this purpose, hardware description languages can be exploited. ables X ¼ fx1 ,..., xn g is called reversible if it is bijective. Clearly, many
In conventional synthesis, approaches using languages such as Boolean functions of practical interest are not reversible (e.g. the
VHDL [21], SystemC [22], or SystemVerilog [23] are used to specify conjunct of two propositional variables with a truth table repre-
and subsequently synthesize circuits. Even in some of the appli- sented by the bit-string 0001). In order to realize such function-
cation domains of reversible logic, namely quantum computation, ality as a reversible circuit, the corresponding functions are
first programming languages have been proposed (see e.g. Quipper embedded [32,33]. This is achieved by adding so-called garbage
[24] or Scaffold [25] as well as related overviews such as e.g. [26]). outputs which are used to distinguish equal output patterns, thus
In this work, we provide a comprehensive introduction to and making the function injective. As a last step in the embedding
extensions for the hardware description language SyReC.1 SyReC constant inputs are added to equalize the number of input vari-
allows for the specification and automatic synthesis of complex ables and output variables of the function, thus making it bijective.
reversible circuits. For this purpose, established concepts from the
previously introduced reversible software language Janus [28,29] 2.2. Reversible circuits
are adapted. To the best of our knowledge only two other hard-
Reversible functions can be realized by reversible circuits in
ware description languages for reversible logic have been pro-
which each variable of the function is represented by a circuit line.
posed after SyReC has been introduced. In [30] a functional
To maintain the bijectivity property of the reversible function, fan-
reversible language has been proposed that ensures reversibility
out and feedback are not directly allowed in reversible circuits. As
using a type system based on linear types. As a consequence, only
a consequence, reversible circuits can be built as a cascade of
pure reversible functionality can be addressed, i.e. irreversible
reversible gates G ¼ g 1 …g d . There exist different gate libraries that
operations are not supported. A combinator-style functional lan-
are being used to build reversible circuits. However, in the scope of
guage has been presented in [31] which offers the design of
this work we restrict ourselves to the most commonly used ones
reversible circuits at the gate level and therefore focuses on lower
containing the Toffoli gate [34] and the Fredkin gate [35]. For this
levels than SyReC.
purpose each gate gi in the circuit is denoted by tðC; TÞ with

1
A preliminary introduction of SyReC is available in [27].  a gate type t A fT; Fg,
R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53 41

 control lines C  X, and 3.1. General concepts


 target lines T D X⧹C.
In order to ensure reversibility in its description, SyReC adapts
Each gate gi realizes a reversible function f i : Bn -Bn . If t ¼ T, established concepts from the previously introduced reversible
i.e. the gate is a Toffoli gate, we have T ¼ fxt g and fi maps programming language Janus [29] and is additionally enhanced by
hardware-related language constructs as it is targeting the
ðx1 ,..., xn Þ↦ðx1 ,..., xt  1 ; xt  ⋀ c; xt þ 1 ,..., xn Þ;
cAC description of reversible circuits. The general concepts of SyReC
are summarized in the following.
i.e. the value on line xt is inverted if, and only if, all control values
are assigned 1. A Toffoli gate is called a NOT gate if j C j ¼ 0. For a 3.1.1. Only reversible assignments
Fredkin gate, i.e. t ¼ F, we have T ¼ fxs ; xt g and fi maps Being one of the most elementary language constructs, variable
ðx1 ,..., xn Þ↦ðx1 ,..., xs  1 ; x0s ; xs þ 1 ,..., xt  1 ; x0t ; xt þ 1 ,..., xn Þ; assignments such as used in the majority of the imperative lan-
guages are irreversible and can therefore not be part of a reversible
language. The concept of reversible assignments (or sometimes also
with x0s ¼ c 0 xs  c0 xt , x0t ¼ c 0 xt  c0 xs , and c0 ¼ ⋀c A C c, i.e. the values called reversible updates) is used as an alternative. Reversible
of the target lines are interchanged (swapped) if, and only if, all assignments have the form v  ¼ e with  A f^; þ  g such that
control values are assigned 1. A Fredkin gate is also referred to as the variable v does not appear in the right-hand side expression e.
SWAP gate if j C j ¼ 0. The function realized by the circuit is the Although SyReC is limited to this set of operators, in general any
composition of the functions realized by the gates, i.e. operator f can be used for the reversible assignment, if there exists
f ¼ f 1 ○f 2 ○⋯○f d . an inverse operator f  1 such that
In addition to the constant inputs and garbage outputs that are 1
v¼f ðf ðv; eÞ; eÞ ð1Þ
added to a function in the process of embedding, for circuits we
also consider so-called ancilla lines. Ancilla lines hold a constant for all variables v and for all expressions e. Note that ‘þ ’ (addition)
input assigned some Boolean value v and are used in such a way is inverse to ‘  ’ (subtraction), and vice versa, and ‘ ^ ’ (bit-wise
that their output is always v. Moreover, when considering circuits exclusive OR) is inverse to itself. When executing the program in
that realize a complex functionality some lines may be semanti- reverse order, all reversible assignment operators are replaced by
cally grouped as a signal, e.g. if the circuit realizes the addition of their inverse operators.
two 32-bit values.
3.1.2. Syntactical expressiveness
Example 1. Fig. 1 shows a reversible circuit with three lines and
Due to the construction of the reversible assignment, the right-
four gates. The first, second, and fourth gates are Toffoli gates with
hand side expression can also be irreversible and compute any
a different number of control lines. The target line is denoted by 
operation. The most common operations are directly applicable
whereas the control lines are denoted as solid black dots. The third
using a wide variety of syntax including arithmetic
gate is a Fredkin gate which target lines are denoted by .
ð þ ; n ,/, %; n 4 Þ, bit-wise ð&; j ;^Þ, logical ð&&; J Þ, and relational ð o
In order to measure the costs of a circuit, different metrics can ; 4 ; ¼ ; ! ¼ ; o ¼ ; 4 ¼ Þ operations. The reversibility is ensured,
be applied. Besides the number of gates so-called quantum costs since the input values to the operation are also given to the inverse
and transistor costs approximate a better cost considering the operation when reverting the assignment (cf. (1)). In order to
actual physical implementation based on quantum mechanics and specify e.g. a multiplication anb, a new free signal c must be
classical mechanics, respectively. Most of the cost metrics are introduced which is used to store the result (i.e. c^¼(anb) is
applied to the gates and are accumulated in order to calculate the applied).
costs for the overall circuit. In this paper, we use the quantum cost
metric presented in [36,37] that grows exponentially with respect 3.1.3. Reversible control flow
to the number of control lines of a gate and transistor costs as A reversible data flow is ensured due to the above-mentioned
presented in [4] that grows linearly with respect to the number of assignment operations, and the control flow is made bijectively
control lines. executable in a similar fashion. This becomes particularly manifest
in conditional statements. In contrast to non-reversible languages,
SyReC requires an additional fi-condition for each if-condition
which is applied as an assertion. This fi-condition is required, since
3. The SyReC language
a conditional statement may not be computed in both directions
In the following, the SyReC language is introduced. SyReC using the same condition, i.e. it cannot be ensured that the same
allows for the specification and the synthesis of complex logic block (then-block or else-block) is processed when computing an
through common HDL description means. Since every (valid) if-statement in the reverse direction. As a solution, a fi-condition
SyReC program is inherently reversible, the reversibility of the that is asserted when computing the statement in the reverse
specification is ensured at the same time. The general concepts to direction is added ensuring a consistent execution semantic. This
achieve this are summarized in the first part of this section. language principle is illustrated in more detail in the next section.
Afterwards, the syntax and semantics of all SyReC description
means are explained in detail. 3.1.4. Specific hardware description properties
Since SyReC is used for the synthesis of reversible circuits, it
obeys some HDL related properties:

 The single data-type is a circuit signal with parameterized bit-


width.
 Access to single bits (x.N), a range of bits (x.N:N), as well as the
Fig. 1. Reversible circuit. size (#x) of a signal is provided.
42 R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53

Fig. 2. Syntax of the hardware description language SyReC.

〈parameterlist〉 in Line 3), local signal declarations (denoted by


Table 1
Signal access modifiers and implied circuit properties. 〈signallist〉 in Line 5), and a sequence of statements (denoted by
〈statementlist〉 in Line 7). The top-module of a program is defined
Modifier Constant Garbage State Initial by the special identifier main. If no module with this name exists,
input output value
the last module declared is used as the top-module instead.
in – Yes No Given by primary input SyReC uses a signal representing a non-negative integer as its
out 0 No No 0 sole data type. The bit-width of signals can optionally be defined
inout – No No Given by primary input
by round brackets after the signal name (Line 6). If no bit-width is
wire 0 Yes No 0
state – No Yes Given by pseudo-primary input specified, a default value is assumed. For each signal, an access
modifier has to be defined. For a parameter signal (used in a
module declaration), this can be either in, out, or inout (Line 4).
 Since loops must be completely unrolled during synthesis, the
Local signals can either work as internal signals (denoted by wire)
number of iterations has to be available before compilation.
or in case of sequential circuits as state signals2 (denoted by state;
That is, dynamic loops (defined by expressions) are not allowed.
 Line 5). The access modifier affects properties in the synthesized
Further operations as used in hardware design (e.g. shifts ‘ o o ’
and ‘ 4 4’) are provided. circuits as summarized in Table 1. Besides that, signals can be
grouped into multi-dimensional arrays of constant length using
Overall, the implementation of all these general concepts led to square brackets after the signal name and before the optional bit-
the SyReC syntax as defined by means of the EBNF in Fig. 2. In the width declaration (Line 6).
following, the syntax and the semantics of all description means
are explained and illustrated in detail. Example 2. Fig. 3 shows several module declarations possible in
SyReC including an adder-module with two inputs and one output
3.2. Module and signal declarations (adder1), an adder-module with fixed bit-widths for the inputs
and outputs (adder2), an adder-module where four operands are
Each SyReC specification (denoted by 〈program〉 in Line 1 in
Fig. 2) consists of one or more modules (denoted by 〈module〉 in 2
Note that depending on the application feedback and, hence, state signals
Line 2). A module is introduced with the keyword module and
might not be allowed in reversible circuits. Nevertheless, SyReC supports this
includes an identifier (represented by a string as defined in Line concept in principle. For a more detailed discussion on reversible sequential cir-
23), a list of parameters representing global signals (denoted by cuits, we refer to [38,39].
R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53 43

Fig. 3. Exemplary module declarations.

Fig. 4. Calling a module identified by adder1.

given by a 4-segment array composed of 16-bit signals (adder3),


and an arbitrary module with local and state signals (myCircuit).

Fig. 5. Loops in SyReC. (a) Simple loop, (b) loop over bits of a signal, and (c) loop
3.3. Statements with step keyword.

Statements include call and uncall of other modules, loops,


conditional statements, and various data operations (i.e. reversible
assignment operations, unary operations, and swap statements;
Line 8). The empty statement can explicitly be modeled using the
skip keyword (Line 15). Statements are separated by semicolons
(Line 7). Signals within statements are denoted by 〈signal〉 allowing
access to the whole signal (e.g. x), a certain bit (e.g. x.4þ), or a
range of bits (e.g. x.2:4 þ, Line 16). The bit-width of a signal can Fig. 6. Conditional statements in SyReC.
also be accessed (e.g. #x; Line 25).

3.3.1. Call and uncall of modules Table 2


Hierarchic descriptions are realized in SyReC by means of Statements in SyReC.
modules which can be called and uncalled. For this purpose, the
Operation Semantic
keyword call (uncall) has to be applied together with the identifier
of the module to be called and its parameters (Line 9). Call exe- x ¼^ e Bit-wise XOR assignment of e to x, i.e. x≔x  e
cutes the selected module in forward direction, while uncall exe- x þ¼ e Increase by value of e to x, i.e. x≔x þ e
x ¼ e Decrease by value of e to x, i.e. x≔x  e
cutes the selected module backwards.
¼ x Bit-wise inversion of x
Example 3. If a SyReC description of an adder is available (as e.g. þþ¼ x Increment of x
declared in Fig. 3), it can be added to a design by the call command  ¼ x Decrement of x
as shown in Fig. 4.
x o¼4 y Swapping value of x with value of y

3.3.2. Loops
An iterative execution of a block is defined by means of loops else-bock modifies an input value of the conditional expression,
(defined in Line 10). The number of iterations has to be available the if- and the fi-expression are identical.
prior to the compilation, i.e. dynamic loops are not allowed.
Therefore, e.g. fix integer values, the bit-width of a signal, or Example 5. Fig. 6 shows two different conditional statements in
internal (local) $-variables can be applied. Furthermore, the cur- SyReC. The first one does not modify any of the inputs of the
rent value of internal counter variables can be accessed during the conditional expressions (signal b in this case). Hence, the if- and
iterations. Using the optional keyword step, also the iteration itself the fi-expression are identical. In contrast, the then-block of the
can be modified. A loop is terminated by rof.
second conditional statement modifies the value of signal b.
Example 4. Fig. 5 shows several loop descriptions possible in Hence, a suitable fi-expression different from the if-expression has
SyReC including (a) a simple loop with 10 iterations, (b) an itera- to be provided to ensure correct execution semantics in both
tion over all bits of an n-bit signal, and (c) a loop with a step directions.
definition.
3.3.4. Assignment statements
3.3.3. Conditional statements All further statements include the reversible assignment
Conditional statements (defined in Line 11) need an expression statements (denoted by 〈assignstatement〉), unary statements
to be evaluated followed by the respective then- and else-block. (denoted by 〈unarystatement〉), and the swap statement (denoted
Each of these blocks is a sequence of statements. In a forward by 〈swapstatement〉) as defined in Lines 12–14. The semantics of
computation, the then-block is executed if, and only if, the if- these statements is summarized in Table 2, whereby signals are
expression evaluates to 1; otherwise, the else-block is executed. In denoted by x; y and expressions are denoted by e. Since these
order to ensure reversibility, a conditional statement is terminated statements perform only reversible operations, they may assign
by a fi together with an adjusted expression. In a backward com- new values to signals. Therefore, the respective signal(s) to be
putation, the fi-expressions decides whether the then- or the modified must not appear in the expression on the right-hand
else-block is reversibly executed. In case neither the then- nor the side.
44 R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53

Table 3
Expressions in SyReC.

Operation Semantic

eþ f Addition of e and f
Fig. 7. Assignment, unary, and swap statements in SyReC.
e-f Subtraction of e and f
enf Lower bits of multiplication of e and f
Example 6. Fig. 7 shows some of these statements in action. It can e n4 f Upper bits of multiplication of e and f
easily be seen that all these operations can be executed in both e/f Division of e and f
directions, i.e. forward and backward computation always lead to e%f Remainder of division of e and f

unique results. e^f Bit-wise XOR of e and f


e&f Bit-wise AND of e and f
3.4. Expressions ej f Bit-wise OR of e and f
e Bit-wise inversion of e

Expressions as defined in Lines 17–20 are applied e.g. in the


e && f Logical AND of e and f
right-hand side of assignment statements or as branching condi- eJf Logical OR of e and f
tion in if/fi-statements. Since expressions do not modify the values !e Logical NOT of e
of any signal, also non-reversible operations can be applied in
e o f True, if, and only if, e is less than f
expressions without jeopardizing the reversibility. By this, a wide
e 4 f True, if, and only if, e is greater than f
range of different description means is provided. Table 3 lists the e¼ f True, if, and only if, e equals f
semantic of all operations which can be used in expressions, e !¼ f True, if, and only if, e not equals f
whereby sub-expressions are denoted by e; f and natural numbers e o¼ f True, if, and only if, e is less or equal to f
are denoted by N. e 4¼ f True, if, and only if, e is greater or equal to f

Example 7. Fig. 8 shows some statements including expressions e⪡N Logical left shift of e by N
e⪢N Logical right shift of e by N
that demonstrate the range of description means available in
SyReC. Although the language is restricted in order to ensure
reversibility (e.g. statements such as c ¼anb are not allowed),
common functionality can easily be specified nevertheless (e.g.
with a new free signal c and c^¼anb). It can easily be seen that
despite the usage of non-reversible operations in Fig. 8, all state-
ments still can be executed in both directions.

Using the language introduced in this section, it is possible to


Fig. 8. Application of expressions.
specify reversible circuits on a higher level of abstraction. In par-
ticular for the design of complex functionality, SyReC clearly out-
performs currently applied description means such as truth tables,
permutations, and decision diagrams. Later in Section 7 this is
further demonstrated by means of a complete design of a pro-
cessor in SyReC. Beforehand, the synthesis of a reversible circuit
based on a SyReC description is introduced.

4. Synthesis of SyReC specifications

In order to synthesize a given SyReC specification, we devel-


oped a hierarchical synthesis method that uses existing realiza-
tions, so-called building blocks, of the individual statements and
expressions and combines them so that the desired circuit results.
More precisely, our approach (1) traverses the whole program and
(2) adds cascades of reversible gates to the circuit realizing each
statement or expression. Fig. 9. Synthesis of assignment statements.
Modules are synthesized independently of each other and
afterwards cascaded according to the respective call and uncall- value can still be recovered by applying the inverted assignment
statements. All signals are realized by buses of common reversible operation. In the following, we use the notation depicted in
circuit lines with the specified bit-width. In the following, the Fig. 9(a) to denote such an operation in a circuit structure. Solid
individual mappings of the statements to the respective reversible lines that cross the box represent the signals(s) on the right-hand
cascades are described. We distinguish between the synthesis of side of the statement, i.e. the signal(s) whose values are preserved.
(A) assignment statements (including unary statements and swap The simplest reversible assignment operation is the bit-wise
statements), (B) expressions, as well as (C) control logic including XOR (e.g. a^¼ b). For 1-bit signals, this operation can be synthesized
call/uncall, loops, and conditional statements. by a single Toffoli gate as shown in Fig. 9(b). If signals with a bit-
width greater than 1 are applied, for each bit a Toffoli gate is
4.1. Synthesis of assignment statements applied analogously.
To synthesize the increase operation (e.g. aþ ¼b), a modified
As introduced in Section 3.3.4, assignment statements in SyReC addition network is added. In the past, several realizations of
must be reversible. As a consequence signal values are not over- addition in reversible logic have been investigated. In particular, it
written but rather updated with a new value such that the old is well known that the minimal realization of a one-bit adder
R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53 45

consists of four Toffoli gates (e.g. [40]). Thus, cascading the is needed since the circuit line representing c is used instead.
required number of one-bit adders is a possible realization. But However, such a simple “combination” is not possible for all
since every one-bit adder also requires one constant input, this is a statements. As an example, Fig. 10(d) shows a two-bit addition
very poor solution with respect to circuit lines. In contrast, heur- whose result is applied to a bit-wise XOR, i.e. c^¼a þb. Here,
istic realizations exist that require a fewer number of additional removing the constant lines and directly applying the XOR
lines (e.g. [41]). Since the increase operation, unlike the addition in operation on the lines representing c would lead to a wrong result.
general, is reversible, it can even be synthesized without any This is because intermediate results are stored at the lines repre-
additional lines. Such a realization is used in our approach. A senting the sum. Since these values are reused later, performing
corresponding cascade for a 3-bit increase is depicted in Fig. 9(c). the XOR operation “in parallel” would destroy the result. Thus, to
The mapping for the decrease operation is left (e.g. a  ¼b). have a combined realization of a bit-wise XOR and an addition, a
Since the decrease operation is the inverse of the increase opera- precise embedding for this case must be generated. Since deter-
tion, the same realization as depicted in Fig. 9(c) is used in reverse. mining the respective embeddings and circuits for arbitrary
Finally, the realizations for the unary and swap statements are combinations of statements and expressions is a cumbersome
straight-forward. The bit-wise inversion (e.g. ¼a) is realized by task, constant lines are applied to realize the respective func-
̃
adding a NOT gate to each circuit line representing a bit of a. tionality. However, later in Section 5 an extended synthesis
Similarly, a swap (e.g. a o ¼ 4 b) is realized by adding a SWAP scheme that removes many of these additional lines is presented.
gate to the corresponding circuit lines of a and b. To synthesize an
increment (e.g. þ þ ¼a), a cascade as depicted in Fig. 9(d) is 4.3. Synthesis of the control logic
applied. A decrement (e.g.   ¼a) is realized by using the same
cascade in reverse. Finally, the synthesis of control logic is considered. Module
calls/uncalls and loops are realized in a straightforward manner.
More precisely, loops are realized by simply cascading (i.e. unrol-
4.2. Synthesis of expressions
ling) the respective statements within a loop block for each
iteration. Since the number of iterations must be fixed (cf.
Expressions include operations that are not necessarily rever-
Section 3.3.2), this results in a finite number of statements which
sible so that their inputs have to be preserved to allow a (rever-
are subsequently processed. Call and uncall of modules are han-
sible) computation in both directions. To denote such operations,
dled similarly. Here, the respective statements in the modules are
in the following the notation depicted in Fig. 10(a) is used. Again,
cascaded.
solid lines represent the signals(s) whose values are preserved (i.e.
To realize conditional statements (i.e. if-statements as intro-
in this case the input signals).
duced in Section 3.3.3), two complementary variants are proposed.
Synthesis of irreversible functions in reversible logic is not new
The first one is depicted in Fig. 11(b). Here, the statements in the
so that for most of the respective operations reversible circuit
then- and else-block are mapped to reversible cascades with an
realizations already exist. Additional lines with constant inputs are
additional control line added to all gates. Thus, the respective
applied to make an irreversible function reversible (e.g. [32,33]).
As an example, Fig. 10 shows a reversible gate that realizes an AND
operation. As can be seen, this requires one additional circuit line
with a constant input 0. Similar mappings exist for all other
operations.
Since expressions can be applied together with assignment
statements (e.g. c^¼a&b), sometimes a more compact realization is
possible. More precisely, additional (constant) circuit lines can be
saved (at least for some statements), if the result of an expression
is applied to an assignment statement. As an example,
Fig. 10(c) shows the realization for c^¼a&b where no constant input

Fig. 11. Synthesis of conditional statements. (a) Code. (b) Without additional lines.
Fig. 10. Synthesis of expressions. (c) With additional lines.
46 R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53

operations of the statements in the then-block (else-block) are expression. Usually, a large number of circuit lines is seen as a
computed if, and only if, the result of the expression (stored in disadvantage.
signal e) is 1 (0). A NOT gate is applied to flip the value of e so that
the gates of the else-block can be “controlled” as well.
Fig. 11(c) shows the second realization of a conditional state- 5. Line-aware synthesis of SyReC specifications
ment, which is realized in three steps:
In order to realize SyReC specifications with a smaller number
1. All signals in the then- or else-block, which potentially are of additional circuit lines, an extended synthesis scheme is pre-
assigned a new value (e.g. that are on the left-hand side of a sented in this section (based on [42]). The idea is to use the same
reversible assignment operation), are duplicated. This requires
building blocks as introduced in the previous section, but to undo
an additional circuit line with constant input 0.
intermediate results of the expressions as soon as they are not
2. The statements within the blocks are mapped to reversible
needed anymore. A similar idea (for reversible software programs)
cascades. The duplications introduced in the previous step are
has previously been proposed in [43]. This enables that circuit
applied to intermediately store the results of the then-block and
lines which have been occupied by expressions before to be re-
the original values of the signals in the else-block.
used.
3. Depending on the result of the conditional expression e, the
In the following, the general concept of this scheme is illu-
values of the duplicated lines and the original lines are swap-
strated before the extended synthesis is described in detail for all
ped. More precisely, in the example of Fig. 11(a) the value of a is
swapped with its (newly assigned) duplication if e evaluates to possible SyReC statements. Afterwards, the necessary number of
1. Analogously, if e evaluates to 0 the (newly assigned) value of c additional circuit lines is discussed.
is passed through unaltered.
5.1. General concept
Having both realizations, it is up to the designer which one
should be applied during synthesis. The second realization leads to The extended synthesis approach follows the scheme as
additional circuit lines in contrast to the first realization. However, introduced in the end of the previous section, but is extended by
due to the additional control lines both the quantum cost and the an additional third step:
transistor cost of the circuit significantly increase in the first rea-
lization. This, and other aspects, are further evaluated in Section 7. 3. Add the inverse circuit from Step 1, i.e. G  1
 , to the circuit in
order to reset the circuit lines buffering the result of the
expressions to the constant 0.
4.4. Summary
Example 8. Consider the two following generic HDL statements:
Using the building blocks for statements and expressions as
introduced above, it is possible to automatically synthesize a  ¼ ðb  cÞ; d  ¼ ðe  fÞ;
reversible circuits specified in SyReC. More precisely, the following
two steps are performed for each statement:
Fig. 13 sketches the resulting circuit after applying the extended
1. Compose a sub-circuit G  realizing all the expressions in a synthesis scheme. The first two sub-circuits Gb  c and Ga  ¼ b  c
statement using the respective building blocks. The result of the ensure that the first statement is realized. This is equal to the
expressions is buffered by means of additional circuit lines. scheme proposed in Section 4 and leads to additional lines with
2. Compose a sub-circuit G  realizing the overall statement using constant inputs (highlighted thick). Afterwards, a further sub-
the existing building blocks of the statement itself together with circuit Gb1c is applied. Since Gb1c is the inverse of Gb  c , this sets
the buffered results of the expressions. the circuit lines buffering the result of b  c back to the constant 0.
As a result, these circuit lines can be reused in order to realize the
Hence, the resulting circuits basically have a structure as shown in following statements as illustrated for d  ¼e  f in Fig. 13.
Fig. 12, i.e. cascades of building blocks for the respective assign-
ment statements and their expressions result.
Obviously, this leads to a significant number of additional cir- 5.2. Resulting synthesis scheme
cuit lines with constant inputs which are used to buffer inter-
mediate results of the expressions. The precise number of addi- Following the proposed concept, each statement can be rea-
tional circuit lines increases with respect to the complexity of the lized with zero garbage outputs. In the following, the precise
realization of this scheme is detailed for each possibly affected
statement. The unary statements, the swap-statement ( o ¼ 4 )
and the skip-statement are not considered here as they are rea-
lized without additional circuit lines.

Fig. 12. Resulting circuit structure. Fig. 13. Line reduction.


R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53 47

5.2.1. Assignment statements then restored by another NOT gate. Afterwards, the original
In order to realize statements of the form a  ¼e with e being (constant) value of that line is restored by applying a sub-circuit Gfi
an arbitrary expression, basically the respective building blocks are which evaluates the fi-expression of the statement analogous to
arranged as already illustrated in Fig. 13. First, a sub-circuit rea- Gif . As defined in Section 3.3.3, SyReC requires the definition of a
lizing the expression e, i.e. the right-hand side of the statement, is fi-expression that evaluates to the same Boolean value as the if-
created. This requires additional lines to store the result of e. Next, expression did in Gif .
a sub-circuit realizing the assignment operation is created as well Besides that, Fig. 15(c) illustrates the adjusted procedure for the
as a sub-circuit reversing the result of e into a constant value. The synthesis of a conditional statement according to the second rea-
latter is done by reversing the order of gates of the first sub-circuit. lization (i.e. according to the scheme illustrated in Fig. 11(c)). The
Finally, all three sub-circuits are composed leading to the desired gates highlighted in dark gray (light gray) correspond to the then-
realization of the statement. block (else-block). Also, a sub-circuit Gif is created as in the first
realization, and the result of the if-expression is stored in an
Example 9. Fig. 14 shows the circuit obtained by synthesizing c ^ additional line e (the top line in Fig. 15(c)). The conditional state-
¼ ða þ bÞ using the extended synthesis scheme. The respective sub- ment is then realized by applying the procedure described in
circuits Ga þ b , Gc ^ ¼ a þ b , and Gaþ1b are highlighted by dashed rec- Section 4.3. Afterwards, the values of the additional lines that were
tangles. Since all gates considered in this work are self-inverse, used to duplicate signals are reset to the constant value 0. This is
Gaþ1b is obtained by reversing the order of the gates of Ga þ b . done by applying the gates used in the then- and else-block again
with e as an additional control line. The additional lines are set to
Applying this procedure, any arbitrary combination of assign-
the values of the corresponding signal lines, which are then used
ment statements and expressions can be realized in a garbage-free
to undo the duplication and set the additional lines back to 0. The
manner. That is, required additional circuit lines are ancilla lines
value of e is reset to 0 by creating a sub-circuit Gfi as in the first
and can be reused for other statements and operations.
realization.
The original advantage of the second realization was lower
5.2.2. Conditional statements
quantum cost and transistor cost, since the realization of the then-
As described in Section 4.3, there are two proposed realizations
and else-block does not have an extra control line on each gate.
for conditional statements (cf. Fig. 11).
This advantage is lost here. Since the values of the additional lines
Fig. 15(b) illustrates the adjusted procedure for the synthesis of
depend on the value of e (e.g. a if e¼1 and a0 if e¼ 0) and the
a conditional statement according to the first realization (i.e.
realizations of the then- and the else-block are needed to set the
according to the scheme illustrated in Fig. 11(b)). The gates needed
additional line to the same value as the signal line (e.g. a0 if e¼1
to realize the then-block (else-block) are highlighted in dark gray
and a if e¼ 0), both the realization of the then- and else-block have
(light gray). Also here, a sub-circuit Gif evaluating the respective if-
to be added to the circuit with an extra control line on each gate.
expression is created. The intermediate results of that expression
As a consequence, the second realization of conditional statements
are handled analogously to assignment statements as described
in the line-aware synthesis leads to both, additional circuit lines as
above. An additional circuit line is applied to store the Boolean
well as higher costs, and is therefore not considered any further.
result of the if-expression and control the execution of the then-
and else-block as described in Section 4.3. The flip on the addi- 5.2.3. Loops and calls
tional line, which is done to control the gates of the else-block, is The realization of loops and module calls is treated in a straight
forward manner exploiting the procedures proposed above. More
precisely, calls are substituted by the corresponding statements
inside the body of the call. Loops are realized by explicitly cas-
cading (i.e. unrolling) the respective statements within a loop
block according to the fixed and the finite number of iterations.

5.3. Discussion

Applying the extended synthesis scheme, every statement is


Fig. 14. Synthesizing c^¼ ða þ bÞ. synthesized with zero garbage outputs and only additional ancilla

Fig. 15. Synthesizing conditional statements. (a) Code. (b) Without additional lines. (c) With additional lines.
48 R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53

Fig. 16. Effect of expression size.

lines. Consequently, the total number of additional lines which are


required to realize a SyReC specification with the proposed solu-
tion can be determined by the statement that requires the largest
number of additional lines in order to buffer intermediate results.

Example 10. Consider a sequence of three assignment statements


to be synthesized. Additionally, assume that 1, 3, and 2 circuit lines
are needed to buffer the intermediate results of the respective
expressions. Then, in total max f1; 3; 2g ¼ 3 additional circuit lines
are needed to realize the statements. Fig. 16 illustrates how these
circuit lines are applied. For comparison, the synthesis scheme
from Section 4 needs 1 þ 3 þ 2 ¼ 6 additional circuit lines.

The number of additional circuit lines can further be reduced in


many cases by restructuring the SyReC code. In general, larger
expressions lead to more intermediate results to be buffered. Thus,
if the same functionality can be represented by more but smaller
statements, a further reduction in the number of lines is possible.

Example 11. Consider the following statement:


a þ ¼ ððb&cÞ þ ððdneÞ  f ÞÞ
In order to execute the outer expression (i.e. the addition opera-
tion), the intermediate results of the inner expressions (b & c),
(d n e), and ((d n e)  f) are buffered at the same time. Considering
32-bit signals, this requires 96 circuit lines (in addition to 32 cir- Fig. 17. Cost-aware synthesis. (a) Original realization. (b) Revised realization.
cuit lines needed to buffer the result of the outer expression itself,
i.e. 128 in total). depend on their respective number of control lines. Hence, buf-
In contrast, the same functionality can also be specified by the fering the results of common control conditions of a cascade of
following statements: gates allows for reducing the number of required control lines in
a þ ¼ ðb&cÞ; each gate. As a result, the costs of each gate and, hence, the costs of
a þ ¼ ðdneÞ; the entire circuit are decreased significantly.
a ¼ f ;
Example 12. Consider an 8-bit realization of the increment
Here, the respective binary operations are applied separately with statement ( þ þ ¼ a) as shown in Fig. 17(a). The gates in this cas-
an assignment operation. Hence, no more than 32 ancilla lines are cade have several common control lines, e.g. C 0 ¼ fa0 ; a1 ; a2 g. By
needed to buffer the intermediate results. adding two Toffoli gates TðC 0 ; fhgÞ, the result of the common con-
Overall, a price for the smaller number of circuit lines is an trol conditions C 0 can be buffered in an ancilla line h as shown in
expected increase in the number of gates, and thus in the gate Fig. 17(b) (the new gates are emphasized with a gray box and the
costs. However, the increase in the gate costs is bounded. For line h is on the top). This enables that all gates with control lines
example, in comparison to the synthesis scheme from Section 4 C + C 0 can be simplified, i.e. instead of C a smaller set of control
where the building blocks G  and G  are applied for each lines ðC⧹C 0 Þ [ fhg is sufficient (in Fig. 17(b), the saved control lines
assignment statement, the extended scheme uses just one more are indicated with dashed circles). As a result, the costs of the
building block G  1 1
 . Since G  is the inverse of G  , the circuit can
gates and, hence, the costs of the overall circuit are significantly
at most double its gate cost. reduced. In fact, quantum costs can be improved from 431 to 116
Overall, the resulting circuits still include additional circuit (73%) and transistor costs can be improved from 224 to 192 (14%).
lines with constant inputs. But considering that, previously, the
Similar observations can be made for many other building
synthesis of complex functionality as a reversible circuit with the
blocks as well. Particularly (nested) conditional statements fre-
minimal number of lines is a cumbersome task (e.g. [33]), the
proposed solution enables to keep this number relatively small. quently lead to large cascades of gates with common control lines.
This is because the circuit lines representing the conditional
expressions control whole cascades realizing the respective then-
6. Cost-aware synthesis of SyReC specifications and else-blocks. Hence, it is worth to exploit these observations.
Note that an improvement is obviously only possible if the total
Finally, all synthesis approaches proposed in the previous costs of the two added gates are less than the costs saved by
sections can further be refined in order to reduce the costs of the buffering the common control lines. Furthermore, a free ancilla
resulting circuits. An observation made in [44] is exploited for this line has to be available. This is either already the case (e.g. when a
purpose. Here, it has been observed that many reversible circuits constant circuit line is required anyway for the realization of
are composed of cascades of gates with several common control another expression) or can explicitly be added by the designer to
lines. As reviewed in Section 2, the costs of single gates mainly enable this reduction.
R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53 49

Fig. 18. Schematic diagram of the CPU implementation.

7. Case study and experimental evaluation

SyReC as proposed above enables the design and synthesis of


complex logic as a reversible circuit. This has been demonstrated
by means of a case study, i.e. the design of a RISC CPU. In this
section, the design issues as well as the application of the result
are summarized and discussed.
Furthermore, we conducted a thorough study on the effects of
the respective optimization approaches presented in Sections 5
Fig. 19. Implementation of the program counter. and 6. For this purpose, we used the components of the designed
CPU as well as further SyReC specifications as benchmarks. The
results of this evaluation are also summarized in this section.

7.1. Design of a RISC CPU

To demonstrate the applicability and the benefits of the pro-


posed hardware description language and its synthesizer, the
design of a CPU is suitable. A CPU provides a well-known piece of
hardware but remains complex enough to bring previously pro-
posed synthesis approaches as discussed in Section 1 to their
limits. In this case study, the basic design steps as well as the
Fig. 20. Assembler program for Fibonacci number computation.
results are summarized. For a detailed consideration of this
experiment, the reader is referred to [45].
The specification of the CPU is inspired by the design of a
Following these concepts, synthesis of SyReC specifications can conventional CPU (cf. [46]). The CPU was created in order to exe-
be refined as follows: cute software programs provided in terms of an assembler lan-
guage which includes arithmetic, logic, jump, and load/store
1. Synthesize a statement as described in the previous sections. instructions. The CPU has been designed as a Harvard architecture
2. Determine cascades of gates t 1 ðC 1 ; T 1 Þ ... t k ðC k ; T k Þ which satisfy with a bit-width of 16 bit for both the program memory and the
the following criteria: data memory. The CPU has 8 registers, the size of the program
(a) The gates in the cascade have a common set C 0 of control memory is 4 kByte, and the size of the data memory is 128 kByte.
lines, i.e. C i + C 0 for 1 ri r k. Fig. 18 provides a schematic overview showing the imple-
(b) The values of the common control lines are not modified mentation of the proposed CPU. In the following, the respective
within this cascade, i.e. C 0 \ T i ¼ ∅ for 1 r i rk. components are briefly described from the left-hand side to the
3. Create a new cascade TðC 0 ; fhgÞ t 1 ððC 1 ⧹C 0 Þ [ fhg; T 1 Þ ... t k ððC k right-hand side.
þ 1⧹C 0 Þ [ fhg; T k ÞTðC 0 ; fhgÞ. In each cycle, first the current instruction is fetched from the
4. If a free circuit line h is available and the new cascade is cheaper program memory. That is, depending on the current value of the
than the original cascade, replace the original cascade with the
program counter pc, the respective instruction word is stored in
new one.
the signal instr. Using this signal, the control unit decodes the
instruction. Afterwards, as defined in the instruction, the respec-
This procedure is applicable to both synthesis approaches, i.e.
tive operation is performed in the ALU. Depending on the value of
to the scheme proposed in Section 4 as well as to the extended
oprt as well as the operands op1 and op2, a result is determined
scheme proposed in Section 5. Determining the best possible and assigned to data. This value is then stored in a register
cascades for replacement is a complex task as the order in which addressed by dest. Finally, the program counter is updated. If no
common control lines are exploited typically has an effect. Hence, control operation has been performed (i.e. if inc ¼ 1), the value of
we apply this procedure only for single statements leading to local signal pc is simply increased by one. Otherwise, pc is assigned the
optima. As confirmed by the experiments in the next section, this value given by jmp. An exception occurs, if the primary input
leads to significant improvements in short run-time. reset is set to 1. Then, the whole execution of the program is
50 R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53

Fig. 21. Waveform illustrating the execution of the program given in Fig. 20.

Table 4
Comparison to previous work.

Benchmark Bit-width PI/PO BDD-based synth. SyReC synth. SyReC synth.


[19] if-stm. without add. Lines if-stm. with add. Lines

Add. lines QC TC Run-time Add. lines QC TC Add. lines QC TC


CPU from Section 7.1

cpu_alu 16 55 1852 20,660 77,704 165.99 349 662,531 568,328 2085 31,244 67,896
cpu_alu 32 103 – – – 4 500 653 2,235,491 1,917,448 6101 112,396 218,680
cpu_control_unit 16 233 618 7119 27,264 0.12 158 40,433 43,888 413 22,343 31,432
cpu_pc 11 24 39 392 1456 0.00 13 857 912 68 797 1336
cpu_register 16 149 512 7,040 25,600 0.05 18 9833 8472 162 7,560 8472
cpu_register 32 293 1024 14,080 51,200 0.21 34 19,641 16,792 322 15,096 16,920

Benchmarks from RevLib [40]

alu 16 50 – – – 4 500 67 258,872 234,424 115 146,385 151,168


alu 32 98 – – – 4 500 131 1,704,912 1,402,232 227 1,230,577 1,064,000
alu_flat 16 50 – – – 4 500 68 181,662 179,464 132 146,496 151,472
alu_flat 32 98 – – – 4 500 132 1,380,526 1,177,928 260 1,230,784 1,064,560
simple_alu 16 50 – – – 4 500 67 35,463 39,552 115 6275 17,568
simple_alu 32 98 – – – 4 500 131 144,791 154,432 227 25,531 67,744
bubblesort 16 64 – – – 4 500 254 29,327 44,248 748 21,149 43,272
bubblesort 32 128 – – – 4 500 494 58,739 88,840 1468 42,281 87,096
callif 16 33 499 7031 26,128 3.80 1 1522 3816 33 641 2664
callif 32 65 – – – 4 500 1 3154 7912 65 1313 5480
mult_stmts 16 96 – – – 4 500 32 6122 16,960 32 6122 16,960
mult_stmts 32 192 – – – 4 500 64 25,282 66,752 64 25,282 66,752
nestedif 16 34 752 10,534 39,128 11.04 3 6982 11,000 99 1475 5848
nestedif 32 66 – – – 4 500 3 14,470 22,776 195 3011 11,992
nestedif2 16 34 257 3348 12,312 1.72 4 8423 8856 100 6034 6824
nestedif2 32 66 – – – 4 500 4 31,703 27,736 196 26,674 23,784
varops 16 48 – – – 4 500 64 1361 6512 64 1361 6512
varops 32 96 – – – 4 500 128 2801 13,424 128 2801 13,424

reset, i.e. the program counter is set to 0. The updated value of the components of the CPU are realized. All resulting SyReC codes are
program counter is used in the next cycle. available at RevLib [40].
The SyReC language has been applied to realize all combinational Using the synthesized realizations of these components and
components of this CPU.3 Being restricted to a reversible language has plugging them together yields a reversible circuit which is able to
an impact on the design phase and requires the integration of new process assembler programs as e.g. shown in Fig. 20 (for Fibonacci
design patterns during the implementation. As an example, one new number computation). Each assembler instruction is translated into a
design paradigm becomes already evident in the implementation of sequence of respective instruction words by applying techniques
the program counter whose SyReC code is given in Fig. 19. According proposed in [46]. Afterwards, the resulting instruction words are loa-
to the specification, the program counter should be assigned 0, if the ded into the program memory, while the data memory is initialized
primary input reset is assigned 1. Due to a lack of conventional with desired values. Overall, this supports running the translated
assignment operations, this is realized by a new additional signal object code.
(denoted by zero and set to 0) as well as a swapping operation (Line Running the program from Fig. 20 on the designed and synthesized
6 of Fig. 19). Similar design decisions have to be made e.g. to realize the CPU yields the waveform given in Fig. 21. The identifiers clk, pc', and
desired control path or to implement the respective functionality of instr[15:11] denote the values of the clock signal, the program counter,
the ALU. In contrast, the increase of the program counter is a reversible and the operation code extracted from the instr signal, respectively.
operation and, thus, can easily be implemented by the respective Furthermore, the values of the used registers are provided. For the
þ þ ¼ instruction (Line 9). In a similar fashion all remaining sake of clarity, all other signal values are omitted. Note that the value
of the program counter always corresponds to the respective line
3
number of the code given in Fig. 20. As can be seen, the CPU processes
Note that the sequential elements, i.e. the memories, are realized by an
external controller; here emulated by a Python script. this program as intended. Using SyReC, the design of this complex
R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53 51

Table 5
Effect of line- and cost-aware synthesis.

Line-aware synth. (Sect. V) Cost-aware synth. (Sect. VI) Cost-aware


if-stm. without add. Lines if-stm. with add. Lines þ Line-aware synth.

Benchmark Bit-width a.l. QC TC a.l. QC TC a.l. QC TC a.l. QC TC


CPU from Section 7.1

cpu_alu 16 87 1,281,717 1,103,200 350 63,025 112,048 2086 30,208 67,144 88 118,751 215,568
cpu_alu 32 151 4,381,653 3,766,496 654 178,783 337,000 6102 107,136 215,112 152 331,151 648,208
cpu_control_unit 16 57 80,142 87,176 159 10,513 23,808 414 7463 21,448 58 20,756 47,304
cpu_pc 11 13 865 944 14 505 672 69 609 1224 14 513 704
cpu_register 16 17 9848 8512 19 2217 3352 163 2600 5144 18 2232 3392
cpu_register 32 33 19,656 16,832 35 3577 5528 323 4760 9496 34 3592 5568

Benchmarks from RevLib [40]

alu 16 19 516,628 467,184 68 44,782 81,888 116 35,152 72,008 20 88,566 162,208
alu 32 35 3,407,588 2,801,136 132 174,594 319,888 228 150,928 297,224 36 347,198 636,672
alu_flat 16 17 363,012 357,904 69 38,657 76,872 133 35,263 72,312 18 77,002 152,720
alu_flat 32 33 2,760,420 2,353,808 133 158,241 307,208 261 151,135 297,784 34 315,850 612,368
simple_alu 16 19 69,810 77,440 68 8975 21,088 115 6275 17,568 20 17,262 40,816
simple_alu 32 35 287,346 305,536 132 30,775 74,592 227 25,531 67,744 36 60,206 146,544
bubblesort 16 153 34,374 53,512 255 11,615 31,960 749 12,653 36,360 154 13,830 38,920
bubblesort 32 297 68,766 107,320 495 21,827 61,192 1469 24,569 70,968 298 25,950 74,296
callif 16 1 1524 3824 1 1522 3816 33 641 2664 1 1524 3824
callif 32 1 3156 7920 1 3154 7912 65 1313 5480 1 3156 7920
mult_stmts 16 16 11,572 30,704 32 6122 16,960 32 6122 16,960 16 11,572 30,704
mult_stmts 32 32 49,172 126,832 64 25,282 66,752 64 25,282 66,752 32 49,172 126,832
nestedif 16 2 6996 11,056 4 3094 7800 99 1475 5848 3 3108 7856
nestedif 32 2 14,484 22,832 4 6358 15,992 195 3011 11,992 3 6372 16,048
nestedif2 16 3 8504 9072 5 5243 6568 101 3809 5224 4 5324 6784
nestedif2 32 3 31,784 27,952 5 17,269 17,960 197 14,424 15,464 4 17,350 18,176
varops 16 48 2032 9680 64 1361 6512 64 1361 6512 48 2032 9680
varops 32 96 4176 19,920 128 2801 13,424 128 2801 13,424 96 4176 19,920

Table 6
Average values of the respective metrics for all schemes.

add.lines QC TC

Initial approach (Section 4)


if-stm. w/o additional lines 120.0 286,037.4 252,612.7
if-stm. w/ additional lines 559.1 129,734.5 131,327.3
Line-aware scheme (Section 5)
48.8 558,967.7 490,699.7
Cost-aware scheme (Section 6)
if-stm. w/o additional lines 120.0 34,178.8 67,533.0
if-stm. w/ additional lines 559.7 27,271.7 58,410.7
Cost- & Line-aware scheme
49.5 63,610.2 126,376.0

circuitry was significantly easier than with pure Boolean function the existing synthesis approaches for reversible circuits rely on non-
representations. compacted Boolean descriptions and are therefore often not scalable.
In fact, the complex circuitry considered here cannot be realized by
7.2. Evaluation of the resulting circuits most of them. The BDD-based synthesis approach presented in [19]
represents an exception as it relies on a compacted Boolean repre-
Besides the case study on the applicability of the hardware sentation. Hence, we compared the circuits generated by SyReC with
description language, we also conducted a thorough study on the the equivalent realizations generated by the approach from [19].
quality of the resulting circuits. For this purpose, we implemented The results are summarized in Table 4. The first columns give the
all synthesis schemes as described above in C þ þ on top of RevKit name of the benchmark, the bit-width of the realization as well as the
[47]. As benchmarks for the evaluation, we used the SyReC spe- number of primary inputs and outputs (denoted by Benchmark, Bit-
cifications from the respective CPU components discussed in the width, and PI/PO, respectively). The following columns give the num-
previous section as well as further designs which have been made ber of additional circuit lines (add. lines), the quantum cost (QC), and
available at RevLib [40]. All experiments have been performed on a the transistor cost (TC) of the circuits obtained using the BDD-based
2.8 GHz Intel Core i7 processor with 7.8 GB of main memory. In the approach (denoted by BBD-based synth.) and the SyReC synthesizer.
following, the results are summarized and discussed. For the latter, we distinguish between the realization of if-statements
according to Fig. 11(b) (denoted by if-stm. without add. lines) and
7.2.1. Comparison to previous work according to Fig. 11(c) (denoted by if-stm. with add. lines). For the BDD-
In a first evaluation, we compared the quality of the circuits based approach the run-time is additionally listed. This is omitted for
obtained using the initial synthesis scheme (as introduced in Section the SyReC solution as all circuits have been realized in less than one
4) to previously proposed solutions. As discussed in Section 1, most of CPU second.
52 R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53

As can be clearly seen, the proposed approach outperforms the 8. Conclusions


BDD-based synthesis with respect to scalability. In particular for
the benchmarks including arithmetic (e.g. the alu realizations), In this paper, we investigated, extended, and evaluated the rever-
BDD-based synthesis requires a significant amount of time to sible hardware description language SyReC. Besides new syntactical
generate a result; often the results cannot be achieved within the features, two optimization approaches have been proposed that can be
applied timeout of 500 CPU seconds. This can be explained by the applied to reduce synthesis results based on the designer's individual
fact that in particular for the multiplication no efficient repre- needs. An adjusted synthesis scheme “uncomputes” intermediate
results and therefore allows one to keep the number of additional lines
sentation as a BDD exists. Thus, for these components the BDD-
small. A second optimization scheme adds a new line which is then
based approach suffers from memory explosion.
used in order to buffer intermediate values. This allows for a reduction
Besides that, these results also confirm the discussion from
of the size of individual gates and, by this, improves the costs of the
Section 4 concerning the different realizations of the if-statements.
circuit.
If additional circuit lines are applied, the respective costs can
In an experimental evaluation we first demonstrated the applic-
significantly be reduced. In comparison to the realization without
ability by means of a case study in which a RISC CPU has been
additional circuit lines for if-statements, approx. 40% (95% in the designed using the hardware description language. Furthermore,
best cases) of the quantum costs and more than 20% (90% in the SyReC's advantage compared to previously proposed synthesis
best cases) of the transistor costs can be saved. In contrast, this approaches based on Boolean function representations has been
leads to a significant increase in the number of additional lines. shown. Finally, a closer examination has been made for the two
optimization schemes. The results showed that a combination
7.2.2. Effect of line- and cost-aware synthesis between the line- and costs-aware synthesis scheme provides a good
In a second evaluation, the effect of the optimized synthesis trade-off and, hence, leads to a significantly more efficient
schemes presented in Section 5 (for line-aware synthesis) and compromise.
Section 6 (for cost-aware synthesis) has been evaluated. Here, Future work will focus on the development of strategies which
Table 5 presents the results generated with the following schemes: further reduce the resulting costs as well as the number of lines in the
resulting circuits. This includes the consideration of more efficient
 The synthesis scheme as described in Section 5 using the building blocks as well as schemes which reduce the number of
realization of if-statements according to Fig. 11(b) (denoted by required building blocks (in particular the ones for “uncomputing”). In
Line-aware synth).4 parallel, solutions for code optimization, e.g. term rewriting techniques
 The synthesis scheme as described in Section 6 using the that best exploit the potential of the proposed synthesis method, shall
realization of if-statements according to Fig. 11(b) (denoted by be investigated. Further ideas addressing these objectives have
Cost-aware synth; if-stm. without add. lines). recently been proposed e.g. in [49].
 The synthesis scheme as described in Section 6 using the
realization of if-statements according to Fig. 11(c) (denoted by
Cost-aware synth; if-stm. with add. lines). Acknowledgments
 The synthesis scheme as described in Sections 5 and 6 com-
bined together with the realization of if-statements according to
This work has partially been supported by the Graduate School
Fig. 11(b) (denoted by Cost-aware þ Line-aware synth.).
SyDe, funded by the German Excellence Initiative within the
University of Bremen's institutional strategy, and by the European
Beyond that, Table 5 uses the same denotation as Table 4. To
Union through the COST Action IC1405.
further ease the interpretation of the numbers, we additionally
provide the average values of the respective metrics for all
considered synthesis schemes in Table 6.
The observations from above are confirmed. In fact, it becomes References
clearly evident that the selection of the respective scheme is crucial to
the resulting circuit sizes. Differences of several orders of magnitude [1] R. Landauer, Irreversibility and heat generation in the computing process, IBM
J. Res. Dev 5 (1961) 183.
can be observed for all objectives. On average, the number of addi- [2] N. Gershenfeld, Signal entropy and the thermodynamics of computation, IBM
tional lines varies from 48.8 (if the line-aware scheme is applied) to Syst. J. 35 (3–4) (1996) 577–586.
[3] C.H. Bennett, Logical reversibility of computation, IBM J. Res. Dev. 17 (6) (1973)
559.7 (if schemes are applied realizing if-statements according to 525–532.
Fig. 11(c)). Similarly, the worst case quantum costs (transistor costs) of [4] B. Desoete, A.D. Vos, A reversible carry-look-ahead adder using control gates,
INTEGRATION, VLSI J. 33 (1-2) (2002) 89–104.
558,967.7 (490,699.7) can be reduced to 27,271.7 (58,410.7) if cost-
[5] A. Berut, A. Arakelyan, A. Petrosyan, S. Ciliberto, R. Dillenschneider, E. Lutz,
aware synthesis and the realization of if-statements with additional Experimental verification of Landauer's principle linking information and
lines is applied. However, the two metrics are complementary to each thermodynamics, Nature 483 (2012) 187–189.
[6] M. Nielsen, I. Chuang, Quantum Computation and Quantum Information,
other. That is, if a designer picks the circuit with the best number of Cambridge University Press, 2000.
additional lines, he also gets the circuit with the worst circuit costs. [7] D. Deutsch, R. Jozsa, Rapid solution of problems by quantum computation,
Proc. R. Soc. Lond. A 439 (1992) 553–558.
This is in line with observations previously made e.g. in [48]. [8] L.K. Grover, A fast quantum mechanical algorithm for database search, Theory
Nevertheless, combining the line- and cost-aware schemes Comput. (1996) 212–219.
provides a good trade-off. In doing so, circuits with 49.5 additional [9] P.W. Shor, Algorithms for quantum computation: discrete logarithms and
factoring, Found. Comput. Sci. (1994) 124–134.
lines (just a bit more than the best result) and quantum costs [10] R. Wille, R. Drechsler, C. Oswald, A. Garcia-Ortiz, Automatic design of low-
(transistor costs) of 63,610.2 (126,376.0) (twice than the best power encoders using reversible circuit synthesis, In: Design, Automation and
Test in Europe, 2012, p. 1036–1041.
result) are achieved on average. [11] P. Patra, D. Fussell, On efficient adiabatic design of MOS circuits, In: Workshop
on Physics and Computation, Boston, 1996, pp. 260–269.
[12] J. Lim, D.-G. Kim, S.-I. Chae, nMOS reversible energy recovery logic for ultra-
4
Note that a realization of if-statements according to Fig. 11(c) has not been low-energy applications, J. Solid-State Circuits 35 (6) (2000) 865–875.
considered for this scheme since, as discussed in Section 5.2.2, line-aware synthesis [13] V.V. Shende, A.K. Prasad, I.L. Markov, J.P. Hayes, Synthesis of reversible logic
would always lead to an increase in both, additional lines and costs, in this case. circuits, IEEE Trans. CAD 22 (6) (2003) 710–722.
R. Wille et al. / INTEGRATION, the VLSI journal 53 (2016) 39–53 53

[14] M. Saeedi, M.S. Zamani, M. Sedighi, Z. Sasanian, Reversible circuit synthesis Notes in Computer Science, vol. 7165, pp. 64–76, RevKit is Available at 〈http://
using a cycle-based approach, J. Emerg. Technol. Comput. Syst. 6 (4), 2010. www.revkit.org〉, 2012.
[15] D. Maslov, G.W. Dueck, D.M. Miller, Toffoli network synthesis with templates, [48] R. Wille, M. Soeken, D.M. Miller, R. Drechsler, Trading off circuit lines and gate
IEEE Trans. CAD 24 (6) (2005) 807–817. costs in the synthesis of reversible logic, INTEGRATION, VLSI J. 47 (2) (2014)
[16] P. Gupta, A. Agrawal, N.K. Jha, An algorithm for synthesis of reversible logic 284–294.
circuits, IEEE Trans. CAD 25 (11) (2006) 2317–2330. [49] Z. Al-Wardi, R. Wille, R. Drechsler, Towards line-aware realizations of
[17] D. Maslov, G.W. Dueck, D.M. Miller, Techniques for the Synthesis of Reversible expressions for HDL-based synthesis of reversible circuits, In: Reversible
Toffoli Networks, ACM Trans. Des. Autom. Electron. Syst. 12 (4), 2007. Computation, 2015.
[18] K. Fazel, M. Thornton, J. Rice, ESOP-based Toffoli gate cascade generation, in:
IEEE Pacific Rim Conference on Communications, Computers and Signal Pro-
cessing, 2007, pp. 206–209.
[19] R. Wille, R. Drechsler, BDD-based Synthesis of reversible logic for large func-
Robert Wille received the Diploma and Dr.-Ing.
tions, In: Design Automation Conference, 2009, pp. 270–275.
degrees in computer science from the University of
[20] M. Soeken, R. Wille, C. Hilken, N. Przigoda, R. Drechsler, Synthesis of reversible
Bremen, Bremen, Germany, in 2006 and 2009, respec-
circuits with minimal lines for large functions, In: ASP Design Automation
tively. From 2006 to 2015, he has been with the Group
Conference, 2012, pp. 85–92.
of Computer Architecture, University of Bremen, and,
[21] R. Lipsett, C. Schaefer, C. Ussery, VHDL: Hardware Description and Design,
since 2013, the German Research Center for Artificial
Kluwer Academic Publishers, Intermetrics, Inc., 1989.
Intelligence, Bremen. He was a Lecturer with the Uni-
[22] T. Grötker, S. Liao, G. Martin, S. Swan, System Design with SystemC, Kluwer
versity of Applied Science, Bremen, and a Visiting Pro-
Academic Publishers, 2002.
fessor with the University of Potsdam, Potsdam, Ger-
[23] S. Sutherland, S. Davidmann, P. Flake, System Verilog for Design and Modeling,
many, and Technical University Dresden, Dresden,
Kluwer Academic Publishers, 2004.
Germany. Since 2015, he is full professor at the
[24] A.S. Green, P.L. Lumsdaine, N.J. Ross, P. Selinger, B. Valiron, Quipper: a scalable
Johannes Kepler University Linz, Austria. In the nine
quantum programming language, In: Conference on Programming Language
years of his research activity, he has published over 100
Design and Implementation, 2013, pp. 333–342.
papers in journals and conferences and served in program committees of numerous
[25] A.J. Abhari, A. Faruque, M.J. Dousti, L. Svec, O. Catu, A. Chakrabati, C.-F. Chiang,
conferences such as ASPDAC, DAC, and ICCAD. His current research interests
S. Vanderwilt, J. Black, F. Chong, M. Martonosi, M. Suchara, K. Brown, M.
include design of circuits and systems for both conventional and emerging tech-
Pedram, T. Brun, Scaffold: Quantum Programming Language, 〈http://ftp://ftp.
nologies with a focus in the domain of verification and proof engines.
cs.princeton.edu/techreports/2012/934.pdf〉, 2012.
[26] S.J. Gay, Quantum programming languages: survey and bibliography, Math.
Struct. Comput. Sci. 16 (4) (2006) 581–600.
[27] R. Wille, S. Offermann, R. Drechsler, SyReC: a programming language for
synthesis of reversible circuits, Forum Specif. Des. Lang. (2010) 184–189. Eleonora Schönborn received the Diploma degree in
[28] C. Lutz, Janus: a time-reversible language, Letter to R. Landauer. 〈http://www. computer science from the University of Bremen, Bre-
tetsuo.jp/ref/janus.html〉, 1986. men, Germany, in 2012. From 2012 to 2015 she was
[29] T. Yokoyama, R. Glück, A reversible programming language and its invertible with the Group of Computer Architecture, University of
self-interpreter, In: Symposium on Partial Evaluation and Semantics-Based Bremen, and the Graduate School System Design, a
Program Manipulation, 2007, pp. 144–153. cooperation of the University of Bremen with the Ger-
[30] M.K. Thomsen, A functional language for describing reversible logic, Forum man Research Center for Artificial Intelligence (DFKI)
Specif. Des. Lang. (2012) 135–142. and the German Aerospace Center (DLR). Her research
[31] M.K. Thomsen, Describing and optimising reversible logic using a functional interests include the design and synthesis of reversible
language, Implem. Appl. Funct. Lang. (2011) 148–163. circuits and systems.
[32] D. Maslov, G.W. Dueck, Reversible cascades with minimal garbage, IEEE Trans.
CAD 23 (11) (2004) 1497–1509.
[33] R. Wille, O. Keszöcze, R. Drechsler, Determining the minimal number of lines for
large reversible circuits, In: Design, Automation and Test in Europe, 2011, pp. 1204–
1207.
[34] T. Toffoli, Reversible Computing, In: W. de Bakker, J. van Leeuwen (Eds.),
Automata, Languages and Programming, Springer, vol. 632, Technical Memo Mathias Soeken received the Dr.-Ing. degree in com-
MIT/LCS/TM-151, MIT Lab for Computer Science, 1980. puter science from the University of Bremen in 2013.
[35] E.F. Fredkin, T. Toffoli, Conservative logic, Int. J. Theoret. Phys. 21 (3/4) (1982) Since 2009, he is with the Group of Computer Archi-
219–253. tecture at the University of Bremen and, since 2012,
[36] A. Barenco, C.H. Bennett, R. Cleve, D. DiVinchenzo, N. Margolus, P. Shor, with the German Research Center for Artificial Intelli-
T. Sleator, J. Smolin, H. Weinfurter, Elementary gates for quantum computa- gence (DFKI). His research interests are in electronic
tion, Phys. Rev. A 52 (1995) 3457–3467. design automation, formal verification, natural lan-
[37] D.M. Miller, R. Wille, Z. Sasanian, Elementary quantum gate realizations for guage processing, and circuit complexity. Since 2012,
multiple-control toffolli gates, In: International Symposium on Multi-Valued Mathias Soeken teaches graduate courses at the Uni-
Logic, 2011, pp. 288–293. versity of Bremen.
[38] M.-L. Chuang, C.-Y. Wang, Synthesis of reversible sequential elements, In: ASP
Design Automation Conference, 2007, pp. 420–425.
[39] M. Lukac, M. Perkowski, Quantum finite state machines as sequential quantum
circuits, In: International Symposium on Multi-Valued Logic 2009, pp. 92–97,.
[40] R. Wille, D. Große, L. Teuber, G. W. Dueck, R. Drechsler, RevLib: an online
resource for reversible functions and reversible circuits, In: International
Symposium on Multi-Valued Logic, pp. 220–225, RevLib is Available at 〈http:// Rolf Drechsler received the Diploma and Dr. phil. nat.
www.revlib.org〉, 2008. degrees in computer science from the J. W. Goethe
[41] Y. Takahashi, N. Kunihiro, A linear-size quantum circuit for addition with no University Frankfurt am Main, Frankfurt am Main,
ancillary qubits, Quant. Inf. Comput. 5 (2005) 440–448. Germany, in 1992 and 1995, respectively. He was with
[42] R. Wille, M. Soeken, E. Schönborn, R. Drechsler, Circuit line minimization in the Institute of Computer Science, Albert-Ludwigs
the HDL-based synthesis of reversible logic, In: IEEE Annual Symposium on University, Freiburg im Breisgau, Germany and with the
VLSI, 2012, pp. 213–218. Corporate Technology Department, Siemens AG,
[43] H.B. Axelsen, Clean translation of an imperative reversible programming language, Munich, Germany. Since October 2001, he has been
In: International Conference on Compiler Construction, 2011, pp. 144–163. with the University of Bremen, Bremen, Germany,
[44] D.M. Miller, R. Wille, R. Drechsler, Reducing reversible circuit cost by adding where he is currently a Full Professor and the Head of
lines, In: International Symposium on Multi-Valued Logic, 2010, pp. 217–222. the Group for Computer Architecture, Institute of
[45] R. Wille, M. Soeken, D. Große, E. Schönborn, R. Drechsler, Designing a RISC Computer Science. Since 2011, he is also the Director of
CPU in reversible logic, In: International Symposium on Multi-Valued Logic, the Cyber-Physical Systems group at the German
2011, pp. 170–175. Research Center for Artificial Intelligence (DFKI) in Bremen. His research interests
[46] D. Große, U. Kühne, R. Drechsler, HW/SW co-verification of embedded sys- include the development and design of data structures and algorithms with a focus
tems using bounded model checking, In: ACM Great Lakes Symposium on on circuit and system design. In these areas, he published more than 250 papers
VLSI, 2006, pp. 43–48. and served in program committees of international conferences such as DAC, DATE,
[47] M. Soeken, S. Frehse, R. Wille, R. Drechsler, RevKit: an open source toolkit for and ICCAD.
the design of reversible circuits, In: Reversible Computation 2011, Lecture

You might also like