0% found this document useful (0 votes)
4 views

Sum Propagate Adders

This article introduces a novel approach to binary addition called sum-propagate addition, which replaces the traditional carry-propagation method by directly propagating sum bits. The authors present new parallel-prefix structures and an associative prefix operator to facilitate this new addition method, which is expected to perform better with future technologies. While current implementations favor carry-propagate adders, the proposed sum-propagate adders may offer advantages as technology advances, particularly in FPGA applications.

Uploaded by

Pradeep K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Sum Propagate Adders

This article introduces a novel approach to binary addition called sum-propagate addition, which replaces the traditional carry-propagation method by directly propagating sum bits. The authors present new parallel-prefix structures and an associative prefix operator to facilitate this new addition method, which is expected to perform better with future technologies. While current implementations favor carry-propagate adders, the proposed sum-propagate adders may offer advantages as technology advances, particularly in FPGA applications.

Uploaded by

Pradeep K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
1

Sum Propagate Adders


Giorgos Dimitrakopoulos, Kleanthis Papachatzopoulos, and Vassilis Paliouras, Member, IEEE

Abstract—Binary adders are present in every digital computer system. Even if their structure has evolved significantly over the last
decades following the progress in logic and circuit design, the scaling of implementation technologies, and the improvement of logic
synthesis tools, the fundamental carry-propagation algorithm that guides their operation remains unchanged. This work takes a
different path and explores the possibility of performing addition by propagating directly the sum bits of previous bit positions instead of
carries. The transformation of binary carry-propagate addition to an equivalent sum propagate addition opens up a whole new design
space that spans from ripple-sum to sum-lookahead adders. New parallel-prefix structures that follow the sum-propagation paradigm
are presented using a newly introduced associative prefix operator. Sum-propagate and carry-propagate adders have asymptotically
the same area and delay complexity. In practice, however, carry propagate adders exhibit better characteristics when implemented in
currently established implementation technologies. This gap is expected to reduce in the future using multiple-independent-gate
transistors that are promising functionality-enhanced beyond CMOS device technologies, and allow the cost-efficient implementation of
AND-XOR operations involved in sum-propagate adders.

Index Terms—Binary addition, Parallel prefix adders, FPGA adders, Computer arithmetic, Logic design

1 I NTRODUCTION ber of gates, to carry-propagate adders. However, their


internal structure differs and this leads to measurable
Adder design is under a constant evolution following the
performance differences in terms of delay. The operation
evolution of the implementation technologies and the dis-
of sum-propagate adders is based on a series of AND-
covery of new logic and circuit design techniques [1], [2],
XOR operations and thus are slower in currently available
[3]. Irrespective of how many transformations we have
CMOS standard-cell libraries than the AND-OR operations
witnessed to the logic design of adders the last decades, the
involved in carry-propagate adders. This result is verified
only attribute that remained unchanged is the fundamental
experimentally by synthesizing both sum-propagate and
algorithm of carry propagation used for implementing bi-
carry-propagate adders with the Cadence digital implemen-
nary addition. In all cases, for computing the sum of two
tation flow and using commercial grade 28nm standard-cell
numbers, carries are generated locally and propagated to
libraries under two operating voltages.
the next bit positions producing new carries that in turn
compute the needed sum bits. The delay overhead of sum-propagate adders is expected
This work, explores, for the first time—to the best of to reduce in the future with the adoption of multiple-
our knowledge—the possibility of performing addition by independent gate nanowire transistors [4] that are pro-
propagating directly the sum bits of the previous bit positions, mising functionality-enhanced beyond CMOS device tech-
instead of carries. Carry generation/propagation is actually nologies [5]. To support this argument, we implemented
hidden and the computation of each sum bit is based solely the same adders under comparison using a 10nm triple-
on local information and the sum bit of the previous bit independent-gates CMOS technology [6], [7] and measured
position. This new sum-only recursive bit-level algorithm allows the delay of each design in Spectre-Spice. The availability
the design of new sum-propagate adders that can be useful of three-independent gates per transistor reduces the logic
in both ASIC and FPGA chips. For instance, in the case depth of sum propagate adders, which in effect reduces the
of FPGA chips, the introduced sum-propagation chains can delay gap across various adder architectures.
readily replace the existing hard carry chains in FPGAs with In overall, the introduced direct sum-propagation
simpler sum-only chain structures. paradigm and its parallel-prefix computation, is an interest-
To enable fast and regular sum propagate adders, a novel ing theoretical result that can be proven useful: (a) for adder
associative parallel prefix operator is introduced that allows design in future implementation technologies or FPGAs,
the design of a new family of sum-lookahead adders based (b) for the design of approximate adders or in-memory
on the parallel-prefix paradigm. adders and other addition-related operations or even (c)
The introduced formulation of sum-propagation addi- for handling addition in logic synthesis flows that focus on
tion allows the design of sum-propagate adders that are the optimization of AND-XOR logic for preserving security
asymptotically equivalent, in terms of logic levels and num- properties [8], such as the ones introduced in [9], [10].

Giorgos Dimitrakopoulos is with the Electrical and Computer Engineering The remainder of the paper is organized as follows:
Department, Democritus University of Thrace, Xanthi, Greece. Section 2 revisits the basics of carry-propagate addition. Sec-
(e-mail: [email protected]). tion 3 introduces sum-propagate adders. Section 4 organizes
sum-lookahead computation in a parallel-prefix structure,
Kleanthis Papachatzopoulos and Vassilis Paliouras are with the Electrical and
Computer Engineering Department, University of Patras, Patras, Greece. while Section 5 presents the experimental results. Finally,
(e-mail: [email protected], [email protected]). conclusions are drawn in Section 6.

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
2

(a) Kogge-Stone
Fig. 1. (a) Ripple-carry adder using AND-OR carry propagation (b) MUX-
based equivalent logic employed at FPGA hard carry chains.

2 BASICS OF C ARRY -P ROPAGATE A DDERS


When adding two n-bit binary numbers A =
An−1 An−2 . . . A0 and B = Bn−1 Bn−2 . . . B0 , the sum
bit Si at the i-th bit position is computed by combining
the modulo-2 sum (exclusive OR) of bits Ai and Bi , i.e.,
Hi = Ai ⊕ Bi (XOR), and the carry Ci−1 computed in the
previous bit position:
Si = Hi ⊕ Ci−1 . (1)
For computing the sum bits of the following bit positions, (b) Ladner-Fischer
the incoming carry Ci−1 needs to propagate to the next
position. The carry out of the i-th bit position Ci is com- Fig. 2. The (a) Kogge-Stone [20] and (b) Ladner-Fischer [19] parallel
prefix carry-propagate adders.
puted using the local carry generate and propagate bits
Gi = Ai Bi (AND) and Pi = Ai + Bi (OR) and the
fundamental recursive carry propagation formula
Ci = Gi + Pi Ci−1 . (2)
For short bit widths, ripple-carry adder structures [2]
offer compact implementations. Fig. 1 depicts two versions
of ripple-carry cells [3]. In Fig. 1(a) carry propagation fol-
lows the AND-OR logic of (2) where Hi is legally used in
place of Pi , while in Fig. 1(b) carry out is computed using
equivalent MUX-based logic. The latter structure is adopted
in FPGA chips that employ dedicated carry chains for carry
propagation instead of mapping addition solely on lookup
tables (LUTs) [11].
For increased bit widths, carry-skip adders [12], [13]
have been proposed that improve the linear growth of carry
chain delay by allowing carries to skip across blocks of bits,
rather than rippling through them [14]. Fig. 3. The blocks used in the (a) first and (b) in the last stage of
Carry-lookahead adders can reduce the delay fur- a parallel-prefix adder. (c) The logic-level implementation of the prefix
operator ◦. (d) The simplified operator ◦ used in the last node of each
ther [15], [16], [17]. In this case, carries are computed in column.
parallel and addition is implemented in logarithmic logic
depth. Mapping carry-lookahead adders to parallel-prefix
structures maximized their efficiency and increased also and carry Ci corresponds to Gi:0 .
their placement and wiring regularity [18], [19], [20], [21], Fig. 2 depicts two parallel-prefix carry-propagate adders
[22], [23]. Carry computation is transformed to a prefix prob- and Fig. 3 highlights the logic-level implementation of their
lem [18] by using the associative operator ◦ that associates basic blocks. Each adder is organized in three stages. The
pairs of generate and propagate bits as follows: pre-processing stage computes bits Gi , Pi and Hi . Parallel
(G, P ) ◦ (G0 , P 0 ) = (G + P G0 , P P 0 ). prefix carry computation is done in the second stage using
black and grey nodes that implement (3), whereas the last
In a series of consecutive associations of generate and pro- stage computes the sum bits according to (1).
pagate pairs, (Gk:j , Pk:j ) denotes the group generate and
To avoid the large wiring overhead of some parallel-
propagate term produced out of bits k, k − 1, . . . , j ,
prefix structures, hybrid designs have been proposed that
(Gk:j , Pk:j ) = (Gk , Pk ) ◦ (Gk−1 , Pk−1 ) ◦ . . . ◦ (Gj , Pj ), (3) compute only a subset of the required carries and the inter-

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
3

mediate ones are pre-computed in parallel using conditional Defining Xi+1 = Hi+1 ⊕Pi (or equivalently Xi = Hi ⊕Pi−1 )
sum computation [24], [25], [26]. Conditional sum can be we can write the sum bit of the i-th bit position as Si =
simplified using select-prefix/carry increment blocks [27]. Xi ⊕ Hi−1 Si−1 .
Also, fully unrolling the concept of conditional sum compu-
tation [28] allows the design of adders using parallel-prefix Lemma 1 manages to remove the dependency of ad-
structures that consist of multiplexer-based prefix operators. dition on carry bits, while the actual information of carry
Ling in [29] simplified the definition of each carry by propagation is handled implicitly by the direct propagation
propagating the OR of two consecutive carries. This allowed of previous sum bits. The addition is carry free in the sense
the design of reduced logic depth parallel-prefix adders [30] that no carry is generated or propagated, while still the
as well as fast ripple-carry adders for small bitwidths [31], addition remains non-redundant. The arithmetic example
[32]. Exploring similar concepts Dimitrakopoulos et al. in Fig. 4 shows the values of the intermediate signals Hi ,
in [33] introduced carries that exhibit lower switching acti- Pi and Xi needed for the recursive computation of the sum
vity than normal carries and can lead to reduced-complexity bits. As highlighted in Fig. 4, the sum bit at position 4, S4 ,
parallel-prefix structures. is computed using X4 , H3 and the previous sum bit S3 . For
bit position 0, we assume that H−1 = P−1 = 0.

3 S UM - PROPAGATE ADDITION
To transform binary addition from a carry-propagate opera-
tion to a sum-propagate operation, we need to associate the
computation of each sum bit Si directly with the sum bit of
the previous bit position Si−1 without using a carry bit as an
intermediate. This is achieved by Lemma 1, assuming that
boolean AND takes precedence of boolean XOR operation,
i.e., a ⊕ bc should be read as a ⊕ (bc).
Lemma 1. Si = Xi ⊕ Hi−1 Si−1 , where Xi = Hi ⊕ Pi−1 .
Fig. 4. Arithmetic example of adding two 8-bit numbers using the recur-
Proof. First, it will be helpful to express the fundamental sive formula of Lemma 1.
carry operation (2), in a XOR-AND form. To do this we use
the property that for any two bits a, b it holds that a + b = The most primitive form of a binary adder that can
a ⊕ b ⊕ ab. be designed using this new formulation is the ripple-sum
adder. Fig. 5 depicts a 4-bit ripple-sum adder. The H and
Ci = Gi + Pi Ci−1 = Gi ⊕ Pi Ci−1 ⊕ Gi Pi Ci−1
P bits are computed directly from the input bits in one
Since Gi Pi = Gi and Pi ⊕ Gi = Hi , logic level (G bits used in a carry-propagate adder are not
needed). Then those signals are used to compute the X bits
Ci = Gi ⊕ Pi Ci−1 ⊕ Gi Ci−1 = Gi ⊕ (Pi ⊕ Gi )Ci−1 according to the definition in Lemma 1. The computation
= Gi ⊕ Hi Ci−1 (4) of the i-th sum bit Si requires the pair (Xi , Hi−1 ) and the
sum bit computed in the previous bit position Si−1 . To
Second, using (1), we can express the carry bits Ci and achieve this, the sum produced at each bit position drives
Ci−1 as a function of the corresponding sum bits they are the adder’s output and feeds also the next bit position in a
meant to compute. For instance, by adding modulo-2 (xor) ripple manner.
the term Hi to both sides of (1) we get:

Hi ⊕ Si = Hi ⊕ Hi ⊕ Ci−1 . 3.1 Carry interfaces


Simplifying Hi from the right side we get that For backward compatibility sum-propagate adders need to
support carry in and out interfaces. The easiest way to
Hi ⊕ Si = Ci−1 . (5) incorporate a carry-in signal to the proposed sum-propagate
Equivalently, for the next bit position we can write that adders is to connect it to P−1 . By setting P−1 = Cin and
keeping H−1 = 0, we get that
Ci = Si+1 ⊕ Hi+1 . (6)
S0 = X0 = H0 ⊕ P−1 = H0 ⊕ Cin , (8)
Replacing (5) and (6) to the xor-based recursive carry
formula (4) and recognizing that Gi ⊕ Hi = Pi we get: which matches the definition of the least-significant sum bit
according the basic definition (1). In this way, the carry-in
Si+1 ⊕ Hi+1 = Gi ⊕ Hi (Si ⊕ Hi ) signal is incorporated to the computation of the rest sum
= Gi ⊕ Hi ⊕ Hi Si bits, since they are recursively derived from S0 , following
= Pi ⊕ Hi Si (7) Lemma 1.
A carry out bit can be derived by the P and H bits of the
Adding modulo-2 (xor) the term Hi+1 in both sides of (7) most significant bit position and the most significant sum
and simplifying Hi+1 from the left side, we get bit according to Lemma 2.
Si+1 = Hi+1 ⊕ Pi ⊕ Hi Si Lemma 2. Cout = Cn−1 = Pn−1 ⊕ Hn−1 Sn−1

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
4

(a)

Fig. 5. A 4-bit ripple-sum adder. Addition is based solely on sum bit


propagation. Carry input/output interfaces also included.

Proof. From (6) Cn−1 = Sn ⊕ Hn . Replacing Sn with its


definition from Lemma 1, we get that
Cn−1 = (Xn ⊕ Hn−1 Sn−1 ) ⊕ Hn
Expanding the definition of Xn and simplifying redundant (b)
terms Hn , we can get the needed result.
Fig. 6. (a)Ripple-sum adder mapped in a series of 5-input LUTs (b) Sum-
Cn−1 = Hn ⊕ Pn−1 ⊕ Hn−1 Sn−1 ⊕ Hn propagation multiplexer chain that can replace FPGA carry chains.
= Pn−1 ⊕ Hn−1 Sn−1
4 PARALLEL P REFIX S UM L OOKAHEAD A DDERS
Fig. 5 highlights also the extra circuit needed in the least Unrolling the recursive definition of sum bits given in
and the most-significant bit position for handling the carry- Lemma 1 allows the design of sum-lookahead adders. In this
in and carry-out signals, respectively. case, the sum bits are computed in parallel directly from the
input bits without the need for ripple-like sum propagation.
3.2 Ripple sum adders for FPGAs
   
Mi−1 i−1
Y
The introduced sum-propagate adders open up new possi- Si = Xi ⊕   Hk  Xj  (9)
bilities on how adders can be implemented in FPGAs. At the j=0 k=j
moment, FPGAs, besides the adders built inside DSP blocks, For example, in the case of a 4-bit adder the sum bits are
can implement binary addition either using only LUTs or given by the following equations:
using a combination of LUTs and the built-in carry chains
for higher speed [11]. S0 = X0
In the first case, of using only LUTs for implementing S1 = X1 ⊕ H0 X0
addition, the removal of carry propagation allows the de- S2 = X2 ⊕ H1 X1 ⊕ H1 H0 X0
sign of ripple-sum adders built only with 5-input LUTs as
shown in Fig. 6. The sum bits at the output of each LUT can S3 = X3 ⊕ H2 X2 ⊕ H2 H1 X1 ⊕ H2 H1 H0 X0
feed the next 5-input LUT without any additional logic. In traditional carry-propagate adders, parallel-prefix
In the second case, we argue that hard carry chains carry computation introduces structure and regularity to the
are redundant and can be replaced by equivalent sum- lookahead computation of carries.
propagate chains thus removing the need for additional We would like to take advantage of the favorable pro-
logic that would compute sum bits from the computed perties of parallel-prefix computation to design efficient and
carries. Actually, the multiplexer structures used in those regular sum-lookahead adders. To do this, we introduce the
carry chains can be replaced by equivalent multiplexer- operator , defined as
based structures for propagating directly the corresponding
sum bits. Using two 4-input LUTs per bit position, one can (x, h) (x0 , h0 ) = (x ⊕ h x0 , h h0 ). (10)
produce from Lemma 1 two speculative variants of each Then, by leveraging operator and Lemmas 3 and 4,
sum bit, i.e., Si0 and Si1 , assuming that Si−1 is equal to 0 or we can compute the sum bits following the parallel-prefix
1, respectively. paradigm.
Si0 = Xi and Si1 = Xi ⊕ Hi−1 Lemma 3. Let
Then, as shown in Fig. 6(b), the correct sum bit is derived by
(
(X0 ,H−1 ), if i = 0
selecting one of the two speculative results. The selection is (Xi:0 ,Hi−1:0 ) =
(Xi ,Hi−1 ) (Xi−1:0 ,Hi−2:0 ), if 1 ≤ i ≤ n−1
done by the sum bit of the previous bit position that drives
the select signal of the corresponding multiplexer. then Si = Xi:0 , for all i = 0, 1, . . . , n − 1.

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
5

(a): Ladner-Fischer (b): Kogge-Stone

(c) Roy - auto synthesis (d): Knowles [1, 2, 2]

(f): Han-Carlson (g): Sum Increment


Fig. 7. Examples of 8-bit sum-propagate parallel-prefix adders.

Proof. We prove the lemma by induction on i. For i = 0, Proof. For any (X3 , H2 ) (X2 , H1 ) (X1 , H0 ) we have
S0 = X0 = H0 which is true according to (1) assuming a
carry-in (C−1 ) equal to zero and H−1 = P−1 = 0. [(X3 , H2 ) (X2 , H1 )] (X1 , H0 ) =
For i > 1, let’s assume that Si−1 = Xi−1:0 . Then, (X3 ⊕ H2 X2 , H2 H1 ) (X1 , H0 ) =
(X3 ⊕ H2 X2 ⊕ H2 H1 X1 , H2 H1 H0 )
(Xi:0 , Hi−1:0 ) = (Xi , Hi−1 ) (Xi−1:0 , Hi−2:0 )
= (Xi , Hi−1 ) (Si−1 , Hi−2:0 ) (X3 , H2 ) [(X2 , H1 ) (X1 , H0 )] =
= (Xi ⊕ Hi−1 Si−1 , Hi−1 Hi−2:0 ) (X3 , H2 ) (X2 ⊕ H1 X1 , H1 H0 ) =
(X3 ⊕ H2 X2 ⊕ H2 H1 X1 , H2 H1 H0 )
Thus Xi:0 = Xi ⊕ Hi−1 Si−1 . Since, according to Lemma 1,
Si = Xi:0 the result follows by induction. The right-hand sides of both expressions are equal.

Using the operator and the definition of Lemma 3, in


Lemma 4. The operator is associative a series of consecutive associations of X and H bits the

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
6

adders. However, not all structures are possible since the


operator is not idempotent, i.e., (x, h) (x, h) 6= (x, h).
Therefore, parallel-prefix structures that compute overlap-
ping groups of (Xi:j , Hi−1:j−1 ) should be avoided. For
instance, in Fig. 9(a) depicts a valid parallel-prefix structure,
while the one presented in Fig. 9(b) is illegal, since for the
computation of group 2:0 it associates the terms 2:1 and 1:0
that overlap at bit position 1.
Finally, it should be noted that although the proposed
parallel-prefix sum-lookahead adders resemble structurally
the traditional carry-lookahead adders and they can be
built using the same construction principles, their logic-
level netlists are completely different. Their fundamental
differences can be easily identified by Fig. 10 that presents
the unfolded logic-level implementation of an 8-bit Kogge-
Fig. 8. The logic-level implementation of the basic blocks used to con- Stone sum-propagate adder.
struct parallel-prefix sum-propagate adders.
5 E XPERIMENTAL R ESULTS
notation (Xk:j , Hk−1:j−1 ) is used to denote the group term The experimental results aim at highlighting the delay
produced out of bits k, k − 1, . . . , j . performance of conventional parallel-prefix carry-propagate
(CP) adders and the introduced sum-propagate (SP) adders
(Xk:j , Hk−1:j−1 ) = (Xk , Hk−1 ) (Xk−1 , Hk−2 ) · · · (Xj , Hj−1 ) under two scenarios: In the first case, we employed a
well-established digital design flow using a commercial-
According to Lemma 1, for the computation of the i-th sum
grade standard cell library. In the second case, the
bit Si only the term Xi:0 is needed. For example, in the case
adders under comparison were implemented with multiple-
of 4 bits, the four sum bits can be computed using the
independent gate nanowire transistors [4] that are promi-
operator as follows:
sing functionality-enhanced beyond CMOS device technolo-
S0 : (X0 , H−1 ) gies and their delay was measured in Spice.
S1 : (X1 , H0 ) (X0 , H−1 )
5.1 Unit gate model analysis
S2 : (X2 , H1 ) (X1 , H0 ) (X0 , H−1 )
Before moving on the actual hardware measurements, it
S3 : (X3 , H2 ) (X2 , H1 ) (X1 , H0 ) (X0 , H−1 ) would be interesting to discuss analytically the properties of
Fig. 7 presents six variants of the new 8-bit parallel- SP adders relative to their CP counterparts. Asymptotically,
prefix sum-propagate adders using the topologies proposed SP and CP adders have the same properties. For instance,
by Ladner-Fischer [19], Kogge-Stone [20], one representative sum bits can be computed in as low as 2 + 2 log2 n logic le-
of the Knowles’ adders [21] and Roy [23] for parallel carry vels using minimum depth parallel-prefix SP or CP adders,
computation. Also, a Han-Carlson [34] and a sum-increment assuming that each 2-input gate counts as one independent
adder [2] (analogous to carry-increment topologies) are pre- of its type. This can be easily verified by comparing the
sented. CP and SP Kogge-Stone adders of Fig. 2(a) and Fig. 7(b),
The logic-level implementation of the basic cells used in respectively.
a parallel-prefix sum-propagate adder is shown in Fig. 8. Also, SP and CP adders can be built using the same
The first stage computes in parallel, bits Pi and Hi , while number of 2-input gates. In both cases, every prefix operator
the second stage, computes in one logic level the pairs consists of three or two gates and introduces a delay of
(Xi , Hi−1 ). Then, the sum bits are produced in a parallel- two gates in series. This can be easily verified by the 8-
prefix manner using log2 n levels. The last node of each bit bit example SP adder shown in Fig. 10. The depicted adder
column requires a simpler implementation of the operator consists of 70 gates which is equal to the number of gates
since only a group X term of the form Xi:0 needs to be used by the Kogge-Stone CP adder shown in Fig. 2(a).
computed. However, such favorable properties do not translate to
equivalent performance in practice, since the basic AND-
XOR sum propagation operation is fundamentally slower
than the basic AND-OR carry propagation operation in
current CMOS implementations. Also, implementing AND-
XOR with discrete gates adds more transistors in overall
than the combined AND-OR gates (either AND-OR-Invert
or OR-AND-Invert) that can be used in the CP adders.

Fig. 9. Idempotency does not hold for the operator . (a) Valid connec- 5.2 Logic synthesis results
tion with no overlap (b) Invalid connectivity.
To quantify the area-delay gap of SP adders with respect to
The proposed parallel-prefix sum-propagate adders can CP adders, various SP and CP adder architectures were im-
take many forms leading to a whole family of parallel-prefix plemented in Verilog and synthesised to a commercial 28 nm

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
7

Fig. 10. The gate-level implementation of an 8-bit Kogge-Stone sum-propagate adder.

TABLE 1
The minimum achievable delay of all adders under comparison and their corresponding area after logic synthesis at 28nm.

16-bit 32-bit 64-bit


Architecture
Delay (ps) Area (µm2 ) Delay (ps) Area (µm2 ) Delay (ps) Area (µm2 )
1V SP CP SP CP SP CP SP CP SP CP SP CP
Kogge-Stone 309 236 155 115 352 272 284 262 400 292 694 675
Knowles [1,2,2,2] 305 240 120 101 356 268 300 262 399 296 725 638
Ladner-Fischer 308 244 97 89 355 269 208 202 405 302 473 451
Brent-Kung 370 277 84 68 448 320 170 148 534 369 377 299
Han-Carlson 337 254 108 88 391 283 212 203 426 314 465 438
0.72 V SP CP SP CP SP CP SP CP SP CP SP CP
Kogge-Stone 1283 1052 223 118 1438 1150 546 313 1626 1253 1221 693
Knowles [1,2,2,2] 1270 1067 171 111 1444 1158 421 285 1629 1277 981 654
Ladner-Fischer 1305 1055 138 100 1447 1173 307 213 1641 1298 679 475
Brent-Kung 1487 1199 117 76 1790 1366 232 165 2077 1550 496 327
Han-Carlson 1350 1130 164 98 1554 1218 354 235 1716 1330 914 520

standard-cell library using Cadence Genus. Synthesis was For smaller bitwidths, where even ripple-carry/sum
performed for two different operating voltages: a typical architectures make sense, SP adders are still slower. For
case of 1V and a low-voltage scenario of 0.72V. In both cases, instance, an 8-bit ripple-sum adder has a worst-case delay
cells with only regular threshold voltage were employed. To of 376 ps, while an equivalent 8-bit ripple-carry adder has a
apply delay constraints in a uniform manner and enable an worst-case delay of 290 ps.
equivalent output loading for all adders under comparison,
all inputs and outputs are assumed to be registered.
5.3 Adder design with three-independent-gate field-
The minimum achievable delay of each architecture for
effect transistors
various bitwidths and the area that corresponds to this delay
point are depicted in Table 1. In all cases, SP adders are Here, we assess the delay performance of parallel-
slower than CP adders. The reason for this delay gap is prefix adders leveraging Three-Independent-Gate Field-
threefold. First, the AND-XOR operation involved in SP Effect Transistors (TIGFETs). The particular technology al-
adders is inherently more complex than the AND-OR logic lows for increased transistor density by combining multiple
needed in CP adders. Second, AND-OR gates in standard- MOS gates on a single device [37], [38], [39], [40].
cell libraries are highly optimized logic cells relative to XOR TIGFET devices provide three independent gate ter-
gates that are considered less often used. Third, the inter- minals, namely, center gate (CG), polarity gate at source
mediate representations used in logic synthesis tools [35] (PGS) and polarity gate at drain (PGD), as depicted in
promote sum-of-product implementations relative to the di- Fig. 11(a). Specifically, the PGS and PGD terminals configure
rect synthesis of XOR logic thus missing additional possible the Schottky junction’s effective barrier height and allow
optimizations [36]. either holes or electrons to flow through the channel of the

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
8

drain
PGD=1 A B A B
drain B A 0 0
CG A B
A⊕B
C C
AB + C
PGD B B 1 1
CG A A A C
B B 1
PGS B

PGS
source
(a) (b)
source
(a) (b)
0 0 C
A A A
C B B
Fig. 11. (a) TIGFET device symbol and (b) TIGFET device operating as A ⊕ BC
two series n-type MOSFETs by setting PGD=1. C 1 1
B C B
A A A

device. Therefore, the device is configured either as an n-


type or a p-type device by setting the polarity gates at high (c)
or low logic state, respectively [5]. When all of CG, PGS
Fig. 12. TIGFET-based transistor-level implementations of the main cells
and PGD terminals are biased on the same potential, the
required for parallel-prefix SP and CP adders.
device turns on and carriers flow through the channel, while
remainder bias condition lead to an off-state for the device
(cf. [41, Fig. 2]). When a TIGFET device is biased at an input 800 SP
and controlled by the other two, it operates as two series
transistors, offering the possibility of a mapping between Critical path delay (ps) CP
MOSFET and TIGFET devices. For example, Fig. 11(b) de- 600
picts the two series n-type MOSFETs equivalent of a TIGFET
device with the PGD terminal at high logic state. 400
The proposed 16-bit SP and CP parallel-prefix adders
are implemented at a 10-nm TIGFET technology [6] publicly
available in [42]. The most interesting cells for implementing 200
both types of adders are depicted in Fig. 12. Fig. 12(a)
depicts a 2-input XOR gate used in both types of adders,
Fig. 12(b) shows the simplified implementation of an AND- e les r n ng
ton ow che rlso -Ku
e-S Kn Fis Ca nt
OR gate (its inverted version) used in the parallel prefix Ko
gg er- n- Bre
L ad n Ha
operator of CP addition, while Fig. 12(c) shows the single-
gate implementation of the AND-XOR operation used in the Architecture
parallel-prefix operator of an SP adder. In all cases, stacked
devices are completely avoided with the requirement that Fig. 13. Worst-case delay of 16-bit TIGFET-based parallel-prefix adders
at a 10-nm technology node and 0.7 V. Han-Carlson design has a
both non-inverting and inverting inputs (generated locally) smaller delay than Knowles and Ladner-Fischer adders, despite that
are available. Han-Carlson is comprised of an extra prefix stage, due to its lower
The delay performance of the TIGFET-based adder im- internal signal fanout.
plementation is assessed using 16-bit parallel-prefix SP and
CP adders. The worst-case delay of its design is depicted
in Fig. 13. Delay measurements are performed in Cadence
6 C ONCLUSIONS
Spectre Simulator using appropriate pairs of input vectors This work redefined the decades-old carry propagate re-
and assuming a nominal supply voltage of 0.7 V, according cursion using an equivalent sum-propagate recursion that
to the available transistor models [6] and a fanout load of a works directly on sum bits. Based on this fundamental
minimum-sized inverter per sum output. new formulation, various sum-propagate adder architectu-
The delay differences between SP and CP adders are res have been presented that cover most of adder design
mainly attributed to the increased delay of the AND-XOR space: from ripple sum adders to parallel prefix sum-
cell used in SP adders relative to the AND-OR cell used lookahead architectures. The presented adders open up a
in CP adders. From the transistor-level implementation of whole new design space that can be proven useful in current
the basic AND-XOR and AND-OR cells, it is evident that and future implementation technologies. Also, the proposed
the AND-XOR operation is performed using only parallel formulation allows the replacement of the dedicated carry
transistors as in the case of an AND-OR cell, but with chains in FPGA chips with equivalent sum chains that can
two more branches. These extra transistors increase both work directly on the sum bits without the need to introduce
the capacitance of the output node as well as the input carry signals as an intermediate for computing fast the result
capacitance of the cell. In this way, and AND-XOR cell has of the addition.
both more output capacitance to drive and incurs also more At the moment, and using current state-of-the-art imple-
delay to the previous stage that drives it. mentation technologies sum-propagate are slower than their

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
9

carry-propagate counterparts. In the future, we expect this [17] J. P. Fishburn, “A Depth-Decreasing Heuristic for Combinational
gap to diminish with the adoption of multi-gate transistors, Logic; or How to Convert a Ripple-Carry Adder into a Carry-
Lookahead Adder or Anything In-Between,” in 27th ACM/IEEE
examined in this work, and the enhancement of the interme- Design Automation Conference, 1990, pp. 361–364.
diate representations of logic synthesis tools [10], [36] that [18] R. P. Brent and H. T. Kung, “A Regular Layout for Parallel
would treat equally exclusive-or-rich logic. Adders,” IEEE Trans. on Computers, vol. 31, no. 3, pp. 260–264,
In overall, we believe that this new formulation would Mar. 1982.
[19] R. E. Ladner and M. J. Fisher, “Parallel Prefix Computation,”
help redefine several problems that are directly connected Journal of The ACM, vol. 27, no. 4, pp. 831–838, Oct. 1980.
to carry propagation. For instance, solutions to well-known [20] P. M. Kogge and H. S. Stone, “A Parallel Algorithm for the Efficient
problems such as leading-zero anticipation [43] and parity Solution of a General Class of Recurrence Equations,” IEEE Trans.
prediction of addition [44] could be possibly revised and on Computers, vol. C-22, pp. 786–792, Aug. 1973.
[21] S. Knowles, “A Family of Adders,” in Proc. of the IEEE Symp. on
enhanced. Also, new approximate adders [45] with more Computer Arithmetic, April 1999, pp. 30–34.
tight error bounds are expected to be designed, while the [22] R. Zimmermann, “Binary Adder Architectures for Cell-Based VLSI
in-memory computation of addition [46] can be possibly and their Synthesis,” Ph.D. dissertation, ETHZ, 1998.
simplified by omitting the computation of carry bits and [23] S. Roy, M. Choudhury, R. Puri, and D. Z. Pan, “Towards Optimal
Performance-Area Trade-Off in Adders by Synthesis of Parallel
reusing previously computed sum bits. Prefix Structures,” in Design Automation Conference (DAC), 2013,
pp. 1–8.
[24] T. Lynch and E. E. Swartzlander, “A Spanning Tree Carry Looka-
head Adder,” IEEE Trans. on Computers, vol. 41, no. 8, pp. 931–939,
R EFERENCES 1992.
[25] S. K. Mathew, M. A. Anders, B. Bloechel, Trang Nguyen, R. K.
[1] B. Parhami, Computer Arithmetic - Algorithms and Hardware Designs. Krishnamurthy, and S. Borkar, “A 4-GHz 300-mW 64-bit Integer
New York: Oxford University Press, 2000. Execution ALU with Dual Supply Voltages in 90-nm CMOS,” IEEE
[2] N. Weste and D. Harris, CMOS VLSI Design a Circuits and Systems Journal of Solid-State Circuits, vol. 40, no. 1, pp. 44–51, 2005.
Perspective. Addison Wesley (3rd Edition), 2010.
[26] A. Beaumont-Smith and C. C. Lim, “Parallel-Prefix Adder De-
[3] V. G. Oklobdzija, “High-Speed VLSI Arithmetic Units: Adders and sign,” in Proc. of the IEEE Symp. on Computer Arithmetic, Apr. 2001,
Multipliers,” in Design of High-Performance Microprocessor Circuits. pp. 218–225.
IEEE Press, 2000.
[27] A. Tyagi, “A Reduced-Area Scheme for Carry-Select Adders,”
[4] J. Trommer, A. Heinzig, T. Baldauf, T. Mikolajick, W. M. Weber, IEEE Trans. on Computers, vol. 42, no. 10, pp. 1163–1170, 1993.
M. Raitza, and M. Völp, “Reconfigurable Nanowire Transistors
[28] H. Lindkvist and P. Andersson, “Techniques for Fast CMOS-Based
with Multiple Independent Gates for Efficient and Programmable
Conditional Sum Adders,” in Proceedings 1994 IEEE International
Combinational Circuits,” in 2016 Design, Automation Test in Europe
Conference on Computer Design: VLSI in Computers and Processors,
Conference Exhibition (DATE), 2016, pp. 169–174.
1994, pp. 626–635.
[5] G. V. Resta, A. Leonhardt, Y. Balaji, S. De Gendt, P. Gaillardon, and
G. De Micheli, “Devices and Circuits Using Novel 2-D Materials: A [29] H. Ling, “High-Speed Binary Adder,” IBM Journal of Research and
Perspective for Future VLSI Systems,” IEEE Trans. on VLSI Systems, Development, vol. 25, pp. 156–166, May 1981.
vol. 27, no. 7, pp. 1486–1503, 2019. [30] G. Dimitrakopoulos and D. Nikolos, “High-Speed Parallel-Prefix
[6] G. Gore, P. Cadareanu, E. Giacomin, and P. Gaillardon, “A Predic- VLSI Ling Adders,” IEEE Trans. on Computers, vol. 54, no. 2, pp.
tive Process Design Kit for Three-Independent-Gate Field-Effect 225–231, 2005.
Transistors,” in 2019 IFIP/IEEE 27th International Conference on Very [31] N. Burgess, “Fast Ripple-Carry Adders in Standard-Cell CMOS
Large Scale Integration (VLSI-SoC), 2019, pp. 172–177. VLSI,” in Proc. of the IEEE Symp. on Computer Arithmetic (ARITH
[7] A. Antidormi, S. Frache, M. Graziano, P. Gaillardon, G. Pic- 2011), jul 2011, pp. 103–111.
cinini, and G. De Micheli, “Computationally Efficient Multiple- [32] J. Grad and J. E. Stine, “New Algorithms for Carry Propagation,”
Independent-Gate Device Model,” IEEE Transactions on Nanotech- in Proc. of the ACM Great Lakes Symp. on VLSI, 2005, p. 396–399.
nology, vol. 15, no. 1, pp. 2–14, 2016. [33] G. Dimitrakopoulos, P. Kolovos, P. Kalogerakis, and D. Nikolos,
[8] R. P. J. Boyar, P. Matthews, “Logic Minimization Techniques with “Design of High-Speed Low-Power Parallel-Prefix VLSI Adders,”
Applications to Cryptology,” J. Cryptology, Springer, pp. 280–312, in Power and Timing Modeling, Optimization and Simulation (PAT-
2013. MOS), 2004, pp. 248–257.
[9] E. Testa, M. Soeken, L. Amarù, and G. De Micheli, “Reducing the [34] T. Han and D. Carlson, “Fast Area-Efficient VLSI Adders,” in Proc.
Multiplicative Complexity in Logic Networks for Cryptography of the IEEE Symp. on Computer Arithmetic, May 1987, pp. 49–56.
and Security Applications,” in Proceedings of the 56th Annual Design [35] D. S. Marakkalage, E. Testa, H. Riener, A. Mishchenko, M. Soeken,
Automation Conference 2019. Association for Computing Machin- and G. De Micheli, “Three-Input Gates for Logic Synthesis,” IEEE
ery, 2019. Transactions on Computer-Aided Design of Integrated Circuits and
[10] E. Testa, M. Soeken, H. Riener, L. Amaru, and G. D. Micheli, “A Systems, pp. 1–1, 2020.
Logic Synthesis Toolbox for Reducing the Multiplicative Complex- [36] A. K. Verma and P. Ienne, “Improving XOR-Dominated Circuits
ity in Logic Networks,” in 2020 Design, Automation Test in Europe by Exploiting Dependencies between Operands,” in 2007 Asia and
Conference Exhibition (DATE), 2020, pp. 568–573. South Pacific Design Automation Conference, 2007, pp. 601–608.
[11] K. E. Murray, J. Luu, M. J. P. Walker, C. McCullough, S. Wang, [37] J. Zhang, X. Tang, P.-E. Gaillardon, and G. De Micheli, “Config-
S. Huda, B. Yan, C. Chiasson, K. B. Kent, J. Anderson, J. Rose, urable Circuits Featuring Dual-Threshold-Voltage Design With
and V. Betz, “Optimizing FPGA Logic Block Architectures for Three-Independent-Gate Silicon Nanowire FETs,” IEEE Transac-
Arithmetic,” IEEE Trans. on VLSI Systems, vol. 28, no. 6, pp. 1378– tions on Circuits and Systems I: Regular Papers, vol. 61, no. 10, pp.
1391, 2020. 2851–2861, 2014.
[12] M. Lehman and N. Burla, “Skip Techniques for High-Speed Carry- [38] D. Vana, P. Gaillardon, and A. Teman, “C2 TIG: Dynamic C2 MOS
Propagation in Binary Arithmetic Units,” IRE Trans. on Electronic Design Based on Three-Independent-Gate Field-Effect Transi-
Computers, vol. EC-10, no. 4, pp. 691–698, 1961. stors,” IEEE Transactions on Nanotechnology, vol. 19, pp. 123–136,
[13] V. Kantabutra, “Designing Optimum One-Level Carry-Skip 2020.
Adders,” IEEE Trans. on Computers, vol. 42, no. 6, pp. 759–764, [39] E. Giacomin, J. R. Gonzalez, and P. Gaillardon, “Low-Power Mul-
1993. tiplexer Designs Using Three-Independent-Gate Field Effect Tran-
[14] I. Koren, Computer Arithmetic Algorithms. A. K. Peters, Ltd., 2002. sistors,” in 2017 IEEE/ACM International Symposium on Nanoscale
[15] J. L. S. Weinberger, “A Logic for High-Speed Addition,” in National Architectures (NANOARCH), 2017, pp. 33–38.
Bureau of Standards, Circulation 591, 1958, pp. 3–12. [40] H. G. Mohammadi, P.-E. Gaillardon, J. Zhang, G. D. Micheli,
[16] B. Lee and V. Oklobdzija, “Improved CLA Scheme with Optimized E. Sanchez, and M. S. Reorda, “A Fault-Tolerant Ripple-Carry
Delay,” Journal VLSI Sign Processing Systems Signal Image Video Adder with Controllable-Polarity Transistors,” J. Emerg. Technol.
Technology, no. 3, p. 265–274, 1991. Comput. Syst., vol. 13, no. 2, Dec. 2016.

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2021.3068729, IEEE
Transactions on Emerging Topics in Computing
10

[41] J. Romero-González and P.-E. Gaillardon, “An Efficient Adder Ar-


chitecture with Three-Independent-Gate field-Effect Transistors,”
in 2018 IEEE International Conference on Rebooting Computing
(ICRC). IEEE, 2018, pp. 1–8.
[42] A 10-nm TIGFET PDK, accessed: Jan., 2021. [Online]. Available:
https://github.com/LNIS-Projects/TIGFET-10nm-model
[43] M. S. Schmookler and K. J. Nowka, “Leading Zero Anticipation
and Detection: A Comparison of Methods,” in Proc.15th IEEE
Symp. on Computer Arithmetic (ARITH), 2001, pp. 7–12.
[44] M. Nicolaidis, “Carry Checking/Parity Prediction Adders and
ALUs,” IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, vol. 11, no. 1, pp. 121–128, 2003.
[45] H. Jiang, J. Han, and F. Lombardi, “A Comparative Review and
Evaluation of Approximate Adders,” in Proc.of the Great Lakes
Symp. on VLSI (GLSVLSI), 2015, p. 343–348.
[46] A. Velasquez and B. Shaia, “Spatially Efficient In-Memory Ad-
dition Through Destructive and Non-Destructive Operations,” in
IEEE Intern. Symp. on Circuits and Systems (ISCAS), 2019, pp. 1–5.

Giorgos Dimitrakopoulos received the B.S,


MSc and Ph.D. degrees in Computer Engineer-
ing from University of Patras, Patras, Greece, in
2001, 2003 and 2007, respectively.
He is currently an Associate Professor with
the Department of Electrical and Computer En-
gineering, Democritus University of Thrace, Xan-
thi, Greece. He is interested in the design of
digital integrated circuits, electronic design au-
tomation, computer arithmetic and computer ar-
chitecture, with emphasis in the design of energy
efficient systems.

Kleanthis Papachatzopoulos received the


Diploma degree in Electrical and Computer En-
gineering, and the M.Sc. degree in Integrated
Hardware-Software Systems from the University
of Patras, Patras, Greece, in 2016 and 2018,
respectively.
Currently, he is pursuing a Ph.D. degree and
he is working as a Research Assistant with the
VLSI Design Laboratory, ECE Dept., University
of Patras, Patras, Greece. His current research
interests include VLSI architectures for signal
processing and computer arithmetic.

Vassilis Paliouras (Member, IEEE) is currently


a Full Professor with the Electrical and Computer
Engineering Department, University of Patras,
Patras, Greece. His research interests are in the
areas of VLSI architectures for signal process-
ing and communications, low-power digital de-
sign and computer arithmetic. He has authored
or coauthored more than 150 research articles
in international journals, conferences, and book
chapters and has edited three books. He is advi-
sor to three Ph.D. students, and has supervised
four Ph.D., 36 masters, and 43 diploma theses.
Prof. Paliouras has received the IEEE CASS Guillemin—Cauer Best-
Paper Award for the year 2000. He has served as the General Co-Chair
for International Workshop on Power and Timing Modeling, Optimization
and Simulation (PATMOS 2004). He has also served as a Technical
Program Chair of PATMOS 2005, the IEEE Workshop on Signal Pro-
cessing Systems Implementation (SiPS) 2005, and Technical Program
Co-Chair of the IEEE International Conference on Electronics Circuits
and Systems (ICECS) 2010 and a European liaison for the IEEE ISCAS
2012, South Korea. He has served in editorial boards of journals and
technical program committees of numerous conferences in the areas of
signal processing, circuits, systems, and communications.

2168-6750 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 04:05:43 UTC from IEEE Xplore. Restrictions apply.

You might also like