0% found this document useful (0 votes)
50 views

Speed and Energy Full Adder Design Restoring Logic: A High Using Complementary & Level Carry

Uploaded by

Bappy Debnath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Speed and Energy Full Adder Design Restoring Logic: A High Using Complementary & Level Carry

Uploaded by

Bappy Debnath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

A High Speed and Energy Efficient Full Adder Design

Using Complementary & Level Restoring Carry Logic


Jin-Fa Lin Yin-Tsung Hwang, Ming-Hwa Sheu and Cheng-Che Ho
Department of Electronic Engineering Department of Electronic Engineering
Wu-Feng Institute of Technology National Yunlin University of Science & Technology
Chai-yi, Taiwan Touliu, Yunlin, Taiwan
Abstract-In this paper, we propose a low complexity full and 16 transistors, respectively. To pursue even lower
adder design (10-transistor per bit) featuring higher transistor count full adder designs, pass transistor logic can
computing speed, lower operating voltage, and lower energy be used in lieu of transmission gate. In [2], pass transistor
consumption compared with peer designs. The design adopts logic based XOR/XNOR circuits were used and the full
inverter buffered XOR/XNOR designs to alleviate the adder design consists of only 14 transistors. Despite the
threshold voltage loss problem, which often prevents the full saving in transistor count, the output voltage level is
adder design from low supply voltage operation and from degraded at certain input combinations due to threshold
direct cascading. The proposed design successfully embeds the voltage loss problem. At the cost of two additional
buffering circuit in the full adder design so as to enhance the transistors, the design was further improved in [3] and can
speed performance while keeping the transistor count as eliminate the inverter from the critical path to avoid the
minimum. For performance comparison, both DC and AC
possle shortIrcuitpowe consumptionfo poer
p

performances of the proposed design against various full adder low


designs are evaluated via extensive HSPICE simulations. The operation. In [4], a pass transistor based new Static Energy-
simulation results, based on TSMC 2P4M 0.35um process Recovery Full (SERF) adder with as few as 10 transistors
models, indicate that the proposed design has the lowest was presented. Despite of its claimed superiority in energy
working Vdd and highest working frequency among all designs consumption, the design is relatively slower than peer
using 10 transistors. It also features the lowest energy designs and cannot be cascaded at lowVdd operation dueto
consumption per addition among these designs. In addition, multiple-threshold loss problem. In [5], improved 10-
the performance edge of the proposed design in both speed and transistor full adder designs were derived based on
energy consumption becomes even more significant as the systematic exploration of the combinations of various XOR,
word length of the adder increases. sum and carry out modules. Again, these designs suffer from
the severe threshold loss problem and cannot operate
properly in cascade under low supply voltage. In [6], another
I. INTRODUCTION 10-transistor full adder design consisting of two pass
The essence of the digital computing lies in the full adder transistor based XORs and a 2-to-I multiplexer was
design. The design criteria of a full adder are usually multi- presented. Its Cout voltage swing is degraded by a total of 3
fold. Transistor count is, of course, a primary concern which VT's and the cascaded operation in low supply voltage
largely affects the design complexity of many function units becomes problematic. In this paper, we will propose a novel
such as multiplier and ALU. Two other important yet often 10-transistor full adder design with alleviated threshold loss
conflicted design criteria are power consumption and speed. problem. This leads to faster ripple carry addition while
A better metric would be the power delay product or energy maintaining the performance edge in energy consumption
consumption per operation to indicate the optimal design per operation. The design can also sustain lower Vdd
tradeoffs. Related to the power consumption is the lowest operation than peer designs.
supply voltage the design can still operate properly.
Numerous full adder designs [1-6] in the categories of fully II. THE PROPOSED CLRCL FULL ADDER DESIGN
static CMOS, dynamic circuit, transmission gate, or pass
transistor logic have been presented. The full adder design in The logic function of a full adder can be represented as
fully static CMOS is the most conventional one but requires
as many as 28 transistors. Dynamic circuits can significantly Sum (AOB)Cin ± (AEEB) Gin (1)
reduce the transistor count but the incurred power Cout (AEB)GCin + (AOB) A (2)
consumption, including that of the clock tree, is usually high. From Eq(l) and (2), we can easily identify two basic
Building logic in transmission gate is another alterative to modules needed in implementing the functions, i.e. XOR and
reduce the circuit complexity. In [1], transmission gate plus 2-to-I multiplexer. An XORXNOR function can be
inverter based full adder designs were presented using 20 ach7eved with only 4 trans©stors I pass transistor logic. A 2-
as

Co~ ut<
to-I multiplexer can be implemented using as few as 2
transistors if complementary control signals are available.
Note that these circuits all encounter various degrees of
threshold voltage loss problems and should be used with care.
In this paper, we propose a novel full adder design featuring
Complementary and Level Restoring Carry Logic (CLRCL).
The goal is to reduce the circuit complexity and to achieve
faster cascade operation. The strategy is to avoid multiple
threshold voltage loss in carry chain by proper level
restoring. We first rewrite the full adder sum and carry logics

Sum= (AEDCin)- Cout + (A(OCin) B


Cout = (AeDCin) B + (A ()Cin) A
The logic block diagram of the proposed design is shown in
Figure 1. The design rationales are as follows: First, avoid
the usage of degraded output in the following stage as gate
control signals. This is the common problem existing in

signal propagation in a pass transistor chain. According to


Elmore formula, the propagation delay is a quadratic
function of the number of cascaded pass transistors [7].
Even for moderate number of cascade length, the delay is
still intolerable.
A

Cin
CIn
i n
/--ini
0sei
<XNOR
INV1

out
>'>
2
inl >;>
XinO
out

\4'X3
<=

used as a level restoring circuit to combat the output


(3)
(4)

most 10-transistor full adder designs. It will lead to multiple


threshold voltage loss and may prevent the cascaded circuit
from correct operation. Second, eliminate unbuffered carry

Sum

Cout

MUX2 Cout
Figure 1. logic block diagram of the CLRCL full adder
As shown in Figure 1, the XNOR circuit adopted in the
proposed design is realized by a 2-to- I multiplexer followed
by an inverter. The role of the inverter is 3-fold. Firstly, it is
threshold voltage loss. The level restored output is then fed
to NItUX 2/3 to generate Sum and Cout signals. The
threshold voltage loss of Sum and Cout will be confined to
only one VT away from the power supplies. Secondly, the
inverter
inverter(INV
2.
(INV 2) serves
serves as buffer along
as a buffer
,,
speed up the carry propagation.
carry chain to
along the cafychainto
Thirdly,_the inverter (INV
2) provides complementary signal ( Cout ) needed in the
following stage. The complementary input signals provided
also help simplify the XNOR design where only one signal
iS needed in selection control. The MOS schematic circuit
design of the CLRCL full adder is depicted inFigure 2. The
entire full adder circuit requires only i10 transistors (5
U
PMOS and 5 NMOS) - the one with least transistor count
we have learned so far from the literature. In the following
section, we will conduct various analyses and simulations to
demonstrate the performance superiority of the proposed
adder over other designs.
Cin

|TFA
A
E=:_
A
Cin =>-

B
B

14T
16T
SERF
9A
9B
13A

the voltage
[I]
[2]
[3]
[4]
[5]
[5]
[5]

onytoei-rnitr(v
threshold
16
14
16
10
10
10
10
r

In this paper, several comparable full adder designs are


included forperformance comparison. Sincethe design goals
are low circuit complexity and high speed operation subject
to competitive energy consumption, we focus basically on
low gate count and static pass transistor based full adder
designs. Besides the proposed CLRCL full adder design, 9
more designs are employed. The features of these designs are
summarized in Table I.
TABLE I. FEATURES OF FULL ADDER DESIGNS UNDER COMPARISON

CKT
2T
TG-CMOS
Ref TXRs #
3 |28
Circuit type
T |Complementary CMOS logic
20 Transmission gate logic (TGL)
Transmission gate logic
PTL + TGL + inverter
PTL + TGL
Pass transistor logic (PTL)
Pass transistor logic (PTL)
Pass transistor logic (PTL)
Pass transistor logic (PTL)
A. Simulation results of single-bit full adder designs
Sum

Gout
Cout
Figure 2. MOS schematic circuit design ofthe CLRCL full adder
III. PERFORMANCE ANALYSES AND SIMULATION
RESULTS

Voltage swing
Full
Full
Full
Full
Full
degraded
degraded
degraded
degraded

We will first conduct static circuit analyses to determine


swings different
of
nsot
adder designs. We compare
ein wher
voltage loss problem comes hand in hand with the
low gate count approach. Among the three inputs to a full
adder, inputs A and B, are assumed to be perfect and have
full voltage swing (from Vdd to Gnd). Input Cin, however, is
drawn from the Cout of another full adder to faithfully reflect
the situation of cascaded operation in most parallel adder
designs. The Hspice simulation results are summarized in
TaleI. Th iuain.r aedo ici Caayi

2706
using TSMC 0.35um 2P4M process and 3.3V power supply propagation from LSB to MSB. Since there are more than
(except Vdd,mm simulations) . Typical transistor sizes, i.e. one input pattern leading to carry generation or propagation
(W/L)p = 2tm/0.35tm and (W/L)1 = lm/0.35tm are in each bit of the full adder, only the one that results in the
employed. Worst case output voltage levels are recorded. longest delay (usually degraded outputs) will be selected.
Note that the two values in each entry for the proposed Note that the test patterns are structure dependent. We use
CLRCL circuit correspond to Cout and Cout, respectively. Vdd = 3.3V in our worst case delay simulations and the
Among these lOT designs, SERF, 9A, 9B and 13A designs results
CLRC are summarized
designs hasF in Table IV.
coplmntr carr Since
sigals proposed
the two delay
all have twice threshold voltage loss in Cout high. The CLRCL design has complementary carry signals two delay
proposed
CLRCL design, however, encounters only 1 numbers are given. The maximum working frequency (fmax
threshold voltage losso along the unbuffered out signal calculated as the reciprocal of the larger delay.
in MHz)theis ten
Simiarsly l oth r 1 deign tenounere 2oto3ime. Among designs
under comparison, CLRCL, 9A, 9B,
threshold voltage loss in Sum high signal while thes 13A and SERF are 1 OT designs. The remaining 5 designs are
phroposed GolRagL esi s Sufrs -only onelhehold tae higher gate design
count with full voltage swing operation.
loss. As a consequence, the propo se design requires tage They are considered to have better speed performance, when
lest. Vd amcongeall1 Ot design. o m te min compared with 1OT designs, at the cost of increased circuit
Vdd that the circuits can still function properly, every output complexity.
aer aislod wita tpicaly size inverter
From the table, it is clear that the proposed
of the CLRCL design has the minimum delays among all the lOT
ofwthere full
where addr d
(W/L)p =
1.4gm/0.35gm and
1.4~tm/0.35~tm and (W/L)si
(W/L)n - =
designs gap of the delays
and theadder
size of ripple grows. At word length equalwider
becomes even as the
to 16 bit,
0.7tm/0.35tm. A full adder design is considered as working the delay of the CLRCL design is almost a magnitude order
properly subject to a power supply if its output voltage swing smaller than those of other lOT designs. Compared with the
is larger than the range confined by the VIH and the VIL of the higher gate count designs, the proposed CLRCL design is
loading inverter. From Table II, the proposed CLRCL design inferior in speed but the delay remains in the same
has the lowest Vdd,m, among all the 10-transistor full adder magnitude order as those of higher gate count designs. This
designs. Note that Vddm, numbers in Table II are obtained is mainly attributed to the inverter buffering strategy of the
under the condition of sufficient time period for signals to design
reach their steady states. Vdd,1, will increase with the
working frequency as shorter time period is allocated for TABLE IV. SPEED ANALYSIS OF DIFFERENT RIPPLE ADDER DESIGNS
higher frequency operations. Table III summarizes the Vdd,1, Word 2-bit 4-bit 8-bit 16-bit
values at different working frequencies. Again, the proposed length
CLRCL design has the lowest Vdd,m. among all lOT designs category designs Delay (ns) Delay (ns) Delay Delay
at different working frequency. (ns) (ns)
CLRCL 1.57/1.89 3.44/3.73 7.15/7.43 14.48/14.86
Table II. HSPICE DC ANALYSIS RESULTS OF FULL ADDER DESIGNS
9A 2.99* 9.03 * 22.54* 77.3*
Designs Cout highmin COUt lOWmax Sum highmin Sum lOWmax Vdd,nin I
lOT 9B 3.52 10.04 30.8 106
SERF 1.59 0.97 1.59 0 2.8 13A 3.53 8.78 24.9 77
9A 1.58 0.98 1.12 0.96 2.8 SERF 2.93 7.99 22.45 70.8
9B 1.58 0.98 1.58 0.94 2.8 14T 0.775 1.43 3.66 11.51
13A 1.59 0.98 1.59 0.96 2.8 Higher 16T 0.608 1.26 3.1 9.7
CLRCL 2.36/3.3 0.97/0 2.21 0.98 1.9 Hiteght TFA 0.547 1.13 2.845 8.386
TG-CMOS 0.502 1.082 3.05 10.21
TABLE III. Vddnin VERSUS WORKING FREQUENCY 28T 0.793 1.411 2.66 5.097
* measurement for Cout only, but Sum signal is void
Vdd,nin @l100MHz @,125MHz @,250MHz @,500MHz
SERF 2.8 2.8 2.9 3.1 We will next examine the power / energy consumption
9A 2.8 2.8 2.9 3.1 issues of these designs. The index term ('1) as given in Eq.
9B 2.8 2.8 2.8 3.1 (5), is the average energy consumption per addition (or per
13A 2.8 2.8 2.8 3.1 input transition).
CLRCL 2.0 2.1 2.2 2.5 P* T
= = (P / fmax ) Vdd (5)
B. Simulation results ofripple adder designs Since most of the power consumption occurs due to input
Since the speed performance of pass transistor logic transition, the index is calculated as the product of average
based designs tends to degrade drastically with the depth of power and the clock period under the condition of input
logic chaining, the simulations will be conducted subject to transition at fmax. This term is also equivalent to the
different ripple adder sizes ranging from 2, 4, 8 to 16-bit. normalized power consumption per MHz. Again, the Vdd is
The performance indexes include worst case delay set to be 3.3V in our simulation. In Figure 3, the comparison
(maximum working frequency) and average energy of energy consumption per addition for the five lOT designs
consumption per addition. To evaluate the worst case delays, is illustrated. The proposed GLRGL design has the smallest
it is clear the most critical timing lies in the carry figures and the discrepancy with other designs becomes

2707
larger and larger with the increase of full adder word length. The layout size of the CLRCL design is 14.3 gm by
This is mainly because other lOT based designs lack of 15.8gtm. We also conduct post layout simulations to check
proper driving capability in cascading and take much longer
time to accomplish the computation in large word length T ble Vh
operation. The energy, or the power delay product, will thus a V
deteriorate. Note that design 9B has the worst energy TABLE V. PRE-LAYOUT V.S. POST LAYOUT SIMULATION RESULTS
performance in that it suffers from the slowest operation. To
compare the performance of those designs with different gate Speed Power (W)
counts, a new performance index P called area weighted CLRCL Pre-layout Post-layout Pre-layout Post-layout
16BIT 66.7 MHz/14.9ns 42 MHz/23.6ns 11.697e-04 11.981e-04
energy consumption is devised. It is defied as the product of
area, power and delay. In Figure 4, we show the comparison IV. CONCLUSION
results of the proposed CLRCL design versus other high gate
count designs measured in '. Except for the 28T design, the In conclusion, in this paper, we presented a novel full
discrepancy of ' performance for different designs at word adder design using as few as 10 transistors per bit. In the
length less than 8-bit is not significant. However, when word TDC asect, the CLRCL design the lowest power requIres
length reaches 16 bits, the ' values of 14T, 16T, TFA and supply among other lOT designs. In the AC aspect, udier a
TG-CMOS designs designstart the other hand, the
to soar up. On the fixed Vdd, the CLRCL design also enjoys the highest
TG.CMOS starttosoarup. otheworking
.
CLRCL and 28T exhibit only mild increase in ' values,
frequency. We also include other higher gate count
full adder designs in our comparison. The CLRCL design
Meanwhile, the CLRCL design enjoys the lowest ' value at can achieve comparable speed performance while using
16-bit design. This, again, ca achev ble Insed
smaller cgateacount.
^ ^demonstrates the superiority of the ~~~~~~~~~much termnspe whie usnthe
ne efficiency,
of energy
proposed design. CLRCL design features the lowest energy consumption per
addition among all 1OT designs. Its area weighted energy

350
-4--=CLRCL -U-.9A 9B .SERF|
13A consumption index is also superior to those of higher gate
count designs when word length equals to 16-bit.
300 ACKNOWNLEDGMENT
250 This work was partially supported by National Chip
Implementation Center of Taiwan and National Science
.r 200 -Council of Taiwan under the Grant NSC94-EC- 17-A-0 1-S1-
150 / 037.
100 REFERENCES
50 [1] N. Zhuang and H. Wu, "A new design of the CMOS full adder,"
IEEE J of Solid state circuits, Vol. 27, pp.840-844, May 1992.
0 1 - [2] J.Wang, S. Fang, and W. Feng, "New efficient designs for XOR and
2 4 Bits 8 16 XNOR functions on the transistor level," IEEE J. Solid-State Circuits,
Figure 3. Comparison of energy consumption per addition for different 1OT vol. 29, pp. 780-786, July 1994.
based designs [3] A. M. Shams and Magdy A. Bayoumi, "A Novel High-Performance
CMOS 1-Bit Full Adder Cell," IEEE Trans. Circuits and Systems-II,
--CLRCL 14T 16T Vol.47 No.5, May 2000.
[4] R. Shalem, E. John, and L. K. John, "A novel low-power energy
35
>
TFA H = TG-CMOS 28T recovery full adder cell," in Proc. Great Lakes Symp. VLSI, pp. 380-
383, Feb. 1999.
i) 30 vi [5] H. T. Bui, Y. Wang, and Y. Jiang, "Design and analysis of low-power
10- transistor full adders using novel XOR-XNOR gates," IEEE
X 25 Trans. On Ckt and Systems II, Vol. 49, no. 1, PP.25 - 30, Jan 2002.
k 20 / k [6] A. Fayed and M. Bayoumi, "A low power 10-transistor full adder cell
for embedded architectures,"IEEE Symposium of Circuits and
-r 15 Systems, Sydney, Australia, pp.226 -229, May 2001.
. [7] N. Weste and K. Eshraghian, Principles of CMOS VLSI Design, a
/ s U
System Perspective. Reading, MA: Addison-Wesley, 1993.
5
0
2 4 Bits 8 16

Figure 4. Comparison of index for CLRCL versus high gate count full
adder designs

2708

You might also like