Speed and Energy Full Adder Design Restoring Logic: A High Using Complementary & Level Carry
Speed and Energy Full Adder Design Restoring Logic: A High Using Complementary & Level Carry
Co~ ut<
to-I multiplexer can be implemented using as few as 2
transistors if complementary control signals are available.
Note that these circuits all encounter various degrees of
threshold voltage loss problems and should be used with care.
In this paper, we propose a novel full adder design featuring
Complementary and Level Restoring Carry Logic (CLRCL).
The goal is to reduce the circuit complexity and to achieve
faster cascade operation. The strategy is to avoid multiple
threshold voltage loss in carry chain by proper level
restoring. We first rewrite the full adder sum and carry logics
Cin
CIn
i n
/--ini
0sei
<XNOR
INV1
out
>'>
2
inl >;>
XinO
out
\4'X3
<=
Sum
Cout
MUX2 Cout
Figure 1. logic block diagram of the CLRCL full adder
As shown in Figure 1, the XNOR circuit adopted in the
proposed design is realized by a 2-to- I multiplexer followed
by an inverter. The role of the inverter is 3-fold. Firstly, it is
threshold voltage loss. The level restored output is then fed
to NItUX 2/3 to generate Sum and Cout signals. The
threshold voltage loss of Sum and Cout will be confined to
only one VT away from the power supplies. Secondly, the
inverter
inverter(INV
2.
(INV 2) serves
serves as buffer along
as a buffer
,,
speed up the carry propagation.
carry chain to
along the cafychainto
Thirdly,_the inverter (INV
2) provides complementary signal ( Cout ) needed in the
following stage. The complementary input signals provided
also help simplify the XNOR design where only one signal
iS needed in selection control. The MOS schematic circuit
design of the CLRCL full adder is depicted inFigure 2. The
entire full adder circuit requires only i10 transistors (5
U
PMOS and 5 NMOS) - the one with least transistor count
we have learned so far from the literature. In the following
section, we will conduct various analyses and simulations to
demonstrate the performance superiority of the proposed
adder over other designs.
Cin
|TFA
A
E=:_
A
Cin =>-
B
B
14T
16T
SERF
9A
9B
13A
the voltage
[I]
[2]
[3]
[4]
[5]
[5]
[5]
onytoei-rnitr(v
threshold
16
14
16
10
10
10
10
r
CKT
2T
TG-CMOS
Ref TXRs #
3 |28
Circuit type
T |Complementary CMOS logic
20 Transmission gate logic (TGL)
Transmission gate logic
PTL + TGL + inverter
PTL + TGL
Pass transistor logic (PTL)
Pass transistor logic (PTL)
Pass transistor logic (PTL)
Pass transistor logic (PTL)
A. Simulation results of single-bit full adder designs
Sum
Gout
Cout
Figure 2. MOS schematic circuit design ofthe CLRCL full adder
III. PERFORMANCE ANALYSES AND SIMULATION
RESULTS
Voltage swing
Full
Full
Full
Full
Full
degraded
degraded
degraded
degraded
2706
using TSMC 0.35um 2P4M process and 3.3V power supply propagation from LSB to MSB. Since there are more than
(except Vdd,mm simulations) . Typical transistor sizes, i.e. one input pattern leading to carry generation or propagation
(W/L)p = 2tm/0.35tm and (W/L)1 = lm/0.35tm are in each bit of the full adder, only the one that results in the
employed. Worst case output voltage levels are recorded. longest delay (usually degraded outputs) will be selected.
Note that the two values in each entry for the proposed Note that the test patterns are structure dependent. We use
CLRCL circuit correspond to Cout and Cout, respectively. Vdd = 3.3V in our worst case delay simulations and the
Among these lOT designs, SERF, 9A, 9B and 13A designs results
CLRC are summarized
designs hasF in Table IV.
coplmntr carr Since
sigals proposed
the two delay
all have twice threshold voltage loss in Cout high. The CLRCL design has complementary carry signals two delay
proposed
CLRCL design, however, encounters only 1 numbers are given. The maximum working frequency (fmax
threshold voltage losso along the unbuffered out signal calculated as the reciprocal of the larger delay.
in MHz)theis ten
Simiarsly l oth r 1 deign tenounere 2oto3ime. Among designs
under comparison, CLRCL, 9A, 9B,
threshold voltage loss in Sum high signal while thes 13A and SERF are 1 OT designs. The remaining 5 designs are
phroposed GolRagL esi s Sufrs -only onelhehold tae higher gate design
count with full voltage swing operation.
loss. As a consequence, the propo se design requires tage They are considered to have better speed performance, when
lest. Vd amcongeall1 Ot design. o m te min compared with 1OT designs, at the cost of increased circuit
Vdd that the circuits can still function properly, every output complexity.
aer aislod wita tpicaly size inverter
From the table, it is clear that the proposed
of the CLRCL design has the minimum delays among all the lOT
ofwthere full
where addr d
(W/L)p =
1.4gm/0.35gm and
1.4~tm/0.35~tm and (W/L)si
(W/L)n - =
designs gap of the delays
and theadder
size of ripple grows. At word length equalwider
becomes even as the
to 16 bit,
0.7tm/0.35tm. A full adder design is considered as working the delay of the CLRCL design is almost a magnitude order
properly subject to a power supply if its output voltage swing smaller than those of other lOT designs. Compared with the
is larger than the range confined by the VIH and the VIL of the higher gate count designs, the proposed CLRCL design is
loading inverter. From Table II, the proposed CLRCL design inferior in speed but the delay remains in the same
has the lowest Vdd,m, among all the 10-transistor full adder magnitude order as those of higher gate count designs. This
designs. Note that Vddm, numbers in Table II are obtained is mainly attributed to the inverter buffering strategy of the
under the condition of sufficient time period for signals to design
reach their steady states. Vdd,1, will increase with the
working frequency as shorter time period is allocated for TABLE IV. SPEED ANALYSIS OF DIFFERENT RIPPLE ADDER DESIGNS
higher frequency operations. Table III summarizes the Vdd,1, Word 2-bit 4-bit 8-bit 16-bit
values at different working frequencies. Again, the proposed length
CLRCL design has the lowest Vdd,m. among all lOT designs category designs Delay (ns) Delay (ns) Delay Delay
at different working frequency. (ns) (ns)
CLRCL 1.57/1.89 3.44/3.73 7.15/7.43 14.48/14.86
Table II. HSPICE DC ANALYSIS RESULTS OF FULL ADDER DESIGNS
9A 2.99* 9.03 * 22.54* 77.3*
Designs Cout highmin COUt lOWmax Sum highmin Sum lOWmax Vdd,nin I
lOT 9B 3.52 10.04 30.8 106
SERF 1.59 0.97 1.59 0 2.8 13A 3.53 8.78 24.9 77
9A 1.58 0.98 1.12 0.96 2.8 SERF 2.93 7.99 22.45 70.8
9B 1.58 0.98 1.58 0.94 2.8 14T 0.775 1.43 3.66 11.51
13A 1.59 0.98 1.59 0.96 2.8 Higher 16T 0.608 1.26 3.1 9.7
CLRCL 2.36/3.3 0.97/0 2.21 0.98 1.9 Hiteght TFA 0.547 1.13 2.845 8.386
TG-CMOS 0.502 1.082 3.05 10.21
TABLE III. Vddnin VERSUS WORKING FREQUENCY 28T 0.793 1.411 2.66 5.097
* measurement for Cout only, but Sum signal is void
Vdd,nin @l100MHz @,125MHz @,250MHz @,500MHz
SERF 2.8 2.8 2.9 3.1 We will next examine the power / energy consumption
9A 2.8 2.8 2.9 3.1 issues of these designs. The index term ('1) as given in Eq.
9B 2.8 2.8 2.8 3.1 (5), is the average energy consumption per addition (or per
13A 2.8 2.8 2.8 3.1 input transition).
CLRCL 2.0 2.1 2.2 2.5 P* T
= = (P / fmax ) Vdd (5)
B. Simulation results ofripple adder designs Since most of the power consumption occurs due to input
Since the speed performance of pass transistor logic transition, the index is calculated as the product of average
based designs tends to degrade drastically with the depth of power and the clock period under the condition of input
logic chaining, the simulations will be conducted subject to transition at fmax. This term is also equivalent to the
different ripple adder sizes ranging from 2, 4, 8 to 16-bit. normalized power consumption per MHz. Again, the Vdd is
The performance indexes include worst case delay set to be 3.3V in our simulation. In Figure 3, the comparison
(maximum working frequency) and average energy of energy consumption per addition for the five lOT designs
consumption per addition. To evaluate the worst case delays, is illustrated. The proposed GLRGL design has the smallest
it is clear the most critical timing lies in the carry figures and the discrepancy with other designs becomes
2707
larger and larger with the increase of full adder word length. The layout size of the CLRCL design is 14.3 gm by
This is mainly because other lOT based designs lack of 15.8gtm. We also conduct post layout simulations to check
proper driving capability in cascading and take much longer
time to accomplish the computation in large word length T ble Vh
operation. The energy, or the power delay product, will thus a V
deteriorate. Note that design 9B has the worst energy TABLE V. PRE-LAYOUT V.S. POST LAYOUT SIMULATION RESULTS
performance in that it suffers from the slowest operation. To
compare the performance of those designs with different gate Speed Power (W)
counts, a new performance index P called area weighted CLRCL Pre-layout Post-layout Pre-layout Post-layout
16BIT 66.7 MHz/14.9ns 42 MHz/23.6ns 11.697e-04 11.981e-04
energy consumption is devised. It is defied as the product of
area, power and delay. In Figure 4, we show the comparison IV. CONCLUSION
results of the proposed CLRCL design versus other high gate
count designs measured in '. Except for the 28T design, the In conclusion, in this paper, we presented a novel full
discrepancy of ' performance for different designs at word adder design using as few as 10 transistors per bit. In the
length less than 8-bit is not significant. However, when word TDC asect, the CLRCL design the lowest power requIres
length reaches 16 bits, the ' values of 14T, 16T, TFA and supply among other lOT designs. In the AC aspect, udier a
TG-CMOS designs designstart the other hand, the
to soar up. On the fixed Vdd, the CLRCL design also enjoys the highest
TG.CMOS starttosoarup. otheworking
.
CLRCL and 28T exhibit only mild increase in ' values,
frequency. We also include other higher gate count
full adder designs in our comparison. The CLRCL design
Meanwhile, the CLRCL design enjoys the lowest ' value at can achieve comparable speed performance while using
16-bit design. This, again, ca achev ble Insed
smaller cgateacount.
^ ^demonstrates the superiority of the ~~~~~~~~~much termnspe whie usnthe
ne efficiency,
of energy
proposed design. CLRCL design features the lowest energy consumption per
addition among all 1OT designs. Its area weighted energy
350
-4--=CLRCL -U-.9A 9B .SERF|
13A consumption index is also superior to those of higher gate
count designs when word length equals to 16-bit.
300 ACKNOWNLEDGMENT
250 This work was partially supported by National Chip
Implementation Center of Taiwan and National Science
.r 200 -Council of Taiwan under the Grant NSC94-EC- 17-A-0 1-S1-
150 / 037.
100 REFERENCES
50 [1] N. Zhuang and H. Wu, "A new design of the CMOS full adder,"
IEEE J of Solid state circuits, Vol. 27, pp.840-844, May 1992.
0 1 - [2] J.Wang, S. Fang, and W. Feng, "New efficient designs for XOR and
2 4 Bits 8 16 XNOR functions on the transistor level," IEEE J. Solid-State Circuits,
Figure 3. Comparison of energy consumption per addition for different 1OT vol. 29, pp. 780-786, July 1994.
based designs [3] A. M. Shams and Magdy A. Bayoumi, "A Novel High-Performance
CMOS 1-Bit Full Adder Cell," IEEE Trans. Circuits and Systems-II,
--CLRCL 14T 16T Vol.47 No.5, May 2000.
[4] R. Shalem, E. John, and L. K. John, "A novel low-power energy
35
>
TFA H = TG-CMOS 28T recovery full adder cell," in Proc. Great Lakes Symp. VLSI, pp. 380-
383, Feb. 1999.
i) 30 vi [5] H. T. Bui, Y. Wang, and Y. Jiang, "Design and analysis of low-power
10- transistor full adders using novel XOR-XNOR gates," IEEE
X 25 Trans. On Ckt and Systems II, Vol. 49, no. 1, PP.25 - 30, Jan 2002.
k 20 / k [6] A. Fayed and M. Bayoumi, "A low power 10-transistor full adder cell
for embedded architectures,"IEEE Symposium of Circuits and
-r 15 Systems, Sydney, Australia, pp.226 -229, May 2001.
. [7] N. Weste and K. Eshraghian, Principles of CMOS VLSI Design, a
/ s U
System Perspective. Reading, MA: Addison-Wesley, 1993.
5
0
2 4 Bits 8 16
Figure 4. Comparison of index for CLRCL versus high gate count full
adder designs
2708