A Review of Adiabatic Computing: AT&T Bell
A Review of Adiabatic Computing: AT&T Bell
94 1994 IEEE Symposium on Low Power Electronics 0-7803-1953-2I 9 4 I $3.00 I O 1994 IEEE
Authorized licensed use limited to: INDIAN INST OF INFO TECH AND MANAGEMENT. Downloaded on September 07,2020 at 19:37:46 UTC from IEEE Xplore. Restrictions apply.
gate-drain capacitance of the PFET.) The energy curve at
time 2nS includes the energy stored in the capacitance, plus
Input Current
the energy dissipated during charging (in the form of I Z R
A’lo
heating in the PFET channel, wiring, etcetera). The stored
energy is dissipated during discharge (in the NFET channel, Power-
etcetera).
During the two operations (charge and discharge), an en-
clock Input
+
ergy CV& disappears from the supply. Therefore all circuits
of this type must dissipate a t least fCVjdper operation.
This is a powerful result. It is independent of the internal Fig. 3 . Toy Example: Adiabatic CMOS
details of the circuit, and a similar result holds even if the
node’s charge-versus-voltage characteristic is nonlinear.
The dissipated energy can easily exceed this lower bound. The time required increases by a factor of N .
For instance, if the N F E T and P F E T are ever turned on The current decreases by a factor of N .
simultaneously, a “crowbar current” will flow directly from The power decreases by a factor of N Z .
V d d to ground, dissipating energy without contributing to
The dissipated energy per operation decreases by a fac-
the desired charging or discharging of the node X . tor of N .
The conventional approaches to reducing the energy per The transferred charge and energy are unchanged.
computation are: This is in sharp contrast t o vanilla CMOS, where simply
Reducing the operating voltage Vdd. reducing the clock rate reduces the power by only one factor
Reducing the capacitance C. of N and reduces the energy not at all.
Reducing the activity factor (i.e. the number of node The voltage at node X closely tracks the power-clock volt-
transitions per useful computation). age (the dashed line in the figure). This is crucial: at no
Obviously such reductions are advantageous, but there are time does a current flow across a large potential drop. This
limits; in any case for present purposes we take such reduc- is in contrast to vanilla CMOS, where currents routinely
tions for granted and show how dissipation can be further flow across drops on the order of Vdd.
reduced a t any particular c, V d d , and activity factor. Types of Adiabatic Circuitry
An Adiabatic Example Circuits can be categorized on the basis of their energy
The foregoing energy theorem depends on the assumption performance as follows:
that all the needed electrons are extracted from the supply 1 - Fully adiabatic circuits (e.g. figure 3) - which, if op-
via the V d d terminal and returned via the ground terminal. erated arbitrarily slowly, would dissipate arbitrarily lit-
The essential idea of adiabatic computing is to lift this as- tle energy per operation. Remember, “full” adiabati-
sumption. T h a t is: cicity is a statement about the asymptotic behavior a t
We extract charge from the supply a t the lowest feasible very low frequencies.
voltage, and return it a t the highest feasible voltage. 2 - Partially adiabatic circuits -in which charge is trans-
We now show how this can be done; figure 3 is an exam- ferred across reduced potential drops, and some energy
ple. (This circuit is similar to the vanilla CMOS inverter in is recovered. However, some energy is lost due to op-
figure 1, and performs a similar function - but is not quite erations that are irreversible in principle.
as practical, for reasons t o be discussed below.) The solid 3 - Utterly non-adiabatic circuits (e.g. vanilla CMOS, fig-
waveforms in figure 2 apply t o this circuit. The transistors ure 1) - in which no attempt is made t o minimize
remain on while the power supply is ramped up and down. potential drops, or to recover the transferred energy.
As we shall see later, this time-dependent power supply can This field is rooted in discussions of “logical reversibility”
take over the timing functions of a traditional clock signal, and the thermodynamics of computation[l; 21. A logically
so we will call it a power-clock. The rise and fall times of reversible device is one where if you tell me the output I
the power-clock are several-fold longer than the natural RC can tell you what the input must have been; an inverter
time of the node (where R includes the resistance of the is a perfect example. In contrast, an adder is not logically
transistor channels, wiring resistance, etcetera). reversible, because if you tell me the sum I cannot tell you
The adiabatic circuit dissipates less energy during charg- what the addends must have been.
ing; the energy trace a t time 2nS is mostly transferred en- Figure 4 shows a mechanical adder. Even though it is
ergy. During discharge (2.25 - 2.75 nS) most of this energy not logically reversible, it is thermodynamically and me-
is returned to the supply. chanically reversible in the usual sense. If operated slowly,
The adiabatic circuit charges node X to the same volt- it would dissipate arbitrarily little energy per operation. As
age as the vanilla circuit. This can be seen in the voltage will become clearer below, logical reversibility is neither nec-
waveform a t time 2nS. However, the charge is delivered over essary nor sufficient for thermodynamic reversibility (or adi-
a longer time. This circuit’s peak currents are several-fold abaticity).
smaller. This is crucial. Remember, the dissipated power, We classify the mechanical adder as a push-through logic
I’R, is a nonlinear function of I . If we slow down the power- device. The output changes almost as soon as the input
clock’s risetime by a factor of N , is changed. The bad news is that the device provides no
Authorized licensed use limited to: INDIAN INST OF INFO TECH AND MANAGEMENT. Downloaded on September 07,2020 at 19:37:46 UTC from IEEE Xplore. Restrictions apply.
e Regenerative schemes.
Retractile Cascade- The most direct way t o guarantee that
recharge will be performed if and only if necessary is t o
require not just (a) that the input be valid while the desired
output is being computed (the evaluate phase) but also (b)
that it remain valid until the recharge phase is completed.
Let’s see what this implies for a complex calculation con-
sisting of M stages. Let’s suppose the final output, the
Eh
input output of the M t h stage, must be valid for one clock phase.
The input to that stage must be valid for one phase before
Fig. 4. Mechanical Adder and after, for a total of 3 phases. The input t o stage 1 must
be valid for 2 M + 1 phases. This is illustrated in figure 5
for the case M = 3.
power gain; the force and energy required to move the out-
put pointer must be supplied by whomever is moving the Stage 1 In 1-L
input pointer. Furthermore, there is no logic-level restora- Stage 2 In Stage 1 Out
tion (i.e. a degraded input produces a degraded output). Stage 3 In /-7 Stage 2 Out
The circuit of figure 3 can be extended to perform a Stage3 Out
NAND function. (It needs another NFET in series and an-
other PFET in parallel.) We classify this as an escalator Fig. 5 . Retractile Cascade Timing
logic device, which is very different from a push-through de-
vice. The output is carried up and down by attaching it to This staggered timing diagram is the signature of the re-
the power-clock. The output energy and the output timing tractile cascade scheme. The basic idea[2] has several elec-
come from the power-clock, not from the inputs. The device tronic embodiments[6; 81.
provides power gain, and performs logic-level restoration. It One undesirable aspect is that the latency (the interval
is not logically reversible, but it is fully adiabatic. between accepting one input and accepting the next) is in-
In a push-through device, the outputs necessarily and im- creased by a factor of two, because time is needed for the
mediately follow the inputs, so an adiabatic input gener- retractions. What’s worse is that the throughput is reduced
ally guarantees an adiabatic output. In an escalator device, by a factor of M or so, since no pipelining is possible.
however, some restrictions must be enforced to prevent an
M e m o r y Schemes - To permit pipelining, the first thing
“oops” - an accidental non-adiabatic transition. Specif-
that comes to mind is to recharge the output node without
ically, an output that is low must not be connected to a
having valid inputs available a t recharge time. The circuit
power-clock that is already high; similarly, an output that
of figure 6 uses this scheme.
is high (perhaps as the result of a previous calculation) must
This approach carries its own price. If the output remains
not be connected to a power-clock that is already low. Con-
valid after the input has gone invalid, then the device is per-
sequently, most escalator devices are pulse-mode escalator
forming a m e m o r y function, whether we intended it or not.
devices. The circuit in figure 3 illustrates this. The output
The laws of physics[l] tell us that erasing one bit of mem-
has a predetermined “resting” level (ground in this case).
ory must cost some energy, no matter how slowly it is done.
Whenever the output makes a transition away from the rest-
(Retraction, unlike erasure, can be dissipation-free since the
ing level, it must be returned (“recharged”) to the resting
still-valid input allows us to know the prior state of the node
level before the start of the next calculation. This recharge
we are trying to recharge.) The fundamental physical limit
step carries a terrible price.
is on the order or IcT; the practical limit appears to be on
The problem is that recharge must be performed if and
the order of CK2, where Vt is a threshold voltage - several
only if it was necessary. Suppose (as is diagrammed) the
orders of magnitude larger than kT,but still much less than
input is asserted (i.e. A = high, = low) while the power-
CV,&
clock ramps up. Then the input must remain asserted while
The memory function need not be explicit or complex.
the power-clock ramps down, so that the output may return
Some logic families[4; 5; 91 leave the output node in a high
to its resting level.
impedance state until recharge time; charge trapped on the
On the other hand, now suppose that (contrary to what is
node constitutes the memory.
diagrammed)
- the input had been unasserted (i.e. A = low,
A = high) while the power-clock ramped up. T h e output,
Because of this erasure energy, so memory schemes are n o t
asymptotically adiabatic. However, for the applications of
node X , would have remained a t the resting level (low). The
most interest to us, the operating frequency is well above the
input must not become asserted until the power-clock has
asymptotic regime anyway. At any given nonzero frequency,
returned to the resting level - lest an oops occur.
all we care about is the actual dissipation a t that frequency.
A d i a b a t i c R e c h a r g e Schemes Regenerative schemes - As mentioned above, the recharge
There are three ways to deal with the recharge problem: energy can be arbitrarily small if we know the prior state of
e Retractile cascade schemes. the node. If the gate a t stage m implements a logically re-
e Memory schemes. versible function, a tantalizing possibility arises: we can use
Authorized licensed use limited to: INDIAN INST OF INFO TECH AND MANAGEMENT. Downloaded on September 07,2020 at 19:37:46 UTC from IEEE Xplore. Restrictions apply.
the stage-m outputs t o control the recharge of the stage-m V d d from 5 volts to 3.5 volts, but then the operating speed
inputs[lO; 111. This idea can be applied without difficulty to is compromised.
a chain of inverters and/or buffers (i.e. a shift register). Un-
fortunately, for typical logic functions F , it is prohibitively
Conclusions + Acknowledgements
difficult to implement the functional inverse E-’. We will not see fully adiabatic micrprocessor circuits on
the market any time soon. In the short run, there are easier
Clockability ways to reduce system energy requirements. On the other
In any digital circuit, dissipation in the clock generator hand, these are novel ideas that will have important appli-
is an important part of the overall energy budget. An adia- cations eventually.
batic system can use a N constant-voltage sources (capaci- Many of the ideas presented here were developed in col-
tors and/or batteries) and a N switches, to synthesize an N- laboration with Alan Kramer. The Bell Labs adiabatic com-
step approximation t o the required ramplike waveforms[7]. puting effort has benefitted from the contributions of Steve
This works fine a t low frequencies, but a t high frequencies Avery, Bryan Ackland, Alex Dickinson, Yann leCun, Larry
the energy required t o operate the switches is prohibitive. Jackel, Tom Wik, and many others.
We prefer resonant supplies; that is, an RLC circuit,
where R is the parallel combination of all the transistor References
channel resistance and wiring resistance on the chip, and
C is the parallel combination of all the gate capacitance [l] Rolf Landauer, “Irreversibility and Heat Generation in
and wiring capacitance. We need one or two inductors per the Computing Process: IBM J . Res. Devel. 5 (1961).
chip (certainly not per logic gate). Resonant schemes, alas, [2] C. H. Bennett, “Logical Reversibility of Computation:’
require that the clock driver see a relatively constant load IBM J. Res. Devel. 17, 525-532 (1973).
capacitance - independent of data patterns. [3] C. Seitz et al., “Hot Clock nMOS: Proceedings of the
1985 Chapel Hill Conference o n VLSI. Computer Sci-
ence Press (1985).
[4] Roderick T . Hinman and Martin F. Schlecht, “Power
Dissipation Measurements on Recovered Energy Logic:’
1994 Symposium o n VLSI Circuits / Digest of Tech-
nical Papers, 19. IEEE (June 1994); also R. T. Hin-
man and M. F. Schlect, “Recovered Energy Logic ...:’
Fig. 6. Adiabatic Logic Gate
Proceedings of the I E E E Power Electronics Specialists
Conference. (1993).
A. G. Dickinson and J. S.Denker, “Adiabatic Dynamic
Logic:’ Proceedings of the Custom Integrated Circuits
3 Vanilla5.0 Conference. IEEE (1994).
I
r J. G. Koller and W.C. Athas, “Adiabatic Switching,
F
0
Low Energy Computing, and the Physics of Storing
and Erasing Information:’ PhysComp ’92: Proc. of the
i 2
z Workshop o n Physics and Computation. IEEE (1993).
W. C. Athas, L.“J.” Svensson, J. G. Koller, N. Tzartza-
-cn
0
nis, Y-C Chou, ‘‘A Framework for Practical Low-
’ 0 1 2 3 Power Digital Cmos Systems Using Adiabatic Switch-
log 10Freqency I MHz ing Principles:’ Int’l Workshop o n Low-Power Design,
(unpublished, 1994).
Fig. 7. Energy versus Frequency Ralph C. Merkle, “Reversible Electronic Logic using
Switches:’ Nanotechnology4 21-40. (1993).
Alan Kramer, John S. Denker, Stephen C. Avery, Alex
A Working System G. Dickinson, and Thomas R. Wik, “Adiabatic Com-
Figure 6 shows a high-performance adiabatic logic gate. puting with the 2N-2N2D Logic Family:’ 1994 Sympo-
It has excellent noise immunity compared to previous logic s i u m o n V L S l Circuits / Digest of Technical Papers,
families[9]. The transistor area required for an inverter in 25. IEEE (June 1994).
in this family is large compared to vanilla CMOS, but is ac- J . S. Hall, “An Electroid Switching Model for Feversible
tually smaller for many-input gates. Two-wire (differential) Computer Architectures: PhysComp ’92: Proc. of the
signalling ensures a constant load t o the power-clock driver Workshop o n Physics and Computation. IEEE (1993).
- but doubles the wiring area. S. G. Younis and T. Knight, “Practical Implementa-
Figure 7 shows the energy per operation for a chain of tion of Charge Recovering Asymptotically Zero Power
inverters, comparing the adiabatic family to vanilla CMOS. CMOS:’ Proc. of 1993 Symposium o n Integrated Sys-
At 200MH2, the adiabatic circuit dissipates fourfold less en- tems, 234-250. MIT Press (1993).
ergy. At lower frequencies, the advantage is even larger. The T.J. Gabara, “Pulsed Low Power CMOS: Inter. J . of
dissipation of the vanilla CMOS can be reduced by lowering High Speed Elec. and Systems, 5 2, (1994).
Authorized licensed use limited to: INDIAN INST OF INFO TECH AND MANAGEMENT. Downloaded on September 07,2020 at 19:37:46 UTC from IEEE Xplore. Restrictions apply.