0% found this document useful (0 votes)
26 views

Module 4_Combination_circuit_design_notes.docx

Uploaded by

rashmi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Module 4_Combination_circuit_design_notes.docx

Uploaded by

rashmi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Module 4

Syllabus:
Combinational Circuit Design: Circuit Families- Static CMOS, Ratioed
Circuits, Cascode Voltage Switch Logic, Dynamic Circuits, Pass-Transistor
Circuits.
Sequential Circuit Design: Circuit Design of Latches and Flip-flops,
Conventional CMOS Latches, Conventional CMOS Flip-flops, Pulsed Latches,
Resettable Latches and Flip-flops, Enabled Latches and Flip-flops. (Text-1)

Combinational Circuit Design


Circuit Families
Static CMOS circuits with complementary nMOS pulldown and pMOS pullup
networks are used for the vast majority of logic gates in integrated circuits.
They have good noise margins, and are fast, consume low power, insensitive
to device variations, easy to design, widely supported by CAD tools, and
readily available in standard cell libraries. When noise exceed the margins,
the gate delay increases because of the glitch, but the gate eventually will
settle to the correct answer.
Some of the Circuit families are :
a) Static CMOS,
b) Ratioed Circuits,
c) Cascode Voltage Switch Logic,
d) Dynamic Circuits,
e) Pass-Transistor Circuits
Static CMOS circuits are mostly used for combinational logic.

1. Static CMOS
For designing Static CMOS circuits the following aspects need to considered:
• Designers accustomed to AND and OR functions must learn to think
in terms of NAND and NOR to take advantage of static CMOS.
• Manually circuit design is done through bubble pushing.
• Compound gates are particularly useful to perform complex
functions with relatively low logical efforts.
• When a particular input is known to be latest, the gate can be
optimized to favour that input.
• Similarly, when either the rising or falling edge is known to be more
critical, the gate can be optimized to favour that edge.
• Its focused on building gates with equal rising and falling delays;
however, using smaller pMOS transistors can reduce power, area,
and delay.
• In processes with multiple threshold voltages, multiple flavours of
gates can be constructed with different speed/leakage power trade-
offs.
1.1. Bubble Pushing
CMOS stages are inherently inverting, so AND and OR functions must be built from NAND and NOR
gates. DeMorgan’s law helps with this conversion:

These relations are illustrated graphically in Figure 9.1. A NAND gate is


equivalent to an OR of inverted inputs. A NOR gate is equivalent to an AND of
inverted inputs. The same relationship applies to gates with more inputs.
Switching between these representations is easy to do on a whiteboard and is
often called bubble pushing.
1.2. Compound Gates
Compound gates computing various inverting combinations of AND/OR
functions can be designed in a single stage. The function F = AB + CD can be
computed with an AND-OR-INVERT-22 (AOI22) gate and an inverter, as
shown in Figure 9.3.

Logical effort of compound gates can be different for different inputs. Figure
9.4 shows how logical efforts can be estimated for the AOI21, AOI22, and a
more complex compound AOI gate.
a) The transistor widths are chosen to give the same drive as a unit
inverter.
b) The logical effort of each input is the ratio of the input capacitance of
that input to the input capacitance of the inverter. For the AOI21 gate,
this means the logical effort is slightly lower for the OR terminal (C) than
for the two AND terminals (A, B).
c) The parasitic delay is coarsely estimated from the total diffusion
capacitance on the output node by summing the sizes of the transistors
attached to the output.
1.3. Input Ordering Delay Effect
In some logic gates the logical effort and parasitic delay are different for
different inputs terminals, which is asymmetric. Gates, like NANDs and NORs, are
nominally symmetric but actually have slightly different logical effort and parasitic delays for the
different inputs.
✓ Figure 9.6 shows a 2-input NAND gate annotated with diffusion
parasitic.
✓ Consider the falling output transition occurring when one input held a
stable 1 value and the other rises from 0 to 1. If input B rises last, node
x will initially be at VDD – Vt ≈ VDD because it was pulled up through the
nMOS transistor on input A. The Elmore delay is (R/2)(2C) + R(6C) =
7RC = 2.33 .
✓ If input A rises last, node x will initially be at 0 V because it was
discharged through the nMOS transistor on input B.
✓ No charge must be delivered to node x, so the Elmore delay is simply
R(6C) = 6RC = 2 .
✓ In general, the outer input is closer to the supply rail, i.e VDD or Gnd,
(e.g., B) and the inner input is closer to the output (e.g., A).
✓ The parasitic delay is smallest when the inner input switches last
because the intermediate nodes have already been discharged.
✓ Therefore, if one signal is known to arrive later than the others, the gate
is fastest when that signal is connected to the inner input.
✓ The logical efforts are lower than initial estimates might predict because
of velocity saturation. Interestingly, the inner input has a slightly higher
logical effort because the intermediate node x tends to rise and cause
negative feedback when the inner input turns ON.
✓ This effect is seldom significant to the designer because the inner input
remains faster over the range of fanouts used in reasonable circuits.
1.4. Asymmetric Gates
✓ Asymmetric is provided by different size transistors for different inputs.
When one input is far less critical than another, even nominally
symmetric gates can be made asymmetric to favor the late input at the
expense of the early one.
✓ In a series network, this involves connecting the early input to the outer
transistor and making the transistor wider so that it offers less series
resistance when the critical input arrives.
✓ In a parallel network, the early input is connected to a narrower
transistor to reduce the parasitic capacitance.
✓ For example, consider the path in Figure 9.7(a).
✓ Reset occurs under exceptional circumstances and can take place
slowly, the circuit should be optimized for input-to-output delay at the
expense of reset.
✓ This can be done with the asymmetric NAND gate in Figure 9.7(b). The
pulldown resistance is R/4 + R/(4/3) = R, so the gate still offers the
same driver as a unit inverter.
✓ The capacitance on input A is only 10/3, so the logical effort is 10/9.
This is better than 4/3, which is normally associated with a NAND gate.
✓ In the limit of an infinitely large reset transistor and unit-sized nMOS
transistor for input A, the logical effort approaches 1, just like an
inverter.
✓ The improvement in logical effort of input A comes at the cost of much
higher effort on the reset input.
✓ Note that the pMOS transistor on the reset input is also shrunk. This
reduces its diffusion capacitance and parasitic delay at the expense of
slower response to reset.
Figure 9.8 shows how to construct a symmetric NAND gate
1.5. Skewed Gates
HI-skew gates to favour the rising output transition and LO-skew gates
to favour the falling output transition. This favouring can be done by
decreasing the size of the noncritical transistor. The logical efforts for
the rising (up) and falling (down) transitions are called gu and gd,
respectively, and are the ratio of the input capacitance of the skewed
gate to the input capacitance of an unskewed inverter with equal drive
for that transition. Figure 9.9(a) shows how a HI-skew inverter is
constructed by downsizing the nMOS transistor. This maintains the
same effective resistance for the critical transition while reducing the
input capacitance relative to the unskewed inverter of Figure 9.9(b),
thus reducing the logical effort on that critical transition to gu = 2.5/3
= 5/6. The logical effort for the falling transition is estimated by
comparing the inverter to a smaller unskewed inverter with equal
pulldown current, shown in Figure 9.9(c), giving a logical effort of gd =
2.5/1.5 = 5/3. The degree of skewing (e.g., the ratio of effective
resistance for the fast transition relative to the slow transition) impacts
the logical efforts and noise margins; a factor of two is common. Figure
9.10 catalogs HI skew and LO-skew gates with a skew factor of two.
Skewed gates are sometimes denoted with an H or an L on their symbol
in a schematic.
1.6. P/N Ratios
The P/N ratio (pMOS/nMOS size ratio) of a library of cells should be
chosen on the basis of area, power, and reliability, not average delay.
For NOR gates, reducing the size of the pMOS transistors significantly
improves both delay and area. In most standard cell libraries, the pitch
of the cell determines the P/N ratio that can be achieved in any
particular gate.
The pMOS transistors in the unskewed gate are enormous in order to
provide equal rise delay. They contribute input capacitance for both
transitions, while only helping the rising delay. By accepting a slower
rise delay, the pMOS transistors can be downsized to reduce input
capacitance and average delay significantly.
For processes with a mobility ratio of Rn/Rp = 2 as we have generally
been assuming, the best ratios are shown in Figure 9.11.

➢ Reducing the pMOS size from 2 to for the inverter gives the
theoretical fastest average delay, but this delay improvement is
only 3%.
➢ This significantly reduces the pMOS transistor area. It also
reduces input capacitance, which in turn reduces power
consumption.
1.7. Multiple Threshold Voltages:
➢ Some CMOS processes offer two or more threshold voltages.
➢ Transistors with lower threshold voltages produce more ON
current, but also leak exponentially more OFF current.
➢ Libraries can provide both high- and low-threshold versions of
gates. The low-threshold gates can be used sparingly to reduce
the delay of critical paths.
➢ Skewed gates can use low-threshold devices on only the critical
network of transistors.
2. Ratioed Circuits
➢ Ratioed circuits, use weak pull-up devices and stronger pull-
down devices. They reduce the input capacitance and hence
improve logical effort by eliminating large pMOS transistors
loading the inputs, but depend on the correct ratio of pull-up to
pull-down strength.
➢ If the pull up is too strong, VOLmax may be too high; VOLmax is best
chosen to be less than Vin so the low output does not turn ON the
next stage.
➢ If the pull-up is too weak, the rising delay will be too slow; Ratioed
circuits also dissipate static power while the output is low, so they
must be used in a limited fashion where they provide significant
benefits.
2.1. Pseudo-nMOS
Figure 9.14 shows a pseudo-nMOS logic gates, which are the most
common form of CMOS ratioed logic.

The pulldown network is like that of an ordinary static gate, but the pullup
network has been replaced with a single pMOS transistor that is grounded so
it is always ON.
The pMOS transistor widths are selected to be about 1/4 the strength (i.e.,
1/2 the effective width) of the nMOS pulldown network as a compromise
between noise margin and speed.
Consider a complementary CMOS unit inverter that delivers current I in both
rising and falling transitions. For the widths shown, the pMOS transistors
produce I/3 and the nMOS networks produce 4I/3. The logical effort for each
transition is computed as the ratio of the input capacitance to that of a
complementary CMOS inverter with equal current for that transition. For the
falling transition, the pMOS transistor current is less than the nMOS
pulldown. The output current is estimated as the pulldown current minus the
pullup current, (4I/3 – I/3) = I. Therefore, we will compare each gate to a unit
inverter to calculate gd. For example, the logical effort for a falling transition
of the pseudo-nMOS inverter is the ratio of its input capacitance (4/3) to that
of a unit complementary CMOS inverter (3), i.e., 4/9. gu is three times as great
because the current is 1/3 as much.
The pseudo-nMOS NOR has 10/3 units of diffusion capacitance as compared
to 3 for a unit-sized complementary CMOS inverter, so its parasitic delay
pulling down is 10/9. The pullup current is 1/3 as great, so the parasitic
delay pulling up is 10/3.
Pseudo-nMOS is slower on average than static CMOS for NAND structures.
Pseudo-nMOS works well for NOR structures. The logical effort is independent
of the number of inputs in wide NORs, so pseudo-nMOS is useful for fast wide
NOR gates or NOR-based structures like ROMs and PLAs when power permits.
➢ Pseudo-nMOS gates will not operate correctly if VOL > VIL of the receiving
gate. This is most likely in the SF (Slow-Fast) design corner where
nMOS transistors are weak and pMOS transistors are strong.
➢ Designing for acceptable noise margin in the SF corner forces a
conservative choice of weak pMOS transistors in the normal corner.
➢ A biasing circuit can be used to reduce process sensitivity, as shown in
Figure 9.17.
➢ The goal of the biasing circuit is to create a Vbias that causes P2 to deliver
1/3 the current of N2, independent of the relative mobilities of the
pMOS and nMOS transistors. Transistor N2 has width of 3/2 and hence
produces current 3I/2 when ON.
➢ Transistor N1 is tied ON to act as a current source with 1/3 the current
of N2, i.e., I/2. P1 acts as a current mirror using feedback to establish
the bias voltage sufficient to provide equal current as N1, I/2.
➢ The size of P1 is noncritical so long as it is large enough to produce
sufficient current and is equal in size to P2.
➢ Now, P2 ideally also provides I/2. In summary, when A is low, the
pseudo-nMOS gate pulls up with a current of I/2.
➢ When A is high, the pseudo-nMOS gate pulls down with an effective
current of (3I/2 – I/2) = I.
➢ To first order, this biasing technique sets the relative currents strictly
by transistor widths, independent of relative pMOS and nMOS
mobilities.

➢ Such replica biasing permits the 1/3 current ratio rather than the
conservative 1/4 ratio in the previous circuits, resulting in lower logical
effort.
➢ The bias voltage Vbias can be distributed to multiple pseudo-nMOS
gates.
➢ Ideally, Vbias will adjust itself to keep VOL constant across process
corners.
➢ The currents through the two pMOS transistors do not exactly match
because their drain voltages are unequal, so this technique still has
some process sensitivity.
➢ This bias is relative to VDD, so any noise on either the bias voltage line
or the VDD supply rail will impact circuit performance.
➢ Turning off the pMOS transistor can reduce power when the logic is idle
or during IDDQ test mode as shown in Figure 9.18.

2.2. Ganged CMOS


➢ Figure 9.19 illustrates pairs of CMOS inverters ganged together.
➢ The truth table is given in Table 9.1, showing that the pair compute the
NOR function.
➢ Such a circuit is sometimes called a symmetric 2 NOR, or more
generally, ganged CMOS.
➢ When one input is 0 and the other 1, the gate can be viewed as a
pseudo-nMOS circuit with appropriate ratio constraints.
➢ When both inputs are 0, both pMOS transistors turn on in parallel,
pulling the output high faster than they would in an ordinary pseudo
nMOS gate.
➢ When both inputs are 1, both pMOS transistors turn OFF, saving static
power dissipation. As in pseudo-nMOS, the transistors are sized so the
pMOS are about 1/4 the strength of the nMOS and the pulldown
current matches that of a unit inverter.
➢ Hence, the symmetric NOR achieves both better performance and lower
power dissipation than a 2-input pseudo-nMOS NOR.
2.3. Source Follower Pull-up Logic (SFPL)
It is similar to pseudo-nMOS gate expect the pull-up is controlled by the
inputs. In figure below N6-N9 and P1 form a pseudo-nMOS NOR function. The
gate of the pull-up is driven by a parallel source follower consisting of drive
transistors N1-N4 and load transistor Nload. When one input turns on, the
source follower pulls node x to approximately VDD/2. This partially turns off
P1, which allows smaller nMOS pulldowns N6-N9 to be used. SFPL is used
to construct wide NOR gates.
3. Cascode Voltage Switch Logic
➢ Cascode Voltage Switch Logic (CVSL) provides the benefits of ratioed
circuits without the static power consumption. It uses both true and
complementary input signals and computes both true and
complementary outputs using a pair of nMOS pulldown networks, as
shown in Figure 9.20(a).
➢ The pulldown network f implements the logic function as in a static
CMOS gate, while f’ uses inverted inputs feeding transistors arranged
in the conduction complement.
➢ For any given input pattern, one of the pulldown networks will be ON
and the other OFF.
➢ The pulldown network that is ON will pull that output low. This low
output turns ON the pMOS transistor to pull the opposite output high.
➢ When the opposite output rises, the other pMOS transistor turns OFF
so no static power dissipation occurs.
➢ Figure 9.20(b) shows a CVSL AND/NAND gate. Herein the pulldown
networks are complementary, with parallel transistors in one and series
in the other.
➢ Figure 9.20(c) shows a 4-input XOR gate. The pulldown networks share
A and A’ transistors to reduce the transistor count by two.
➢ CVSL are slower than static CMOS.
4. Dynamic Circuits
The drawbacks of ratioed circuits include slow rising transitions,
contention on the falling transitions, static power dissipation, and a
nonzero VOL. Dynamic circuits overcome the drawbacks of the ratioed
circuits by using a clocked pullup transistor rather than a pMOS that is
always ON. Figure 9.21 compares (a) static CMOS, (b) pseudo-nMOS, and
(c) dynamic inverters. Dynamic circuit operation is divided into two modes,
as shown in Figure 9.22. During precharge, the clock φ is 0, so the clocked
pMOS is ON and initializes the output Y high. During evaluation, the clock
is 1 and the clocked pMOS turns OFF. The output may remain high or may
be discharged low through the pulldown network. Dynamic circuits are the
fastest commonly used circuit family because they have lower input
capacitance and no contention during switching. They also have zero static
power dissipation.
➢ In Figure 9.21(c), if the input A is 1 during precharge, contention will
take place because both the pMOS and nMOS transistors will be ON.
➢ When the input cannot be guaranteed to be 0 during precharge, an
extra clocked evaluation transistor can be added to the bottom of the
nMOS stack to avoid contention as shown in Figure 9.23.
➢ The extra transistor is sometimes called a foot. Figure 9.24 shows
generic footed and unfooted gates.
➢ Figure 9.25 estimates the falling logical effort of both footed and
unfooted dynamic gates. As usual, the pulldown transistors’ widths are
chosen to give unit resistance.
➢ Precharge occurs while the gate is idle and often may take place more
slowly. Therefore, the precharge transistor width is chosen for twice unit
resistance.
➢ This reduces the capacitive load on the clock and the parasitic
capacitance at the expense of greater rising delays.
➢ Footed gates have higher logical effort than their unfooted counterparts
but are still an improvement over static logic.
➢ The logical effort of footed gates is better than predicted because velocity
saturation means series nMOS transistors have less resistance than we
have estimated. Moreover, logical efforts are also slightly better than
predicted because there is no contention between nMOS and pMOS
transistors during the input transition. The size of the foot can be
increased relative to the other nMOS transistors to reduce logical effort
of the other inputs at the expense of greater clock loading. Like pseudo-
nMOS gates, dynamic gates are particularly well suited to wide NOR
functions or multiplexers because the logical effort is independent of the
number of inputs.
➢ The parasitic delay does increase with the number of inputs because
there is more diffusion capacitance on the output node.
➢ Characterizing the logical effort and parasitic delay of dynamic gates is
tricky because the output tends to fall much faster than the input rises,
leading to potentially misleading dependence of propagation delay on
fanout.
➢ A fundamental difficulty with dynamic circuits is the monotonicity
requirement. While a dynamic gate is in evaluation, the inputs must be
monotonically rising. That is, the input can start LOW and remain LOW,
start LOW and rise HIGH, start HIGH and remain HIGH, but not start
HIGH and fall LOW.
➢ Figure 9.26 shows waveforms for a footed dynamic inverter in which the
input violates monotonicity.
➢ During precharge, the output is pulled HIGH. When the clock rises, the
input is HIGH so the output is discharged LOW through the pulldown
network, as you would want to have happen in an inverter.
➢ The input later falls LOW, turning off the pulldown network. However,
the precharge transistor is also OFF so the output floats, staying LOW
rather than rising as it would in a normal inverter.
➢ The output will remain low until the next precharge step. In summary,
the inputs must be monotonically rising for the dynamic gate to
compute the correct function.
➢ The output of a dynamic gate begins HIGH and monotonically falls LOW
during evaluation. This monotonically falling output X is not a suitable
input to a second dynamic gate expecting monotonically rising signals,
as shown in Figure 9.27. Dynamic gates sharing the same clock cannot
be directly connected.
4.1.Domino Logic
The monotonicity problem can be solved by placing a static CMOS inverter between dynamic
gates, as shown in Figure 9.28(a). This converts the monotonically falling output into a
monotonically rising signal suitable for the next gate, as shown in Figure 9.28(b). The
dynamic-static pair together is called a domino gate. A single clock can be used to precharge
and evaluate all the logic gates within the chain. The dynamic output is monotonically falling
during evaluation, so the static inverter output is monotonically rising. Therefore, the static
inverter is usually a HI-skew gate to favor this rising output. Precharge occurs in parallel, but
evaluation occurs sequentially. The symbols for the dynamic NAND, HI-skew inverter, and
domino AND are shown in Figure9.28(c).
When complex inverting static CMOS gates such as NANDs or NORs can be used in place of
the inverter the gate is called compound domino. For example, Figure 9.29 shows an 8-input
domino multiplexer built from two 4-input dynamic multiplexers and a HI-skew NAND gate.
This is often faster than an 8-input dynamic mux and HI-skew inverter because the dynamic
stage has less diffusion capacitance and parasitic delay. Domino gates are inherently
noninverting, while some functions like XOR gates necessarily require inversion. Three
methods of addressing this problem include pushing inversions into static logic, delaying
clocks, and using dual-rail domino logic.
4.2.Dual-Rail Domino Logic
Dual-rail domino gates encode each signal with a pair of wires. The input and output signal
pairs are denoted with _h and _l, respectively. Table 9.2 summarizes the encoding. The _h wire
is asserted to indicate that the output of the gate is “high” or 1. The _l wire is asserted to indicate
that the output of the gate is“low or 0.
When the gate is precharged, neither _h nor _l is asserted. The pair of lines should never be
both asserted simultaneously during correct operation. Dual-rail domino gates accept both
true and complementary inputs and compute both true and complementary outputs, as shown
in Figure 9.30(a).
Figure 9.30(b) shows a dual-rail AND/NAND gate and Figure 9.30(c) shows a dual-rail
XOR/XNOR gate. The gates are with clocked evaluation transistors..
Dual-rail domino requires more area, wiring, and power. Dual-rail domino signals indicate
when the computation is done. Before computation completes, both rails are precharged.
When the computation completes, one rail will be asserted. A NAND gate can be used for
completion detection, as shown in Figure 9.31.
Coupling can be reduced in dual-rail signal busses by interdigitating the bits of the bus, as
shown in Figure 9.32. Each wire will never see more than one aggressor switching at a time
because only one of the two rails switches in each cycle.
4.3.Keepers
If a dynamic node is precharged high and then left floating, the voltage on the dynamic node will
drift due to charge leakage. The time constants tend to be in the millisecond to nanosecond
range, depending on process and temperature. Also dynamic circuits have poor input noise
margins. If the input rises above Vt while the gate is in evaluation, the input transistors will turn
on weakly and can incorrectly discharge the output. Both leakage and noise margin problems
can be addressed by adding a keeper circuit.

Figure 9.33 shows a conventional keeper on a domino buffer. The keeper is a weak transistor that
holds, or staticizes, the output at the correct level when it would otherwise float. When the
dynamic node X is high, the output Y is low and the keeper is ON to prevent X from floating.
When X falls, the keeper initially opposes the transition so it must be much weaker than the
pulldown network. Eventually Y rises, turning the keeper OFF and avoiding static power
dissipation.

The keeper must be strong (i.e., wide) enough to compensate for any leakage current drawn
when the output is floating and the pulldown stack is OFF. Strong keepers also improve the
noise margin because when the inputs are slightly above Vt the keeper can supply enough
current to hold the output high.
Strong keepers also increase delay.
For small dynamic gates, the keeper must be weaker than a minimum-sized transistor. This is
achieved by increasing the keeper length, as shown in Figure 9.34(a).
Long keeper transistors increase the capacitive load on the output Y. This can be avoided by
splitting the keeper, as shown in Figure 9.34(b).
4.4. Multiple-Output Domino Logic (MODL)
It is often necessary to compute multiple functions where one is a subfunction of another or
shares a subfunction. Multiple-output domino logic (MODL) saves area by combining all of the
computations into a multiple-output gate.
c
A popular application is in addition, where the carry-out i of each bit of a 4-bit block must be

computed. Each bit position i in the block can either propagate the carry (pi) or generate a carry
(gi). The carry-out logic is

This can be implemented in four compound AOI gates, as shown in Figure below. Notice that
each output is a function of the less significant outputs.

The more compact MODL design shown in Figure below is often called a Manchester carry chain.
Note that the intermediate outputs require secondary precharge transistors. Also note that care
must be taken for certain inputs to be mutually exclusive in order to avoid sneak paths. For
example, in the adder we must define
5. Pass-Transistor Circuits
In many circuit families inputs are applied only to the gate terminals of transistors, but in
pass-transistor circuits, inputs are also applied to the source/drain diffusion terminals. These
circuits build switches using either nMOS pass transistors or parallel pairs of nMOS and
pMOS transistors called transmission gates.
Substantial area, speed, and/or power improvements have been obtained using pass
transistors compared to static CMOS logic. Pass transistors are essential to the design of
efficient 6-transistor static RAM cells , Full adders and other circuits rich in XORs.
For the purpose of comparison, Figure 9.47 below shows a 2-input multiplexer constructed in
a wide variety of pass-transistor circuit families along with static CMOS, pseudonMOS,
CVSL, and single- and dual-rail domino.
Some of the circuit families are dualrail, producing both true and complementary outputs, while
others are single-rail and may require an additional inversion if the other polarity of output is
needed. U XOR V can be computed with exactly the same logic using S = U, S = U, A = V, B = V. This
shows that static CMOS is particularly poorly suited to XOR because the complex gate and two
additional inverters are required; hence, pass-transistor circuits become attractive. In
comparison, static CMOS NAND and NOR gates are relatively efficient and benefit less from
pass transistors.
5.1.CMOS with Transmission Gates
Structures such as tristates, latches, and multiplexers are often drawn as transmission gates in
conjunction with simple static CMOS logic. The transmission gate multiplexer using two
transmission gates is nonrestoring; i.e., the logic levels on the output are no better than those on
the input so a cascade of such circuits may ccumulate noise. To buffer the output and restore
levels, a static CMOS output inverter can be added, as shown in Figure 9.47 (CMOSTG).
A single nMOS or pMOS pass transistor suffers from a threshold drop. If used alone, additional
circuitry may be needed to pull the output to the rail. Transmission gates solve this problem but
require two transistors in parallel. The resistance of a unit-sized transmission gate can be
estimated as R for the purpose of delay estimation. Current flows through the parallel
combination of the nMOS and pMOS transistors. One of the transistors is passing the value well
and the other is passing it poorly; for example, a logic 1 is passed well through the pMOS but
poorly through the nMOS.
Estimate the effective resistance of a unit transistor passing a value in its poor direction as twice
the usual value: 2R for nMOS and 4R for pMOS. Figure 9.48 shows the parallel combination of
resistances. When passing a 0, the resistance is R|| 4R = (4/5)R. The effective resistance passing a 1
is 2R || 2R = R. Hence, a transmission gate made from unit transistors is approximately R in either
direction. Note that transmission gates are commonly built using equal-sized nMOS and pMOS
transistors. Boosting the size of the pMOS transistor only slightly improves the effective
resistance while significantly increasing the capacitance.

If multiple stages of CMOS with transmission gates logic are cascaded, they can be viewed as
alternating transmission gates and inverters. Figure 9.49(a) redraws the multiplexer to include
the inverters from the previous stage that drive the diffusion inputs but to exclude the output
inverter. Figure 9.49(b) shows this multiplexer drawn at the transistor level. Observe that this is
identical to the static CMOS multiplexer of Figure 9.47 except that the intermediate nodes in the
pullup and pulldown networks are shorted together as N1 and N2.
The shorting of the intermediate nodes has two effects on delay. The effective resistance
decreases somewhat because the output is pulled up or down through the parallel combination
of both pass transistors rather than through a single transistor. But, the effective capacitance
increases slightly because of the extra diffusion and wire capacitance required for this shorting.
This is apparent from layouts of the multiplexers; the transmission gate design in Figure 9.50(a)
requires contacted diffusion on N1 and N2 while the static CMOS gate in Figure 9.50(b) does not.
In most processes, the improved resistance dominates for gates with moderate fanouts, making
shorting generally faster at a small cost in power.

The logical effort of circuits involving transmission gates is computed by drawing stages that
begin at gate inputs rather than diffusion inputs, as in Figure 9.52 for a transmission gate
multiplexer. The effect of the shorting can be ignored, so the logical effort from either the A or B
terminals is 6/3, just as in a static CMOS multiplexer. Note that the parasitic delay of transmission
gate circuits with multiple series transmission gates increases rapidly because of the internal
diffusion capacitance, so it is not that much beneficial to use more than two transmission gates
in series without buffering.
5.2.Complementary Pass Transistor Logic (CPL)
CPL can be understood as an improvement on CVSL. CVSL is slow because one side of the gate
pulls down, and then the cross-coupled pMOS transistor pulls the other side up. The size of the
cross coupled device is an inherent compromise between a large transistor that opposes the
pulldown excessively and a small transistor that is slow pulling up. CPL resolves this problem
by making one half of the gate pull up while the other half pulls down.
Figure 9.53(a) shows the CPL multiplexer from Figure 9.47 rotated sideways. If a path consists of
a cascade of CPL gates, the inverters can be viewed equally well as being on the output of one
stage or the input of the next. Figure 9.53(b) redraws the mux to include the inverters from the
previous stage that drives the diffusion input, but to exclude the output inverters. Figure 9.53(c)
shows the mux drawn at the transistor level. Observe that this is identical to the CVSL gate from
Figure 9.47 except that the internal node of the stack can be pulled up through the weak pMOS
transistors in the inverters. When the gate switches, one side pulls down well through its nMOS
transistors. The other side pulls up. CPL can be constructed without cross-coupled pMOS
transistors, but the outputs would only rise to VDD – Vt (or slightly lower because the nMOS
transistors experience the body effect). This costs static power because the output inverter will
be turned slightly ON. Adding weak cross-coupled devices helps bring the rising output to
the supply rail while only slightly slowing the falling output. The output inverters can be
LO-skewed to reduce sensitivity to the slowly rising output.

You might also like