AVLIS Final Merge
AVLIS Final Merge
Lec_1_Introduction_1642315645671 9
Lec_2_Sequential_logic_1642923673798 40
Lec_3_Dynamic_Sequential_1643530824129 72
Lec_4A_Pipelined_Registers_1644133424311 95
Lec_4B_Static_Timimg_Analyisis_1644133457649 104
Lec_5_Clock_Generatio_and_Distribution_Part_1_1644752385887 116
Lec_5_Clock_Generatio_and_Distribution_Part_2_1644752403211 129
Lec_6_Latch_based_Clocking_And_Asynchronous_Clocking_1645353703599 146
Lec_7A_Interfacing_Circuits_Part_1_1645962856852_1645962956291 167
Lec_7B_Interfacing_Circuits_Part_2_1645962991634 174
Lec_8_Interfacing_Circuits_Part_3_1647784649397 195
Lec_9_Arithmetic_Circuits_Part_1_1649432767319 221
Lec_10_Arithmetic_Circuits_Part_2_1649597449288 256
Lec_11_Memory_Circuits_1650202929351 287
Lec_12A_LPC_Part_1_1650798675349 337
Lec_12B_LPC_Part_2_1650799616584 363
Lec_12C_LPC_Part_3_Adiabatic_Logic_1651408229137 382
Lec_13_Interconnects_1651408262173 388
Lec_14A_Deep_Submicron_MOSFET_operation_1652025321991 413
Lec_14B_CMOS_Technology_Scaling_1652025348068 444
BITS Pilani Presentation
BITS Pilani Dr. Sanjay Vidhyadharan
EEE WILP
Pilani Campus
➢ Reference Books:
1945
Advantages
➢ The primary advantage of RTL technology
was that it involved a minimum number of
transistors, which was an important
consideration before integrated circuit
technology
Disadvantages
➢ The obvious disadvantage of RTL is its high
current dissipation when the transistor
conducts to overdrive the output biasing
resistor.
➢ Passive Pull up.
➢ Limited Fanout
➢ No Rail-to-Rail Output
1/15/2022 5
Advantages
➢ Diodes perform the logical function
Disadvantages
➢ The obvious disadvantage of DTL is its high
current dissipation when the transistor
conducts to overdrive the output biasing
resistor.
➢ Passive Pull up.
➢ Limited Fanout
➢ No Rail-to-Rail Output
1/15/2022 6
Advantages
➢ Multi-Emitter perform the logical function
➢ Active Pull-up and Active Pull down
Disadvantages
➢ The obvious disadvantage of TTL is its high
current dissipation
➢ No Rail-to-Rail Output
1/15/2022 7
Advantages
➢ Fast
➢ High Fan-out
Disadvantages
➢ High current dissipation
1/15/2022 8
1/15/2022 9
100
Die size (mm)
P6
486 Pentium ® proc
10 386
286
8080 8086
8085 ~7% growth per year
8008
4004 ~2X growth in 10 years
1
1970 1980 1990 2000 2010
Year
13
P6
100
Pentium ® proc
486
10 8085 386
8086 286
1 8080
8008
4004
0.1
1970 1980 1990 2000 2010
Year
Total Power
𝐾1 𝐶𝐿 𝐾2𝐶𝐿
𝑃𝑟𝑜𝑝𝑎𝑔𝑎𝑡𝑖𝑜𝑛 𝐷𝑒𝑙𝑎𝑦 (𝑡𝑝𝑑) = = 2`
𝐼𝐷 𝑉𝐷𝐷−𝑉𝑡ℎ
Effect of Decreasing Vth on Power
𝑉𝐷𝐷
➢ Use devices with Vth close to & steep switching characteristics
2
➢ Short Circuit loss reduces with Cload
𝑉 1 2
Energy stored in CLoad (𝐶𝐿)= 0 𝐷𝐷 𝑉𝐶 . 𝐶𝐿𝑑𝑉𝑐 = . 𝑉𝐷𝐷 ∗ 𝐶𝐿
2
𝑇 2
Energy consumed from power supply= 𝑉𝐷𝐷 0 𝑖 𝑡 𝑑𝑡 = 𝑉𝐷𝐷 . 𝑄𝐶𝐿 = 𝑉𝐷𝐷 . 𝐶𝐿
21
Energy dissipated in pMOSFET during charging = . 𝑉𝐷𝐷 . 𝐶𝐿
2
2 1
Energy dissipated in nMOSFET during discharging = . 𝑉𝐷𝐷 . 𝐶𝐿
2
2
Power Consumption = Frquency. 𝑉𝐷𝐷 . 𝐶𝐿
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
High Dynamic Power Consumption
PDynamic = Freq * VDD2 * CL
➢ VDD & CL : Reduced by 30 % each generation
➢ Frequency : 43 % Increase
▪ Architectural optimizations
▪ Consumer requirements
26
27
1/15/2022 28
𝐾1 𝐶𝐿 𝐾 2 𝐶𝐿
𝑃𝑟𝑜𝑝𝑎𝑔𝑎𝑡𝑖𝑜𝑛 𝐷𝑒𝑙𝑎𝑦 (𝑡𝑝𝑑) = = 2`
𝐼𝐷 𝑉𝐷𝐷−𝑉𝑡ℎ
1/15/2022 30
1/15/2022 31
n-inputs m-outputs
Combinational
Circuit Storage
Next Elements Present
state state
Sequential Circuit
1/23/2022 3
1 0
0 1
1 0
1
0
1 0 1 0
1 1 0 0
0
1
0
1
0
7
1 0 1 0
1 1 0 0
1 0 1
0
1 1
0 1 0
1
0
8
1 0 1 0
1 1 0 0
0 1
0 1
1 0 0
1
1 1
9
S R Q ഥ
𝑸
0 0 Q ഥ
𝑸 No change
0 1 0 1 Reset Q = 0
1 0 1 0 Set Q = 1
1 1 0 0 Forbidden
10
1/23/2022 11
S’ R’ Q Q’
0 0 1 1 Forbidden
0 1 1 0 Set
1 0 0 1 Reset
1 1 Q Q’ No change
1/23/2022 12
1/23/2022 13
1/23/2022 14
1/23/2022 15
1/23/2022 16
1/23/2022 17
1/23/2022 18
1/23/2022 19
Xn
D
Yn = Xn . X’n-1
1/23/2022 20
1/23/2022 21
1/23/2022 22
1/23/2022 23
1/23/2022 24
1/23/2022 25
The penalty for the reduced clock load is increased design complexity. The
transmission gate (T1) and its source driver must overpower the feedback
inverter (I2 ) to switch the state of the cross-coupled inverter.
1/23/2022 26
1/23/2022 28
1/23/2022 29
1/23/2022 30
Negative latch
➢ During normal mode of operation, the sleep devices are tuned on.
➢ The shaded inverters and transmission gates are implemented in low-threshold devices.
1/23/2022 31
1/23/2022 32
➢Area Large
➢Complexity
➢Static Dissipation
➢Dynamic Dissipation
Short Circuit
Switching Loss
1/30/2022 2
Principle of Operation:
➢ Temporary storage of charge on parasitic capacitors.
➢ A stored value can hence only be kept for a limited amount of
time, typically in the range of milliseconds.
➢ To preserve signal integrity, a periodic refresh of its value
➢ Registers are used in computational structures are constantly
clocked such as pipelined datapath.
1/30/2022 3
➢ Only 8 transistors
➢ Only 6 transistors if NMOS Gates used
➢ Low Power
➢ Low Propagation Delay ( One Pass Transistor Delay + One Inverter Delay)
➢ Set-up Time : (One Pass Transistor Delay + One Inverter Delay)
➢ Hold Time : Nil
1/30/2022 4
Limitations
During the 0-0 overlap direct path for data from D to Q (T1PMOS – T2PMOS)
During the 1-1 overlap direct path for data from D to Q (T1NMOS – T2NMOS)
1/30/2022 5
Q salved to D
1/30/2022 6
1/30/2022 7
If the D input changes during the overlap period, node X can make a transition, but cannot
propagate to the output.
1/30/2022 8
If the D input changes during the overlap period, node X can make a transition, but cannot
propagate to the output.
1/30/2022 9
If the D input changes during the overlap period, node X can make a transition, but cannot propagate to the
output. However, as soon as the overlap period is over, the PMOS M8 is turned on and the 0 propagates to
output. This effect is not desirable. The problem is fixed by imposing a hold time constraint on the input
data, D, or, in other words, the data D should be stable during the overlap period.
1/30/2022 10
1/30/2022 11
1/30/2022 12
1/30/2022 13
➢ In two-phase clocking schemes, care must be taken in routing the two clock signals to
ensure that overlap is minimized.
➢ A register can be constructed by cascading positive and negative latches.
➢ The main advantage is the use of a single clock phase.
➢ The disadvantage is the slight increase in the number of transistors -12 transistors are
required.
1/30/2022 14
1/30/2022 15
In a 0.25 mm, the set-up time of such a circuit using minimum-size devices is 140 psec. A
conventional approach, composed of an AND gate followed by a positive latch has an effective set-up
time of 600 psec (we treat the AND plus latch as a black box that performs both functions). The
embedded logic approach hence results in significant performance improvements.
1/30/2022 16
➢ The TSPC latch circuits can be further reduced in complexity, where only the first
inverter is controlled by the clock.
➢ Not all node voltages in the latch experience the full logic swing the voltage at node A
(for Vin = 0 V) for the positive latch maximally equals VDD– VTn, which results in a
reduced drive for the output NMOS transistor and a loss in performance.
1/30/2022 17
When CLK = 0 :
The input inverter is sampling the inverted D input on node X.
The second (dynamic) inverter is in the precharge mode, Y to VDD.
The third inverter is in the hold mode, since M8 and M9 are off.
On the rising edge of the clock :
The dynamic inverter M4-M6 evaluates. If X is high on the rising edge, Y discharges.
The third inverter M7-M8 is on during the high phase, and the node value on Y is
passed to the output Q.
1/30/2022 18
Hold Time: On the positive phase of the clock, note that node X transitions to a low if the D
input transitions to a high level. Therefore, the input must be kept stable till the value on node
X before the rising edge of the clock propagates to Y. This represents the hold time of the
register (note that the hold time less than 1 inverter delay since it takes 1 delay for the input to
affect node X).
Set-up Time: The set-up time is the time for node X to be valid, which is one inverter delay.
Propagation Delay : The propagation delay of the register is essentially three inverters since
the value on node X must propagate to the output Q.
1/30/2022 19
1/30/2022 20
Sense amplifier circuits are used extensively in memory cores and in low swing bus drivers
to amplify small voltage swings present in heavily loaded wires.
1/30/2022 22
1/30/2022 23
Power 4
Throughput @ (3 * _ ( ) _ ( ))
Gate complexity 4*
2/6/2022 2
A non-overlapping clock essential for correct operation. Else there will be race around
2/6/2022 6
It combines C2MOS pipeline registers and NORA dynamic logic functional blocks.
2/6/2022 8
NORA offers designers a wide range of design choices. Dynamic and static logic can be mixed freely. A NORA datapath
consists of a chain of alternating CLK and CLK modules. While one class of modules is precharging with its output latch
in hold mode, preserving the pre-vious output value, the other class is evaluating. Data is passed in a pipelined fashion
from module to module.
2/6/2022 9
2/6/2022 11
Clock Jitter: Sometimes some external sources like noise, voltage variations may cause
to disrupt the natural periodicity or frequency of the clock. This deviation from the natural
location of the clock is termed to be clock jitter.
2/6/2022 12
2/6/2022 13
2/6/2022 14
There are two main problems that can arise in synchronous logic:
Max Delay: The data doesn’t have enough time to pass from one register
to the next before the next clock edge.
Min Delay: The data path is so short that it passes through several
registers during the same clock cycle.
Max delay violations are a result of a slow data path, including the
registers, tsu therefore it is often called the “Setup” path.
Min delay violations are a result of a short data path, causing the data to
change before the thold has passed, therefore it is often called the “Hold”
path.
2/6/2022 15
After the clock rises, it takes tcq for the data to propagate to point A.
Then the data goes through the delay of the logic to get to point B.
The data has to arrive at point B, tsu before the next clock.
T= 15 ns
Setup slack = Min. Clock Path Delay - Max. Data Arrival Time
= (15 ns + 2ns + 5 ns + 2 ns - 4 ns) - ( 2 ns + 11 ns +2 ns + 9 ns + 2 ns)
= 20 ns – 26 ns = -6 ns : Setup Time Violation.
Hold time slack = Min. Data Arrival Time - Max. Clock Path Delay
= (1 ns + 9 ns + 1 ns + 6 ns + 1 ns) - (3 ns + 9ns + 3 ns + 2 ns) -
2/6/2022
= 18 ns – 17 ns = + 1ns : No Hold Time Violation. 18
2/6/2022 19
2/6/2022 21
2/13/2022 2
PLL
2/13/2022 3
➢ Any deviation in a positive or negative direction from the perfect in-phase condition
produces the same change in duty factor resulting in the same average voltage.
➢ If the local clock is a multiple of the reference clock frequency, the output of the phase
detector will still be a square wave of 50% duty cycle, albeit at a different frequency
2/13/2022 4
2/13/2022 5
2/13/2022 6
Charge Pump
2/13/2022 7
2/13/2022 8
2/13/2022 9
2/13/2022 10
2/13/2022 11
2/13/2022 12
2/13/2022 13
Flip-flops
2/13/2022 15
Advantages
Low cost wiring
Low Capacitance
Disadvantages
Difficult to balance path delays due to
asymmetric FF distribution
Sensitive to variations
Flip-flops
2/13/2022 16
2/13/2022 17
Wire usage = 3% of
metals 3 & 4
4 major clock quadrants, each with a large driver connected to local grid
structures
2/13/2022 18
2/13/2022 19
Grid/Mesh
• Gridded clock distribution was common on
earlier DEC Alpha microprocessors
• Advantages:
– Skew determined by grid density and not
overly sensitive to load position
– Clock signals are available everywhere
– Tolerant to process variations
– Usually yields extremely low skew values
2/13/2022 20
Grid/Mesh
• Huge amounts of wiring & power
– Wire cap large
– Strong drivers needed – pre-driver cap large
– Routing area large
• To minimize all these penalties, make grid pitch coarser
– Skew gets worse
– Losing the main advantage
• Don’t overdesign
– let the skew be as large as tolerable Still
– grids seem non-feasible for SOC’s
2/13/2022 21
2/13/2022 22
2/13/2022 23
Mesh
-- excellent for low skew, jitter
-- high power, area, capacitance
Flip-flops
flip flops
-- difficult to analyze
-- used in modern processors
Tree
-- low cost (wiring, power, cap)
-- higher skew, jitter than mesh
-- widely used in ASIC designs Best architecture depends on the application
Clock source
Flip flops
crosslink
crosslink
tree
2/13/2022 24
0
1
0/1 0
1
0
0/1 1
1/0
1
60 SHIFT REGISTER
SCALER
ALU m Watts
50
ADDER
Powe
FF
r
MUX
EQUAL TO
Cl M
S
Conventional
ll dder
S
Modified
ll dder
i e ns
2/13/2022 28
2/13/2022 29
2/13/2022 30
▪ A stable input is available to the combinational logic block A (CLB_A) on the falling edge of CLK1
(at edge 2 ) and it has a maximum time equal to the T CLK /2 to evaluate (that is, the entire low
phase of CLK1). On the falling edge of CLK2 (at edge 3 ), the output CLB_A is latched and the
computation of CLK_B is launched. CLB_B computes on the low phase of CLK2 and the output is
available on the falling edge of CLK1 (at edge 4 )
➢ It possible for a logic block to utilize time that is left over from the previous logic block
and this is referred to as slack borrowing
2/20/2022 3
2/20/2022 4
2/20/2022 5
➢ Clock period is chosen to be larger than the worst-case delay of each pipeline stage.
➢ Hence, The throughput rate of the pipelined system is directly linked to the worst-case
delay of the slowest element in the pipeline
➢ As all the clocks in a circuit transitions at the same time, significant current flows over a
very short period of time (due to the large capacitance load). This causes significant noise
problems due to package inductance and power supply grid resistance.
2/20/2022 6
This avoids all problems and overheads associated with distributing high-speed clocks.
A self-timed circuit proceeds at the average speed of the hardware in contrast to the worst-case model
of synchronous logic.
The automatic shut-down of blocks that are not in use can result in power savings. Additionally, the
power consumption overhead of generating and distributing high-speed clocks can be partially avoided.
Self-timed circuits are by nature robust to variations in manufacturing and operating conditions such as
temperature.
2/20/2022 7
Dual-Rail Coding
2/20/2022 8
Dual-Rail Coding
Dual-rail coding above allows tracking of the signal statistics, it comes at the cost of power
dissipation. Every single gate must transition for every new input vector, regardless of the
value of the data vector
2/20/2022 9
𝐹𝑜𝑟 8 − 𝑏𝑖𝑡 𝐴𝑑𝑑𝑒𝑟 ≈ 24 𝐺𝑎𝑡𝑒 𝐷𝑒𝑙𝑎𝑦𝑠 𝐹𝑜𝑟 16 − 𝑏𝑖𝑡 𝐴𝑑𝑑𝑒𝑟 ≈ 48 𝐺𝑎𝑡𝑒 𝐷𝑒𝑙𝑎𝑦𝑠
2/20/2022 10
𝐹𝑜𝑟 4 − 𝐵𝑖𝑡 𝐴𝑑𝑑𝑒𝑟 ≈ 6 𝐺𝑎𝑡𝑒 𝐷𝑒𝑙𝑎𝑦𝑠 𝐹𝑜𝑟 8 − 𝐵𝑖𝑡 𝐴𝑑𝑑𝑒𝑟 ≈ 12 𝐺𝑎𝑡𝑒 𝐷𝑒𝑙𝑎𝑦𝑠
2/20/2022 𝐹𝑜𝑟 16 − 𝐵𝑖𝑡 𝐴𝑑𝑑𝑒𝑟 ≈ 18 𝐺𝑎𝑡𝑒 𝐷𝑒𝑙𝑎𝑦𝑠 11
2/20/2022 12
The advantage of this approach is that the logic can be implemented using a standard non-
redundant circuit style such as complementary CMOS.
Also, if multiple logic units are computing in parallel, it is possible to amortize the
overhead of the delay line over multiple blocks
2/20/2022 13
2/20/2022 14
The four events, data change, request, data acceptance, and acknowledge, proceed in a cyclic
order. This protocol is called two-phase.
➢ The Req event terminates the active cycle of the sender. The sender is free to change the data
during its active cycle.
➢ The receiver’s cycle is completed by the Ack event. The receiver can only accept data during
its active cycle.
2/20/2022 15
2/20/2022 16
0 0 1 0
0 -> 1 0 1 0 -> 1
1/1 -> 0 0 1 1
1/0 0 -> 1 1 -> 0 1/1 -> 0
1 -> 0 1 0 1 -> 0
2/20/2022 17
2/20/2022 18
2/20/2022 19
0 0 0 1 0
0 -> 1 -> 0 0 -> 1 0 1 0
0->1
0 0 1 0 1
0 -> 1 -> 0 0 1 0 1
1 -> 0
0 1
2/20/2022 20
2/20/2022 21
PT Switch
2/27/2022 2
2/27/2022 3
2/27/2022 4
errors/sec
2/27/2022 5
2/27/2022 6
2/27/2022 7
2/27/2022 2
2/27/2022 3
2/27/2022 4
2/27/2022 5
2/27/2022 7
2/27/2022 8
2/27/2022 9
With Vin less than the threshold voltage of M1, VX remains at VDD - VTHN3
2/27/2022 10
2/27/2022 11
2/27/2022 12
2/27/2022 13
2/27/2022 14
2/27/2022 15
2/27/2022 16
2/27/2022 17
2/27/2022 18
2/27/2022 19
2/27/2022 20
2/27/2022 21
2.0V
0.8V
3/20/2022 3
3/20/2022 4
0.8+2
For 𝑉𝐷𝐷 = 5 𝑉 𝑎𝑛𝑑 VM = 2
= 1.4 𝑉 , 𝑉𝑇𝑛 = 1 𝑉, 𝑎𝑛𝑑 𝑉𝑇𝑝𝑛 = −1 𝑉
r = 6.5
𝜇𝑛 𝐶𝑜𝑥 𝑊𝑛 /𝐿𝑛
𝑟=
𝜇𝑝 𝐶𝑜𝑥 𝑊𝑝 /𝐿𝑝
𝑊𝑛 /𝐿𝑛 1 2 169
= 6.5 =
𝑊𝑝 /𝐿𝑝 3 12
3/20/2022 5
3/20/2022 6
VDDL = 1.2 V
VDDL = 0.9 V VDDL = 0.7 V
3/20/2022 7
VDDH = 1 V ( 90 nm)
VDDL = 0.18 V
VHVT = 0.535 V
VSVT = 0.360 V
VLVT = 0.230 V
3/20/2022 10
3/20/2022 11
1-3kὨ
3/20/2022 12
3/20/2022 13
3/20/2022 14
3/20/2022 15
3/20/2022 16
𝐼𝑚𝑎𝑥 𝑡𝑠 𝑉𝐷𝐷
∗ = 𝐶𝑙𝑜𝑎𝑑
2 2 2
for a bonding wire with L = 2 nH,
The role of two nMOS transistors controlled by the strobe signal (ST) is to pre-charge
the gate potentials of the last-stage driver transistors at an approximate midpoint
between the initial and final potentials of the load capacitor.
18
3/20/2022 19
3/20/2022 20
Package functions
– Electrical connection of signals and power from chip to board
– Little delay or distortion
– Mechanical connection of chip to board
– Removes heat produced on chip
– Protects chip from mechanical damage
– Compatible with thermal expansion
– Inexpensive to manufacture and test
3/20/2022 21
3/20/2022 22
3/20/2022 23
3/20/2022 24
3/20/2022 25
3/20/2022 26
State Instruction
Control Store
Sequencer Decoders
Internal A Bus
Address Data
out PC R0 R1 Rn Shifter ALU In/out
Buffer Reg
Internal B Bus
4/8/2022 3
S = A ⊕ B ⊕ Cin
Cout = AB + BCin + ACin
= AB + Cin (A+B)
= AB + Cin (A ⊕ B+ AB)
= AB + Cin (A ⊕ B) + Cin AB = AB(1+ Cin) + Cin (A ⊕ B)
= AB + Cin (A ⊕ B)
4/8/2022 7
𝐶𝑎𝑟𝑟𝑦 = 𝐴𝐵 + 𝐵𝐶 + 𝐶𝐴 = 𝐴𝐵 + 𝐶 𝐴 + 𝐵
4/8/2022 8
12 12 12
12 12
6 6
6 6 6
4/8/2022 9
1-bit CLA
B A
11
CLA
CLLB
C1=G0+P0C0
12
C2=G1+P1C1
C2 = G1+P1(G0+P0C0)
= G1+P1G0+P1P0C0
13
C1=G0+P0C0
C2= G1+P1G0+P1P0C0
15
16
19
4/8/2022 20
4/8/2022 21
4/8/2022 22
4/8/2022 23
4/8/2022 24
4/8/2022 25
𝑁 = 𝑀 + 𝑀 + 1 + 𝑀 + 2 + 𝑀 + 3 + ⋯ . +(𝑀 + 𝑃 − 1)
𝑀 ≪ 𝑁 𝑒. 𝑀 = 2 𝑎𝑛𝑑 𝑁 = 64
𝑃2
𝑁≈
2
4/8/2022 26
4/8/2022 27
4/8/2022 28
Overflow
29
30
4+8 4 0100
8 1000
1100 Is this expected Result ?
4/8/2022 35
4/10/2022 2
HA FA FA HA
FA FA FA HA
A3B3 A3B2 A3B1 A3B0
FA FA FA HA
P7 P6 P5 P4 P3 P2 P1 P0
4/10/2022 3
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
The Array Multiplier
𝑀 𝐵𝑖𝑡 𝑋 𝑁 𝐵𝑖𝑡 −> 𝑃 = (𝑁 + 𝑀 − 1) A1B3 A1B2 A0B3 A1B1 A0B2 A1B0 A0B1 A0B0
HA FA FA HA
FA FA FA HA
A3B3 A3B2 A3B1 A3B0
FA FA FA HA
P7 P6 P5 P4 P3 P2 P1 P0
𝑅𝑒𝑞𝑢𝑖𝑟𝑒 𝑁 − 1 𝑀 − 𝐵𝑖𝑡 𝐴𝑑𝑑𝑒𝑟𝑠 𝑅𝑒𝑞𝑢𝑖𝑟𝑒 𝑀 𝑋 𝑁 𝐴𝑁𝐷 𝐺𝑎𝑡𝑒𝑠
HA FA FA FA
HA FA FA FA
Vector Merging Adder
HA FA FA HA
P7 P6 P5 P4 P3 P2 P1 P0
4/10/2022 6
4/10/2022 8
4/10/2022 9
4/10/2022 10
4/10/2022 11
4/10/2022 12
4/10/2022 13
4/10/2022 14
1 0 1
1 1 0
15
1 0 1
1 1 0
16
G0= A0B0’
L0= A0’B0
17
B0 A0
1-Bit
Comp
L0 E0 G0
18
E (A=B)
19
B BB
L1 E1 G1 L0 E0 G0
1 0
E1 G0
A > B When A1 > B1 OR
When A1 = B1 and A0 > B0 AND G1
G= G1 + E1 . G0
OR
G
20
B BB
L1 E1 G1 L0 E0 G0
1 0
E1 L0
A < B When A1 < B1 OR
When A1 = B1 and A0 < B0 AND L1
L = L1 + E1 . L0
OR
21
2-Bit Comparator
G = G1 + E1. G0 L = L1 + E1. L0 E = E1. E0
3-Bit Comparator
G = G2 + E2. G1 + E2. E1. G0
E = E2 .E1. E0
L = L2 + E2. L1 + E2. E1. L0
22
L3
4-Bit Comparator
E3 E = E2 .E1. E0
G3
G = G3 + E3.G2 + E3.E2. G1 + E3.E2. E1. G0
L2 L = L3 + E3.L2 + E3.E2. L1 + E3.E2. E1. L0
E2
G2
L1
E1
G1
L0
E0
G0
23
Logical Shift:
– Shifts number left or right and fills with 0’s
• 1011 LSR 1 = 0101, 1011 LSL1 = 0110
• Arithmetic Shift:
– Shifts number left or right. Rt shift sign extends
• 1011 ASR1 = 1101 1011 ASL1 = 0110
• Rotate:
– Shifts number left or right and fills with lost bits
• 1011 ROR1 = 1101 1011 ROL1 = 0111
4/10/2022 24
4/10/2022 25
4/10/2022 26
4/10/2022 27
4/10/2022 28
4/10/2022 29
4/10/2022 30
4/10/2022 31
4/17/2022 2
4/17/2022 3
4/17/2022 4
4/17/2022 5
Advantages:
1. Shorter wires within blocks
2. Block address activates only 1 block => power savings
4/17/2022 6
Advantages:
1. Shorter wires within blocks
2. Block address activates only 1 block => power savings
4/17/2022 7
0.0
0 100 200 300 400 500 600 700
time (ps)
1.0
bit
Ex: A = 0, A_b = 1 word
SRAM Sizing
High bitlines must not overpower inverters during reads
But low bitlines must write new value into cell
bit bit_b
word
weak
med med
A A_b
strong
4/17/2022 12
4/17/2022 13
➢ 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution
read-out
➢ The read-out of the 1T DRAM cell is destructive; read and refresh operations are
necessary for correct operation.
➢ When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss
can be circumvented by bootstrapping the word lines to a higher value than VDD
4/17/2022 14
X : means connection
16
17
18
19
20
4. EEPROM
21
22
23
24
4/17/2022 25
4/17/2022 26
4/17/2022 27
4/17/2022 28
4/17/2022 29
4/17/2022
4/17/2022 31
33
34
35
36
37
38
X(A, B, C, D) = ∑ (7,8,9,10,11,12,13,14,15)
Y(A, B, C, D) = ∑ (1,2,8,12,13)
X(A, B, C, D) = A + BCD
Y(A, B, C, D) = ABC’ + A’B’CD’+ AC’D’ + A’B’C’D
= W + AC’D’ + A’B’C’D
39
40
Example
X(A, B, C, D) = A + BCD
41
4/17/2022 42
NOR flash is faster to read than NAND flash, but it's also more expensive.
NAND has a higher memory capacity than NOR.
4/17/2022 43
4/17/2022 44
NOR FLASH memories are very fast to program and read. Erasure through tunneling is
much slower. However, this kind of array suffers from low density due to the same
reason that impacts NOR ROM density the need for multiple grounds.
4/17/2022 45
4/17/2022 46
4/17/2022 47
4/17/2022 48
4/17/2022 49
4/17/2022 50
4/24/2022 2
4/24/2022 3
4/24/2022 4
4/24/2022 5
4/24/2022 6
4/24/2022 8
4/24/2022 9
4/24/2022 10
4/24/2022 11
4/24/2022 12
➢ An input signal to a gate is called critical if it is the last signal of all inputs to assume a
stable value.
➢ The path through the logic which determines the ultimate speed of the structure is called
the critical path.
➢ Putting the critical-path transistors closer to the output of the gate can result in a speed-up.
4/24/2022 13
4/24/2022 14
4/24/2022 15
4/24/2022 16
4/24/2022 17
18
19
4/24/2022 20
4/24/2022 21
4/24/2022 22
4/24/2022 23
4/24/2022 24
4/24/2022 25
4/24/2022 26
4/24/2022 2
4/24/2022 3
4/24/2022 4
4/24/2022 5
4/24/2022 6
4/24/2022 7
4/24/2022 8
Gray Counter
4/24/2022 9
4/24/2022 10
4/24/2022 11
4/24/2022 12
4/24/2022 13
4/24/2022 14
4/24/2022 15
BHE
BHE
BHE
4/24/2022 16
4/24/2022 17
4/24/2022 18
4/24/2022 19
2
Adiabatic Logic
CMOS Symmetric Pass Gate Adiabatic Logic
3
Adiabatic Logic
4
Adiabatic Logic vs. Static CMOS
Sanjay Vidhyadharan et all “An advanced adiabatic logic using Gate Overlap Tunnel FET
(GOTFET) devices for ultra-low power VLSI sensor applications”
5
Thank you
5/1/2022 6
2
Wire Geometry
3
Layer Stack
4
Choice of Metals
5
Metal Layers 45 nm Technology
6
Wire Resistance
7
Sheet Resistance
8
Contacts Resistance
9
Wire Capacitance
10
Capacitance Trends
11
Capacitance Trends
12
Capacitance Trends
13
Polysilicon
14
Lumped Element Models
15
Lumped Element Models
L Model
The delay of a wire is a quadratic function of its length! This means that doubling
the length of the wire quadruples its delay
16
Lumped Element Models
Example
17
Lumped Element Models
18
Repeaters
19
Repeaters
20
Crosstalk
Crosstalk effects
Noise on nonswitching wires
Increased delay on switching wires
21
Crosstalk Delay
22
Crosstalk
23
Crosstalk
Driven Victims
If the noise is less than the noise margin, nothing happens Static CMOS logic will eventually
settle to correct output even if disturbed by large noise spikes
But glitches cause extra delay, also cause extra power from false transitions
Memories and other sensitive circuits also can produce the wrong answer
24
Thank you
5/1/2022 25
Cut-off Region
5/8/2022 2
5/8/2022 4
5/8/2022 5
The body effect occurs in a MOSFET when the source is not tied to the substrate
(which is always connected to the most negative power supply in the integrated circuit
for n-channel devices and to the most positive for p-channel devices). The substrate
then acts as a “second gate” or a back-gate for the MOSFET
5/8/2022 6
𝑄
𝐶𝑜𝑥 𝑊𝐿 =
𝑉𝑜𝑣
𝑄
𝐶ℎ𝑎𝑟𝑔𝑒 𝑝𝑒𝑟 𝑢𝑛𝑖𝑡 𝐿𝑒𝑛𝑔𝑡ℎ = = 𝐶𝑜𝑥 𝑊𝑉𝑜𝑣
𝐿
𝑉𝐷𝑆
𝐸𝑙𝑒𝑐𝑡𝑟𝑖𝑐 𝐹𝑖𝑒𝑙𝑑 𝑖𝑛 𝐶ℎ𝑎𝑛𝑛𝑒𝑙 =
𝐿
𝑉𝐷𝑆 µ𝑛
𝑉𝑒𝑙𝑜𝑐𝑖𝑡𝑦 𝑜𝑓 𝐶ℎ𝑎𝑟𝑔𝑒 𝑖𝑛 𝐶ℎ𝑎𝑛𝑛𝑒𝑙(𝑣) = µ𝑛 𝐸 =
𝐿
𝑄 𝑉𝐷𝑆 µ𝑛
𝐶𝑢𝑟𝑟𝑒𝑛𝑡 𝑖𝑛 𝐶ℎ𝑎𝑛𝑛𝑒𝑙 (𝐼𝐷 ) = 𝑣 ∗ 𝐿 = * 𝐶𝑜𝑥 𝑊𝑉𝑜𝑣
𝐿
𝑉𝑜𝑣
5/8/2022 8
𝑘𝑛′ 𝑊(𝑉𝐺𝑆 − 𝑉𝑇 )2
𝐼𝐷 =
2𝐿
5/8/2022 11
5/8/2022 12
5/8/2022 13
5/8/2022 14
5/8/2022 15
5/8/2022 16
5/8/2022 17
5/8/2022 18
5/8/2022 19
5/8/2022 20
➢ VT Roll Off
➢ Drain-induced barrier lowering (DIBL)
5/8/2022 21
5/8/2022 22
5/8/2022 23
5/8/2022 24
5/8/2022 25
When the VDS is increased beyond VOV , the pinch-off point is moved slightly away from the drain,
toward the source. The additional voltage applied to the drain appears as a voltage drop across the narrow
depletion region between the end of the channel and the drain region. This voltage accelerates the
electrons that reach the drain end of the channel and sweeps them across the depletion region into the
drain.
5/8/2022 26
5/8/2022 27
𝑉𝐴 𝐸𝑎𝑟𝑙𝑦 𝑉𝑜𝑙𝑡𝑎𝑔𝑒
1
𝜆= λ ∝ 1/L
𝑉𝐴
5/8/2022 28
5/8/2022 30
5/8/2022 31
Trieste, 8-10
CMOS technology 2
November 1999
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
• The scaling variables are:
– Supply voltage: Vdd → Vdd /
– Gate length: L → L/
– Gate width: W → W/
– Gate-oxide thickness: tox → tox /
– Junction depth: Xj → Xj /
– Substrate doping: NA → NA × α
This is called constant field scaling because the electric field across
the gate-oxide does not change when the technology is scaled
If the power supply voltage is maintained constant the scaling is called
constant voltage. In this case, the electric field across the gate-oxide
increases as the technology is scaled down.
Due to gate-oxide breakdown, below 0.8µm only “constant field”
scaling is used.
3
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
• Device/die area:
W L → (1/)2 = 0.49
– In practice, microprocessor die size grows about 25% per technology
generation! This is a result of added functionality.
• Transistor density:
(unit area) /(W L) → 2 = 2.04
– In practice, memory density has been scaling as expected.
(not true for microprocessors…)
4
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
• Gate capacitance:
W L / tox → 1/ = 0.7
• Drain current:
(W/L) (V2/tox) → 1/ = 0.7
• Gate delay:
(C V) / I → 1/ = 0.7
Frequency → = 1.43
– In practice, microprocessor frequency has doubled every
technology generation (2 to 3 years)! This faster increase rate is
due to two factors:
• the number of gate delays in a clock cycle decreases with time (the
designs become highly pipelined)
• advanced circuit techniques reduce the average gate delay beyond
30% per generation.
5
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
• Power:
C V2 f → (1/)2 = 0.49
• Power density:
1/tox V2 f → 1
• Active capacitance/unit-area:
•
Power dissipation is a function of the operation frequency, the power
supply voltage and of the circuit size (number of devices).
If we normalize the power density to V2 f we obtain the active
capacitance per unit area for a given circuit. This parameter can be
compared with the oxide capacitance per unit area:
1/tox → = 1.43
– In practice, for microprocessors, the active capacitance/unit-area
only increases between 30% and 35%. Thus, the twofold
improvement in logic density between technologies is not
achieved.
6
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
• Interconnects scaling:
– Higher densities are only possible if the interconnects
also scale.
– Reduced width → increased resistance
– Denser interconnects → higher capacitance
– To account for increased parasitics and integration
complexity more interconnection layers are added:
• thinner and tighter layers → local interconnections
• thicker and sparser layers → global interconnections and
power
Interconnects are scaling as expected
7
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
Quantity Sensitivity Constant Field Constant Voltage
Scaling Parameters
Length L 1/S 1/S
Width W 1/S 1/S
Gate Oxide Thickness tox 1/S 1/S
Supply Voltage Vdd 1/S 1
Threshold Voltage VT0 1/S 1
Doping Density NA, ND S S2
Device Characteristics
Area (A) WL 1/S2 1/S2
W/Ltox S S
D-S Current (IDS) (Vdd - vT)2 1/S S
Gate Capacitance (Cg) WL/tox 1/S 1/S
Transistor On-Resistance (Rtr) Vdd/IDS 1 S
Intrinsic Gate Delay () RtrCg 1/S 1/S
Clock Frequency f f f
Power Dissipation (P) IDSVdd 1/S2 S
Power Dissipation Density (P/A) P/A 1 S3 8
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
Lithography:
Optics technology Technology node
248nm mercury-xenon lamp 180 - 250nm
248nm krypton-fluoride laser 130 - 180nm
193nm argon-fluoride laser 100 - 130nm
157nm fluorine laser 70 - 100nm
13.4nm extreme UV 50 - 70nm
9
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Technology Scaling
Lithography:
• Electron Beam Lithography (EBL)
– Patterns are derived directly from digital data
– The process can be direct: no masks
– Pattern changes can be implemented quickly
– However:
• Equipment cost is high
• Large amount of time required to access all the points
on the wafer
CMOS technology 10
ELECTRICAL ELECTRONICS COMMUNICATION INSTRUMENTATION
Thank you
5/8/2022 11