vlsi digital circuits full power point presentation

VLSI Digital Circuits
Lecture 15: Timing
Issues
[Adapted from Prof. Mary Jane Irwin’s slides, Rabaey’s Digital Integrated
Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]

Timing Classifications
 Synchronous systems
 All memory elements in the system are simultaneously updated
using a globally distributed periodic synchronization signal (i.e., a
global clock signal)
 Functionality is ensure by strict constraints on the clock signal
generation and distribution to minimize
- Clock skew (spatial variations in clock edges)
- Clock jitter (temporal variations in clock edges)
 Asynchronous systems
 Self-timed (controlled) systems
 No need for a globally distributed clock, but have asynchronous
circuit overheads (handshaking logic, etc.)
 Hybrid systems
 Synchronization between different clock domains
 Interfacing between asynchronous and synchronous domains

Review: Synchronous Timing Basics
 Under ideal conditions (i.e., when tclk1 = tclk2)
T  tc-q + tplogic + tsu
thold ≤ tcdlogic + tcdreg
 Under real conditions, the clock signal can have both
spatial (clock skew) and temporal (clock jitter) variations
 skew is constant from cycle to cycle (by definition); skew can be
positive (clock and data flowing in the same direction) or negative
(clock and data flowing in opposite directions)
 jitter causes T to change on a cycle-by-cycle basis
D Q
R1
Combinational
logic
D Q
R2
clk
In
tclk1 tclk2
tc-q, tsu,
thold, tcdreg
tplogic, tcdlogic

Sources of Clock Skew and Jitter in Clock Network
PLL
1
2
4
3
5
6
7
clock
generation
clock drivers
power supply
interconnect
capacitive load
capacitive
coupling
temperature
 Skew
 manufacturing device
variations in clock drivers
 interconnect variations
 environmental variations
(power supply and
temperature)
 Jitter
 clock generation
 capacitive loading and
coupling
 environmental variations
(power supply and
temperature)

Positive Clock Skew
D Q
R1
Combinational
logic
D Q
R2
clk
In
tclk1 tclk2
delay
  > 0: Improves performance, but makes thold harder to
meet. If thold is not met (race conditions), the circuit
malfunctions independent of the clock period!
T
T + 
 > 0
 + thold
T +   tc-q + tplogic + tsu so T  tc-q + tplogic + tsu - 
thold +  ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg - 
1
2
3
4
 Clock and
data flow in
the same
direction
T :
thold :

Negative Clock Skew
D Q
R1
Combinational
logic
D Q
R2
clk
In
tclk1 tclk2
delay
 Clock and
data flow in
opposite
directions
T
T + 
 < 0
T +   tc-q + tplogic + tsu so T  tc-q + tplogic + tsu - 
thold +  ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg - 
1
2
3
4
  < 0: Degrades performance, but thold is easier to meet
(eliminating race conditions)
T :
thold :

Clock Jitter
 Jitter causes T to
vary on a cycle-by-
cycle basis
R1
Combinational
logic
clk
In
tclk
T
-tjitter +tjitter
T - 2tjitter  tc-q + tplogic + tsu so T  tc-q + tplogic + tsu + 2tjitter
 Jitter directly reduces the performance of a sequential
circuit
T :

Combined Impact of Skew and Jitter
D Q
R1
Combinational
logic
D Q
R2
In
tclk1 tclk2
 Constraints
on the
minimum
clock period
( > 0)
  > 0 with jitter: Degrades performance, and makes thold
even harder to meet. (The acceptable skew is reduced
by jitter.)
T
T + 
 > 0
1
6 12
-tjitter
T  tc-q + tplogic + tsu -  + 2tjitter thold ≤ tcdlogic + tcdreg –  – 2tjitter

Clock Distribution Networks
 Clock skew and jitter can ultimately limit the performance
of a digital system, so designing a clock network that
minimizes both is important
 In many high-speed processors, a majority of the dynamic power
is dissipated in the clock network.
 To reduce dynamic power, the clock network must support clock
gating (shutting down (disabling the clock) units)
 Clock distribution techniques
 Balanced paths (H-tree network, matched RC trees)
- In the ideal case, can eliminate skew
- Could take multiple cycles for the clock signal to propagate to the
leaves of the tree
 Clock grids
- Typically used in the final stage of the clock distribution network
- Minimizes absolute delay (not relative delay)

H-Tree Clock Network
Clock
Clock
Idle
condition
Gated
clock
Can insert clock gating at
multiple levels in clock tree
Can shut off entire subtree
if all gating conditions are
satisfied
 If the paths are perfectly balanced, clock skew is zero

Clock Grid Network
 Distributed buffering reduces absolute delay and makes
clock gating easier, but is sensitive to variations in the
buffer delay
Clock
secondary clock buffers
local logic
area
main clock
buffer
 The secondary buffers
isolate the local clock
nets from the upstream
load and amplify the
clock signals degraded
by the RC network
 decreases absolute skew
 gives steeper clocks
 Only have to bound the
skew within the local
logic area

DEC Alpha 21164 (EV5) Example
 300 MHz clock (9.3 million transistors on a 16.5x18.1
mm die in 0.5 micron CMOS technology)
 single phase clock
 3.75 nF total clock load
 Extensive use of dynamic logic
 20 W (out of 50) in clock distribution network
 Two level clock distribution
 Single 6 inverter stage main clock buffer at the center of the
chip
 Secondary clock buffers drive the left and right sides of the
clock grid in m3 and m4
 Total equivalent driver size of 58 cm !!

Clock Skew in Alpha Processor
 Absolute skew smaller than 90 ps
 The critical
instruction and
execution units all
see the clock within
65 ps

Dealing with Clock Skew and Jitter
 To minimize skew, balance clock paths using H-tree or
matched-tree clock distribution structures.
 If possible, route data and clock in opposite directions;
eliminates races at the cost of performance.
 The use of gated clocks to help with dynamic power
consumption make jitter worse.
 Shield clock wires (route power lines – VDD or GND – next to
clock lines) to minimize/eliminate coupling with neighboring
signal nets.
 Use dummy fills to reduce skew by reducing variations in
interconnect capacitances due to interlayer dielectric
thickness variations.
 Beware of temperature and supply rail variations and their
effects on skew and jitter. Power supply noise fundamentally
limits the performance of clock networks.

Major Components of a Computer
Processor
Control
Datapath
Memory
Devices
Input
Output
 Modern processor architecture styles (CSE 431)
 Pipelined, single issue (e.g., ARM)
 Pipelined, hardware controlled multiple issue – superscalar
 Pipelined, software controlled multiple issue – VLIW
 Pipelined, multiple issue from multiple process threads -
multithreaded

Basic Building Blocks
 Datapath
 Execution units
- Adder, multiplier, divider, shifter, etc.
 Register file and pipeline registers
 Multiplexers, decoders
 Control
 Finite state machines (PLA, ROM, random logic)
 Interconnect
 Switches, arbiters, buses
 Memory
 Caches, TLBs, DRAM, buffers

MIPS 5-Stage Pipelined (Single Issue) Datapath
Read
Address
I$
Add
PC
4
0
1
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
Sign
Extend
16 32
ALU
1
0
Shift
left 2
Add
D$
Address
Write Data
Read
Data
1
0
IF/Dec
Dec/Exec
Exec/Mem
Mem/WB
pipeline
stage
isolation
register
Fetch Decode Execute Memory WriteBack
clk
Icache
precharge
Dcache
precharge
RegWrite

Datapath Bit-Sliced Organization
Control Flow
Bit 0
Bit 1
Bit 2
Bit 3
Tile identical bit-slice elements
Register
File
Pipeline
Register
Adder
Shifter
Pipeline
Register
Multiplexer
Multiplexer
Data Flow
Pipeline
Register
From
I$
Pipeline
Register
To/From D$

vlsi digital circuits full power point presentation

Recommended

More Related Content

Similar to vlsi digital circuits full power point presentation (20)

Recently uploaded (20)

vlsi digital circuits full power point presentation

Editor's Notes