VLSI Digital Signal Processing Systems
VLSI Digital Signal Processing Systems
Keshab K. Parhi
VLSI Digital Signal Processing Systems
• Textbook:
– K.K. Parhi, VLSI Digital Signal Processing Systems: Design and
Implementation, John Wiley, 1999
• Buy Textbook:
– http://www.bn.com
– http://www.amazon.com
– http://www.bestbookbuys.com
Chap. 2 2
Chapter 1. Introduction to DSP Systems
• Introduction (Read Sec. 1.1, 1.3)
• Non-Terminating Programs Require Real-Time
Operations
• Applications dictate different speed constraints
(e.g., voice, audio, cable modem, settop box,
Gigabit ethernet, 3-D Graphics)
• Need to design Families of Architectures for
specified algorithm complexity and speed
constraints
• Representations of DSP Algorithms (Sec. 1.4)
Chap. 2 3
Typical DSP Programs
• Usually highly real-time, design hardware and/or software to meet the
application speed constraint
samples in DSP System out
• Non-terminating
– Example:
for n = 1 to ∞
y ( n ) = a ⋅ x ( n ) + b ⋅ x ( n − 1) + c ⋅ x ( n − 2 )
end
nT 3T 2T T 0 Algorithms out
.…
signals
Chap. 2 4
Area-Speed-Power Tradeoffs
• 3-Dimensional Optimization (Area, Speed, Power)
• Achieve Required Speed, Area-Power Tradeoffs
• Power Consumption
P = C ⋅V 2 ⋅ f
• Latency reduction Techniques => Increase in speed or
power reduction through lower supply voltage operation
• Since the capacitance of the multiplier is usually dominant,
reduction of the number of multiplications is important
(this is possible through strength reduction)
Chap. 2 5
Representation Methods of DSP systems
Example: y(n)=a*x(n)+b*x(n-1)+c*x(n-2)
a b c
y(n)
Chap. 2 6
• Graphical Representation Method 2: Signal-Flow Graph
– SFG: a collection of nodes and directed edges
– Nodes: represent computations and/or task, sum all incoming signals
– Directed edge (j, k): denotes a linear transformation from the input signal
at node j to the output signal at node k
– Linear SFGs can be transformed into different forms without changing the
system functions. For example, Flow graph reversal or transposition is
one of these transformations (Note: only applicable to single-input-single-
output systems)
– Usually used for linear time-invariant DSP systems representation
Chap. 2 7
• Graphical Representation Method 3: Data-Flow Graph
– DFG: nodes represent computations (or functions or subtasks), while the
directed edges represent data paths (data communications between nodes),
each edge has a nonnegative number of delays associated with it.
– DFG captures the data-driven property of DSP algorithm: any node can
perform its computation whenever all its input data are available.
– Each edge describes a precedence constraint between two nodes in DFG:
• Intra-iteration precedence constraint: if the edge has zero delays
• Inter-iteration precedence constraint: if the edge has one or more delays
• DFGs and Block Diagrams can be used to describe both linear single-rate and
nonlinear multi-rate DSP systems
• Fine-Grain DFG
x(n) D D
a b c
y(n)
Chap. 2 8
Examples of DFG
Adaptive
FFT IFFT
filtering
Decimator N samples
↓2 N/2 samples
≡ 2 1
• Introduction
• Loop Bound
– Important Definitions and Examples
• Iteration Bound
– Important Definitions and Examples
– Techniques to Compute Iteration Bound
Chap. 2 10
Introduction
• Iteration: execution of all computations (or functions) in an algorithm
once
– Example 1: A B C
1 2 2 3 2 1
• For 1 iteration, computations are: A B C
2 times 2 times 3 times
Chap. 2 11
Introduction (cont’d)
– Assume the execution times of multiplier and adder are Tm & Ta, then the
iteration period for this example is Tm+ Ta (assume 10ns, see the red-color
box). so for the signal, the sample period (Ts ) must satisfy:
Ts ≥ Tm + Ta
• Definitions:
– Iteration rate: the number of iterations executed per second
– Sample rate: the number of samples processed in the DSP system per
second (also called throughput)
Chap. 2 12
Iteration Bound
• Definitions:
– Loop: a directed path that begins and ends at the same node
– Loop bound of the j-th loop: defined as Tj/Wj, where Tj is the loop
computation time & Wj is the number of delays in the loop
– Example 1: a→ b→ c→ a is a loop (see the same example in Note 2,
Tloopbound = Tm + Ta = 10 ns
PP2), its loop bound:
+ y(n-2)
x(n) 2D Tm + Ta
+ Tloopbound = = 5 ns
2
a
Chap. 2 13
Iteration Bound (cont’d)
– Example 3: compute the loop_bounds of the following loops:
Chap. 2 14
Iteration bound (cont’d)
• If no delay element in the loop, then T∞ = TL 0 = ∞
– Delay-free loops are non-computable, see the example: A B
• Non-causal systems cannot be implemented
Z B = A ⋅ Z non− causal
A B −1
A = B ⋅ Z causal
• Speed of the DSP system: depends on the “critical path comp. time”
– Paths: do not contain delay elements (4 possible path locations)
• (1) input node →delay element
• (2) delay element’s output → output node
• (3) input node → output node
• (4) delay element → delay element
– Critical path of a DFG: the path with the longest computation time among
all paths that contain zero delays
– Clock period is lower bounded by the critical path computation time
Chap. 2 15
Iteration Bound (cont’d)
– Example: Assume Tm = 10ns, Ta = 4ns, then the length of the critical
path is 26ns (see the red lines in the following figure)
x(n)
D D D D
a b c d e
26 26 22 18 14
y(n)
Chap. 2 16
Pr ecedence Const r aint s
Chap. 2 17
y(n)=ay(n-1) + x(n)
+
int er -it er at ion pr ecedence const r aint
x(n) A D A 1àB2
A 2 àB3
D
T loop= 13ut A 1à B1=> A 2 à B2 => A 3 ….
A B
(10) (3)
Chap. 2 19
• Algor it hms t o comput e it er at ion bound
– Longest Pat h Mat r ix (LPM)
– Minimum Cycle Mean (MCM)
Chap. 2 20
• Longest Pat h Mat r ix Algor it hm
Ø Let ‘d’ be t he number of delays in t he DFG.
Ø A ser ies of mat r ices L(m), m = 1, 2, …, d, ar e const r uct ed
such t hat li,j(m) is t he longest comput at ion t ime of all pat hs
f r om delay element di t o dj t hat passes t hr ough exact ly
(m-1) delays. I f such a pat h does not exist li,j(m) = -1.
Ø The longest pat h bet ween any t wo nodes can be
comput ed using eit her Bellman-For d algor it hm or Floyd-
War shall algor it hm (Appendix A).
Ø Usually, L(1)is comput ed using t he DFG. The higher or der
mat r ices ar e comput ed r ecur sively as f ollows :
li,j(m+1) = max(-1, li,k(1) + lk,j(m) ) f or k∈K
wher e K is t he set of int eger s k in t he int er val [1,d] such
t hat neit her li,k(1) = -1 nor lk,j(m) = -1 holds.
Ø The it er at ion bound is given by,
T ∞ = max{li,i(m) /m} , f or i, m ∈ {1, 2, …, d}
Chap. 2 21
• Example : -1 0 -1 -1
(1) 1 4 -1 0 -1
D d1 L(1) =
5 -1 -1 0
(2) D d2 5 -1 -1 -1
(1) 2 4 4 -1 0 -1
(2) D d3 5 4 -1 0
(1) 3 5 L(2) =
5 5 -1 -1
(2) D d4
6 -1 5 -1 -1
5 4 -1 0 8 5 4 -1
8 5 4 -1 9 8 5 4
L(3) = L(4) =
9 5 5 -1 10 9 5 5
9 -1 5 -1 10 9 -1 5
Chap. 2 23
Ø To comput e t he maximum cycle mean of Gd t he MCM of Gd ’
is comput ed and mult iplied wit h –1. Gd’ is similar t o Gd
except t hat it s weight s negat ive of t hat of Gd.
Algor it hm f or MCM :
Ø Const r uct a ser ies of d+1 vect or s, f (m), m=0, 1, …, d, which
ar e each of dimension d×1.
Ø An ar bit r ar y r ef er ence node s is chosen and f (0)is f or med
by set t ing f (0) (s)=0 and r emaining ent r ies of f (0) t o ∞.
Ø The r emaining vect or s f (m) , m = 1, 2, …, d ar e r ecur sively
comput ed accor ding t o
f (m) (j ) = min(f (m- 1) (i) + w’(i,j )) f or i ∈ I
wher e, I is t he set of nodes in Gd’ such t hat t her e exist s
an edge f r om node i t o node j .
Ø The it er at ion bound is given by :
T ∞ = -mini ∈{1,2,…,d} (max m ∈ {0,1, …, d-1}((f (d)(i) - f (m)(i))/ (d-m)))
Chap. 2 24
• Example :
4 -4
0 1 0 2
1 2
0 Gd t o Gd’ 0
5 -5
0 0 4
3 4 3
5 -5
m=0 m=1 m=2 m=3 max m ∈ {0,1, …, d-1}((f (d)(i) - f (m)(i))/ (d-m))
i=1 -2 -∞ -2 -3 -2
i=2 -∞ -5/ 3 -∞ -1 -1
i=3 -∞ -∞ -2 -∞ -2
i=4 ∞-∞ ∞-∞ ∞-∞ ∞ ∞