Approaches To Design & Implement High Speed-Low Power Digital Filter: Review
Approaches To Design & Implement High Speed-Low Power Digital Filter: Review
com
Dr.K.B.Khanchandani
H.O.D,Department of electronics and telecommunication,
S.S G.M College of Engineering, Shegaon (M.S.), INDIA
T.P.Marode
M.E. Student
S.S G.M College of Engineering, Shegaon (M.S.), INDIA
http://www.ijccr.com
Abstract
Digital Filters are important elements of Digital Signal Processing (DSP) systems. Digital
Filters can be realized by various Digital Filter Structures like Direct Form-I, Direct Form-II,
Cascade, Parallel,T ransposed structures etc.These structures provide a space for selection of
appropriate structure for reduction of power consumption and improvement in speed of Digital
filters which is significantly important for all high-performance DSP applications. Major factors
influencing the choice of specific realization are computational complexity, memory
requirements and finite word length effects. The techniques which are used to achieve low
power consumption in VLSI-DSP applications span a wide range, from algorithm and
architectural levels to logic, circuits and device levels. Pipelining and parallel processing can be
used to reduce power consumption by reducing supply voltage. Power consumption can be
reduced by reducing effective capacitance which can be achieved by reducing the number of
gates or by algorithmic strength reduction where the number of operations in an algorithm is
reduced. Power can also be reduced by reducing memory access. This paper reviews several
techniques and approaches used by previous authors for designing & implementing low power-
high speed digital filters.
Keywords
I. Introduction
The developments in electronic technology are taking place at a tremendous speed.
Recently, Digital Signal Processing (DSP) is used in numerous applications such as video
compression, digital set-top box, cable modems, digital versatile disk, portable video
systems/computers, digital audio, multimedia and wireless communications, digital radio, digital
http://www.ijccr.com
still and network cameras, speech processing, transmission systems, radar imaging, acoustic
beam formers, global positioning systems, and biomedical signal processing. The field of DSP
has always been driven by the advances in DSP applications and in scaled very-large-scale-
integrated (VLSI) technologies. Therefore, at any given time, DSP applications impose several
challenges on the implementations of the DSP systems. These implementations must satisfy
the enforced sampling rate constraints of the real-time DSP applications and must require less
space and power consumption.
DSP computation is different from general-purpose computation in the sense that the
DSP programs are non terminating programs. In DSP computation, the same program is
executed repetitively on an infinite time series. The non terminating nature can be exploited to
design more efficient DSP systems by exploiting the dependency of tasks both within iteration
and among multiple iterations. Furthermore, long critical paths in DSP algorithms limit the
performance of DSP systems. These algorithms need to be transformed for design of high-
speed or low-power implementations. The techniques which are used to achieve low power
consumption in VLSI-DSP applications span a wide range, from algorithm and architectural
levels to logic, circuits and device levels [1-2].Digital filters are essential elements of DSP
systems. Digital filters are classified into two categories as: Finite Impulse Response (FIR) filter
and Infinite Impulse Response (IIR) filter. Though FIR filters have linear phase property, low
coefficient sensitivity and stability compare to IIR filter they consume more power than IIR filter
in general. Strength reduction transformations are applied to reduce the number of
multiplications in finite impulse response (FIR) digital filters.
There have been consistent efforts being taken to reduce power consumption since last few
decades. Power consumption can be reduced by a combination of several techniques.
Pipelining and parallel processing can be used to reduce power consumption by reducing
supply voltage. Power consumption can be reduced by reducing effective capacitance which
can be achieved by reducing the number of gates or by algorithmic strength reduction where the
number of operations in an algorithm is reduced. Power can also be reduced by reducing
memory access. The single most effective means to power-consumption reduction is clock
gating where all functional units which need not compute any useful outputs are switched off by
using gated clocks. Use of multiple-supply voltages and a simultaneous reduction of threshold
and supply voltages are also effective in reducing power consumption. Most power-reduction
approaches apply to dedicated, Programmable or Field Programmable Gate Array (FPGA)
systems in a dual manner [1-2].Optimizations of the speed and power consumption of digital
filters can be achieved by using dedicated operators instead of general ones whenever
possible. Multiply and Accumulate (MAC) is an important unit of DSP systems. It decides the
power consumption and speed of operation of DSP systems.
http://www.ijccr.com
Pipelining transformation leads to a reduction in the critical path, which can be exploited to
either increase the clock speed or sample speed or to reduce power consumption at same
speed. Critical path is defined as the path with longest computation time among all the paths
that contain zero delays, and the computation time of the critical path is the lower bound on the
clock period of the circuit. In parallel processing, multiple outputs are computed in parallel in a
clock period. Therefore the effective sampling speed is increased by the level of parallelism.
Similar to the pipelining, parallel processing can also be used for reduction of power
consumption. Pipelining reduces the effective critical path by introducing pipelining latches
along the datapath.The power consumption of the pipelined filter is given by Ppip=Ctotal β2 V0f .
Where Ctotal is total capacitance of the circuit, V0-supply voltage, f-clock frequency, β-power
consumption reduction factor.Parallel processing can reduce power consumption of a system by
allowing the supply voltage to be reduced.Power consumption of L-parallel system can be
computed as: Ppar = β2 Ccharge V0 2 f , Where Ccharge –charging capacitor along critical path Since
maintaining the same sample rate, clock period is increased to LTseq.Where Tseq =1/f .This
means that Ccharge is charged in LTseq, and the power supply can be reduced to βV0.The supply
voltage cannot be lowered indefinitely by applying more & more levels of pipelining and
parallelism[3]. Cascade and parallel stuctures for Digital Filter are developed on the same
concept.
http://www.ijccr.com
B. Unfolding
It is transformation technique that can be applied to a DSP program to create a new
paradigm describing more than one iteration of the original program. More specifically, unfolding
a DSP program by the unfolding factor J creates a new program that describes J consecutive
iterations of the original program. Unfolding allows the DSP program to be implemented with an
iteration period equal to iteration bound[B1].If the basic iteration is unfolded J times the number
of simultaneously processed samples increases linearly, while the critical path is not altered.
Therefore the throughput increases at the rate J.So an arbitrary fast implementation can be
achieved by using the appropriate number of unrolling [4].It can be applied to generate word
parallel architectures that can be used for high speed & low power applications.
For the applications demanding low power and high speed Digital Filters ,the various
approaches developed so far to reduce the number of multiplications and additions are
discussed below:
Strength reduction at algorithmic level can be used to reduce the number of additions and
multiplications. Applications involving multiplication by constant are common in digital signal
processing.A first solution proposed to optimize multiplication by constant was the use of the
http://www.ijccr.com
constant recoding, such as Booth’s. This solution just avoids long strings of consecutive ones in
the binary representation of the constant. Better solutions are based on the factorization of
common subexpressions, simulated annealing, tree exploration, pattern search methods, etc In
2001, Lefe`vre proposed a new algorithm to efficiently multiply a variable integer by a given set
of integer constants which was then modified by Nicolas Boullis and Arnaud Tisserand. A
significant drop up to 40 percent in the total number of additions/subtractions is obtained by
using this modified algorithm,[7].Strength reduction leads to a reduction in hardware complexity
by exploiting substructure sharing.
subexpressions and produce better results than those produced by methods that can only find
common subexpressions involving only a single variable at a time. But the algorithm suffers
from two major drawbacks. The major drawback is that it is unable to detect common
subexpressions that have their signs reversed. This is a major disadvantage, since when
signed digit representations like CSD are used, a number of such opportunities are missed.
Another disadvantage of this technique is that the problem of finding the best rectangle in the
Kernel Intersection Matrix (KIM) is exponential in the number of rows/columns of the matrix.
Therefore heuristic algorithms such as the ping-pong algorithm have to be used to find the best
common subexpression in each iteration [9]. Multi-level logic synthesis techniques use a faster
algorithm called Fast Extract (FX) for doing quick Boolean decomposition and factoring. This
technique is much faster than the rectangular covering methods and produces results close to
the most expensive routines using rectangle covering. This method is called as 2-term CSE
method. The better results of the algorithm can be attributed to the fact that it can detect
common subexpressions with reversed signs[10-11].
enabling technique, the delay of each stage is obtained. Every block gets enabled only after the
expected delay. For the entire duration until the inputs are available, the successive blocks are
disabled, thus saving power[12].
In the paper titled “Low-Complexity Constant Coefficient Matrix Multiplication Using a Minimum
Spanning Tree Approach”, Oscar Gustafsson, Henrik Ohlsson, and Lars Wanhammar proposed
an algorithm for low complexity constant coefficient matrix multiplication based on differences. It
uses a minimum spanning tree (MST) to select the coefficients, which warrants low execution
time as an MST can be found in polynomial time [17-18].
In general, optimization techniques usually used for multiplierless filter design are complex, can
require long run times, and provide no performance guarantees (Koter at al., 2003). Gordana
Jovanovic Dolecek and Sanjit K. Mitra in their paper titled “Computationally Efficient Multiplier-
Free Fir Filter Design”, proposed simple efficient method for the design of multiplier-free FIR
http://www.ijccr.com
filters without optimization.The method uses the rounding to the nearest integer of the
coefficients of the equiripple filter which satisfies the desired specification. Considering that the
integer coefficient multiplications can be accomplished with only shift-and-add operations, the
rounded impulse response filter is multiplier-free. The complexity of the rounded filter (the
number of the sums and the number of integer multiplications) depends on the choice of the
rounding constant. Higher values of the rounding constant lead to the less complexity of the
rounded filter but also in a more distortion in the desired gain response. In the next step the
sharpening technique is used to improve the magnitude characteristic and to satisfy the
specification. In that way the overall filter is based on combining one simple filter with integer
coefficients [19].
V.Conclusion
Low power high speed techniques for digital filter implementations are reviewed in this paper.
Cascade and parallel structures may be used for Low power high speed Digital Filter Structures.
Speed of Digital Filter can be improved by unfolding the various iterations of DSP program. A
significant drop up to 40 percent in the total number of additions/subtractions is obtained by
using Lefevre’s modified approach for optimization of hardware multiplication by constant
matrices. The ping-pong algorithm may be the best option to find the best common
subexpression in each iteration.Designing MAC using block enabling technique results in low
power consumption. Symbolic mathematics is extremely powerful tool for technical computing
which reduces the computations thereby improving speed and optimizing power consumption.
The method for the design of multiplier-free FIR filters without optimization technique has
proven less complexity which results in high speed digital filter.
http://www.ijccr.com
REFERENCES
[1] Keshab K. Parhi, , “Approaches to Low-Power Implementations of DSP Systems,” IEEE
Transactions on circuits and systems—I: Fundamental Theory and applications, Vol. 48,
No. 10, October 2001,pp.1214-1224.
[2] Shanthala S, S. Y. Kulkarni “High Speed and Low Power FPGA Implementation of FIR
Filter for DSP Applications”, European Journal of Scientific Research ISSN 1450-216X
Vol.31 No.1 (2009), pp. 19-28.
[4] Miodrag Potkonjak ,Jan Rabaey , “Maximally Fast and Arbitrarily Fast Implementation of
Linear Computations”, 1992 IEEE proceedings.
[5] Mandeep Kaur & Vikas Sharma , “Analysis of Various Algorithms for Low Power
Consumption in Embedded System using Different architecture” International journal of
Electronics Engineering 2(1),2010,pp. 213-217.
[7] Nicolas Boullis and Arnaud Tisserand “Some Optimizations of Hardware Multiplication
by Constant Matrices,” IEEE Transations on computers, vol. 54,no. 10, October 2005.
http://www.ijccr.com
[9] Anup Hosangadi, Farzan Fallah, Ryan Kastner, "Common Subexpression Elimination
Involving Multiple Variables for Linear DSP Synthesis,", pp.202-212, 15th IEEE
International Conference on Application-Specific Systems, Architectures and Processors
(ASAP'04), 2004.
[11] J. Vasudevamurthy and J. Rajski, "A method for concurrent decomposition and
factorization of Boolean expressions," presented at Computer-Aided Design, 1990.
ICCAD-90. Digest of Technical Papers., 1990 IEEE International Conference on, 1990.
[12] Shanthala and Y.V.Kulkarni , “VLSI Design and Implementation of Low Power MAC
Unit with Block Enabling Technique”, in European Journal of Scientific Research,(2009),
pp. 19-28.
http://www.ijccr.com
[13] Miroslav D. Lutovac Jelena D. C´ artical´ Ljiljana D. Mili´c “Digital Filter Design Using
Computer Algebra Systems”, published in Circuits Systems Signal processing (2010) 29:
51-64.
[14] M.D. Lutovac, D.V. Toši´c, “Symbolic analysis and design of control systems using
Mathematica,”. Int. Journal on Control 79(11), 1368–1381 (2006). Special Issue on
Symbolic Computing in Control.
[15] B.Lutovac, M.D. Lutovac, “Design and VHDL description of multiplierless half-band IIR
filter.,” Int. Journal on Electronic Communication AE˝U 56, 348–350 (2002).
[19] Gordana Jovanovic Dolecek, and Sanjit K. Mitra “ Computationally Efficient Multiplier-
Free Fir Filter Design”, Computación y Sistemas Vol. 10 No. 3, 2007, pp 251-267 ISSN
1405-5546.