ASIC Design of A High Speed Low Power Circuit For Factorial Calculation Using Ancient Vedic Mathematics
ASIC Design of A High Speed Low Power Circuit For Factorial Calculation Using Ancient Vedic Mathematics
Microelectronics Journal
journal homepage: www.elsevier.com/locate/mejo
ASIC design of a high speed low power circuit for factorial calculation using
ancient Vedic mathematics
P. Saha a, A. Banerjee b, A. Dandapat c, P. Bhattacharyya d,n
a
School of VLSI Technology, Bengal Engineering and Science University, Shibpur, Howrah 711103, West Bengal, India
Department of Electronics and Communication Engineering, JIS College of Engineering, Kalyani 741235, India
c
Department of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata 700032, India
d
Department of Electronics and Telecommunication Engineering, Bengal Engineering and Science University, Shibpur, Howrah 711103, West Bengal, India
b
a r t i c l e i n f o
abstract
Article history:
Received 28 January 2011
Received in revised form
2 September 2011
Accepted 5 September 2011
Available online 29 September 2011
ASIC design of a high speed low power circuit for factorial calculation of a number is reported in this
paper. The factorial of a number can be calculated using iterative multiplication by incrementing or
decrementing process and iterative multiplication can be computed through parallel implementation
methodology. Parallel implementation along with Vedic multiplication methodology for calculation of
factorial of a number ensures signicant reduction in propagation delay and switching power
consumption due to reduction of stages in multiplication process, in comparison with the conventionally used Vedic multiplication methodologies like Urdhva-tiryakbyham (UT) and Nikhilam Navatascaramam Dasatah (NND) based implementation methodology. Transistor level implementation was
carried out using spice specter with standard 90 nm CMOS technology and the results were compared
with the above mentioned conventional methodologies. The propagation delay for the calculation of
4-bit factorial of a number was only 42.13 ns while the power consumption of the same was
58.82 mW for a layout area of 6 mm2. Improvement in speed was found to be 33% and 24%
while corresponding reduction of power consumption in 34.48% and 24% for the factorial
calculation circuitry in comparison with UT and NND based implementations, respectively.
& 2011 Elsevier Ltd. All rights reserved.
Keywords:
Vedic multiplier
Incrementer
Zero detectors
Decrementer
Factorial design
High speed.
1. Introduction
ASIC implementation of the logarithmic, exponential, trigonometric and other arithmetic circuits plays a pivotal role in the
eld of general and special purpose computer [1,2]. Generally,
such type of computations is implemented through software
programs, like NewtonRaphson, TaylorMacLaurin series, or
polynomial approximations. The computation of the factorial
circuitry is of immense importance for ASIC implementation of
such series (NewtonRaphson, TaylorMacLaurin series, or polynomial approximations).
The principal components required for hardware implementation
of factorial calculation circuitry are incrementer/decrementer and
multiplier for successive multiplication. Therefore the successive
multiplication and incrementer/decrementer limits the overall speed
of the factorial implementation technique. Substantial amount of
work has so far been reported on multiplier [310], such as shift and
1344
From Eq. (4) general expression of the product terms after nth
iteration is equal to Pn.
Mathematically PI can be formulated as:
PI 2k PI1 zP I1 7 IP I1
P1 2k Xz zz 7X
Assume I2
Now consider again Y is either incremented or decremented by
one. So Y is replaced by its new value
P2 P 1 X 7 2
P2 2k P 1 zP 1 7 2P 1
1345
Fact(Num)
for each i from 0 to Num-1
Arr1[i]i 1
end for
for each i from Num to 15
Arr1[i]1
end for
for each j from 0 to i/2
Arr2[j] Arr1[2nj]nArr1[2nj1];
end for
for each k from 0 to j/2
Arr3[k]Arr2[2nk]nArr2[2nk 1];
end for
1346
n1
X
Ai 10i and B
i0
n1
X
10
i0
11
12
13
Eq. (13) can be derived for both the numbers if the number is
greater than the base or less than the base.
( n
10 10n A B 10n A10n B if A,B 4 10n
P
14
10n AB AB
if A,B o 10n
where n is any positive integer and A and B are the 10n0 s
complements of A and B. Mathematical expression of Nikhilam
Navatascaramam Dasatah sutra for binary number system is
given hereunder:
Consider two n bit numbers X and Y, k is exponents, z1, z2 are
residues of X and Y, respectively. Mathematically, X and Y can be
represented as: X 2k 7z1 , Y 2k 7z2
The product term of X and Y is assumed as P and can be
represented as:
P X Y 2k 7 z1 2k 7z2
15
For the fast multiplication using extended rule of the sutra the
bases of the multiplicand and the multiplier assuming same, thus
the Eq. (15) can be rewritten as
P XY 2k X 7 z2 7z1 z2
16
N
1
X
xi 2i
17
yj 2j
18
i0
and
Y
N1
X
j0
N
1
X
i0
P
Fig. 3. Implementation of multiplication using NDD sutra.
XX
i
xi 2i
N
1
X
yj 2j
19
j0
xi yj 2i j
20
Let kij
P
2N1
1
X NX
xi yki 2k
21
k0i0
2N1
X
pk 2k
22
4.2. Comparator
k0
where
pk xi yki
1347
23
24
Y 3 buffered x3
25
Y 2 buffered x2
26
Y 1 buffered x1
27
Y 0 Ctrl x0
28
29
1 r j rn1
j0
30
Fig. 4. Circuitry for checking zero value at the input bit stream.
The conventional adder/subtractor block has been implemented [27] to perform addition as well as subtraction in a single
block, and their performance parameters have been checked
using standard 90 nm CMOS technology. Here the control
(addsub) signal is used for the operation of addition or subtraction. For addition purpose the addsub signal is active low and to
subtract it is active high. The circuit level diagram for the reported
diagram is shown in Fig. 7.
1348
Table 1
Combination of shifting operation.
A7
A6
A5
A4
A3
A2
A1
A0
A6
A5
A4
A3
A2
A1
A0
0
A5
A4
A3
A2
A1
A0
0
0
A4
A3
A2
A1
A0
0
0
0
A3
A2
A1
A0
0
0
0
0
A2
A1
A0
0
0
0
0
0
A1
A0
0
0
0
0
0
0
A0
0
0
0
0
0
0
0
When
When
When
When
When
When
When
When
S2S1S0 000
S2S1S0 001
S2S1S0 010
S2S1S0 011
S2S1S0 100
S2S1S0 101
S2S1S0 110
S2S1S0 111
1349
1350
Table 2
Performance parameters like propagation delay (ps), average dynamic power
consumption (mW) and Energy delay product (10 27) Js analysis of different
components such as zero detector, incrementer/decrementer, comparator, adder/
subtractor, REU.
Circuit module
Delay (ps)
Power (mW)
Zero detector
Incrementer/decrementer
Comparator
Adder/subtractor (4-Bit)
REU
120
180
148
140
376
1.02
3.14
2.15
0.856
0.678
14.16
101.74
47.09
16.3
95.85
[20]
[26]
Proposed
0
4x4
16
14
12
10
8x8
16x16
32x32
16x16
32x32
[20]
[26]
proposed
7
Power (uW)
6
5
4
3
2
1
0
4x4
8x8
Fig. 13. Comparison of results of different type Vedic multipliers (VM), implemented in same environment, in terms of performance parameters such as
propagation delay (ns) and dynamic switching power (mW), as a function of input
number of bits.
34.48% and 24% for the factorial calculation circuitry in comparison with UT and NND based implementations, respectively. Fig. 15
represents the layout of the proposed factorial circuitry for a 4-bit
number using parallel Vedic multiplication methodology, for a layout
area of only 6 mm2. It can be envisaged from the above discussion
that the Vedic multiplier is the most critical element in improving
the speed of the circuit to compute Factorial of a number.
180
160
1351
through parallel implementation methodology. This novel architecture combines the advantages of ancient Vedic formulae and the
parallel implementation techniques thereby leading to signicant
reduction in the number of stages, resulting in high speed operation.
In circuit realization, an (N N) bit multiplier implementation was
transformed into just one small multiplication (bit length 5N) and
one adder/subtractor implementation, thereby high speed operation,
for factorial computation. The propagation delay for the calculation
of 4-bit factorial of a number was only 42.13 ns while the power
consumption of the same was 58.82 mW for a layout area of
6 mm2. Improvement in speed was found to be 33% and 24%
while corresponding reduction of power consumption in 34.48% ,
24% for the factorial calculation circuitry in comparison with UT
and NND based implementation respectively. It can be envisaged
that speed improvement in factorial computation circuit is attributed signicantly from incorporation of the Vedic multiplier.
[20]
[26]
Proposed
140
120
100
80
60
40
20
0
3-Bit
200
4-Bit
5-Bit
[20]
[26]
Proposed
150
100
50
0
3-Bit
4-Bit
5-Bit
Fig. 15. Layout of factorial design circuit using parallel Vedic multiplication
methodology. Layout consumes only 6 mm2 area. Layout have been implemented using L-Edit V-13 of T-Spice simulator.
6. Conclusion
In this paper, based on ancient Vedic mathematics, we report on
a novel circuitry for computation of factorial of a 4-bit number
References
[1] J.P. Deschamps, G.J.A. Bioul, G.D. Sutter, Synthesis of Arithmetic Circuits, FPGA,
ASIC and Embedded Systems, Wiley Interscience Publications, 2006 180198.
[2] J.F. Hart, E.W. Cheney, C.L. Lawson, H.J. Maehly, C.K. Mesztenyi, J.R. Rice,
H.G. Thacher, C. Thacher, H.G. Witzgall Jr., Computer Approximations, Wiley,
1968.
[3] M. M.-Dastjerdi, A. A.-Kusha, M. Pedram, BZ-FAD: A Low-Power Low-Area
Multiplier Based on Shift-and-Add Architecture, IEEE Trans. Very Large Scale
Integr. (VLSI) Syst. 17 (2) (2009) 302306.
[4] A.D. Booth, A signed binary multiplication technique, Q. J. Mech. Appl. Math.
(1952) 236240 IV.
[5] Y.-H. Seo, D.-W. Kim, A. New VLSI, Architecture of Parallel Multiplier
Accumulator Based on Radix-2 Modied Booth Algorithm, IEEE Trans. Very
Large Scale Integr. (VLSI) Syst. 18 (2) (2010) 201208.
[6] J. Hu, L. Wang, T. Xu, A low-power adiabatic multiplier based on modied
Booth algorithm, in: Proceedings of the IEEE International Symposium on
Integrated Circuits, Singapore, September 2007, pp. 489492.
[7] C.S. Wallace, A suggestion for a fast multiplier, IEE Trans. Electron. Comput.
EC-13 (1) (1964) 1417.
[8] M. Young, The Techincal Writers Handbook, CA: University Science, Mill
Valley, 1989.
[9] F. Carbognani, F. Buergin, N. Felber, H. Kaeslin, W. Fichtnes, A 2.7-/SPL mu/W/
MHz transmission-gate-based 16-bit multiplier for digital hearing aids, in:
Proceeding of the IEEE 48th Midwest Symposium on Circuit and Systems,
Covington, KY, August 2005, pp. 14061409.
[10] Z. Wang, G.A. Jullien, W.C. Miller, A new design technique for column
compression multipliers, IEEE Trans. Comput. 44 (8) (1995) 962970.
[11] K.-J. Cho, S. Jo, Y.-E. Kim, Y.-N. Xu, J.-G. Chung, Constant multiplier design
using specialized bit pattern adders, in: Proceeding of the IEEE Fifteenth
International Conference on Electronics, Circuits and Systems, St. Juliens,
August 2008, pp. 4144.
[12] S.L. Chen, X.-Y. Tian, X.-J. Zhao, Improved multiplier of CSD used in digital
signal processing, in: Proceeding of the IEEE International Conference on
Machine Learning and Cybernetics, Kunming, July 2008, pp. 29052908.
[13] A. Avizienis, Signed-digit number representations for fast parallel arithmetic,
IRE Trans. Electron. Comput. EC-10 (1961) 389400.
[14] M.R. Stan, A.F. Tenca, M.D. Ercegovac, Long and fast up/down counters, IEEE
Trans. Comput. 47 (7) (1998) 722735.
[15] D.R. Lutz, D.N. Jayashima, Programmable modulo-K counters, IEEE Trans.
Circuits Syst.: Fund. Theory Appl. 43 (11) (1996) 939941.
[16] R. Hashemian, Highly parallel increment/decrement using CMOS technology,
in: Proceedings of the 33rd IEEE Midwest Symposium on Circuit and System,
Calgary, Alberta, Canada, August 1990, vol. 2, pp. 866869.
[17] C.-H. Huang, J.-S. Wang, Y.-C. Huang, A high-speed CMOS incrementer/
decrementer, in: Proceeding of the IEEE International Symposium on Circuits
and Systems, Sydney, Australia, May 2001, vol. 4, pp. 8891.
[18] S. Bi, W.J. Gross, W. Wang, A. Al-Khalili, M.N.S. Swamy, An area-reduced
scheme for Modulo 2n 1 addition/subtraction, in: Proceeding of the IEEE
Ninth International Database Engineering and Application Symposium, July
2005, pp. 396399.
[19] J.S.S.B.K.T. Maharaja, Vedic Mathematics, Motilal Banarsidass Publishers Pvt
Ltd, Delhi, 2001.
[20] P. Mehta, D. Gawali, Conventional versus Vedic mathematical method for
hardware implementation of a multiplier, in: Proceedings of the IEEE
International Conference on Advances in Computing, Control, and Telecommunication Technologies, Trivandrum, Kerala, December 2009, pp. 640642.
[21] M. Ramalatha, K. Thanushkodi, K.D. Dayalan, P. Dharani, A. Novel Time and
energy efcient cubing circuit using Vedic mathematics for nite eld
arithmetic, in: Proceedings of the IEEE International Conference on Advances
in Recent Technologies in Communication and Computing, Kerala, October
2009, pp. 873875.
1352
[22] M. Ramalatha, K.D. Dayalan, P. Dharani, S.D. Priya, High speed energy
efcient ALU design using Vedic multiplication techniques, in: Proceedings
of the IEEE International Conference on Advances in Computational Tools for
Engineering Applications, Zouk Mosbeh, July 2009, pp. 600603.
[23] S. Akhter, VHDL implementation of fast N N multiplier based on vedic
mathematic, in: Proceedings of the IEEE, Eighteenth European Conference on
Circuit Theory and Design, Seville, August 2007, pp. 472475.
[24] P. Mehta, D. Gawali, Conventional versus Vedic mathematical method for
hardware implementation of a multiplier, in: Proceedings of the IEEE
International Conference on Advances in Computing, Control, and Telecommunication, Trivandrum, Kerala, December 2009, pp. 640642.
[25] H.D. Tiwari, G. Gankhuyag, C.M. Kim, Y.B. Cho, Multiplier design based on
ancient Indian Vedic Mathematics, in: Proceedings of the IEEE International
SoC Design Conference, Busan, November 2008, pp. 6568.