Paper For Final Submission
Paper For Final Submission
Abstract—-Multiplication is a basic arithmetic operation. error is not reduced through any direct approximation
Multiplication operations such as Fast Fourier Transforms, techniques, which are the principal constituents in most of the
Multiplication and accumulation units, Convolution are some of logarithmic based multipliers, rather provides an iterative
the computation-intensive arithmetic functions often encountered solution to reduce the error. Some of the direct error
in Digital Signal Processing applications. Usually, Logarithm approximation techniques are segmentation and interpolation
based multipliers are used in these cases which introduce certain techniques [6]. The proposed architecture follows an algorithm
errors. These errors are approximated by various methods. In this which is similar to the iterative algorithm [4], uses
paper a simple architecture of a 16X16 logarithm based multiplier combinational and sequential logic circuits to achieve an exact
is proposed which uses simple combinational and sequential
result.
circuits to obtain an exact product. The multiplier has an arbitrary
execution time with the maximum execution time being 15 clock The rest of the paper is organized as follows. Section II is
cycles and mean being 7.5 clock cycles. This architecture is subdivided into two parts (A, B). Part A and B explain the
designed and simulated in ‘ModelSim’ simulation tool. previous works of Mitchell’s algorithm [1] and iterative
multiplication algorithms respectively [4]. Section III presents
Keywords—Logarithmic Multiplier, Logarithmic Number
Systems, Modified Iterative Block, Check Block, Control Block, the proposed architecture which uses a modified iterative block
Exact Product. to arrive at the exact results and also explains the functionality
of each block used in the architecture in detail. Section IV
I. INTRODUCTION provides simulation results. Section V draws conclusions and
Section VI provides the references.
Logarithmic multiplication is a process which involves
calculating the product of two operands by converting the II. PREVIOUS WORK
operands into Logarithmic Number System. The procedure for
A. MITCHELL’S ALGORITHM
calculating the product involves converting the operands into
their respective logarithms, adding the logarithmic result and Any binary integral number can be written as:
computing the anti-logarithm of that result. This procedure is
𝑘−1
simpler as the addition operation replaces product operation in 𝑁 = 2𝑘 [ 1 + ∑𝑛=0 2𝑛−𝑘 . 𝑍𝑖 ] (1)
Logarithmic Number Systems [7]. However, this procedure
introduces a setback as the logarithms and anti-logarithms Where ‘k’ is the position of the most significant bit whose value
cannot be computed exactly. So, these methods introduce errors is ‘1’. ‘Zi’ is the value of the bit in the ith position.
as exact values of Logarithms and anti-logarithms cannot be
obtained and one is obliged to approximate the results of The above equation can be further modeled as:
Logarithms and antilogarithms [5] [6] [8] [9] [11] [12]. Such a
method is Mitchell’s Algorithm based multiplier [1], which 𝑁 = 2𝑘 [ 1 + 𝑋 ] (2)
approximates 𝑙𝑜𝑔2 ( 1 + 𝑥 ) as 𝑥, where 𝑥 represents mantissa Where 𝑋 is the mantissa part
of a number.
𝑙𝑜𝑔2 𝑁 = 𝑘 + 𝑙𝑜𝑔2 ( 1 + 𝑋 ) (3)
An iterative architecture similar to Mitchell’s Algorithm
based multiplier was proposed by Patricio Buli´c and his team Mitchell’s algorithm approximates 𝑙𝑜𝑔2 ( 1 + 𝑋 ) with 𝑋
[4] [10] which models the true product as the sum of
approximate product and error. The error here is in the form of So, for any two operands N1 and N2,
the product of two new operands which can be again fed into a
𝑁1 = 2𝑘1 [ 1 + 𝑋1 ] (4)
similar block and whose approximate product can be added to
the previous result, so as to reduce the overall error. The overall 𝑁2 = 2𝑘2 [ 1 + 𝑋2 ] (5)
𝑙𝑜𝑔2 𝑁1 = 𝑘1 + 𝑙𝑜𝑔2 ( 1 + 𝑋1 ) (6) results will give the more accurate value of the product. On
employing the same procedure repeatedly, the accurate product
𝑙𝑜𝑔2 𝑁2 = 𝑘2 + 𝑙𝑜𝑔2 ( 1 + 𝑋2 ) (7) can be achieved at some point. The below block diagram is the
architecture of the iterative block [3]. The architecture of a
From Mitchell’s approximation, 16x16 bit iterative block uses Leading One Detectors ( 16 bit ),
𝑙𝑜𝑔2 𝑁1 = 𝑘1 + 𝑋1 and 𝑙𝑜𝑔2 𝑁2 = 𝑘2 + 𝑋2 . Priority encoders ( 16 x 4 ), Barrel Shifters ( 32 bit ), Ripple
Carry Adders ( 4 bit and 32 bit ), Decoders ( 5 x 32 ) and XOR
𝑙𝑜𝑔2 (𝑁1 . 𝑁2 ) = 𝑘1 + 𝑋1 + 𝑘2 + 𝑋2 . (8) banks ( 16 bit ).
The error here is positive as 𝑙𝑜𝑔2 ( 1 + 𝑋) is always greater
than or equal to 𝑋and the error ranges from 0-11% [2] . Various N1 N2
techniques were proposed to reduce this error, some of them
being Operand Decomposition method [5], using look-up
tables, and Segmentation and interpolation methods [6]. Each
method has its own tradeoffs between architecture complexity, LOD LOD
Mathematics Involved:
ADDER
𝑁1 . 𝑁2 = 2𝑘1 +𝑘2 + 𝑋1 . 2𝑘1+𝑘2 + 𝑋2 . 2𝑘1 +𝑘2
+ [ 𝑋1 . 𝑋2 ]. 2𝑘1 +𝑘2 (11)
From equation (2), we can write 𝑋. 2𝑘 = (𝑁 − 2𝑘 ) .
Therefore
ADDER
Pappx = 2𝑘1 +𝑘2 + ( 𝑁1 − 2𝑘1 ). 2𝑘2 + ( 𝑁2 − 2𝑘2 ). 2𝑘1 (14) With inputs as 2𝑘1 and 2𝑘2 , priority encoders compute
the values of k1 and k2.
E. = ( 𝑁1 − 2𝑘1 ) . ( 𝑁2 − 2𝑘2 ). (15)
( 𝑁1 − 2𝑘1 ) and ( 𝑁2 − 2𝑘2 ) are the outputs of XOR
The error here is again in the form of product of two operands banks whose inputs are the Operands and outputs from
(N1*).( N2*), N1* and N2* are the error operands. Leading one detectors.
Where N1* = ( 𝑁1 − 2𝑘1 ) ; N2* = ( 𝑁2 − 2𝑘2 ) for which the Barrel shifters compute the values of ( 𝑁1 − 2𝑘1 ). 2𝑘2 and
same arithmetic can be followed and subsequently adding the ( 𝑁2 − 2𝑘2 ). 2𝑘1 .
The results of the two barrel shifters are added to obtain the Modified iterative block:
sum ( 𝑁1 − 2𝑘1 ). 2𝑘2 +( 𝑁2 − 2𝑘2 ). 2𝑘1 .
The Iterative block explained in part B of Section II, uses
The values of k1 and k2 obtained in step 2. are added and 16x4 priority encoders for which 16’b0 is an invalid input. So,
the result is given as an input to a Decoder, which gives the the block does not perform as expected. The block can be used
value of 2𝑘1+𝑘2 as output. iteratively to reduce the error, but may not achieve exact
product. So we modified the iterative block by including a
The results obtained in step 5. and step 6. are added to give combinational logic circuit which bypasses the case of the
2𝑘1 +𝑘2 +( 𝑁1 − 2𝑘1 ). 2𝑘2 + ( 𝑁2 − 2𝑘2 ). 2𝑘1 as output. inputs being 16’b0 and to act as expected ( the iterative block is
expected to give a 16’b0 as output even if any one of its
The error operands are the outputs of the XOR banks. operands is 16’b0 ).
III. PROPOSED ARCHITECTURE N1 N2
We have to note that each error operand in the above
algorithm is the result of removing the most significant bit with
the value ‘1’ from the input operands to the iterative block. So N1 *
by successive iterations (and adding the approximate products
of each iteration) at least one of the error operands becomes ‘0’ Iterative
at some point, which means that the error is ‘0’ at that point, Block
and the accurate product is obtained. For explaining this more
clearly we shall take an example, here we will look at the errors N2 *
NOR NOR
after each iteration.
Bank Bank
TABLE I
CONTROL
BLOCK
N2
MODIFIED CHECK
ITERATIVE BLOCK BLOCK
Fig. 4. Block Diagram of the Check Block
FINAL PRODUCT
Control Block:
N1 N1 * N2 N2 *
ADDER BUFFER
REGISTER
MUX MUX
.
MUX
0 Register Register
Simulation Results:
TABLE II
1111010101010101 1000100101010000 10 5 5 4
1000111100001010 0100100011110010 7 7 7 6
0010001100010000 1010101010100101 4 8 4 3
Discussions:
of 1’s in both the operands, which can be observed from the
The architecture is coded in Verilog HDL and tested on
simulation results. The wire ‘w5’ shows the execution time
ModelSim simulation tool. As we have discussed earlier the
for different inputs. The mean delay is 7.5 clock cycles.
delay is arbitrary and depends on the minimum of the number
V. CONCLUSIONS [6] Selina, R. Rachel. "VLSI implementation of piecewise
approximated antilogarithmic converter." Communications and
In this paper we have designed a Logarithmic Multiplier Signal Processing (ICCSP), 2013 International Conference on.
which gives an exact result unlike the other logarithmic based IEEE, 2013.
multipliers. The architectural design is simple as it uses [7] Hoefflinger, B., M. Selzer, and F. Warkowski. "Digital
logarithmic CMOS multiplier for very-high-speed signal
simple combinational and sequential logic circuits. The processing." Custom Integrated Circuits Conference, 1991.,
architecture was designed by modifying the existing iterative Proceedings of the IEEE 1991. IEEE, 1991.
block, using the modified iterative block and designing [8] Kong, Man Yan, JM Pierre Langlois, and Dhamin Al-Khalili.
proper combinational and sequential circuits to monitor and "Efficient FPGA implementation of complex multipliers using
the logarithmic number system." Circuits and Systems, 2008.
control the inputs and iterations to arrive at exact results. ISCAS 2008. IEEE International Symposium on. IEEE, 2008.
[9] Ahmed, Syed Ershad, Sanket Kadam, and M. B. Srinivas. "An
VI. REFERENCES Iterative Logarithmic Multiplier with Improved
Precision." Computer Arithmetic (ARITH), 2016 IEEE 23nd
[1] Mitchell, John N. "Computer multiplication and division using Symposium on. IEEE, 2016.
binary logarithms." IRE Transactions on Electronic Computers4 [10] Babic, Zdenka, Aleksej Avramovic, and Patricio Bulic. "An
(1962): 512-517. iterative Mitchell's algorithm based multiplier." Signal
[2] McLaren, Duncan J. "Improved Mitchell-based logarithmic Processing and Information Technology, 2008. ISSPIT 2008.
multiplier for low-power DSP applications." SOC Conference, IEEE International Symposium on. IEEE, 2008.
2003. Proceedings. IEEE International [Systems-on-Chip]. [11] Kim, Min Soo, et al. "Low-power implementation of Mitchell's
IEEE, 2003. approximate logarithmic multiplication for convolutional neural
networks." Design Automation Conference (ASP-DAC), 2018
[3] Agrawal, Ritesh Kumar, and Harish Mallikarjun Kittur. "ASIC
23rd Asia and South Pacific. IEEE, 2018.
based logarithmic multiplier using iterative pipelined
[12] Klinefelter, Alicia, et al. "Error-energy analysis of hardware
architecture." Information & Communication Technologies logarithmic approximation methods for low power
(ICT), 2013 IEEE Conference on. IEEE, 2013. applications." Circuits and Systems (ISCAS), 2015 IEEE
[4] Bulić, Patricio, Zdenka Babić, and Aleksej Avramović. "A International Symposium on. IEEE, 2015.
simple pipelined logarithmic multiplier." Computer Design
(ICCD), 2010 IEEE International Conference on. IEEE, 2010.
[5] Mahalingam, Venkataraman, and N. Ranganathan. "An efficient
and accurate logarithmic multiplier based on operand
decomposition." VLSI Design, 2006. Held jointly with 5th
International Conference on Embedded Systems and Design.,
19th International Conference on. IEEE, 2006.