0% found this document useful (0 votes)
99 views

Paper For Final Submission

The document describes a proposed architecture for a 16x16 bit iterative logarithmic multiplier that can produce exact results. The architecture uses simple combinational and sequential logic circuits. It works by iteratively reducing the error between the approximate product and exact product over multiple clock cycles, with a maximum of 15 cycles. Previous related work on Mitchell's algorithm and other iterative multiplication algorithms are also discussed. The proposed architecture is then described and simulated using ModelSim to validate the design.

Uploaded by

Anurag Subhakth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views

Paper For Final Submission

The document describes a proposed architecture for a 16x16 bit iterative logarithmic multiplier that can produce exact results. The architecture uses simple combinational and sequential logic circuits. It works by iteratively reducing the error between the approximate product and exact product over multiple clock cycles, with a maximum of 15 cycles. Previous related work on Mitchell's algorithm and other iterative multiplication algorithms are also discussed. The proposed architecture is then described and simulated using ModelSim to validate the design.

Uploaded by

Anurag Subhakth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Design and Simulation of 16x16 bit Iterative

Logarithmic Multiplier for Accurate Results.


Alen Anurag Pandit Ch. Achuth Reddy Dr. Gautam Narayan
B.Tech 4th year,ECE B.Tech 4th year,ECE Associate Professor
School of Electronics Engineering School of Electronics Engineering School of Electronics Engineering
VIT Vellore VIT Vellore VIT Vellore
[email protected] [email protected] [email protected]

Abstract—-Multiplication is a basic arithmetic operation. error is not reduced through any direct approximation
Multiplication operations such as Fast Fourier Transforms, techniques, which are the principal constituents in most of the
Multiplication and accumulation units, Convolution are some of logarithmic based multipliers, rather provides an iterative
the computation-intensive arithmetic functions often encountered solution to reduce the error. Some of the direct error
in Digital Signal Processing applications. Usually, Logarithm approximation techniques are segmentation and interpolation
based multipliers are used in these cases which introduce certain techniques [6]. The proposed architecture follows an algorithm
errors. These errors are approximated by various methods. In this which is similar to the iterative algorithm [4], uses
paper a simple architecture of a 16X16 logarithm based multiplier combinational and sequential logic circuits to achieve an exact
is proposed which uses simple combinational and sequential
result.
circuits to obtain an exact product. The multiplier has an arbitrary
execution time with the maximum execution time being 15 clock The rest of the paper is organized as follows. Section II is
cycles and mean being 7.5 clock cycles. This architecture is subdivided into two parts (A, B). Part A and B explain the
designed and simulated in ‘ModelSim’ simulation tool. previous works of Mitchell’s algorithm [1] and iterative
multiplication algorithms respectively [4]. Section III presents
Keywords—Logarithmic Multiplier, Logarithmic Number
Systems, Modified Iterative Block, Check Block, Control Block, the proposed architecture which uses a modified iterative block
Exact Product. to arrive at the exact results and also explains the functionality
of each block used in the architecture in detail. Section IV
I. INTRODUCTION provides simulation results. Section V draws conclusions and
Section VI provides the references.
Logarithmic multiplication is a process which involves
calculating the product of two operands by converting the II. PREVIOUS WORK
operands into Logarithmic Number System. The procedure for
A. MITCHELL’S ALGORITHM
calculating the product involves converting the operands into
their respective logarithms, adding the logarithmic result and Any binary integral number can be written as:
computing the anti-logarithm of that result. This procedure is
𝑘−1
simpler as the addition operation replaces product operation in 𝑁 = 2𝑘 [ 1 + ∑𝑛=0 2𝑛−𝑘 . 𝑍𝑖 ] (1)
Logarithmic Number Systems [7]. However, this procedure
introduces a setback as the logarithms and anti-logarithms Where ‘k’ is the position of the most significant bit whose value
cannot be computed exactly. So, these methods introduce errors is ‘1’. ‘Zi’ is the value of the bit in the ith position.
as exact values of Logarithms and anti-logarithms cannot be
obtained and one is obliged to approximate the results of The above equation can be further modeled as:
Logarithms and antilogarithms [5] [6] [8] [9] [11] [12]. Such a
method is Mitchell’s Algorithm based multiplier [1], which 𝑁 = 2𝑘 [ 1 + 𝑋 ] (2)
approximates 𝑙𝑜𝑔2 ( 1 + 𝑥 ) as 𝑥, where 𝑥 represents mantissa Where 𝑋 is the mantissa part
of a number.
𝑙𝑜𝑔2 𝑁 = 𝑘 + 𝑙𝑜𝑔2 ( 1 + 𝑋 ) (3)
An iterative architecture similar to Mitchell’s Algorithm
based multiplier was proposed by Patricio Buli´c and his team Mitchell’s algorithm approximates 𝑙𝑜𝑔2 ( 1 + 𝑋 ) with 𝑋
[4] [10] which models the true product as the sum of
approximate product and error. The error here is in the form of So, for any two operands N1 and N2,
the product of two new operands which can be again fed into a
𝑁1 = 2𝑘1 [ 1 + 𝑋1 ] (4)
similar block and whose approximate product can be added to
the previous result, so as to reduce the overall error. The overall 𝑁2 = 2𝑘2 [ 1 + 𝑋2 ] (5)
𝑙𝑜𝑔2 𝑁1 = 𝑘1 + 𝑙𝑜𝑔2 ( 1 + 𝑋1 ) (6) results will give the more accurate value of the product. On
employing the same procedure repeatedly, the accurate product
𝑙𝑜𝑔2 𝑁2 = 𝑘2 + 𝑙𝑜𝑔2 ( 1 + 𝑋2 ) (7) can be achieved at some point. The below block diagram is the
architecture of the iterative block [3]. The architecture of a
From Mitchell’s approximation, 16x16 bit iterative block uses Leading One Detectors ( 16 bit ),
𝑙𝑜𝑔2 𝑁1 = 𝑘1 + 𝑋1 and 𝑙𝑜𝑔2 𝑁2 = 𝑘2 + 𝑋2 . Priority encoders ( 16 x 4 ), Barrel Shifters ( 32 bit ), Ripple
Carry Adders ( 4 bit and 32 bit ), Decoders ( 5 x 32 ) and XOR
𝑙𝑜𝑔2 (𝑁1 . 𝑁2 ) = 𝑘1 + 𝑋1 + 𝑘2 + 𝑋2 . (8) banks ( 16 bit ).
The error here is positive as 𝑙𝑜𝑔2 ( 1 + 𝑋) is always greater
than or equal to 𝑋and the error ranges from 0-11% [2] . Various N1 N2
techniques were proposed to reduce this error, some of them
being Operand Decomposition method [5], using look-up
tables, and Segmentation and interpolation methods [6]. Each
method has its own tradeoffs between architecture complexity, LOD LOD

accuracy and execution time and is generally used where certain


XOR XOR
errors are tolerable. Such conditions are often met in Digital BAN BANK
Signal Processing applications. K
N1 *
B. SIMPLE ITERATIVE LOGARITHMIC MULTIPLIER P.E P.E

This method is similar to Mitchell’s algorithm and uses an


iterative method which gives a possibility to achieve an error as N2 *
small as one desires and even might achieve an exact result [3]
BARREL BARREL
[4] [10]. ADDER
SHIFTER SHIFTER

Mathematics Involved:

𝑁1 . 𝑁2 = 2𝑘1 [ 1 + 𝑋1 ]. 2𝑘2 [ 1 + 𝑋2 ] (9)


𝑁1 . 𝑁2 = 2𝑘1+𝑘2 . [ 1 + 𝑋1 + 𝑋2 ] + 2𝑘1 +𝑘2 . [ 𝑋1 . 𝑋2 ] (10) DECODER

ADDER
𝑁1 . 𝑁2 = 2𝑘1 +𝑘2 + 𝑋1 . 2𝑘1+𝑘2 + 𝑋2 . 2𝑘1 +𝑘2
+ [ 𝑋1 . 𝑋2 ]. 2𝑘1 +𝑘2 (11)
From equation (2), we can write 𝑋. 2𝑘 = (𝑁 − 2𝑘 ) .
Therefore
ADDER

𝑋1 .2𝑘1 = ( 𝑁1 − 2𝑘1 ) ; 𝑋2 . 2𝑘2 = ( 𝑁2 − 2𝑘2 ) (12)


𝑁1 . 𝑁2 = 2𝑘1 +𝑘2 + ( 𝑁1 − 2𝑘1 ). 2𝑘2 + ( 𝑁2 − 2𝑘2 ). 2𝑘1
Pappx
+ ( 𝑁1 − 2𝑘1 ) . ( 𝑁2 − 2𝑘2 ) (13)
Fig. 1. Architecture of the iterative block.
Ptrue = 𝑁1 . 𝑁2
Algorithm:
Ptrue = Pappx + E.
Where Ptrue is the exact product, Pappx is the approximate product  Inputs are given to the Leading One Detectors (LOD’s),
and ‘E’ is the error. outputs of which will be 2𝑘1 and 2𝑘2 .

Pappx = 2𝑘1 +𝑘2 + ( 𝑁1 − 2𝑘1 ). 2𝑘2 + ( 𝑁2 − 2𝑘2 ). 2𝑘1 (14)  With inputs as 2𝑘1 and 2𝑘2 , priority encoders compute
the values of k1 and k2.
E. = ( 𝑁1 − 2𝑘1 ) . ( 𝑁2 − 2𝑘2 ). (15)
 ( 𝑁1 − 2𝑘1 ) and ( 𝑁2 − 2𝑘2 ) are the outputs of XOR
The error here is again in the form of product of two operands banks whose inputs are the Operands and outputs from
(N1*).( N2*), N1* and N2* are the error operands. Leading one detectors.
Where N1* = ( 𝑁1 − 2𝑘1 ) ; N2* = ( 𝑁2 − 2𝑘2 ) for which the  Barrel shifters compute the values of ( 𝑁1 − 2𝑘1 ). 2𝑘2 and
same arithmetic can be followed and subsequently adding the ( 𝑁2 − 2𝑘2 ). 2𝑘1 .
 The results of the two barrel shifters are added to obtain the Modified iterative block:
sum ( 𝑁1 − 2𝑘1 ). 2𝑘2 +( 𝑁2 − 2𝑘2 ). 2𝑘1 .
The Iterative block explained in part B of Section II, uses
 The values of k1 and k2 obtained in step 2. are added and 16x4 priority encoders for which 16’b0 is an invalid input. So,
the result is given as an input to a Decoder, which gives the the block does not perform as expected. The block can be used
value of 2𝑘1+𝑘2 as output. iteratively to reduce the error, but may not achieve exact
product. So we modified the iterative block by including a
 The results obtained in step 5. and step 6. are added to give combinational logic circuit which bypasses the case of the
2𝑘1 +𝑘2 +( 𝑁1 − 2𝑘1 ). 2𝑘2 + ( 𝑁2 − 2𝑘2 ). 2𝑘1 as output. inputs being 16’b0 and to act as expected ( the iterative block is
expected to give a 16’b0 as output even if any one of its
 The error operands are the outputs of the XOR banks. operands is 16’b0 ).
III. PROPOSED ARCHITECTURE N1 N2
We have to note that each error operand in the above
algorithm is the result of removing the most significant bit with
the value ‘1’ from the input operands to the iterative block. So N1 *
by successive iterations (and adding the approximate products
of each iteration) at least one of the error operands becomes ‘0’ Iterative
at some point, which means that the error is ‘0’ at that point, Block
and the accurate product is obtained. For explaining this more
clearly we shall take an example, here we will look at the errors N2 *
NOR NOR
after each iteration.
Bank Bank

Let 𝑁1 = 1001010011 𝑎𝑛𝑑 𝑁2 = 10000101


0
We know that error = ( 𝑁1 − 2𝑘1 ). ( 𝑁2 − 2𝑘2 )

We should note that ( 𝑁 − 2𝑘 ) is the value after removing


MSB from the operand.
MUX

TABLE I

ERROR AFTER EACH ITERATION FOR THE ABOVE EXAMPLE Pappx


Iteration ( 𝑁1 − 2 𝑘1 ). ( 𝑁2 − 2 𝑘2 )
Fig. 2. Modified iterative block
1st (1010011) ∗ (101)
Concept of the proposed architecture:
2nd (10011) ∗ (1)
3rd (11) ∗ (0)
To address the question of detecting the instance at which
the accurate product is obtained we use a check block which
checks whether any of the error operands is ‘0’. Check block
takes the error operands from the modified iterative block as
After 3rd iteration, the error is equal to (11) ∗ (0) which is ‘0’
input, gives a ‘High’ output when no operand is ‘0’ and gives
that means, at this instant accurate product is achieved.
‘Low’ output when any one of the operands is ‘0’.
Therefore we can state that the number of iterations to We have to note that a transition from ‘High’ to ‘Low’
obtain an accurate product is equal to the minimum of the occurs in the output of the check block when the error becomes
number of 1’s in the two operands .Let n1 and n2 be the number ‘0’. We use this condition to detect the instance at which error
of 1’s in the input operands N1 and N2. Then, becomes zero or accurate product is achieved. The architecture
should not allow new inputs or the initial inputs while the
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑠 = Min [𝑛1 , 𝑛2 ] (16) iterations are in progress, this is controlled by a control block
Execution time = (number of iterations – 1)*(clock period) (17) which takes the input from the check block.
The control block allows the error operands as inputs to the
The question here is, how to detect that instance when the
modified iterative block,when the output of the check block is
accurate product is achieved.
‘High’. This block keeps on allowing the error operands until
the output of the check block remains ‘High’. The output of the
check block being ‘Low’ means that the final result is achieved This is done by ‘OR’ operation of every digit of the operands
and there is no need of iterating the error operands further individually and subsequently using an ‘AND’ gate. This check
( since at least one of the error operands will be zero at this point block is a typical zero detector.
and error becomes ‘0’). At this point, new inputs can be
accepted. The control and check blocks are simple and can be N1 * N2 *
constructed using logic gates, Multiplexers, and registers.
These blocks are explained in detail in further discussions. A
buffer register is used to store the temporary results of
successive additions of products after each iteration.
OR OR
Bank Bank
N1

CONTROL
BLOCK

N2

MODIFIED CHECK
ITERATIVE BLOCK BLOCK
Fig. 4. Block Diagram of the Check Block
FINAL PRODUCT

Control Block:

N1 N1 * N2 N2 *
ADDER BUFFER
REGISTER

MUX MUX
.
MUX

0 Register Register

Fig. 3. Block Diagram of the Proposed Architecture

Fig. 5. Block diagram of Control Block


The register which stores the final product is driven by
The Control block controls the inputs to the modified
the output of the check block which serves as a negative edge
iterative block. It does not allow new inputs or initial inputs
clock to the register. So the values in the register change only
while the iterations are in progress. The selection line to the
at the instances when the transition from ‘High’ to ‘Low’
MUX’s in the above block is taken from the output of the
occurs.
check block. When the output of the check block is ‘High’
Check Block:
(when no error operands is ‘0’) the selection line will be
As discussed earlier the check block takes error operands ‘High’ and allows the error operands for further iterations and
as inputs and gives a ‘High’ output when no operand is ‘0’ blocks the initial inputs or new inputs. Selection line becomes
and ‘Low’ output when any one of the operands is ‘0’. ‘Low’ when any of the error operands is ‘0’, then new inputs
can be allowed.
IV. SIMULATIONS AND DISCUSSIONS

Simulation Results:

Fig. 6. Simulation result for N1 = b’10100110011 and N2 = b’10000101

Fig. 7. Simulation result for N1 = b’1010101001010101 and N2 = b’1001000111010101

TABLE II

EXECUTION TIME FOR DIFFERENT INPUTS

Number of 1’s Number of 1’s Execution


Input 1 Input 2 in input 1 in input 2 Min [n1 , n2 ] time
( N1 ) ( N2 ) ( n1 ) ( n2 ) in clock
cycles
1010100000000001 0101010000011011 4 7 4 3

1111010101010101 1000100101010000 10 5 5 4

1000111100001010 0100100011110010 7 7 7 6

0010001100010000 1010101010100101 4 8 4 3

Discussions:
of 1’s in both the operands, which can be observed from the
The architecture is coded in Verilog HDL and tested on
simulation results. The wire ‘w5’ shows the execution time
ModelSim simulation tool. As we have discussed earlier the
for different inputs. The mean delay is 7.5 clock cycles.
delay is arbitrary and depends on the minimum of the number
V. CONCLUSIONS [6] Selina, R. Rachel. "VLSI implementation of piecewise
approximated antilogarithmic converter." Communications and
In this paper we have designed a Logarithmic Multiplier Signal Processing (ICCSP), 2013 International Conference on.
which gives an exact result unlike the other logarithmic based IEEE, 2013.
multipliers. The architectural design is simple as it uses [7] Hoefflinger, B., M. Selzer, and F. Warkowski. "Digital
logarithmic CMOS multiplier for very-high-speed signal
simple combinational and sequential logic circuits. The processing." Custom Integrated Circuits Conference, 1991.,
architecture was designed by modifying the existing iterative Proceedings of the IEEE 1991. IEEE, 1991.
block, using the modified iterative block and designing [8] Kong, Man Yan, JM Pierre Langlois, and Dhamin Al-Khalili.
proper combinational and sequential circuits to monitor and "Efficient FPGA implementation of complex multipliers using
the logarithmic number system." Circuits and Systems, 2008.
control the inputs and iterations to arrive at exact results. ISCAS 2008. IEEE International Symposium on. IEEE, 2008.
[9] Ahmed, Syed Ershad, Sanket Kadam, and M. B. Srinivas. "An
VI. REFERENCES Iterative Logarithmic Multiplier with Improved
Precision." Computer Arithmetic (ARITH), 2016 IEEE 23nd
[1] Mitchell, John N. "Computer multiplication and division using Symposium on. IEEE, 2016.
binary logarithms." IRE Transactions on Electronic Computers4 [10] Babic, Zdenka, Aleksej Avramovic, and Patricio Bulic. "An
(1962): 512-517. iterative Mitchell's algorithm based multiplier." Signal
[2] McLaren, Duncan J. "Improved Mitchell-based logarithmic Processing and Information Technology, 2008. ISSPIT 2008.
multiplier for low-power DSP applications." SOC Conference, IEEE International Symposium on. IEEE, 2008.
2003. Proceedings. IEEE International [Systems-on-Chip]. [11] Kim, Min Soo, et al. "Low-power implementation of Mitchell's
IEEE, 2003. approximate logarithmic multiplication for convolutional neural
networks." Design Automation Conference (ASP-DAC), 2018
[3] Agrawal, Ritesh Kumar, and Harish Mallikarjun Kittur. "ASIC
23rd Asia and South Pacific. IEEE, 2018.
based logarithmic multiplier using iterative pipelined
[12] Klinefelter, Alicia, et al. "Error-energy analysis of hardware
architecture." Information & Communication Technologies logarithmic approximation methods for low power
(ICT), 2013 IEEE Conference on. IEEE, 2013. applications." Circuits and Systems (ISCAS), 2015 IEEE
[4] Bulić, Patricio, Zdenka Babić, and Aleksej Avramović. "A International Symposium on. IEEE, 2015.
simple pipelined logarithmic multiplier." Computer Design
(ICCD), 2010 IEEE International Conference on. IEEE, 2010.
[5] Mahalingam, Venkataraman, and N. Ranganathan. "An efficient
and accurate logarithmic multiplier based on operand
decomposition." VLSI Design, 2006. Held jointly with 5th
International Conference on Embedded Systems and Design.,
19th International Conference on. IEEE, 2006.

You might also like