Implementation of area optimized low power multiplication and accumulation

International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9, Issue-1, November 2019
2928
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: A9110119119/2019©BEIESP
DOI: 10.35940/ijitee.A9110.119119
Abstract— There is number of computations involved at every
stage in Digital Signal Processing (DSP). At every stage of
computation we have addition and multiplication of the terms
derived from previous and presents stages. The general
computation incorporates the use of normal multiplication and
addition, but the circuitry of normal multiplication and addition
is lethargic i.e., it consumes more space on chip, consumes more
power and the speed of computation is also low.These drawbacks
can be avoided by switching to proposed method called
Multiplication and Accumulation (MAC). Aim of this project is to
develop an Area optimized Low power digital circuit for MAC
(Multiply and Accumulate) operation. We develop the Verilog
Hardware Description Language code for the various
implementations of the MAC (Multiply and Accumulate) that is
we try to avoid using multipliers and prefer to use the
combinational circuits like multiplexers. These Verilog HDL
codes will be simulated to check the functionality. Once we get
the expected results we go for the implementation of the digital
circuits. We analyze all the MAC digital circuits to find out the
best digital circuit which consumes minimum area and power.
The importance of MAC in FPGA designs is explained by some
filter designs. We also give some suggestions on the system level
solutions based on the MAC.
Keywords: Multiplier Accumulator Unit, Digital Signal
Processing, Embedded Systems Algorithmic Noise Tolerant ,
Replica Redundancy Block
I. INTRODUCTION
The multiplier and multiplier-accumulator are the
essential elements of the digital signal processing for
example separating, convolution, and inner products. This
unit can compute running total of products, which is at the
core of algorithms, for example, the FIR and FFT. The
capacity to register with a quick MAC unit is fundamental to
accomplish elite in many DSP algorithms, and is the reason
there is in any event one MAC unit in all of contemporary
profitable DSP processors. Most digital signal processing
strategies utilize nonlinear functions for example discrete
cosine transform or discrete wavelet transform. In most of
digital signal processing applications the basic activities are
the multiplication and accumulation. Multiplier-
Revised Manuscript Received on August 05, 2019.
Dr.S.China Venkateswarlu, Professor-ECE,Institute of Aeronautical
Engineering, Affiliated to JNTUH, Hyderabad., undigal, Hyderabad,
Telangana,India. (Email: prof.cvsonagiri@gmail.)
Dr.N. Uday Kumar, Professor-ECE, Marri Laxman Institute of
Technology, Affiliated to JNTUH, Hyderabad, Dundigal, Hyderabad,
Telangana,India. (Email: joyudaya@gmail.com)
N.Sandeep Kumar, Assistant Professor-ECE,Institute of Aeronautical
Engineering, Affiliated to JNTUH, Hyderabad, Dundigal, Hyderabad,
Telangana,India. (Email: prof.cvsonagiri@gmail.com)
Aannam Karthik, Assistant Professor-ECE,Institute of Aeronautical
Engineering, Affiliated to JNTUH, Hyderabad, Dundigal, Hyderabad,
Telangana,India. (Email: karthik011190@gmail.com)
Dr.V.Vijay, Professor-ECE,Institute of Aeronautical Engineering, .
Affiliated to JNTUH, Hyderabad, Dundigal, Hyderabad, Telangana,India
(Email: V.vijay@iare.ac.in)
Accumulator (MAC) unit consumes low power and is
dependably a key to accomplish an elite digital signal
processing system. Finite impulse response filters are
generally utilized in different DSP applications.
Figure 1.1: Basic Structure of Multiply Accumulate
Unit
Another structure has been embraced to expand the
throughput rate and special pipeline structures were utilized
in the accumulator to decrease the complete latency. Used in
the design and implementation of Finite Impulse Response
filter utilizing a low power MAC unit with clock gating and
pipelining methods to save power. In most of digital signal
processing applications the basic activities are the
multiplication and accumulation. Multiply-Accumulate unit
that devours low power is dependably a key to accomplish
an elite in digital signal processing framework. Finite
impulse response filters are generally utilized in different
DSP applications.
We first design a 1-bit or 2-bit MAC unit, with proper
geometries that gives improved power, part and
postponement. The postponement in the pipeline organizes
in the MAC unit is evaluated dependent on which a control
unit is intended to control the information stream between
the MAC obstructs for low power. Correspondingly, the N-
bit MAC unit is planned and controlled for low power
Implementation of Area optimized Low power
Multiplication and Accumulation
S.China Venkateswarlu, N.Uday Kumar, N.Sandeep Kumar, Aannam Karthik, V.Vijay 

Implementation of Area optimized Low power Multiplication and Accumulation
2929
Published By:
DOI: 10.35940/ijitee.A9110.119119
utilizing a control rationale that enables the pipelined stages
at appropriate time. The viper cell planned has bit of leeway
of high operational speed, little transistor tallies and low
power. Ripple Carry Adder is for the most part utilized as an
accumulator in the plan. This exploration work also likewise
examines on different models of multipliers and adders
which are appropriate for execution of high throughput
signal processing and in the meantime to accomplish low
power utilization. The proficiency as far as area and speed
of proposed MAC unit architecture is seen through
diminished area, low basic postponement and low hardware
intricacy. The proposed MAC unit diminishes the area by
decreasing the quantity of multiplication and expansion in
the multiplier unit. The proposed MAC unit can be executed
on a field programmable gate array device. The presentation
development results as far as speed and device usage are
contrasted with before MAC architecture. Despite the fact
that the utilization of Vedic mathematics strategies for
multiplication is described in literature, it has been seen that
the proposed technique of MAC unit execution is utilizing
multiplication unit and shows upgrades in the delay and
area. The general development of the MAC task can be
displayed by this equation
A. Literature Review
a) MarojuSaiKumar& D. Ashok Kumar allowed “Design
& Performance Analysis of Multiply-Accumulate (MAC)
Unit” grants MAC unit model which is planned by
consolidating different multipliers for example, Array
multiplier, Ripple carry multiplier through row bypassing
method, Wallace multiplier & DADDA multiplier.
Execution of MAC unit stands examined in standings of
Area, delay and power. Proposed MAC unit model
(DM+CSA+Acc) accomplishes improved execution as far as
area and delay contrasted with current technique. In any
case, it has small growth in power. The presentation
investigation of MAC unit models is finished by planning
models in Verilog HDL. At that time MAC unit models
remain replicated & produced in Xilinx ISE 13.2 fir Virtex-
6 family 40 nm technology.
b) Thirumala Rao V., Girish Gandhi S. &Leela Mohan C.
permitted “Performance Evaluation of Parallel Multipliers
for High Speed MAC Design” that presented execution of
high speed Signed & Unsigned fast Multipliers & their
relative investigation. Planned manner for broadly utilized
parallel multipliers for example, Booth multiplier, Wallace
multiplier and DADDA Tree multiplier all together get the
plan qualities like Speed, area. The MAC executed utilizing
Wallace Multiplier had slightest interruption when
contrasted with others. The developed design constraints of
multipliers were examined to structure optimum speed
multiply and Accumulate (MAC) unit for multimedia
application like Filters, Synthesizers,
B. Problem Identification
Though there are many ways by means of which MAC
unit has been implemented, out of which some are by using
specific multipliers and adders. We have many good
multipliers such as Wallace Tree Multiplier, Booth
Multiplier, etc. Also we have many adders like Ripple carry
Adder, Carry Look Ahead Adder, etc., but inspite of that we
observed little flaws in their operations. There is specifically
no issue with the Adders, but the issue lies with the
Multipliers. The multiplication operation in those cases are
carried out by using asterisk “*”, which is nothing but called
a star operator used for multiplication.
Hence it is required to counter this big flaw. Hence we
came up with an idea of making the Multiplier Less
Algorithm for carrying out the multiplication operation. As
we know that any program will work extremely fast if it is
incorporated with the processor level commands. We made
the use of this fact and developed a unit MAC, which is
actually multiplier less unit.
II. METHODOLOGY
A key part of study or hypothesis is the methodology.
This isn't exactly equivalent to “methods”. The methodology
portrays the wide philosophical supporting to your picked
research methods, including whether you are utilizing
qualitative or quantitative methods, or a blend of both, and
why.
A. METHODOLOGY OF EXISTING METHODS
In 2 x 2-bit multiplier, the multiplicand has two bits each
and result of multiplication is of four bits. Input ranges from
“00” to “11” and the output lies in the set of “0000” to
“1111”.“00” to “11” and the output lies in the set of “0000”
to “1111”. Figure 1.1 shows the stepwise multiplication of
two binary numbers using Vedic mathematics technique.
Figure 2.1: Block Diagram of Developed Multiplier

2930
Published By:
DOI: 10.35940/ijitee.A9110.119119
III. METHODOLOGY OF PROPOSED METHOD
The basic operation in most of the Digital Signal
processing applications is MAC. The proposed method is on
design and development of an area optimized MAC
(Multiply and Accumulate Block).Proposed method
involves: a) Developing the traditional multiplier-based
MAC, synthesis and area calculation. b) Developing a
windows-based application to generate the multiplier less
MAC equations. c) Developing the various multipliers less
MAC blocks. d) Synthesis of all the multiplier less MAC
blocks. e) Summarizing the area utilization of all the
multiplier less MAC blocks and comparing them with the
traditional multiplier-based MAC. f) Explain the importance
of the proposed method in any of the real time applications
like Video processing systems.
IV. IMPLEMENTATION
In this section different techniques and methodologies
have been discussed by means of which we have
implemented our Research work. Multiplier Based
Traditional Method,
Multiplier Less Mac Method 1. In Multiplier less method
I, we have implemented MAC using Multiplier less
algorithm. To make it multiplier less we have used shifting
operations “≪”, “≫” instead of using asterisk “*”. Since the
shifting commands get executed at the processor levels, they
are very much fast as compared to the asterisk operator and
also they consume few processor resources which had been
proved in the later section. We have designed a 4-bit
coefficient and 8-bit variable MAC. Initially we designed a
C-program by which we generated the MAC equations, i.e.,
we wrote the C-code that generates the MAC equations
when given the length of coefficients.
The addition program using “+” operator has been written
using Verilog language on Mentor Graphics ModelSimTool,
and later is synthesized using Synopsys Synplify Pro
Synthesis Tool in order to generate the resource
consumption report. Below is the resource utilization report
generated by using Synopsys Synplify Pro Synthesis Tool.
V. SIMULATION RESULTS
In this research work simulation waveforms of proposed
method (or) ideology has been included. Apart from that an
explanation related to the functionality as per the waveforms
has also been provided.INPUTS:4-bit coeff, 8-bit var, clk,
rst,
OUTPUTS: macout Multiplication Algorithm Multiplier
Less Algorithm-I Here initially we have considered
Multiplier Less Algorithm-I (MLA-I) from the
implementation section. MLA-I has been simulated and its
snippet has been included. The waveforms have also been
explained briefly.Apart from its simulation, synthesizing has
also been done by using Synopsys Synplify Pro Synthesis
tool and the snippet has been attached below.CASE-I: When
rst=-1. The input provided is 4-bit coeff as: “1110” whose
value in decimal is “14” and 8-bit var as: “0101011110”
whose value in decimal is “86”. Here it is observed clearly
that when rst=1, then irrespective of any inputs the output is
zero “000000000000” (12’b0).In short the system is reset.
Now let us see when rst . Observed Simulation
waveform of MLA-I when rst 0 Here we have made
, so basically now the system is out of reset state, hence the
system will function as it was designed to function. Now it
will produce the output which will be the multiplication of
both the inputs.Clearly the output obtained is macout=
“010010110100” whose decimal value is “1204”.By this the
functionality of our proposed idea is verified. Below shown
is the snippet which is the synthesis outcome of this
particular algorithm MLA-I using Synopsys Synlipy Pro
Synthesis Tool.
Figure 5.1: Synthesis Outcome of MLA-I

Implementation of Area optimized Low power Multiplication and Accumulation
2931
Published By:
DOI: 10.35940/ijitee.A9110.119119
Figure 5.2: Synthesis Outcome of MLA-II
Figure 5.3: Simulation waveform of MLA-II when
rst 0
Figure 5.4: Synthesis Outcome of Adder implemented
using “+” operatoron Synopsys Synplify Pro Synthesis
Tool
VI. CONCLUSION AND FUTURE SCOPE
The electronics and semiconductor industry involve a
critical spot in modern culture by helping human desires and
sub-serving human happiness. Therefore, with the
progression of time and evolution in demand for electronic
gadgets, the industry has seen captivating development. The
advancement has been increasingly articulated in the field of
convenient specialized gadgets like Mobile phones, IPADS
and notepads. The current elite frameworks utilized in the
correspondence framework uses Multiply Accumulate block
which expends enormous area and power and are described
by higher information throughput rate. However, in the
power-starved society, there is a basic need to find
frameworks that can reduce power and rise the speed of
MAC system. In this specific circumstance, a few MAC
with its area and power decrease procedures have been
engaged in this project effort. In addition, this investigation
would assist the future specialists with exploring more in the
field of low power, area and high speed multiply-accumulate
unit.
The proposed methodology is designed keeping
computational operations performed in Digital Signal
Processing applications in mind, but this can be further
increased to other domains like Video Processing, and
interestingly in Image Processing. The main aim of MAC
unit is actually the pace at which an operation is performed
i.e., speed of operation. With further enhancements in the
proposed methodology we can make MAC unit as more area
efficient, faster, and minimize the power consumption even
more. They can be used in the implementation of finite

2932
Published By:
DOI: 10.35940/ijitee.A9110.119119
impulse response filter design for DSP which are
advantageous for low power applications. ASIC design for
low power digital filter with low latency and power gating
can be carried out. These arithmetic sub systems can be used
in the implementation of arithmetic and logic units and
multiply and accumulate units for DSP processor which
improves performance of the system in different level of
abstraction. The GDI based counter designs can be
improved by using clock swing techniques in flip flops.
VII. APPENDIX
Appendix A: Structure of Multiply Accumulate Unit ,
Developed Multiplier, Synthesis Outcome of MLA-I,
Synthesis Outcome of MLA-II, Simulation waveform of
MLA-II when rs values varies from 0 to and so on 0,
Synthesis Outcome of Adder implemented using using “+”
operation Synopsys Simplify Pro Synthesis Tool.
VIII. ACKNOWLEDGMENT
The work is carried out through the research facility at the
Department of Electronics & Communication Engineering,
Institute of Aeronautical Engineering, Dundigal, Hyderabad,
Telangana, INDIA. Department of Electronics &
Communication Engineering, MLRITM, Hyderabad. The
Authors also would like to thank the authorities of JNTUH,
for encouraging this research work.
REFERENCES
1 Pratap Kumar Dakua, Anamika Sinha,
Shivdhari&Gourab,“Hardware Implementation of MAC
unit,” International Journal of Electronics
Communication and Computer Engineer ing, vol. 3,
2012.
2 M.Jeevitha, R.Muthaiah, P.Swaminathan, “Review
Article: Efficient Multiplier Architecure in VLSI
Design,” Journal of Theoretical and Applied Information
Technology, vol. 38, no. 2, April 2012.
3 Ravi Shankar Mishra, PuranGour, Braj Bihari Soni,
“Design and Implements of Booth and Robertson’s
multipliers algorithm on FPGA,” International Journal of
EngineeringResearch and Applications, Vol. 1, Issue 3,
pp. 905-910, 2011.
4 Tung Thanh Hoang, Magnus Själander, Per
LarssonEdefors, “A High-Speed, Energy-Efficient Two-
Cycle Multiply-Accumulate (MAC) Architecture and Its
Application to aDouble-Throughput MAC Unit,” IEEE
transactions on Circuits & Systems, vol. 57, no. 12, pp.
3073-3081, Dec. 2010.
5 A. Abdelgawad, MagdyBayoumi, “High Speed andArea-
Efficient Multiply Ac cumulate (MAC) Unit for
DigitalSignal Prossing Applications,” IEEE International
Symposium on Circuits &Systems , pp. 3199 – 3202,
2007.
6 Berkeley Design Technology, Inc., “Choosing a DSP
Processor,” World Wide Web,
http://www.bdti.com/articles/choose_2000.pdf, 2000.
7 Jennifer Eyre and Jeff Bier, “The Evolution of DSP
Processors”, Berkeley Design Technol
ogy,Inc.,http://www.bdti.com/articles/evolution.pdf,
2000.
AUTHORS PROFILE
Dr.S.China Venkateswarlu, He obtained
his B.Tech (ECE), M.Tech (DSCE),
Ph.D(DSP), working as a Professor in
Dept.of ECE at Institute of Aeronautical
Engineering, Hyderabad.TG, INDIA. He is
having more than 22 years of teaching
experience and published more than 25
papers in referred National and International
Journals. He has presented more than15
papers in National and International Journals. He has reviewed one Book
on Digital Signal Processing from M/S Persons Educations. He is a IEEE
Member, Life member of ISTE, CSI, IAENG, SDIWC, ISRD, IFERP. He
is working as a Reviewer in Different International Journals Elsevier-Signal
Processing. Edas. His areas of interest Digital Signal Processing, Speech
Processing, ,Hearing Aid, Embedded Systems, Digital Image Processing.
Dr.N.Uday Kumar , He obtained his
B.Tech,M.Tech,Ph.D., working as a
Professor in Dept.of ECE at Marri Laxman
Reddy Institute of Technology &
Management,a, Hyderabad.TG, INDIA. He
is having more than 10 years of teaching
experience and published more than 6 papers
in referred National and International
Journals. He has presented more than 9 apers
in National and International Journals. He is a Life member of MIE and
IAENG. His areas of interest Embedded Systems, Digital Image
Processing.
N.Sandeep Kumar, He obtained his
B.Tech M.Tech, working as a Assistant
Professor in Dept.of ECE at Institute of
Aeronautical Engineering, Hyderabad.TG,
INDIA. He is having more than 7 years of
teaching experience and published more
than 6 papers in referred National and
International Journals. He has presented
more than 5 papers in National and International Journals.. His areas of
interest Digital Signal Processing, Speech Processing, ,Hearing Aid,
Embedded Systems, Digital Image Processing.
A.KARTHIK Research scholar in
department of ECE-Veltech Dr.RR &
Dr.Sr University,Chennai. M.Tech (ECE)
in Mallareddy institute of technology &
science (JNTUH), Obtained B.Tech (ECE)
in Vaageswari College of engineering
(JNTUH) .Worked as Assistant professor
in Narsimha reddy engineering college. Areas
of interest Digital Signal Processing, Speech processing, Embedded
Systems,karthik011190@gmail.com.
Dr.V.Vijay, He obtained his B.Tech,,
M.Tech , Ph.D, working as a Professor in
Dept.of ECE at Institute of Aeronautical
Engineering, Hyderabad.TG, INDIA. He is
having more than 15 years of teaching
experience and published more than 20
papers in referred National and
International Journals. He has presented more than10 papers in National
and International Journals. He is a IEEE Member, ISTE, IAENG, His areas
of interest VLSI, Signal Processing, ,Hearing Aid. .

Implementation of area optimized low power multiplication and accumulation

More Related Content

What's hot (18)

Similar to Implementation of area optimized low power multiplication and accumulation (20)

Recently uploaded (20)

Implementation of area optimized low power multiplication and accumulation