Developing_and_Assessinginexact_Multiplierarchitec (1)
Developing_and_Assessinginexact_Multiplierarchitec (1)
Thiruvenkadam Krishnan
Research Article
Keywords: Approximate computing, inexact multiplier, Partial product reduction circuitry, Design
parameter, picture multiplication and picture sharpening
DOI: https://doi.org/10.21203/rs.3.rs-4476303/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Abstract
In the realm of approximate computing, the inexact multiplier archi-
tecture stands out as a cornerstone, playing a pivotal role across
error-tolerated applications. This article delves into the intricacies of
three distinct inexact multiplier architectures tailored specifically for
image processing tasks. The study revolves around efficiently par-
titioning the partial product stage into smaller modules and then
employing decoder algorithms/truncation techniques, to obtain the
suggested multiplier’s final result. The resulting 8×8 imprecise mul-
tipliers are engineered with reduced design overhead and reasonable
error metrics. Through simulation with the Cadence RTL compiler
using TSMC 90 nm technology, the realization showcases substantial
area and power savings compared to conventional imprecise multipli-
ers. Comparing one of the proposed approximate models to precise
multipliers reveals significant reductions in both the area and the
power requirement, amounting to 37.19% and 46.14%, respectively, all
while ensuring acceptable error metrics. Furthermore, in comparison to
alternative approximation multiplier designs, the suggested 8×8 mul-
tiplier showcases superior performance metrics. It achieves justifiable
mean Structural Similarity Index (SSI) values, making it particularly
advantageous for tasks such as picture multiplication and sharpening.
1
Springer Nature 2021 LATEX template
2 Research paper
1 Introduction
Approximate computing has indeed captured significant attention due to
its potential advantages in error-tolerant applications [1], especially within
arithmetic circuits [2]. By embracing approximate computing methodolo-
gies, systems can potentially achieve faster operation, improved efficiency,
and reduced power consumption. These benefits are particularly appealing in
domains such as human vision or hearing, where a certain level of imprecision
can be tolerated without significantly impacting the user experience. In the
context of arithmetic circuits, approximate computing techniques offer various
strategies for optimizing performance and resource utilization. These strate-
gies may involve trading off computational accuracy for gains in efficiency.
For instance, reducing the precision of arithmetic operations or employing
approximation algorithms can lead to computational savings with somewhat
compromising overall system functionality. In applications like human vision,
where perception is inherently tolerant to certain degrees of error or impreci-
sion, approximate computing can be leveraged to exploit these characteristics.
Systems can achieve the desired functionality with reduced computational
overhead by judiciously introducing controlled errors or simplifications in com-
putational processes. In arithmetic circuits, approximate computing techniques
may involve trading off between circuit parameters and the overall performance
of the system. This can be achieved through various means, such as reducing
the precision of computations, employing approximation algorithms, or even
introducing controlled errors in computations [3]. In the study by [4], three
innovative designs of approximately 4-2 compressors are presented, seamlessly
integrated into the partial product reduction circuit of a multiplier. Each sys-
tem demonstrates remarkable precision while upholding a rigorous standard for
allowable error metrics. The multiplier’s architecture intelligently amalgamates
truncation and approximation methods to realize additional enhancements in
power efficiency, area minimization, and reduction in delay. Comparative eval-
uation against precise methodologies and alternative approximation strategies
highlights substantial benefits in power consumption, latency reduction, and
improved area utilization specifically tailored for image-sharpening applica-
tions. In letter [5], a novel design is introduced, which integrates an error
correction unit, based on a previous inexact 4-2 compressor concept. Unlike
other suggested 4-2 compressor-based imprecise multiplier systems, this design
showcases superior accuracy, demands fewer hardware resources, and boasts
reduced power consumption, even with the incorporation of the fault-tolerant
mechanism. In this paper [6], produce and propagate signals are harnessed to
Springer Nature 2021 LATEX template
Research paper 3
manipulate the partial products of the multiplier, paving the way for effec-
tive approximation multipliers. Basic OR gates are deployed to implement
approximation for the modified produce partial products. Additionally, to cur-
tail the residual partial products, the paper proposes approximate versions of
the adder blocks. By contrasting with accurate designs, the paper puts forth
two variations of approximate multipliers, which yield substantial reductions
in both area and power consumption. In references [7] and [8], specialized
low-power and energy-efficient approximate arithmetic multipliers have been
meticulously designed to cater to the demands of image sharpening and JPEG
compression applications. The hybrid and high-speed imprecise multiplier has
been purposefully engineered for image multiplication and conventional neural
network-based applications, as detailed in references [9] and [10]. The authors
of [11] have devised an imprecise multiplier that approximates the partial
product reduction and creation steps to reduce computing expenses. Extensive
hardware evaluations demonstrate that the proposed solutions exhibit superior
performance in terms of design parameters compared to existing approaches.
Validation using image sharpening and image multiplication applications indi-
cates that the suggested designs offer a better balance between performance
and image quality. Additionally, it’s worth noting that several imprecise multi-
plier structures have been developed for image filtering [12], edge detection [13],
and neural network applications [14]. These structures offer versatile solutions
to enhance performance across various domains.
Initially, utilizing conventional AND gate logic, the 8×8 exact multiplier (A
and B) produces sixty-four partial product values. Here, A represents the mul-
tiplier, while B represents the multiplicand. These partial products undergo
processing via a combination of the high number of Half-adder (HA) and
Full-adder (FA) blocks to derive the product value, thereby escalating design
complexity. To confront this challenge, a novel approach introduces a 2-bit-
decoder logic-based 8×8 imprecise multiplier, as detailed in [16]. Within this
structure, the 2-bit decoder logic facilitates the grouping of the multiplier
(A) bits on the LSB side (A6-A0), while exact AND logic operates on the
MSB (A8-A7) bits for partial product generation. As a result, this method
effectively reduces the row count for partial products compared to the pre-
cise model. Nonetheless, despite outperforming previous imprecise models, the
2-bit decoder logic-based imprecise multiplier still contends with high design
complexity. To address this further, [15] proposes a 3-bit decoder logic-based
imprecise multiplier. Here, the 8-bit multiplier bits (A) are grouped via 3-
bit decoder logic on the LSB side (A6 -A0 ), with exact AND logic utilized on
the MSB side (A8 -A7 ) for partial product generation. This particular impre-
cise multiplier structure exhibits superior performance in minimizing area
overhead compared to prior inexact multiplier designs presented in the liter-
ature with admissible error metrics. However, despite these advancements, a
notable research gap remains concerning circuit parameters and the quality of
outcomes.
Springer Nature 2021 LATEX template
4 Research paper
Research paper 5
evenly partitioned into Most Significant Bit (MSB) and Least Significant Bit
(LSB) components. Specifically, (A3 -A0 ) and (A7 -A4 ) denote the LSB and
MSB bits of the multiplier, while (B3 -A0 ) and (B7 -A4 ) signify the LSB and
MSB bits of the multiplicand. The partial product values, ranging from p0 to
p7 , result from the multiplication of (B3 -B0 × A3 -A0 ); p15 -p8 arise from (B7 -
B4 × A3 -A0 ); p23 -p16 stem from (B3 -B0 × A7 -A4 ), and p31 -p24 emerge from
(B7 -B4 × A7 -A4 ). Following the production of partial product terms, the par-
tial product reduction circuitry utilizes adder components to derive the final
outcome. The range P15 –P0 signifies the final product value obtained from 8x8
multiplications.
6 Research paper
Conversely, the square denotes the partial product generation of the (B3 -B0 ×
A7 -A4 ) sub-module, employing both 3-bit decoder logic and AND logic. This
approach entails grouping the least significant bits of the multiplier (A6 -A4 )
via 3-bit decoder logic, while the most significant bit (A7 ) is utilized in AND
logic to produce imprecise partial product results.
Research paper 7
Fig. 5 The numerical example of the proposed 8x8 inexact multiplier Design-3.
8 Research paper
Fig. 6 The numerical example of one of the proposed 4x4 inexact multipliers.
Fig. 7 The numerical example of one of the proposed 4x4 inexact multipliers using 2-bit
decoder logic .
Research paper 9
3 Hardware realization
The suggested and existing circuits were developed and implemented using
TSMC 90 nm technology with the aid of the Cadence RTL compiler v7.1 (slow-
normal library). Verilog programming language served as the foundation for
both the existing and proposed approximate multipliers. Table 1 provides an
intricate comparison of 8 × 8 imprecise multipliers based on circuit param-
eters. Additionally, Table 1 presents a comprehensive analysis of accuracy
Springer Nature 2021 LATEX template
10 Research paper
Research paper 11
12 Research paper
Research paper 13
existing designs, excluding [9] D1 and [14]. In terms of circuit parameters, when
compared to [9] D1 , the suggested multiplier D3 showcased reductions in area,
delay, power, and ADP by 29.09%, 35.00%, 12.65%, and 38.08%, respectively.
Similarly, in comparison to [14] based on circuit parameters, the suggested
multiplier D3 displayed reductions in area, delay, power, and ADP by 13.63%,
12.03%, 1.25%, and 14.74%, respectively. Upon analysis of Table 2, it was
deduced that the suggested approximate multipliers prowess lies in its capacity
to achieve an approving equilibrium between ADP and image quality.
5 Conclusion
Within this paper, we have devised three distinct types of proposed
8×8 approximate multiplier structures, employing decoding and truncation
schemes. The primary aim of this endeavour is to refine the design param-
eters by effectively minimizing the partial product rows. Furthermore, upon
juxtaposing with precise and existing inexact multipliers, the presented 8 × 8
model mitigates design variables while upholding justifiable accuracy metrics.
Likewise, comparing one of the provided 8 × 8 approximate multipliers with
accurate multipliers reveals substantial reductions in area, delay, power, and
ADP by 37.19%, 21.68%, 46.14%, and 50.82%, respectively. Considering the
adeptness of the proposed models in attaining an optimal equilibrium between
design parameters and image quality, they stand out as promising candidates
for many image-processing applications.
References
[1] P. J. Edavoor, S. Raveendran and A. D. Rahulkar, ”Approximate Multi-
plier Design Using Novel Dual-Stage 4:2 Compressors,” in IEEE Access,
vol. 8, pp. 48337-48351, 2020, doi: 10.1109/ACCESS.2020.2978773.
14 Research paper
[6] S. Venkatachalam and S. -B. Ko, ”Design of Power and Area Efficient
Approximate Multipliers,” in IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 25, no. 5, pp. 1782-1786, May 2017, doi:
10.1109/TVLSI.2016.2643639.
Research paper 15