0% found this document useful (0 votes)
9 views

Error Diluted Approximate Multipliers Using Positive and Negative Compressors

The document discusses using positive and negative compressors in different stages of a partial product reduction multiplier to reduce accumulated errors compared to using only one type of compressor. It proposes approximate multiplier designs using appropriately placed positive and negative compressors and evaluates them for image smoothing and convolutional neural network applications.

Uploaded by

usaravanakumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Error Diluted Approximate Multipliers Using Positive and Negative Compressors

The document discusses using positive and negative compressors in different stages of a partial product reduction multiplier to reduce accumulated errors compared to using only one type of compressor. It proposes approximate multiplier designs using appropriately placed positive and negative compressors and evaluates them for image smoothing and convolutional neural network applications.

Uploaded by

usaravanakumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Error Diluted Approximate Multipliers Using

Positive And Negative Compressors


Bindu G Gowda Prashanth H C Madhav Rao
2023 24th International Symposium on Quality Electronic Design (ISQED) | 979-8-3503-3475-3/23/$31.00 ©2023 IEEE | DOI: 10.1109/ISQED57927.2023.10129376

IIIT Bangalore IIIT Bangalore IIIT-Bangalore,


Bangalore, India Bangalore, India Bangalore, India
[email protected] [email protected] [email protected]

Abstract—Introducing approximation has shown significant metrics - performance, energy, and silicon footprint [10]–[12].
benefits in the performance and throughput, besides lowering Relaxing exact circuit implementation on the least significant
on-chip power consumption and silicon footprint requirement. part (LSP) of the design not only reduces the gate count but
Approximation in digital computing was designed and targeted
towards error-resilient applications primarily involving image also offers power-saving benefits, reduced silicon footprint
or signal processing modules. Previous works focus on approx- with its associated cost benefits, and improves the through-
imating various arithmetic operator designs, including dividers, put of the designed system block [13]–[15]. However, the
multipliers, adders, subtractors and multiply-and-accumulate approximation technique generates erroneous results, making
units. Approximating compressor designs for multipliers was it applicable only to error-resilient applications. Few image
found to improve performance, power and area effectively. In
addition, they offer regularity in cascading the partial product and signal processing applications are considered error-tolerant
bits. Conventional multiplier designs employ compressors of wherein the error in the processed output does not affect the
the same kind throughout the partial product reduction stages, overall inference [1], [5], [14], [16]. If the output error is
leading to the accumulation of errors. This paper proposes to confined to smaller values, then the overall decision drawn
utilize two different types of compressors: positive and negative from the output image remains unchanged. To keep the error
compressors, subsequently in partial product reduction stages,
with the intention to reduce the accumulated error. The proposed magnitude relatively small, most of the approximation scheme
multiplier designs with appropriately placed positive and negative is applied on the lower significant part of the design [17].
compressors along the stages and columns of the Partial Product One of the easiest methods to design hardware multiplier
Matrix (PPM) are investigated and characterized for hardware is to generate partial products and then accumulate it us-
and error metrics. These designs were further evaluated for ing compressor designs during the partial product reduction
Image smoothing and Convolutional Neural Network (CNN)
applications. The CNN built for four datasets using proposed stage [18]–[22]. This approach is considered relatively easy to
approximate multipliers demonstrated comparable accuracy to comprehend, and approximation at the partial product reduc-
that of exact multiplier-based CNN in the Lenet-5 architecture. tion stage is controllable [23]. In this stage, the approximation
Index Terms—Approximate multiplier, Compressor, Image is introduced by placing inexact compressor designs, thereby
processing, Gaussian smoothing, Approximate CNN achieving all the hardware benefits stated above [18], [20].
In the past, various approximate compressor designs have
I. I NTRODUCTION been designed, evaluated and applied in the partial product
The multiplier is one of the standard design elements seen reduction stage [18], [19]. The compressors of different sizes
in most compute intense processing modules [1], [2]. Due starting from 15:4 to 3:2 were evaluated [24]. However, most
to the emergence of machine learning or neural network of them generate one-sided errors, and employing them for
realization on the hardware, it has become imperative to multiplier design is likely to accumulate errors of relatively
make power and performance-efficient multiplier designs for large magnitude. Compressors with one-sided error distribu-
catering to the large volume of data. Typically, the critical path tion likely accumulate errors over multiple stages of partial
for any image processing system design involves multiplier product reduction, leading to a large error in the final product
unit, and the performance is constrained primarily by this bits. Hence a need for not only designing approximate com-
compute intense multiplier block [3]. One of the common pressors but also minimizing the cumulative error is crucial.
research goals is to make the multiplier design faster, and One recent work in [25] has shown that alternating positive and
hence different architectures have evolved over the years, negative compressors in the partial product reduction stages
which include Booth Encoding, Compressor adopted partial reduce the error. However, much more improvement can be
product reduction technique, Wallace tree, Dadda designs, and achieved by placing the approximate compressors wisely in
many more [4]–[9]. Each of these multiplier designs either the design.
works on generating the partial products faster, adding them The paper primarily focuses on eight different configura-
efficiently, or arranging the partial products such that addition tions of multiplier units designed by appropriately placing
is performed swiftly to obtain the product bits. Most of these positive and negative compressors along the PPM stages
design improves the computation in one of the three hardware and columns to reduce errors. The optimally balanced error-

Authorized licensed use limited to: SARAVANAKUMAR UMATHURAI. Downloaded on May 14,2024 at 08:13:46 UTC from IEEE Xplore. Restrictions apply.
distributed compressor designs for both positive and negative TABLE II: Error (Err) and Probability of Error (P(Err)) for
variants are designed, evaluated and applied in the multiplier 5-2 Positive Compressor
designs. These multipliers are characterized for hardware and Exact Apprx.
a b c d e Err P(Err)
error metrics to realize the benefits. Further, the proposed Sum Sum
0 0 0 1 1 2 3 1 27/1024
approximate multipliers are then applied for image processing 0 0 1 0 1 2 3 1 27/1024
and CNN applications to realize the impact on quality metrics 0 0 1 1 0 2 3 1 27/1024
and classification accuracy, respectively. This is the first time, 0 1 0 0 1 2 3 1 27/1024
as per the authors’ knowledge, the error-diluted technique in 0 1 0 1 0 2 3 1 27/1024
0 1 1 0 0 2 1 -1 27/1024
different configurations of approximate multipliers is evaluated 0 1 1 1 1 4 3 -1 3/1024
and presented. 1 0 0 0 1 2 3 1 27/1024
This paper is organized so that Section II describes the de- 1 0 0 1 0 2 3 1 27/1024
1 0 1 0 0 2 3 1 27/1024
sign of different positive and negative compressors of different
1 0 1 1 1 4 3 -1 3/1024
sizes, and Section III explains the novel approaches explored 1 1 0 0 0 2 3 1 27/1024
to build approximate multipliers to balance out or dilute the 1 1 0 1 1 4 3 -1 3/1024
error accumulation when used in computing intense systems. 1 1 1 0 1 4 3 -1 3/1024
1 1 1 1 0 4 3 -1 3/1024
Section IV gives a detailed analysis of error, performance and 1 1 1 1 1 5 3 -2 1/1024
hardware utilization. Section V presents the application of all
the designed multipliers in Image smoothing and Convolu- TABLE III: Error (Err) and Probability of Error (P(Err)) for
tional Neural Networks (CNN). 3-2 Negative Compressor
II. P ROPOSED A PPROXIMATE C OMPRESSORS a b c
Exact Apprx.
Err P(Err)
Sum Sum
In this proposed work, approximate compressors of three 0 1 1 2 3 1 3/64
different sizes 3:2, 4:2 and 5:2 are designed and evalu- 1 0 1 2 1 -1 3/64
ated considering their error probability, statistical mean error 1 1 0 2 1 -1 3/64
(Emean ), error distribution and the direction of errors. The
compressors are categorised as Positive and Negative based
on the direction of their mean error (Emean ). The positive of these errors due to a particular input combination. The
compressor is configured such that most of the errors are statistical mean error is defined as:
accumulated towards the positive side of the error distribution, Emean =
X
{P (Err)i ∗ Erri }
whereas the negative compressors are configured with errors i
accumulated towards the negative side of the error distribution.
In all compressor designs, the Sum bit is produced by
TABLE I: Error (Err) and Probability of Error (P(Err)) for 3-2 performing OR operation on all the inputs, Carry bit is used
Positive Compressor. in deciding whether the compressor is positive or negative.
Exact Apprx.
a b c
Sum Sum
Err P(Err) A. Design of 3-2 Approximate Compressors
0 1 1 2 3 1 3/64 The expression for Sum bit of 3-2 Approximate Compressors
1 0 1 2 3 1 3/64
1 1 0 2 1 -1 3/64 can be written as: Sum = a + b + c.
The expression for Carry that makes this a Positive Com-
The inputs to the compressors are the partial products pressor is considered as: Carry+ve = bc + ac.
generated by performing AN D operation on the multiplier and The expression for Carry that makes this a Negative Com-
multiplicand bits. Similar to previous works [26], it is assumed pressor is: Carry−ve = bc.
in this work that the multiplier and multiplicand bits are The resulting statistical mean error Emean for Positive and
3 3
independent and uniformly distributed such that the probability Negative compressors are 64 and − 64 , respectively.
of each of these bits to be 1 is 21 and to be 0 is 12 . Hence,
the probability that the partial products generated from these B. Design of 4-2 Approximate Compressors
bits to be 1 is 14 and to be 0 is 34 . The truth table considered The expression for Sum bit of 4-2 Approximate Compressors
to design 3:2 and 5:2 Positive Compressors are reported in can be written as: Sum = a + b + c + d.
Tables I and II respectively. Similarly, the truth table consid- The expression for Carry that makes this a Positive Com-
ered to design 3:2 and 5:2 Negative Compressors is reported pressor is: Carry+ve = (a + c)b + (b + c)d.
in Tables III and IV respectively. It is to be noted that the The expression for Carry that makes this a Negative Com-
table shown has the entries of input combinations conceding pressor is: Carry−ve = ab + cd.
output errors only. The other combinations generating exact The resulting Emean for Positive and Negative Compressors
17 19
outputs are not presented. In these tables, ’Err’ represents the are 256 and − 256 , respectively. The designed 4-2 approximate
difference between the approximate result and the exact result, compressors are seen to be logically equivalent to the one
and ’P (Err)’ represents the probability of the occurrence stated in [25].

Authorized licensed use limited to: SARAVANAKUMAR UMATHURAI. Downloaded on May 14,2024 at 08:13:46 UTC from IEEE Xplore. Restrictions apply.
TABLE IV: Error (Err) and Probability of Error (P(Err)) for accuracy. A method to reduce such high error accumulation is
5-2 Negative Compressor explored in [25] by alternately employing positive and negative
Exact Apprx. compressors, but only of the size 4-2, in the partial-product
a b c d e Err P(Err)
Sum Sum reduction stages of the multiplier block. Multipliers designed
0 0 0 1 1 2 1 -1 27/1024
0 0 1 0 1 2 1 -1 27/1024
using a sequence of positive and negative compressors are
0 0 1 1 0 2 1 -1 27/1024 expected to dilute the accumulation of errors in the product
0 1 0 0 1 2 1 -1 27/1024 bits. In this work, we have investigated and explored maximum
0 1 0 1 0 2 1 -1 27/1024 configurations by appropriately placing positive and negative
0 1 1 0 0 2 1 -1 27/1024
0 1 1 1 1 4 3 -1 3/1024 compressors on the partial-product matrix of the multiplier
1 0 0 0 1 2 3 1 27/1024 design block. In addition, the compressor of sizes 3-2, 4-2
1 0 0 1 0 2 3 1 27/1024 and 5-2 as designed in section II are also evaluated towards
1 0 1 0 0 2 3 1 27/1024
improved performance.
1 0 1 1 1 4 3 -1 3/1024
1 1 0 0 0 2 3 1 27/1024 The four different configurations of multiplier designs that
1 1 0 1 1 4 3 -1 3/1024 are investigated in this work are as follows:
1 1 1 0 1 4 3 -1 3/1024
1 1 1 1 0 4 3 -1 3/1024
1 1 1 1 1 5 3 -2 1/1024
(a) Non Interleaving (NI): This configuration results in the
traditional type of multiplier in which only one type of
compressor is employed in all the reduction stages. The
C. Design of 5-2 Approximate Compressors use of positive compressors results in Positive Multiplier
(PM), and the use of negative compressors results in
The expression for Sum bit of 5-2 Approximate Compressors
Negative Multiplier (NM) with NI configuration.
can be written as: Sum = a + b + c + d + e.
(b) Stage-wise Interleaving (SI): In this configuration, pos-
The Carry that makes this a Positive Compressor is:
itive and negative compressors are used alternatively in
Carry+ve = a(b + c + d + e) + (b + c)(d + e) + de.
the subsequent reduction stages as shown in Figure 2 (a).
The Carry that makes this a Negative Compressor is:
The type of compressors used in the first reduction
Carry−ve = a(b + c + d + e) + bc(d + e) + de(b + c).
stage defines the type of multiplier. The use of positive
The resulting Emean for Positive and Negative Compressors
199 71 compressors in the first reduction stage, followed by
are 1024 and − 1024 , respectively.
the negative compressors in the second reduction stage,
It is to be noted that the current designs are finalized
and so on, makes it a Positive Multiplier (PM) with SI
considering the factors such as the number of gates used,
configuration; and the use of negative compressors in the
hardware complexity and error distribution. The work attempts
first reduction stage, followed by the positive compressors
to design compressors to achieve a similar spread of errors in
in the second reduction stage, and so on, makes it a
positive and negative compressors, but in opposite directions
Negative Multiplier (NM) with SI configuration.
with respect to each other, as shown in Figure 1 for 5-2 positive
(c) Column-wise Interleaving (CI): As the bit weight varies
and negative compressors.
by the power of 2 along the columns of the Partial
Product Matrix (PPM), so does the weight of the error
vary by the power of 2. Hence, there is a scope for
alternatively introducing positive and negative compres-
sors along the subsequent columns. This configuration is
shown in Figure 2 (b). It is to be noted that a column
’i’ in all the subsequent reduction stages are employed
with the same type of compressor, i.e., either positive
or negative. The type of compressor employed along
the column with the highest height defines the type of
Fig. 1: Error Distribution of 5-2 Positive and Negative Ap- multiplier. Using positive compressors along the column
proximate Compressors. with the highest height in the PPM makes it a Positive
Multiplier (PM) with CI configuration, and using negative
compressors along the column with the highest height in
III. P ROPOSED N OVEL A PPROXIMATE M ULTIPLIERS the PPM makes it a Negative Multiplier (NM) with CI
The conventional method of using a single type of com- configuration.
pressors in all the reduction stages of a multiplier design (d) Column-cum-Stage-wise Interleaving (CSI): This is the
results in the accumulation of errors in one direction; it keeps hybrid configuration constructed by using both Stage-
on increasing as the number of reduction stages increases or wise and Column-wise interleaving techniques to curb
the multiplier size increases. Also, when such multipliers are cumulative errors as well as errors along the bit positions,
used in large numbers in a bigger system, the magnitude as shown in Figure 2 (c). Considering the columns ’i’
of error increases resulting in the system with very poor and ’i ± 1’ of the PPM, suppose a positive compressor

Authorized licensed use limited to: SARAVANAKUMAR UMATHURAI. Downloaded on May 14,2024 at 08:13:46 UTC from IEEE Xplore. Restrictions apply.
(a) (b) (c)

Fig. 2: Reduction of the PPM of an 8 × 8 Positive Multiplier using the proposed techniques: (a) Stage-wise Interleaving (SI),
(b) Column-wise Interleaving (CI), and (c) Column-cum-Stage-wise Interleaving (CSI).

is applied along column ’i’ of the first reduction stage,


then a negative compressor is employed along the same
column ’i’ of second reduction stage; and the column
’i ± 1’ of the first reduction stage is employed with a
negative compressor, a positive compressor is applied
along the column ’i ± 1’ of second reduction stage, and
so on. Thus, this configuration employs positive and
negative compressors alternately along the subsequent
columns and reduction stages. The type of compressor
employed along the column with the highest height in
the PPM/first reduction stage defines the type of the
multiplier, i.e., either PM with CSI or NM with CSI
configuration.
IV. E RROR AND P ERFORMANCE A NALYSIS OF
A PPROXIMATE M ULTIPLIERS
The distribution of errors against error magnitude/distance
achieved for all 8 × 8 approximate multiplier designs is shown
in Figure 3. As expected, the PM and NM with NI configured
multiplier designs follow the same trend as that of reference
multiplier design [25]. Evidently, most error values are accu-
Fig. 3: Error Distribution of eight different Positive and Neg-
mulated in the positive direction for PM and in the negative
ative Multiplier Designs along with a pair multiplier design
direction for NM. The use of approximate multipliers with
referred to as Park from [25].
SI, CI, and CSI configurations primarily benefits in reducing
the error count and also achieving a balanced error distribu-
tion. Error metrics such as Error Rate (ER), Mean Absolute
Error (MAE), Root Mean Square Error (ERMS), Normalized
Mean Error Distance (NMED), and Number of Effective Bits
(NoEB) are also evaluated on all the proposed multipliers and
compared with the reference multiplier from [25], as shown
in Figures 4 and 5.
Figure 4 clearly shows that all the multiplier configurations
proposed in this work (NI, SI, CI and CSI) for both pos-
itive and negative type are better than the multipliers built
using the configuration proposed in [25] (Park). It is also
evident that the PM and NM with CSI configuration offers Fig. 4: Error Rate of proposed Positive and Negative Multi-
the lowest error rate compared with other proposed 8 × 8 pliers against Park [25].
designs. In terms of the design mentioned in the literature,

Authorized licensed use limited to: SARAVANAKUMAR UMATHURAI. Downloaded on May 14,2024 at 08:13:46 UTC from IEEE Xplore. Restrictions apply.
(a) (b) (c) (d)

(a) (b)

(e) (f) (g) (h)

(c) (d)

Fig. 5: Error Analysis of the proposed Multiplier Designs. (i) (j) (k) (l)

the CSI multiplier showcases 2.35% and 1.1% improvements


in error-rate for positive and negative multipliers respectively.
The SI configured design also closely matches with the CSI
design. The CI configuration also offers a low error rate
when compared to the balanced multiplier design mentioned
by Park et al. [25]. There is hardly any difference between (m) (n) (o) (p)
the other error metrics, including NoEB, NMED, ERMS,
Fig. 7: Gaussian Filtered Images using positive multipliers
and MAE, when compared with all the proposed multiplier
of the configurations: (a) PM NI, (b) PM SI, (c) PM CI,
designs, as shown in Figure 5 (a-d). Except CSI configuration,
(d) PM CSI; SSIM extracted output images post filtering using
all other configurations including NI, SI, CI and Park [25]
positive multipliers of the configurations: (e) PM NI, (f) PM
shows disparate error metrics between positive and negative
SI, (g) PM CI, (h) PM CSI; Gaussian Filtered Images using
multipliers, and negative multiplier consistently offers lower
negative multipliers of the configurations: (i) NM NI, (j) NM
error metrics than its corresponding positive multiplier.
SI, (k) NM CI, (l) NM CSI; SSIM extracted output Images
post filtering using negative multipliers of the configurations:
(m) NM NI, (n) NM SI, (o) NM CI, (p) NM CSI.

CSI and NI based negative multipliers offer the lowest critical


path delay, with an improvement of 5.30% over the reference
multiplier design. The power consumption for the eight variant
designs remains similar to the reference designs. Between the
(a) (b) positive and negative multiplier design for the CSI configura-
tion, the negative multiplier design presents better hardware
Fig. 6: Hardware Analysis of: (a) Positive Multipliers, (b) characteristics when compared with the positive multiplier
Negative Multipliers. design.

All the multipliers discussed in this work were synthesized V. A PPLICATIONS


in Cadence Genus tool using gpdk45 (45nm Generic PDK)
library to perform the hardware characterization. The NI, CI A. Image Processing: Gaussian Smoothing
and CSI configured positive and negative multipliers show All the multipliers discussed in this work are used in
reduced critical path delay but incur more silicon footprint image smoothing application on the standard grayscale images
when compared to the literature work referred to as Park, Lena and Cameraman using a Gaussian filter. The resulting
as shown in Figure 6 (a-b). Among all the eight multipliers, Gaussian filtered images for Lena are presented in Figure 7.

Authorized licensed use limited to: SARAVANAKUMAR UMATHURAI. Downloaded on May 14,2024 at 08:13:46 UTC from IEEE Xplore. Restrictions apply.
All the filtered images using proposed variants of approximate with the proposed multipliers when employed in the CNN
multiplier designs are comparable to those extracted from the network due to their unidirectional error distribution.
exact multiplier design. Structural Similarity Index (SSIM) As can be inferred from Figure 8, negative multipliers
and Peak Signal-to-Noise ratio (PSNR) are determined to exhibit better performance than positive multipliers. However,
evaluate the quality of the filtered output images. As reported among the positive multipliers, PM CSI shows the least im-
in Tables V and VI, PM and NM with CSI configuration pact on the model performance degradation. Likewise, among
exhibits the best SSIM for both Lena and Cameraman images. negative multipliers, the NM CI configuration showcases the
Between the positive and negative multiplier designs, the best accuracy with negligible loss.
negative multiplier design consistently reported better PSNR
for both images under investigation for different variants of
multiplier designs. Additionally, the critical path delay is also
favourable for negative multiplier designs. Among the negative
multiplier designs, CI multiplier design reported slightly better
results compared to others. For the positive multiplier designs,
CSI design offers the best PSNR. When compared to the earlier
work referred to as Park, PM CSI, PM CI, NM CSI, and NM
CI across both Lena, and Cameraman images shows benefits
in PSNR, with similar SSIM index.

TABLE V: Performance Measure of Positive Multipliers in Fig. 8: Percentage reduction in validation accuracy for all
Gaussian Smoothing Application. 8 proposed and Park [25] multipliers with respect to exact
Lena Cameraman
multiplier, when applied for four different datasets including
SSIM PSNR SSIM PSNR MNIST, Fashion-MNIST, CIFAR10, CIFAR100 trained Lenet-5
PM Park [25] 0.9974 45.9055 0.9975 46.3090 CNN.
PM NI 0.9972 46.2881 0.9960 45.2469
PM SI 0.9970 48.1556 0.9957 46.3542
PM CI 0.9973 48.2737 0.9965 47.4166
PM CSI 0.9974 50.7835 0.9965 48.4852 VI. C ONCLUSIONS
Compressors of different error groups were appropriately
utilized to reduce the error rate for the 8 × 8 bit multiplier
B. CNN Inference designs. Eight forms of positive and negative approximate
All the eight 8 × 8 approximate multipliers proposed in this compressors packed multiplier designs were characterized and
work, the multipliers proposed in [25] and [11] are evaluated evaluated. These compressors defined by their error distribu-
along the convolution layers of Lenet-5 architecture, which tion were optimally utilized to reduce the accumulated errors
is a 5-layer CNN model. The model was trained on MNIST, in the inexact multiplier design. The CSI configured multiplier
Fashion MNIST, CIFAR10 and CIFAR100. The reduction in designs comprised of simultaneous stacking of compressors
validation accuracy compared to exact 8-bit multipliers is along column-wise and stage-wise of different error groups,
shown in the Figure 8. The model is quantized to 8-bit with are reported to yield the lowest error-rate among the positive
≈ 0.01% reduction in validation accuracy because of quanti- and negative multiplier designs. The CSI configured negative
zation before using the proposed approximate multipliers. multiplier exhibits the lowest critical path delay compared to
Two approximate multiplier designs proposed in the lit- other design variants. Gaussian filtering using CSI arrange-
erature [11] that are reported with error rates of 22% and ment for negative multiplier presents the best performance
49% were used in this application, and the best of them metrics in terms of SSIM and PSNR. The proposed mul-
resulted in an accuracy loss of 37.62%, 67.04%, 76.99%, and tipliers were successfully implemented across four different
89.14% for the datasets MNIST, Fashion MNIST, CIFAR10, CNN implementations of Lenet-5 architectures, and among
and CIFAR100, respectively. Despite exhibiting lesser error all eight variants, CSI, and CI configured positive and neg-
rates, these multipliers could not perform well in comparison ative multipliers, respectively, showed the best accuracy. In
the future, multiply-and-accumulate units designed using the
combination of positive and negative multipliers for neural
TABLE VI: Performance Measure of Negative Multipliers in network implementation is planned to showcase significant
Gaussian Smoothing Application. performance improvements.
Lena Cameraman
SSIM PSNR SSIM PSNR R EFERENCES
NM Park [25] 0.9975 48.8719 0.9974 48.7751
[1] R. Zendegani, M. Kamal, M. Bahadori, A. Afzali-Kusha, and M. Pe-
NM NI 0.9971 48.1021 0.9973 48.5188
dram, “Roba multiplier: A rounding-based approximate multiplier for
NM SI 0.9974 53.0217 0.9976 52.2254 high-speed yet energy-efficient digital signal processing,” IEEE Trans-
NM CI 0.9974 52.4603 0.9974 51.5770 actions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 2,
NM CSI 0.9975 52.1785 0.9976 51.2997 pp. 393–401, 2017.

Authorized licensed use limited to: SARAVANAKUMAR UMATHURAI. Downloaded on May 14,2024 at 08:13:46 UTC from IEEE Xplore. Restrictions apply.
[2] H. Jiang, J. Han, F. Qiao, and F. Lombardi, “Approximate radix-8 [14] H. Jiang, C. Liu, L. Liu, F. Lombardi, and J. Han, “A review,
booth multipliers for low-power and high-performance operation,” IEEE classification, and comparative evaluation of approximate arithmetic
Transactions on Computers, vol. 65, no. 8, pp. 2638–2644, 2016. circuits,” J. Emerg. Technol. Comput. Syst., vol. 13, no. 4, aug 2017.
[3] S. E. Mironov, O. I. Bureneva, and A. D. Milakin, “Analysis of multiplier [Online]. Available: https://doi.org/10.1145/3094124
architectures for neural networks hardware implementation,” in 2022 III [15] S. Mittal, “A survey of techniques for approximate computing,”
International Conference on Neural Networks and Neurotechnologies ACM Comput. Surv., vol. 48, no. 4, mar 2016. [Online]. Available:
(NeuroNT), 2022, pp. 32–35. https://doi.org/10.1145/2893356
[4] Y.-J. Chang, Y.-C. Cheng, S.-C. Liao, and C.-H. Hsiao, “A low power [16] M. Ha and S. Lee, “Multipliers with approximate 4–2 compressors and
radix-4 booth multiplier with pre-encoded mechanism,” IEEE Access, error recovery modules,” IEEE Embedded Systems Letters, vol. 10, no. 1,
vol. 8, pp. 114 842–114 853, 2020. pp. 6–9, 2018.
[5] W. Liu, L. Qian, C. Wang, H. Jiang, J. Han, and F. Lombardi, “Design [17] A. Böttcher, M. Kumm, and F. de Dinechin, “Resource optimal truncated
of approximate radix-4 booth multipliers for error-tolerant computing,” multipliers for fpgas,” in 2021 IEEE 28th Symposium on Computer
IEEE Transactions on Computers, vol. 66, no. 8, pp. 1435–1441, 2017. Arithmetic (ARITH), 2021, pp. 102–109.
[6] A. A. Del Barrio, R. Hermida, and S. Ogrenci-Memik, “A combined [18] A. Momeni, J. Han, P. Montuschi, and F. Lombardi, “Design and anal-
arithmetic-high-level synthesis solution to deploy partial carry-save ysis of approximate compressors for multiplication,” IEEE Transactions
radix-8 booth multipliers in datapaths,” IEEE Transactions on Circuits on Computers, vol. 64, no. 4, pp. 984–994, 2015.
and Systems I: Regular Papers, vol. 66, no. 2, pp. 742–755, 2019. [19] S. Venkatachalam, E. Adams, H. J. Lee, and S.-B. Ko, “Design and
[7] R. Pilipović and P. Bulić, “On the design of logarithmic multiplier using analysis of area and power efficient approximate booth multipliers,”
radix-4 booth encoding,” IEEE Access, vol. 8, pp. 64 578–64 590, 2020. IEEE Transactions on Computers, vol. 68, no. 11, pp. 1697–1703, 2019.
[8] L. Qian, C. Wang, W. Liu, F. Lombardi, and J. Han, “Design and [20] T. Kong and S. Li, “Design and analysis of approximate 4–2 compressors
evaluation of an approximate wallace-booth multiplier,” in 2016 IEEE for high-accuracy multipliers,” IEEE Transactions on Very Large Scale
International Symposium on Circuits and Systems (ISCAS), 2016, pp. Integration (VLSI) Systems, vol. 29, no. 10, pp. 1771–1781, 2021.
1974–1977. [21] A. Saha, R. Pal, A. G. Naik, and D. Pal, “Novel cmos multi-bit counter
[9] H. Waris, C. Wang, W. Liu, and F. Lombardi, “Axbms: Approximate for speed-power optimization in multiplier design,” AEU - International
radix-8 booth multipliers for high-performance fpga-based accelerators,” Journal of Electronics and Communications, vol. 95, pp. 189–198, 2018.
IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 68, [22] Z. Wang, G. Jullien, and W. Miller, “A new design technique for column
no. 5, pp. 1566–1570, 2021. compression multipliers,” IEEE Transactions on Computers, vol. 44,
[10] Y. Mannepalli, V. B. Korede, and M. Rao, Novel Approximate no. 8, pp. 962–970, 1995.
Multiplier Designs for Edge Detection Application. New York, [23] A. Cilardo, D. De Caro, N. Petra, F. Caserta, N. Mazzocca, E. Napoli,
NY, USA: Association for Computing Machinery, 2021, p. 371–377. and A. Strollo, “High speed speculative multipliers based on speculative
[Online]. Available: https://doi.org/10.1145/3453688.3461482 carry-save tree,” Circuits and Systems I: Regular Papers, IEEE Trans-
[11] P. H C, S. S R, B. G Gowda, and M. Rao, “Design and evaluation actions on, vol. 61, pp. 3426–3435, 12 2014.
of in-exact compressor based approximate multipliers,” in Proceedings [24] W. Guo and S. Li, “Fast binary counters and compressors generated by
of the Great Lakes Symposium on VLSI 2022, ser. GLSVLSI ’22. sorting network,” IEEE Transactions on Very Large Scale Integration
New York, NY, USA: Association for Computing Machinery, 2022, p. (VLSI) Systems, vol. 29, no. 6, pp. 1220–1230, 2021.
431–436. [Online]. Available: https://doi.org/10.1145/3526241.3530320 [25] J. K. Gunho Park and Y. Lee, “Design and analysis of approximate
[12] S. Singh, P. K. Pothula, and M. Rao, “Design and evaluation of on-chip compressors for balanced error accumulation in mac operator,” IEEE
dct accelerators based on novel approximate reverse carry propagate TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PA-
adders,” in 2022 IEEE Computer Society Annual Symposium on VLSI PERS, vol. 68, no. 7, pp. 2950–2961, 2021.
(ISVLSI), 2022, pp. 8–13. [26] D. Esposito, A. G. M. Strollo, E. Napoli, D. De Caro, and N. Petra,
[13] J. Liang, J. Han, and F. Lombardi, “New metrics for the reliability of ap- “Approximate multipliers based on new approximate compressors,”
proximate and probabilistic adders,” IEEE Transactions on Computers, IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 65,
vol. 62, no. 9, pp. 1760–1771, 2013. no. 12, pp. 4169–4182, 2018.

Authorized licensed use limited to: SARAVANAKUMAR UMATHURAI. Downloaded on May 14,2024 at 08:13:46 UTC from IEEE Xplore. Restrictions apply.

You might also like