paper6

This paper presents energy-efficient approximate multiplication techniques for digital signal processing (DSP) and classification applications that tolerate computational errors. The proposed architectures can reduce energy consumption by up to 58% compared to precise multipliers while maintaining acceptable accuracy levels, leveraging methods such as aggressive voltage scaling and bit-width truncation. The results demonstrate that small computational errors do not significantly impact the quality of DSP and classification tasks, making these techniques suitable for energy-constrained devices.

Uploaded by

Akash Dey

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

paper6

Uploaded by

Akash Dey

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

1

Energy-Efficient Approximate Multiplication for

Digital Signal Processing and Classification
Applications
Srinivasan Narayanamoorthy, Hadi Asghari Moghaddam, Zhenhong Liu, Taejoon Park, Member, IEEE,
and Nam Sung Kim, Senior Member, IEEE

Abstract— The need to support various digital signal process- due to computational error tolerance.
ing (DSP) and classification applications on energy-constrained Most of such algorithms extensively perform matrix multi-
devices has steadily grown. Such applications often extensively plications as their fundamental operation, while a multiplier is
perform matrix multiplications using fixed-point arithmetic while
exhibiting tolerance for some computational errors. Hence, typically an inherently energy-hungry component. To improve
improving the energy efficiency of multiplications is critical. In energy efficiency of multipliers, previous studies have explored
this paper, we propose multiplier architectures that can trade-off various techniques exploiting computational error tolerance.
computational accuracy with energy consumption at design time. They can be classified into three categories: (i) aggressive
Compared to a precise multiplier, the proposed multiplier can voltage scaling [4,5], (ii) truncation of bit-width [4,6], and
consume 58% less energy/op with average computational error
of ∼1%. Finally, we demonstrate that such small computational (iii) use of inaccurate building blocks [7]. Chippa et al. pro-
error does not notably impact the quality of DSP and the posed scalable effort hardware design and explored algorithm-
accuracy of classification applications. , architecture-, and circuit-level scaling to minimize energy
consumption while offering acceptable classification quality
through aggressively scaling voltage scaling and truncating
I. I NTRODUCTION least-significant bits [4]. Kulkarni et al. proposed an under-
Achieving high energy efficiency has become a key design designed 16 × 16 multiplier using inaccurate 2 × 2 partial
objective for embedded and mobile computing devices due to product generators (PPG) while guaranteeing the minimum
their limited battery capacity and power budget. To improve and maximum accuracy fixed at design time. Each PPG has
energy efficiency of such computing devices, significant effort fewer transistors compared to the accurate 2 × 2 one, reducing
has already been devoted at various levels, from software to both dynamic and leakage energy at the cost of some accuracy
architecture, and all the way down to circuit and technology loss. Babića et al. proposed a novel iterative log approximate
levels. multiplier using leading one detectors (LODs) to support
Embedded and mobile computing devices are frequently variable accuracy [8].
required to execute some key digital signal processing (DSP) In this paper, we propose an approximate multiplication
and classification applications. To further improve energy technique that takes m consecutive bits (i.e., m-bit segment)
efficiency of executing such applications, first, dedicated spe- from each n-bit operand, where m is equal to or greater than
cialized processors are often integrated in computing devices. n/2. An m-bit segment can start only from one of two or
It has been reported that the use of such specialized processors three fixed bit positions depending where the leading one bit is
can improve energy efficiency by 10∼100× compared to located for a positive number. This approach can provide much
general-purpose processors at the same voltage and technology higher accuracy than one simply truncating the LSBs, because
generation [1]. it can more effectively capture more noteworthy bits. Although
Second, many DSP and classification applications heavily we can capture m-bit segments starting from the exact leading
rely on complex probabilistic mathematical models and are one bit position, such an approach requires expensive LODs
designed to process information that typically contains noise. and shifters to take m-bit segments starting from the leading
Thus, for some computational error, they exhibit graceful one position, steer them to an m × m multiplier, and expand
degradation in overall DSP quality and classification accuracy 2m bits to 2n bits. In contrast, our approach is more scalable
instead of a catastrophic failure. Such computational error than one that captures m-bit segments starting from the leading
tolerance has been exploited by trading accuracy with energy one bits, since it limits the possible starting bit positions of an
consumption (e.g., [2]). m-bit segment to two or three regardless of m and n chosen
Finally, these algorithms are initially designed and trained at design time, eliminating LODs and replacing shifters with
with floating-point (FP) arithmetic, but they are often con- multiplexers. Finally, we also observe that one of two operands
verted to fixed-point (FxP) arithmetic due to the area and in each multiplication for DSP and classification algorithms is
power cost of supporting FP units in embedded computing often stored in memory (e.g., coefficients in filter algorithms
devices [3]. Although this conversion process leads to some and trained weight values in artificial neural networks (ANN))
loss of computational accuracy, it does not notably affect the and repeatedly used. We exploit it to further improve the
quality of DSP and the accuracy of classification applications energy efficiency of our approximate multiplier.
Digital Object Identifier: 10.1109/TVLSI.2017.2333366

1557-9999 c 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
2

achieve 99.4% accuracy for a 16 ×16 multiplication even with

an 8 × 8 multiplier.
Such a multiplication approach has little negative impact on
computational accuracy, because it can eliminates redundant
Fig. 1. An example of a multiplication with 8-bit segments of two 16-bit
operands; bold-font bits comprise the segments.
bits (i.e., sign-extension bits) while feeding the most useful
m significant bits to the multiplier; we will provide detailed
evaluations of computational accuracy for various m in Section
3. Furthermore, an m × m multiplier consumes much less
energy than an n × n multiplier, because the complexity
(and thus energy consumption) of multipliers quadratically
increases with n. For example, the 4 × 4 and 8 × 8 multipliers
consume almost 20× and 5× less energy than a 16 × 16
multiplier per operation on average. However, a DSM requires:
(i) two LODs, (ii) two n-bit shifters to align the leading one
position of each n-bit operand to the MSB position of each
Fig. 2. Possible starting bit positions of 8- and 10-bit segments indicated by m-bit segment to apply their m-bit segments to the m × m
arrows; the dotted arrow is the case for supporting three possible starting bit
positions.
multiplier, and (iii) one 2n-bit shifter to expand a 2m-bit result
to 2n bits. (i), (ii), and (iii) incur considerable area and energy
penalties, completely negating the energy benefit of using the
m × m multiplier; we provide detailed evaluations for two m
values in Section 0.
The area and energy penalties associated with (i), (ii), and
(iii) in DSM is to capture an m-bit segment starting from an
arbitrary bit position in an n-bit operand because the leading
one bit can be anywhere. Thus, we proposed to limit possible
starting bit positions to extract an m-bit segment from an n-
bit operand to two or three at most in SSM, where Figure 2
shows examples of extracting 8- and 10-bit segments from a
16-bit operand. Regardless of m and n, we have four possible
combinations of taking two m-bit segments from two n-bit
Fig. 3. Examples of 16×16 multiplications based on 8-bit segments with two operands for a multiplication using the m-bit SSM.
possible starting bit positions for 8-bit segments. The shaded cells represent For a multiplication, we choose the m-bit segment that
8-bit segments and the aligned position of 8 × 8 multiplication results. contains the leading one bit of each operand and apply the
chosen segments from both operands to the m × m multiplier.
The SSM greatly simplifies the circuit that chooses m-bit
segments and steers them to the m ×m multiplier by replacing
two n-bit LODs and shifters for the DSM with two (n − m)-
Fig. 4. An example of low accuracy for SSM16 × 16. input OR gates and m-bit 2-to-1 multiplexers; if the first
(n − m) bits starting from the MSB are all zeros, the lower
m-bit segment must contain the leading one. Furthermore, the
The remainder of the paper is organized as follows. Sec- SSM also allows us to replace the 2n-bit shifter used for the
tion 2 details the proposed multiplier architecture. Section 3 DSM with a 2n-bit 3-to-1 multiplexer. Since the segment for
analyzes energy consumption and computational accuracy of each operand is taken from one of two possible segments in
various approximate multipliers and impact of such multipliers an n-bit operand, a 2m-bit result can be expanded to a 2n-
on quality of DSP and accuracy of classification algorithms. bit result by left-shifting the 2m-bit result by one of three
Section 4 concludes this study. possible shift amounts: (i) no shift when both segments are
from the lower m-bit segments; (ii) (n − m) shift when two
II. A PPROXIMATE M ULTIPLIER E XPLOITING S IGNIFICANT segments are from the upper and lower ones, respectively; and
S EGMENTS OF O PERANDS (iii) 2 × (n − m) shift when both segments are from the upper
In order to motivate and describe our proposed multiplier, ones, as shown in Figure 3.
we define an m-bit segment as m contiguous bits starting with Note that the accuracy of an SSM with m = n/2 can be
the leading one in an n-bit positive operand. We dub this significantly low for operands shown in Figure 4, where many
method dynamic segment method (DSM) in contrast to static MSBs of m-bit segments containing the leading one bit are
segment method (SSM) that will be discussed later in this filled with zeros. On the other hand, such a problem becomes
section. With two m-bit segments from two n-bit operands, less severe as m is larger than n/2; there is an overlap in a range
we can perform a multiplication using an m × m multiplier. of bits covered by both possible m-bit segments as shown
Figure 1 illustrates an example of a multiplication after taking for m = 10 in Figure 2. Thus, for an SSM with m = n/2
8-bit segments from 16-bit operands. In this example, we can we propose to support one more bit position that allows us
3

Fig. 5. Probability distribution of compute accuracy of AM2 × 2, DSM8 × 8, DSM6 × 6, SSM8 × 8, ESSM8 × 8, and 8 × 8 (truncated) for random vectors,
audio/image processing, and recognition applications.

to extract an m-bit segment indicated by the dotted arrow in

Figure 2. This will be able to effectively capture operand pairs
similar to one shown in Figure 4.
Figure 5 illustrates an SSM allowing to take an m-bit
segment from two possible bit positions of an n-bit operand.
The key advantage is its scalability for various m and n,
because the complexity (i.e., area and energy consumption)
of auxiliary circuits for choosing/steering m-bit segments and
expanding a 2m-bit result to a 2n-bit results scales linearly
with m.
For applications where one of operands of each multiplica-
tion is often a fixed coefficient, we propose to pre-compute the
bit-wise OR value of B[n − 1:m] and pre-select between two
possible m-bit segments (i.e., B[n − 1:n − m] and B[m − 1:0])
in Figure 5, and store them instead of the native B value in
memory. This allows us to remove the n − m input OR gate
Fig. 6. Proposed approximate multiplier architecture; the logic and wires
and the m-bit 2-to-1 multiplexer denoted by the dotted lines denoted by the dotted lines are not needed if B is pre-processed as proposed.
in Figure 5.
Finally, to support three possible starting bit positions for
picking an m-bit segment where m = n/2, the two 2-to-1
multiplexers at the input stage and one 3-to-1 multiplier at the values (denoted by “random”), noise cancelling algorithm
output stage are replaced with 3-to-1 and 5-to-1 multiplexers, [9] (denoted by “audio”), 2-dimensional optical coherence
respectively, along with some minor changes in logic functions tomography (2D OCT) [10] (denoted by “image”), and isolated
generating multiplexer control signals; we will show this spoken digit recognition [11] (denoted by “recognition”);
enhanced SSM design for m = 8 and n = 16 (denoted by where each set is comprised of billions of operand pairs. To
ESSM8 × 8) can provide as good accuracy as SSM10 × 10 at evaluate energy consumption, we use Synopsys PrimeTime-
notably lower energy consumption later. PX®, which can estimate energy consumption of a synthesized
design based on annotated switching activities from gate-level
simulation. The input vectors for energy estimation are directly
III. E VALUATION
taken from the execution of multiplication intensive kernels in
Evaluation Methodology: In this section we describe each application. We observe that the extracted input vectors
the methodology for evaluating computational accuracy and exhibit inherent periodicity in the operand values applied to
energy consumption of precise and various approximate mul- the multiplier. Thus, we take many such periods such that the
tipliers. All the multipliers are described to support two 16-bit number of vectors is 10,000 at least.
inputs and 32-bit output with Verilog HDL and synthesized Computational Accuracy: Figure 6 plots the probability
using Synopsys Design Compiler®and a TSMC 45nm stan- distribution of computational accuracy of AM, DSM8 × 8,
dard cell library at the typical process corner. We repeatedly DSM6×6, SSM8×8, SSM10×10, ESSM8×8, and TRUN8×8
synthesize each multiplier until it achieves the highest operat- for four sets of operand pairs. We observe that the average
ing frequency. Then we choose the frequency of the slowest computational accuracy of all these approximate multipliers is
one (i.e., 2GHz) to re-synthesize all other multipliers. very high. For “random,” AM, DSM8×8, DSM6×6, SSM8×8,
To evaluate computational accuracy, we take four sets SSM10 × 10, ESSM8 × 8, and TRUN8 × 8 exhibit average
of 16-bit operand pairs from: all possible pairs of 16-bit compute accuracy of 96.7%, 99.7%, 97.8%, 98.0%, 99.6%,
4

Fig. 7. Energy/op of AM, DSM8 × 8, DSM6 × 6, SSM8 × 8, SSM10 × 10,

and ESSM10 × 10, relative to that of PM for four sets of operand pairs.

Fig. 8. Breakdown of area and energy/op of 16 × 16.

TABLE I
Q O C OF APPLICATIONS USING APPROXIMATE MULTIPLIERS RELATIVE TO
THE PRECISE MULTIPLIER .
respectively. In contrast, the average energy/op of SSM10×10
and ESSM8 × 8, which can offer sufficient computational
accuracy and QoC, is 35% and 58% lower than that of PM.
ESSM8 × 8 that is simplified to accept pre-processed fixed
coefficients consumes 6% lower than the original ESSM8 × 8.
We do not provide detailed energy/op analysis for TRUN8×8,
SSM8×8 and SSM12×12, because (i) TRUN8×8 and SSM8×
8 may not provide sufficient computational accuracy and QoC
regardless of very low energy/op and (ii) SSM12 × 12 does
not exhibit notably higher computational accuracy and QoC
than SSM10 × 10 while consuming much higher energy/op.
Analyzing the energy/op reduction of the evaluated multi-
99.0%, and 97.1%, respectively. However, for three classes of pliers, we first note that energy/op of AM can consume even
applications, AM, SSM8 × 8, and TRUN8 × 8 show notably lower than that of PM with higher target synthesis frequency
deteriorating accuracy compared to SSM10×10 and ESSM8× [7]. However, the target synthesis frequency is limited by
8. For example, AM, SSM8 × 8, and TRUN8 × 8 can achieve DSM8 × 8 while the energy/op of the multipliers should
computational accuracy higher than 95% only for 45%, 64%, be compared at the same target frequency. Even though we
and 61 % of operand pairs from “image.” In contrast, other remove DSM8 × 8 in the comparison, which allows us to
approximate multipliers such as DSM8 × 8, SSM10 × 10, and increase the target synthesis frequency for AM, SSM10 × 10,
ESSM8×8 can offer computational accuracy higher than 95% ESSM8 × 8, and PM, we see that the relative energy/op differ-
for 100%, 98%, and 98% of operand pairs, respectively. We ence between AM and SSM10 × 10 (or ESSM8 × 8) does not
expect that the high computational accuracy of SSM10 × 10 notably change. In other words, SSM10 × 10 and ESSM8 × 8
and ESSM8 × 8 for such a high fraction of operand pairs will also benefits from higher target synthesis frequency, exhibiting
barely impact quality of computing (QoC). The computational lower energy/op.
accuracy trend of “audio” and “recognition” is similar to that Second, we observe that the power overhead of extra logic
of “image” as shown in Figure 6. such as LODs and shifters is almost considerably larger than
QoC: Table 1 tabulates the QoC obtained using different the 8 × 8 and 6 × 6 multipliers themselves for DSM8 × 8
approximate multipliers relative to PM. To measure QoC, we and DSM6 × 6, although the bit width of the multiplier is a
use perceptual evaluation of speech quality (PESQ) [12] and half of PM. This significantly reduces the overall benefit of
structural similarity (SSIM) [13] for “audio” and “image,” DSM8 × 8 and DSM6 × 6. Furthermore, the fraction of logic
respectively. In “audio” and “image,” PESQ and SSIM that are gates switching in the 8 × 8 multiplier for DSM8 × 8 and
higher than 99% do not incur notable perceptual difference. DSM6 × 6 can be much higher than the 16 × 16 multiplier
Thus, our SSM10 × 10 and ESSM8 × 8 are sufficient for QoC for PM. This is because DSM8 × 8 and DSM6 × 6 remove
while TRUN8 × 8 shows considerable QoC degradation for all redundant sign-extension bits, which may incur no switching
three applications. in many logic gates corresponding to MSB portion of PM
Energy and Area Analysis: Figure 7 shows the average before each multiplication.
energy/op of AM, DSM8×8, DSM6×6, SSM8×8, SSM10 × Figure 8 depicts the breakdown of area and energy/op of
10, ESSM8×8, and TRUN8×8 for each of four sets of operand 16 × 16 multipliers using DSM, SSM, and ESSM for various
pairs, normalized to that of PM, respectively. The average m, normalized to those of the 16 × 16 PMs. In the plot, each
energy/op of AM, DSM8 × 8, DSM6 × 6 across all four sets bar is comprised of area (or energy/op) of the base m × m
of operand pairs is 13%, 3%, and 28% lower than that of PM, multiplier (denoted by “base m × m mult”) and the remaining
5

components (denoted by “rest”) such as the segment selection [2] R. Hegde and N.R. Shanbhag, "Energy-efficient signal processing via
logic and multiplexers in SSMm × m, and LODs and shifters algorithmic noise-tolerance," in IEEE/ACM Int. Symp. Low Power Elec-
tronics and Design (ISLPED), 1999, pp. 30-35.
in DSMm × m, respectively. The average energy/op in the plot [3] D. Menard, D. Chillet, C. Charot, and O Sentieys, "Automatic floating-
is based on “random.” point to fixed-point conversion for DSP code generation," in ACM Int.
First, the area of SSMm×m and DSMm×m are very closely Conf. Compilers, Arch., and Syn. for Embedded Syst. (CASES), 2002,
pp. 270-276.
correlated with the energy/op. For example, SSM10 × 10 [4] V.K. Chippa, D. Mohapatra, A. Raghunathan, K. Roy, and S.T. Chakrad-
consumes only 62% and 58% of PM’s area and energy/op, har, "Scalable effort hardware design: Exploiting algorithmic resilience
respectively. Second, the base m × m multiplier contributes to for energy efficiency," in IEEE/ACM Design Automation Conf., 2010,
pp. 555-560.
consider-able area and energy/op for SSMm × m, while the [5] D. Mohapatra, G. Karakonstantis, and K. Roy, "Significance driven
remaining components dominate those for DSMm × m. For computation: a voltage-scalable, variation-aware, quality-tuning motion
example, the 10 × 10 multiplier is responsible for 67% and estimator," in IEEE/ACM Int. Symp. Low Power Electronics and Design
(ISLPED), 2009, pp. 195-200.
71% of total area and energy/op of SSM10 × 10, respectively. [6] C.H. Chang and R.K. Satzoda, "A low error and high performance
In other words, SSM is much more efficient than DSM because multiplexer-based truncated multiplier," IEEE T. on Very Large Scale
the overhead of the extra circuits to support SSM is small. Integration (VLSI) Syst., vol. 18, no. 12, pp. 1767-1771, Dec 2010.
[7] P. Kulkarni, P Gupta, and M. Ercegovac, "Trading Accuracy for Power
Third, we observe that increasing the number of possible with an Underdesigned Multiplier Architecture," in IEEE Int. Conf. VLSI
starting bit positions to take an m-bit segment from two to Design (VLSID), 2011, pp. 346-351.
three does not notably increase the area and energy/op because [8] Z. Babića, A. Avramovića, and P. Bulićb, "An iterative logarithmic
multiplier," Microprocessors and Microsyst., vol. 35, no. 1, pp. 23-33,
both the area and energy/op are dominated by the base m × m 2011.
multiplier. Finally, DSM6×6 does not significantly reduce area [9] B. Widrow et al., "Adaptive noise cancelling: Principles and applica-
and energy/op compared to DSM8 × 8 because its area and tions," Proceedings of the IEEE, vol. 63, no. 12, pp. 1692-1716, Dec
1975.
energy/op are dominated by the peripheral gates as discussed [10] K. Zhang and J. U. Kang, "Graphics Processing Unit-Based Ultrahigh
earlier. Speed Real-Time Fourier Domain Optical Coherence Tomography,"
IEEE J. Selected Topics in Quantum Electronics, vol. 18, no. 4, pp.
1270-1279, Jul-Aug 2012.
IV. C ONCLUSION [11] D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout,
"Isolated word recognition with the Liquid State Machine: a case study,"
In this paper, we propose an approximate multiplier that can Inf. Process. Lett., vol. 95, no. 6, pp. 521-528, Sep 2005.
trade-off accuracy and energy/op at design time for DSP and [12] Y. Hu and P.C. Loizou, "Evaluation of Objective Quality Measures for
Speech Enhancement," IEEE T. Audio, Speech, and Lang. Process, vol.
recognition applications. Our proposed approximate multiplier 16, no. 1, pp. 229-238, Jan 2008.
takes m consecutive bits (i.e., an m-bit segment) of an n-bit [13] Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli, "Image quality
operand either starting from the MSB or ending at the LSB assessment: from error visibility to structural similarity," IEEE T. Image
Processing, vol. 13, no. 4, pp. 600-612, Apr 2004.
and apply two segments that includes the leading ones from
two operands (i.e., SSM) to an m × m multiplier. Compared
to an approach that identifies the exact leading one positions
of two operands and applies two m-bit segments starting
from the leading one positions (i.e., DSM), ours consumes
much less energy and area than PM and DSM. This improved
energy and area efficiency comes at the cost of slightly
lower compute accuracy than PM and DSM. However, we
demonstrate that the loss of small compute accuracy using
SSM does not notably impact QoC of image, audio, and
recognition applications we evaluated. On average, 16 × 16
ESSM8 × 8 can achieve 99% computational accuracy, respec-
tively, with negligible degradation in QoC for audio, image,
and recognition applications. On the other hand, 16 × 16
ESSM8 × 8 consumes only 42% energy/op of PM.

ACKNOWLEDGEMENT
This work was supported in part by generous grants from
NSF (CCF-0953603) and DARPA (HR0011-12-2-0019). Nam
Sung Kim has a financial interest in AMD and Samsung Elec-
tronics. Nam Sung Kim and Taejoon Park equally contributed
to this work.

R EFERENCES
[1] R.K. Krishnamurthy and H. Kaul, "Ultra-low Voltage Technologies for
Energy-efficient Special-Purpose Hardware Accelerators," Intel Technol-
ogy J., vol. 13, no. 4, pp. 102-117, 2009.

Class Schedule Shifting: Understanding The Perspective in Class Schedule Shifting of Talisay Senior High School
80% (5)
Class Schedule Shifting: Understanding The Perspective in Class Schedule Shifting of Talisay Senior High School
15 pages
Energy-Efficient Approximate Multiplication For Digital Signal Processing and Classification Applications
No ratings yet
Energy-Efficient Approximate Multiplication For Digital Signal Processing and Classification Applications
5 pages
A Two-Stage Operand Trimming Approximate
No ratings yet
A Two-Stage Operand Trimming Approximate
11 pages
1 s2.0 S143484112100385X Main
No ratings yet
1 s2.0 S143484112100385X Main
12 pages
Design of Roba Multiplier Using Mac Unit
No ratings yet
Design of Roba Multiplier Using Mac Unit
15 pages
2021 A Hybrid Radix-4 and Approximate Logarithmic Multiplier - Lotric
No ratings yet
2021 A Hybrid Radix-4 and Approximate Logarithmic Multiplier - Lotric
20 pages
DRUM: A Dynamic Range Unbiased Multiplier For Approximate Applications
No ratings yet
DRUM: A Dynamic Range Unbiased Multiplier For Approximate Applications
8 pages
Design of Roba Multiplier For High-Speed Yet Energy-Efficient Digital Signal Processing Using Verilog HDL
No ratings yet
Design of Roba Multiplier For High-Speed Yet Energy-Efficient Digital Signal Processing Using Verilog HDL
16 pages
Example of Multiplier
No ratings yet
Example of Multiplier
4 pages
ROBA
67% (3)
ROBA
11 pages
Approximate Recursive Multipliers Using Low Power
No ratings yet
Approximate Recursive Multipliers Using Low Power
16 pages
Power and Area Efficient Approximate Multipliers
No ratings yet
Power and Area Efficient Approximate Multipliers
5 pages
On The Use of Low-Power Devices, Approximate Adders and Near-Threshold Operation For Energy-Efficient Multipliers
No ratings yet
On The Use of Low-Power Devices, Approximate Adders and Near-Threshold Operation For Energy-Efficient Multipliers
12 pages
FPGA-Based Multi-Level Approximate Multipliers For High-Performance Error-Resilient Applications
No ratings yet
FPGA-Based Multi-Level Approximate Multipliers For High-Performance Error-Resilient Applications
17 pages
electronics-12-00446-v2
No ratings yet
electronics-12-00446-v2
21 pages
New Approximate Multiplier For Low Power Digital Signal Processing
No ratings yet
New Approximate Multiplier For Low Power Digital Signal Processing
6 pages
Approximate Hybrid High Radix Encoding For Energy-Efficient Inexact Multipliers
No ratings yet
Approximate Hybrid High Radix Encoding For Energy-Efficient Inexact Multipliers
10 pages
Energy-Ef Cient Low-Latency Signed Multiplier For FPGA-based Hardware Accelerators
No ratings yet
Energy-Ef Cient Low-Latency Signed Multiplier For FPGA-based Hardware Accelerators
4 pages
Low Power DSP Using Approximate Adders
No ratings yet
Low Power DSP Using Approximate Adders
14 pages
Ijlbps 66006543d0393
No ratings yet
Ijlbps 66006543d0393
8 pages
Design and Analysis of Approximate Compressors For Multiplication
No ratings yet
Design and Analysis of Approximate Compressors For Multiplication
11 pages
1 s2.0 S0141933119305976 Main
No ratings yet
1 s2.0 S0141933119305976 Main
8 pages
Camus Dac16
No ratings yet
Camus Dac16
6 pages
A Theoretical Framework For Quality Estimation and Optimization of DSP Applications Using Low-Power Approximate Adders
No ratings yet
A Theoretical Framework For Quality Estimation and Optimization of DSP Applications Using Low-Power Approximate Adders
14 pages
Published Paper - High Speed Low Power Approximate Multipliers
No ratings yet
Published Paper - High Speed Low Power Approximate Multipliers
6 pages
MICPRO2011-An Iterative Logarithmic Multiplier
No ratings yet
MICPRO2011-An Iterative Logarithmic Multiplier
11 pages
AxRMs_Approximate_Recursive_Multipliers_Using_High-Performance_Building_Blocks
No ratings yet
AxRMs_Approximate_Recursive_Multipliers_Using_High-Performance_Building_Blocks
7 pages
A Low-Power, High-Performance Approximate Multiplier With Configurable Partial Error Recovery
No ratings yet
A Low-Power, High-Performance Approximate Multiplier With Configurable Partial Error Recovery
4 pages
Developing_and_Assessinginexact_Multiplierarchitec (1)
No ratings yet
Developing_and_Assessinginexact_Multiplierarchitec (1)
16 pages
×32 Bit Multiprecision Razor-Based Dynamic: 32 Bit Voltage Scaling Multiplier With Operands Scheduler
No ratings yet
×32 Bit Multiprecision Razor-Based Dynamic: 32 Bit Voltage Scaling Multiplier With Operands Scheduler
12 pages
Low Power Approximate Unsigned Multipliers With Configurable Error Recovery
No ratings yet
Low Power Approximate Unsigned Multipliers With Configurable Error Recovery
8 pages
31_Design_JJ_new
No ratings yet
31_Design_JJ_new
8 pages
Power Efficient Approximate Booth Multiplier
No ratings yet
Power Efficient Approximate Booth Multiplier
4 pages
Area-Efficient_Iterative_Logarithmic_Approximate_Multipliers_for_IEEE_754_and_Posit_Numbers
No ratings yet
Area-Efficient_Iterative_Logarithmic_Approximate_Multipliers_for_IEEE_754_and_Posit_Numbers
13 pages
MACcelerator Approximate Arithmetic Unit For Computational Acceleration
No ratings yet
MACcelerator Approximate Arithmetic Unit For Computational Acceleration
6 pages
DesignandimplementationofMultiplierunitMAC ROBA
No ratings yet
DesignandimplementationofMultiplierunitMAC ROBA
10 pages
Major PPT Batch - 13
No ratings yet
Major PPT Batch - 13
28 pages
Power-Area Efficient Computing Technique For Approximate Multiplier With Carry Prediction
No ratings yet
Power-Area Efficient Computing Technique For Approximate Multiplier With Carry Prediction
4 pages
Project Base Paper
No ratings yet
Project Base Paper
6 pages
IJONS - Yogeswari P
No ratings yet
IJONS - Yogeswari P
17 pages
Batch A7
No ratings yet
Batch A7
22 pages
A A Y N E - E L M: Ddition Is LL OU EED FOR Nergy Fficient Anguage Odels
No ratings yet
A A Y N E - E L M: Ddition Is LL OU EED FOR Nergy Fficient Anguage Odels
13 pages
Design and Analysis of Approximate Redundant Binary Multipliers
No ratings yet
Design and Analysis of Approximate Redundant Binary Multipliers
15 pages
A2 Intro
No ratings yet
A2 Intro
28 pages
Design of High Performance Dynamically Truncated A-1
No ratings yet
Design of High Performance Dynamically Truncated A-1
7 pages
Design and Implementation of Low-Power Digital Signal Processing Using Approximate Adders
No ratings yet
Design and Implementation of Low-Power Digital Signal Processing Using Approximate Adders
7 pages
IJSPR_5901_30318
No ratings yet
IJSPR_5901_30318
5 pages
Low-Power Compressor-Based Approximate Multipliers With Error Correcting Module
No ratings yet
Low-Power Compressor-Based Approximate Multipliers With Error Correcting Module
4 pages
PMC 2021
No ratings yet
PMC 2021
6 pages
Approximate Radix-8 Booth Multipliers For Low-Power and High-Performance Operation
No ratings yet
Approximate Radix-8 Booth Multipliers For Low-Power and High-Performance Operation
8 pages
Wordlengthresuction
No ratings yet
Wordlengthresuction
18 pages
22
No ratings yet
22
6 pages
Multiplier 6.10 CameraReady
No ratings yet
Multiplier 6.10 CameraReady
6 pages
9 .Efficient Design For Fixed Width Adder
No ratings yet
9 .Efficient Design For Fixed Width Adder
45 pages
Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
No ratings yet
Efficient Design of Majority-Logic-Based Approximate Arithmetic Circuits
13 pages
A low-power high-accuracy approximate multiplier using high-order approximate compressors
No ratings yet
A low-power high-accuracy approximate multiplier using high-order approximate compressors
10 pages
A Simple Yet Efficient Accuracy-Configurable Adder Design
No ratings yet
A Simple Yet Efficient Accuracy-Configurable Adder Design
14 pages
Abed 2018
No ratings yet
Abed 2018
15 pages
Application-Specific Efficiently Approximated Adders and Multipliers Design and Its Metrics Evaluation - WOS
No ratings yet
Application-Specific Efficiently Approximated Adders and Multipliers Design and Its Metrics Evaluation - WOS
8 pages
Intelligent Technologies for Research and Engineering
From Everand
Intelligent Technologies for Research and Engineering
S. Kannadhasan
No ratings yet
Analog Dialogue, Volume 48, Number 2
From Everand
Analog Dialogue, Volume 48, Number 2
Analog Dialogue
No ratings yet
paranoid-personality-disorder-test
No ratings yet
paranoid-personality-disorder-test
6 pages
Delhi Maths Championship 2021
No ratings yet
Delhi Maths Championship 2021
1 page
10th English Unit Wise Unit Test Question Papers and Model Full Test Question Papers Mr. S. Mohan
100% (3)
10th English Unit Wise Unit Test Question Papers and Model Full Test Question Papers Mr. S. Mohan
57 pages
38-Article Text-175-2-10-20230402
No ratings yet
38-Article Text-175-2-10-20230402
6 pages
Work and Energy Prac Prob 204qiud
No ratings yet
Work and Energy Prac Prob 204qiud
5 pages
(Ebook) Ultraviolet Light in Food Technology: Principles and Applications (Contemporary Food Engineering) by Tatiana Koutchma, Larry J. Forney, Carmen I. Moraru ISBN 9781420059502, 1420059505 - Download the ebook today and own the complete version
100% (2)
(Ebook) Ultraviolet Light in Food Technology: Principles and Applications (Contemporary Food Engineering) by Tatiana Koutchma, Larry J. Forney, Carmen I. Moraru ISBN 9781420059502, 1420059505 - Download the ebook today and own the complete version
53 pages
HEC-HMS Users Manual-V41-20201128 - 041539
No ratings yet
HEC-HMS Users Manual-V41-20201128 - 041539
588 pages
Public Administration
No ratings yet
Public Administration
8 pages
DLP Math 5 Congruent Polygon
No ratings yet
DLP Math 5 Congruent Polygon
6 pages
Math - Graphs, Charts Tables by Scholastic
100% (3)
Math - Graphs, Charts Tables by Scholastic
65 pages
Lovemarksacademy 2013
No ratings yet
Lovemarksacademy 2013
270 pages
SOCIALINGUISTICS
No ratings yet
SOCIALINGUISTICS
34 pages
Tentative Examination Center List For Odd Semester 2023-24 Phase-II
No ratings yet
Tentative Examination Center List For Odd Semester 2023-24 Phase-II
387 pages
Ferrule PDF
No ratings yet
Ferrule PDF
7 pages
WRONG, Dennis H. The Oversocialized Conception of Man in Modern Sociology
No ratings yet
WRONG, Dennis H. The Oversocialized Conception of Man in Modern Sociology
12 pages
4.chapter 3 Demand Forecasting
No ratings yet
4.chapter 3 Demand Forecasting
43 pages
Gender Roles in A Dolls House
No ratings yet
Gender Roles in A Dolls House
3 pages
Advanced Social Worke With Communities
No ratings yet
Advanced Social Worke With Communities
5 pages
50 Very Short Rules For A Good Life From The Stoics - by Ryan Holiday - May, 2021 - Forge
No ratings yet
50 Very Short Rules For A Good Life From The Stoics - by Ryan Holiday - May, 2021 - Forge
5 pages
Type 1 Diabetes: Endocrine Glands Insulin Pump Glucose Test Insulin Pump Type I Diabetes
No ratings yet
Type 1 Diabetes: Endocrine Glands Insulin Pump Glucose Test Insulin Pump Type I Diabetes
9 pages
Ms
No ratings yet
Ms
30 pages
A Universal Scaling Law For Nanoindentation, But Not Only
No ratings yet
A Universal Scaling Law For Nanoindentation, But Not Only
9 pages
Polyester ISO Tds Web
No ratings yet
Polyester ISO Tds Web
1 page
Competente, R. J. T. (2019) - Pre-Service Teachers' Inclusion of Climate Change Education
No ratings yet
Competente, R. J. T. (2019) - Pre-Service Teachers' Inclusion of Climate Change Education
8 pages
Bellow Seal For Liquid System
No ratings yet
Bellow Seal For Liquid System
42 pages
Standards On Water Meters
No ratings yet
Standards On Water Meters
10 pages
Jihane Ennadifi: Receptionist
No ratings yet
Jihane Ennadifi: Receptionist
1 page
Cityscapes and Monuments of Western Asia Minor Memories and Identities 3rd Edition Eva Mortensen Birte Poulsen Download PDF
100% (3)
Cityscapes and Monuments of Western Asia Minor Memories and Identities 3rd Edition Eva Mortensen Birte Poulsen Download PDF
79 pages
STA464C Sanken
No ratings yet
STA464C Sanken
1 page

paper6

Uploaded by

paper6

Uploaded by

1

Energy-Efficient Approximate Multiplication for

achieve 99.4% accuracy for a 16 ×16 multiplication even with

to extract an m-bit segment indicated by the dotted arrow in

Fig. 7. Energy/op of AM, DSM8 × 8, DSM6 × 6, SSM8 × 8, SSM10 × 10,

Fig. 8. Breakdown of area and energy/op of 16 × 16.

You might also like