Design of Energy-Efficient RFET-Based Exact and Approximate 42 Compressors and Multipliers
Design of Energy-Efficient RFET-Based Exact and Approximate 42 Compressors and Multipliers
9, SEPTEMBER 2023
Abstract—The ever-increasing demand for low-power and area- ever-increasing CMOS scaling problems and slowing down
efficient circuits for use in battery-powered devices and the CMOS of Dennard scaling have made it challenging to create more
scaling problems have attracted the attention of VLSI design- compact and low-power circuits with each new technology
ers to beyond-CMOS technologies like Reconfigurable Field-Effect generation. For this reason, several compressors have been
Transistors (RFETs). Improving the efficiency of multipliers is proposed based on beyond-CMOS technologies [3], [4].
critical as the core component of many applications such as image Among emerging technologies, Reconfigurable Field-Effect
processing and Machine Learning (ML). This brief proposes a
Transistor (RFET) follows a top-down manufacturing pro-
compact and energy-efficient RFET-based architecture for the
4:2 compressor and Dadda multiplier, leveraging transistor-level cess similar to CMOS [5], [6], and its unique features, like
reconfigurability and multi-input support of the RFET. Moreover, ambipolarity and multi-input support, open new doors to
we propose a novel approximate 4:2 compressor based on efficient designing compact low-power circuits [7]. Several previous
RFET logic cells to cater to the needs of error-resilient applica- works showed the benefits of RFET in creating basic logic
tions. Extensive circuit-level simulations with 14nm germanium gates [6], [8], [9], [10].
nanowire (GeNW) RFET technology show that the proposed In this brief, we propose an area- and energy-efficient 4:2
RFET-based exact multiplier improves the power consumption compressor and Dadda multiplier exploiting RFET features.
and power-delay product (PDP) by 65% and 45%, respectively, Moreover, as error-resilient applications like image processing
compared to the conventional CMOS-based counterpart in 14nm or ML allow designers to use approximate compressors and
FinFET technology. Besides, we show that utilizing the proposed multipliers to reduce energy consumption and area [1], [4],
approximate compressor, the area and PDP of the multiplier [11], [12], [13], [14], [15], [16], [17], [18], we also propose
reduce by 46% and 42%. The effectiveness of the approximate a novel approximate 4:2 compressor designed based on effi-
multiplier is evaluated in the image multiplication, and the average
PSNR and SSIM values are 31.39 and 0.87, respectively.
cient RFET logic cells. Although most research in the RFET
domain has been focused on Silicon nanowire (SiNW) transis-
Index Terms—RFET, GeNW, compressor, multiplier, approxi- tors, recently introduced GeNW transistors [19], [20], [21] can
mate computing. offer much better performance. As the power and performance
I. I NTRODUCTION trade-off is essential in a multiplier, in this brief, we use GeNW
N RECENT years, the wide usage of various micropro- RFET [21] to analyze 4:2 compressors and Dadda multiplier.
I cessors in battery-powered portable devices pushed VLSI
designers to employ different methods to gain more compact
The contributions of this brief are as follows:
• RFET-based exact 4:2 compressor: We propose an RFET-
and energy-efficient circuits. Among digital arithmetic blocks, based architecture for a compact and low-power exact
multipliers play an important role in many applications, such 4:2 compressor. We demonstrate that only 1-to-1 replace-
as image processing, and machine learning (ML). Multiplier ment of CMOS transistors with RFETs doesn’t lead to an
circuits are large and power-hungry and contribute consid- optimal circuit. Thus, we propose a design that exploits
erably to overall system performance. A multiplier usually reconfigurability and multi-input support of RFETs.
• RFET-based approximate 4:2 compressor: We introduce
comprises three phases: 1) Partial Product Generation (PPG),
2) Partial Product Reduction (PPR), 3) Final Addition. a novel 4:2 approximate compressor utilizing efficient
As the second stage has the largest portion of the area, RFET-based logic cells like minority and multiplexer for
propagation delay, and power consumption, efficient design of error-resilient applications.
• Exact and approximate Dadda multiplier: We imple-
this phase is crucial. Dadda method is one of the fastest and
most well-known methods for PPR, and 4:2 compressors are ment exact and approximate 8 × 8 multipliers employing
commonly used for implementing it [1]. our proposed compressors. We propose a structure for
Several works have proposed CMOS-based efficient 4:2 intra- and inter-compressor connections within the exact
compressors in terms of delay, power, and area [2]. However, multiplier to minimize the number of cascaded transmis-
sion gates (TG) in the critical path of the multiplier.
Manuscript received 30 March 2023; accepted 6 May 2023. Date of publi- • HW and accuracy analysis: We performed SPICE simula-
cation 15 May 2023; date of current version 29 August 2023. This work was tions using the 14nm GeNW RFET Verilog-A model [21]
supported in part by the Deutsche Forschungsgemeinschaft (DFG, German to evaluate our proposed compressors and multipliers in
Research Foundation) through SecuReFET under Project 439891087, and in terms of area and energy efficiency. Besides, we analyze
part by the German Federal Ministry for Education and Research (BMBF)
under the Framework of VE-CirroStrato. This brief was recommended by the accuracy of the approximate multiplier and investigate
Associate Editor A. Calimera. (Corresponding author: Akash Kumar.) the quality of this multiplier in an image multiplication
The authors are with the Department of Computer Science, Technische application.
Universität Dresden, 01062 Dresden, Germany (e-mail: nima.kavand@
tu-dresden.de; [email protected]; [email protected]; II. BACKGROUND
[email protected]).
Color versions of one or more figures in this article are available at RFET Structure and Functionality: RFETs are a group of
https://doi.org/10.1109/TCSII.2023.3275983. ambipolar transistors that can be electrostatically programmed
Digital Object Identifier 10.1109/TCSII.2023.3275983 at run-time to act either as nmos or pmos transistors. The
1549-7747
c 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on August 09,2024 at 09:58:52 UTC from IEEE Xplore. Restrictions apply.
KAVAND et al.: DESIGN OF ENERGY-EFFICIENT RFET-BASED EXACT AND APPROXIMATE 4:2 COMPRESSORS AND MULTIPLIERS 3645
Fig. 1. a) RFET reconfigurability b) RFET with two, three, and multi gates.
Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on August 09,2024 at 09:58:52 UTC from IEEE Xplore. Restrictions apply.
3646 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 70, NO. 9, SEPTEMBER 2023
Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on August 09,2024 at 09:58:52 UTC from IEEE Xplore. Restrictions apply.
KAVAND et al.: DESIGN OF ENERGY-EFFICIENT RFET-BASED EXACT AND APPROXIMATE 4:2 COMPRESSORS AND MULTIPLIERS 3647
TABLE I
HW A NALYSIS OF 4:2 C OMPRESSORS
TABLE II
HW A NALYSIS OF 8 × 8 DADDA M ULTIPLIERS
TABLE III
three-gate RFET is approximately 1.5× [28] larger than a sin- T HE ACCURACY OF THE A PPROXIMATE 8 × 8 M ULTIPLIERS
gle CMOS device due to its extra gate signals, we estimate
the circuit area based on the unit size transistor (UST) [28].
The proposed RFET-based exact compressor consumes 72%
less power than the CMOS counterpart with a negligible raise
in delay. Hence PDP and EDP are improved in our design by
72% and 71%. Besides, the area is reduced by 25% thanks to
the reconfigurability and multi-input support of RFET.
To show the strength of our design, we also evaluated the
naive RFET-based compressor. According to Table I, although
in the naive implementation, the power consumption has
decreased, the delay and area have increased significantly com-
TABLE IV
pared to the CMOS-based design. Hence, for achieving an T HE PSNR AND SSIM OF THE A PPROXIMATE M ULTIPLIERS
efficient RFET-based circuit, 1-to-1 replacement of CMOS
transistors with RFETs is not sufficient, and choosing a proper
design that can exploit RFET features is crucial.
According to Table I, the proposed RFET-based approxi-
mate compressor improves the delay, power, PDP, EDP, and
area by 38%, 8%, 42%, 64%, and 46% compared to the RFET-
based exact compressor. Since our approximate compressor
is designed based on the efficient RFET cells like the Min
and MUX-INV, the area and energy efficiency of its RFET
implementation is better than its CMOS implementation.
2) Accuracy Analysis of Approximate Multipliers: To
B. Multipliers report the accuracy of approximate designs, we used Error
1) HW Analysis: We simulated exact and approximate 8×8 Rate (ER), Mean Error Distance (MED), and Normalized
Dadda multipliers using the exact and approximate com- Error Distance (NED), which are commonly used accuracy
pressors introduced in Section IV-A. All the compressors metrics [31]. To evaluate the accuracy of our approximate
in the approximate multipliers are approximate. The simu- compressor in multiplication, we compared multipliers imple-
lation results of multipliers are given in Table II. Although mented using our compressor and other compressors in the
the CMOS-based multiplier has 38% lower delay than the literature. For a fair comparison, all the compressors in
RFET-based multiplier, in our design, PDP and EDP have been the multipliers are replaced with approximate compressors,
reduced by 45% and 11%, respectively, due to a significant and we do not consider any truncation. We applied all the
reduction in power consumption by 65%. As CMOS-based possible input combinations (65536 inputs) to the approxi-
AND gates have higher performance than RFET-based AND mate multipliers and calculated the accuracy metrics using
gates, this increase in delay is partly due to the AND gates MATLAB. The results of the accuracy analysis are given in
in the PPG phase. Comparing the results of our design with Table III. According to this table, the accuracy of the proposed
naive RFET-based design, we can see again that the 1-to-1 multiplier is in the range of other approximate multipliers
replacement of CMOS transistors with RFETs does not lead in the literature. However, if we implement all these designs
to an optimal design. using RFETs, our compressor has the lowest hardware com-
In comparison to the RFET-based exact multiplier, the plexity. It shows that RFET properties should be considered
RFET-based approximate multiplier reduces delay, power, to design an efficient RFET-based approximate circuit. Note
PDP, EDP, and area by 48%, 2%, 49%, 74%, and 21%, respec- that the designs with better accuracy incur much more area
tively. Moreover, the RFET-based approximate multiplier has overhead to the circuits.
better area and energy efficiency than the CMOS-based 3) Approximate Multiplier in the Image Multiplication:
approximate multiplier with the same structure. To evaluate the efficiency of the approximate multipliers
Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on August 09,2024 at 09:58:52 UTC from IEEE Xplore. Restrictions apply.
3648 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 70, NO. 9, SEPTEMBER 2023
Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on August 09,2024 at 09:58:52 UTC from IEEE Xplore. Restrictions apply.