3.1. Spurious Power Suppression Technique

The document discusses techniques for reducing power consumption and improving performance in multipliers and multiply-accumulate (MAC) units. It proposes: 1) A spurious power suppression technique (SPST) that detects cases where the most significant bits of partial products don't affect the result, allowing those circuits to be skipped to reduce power. 2) A modified Booth encoding algorithm and hybrid carry-save adder structure for MACs to reduce critical path delays and improve throughput. 3) Pipelining the MAC to accumulate intermediate sums and carries rather than final outputs to further increase throughput.

Uploaded by

achuu1987

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

3.1. Spurious Power Suppression Technique

Uploaded by

achuu1987

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

3.1.

Spurious power suppression technique:

Figure shows the five cases of a 16-bit addition in which the spurious switching activities occur. The 1 st case illustrates a transient state in which the spurious transitions of carry signals occur in the MSP though the final result of the MSP are unchanged. The 2nd and the 3rd cases describe the situations of one negative operand adding another positive operand without and with carry from LSP, respectively. Moreover, the 4th and the 5th cases respectively demonstrate the addition of two negative operands without and with carry-in from LSP. In those cases, the results of the MSP are predictable Therefore the computations in the MSP are useless and can be neglected. The data are separated into the Most Significant Part (MSP) and the Least Significant Part (LSP). To know whether the MSP affects the computation results or not. We need a detection logic unit to detect the effective ranges of the inputs. The Boolean logical equations shown below express the behavioral principles of the detection logic unit in the MSP circuits of the SPST-based adder/subtractor:

Figure 2. Spurious transition cases in multimedia/ DSP processing AMSP Aand Band = A[15:8]; BMSP = B[15:8] ; = A[15] A[14] A[8]; = B[15] B[14] B[8];]

where A[m] and B[n] respectively denote the mth bit of the operands A and the nth bit of the operand B, and AMSP and BMSP respectively denote the MSP parts, i.e. the 9 th bit to the 16th bit, of the operands A and B. When the bits in AMSP and/or those in BMSP are all ones, the value of Aand and/or that of Band respectively become one, while the bits in AMSP and/or those in BMSP are all zeros, the value of Anor, and/or that of Bnor respectively turn into one. Being one of the three outputs of the detection logic unit, close denotes whether the MSP circuits can be neglected or not. When the two input operand can be classified into one of the five classes as shown in figure 1, the value of close becomes zero which indicates that the MSP circuits can be closed. figure 1. also shows that it is necessary to compensate the sign bit of computing results Accordingly, we derive the Karnaugh maps which lead to the Boolean equations (7) and (8) for the Carr_ctrl and the sign signals, respectively. In equation (7) and (8), CLSP denotes the carry propagated from the LSP circuits.

Figure shows a 16-bit adder/subtractor design example based on the proposed SPST. In this example, the 16-bit adder/subtractor is divided into MSP and LSP at the place between the 8th bit and the 9th bit. Latches implemented by simple AND gates are used to control the input data of the MSP. When the MSP is necessary, the input data of MSP remain the same as usual, while the MSP is negligible, the input data of the MSP become zeros to avoid switching power consumption. From the derived Boolean equations (1) to (8), the detection logic unit of the SPST is designed as shown in figure 4. The use of MSP can be determined by whether the input data of MSP should be latched or not. Mo reover, we add three 1-bit to control the assertion of the close, sign, and Carr-ctrl signals in order to further decrease the glitch signals occurred in the cascaded circuits which are usually adopted in VLSI architectures designed for video coding.

Fig. shows a 16-bit adder/subtractor design example adopting the proposed SPST. In this example, the 16-bit adder/subtractor is divided into MSP and LSP between the eighth and the ninth bits. Latches implemented by simple AND gates are used to control the input data of the MSP. When theMSP is necessary, the input data of MSP remain unchanged. However, when the MSP is negligible, the input data of the MSP become

zeros to avoid glitching power consumption. The two operands of the MSP enter the detection-logic unit, except the adder/subtractor, so that the detection-logic unit can decide whether to turn off the MSP or not. Based on the derived Boolean equations (1) to (8), the detection-logic unit of SPST is shown in Fig. 6(a), which can determine whether the input data of MSP should be latched or not. Moreover, we propose the novel glitchdiminishing technique by adding three 1-bit registers to control the assertion of the close, sign, and carr-ctrl signals to further decrease the transient signals occurred in the cascaded circuits which are usually adopted in VLSI architectures designed for multimedia/DSP applications. The timing diagram is shown in Fig. 6(b). A certain amount of delay is used to assert the close, sign, and carr-ctrl signals after the period of data transition which is achieved by controlling the three 1-bit registers at the outputs of the detection-logic unit. Hence, the transients of the detection-logic unit can be filtered out; thus, the data latches shown in Fig can prevent the glitch signals from flowing into the MSP with tiny cost. The data transient time and the earliest required time of all the inputs are also illustrated. The delay should be set in the range of, which is shown as the shadow area in Fig, to filter out the glitch signals as well as to keep the computation results correct. Based on Figs. 5 and 6, the timing issue of the SPST is analyzed as follows. 3.1.1. When the detection-logic unit turns off the MSP: At this moment, the outputs of the MSP are directly compensated by the SE unit; therefore, the time saved from skipping the computations in the MSP circuits shall cancel out the delay caused by the detection-logic unit. 3.1.2. When the detection-logic unit turns on the MSP: The MSP circuits must wait for the notification of the detection-logic unit to turn on the data latches to let the data in. Hence, the delay caused by the detection-logic unit will contribute to the delay of the whole combinational circuitry, i.e., the16-bit adder/subtractor in this design example. 3.1.3.When the detection-logic unit remains its decision: No matter whether the last decision is turning on or turning off the MSP, the delay of the detection logic is negligible because the path of the combinational circuitry (i.e., the 16-bit adder/subtractor in this design example) remains the same. From the analysis earlier, we can know that the total delay is affected only when the detection-logic unit turns on the MSP. However, the detection-logic unit should be a speed-oriented design. When the SPST is applied on combinational circuitries, we should first determine the longest transitions of the interested cross sections of

each combinational circuitry, which is timing characteristic and is also related to the adopted technology. The longest transitions can be obtained from analyzing the timing differences between the earliest arrival and the latest arrival signals of the cross sections of a combinational circuitry.

3.2. MAC 3.2.1 Block Diagram of MAC: In this Project, a new architecture for a high-speed MAC is proposed. In this MAC, the computations of multiplication and accumulation are combined and a hybrid-type CSA structure is proposed to reduce the critical path and improve the output rate. It uses MBA algorithm based on 1s complement number system. A modified array structure for the sign bits is used to increase the density of the op erands. A carry look-ahead adder (CLA) is inserted in the CSA tree to reduce the number of bits in the final adder. In addition, in order to increase the output rate by optimizing the pipeline efficiency, intermediate calculation results are accumulated in the form of sum and carry instead of the final adder outputs. A multiplier can be divided into three operational steps. The first is radix-2 Booth encoding in which a partial product is generated from the multiplicand X and the multiplier Y . The second is adder array or partial product compression to add all partial products and convert them into the form of sum and carry. The last is the final addition in which the final multiplication result is produced by adding the sum and the carry. If the process to accumulate the multiplied results is included, a MAC consists of four steps, as shown in Fig. 1, which shows the operational steps explicitly.

3.2.3.Proposed MAC Architecture: In this section, the expression for the new arithmetic will be derived from equations of the standard design. From this result, VLSI architecture for the new MAC will be proposed. In addition, a hybrid-typed CSA architecture that can satisfy the operation of the proposed MAC will be proposed. 3.3.A Radix-4 modified Booth's algorithm: Booth's Algorithm is simple but powerful. Speed of VMFU is dependent on the number of partial products and speed of accumulate partial product. Booth's Algorithm provide us to reduced partial products. We choose radix-4 algorithm because of below reasons. Original Booth's algorithm has an inefficient case. The 17 partial products are generated in 16bit x 16bit signed or unsigned multiplication. Modified Booth's radix-4 algorithm has fatal encoding time in 16bit x 16bit multiplication.

Radix-4 Algorithm has a 3x term which means that a partial product cannot be generated by shifting. Therefore, 2x + 1x are needed in encoding processing. One of the solution is handling an additional 1x term in wallace tree. However, large wallace tree has some problems too. A radix-4 modified Booth's algorithm: Booth's radix-4 algorithm is widely used to reduce the area of multiplier and to increase the speed. Grouping 3 bits of multiplier with overlapping has half partial products which improves the system speed. Radix-4 modified Booth's algorithm is shown below: X-1 = 0; Insert 0 on the right side of LSB of multiplier. Start grouping each 3bits with overlapping from x-1 If the number of multiplier bits is odd, add a extra 1 bit on left side of MSB generate partial product from truth table when new partial product is generated, each partial product is added 2 bit left shifting in regular sequence.

x: multiplic and y: multiplier

3.4. Sign or zero extension Our MAC supports signed or unsigned multiplication and the produced result is 64bit which are stored in 2 special 32bit register. First MAC receives a multiplicand and multiplier but just 16bit operands are signed number in Booth's radix-4 algorithm. Hence, extension bit is required to express 16bit signed number. The core idea of this is that 16bit unsigned number can be expressed by 33bit signed number. The 17 partial products are generated in 33bit x 33bit case (16 partial products in 32bit x 32bit case). Here is an example of signed and unsigned multiplication. When x(multiplicand) is 3bit 111 and y(multiplier) is 3bit 111, the signed and unsigned multiplication is different. In signed case x y = 1 (-1 x -1 = 1) and in unsigned case x y = 49 (7 x 7 = 49).

3.5. Carry-Save Adder When three or more operands are to be added simultaneously using two operand adders, the time consuming carry propagation must be repeated several times. If the number of operands is k, then carries have to propagate (k -1) times (Weste & Harris, 3rd Ed). In the carry save addition, we let the carry propagate only in the last step, while in all the other steps we generate the partial sum and sequence of carries separately. A CSA is capable of reducing the number of operands to be added from 3 to 2 without any carry propagation. A CSA can be implemented in different ways. In the simplest implementation, the basic element of carry save adder is the combination of two half adders or 1 bit full adder (Weste & Harris, 3rd Ed)

3.6 Circuit Design Features

One of the most advanced types of MAC for general-purpose digital signal processing has been proposed by Elguibaly . It is an architecture in which accumulation has been combined with the carry save adder (CSA) tree that compresses partial products. In the architecture proposed in, the critical path was reduced by eliminating the adder for accumulation and decreasing the number of input bits in the final adder. While it has a better performance because of the reduced critical path compared to the previous VMFU architectures, there is a need to improve the output rate due to the use of the final adder results for accumulation. The architecture to merge the adder block to the accumulator register in the VMFU operator was proposed to provide the possibility of using two separate N/2-bit adders instead of one-bit adder to accumulate the MAC results. Recently, Zicari proposed an architecture that took a merging technique to fully utilize the 4 2 compressor .It also took this compressor as the basic building blocks for the multiplication circuit.

A Project Report On: Punjab University Chandigarh
No ratings yet
A Project Report On: Punjab University Chandigarh
56 pages
1.5. MAC 1.5.1 Block Diagram of MAC
No ratings yet
1.5. MAC 1.5.1 Block Diagram of MAC
11 pages
VLSI Low Power Design
No ratings yet
VLSI Low Power Design
4 pages
Design of Low Power / High Speed Multiplier Using Spurious Power Suppression Technique (SPST)
No ratings yet
Design of Low Power / High Speed Multiplier Using Spurious Power Suppression Technique (SPST)
5 pages
A High-Speed/Low-Power Multiplier Using An Advanced Spurious Power Suppression Technique
No ratings yet
A High-Speed/Low-Power Multiplier Using An Advanced Spurious Power Suppression Technique
4 pages
vlsi project report 1
No ratings yet
vlsi project report 1
13 pages
A Spurious Power Suppression Technique For
No ratings yet
A Spurious Power Suppression Technique For
8 pages
(IJCST-V9I2P10) :DR - Shine N Das
No ratings yet
(IJCST-V9I2P10) :DR - Shine N Das
6 pages
307 - An Efficient Two-Phase 3387-11439-1-PB
No ratings yet
307 - An Efficient Two-Phase 3387-11439-1-PB
7 pages
Research Article: 9T Full Adder Design in Subthreshold Region
No ratings yet
Research Article: 9T Full Adder Design in Subthreshold Region
5 pages
Research Article: 9T Full Adder Design in Subthreshold Region
No ratings yet
Research Article: 9T Full Adder Design in Subthreshold Region
6 pages
Low-Power-Delay-Product Radix-4 8 8 Booth Multiplier in CMOS
No ratings yet
Low-Power-Delay-Product Radix-4 8 8 Booth Multiplier in CMOS
2 pages
Algorithm and Design
No ratings yet
Algorithm and Design
6 pages
Efficient Implementation of 16-Bit Multiplier-Accumulator Using Radix-2 Modified Booth Algorithm and SPST Adder Using Verilog
No ratings yet
Efficient Implementation of 16-Bit Multiplier-Accumulator Using Radix-2 Modified Booth Algorithm and SPST Adder Using Verilog
12 pages
Fast Adder Nptel
No ratings yet
Fast Adder Nptel
23 pages
07. Reddy_2021_BP_11414D
No ratings yet
07. Reddy_2021_BP_11414D
21 pages
Approaches To Low-Power Implementations of DSP Systems
No ratings yet
Approaches To Low-Power Implementations of DSP Systems
22 pages
EE6306 Slides (W9-13)
No ratings yet
EE6306 Slides (W9-13)
91 pages
Adder Sub Alu
No ratings yet
Adder Sub Alu
22 pages
2018 Test2 PDF
No ratings yet
2018 Test2 PDF
6 pages
SPST
No ratings yet
SPST
40 pages
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
No ratings yet
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
8 pages
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
No ratings yet
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
8 pages
Building The Up
No ratings yet
Building The Up
30 pages
Paper On Booth Squarer
No ratings yet
Paper On Booth Squarer
12 pages
Unit 2
No ratings yet
Unit 2
13 pages
PaperID 74S201921
No ratings yet
PaperID 74S201921
7 pages
Booth Multiplier
No ratings yet
Booth Multiplier
5 pages
Lec 34
No ratings yet
Lec 34
15 pages
Computer Organization and Architecture: UNIT-2
No ratings yet
Computer Organization and Architecture: UNIT-2
29 pages
EC 5110 Logic Synthesis and Verification Lecture Notes 30082024
No ratings yet
EC 5110 Logic Synthesis and Verification Lecture Notes 30082024
34 pages
Unit II Students
No ratings yet
Unit II Students
25 pages
Design of Modified Low Power Booth Multiplier
No ratings yet
Design of Modified Low Power Booth Multiplier
6 pages
Unit IV Vlsi
No ratings yet
Unit IV Vlsi
119 pages
PXC 3878710
No ratings yet
PXC 3878710
4 pages
Mini Project PPT A14 (1)
No ratings yet
Mini Project PPT A14 (1)
15 pages
Combinational Logic Circuits
No ratings yet
Combinational Logic Circuits
36 pages
COA Module - 4
No ratings yet
COA Module - 4
43 pages
Datapath Logic Cells
No ratings yet
Datapath Logic Cells
19 pages
Unit-2 Arithmetic Logic Unit (ALU)
No ratings yet
Unit-2 Arithmetic Logic Unit (ALU)
13 pages
Module 3 (BKM) - Arithmetic
No ratings yet
Module 3 (BKM) - Arithmetic
77 pages
Mos Vlsi Report Final
No ratings yet
Mos Vlsi Report Final
46 pages
21
No ratings yet
21
12 pages
DRD
No ratings yet
DRD
16 pages
Chapter4 Arithmetic
No ratings yet
Chapter4 Arithmetic
74 pages
RISC-V_Lecture_00
No ratings yet
RISC-V_Lecture_00
62 pages
Fast Multiplication
No ratings yet
Fast Multiplication
55 pages
electronics-12-00446-v2
No ratings yet
electronics-12-00446-v2
21 pages
Ec6302 Digital Electronics 1
No ratings yet
Ec6302 Digital Electronics 1
178 pages
Midterm Project: EEDG 6306 - Application Specific Integrated Circuit Design
No ratings yet
Midterm Project: EEDG 6306 - Application Specific Integrated Circuit Design
6 pages
VLSI Implementation of Modified Booth Algorithm: Rasika Nigam, Jagdish Nagar
No ratings yet
VLSI Implementation of Modified Booth Algorithm: Rasika Nigam, Jagdish Nagar
4 pages
Lec12 2
No ratings yet
Lec12 2
12 pages
Floating Point Ieee
No ratings yet
Floating Point Ieee
4 pages
Design of High-Speed Area Efficient Mac Unit Using Reversible Logic
No ratings yet
Design of High-Speed Area Efficient Mac Unit Using Reversible Logic
6 pages
Revsion
No ratings yet
Revsion
56 pages
Unit_2
No ratings yet
Unit_2
22 pages
Irjet V4i3574 PDF
No ratings yet
Irjet V4i3574 PDF
4 pages
Multiplication
No ratings yet
Multiplication
33 pages
Low Power Mac For Digital Fir
No ratings yet
Low Power Mac For Digital Fir
4 pages
MULTICAST IP ROUTING Part-2: IP routing & forwarding
From Everand
MULTICAST IP ROUTING Part-2: IP routing & forwarding
Ummed Singh
No ratings yet
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
What Should Never Be Done?
No ratings yet
What Should Never Be Done?
1 page
Monthly Kirana: S No Item Name Preferred Brand Quantity Kgs/Litres
No ratings yet
Monthly Kirana: S No Item Name Preferred Brand Quantity Kgs/Litres
1 page
Hall: S No Item Name Quantity Required Brand Price TV Stabilizer Diwana Cott Fridge
No ratings yet
Hall: S No Item Name Quantity Required Brand Price TV Stabilizer Diwana Cott Fridge
1 page
Experience Summary
No ratings yet
Experience Summary
1 page
Telephone Etiquette Means
No ratings yet
Telephone Etiquette Means
1 page
Data Warehousing Concepts: Database Reporting
No ratings yet
Data Warehousing Concepts: Database Reporting
1 page
Step #1: Login To Application Life Cycle Management Into With Valid User Name and Password
No ratings yet
Step #1: Login To Application Life Cycle Management Into With Valid User Name and Password
1 page
Real-Time Pedestrian Detection and Tracking at Nighttime For Driver-Assistance Systems
No ratings yet
Real-Time Pedestrian Detection and Tracking at Nighttime For Driver-Assistance Systems
1 page
Database: What Is Dimensional Modelling
No ratings yet
Database: What Is Dimensional Modelling
1 page
Metric and Revenue Calculation
No ratings yet
Metric and Revenue Calculation
1 page
Level Metrics: All Item Sales
No ratings yet
Level Metrics: All Item Sales
1 page
Transformations: Expression Based Transformation
No ratings yet
Transformations: Expression Based Transformation
2 pages
E.F Codds Rules
No ratings yet
E.F Codds Rules
20 pages
Icubes Mod
No ratings yet
Icubes Mod
1 page
Intoduction To MSTR
No ratings yet
Intoduction To MSTR
11 pages
Value Change Dump
No ratings yet
Value Change Dump
3 pages
Wo Wie
No ratings yet
Wo Wie
2 pages
Dynamics AX Retail and POS NET Training - 2.0
No ratings yet
Dynamics AX Retail and POS NET Training - 2.0
50 pages
1) Write A Program To Implement VRC and LRC. Code (VRC Program)
No ratings yet
1) Write A Program To Implement VRC and LRC. Code (VRC Program)
9 pages
Data Security Issues in Cloud Environment and Solutions: 2014 World Congress On Computing and Communication Technologies
No ratings yet
Data Security Issues in Cloud Environment and Solutions: 2014 World Congress On Computing and Communication Technologies
5 pages
Cisco 1000 Series Integrated Services Routers Data Sheet
No ratings yet
Cisco 1000 Series Integrated Services Routers Data Sheet
31 pages
Soal Dan Jawaban Testing CH 1 (Inggris)
No ratings yet
Soal Dan Jawaban Testing CH 1 (Inggris)
6 pages
CATS Application Setup PDF
No ratings yet
CATS Application Setup PDF
11 pages
All C Questions Output The Following
No ratings yet
All C Questions Output The Following
41 pages
The Secret To Learning Code: Reverse Engineering
No ratings yet
The Secret To Learning Code: Reverse Engineering
8 pages
Timescaled Logic Diagrams
No ratings yet
Timescaled Logic Diagrams
4 pages
Oracle SQL Tuning PDF
50% (2)
Oracle SQL Tuning PDF
70 pages
Active City Administration
100% (2)
Active City Administration
32 pages
Handout 1
No ratings yet
Handout 1
6 pages
Differential Fault Analysis Attack Resistant Architectures For The Advanced Encryption Standard
No ratings yet
Differential Fault Analysis Attack Resistant Architectures For The Advanced Encryption Standard
16 pages
Mainframe Programmer Business Analyst COBOL in Dallas FT Worth TX Resume Rickey Freeland
No ratings yet
Mainframe Programmer Business Analyst COBOL in Dallas FT Worth TX Resume Rickey Freeland
4 pages
CP1E CPU Unit Software Users Manual
100% (1)
CP1E CPU Unit Software Users Manual
400 pages
HW 11 N°2
No ratings yet
HW 11 N°2
23 pages
Class Module Step by Step PDF
No ratings yet
Class Module Step by Step PDF
4 pages
Joomla! V 1 5 Getting Started With Template Overrides
100% (35)
Joomla! V 1 5 Getting Started With Template Overrides
40 pages
LG Flash Tool
No ratings yet
LG Flash Tool
10 pages
AntTweakBar GUI Library To Tweak Parameters of Your OpenGL A
No ratings yet
AntTweakBar GUI Library To Tweak Parameters of Your OpenGL A
1 page
Benefits of The Object-Oriented Development
No ratings yet
Benefits of The Object-Oriented Development
3 pages
GDEVWR0008023 Material Master Interface (GLOBE To BEST) FS - TS - 12 - 17082005 - Updated
0% (1)
GDEVWR0008023 Material Master Interface (GLOBE To BEST) FS - TS - 12 - 17082005 - Updated
45 pages
Resume
100% (1)
Resume
1 page
h11300 Pkcs 1v2 2 Rsa Cryptography Standard WP
No ratings yet
h11300 Pkcs 1v2 2 Rsa Cryptography Standard WP
63 pages
Chapter 4: SQL: ©silberschatz, Korth and Sudarshan 4.1 Database System Concepts
No ratings yet
Chapter 4: SQL: ©silberschatz, Korth and Sudarshan 4.1 Database System Concepts
98 pages
Manual Power Designer PDF
100% (1)
Manual Power Designer PDF
765 pages
Moulinette
No ratings yet
Moulinette
41 pages

3.1. Spurious Power Suppression Technique

Uploaded by

3.1. Spurious Power Suppression Technique

Uploaded by

3.1.

Spurious power suppression technique:

x: multiplic and y: multiplier

3.6 Circuit Design Features

You might also like