0% found this document useful (0 votes)

52 views

Hardware Implementation Low Power High Speed FFT Core

The document describes a hardware implementation of a low power, high speed FFT core using a multiplier-less architecture. Key aspects include: - A radix-4 single-path delay commutator pipelined FFT processor architecture is proposed for sizes 16, 64, and 256 points. - The multiplier-less architecture replaces complex multiplications with simpler shift and add operations, reducing power consumption. - When compared to a conventional FFT architecture using non-Booth coded multipliers, the proposed implementation achieves 44% and 60% power reduction for 64-point and 16-point FFTs respectively. - The 256-point FFT architecture consumes 153mW at an operating speed of 125MHz

Uploaded by

Andrew TheKanenas

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Hardware Implementation Low Power High Speed FFT Core

Uploaded by

Andrew TheKanenas

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

The International Arab Journal of Information Technology, Vol. 6, No.

1, January 2009

Hardware Implementation Low Power High Speed FFT Core

Muniandi Kannan and Srinivasa Srivatsa Department of Electronics Engineering, Chennai, India
Abstract: In recent times, DSP algorithms have received increased attention due to rapid advancements in multimedia computing and high-speed wired and wireless communications. In response to these advances, the search for novel implementations of arithmetic-intensive circuitry has intensified. For the portability requirement in telecommunication systems, there is a need for low power hardware implementation of fast fourier transforms algorithm. This paper proposes the hardware implementation of low power multiplier-less radix-4 singlepath delay commutator pipelined fast fourier transform processor architecture of sizes 16, 64 and 256 points. The multiplier-less architecture uses common sub-expression sharing to replace complex multiplications with simpler shift and add operations. By combining a new commutator architecture and low power butterfly architecture with this approach power reduction is achieved. When compared with a conventional fast fourier transform architecture based on non-booth coded wallace tree multiplier the power reduction in this implementation is 44% and 60% for 64-point and 16-point radix-4 fast fourier transforms respectively. The power dissipation is estimated using cadence RTL compiler. The operating frequencies are 166 MHz and 200 MHz, for 64 point and 16 point fast fourier transforms, respectively. Our implementation of the 256 point FFT architecture consumes 153 mw for an operating speed of 125 MHz. Keywords: Pipelined architecture, shift register, finite state machine, common sub-expression, multiplier-less architecture. Received March 15, 2007; accepted June 15, 2007

1. Introduction
Fast Fourier Transforms (FFT) is the fast implementation of the Discrete Fourier Transform (DFT) which relies on mathematical simplification and classification of the input sequence to achieve their performance gain. The FFT typically requires O(N log2N) operations to complete in comparison to the straight DFT requiring O(N2) operations [2]. The FFT processor is used in a wide range of DSP and communication applications, such as radar signal processing and wireless LAN. Recent research work has demonstrated the pipelined FFT as a leading architecture for real time applications. In this paper, a low power and efficient multiplier-less approach is employed to substitute complex multipliers in pipelined FFTs the commutator is needed to reorder the input data. It is well known that the switching power is mainly responsible for power consumption in CMOS circuits. This power, Psw, is given by 1 2 psw = kcload vdd f (1) 2 where k is the average number of times the gate makes an active transition during one clock cycle, f is the clock frequency, Vdd is the supply voltage and Cload is the load capacitance of the gate. Hence, for achieving low power, one or more of the parameters Cload, Vdd and k need to be minimized. However, since Cload and Vdd are relative to the target technology, k becomes the main point of improvement.

1.1. Radix-4 Pipelined FFT

The DFT of N complex data points x(n) is defined by
X (k ) = x(n)WNnk , k = 0, 1, 2... N-1
n= o N 1

(2)

j WN = e N is twiddle factor [10]. Since 16 and 64 is a power of four, radix-4 decimation-in-frequency algorithm is used to break the DFT formula into four smaller DFTs. The FFT is the speed-up algorithm of DFT [7]. The final sets of transforms look like

where

N/4-1 0 kn X(4k)= [x(n)+x(n+ N )+x(n+ N )+x(n+ 3N )]WN WN/4 4 2 4 n=0

(3)
N/4-1 n kn N N 3N X(4k+1)= [x(n)-jx(n+ 4 )-x(n+ 2 )+jx(n+ 4 )]WN WN/4 n=0

(4)
N/4-1 2n kn N N 3N X(4k+2)= [x(n)-x(n+ 4 )+x(n+ 2 )-x(n+ 4 )]WN WN/4 n=0

(5)
N/4-1 3n kn X(4k+3)= [x(n)+jx(n+ N )-x(n+ N )-jx(n+ 3N )]W W N N/4 4 2 4 n=0

(6) For 16-point FFT for k=0, 1, 2, 3 we get 16 equations and 64-point FFT k varies from 0 to 15. The flow graph of 16-point FFT is seen in Figure 1. In this

2 2009

The International Arab Journal of Information Technology, Vol. 6, No. 1, January

Figure the numbers inside the open circle represent equations which are used for computing the output in the butterfly stage.

Figure 1. Signal flow graph of a radix-4 16-point FFT (DIF algorithm).

(R4SDC) [4] is widely used, owing to its high utilization of multipliers, butterfly elements and memory blocks. The commutators will take up more proportion of the overall power consumption and act as a leading actor with the increase of FFT size. Therefore, reducing the power consumption of the communator units is crucial for the low power implementation of pipelined FFT processor. The requisite commutator is shown in Figure 2 (this is required for both real and imaginary parts). It consists of six shift registers each providing Nt word delays. Control signals (denoted c1, c2, and c3) select the appropriate data via 2:1 multiplexers. In accordance with the value of mt, the four complex outputs from the commutator are connected to its associated butterfly. The commuator supplies the same set of data for Nt word cycles. Each FIFO is implemented through a set of shift registers. The FIFO size Nt equals 4(5-t), where t is the stage number.
Fig
I/P Nt Nt Nt Nt Nt Nt m1 0 1 2 3 c1 1 0 0 0 c2 1 1 0 0 c3 1 1 1 0 O1 O2 O3 O4

The number outside the open circle is the twiddle factor used. The 4 outputs from two commutators are fed into each simplified butterfly unit. The butterfly unit computes the four equations in a clock cycle. Coefficients are fed in to complex multipliers, respectively. A pipelined N-point radix-4 FFT processor based on this architecture [6], shown in Figure 3, has log4N stages. Each stage produces one output within each word cycle. Each stage contains a commutator, a butterfly element and a complex multiplier. The sequential outputs at each stage must be ordered in accordance with the value of mt. For instance, from Figure 2 at stage 1, the outputs associated with mt = 0 are produced in the first four word cycles, then those associated with mt = 1 in the next four cycles and so on. It is clear from FFT equation that input data for each summation at stage t are separated in time by Nt words.
Stage 1 Com muta in tor C1 C2 C3 Coefficient Butte rfly Com muta tor Stage 2 Butte rfly Stage v Com muta tor Butte rfly

ure 3. Commutator for stage t.

2.2. Low Power Butterfly

The butterfly operation is the heart of the FFT algorithm. It takes data words from memory and computes the FFT. Low power Butterfly (LB) architecture is employed to replace the conventional butterfly based on adder/subtracters. For radix-4, the complex multiplications within the sum can be replaced by the combination of addition, subtractions, and swapping between the real and imaginary parts, as shown in Figure 4 Three complex adder/ subtractors (each comprising a real and imaginary element) are used instead of eight complex adders [1]. Control signals again select data and functions in accordance with the value of mt. The butterfly element produces N outputs consecutively over N word cycles in contrast to conventional configurations leaving 3 N/4 word cycles unused. Thus, only one complex multiplier is needed for the twiddle rotation at each stage instead of three in other designs of butterfly realizations.

out

C4 C5 C6

Figure 2. N-point radix-4 pipelined FFT processor architecture.

2. Implementation
2.1. Commutator
In realtime applications, input data is a sequential stream. Therefore, it does not match the FFT algorithm since the FFT requires temporal re-ordering of data. For this reason, the commutator is needed to reorder the input data. Among several pipelined FFT architectures, Radix-4 Single-path Delay Commutator

2.3. Multiplier-Less Unit

Minimization of silicon area is achieved by reducing the number of functional units (such as adders and multipliers), multiplexers, and interconnection wires. Conventional complex multiplier consists of four real multipliers, one adder and one subtracter. Shift and

Hardware Implementation Low Power High Speed FFT Core

addition operations with common sub expression sharing are used to pre-compute twiddle coefficients which reduces area as well as power [5]. The number of coefficients for the 16-point FFT is shown in Table 1 The multiplier-less unit as shown in Figure 7 consist of shift and addition operations with common sub expression sharing to replace complex multiplications [3]. A close observation reveals that the seven coefficients (7fff, 0000) and (0000, 8000) are the trivial coefficients which are the quantized representation for (1, 0) and (0, -1) in 16-bit twos complement format respectively. In each set, the first entry corresponds to the cosine function (the real part, Wr) and the second one corresponds to the sine function (the imaginary part, Wi). For the trivial coefficients (7fff, 0000) and (0000, 8000), the complex multiplication is not necessary. Data can directly pass through the multiplier unit without any multiplication, when data is multiplied with (7fff, 0000). Only an additional unit, which swaps the real and imaginary parts of input data, and inverts the imaginary part is needed for those data (0000, 8000). The rest of the coefficients can be represented by three constants (7641, 5a82 and 30fb). For example, a multiplication with the constant a57d could be realized by first multiplying the data with 5a82, and then twos complementing the result. The other two constants (89be and cf04) can be realized in a similar manner, using constants 7641 and 30fb, respectively.
ar ai add/sub add/sub br bi cr ci add/sub Exor Re

5a82 (0101101010000010), 7641 (1000-10-001000001) and 30fb (010-1000100000-10-1). We can use shifters and adders based on the three constants to carry out those nontrivial complex multiplications as shown below: 5a82X = 5X << 12 + 5X << 9 + 65X <<1 7641X = X << 15 + 65X 5X <<9 30fbX = 65X << 8 X << 12 5X The common sub expressions for the two constants are 101 (5) and 1000001 (65). Figure 5 shows the shiftand-addition module for the three constants in the multiplier-less unit. ROM unit storing coefficients is replaced by a FSM unit generating control signals (s1- s8) in multiplier-less approach. The same multiplier architecture is applied to 64-point FFT is shown in Figure 6 All coefficients for 64-point FFT is represented interms of 7f62, 7d8a, 7a7d, 7641, 70c2, 6a6d, 62f2, 5a82, 5133, 471c, 3c56, 30fb, 2528, 18f8, 0c8b [8]. The following coefficients are pre-computed using common sub expression based shift and addition. 7f62X = X << 15 5X << 5 + X << 1 7d8aX =X << 15 5X << 7+ 5X << 1 7a7dX =65X << 9 X << 11 + X << 7 X << 2 + X 7641X = X << 15 + 65X 5X <<9 70c2X = X<< 1X << 5 +X <<8 X << 12 +X <<15 6a6dX = X 5X << 2+X <<7+65X <<9 X << 13 62f2X = X << 1 + X << 10 + X << 15 X << 4 X << 8 X << 13 5a82X = 5X << 12 + 5X << 9 + 65X <<1 5133X = 5X << 12 + 65X << 2 X << 4 + X << 6 471cX = X << 5 65X << 2 + X << 11 + X << 14 3c56X = X << 7 5X << 1 X << 7 X << 10 + X << 14 30fbX = 65X << 8 X << 12 5X 2528X = 5X << 3 + 5X << 8 + X << 13 18f8X = X << 8 X <<3 X << 11 + X << 13 0c8bX = X << 11 5X 65X << 4 +X << 7 Similarly, coefficients for 256-point FFT is represented interms of 7ff6, 7fd8, 7fa7, 7f62, 7f09, 7e9d, 7e1d, 7d8a, 7ce2, 7c29, 7b5d, 7a7d, 798a, 7884, 776c, 7641, 7504, 73b5, 7255, 70e2, 6f5f, 6dca, 6c24, 6a6d, 68a6, 66cf, 64e8, 62f2, 60ec, 5ed7, 5cb4, 5a82, 5842, 55f5, 539b, 5133, 4ebf, 4c3f, 49b4, 471c, 447a, 41ce, 3f17, 3c56, 398c, 36ba, 33de, 30fb, 2e11, 2b1f, 282b, 2528, 2223, 1f19, 1c0b, 18f8, 15e2, 12c8, 0fab, 0c8b, 096a, 0647, 0324. The coefficients are pre-computed using common sub expression based shift and addition.
X Constant 5a82 Inverter Mux X Comm on sub exp block Constant 7641 Inverter S2 Swap

add/sub

add/sub dr di add/sub Im

0->addition 1->subtraction

m1 0 1 2 3

c1 c2 c3 0 0 0 1 0 1 0 1 1 1 1 0

Figure 4. Butterfly element for stage for stage t. Table 1. The coefficients for 16-point.
Coefficient Sequence m1 = 0,1 Wo Wo Wo Wo Wo W1 W2 W3 Original quantized coefficient 7fff, 0000 7fff, 0000 7fff, 0000 7fff, 0000 7fff, 0000 7641, cf04 5a82, a57d 30fb, 89be Coefficient sequence m1 = 2,3 Wo W2 W4 W6 Wo W3 W6 W9 Original quantized coefficient 7fff, 0000 5a82, a57d 0000, 000 a57d, a57d 7fff, 0000 30fb, 89be a57d, a57d 89be, 30fb

R Output Switch Unit I

5a82 is represented in twos complement format, 7641 and 30fb are represented in Canonical Signed-Digit (CSD) format:

Input Switch Unit

S5 Constant 30fb Inverter S4

X S1

Figure 5. Block diagram of shift-and-addition module.

4 2009

The International Arab Journal of Information Technology, Vol. 6, No. 1, January

5a82 7f62 0c8b 7d8a 18fd 7a7d 2528 7641 30fb 70c2 I n s w i t c h u n i t 3c56 6a6d 471c 62f2 5133 O U T S W I T
S1

inverter inverter
S3

mux
S2

fixed point using 16 point, 64-point and 256-point radix-4 DIF FFT algorithm. The given inputs and corresponding outputs are as follows: Input: 1, 1 ,1 ,1, 2 ,2 ,2 ,2 ,1 ,1 ,1 ,1 , 0 , 0 , 0 , 0, 1, 1 ,1 ,1, 2 ,2 ,2 ,2 ,1 ,1 ,1 ,1 , 0 , 0 , 0 , 0, 1, 1 ,1 ,1, 2 ,2 ,2 ,2 ,1 ,1 ,1 ,1 , 0 , 0 , 0 , 0, 1, 1 ,1 ,1, 2 ,2 ,2 ,2 ,1 ,1 ,1 ,1 , 0 , 0 , 0 , 0 Output: 64, 0, 0, 0, -16.109-24.109i, 0, 0, 0, 0, 0, 0, 0, 9.9864-1.9864i,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.3273-6.6727i , 0, 0, 0, 0, 0, 0, 0, 4.7956+3.2044i ,0, 0, 0, 0, 0, 0, 0, 4.7956-3.2044i, 0, 0, 0, 0, 0, 0, 0, 1.3273+6.6727i, 0, 0, 0, 0, 0, 0, 0, 9.9864+1.9864i, 0, 0, 0, 0, 0, 0, 0, 16.109+24.109i, 0, 0, 0.

swap
S5

inverter
S4

inverter
S3

swap
S5

inverter
S4

inverter
S3

swap
S5

inverter
S4

inverter
S3

swap
S5

3.2. Synthesis Results

The proposed FFT architecture has been synthesized for 16-point, 64-point and 256-point using cadence RTL Compiler targeting the TSMC 0.18 CMOS technology library.

inverter
S4

inverter
S3

swap
S5

inverter
S4

3.3. Reports
RTL Compiler was used to evaluate power, area and timing report for FFTs. The Timing and power report for 16-point, 64-point and 256-point FFT core is shown in Table 2 and 3 The power and area report of different modules present in top FFT core for 16-point and 64-point FFT is shown in Tables 4 and 5 For 256point FFT above reports are given in Tables 6 and 7.
Table 2. Timing report for FFT core (different points).
FFT Size 16-point 64-point 256-point Frequency(MHz) 200 166.66 125

inverter
S3

swap
S5

inverter
S4

inverter
S3

swap
S5

inverter
S4

Figure 6. Block diagram of shift-and-addition module (64-point).

Table 3. Power report for FFT core.

Swap D E M U X Attributes Leakage(mw) Internal (mw) Net ( mw) Switching (mw) Total (m w) Area(mm2) 16-point 0.0012 11.106 1.779 12.885 25.772 0.2112 64-point 0.0015 21.372 2.555 23.927 47.856 0.3216 256-point 0.0038 69.153 7.590 76.744 153.49 0.8417

Shift-Add Module

M U X

o/p

Shift-Add Module

Figure 7. Block diagram of the multiplier-less unit.

Table 4. Power report for 16-point FFT core (different modules).

Attribute Comm (I ) 0.31 3.978 0.430 4.408 8.817 0.069 Comm (II) 0.09 0.993 0.128 1.122 2.444 0.018 0.3 3.5793 0.9848 4.5641 9.128 0.0442 Multiplier Butter fly 0.25 3.4008 0.9624 4.3632 8.726 0.0420

3. Results
3.1. Simulation Results Using Modelsim Tool
The FFT blocks are simulated and the results are shown below using Modelsim Tool in Verilog HDL. The resulting Verilog HDL simulation models can then be used as building blocks in larger circuits (using schematics, block diagrams or system-level Verilog HDL descriptions) for the purpose of simulation. The top module is simulated for 32 bits (complex data)

Leakage (w) Internal (mw) Net ( mw) Switching (mw) Total (m w) Cell Area (mm2)

Hardware Implementation Low Power High Speed FFT Core

Table 5. Power report for 64-point FFT core (different , modules).
Attribute Leakage (w) Internal (mw) Net (mw) Switching (mw) Total (mw) Cell Area (mm2) Comm (I ) 1.14 15.303 1.572 16.875 33.751 0.2606 Comm (II ) 0.31 3.9781 0.4303 4.4083 8.817 0.0690 Comm (III ) 0.09 0.9932 0.1287 1.122 2.444 0.0185 Multiplier 0.97 6.676 1.889 8.566 17.13 0.159 Butter fly 0.25 3.40 0.96 4.36 8.72 0.04

impact on power /speed performance has been compared.

References
[1] Bi G. and Jones E., A Pipelined FFT Processor
for Word Sequential Data, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, no. 12, pp. 1982-1985, 1989. Cooley J. and Tukey J., An Algorithm for the Machine Computation of the Complex Fourier Series, Mathematics of Computation, vol. 19, pp. 297-301, 1965. Han W., Arslan T., Erdogan A., and Hasan M., A Novel Low Power Pipelined FFT Based on Sub Expression Sharing for Wireless LAN Applications, IEEE Workshop on Signal Processing Systems, pp. 83-88, 2004. Han W., Arslan T., Erdogan A., and Hasan M., Low Power Commutator for Pipelined FFT Processors, IEEE International Symposium on Circuits and Systems, vol. 5, pp. 5274-5277, 2005. Han W., Arslan T., Erdogan A., and Hasan M., Multiplier-Less Based Parallel-Pipelined FFT Architecture for Wireless Communication Applications, IEEE Proceedings on Acoustics, Speech and Signal Processing, vol. 5, pp. 45-48, 2005. Han W., Arslan T., Erdogan A., and Hasan M., The Development of High Performance FFT IP Cores Through Hybrid Low Power Algorithmic Methodology, in Proceedings of the Asia South Pacific Design Automation, pp. 549-552, China, 2005. John G. and Manolakis D., Digital Signal Processing, MacMillian, London, 1988. Maharatna K., Grass.E., and Jagdhold U., A 64Point Fourier Transform Chip for High-Speed Wireless LAN Application Using OFDM, IEEE Journal of Solid-State Circuits, vol. 39, no. 3, pp. 484-493, 2004

[2]

Table 6. Power report for 256-point FFT core (different modules).

Attribute Leakage (w) Internal (mw) Net (mw) Switching (mw) Total (mw) Cell Area (mm2) Comm (I) 4.4 60.601 6.140 66.742 133.487 1.0270 Comm (II) 1.14 15.303 1.572 16.875 33.751 0.2606 Comm (III) 0.31 3.9781 0.4303 4.4083 8.817 0.0690 Comm (IV ) 0.09 0.9932 0.1287 1.122 2.444 0.0185

[3]

[4]

[5]

Table 7. Power report for 256-point FFT core (different modules).

Attribute Leakage (w) Internal (mw) Net ( mw) Switching (mw) Total (m w) Multiplier 1.5 21.859 5.276 27.135 54.271 Butterfly 0.25 3.4008 0.9624 4.3632 8.726

[6]

Cell Area (mm2)

[7] [8]

0.044224

0.0420

4. Conclusion
In this paper a pipelined architecture for 16 point, 64point and 256-point radix-4 DIF FFT in fixed point representation is implemented. Low power FFT processor is implemented by using multiplier less (shift add) approach for multiplying twiddle coefficient. This paper presents a multiplier-less pipelined FFT processor architecture suitable for shorter FFTs. This design approach can also be applied to the longer FFTs. The multiplier-less architecture employs the minimum number of shift and addition operations to realize the complex multiplications. By combining a commutator architecture and low power butterfly architecture with this approach, the resulting power savings are around 43% and 59% for 64-point and 16point radix-4 FFTs, respectively, as compared to a conventional FFT architecture based on non-booth coded wallace tree multiplier. The parameterization

[9] Mohd H. and Tughrul A., A Triple Port RAM

Based Low Power Commutator Architecture for a Pipelined FFT Processor, in Proceedings of the 2003 International Symposium of Circuits and Systems(ISCAS03), vol. 5, pp. 353-356, 2003. [10] Rabiner L. and Gold B., Theory and Application of Digital Signal Processing, Prentice Hall, 1975.

6 2009

The International Arab Journal of Information Technology, Vol. 6, No. 1, January

Muniandi Kannan received his BE in electronics and communication engineering from MK University, Madurai, and his ME from Anna University, Chennai. Since 1993 he has been working in Anna University, Chennai, India. His area of interests includes computer architecture, VLSI design, and VLSI for signal processing. Srinivasa Srivatsa received his BE in electronics and telecommunication engineering from Jadavpur University, his ME in electrical communication engineering, and his PhD from Indian Institute of Science, Bangalore, India. He had been a professor of electronics engineering in Anna University, Chennai, India for nearly 20 years. He is the author of 191 publications in reputed journals/conference proceedings. His area of interests includes computer networks, digital logic design, and design of algorithms and robotics.

Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
Professor Scarlet's Notebook
No ratings yet
Professor Scarlet's Notebook
163 pages
Manual de Partes Genie S-125 PDF
100% (2)
Manual de Partes Genie S-125 PDF
194 pages
DL24 150W 180W DIY 1000W Installation Manual B Version
100% (2)
DL24 150W 180W DIY 1000W Installation Manual B Version
1 page
(IJCST-V3I2P16) :harpreet Kaur
No ratings yet
(IJCST-V3I2P16) :harpreet Kaur
6 pages
12 - Chepter 5
No ratings yet
12 - Chepter 5
11 pages
File 20220815 162546 Spin20178050021 220815 162543
No ratings yet
File 20220815 162546 Spin20178050021 220815 162543
6 pages
10 1109@icoei48184 2020 9143051
No ratings yet
10 1109@icoei48184 2020 9143051
6 pages
(R4MDC) A High Throughput and Low Power Radix-4 FFT Architecture
No ratings yet
(R4MDC) A High Throughput and Low Power Radix-4 FFT Architecture
5 pages
Design and Implementation of Low Power Fft/Ifft Processor For Wireless Communication
No ratings yet
Design and Implementation of Low Power Fft/Ifft Processor For Wireless Communication
4 pages
Design of A Power Optimized L024-Point 32-Bit
No ratings yet
Design of A Power Optimized L024-Point 32-Bit
3 pages
35.IJAEST Vol No 5 Issue No 2 Design Analysis of FFT Blocks For Pulsed OFDM UWB Systems Using FPGA 339 342
No ratings yet
35.IJAEST Vol No 5 Issue No 2 Design Analysis of FFT Blocks For Pulsed OFDM UWB Systems Using FPGA 339 342
4 pages
2003 A 2048 Complex Point FFT Processor Using A Novel Data Scaling Approach
No ratings yet
2003 A 2048 Complex Point FFT Processor Using A Novel Data Scaling Approach
4 pages
Design and Implementation of Pipelined FFT Processor: D.Venkata Kishore, C.Ram Kumar
No ratings yet
Design and Implementation of Pipelined FFT Processor: D.Venkata Kishore, C.Ram Kumar
4 pages
Design and Implementation of A 1024-Point
No ratings yet
Design and Implementation of A 1024-Point
5 pages
Design and Implementation of Parallel Bit Reversal On FFT by Using Verilog H PDF
No ratings yet
Design and Implementation of Parallel Bit Reversal On FFT by Using Verilog H PDF
5 pages
Efficient Cached 64 Point FFT Processor Using Floating Point Arithmetic For OFDM
No ratings yet
Efficient Cached 64 Point FFT Processor Using Floating Point Arithmetic For OFDM
6 pages
Implementation of 64-Point FFT
No ratings yet
Implementation of 64-Point FFT
5 pages
On-Chip Implementation of High Speed and High Resolution Pipeline Radix 2 FFT Algorithm
No ratings yet
On-Chip Implementation of High Speed and High Resolution Pipeline Radix 2 FFT Algorithm
3 pages
Base Paper FPR FFT
No ratings yet
Base Paper FPR FFT
5 pages
Design & Development of IP-core of FFT For Field Programmable Gate Arrays
No ratings yet
Design & Development of IP-core of FFT For Field Programmable Gate Arrays
7 pages
fft
No ratings yet
fft
4 pages
Comp Networking - IJCNWMC - Design Approach For Implementation
No ratings yet
Comp Networking - IJCNWMC - Design Approach For Implementation
8 pages
FFT Algorithms: A Survey: Pavan Kumar K M, Priya Jain, Ravi Kiran S, Rohith N, Ramamani K
No ratings yet
FFT Algorithms: A Survey: Pavan Kumar K M, Priya Jain, Ravi Kiran S, Rohith N, Ramamani K
5 pages
Doc1132 PDF
No ratings yet
Doc1132 PDF
9 pages
Fpga Implementation of FFT Algorithm For Ieee 802.16E (Mobile Wimax)
No ratings yet
Fpga Implementation of FFT Algorithm For Ieee 802.16E (Mobile Wimax)
7 pages
International Journal of Engineering Research and Development
No ratings yet
International Journal of Engineering Research and Development
5 pages
Frequency Analyzer
No ratings yet
Frequency Analyzer
4 pages
VD 02 Design and Implement of FFT Processor For OFDMA System
No ratings yet
VD 02 Design and Implement of FFT Processor For OFDMA System
3 pages
Design and Simulation of 32-Point FFT Using Mixed Radix Algorithm For FPGA Implementation
No ratings yet
Design and Simulation of 32-Point FFT Using Mixed Radix Algorithm For FPGA Implementation
11 pages
Efficient Low Multiplier Cost 256-Point FFT Design With Radix-2 SDF Architecture
No ratings yet
Efficient Low Multiplier Cost 256-Point FFT Design With Radix-2 SDF Architecture
14 pages
A 64 Point Fourier Transform Chip
No ratings yet
A 64 Point Fourier Transform Chip
21 pages
FFT Imp Butterfly
No ratings yet
FFT Imp Butterfly
14 pages
Vlsi Architecture For r2b r4b r8b
No ratings yet
Vlsi Architecture For r2b r4b r8b
81 pages
VLSI Implementation of Pipelined Fast Fourier Transform
No ratings yet
VLSI Implementation of Pipelined Fast Fourier Transform
6 pages
Area-Efficient Architecture For Fast Fourier Transform
No ratings yet
Area-Efficient Architecture For Fast Fourier Transform
7 pages
VLSI Architecture For FFT Using Radix-2 Butterfly of Complex Valued Data
No ratings yet
VLSI Architecture For FFT Using Radix-2 Butterfly of Complex Valued Data
5 pages
A 128/512/1024/2048-Point Pipeline Fft/Ifft Architecture For Mobile Wimax
No ratings yet
A 128/512/1024/2048-Point Pipeline Fft/Ifft Architecture For Mobile Wimax
2 pages
A Serial Commutator Fast Fourier Transform Architecture For Real-Valued Signals
No ratings yet
A Serial Commutator Fast Fourier Transform Architecture For Real-Valued Signals
5 pages
Low-Power, High-Speed FFT Processor For MB-OFDM UWB Application
No ratings yet
Low-Power, High-Speed FFT Processor For MB-OFDM UWB Application
10 pages
Impact of DPU 2017
No ratings yet
Impact of DPU 2017
6 pages
A New Approach To Pipeline FFT Processor
No ratings yet
A New Approach To Pipeline FFT Processor
5 pages
FFT and Ifftv Seminar Project
No ratings yet
FFT and Ifftv Seminar Project
83 pages
Butter
No ratings yet
Butter
10 pages
FFT
No ratings yet
FFT
4 pages
Designing Pipeline FFT Processor For Ofdm Demodulation
No ratings yet
Designing Pipeline FFT Processor For Ofdm Demodulation
6 pages
Design of 16-Point Radix4 Fast Fourier Transform I
No ratings yet
Design of 16-Point Radix4 Fast Fourier Transform I
7 pages
DD
No ratings yet
DD
78 pages
Implementation of Fast Fourier Transform (FFT) On FPGA Using Verilog HDL
0% (1)
Implementation of Fast Fourier Transform (FFT) On FPGA Using Verilog HDL
21 pages
High Speed Eight-Parallel Mixed-Radix FFT Processor For OFDM Systems
No ratings yet
High Speed Eight-Parallel Mixed-Radix FFT Processor For OFDM Systems
4 pages
Han 2020
No ratings yet
Han 2020
12 pages
VHDL Implementation of A Flexible and Synthesizable FFT Processor
No ratings yet
VHDL Implementation of A Flexible and Synthesizable FFT Processor
5 pages
Part-by-Part-Evaluation-on-Arrival Approach Involving Modified Eight Point Radix-2 FFT/IFFT For An OFDM Transceiver To Reduce Latency
No ratings yet
Part-by-Part-Evaluation-on-Arrival Approach Involving Modified Eight Point Radix-2 FFT/IFFT For An OFDM Transceiver To Reduce Latency
5 pages
Ijatcse 144942020
No ratings yet
Ijatcse 144942020
5 pages
1 - A Novel Area-Power Efficient Design For Approximated Small-Point FFT Architecture
No ratings yet
1 - A Novel Area-Power Efficient Design For Approximated Small-Point FFT Architecture
12 pages
On The Design of The FFT Butterfly Units
No ratings yet
On The Design of The FFT Butterfly Units
1 page
Digital Spectral Analysis MATLAB® Software User Guide
From Everand
Digital Spectral Analysis MATLAB® Software User Guide
S. Lawrence Marple, Jr.
No ratings yet
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
From Everand
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
Analog Dialogue
4/5 (1)
Analog Dialogue, Volume 47, Number 2
From Everand
Analog Dialogue, Volume 47, Number 2
Analog Dialogue
No ratings yet
Analog Dialogue, Volume 45, Number 4: Analog Dialogue, #4
From Everand
Analog Dialogue, Volume 45, Number 4: Analog Dialogue, #4
Analog Dialogue
No ratings yet
Reference Guide To Useful Electronic Circuits And Circuit Design Techniques - Part 2
From Everand
Reference Guide To Useful Electronic Circuits And Circuit Design Techniques - Part 2
Kerwin Mathew
No ratings yet
Digital Filters Design for Signal and Image Processing
From Everand
Digital Filters Design for Signal and Image Processing
Mohamed Najim
No ratings yet
Mastering FT8 A Comprehensive Guide to the Ultimate Digital Mode
From Everand
Mastering FT8 A Comprehensive Guide to the Ultimate Digital Mode
Duarte Braga
No ratings yet
Performance Requirement in LTE
No ratings yet
Performance Requirement in LTE
136 pages
Ch01 Introduction To Digital Systems
No ratings yet
Ch01 Introduction To Digital Systems
39 pages
EC1258 - Digital Electronics Laboratory Manual (REC)
100% (8)
EC1258 - Digital Electronics Laboratory Manual (REC)
94 pages
VIOS in Action With IBM I: Janus Hertz
No ratings yet
VIOS in Action With IBM I: Janus Hertz
34 pages
Pumps Spec Series5100!8!09
No ratings yet
Pumps Spec Series5100!8!09
12 pages
# Installation Guide
No ratings yet
# Installation Guide
5 pages
Manual Reefer Manager
No ratings yet
Manual Reefer Manager
30 pages
Manual Viewer
No ratings yet
Manual Viewer
36 pages
S09 Power Train
No ratings yet
S09 Power Train
54 pages
Operating System
No ratings yet
Operating System
10 pages
Precios Capitalmovil
No ratings yet
Precios Capitalmovil
66 pages
APEM 5000 Series Toggle Switches
No ratings yet
APEM 5000 Series Toggle Switches
30 pages
CSI3131 Syllabus
No ratings yet
CSI3131 Syllabus
11 pages
Fresadora_Feeler_Diagrama Elétrico
No ratings yet
Fresadora_Feeler_Diagrama Elétrico
141 pages
ASLC011 Bfrick RWBIIPlus Microprocessor
100% (1)
ASLC011 Bfrick RWBIIPlus Microprocessor
32 pages
Ce211301 Interfacing The Psoc Analog Coprocessor With A Pir Motion Sensor
No ratings yet
Ce211301 Interfacing The Psoc Analog Coprocessor With A Pir Motion Sensor
11 pages
Windobserver 2 Manual PDF
No ratings yet
Windobserver 2 Manual PDF
56 pages
Toyota Altezza Ecu Side Terminal (TP5-7Base) Refer The Following For Special Setting When Modifying The Wiring, Etc
No ratings yet
Toyota Altezza Ecu Side Terminal (TP5-7Base) Refer The Following For Special Setting When Modifying The Wiring, Etc
9 pages
ECE3003 - Microcontroller and Its Applications Digital Assignment - I
No ratings yet
ECE3003 - Microcontroller and Its Applications Digital Assignment - I
9 pages
Chap 3. The Buffer Cache
No ratings yet
Chap 3. The Buffer Cache
24 pages
Management Information System of EDCL
No ratings yet
Management Information System of EDCL
47 pages
MTN Ghana
No ratings yet
MTN Ghana
17 pages
ATV900 Communication Parameters NHA80944 V1.2
No ratings yet
ATV900 Communication Parameters NHA80944 V1.2
51 pages
10 Hydrants Fdny
No ratings yet
10 Hydrants Fdny
11 pages
Digital Cinema Package
No ratings yet
Digital Cinema Package
8 pages
Jawaharlal Nehru Engineering College: Laboratory Manual
No ratings yet
Jawaharlal Nehru Engineering College: Laboratory Manual
26 pages
REV. 1-14-2015 Installation Instructions: Description Part Number Qty
No ratings yet
REV. 1-14-2015 Installation Instructions: Description Part Number Qty
5 pages

Hardware Implementation Low Power High Speed FFT Core

Uploaded by

Hardware Implementation Low Power High Speed FFT Core

Uploaded by

The International Arab Journal of Information Technology, Vol. 6, No.

Hardware Implementation Low Power High Speed FFT Core

1.1. Radix-4 Pipelined FFT

N/4-1 0 kn X(4k)= [x(n)+x(n+ N )+x(n+ N )+x(n+ 3N )]WN WN/4 4 2 4 n=0

The International Arab Journal of Information Technology, Vol. 6, No. 1, January

Figure 1. Signal flow graph of a radix-4 16-point FFT (DIF algorithm).

ure 3. Commutator for stage t.

2.2. Low Power Butterfly

Figure 2. N-point radix-4 pipelined FFT processor architecture.

2.3. Multiplier-Less Unit

Hardware Implementation Low Power High Speed FFT Core

R Output Switch Unit I

Input Switch Unit

S5 Constant 30fb Inverter S4

Figure 5. Block diagram of shift-and-addition module.

The International Arab Journal of Information Technology, Vol. 6, No. 1, January

3.2. Synthesis Results

Figure 6. Block diagram of shift-and-addition module (64-point).

Table 3. Power report for FFT core.

Figure 7. Block diagram of the multiplier-less unit.

Table 4. Power report for 16-point FFT core (different modules).

Hardware Implementation Low Power High Speed FFT Core

impact on power /speed performance has been compared.

Table 6. Power report for 256-point FFT core (different modules).

Table 7. Power report for 256-point FFT core (different modules).

Cell Area (mm2)

[9] Mohd H. and Tughrul A., A Triple Port RAM

The International Arab Journal of Information Technology, Vol. 6, No. 1, January

You might also like