The Design and Implementation of FFT Algorithm Based On The Xilinx FPGA IP Core
The Design and Implementation of FFT Algorithm Based On The Xilinx FPGA IP Core
The Design and Implementation of FFT Algorithm Based on The Xilinx FPGA IP
Core
In the actual hardware operation, the module execution the two base 2 of same group have the same compression
speed is very important parameters, this paper is based on the ratio. Example: the data length N = 1024, Scale_SCH = [10
assembly line, the simulation validation of Streaming I/O 10 00 01 11] for group 0 (stage 0 and stage 1) right shift bit 3,
structure do the continuous data processing. Line, Streaming group 1 (stage 2 and stage 3) right shift bit 1, group 2 (stage
I/O architecture adopt assembly line technology design to a 4 and stage 5) no shift, group 3 (stage 6 and stage 7) right
series of base 2 wing processing engine, and each wing shift bit 2, group 4 (stage 8 and stage 9) right shift bit 2. If
processing engine has its own independent memory to input transform lenth N is not 4 integer times power, the last group
data and intermediate data (figure 1). In this structure, FFT only contains a base 2 bands, can use 00 or 01 said.
IP nuclear has also deal with the current frame N point data, Experience conclusion (can prevent to produce data
load the next frame N point data, output an ancient frame N overflow) : N = 512, Scale_SCH = [11] 01 10 10 10; N =
point before data ability. 1024, Scale_SCH = [10 10 10 10 11].
Xilinx FFT IP nuclear V5.0 support three algorithm types: Compression ratio, the Scale_SCH bitwidth for assembly
full precision no compression, block floating-point and line, Streaming I/O architecture and base 4, Burst the I/O
fixed-point compression (compression ratio by user-defined). structure, for 2 * ceil (0.5 * log2 (N); For base 2, Burst the
For all the precision no compression structure, any a I/O structure and base 2 Lite Burst the I/O structure for 2 *
meaningful integer in data channels will be retained, the log2 (N), including N for converting data length.
decimal part produced during the operation will be truncated
or integer. This structure, for fixed-point algorithm, after III. FFT IP CORE OF SIMULATION VALIDATION
multistage multiplication operation later, data bits wide will Through invoke Xilinx IP core to achieve a 512 points,
double its output bitwidth, increasing input for (input data bitwidth. and phase factor for 16bit bitwidth of FFT
bitwidth+log2 (data conversion length) + 1) bits. algorithm modules, clock frequency for 50MHz (clock
For Block floating-point type, any data point in one frequency higher, can obtain higher reuse multiples, save
frame data have the same compression ratio, the compression more resources area), uses assembly line, Streaming I/O and
ratio as the output value by Block Exponent shows, and only fixed-point compression structure, complete in the
in FFT IP nuclear testing will produce a data, we will do commissioning medium or ower end FPGA, verify its
compression operations. reliability and feasibility. In order to facilitatly verify the
This paper adopted the fix-point compression structure. correctness of the nuclear function of FFT IP core: with zero
This structure, compared to full precision no compression start counting, in every clock rise along comes, add an
structure can greatly reduce the FPGA internal resources operation obtained data respectively as real part and plural
Xtreme DSP Slices and the use of block RAM, and relative part of input signal. Scale_SCH = [01 10 10 01 11], in
to block floating-point type, can be adjusted flexibly ISE10.1 build engineering, in invoke Xilinx FFT IP core,
compression ratio. The compression ratio chart(Scale_SCH) then use SE6.5 ModelSim to simulate, the simulation timing
of fixed-point compression structure. Compression ratio is as figure2shows.
according to 1, 2, 4, or 8 for each order compression, namely Timing validation aspects: the whole timing sequence is
separately shift right corresponds to 0, 1, 2 or 3. If entirely correct. As can be seen from the timing diagram:
compression are inadequate, the output wing will become signal high indicates that FFT IP core is ongoing FFT
beyond the dynamic range, cause data overflow. For Burst operations, after doing signal down that operation to have
I/O architecture, Scale_SCH’s expression methods: for each ended, the output FFT operation result; Edone signal done
stage compression ratio are made by appointed 2bits number, signal in a cycle reached before; At this time, a cycle, done
the zero stage 2bits number are the lowest 2bits zero order, market-place complete; that FFT operations And, because of
concrete for [... N4 , N3, N2, N1, N0], each 2bits number the 512 points, so, each operation FFT 512 clock cycle,
respectively correspond to the corresponding stage interval edone and done signal will push a; RFD signal has
compression ratio. For example: to base 4 structure, data been pulled that input data has been transferred to FFT IP
transfer length N = 1024, Scale_SCH = [01 10 00 11 01]for core of input ports, Streaming with using line, I/O
stage 0 right shift bit 2, stage 1 right shift bit 3, stage 2 right architecture continuous data processing, are consistent; Dv
shift bit 0, stage 3 right shift bit 2, stage 4 right shift bit 1. signal is high, show for the output signal is effective.
Experience conclusion (can prevent to produce data A functional verification aspects: according to FFT IP
overflow) : for the base 4 structure 1024 point, Burst I/O core in assembly line, in Streaming I/O architecture, interval
architecture, Scale_SCH = [10 10 10 10 11]; But for the base each frame data need three frames can output the
2 structure 1,024 point Scale_SCH = [01, 01 01 01 01 01 01 characteristic of the calculation results, can calculate inside
01 01 10]. the simulation above output corresponding to the [94:
For assembly line, Streaming I/O structure, put near a moments [94:605] + [94:605]* j FFT output results. Inside
pair of base near 2 bands group together, namely stage 0 and the Matlab simulation result, according to the proportion of
stage 1 for group 0, stage 2 and stage 3 is group 1, etc. Scale_SCH compress, and it is consistent with the result
Scale_SCH expression methods: for each group of shows that the FFT IP core woeking is normal .
compression ratio are made by appointed 2bits number, the
2bits number of zero group is the lowest, concrete form for IV. CONCLUSION
[... N4, N3, N2, N1, N0], each 2bits number respectively This paper mainly through FFT IP nuclear overall testing
correspond to the corresponding group of compression, said and validatiing FFT algorithm the feasibility and reliability
in medium or lower end FPGA. In selecting lines structure [5] Gregory W. WORNELL , Alan V. OPPENHEIN. Estimation of
realize FFT basis, adopts fixed point, reduce the time of data Signals from Noisy Measurements Using Wavelets [J]. IEEE
Transactions on Signal Processing,1992,40(3):611-623.
reading and processing, better meet the needs of the FFT
[6] Stéphane MALLAT. A Wavelet Tour of Signal Processing[M].
processing data. BeiJing:China Machine Press,2002.
REFERENCES [7] David L. DONOHO. De-Noising by Soft-Thresholding [J]. IEEE
Transactions on Information Theory,1995,41(3):613-627.
[1] Cooley.J.W,Tukey.J.W. An algorithm for the machine computation of [8] SWELDENS W. The Lifting Scheme: A Custom-Design
complex Fourier series. Mathematics of Computation, 1965, 297~301 Construction of Biorthogonal Wavelets[J]. Applied and
[2] K.M.Lakin. A Review Of The Thin Film Resonator Technology. Computational Harmonic Analysis 3, 1996 : 186-200.
IEEE Microwave Magazine, 2003, 4(4 SPE(ISS)):333~336 [9] FranLke.U.andS.Heinrieh.Fast Obstaele Deteetion for Urban Traffie
[3] Ng Kuang Chern,Nathaniel,Poo Ann Neow and Marcelo H.Ang Situations IEEE Trans. Intelligeni Trans Portation Systems,2002.3
Jr,Practical issues in Pixel-based Autofocusing For Maxhine [10] S.Zhang et al.The Research of Mixed Programming Auto-Focus
Vision[C].Proceedings of the 2001 IEEE International Conference on Based On Image Processing .ICICA2010,PartI,CCIS105,PP.217-
Robtics&Automation,May 2001 Seoul,Korea.p.2791-6. 225,2010.
[4] Santos A,Ortiz de Solorzano C,de la Pena J,Vaquero J,Malpica N, del [11] S. Zhang, G. Jin, J. Xiao, S. Li, Y.P. Qin, J.H. Liu, T. An and W.F.
Pozo F.Evaluation of autofocus functions in molecular cytogenetic Zhong.Generalized Constraint Neural Network Model System
analysis[J].Journal of Microscopy 1997;188;264-72. Parameter Identification.Advanced Materials Research Vols. 143-
144 ,pp 1207-1212,2011
Radix- Radix-
Radix- Radix- 2 butterfly
2 butterfly
Input data 2 butterfly processin 2 butterfly processin
g engine g engine
processin processin
i i
Order 0 Order3
Order 1 Order4
Memory Memory
bank bank
Radix- Output
Radix- 2 butterfly
2 butterfly processin reordering output
processin
g engine data
i
Order n-
Order n
1