RTL Synthesizable Asynchronous FIFO
RTL Synthesizable Asynchronous FIFO
Shruti Sharmaa
a. University School of Information and Communication Technology, Guru Gobind Singh Indraprastha
University (GGSIPU), Delhi, India. Email: [email protected]
(Student of Masters of Technology in Electronics and Communication Engineering, Final Year)
models (augmented with System Verilog features),thus retaining the low value of area and power consumption
allowing the use of any standard Verilog simulator that in any asynchronous control system, this work offered
supports System Verilog [3] features for functional a method which avoids the predictable syntax directed
verification, and its generation of Verilog models translation tactic. As an alternative,it occupies a design
suitable for a register-transfer level (RTL) synthesis. access which is a data-driven design style and also a
The concept of utilizing four-phase handshake coarse-grain access in a way to synthesise an
protocol [3,4] in this work is applied mostly to achieve asynchronous control system, thereby confining the
an asynchronous FIFO environment which is disparate communication channels from being implemented
from the flow-through FIFOs, i.e. where the movement asynchronously. The presented method is adopted for
of data through FIFO can be avoided to eliminate the the reason that it can be easily integrated into
chances of latency. To briefly describing the working conventional synchronous design styles since they are
of 4-phase protocol one can say that a particular cycle based on Verilog and System Verilog stipulations, and
has two phases, one is work phase and another is reset are able to produce their appropriate register-transfer
phase. The work phase expands from the start of the level (RTL) models for further enabling their functional
request rising edge to the start of the acknowledgement simulation and logical synthesis by means of currently
rising edge indicating the handling of a request and its available computer-aided design tools.
completion. The reset phase is constituted by the The work in [8] suggested the RTL synthesizable
return-to-zero of both the signals i.e. request and asynchronous design of a FIFO by consuming the
acknowledgement signals,the signalling transitions can modified pipeline construct which is proving beneficial
be arranged differently for different options of cost and in designing the FIFOs where the data movement is
performance. The work proposed in [3] presents the use avoided within the FIFO. Thus, by agreeing to the
of an early acknowledgement protocol featured as an modified pipeline concept the complexity of the control
improvement above the normal four-phase protocol by pipelines to exchange the tokens or signals permitting
entirely concealing the reset phase of the signalling i.e. to read or write in the FIFO, can be reduced. One more
the acknowledging of the receiver is specified by the possible benefit elaborated in this work is its design in
falling edge of the acknowledge signal instead of being RTL i.e. by utilizing the hardware description language
specified by the rising edge as in the normal four-phase VHDL, thus making it possible to incorporate with the
protocol. In the early acknowledgement protocol the other synchronous RTL designs suitable to construct a
signalling occurs or the request signal is marked to large, diverse system with both asynchronous
reset whenever the request signal as well as the synchronous potentials such as a GALS on-chip
acknowledgement signal goes high. The end of the system.
ongoing transaction is delimited by the actual
2. ASYNCHRONOUS VS. SYNCHRONOUS
acknowledgement which is directed from the falling
DESIGN
edge of acknowledgement signal, this resets the
acknowledgement signal for the upcoming cycle of the Asynchronously designed circuits deliver numerous
request-acknowledge, by doing so this protocol benefits which are contradictory to their synchronous
removes the reset phase inherited by the simple 4-phase equivalents; one such benefit is the thorough avoidance
protocol. of clock skew, which eventually has become a main
The International Technology Roadmap for apprehension in deep sub-micron technologies [9].
Semiconductors (lTRS) in its 2008 edition recounts a Synchronously designed circuits are fetching more
clear requisite for asynchronous communication difficulties in the distribution of a global clock signal
protocols for control and synchronization in integrated and maintaining the low clock slew side-by-side,
circuits (lCs) [5]. The 2011 edition further laid throughout the circuit, due to withdrawing of
emphasis on the prominence of asynchronous technologies with rapidly growing design sizes and the
handshaking and on the design approach based on demand of increase in clock speed. The global
globally asynchronous locally synchronous (GALS) in synchronization of events followed in the synchronous
integrated circuits [6]. Most system-on-chip (SoC) designs,induces the necessity to further downturn,their
designs adapt to the synchronous configuration, circuits to compensate the skew, otherwise this skew
recommending an assumption of a discrete concept of may become a much bigger concern on further
time, this concept suggestively moderates the reduction of the feature size. The distribution of clock
complexity of a digital circuit design and allows them is also intended to consume more circuit power.
to be further easily modelled using register transfer In contrary to this, an asynchronous design which is
level (RTL) languages like VHDL or Verilog. The demand-driven, has become an only solution to
work proposed in [7] presents a group of modelling overcome such difficulties simply by removing the
directions and synthesis techniques to be preferred in clock signal, employs handshaking protocol exchange
designing asynchronous pipelines. In the context of among the neighbors participating in a transfer of data
without being globally synchronized by a single clock, Step c: Then the sender responds to the wr_ack signal
making it more energy-efficient. Conversely, fully by resetting wr-req signal.
asynchronous circuit designs are still not accepted as a Step d: finally,the FIFO resets the push-ack signal after
remedy to completely replace synchronous designs due push-req has been reset.
to absence of advanced computer-aided design tools. The process of reading the data from the FIFO is
A digital circuit is implied to be synchronous in its similar to writing/pushing process except that the data
logic, if it monitors the controlling of all activities in is supplied by the FIFO and obtained by the receiver.
the circuit with a single clock signal. A circuit The control block contains a control cell for each data
operating in another sense is said to be a non cell in data bank block and provides control signals to it
synchronous or simply an asynchronous circuit, where as their second input. Each control cell is used to
no clock signal. The work in [10] presents an control one data cell in the data bank block and the
implementation of a GALS pipelined processor wr/rdJeq input signals are fed into every control cell
integrated on a synchronous viable FPGAs, where a by default. The basic principle of this control block is
pipelined accumulator-based processor as a sample for that every control cell responds to write and read
a pipelined processor is implemented with variable request signals only when it has permission for the
stages' delays. This work presents a unique idea of a same and then as for an answer to its request a token
port controller suitable for the pipeline to function for read operation is granted to control cell 1 after the
correctly under the influence of stage delays hence system reset. After that the write/push token will be
resulting in a performance with increased and relatively transferred to the next control cell after the current cell
reduced power consumption. The work in [9] also of write/push token completed its operation of
illustrated the categories of asynchronous circuits on writing/pushing data in the FIFO and when the
the basis of delay offered by the wires and gates. The write/push token reaches the last cell in the control
delay-insensitive (DI) model which is more robust can block, it will return to control cell 1. Thus, the data can
operate accurately irrespective of any wire or gate be written into or read from the asynchronous FIFO
delay offered, but this comes under a much restrictive without actually moving the data across the
class. The other class constituted by quasi-delay asynchronous FIFO data movement by following this
insensitive (QDI) model attracts the attention by token passing scheme.
offering signal transitions which occur at the exact time B. Data Bank Block:
only at the distinct end point of the mentioned forks. The data block is composed of a series of latches
arranged in an array in each data cell and a multiplexer.
3. ARCHITECTURE OF ASYNCHRONOUS FIFO The function of the data block is to latch the incoming
In this paper,an asynchronous FIFO is presented which data datajn and process the output data data_out
avoids data movement in a pipeline which means that requested by the control signals from the control block.
the data don't actually move through the pipeline, Each data cell block will take two-inputs where one
instead it moves through latches connected in an input is a data_in fed via default and the other is an
asynchronous manner in the data cell blocks. The enable/control signal coming from each control cell.
presented FIFO contains two main units: a control The enable/control signals coming from control block
block and a data cell/bank block. The basic block are used as the enable signals for latching the incoming
diagram considered in the architecture of the data. The latches used in data cell are same with the
asynchronous FIFO is provided in figure 1. latches used in the control block. The number of latches
A. Control Logic Block: in data cell latch may depend on the data width of the
The control block employs two inputs and two outputs FIFO. The read_ctr/en signals shown in the figure
(wrJeq & rdJeq) and (wr_ack & rd_ack). This FIFO above is the combination of all the read/ctr/en signals
works with four phase bundled-data handshaking coming from each control cells. It is used as the control
protocol. The process of moving data across the FIFO signal for selecting the requested data in the different
(asynchronous/synchronous) can be explained in the data cells. There's also an arbiter is used in the
following steps: construction of the proposed asynchronous FIFO
Step a: The sender sets the request signal ('wrJeq' because it can only handle one request write/read at a
signal) after the data to be sent is ready (Data_In). time, therefore, by using an arbiter this problem can be
Step b: The FIFO will set acknowledge signal (wr_ack solved as it can grant either read or write request to take
signal) after receiving the incoming data successfully. effect at one time. Table 1 illustrates the characteristics
of the implemented FIFO.
CO\1ROl
LOGIC
BlOCJ(
Signal Description
It writes or push_req input signals are fed into every cell of the control block by default,
the basic principle of its working is that every control cell responds to push/write and
pop/read request signals only when it has permission for the corresponding operation.
It reads or popJeq input signals are fed into every control cell by default composed by
Wr ack/ Rd ack the 'push/pop-ack' signals from each cell of the control block,which is reasonable because
only one cell of this block will assert its push/pop acknowledge signal for the current
push/pop request.
Data iniData out The function of this data bank block is to latch the incoming data or output the requested
data by the control signals from the'Control Logic' Block.
place and route (PAR-tool), using XST-tool receives output. Finally, for programming the XILINX Device,
the generated input net list (.edt) file. At first this input the place and route program is received thereby
net list file is translated by a translation program along creating a programming file (.bit) for configuration.
with the other design constraints to a database file of The use of iMPACT to program the device employing
XILINX, ensuing to a positive run of the translation a programming cable is done. Prior to programming the
program, the map program then maps the design to a FPGA file, a timing simulation needs to be executed
XILINX FPGA design. Lastly the PAR tool program generating a JTAG file for debugging the circuit,
accepts the map design, accordingly places and routes regarded as a set of timing functions and requirements
the FPGA resulting in bit-stream generator (BitGen) corresponding to the RTL codes.
,-----�--�------------�----�
-
.. --
.-
PC'
FU
-
- r
...r:.
-
� ,c
tt-' �
ld;. - �.
U
-
bJ
--
r- r"'
tr
� "-'
.- ..
-
-
--
r- -t,.;J.
-
-
--..
-
l.di,
�
""". -
I! J -
-=
� "-'
--
.-
- -
r
-
� �.
"-'
i-f6., -
IbJ - -
'='
� -
["- "='
-<:: . -
"='
-
�
IbJ -
-«.-
-t..J
.-
[1] Nowick, S. and Singh, M., "High-Performance Semiconductors: 2011", ITRS 2011 Edition
Design Test of Computers, vol. 2S, no. 5, pp.S- [7] Law, C., Gwee, B. H., Senior Member, IEEE,
[2] Steininger, A., Veeravalli, V. S., Alexandrescu, Asynchronous Pipelines", IEEE Transactions
D., Costenaro, E., Anghel, L., "Exploring the on Very Large Scale Integration (VLSI)
State Dependent SET Sensitivity of Systems, vol. 19, no. 4, pp. 6S2-695, April
Example", 32nd IEEE International [8] Wang, X. and Nurmi, J., "A RTL
Conference on Computer Design (ICCD), pp. Asynchronous FIFO Design Using Modified
[3] Mannakkara, C. and Yoneda, T., Conference, pp. 1-4, Oct. 2006.
"Asynchronous Pipeline Controller Based on [9] Moreira, M. T., Oliveira, B. S., Moraes, F. G.
[4] Furber, S. B. and Day, P., "Four-phase [10] Farouk. H. A.. EI-Hadidi, M. T..
(VLSI) Systems, vol.4, no. 2, pp. 247-253, Commercial Synchronous FPGAs", IEEE 17th
[5] Semiconductor Industry Association, 'The Telecommunications (ICT), pp. 9S9-994, April