0% found this document useful (0 votes)
62 views

Vlsi Systems For Simultaneous in Logic Simulation: Karthik. S Priyadarsini. K Jeanshilpa. V

1. Logic simulation is commonly used for functional verification of integrated circuits, but suffers from limitations as design complexity increases. 2. Parallel simulation can improve simulation performance by distributing the simulation across multiple processor cores. However, communication overhead between cores limits speedup gains. 3. This paper presents a platform for efficiently simulating high-level specifications of systems in parallel using the Zybo heterogeneous FPGA board. The goal is to evaluate parallel simulation performance gains on this platform.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Vlsi Systems For Simultaneous in Logic Simulation: Karthik. S Priyadarsini. K Jeanshilpa. V

1. Logic simulation is commonly used for functional verification of integrated circuits, but suffers from limitations as design complexity increases. 2. Parallel simulation can improve simulation performance by distributing the simulation across multiple processor cores. However, communication overhead between cores limits speedup gains. 3. This paper presents a platform for efficiently simulating high-level specifications of systems in parallel using the Zybo heterogeneous FPGA board. The goal is to evaluate parallel simulation performance gains on this platform.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

VLSI SYSTEMS FOR SIMULTANEOUS IN

LOGIC SIMULATION
Karthik. S Priyadarsini. k Jeanshilpa. V
AP, Department of ECE Research Scholar, Department of CSE AP, Department of ECE
SRMIST VISTAS B.S. Abdur Rahman Crescent IST
Chennai, India Chennai, India Chennai, India
[email protected] [email protected] [email protected]

Abstract— Logic simulation used in conjunction with redundancy. In Verification based on simulation technique, the
functional verification verifies the correctness of an integrated design or circuit is placed below a TB (test bench), then input
circuit. Verification methodologies can be formal or simulation vectors are passed through test bench, and finally the output
based. Formal based methodologies use exhaustive mathematical design is estimated with directed output. A TB contains a code
techniques to prove circuit responses to all possible inputs and all which supports the working of the circuit or design, and at
possible reachable states of the circuit conform to specification. times generates input vectors and also compares the output
These methods do not rely on the generation of input to verify the obtained with directed output. The input vectors can be
design. Simulation based methodologies aims at uncovering produced in advance to simulation and using a database it can
design errors by thoroughly exercising the current model of the
be read into the design at the time of simulation, or it is
circuit. The aim of this paper is to present a platform for efficient
produced during simulation run. Similarly, the directed output
parallel simulation of high level specification of the system
written using RTL/C/C++ language. For the experimental can also be produced. At the end, both the models are
purpose, the Zybo a heterogeneous platform board has been estimated. A simulator is moreover classified as event driven
considered in this work. Zybo board is a development kit which simulator, cycle-based simulator and hardware-emulator. An
provides the designer to develop or test designs. The board event simulator checks a module or logic gate or a group of
contains all the essential interfaces communications and associate code every time an input of the gate or the variable to which
functions to enable a wide range of applications. The most module is responsive. An event represents a shift in value. A
significant part on this board is Xilinx Zynq-7000 All CBS separates as per the clock domain and examines the sub
Programmable SoC (Zynq SoC) which consists of a dual ARM circuit at every triggering edge of the clk in clock domain.
Cortex-A9 CPU based PS (Processing System) as well as Xilinx Hence, event count creates impact in speed of simulator
hardware PL (Programmable Logic) on the same chip and working. A design with little activity or event possibly runs
supports hardware-software co-design. quicker on EDS, whereas a design with large event reckons
runs quicker on CBS. In routine, generally circuits have
Keywords—Zybo, Profiling, ZynQ, Verification enough functions that CBS exceed their EDS copies, although,
simulators based on cycle contain their own flaws. For a design
I. INTRODUCTION
to be counterfeit in a CBS, time domain in the design must be
It is a mandatory step in the development of today’s clearly distinguished. For instance, design with no clock does
multidimensional digital designs. The way the project team is not have clear time domain definition; example is an
put together, one can look at the worth of logic simulation. A asynchronous circuit since no clocks are involved. Hence, it
classic project team typically consists of same amount of cannot be simulated in a CBS.
design and verification engineers. From time to time the
number of verification engineers will be more than the design A hardware emulator or a hardware simulator forms a
engineers, since to validate a design, one must first understand design by means of hardware elements such as microprocessor
the specifications, knowledge of the design and more arrays and programmable ASICs like field programmable gate
important, develop a different design approach from the arrays (FPGAs). Initially, the elements like hardware simulator
specifications. It must be highlighted as the method adopted by are configured to pattern the design. In a processor array based
verification engineer will not be the same as that of the design hardware simulator, the design is converted into set of
engineer. Both would commit the same faults and nothing instructions which is self-same to simulating the design. In
much would be validated if the verification engineer shadows hardware emulator using an FPGA, the FPGAs are configured
the same design style as the design engineer. Hardware or programmed to represents the logic present in the design.
complication growth carries on following Moore’s law, but the Thereby, the outcome of functioning of the hardware is said to
verification difficulty is even more challenging. From be simulation outcome of the design. A hardware simulator
development cycle of a project, one can appreciate the efforts could be either cycle based or it can be like event driven as like
of DV (design verification). History shows that around 70% of a software simulator. So, each kind of simulator has its owned
time is spent on DV when entire project development cycle is programming style guidelines, and these are more rigid than
considered. Verification based on simulation is the most those of software simulators. A circuit could be run on a
frequently used verification method [1] .As, it is specified simulator if only if when it matches all coding needs of the
previously on books and articles, Verification based on software simulator. To support this descriptions containing
simulation is said to be a kind of verification technique by

978-1-5386-4310-5/18/$31.00 2018
c IEEE 23
delays information are not allowed on hardware based interconnects technology. The reason behind not adapting
simulator. parallel simulation is clearly shown in the figure 1 below.
II. LIMITATIONS OF FORMAL VERIFICATION
Performance in decades

An alternative to simulation based verification is formal-


verification shown in Figure 1.5 and static-timing- analysis interconnect
(STA). These techniques optionally use simulation internally to
improve their efficiency. Formal-Verification technique proves
a design without the stimulus/test vectors. So it gives formal memory
verification technique a vast advantage as it entirely eliminates latency

the use of a test-bench. Formal verification can be divided into


equivalence-checking and property checking or model
checking. Equivalence-checking (EC) finds whether two cpu

designs are equivalent in their implementation. For instance,


equivalence-checking is used to find whether synthesized gate- 0 10 20 30 40 50
level netlist and Register Transfer Language are functionally Performance in decades
similar. Property (or model) checking takes a design or circuits
and proves or disproves a set of properties given as specs of the Fig. 1. Performance Enhancement
design. Verification approach needs two forms of a
Equation 1.1 shows the formula for speedup where tp1 is
design/circuit. Therefore, an error-bug in the specifications
time taken in simulating partition 1 and tparallel is the overall
creates an incomplete the reference model which may escape
simulation time and tcommunication is time taken in
verification. Missing specifications is a classic type of
communicating intermediate messages between the cores.
specification error. Errors in formal-verification software will
omit errors in the design and will provide wrong validation. ܵ‫ ݌ݑ݀݁݁݌‬ൌ ‫ ͳ݌ݐ‬൅ ሺ‫ ݈݈݈݁ܽݎܽ݌ݐ‬൅ ‫݊݋݅ݐܽܿ݅݊ݑ݉݉݋ܿݐ‬ሻ ..1.1
One will be familiar that there is no assurance for a bug-free
software code. Speed up will be affected for larger design as there will be
many inter dependency between the modules. In spite of these
III. PARALLEL LOGIC SIMULATION researchers has come up with many more techniques which
will be reviewed in further sections. Figure 2 shows Distributed
simulation.
The problem with logic simulation is that the simulation
time is more for a large design and very slow in running Partition1 Partition2 Partition3
application software against the hardware design. But by the
use of SOC, prototyping are quick and low cost, and provide
full visibility to ten thousands of internal signals. Few recent
works have demonstrated the effectiveness of using GPU to Core1 Core2 Core 3
speed up gate level simulation but failed in communication and
load balancing. With advancement of FPGA architectures like Fig. 2.Distributed Simulation
ZYNQ-7000 SOC architecture, speeding up of simulation is
possible by creating a heterogeneous architecture that is the use IV. HARDWARE/SOFTWARE CO-DESIGN
PS and PL. In this work we introduce Heterogeneous
architecture for speeding up functional verification with the Hardware-Software (HW/SW) co-design is the
help of application profiling. simultaneous development of both hw and Sw sides of the
Design Partitioning [2] is a significant characteristic of system. The HW parts are to execute on a general purpose
distributed parallel simulation. Partitioning impacts the signals processor in FPGAs [3] or ASICs [4] while the SW parts are
flowing between the partitions and also it affects the translated to a low-level programming language. The primary
synchronization. Partitioning can be done based on goal of HW/SW co-design needs partitioning the application to
functionality, no. of modules, gates etc. Biggest difficulty is some parts implement on hardware and other parts run on
minimizing communication between the partitioned modules software, and the partitioning is critical for performance. To
and also running the modules concurrently. achieve this objective, the compute-intensives parts of the
application are mapped to the hardware side.
Another biggest challenge is communication overhead Hardware/Software design flow is shown in figure 3.
which is defined as the time exhausted in exchanging the
values or messages between the partitions. Synchronization
overhead is another factor in coordinating all simulations
running parallel. When number of partitions increase all the
factors discussed will be dominate. Figure 1 shows
performance enhancement in CPU, Memory latency and
interconnect technologies. Growth of CPU has significantly
high when compared with memory access time and

24 2018 International Conference on Recent Trends in Electrical, Control and Communication (RTECC).
Fig. 4. Zybo Board from Digilent

VI. PROFILING
Profiling is a technique which allows you to find where the
program that you have written consumes its time and the
functions which invoked the other functions when it gets
executed. This valuable information shows you which pieces
or parts of your code spends more time or running slower than
you anticipated and may be the contender for reworking to
make your code execute faster. In my work the function which
takes more time will be run on FPGA rather than in the
processor. Profiling workflow is shown below in figure 5.

Fig. 3. Hw/Sw Design flow

The problem with logic simulation is that the simulation time is


more for a large design and very slow in running application
software against the hardware design. But by the use of SOC,
prototyping are quick and low cost, and provide full visibility
to ten thousands of internal signals.
V. EXPERIMENTAL PLATFORM
Accelerating an application to gain better performance is done
by distributing the load among the processors using the
heterogeneous platform. For the experimental purpose, the
Zybo [5] heterogeneous platform has been considered in this Fig. 5. Profiling work flow (source Xilinx)
work. Zyboboard shown in figure 4 is a comprehensive
development-design kit for the developer concerned in In Xilinx SDK, we can profile a program running on
developing/testing designs. This board includes all the essential embedded hardware. The Profiling technique is software
interfaces communications and supporting functions to allow a intrusive and is built on the GNU gprof tool. This tool offers
broad range of applications. Most important part on this board two types of information that we can use to enhance the code:
is Xilinx Zynq-7000 All Programmable SoC (ZynqSoC).
• Use of histogram by which we will find the functions
ZynqSoC is performing all computational resources for the
design system. The computational resources have made by in the application program that takes up the most of
some configuration which is carried out by Advanced RISC the execution time.
Machine based Processing System section and FPGA-based • A call graph shown in figure 6 which displays what
performance orientated Programmable Logic. functions called or invoked the other functions, and
also the number of times.

Fig. 6. Profiling example

2018 International Conference on Recent Trends in Electrical, Control and Communication (RTECC). 25
VII. IMPLEMENTATION FLOW 29806 100
XScuGic_DeviceInitialize 1 1 90.9us 0.01%
Following figure 7 shows the way the simulation of a design XscuGic_RegisterHandler 0 1 0ns 0%
was carried out using the ZynQ SOC. Vivado IDE tool was XuartPs_sendByte 232 233 23.34us 12.45%
used in creation of the Zynq hardware platform which defines XUartPs_RecByte 28778 34495 83.456us 79.45%
how the ARM PS is configured and also the various IP to be Xil_in32 242 - - 0%
Xil_out16 485 - - 0%
added with ARM
VIII. SIMULATION SPEED UP FOR VARIOUS CIRCUIT DESIGNS
Once the design was profiled and the functions that has to be
exported to hardware was identified. Using the testbench
simulations were performed using Vivado HLS on the entire
designs and simulation time was noted. A comparative study
was made with respect to the simulation time obtained using
existing methodology and with the one which followed is this
paper. Table III shows simulation time and speed up for
various circuit designs.

TABLE III. SIMULATION SPEED UP FOR VARIOUS CIRCUIT DESIGNS


Circuit Design Simulation Time Simulation Speed up factor
Fig. 7.Implementaion Flow name using Existing Time using our

After the profiling is being done for various designs, the Technique in sec methodology
synthesizable functions that took large time were mapped as Hamming Code 12421 7210 1.722746
custom IP in PL section of ZynQ. Profiling results are observed ALU 8763 4234 2.069674
under gmon.out file. To have a deeper view of the time taken
AES 23442 12311
by the functions, Debug option can be chosen and the 1.904151
application can be run in the Debug format. In the Debug PCI 34452 24241 1.421228
format we can see clearly the inclusive and exclusive timing
i.e. the time taken by the main function with its child functions
and the time taken by the child functions alone respectively. IX. CONCLUSION
The following tables show profiling results of various designs.
The function that will be mapped to PL section is highlighted. From the above results one conclude that the total time
Table I and II shows the profiling results of Hamming code and taken by each function can be analyzed following by the
PCI and figure 8 shows the pie chart representation of transfer of the timing consuming function on the hardware and
Profiling. accelerating the verification. Not only this, much flexibility is
extended for the purpose of Verification. Vivado tool with its
TABLE I. PROFILING RESULTS OF HAMMING CODE
variety of environments made the process of verification very
simple and cost effective.
name(location) Samples Calls time/call %time
2356 100% Another salient feature of this method is that, as per the current
xiicps_mastersend 1221 76 23us 23.34% requirements, the method utilizes the existing tools and its
H1.c 121 68 35.33us 32.43% associated hardware. This tool is board compatible, i.e. by
outbyte.c 23 21 22.55us 10.23% adding the libraries needed by the particular board, the same
xintx_l.c 11 2 16.34us 5.23% design can be run on any boards. But for this work, we utilized
xi_cache.c 2 3 13.44us 1.2%
xbasic_type.c 1 4 1us 1%
the Zybo board due to its extraordinary feature, presence of
ARM A9 Cortex processor and FPGA. For our work we
applied the same methods on several filters and Hamming
code, but the same can be extended to intricating and time
consuming designs. Fulfilling our objective this method
happens to be cost effective since the boards and the Softwares
are available at an affordable rate. Also, this method enhances
the system performance, by reducing the functions that run of
the processor in other words the processor is relieved of some
of its functions. The processor can utilize this time for
performing its usual routines and interrupts; efficiency of the
system is improved and the total time taken is reduced i.e. the
Fig. 8..Piechart representation of profiling simulation and the testing process has been accelerated, thereby
obtaining the results at a much lowered time. Mainly, through
TABLE II. PROFILING RESULTS OF PCI
this work every aspect of the Vivado tool right from the wide
name(location) Samples Calls time/call %time range of tools available and also the various platforms too have

26 2018 International Conference on Recent Trends in Electrical, Control and Communication (RTECC).
been explored. The usage of design sources in low level or high the FPGA are made use of, thereby resulting in the utilization
level languages in this process is a welcoming fact of this tool. of the resources to the maximum.
Another major advantage of this work is both the processor and
[3] Xilinx-FPGA. Virtex-7 FPGA DSP48E1 slice.
REFERENCES https://www.chipestimate.com/techtalk.php?d=2013-01-08.
[4] Martin Rieger. Application specific integrated circuits (asics). In The
[1] William K. Lam (2005).Hardware Design Verification: Simulation and Electronic Design Automation Handbook, pages 384–397. Springer,
Formal Method-Based Approaches, Prentice Hall PTR. 2003.
[2] Tariq B. Ahmad, Maciej Ciesielski, “Parallel Multi-core Verilog HDL [5] Inc.Avnet. Avnet product brief.http://www.em.avnet.com/en
Simulation based on Domain”, 2014 IEEE Computer Society Annual us/design/drc/Documents/ Xilinx/PB-AES-Z7EV-7Z020-G-v7-web.pdf.
Symposium on VLSI [6] gprof.(2007-08-28).Retrieved from URL
https://sourceware.org/binutils/docs-2.18/gprof/index.html#Top

2018 International Conference on Recent Trends in Electrical, Control and Communication (RTECC). 27

You might also like